0% found this document useful (0 votes)
70 views

m392c ADHM Notes

These notes summarize the M392C Mathematical Gauge Theory class taught by Dan Freed at UT Austin in Spring 2019, covering various topics in linear algebra, principal bundles, harmonic forms, and gauge transformations. The document includes lecture notes on specific mathematical concepts, exercises, and examples related to gauge theory. The notes are intended for students and contain personal annotations and corrections.

Uploaded by

rmknupp
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views

m392c ADHM Notes

These notes summarize the M392C Mathematical Gauge Theory class taught by Dan Freed at UT Austin in Spring 2019, covering various topics in linear algebra, principal bundles, harmonic forms, and gauge transformations. The document includes lecture notes on specific mathematical concepts, exercises, and examples related to gauge theory. The notes are intended for students and contain personal annotations and corrections.

Uploaded by

rmknupp
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 81

M392C NOTES: MATHEMATICAL GAUGE THEORY

ARUN DEBRAY
MAY 9, 2019

These notes were taken in UT Austin’s M392C (Mathematical gauge theory) class in Spring 2019, taught by
Dan Freed. I live-TEXed them using vim, so there may be typos; please send questions, comments, complaints, and
corrections to [email protected]. Any mistakes in the notes are my own. Thanks to Yixian Wu for finding
and fixing a typo.

Contents
1. Some useful linear algebra: 1/22/19 2
2. Fantastic 2-forms and where to find them: 1/24/19 5
3. Principal bundles, associated bundles, and the curvature 2-form: 1/29/19 8
4. Harmonic forms and (anti)-self-dual connections: 1/31/19 11
5. The Yang-Mills functional: 2/5/19 14
6. Spinors in low dimensions and special isomorphisms of Lie groups: 2/7/19 18
7. Some linear algebra underlying spinors in dimension 4: 2/12/19 21
8. Twistors and Dirac operators: 2/14/19 24
9. The Nahm transform for anti-self-dual connections: 2/17/19 26
10. The Chern connection: 2/19/19 29
11. Proof of the Nahm transform, part 1: 2/26/19 32
12. Proof of the Nahm transform, part 2: 2/28/19 34
13. Fredholm theory: 3/7/19 37
14. Transversality and the obstruction bundle: 3/12/19 41
15. Constructing moduli spaces: 3/14/19 44
16. Gauge transformations: 3/26/19 46
17. : 3/28/19 48
18. More Sobolev spaces and a spectral theorem: 4/2/19 48
19. Dirac operators: 4/4/19 48
20. Elliptic regularity: 4/9/19 51
21. Chern-Weil theory: 4/11/19 54
22. Chern-Weil theory on classifying spaces: 4/16/19 57
23. Chern-Weil and Chern-Simons forms: 4/18/19 60
24. Chern-Simons forms, II: 4/23/19 63
25. Classical Chern-Simons theory: 4/25/19 66
26. Classical Chern-Simons theory, II: 4/30/19 69
27. Chern-Simons lines: 5/2/19 71
28. New(er) equations in gauge theory: 5/7/19 74
29. BPS equations: 5/9/19 77

Note: I have handwritten notes for the missing lectures, and will try to type them up at some point.
1
2 M392C (Mathematical gauge theory) Lecture Notes

Lecture 1.
Some useful linear algebra: 1/22/19

“Why did the typing stop?”


Today we’ll discuss some basic linear algebra which, in addition to being useful on its own, is helpful for
studying the self-duality equations. You should think of this as happening pointwise on the tangent space of
a smooth manifold.
Let V be a real n-dimensional vector space. The exterior powers of V define more vector spaces: the
scalars R, V , Λ2 V , and so on, up to Λn V = Det V . We can also apply this to the dual space, defining R, V ∗ ,
Λ2 V ∗ , etc, up to Λn V ∗ = Det V ∗ .
There is a duality pairing
\begin {aligned} \theta \colon \Lambda ^k V^*\times \Lambda ^k V &\longrightarrow \R \\ (v^1\many \wedge v^k, v_1\many \wedge v_k) &\longmapsto \det (v^i(v_j))_{i,j}, \end {aligned}
(1.1)

where v i ∈ V ∗ and vj ∈ V .
Now fix a µ ∈ Det V ∗ \ 0, which we call a volume form. Then we get another duality pairing
\begin {aligned} \Lambda ^k V\times \Lambda ^{n-k}V &\longrightarrow \R \\ (x, y) &\longmapsto \theta (\mu , x\wedge y). \end {aligned}
(1.2)

Thus Λk V ∼
= Λn−k V ∗ .
Suppose we have additional structure: an inner product and an orientation. Let e1 , . . . , en be an oriented,
orthonormal basis of V , and e1 , . . . , en be the dual basis. Now we can choose µ = e1 ∧ · · · ∧ en .
Definition 1.3. The Hodge star operator is the linear operator ? : Λk V ∗ → Λn−k V ∗ characterized by
(1.4) \alpha \wedge (\star \beta ) = \ang {\alpha , \beta }_{\Lambda ^kV}\cdot \mu .
k ∗
The inner product on Λ V is defined by
(1.5) \ang {v^1\many \wedge v^k, w^1\many \wedge w^k} \coloneqq \det (\ang {v^i, w^j})_{i,j}.
The Hodge star was named after W.V.D. Hodge, a British mathematician. Notice how we’ve used both the
metric and the orientation – it’s possible to work with unoriented vector spaces (and eventually unoriented
Riemannian manifolds), but one must keep track of some additional data.
Example 1.6.
• ?(ei1 ∧ · · · ∧ eik ) = ej1 ∧ · · · ∧ ejn−k if the permutation 1, . . . , n to i1 , . . . , ik , j1 , . . . , jn−k of [n] :=
{1, . . . , n} is even. Otherwise there’s a factor of −1.
• Suppose n = 4. Then ?(e1 ∧ e2 ) = e3 ∧ e4 and ?(e1 ∧ e3 ) = −e2 ∧ e4 , and so on. (
Remark 1.7. The Hodge star is natural. First, you can see that we didn’t make any choices when defining it,
other than an orientation and a volume form, but there’s also a functoriality property. Let T : V → V be an
automorphism; this induces (Λk T ∗ )−1 : Λk T ∗ → Λk T ∗ , and if T is an orientation-preserving isometry,
(1.8) \star \circ (\Lambda ^k T^*)^{-1} = (\Lambda ^{n-k}T^*)^{-1}\circ \star .
k ∗ k ∗
Hence ?? : Λ V → Λ V is some nonzero scalar multiple of the identity, and we can determine which multiple
it is. Certainly we know
(1.9) {\star }{\star }(e^1\many \wedge e^k) = \star (e^{k+1}\many \wedge e^1) = \lambda e^1\many \wedge e^k,
and we just have to compute the parity of these permutations: one uses k transpositions, and the other uses
n − k. Therefore we conclude that
(1.10) {\star }{\star } = (-1)^{k(n-k)}\colon \Lambda ^kV^*\to \Lambda ^kV^*. \qedhere (
Now suppose n = 2m, so we have a middle dimension m, and ?? : Λm → Λm is (−1)m . This induces
additional structure on Λm V ∗ .
• If m is even (so n ≡ 0 mod 4), the double Hodge star is an endomorphism squaring to 1. This defines
a Z/2-grading on Λm V ∗ , given by the ±1-eigenspaces, which we’ll denote Λm ∗
± V . The +1-eigenspace
is called self-dual m-forms, and the −1-eigenspace is called the anti-self-dual m-forms.
Arun Debray May 9, 2019 3

• If m is odd (so n ≡ 2 mod 4), the double Hodge star squares to −1, so this defines a complex structure
on Λm V ∗ , where i acts by the double Hodge star.
Exercise 1.11. Especially for those interested in physics, work out this linear algebra in indefinite signature
(particularly Lorentz). The signs are different, and in Lorentz signature the two bullet points above switch!
Exercise 1.12. Show that if 4 | n, the direct-sum decomposition Λm V ∗ = Λm ∗ m ∗
+ V ⊕ Λ− V is orthogonal. See
if you can find the one-line proof that self-dual and anti-self-dual forms are orthogonal.
Next we introduce conformal structures. This allows the sort of geometry which knows angles, but not
lengths.
Definition 1.13. A conformal structure on a real vector space V is a set C of inner products on V such
that any g1 , g2 ∈ C are related by g1 = λg2 for a λ ∈ R+ .
In this setting, one can obtain g2 from g1 by pulling back g1 along the dilation Tλ : v 7→ λv. This induces
an action of (Tλ∗ )−1 on Λk V ∗ , which is multiplication by λ−k : if µi is the volume form induced from gi , so
that
(1.14) \alpha \wedge {\star \beta } = g_1(\alpha ,\beta )\mu _1,
then
(1.15) \lambda ^{-2k}\alpha \wedge {\star \beta } = g_2(\alpha ,\beta )\lambda ^{-n}\mu _2.
Thus pulling back by dilation carries the Hodge star to λn−2k ?. Importantly, if n = 2m, then ? : Λm V ∗ →
Λm V ∗ is preserved by this dilation, so it only depends on the orientation and the conformal structure.
Remark 1.16. A conformal structure is independent from an orientation. For example, on a one-dimensional
vector space, a conformal structure is no information at all (all inner products are multiples of each other),
but an orientation is a choice. (
Example 1.17. Suppose n = 2 and choose an orientation and a conformal structure on V . As we just saw,
this is enough to define the Hodge star ? : V ∗ → V ∗ , which defines a complex structure on V . Pick a square
root i of −1 and let ? act by it (there are two choices, acted on by a Galois group).
We get more structure by complexifying: V ∗ ⊗ C splits as a the ±i-eigenspaces of the Hodge star; we
denote the i-eigenspace by V (1,0) (the (1, 0)-forms) and the −i-eigenspace by V (0,1) (the (0, 1)-forms).
Now let’s globalize this: everything has been completely natural, so given an oriented, conformal 2-manifold
X, it picks up a complex structure, hence is a Riemann surface, and the Hodge star is a map ? : Ω1X → Ω1X .
Moreover, we can do this on the complex differential forms, which split into (1, 0)-forms and (0, 1)-forms.
How do 1-forms most naturally appear? They’re differentials of functions, so given an f : X → C, we can
ask what it means for df ∈ Ω1,0 X . This is the equation
(1.18) {\star }\ud f = i\ud f.
This is precisely the Cauchy-Riemann equation; its solutions are precisely the holomorphic functions on
X. (
Remark 1.19. More generally, one can ask about functions to Cn or even sections of complex vector bundles;
the analogue gives you notions of holomorphic sections. In this case, the equations have the notation

(1.20) \delbar f = \paren {\frac {1+i{\star }}{2}}\ud f. \qedhere (

We’ll spend some time in this class understanding a four-dimensional analogue of all of this structure.
Symmetry groups. Symmetry is a powerful perspective on geometry. If we think about V together with
some structure (orientation, metric, conformal structure, some combination,. . . ), we can ask about the
symmetries of V preserving this structure. Of course, to know this, we must know V , but we can instead look
at a model space Rn to define a symmetry type, and ask about its symmetry group G: then an isomorphism
Rn → V preserving all of the data we’re interested in defines an isomorphism from G to the symmetry group
of V .
Example 1.21. When dim V = 2, the most general symmetry group is GL2 (R), the invertible matrices
acting on R2 . Adding more structure we get more options.
4 M392C (Mathematical gauge theory) Lecture Notes

• If we restrict to orientation-preserving symmetries, we get GL+2 (R).


• If we restrict to symmetries preserving a conformal structure, the group is called CO2 = O2 × R>0 .
• If we ask to preserve an orientation and a complex structure, we get CO+ 2 = SO2 × R
>0
. This is
× >0
isomorphic to C = GL1 (C): an element of SO2 × R is rotation through some angle θ and a
positive number r; this is sent to reiθ ∈ C× .
This provides another perspective on why an orientation and a conformal structure give us a complex
structure. (
Example 1.22. Now suppose n = 4, and choose a conformal structure C and an orientation on V .
Then orthogonal makes sense, though orthonormal doesn’t, and the Hodge star induces a Z/2-grading on
Λ2 V ∗ = Λ2+ V ∗ ⊕ Λ2− V ∗ , the self-dual and anti-self-dual 2-forms. The total space Λ2 V ∗ is six-dimensional,
and these two subspaces are each three-dimensional.
Suppose e1 , e2 , e3 , e4 is an orthonormal basis for some inner product in C. We can use these to define
bases of Λ2± V ∗ , given by
\begin {aligned} \alpha _1^\pm &\coloneqq e^1\wedge e^2 \pm e^3\wedge e^4\\ \alpha _2^\pm &\coloneqq e^1\wedge e^3 \mp e^2\wedge e^4\\ \alpha _3^\pm &\coloneqq e^1\wedge e^4 \pm e^2\wedge e^3. \end {aligned}
(1.23)

Now, what symmetry groups do we have? Inside GL4 (R), preserving an orientation lands in the subgroup
GL+ 4 (R); preserving a conformal structure lands in O4 × R
>0
; and preserving both lands in SO4 × R>0 . The
4 ∗
first three of these act irreducibly on Λ (R ) , but the action of SO4 × R>0 has two irreducible summands,
2

Λ2± (R4 )+ .
To understand this better, we should learn a little more about SO4 . Recall that Sp1 is the Lie group of
unit quaternions. This is isomorphic to SU2 , the group of determinant-1 unitary transformations of C2 . This
group has an irreducible 3-dimensional representation ρ in which Sp1 acts by conjugation on the imaginary
quaternions (since R ⊂ H is preserved by this action).
Remark 1.24. Another way of describing ρ is: let ρ0 denote the action of SU2 on C2 by matrix multiplication.
Then ρ ∼
= Sym2 ρ0 . (
Proposition 1.25. There is a double cover Sp1 × Sp1 → SO4 . Under this cover, the SO4 -representation
Λ4± (R4 )∗ pulls back to a real three-dimensional representation in which one copy of Sp1 acts by ρ and the
other acts trivially.
Proof. Let W 0 and W 00 be two-dimensional Hermitian vector spaces with compatible quaternionic structures
J 0 , resp. J 00 .1 Then, V := W 0 ⊗C W 00 has a real structure J 0 ⊗ J 00 : two minuses make a plus, and compatibility
of J 0 and J 00 means the real points of V have an inner product. (These kinds of linear-algebraic spaces are
things you should prove once in your life.)
By tensoring symmetries we obtain a homomorphism Sp(W 0 ) × Sp(W 00 ) → O(V ). This factors through
SO(V ) ,→ O(V ), which you can see for two reasons:
• Sp(W 0 ) and Sp(W 00 ) are connected, so this homomorphism must factor through the identity component
of O(V ), which is SO(V ); or
• a complex vector space has a canonical orientation, and using this we know these symmetries are
orientation-preserving.
Now we want to claim this map is two-to-one. One can quickly check that (−1, −1) is in the kernel; the rest
is an exercise. (
Since Spinn is the double cover of SOn , this is telling us Spin4 = Sp1 × Sp1 . This splitting is the genesis
of a lot of what we’ll do in the next several lectures.
Consider the 16-dimensional space
(1.26) \label {1313} V^*\otimes V^* = (W')^*\otimes (W')^*\otimes (W'')^*\otimes (W'')^*.
1That is, J 0 is an antilinear endomorphism of W 0 squaring to −1, and similarly for J 00 . Compatible means with the Hermitian
metric: h is a map W × W → C and J is a map W → W , and if ξ, η ∈ W 0 , we want
h(J 0 ξ, J 0 η) = h(ξ, η) and h(Jξ, η) = −h(Jη, ξ).
Arun Debray May 9, 2019 5

Because the map


\begin {aligned} \omega '\colon W'\times W' &\longrightarrow \C \\ \xi ',\eta ' &\mapsto h'(J'\xi ', \eta ') \end {aligned}
(1.27)

is skew-symmetric, it lives in Λ2 (W 0 )∗ ⊂ (W 0 )∗ ⊗ (W 0 )∗ . In particular, the embedding


(1.28) \Sym ^2 (W')^*\oplus \Sym ^2 (W'')^*\inj (W')^*\otimes (W')^*\otimes (W'')^*\otimes (W'')^*

is the map sending


(1.29) \alpha ,\beta \mapsto \alpha \otimes w'' + \omega '\otimes \beta .
Remark 1.30. This story can be interpreted in terms of representations of Sp(W 0 ) × Sp(W 00 ). Let 1 denote
the trivial representation of Sp1 and 3 be the three-dimensional irreducible representation we discussed above.
Then (1.26) enhances to
(1.31) V^*\otimes V^* = \boldsymbol 1_{\Sp (W')}\otimes \boldsymbol 3_{\Sp (W')} \otimes \boldsymbol 1_{\Sp (W'')} \otimes \boldsymbol 3_{\Sp (W'')}.
The skew-symmetric part is 3Sp(W 0 ) ⊗ 1Sp(W 00 ) ⊕ 1Sp(W 0 ) ⊗ 3Sp(W 00 ) , and the “rest” (complement) is symmetric.
(
The group Sp1 × Sp1 = Spin4 has complex (quaternionic) two-dimensional representations S ± , the spin
representations, and Λ2± V ∼
= Sym2 S ± .
So two-forms have self-dual and anti-self-dual parts, and curvature is a natural source of 2-forms! (
Lecture 2.
Fantastic 2-forms and where to find them: 1/24/19

“I’ve taught this before, so I know it’s true.”


Last time, we discussed some linear algebra which is a local model for phenomena we will study in
differential geometry. For example, we saw that on an oriented even-dimensional vector space with an inner
product, the Hodge star defines a self-map of the middle-dimensional part of the exterior algebra, which
induces extra structure, such a splitting into self-dual and anti-self-dual pieces in dimensions divisible by
4. This therefore generalizes to a 4k-dimensional manifold with a metric and an orientation: the space of
2k-forms splits as an orthogonal direct sum of self-dual and anti-self-dual forms. (We also discussed other
examples, such as how 1-forms on an oriented 2-manifold split into holomorphic and antiholomorphic pieces.)
We’re particularly interested in the case k = 1, where this splitting depends only on a conformal structure,
and applies to 2-forms. To study its consequences we’ll discuss where one can find 2-forms in differential
geometry.
Definition 2.1. A fiber bundle is the data of a smooth map π : E → X of smooth manifolds if for all x ∈ X
there’s an open neighborhood U of x and a diffeomorphism ϕ : U × π −1 (x) → π −1 (U ) such that the diagram

\begin {gathered} \xymatrix @R=0.4cm{ U\times \pi ^{-1}(x)\ar [rr]^-\vp \ar [dr]_{\mathrm {proj}_1} && \pi ^{-1}(U)\ar [dl]^\pi \\ & U } \end {gathered}
(2.2)

commutes. In this case we call X the base space and E the total space. If there is a manifold F such that
in the above definition we can replace π −1 (x) with F , we call π a fiber bundle with fiber F .2 The map ϕ is
called the local trivialization.
Example 2.3. The trivial bundle with fiber F is the projection map X × F → X. (
Remark 2.4. Fiber bundles were first defined by Steenrod [Ste51] in the 1940s, albeit in a different-looking
way. His key insight was local triviality. There are variants depending on what kind of space you care about:
for example, you can replace manifolds with spaces and smooth maps with continuous maps.
Keep in mind that a fiber bundle is data (π) and a condition. Often people say “E is a fiber bundle” when
they really mean “π is a fiber bundle”; specifying E doesn’t uniquely specify π. (
2Not all fiber bundles have a fiber in this sense, e.g. a fiber bundle with different fibers over different connected components.
6 M392C (Mathematical gauge theory) Lecture Notes

If F has more structure, such as a Lie group, torsor, vector space, algebra, Lie algebra, etc., we ask that
ϕ|π−1 (x) : F → π −1 (x) preserve this structure. For example, in a fiber bundle whose fibers are vector spaces,
we want ϕ to be linear; in this case we call it a vector bundle.
Definition 2.5. If π : E → X is a vector bundle, the space of k-forms valued in E, denoted ΩkX (E), is the
space of C ∞ sections of Λk T ∗ X ⊗ E → X.
For ordinary differential forms (so when E is a trivial bundle), we have the de Rham differential d : ΩkX →
Ωk+1
X , but we do not have this in general.
Definition 2.6. Let X be a smooth manifold.
(1) A distribution on X is the subbundle E ⊂ T X.
(2) A vector field ξ on X belongs to E if ξx ∈ Ex ⊂ Tx X for all X.
(3) A submanifold Y ⊂ X is an integral submanifold for E if for all y ∈ Y , Ty Y = Ey inside Ty X.
Do integral submanifolds exist? This is a local question and a global question (the latter about maximal
integral submanifolds). In general, the answer is “no,” as in the next example.
Example 2.7. Consider a distribution on A3 with coordinates (x, y, z) given by

(2.8) E_{(x,y,z)} = \spa \set *{\pfr {}{x}, \pfr {}{y} + x\pfr {}{z}}.

There is no integral surface for this distribution. TODO: I missed the argument, sorry. (
This is the basic example that illustrates curvature. It turns out that the existence of an integral
submanifold is determined completely by the (non)vanishing of a tensor.
Definition 2.9. Let E ⊂ T X be a distribution. The Frobenius tensor φE : E × E → T X/E given by
ξ1 , ξ2 7−→ [ξ1 , ξ2 ] mod E.
Let’s think about this: the Lie bracket is defined for vector fields, not vectors. So we have to extend ξ1
and ξ2 to vector fields (well, sections of E, since they’re in E), which is a choice, and then check that what
we obtain is independent of this choice. It suffices to know that this is linear over functions: that
(2.10) [f_1\xi _1, f_2\xi _2] \overset ?= f_1f_2[\xi _1,\xi _2].
Of course, this is not what the Lie bracket does: it differentiates in both variables, so we have the extra terms
f1 (ξ · f2 )ξ2 and f2 (ξ · f1 )ξ1 . But both of these are sections of E, so vanish mod E, and therefore we do get a
well-defined, skew-symmetric form, a section of Λ2 E ∗ ⊗ T X/E – not quite a differential form.
Frobenius did many important things in mathematics, across group theory and representation theory and
this theorem, which is about differential equations!
Theorem 2.11 (Frobenius theorem). An integral submanifold of E exists locally iff φE = 0.
This is a nonlinear ODE. As such, our proof will rely on some facts from a course on ODEs.
Lemma 2.12. Let X be a smooth manifold, ξ be a vector field on X, and x ∈ X be a point where ξ doesn’t
vanish. Then there are local coordinates x1 , . . . , xn around x such that ξ = ∂x1 in this neighborhood.

Proof. Let ϕt be the local flow generated by ξ, and choose coordinates y 1 , . . . , y n near x such that ξx = ∂y 1 .
x
Define a map U : Rn → X by
(2.13) x^1,\dotsc ,x^n\mapsto \vp _{x^1}(0, x^2,\dotsc ,x^n).
The right-hand side is expressed in y-coordinates. Now we need to check this is a coordinate chart, which
follows from the inverse function theorem, because the differential of ϕ is invertible at 0 (in fact, it’s the
identity). The theorem then follows because x1 is the time direction for flow along ξ in this coordinate
system. 

Lemma 2.14. With notation as above, let ξ1 , . . . , ξk be vector fields which are linearly independent at x and
suc that [ξi , ξj ] = 0 for all 1 ≤ i, j ≤ k. Then there exist local coordinates x1 , . . . , xn such that for 1 ≤ i ≤ k,

ξi = ∂x i .
Arun Debray May 9, 2019 7

In fact, the converse is also true, but trivially so: it’s the theorem in multivariable calculus that mixed
partials commute.
Proof. Let ϕ1 , . . . , ϕk be the local flows for ξ1 , . . . , ξk . Because the pairwise Lie brackets commute, ϕi ϕj =
ϕj ϕi . Since these vector fields are linearly independent at x, we can choose local coordinates y 1 , . . . , y n

around x such that ξi |x = ∂y i . Then, as above, define
x

(2.15) x^1,\dotsc ,x^n\mapsto (\vp _1)_{x_1}(\vp _2)_{x_2}\dotsb (\vp _k)_{x_k}(0,\dotsc ,0, x^{k+1}, \dotsc ,x^n).
You can check that dϕ is invertible, so this is a change of coordinates, and then, using the fact that the flows
commute, you can see that the lemma follows. 

These lemmas are important theorems in their own right.


Proof of Theorem 2.11. Since the theorem statement is local, we can work in affine space An . Let π : An → Ak
be an affine surjection such that dπ0 restricts to an isomorphism E0 → Rk . Restrict to a neighborhood U
of 0 in An such that dπp |Ep : Ep → Rn is an isomorphism for all p ∈ U , and choose ξi |p ∈ Ep such that

dπp (ξp ) = ∂y i . Then, [ξi , ξj ] = 0: we know it’s in E, and

(2.16) \d \pi [\xi _i, \xi _j] = [\d \pi (\xi _i), \d \pi (\xi _j)] = \bkt {\pfr {}{y^i}, \pfr {}{y^j}} = 0.

Now apply Lemma 2.14; then {y k+1 = · · · = y n = 0} gives the desired integral submanifold. 

The idea of the theorem is that it’s a local normal form for an involutive distribution (one whose Frobenius
tensor vanishes): locally it looks like the splitting of Rn into the first k coordinates and the last (n − k)
coordinates. And in that local model, we know what the integral manifolds are.
Consider a fiber bundle with a discrete fiber (i.e. the inverse image of every point has the discrete topology).
This is also known as a covering space. On a “nearby fiber,” whatever that means (without more data, we
don’t have a metric on the base space), we have some sort of parallel transport. The precise statement is that
there’s a neighborhood of any x on the base space such that any path in that neighborhood lifts to a path on
the total space, unique if you specify a point in the fiber. More generally, you can lift families of paths, which
illustrates a homotopy-theoretic generalization of a fiber bundle called a fibration. But globally, given an
element of π1 (X), it might lift to a nontrivial automorphism of the fiber.
We’d like to do this for more general fiber bundles π : E → X, in which case we’ll need more data. The
kernel of dπ is a distribution, and consists of the “vertical” vectors (projection down to X kills them). A
complement is “horizontal”.
Without any choice, we get a short exact sequence at every e ∈ E:

(2.17) \label {fibbunex} \shortexact [][\d \pi _e]{\ker (\d \pi _e)}{T_eE}{T_xX},
and a splitting is exactly the choice of a complement He : Tx X → Te E. We would like to do this over the
whole base, which motivates the next definition.
Definition 2.18. Let π : E → X be a fiber bundle. A horizontal distribution is a subbundle H ⊂ T E
transverse to ker(dπ), or equivalently a section of the (surjective) map T E → π ∗ T X of vector bundles on E.
We must address existence and uniqueness. At e the space of splittings is an affine space modeled on
Hom(Tx X, ker(dπe ), because TODOsomething with a short exact sequence.
Therefore existence and uniqueness of a horizontal distribution is a question about existence and uniqueness
of a section of an affine bundle over X. Using partitions of unity, we can construct many of these: existence
is good, but uniqueness fails.
What about path lifting? Suppose γ : [0, 1] → X is a path in X beginning at x0 and terminating at x1 .
We can pull back both E and H by γ, to obtain a rank-1 distribution γ ∗ H in γ ∗ T E, and the projection map
to T [0, 1] is a fiberwise isomorphism. Therefore given a vector at x0 = γ(0) we get a unique horizontal lift
along [0, 1] to a vector field, and therefore get a unique integral curve above γ.
Note that you cannot always lift higher-dimensional submanifolds, and again the obstruction is the
Frobenius tensor, because that’s the obstruction to the existence of an integral submanifold. In this context
the Frobenius tensor is called curvature – right now it’s on the total space, but in some settings we can
descend it to the base.
8 M392C (Mathematical gauge theory) Lecture Notes

Lecture 3.
Principal bundles, associated bundles, and the curvature 2-form: 1/29/19

“For whatever reason I’m being a little impressionistic. . . ”


Last time, we discussed a way in which 2-forms appear in geometry: as the obstruction to integrability of
a distribution E ⊂ T X. That is, a distribution contains vectors, and we can ask whether integral curves
of those vectors have tangent vectors contained within E. Associated to E we defined a Frobenius tensor
φE : Λ2 E → T X/E sending
(3.1) \xi _1,\xi _2\mapsto [\widetilde \xi _1, \widetilde \xi _2]\bmod E,
where ξei is a vector field extending ξ (and we showed this doesn’t depend on the choice of extension). In
Theorem 2.11, we saw that φE is exactly the local obstruction to integrability; we can then move to global
questions.
More generally, suppose that π : E → X is a fiber bundle. Then T E fits into a short exact sequence (2.17),
and we can ask for a horizontal lift from T X to T E, which is a section H of (2.17). Then, given a vector
e ∈ Tx X and a path γ : [0, 1] → X with γ(0) = x, we can pull back3 π and H to obtain a distribution in γ ∗ E.
The Frobenius tensor vanishes, because [0, 1] is one-dimensional, so we can extend to an integral curve and
therefore parallel-transport along γ. However, if we choose different paths in a ball, there’s no guarantee that
parallel transport along nearby paths agree at all; the Frobenius tensor may still be nonzero on X.
Steenrod’s elegant perspective on fiber bundles (see his book [Ste51]) considered in the spirit of Felix Klein
symmetry groups associated to fiber bundles. This leads to the definition of a principal G-bundle as a fiber
bundle of right G-torsors.
Definition 3.2. Let G be a Lie group and recall that a right G-torsor is a smooth manifold T and a smooth
right G-action on T such that the action map T × G → T × T sending (t, g) 7→ (t, t · g) is an isomorphism.
Example 3.3. The prime example of a torsor is to let V be a real vector space; then, the manifold B(V ) of
bases of V is a GLn (R)-torsor: GLn (R) acts by precomposition. This also works over C and H. (
Example 3.4. Now let X be a smooth manifold. Our first example of a principal bundle spreads Example 3.3
over X: let B(X) be the smooth manifold4 of pairs (x, b) where x ∈ X and b is a basis of Tx X, i.e. an

=
isomorphism b : Rn → Tx X. There’s a natural forgetful map π : B(X) → X sending (x, b) 7→ x.
This fiber bundle is a principal GLn (R)-bundle: given g ∈ GLn (R) and a basis b : Rn → Tx X, we let
b · g := b ◦ g : Rn → Tx X, using the standard action of GLn (R) on Rn . This is called the frame bundle of
X. (
This principal bundle controls a lot of the geometry of X, via its associated fiber bundles.
Definition 3.5. Let π : P → X be a principal G-bundle and F be a smooth (left) G-manifold. The associated
fiber bundle with fiber F is the quotient P ×G F := (P × F )/G, which is a fiber bundle over X with fiber F .
Here, G acts on P × F on the right by (p, f ) · g = (p · g, g −1 · f ).
One has to check this is a fiber bundle, and in particular that the total space is a smooth manifold. Since
G acts freely on P , it acts freely on P × F , but for G noncompact there’s more to say.
Example 3.6. GLn (R) acts linearly on Rn . Because Rn carries the additional structure of a vector space,
the associated bundle B(X) ×GLn (R) Rn has additional structure: it’s a vector bundle. In general, additional
structure on F manifests in additional structure on P ×G F .
Anyways, what vector bundle do we get? (Can you guess?) An element of the fiber of B(X) ×GLn (R) Rn
is an equivalence class of an element v ∈ Rn and a basis p : Rn → Tx X; let ξ := p(v) ∈ Tx X. Another
representative of this equivalence class are represented by g −1 v and p ◦ g for some g ∈ GLn (R), so this pair
defines the same tangent vector ξ. Therefore we recover the tangent bundle. (
3In general, we can form the pullback of [0, 1] → X ← E in the category of sets or spaces, but we want to put a smooth
manifold structure on it. We can do it when these two maps are transverse – and since π : E → X is a submersion, this is always
satisfied.
4You have to put a smooth manifold structure on this set! The way to do this is the only tool we have right now: work in an
atlas U of X which trivializes T X, do this locally, and check that the transition maps are smooth. This will also show that the
map π : B(X) → X is a fiber bundle.
Arun Debray May 9, 2019 9

In general, a principal bundle is telling you some internal coordinates. You know these coordinates up to
some symmetry G, and the principal bundle tracks that: you have to make a choice to get coordinates, and it
tells you how different choices are related.
We want to show local triviality of a principal G-bundle π : P → X, which will follow from local triviality
as a fiber bundle. Consider a local section s : U → π −1 (U ), where U ⊂ X; we would like to exhibit an
isomorphism of fiber bundles U × G → π −1 (U ) over U . The map is exactly
(3.7) x,g\mapsto s(x)\cdot g.
∼ −1
This exhibits U × G = π (U ) as principal G-bundles, so we have local trivialization. Then in every associated
bundle to P , we also obtain local triviality, hence local coordinates. For example, if the bundle of frames is
trivialized over U , we get local coordinates (i.e. a local trivialization of T U ).
Definition 3.8. A connection on a principal G-bundle π : P → X is a G-invariant horizontal distribution.
Specifically, given g ∈ G, we have the right action map Rg : P → P , and can therefore define Hp·g :=
(Rg )∗ (Hp ) for a distribution H.
In this setting, the Frobenius tensor is going to do something nice: it’s a map
(3.9) \phi _H\colon H\wedge H\to TP/H\cong \ker (\pi _*),
so given two horizontal vectors, we get a vertical vector. Since H is G-invariant, the Frobenius tensor is also
G-invariant, so we ought to be able to descend it to the base: there’s only one piece of information on each
fiber. That is, given vectors ξ1 , ξ2 on X, we can lift them to P and compute the Frobenius tensor there, and
G-invariance means it doesn’t matter how we lift. ∼
=
If g is the Lie algebra of the Lie group G, we have an isomorphism g → ker(π∗ ) as vector bundles on P .
Specifically, let ξ ∈ g, and consider the exponential map exp : g → G. Given p ∈ P with π(p) = x, we get a
curve in P given by t 7→ p · exp(tξ) sending 0 7→ p, and this curve is contained entirely within Px . Therefore
its tangent vector at p is in ker(π∗ ).
So the Frobenius tensor is a map φH : H ∧ H → g. Now let’s descend to the base. We’d like to claim
that what we get in g is invariant, but that’s just not true: if g ∈ G, the action of g on p · exp(tξ) is not the
same as p · g · exp(tξ): the issue is that g exp(tξ) and exp(tξ)g may not agree. This will make it slightly more
interesting to descend to the base.
First, extend φH to a map
(3.10) \widetilde \phi _H\colon TP\wedge TP\longrightarrow \underline \fg

by projecting pH : T P  H, which has kernel ker(π∗ ). That is, φeH (η1 ∧η2 ) := pH η1 ∧pH η2 . Thus φeH ∈ Ω2P (g).
Lemma 3.11. Let g ∈ G. Then in Ω2 (g), R∗ φeH = Adg−1 φeH .
P g

So once we choose a basis for g, we can think of elements of Ω2P (g) as matrix-valued differential forms.5
The proof of Lemma 3.11 comes from the observation above that to get from p · g · exp(tξ) to
(3.12) p\cdot \exp (t\xi )g = p\cdot g\cdot (g^{-1}\exp (t\xi )g) = p\cdot g\cdot \Ad _{g^{-1}}(\xi ).
So this is exactly an example of an associated bundle to P , where the G-manifold F is g with the adjoint
G-bundle. So associated to P is the adjoint bundle gP → X defined as P ×G g. This is a vector bundle, in
fact a bundle of Lie algebras because the adjoint action preserves the Lie bracket.
A section of gP is a function upstairs valued in g, which is exactly what φeH is.
Corollary 3.13. φeH descends to a 2-form −ΩH ∈ Ω2 (gP ). X
In this case ΩH is called the curvature of H. In particular, if X is a 4-manifold with a conformal structure,
we can ask for this to be self-dual or anti-self-dual.
In the short exact sequence
(3.14) \shortexact [][\pi _*]{\underline \fg }{TP}{\pi ^*TX},

a section H : π ∗ T X → T P is equivalent to a section Θ : T P → g, i.e. a form Θ ∈ Ω1P (g). This is called the
connection form, and H = ker(Θ). It has to satisfy some properties.
5Here I suppose we need to use a Lie group G that admits a faithful finite-dimensional representation, but all compact Lie
groups, and most noncompact Lie groups that you’ll encounter, have this property.
10 M392C (Mathematical gauge theory) Lecture Notes

• Θ must be G-invariant: Rg∗ Θ = Adg−1 Θ. This is a linear equation inside the infinite-dimensional
vector space Ω1P (g).
• The other constraint is affine: Θ|vertical = id.
So the space AP of one-forms Θ satisfying these conditions is affine. This is the space of connections, and in
particular tells us that there are lots of connections.
We can also interpret the Frobenius tensor in terms of Θ. Let ζ1 and ζ2 be horizontal vectors, and extend
e Then ζ · Θ(ζe1−i ) = 0, so
them to vector fields ζe1 and ζ.

\begin {aligned} \d \Theta (\zeta _1, \zeta _2) &= \zeta _1\Theta (\widetilde \zeta _2) - \zeta _2\Theta (\widetilde \zeta _1) - \Theta ([\widetilde \zeta _1, \widetilde \zeta _2])\\ &= -\Theta ([\widetilde \zeta _1, \widetilde \zeta _2]) = -\phi _H(\zeta _1, \zeta _2). \end {aligned}
(3.15)

Thus we have proved

Proposition 3.16. π ∗ ΩH = −φeH = dΘ + (1/2)[Θ ∧ Θ].

The notation [Θ ∧ Θ] means: Θ ∧ Θ ∈ Ω2P (g ⊗ g), and this has a Lie bracket map [·] : Ω2P (g ⊗ g) → Ω2P (g).

Corollary 3.17 (Bianchi identity). dΩ + [Θ ∧ Ω] = 0.

Proof.
\begin {aligned} \d \Omega _H &= [\d \Theta \wedge \Theta ]\\ &= \bkt {\Omega - \frac 12[\Theta \wedge \Theta ] \wedge \Theta }\\ &= [\Omega \wedge \Theta ] \end {aligned}

(3.18)

by the Jacobi identity. 

This has been more theory than examples of principal bundles, but we will see plenty of examples when
we delve into gauge theory.
Now given a principal G-bundle π : P → X with a connection, and any associated bundle FP with fiber F ,
we get a horizontal distribution. There’s a hands-on way to construct this, or you could think of it in terms
of path lifting: given an x ∈ X and a lift p ∈ P , the connection lifts a path γ : [0, 1] → X based at x to a
e : [0, 1] → P based at p, so given an f ∈ F , we can define the path γ
path γ b : [0, 1] → FP by t 7→ (e
γ (t), f ).
Suppose V is a G-representation, so its associated vector bundle VP → X is a vector bundle. Then the
horizontal distribution we obtain on VP is tangent to the zero section of VP . Let ψ : X → VP be a section
and ξ ∈ Tx X; we would like to differentiate ψ in the direction ξ. If ψ were valued in a fixed vector space, we
could do this as usual: extend ξ to a curve γ : (−ε, ε) → M , and then define

(3.19) \nabla _\xi \psi \coloneqq \left .\dfr {}{t}\right |_{t=0} \psi (\gamma (t)).

This is precisely the directional derivative. In VP , the fibers are different vector spaces, which seems like
a problem except that the connection on P defines parallel transport τt along γ for the fibers of VP , and
therefore we can define the directional derivative of ψ as

(3.20) \nabla _\xi \psi \coloneqq \left .\dfr {}{t}\right |_{t=0} \tau _{-t}\psi (\gamma (t)).

This is called the covariant derivative.

Exercise 3.21. Show that this satisfies the Leibniz rule: if f is a function on X, then

(3.22) \nabla _\xi (f\cdot \psi ) = (\xi \cdot f)\psi + f(x)\nabla _\xi \psi .

In other words, the existence of the horizontal distribution is somehow telling us about the Leibniz rule,
though this is a somewhat mysterious fact.
Arun Debray May 9, 2019 11

Lecture 4.
Harmonic forms and (anti)-self-dual connections: 1/31/19

“The key to humor is. . . . . . . . . timing!”


Last time, we discussed connections on principal bundles, and what they induce on associated vector bundles.
We also briefly saw the covariant derivative associated to a connection. We begin with more on covariant
derivatives.
Definition 4.1. Let E → X be a vector bundle. A covariant derivative is a linear map ∇ : Ω0X (E) → Ω1X (E)
satisfying the Leibniz rule
(4.2) \nabla (fs) = \d f\cdot s + f\nabla s,
where f is a smooth function on X and s is a smooth section of E.
If E is a trivial bundle with constant fiber V , the usual directional derivative is a covariant derivative, but
there can be others.
We can extend ∇ to a sequence of first-order differential operators

(4.3) \xymatrix { 0\ar [r] & \Omega _X^0(E)\ar [r]^{\d _\nabla } & \Omega _X^1(E)\ar [r]^{\d _\nabla } &\Omega _X^2(E)\ar [r]^{\d _\nabla } &\dotsb }
defined by
(4.4) \d _\nabla (\omega \cdot s)\coloneqq \d \omega \cdot s + (-1)^k\omega \wedge \nabla s,
where ω ∈ ΩkX and s ∈ Ω0X (E). Thus the first map d∇ : Ω0X (E) → Ω1X (E) is just ∇.
Exercise 4.5. Show that d2∇ (f s) = f d2∇ (s).
In other words, this says the symbol of d2∇ vanishes; this second-order operator is really a first-order
operator. Therefore there exists an F∇ ∈ Ω2X (End E), called the curvature, such that d2∇ (s) = F∇ · s.
Digression 4.6. We recall what the symbol of an operator is. Let E, F → X be vector bundles and
D : Ω0X (E) → Ω0X (F ) be a differential operator. By definition, D is first-order if for every function f and
section s,
(4.7) D(fs) = \sigma (\d f)s + fDs

for some σ : T X ⊗ E → F , which is called the symbol of D. (
Exercise 4.8. Compute d3∇ . (Answer: it’s zero.)
Now we have two notions of curvature: the curvature associated to a covariant derivative as above, and
the curvature associated to a principal bundle with connection and an associated vector bundle.
Exercise 4.9. Let G be a Lie group, π : P → X be a principal G-bundle with connection Θ ∈ Ω1P (g), and
ρ : G → Aut(E) be a linear representation of G. Let E := EP = P ×G E → X be the associated bundle, which
carries a covariant derivative ∇ : Ω0X (E) → Ω1X (E). Compute d2∇ in terms of Ω = dΘ + (1/2)[Θ ∧ Θ].
Example 4.10. Let’s think about connections on a principal T-bundle.6 Consider C2 with coordinates z 0 , z 1
and metric
(4.11) \ang {(z^0, z^1), (w^0, w^1)} \coloneqq \overline {z^0}w^0 + \overline {z^1}w^1.
The circle group T acts on S ⊂ C on the right by (z 0 , z 1 ) · λ := (z 0 λ, z 1 λ). This is a free action, so its
3 2

quotient is a smooth manifold, specifically CP1 ∼


= S 2 , the manifold of complex lines through the origin in C2 .
Thus we obtain a principal T-bundle π : S → CP1 , called the7 Hopf bundle.
3

Now let’s put a connection on π. We want a horizontal distribution on the total space S 3 . Inside T(z0 ,z1 ) S 3 ,
there’s a one-dimensional subspace of vectors in the direction of the fiber {(z 0 , z 1 ) · λ}. The standard
Riemannian metric on C2 = R4 allows us to choose a complementary line at each point, which is a horizontal
distribution. Because T acts by isometries, this is an invariant distribution, hence a connection.
6Here T ⊂ C× is the group of unit-magnitude complex numbers, sometimes also denoted U or S 1 .
1
7Well, there’s more than one Hopf bundle, and we’ll see some others later, but this is the first example.
12 M392C (Mathematical gauge theory) Lecture Notes

This is all pretty and geometric, but we need to compute the connection form Θ ∈ Ω1S 3 (iR) (the Lie algebra
of T is a line with trivial bracket, and is more canonically iR). Specifically,

(4.12) \Theta = \Im \paren {\overline {z^0}\ud z^0 + \overline {z^1}\ud z^1}.

In the vertical direction, Θ = id, (z 0 eit , z 1 eit ) = (iz 0 , iz 1 ). Looking inside the complexified tangent bundle (a
four-dimensional complex vector bundle), which has basis {∂z0 , ∂z0 , ∂z1 , ∂z1 }, we get

(4.13) iz^0\pfr {}{z^0} - i\overline {z^0}\pfr {}{\overline {z^0}} + iz^1\pfr {}{z^1} - i\overline {z^1}\pfr {}{\overline {z^1}}.

So on vertical vectors, this is the identity. One (you) can check that on a vector normal to S 3 , this vanishes –
this is just linear algebra over the complex numbers, so nothing too intimidating.
Next we’d like to see
(4.14) \Omega = \d \Theta = \Im \paren {\overline {\d z^0}\wedge \d z^0 + \overline {\d z^1}\wedge \d z^1},

though this is already imaginary, so we can remove the ‘Im’ in front. You can check this descends to CP1 .
It’s a 2-form on C2 , visibly of type (1, 1), and we restrict it to S 3 ; the claim is that there’s a form on CP1
whose pullback by π is Ω|S 3 . This involves verifying two things: that Ω is T-invariant, and that it’s trivial in
the vertical direction. This is a good practice computation.
Let Ω also denote the form on CP1 : Ω ∈ Ω2CP1 (iR). We claim
\int _{\CP ^1} \frac {1}{2\pi } i\Omega = 1.
(4.15)

To compute this, we need some coordinates on CP1 . We’ll construct a section s of π over CP1 \ ∞ ∼
= C.
1
Specifically, given z ∈ C, which we think of as [z : 1] ∈ CP , let

(4.16) s(z) = \frac {(z, 1)}{\sqrt {1 + \abs z^2}}.

The term in the denominator means that the function decays at infinity in C, so we expect this integral to
converge. (But you should still do it!) (
Consider a more general principal T-bundle π : P → X, where X is a smooth manifold. Is it a pullback
of the Hopf bundle by a map X → CP1 ? This need not be true, but something weaker is. Consider the
generalized Hopf bundle S 2N +1 → CPN , defined in the same way as the Hopf bundle.
Theorem 4.17. Every principal T-bundle P over a smooth manifold X arises as a pullback of a Hopf bundle
S 2N +1 → CPN for some N .
We can choose N independent of P , but it will depend on X. So in general you can think of pulling back
from CP∞ .
Proof sketch. A pullback is a T-equivariant map ϕ : P → S 2N +1 ; the quotient by T defines a map X → CPN
satisfying the theorem. But this is equivalent data to a section of the associated bundle SP2N +1 → X. This is
good: there are tools in topology for constructing sections. First, using an approximation theorem, one shows
that it suffices to find a continuous section. Then, one uses obstruction theory: choose a CW structure on X
and a q-cell D → X. We’d like to extend a section over this cell; since D is contractible, it’s equivalent to ask
that the map S q−1 = ∂D → SP2N +1 is trivial (up to homotopy). This is a question about homotopy groups,
and for N large enough, the relevant homotopy group vanishes. 

So the next question is: can we construct universal connections Θuniv on these Hopf bundles such that
every connection arises as a pullback? This is finickier. Supposing it exists, and ϕ : (P, Θ) → (S 2N +1 , Θuniv ),
then since connections form an affine space, there’s an α ∈ Ω1X (iR) such that
(4.18) ϕ∗ Θuniv − Θ = π ∗ α,
and hence
(4.19) ϕ∗ Ωuniv − Ω = dα.
Arun Debray May 9, 2019 13

This therefore implies dΩ = 0, where Ω ∈ Ω2X (iR), so it has a de Rham cohomology class [iΩ/2π] ∈ HdR 2
(X).
2 N N
This is the pullback of a class (c1 )R ∈ HdR (CP ). We can see this class explicitly; CP has a very simple
CW structure with one cell in each even dimension. Therefore the cochain complex for CW cohomology with
Z coefficients looks like Z → 0 → Z → 0 → Z → · · · , and we claim c1 is the generator8 of H 2 (CPN ; Z). Then
there’s an argument for why these two agree, namely just calculate on CPN , and this is the beginning of
Chern-Weil theory, relating curvature and characteristic classes.
Remark 4.20. There’s a similar story for higher Chern classes, but it’s sufficiently complicated enough that
it’s generally easier to calculate using the splitting principle to split a vector bundle as a direct sum of line
bundles. (
Let’s come back to 4-manifolds and self-duality: we let X be an oriented 4-manifold with a conformal
structure [g]. This is enough to define the Hodge star ? : Ω2X → Ω2X , which squares to the identity. Tensoring
with a vector bundle allows us to define ? : Ω2X (E) → Ω2X (E) for any vector bundle E → X, which also
squares to the identity; therefore we can also define self-dual and anti-self-dual forms valued in E in the same
way.
Definition 4.21. Let P → X be a principal G-bundle with connection Θ and Ω ∈ Ω2X (gP ) be the associated
connection form. We say Θ is self-dual (resp. anti-self-dual) if ?Ω = Ω (resp. ?Ω = −Ω).
As we discussed in the first lecture, this is the four-dimensional analogue of a two-dimensional question on
oriented, conformal surfaces: whether a function (form, . . . ) is holomorphic or antiholomorphic. The sign
isn’t all that intrinsic: changing the orientation on X changes it.
Anti-self-dual connections are of interest to physicists, since the 1970s, beginning with work of Polyakov
and others looking at flat space. Uhlenbeck produced a condition guaranteeing that solutions to ?Ω = −Ω
extend over S 4 , and later Atiyah, Bott, Hitchin, and Singer claimed there are more solutions, and used
algebraic geometry to produce them. We will study more of this story in this class, but first some examples.
The simplest case is G = T. Often this is called “the” abelian case, though there are certainly other abelian
Lie groups, such as T2 . Anyways, in this case Ω lives in Ω2X (iR), dΩ = 0, and if ?Ω = ±Ω, then d?Ω = 0 iff
d∗ Ω = 0. Together these imply that Ω is a harmonic form if X is closed.
Digression 4.22. Let M be a Riemannian manifold (though for just dimension 4, we’re only going to need
the conformal class of the metric.) For example, we could take M = En , which denotes Rn with the standard
Riemannian metric. Then the Laplacian is

(4.23) \label {flatlaplacian} \Delta \coloneqq -\paren {\pfr [2]{}{(x^1)} + \dotsb + \pfr [2]{}{(x^n)}}.

Why the minus sign? This has a discrete spectrum, and we’d like it to be nonnegative rather than nonpositive.
The de Rham derivative has the form
(4.24) \d = \e (\d x^i)\pfr {}{x^i},

where ε denotes exterior multiplication (which is its symbol). Using the metric, the formal adjoint is

(4.25) \d ^* = -\iota (\d x^i)\pfr {}{x^i}.

(whose symbol is −ι; here ι is interior multiplication). Then you can check that ∆ := dd∗ + d∗ d.
Now we can bring this to any Riemannian manifold M : we know what d is, and can define d∗ by integrating
by parts to construct the formal adjoint of d, or construct it locally. But, for the same reason that interior
multiplication requires a metric, d∗ depends on the metric. And therefore we can define the Laplacian ∆ on
M to be dd∗ + d∗ d. This means the analogue of (4.23) on M in local coordinates (x1 , . . . , xn ) is

(4.26) \Delta = -\sum _{1\le i\le j\le n} -g^{ij}\pfr {{}^2}{x^i\partial x^j}.

Here gij := h∂i , ∂j i, and g ij is the (components of the) inverse to the matrix (gij )i,j . If you haven’t seen this
before, it’s good to work it out.
8We need to pick a sign, but this is determined by the canonical orientation of CPN coming from the complex structure.
14 M392C (Mathematical gauge theory) Lecture Notes

Now suppose M is closed and ∆ω = 0. Then


0 = dd∗ ω + d∗ dω
= hdd∗ ω, ωi + hd∗ dω, ωi
Z
= (hdd∗ ω, ωi + hd∗ dω, ωi) dvol
M
Z
= (hd∗ ω, d∗ ωi + hdω, dωi) dvol.
M
In fact, the converse is true.
Theorem 4.27. On a closed Riemannian manifold, dω = 0 and d∗ ω = 0 iff ∆ω = 0.
k
Such a form ω is called harmonic. The space of harmonic k-forms is denoted HM (g) ⊂ ΩkM . Elliptic theory
9
shows this is finite-dimensional, and in fact more is true.
Theorem 4.28 (Hodge decomposition). There is a splitting
ΩkM ∼
= HMk
(g) ⊕ Im(d) ⊕ Im(d∗ ).
closed
k k
Since harmonic forms are closed, there’s a projection HM (g) → HdR (M ), and in fact this is an isomorphism!
So every cohomology class has a unique harmonic representative. (
2 ∼ 2
And now back to 4-manifolds. If X is an oriented Riemannian 4-manifold, we have HX (g) = H (X; R), and
2 2 2
HX (g) has two distinguished subspaces: the self-dual forms H+ (g) and the anti-self-dual forms H− (g). These
2
are distinct subspaces, so every harmonic 2-form ω decomposes as a sum ω = ω+ + ω− , where ω± ∈ H± (g).
Explicitly,
(4.29) \omega _\pm = \frac {\omega \pm {\star }\omega }{2}.
All of this depended on the metric, so we can ask how this changes as the metric moves, which involves some
Sard-Smale theory, as we discussed in Morse theory last semester. But Chern-Weil theory tells us that if Ω
comes from a connection on a principal T-bundle, then iΩ/2π defines an integer-valued cohomology class.
Therefore self-dual or anti-self-dual connections are the intersection of an integer lattice in H 2 with two
2 2
lines H± (g). Generically, this has no solutions, unless one of H± (g) is zero (so all forms are self-dual, or are
anti-self-dual). Perhaps that’s a little disappointing.
To study this, we’ll look at the intersection form, a symmetric bilinear 2-form on H 2 (X; Z) sending
c1 , c2 7→ hc1 ^ c2 , [X]i. Let b2+ (resp. b2− ) denote the dimension of the largest subspace on which this form is
positive (resp. negative). Then b2+ + b2− = b2 (M ), and their difference is the sigature. We’ll put conditions on
b2± which make it possible to find (anti)-self-dual connections.
Example 4.30.
(1) On S 4 , b2 = 0, so b2± = 0. So no self-dual forms here.
(2) On CP2 , b2+ = 1 and b2− = 0. In this case, self-dual forms exist! Hooray.
(3) But on a K3 surface, b2− = 19 and b2+ = 3, so no self-dual forms generically. (
This is a little annoying. Maybe we should work with a different Lie group.
The next simplest example is SU2 = Sp1 . Associated to it is another Hopf bundle: Sp1 acts on S 7 ⊂ H2 ,
as (right) multiplication by unit quaternions, and the quotient is HP1 ∼ = S 4 . We can use this to follow the
same story as above, defining a connection geometrically and so on.
Lecture 5.
The Yang-Mills functional: 2/5/19

Let X be an oriented, conformal 4-manifold, P → X be a principal G-bundle, and Θ ∈ Ω1P (g) be a


connection. We will study gauge theory in this situation; sometimes we will use a Riemannian metric in the
conformal class for X.
9We’ll use some elliptic theory later this semester, and will therefore go over some of the ingredients that you’d use to prove
this.
Arun Debray May 9, 2019 15

In gauge theory, people typically use slightly different notation.


• The connection Θ is usually denoted A.
• Its curvature is denoted F = FA = dA + (1/2)[A ∧ A] ∈ Ω2P (g) – but FA is also used to denote the
curvature form on X, FA ∈ Ω2X (gP ).
Definition 5.1. We say that A is self-dual, resp. anti-self-dual, if ?FA = FA , resp. ?FA = −FA .
These are first-order nonlinear PDEs in A. Let’s say something about where they come from.

Hodge theory and minimization. Let (M, g) be a closed, oriented Riemannian n-manifold.10 Given a
cohomology class c ∈ H k (M ; R), what’s the “best” differential form representative for c? That is, what’s the
“best” ω ∈ ΩkM with [ω] = c?
Well, what does “best” mean? Maybe smallest-norm: let’s ask for an ω which minimizes

(5.2) f\colon \omega \mapsto \int _M \norm \omega ^2\ud \mathrm {vol}_g = \int _M \omega \wedge {\star }\omega

such that [ω] = c.


Remark 5.3. If M is not oriented, then we don’t have a volume form, and ω ∧ ?ω is a density. Asking to
minimize this norm still makes sense. (
Fix an ω0 such that [ω0 ] = c. On the affine line ω0 + dΩk−1
M , consider the function

(5.4) f(\omega _0 + \d \eta ) = \int _M (\omega _0 + \d \eta )\wedge {\star }(\omega _0 + \d \eta ).

This is a quadratic function on a real affine line. We know what those look like – parabolas. So we can find
the unique minimum where the derivative of f is zero. The derivative is

(5.5) \d f_{\omega _0}(\d \eta ) = 2\int _M \d \eta \wedge {\star }\omega _0 = \pm 2\int _M \eta \wedge \d {\star }\omega _0,

using Stokes’ theorem, since M is closed. The output equations are


\begin {aligned} \d {\star }\omega _0 &= 0\\ \d \omega _0 &=0. \end {aligned}
(5.6)

These are the Euler-Lagrange equations for this problem.11 They’re satisfied iff ∆ω0 = 0, where ∆ := dd∗ +d∗ d
is the Laplacian. Solutions to ∆ω0 = 0 are called harmonic forms.
Lemma 5.7. On a closed manifold M , ω is harmonic iff dω = 0 and d?ω = 0.
Proof sketch. Suppose ω is harmonic. Then
Z
(5.8) 0= ∆ω ∧ ?ω
ZM
(5.9) = (dd∗ ω ∧ ?ω + dd∗ ω ∧ ?ω)
M
Z
kd∗ ωk2 + kdωk2 dvolg .

(5.10) =
M
Since this is the integral of a nonnegative function, that function must be 0 everywhere, so we conclude that
d∗ ω = 0 and dω = 0. To get from d∗ to d?, use the fact that the formal adjoint of d, namely d∗ , is also ±?d?,
where the sign depends on the degree of ω and the dimension of M .
The other direction is up to you. 

Now let’s apply this to connections on 4-manifolds. Let AP denote the affine space of connections on the
principal G-bundle P → X.

10One can generalize to open manifolds, but then one needs some vanishing or growth conditions at infinity, or a boundary
condition. We’re not going to worry about this in this motivational section.
11Well, this is a little silly in this setting, since all we did is take a derivative. But in general they’re more involved.
16 M392C (Mathematical gauge theory) Lecture Notes

Definition 5.11. The Yang-Mills functional Y : AP → R is

(5.12) Y(A)\coloneqq \int _X\norm {F_A}^2\ud \mathrm {vol}_g.

This looks nice and all that, but we haven’t yet defined everything: we need to make sense of the norm on
Ω2P (g). We’ll come back to this.

Example 5.13. Suppose G = T, so g = iR. Then FA ∈ Ω2X (iR), and we know how to take the norm of
these kinds of differential forms: the Yang-Mills functional is

(5.14) Y(A) = -\int _X F_A\wedge {\star }F_A.

We can decompose FA into its self-dual and anti-self-dual pieces FA± : FA = FA+ +FA− , and then ?FA = FA+ −FA− .
Thus we can rewrite the Yang-Mills functional as12
Z
(5.15) Y (A) = FA− ∧ FA− − FA+ ∧ FA+
X
≥0 ≤0
Z
(5.16) ≥ FA− ∧ FA− + FA+ ∧ FA+
X
Z
(5.17) = FA ∧ FA
X
(5.18) = 4π hc1 (P )2 , [X]i.
2

2
Here c1 (P ) = (i/2π)[FA ] ∈ HdR (M ) is the first (well only) Chern class of the principal T-bundle P . So we
obtain a lower bound on the Yang-Mills functional in terms of characteristic classes. Equality is achieved
exactly when FA+ ∧ FA+ = 0, i.e. A is anti-self-dual.
Next, what are the critical points of Y ? The differential is

(5.19) \d Y_A(\dot A) = -2\int _X \dot A\wedge \d {\star } F_A,

where A ∈ AP and Ȧ ∈ Ω1X (iR) (a variation of A), so the critical points are those such that d?FA = 0. The
Bianchi identity says that, in addition, dFA = 0, so the critical points have harmonic curvature forms. These
two PDEs are second-order (curvature is a derivative, and then we take one more derivative), but linear.

Definition 5.20. The Yang-Mills equations are d?FA = 0 and dFA = 0.

Suppose b+2 (M ) > 0. Then for generic g, the minimum of Y is not realized: we’re trying to intersect a
line and a lattice. But we can get arbitrarily close, producing a sequence of connections approaching the
minimum. (

What changes for G more general? First we need a G-invariant inner product h–, –ig on g; this will induce
an inner product h–, –igP on Ω2X (gP ): we think of ω, η 7→ ω ∧ η ∈ Ω4X (g ⊗ g), then use h–, –ig to get from
g ⊗ g to R, and then integrate to get a number.
G acts on g by the adjoint action, so we want something invariant under this action. Such a form always
exists: you can take for example the Killing form. Then the Yang-Mills functional is

(5.21) \label {fgPYM} Y(A) = \int _X \ang {F_A\wedge {\star } F_A}_{\fg _P},

12If the signs look weird, keep in mind F is imaginary. But there may yet be sign errors.
A
Arun Debray May 9, 2019 17

and the above calculation generalizes to


Z
(5.22) Y (A) = hFA− ∧ FA− igP − hFA+ ∧ FA+ igP
X
≥0 ≤0
Z
(5.23) ≥ hFA− ∧ FA− igP + hFA+ ∧ FA+ igP
ZX
(5.24) = hFA ∧ FA igP
X
2
(5.25) = 4π hc(P ), [X]i,
where c(P ) is some degree-4 characteristic class for principal G-bundles. In particular, this is constant, so we
once again obtain a lower bound.
Exercise 5.26. Show that Z
dYA (Ȧ) = −2 hȦ ∧ dA ∧ ?FA igP ,
X
where Ȧ ∈ Ω1X (g) is a variation of A.
Proof. First, let’s differentiate the curvature operator F : AP → Ω2X (gP ).
d
(5.27) dFA (Ȧ) = F (A + tȦ)
dt t=0
d 1
(5.28) = d(A + tȦ) + [(A + tȦ) ∧ (A + tȦ)]
dt t=0 2
(5.29) = dȦ + [A ∧ Ȧ] = dA A.
We’ll use this to differentiate the Yang-Mills functional, starting with (5.21). Then
Z
(5.30) dYA (Ȧ) = 2 hdA Ȧ ∧ ?FA i
X
Z
(5.31) = −2 hȦ ∧ dA ?FA i,
X
using the Leibniz rule
(5.32) \d \ang {\omega \wedge \eta } = \ang {\d _A\omega \wedge \eta } \pm \ang {\omega \wedge \d _A\eta }
where ω, η ∈ Ω∗X (gP ). 

Finding the critical points of the Yang-Mills functional is now a nonlinear second-order PDE. This is
part of a very general story, including work of many people. One doesn’t have to restrict to dimension 4;
for example, Atiyah and Bott studied the Yang-Mills functional on Riemann surfaces and the topology of
(certain equivalence classes of) the space of connections and its relationship with the algebraic geometry of
the Riemann surface.
For the rest of today’s lecture, we’ll discuss some more classical geometric preliminaries, beginning with
some cool facts about the conformal group. Let W be a finite-dimensional real vector space,13 and suppose
h–, –i : W × W → R is a nondegenerate, symmetric bilinear form. We aren’t assuming it’s positive definite,
e.g. it could be Lorentz. Inside W , we have the null cone NW of vectors ξ ∈ W with hξ, ξi = 0. This is a
cone because if ξ ∈ NW , then tξ ∈ NW too. If the form is definite, the null cone is just {0}.
Let QW denote the image of NW \ 0 under the quotient W \ 0  PW by R× . Then QW is a compact real
manifold, a real quadric, and carries a natural conformal structure. Suppose ` ∈ QW , meaning it’s a line in
the null cone. Then T` QW ⊂ T` PW = Hom(`, W/`), and under this identification, T` QW = Hom(`, `⊥ /`):
since ` is null, ` ⊂ `⊥ , and in fact `⊥ is a codimension-one subspace of W : it’s cut out by one equation.
The bilinear form h–, –i descends to `⊥ /` × `⊥ /` → R, and therefore defines a conformal structure on
Hom(`, `⊥ /`) ∼ = `⊥ /` ⊗ `∗ . This obviously varies smoothly in `, hence defines a conformal structure on QW .
Suppose, for example, W = V ⊕ H is an orthogonal direct sum, where h–, –iV is an inner product and H
is a hyperbolic plane, so it has signature (1, 1) and exactly two lines. Then PH is a circle, and the null cone
13We could work with complex vector spaces, or infinite-dimensional vector spaces.
18 M392C (Mathematical gauge theory) Lecture Notes

is two points inside it. The entire quadric cone splits as QW = V q NV q QV . How does this work? Inside
PW = {[v : s : t]}, V = {[v : −kvk2 : 1]}, NV = {[v : 1 : 0]}, and QV = {[v] : 0 : 0}.
If V is n-dimensional, then NV is (n − 1)-dimensional and QV is (n − 2)-dimensional. Therefore QW is a
compactification of V : we’ve added strata of codimensions one and two to it, and picked up a conformal
structure.
Example 5.33. Suppose h–, –i is an inner product. Then NV is a point and QV is the empty set, so QW is
diffeomorphic to S n with the conformal structure induced from the usual metric.14 In this case, the symmetry
group is O(V ), and with an orientation we can get SO(V ). (
Example 5.34. More generally, let V = Rp,q , where p + q = n, with the indefinite-signature form

(5.35) \ang {\xi ,\eta } = \sum _{i=1}^p \xi ^i\eta ^i - \sum _{i=p+1}^q \xi ^i\eta ^i.

For example, H = R1,1 . In this case, W ∼ = Rp+1,q+1 = Rp+1 ⊕ Rq+1 (though the inner product on the second
has a minus sign put in front). If ξ ∈ Rp+1 and η ∈ Rq+1 , then (ξ, η) ∈ NW iff kξk = kηk = 1, so QW is
diffeomorphic to (S q × S q )/{±1}, and PO(W ) acts via conformal transformations on QW . Inside this we have
O(V ) ∼= Op,q . So this says that the conformal symmetries in signature (p, q) are the orthogonal symmetries
in signature (p + 1, q + 1). (
For example, the group of conformal symmetries of R4 with positive-definite inner product is O5,1 ,
which acts on S 4 , the conformal compactification of R4 . The conformal group of R2 , sitting inside its
compactification S 2 , is O3,1 ; with an orientation we get SO3,1 ∼= PSL2 (C), one of the low-dimensional
exceptional isomorphisms of Lie groups. These act by Möbius transformations. Analogously, there’s a special
isomorphism SO5,1 ∼ = PSL2 (H), allowing us to get at conformal symmetries in dimension 4 via the quaternions.
Next time we’ll apply this to the self-duality equations in dimension 4.
Lecture 6.
Spinors in low dimensions and special isomorphisms of Lie groups: 2/7/19

“I’ll tell a story for three minutes and then you can go.”
Today we’ll continue discussing preliminaries to the ADHM construction, sometimes using them to introduce
interesting nearby ideas in their own right. Today we’ll discuss exceptional isomorphisms of Lie groups
in low dimensions, which are applicable in other cases. For example, if you care about fermions in, e.g.
supersymmetric quantum field theories in dimensions 6 or below, these ideas appear.
Fix a 4-dimensional complex vector space S, which could be C4 . We can take exterior powers Λ2 S, Λ3 S,
and Λ4 S = Det S, and similarly for S∗ . Choose a volume form µ ∈ Det S∗ \ 0; we want to consider the
symmetries of (S, µ). A linear map T : S → S has an associated volume det T , so we’re essentially asking for
automorphisms with determinant 1. This is literally true for S = C4 , in which case Aut(S, µ) = SL4 (C).
Now given such an automorphism T , we obtain an automorphism Λ2 T : Λ2 S → Λ2 S. The condition that T
preserves the volume form is mapped to the bilinear pairing B : Λ2 S × Λ2 S → C sending
(6.1) x,y\mapsto \ang {\mu , x\wedge y}_{\Det \Sph ^*, \Det \Sph }.
This is symmetric and nondegenerate, hence an inner product, so Λ2 T preserves this inner product! Therefore
the map φ = Λ2 : Aut(S, µ) → Aut(Λ2 S, B) amounts to a homomorphism SL4 (C) → O6 (C).
You can directly check that {±1} ⊂ ker(φ). Since SL4 (C) is connected, the image of φ is contained in
SO6 (C).
Claim 6.2. φ : SL4 (C) → SO6 (C) is surjective.
The proof would amount to checking that it’s an isomorphism on Lie algebras, so the image is open, and
that the image is closed, hence all of SO6 (C). One corollary is that we’ve identified the spinor representation
as S.
Moreover, since SL4 (C) is simply connected, the map φ : SL4 (C) → SO6 (C) is the nontrivial double cover
map. Therefore we have produced an isomorphism SL4 (C) = ∼ Spin (C).
6

14TODO: I would like to double-check this.


Arun Debray May 9, 2019 19

Remark 6.3. We can define the complex spin groups in the same way as the real ones: for n ≥ 3, π1 SOn (C) =
Z/2, and for n = 2, π1 SO2 (C) = Z, so we can ask for the connected double cover of SOn (C), which has a
canonical Lie group structure, and define it to be Spinn (C). (
There can be two different realizations of this double cover, but you can prove there’s a unique Lie group
homomorphism between them respecting the covering map, and it’s an isomorphism.
There are several other exceptional isomorphisms, and they all follow from this one.
Example 6.4. Let J : S → S be a quaternionic structure, i.e. an antilinear map such that J 2 = −idS . Then
Λk J : Λk S → Λk S is also antilinear, and squares to (−1)k idS . So on Λ2 S it defines a real structure, and on
Λ3 S it’s quaternionic.
We can impose another constraint, that (det J)∗ µ = µ, i.e. µ is real. That is, it’s in the subspace of Det S∗
fixed by det J, which is a one-dimensional real vector space. Therefore we obtain a map
(6.5) \phi \colon \Aut (\Sph , J, \mu )\longrightarrow \Aut (\Lambda ^2\Sph , B, \Lambda ^2 J).
Now suppose S = C4 . Then the codomain is Op+q for some p + q = 6, but B might not be positive definite.
The domain is SL2 (H) – though you have to be careful with what this means. There’s no determinant
map GLn (H) → H× , but we can take the determinant to be a complex number (via regarding quaternionic
matrices as complex matrices), and ask for it to be 1.
Working out the details, the kernel will once again be {±1}, and this will be a double cover, so this
will identify SL2 (H) ∼
= Spinp,q , an isomorphism of 15-dimensional real Lie groups. Then S is the spinor
representation again – but it’s quaternionic, so we know we can’t get Spin6 .15
To determine the signature, let {e1 , Je1 , e2 , Je2 } be a basis for S. Then we can write down a basis for
(Λ@ S)R as follows:
\label {6basis} \begin {alignedat}{2} e_1&\wedge Je_1 \qquad \qquad \qquad \qquad & e_2&\wedge Je_2\\ e_1 &\wedge Je_2 + e_2\wedge Je_1 & i(e_1 &\wedge Je_2 - e_2\wedge Je_1)\\ e_1&\wedge e_2 + Je_1\wedge Je_2 &i(e_1 &\wedge e_2 - Je_1\wedge Je_2). \end {alignedat}
(6.6)

You can check these are orthogonal. The first two form a hyperbolic pair (signature (1, 1)), and the last four
all self-pair to −1. Therefore the signature is (1, 5), and we conclude SL2 (H) ∼ = Spin1,5 .
Taking the quotient, we also get PSL2 (H) ∼ = SO01,5 16 – in indefinite signature, special orthogonal groups
aren’t connected. Now, SO01,5 acts as the group of automorphisms of the conformal compactification of R4
with a definite metric, which gives you S 4 : PSL2 (H) is the conformal group of the 4-sphere. We can also
identify PSL2 (H) ∼= PGL2 (H) via the diffeomorphism S 4 ∼ = HP1 .
So in summary, we have SL2 (H) = Spin1,5 and PSL2 (H) ∼
∼ = PGL2 (H) ∼ = SO01,5 . (
Remark 6.7. If you want to imitate the above story but get Spin6 , you’ll want to get the maximal compact
in SL4 (C), which is SU4 . So throw out J and instead ask that your automorphisms fix a Hermitian inner
product. (
Example 6.8. Now let’s introduce a symplectic form ω ∈ Λ2 S∗ , and let’s say that µ = (1/2)ω ∧ ω. We’ll
ask about the automorphisms of S that fix µ; if S = C4 , this group is called Sp4 (C).
Passing up to Λ2 S, such an automorphism must preserve ker(ω), which has codimension 1, and the bilinear
form B from before. The automorphisms of ker(ω) and B form the group O5 (C). As in the previous case,
the map Sp4 (C) → O5 (C) has image the identity component SO5 (C) ⊂ O5 (C), and is a double cover onto it.
Therefore we obtain an isomorphism Spin5 (C) ∼= Sp4 (C). (
Example 6.9. If you ask for automorphisms of S which preserve both J and ω? In this case we get a map
Aut(S, J, ω) → Aut(ker(ω), Λ2 J, B). The basis of ker(ω) contains the last four vectors in (6.6), but instead
of the first two we have the vector e1 ∧ Je1 − e2 ∧ Je2 , which self-pairs to −1. Therefore we’re in definite
signature, so this map is identified with the map Sp2 → O5 . Here, Sp2 is the symmetries of H2 with its
standard inner product.
As usual, this only sees SO5 ⊂ O5 , and is a double cover, providing for us an isomorphism Spin5 ∼
= Sp2 . (

15You can also check that Z(SL (H)) ∼ Z/2 but Z(Spin ) ∼ Z/4.
2 = 6 =
16PSL (H) is defined to be SL (H) modulo its center {±1}.
2 2
20 M392C (Mathematical gauge theory) Lecture Notes

Let’s digress from the linear algebra a little bit and talk about quaternions. A general quaternion is of the
form x = x0 + x1 i + x2 j + x3 k.
Definition 6.10. The conjugate of a quaternion x as above is
x := x0 − x1 i − x2 j − x3 k.
The imaginary part of x is Im(x) := x1 i + x2 j + x3 k.
Therefore xx = (x0 )2 + (x1 )2 + (x2 )2 + (x3 )2 ; this is the norm squared of x. We have dx = dx0 + dx1 i +
dx j + dx3 k, and therefore
2

(6.11) \label {isimag} \d x\wedge \d \overline x = -(\d x^0\wedge \d x^1 + \d x^2 \wedge \d x^3)i - (\d x^0\wedge \d x^2 - \d x^1\wedge \d x^3) j - (\d x^0\wedge \d x^3 + \d x^1\wedge \d x^2)k.
You’ll notice this is self-dual, and in fact the coeffcients are a basis for the space of self-dual 2-forms. Similarly,
dx ∧ dx is anti-self-dual, and its coefficients are a basis for the space of anti-self-dual forms.
Example 6.12. Continuing Example 6.9, consider S 7 ⊂ H2 , which is preserved by Sp2 = Spin5 . Restricting
to H2 \ 0, we have a projection down to HP1 ∼ = S 4 , the quaternionic projective line.17 This restricts to a
7 4 7
submersion π : S → S . Two points in S have the same image iff the are acted on by a unit-norm quaternion.
The group of unit-norm quaternions is Sp1 = SU3 = Spin2 . Since S 7 = Sp2 /Sp1 , S 4 ∼ = Sp2 /Sp1 × Sp1 . In
other words, π : S 7 → S 4 is a principal Sp1 -bundle, and it is homogeneous for the left Sp2 -action. This is
another example of a Hopf fibration.18
Now who acts on S 4 ? Well, SO5 acts by isometries. The whole situation is invariant under the isometry
group, so we can move the total space around by isometries of the base, and therefore SO5 lifts to the
Spin5 -action we already identified on S 7 . Inside the isometries, there’s a bigger group SO01,5 of conformal
transformations. We’d like to ask whether these lift to conformal isometries on S 7 . The identification
S4 ∼
= HP1 carries SO5 to PSp2 and SO01,5 ∼ = PSL2 (H), and this lifts to the SL2 (H)-action on H2 .
Now we introduce connections, mimicking the basic story we already saw over C in §4. We want to consider
a connection Θ for S 7 → S 4 regarded as an Sp1 -bundle, an Sp1 -invariant horizontal distribution. Specifically,
you can check that
(6.13) \label {S7conn} \Theta \coloneqq -\Im \paren {q^0\,\overline {\d q^0} + q^1\,\overline {\d q^1}}
is a connection, by checking it’s orthogonal to orbits everywhere. Then Ω0,1 , restricted to the horizontal
distribution, is − Im(dq 0 ∧ dq 0 ). But from the calculation in (6.11), we already know this is imaginary, so
Ω0,1 |horiz. = −dq 0 ∧ dq 0 ; we also know it’s self-dual. (
So we’ve found one solution to the self-dual equations; we can discover others by transforming by conformal
transformations. Therefore SL2 (H) = Spin1,5 acts on Θ to produce self-dual connections. This is an SL2 (H)-
orbit inside AP ; we know it’s a homogeneous manifold, so if we want to know what it is, we should compute the
stabilizer of Θ. This is precisely the isometries, which are Spin5 , so the orbit is Spin1,5 /Spin5 = SO1,5 /SO5 ,
which is a hyperbolic 5-ball M , and Θ is actually the center. As you get closer to the “edge” S 4 , which we
think of as the base S 4 , the curvature concentrates more and more at the boundary point, and we could
think of the connections at infinity as having curvature in a δ-function (this doesn’t actually work, of course).
The only connection which has no priviledged concentration of curvatuee is Θ, at the center.
Theorem 6.14 (Atiyah-Drinfeld-Hitchin-Manin). SO1,5 /SO5 is the moduli space of self-dual Sp1 -instantons
with Chern class c2 = 1.
This is the only characteristic class data; for an Sp1 = SU2 -bundle, c1 = 0, and there are no more Chern
classes. On S 4 , c2 ∈ H 4 (S 4 ), and the orientation identifies this cohomology group with Z, so we can make
sense of c4 = 1. This is an instance of fixing discrete data in a moduli problem, which is common.
This is a basic case where we can picture what’s going on, and illustrates a good part of the general story.
But how do you prove Theorem 6.14? How do we know that every connection is one of these? Stay tuned;
we’ll prove this later. Then there are other questions involving other Chern classes, other manifolds, and so
on.
17Because H isn’t commutative, we have to specify that we’re taking the quotient of the right action.
18There are also Hopf fibrations over the reals, including the double cover Z/2 → S 1 → RP1 . There’s also one over the
octonions.
Arun Debray May 9, 2019 21

This kind of question was first investigated by physicists, Polyakov and others. Atiyah and Singer then
checked the dimension by linearizing the equations and found that the physicists had missed some. When
Simon Donaldson was a graduate student, he had the brilliant idea of taking this example, but generalizing to
4-manifolds where the intersection form is positive definite. Once again, the moduli space is five-dimensional,
and you can take connections with curvature concentrated near a point, and extend by zero. Taubes show you
can wiggle this a little bit and actually get a solution – and therefore you again get a copy of the manifold
at infinity. Through this (and of course, plenty more) Donaldson was able to prove his first theorem – this
exhibits a bordism from this manifold to something else.
Lecture 7.
Some linear algebra underlying spinors in dimension 4: 2/12/19

“Is it clear that —” “Yes, it’s clear.”


Recall that we’ve been discussing connections on S 4 with Chern number 1. Fix the usual orientation and
round metric (we only need a conformal structure in dimension 4, but we have a canonical metric so we might
as well). The conformal symmetries SO1,5 = ∼ PSL2 (H) act on S 4 , and hence also on the moduli space M of
self-dual Sp1 -connections. This moduli space is diffeomorphic to a 5-ball.
Inside this space, there’s a special point, as we discussed last time: the principal Sp1 -bundle S 7 → S 4
realized as the Hopf fibration H2 \ {0}/R>0 → HP1 , and HP1 = ∼ S 4 . Then we can choose a Sp -invariant
1
7
horizontal distribution on S , as in (6.13).
Now for some more cool facts about spinors. Previously, we discussed how to obtain several exceptional
isomorphisms of Lie groups by starting with a four-dimensional complex vector space S and considering
various wedge powers of S and S∗ . Now, let’s suppose S = S0 ⊕ S00 , where each summand is two-dimensional.
Then
(7.1) \Lambda ^2\Sph ^* = \Lambda ^2(\Sph ')^*\oplus \Lambda ^2(\Sph '')^*\oplus (\Sph ')^*\otimes (\Sph '')^*.
The summands have dimensions 1, 1, and 4, respectively.
Suppose we fix a µ ∈ Det S∗ , which splits as 0 ∧ 00 , where 0 ∈ Λ2 (S0 )∗ and 00 ∈ Λ2 (S00 )∗ . Then S0 ⊕ S00
has a bilinear pairing
(7.2) B(s'\otimes s'', \widetilde s'\otimes \widetilde s'') \coloneqq \epsilon '(s', \widetilde s')\epsilon ''(s'', \widetilde s'').
Given an isometry of S (i.e. preserving 0 ) and an isometry of S00 , their tensor product preserves B, so we
0

obtain a map
(7.3a) \Aut (\Sph ', \epsilon ')\times \Aut (\Sph '', \epsilon '')\longrightarrow \Aut (\Sph '\otimes \Sph '', B),
which if S = C4 with the usual decomposition as C2 ⊕ C2 , is a map
(7.3b) \SL _2(\C )\times \SL _2(\C )\longrightarrow \O _4(\C ),
and as usual, this has image SO4 (C), and is a double cover onto it. Hence Spin4 (C) ∼ = SL2 (C) × SL2 (C).
We’re also interested in the real spin group. Hence, fix a quaternionic structure J = J 0 ⊕ J 00 , where
J 0 : S0 → S0 is antilinear and squares to −id, and similarly for J 00 . Then Aut(S0 , 0 , J 0 ) ∼
= Sp1 , and similarly
for S00 , and Λ2 J restricts to a real structure on S0 ⊗ S00 , so we obtain a map
(7.4) \Sp _1\times \Sp _1\longrightarrow \SO _4,
and this is a double cover, so Spin4 ∼ = Sp1 × Sp1 .
We can produce a real basis of S0 ⊗ S00 with respect to this real structure. Let e0 , Je0 be a basis of S0 , where
we ask 0 (e0 , Je0 ) = 1, and define e00 and J 00 similarly. Then our real basis is
\begin {gathered} e'\otimes e'' + Je'\otimes Je''\\ i\paren {e'\otimes e'' - Je'\otimes Je''}\\ e'\otimes Je'' - Je'\otimes e''\\ i\paren {e'\otimes Je'' + Je'\otimes e''}. \end {gathered}

(7.5)

Remark 7.6. You can also choose a real structure J on S = S0 ⊕ S00 , and stipulate that J(S0 ) = S00 , and can
play a similar game as above. (
22 M392C (Mathematical gauge theory) Lecture Notes

Now consider the exterior product Λ2 S ⊗ S → Λ3 S. Restricted to V ∗ := S0 ⊗ S00 ⊂ Λ2 S we obtain a map


\begin {aligned} V^*\otimes (\Sph '\oplus \Sph '') &\longrightarrow \Sph '\oplus \Sph ''\\ (s'\otimes s''), (\psi ', \psi '') &\longmapsto \paren {\epsilon '(s', \psi ')s'', \epsilon ''(s'', \psi '')s'}. \end {aligned}
(7.7)

In particular, if v ∈ V ∗ , s0 ∈ S0 , and s00 ∈ S00 , then v, s0 ⊗ 1 lands in S00 , and correspondingly v, 1 ⊗ s00 lands in
S0 . Therefore we obtain maps γ : V ∗ ⊗ S0 → S00 and V ∗ ⊗ S00 → S0 . These will be the Clifford multiplication
maps when we pass to associated bundles on a spin 4-manifold. The notation is suggestive: V ∗ will be vectors
and S0 and S00 will be the spinors.
Proposition 7.8. If θ1 , θ2 ∈ V ∗ , then
γ(θ1 )γ(θ2 ) + γ(θ2 )γ(θ1 ) = B(θ1 , θ2 ).
But first we need a quick fact.
Lemma 7.9. Let W be a two-dimensional vector space and  be an area form for W . For any w1 , w2 , w3 ∈ W ,
(w1 , w2 )w3 + (w2 , w3 )w1 + (w3 , w1 )w2 = 0.
Proof sketch. It suffices to check that the map
\begin {aligned} W\times W\times W &\longrightarrow \Lambda ^2 W\otimes W\\ w_1,w_2,w_3 &\longmapsto (w_1\wedge w_2)\otimes w_3 + (w_2\wedge w_3)\otimes w_1 + (w_3\wedge w_1)\otimes w_2 \end {aligned}
(7.10)

factors through Λ3 W . 

Proof of Proposition 7.8. Write θ1 = s01 ⊗ s001 and θ2 = s02 ⊗ s002 . If ψ ∈ S , then 0 0

\begin {aligned} \gamma (s_1'\otimes s_1'')\gamma (s_2'\otimes s_2'')\psi ' &= \gamma (s_1'\otimes s_1'')\epsilon '(s_2', \psi ')s_2''\\ &= \epsilon '(s_2', \psi ')\epsilon ''(s_1'', s_2'')s_1'. \end {aligned}
(7.11a)

Similarly,
(7.11b) \gamma (s_2'\otimes s_2'')\gamma (s_1'\otimes s_1'') = -\epsilon '(s_1', \psi ')\epsilon ''(s_1'', s_2'')s_2'.
Adding these together, you get
(7.12) \gamma (s_1'\otimes s_1'')\gamma (s_2'\otimes s_2'')\psi ' + \gamma (s_2'\otimes s_2'')\gamma (s_1'\otimes s_1'') = -\epsilon _1'(s_1', s_2')\epsilon ''(s_1'', s_2'')\psi '.\qedhere 

Proposition 7.13. If θ1 , θ2 ∈ V ∗ , then the assignment


(7.14a) \theta _1\wedge \theta _2\mapsto \gamma (\theta _1)\gamma (\theta _2) - \gamma (\theta _2)\gamma (\theta _1)
defines a map
(7.14b) \Lambda ^2V^*\longrightarrow \Aut (\Sph ')\times \Aut (\Sph ''),
such that Λ 2
V+∗ acts by zero on S and Λ2 V−∗ acts by zero on S0 . 00

Recall first that if W is a vector space, there’s a canonical isomorphism


(7.15) W\otimes W\cong \Sym ^2 W\otimes \Lambda ^2 W.
The general theory of decomposing a tensor product involves a tool called Young diagrams. In our setting,
\begin {aligned} V^*\otimes V^* &\cong \Sph '\otimes \Sph ''\otimes \Sph '\otimes \Sph ''\\ &\cong \underbracket {\Sym ^2\Sph '\otimes \Lambda ^2\Sph ''}_{\Lambda _+^2V^*} \oplus \underbracket {\Lambda ^2\Sph '\otimes \Sym ^2\Sph ''}_{\Lambda _-^2 V^*} \oplus \dotsb \end {aligned}
(7.16)

Now choose a real subspace L0 ⊂ S0 such that S0 = L0 ⊕ JL0 . Then V ∗ splits as


(7.17) V^* = \Sph '\otimes \Sph '' = (L'\oplus JL')\otimes \Sph '' = \underbracket {L'\otimes \Sph ''}_{(1,0)}\oplus \underbracket {JL'\otimes \Sph ''}_{(0,1)},

and J interchanges the two factors, defining a complex structure on VR∗ . Similarly,
\begin {aligned} \Lambda _+^2 V^* &\cong \Sym ^2\Sph '\oplus \Lambda ^2\Sph ''\\ &\cong \underbracket {(L')^{\otimes 2}\otimes \Lambda ^2\Sph ''}_{(2,0)} \oplus \underbracket {(JL')^{\otimes 2}\otimes \Lambda ^2\Sph ''}_{(0,2)} \oplus \underbracket {(L'\otimes JL')\otimes \Lambda ^@\Sph ''}_{(1,1)}. \end {aligned}
(7.18)
Arun Debray May 9, 2019 23

and
(7.19) \Lambda _-^2 V^* \cong \Lambda ^2\Sph '\otimes \Sym ^2\Sph '' = \underbracket {(L'\otimes JL')\otimes \Sym ^2\Sph ''}_{(1,1)}.

Here the annotations mean the type of a form in the complexificiation of a real vector space. Specifically,

suppose V is a real vector space, so that V ⊗ C = W ⊕ W . Then V ∗ ⊗ C = W ∗ ⊕ W too, and therefore
there is a splitting

(7.20) \Lambda ^k(V^*\otimes \C ) = \bigoplus _{p+q=k} \Lambda ^pW^*\otimes \Lambda ^q\overline W^*.

The forms in this summand are said to have type (p, q). Since dim V = 4, then
(7.21) \Lambda ^2(V^*\otimes \C ) = \Lambda ^{2,0}\oplus \Lambda ^{1,1}\oplus \Lambda ^{0,2}

of dimensions 1, 4, and 1 respectively.


Theorem 7.22. The intersection over all subspaces L0 of the (1, 1)-forms with respect to L0 is Λ2− V ∗ .
This follows directly from a symmetry argument: this is the only SL2 (C) × SL2 (C)-invariant subspace of
Λ1,1 .
This is very useful, as it establishes a link between self-duality in dimension 4 and complex geometry. If
you have some connection and want to know whether it’s anti-self-dual, it suffices to show that it’s type (1, 1)
in every complex structure.
B·C

We’re now in a position to discuss the Dirac operator, which we will do next time. In the last ten minutes
of this lecture, we’ll review some basics of complex geometry.
Let M be a manifold, and suppose I ∈ End(T M ) squares to −idT M . This is what’s called an almost
complex structure on M , and just as above allows us to decompose

(7.23) \label {hodgeac} \Omega _M^k(\C ) = \bigoplus _{p+q=k} \Omega _M^{p,q}.

In more detail, T M ⊗ C splits as W ⊕ W , where W is the subspace where I acts by i (where we’ve chosen a

square root i of −1) and W is where I acts by −i. Thus Λk T ∗ M splits as a sum of Λp W ∗ and Λq W over all
(p, q) with p + q = k, and hence sections do as well, giving (7.23).
In this setting, what happens to the de Rham differential? It looks like it gets complicated, but it turns
out that dΩ1,0 has no (0, 2)-component, and more generally, d|Ωp,q lands only in (p + 1, q) and (p, q + 1).
Therefore we can let ∂ denote the part of d valued in Ωp+1,q and ∂ the part of d valued in Ωp,q+1 , and we get
a diagram of maps

(7.24)

\begin {gathered} \xymatrix @dr{ \Omega ^{p,q}\ar [r]^\partial \ar [d]_\delbar &\Omega ^{p+1,q}\ar [r]^\partial \ar [d]_\delbar & \Omega ^{p+2,q}\\ \Omega ^{p,q+1}\ar [r]^\partial \ar [d]_\delbar & \Omega ^{p+1,q+1}\\ \Omega ^{p,q+2} } \end {gathered}

2
We know d2 = 0 iff ∂ 2 = 0, ∂ = 0, and ∂∂ + ∂∂ = 0, so we would like this to be true.
Claim 7.25. This condition is exactly the vanishing of the complex Frobenius tensor of the complex
distribution W ⊂ T M ⊗ C.
This isn’t too hard to see. But the crucial equivalent condition is harder:
Theorem 7.26 (Neulander-Nirenberg). This condition holds iff we can cover M by local coordinates in which
the change-of-charts map is holomorphic.
24 M392C (Mathematical gauge theory) Lecture Notes

Lecture 8.
Twistors and Dirac operators: 2/14/19

“Dimension 4 is, as always, the problem child, or the interesting child.”


Last time, we discussed that if M is an almost complex manifold, meaning it comes equipped with a map
I : T M → T M with I 2 = −id, then the complex differential forms ΩkM (C) split into Ωp,q M indexed over p, q
with p + q = k based on how I acts on them. We then mentioned Theorem 7.26, which says that if d
restricted to a map Ω0,1 2,0
M → ΩM is zero (which is an integrability condition), then there is an atlas for M
whose change-of-charts maps are holomorphic, i.e. M is a complex manifold. In particular, in these local
coordinates z1 , . . . , zn , dz1 , . . . dzn are of type (1, 0) and pointwise form a basis of Ω1,0
M .
Assume now that M is a complex manifold, and let E → M be a C ∞ complex vector bundle, i.e. a
complex bundle in the usual sense, and not necessarily holomorphic. Suppose we have a linear operator
∂ E : Ω0,0 0,1
M (E) → ΩM (E) such that

(8.1) \delbar _E(f\cdot s) = \delbar f\cdot s + f\delbar _E s,


where f is a function and s is a section of E. We can then extend this to a complex

(8.2) \xymatrix { 0\ar [r] & \Omega _M^{0,0}(E)\ar [r]^-{\delbar _E} & \Omega _M^{0,1}(E)\ar [r]^-{\delbar _E} & \Omega _M^{0,2}(E)\ar [r] & \dotsb , }
2
and ∂ E : Ω0,0 0,2 0,2
M (E) → ΩM (E) is multiplication by some tensor ΦE ∈ ΩM (End E).

Theorem 8.3. If ΦE = 0, then there exists a local basis of sections s1 , . . . , sn of sections such that ∂ E sj = 0.
This is easier than Theorem 7.26, though we’ll defer the proof. In this case we can place the structure of a
complex manifold on E, and E is what’s called a holomorphic vector bundle.

The twistor approach to the anti-self-dual equations. Now we’ll briefly take a peek at one approach
to the anti-self-dual equations, as part of some great activity in the 1970s by researchers including the Oxford
school. This involves some algebro-geometric techniques which show off the uses of the linear algebra we did
last time, and is the original approach to this equation on S 4 .
Let X be an oriented Riemannian 4-manifold. If S = S+ ⊕ S− is a 4-dimensional complex vector space
(which we’ll turn into the spinor bundle soon enough) and dim S± = 2, then Sp1 × Sp1 acts on it as the spin
group. If V + := S+ ⊗ S− , then it’s isomorphic to (R4 )∗ , and SO4 acts on it. Choose a complex line L ⊂ S+ ,
which as we discussed last time defines a complex structure on V ∗ . The space of choices is P(S+ ), which is
also acted on by SO4 .
Bringing in the topology, let BSO (X) → X be the principal SO4 -bundle of oriented orthonormal bases.
The Riemannian metric defines the Levi-Civita connection ΘLC on BSO (X), which is a beautiful and still
somewhat mysterious fact. Because SO4 acts on P(S+ ), we obtain an associated bundle P(S+ ) → X, and the
Levi-Civita connection induces a horizontal distribution on it.
Remark 8.4. We don’t have an action of SO4 on S+ – that’s what we’d need the spin structure for. But we
do have a projective action, so the action on the projective space is well-defined. (
P(S+ ) is a six-dimensional real manifold.
Exercise 8.5. Choose S 4 = HP1 with the round metric. Then P(S+ ) is diffeomorphic to CP3 , and the
projection map CP2 → HP1 is the map sending a complex line in C2 → H1 to the unique quaternionic line
containing it. The fiber is CP1 .
In this case, S+ is a complex manifold. In general, we will only have an almost complex manifold: given a
point x, L ∈ P(S+ ), we have the two lines Tx X and the vertical line in Tx,L P(S+ ) (TODO: I might have this
wrong), and so we can take the usual almost complex structure where Tx X is real and the vertical line is
imaginary.
Definition 8.6. If M is a Riemannian manifold of dimension at least 5, then its Riemann curvature tensor
splits as a sum of three pieces: scalar curvature, Ricci curvature, and something called Weyl curvature, which
is the piece that’s invariant under conformal transformations.
Arun Debray May 9, 2019 25

In more detail, the Riemann curvature tensor has some symmetries that mean it’s a section of Sym2 (Λ2 (T ∗ M )).
If V is an n-dimensional vector space with an orthogonal basis, the standard On -action induces an On -action
on Sym2 (Λ2 V ∗ ), and this decomposes into three irreducible components, giving us the scalar curvature (for
the trivial subrepresentation), the Ricci curvature, and the Weyl curvature.
The Weyl curvature requires four antisymmetric indices, so the Weyl curvature vanishes in dimensions 2
and 3: in dimension 2, the Riemannian curvature tensor is just the scalar curvature, and is called the Gauss
curvature. In dimension 3, we also have Ricci curvature, but that’s it, and so Riemannian geometry in 3
dimensions is nice, e.g. when studying Ricci flow. In dimension 4, the Weyl curvature tensor splits into two
pieces: its self-dual and anti-self-dual components. This makes life interesting, as always, in dimension 4.
Theorem 8.7 (Atiyah-Hitchin-Singer). This almost complex structure is integrable iff W+ = 0, where W+ is
the self-dual Weyl curvature tensor on X.
In particular: we’ve passed from an integrability question on the fiber to one on the base.
Now suppose E → X is a complex vector bundle,19 and let A be a connection on E, with curvature
F ∈ Ω2X (End E). Now pull back E and A to π ∗ E → P(S+ ). Now, we’re over an almost complex base, so we
can decompose the curvature into its type components:

(8.8) \pi ^*F\in \Omega _{\P (\Sph ^+)}^2(\End E) = \bigoplus _{p+q=2} \Omega ^{p,q}(\End E).

A bit of linear algebra leads to the following lemma.


Lemma 8.9. The connection A is anti-self-dual iff π ∗ F has type (1, 1).
Now let P → M be a principal G-bundle with a connection Θ ∈ Ω1P (g). This satisfies two equations: if
g ∈ G, then Rg∗ Θ = Ad−1 g Θ, and if ξ ∈ g, then ιξ
bΘ = ξ.
Now let E be a vector space and ρ : G → Aut(E) be a representation. It differentiates to a Lie algebra
representation ρ̇ : g → End(E). Let E := EP be the associated bundle to the data of ρ, and let ψ be a
section. We can think of ψ as a map P → E or as an E-valued function on P ; then ψ descends to Ω0M (E) iff
Rg∗ ψ = ρ(g −1 ) · ψ for all g ∈ G. More generally, if α ∈ ΩkP (E), then it descends to ΩkM (E) iff

\begin {aligned} R_g^*\alpha &= \rho (g^{-1})\alpha \\ \iota _{\widehat \xi }\alpha = 0, \end {aligned}
(8.10)

where as before g ∈ G and ξ ∈ g.


The covariant derivative of ψ is
(8.11) \nabla _\Theta \psi = \d \psi + \dot \rho (\Theta )\psi ,
which a priori lives in Ω1P (E), but you can directly check that it descends, which is a useful exercise. More
generally, if α is a k-form as above,
(8.12) \d _\Theta \alpha = \d \alpha + \dot \rho (\Theta )\wedge \alpha ,
and this also descends, which you can check.
Exercise 8.13 (Bianchi identity). The curvature FΘ = dΘ + (1/2)[Θ ∧ Θ] satisfies dF = 0 + [dΘ ∧ Θ], i.e.
dΘ F := dF + [Θ ∧ F ] = 0,
because F and dΘ differ by [Θ ∧ Θ], and the Jacobi identity shows the triple bracket you get by substituting
it in is zero.
Now let M be a Riemannian n-manifold with its principal On -bundle bundle of frames B(M ) → M
and the Levi-Civita connection ΘLC . This therefore gives us horizontal vector fields ∂1 , . . . , ∂n on B(M )

=
defined as follows: given an x ∈ X and a basis b : Rn → Tx M , hence a point in B(M ), we can take

19As X is not necessarily a complex manifold, we can’t ask for E to be holomorphic.


26 M392C (Mathematical gauge theory) Lecture Notes

b(e1 ) ∈ Tx M ⊂ T(x,b) B(M ), which is a horizontal vector, and this defines ∂1 ; then ∂2 , . . . , ∂n are analogous.
Moreover,

(8.14) [\partial _k, \partial _\ell ] = \underbracket {-\frac 12 T_{k\ell }^i \partial _i}_{=0} - \frac 12 R^i_{jk\ell } E_i^j,

where Eij is some matrix (TODO: I did not parse the definition) which is 0 everywhere except for a 1 in
column i and a −1 in column j. The first term vanishes because the Levi-Civita connection is torsion-free.
We can flow along these by geodesics by solving the geodesic equation, which frames the horizontal tangent
bundle on B(M ).
We can use this to write the covariant derivative: if On acts on a vector space S and ψ : B(M ) → S is an
equivariant (TODO: ?) map, then
(8.15) \nabla \psi = e^k\cdot \partial _k\psi \colon \cB (M)\longrightarrow \Sph \otimes (\R ^n)^*.
Now we can say something about the Dirac operator. Assume now that X is a 4-dimensional spin manifold
with a Riemannian metric. All the linear algebra we did last time tells us that if Be → X is the principal
Spin4 -bundle of frames, we have Spin4 -equivariant maps ψ ± : Be → S± , where S± are as in the previous
lecture, so we obtain Clifford multiplication γ : (R4 )∗ ⊗ S± → S∓ , and we proved a lemma that
(8.16) \gamma (e^i)\gamma (e^j) + \gamma (e^j)\gamma (e^i) = -\delta ^{ij}.
Definition 8.17. Let S ± denote the spinor bundles on X. The Dirac operator D : Ω0X (S ± ) → Ω0X (S ∓ ) is
defined by
Dψ := γ(ek )∂k ψ.
It’s easy to see this is a first-order differential operator: we differentiated once.
In the last ten minutes,20 we’ll compute the square of the Dirac operator.
(8.18) D2 ψ = γ(ek )γ(e` )∂k ∂` ψ
n
X X
(8.19) =− ∂k2 ψ + γ(ek )γ(e` )[∂k , ∂` ]ψ
k=1 k<`
1
(8.20) = −∇∗ ∇ψ + Rscal .
4
This is called the Weitzenböck formula.21 Here ∇∗ ∇, meaning the composition of ∇ with its formal adjoint, is
called the covariant Laplacian, and exists for any associated bundle for the bundle of frames on a Riemannian
manifold.
Now lets choose a Hermitian vector bundle E → X of rank N with a connection A and curvature FA , and
let P → X denote the principal UN -bundle of unitary bases for E; then A is a connection here, in the sense
that A ∈ Ω1P (uN ). Cross with the spin bundle to obtain Be ×X P → X, which is a principal Spin4 × UN -bundle,
and this has a connection which heuristically is ΘLC + A, and in particular we still have the vector fields
∂1 , . . . , ∂4 .
If E = CN is the model vector space for E, we can consider Spin4 × UN -equivariant maps ψ : Be ×X P →
±
S ⊗ E, and hence obtain a Dirac operator
(8.21) D_A\colon \Omega _X^0(S^+\otimes E)\longrightarrow \Omega _X^0(S^-\otimes E),
and this squares to TODO: I had to go.
Lecture 9.
The Nahm transform for anti-self-dual connections: 2/17/19

“That’s why I wrote it that way, thinking you might try to pull that one on me.”
Today, we’re going to discuss a kind of Fourier transform for anti-self-dual forms on a Euclidean torus, mostly
following Donaldson-Kronheimer [DK97].
20Measured in my local frame, we have −3 minutes.
21The beer is spelled differently: weizenbock.
Arun Debray May 9, 2019 27

Definition 9.1. A lattice Λ in a real vector space V is a finitely generated abelian subgroup of V , necessarily
free. If its rank is equal to the dimension of V , it’s called full.

Equivalently, a full lattice is the Z-span of a basis of V .


Now let V be a 4-dimensional oriented inner product space and Λ ⊂ V be a full lattice. The model example
is V = R4 and Λ = Z4 . Then T := V /Λ is a torus with an orientation and Riemannian metric induced from
V . It is also an abelian Lie group.
Let Λ∗ := Hom(Λ, Z) denote the dual lattice, which sits inside V ∗ = Hom(V, R). There is an isomorphism

∗ =
Λ → Hom(T, T) sending θ : Λ → Z to v 7→ e2πiθ(v) .

Definition 9.2. The dual torus to T is T ∗ := V ∗ /Λ∗ .

There is a universal cheracter χ : T × Λ∗ → T: given θ ∈ Λ∗ and x ∈ T , χ(x, θ) := exp(2πiθ(x)) (essentially,


“evaluate θ on x”). There are projection maps

(9.3)
\begin {gathered} \xymatrix @dr{ T\times \Lambda ^*\ar [r]^{p_2}\ar [d]_{p_1} & \Lambda ^*,\\ T } \end {gathered}

from which we can define a Fourier transform from (a certain class of) functions on T to (a certain class of)
functions on Λ∗ . Specifically, given a function f on T , we’d like to define

(9.4) \label {fourheur} \widehat f\coloneqq (p_2)_*\chi ^{-1} p_1^*f,

or more explicitly,

(9.5) \widehat f(e^v) = \int _T \d t\, e^{-2\pi i\theta (v)}\cdot f(e^v).

Remark 9.6. This is an instance of a more general phenomenon for locally compact abelian groups called
Pontrjagin duality. Another example is a finite analogue: letting µn ⊂ C denote the nth roots of unity, we
have a pairing Z/n × µn → T sending (k, λ) 7→ λk . These two groups are noncanonically isomorphic„ but
each is canonically isomorphic to the character group of the other. In our setting, T and Λ∗ aren’t isomorphic,
but are still each others’ character groups. (

We will discuss a categorified version of this, which in the algebro-geometric setting is called the Fourier-
Mukai transform. We will use T ∗ instead of Λ∗ , and push-pull along the diagram

(9.7)
\begin {gathered} \xymatrix @dr{ T\times T^*\ar [r]^{p_2}\ar [d]_{p_1} & T^*.\\ T } \end {gathered}
Instead of the universal character we will have a universal line bunde L → T ×T ∗ with a Hermitian connection,
allowing us to exchange vector bundles on T and on T ∗ . This L will be called the Poincaré line bundle.
Heuristically, the formula will look like (9.4): if E → T is a vector bundle, we let

(9.8) \widehat E = (p_2)_*\paren {\sL \otimes p_1^*E},

In order to make this precise, we have some details to figure out, namely
(1) constructing L and its Hermitian connection,
(2) interpreting (p2 )∗ , and
(3) incorporating connections on E and Eb into the story.
Once we do this, though, we will be able to prove some nice theorems: generic anti-self-dual connections on
E pass to anti-self-dual connections on E,
b and there will be an inversion formula.

Remark 9.9. If we give a complex structure on V , then T and T ∗ acquire the structure of complex manifolds,
and we can do all of this with sheaves. This is what’s called the Fourier-Mukai transform. (
28 M392C (Mathematical gauge theory) Lecture Notes

Instead of constructing the Poincaré line bundle, we’ll construct a principal T-bundle P → T × T ∗ with
connection; this is equivalent, via passing to the associated bundle P ×T C → T × T ∗ . Begin with the trivial
principal T-bundle V × V ∗ × T → V × V ∗ , with its trivial connection Θ ∈ Ω1V ×V ∗ ×T (iR); specifically,
(9.10) \Theta _{(v, \theta , z)}(\dot v, \dot v^*, \dot z) = 2\pi i\theta (\dot v) + z^{-1}z.
This is a universal family of flat connections; you can check that the holonomy vanishes.
Now Λ × Λ∗ acts on this trivial principal T-bundle by
(9.11a) λ · (v, θ, z) := (v + λ, θ, z)
(9.11b) λ∗ · (v, θ, z) := (v, θ + λ∗ , exp(−2πiλ∗ (v))z),
and this covers the usual Λ × Λ∗ -action on the base, so it descends to a principal T-bundle P → T × T ∗ with
a flat connection Θ. You can think of this as a family of flat connections on T parameterized by T ∗ ; these
bundles are trivializable,22 but not canonically so. Similarly, you can view this as a family of flat connections
on T ∗ parameterized by T , and these are also trivializable but not canonically trivialized. However, the total
bundle is not trivial.
The connection form Ω is a translation-invariant (under the group operation) purely imaginary 2-form
satisfying the formula
(9.12) \Omega \paren {(\dot v_1, \dot \theta _1), (\dot v_2, \dot \theta _2)} = 2\pi i\paren {\dot \theta _1(\dot v_1) - \dot \theta _2(\dot v_2)},
or, in other words,
(9.13) \Omega = 2\pi i\ang {\d \theta \wedge \d v}.
Here dθ ∈ Ω1T ∗ (V ∗ )
and dv ∈ Ω1T (V
) are precisely the Maurer-Cartan forms for these two tori.
Now let E → T be a Hermitian vector bundle with covariant derivative ∇A . We can pull both of these
back to T × T ∗ and tensor with L, but how do we push forward? This will involve the Dirac operator! Let
S+ and S− be the spinor spaces, so that V ∗ = S+ ⊗ S− . The spinor bundles over V are the trivial bundles
S± → V , which then descend to the torus as trivial bundles. Given θ ∈ T ∗ , we have Dirac operators
(9.14) D_{A,\theta }^\pm \colon \Gamma _T(\underline \Sph ^\pm \otimes E\otimes \sL _\theta )\longrightarrow \Gamma _T(\underline \Sph ^\mp \otimes E\otimes \sL _\theta ).
We will use this to define the pushforward (p2 )∗ – crucially, since T is compact, these are Fredholm operators,
which play the role of sheaves in differential geometry: we have the kernel and cokernel of a Fredholm
operator, but in a family these can jump. We would like to extract an honest vector bundle, so we’ll ask
+ − + ∗
for a hypothesis such that ker(DA,θ ) = 0. This is equivalent to saying that DA,θ = (DA,θ ) is surjective.23
− ∗ −
Assuming this, ker(DA ) → T (i.e. the fiber above θ is ker(DA,θ )) is a vector bundle.
Example 9.15. Before we discuss that hypothesis, let’s look at a toy model. Consider the matrix

(9.16) D_z\coloneqq \begin {pmatrix} 1 & 0 & 0\\0 & z & 0\end {pmatrix}\colon \C ^3\to \C ^2,

where z ∈ C; hence this is a family of linear operators parametrized by C. This is surjective for z = 6 0,
with kernel C · (0, 0, 1). At 0, the kernel is C · (0, 0, 1) ⊕ C · (0, 1, 0). The dimension of the kernel minus the
dimension of the cokernel is always 1, but the kernel isn’t a vector bundle; it does define a sheaf, however. (
+
Proposition 9.17. If A is anti-self-dual, then ker(DA,θ ) = 0, hence we can define the pushforward.
Proof. This is a vanishing theorem, and many vanishing theorems follow a similar proof. We will use the
Weitzenböck formula
(9.18) D_A^-D_A^+ = \nabla _A^*\nabla _A - F_A^+,
and we know FA+ = 0. Suppose ψ ∈ ΓT (S ⊗ E ⊗ Lθ ) and DA
+ +
ψ = 0, but ψ 6= 0. Then

(9.19) 0 = \int _T \ang {\psi , \nabla _A^*\nabla _A\psi }\ud x = \int _T \norm {\nabla _A\psi }^2.

22This is a flat connection on a torus; Chern-Weil theory implies the Chern class of P vanishes, so it’s abstractly trivializable.
23Though this is a simple linear-algebra exercise in finite dimensions, in the infinite-dimensional Fredholm setting it’s a
harder theorem known as the Fredholm alternative.
Arun Debray May 9, 2019 29

Since k∇A ψk2 is nonnegative, this means it’s zero, so ψ is covariantly constant. This is a typical application
of the Weitzenböck formula. 
Spelling this out in more detail, let s1 , s2 be a basis for S+ . We can write
(9.20) \psi (t) = \psi _1(t)s_1 + \psi _2(t)s_2

for t ∈ T , where ψi ∈ ΓT (E ⊗ Lθ ) and ∇A ψi = 0. This section ψ defines a subbundle of E ⊗ Lθ of rank 1,


and the fact that it’s covariantly constant means that it splits off! Therefore there is some other bundle
Ee ⊂ E ⊗ Lθ with E ⊗ Lθ = E e ⊕ C; the connection form is diagonal, and is the standard connection on C.
Now, tensoring with L∗θ ,
(9.21) E = \widetilde E\otimes \sL _\theta ^*\oplus \sL _\theta ^* = E'\oplus \sL _\theta ^*.
Definition 9.22. We say E is without flat factors (WFF) if there is no decomposition E = E 0 ⊕ L where L
is flat.
This is a generic condition.
b := ker(D− ). The next step is to define the covariant
Anyways, we have our Fourier-transformed bundle E A
derivative on E,
b via a more general definition.
Definition 9.23. of vector bundles K, L → M , where M is a smooth manifold, and a map R : K → L. We
will assume K and L come with covariant derivatives ∇K , resp. ∇L , and that either R is Fredholm, or both
K and L have finite rank. Assuming that R is fiberwise surjective, then ker R → M is a vector bundle.
Choose a projection π : K → ker R such that π ◦ i = idker R , where i : ker R ,→ K is inclusion. Then the
compressed covariant derivative on ker R is
π ◦ ∇K ◦ i : Ω0M (ker R) −→ Ω1M (ker R).
This general construction applies to our situation, where K = S− ⊗ E ⊗ Lθ , L = S+ ⊗ E ⊗ Lθ , and
− b → T ∗.
R = DA,θ . (TODO: how did we get π?) Therefore we pick up a connection A
b on E

Theorem 9.24. Assuming A is anti-self-dual and (E, ∇A ) is without flat factors, then A
b is anti-self-dual.

We want to prove that FAb ∈ Ω2,− T ∗ (End E), and will do so by proving that in every complex structure on
b
T , FAb is of type (1, 1), which by Theorem 7.22 suffices. The complex structures on T ∗ are parameterized by

P(S+ ). We will need some more ingredients to do this, which we will discuss next time:
(1) the Chern connection on a holomorphic, Hermitian vector bundle;
(2) how to express the Dirac operator in terms of ∂; and
(3) how to fit these vector bundles into a family of Dolbeault complexes.
This will allow us to identify the connection with the Chern connection, which always has type (1, 1); we can
avoid a calculation in favor of a geometric proof.
Lecture 10.
The Chern connection: 2/19/19

Today, we’ll prove Theorem 9.24, that the Fourier transform of an anti-self-dual connection on a 4-torus is
anti-self-dual. To do this, we’ll fill in some facts about geometry, as promised last time.
Definition 10.1. Let π : E → M be a complex vector bundle, and suppose that both E and M are complex
manifolds and π is holomorphic. Then E → M is called a holomorphic vector bundle.
This implies in particular that locally on M , there is a basis s1 , . . . , sN of holomorphic sections of E.
Thus any section locally has the form f i si for C-valued f i , allowing us to define the ∂ E operator by
∂ E (f i si ) := ∂f i si . Of course, one must check that this is independent of coordinates.
Remark 10.2. Conversely, suppose π : E → M is a C ∞ complex vector bundle, M is a complex manifold, and
2
we have an operator ∂ E : Ω0,0 0,1
M (E) → ΩM (E) satisfying a Leibniz rule and ∂ E = 0. Then π : E → M admits
the structure of a holomorphic vector bundle for which ∂ E is the operator defined above.24 (
24This is less difficult to prove that you might expect: you can certainly lean on the Neulander-Nirenberg theorem, but
there’s a more elementary proof you can find in Donaldson-Kronheimer [DK97].
30 M392C (Mathematical gauge theory) Lecture Notes

Theorem 10.3. Let π : E → M be a holomorphic vector bundle over a complex manifold, and endow E with
a Hermitian metric h–, –i : E ⊗ E → C. Then there exists a unique connection ∇ on E → M such that
(1) if ∇ = ∇0 + ∇00 is the decomposition induced by ∇ : Ω0,0 1,0 0,1 00
M (E) → ΩM (E) ⊕ ΩM (E), then ∇ = ∂ E ;
(2) ∇ is compatible with the metric, in the sense that
(10.4) \label {ChernHermCompat} \d \ang {\overline s_1, s_2} = \ang {\nabla \overline s_1, s_2} + \ang {\overline s_1, \nabla s_2};
(3) and the curvature of ∇ is type (1, 1).
In this setting, ∇ is called the Chern connection.
Proof. We work locally in a holomorphic chart U , allowing us to choose a basis s1 , . . . , sN of holomorphic
sections of E. Define hij := hsi , sj i and write ∇sj = αji si , so αji ∈ Ω1,0
U . Using (10.4), we have

(10.5) dhıj = αki hkj + hık αjk


(10.6) ∂hıj = hık αjk
(10.7) αji = hi` ∂h`j .
Here hi` denotes the components of the inverse matrix. Now (10.7) completely characterizes ∇, forcing
uniqueness, and we can take it as a definition and check that it’s consistent, guaranteeing existence. 

Another proof. Let t1 , . . . , tN be a local unitary basis of C ∞ sections, which means that hti , tj i = δij , and
write ∇tj = βji ti . Then β = (βji ) is skew-Hermitian, so we can write β = β 0 + β 00 , where β 0 ∈ Ω1,0 M and
00 0,1 0 00 ∗ 00
β ∈ ΩM . Since β is skew-Hermitian, β = −(β ) , but β is given by ∂ E , so we’ve again completely
determined ∇. 

Remark 10.8. The fact that the curvature is type (1, 1) is saying something about invariance under scalars: a
λ ∈ C× acts on a form of type (p, q) by λp−q , so it acts by the identity on (p, p)-forms. (
A third proof sketch of Theorem 10.3. Let P → M be the principal GLN (C)-bundle of frames of E. This
is a holomorphic principal bundle, meaning the total space and the projection map are both holomorphic.
Inside it is the principal UN -bundle of unitary frames with respect to h–, –i. We would like to produce a
connection on P , namely a GLN (C)-invariant horizontal distribution H ⊂ T P .
Define H := T Q ∩ I(T Q), where I ∈ End(T P ) is the action of i. Then one can check (TODO) that the
induced covariant derivative satisfies the conditions in the theorem statement. 

This proof is due to Is Singer.


Remark 10.9. Suppose E = T M and M has a Riemannian metric; then in the last proof, P = BGL (M )
and Q = BU (M ). The proof above tells us that BU (M ) picks up the Chern connection Θc . There’s
another connection on BU (M ) induced from the Levi-Civita connection Θ`c on BO (M ), using the inclusion
BU (M ) ,→ BO (M ) induced from the inclusion UN ,→ O2N .
Are these connections equal? This is expressing that the complex structure and Riemannian metric are
compatible, so of course what we get is that this is equivalent to M being Kähler. This is perhaps the most
beautiful definition of Kähler manifolds. (
More generally, given any kind of tangential structure (e.g. spin structure, almost symplectic structure) on
a manifold, and a Riemannian metric, you can ask whether there’s a canonical torsion-free connection on the
bundle of frames.
Recall that on a complex manifold M , the canonical bundle is the complex line bundle KM := DetC (T ∗ M ).
Theorem 10.10 (Atiyah-Bott-Shapiro [ABS64], Hitchin [Hit74]). Let M be a Kähler spin manifold. Then
the spin structure induces a holomorphic Hermitian line bundle L → M together with an isomorphism
L⊗2 → KM , and the Dirac operator is
∗ 0,even/odd 0,odd/even
∂ L + ∂ L : ΩN −→ ΩM (L).
To elaborate on the “even/odd” and “odd/even” pieces, recall that ∂ L maps from (p, q)-forms to (p, q + 1)-

forms, and therefore ∂ L , its adjoint with respect to the Riemannian metric on M , sends (p, q + 1)-forms to
(p, q)-forms.
Arun Debray May 9, 2019 31

Remark 10.11. Not every Kähler manifold is spin, e.g. CP2 . And a spin structure is extra data, which in the
holomorphic world is given by a square root of the canonical bundle. (

We’re not going to prove Theorem 10.10 in full generality, but we’ve developed enough of the theory of
spin geometry in dimension 4 to do it, using the linear algebra of the spin representations that we’ve worked
through. The details are worked out in Donaldson-Kronheimer [DK97].
Now let K 0 , K 1 , K 2 be complex vector spaces, so the trivial bundles K i → M over a complex manifold M
are holomorphic. Suppose we have maps α : K 0 → K 1 and β : K 1 → K 2 , and suppose β ◦ α = 0, so that we
have a complex. The cohomology groups might not be vector bundles: the dimension could jump. But if we
assume α is injective and β is surjective, then the zeroth and second cohomology vanish, so as long as α and
β are Fredholm (automatic if these spaces are finite-dimensional) the Euler characteristic of the complex is
equal to dim H 1 , and therefore the dimension can’t jump, and H 1 is a vector bundle.25

Theorem 10.12. The cohomology H 1 → M has a holomorphic structure. Moreover, given Hermitian metrics
on K i → M , the compressed connection on H 1 → M is its Chern connection.

Proof. For the first part, define a section of H 1 → M to be holomorphic if locally it has a lift to a holomorphic
section ker(β) → M . Then we need to show that there’s locally a basis of holomorphic sections. Fix an
1 1 2 1
m0 ∈ M and s0 ∈ Hm 0
which has a holomorphic lift to e ∈ Km 0
with βm0 e = 0. Choose a P : Km 0
→ Km 0
i
such that βm0 P = id, and trivialize each K → M near m0 . Then in a neighborhood U of m0 , we can write
sm = e + jm , where mm0 = 0 and mj ∈ Im(P ), and we can write βm = βm0 + ηm , where of course ηm0 = 0.
So we now have the equation

(10.13) 0 = \beta s = P(\beta _{m_0} + \eta _m)(e+j_m) = P\eta _me + P\beta _{m_0}j_m + P\eta _m j_m,

i.e. we want to solve

(10.14) (1+P\eta _m)j_m = -P\eta _me.

That is, a solution to this equation allows us to extend s0 to a holomorphic section of H 1 . We can just write
down the solution: let

(10.15) j_m \coloneqq -\sum _{i=0}^\infty (-P\eta _m)^i P\eta _me.

This converges (roughly because 1 + small is invertible), and is holomorphic. In particular, we can lift a
1
basis of Km 0
to some sections that we hope are a local basis. It will be left as an exercise to show that every
holomorphic section can be written as a linear combination of these.
For the second part, let π : K 1 → ker α∗ = (Im α)⊥ , where α∗ : K 1 → K 0 is the adjoint in the Hermitian
metrics. Then π is given by

(10.16) k\mapsto k - \alpha (\alpha ^*\alpha )^{-1} \alpha ^*k.

Since [∂, α] = 0, then [∂, π] ⊂ Im(α), and therefore π[∂, π] = 0. Suppose k is a local holomorphic section of
ker β → U ; then, π(k) is a section of (ker α∗ ) ∩ (ker β) → U and

(10.17) \pi \delbar _{K^1}(\pi (k)) = \pi [\delbar _{K^1}, \pi ]k + \pi ^2(\delbar _{K^1}, k) = 0.

There’s a little more to do, but we’re out of time. 

This is an instance of something general, that holomorphy and unitarity are often in tension. For example,
even just in complex analysis, a norm-1 holomorphic function is constant.

25Implicit in this paragraph is the definition of a holomorphic vector bundle of possibly infinite rank. One can make sense of
this by determining the right notion of a holomorphic function on a complex Banach space, and then using the same words as in
Definition 10.1.
32 M392C (Mathematical gauge theory) Lecture Notes

Lecture 11.
Proof of the Nahm transform, part 1: 2/26/19

“We’re not spinning our wheels here. I mean, we are, but. . . ”


We’re still discussing the Nahm transform (the analogue of the Fourier transform for anti-self-dual connections
on tori). This story is nice because tori are quotients of vector spaces; but there are also more quotients of
vector spaces we could study.
As usual, let V be a four-dimensional real oriented inner product space, Λ ⊂ V be a full lattice, V ∗ be
the dual, and Λ∗ := Hom(Λ, Z). We formed the torus T = V /Λ and its dual torus T ∗ = V ∗ /Λ∗ . Then
there is a correspondence T ← T × T ∗ → T ∗ , and as we usually do with correspondences, we’d like to do
a pullback-and-pushforward weighted by an integral kernel. On functions, this looks like the usual kernel
transform,

(11.1) \widehat f(x) = \int \d y\, K(x,y) f(y).

But we’re interesting in vector bundles (and connections), not functions, so the kernel is the Poincaré bundle.
Well, there are two Poincaré bundles, just as how the Fourier transform uses e−iπx·ξ and its inverse uses
eiπx·ξ . The two Poincaré bundles P, P e → T × T ∗ are Hermitian line bundles with connection. Recall that
a point v ∈ T defines a flat line bundle over T ∗ , and vice versa; therefore we say that P|T ×{θ} → T is the
flat line bundle on T with holonomy θ, and analogously for the bundle over {v} × T ∗ . For P, e one of the

holonomies switches to −v. However, there is curvature on T × T , given by ±ihdv ∧ dθi (here v ∈ T and
θ ∈ T ∗ ; dv is a translation-invariant one-form on V , and hence descends to T , and similarly for dθ).
On to the Nahm transform: given a Hermitian vector bundle E → T with connection A, we would like to
define a Hermitian vector bundle E b Fix spin spaces S± with V = S+ ⊗ S− , so the
b → T ∗ with connection A.
±
spinors on T are functions T → S (in other words, on T , the spinor bundles are trivial bundles with fiber
S± , so sections of them are functions to S). We have a Dirac operator
(11.2) D_A^-|_\theta \colon \Gamma (\underline \Sph ^-\otimes E\otimes \P _\theta )\longrightarrow \Gamma (\underline \Sph ^+\otimes E\otimes \P _\theta ).
− bθ := ker(D− |θ ); because
If A is anti-self-dual and without flat factors, then DA |θ is surjective, and we define E A
the Dirac operator is Fredholm, this is finite-dimensional. We also want a covariant derivative A, b which

we define to the compressed covariant derivative associated to the Fredholm map DA |θ . We want to prove
Theorem 9.24, that A b is anti-self-dual; we will prove this by showing it’s type (1, 1) in every complex structure
on T ∗ .

Proof of Theorem 9.24. A line L ∈ P(S+ ) defines a complex structure (V, I), as we discussed; call this
two-dimensional complex vector space U . The dual space U ∗ = (V ∗ , −I ∗ ) – the reason we use −I ∗ is so
that the duality pairing works nicely. Specifically, if v ∈ V and θ∗ ∈ V ∗ , we want hθ, vi = hIV ∗ θ, IV vi; since
I 2 = −1, if we took I ∗ instead of −I ∗ , there would be an extra minus sign.
Now split ∇A = ∇0A + ∇00A , by types, so ∇00A : Ω0,0 0,1
T (E) → ΩT (E). Since A is anti-self-dual, its curvature
00 2 00
has type (1, 1), and therefore (∇A ) = 0. Therefore ∂ A := ∇A defines a holomorphic structure on E; we will
call this holomorphic vector bundle E → T .
Remark 11.3. Since the curvatures of P, Pe → T × T ∗ also have type (1, 1), they also pick up holomorphic
e → T × T ∗.
structures in this way; we denote these holomorphic bundles P, P (
Lemma 11.4. Let (M, I) be an almost complex manifold and ω ∈ Ω2M (C). Then ω has type (1, 1) iff
ω(Iξ1 , Iξ2 ) = ω(ξ1 , ξ2 ) for all ξ1 , ξ2 ∈ T M .
Proof. Split ω by type:
(11.5) \omega = \underbracket {\omega _{ij}\ud z^i\wedge \d z^j}_{(2,0)} + \underbracket {\omega _{i\overline \jmath } \ud z^i\wedge \overline {\d z^j}}_{(1,1)} + \underbracket {\omega _{\overline \imath \overline \jmath } \,\overline {\d z^i}\wedge \overline {\d z^j}}_{(0,2)}.

Now C× acts on Ω2M (C) by scalar multiplication, and a λ ∈ C× acts by λ2 on Ω2,0 1,1
M (C), by 1 on ΩM (C), and
−2 0,2
by λ on ΩM (C). Since I acts by multiplication by i, ω can only be fixed by I if it’s of type (1, 1). 
Arun Debray May 9, 2019 33

Now that everything is holomorphic, we have the Dolbeault complex

(11.6)

and the Dirac operator is


(11.7) D_A = \delbar _A + \delbar _A^*\colon \Omega _T^{0,1}(\cE \otimes \cP _\theta )\longrightarrow
Hence, by the usual Hodge theory story,
(11.8a) −
ker(DA |θ ) ∼
= H 1 (T, E ⊗ Pθ )
(11.8b) −
coker(DA ∼ H 0 (T, E ⊗ Pθ ) ⊕ H 2 (T, E ⊗ Pθ ).
|θ ) =
Since E is without flat factors, though, H 0 ⊕H 2 vanishes. Therefore the fiber of Eb at θ is exactly H 1 (T, E ⊗Pθ ).
This is exactly what the Mukai transform in algebraic (or complex) geometry does: given the correspondence,
pull back from T to T × T ∗ , then push forward to T ∗ . Here, though “push forward” means something different:
we have to generalize from vector bundles to sheaves. Given a sheaf F → X and a map f : X → Y , we
obtain a pushforward sheaf f∗ F → Y whose space of sections on an open U ⊂ Y is defined to be F(f −1 (U )).
However, the pushforward isn’t exact, so we have to take its derived functors, which amounts to cohomology.
Here, we only can have H 0 through H 2 , and H 0 and H 2 vanish, leaving just H 1 . Moreover, the fact that E
is without flat factors implies that what we get is a vector bundle, rather than just a sheaf. In particular, as
C ∞ vector bundles, Eb is identified with what we got from the Nahm transform.
The point is (TODO details) that, just as we proved last time, the compressed connection is the Chern
connection, and therefore it has type (1, 1). Since the complex structure was arbitrary, this implies A b is
anti-self-dual. 

Now we want to study the moduli space of such E → T . Generally one fixes discrete invariants to make
the problem simpler.
Proposition 11.9. The discrete invariants of a complex (i.e. just C ∞ ) vector bundle E → T are: the rank
in Z≥0 and its Chern classes c1 (E) ∈ H 2 (T ) and c2 (E) ∈ H 4 (T ) ∼
= Z. That is, these determine the C ∞
isomorphism class of E.
This is a good exercise in algebraic geometry.
b → T ∗ can be given in terms of those of E:
Theorem 11.10. The discrete invariants of E
• rank(E)b = c2 (E) − (1/2)c1 (E)2 ,
• c1 (E) = σ(c1 (E)), and
b
b = rank E + (1/2)c1 (E)2 .
• c2 (E)
These follow from the Atiyah-Singer index theorem, though mind that the formulas in the book have a few
mistakes. Here σ : H 2 (T ) → H 2 (T ∗ ) is a map induced from Poincaré duality. Specifically, there is a canonical
class κ ∈ H 1 (T ) ⊗ H 1 (T ) ,→ H 2 (T × T ∗ ).26 Then we define
(11.11) \sigma (x)\coloneqq (p_2)_* (\kappa ^2 p_1^*x).
The pushforward (p2 )∗ is the Gysin map, which comes from Poincaré duality.
Corollary 11.12. There cannot exist an anti-self-dual connection on a bundle with c1 = 0 and c2 = 1.
Proof. Suppose such a connection exists, and choose one of minimal rank. Then there exists an anti-self-dual
connection on Eb with rank 1 and c2 6= 0, but this is a contradiction: the second Chern class of any line
bundle vanishes. 

Now we want to prove an analogue of Fourier inversion.


Definition 11.13. Let F → T ∗ be a holomorphic vector bundle without flat factors, and define F̌ → T to
be the holomorphic vector bundle whose fiber at v ∈ T is H 1 (T ∗ , F ⊗ P
ev ).

26The fact that we can decompose H 2 (T × T ∗ ) as a sum of (H 0 (T ) ⊗ H 2 (T ∗ )) ⊕ (H 1 (T ) ⊗ H 1 (T ∗ )) ⊕ (H 2 (T ) ⊗ H 0 (T ∗ ))


relies on the fact that the homology of T is torsion-free, and is not true in general.
34 M392C (Mathematical gauge theory) Lecture Notes

Given a Hermitian structure on F, we also get one on F̌ as before.


Theorem 11.14 (Mukai inversion). Let E → T be a Hermitian holomorphic vector bundle without flat
ˇ
factors. Then Eb → T ∗ is without flat factors, and there is a natural isomorphism ω : Ê → E of Hermitian
holomorphic vector bundles on T .
Letting p1 : T × T ∗ → T be projection onto the first factor, the complex Ω0,• ∗
T ×T ∗ (p1 E ⊗ P) is actually a
•,•
double complex C . The bigrading (p, q) exists on forms on any Cartesian product: given a k-form, how
many of its factors are from T and how many are from T ∗ ? We also have that the differential splits as
∂ 1 + ∂ 2 , the former of which sends (p, q)-forms to (p + 1, q)-forms, and the latter of which sends (p, q)-forms
to (p, q + 1)-forms.
Given a double complex we obtain a spectral sequence: consider the bigraded vector space E1•,• :=
H • (C •,• ; ∂ 1 ). Since ∂ 1 and ∂ 2 commute, ∂ 2 is a differential on E1•,• , so we can take the cohomology of this
complex, and obtain another bigraded vector space E2•,• often called the E2 -page of the spectral sequence.
One can in general continue on in this way, finding more differentials using homological algebra, but in this
case they all vanish, and this approximates the cohomology of Ω0,• ∗
T ×T ∗ (p1 E ⊗ P), in the sense that it’s the
associated graded for some filtration of it.
Remark 11.15. We could have also defined E1•,• and E2•,• by first taking cohomology with respect to ∂ 2 , then
using ∂ 1 . Then E1•,• may be different, but E2•,• will not change. (
Theorem 11.16. The E2 -page of this spectral sequence is as follows.

(11.17)

The total cohomology of the double complex vanishes except in (total) degree 2, and H 2 (T × T ∗ , p∗1 E ⊗ P) ∼
= E0 .
We will delve into the proof next time.
Lecture 12.
Proof of the Nahm transform, part 2: 2/28/19

“I have two things to do and ten minutes to do them, and no time for either.”
As before, let V be an oriented real four-dimensional inner product space, and choose a complex structure I
on V . Let U := (V, I) and U ∗ := (V ∗ , −I ∗ ). Let Λ ⊂ T be a full lattice and T := V /Λ, which acquires a flat
Kähler structure from the inner product and complex structure on V . The same is true for the dual torus
T ∗ := V ∗ /Λ∗ . We also defined the Poincaré bundle P → T × T ∗ , which we have made into a holomorphic
line bundle.
With this structure in place, suppose E → T is a holomorphic vector bundle without flat factors. Then
we can define Eb := p2∗ (p∗1 E ⊗ P), where p1 : T × T ∗ → T and p2 : T × T ∗ → T ∗ are the projections onto
the two factors. Here, “pushforward” means sheaf cohomology, but the assumption that E is without flat
factors means that H 0 (T, E) = 0 and H 2 (T, E) = 0, so we can just take H 1 . We would like to prove a Fourier
inversion formula for this transform.
We then defined a double Dolbeault complex
(12.1) C\coloneqq (\Omega _{T\times T^*}^{0,\bullet }(p_1^*\cE \otimes \cP ), \delbar _1 + \delbar _2).
The second grading comes from the fact that we’re over a product manifold. Associated to this double
complex is a spectral sequence whose E2 -page is H∂∗ (H∂∗ (C)). Our first task today is to prove Theorem 11.16,
2 1
Arun Debray May 9, 2019 35

first verifying that the E2 page looks like (11.17) and then that the spectral sequence collapses because the
total cohomology vanishes except in degree 2, where it’s E0 .
\begin {gathered} \printpage [name=ex2,page=0] \end {gathered}
Corollary 12.2.
(1) H 0 (T ∗ , E)
b = 0 and H 2 (T ∗ , E)
b = 0, and

=
(2) we obtain an isomorphism ωI : H 1 (T ∗ , E)b → E0 .
ˇ
The second point is a piece of the inversion formula: Eb0 is H 1 (T ∗ , E),
b and this shows us it’s E0 as it should
be. We’ll be able to buoy this up into the full inversion theorem.
Proof of Theorem 11.16. Introduce coordinates z1 , z2 for T and ζ1 , ζ2 for T ∗ . Now, we can view C as a
bundle of complexes over T ∗ :

(12.3) \label {cpxcpx} \xymatrix { \cV ^0\ar [r]^-{\delbar _1} & \cV ^1\ar [r]^-{\delbar _2} & \cV ^2. }

A point in V in the fiber at θ ∈ T is an element of Γ(E ⊗ Pθ ), hence a section of V 0 looks sort of like a
0

function λ(z, ζ), with the caveat that E is nontrivial. In the same way, a section of V 1 looks like a 2-form
λi (z, ζ) dz i ; and a section of V 2 looks like a 2-form λ12 (z, ζ) dz 1 ∧ dz 2 .
Since C p,0 = ΓT ∗ (V p ), then the cohomology of the global sections of (12.3) is what we’re after. We claim
this is the same as the global sections of the cohomology of (12.3). That these two functors commute is
a statement about the vanishing of derived functors. In this case, the sheaves associated to these vector
bundles are flabby, which follows from elliptic theory, and therefore higher cohomology vanishes. These are
infinite-rank vector bundles, so we can’t lean on the usual vanishing results.
The upshot is that on the E1 -page, E p,q is 0 if p 6= 1, and if p = 1, is Ω0,q∗ (E):
1
b
T

(12.4)

The differential is ∂ 2 : E1p,q → E1p,q+1 : it increases vertical degree by 1. The cohomology is clearly H ∗ (T ∗ , E),
b
proving the first part of the theorem.
To get at the second part, we’re going to compute the E2 -page in a different way. Fixing v ∈ T , we get a
bundle Ev ⊗ Pv over T ∗ , a slice of E ⊗ Pθ . We care about this because Ebθ = H 1 (T, E ⊗ Pθ ).
We need a quick lemma about the cohomology with coefficients in a flat line bundle.
Lemma 12.5. If v 6= 0, then H ∗ (T ∗ , Pv ) = 0; at v = 0, H q (T ∗ , P0 ) ∼
= Λq U .
Recall that U is the complex vector space (V, I).
Proof. For v = 0, consider the Dolbeault complex

(12.6) \xymatrix { \Omega _{T^*}^{0,0}\ar [r]^-{\delbar _2} & \Omega _{T^*}^{0,1}\ar [r]^-{\delbar _2} & \Omega _{T^*}^{0,2}. }

The tangent space to T ∗ is, as a complex vector space, U , and therefore the (0, 1) part of the cotangent
∗∗
bundle to T ∗ is U = U .
Now suppose v 6= 0. Given a λ ∈ Λ, we obtain a function eλ : T ∗ → C by
(12.7a) e_\lambda (\theta )\coloneqq e^{2\pi i\ang {\theta , \lambda }}.
Explicitly, in local coordinates ζ1 , ζ2 coming from V ∗ , this is
(12.7b) e_\lambda (\zeta _1,\zeta _2)\coloneqq e^{2\pi i\Re (\lambda ^1\zeta _1 + \lambda ^2\zeta _2)}.
36 M392C (Mathematical gauge theory) Lecture Notes

These give us examples of Dolbeault forms for the complex

(12.8) \xymatrix { \Omega _{T^*}^{0,0}(\cP _v)\ar [r]^-{\delbar _2} & \Omega _{T^*}^{0,1}(\cP _v)\ar [r]^-{\delbar _2} & \Omega _{T^*}^{0,2}(\cP _v): }

eλ in degree zero, eλ dζ1 and eλ dζ2 in degree one, and eλ dζ1 ∧ dζ2 in degree two. Moreover, the theory of
Fourier series says that, as long as we consider forms which rapidly decay in (TODO: I think) the fiber, these
are a Hilbert space basis for Ω0,•
T ∗ (Pv ).
27

Let v := (z , z ). We can choose z dζi as a connection form of Pθ → T ∗ . Then, you can just compute that
1 2 i

(12.9a) ∂ v (eλ ) = (λi + z i )eλ dζi


(12.9b) ∂ v (eλ dζ1 ) = −(λ2 + ζ 2 )eλ dζ1 ∧ dζ2
(12.9c) ∂ v (eλ dζ2 ) = (λ1 + z 1 )eλ dζ1 ∧ dζ2 ,

and this is manifestly acyclic away from v = 0. 

So λ defines a subcomplex Aλ generated by eλ , eλ dζi , and ελ dζ1 ∧ dζ2 , and the total complex is a
completed direct sum of these over λ ∈ Λ, via the Fourier decomposition.
Now let F D•,• denote the germs of forms in a neighborhood of T0∗ , i.e. equivalence classes of differential
forms on open neighborhoods of (0, θ) ∈ T × T ∗ , where two forms are equivalent if they agree on some
(possibly smaller) open neighborhood. The map sending a form to its germ defines a map of complexes
φ : C •,• → D•,• , which induces a map of spectral sequences for these two double complexes.
Lemma 12.5 implies that φ induces an isomorphism of all groups on the E1 -page, and a general theorem on
spectral sequences implies the induced maps on all later pages are also isomorphisms (though here everything
stabilizes at E2 ).
Let N denote a formal neighborhood of zero on T , in the sense that Ω0,•
N denotes germs of (0, •)-forms at
zero. We again have a Fourier-theoretic decomposition

(12.10a) D^{\bullet ,\bullet } = \mathop {\widehat \bigoplus }\limits _{\lambda \in \Lambda } D_\lambda ^{\bullet ,\bullet },

where

(12.10b) D_\lambda ^{\bullet ,\bullet }\coloneqq \Omega _N^{0,\bullet }\otimes A_\lambda \otimes \cE _0.

The double grading here arises from ∂ 1 acting on Ω0,•


N and from ∂ 2 acting on Aλ .
Now run the spectral sequence for Dλ•,• : the E1 -page is

(12.11)

\begin {gathered} \printpage [name=ex3,page=0] \end {gathered} \otimes \cE _0,

where Ãqλ is the space of holomorphic functions from N to Aqλ (i.e. germs of these functions). Hence the
homology of thr E1 -page, which is the E2 -page for D•,• and hence also for C •,• , is

27A little more work has to be done to remove the “rapidly decaying” hypothesis.
Arun Debray May 9, 2019 37

(12.12) 

Remark 12.13. There’s a picture of this in terms of something called a Koszul complex. Fixing λ = 0, let
v ∈ T0 U , which defines a complex

(12.14) \xymatrix { \underline {\Lambda ^0 U}\ar [r]^-{\epsilon (v)} & \underline {\Lambda ^1 U}\ar [r]^-{\epsilon (v)} & \underline {\Lambda ^2 U}, }
where  denotes exterior product. The homology of this complex vanishes except in the top degree (2), where
we get the determinant line Det U .
Or, more algebraically, consider the complex
(12.15) K^\bullet \coloneqq (\Sym ^\bullet (U^*)\otimes \Lambda ^\bullet (U), d_K),
where dK is multiplication by ui ⊗ ui . This complex can be thought of as Λ• (U )-valued polynomials on U .
Then one can prove that the cohomology of this complex is 0 except in degree 2, where it’s Det U . There’s a
similar story for λ 6= 0. (
So we have an abstract isomorphism H 1 (T ∗ , E) ∼ E0 coming from the spectral sequence argument. Can
b =
we get our hands on precisely what this isomorphism is? TODO: I didn’t follow this part as well, so the logic
might be wonky.
In (12.12), E0 lives in bidgree (0, 2), but in (11.17), H 1 (T ∗ , E)
b lives in bidegree (1, 1). The way to understand
1,1 0,2
this is: consider α ∈ E1 and β ∈ E1 . If we can get that
(12.16) 0 = (\delbar _1 + \delbar _2)(-\alpha +\beta ) = \delbar _1\beta - \delbar _2\alpha ,
then we have an identification. Knowing that ∂ 1 α = 0 and ∂ 2 α = 0 mod Im(∂ 1 ), we want to find a β such
that ∂ 1 β = ∂ 2 α.
Let r0 : Ω∗,∗ 0,0 ∗ ∗ − 2,0
T ×T ∗ → Ω{0}×T ∗ be restriction to {0} × T ⊂ T × T . Let ε denote a constant in ΩT ∗ . Then
we need β to satisfy

(12.17) \label {omgI} \omega _I([\alpha ]) = \int _{T^*} r_0(\beta )\wedge \e ^-,

which leads to an explicit answer, and in particular ωI is the isomorphism we want.


Now we want to finish out the proof of Mukai’s theorem, passing back to the C ∞ world and asking about
ˇ
the relationship between  and A. We know that (E, b A)
b is without flat factors: if it weren’t, then it would
0 2
have H and H , which we just saw is not true. We want to show that ωI not only induces an isomorphism
ˇ
of  and A, but also that it doesn’t depend on I. The argument goes by trying to rewrite (12.17) in purely

C terms. After that, we know it suffices to understand this for I and −I; I takes care of the (1, 0) part and
−I takes care of the (0, 1) part. The details can be found in Donaldson-Kronheimer [DK97].

Lecture 13.
Fredholm theory: 3/7/19

“We don’t know that, except that I told you so.”


In the next few lectures, we’ll do something different: explaining how in a differential-geometric context, one
constructs moduli spaces of solutions to nonlinear equations. Today, we’ll mostly discuss the underlying
linear theory, the theory of Fredholm operators between (usually) infinite-dimensional spaces. A linear
38 M392C (Mathematical gauge theory) Lecture Notes

operator between two finite-dimensional vector spaces is always Fredholm, and theorems about operators on
finite-dimensional vector spaces often generaize to the Fredholm setting.
Lemma 13.1. Let 0 → V 0 → V 1 → · · · → V n → 0 be an exact sequence of finite-dimensional vector spaces
(over any field). Then
\label {acycliceulerchar} \sum _{q=0}^n (-1)^q \dim V^q = 0.
(13.2)

Proof. Let Tq denote the map V q → V q+1 . If Cq ⊂ V q is a complement to Im(T q ), then T q+1 |Cq is injective,
hence an isomorphism onto its image. But in (13.2), Cq and Im(Cq ) are counted with opposite signs, so the
total contribution is zero. 

Corollary 13.3 (Rank-nullity theorem). Let T : V → W be a linear map. Then the sequence

(13.4a) \label {rnex} \xymatrix { 0\ar [r] & \ker T\ar [r] & V\ar [r]^-T & W\ar [r] & \coker T\ar [r] & 0 }
is exact, and in particular
(13.4b) \dim V - \dim W = \dim \ker T - \dim \coker T.
If we try to consider a families version of Corollary 13.3 over some curve in Hom(V, W ), we might ask
whether (13.4a) becomes a short exact sequence of vector bundles. This is not true in general: the kernel of a
map of vector bundles need not be a vector bundle, because the dimension can jump. A simple example is

(13.5) T = \begin {pmatrix}1 & 0 & 0\\0 & t & 0\end {pmatrix}\colon \R ^3\to \R ^2,

as t → 0. However, the dimension of the cokernel also jumps, so the expression “dim(V − W )” (as
dim ker T − dim coker T ) is constant, even if the expression is a little weird.
We would like to generalize this to the infinite-dimensional setting (where “dimension” doesn’t play as
well). There are different kinds of topology one considers for infinite-dimensional vector spaces; we will mostly
work with Banach spaces, which are vector spaces complete with respect to a given norm. Sometimes we’ll
have to restrict to Hilbert spaces. We will always work over C.
Definition 13.6. Let X and Y be Banach spaces. A continuous linear map T : X → Y is Fredholm if
(1) Im(T ) ⊂ Y is closed,
(2) ker(T ) is finite-dimensional, and
(3) coker(T ) is finite-dimensional.
In this case the index of T is defined to be ind T := dim ker T − dim coker T .
The first condition is redundant.
Remark 13.7. There is a sense in which the definition of the index is “the wrong way around”: an element of
Hom(X, Y ) defines an element of Y ⊗ X ∗ , not X ⊗ Y ∗ , so we should do cokernel minus kernel. We’ll stick
with the standard definition, though, and sometimes this sign will come back and surprise us. (
Let Hom(X, Y ) denote the space of continuous linear maps, equipped with the operator norm. This
makes it a Banach space, and induces the strong topology. Let Fred(X, Y ) denote the subspace of Fredholm
operators, which is not a vector space.
Theorem 13.8. Suppose X and Y are Hilbert spaces. Then the index ind : Fred(X, Y ) → Z is an isomorphism
on π0 .
This is not in general true for X and Y Banach spaces: consider X infinite-dimensional and Y finite-
dimensional. However, it should be true if there’s an isomorphism X ∼
=Y.
Remark 13.9. The higher topology of Fred(X, Y ) is very interesting; each connected component has the
topology of the classifying space of the infinite unitary group. (
Definition 13.10. Let T : X → Y be Fredholm and W be a subspace of Y . We say T is transverse to W ,
denoted T t W , if T (X) + W = Y .
Arun Debray May 9, 2019 39

For each finite-dimensional W ⊂ Y , let


(13.11) \cO _W\coloneqq \set {T\in \Fred (X,Y)\mid T\pitchfork W}.
In this case we have (13.4a) again: the sequence

(13.12) \xymatrix { 0\ar [r] & \ker T\ar [r] & T^{-1}(W)\ar [r]^-T & W\ar [r] & \coker T\ar [r] & 0 }
is exact, and all of its terms are finite-dimensional. In particular, using Lemma 13.1,
(13.13) \ind T = \dim T^{-1}(W)-\dim W.
There is a vector bundle V → OW whose fiber at T is T −1 (W ), and there’s a map of vector bundles V → W
which at T is just T .
Lemma 13.14. OW ⊂ Hom(X, Y ) is open.
Therefore we have a canonical open cover of Fred(X, Y ). It’s uncountable, yes, but it’s still nice to have.
Proof. Let T0 ∈ OW , and choose a closed complement X 0 to T0−1 (W ).28 Then X = T0−1 (W ) ⊕ X 0 .
There is also a splitting Y = W ⊕ T0 (X 0 ). Why is this?
• First, we have to show that these don’t intersect except at zero. If y ∈ W ∩ T0 (X 0 ), then y = T0 x for
some x ∈ X, and x ∈ T0−1 (W ) ∩ X 0 , so x = 0.
• Second, we want to write y ∈ Y as a sum of a w ∈ W and y 0 ∈ T0 (X 0 ). TODO. . .
Now, consider the composition
(13.15) \xymatrix { \Hom (X,Y)\ar [r] & \Hom (X',Y)\ar [r] & \Hom (X', T_0(X')) }
where the first map is restriction and the second is projection. Each map is continuous and linear. Invertibility

=
is an open condition, so the space of isomorphisms X 0 → T0 (X 0 ) is open in Hom(X 0 , T0 (X 0 )). Thus its
preimage in Hom(X, Y ), which is OW , is also open. 

Corollary 13.16. The index map ind : Fred(X, Y ) → Z is locally constant.


This follows from V → OW being a vector bundle.
Proof of Theorem 13.8. Suppose T0 , T1 ∈ Fred(X, Y ) have the same index. Choose a finite-dimensional
subspace of Y transverse to both T0 and T1 , and let X 0 ⊂ X be a closed complement of both T0−1 (W ) and
T1−1 (W ). Why can you do this? Well, you can take the sum T0−1 (W ) + T1−1 (W ) and ask for a complement
to that, then work within that sum, which is finite-dimensional.
Therefore we can write T0 and T1 as maps T0−1 (W ) ⊕ X 0 → W ⊕ T0 (X 0 ). Then T0 and T1 have the block
decompositions

(13.17) T_0 = \begin {pmatrix} T_0|_{T_0^{-1}(W)} & 0\\0 & T_0|_{X'}\end {pmatrix}\qquad \qquad T_1 = \begin {pmatrix} B & *\\{}* & A\end {pmatrix},

where A is invertible. By considering the family

(13.18) t\mapsto \begin {pmatrix} B & t*\\t* & A\end {pmatrix}

and letting t → 0, we just have to think about B and A. Since B has the same rank as T0 |T −1 (W ) , then it
0
suffices to show that the space of invertible operators X 0 → X 0 is connected.
Lemma 13.19. Let H be a Hilbert space; then GL(H) ⊂ Hom(H, H) is connected.
Proof. If A ∈ GL(H), then A∗ A is positive, so it has a square root P , i.e. P 2 = A∗ A, and P is also invertible.
Define U by A = U P ; then
(13.20) U^*U = (AP^{-1})^*(AP^{-1}) = P^{-1}A^*AP^{-1} = A^*AP^{-2} = \id .
We can write P = eB for some unique self-adjoint B, using the spectral theorem for self-adjoint operators,
and can write U = eiC for some self-adjoint C. This follows because the eigenvalues of U are contained in
28The existence of such a closed complement follows from the Hahn-Banach theorem. If X and Y are Hilbert spaces, then
it’s easier – you can just take the orthogonal complement.
40 M392C (Mathematical gauge theory) Lecture Notes

the unit complex numbers, and the logarithm map from the circle to [0, 2π) is measurable, which suffices for
the spectral theorem.
Anyways, now we have the paths t 7→ etB and t 7→ etC . 

In fact, it is a (harder) theorem of Kuiper that GL(H) is contractible when H is infinite-dimensional.


Now one has to argue that π0 is surjective; this first involves producing a Fredholm operator 

Exercise 13.21. Using the local models OW , show that the composition of two Fredholm maps is Fredholm,
and the indices add.
This in particular means that if X = Y , Fred(X, X) has a composition structure, meaning π0 is a monoid
(in fact, it’s a group), and the map to Z is in fact a group isomorphism.
We will occasionally consider a variant.
Definition 13.22. Consider a complex of Banach spaces

0 / X0 T1
/ X1 T2
/ ··· Tn
/ Xn / 0,
so Tq ◦ Tq−1 = 0. This is a Fredholm complex if each Tq has closed image and the cohomology groups are all
finite-dimensional.
If n = 1, this is equivalent to T1 : X 0 → X 1 being a Fredholm map. Some of the notions we’ve considered
above generalize to Fredholm complexes.
Remark 13.23. Peering into the proof of Lemma 13.14, T0 : X 0 → T0 (X 0 ) is an isomorphism, so we can let S
be an inverse. Then T ◦ S is not quite equal to the identity, but its image is finite-dimensional; one says it
has finite rank. The same applies to S ◦ T .29 Therefore we’ve shown any Fredholm operator is invertible up
to a finite-rank operator, and in fact the converse is true.
The more common version of that result is that T is Fredholm iff it’s invertible up to a compact operator
(i.e. an operator under which the image of the unit ball is compact). The compact operators are precisely the
closure of the finite-rank ones in the strong topology, so what we just said is a little bit stronger. (
Next we discuss determinants, first in the finite-dimensional setting, then in the Fredholm setting.
Recall that over any field k, the determinant line of a finite-dimensional vector space V is Det V := Λdim V V .
The determinant line of the zero vector space is therefore canonically k.
If L is a line over k = R or C (i.e. a one-dimensional k-vector space), there’s a notion of an “inverse” for L:

over R, L−1 = L∗ , because given any v ∈ L, there’s a unique ` ∈ L∗ with `(v) = 1. Over C, L−1 = L for a
similar reason.
Definition 13.24. Consider a complex of finite-dimensional vector spaces over R or C:

(13.25a) \xymatrix { 0\ar [r] & V^0\ar [r]^{T_1} &V^1\ar [r]^{T_2} &\dotsb \ar [r]^{T_n} &V^n\ar [r] & 0, }
so in particular Tq ◦ Tq−1 = 0. Define the determinant of this compelex to be

(13.25b) \Det V^\bullet \coloneqq \bigotimes _{q=0}^n (\Det V^q)^{(-1)^q},

and similarly define the determinant of its cohomology to be

(13.25c) \Det H^*(V^\bullet ) \coloneqq \bigotimes _{q=0}^n (\Det H^q(V^\bullet ))^{(-1)^q}.


=
Lemma 13.26. The maps T define an isomorphism det T : Det H ∗ (V • ) → Det V • .
Corollary 13.27. Letting n = 1, given a linear map T : V 0 → V 1 , there is a canonical element det T ∈
Det(V 0 ⊗ (V 1 )∗ ).
This is called the determinant of T , and generalizes the determinant of an endomorphism (the case
V 0 = V 1 ).
29TODO: I think.
Arun Debray May 9, 2019 41

Proof sketch of Lemma 13.26. Let C q be a complement to Tq (V q−1 ) in ker(Tq+1 ) and let rq := rank Tq =
dim Tq (V q−1 ) and bq := dim H q (V • ). Now choose:
• vq ∈ Λr T (V q−1 ) \ 0,
• cq ∈ Det C q ,
• hq ∈ Det H q (V • ), and
• ehq a lift of hq to Λbq V q ,
and define

(13.28) \label {detdefn} \det \paren {\prod _{q=0}^n h_q^{(-1)^q}} \coloneqq \prod _{q=0}^n\paren {\widetilde h_q\wedge c_q\wedge v_q}^{(-1)^q}.

It therefore suffices to show this doesn’t depend on choices. 

Exercise 13.29. Check that the construction in (13.28) does not depend on the choices of vq , cq , hq , and e
hq .
One might hope for a slicker construction, but this seems to be the most straightforward way to do it.
Given a short exact sequence of complexes 0 → V0• → V1• → V2• → 0, the determinants add.
Now let X and Y be Banach spaces as before. We would like to construct a vector bundle Det → Fred(X, Y ).
Fixing a finite-dimensional subspace W ⊂ Y , let DetW → OW be Det(T : V → W ), i.e. Det V ⊗ Det W ∗ ,
where V → OW is the vector bundle with VT := T −1 (W ).
Now we want to patch for subspaces W, W 0 ⊂ Y with OW ∩ OW 0 6= ∅. It suffices to assume W 0 ⊂ W , since
any two finite-dimensional subspaces of Y are contained within their sum, which is also finite-dimensional.
We then have a diagram of short exact sequences
\gathxy { 0\ar [r] & T^{-1}(W')\ar [r]\ar [d]^{T'} & T^{-1}(W)\ar [r]\ar [d]^T & T^{-1}(W)/T^{-1}(W')\ar [r]\ar [d]^{\overline T}) & 0\\ 0\ar [r] & W'\ar [r] & W\ar [r] & W/W'\ar [r] & 0. }
(13.30)

Then T is an isomorphism (TODO: why?), so its determinant


(13.31) \det \overline T\in \Det (T^{-1}(W)/T^{-1}(W'))\otimes (\Det W/W')^{-1}

is nonzero. So we can write this as T v ⊗ (w)−1 for some v ∈ T −1 (W )/T −1 (W 0 ) and w ∈ W/W 0 . Lift v to
some v ∈ T −1 (W ) and w to some w ∈ W . There are v 0 ∈ T −1 (W 0 ) and w0 ∈ W 0 such that
(13.32) \det T' = T'v'\otimes (w')^{-1} \in \Det T;^{-1}(W)\otimes (\Det W')^{-1},
and therefore
(13.33) \det T = \frac {Tv\wedge Tv'}{w\wedge w'} \in \Det T^{-1}(W)\otimes (\Det W)^{-1}.

Lecture 14.
Transversality and the obstruction bundle: 3/12/19

“Did I use the word ’like?’. . . Then I used it as a millenial.”


Let X and Y be smooth manifolds and f : X → Y be smooth. Recall that a y ∈ Y is a regular value if dfx is
surjective for all x ∈ N := f −1 (y). In this case N is a submanifold of X.
In general, df defines a map of vector bundles T X → f ∗ T Y over X; asking for y to be a regular value
means that the cokernel of the restriction
(14.1) \label {cokercpx} \d f\colon TX|_N\longrightarrow f^*TY|_N

is zero. Suppose this is true; then, thinking of (14.1) as a cochain complex, its cohomology is T N in degree 0
and 0 in degree 1.
Example 14.2. Let X = R3 , Y = R, and f (x1 , x2 , x3 ) = (x1 )2 + (x2 )2 + (x3 )2 . Then y = 1 is a regular
value, which you can see by computing the derivative of f . The preimage is S 2 . Since all vector bundles on
R3 are trivializable but T S 2 isn’t (the nonzero Euler characteristic of S 2 provides an obstruction), T S 2 → S 2
does not extend to a vector bundle on R3 . (
42 M392C (Mathematical gauge theory) Lecture Notes

So T N → N doesn’t always extend to X, but it does extend as a virtual bundle, meaning a formal difference
of two vector bundles. Concretely, this means there are vector bundles ξ, η → X such that T N ⊕ ξ|N ∼ = η|N ,
so we say heuristically that “T N is the restriction of η minus ξ.”
First, using the exact sequence

(14.3) \shortexact [][\d f]{TN}{TX|_N}{f^*TY|_N}{}

of vector bundles over N , then taking determinants, Det T N → N is the restriction to N of


(14.4) (\Det TX)\otimes (\Det f^*TY)^{-1}\longrightarrow X.
In Example 14.2, Det T S 2 is trivializable, because S 2 is orientable, but this of course isn’t always the case.

Example 14.5. We can generalize from functions to sections of vector bundles.


Let V be a four-dimensional vector space and H → P(V ) = ∼ RP3 be the Hopf line bundle: a point in P(V )
is a line L ⊂ V through the origin, and we declare the fiber of H at L to be L∗ .
There’s a map ϕ : V ∗ → Γ(P(V ), H) sending θ 7→ (L 7→ θ|L ). Therefore, the assignment fθ (L) := θ|L
defines a section s of H → P(V ). If Z denotes the zero section and N := fθ−1 (Z), then Z and s are
transverse and so N is a submanifold, diffeomorphic to RP2 ⊂ RP3 . Then Det T N → N is the restriction
of Det T P(V ) ⊗ H −1 → P(V ), and it’s nontrivial: RP2 is nonorientable, and Det T N has nonzero first
Stiefel-Whitney class. (

This approach describes finite-dimensional manifolds as cut out by finitely many equations satisfying
transversality. In the infinite-dimensional setting, things will look similar, though we may cut out the solutions
to infinitely many equations.

Remark 14.6. Another common generalization is to consider two maps f : X → Y and f 0 : X 0 → Y . We want
to take the fiber product of those two maps, the subset of (x, x0 ) ∈ X × X 0 such that f (x) = f 0 (x0 ), and we
want it to be a submanifold. A sufficient, but not necessary, condition, is for f t f 0 , and then the formulas
about tangent bundles generalize. When X 0 = pt and f 0 : pt 7→ y, transversality is equivalent to asking for y
to be a regular value of f . (

Remark 14.7 (The non-transverse setting). If a := dim X and b := dim Y , then in the transverse setting we
expect dim N = a − b. Sometimes Y isn’t a regular value, but N is nonetheless still a manifold; in this case
dim N might be larger than expected.
Let’s assume N is a submanifold of X but that y isn’t necessarily a regular value. We have ker dfx = Tx N ,
and hence E := coker df → N is also a vector bundle, called the obstruction bundle.
For example, let φ : S 2 → S 2 be a diffeomorphism and consider the map f : S 2 → S 2 × S 2 sending
p 7→ (p, φ(p)). The intersection with the diagonal would have expected dimension zero, tracking the fixed
points of φ, but of course if φ = id then we get all of S 2 , which has positive dimension. As an exercise, show
that in this case, the obstruction bundle is identified with T S 2 .
In this case, the way to study the intersection is to also keep track of the obstruction bundle. But this
doesn’t work for all non-transverse intersections – consider the map fa := x2 − a from R → R. For a > 0,
this is fine, and we get two points as expected, but at a = 0, we only get a single point. Differential geometry
can’t really tell that you were supposed to get two points there, but algebraic methods can: consider the
graph of fa in R × R, and intersect it with the graph y = 0.
In algebraic geometry, we do this by considering rings of functions. The functions on R × R are R := R[x, y];
the algebra of functions on Z := {y = 0} is MZ := R[x, y]/(y), and the algebra of functions on P := {y = x2 }
is MP := R[x, y]/(y − x2 ).
The intersection is a pullback, so its algebra of functions is a pushout, which is given by \xymatrix { 0 & \R [x,y]\ar [l] & \R [x,y]\ar [l]_-y & 0\ar [l]. }
the tensor product
(14.8) M_P\otimes _R M_Z.
(In more elaborate settings it can be better to consider derived terms as well, which here are factors of Tor.)
To compute this, let’s take a resolution of R:

(14.9)
Arun Debray May 9, 2019 43

Multiplication by y is injective, so this is fine, and the cohomology of this complex has MZ in degree 0 and
vanishes elsewhere. Now, tensor with MP :

(14.10)
Now, the zeroth cohomology is a two-dimensional vector space, and in this sense we’ve remembered that the
intersection is of multiplicity 2. (
We can also consider this story in families. A “family of equations” is a smooth manifold S and a map
f : S × X → Y ; we think of fixing the parameter s ∈ S to obtain an equation x 7→ fs (x) := f (s, x).
Lemma 14.11. Let y ∈ Y and N := f −1 (y) ⊂ S × X. Let π : N → S be the restriction of projection onto
the first factor S × X → S. Then s is a regular value of π iff y is a regular value of fs .
Sard’s theorem tells us that regular values in Y form a subset of the second category, meaning in particular
that it’s dense, and therefore nonempty. So there are regular values. The takeaway is that if we start with
a nontransverse map and add enough parameters, we can adjust the parameters a little bit and obtain a
transverse map. Though algebraic methods are better at directly dealing with nontransverse situations, they
don’t allow this kind of argument; each perspective is useful in a different way.
Now we throw in symmetries. Suppose G is a Lie group acting on X and Y and that f : X → Y is
G-equivariant. If y ∈ Y is a fixed point of G, then G acts on T X and f ∗ T Y , and df : T X → f ∗ T Y is
G-equivariant. There’s also a Lie algebra action here: at each x ∈ X, there’s a map g → Tx X: a ξ ∈ g
defines a left-invariant vector field on G, and hence a curve t 7→ etξ in G through the identity. Acting by
these elements defines a curve in X, and we can differentiate at t = 0 to obtain a tangent vector at x. As x
varies, this stitches into a vector bundle map, and so altogether we have a complex

(14.12) \xymatrix { \underline \fg \ar [r]^-\alpha & TX\ar [r]^-{\d f} & f^*TY. }

in degrees −1, 0, and 1, and its cohomology is H −1 = ker α, H 0 = T N/ Im α, and H 1 = E, the obstruction
bundle. However, ker α isn’t always a vector bundle – it tells us the Lie algebra of the isotropy groups at
each point, but we don’t yet know that these are always the same dimension. If they are, then N/G is a
smooth orbifold, and T N/ Im(α) = T (N/G). One way to force this would be to ask for all isotropy groups
to be discrete, in which case ker α = 0. So H 1 is non-transversality, H −1 is isotropy groups, and H 0 is the
tangent space.
B·C

We would like to generalize all of this to infinite dimensions. Recall that if V and W are Banach spaces and
T : V → W is a continuous linear map, then T is Fredholm if its kernel and cokernel are finite-dimensional.
Definition 14.13. Let A and B be affine spaces modeled on the Banach spaces V and W , respectively, and
let U ⊂ A be open. A C 1 map (not necessarily linear) f : U → B is Fredholm if dfp : V → W is Fredholm for
all p ∈ U .
In this case, df defines a map U → Fred(V, W ): it’s a family of linear Fredholm operators parameterized
by U , and this is the infinitesimal information contained in f .
Remark 14.14. One can generalize to C 1 maps between Banach manifolds. The basic theory of Banach
manifolds is set up similarly to the finite-dimensional setting that may be more familiar, at least once you
have the inverse function theorem. (
Theorem 14.15. Let y ∈ B be a regular value of a Fredholm map f : U → B. Then f −1 (y) ⊂ U is a
finite-dimensional submanifold, with dimension ind dfx .
Now U might not be connected, so the index of dfx need only be locally constant, since it’s a surjective
map Fred(V, W ) → Z≥0 , at least of V and W are infinite-dimensional. So f −1 (y) may have components of
different dimensions.

Proof. Let x0 ∈ f −1 (y) and T0 = dfx0 : V → W . If V 0 ⊂ V is a complementary subspace to ker(T0 ), then as


we discussed last time, T0 |V 0 : V 0 → T0 (V 0 ) is invertible. Let W 0 be a complement to T0 (V 0 ).
44 M392C (Mathematical gauge theory) Lecture Notes

Let’s introduce a local change of coordinates ϕ around x0 such that


(14.16) (f\circ \vp )(x_0 + \xi ' + \eta ) = y + T_0(\xi ') + \phi (\xi ',\eta ),
where ξ 0 ∈ V 0 , η ∈ ker(T0 ), and φ is a map V → W 0 . (To do this, we need to know the inverse function
theorem.) The point is, in the direction corresponding to V 0 , f ◦ ϕ is just T0 , and in the complementary
direction, it’s something else, which might be nonlinear. Define g : ker(T0 ) → W 0 to send η 7→ φ(0, η); we’ve
just shown a bijection between f −1 (y) and g −1 (0).
We haven’t yet used that y is a regular value, so as a bonus we learn that we can locally model the inverse
image of a Fredholm map as a map between finite-dimensional vector spaces. Since y is a regular value of
f , then 0 is a regular value of g, and therefore the component of f −1 (y) containing x0 is a manifold of the
expected dimension, which is ind T0 = dim ker(T0 ), since the cokernel vanishes. 

This is an echo of the techniques we used last time: we used the inverse function theorem to write a
nonlinear Fredholm map locally as a nonlinear finite-dimensional map and a linear invertible map.
There’s also an analogue of Sard’s theorem in this setting, called the Sard-Smale theorem; next time, we’ll
see how to use this to construct moduli spaces.

Lecture 15.
Constructing moduli spaces: 3/14/19

We continue on our journey to the construction of moduli spaces of solutions to (certain nice) PDEs. One
potential roadblock is that we haven’t discussed Sobolev spaces in this class, but we’ll carry on nonetheless.
Recall that if A and B are affine spaces modeled on the Banach spaces V and W respectively and U ⊂ A
is open, we defined what it means for a C 1 map f : U → B, not necessarily linear, to be Fredholm: that
dfp : V → W is Fredholm for all p ∈ U . Thus we can also make sense of this definition for Banach manifolds
locally modeled on V and W .
The crucial lemma about Fredholm maps is that, after a possibly nonlinear change of coordinates around
any p0 ∈ U , we can rewrite f as “linear plus finite-dimensional.” More precisely, let T0 = df |p0 : V → W , and
choose complements V0 of ker(T0 ) ⊂ V and W 0 of T0 (V0 ) ⊂ W , so that f is a map ker(T0 ) ⊕ V0 → W 0 ⊕ T0 (V0 ).
The inverse function theorem for Banach spaces then tells us there’s a δ > 0 and a nonlinear function
φ : Bδ (0) ⊂ V → W such that
(15.1) f(\eta , \xi _0) = (\phi (\eta , \xi _0), T_0(\xi _0)).
0
Because f is Fredholm, W is finite-dimensional. This will greatly simplify some of what we’re going to do:
instead of worrying about an infinite system of equations, we can reduce to considering only finitely many.
Remark 15.2. Suppose Γ is a Lie group acting on U and B, perhaps nonlinearly, and suppose f is Γ-equivariant
and p0 is a fixed point for Γ. Then, differentiating, we get a linear action on V and W , and the running
through the construction above, T0 and φ are equivariant for these linear actions. (
Corollary 15.3.
(1) If q ∈ B is a regular value, then f −1 (q) ⊂ U is a finite-dimensional submanifold.
(2) The Sard-Smale theorem: regular values are a Baire set (second category), which in particular means
there are a lot of them.
Now we will use this theory to construct moduli spaces – first, the moduli space of pseudoholomorphic curves
in symplectic topology, and then a moduli space in gauge theory, which has a Γ-action as in Remark 15.2.

Pseudoholomorphic curves. Let Σ be a Riemann surface, which is an oriented 2-manifold with a conformal
structure, or equivalently a manifold of complex dimension 1. Let j ∈ End(T M ) denote the almost complex
structure induced from the complex structure, meaning j is multiplication by i.
Let (X, J) be an almost complex manifold, meaning that J ∈ End(T M ) squares to −id.
Definition 15.4. A map φ : Σ → X is pseudoholomorphic (or J-holomorphic) if dφ ◦ j = J ◦ dφ.
In a little more detail, at any σ ∈ Σ, dφσ is a map Tσ Σ → Tφ(σ) X; j acts on the domain and J on the
codomain, and we want dφσ to intertwine those actions.
Arun Debray May 9, 2019 45

The space Hom(Tσ Σ, Tφ(σ) X) splits as

(15.5) \label {homsplit} \Hom (T_\sigma \Sigma , T_{\phi (\sigma )}X) = \Hom ^+(T_\sigma \Sigma , T_{\phi (\sigma )}X)\oplus \Hom ^-(T_\sigma \Sigma , T_{\phi (\sigma )}X),

the spaces which commute and anticommute with J and j, respectively. Alternatively, there is an involution
on this space sending L 7→ J ◦ L ◦ j −1 , and Hom+ is the eigenspace for 1, and Hom− the eigenspace for −1.30

Definition 15.6. Let π − : Hom(Tσ Σ, Tφ(σ) X) → Hom− (Tσ Σ, Tφ(σ) X) be the projection. Then we define
∂φ = ∂ J φ := π − ◦ dφ.

Where does this live? Well, dφ ∈ Ω1Σ (φ∗ T X), and this splits like in (15.5):

(15.7) \Omega _\Sigma ^1(\phi ^*TX) = \Omega _\Sigma ^1(\phi ^*TX)^+\oplus \Omega _\Sigma ^1(\phi ^*TX)^-,

and ∂ is projection onto Ω1Σ (φ∗ T X)− . Thus ∂φ = 0 iff φ is pseudoholomorphic.


We would like to study the space of solutions to the equation ∂φ = 0 on the space of such φ. First we
need to write a function φ 7→ ∂ J φ. The domain is Map(Σ, X), but the codomain a priori depends on φ:
for φ given it lands in Ω1Σ (φ∗ T X)− . So f is instead a section of a vector bundle E → Map(Σ, X), where
Eφ := Ω1Σ (φ∗ T X)− .

Remark 15.8. There is a bit of nuance involved in constructing E → Map(Σ, X). As with anything involving
infinite-dimensional manifolds, there’s the question of what kind of regularity we want; if we just consider
smooth functions, we’ll obtain manifolds modeled on Fréchet spaces, which are weaker than Banach spaces.
Calculus in Fréchet spaces is generally more difficult than in Banach spaces, so it’s often better to complete
in some other norm which leads to a Hilbert manifold, or something like that.
But if you just want to see why E → Map(Σ, X) is a vector bundle, you don’t really need to worry about
that. One way to produce local trivializations of E → Map(Σ, X) is to use a connection on T X to identify
different fibers nearby using parallel transport, but all approaches will involve thinking about what nearby
points in Map(Σ, X) are. (

Given a covariant derivative ∇ on X, we can hit f with it, obtaining a map ∇fφ : Tφ (Map(Σ, X)) → Eφ ,
i.e. a map Ω0Σ (φ∗ T X) → Ω1Σ (φ∗ T X)− . We can write this as a map ξ 7→ (∇ξ)− .

Claim 15.9. ∇fφ (ξ) = π − (∇ξ), and this operator is Fredholm.

The first claim is not so surprising: if you differentiate the ∂ J operator, you get almost the same thing
applied to ξ. The second claim is much deeper: it involves not only a significant chunk of elliptic theory,
but also replacing Map(Σ, X) and E into a Hilbert manifold and Hilbert bundle, respectively, by taking
completions with respect to some norm – “thickening” them, in a sense. So “this” in the theorem statement
isn’t entirely right; instead we replace it with its thickened version.
Once we know it’s Fredholm, which we’ll discuss later in a more general setting, we’d like to compute the
index of this operator. One reasonable choice for calculating is the Riemann-Roch theorem, but as φ isn’t
holomorphic, it doesn’t apply, and instead one has to use the more general Atiyah-Singer index theorem;
indeed, this was one of the first successes of this more general theorem.
Then, however, we have to worry about whether 0 is a regular value for f . In fact, maybe it isn’t – but we
can get around this using the Sard-Smale theorem: almost everything is a regular value, so if we can embed
in a family of moduli problems (concretely, writing the equation with parameters), then we have a better
chance of establishing transversality. One option is to vary the almost complex structure J on X, and another
is to consider ∂ J φ = ν for ν 6= 0, where ν is a section of the vector bundle V → Σ × X whose fiber at (σ, x)
is Hom− (Tσ Σ, Tx X). Thus we have a section f : Map(Σ, X) × Γ(V ) → E, and it turns out that just varying
ν is enough to establish transversality. Because Γ(V ) is a linear space, the calculations are a little easier. So
the moduli space isn’t quite pseudoholomorphic curves, so if you want topological invariants, it’s important
to check that what you get doesn’t depend on ν for generic ν. If you wanted to let ν = 0 even when you
don’t have transversality, you might have to work with the derived geometry, as we discussed last time.

30These eigenspaces have the same dimension, which you can see by replacing J with −J.
46 M392C (Mathematical gauge theory) Lecture Notes

The moduli space of self-dual connections. Unlike the previous case, there’s a group acting on connec-
tions, and we want solutions up to equivalence.
Specifically, let G be a Lie group and M be a smooth manifold. We want to study principal bundles on M
modulo those which are “the same.”
Definition 15.10. If π : P → M and π 0 : P 0 → M are principal G-bundles, a morphism from π to π 0 is a
smooth, G-equivariant map ϕ : P → P 0 commuting with the maps down to the base; that is, π 0 ◦ ϕ = π. If
P = P 0 , we call ϕ a gauge transformation.
As it turns out, this definition forces every morphism to be an isomorphism. Therefore the category
BunG (M ) of principal G-bundles on M is a groupoid. This is the thing which is local, not its set of
isomorphism classes: given intersecting opens U, V ⊂ M , we can’t glue isomorphism classes of bundles on
U and V (how do we work with equivalence classes nicely?), but given two principal bundles P → U and

=
P 0 → V and an isomorphism P |U ∩V → P 0 |U ∩V , we can glue.
We’d like to study these gauge transformations ϕ : P → P . From ϕ we can extract data of a map
gϕ : P → G: ϕ(p) is in the same fiber as p, so it’s equal to g · p for a unique g ∈ G, and we let gϕ (p) := g.
This is equivariant for G acting on itself by conjugation:
(15.11) g_\vp (p\cdot h) = h^{-1}g_\vp (p)h.
Moreover, from gϕ we can recover ϕ, and g(·) sends composition to pointwise multiplication. So the group
of gauge transformations is the group of such gϕ , which is clearly infinite-dimensional (if G isn’t zero-
dimensional). Concretely, this is the space of sections of the associated fiber bundle GP → M given by mixing
P with the G-manifold G with action ρ : G → Aut(G) sending H 7→ (g 7→ hgh−1 ); then the group of gauge
transformations is GP := C ∞ (M, G).
Since GP is (some suitable infinite-dimensional version of a) Lie group, it has a Lie algebra, and we
should determine what it is. Given a curve on M , if we differentiate a section over that curve, we obtain
a vertical vector field on P , so the Lie algebra of GP is contained in the algebra vertical vector fields on
P → M . Specifically, we know how this transforms under G, so more precisely it’s the algebra of G-invariant
vertical vector fields on P → M . If this is confusing, it may be helpful to think of this as an infinitesimal
gauge symmetry: vertical because it preserves the fibers, and G-invariant because gauge symmetries are
G-equivariant.
We can then describe the Lie algebra of GP by differentiating gϕ : we want maps ζ : P → g such that
(15.12) \zeta (p\cdot h) = \Ad _{h^{-1}}\zeta (p).
In other words, ζ is a section of the associated bundle to the adjoint action λ : G → Aut(g) sending h 7→ Adh .
In other words, ζ ∈ Ω0M (gP ). This gP is more than a vector bundle – the adjoint action preserves the Lie
bracket, so this is a bundle of Lie algebras. This makes the fact that these came from infinitesimal symmetries
less clear, but is easier to compute with.
Remark 15.13. So far, we’ve been doing everything with smooth maps, smooth sections, etc., and therefore
obtained Fréchet manifolds. As before, we’ll have to thicken to Hilbert spaces to make calculations more
tractable. (
If AP denotes the space of connections on π : P → M (i.e. G-invariant horizontal distributions), there is a
right GP -action on AP , because a gauge transformation pulls back a horizontal distribution to a horizontal
distribution.
Exercise 15.14. Recall that a connection is equivalent to its connection form A ∈ Ω1P (g) such that
Rh∗ A = Adh−1 (A) and the restriction of A to the fiber is the Maurer-Cartan form. Describe the GP -action in
these terms.

Lecture 16.
Gauge transformations: 3/26/19

“Can I trust you to prove it, or should I? Or do you want Arun to prove it in the notes?”
Arun Debray May 9, 2019 47

Today we continue discussing gauge transformations, illustrating an example of the construction of a moduli
space where one has to divide out by a symmetry group.
Let G be a Lie group and π : P → G be a principal G-bundle. Last time, we defined GP to be the group of
G-equivariant diffeomorphisms ϕ : P → P commuting with the projection map back onto M . If ϕ ∈ GP , then
ϕ sends fibers to fibers, so there’s a function gϕ : P → G defined to satisfy
(16.1a) \vp (p) = p\cdot g_\vp (p),
and G-equivariance of ϕ implies
(16.1b) g_\vp (p\cdot h)\coloneqq h^{-1} g_\vp (p)h.
Hence we can descend gϕ to a section of the associated bundle of groups31 GP → M . That is, a gauge
transformation is a section of an associated bundle of groups.
We’d like to have a Lie algebra of “infinitesimal gauge transformations,” but quickly we run into a problem:
we would need to make sense of GP as an infinite-dimensional Lie group. This is easier if you take Sobolev
completions; we’ll begin discussing the general theory of this approach next time.32 Today, we’ll use a more
ad hoc approach, allowing us to finish the construction of a moduli space of a problem with symmetries.
Our ad hoc approach to the Lie algebra of infinitesimal gauge transformations will be to consider a path
of sections of GP → M . The derivative of such a path at t = 0 is a G-invariant vertical vector field on P .
We’ll eventually see that all G-invariant vector fields on P arise in this way, so this will be our Lie algebra.
Given a G-invariant vertical vector field on P , we obtain a function ξ : P → g: the G-invariant vertical
vector fields on each fiber of P are identified with g, since the fiber is a G-torsor. But G-invariance imposes
additional constraints: specifically, if gt := etξ : P → G, then
(16.2a) g_t(p\cdot h) = h^{-1} g_t(p)h,
and differentiating at t = 0,
(16.2b) \xi (p\cdot h) = \Ad _{h^{-1}}\xi (p).
So ξ descends to a section of the associated bundle of Lie algebras gP → M . In other words, the Lie algebra
of G-invariant vertical vector fields (hence also of infinitesimal gauge transformations) is Ω0M (gP ). The Lie
bracket is pointwise.
Now let’s talk about connections: how do gauge transformations and infinitesimal gauge transformations
act on connections? Let θ ∈ Ω1G (g) be the Maurer-Cartan form: its value on a tangent vector v is the unique
left-invariant vector field on G extending v. This form satisfies the equation

(16.3) \d \theta + \frac 12 [\theta \wedge \theta ] = 0.

The connection forms AP ⊂ Ω1P (g) are those A satisfying the equations
(16.4a) Rh∗ A = Adh−1 A
(16.4b) i∗m A = θ,
for any m ∈ M , where im : Pm ,→ P is the inclusion of the fiber. Because (16.4a) is linear and (16.4b) is
affine, we immediately see that the space of solutions, i.e. the space of connections, is an affine space.
Exercise 16.5. Show that if ϕ ∈ GP and A ∈ AP , then ϕ∗ A ∈ AP . This amounts to checking that ϕ∗ A
satisfies (16.4a) and (16.4b). For (16.4a), it suffices to apply ϕ∗ to both sides – because ϕ is G-equivariant,
ϕ∗ commutes with Rg∗ and Adh−1 . The proof for (16.4b) is similar, though you also have to check that the
Mauer-Cartan form is invariant. This is ultimately because the vector field it gives you is left-invariant.
So GP acts on AP . Our next step will be to write down an explicit formula for this action. Then we’ll
discuss orbits, stabilizer groups, etc. First the formula.

31This is not the same as a principal bundle: it is a locally trivial family of groups. In particular, there is a section given by
the identity element.
32In the context of similar moduli-theoretic problems, controlling this kind of analytic obstacle is one of the many things
Karen Uhlenbeck got the Abel prize for.
48 M392C (Mathematical gauge theory) Lecture Notes

Proposition 16.6. Let ϕ ∈ GP and g := gϕ : P → G. Then


(16.7) \vp ^*A = \Ad _{g^{-1}}A + g^*\theta .
Moreover, if G is a matrix group, the adjoint action is conjugation, so
(16.8) \vp ^*A = g^{-1}Ag + g^{-1}\ud g.
If G is a matrix group, the Maurer-Cartan form is g −1 dg.
Proof. Let pt be a parameterized curve in P with p0 = p and ṗ0 = η ∈ Tp P . 

To be continued. . .
Lecture 17.
: 3/28/19

Lecture 18.
More Sobolev spaces and a spectral theorem: 4/2/19

Lecture 19.
Dirac operators: 4/4/19

“It’s my job. . . chief refresher.”


Last time, we stated and mostly proved the spectral theorem for positive compact self-adjoint operators: if V
is a separable Hilbert space and T : V → V is positive, compact, and self-adjoint, then T is diagonalizable:
there’s an orthonormal basis (en ) such that for all n, T en = λn en , where λn → 0 as n → ∞.
To fill in the missing step, let S(V ) denote the unit sphere in V and Q : S(V ) → R denote the quadratic
form ξ 7→ hξ, T ξi. (That this is a quadratic form at all uses that T is positive and self-adjoint.) Let
λ1 := supξ∈S(V ) Q(ξ). Thus there is a sequence (ξn ) ⊂ S(V ) such that hξn , T ξn i → λ1 .
Let P := λ1 − T ; then hξ, P ξn i → 0. The pairing
(19.1) \xi ',\xi ''\mapsto \ang {\xi ', P\xi ''}

is an inner product on (ker P ) , and applying Cauchy-Schwarz,
(19.2) \abs {\ang {\xi ', P\xi ''}} \le \ang {\xi ', P\xi '}^{1/2}\ang {\xi '', P\xi ''}^{1/2}.
Applying this when ξ 0 = ξ and ξ 00 = P ξ, we have
(19.3a) kP ξk2 ≤ hξ, P ξi1/2 hP ξ, P 2 ξi1/2
(19.3b) ≤ hξ, P ξi1/2 kP kkP ξk
(19.3c) kP ξk ≤ hξ, P ξi1/2 kP k.
Plugging in ξn , we get
(19.4) \norm {P\xi _n} \le \ang {\xi _n, P\xi _n}^{1/2}\norm P\longrightarrow 0,
so P ξn → 0. We haven’t yet used that T is compact, but now let’s use compactness to extract a subsequence
(ξni ) such that T ξni → η for some η. Since P ξni → 0, then λ1 ξni − T ξni → 0, and therefore λ1 ξni → η. That
is, λ1 T ηni converges both to T η and λ1 η, so η is an eigenvector for T with eigenvalue λ1 ; then we can set
e1 := η/kηk. This fixes the hole in the proof.
B·C

Today we’ll discuss how to use this functional analysis to study Dirac operators and elliptic theory, with
the eventual goal of studying moduli spaces. Dirac operators are very general, appearing in both physics and
mathematics.
Let M be an n-dimensional compact Riemannian manifold; we’d like to discuss some differential operators
on M . The metric gives us the Levi-Civita connection on the frame bundle π : BO (M ) → M , and a covariant
derivative on all associated vector bundles. There is a global horizontal framing of BO (M ): for i = 1, . . . , n,
Arun Debray May 9, 2019 49

we have a horizontal vector field ∂i on BO (M ): a point of BO (M ) is an isomorphism b : Rn → Tπ(b) M ; we let


∂i (b) := b(ei ), where ei is the ith basis vector.
Now suppose V is a vector space and ρ : On → Aut(V) is a representation. Let V → M denote the
associated vector bundle. A section ψ : M → V is equivalent data to an On -equivariant map ψ : BO (M ) → V,
i.e. such that Rg∗ ψ = ρ(g)−1 ψ for all g ∈ On . The covariant derivative of ψ is the section of V ⊗ T ∗ M
associated to the On -equivariant map
(19.5) \nabla \psi = \partial _k\psi \otimes e_k\colon \cB _\O (M)\longrightarrow \mathbb V\otimes (\R ^n)^*.
You can generalize a lot of this discussion to arbitrary connections, but the above formula only holds when
the connection is torsion-free.
Now suppose E and F are vector spaces with On -actions with associated vector bundles E → M and
F → M , respectively. Given an equivariant map σ : (Rn )∗ → Hom(E, F), we get a differential operator
Dσ : Γ(E) → Γ(F ) by the formula
(19.6) D_\sigma (\psi )\coloneqq \sigma (e^k)\partial _k\psi .
• n ∗
Example 19.7. Suppose E = F = Λ (R ) and σ(ek ) := (ek ) (meaning exterior multiplication). Then we
recover the de Rham differential: d = (ek )∂k . (
Now assume M is spin, so that we have a bundle of spin frames BSpin (M ) → M , which is a principal
Spinn -bundle. Let S = S0 ⊕ S1 be a Z/2-graded C `((Rn )∗ )-module and S denote the associated Z/2-graded
vector bundle (TODO explicate: this construction does use the spin structure). Suppose we have a Clifford
module map c : (Rn )∗ → End1 (S), meaning
(19.8) c(e^k)c(e^\ell ) + c(e^\ell )c(e^k) = 2\delta ^{k\ell }.
Then we get an associated Dirac operator D : Γ(S) → Γ(S) given by
(19.9) D\psi \coloneqq c(e^k)\partial _k\psi .
This is an odd operator.
Remark 19.10. One common extension of this is to consider a Lie group G (often Or or Ur ) acting on a
vector space V. Let P → M be a principal G-bundle with connection Θ. Then BSpin (M ) ×M P → M
is a principal Spinn × G-bundle, and there is a Dirac operator acting on Spinn × G-equivariant maps
ψ : BSpin (M ) ×M P → S ⊗ V given by
(19.11) D = (c(e^k)\otimes \id _{\mathbb V})\partial _k. \qedhere (
Now let’s compute D2 . We can and will do this in coordinates.
\begin {aligned} D^2 &= c(e^k)\partial _k c(e^\ell )\partial _\ell \\ &= c(e^k)c(e^\ell )\partial _k\partial _\ell \\ &= -\sum _{k=1}^n \partial _k^2 + \sum _{k < \ell } c(e^k)c(e^\ell ) [\partial _k, \partial _\ell ]. \end {aligned}

(19.12)

Now recall that the first term is the Laplacian:

(19.13) - \sum _{k=1}^n \partial _k^2 = \nabla ^*\nabla ,

and the second term is


(19.14) [\partial _k, \partial _\ell ] = -\Gamma _{k\ell }^i\partial _i - R_{jk\ell }^i Z_i^j.
The first term is the torsion, so it vanishes, because the Levi-Civita connection is torsion-free. The second
term comes from the curvature, and we can’t quite get rid of it. So we have some operator F built out of the
curvature such that
\sum _{k < \ell } c(e^k)c(e^\ell ) [\partial _k, \partial _\ell ] = F\cdot \psi .
(19.15)

In summary, we’ve proved:


50 M392C (Mathematical gauge theory) Lecture Notes

Theorem 19.16 (Weitzenböck formula). D2 = ∇∗ ∇ + F .


Determining the precise formula for F is a good exercise in Riemannian geometry, and it is a very useful
thing to have around; but for today we won’t need it.
Elliptic theory will apply to any operator D for which D@ = ∇∗ ∇ + F for some F .
Example 19.17. Let d∗ denote the adjoint of the de Rham differential and ∆ := (d + d∗ )2 = dd∗ + d∗ d be
the Hodge Laplacian. The Laplacian is a Dirac operator: consider the map d + d∗ : Ωodd even
M + ΩM , which is
0 := odd n ∗ 1 := even n ∗
associated to the vector spaces S Λ (R ) and S Λ (R ) , and the map σ : S → S1 given by
0

(19.18) \sigma (e^k) = \epsilon (e^k) - \iota (e^k),


where  denotes exterior multiplication and ι denotes interior multiplication (contraction). This satisfies the
Clifford relations, and we get out the Laplacian. (
Now we’ll begin using the Sobolev theory we spent the previous few lectures building, though we won’t get
too far until next time.
Definition 19.19. Let E → X be a vector bundle. The H−1 -norm of an f ∈ C ∞ (X, E) is the infimum
of the C > 0 such that hf, ψiL2 ≤ CkψkH 1 over all ψ ∈ H1 (X; E). We then define H−1 (X; E) to be the
completion of C ∞ (X; E) in the H1 -norm.
More or less by construction, this is the dual norm to the H1 -norm: there is a nondegnerate pairing
H−1 (X; E) ⊗ H1 (X; E) → C.
Lemma 19.20. The inclusion map L2 (X; E) → H1 (X; E) is continuous, and with D as above, D2 : H1 (X; E) →
H−1 (X; E) is continuous.
Theorem 19.21 (Gårding’s inequality). There is some constant κ > 0 such that
(19.22) \ang {D^2\psi , \psi } + \kappa \ang {\psi , \psi } \ge \norm \psi _{H_1}^2.
Proof. By Theorem 19.16,
(19.23) D2 ψ = ∇∗ ∇ψ + F ψ
(19.24) ψ + ∇∗ ∇ψ = D2 ψ + (1 − F )ψ
(19.25) kψk2H1 = hψ, ψi + h∇ψ, ∇ψi = hD2 ψ, ψi + h(1 − F )ψ, ψi
(19.26) ≤ hD2 ψ, ψi + κhψ, ψi.
This is because 1 − F is some algebraic operator on M , and M is compact, so it’s bounded. 

Corollary 19.27. The inner product


hhψ1 , ψ2 ii := hD2 ψ1 , ψ2 i + κhψ1 , ψ2 i
is equivalent to the H1 inner product.
This is a very strong statement: we can control all first derivatives in terms of the Dirac operator. This is
one manifestation of ellipticity.
Theorem 19.28. D2 + κ : H1 → H−1 is an isomorphism.
Proof. By Theorem 19.21, k(D2 + κ)ψkH−1 ≥ kψkH1 . Therefore D2 + κ is injective with closed range; it
suffices to prove it’s surjective. If f ∈ H−1 , consider the map H1 → C sending ϕ 7→ hf, ϕi. By the Riesz
theorem (and substituting in an equivalent inner product), there is some ψ ∈ H1 such that for all ϕ ∈ H1 ,
(19.29) \ang {f, \vp } = \langle \!\langle \psi , \vp \rangle \!\rangle = \ang {(D^@+\kappa )\psi , \vp },
2
and therefore (D + κ)ψ = f . 

So using not even all that much analysis, we’ve inverted the Laplacian (plus κ). Now we use the Rellich
theorem: consider the operator

(19.30) \xymatrix @1{ T\colon H_0\ar [r] & H_{-1}\ar [r]^-{(D^2+\kappa )^{-1}} & H_1\ar [r] & H_0. }
Arun Debray May 9, 2019 51

Each of these is continuous, and the last is compact. It’s also self-adjoint, and because κ > 0, it’s positive.
Now invoke the spectral theorem, yielding an orthonorormal basis (ψn ) of H0 with D2 ψn = λn ψn , where
λn ≥ 0,33 and λn → ∞. Thus for any a > 0, the set {n : λn < a} is finite.
The last basic theorem we need guarantees smoothness.
Theorem 19.31 (Elliptic regularity). These ψn are C ∞ .
We’ll do this next time.
Lecture 20.
Elliptic regularity: 4/9/19

“You might think it’s game over, but you’re wrong.”


Today our goal will be to prove elliptic regularity. Notation for Dirac operators may differ from previous
lectures, and objects in the mirror may be closer than they appear.
So let M be a closed spin Riemannian manifold of dimension n, BSO (M ) → M be the bundle of oriented
orthonormal frames, and and π : BSpin (M ) → M be its bundle of spin frames. The former is a principal
SOn -bundle and the latter is a principal Spinn -bundle.
Definition 20.1. The Clifford algebra C `−n is the unital algebra generated by e1 , . . . , en subject to the
relation
(20.2) \label {cliffreln} e^ie^j + e^je^i = -2\delta ^{ij}.
Beware that there are different sign conventions in the literature for the Clifford algebra.
The Clifford algebra is Z/2-graded: C `−n = C `0−n ⊕ C `1−n , where the even piece is spanned by products of
the ei with an even number of terms, and the odd piece is spanned by products with an odd number of terms.
This is in fact a Z/2-grading because (20.2) only contains products of two and zero generators; since (20.2) is
not homogeneous in the tensor algebra, we don’t get a Z-grading.
Now suppose V = V0 ⊕ V1 is a supermodule for C `−n , i.e. it’s a Z/2-graded C `−n -module such that the
action of C `0−n preserves the grading and the action of C `1−n reverses the grading. Given this data, let’s
form some associated bundles.
• SOn acts on C `−n by conjugation, where we interpret e1 , . . . , en as an oriented orthonormal basis of
Rn . Let C `(M ) := BSO (M ) ×SOn C `−n .
• The inclusion Spinn ,→ C `−n means the action of C `−n on V induces a Spinn -action on V, so we
can form the associated bundle V := BSpin (M ) ×Spinn V. This vector bundle is again Z/2-graded; let
V 0 , resp. V 1 denote the even and odd components.
In this setting we can define a Dirac operator. We will let ∂1 , . . . , ∂n denote the horizontal vector fields on
BSO (M ) or on BSpin (M ), as we constructed last time; the specific bundle will be unambiguous from context.
In particular, as we discussed last time,

(20.3) \label {bktvert} [\partial _k, \partial _\ell ] = -\frac 12 R_{jk\ell }^i E_i^j

is vertical. Here Eij is the matrix with a 1 in position (i, j) and zeros everywhere else. The Dirac operator
D : C ∞ (M : V ) → C ∞ (M ; V ) is defined as follows: we can identify
(20.4) C^\infty (M;V) = \set {\psi \colon \cB _\Spin (M)\to \bV \mid R_g^*\psi = g^{-1}\psi \text { for all }g\in \Spin _n},
using the fact that V is an associated bundle to BSpin (M ). Then
(20.5) D\coloneqq c(e^k)\partial _k,
k k
where c(e ) : V → V is the Clifford action of e on V.
Last time, we proved some important and nontrivial properties of the Dirac operator.
• The Weitzenböck formula (Theorem 19.16), that D2 = ∇∗ ∇ + R for some R related to the curvature.
• Using this, we proved Gårding’s inequality (Theorem 19.21), that hD2 ψ, ψi + κhψ, ψi ≥ kψk2H1 , and
hence D2 + κ can be inverted.
33Well, really we got that T ψ = µ ψ for some µ → 0; then you can solve for λ in terms of µ , getting λ = 1/µ − κ.
n n n n n n n n
52 M392C (Mathematical gauge theory) Lecture Notes

• Invoking spectral theory for D2 to construct a basis {ψn } of L2 (M ; V ) of eigenvectors with eigenvalues
λn , which increase to ∞ as n → ∞. In particular, for any a > 0, the set of n with λn < a is finite.
Today we’ll prove Theorem 19.31, that these ψn are smooth.
For the rest of today’s lecture, k·k` means k·kH` .
Proposition 20.6 (Elliptic estimate). Suppose ψ ∈ H`+1 for some ` ∈ Z≥0 . Then
(20.7) \norm \psi _{\ell +1} \le C\paren {\norm {D\psi }_\ell + \norm \psi _\ell },
where C is some constant independent of ψ.

Proof. The ` = 0 case follows from Theorem 19.21 and the fact that a2 + b2 ≤ a + b:

(20.8) \norm \psi _1 \le C\sqrt {\norm {D\psi }_0^2 + \norm \psi _0^2} \le C\paren {\norm {D\psi }_0 + \norm \psi _0}.
By induction, let’s assume the estimate is true for 1, . . . , ` − 1. We claim that [∇, D] is a zeroth-order operator
(sometimes called an algebraic operator).
To prove this, we first show that [t(ek ), c(e` )] = 0, because t(ek ) : ξ 7→ ξ ⊗ ek ∈ V ⊗ (Rn )∗ and c(e` ) : ξ 7→
c(e` )ξ ∈ V. So both t(ek )c(e` ) and c(e` )t(ek ) send ξ 7→ c(e` )ξ ⊗ ek . Hence
\begin {aligned} [\nabla , D] &= [t(e^k)\partial _k, c(e^\ell )\partial _\ell ]\\ &= t(e^k)c(e^\ell )[\partial _k, \partial _\ell ], \end {aligned}
(20.9)

and, as in (20.3), this is zeroth-order. This means we can bound the norm without losing any derivatives
after applying [∇, D] (that is, if kψk` is finite, so is k[∇, D]ψk` ). Hence
(20.10a) k∇ψk` ≤ C(kD∇ψk`−1 + k∇ψk`−1 )
(20.10b) ≤ C(k∇Dψk`+1 + k[∇, D]ψk`+1 + k∇ψk`−1 )
(20.10c) ≤ C(kDψk` + kψk`−1 + kψk` )
(20.10d) ≤ C(kDψk` + kψk` ).
Here C is again a variable constant: each C is some constant independent of ψ, but we don’t need to care
which constant it is, and it may change between lines. Hence
(20.11) \norm \psi _\ell + \norm {\nabla \psi }_\ell \le C\paren {\norm {D\psi }_\ell + \norm \psi \ell },
and the left-hand side is (some constant multiple of) the H`+1 norm of ψ. 

Corollary 20.12. If ψ ∈ H`+2 , for ` ≥ 0, then kψk`+2 ≤ C kD2 ψk` + kψk` .
Proof.
(20.13a) kψk`+2 ≤ C(kDψk`+1 + kψk`+1 )
≤ C kD2 ψk` + kDψk` + kψk`+1

(20.13b)
≤ C kD2 ψk` + kψk`+1 + kψk`+1 .

(20.13c)
Changing the value of C, this simplifies to
≤ C kD2 ψk` + kψk`+1

(20.13d)
≤ C kD2 ψk` + kDψk` + kψk`

(20.13e)
≤ C kD2 ψk` + kD2 ψk`−1 + kψk`−1 + kψk`

(20.13f)
≤ C kD2 ψk` + kψk`

(20.13g)
using the elliptic estimate. 

This suggests a very appealing argument: apply this corollary recursively to conclude ψ is in H`+4 , then
H`+6 , . . . until you see that it’s smooth. But we have no base case for this recursion argument and have to
do something different.
We want to approximate our a priori non-smooth sections by smooth ones. One standard way to do this
is with mollifiers, smooth functions which closely approximate a delta-function. Hence convolving a possibly
not smooth f with a mollifier produces a smooth function close to the original f .
Arun Debray May 9, 2019 53

Another thing we can do is take difference quotients, which has its own minimalist elegance. In general,
consider an L2 function f : Tn → V, which has a Fourier series:

(20.14) f(x) = \sum _{\nu \in \Z ^n} \widehat f_\nu e^{i\nu \cdot x},

where ν = (ν1 , . . . , νn ) ∈ Zn , x = (x1 , . . . , xn ) ∈ Rn /(2πZ)n , and ν · x := νi xi . Thus

(20.15) \label {normfour} \norm f_\ell ^2 = \sum _\nu \abs {\widehat f_\nu }^2 \sum _{j=0}^\ell \abs \nu ^{2j}.

Given h ∈ Rn , define

(20.16) f^h(x) \coloneqq \frac {f(x+h)-f(x)}{\abs h}.

Its Fourier coefficients are

(20.17) \label {hFourier} \widehat f^h_\nu = \frac {e^{i\nu \cdot h}-1}{\abs h}\cdot \widehat f_\nu .

Proposition 20.18.
(1) If f ∈ H` , then f (· + h) ∈ H` and kf (· + h)k` = kf k` .
(2) If f ∈ H`+1 , then f h ∈ H` and kf h k` ≤ Ckf k`+1 .
(3) If f ∈ H` and kf h k` ≤ C for all h sufficiently close to 0, with C independent of h, then f ∈ H`+1 .

Proof. (1) is obvious. Next (2): using (20.17),

(20.19) \label {midH4} \abs {\widehat f_\nu ^h}^2 = \frac {\abs {e^{i\nu \cdot h}-1}^2}{\abs h^2}\abs {\widehat f_\nu }^2 \le \frac {\abs {\nu \cdot h}^2}{\abs h^2}\abs {\widehat f_\nu }^2 \le \abs \nu ^2\abs {\widehat f_\nu ^2}.

Thus
(20.15) X
kf h k2` 1 + |ν|2 + · · · + |ν|2` |fbνh |2

(20.20a) =
ν∈Zn
(20.19) X 
(20.20b) ≤ |ν|2 + · · · + |ν|2(`+1) |fbν |2
ν∈Zn
(20.20c) ≤ kf k`+1 .

For (3), let ei := (0, . . . , 0, 1, 0, . . . , 0), with 1 in the ith position. Then

(20.21) \lim _{\e \to 0} \sum _{i=1}^n \abs {\widehat f^{\e e_i}_\nu }^2 = \abs \nu ^2 \abs {\widehat f_\nu }^2.

Hence for N ∈ Z≥0 ,


X `+1
X n X
X 2`
X
εei
(20.22a) |fbν |2 2j
|ν| = lim |fν |
b |ν|2j
ε→0
|ν|<N j=1 i=1 |ν|<N j=0
n
X
(20.22b) ≤ lim kf εei k2` ,
ε→0
i=1

which is bounded by some constant. 

Theorem 20.23. If ` ≥ 1 and ψ ∈ H` , then D2 ψ ∈ H`−1 , so ψ ∈ H`+1 .


Corollary 20.24. If ` ≥ 1 and ψ ∈ H` , then D2 ψ ∈ H` , so ψ ∈ H`+2 . Iterating, we conclude that if
D2 ψ = λψ, then ψ ∈ C ∞ .
Here we use the Sobolev embedding theorem.
54 M392C (Mathematical gauge theory) Lecture Notes

Proof sketch of Theorem 20.23. We use the standard trick to pass to the torus: choose a finite atlas on M
which trivializes all the vector bundles in scope, so it suffices to prove this on each chart, and embed each
chart in Tn . Since D2 is a second-order differential operator, we can write it as
(20.25) D^2 = L_2\circ \nabla ^2 + L_1\circ \nabla + L_0,
where L0 , L1 , and L2 are matrix-valued functions on Tn . In particular, there is some E (for “error term”)
such that
(20.26) D^2\psi ^h(x) = (D^2\psi )^h(x) + E\psi (x+h).
Therefore
kψ h k` ≤ C kD2 ψ h k`−2 + kψ h k`−2

(20.27a)
≤ C k(D2 ψ)h k`−2 + kEψk`−2 + kψk`−2

(20.27b)
≤ C kD2 ψk`−1 + kψk`

(20.27c)
(20.27d) ≤ C.
Therefore ψ ∈ H`+1 , using Proposition 20.18, part (3). 

Lecture 21.
Chern-Weil theory: 4/11/19

Today, we’ll discuss Chern-Weil theory, and at some point also the Chern-Simons form. Let G be a Lie
group, with no restrictions for now, and g be its Lie algebra.
Definition 21.1. An invariant polynomial of degree q ≥ 0 is a symmetric q-linear function
(21.2) f\colon \underbrace {\fg \many \times \fg }_q\longrightarrow \R

which is invariant under conjugation, i.e. for all g ∈ G and ξ1 , . . . , ξq ∈ g,


(21.3) f(g\xi _1 g^{-1}, \dotsc , g\xi _q g^{-1}) = f(\xi _1, \dotsc ,\xi _q).
Invariant polynomials form a vector space, which we denote I k (G), and which can be identified with
(Symq g∗ )G : Symq g∗ is the space of symmetric q-linear functions on g, and we take the invariant subspace
under G acting by conjugation. Let

(21.4) I^\bullet (G)\coloneqq \bigoplus _{q=0}^\infty I^q(G),

which we make Z-graded by specifying deg I q (G) = 2q. Multiplication of polynomials makes this into a
commutative R-algebra.
Example 21.5.
(1) Suppose G is a countable, discrete group. Then g = {0}, so I • (G) is just the constant functions in
degree 0, or just R.
(2) Now suppose G is connected and abelian, e.g. Tm , Rn , or a product of such things. Then g = Rn
and conjugation is trivial. Thus all polynomials are invariant, so I • (G) = Sym• ((Rn )∗ ), but with a
different grading: a degree-q homogeneous polynomial has grading 2q.  
(3) Let’s do a more interesting example: G = SU2 , the space of 2 × 2 matrices of the form αβ −βα ,
where α, β ∈ C and |α|2 + |β|2 = 1. The Lie algebra is

(21.6) \su _2 = \set *{\begin {pmatrix} ix & -y+iz\\ y+iz & -ix\end {pmatrix}: x,y,z\in \R }.

One example of an invariant polynomial is f4 : X 7→ tr(X 2 ): since

(21.7) X^2 = \begin {pmatrix}-x^2-y^2-z^2 & *\\{}* & -x^2-y^2-z^2\end {pmatrix},


Arun Debray May 9, 2019 55

then f4 (X) = tr(X 2 ) = −2(x2 + y 2 + z 2 ). It turns out that I ∗ (SU2 ) is a polynomial algebra generated
by f4 , which has degree 4.
(4) What about SL2 (R)? Again the Lie algebra is the algebra of traceless matrices, and I ∗ (SL2 (R)) =
R[g4 ], where g4 (X) := tr(X 2 ) for X ∈ sl2 (R). Explicitly, an arbitrary X ∈ sl2 (R) is given by a
trace-zero real matrix X = ( xy −xz ), so g4 (X) = 2(x2 + yz). So though the algebra looks the same,
the functions are different: f4 is positive definite, and g4 is indefinite.
(5) If we look at SL2 (C), we again get I • (SL2 (C)) = R[h4 ], given by the same formula: now x, y, and z
are complex, but nothing else changes.
(6) If we consider GLn (R), GLn (C), or Un , the Lie algebra now includes matrices with nonzero trace,
and the trace defines a degree-one invariant polynomial. Focusing on G = Un ; its Lie algebra un is
the algebra of skew-Hermitian matrices. Consider the polynomial

(21.8) \det (t-X) = \sum _{i=0}^n f_{2i}(X)t^i.

The determinant is an invariant polynomial, because if X is unitary, det(X) det(X −1 ) = 1. Therefore


these coefficients f2i are also conjugation-invariant. The theorem (which we won’t prove here) is that
I • (Un ) = R[f2 , . . . , f2n ]. The same holds for GLn (R) and GLn (C). In each case, f2i (X) = ± tr(Λi X).
(7) Suppose G = SO2m . We still have det(t − X), X ∈ o2m , but the trace vanishes, and the trace of any
odd exterior power vanishes: the eigenvalues of X come in conjugate pairs. So

(21.9) \det (t-X) = \sum _{j=0}^m f_{4j}(X)t^{2j},

where f4j = ± tr Λ2i X as before. There’s also a new invariant polynomial Pm of degree m, called the
Pfaffian, defined as follows. Let V = R2m , with the standard inner product. Use this inner product
to identify V and V ∗ , so we can say that a matrix X is in so2m iff, as a map X : V → V ∗ , X ∗ = −X.
The map X : V → V ∗ can be identified with ωX ∈ Λ2 V ∗ , and the condition tells us it’s skew. Then
the Pfaffian is
\label {pfaffindet} \frac {\omega _X^m}{m!} = P_m(x)\cdot \mathrm {vol},
(21.10)
where vol is the volume form induced from the inner product on V and the orientation (preserved
because we’re looking at SO2m and not O2m ). (21.10) takes place in Det V .
Exercise 21.11. Show that Pm (X 2 ) = det(X).
This tells us that I • (SO2m ) isn’t a polynomial ring: there’s a nontrivial ring. (
q
Lemma 21.12. Let f ∈ I (G) and ζ, ζ1 , . . . , ζq ∈ g. Then
q
X
f (ζ1 , . . . , [ζ, ζi ], . . . , ζq ) = 0.
i=1
Proof. Differentiate and use the fact that f is invariant:
(21.13) f\paren {\Ad _{\exp (t\zeta )}\zeta _1, \dotsc , \Ad _{\exp (t\zeta )}\zeta _q} = f(\zeta _1, \dotsc ,\zeta _q).\qedhere 

This expresses the infinitesimal invariance of the invariant polynomials. If the group isn’t connected,
though, that doesn’t tell you everything.
Chern-Weil theory is about applying invariant polynomials to study connections on principal G-bundles.
Let P → M be a principal G-bundle, and recall that a connection Θ ∈ Ω1P (g) satisfies a few equations,
including (16.4a) and (16.4b), telling us how Θ transforms under the pullback by right multiplication and
what it must restrict to on each fiber.
The curvature of Θ, given by Ω = dΘ + (1/2)[Θ ∧ Θ]. It satisfies slightly different equations: it transforms
under Rg∗ exactly as in (16.4a), but i∗m Ω = 0,34 which follows from the Maurer-Cartan equation. Finally,
what happens if you differentiate Ω? The dΘ vanishes, and using the Leibniz rule, we get

(21.14) \label {diffcurv} \d \Omega - = [\d \Theta \wedge \Theta ] = [\Omega \wedge \Theta ] - \frac 12[[\Theta \wedge \Theta ]\wedge \Theta ].

34These two imply that the curvature form descends to the base, valued in the adjoint bundle.
56 M392C (Mathematical gauge theory) Lecture Notes

The Jacobi identity implies the second term vanishes: plug in three vectors to this 3-form and see what
happens. It may help to work in coordinates, letting Θ = Θα ξα , where {ξα } is a basis of g.
We rearrange (21.14) into a more standard form, called the Bianchi identity:
(21.15) \label {bianchi2} \d _\Theta \Omega \coloneqq \d \Omega + [\Theta \wedge \Omega ] = 0.
Though the curvature is in some sense the derivative of the connection, it’s not literally the covariant
derivative.
Remark 21.16. Suppose ζ ∈ g and ζb is the induced vertical vector field on P ; we want to show ιb
ζ
Ω = 0. Well,
let’s compute:
 
1
(21.17a) ιb
ζ
Ω = ι dΘ + [Θ ∧ Θ] .
2
Using Cartan’s formula,
(21.17b) = −dιb
ζ
Θ + Lb
ζ
Θ + [ζ, Θ].
The first term is zero because contraction gives us a constant function P → g, and when we differentiate we
get zero. For the Lie derivative, we have

(21.18) \mathcal L_{\widehat \zeta }\Theta = \left .\dfr {}{t}\right |_{t=0} R_{\exp (t\zeta )}^*\Theta = \left .\dfr {}{t}\right |_{t=0} \Ad _{\exp (-t\zeta )}\Theta = -[\zeta , \Theta ].

So that adds to [ζ, Θ] and we get zero. (


• •
Now, given data of P → M and a connection, we’ll construct a map I (G) → Ω (M ), called the Chern-Weil
homomorphism, and will show that it’s a map of differential graded algebras. First we need to define a
differential on I • (G), but since it’s concentrated in even degrees, our hand is forced: d = 0. So the condition
“the Chern-Weil homomorphism is a map of differential graded algebras” means that the image of any invariant
polynomial is closed.
Definition 21.19. Let P → M be a principal G-bundle with connection Θ and curvature form Ω. The
Chern-Weil homomorphism ω : I • (G) → Ω• (M ) sends f ∈ I q (G) to
(21.20) \omega _f\coloneqq f(\underbrace {\Omega \many \wedge \Omega }_{\text {$q$ times}}).

Let’s quickly typecheck this. Since Ω ∈ Ω2P (g). Hence Ω ∧ · · · ∧ Ω ∈ Ω2q


P (g
⊗q
), and therefore f (Ω ∧ · · · ∧ Ω) ∈
2q
ΩP . There’s still plenty to check here – why does this descend to M ? Why is it closed? But first, an example.
Example 21.21. Let G = Un and f (X) = tr(X 3 ). Then ωf = tr(Ω ∧ Ω ∧ Ω). If you write Ω = (Ωij ), for
Ωij ∈ Ω2P (C), then you can explicitly obtain matrix-valued forms, multiply them together, and take their
trace. (
If {ζα } is a basis of g, we can write Ω = Ωα ζα , where Ωα ∈ Ω2P . Then
(21.22) f(\Omega \many \wedge \Omega ) = f(\zeta _{\alpha _1},\dotsc , \zeta _{\alpha _q}) \Omega ^{\alpha _1}\many \wedge \Omega ^{\alpha _q}.
Now, as promised:
Lemma 21.23. For any f ∈ I • (G), ωf is closed, and it descends to M .
Proof. To show it’s closed, we just compute:
(21.24a) dωf = df (Ω ∧ · · · ∧ Ω)
q
X
(21.24b) = f (Ω ∧ · · · ∧ dΩ ∧ · · · ∧ Ω),
i=1

where dΩ is in the ith place. By (21.15),


q
X
(21.24c) =− f (Ω ∧ · · · ∧ [Θ ∧ Ω] ∧ · · · ∧ Ω)
i=1
(21.24d) =0
Arun Debray May 9, 2019 57

by Lemma 21.12.
To show ωf descends to M , we’ll show that the contraction with any vertical vector field is zero. Let ζb be
a vertical vector field; then
(21.25) \iota _{\widehat \zeta }\omega _f = \iota _{\widehat \zeta }f(\Omega \many \wedge \Omega ) = \sum _i f(\Omega \many \wedge \iota _{\widehat \zeta }\Omega \many \wedge \Omega ) = 0,

because we saw that ιb


ζ
Ω = 0. 

The Chern-Weil homomorphism is natural in the following sense: let ϕ : M 0 → M be a smooth map and
ϕ : P 0 = ϕ∗ P → P be the induced map for the pullback bundle. Let Θ0 := ϕ∗ Θ be the pullback map.
Proposition 21.26. ωf (Θ0 ) = ϕ∗ ωf (Θ).
This leads us to ask whether there is a universal target: is there a manifold BG and a principal G-bundle
EG → BG such that every principal bundle and connection arises this way? Then we could just do Chern-Weil
theory there and carry it over to everywhere else. Unfortunately, this dream doesn’t quite work: BG and
EG exist in topology, but are only unique up to homotopy. So we could fix a model, but we’d need for
connections and forms to make sense on it, which is not guaranteed. Moreover, then pullback maps exist but
aren’t unique; the space of such maps is contractible, which is great for homotopy theory (where contractible
spaces are hardly different from points) but not quite for geometry.
Nonetheless there are some nice things to say. If G is compact, there is a manifold model for EG → BG –
albeit an infinite-dimensional, Hilbert manifold. First, let’s do this for Un . Fix a separable infinite-dimensional
complex Hilbert space H.
Definition 21.27. The Stiefel manifold Stn (H) is the space of isometries b : Cn → H.
This is a Hilbert manifold modeled on H, where the topology is the subspace topology of Hn (in fact,
it’s a submanifold of that space). The Grassmannian of H, denoted Grn (H), is the Hilbert manifold of
n-dimensional subspaces W ⊂ H. There is a map Stn (H) → Grn (H) sending a map Cn → H to its image;
since the map is an isometry, it’s injective, so the image is an n-dimensional subspace. Then Un acts on the
fiber by precomposition: precomposing Cn → H with a unitary map doesn’t change the image.
To see why this is universal, we have to check one more thing.
Lemma 21.28. Stn (H) is contractible.
We’ll discuss this more next time (I think). The proof will induct on n; for n = 1, this is a fairly explicit
question about the unit sphere in an infinite-dimensional Hilbert space. Then, one can induct by showing that
Stn (H) fibers over Stn−1 (H) and things are good. This then implies Grn (H) has the homotopy type of BG.
For general G, we can use the fact that G is compact to know there’s a faithful representatin G ,→ Un ,
and use this to produce smooth manifold models for BG and EG.
The next step will be to construct a universal connection Θuniv on Stn (H) and to apply Chern-Weil theory.
The de Rham theorem is trickier in infinite dimensions, but this can be worked around, so we do get a map
I • (G) → H ∗ (BG; R), and it will be interesting to see what classes we get.
Remark 21.29. In general, this map need not be an isomorphism: I • (Z) = 0, but H ∗ (BZ; R) = R[x]/(x2 ),
with x in degree 1, because BZ = S 1 . (
Lecture 22.
Chern-Weil theory on classifying spaces: 4/16/19

“This is called transgression, even though it doesn’t seem very sinful.”


As we continue our discussion of Chern-Weil theory, we’ve been discussing classifying spaces of compact Lie
groups, realized as Hilbert manifolds.
Theorem 22.1. Let H be a separable35 Hilbert space and S(H) ⊂ H denote the unit sphere (the vectors of
norm 1). Then S(H) is contractible.
There are several different proofs; this one is due to Dick Palais.
35This theorem likely holds for all Hilbert spaces, but we’d need to find a different proof.
58 M392C (Mathematical gauge theory) Lecture Notes

Proof. Let {en }n∈Z be an orthonormal basis of H, and define an embedding i : R ,→ S(H) as follows: if
x = n + θ, where n ∈ Z and 0 ≤ θ ≤ 1, then

(22.2) i(x)\coloneqq \cos \paren {\frac {\pi }{2}\theta } e_n + \sin \paren {\frac {\pi }{2}\theta } e_{n+1}.

The Tietze extension theorem tells us that if X and Y are metric spaces (or more generally, if X is normal),
C ⊂ X is a closed subset, and f : C → Y is continuous, there is a map fe: X → Y with fe|C = f .
Hence we can use the Tietze extension theorem to extend the map x 7→ x + 1 on i(R) to a map
g : D(H) → i(R), where D(H) denotes the closed unit ball. Let f : D(H) → D(H) be g followed by inclusion.
This has no fixed points, so we can argue as in Hirsch’s proof of the Brouwer fixed-point theorem (see
Figure 1): for any ξ ∈ D(H), consider the ray based at f (ξ) and in the direction ξ), which hits S(H) at a
single point p. Define h(ξ) := p; because f is continuous, h : D(H) → S(H) is continuous. It’s the identity
on S(H), and therefore is a deformation retraction of D(H) onto S(H). Hence D(H) ' S(H), but D(H) is
contractible by the radial deformation retraction onto the origin. 

h(ξ)

ξ
f (ξ)
D(H)

S(H)

Figure 1. As in Hirsch’s proof of the Brouwer fixed-point theorem, we construct a deforma-


tion retraction of D(H) onto S(H).

Corollary 22.3. The Stiefel manifold Stn (H) is contractible.


Proof. Well, let’s induct: an isometry C1 → H determines and is determined by where it sends 1 ∈ C, so
St1 (H) = S(H).
Now let’s induct: there is a fiber bundle (though not a principal bundle) Stn (H) → S(H) given by
evaluating at (1, 0, . . . , 0). The fiber is Stn−1 (H): an isometric embedding C ⊕ Cn−1 ,→ H where we’ve
fixed where the first argument goes is equivalent data to an isometric embedding Cn−1 into the orthogonal
complement of the image of C, but as all separable Hilbert spaces are isomorphic, we get a space homeomorphic
to Stn−1 (H).
There is a theorem that if F → E → B is a fiber bundle where all three spaces are metrizable and F and
B are contractible, then E is contractible. In this case, all three spaces are metrizable (as are all Hilbert
manifolds), so we’re done. 

Palais wrote some elegant and useful papers about using this kind of point-set topology to study the
homotopy theory of various spaces useful in geometry.
As we discussed last time, Un acts freely on Stn (H) by precomposition: given an embedding i : Cn ,→ H
and a g ∈ Un , we let i · g := i ◦ g : Cn ,→ H. The quotient is the Grassmannian Grn (H) of n-dimensional
subspaces of H. Therefore πn : Stn (H) → Grn (H) is a principal Un -bundle and the fiber is contractible, so
this is a model for EUn → BUn .
And since Stn (H) and Grn (H) are Hilbert manifolds, we’ll be able to do geometry, by putting a connection
on πn . First we have to identify the tangent space to Stn (H). If we were asking about all injective maps
Cn ,→ H, the tangent space would be Hom(Cn , H); but we impose that our maps are isometries. For maps
Arun Debray May 9, 2019 59

Cn → Cn , this imposes the condition of being skew-Hermitian, and similarly if b : Cn ,→ H is an isometric


embedding and b∗ : H → Cn is its adjoint, then the condition we need is that
(22.4) (b^*b)^*(b^*b) = \id _{\C ^n}.
Differentiating this condition yields the equations that cut out the tangent space.
Ok, next we need a metric. But we have a natural one. The space Hom(Cn , H) has a natural inner product,
which on T1 , T1 ∈ Hom(Cn , H) is given by36
(22.5) \ang {T_1,T_2}\coloneqq \tr (T_1^*\circ T_2\colon \C ^n\to \C ^n).
This inner product is Un -equivariant: for g ∈ Un ,
(22.6) \ang {T_1 g, T_2 g} = \tr ((T_1g)^*T_2g) = \tr (g^*T_1^*T_2g) = \tr (gg^*T_1^*T_2) = \tr (T_1^*T_2).
This uses the fact that the trace is cyclic: tr(ABCD) = tr(DABC) and so on.
This allows us to define a connection on Stn (H) → Grn (H) as follows: we know the vertical vectors, which
are ker((πn )∗ ) in T Stn (H). And using the metric, we can take the orthogonal complement; because the
metric is Un -invariant, this defines a Un -invariant horizontal distribution.
This immediately generalizes to other compact Lie groups: the Peter-Weyl theorem implies every compact
Lie group G has a faithful finite-dimensional unitary representation, which amounts to an embedding G ,→ Un
for some n. Therefore we get a principal G-bundle Stn (H) → Stn (H)/G, which is a model for EG → BG,
and as before, the metric on the total space is G-invariant and allows us to define a connection in the same
way.
Remark 22.7. If G is a noncompact Lie group with π0 G finite, then G deformation retracts onto a compact
subgroup, allowing us to extend this construction to such G. (
Now let’s apply this to Chern-Weil theory. Using the connection on EG → BG described above, the
Chern-Weil map ωG : H k (G) → Ω2k (BG) lands in closed forms (Lemma 21.23). Hence, using de Rham’s
theorem, we map into H 2k (BG; R).
Proposition 22.8. Let P → M be a principal G-bundle. Then there exists a G-equivariant map ϕ : P → EG
such that, if ϕ : M → BG is the quotient, P ∼
= ϕ∗ EG.
Proof. Such a ϕ is equivalent data to a section of the associated bundle P ×G EG → M . If EG and M are
metrizable, then a section exists, since EG is contractible. 

Things get interesting when you add in connections. Using our model of EG → BG and its connection, it
is true that every connection on a principal G-bundle on a manifold can be written as the pullback of the
universal connection by some map to BG. But the map is not unique. If you want to impose uniqueness
then there is no such universal connection, unless you broaden the class of spaces you work with to some
class of simplicial sheaves. But that’s not something we’re going to do.
Example 22.9. Let G = T, with Lie algebra t = iR. If x denotes the coordinate in t, then I • (T) = R[x].
Though we know we have a model for BT above, we can produce some more direct constructions. One is
that if H is a complex separable Hilbert space, its space of lines P(H) is a model for BT. To see this, notice
that T acts freely on S(H) by scalar multiplication, and the quotient is P (H).
Alternatively, we could take CP∞ , defined to be the colimit of CPn along the inclusions CPn ,→ CPn+1
induced from Cn ,→ Cn+1 as the first n coordinates; these two models for BT are homotopy equivalent, but
have very different geometry: P(H) is a Hilbert manifold, and CP∞ isn’t.
In either case, we see H ∗ (BT; R) ∼= R[c1 ], where c1 is called the first Chern class, and lives in degree 2.
The map I • (T) → H 2• (BT; R) sends x to a multiple of c1 . To see this, we’ll apply the Serre spectral sequence
to the fibration T → S(H) → P(H).
This spectral sequence converges to the cohomology of the total space, which is contractible. Therefore
all classes on the E2 -page other than 1 ∈ E20,0 must get killed by differentials. We know H ∗ (S 1 ) = Z[t]/(t2 )
with t in degree 1, so t ∈ E20,1 must get killed by the only differential that it can support:

36In general, if H and H are Hilbert spaces, possibly infinite-dimensional, one can define a similar inner product on a
1 2
restricted class of operators, namely Hilberrt-Schmidt operators. Because Cn is finite-dimensional, we don’t need to worry about
this today.
60 M392C (Mathematical gauge theory) Lecture Notes

(22.10)

Therefore there’s some c ∈ H 2 (BT) = E22,0 with d2 (c) = t. There can be no other linearly independent classes
in E22,0 , since they could not be killed by any differentials. Similarly, there can be nothing in H 1 (BT) = E21,0 ,
as it would have to survive to the E∞ -page.
This spectral sequence is multiplicative, so E22,1 is cyclic, generated by ct, and d2 (ct) = c2 , since d2 satisfies
the Leibniz rule. This continues along to d2 (cn t) = cn+1 , and we see that the spectral sequence collapses on
the E3 -page, and H ∗ (BT) = Z[c] as claimed (well, this implies the result over R).

(22.11)

(
This is an instance of a general phenomenon in spectral sequences called transgression, where a differential
sends a class on the vertical line to a class on the horizontal line.
TODO: there’s a geometric interpretation of this transgression, and it has something to do with canonical
differential forms called the Maurer-Cartan and Chern-Simons forms, but I missed it.
Example 22.12. Suppose G = BSU2 . As we discussed in Example 21.5, I • (SU2 ) = R[s], with s in degree 4;
explicitly, s(A) = tr(A2 ) for A ∈ su2 . One can compute the cohomology of BSU2 in a similar manner as BT
in the previous example: apply the Serre spectral sequence to the fibration SU2 → ESU2 → BSU2 , and use
the fact that ESU2 is contractible and SU2 ∼ = S 3 . The result is H ∗ (BSU2 ; R) ∼
= R[c2 ], with c2 in degree 4,
and the Chern-Weil map is an isomorphism, sending s to a nonzero multiple of c2 . (
Example 22.13. The story for SL2 (C) is very similar to SU2 : I • (SL2 (C)) = R[s], with s in degree 4 again
given by s(A) = tr(A2 ). There is a retraction of BSL2 (C) onto BSU2 , and therefore its cohomology is the
same. The Chern-Weil map is once again an isomorphism. (
Example 22.14. The Chern-Weil map is not always an isomorphism: for SL2 (R), I • (SL2 (R)) = R[s], with
s in degree 4, and since SL2 (R) ' SO2 = T, H ∗ (BSL2 (R); R) ∼
= R[c], with c in degree 2. The Chern-Weil
map sends s 7→ c2 . (
As long as G is compact, though, things are nice.
Theorem 22.15. If G is a compact Lie group, the Chern-Weil homomorphism ω : I 2∗ (G) → H ∗ (BG; R) is
an isomorphism.
Lecture 23.
Chern-Weil and Chern-Simons forms: 4/18/19

“So take a 0-simplex – that’s a fancy word for point. . . ”


Today, we’d like to sketch the proof of Theorem 22.15: that for a compact Lie group, the Chern-Weil
homomorphism is an isomorphism from the ring of invariant polynomials on G, i.e. (Sym• g∗ )G , onto the real
cohomology of BG.
Arun Debray May 9, 2019 61

Definition 23.1. Let G be a Lie group. A torus T ⊂ G is a compact abelian Lie subgroup T of G. If T is
not strictly contained in another torus, it’s called a maximal torus.
Compact abelian Lie groups are necessarily isomorphic to Tn , hence the name “torus.” In our proof of
Theorem 22.15, we’ll need a few facts.
Proposition 23.2.
(1) Every compact connected Lie group G has a maximal torus T ⊂ G; furthermore, all maximal tori are
conjugate.
(2) G/T is a compact complex manifold, and it has a CW decomposition with only even-dimensional cells
called the Bruhat decomposition.
(3) Let N := N (T ) ⊂ G be the normalizer of T in G. Then W := N (T )/G is a finite group called the
Weyl group. The number of cells in the Bruhat decomposition is |W |.
(4) H ∗ (G/N ; R) ∼
= R.
Example 23.3. If G = Un , the diagonal matrices are a maximal torus. Specializing to U2 ,

(23.4) N = \set *{\begin {pmatrix}\lambda _1 & 0\\0 & \lambda _2\end {pmatrix}, \begin {pmatrix}0 & \mu _1\\\mu _2 & 0\end {pmatrix}},

so the Weyl group is Z/2. Then U2 /T ∼ = CP1 = S 2 , with a single 0-cell and a single 2-cell. This is a double
cover of U2 /N , which doesn’t leave us many options, and indeed U2 /N ∼ = RP2 , whose nonzero-degree real
cohomology does vanish. (
Why is G/T complex? Consider the restriction of the adjoint action of G on g to T ⊂ G. One can prove
in Lie theory that this representation splits as

(23.5) \label {fgsplit} \fg = \ft \oplus \bigoplus _{\text {roots }\rho } \fg _\rho ,

where each gρ is two-dimensional, and t is the Lie algebra of T . The tangent space of G/T at the coset T is
g/t, which by (23.5) splits as an t-representation as a sum of these gρ . So it suffices to choose a complex
structure on each gρ , and then check that the Frobenius tensor vanishes. The way to construct the complex
structure on gρ is to complexify: the analogue of (23.5) for gC := g ⊗ C is

(23.6) \fg _\C = \ft _\C \oplus \bigoplus _{\text {roots } \alpha } (\fg _\alpha \oplus \fg _{-\alpha }),

where tC := t ⊗ C, and −α is some negation operation on the roots. Thus, we can get a complex structure by
choosing which of g±α is g1,0 and which is g0,1 , and Lie theory provides nice ways to make this choice. Then
one has to argue that this almost complex structure integrates, but it does and things are good.
Remark 23.7. Another approach is to know that G has a complexification GC , and that G/T is diffeomorphic
to GC modulo a Borel subgroup, and this is manifestly a complex manifold, because GC and the \xymatrix { H^{*}(G/N;\R )\ar @<0.4ex>[r]^-{\pi ^*} & H^{*}(G/T;\R )\ar @<0.4ex>[l]^-{\pi _*}, }
Borel are
complex manifolds. (
In general, the principal T -bundle G → G/T controls the geometry of G/T , in that the tangent bundle is
an associated bundle to G → G/T . This is smaller, and therefore makes life easier.
Remark 23.8. There’s a quick proof of Proposition 23.2, part (4), assuming the other three statements: we
know π : G/T → G/N is a degree-|W | cover, so |W |χ(G/N ) = χ(G/T ). Since G/T has cells only in even
degrees, and has exactly |W | of them, then χ(G/T ) = |W |, hence χ(G/N ) = 1.
Now, consider the maps

(23.9)

where π ∗ is pullback as usual, and π∗ is the Gysin map, which on differential forms is summing over the fiber.
Thus π∗ ◦ π ∗ = |W |, so π ∗ is an injection, and therefore H ∗ (G/N ; R) is concentrated in even degrees, so there
can only be one nontrivial cohomology group, and it must be H 0 = R. (
62 M392C (Mathematical gauge theory) Lecture Notes

Proof sketch of Theorem 22.15. First, we prove the theorem in the case where G = T is a compact connected
abelian Lie group. TODO: Serre spectral sequence calculation for the fiber bundle T → ET → BT to show
H ∗ (BT ; R) ∼
= Sym• (t∗ ); the adjoint action is trivial, because T is abelian. The idea is: we know R in deg
zero, then d2 : E20,1 → E22,0 is an isomorphism t∗ → t∗ , because everything has to vanish. On the next line,
we get E20,3 = Λ2 t∗ , which maps to t∗ ⊗ t∗ by d2 , which therefore must map to the thing which kills it, which
is Sym2 t∗ . And so on.
Now we assume G is connected and compact, but not necessarily abelian. Choose a maximal torus T and
let N and W be as above. Since T and N are subgroups of G, they act freely on EG and therefore we get a
model for ET → BT as EG → EG/T and EN → BN as EG → EG/N . With these models, BT → BN is
a principal W -bundle. Therefore
(23.10) H^*(BN;\R )\cong H^*(BT;\R )^W \cong (\Sym ^\bullet (\ft ^*))^W\cong \Sym ^\bullet (\fg ^*)^G.
Now we descend one step further: the realization of BN above gives us a fiber bundle BN → BG with
fiber G/N . The real cohomology of G/N is trivial (concentrated in degree 0), so if you apply the Serre
spectral sequence to this fiber bundle, it collapses, and we conclude the map H ∗ (BG; R) → H ∗ (BN ; R) is an
isomorphism.
We have two things left to do: first, remove the assumption that G is connected, and then show that
the Chern-Weil homomorphism implements (some scalar multiple of) the above identification H ∗ (BG; R) ∼ =
(Sym• g∗ )G .
First, let G be a general compact Lie group, and let G0 ⊂ G be the connected component of the identity,
which is a Lie subgroup. There is a short exact sequence
(23.11) \xymatrix { 1\ar [r] & G_0\ar [r] & G\ar [r] & \pi _0G\ar [r] &1, }
hence a principal π0 G-bundle BG0 → BG (constructed in a simialar way to the principal W -bundle BT → BN
above). Therefore
(23.12) H^*(BG;\R )\cong H^*(BG_0;\R )^{\pi _0 G} = ((\Sym ^\bullet \fg ^*)^{G_0})^{\pi _0G} \cong (\Sym ^\bullet \fg ^*)^G.
Now the last part, which will be the sketchiest. As above, we’ll start with the torus. The key here is to
identify the transgression d2 : E20,1 → E22,0 in the Leray-Serre spectral sequence with an invariant polynomial;
then, the higher degrees follow, because on both sides of the identification they’re polynomials in t∗ . Next,
one passes from T to N ; because this is a finite cover this isn’t too bad, though one will have to also argue
why the universal connections are compatible across that cover. The last two steps are similar. 

Exercise 23.13. Fill in the details of the last step of the proof, that the identification H ∗ (BG; R) ∼
=
(Sym• g∗ )G is compatible with the Chern-Weil homomorphism.
Now let’s change gears a bit: let G be any Lie group, not necessarily compact, and let π : P → M be a
principal G-bundle. Let AP ⊂ Ω1P (g) be the affine space of connections. Then there is a principal G-bundle
AP × P → AP × M , where G acts trivially on the first factor, and this carries a universal connection
ΘP ∈ Ω1AP ×P (g). At the point (Θ, p), where Θ ∈ AP and p ∈ P , this connection is a map
(23.14) (\Theta _P)_{(\Theta , p)}\colon T_{(\Theta , p)}(\cA _P\times P) = T_\Theta \cA _P\oplus T_pP\longrightarrow \fg ,
and we have an identification TΘ AP = Ω1M (gP ) = Ω1P (g). So, thought of as a map ΩP P 1 (g) ⊕ Tp P → g, this
is a map (τ, ξ) 7→ Θp (ξ). Let ΩP ∈ Ω2AP ×M (AP × M, gAP ×P ) denote the curvature form.
Proposition 23.15. The curvature form splits as follows: its (2, 0)-piece is equal to zero, its (0, 2)-piece is
Ω(Θ), and its (1, 1)-piece is
(ΩP )(Θ,m) (τ, η) = τm (η).
Here the (p, q) decomposition is in terms of horizontal and vertical, since this is over a direct product.
TODO: I didn’t really follow this, and so I can’t fill in the proof. Sorry about that. :(
Now, given an invariant polynomial f , the Chern-Weil construction produces some ωf (ΘP ) ∈ Ω2k AP ×M
(yes, we are using the calculus of differential forms on an infinite-dimensional space, but will only look on
finite-dimensional submanifolds), and it has the property that the restriction to any point Θ ∈ AP is ωf (Θ),
the Chern-Weil form for f and this connection Θ.
Arun Debray May 9, 2019 63

Now, given two connections Θ0 and Θ1 , there is a unique affine line φ : ∆1 × M → AP × M , and we can
pull back and integrate to obtain a (2k − 1)-form

(23.16) \alpha _{\Theta _0, \Theta _1}\coloneqq \int _{\Delta ^1} \phi ^*\omega _f(\Theta _P).

on M .
Proposition 23.17. ωf (Θ1 ) − ωf (Θ0 ) = dα(Θ0 , Θ1 ).
The proof uses Stokes’ theorem and the fact that dωf (ΘP ) = 0. In particular, not only is the difference
between two forms associated to f via different connections exact, but we have a canonical way to see that
it’s exact. In particular,
Corollary 23.18. Given a principal G-bundle π : P → M and an invariant polynomial f , the cohomology
class of the Chern-Weil form for f and P does not depend on the choice of connection on π.
This α is our first example of a Chern-Simons form. Let’s see what it looks like. Explicitly, there is some
one-form τ such that Θ1 = Θ0 + τ , and φ(t) = Θ0 + tτ . Using Proposition 23.15,
(23.19) (\phi ^*\Omega _P)_{t,m} = \Omega (\Theta _t)_m + \d t\wedge \tau _m.
When we apply f , we get
(23.20) \phi ^*\omega _f(\Theta _p) = k\ud t\wedge f(\tau , \Omega , \dotsc ,\Omega ) + f(\Omega ,\dotsc ,\Omega ),
and there are k − 1 copies of Ω. Integrating picks out the first factor.
Lecture 24.
Chern-Simons forms, II: 4/23/19

“One day I threw my hands up – not really, it was some strong language – and decided to
write down a set of consistent sign conventions.”
Let G be a Lie group and π : P → M be a principal G-bundle, where M is a manifold. Let AP ⊂ Ω1P (g) be
the affine space of connections. Last time, we considered the “universal connection” on P : the principal
G-bundle
(24.1) \id \times \pi \colon \cA _P\times P\to \cA _P\times M
carries a tautological 1-form ΘP ∈ Ω0,1
⊂ Ω1AP ×P (g), where “(0, 1)” refers to the bigrading induced by
AP ×P (g)
the product. The de Rham differential also splits: let δ is the de Rham differential in the AP direction, and d
be the de Rham differential in the P direction. (We will also use d to denote the de Rham differential on M .)
Now ΘP is defined by the formula (ΘP )(Θ,p) = Θp : that is, given a point p ∈ P and a connection Θ ∈ AP ,
the value of ΘP at (Θ, p) ∈ AP × P is the value of Θ at p. Last time, we proved in Proposition 23.15 that
the curvature of ΘP is
1
(24.2a) ΩP = (δ + d)ΘP + [ΘP ∧ ΘP ]
2
1
(24.2b) = δΘP + dΘ + [Θ ∧ Θ]
2
(24.2c) = δΘP + Ω(Θ) .
∈Ω1,1 ∈Ω0,2

At a point (Θ, p), the first term is


(24.3) (\delta \Theta _P)_{(\Theta , p)}((\dot \Theta _1, \dot p_1), (\dot \Theta _2, \dot p_2)) = \dot \Theta _1(\dot p_2) - \dot \Theta _2(\dot p_1).
Definition 24.4. Given f ∈ I k (G), let
ωf (ΘP ) := f (ΩP ∧ · · · ∧ ΩP ) ∈ Ω2k
AP ×M ,
which is a closed form.
Example 24.5. If G = Un and f : un → R is given by f (A) = c tr(A3 ) for some c ∈ R, then ωf (ΘP ) =
c tr(ΩP ∧ ΩP ∧ ΩP ). Wedging together three matrices of 2-forms gives a single matrix of 6-forms, and we
take the trace to obtain a 6-form. (
64 M392C (Mathematical gauge theory) Lecture Notes

At the end of the previous lecture, we considered a Chern-Simons form associated to two connections Θ0
and Θ1 . Specifically, because AP is affine, there’s a unique affine line φ : [0, 1] → AP with φ(0) = Θ0 and
φ(1) = Θ1 . Then we define

(24.6) \alpha (\Theta _0, \Theta _1) \coloneqq \int _0^1 \phi ^*\omega _f(\Theta _P) \in \Omega ^{2k-1}(M).

We will compute this explicitly for k = 1, 2.


Example 24.7 (k = 1). In this case f is linear. The displacement vector along φ is Θ̇ = Θ1 − Θ0 , so
Θt := φ(t) = Θ0 + tΘ̇. Since
(24.8) \omega _f(\Theta _P) = f(\Omega _P) = f(\delta \Theta _P + \Omega ),
then
(24.9) \phi ^*\Omega _P = \d t\wedge \dot \Theta + \Omega _P \in \Omega ^2_{[0,1]\times P}(\fg ),
where t denotes the coordinate on [0, 1]. Hence
\int _0^1 f(\phi ^*\Omega _P) = \int _0^1\d t\, f(\dot \Theta ) = f(\dot \Theta ). \qedhere
(24.10) (

Example 24.11 (k = 2). This time, f is quadratic, so we need to care about second-order terms. So let’s
compute:
1 
(24.12a) Ωt = d(Θ0 + tΘ̇) + (Θ0 + tΘ̇) ∧ (Θ0 + tΘ̇)
2
t2
(24.12b) = Ω0 + t(dΘ̇ + [Θ0 ∧ Θ̇]) + [Θ̇ ∧ Θ̇]
2
t2
(24.12c) = Ω0 + tdΘ0 Θ̇ + [Θ̇ ∧ Θ̇].
2
Thus

(24.13a) ωf (ΘP ) = f (dt ∧ Θ̇ + Ωt ) ∧ (dt ∧ Θ̇ + Ωt )
(24.13b) = 2 dt ∧ f (Θ̇ ∧ Ωt ) + f (Ωt ∧ Ωt ).
Since f is a bilinear form, we can compute
Z 1 Z 1
(24.14a) ωf (ΘP ) = 2 dt f (Θ̇ ∧ Ωt )
0 0
Z 1 
t2

(24.14b) =2 dt f (Θ̇ ∧ Ω0 ) + tf (Θ̇ ∧ dΘ0 Θ̇) + f (Θ̇ ∧ [Θ̇ ∧ Θ̇])
0 2
1
(24.14c) = 2f (Θ̇ ∧ Ω0 ) + f (Θ̇ ∧ dΘ0 Θ̇) + f (Θ̇ ∧ [Θ̇ ∧ Θ̇])
3
1
(24.14d) = f (Θ̇ ∧ Ω0 ) + f (Θ̇ ∧ Ω1 ) − f (Θ̇ ∧ [Θ̇ ∧ Θ̇]).
6
(
We’re in the situation of a fiber bundle π : M → S, where the fiber F is a compact n-manifold with
boundary and S is compact, and the relative tangent bundle T (M/S) → M is oriented. In this setting we
can make progress on Proposition 23.17 with Stokes’ theorem.
There is a map
\int _{M/S}\colon \Omega ^q(M)\to \Omega ^{q-n}(S)
(24.15)

called integration along the fiber. There is a projection map Ωq (M ) → Ωn (F ) ⊗ Ωq−n (S); then we can
integrate the first component over F as usual.
Theorem 24.16 (Stokes). Let ω ∈ Ωq (M ). Then

(24.17) \d \int _{M/S}\omega = \int _{\partial M/S}\omega \pm \int _{M/S}\d \omega .
Arun Debray May 9, 2019 65

The sign depends on n, but it wasn’t clear in lecture what the exact formula is.
Now let’s define the more commonly considered Chern-Simons form, which requires data of only a single
connection. Consider the pullback bundle π ∗ P → P . It’s endowed with a canonical section ∆: the fiber
over any p ∈ P is Pπ(p) , which contains p, so ∆(p) := p. You can think of this as the diagonal map into the
product.
The trivialization means we can choose the trivial connection Θ∆ ∈ Aπ∗ P , and it’s flat (zero curvature).
Thus ∆∗ Θ∆ = 0. Given a connection on P , we can pull it back to π ∗ P , which is an affine embedding
π ∗ : AP → Aπ∗ P .

Definition 24.18. Given a Θ ∈ AP , let αf (Θ) := αf (Θ∆ , π ∗ Θ) ∈ Ω2k−1


P , which we call the Chern-Simons
form.

Here αf is overloaded: we’re defining αf of a single connection in terms of αf of two connections.

Corollary 24.19. dαf (Θ) = π ∗ ωf (Θ).

What does this tell us? Well, ωf (Θ) is closed but not exact. If we’d like to write it as d of something, we
have to pull back to P , and here it is exact, and we have a nice, explicit choice of a form hitting it.

Corollary 24.20. Restricted to any fiber of π, αf (Θ) is closed.

This is an example of transgression in the Serre spectral sequence for this fiber bundle. The transgression
tells us how to go from cohomology classes on the fiber to cohomology classes on the base, but differential
geometry tells us a specific choice, and this choice is also useful elsewhere.

Proposition 24.21. Explicitly, the Chern-Simons form is f (Θ ∧ Ω) − (1/6)f (Θ ∧ [Θ, Θ]).

Proof sketch. (TODOsome of this was erased before I could write it down) We use the formulas we computed
in Examples 24.7 and 24.11. In particular, instead of f (Θ̇) we get f (Θ), and f (Θ̇ ∧ Ω0 ) vanishes (TODO: is
this because Ω0 = 0 and Ω1 = Ω?). 

Sometimes one writes Ω = dΘ, or leaves off the brackets in [Θ ∧ Θ].

Proposition 24.22. Let m ∈ M and im : Pm ,→ P be the inclusion of the fiber. Then

(24.23) i_m^*\alpha _f(\Theta ) = c_k f(\Theta \wedge \underbracket {[\theta \wedge \theta ]\many \wedge [\theta \wedge \theta ]}_{k-1}) \in \Omega _G^{2k-1},

where c1 = 1 and c2 = −1/6, and θ is the Maurer-Cartan form.

That’s the Chern-Simons form; what do people do with it? Well, Chern and Simons did some interesting
classical things, and Witten did completely different things more recently. We’ll start with the former.
Let k = 2 and fix a G-invariant symmetric bilinear form f : g × g → R. Asume G is compact, so the space
of these is isomorphic to H 4 (BG; R). Let πP → M be a principal G-bundle, where M is a 3-manifold. Then
α(Θ) ∈ Ω3P is closed, because dα = π ∗ f (Ω ∧ Ω) = 0.
TODO: then there was something involving integrating over the triangle in Aπ∗ P defined by π ∗ Θ0 , π ∗ Θ1 ,
and Θ∆ , but I could not follow any of it.
Another thing we could do is consider the map F : AP × Γ(π) → R defined by

(24.24) \label {Fthetas} F(\Theta , s) \coloneqq \int _M s^*\alpha (\Theta ).

But are there even sections? Γ(π) could be empty. This is a topological question: when does a principal
G-bundle have a section over a 3-manifold? We can use obstruction theory. If G is simply connected, π0 G
and π1 G vanish, and there’s a general theorem that π2 G = 0. This is enough to imply that there’s always a
section. This works for SUn and Spinn (n > 2), but not SOn , Pin±
n , or Un , which are all either not connected
or not simply connected.
66 M392C (Mathematical gauge theory) Lecture Notes

Anyways, suppose Γ(π) is nonempty. Let δ1 be the de Rham differential on AP × Γ(π) in the AP direction,
and let δ2 be the differential in the Γ(π) direction. If ζ is a vertical vector field,
Z
(24.25a) ιζ δ2 F = ζ · [s∗ α(Θ)]
M
Z
(24.25b) = s∗ ιζ dα(Θ)
ZM
(24.25c) = s∗ ιζ π ∗ ω(Θ) = 0,
M

because ω(Θ) = 0. This suggests that F is constant, which is wrong – it’s only locally constant. The space of
sections is not always connected (though typically it is). Next time we’ll discuss a little more Chern-Simons
theory.

Lecture 25.
Classical Chern-Simons theory: 4/25/19

“There are ten terms. That’s a double-digit number. But undaunted, we continue.”
Today we’ll discuss aspects of classical Chern-Simons theory in three dimensions. Let G be a compact Lie
group; not all of what we do today will require compactness. Fix a degree-2 invariant polynomial, which we
think of as a G-invariant symmetric bilinear form

(25.1) \ang {\bl ,\bl }\colon \fg \times \fg \to \R .

For example, we could take

(25.2) \label {adad} \ang {\zeta _1,\zeta _2}\coloneqq \tr (\ad _{\zeta _1}\circ \ad _{\zeta _2}),

where adζ := [ζ, –] : g → g is conjugation. Symmetry of the trace implies this is symmetric. This polynomial
may be zero, e.g. for any abelian Lie group, but on a simple Lie group it is nonzero, and in general such
forms are identified with H 4 (BG; R).
Now let π : P → M be a principal G-bundle. Let AP denote the space of connections and Θ ∈ AP . Let Ω
be the curvature 2-form of Θ and ω(Θ) := hΩ ∧ Ωi ∈ Ω4M ; if Θ is clear we will also denote this just by ω.
Let {ζα } be a basis for g; then π ∗ Ω = Ωα ζα for some Ωα ∈ Ω2P , and

(25.3) \pi ^*\omega = \ang {\Omega ^\alpha \zeta _\alpha , \Omega ^\beta \zeta _\beta } = \ang {\zeta _\alpha , \zeta _\beta }\Omega ^\alpha \wedge \Omega ^\beta .

We also have the Chern-Simons form

(25.4) \alpha (\Theta ) = \ang {\Theta \wedge \Omega } - \frac 16\ang {\Theta \wedge [\Theta \wedge \Theta }]\in \Omega _P^3.

Again, if Θ is clear from context, we’ll call this α. Last time, we proved that if θ denotes the Maurer-Cartan
form, m ∈ M , and i : Pm ,→ P denotes inclusion of the fiber, then dα = π ∗ ω and

(25.5) i^*\alpha = -\frac 16\ang {\theta \wedge [\theta \wedge \theta ]}.

Let ϕ : P → P be a gauge transformation and g : P → G be the map satisfying ϕ(p) = p · g(p). How does the
Chern-Simons form change under P ? It suffices to know how Θ and Ω change in terms of g.

(25.6a) ϕ∗ Θ = Adg−1 Θ + g ∗ θ
(25.6b) ϕ∗ Ω = Adg−1 Ω.
Arun Debray May 9, 2019 67

For ease of reading, let φ = φg := g ∗ θ. Now we have


1
(25.7a) ϕ∗ α = h(Adg−1 Θ + φg ) ∧ Adg−1 Ωi − h(Adg−1 Θ + φg ) ∧ [(Adg−1 Θ + φg ) ∧ (Adg−1 Θ + φg )]i
6
1
= hΘ ∧ Ωi − hΘ ∧ [Θ ∧ Θ]i
6
1
(25.7b) + hφg ∧ Adg−1 Ωi − hφg ∧ [Adg−1 Θ ∧ Adg−1 Θ]i
2
(I)
(II )
1 1
− hAdg−1 Θ ∧ [φg , φg ]i − hφg ∧ [φg ∧ φg ]i.
2 6
Using the curvature formula, we can combine (I) and (II ):
1
(25.7c) = hΘ ∧ Ωi − hΘ ∧ [Θ ∧ Θ]i + hφg ∧ d(Adg−1 Θ)i
6
(I)+(II )
1 1
− hAdg−1 Θ ∧ [φg , φg ]i − hφg ∧ [φg ∧ φg ]i.
2 6
(III )

Now, using that dφg + (1/2)[φg ∧ φg ] = 0, we can replace [φg , φg ] in (III ) with −2dφg . Hence we can combine
(I) + (II ) and (III ):
1
(25.7d) = α + dhAdg−1 Θ ∧ φg i − hφg ∧ [φg ∧ φg ]i.
6
(I)+(II )+(III )

We also have that

(25.8) \d \vp ^*\alpha = \vp ^*\d \alpha = \vp ^*\pi ^*\omega = (\pi \circ \vp )^*\omega = \pi ^*\omega .

Recall the map F : AP × Γ(π) → RR defined in (24.24), and let e : Γ(π) × X → P be the evaluation map. Then
more or less by definition, F = X e∗ α. Therefore we can directly check that F is closed with calculus of
differential forms – yes, they are over infinite-dimensional manifolds, but the logic goes through the same
way. So rather than inventing new and confusing words and notation, you can use the fact that everything
generalizes nicely.
Z Z

(25.9a) dF = d e α= de∗ α
X X
Z
(25.9b) = e∗ dα
X
Z Z
∗ ∗
(25.9c) = e π ω= ω = 0.
X X

So F is locally constant. If Γ(π) is disconnected, we can’t a priori expect it to be constant, only locally
constant. So let’s talk about the topology of the space Γ(π) of sections. Do we expect it to be connected?
We can write Γ(π) as a left GP -torsor: given a gauge transformation ϕ and a section s, we obtain a new
section ϕ ◦ s. We can also write Γ(π) as a right Map(X, G)-torsor: given a section s and a map ψ : X → G,
we have a new section x 7→ s(x) · ψ(x). As a corollary we discover that Γ(π), GP , and Map(X, G) all have the
same topology.

Example 25.10. In particular, the identification of Γ(π) with Map(X, G) as topological spaces helps us
understand whether the space of sections is connected, as π0 Map(X, G) = [X, G], the set of homotopy classes
of maps X → G. This is in general nonzero.
(1) Suppose G = SU2 , which is diffeomorphic to S 3 . For X = S 3 , homotopy classes of maps [S 3 , S 3 ] are
classified by the degree (or winding number), which can be anything in Z. That’s disconnected.
(2) If G = U1 , [X, U1 ] = H 1 (X), and this is often nonzero. (
68 M392C (Mathematical gauge theory) Lecture Notes

But it might nontheless be true that F is constant, so let’s compare F (Θ, s) and F (Θ, s0 ) for two sections
s and s0 . Let ϕ be the unique gauge transformation with s0 = ϕ ◦ s. Then
Z
(25.11a) F (Θ, ϕ ◦ s) = (ϕ ◦ s)∗ α(Θ)
X
Z
(25.11b) = s∗ ϕ∗ α(Θ)
X
Z
(25.11c) = s∗ α(ϕ∗ Θ) = F (ϕ∗ Θ, s).
X
That’s pleasing. Next, we let A := s∗ Θ and compute
(25.12a) F (Θ, ϕ ◦ s) − F (Θ, s) = F (ϕ∗ Θ, s) − F (Θ, s)
Z
(25.12b) = s∗ (ϕ∗ α(Θ) − α(Θ))
X
Z
1
(25.12c) = dhAdg−1 A ∧ φg i − hφg ∧ [φg ∧ φg ]i
6
ZX  
1
(25.12d) = g ∗ − hθ ∧ [θ ∧ θ]i ,
X 6
where g : X → G is the map associated to ϕ. The form (−1/6)hθ ∧ [θ ∧ θ]i ∈ Ω3G is closed, hence defines a
class c ∈ H 3 (G; R). In general this is nontrivial, and so in general (25.12d) is nonzero, so F is not constant.
Hypothesis 25.13. Suppose that c is in the image of the map H 3 (G; Z) → H 3 (G; R).
This is true, for example, if h–, –i is in the image of H 4 (BG : Z) → H 4 (BG; R). So we expect that, in
general, there’s a lattice of symmetric bilinear forms for which this hypothesis holds, inside the vector space
of possible choices.37 Let’s assume this, and let λ denote the chosen preimage of h–, –i; this is called the level.
Remark 25.14. If G is connected, H 4 (BG; Z) is torsion-free, so the map H 4 (BG; Z) → H 4 (BG : R) is
injective. If G is disconnected, we cannot expect this; for example, take G = Z/2; since BZ/2 ' RP∞ , then
H 4 (BZ/2; Z) = Z/2. (
Anyways, assuming Hypothesis 25.13, then (25.12d) is an integer, and letting F (Θ, s) := F (Θ, s) mod 1 ∈
R/Z, this is independent of s, defining a map F : AP → R/Z.
Remark 25.15. This is a typical example of a secondary invariant, associated to the primary invariant h–, –i
(or rather, the characteristic class it defines in H 4 (BG)). As is typical, it’s in one degree lower than the
primary invariant, and depends on the geometric data of the connection, rather than just the topological
data of the principal bundle. (
Now we have a map F ; we would like to understand its derivative. Fix models of EG and BG, which need
not be the smooth manifold model, and choose a G-equivariant map f : P → EG, and let f : X → BG be
the quotient. Then we may consider the class f ∗ [X] ∈ H3 (BG).
Proposition 25.16. H3 (BG) is finite.
Proof. Well, H3 (BG) ⊗ R is dual to H 3 (BG; R) = 0: by Chern-Weil theory, we only get even-degree classes
here. 

Hence there is some k ∈ Z such that kf∗ [X] = 0. Choose a compact oriented 4-manifold W together with
a diffeomorphism ψ : ∂W → X qk and an extension of P, Θ → X to Pe, Θ e → W . Let ω e ∈ Ω4W denote the
differential form associated to Pe via h–, –i.
Exercise 25.17. We haven’t yet proven that we can do this! To do so, we need to prove that the 3rd oriented
bordism group of BG, denoted ΩSO SO
3 (BG), is finite. Show this; it suffices to show that Ω3 (BG) ⊗ R = 0,
which you can take care of with the Atiyah-Hirzebruch spectral sequence.
37Why should this be true? The idea is that α is a transgressing cochain for the Serre spectral sequence associated to the
fibration G → EG → BG, hitting c on the E2 -page. Unwinding the definitions leads one to show that c then transgresses for ω.
Arun Debray May 9, 2019 69

The (homotopy class of the) map g : W → BG induced by Pe lets us consider g∗ [W ] ∈ H4 (BG; R), and by
construction,

(25.18) \partial \paren {\frac 1k g_*[W]} = f_*[X]\in H_3(BG;\R ).

But we know that f∗ [X] is an integer, so mod 1, (1/k)g∗ [W ] is a cycle. It therefore defines a homology class
η = [(1/k)g∗ [W ]] ∈ H4 (BG; (1/k)Z/Z).
Remark 25.19. A manifold W with an identification of its boundary with k copies of some other manifold is
sometimes called a Z/k-manifold; the argument we’ve just given shows that Z/k-manifolds have a fundamental
class in (1/k)Z/Z. (
This allows us to explicitly describe F as

(25.20) \overline F(\Theta ) = \frac 1k\int _W\widetilde \omega - \lambda (\eta )\pmod 1.

Here defining λ(η) uses a pairing H 4 (BG; Z) ⊗ H3 (BG; (1/k)Z/Z), which lands in (1/k)Z/Z, which we include
in R/Z by (1/k)Z ,→ R.

Lecture 26.
Classical Chern-Simons theory, II: 4/30/19
√ √
As usual, we have a compact Lie group G. Let R(1) := −1R and Z(1) ⊂ R(1) be 2π −1Z. This provides
a way to discuss the line of imaginary numbers, and a lattice inside it, without having to choose i or −i.
Given a level, i.e. a class λ ∈ H 4 (BG; Z(1)), its image in H 4 (BG; R(1)) determines via Chern-Weil theory
an invariant bilinear pairing h–, –i : g → R(1). Assume this is nondegenerate. Given a principal G-bundle
π : P → M , we considered a universal bundle Aπ × P → Aπ × M , where Aπ is the affine space of connections
on π. Over a product, the de Rham differential splits as dAπ ×M = δ + d, with δ the de Rham differential in
the Aπ direction and d the de Rham differential in the M diretion.
This bundle carries a tautological connection Θπ with curvature Ωπ = dΘπ + Ω. Its its Chern-Simons
form is

(26.1) \alpha _\pi = \ang {\Theta _\pi \wedge \Omega _\pi } - \frac 16\ang {\Theta _\pi \wedge [\Theta _\pi \wedge \Theta _\pi ]}

and its Chern-Weil form is


(26.2) \omega _\pi = \ang {\Omega _\pi \wedge \Omega _\pi } = 2\ang {\delta \Theta _\pi \wedge \Omega },
which is of type (1, 3) in the product splitting.
Now assume M is a closed, oriented 3-manifold. Let Γ(π) denote the space of sections of π, which we know
is a torsor for the gauge group Gπ . Let e : Γ(π) × M → P be the evaluation map s, p 7→ s(p), and consider
the map id × e in the diagram

\gathxy { \cA _\pi \times \Gamma (\pi )\times M\ar [d]\ar [r]^-{\id \times e} & \cA _\pi \times P\ar [d]^{\id \times \pi }\\ \cA _\pi \times \Gamma (\pi ) & \cA _\pi \times X. }
(26.3)

We defined F := (id × e)∗ απ ∈ Ω0 (Aπ × Γ(π)), and showed eF (θ, s) is independent of s. Then
Z
(26.4a) δF = (id × e)∗ δαπ
M
Z
(26.4b) = (id × e)∗ (id × π)∗ ωπ
M
Z
(26.4c) =2 hδΘπ ∧ Ωπi ∈ Ω1 (Aπ ).
M
70 M392C (Mathematical gauge theory) Lecture Notes

Hence
(26.5) \delta F_\Theta (\dot \Theta ) = 2\int _M \ang {\dot \Theta \wedge \Omega (\Theta )},

for Θ ∈ Aπ and Θ̇ ∈ Ω1M (gP ). We have proved the following result.


Proposition 26.6. The solutions to δFθ = 0 are exactly the flat connections, i.e. those with Ω(Θ) = 0.
So the critical locus is A[π ⊂ Aπ , the subspace of flat connections.
Digression 26.7. This is a Wick-rotated classical field theory.
The Lagragian formulation of classical mechanics begins with the real line representing time, a Riemannian
manifold M , and a potential function V : M → R. We consider a particle moving on a trajectory x ∈
Map(R, M ), whose energy is given by the Lagrangian functional

(26.8) L = \paren {\frac 12\abs {\d x}^2 - V\circ x}\,\abs {\d t}.

This defines a classical-mechanical system, and is the usual starting point for classical mechanics. One first
computes the Euler-Lagrange equations, to determine where the Lagranian is extrmeized. In this example,
the Euler-Lagrange equations tell us Newton’s laws. The classical solutions are those extremal points, and
are the trajectries which obey Newton’s laws.
To compute this, we do not need to integrate over time, so we can replace time R with spacetime X,
and instead of Map(R, M ), we consider the space of connections on P . The solutions to the Euler-Lagrange
equations, which is a symplectic manifold. Strictly speaking, one obtains a one-form λ, and sets ω := dλ, but
in order for this to be symplectic, one must check a nondegeneracy condition on the second derivatives of L in
terms of the kinetic term and its time derivative: specifically, the Hessian (∂q̇i ∂q̇j L)i,j must be nondegenerate.
For Chern-Simons theory, this is satisfied by the assumption that h–, –i : g × g → R(1).
But in physics, one typically considers flat Minkowski space R3,1 , and here we’re conidering generally
non-flat Riemannian geometry. This is a two-step process: Wick rotation brings us from flat Minkowski
spacetime to flat Euclidean space, and then one follows Riemann and Cartan into curved geometry. That
said, though, we never needed to use a metric on M ! This isn’t quite a topological field theory, though.
It turns out that we don’t quite get a symplectic structure on the space of classical solutions M[π := A[π /Gπ ;
it’s not quite nice enough. There’s a sense in which it has something called a shifted symplectic structure, but
we’re not going to worry about that.
Example 26.9. Let M := Σ(2, 3, 5), the Poincaré homology sphere, defined as SO3 modulo the subgroup of
symmetries of an icosahedron (isomorphic to A5 ). The fundamental group is a double cover of A5 called the
binary icosahedral group. Flat connections are identified with homomorphisms π1 M → G, and if G = SU2 ,
there’s exactly two of these, so this is symplectic, but not in any interesting way.
A more concerning example is X = S 1 × S 2 with G = T; then π1 (S 1 × S 2 ) → T ∼ = T, and this is
odd-dimensional, hence definitely not symplectic! (
The place we do expect a symplectic structure is on R × Σ, where Σ is a closed, oriented 2-manifold (this
∼ M[ (Y ), and it’s possible to produce the symplectic
isn’t compact, but it’s OK). In this case M[ (R × Y ) =
structure from the Chern-Simons form.
Digression 26.10. Let (M, ω) be a symplectic manifold. Is there a line bundle L → M with connection ∇
whose curvature is ω? (Here we’re viewing ω ∈ Ω2M (R(1)).)
Chern-Weil theory tells us one obstruction: the cohomology class Chern-Weil theory assigns to ∇ is in the
image of H 2 (M ; Z(1)) → H 2 (M ; R(1)), so we also need the cohomology class of ω to be in this image. It
turns out this is also a sufficient condition.
One can consider the category of symplectic manifolds and symplectomorphisms for doing symplectic
geometry, but it’s often nicer to work in the category of symplectic manifolds together with such a line bundle;
connections somehow behave nicer. This is sometimes called prequantization.
Suppose a group G acts on M and on the line bundle L → M . The moment map measures to what extent
the action preserves horizontal vectors; we can imagine descending to the quotient and wondering whether we
still have the data of the connection there.
Arun Debray May 9, 2019 71

Now let’s construct this line bundle in the setting of Chern-Simons theory. Suppose M is compact, but
possibly with boundary (of course, the boundary is closed), and use the orientation on M to orient ∂M . Fix
a connectionR Θ on P → X and let α denote the Chern-Simons form for Θ. We argued that if M is closed,
F (Θ, s) = M s∗ Θ is independent of s; let’s reexamine this now that M may have a boundary.
First, let ϕ be a gauge transformation. We compute
Z Z

(26.11a) F (Θ, ϕ ◦ s) = (ϕ ◦ s) αΘ = s∗ (ϕ∗ αΘ ).
M M
Recalling from (25.7d) the formula for ϕ∗ αΘ ,
Z Z
∗ 1
(26.11b) = F (Θ, s) + s hφg ∧ Adg−1 Θi + − s∗ hφg ∧ [φg ∧ φg ]i.
∂M M 6
When M is closed, the last term mod Z(1) is constant, but this is no longer true. However, it only depends
on s∗ φg |∂M , which is the kind of thing that often happens.
Remark 26.12. You can use this to define (in some cases) the Wess-Zumino-Witten invariant of a closed,
oriented surface Y together with a principal bundle P → Y and connection Θ: choose a compact, oriented
3-manifold X with ∂X = Y and a principal bundle Q → X and connection ΘQ extending P and Θ (i.e.
restricted t Y , we get back P and Θ). Then the Wess-Zumino-Witten invariant is the Chern-Simons invariant
of this data on X, in R/Z.
First, you would wonder whether this depends on our choice of X, Q, and ΘQ , but there’s a standard
bordism argument here: if (X, Q, ΘQ ) and (X 0 , Q0 , ΘQ0 ) are two choices of extensions, we can glue across Y
to obtain a closed 3-manifold X ∪Y X 0 , a principal G-bundle Q ∪P Q0 → X ∪Y X 0 , and a connection, and
the Chern-Simons invariant of this data is the difference of the Chern-Simons invariants (because we can split
the integrals; difference rather than sum because the orientations are different). And on a closed manifold,
this is in Z(1), so mod Z(1) this is well-defined.
TODO: I missed some details that are important in this, namely, do we have a principal bundle on Y or is
it a map to G? I think it’s supposed to be the latter but then I don’t see how we obtain a Chern-Simons
invariant on X. Caveat lector.
You might be worried that we can’t always extend the principal bundle on Y to a compact 3-manifold
which Y bounds. Indeed, this often is the case – consider the case where Y is a torus, G = T × T, and the
map is the identity. If we can extend the principal bundle, the connection also extends: this is a partition of
unity argument. There is a workaround: the Wess-Zumino-Witten invariant is an example of a secondary
cohomology invariant, much like the Chern-Simons invariant is a secondary invariant associated to the
Chern-Weil invariant. Again it requires geometric data, and both can be interpreted in some cases by picking
a manifold in one dimension higher which our manifold bounds, and computing the primary invariant there. In
situations where the bundle doesn’t extend, there’s another general framework, realizing secondary invariants
in differential cohomology irrespective of whether the data extends. (
Now consider the map f := exp((F (Θ, –))) : Γ(∂π) → C. If h : ∂M → G is a function, then
(26.13a) \label {surprise1D} f(s\cdot h) = c(s,h)\cdot f(s),
where
(26.13b) \label {isherm} c(s,h) \coloneqq \exp \paren {\int _{\partial M} \ang {h^*\theta \wedge \Ad _{h^{-1}} s^*\Theta } - \frac 16\int _M \ang {h^*\theta \wedge [ h^*\theta \wedge h^*\theta ]}}.

Because the group of functions ∂M → G acts simply transitively on sections, the space of functions f
satisfying (26.13a) is one-dimensional: if we know f at one section, we know it everywhere. So we get a
complex line L∂M (∂Θ). In fact, because (26.13b) always has norm 1, L∂M (∂Θ) is a Hermitian line. So we
get a Hermitian line, and exp(F (Θ)) is an element of that line.
Lecture 27.
Chern-Simons lines: 5/2/19

As usual, we let G be a compact Lie group and λ ∈ H 4 (BG; Z(1)), whose image in real cohomology induces
a G-invariant inner product h–, –i : g × g → R(1). Let X be a compact oriented 3-manifold, π : P → X be a
principal G-bundle, and Θ ∈ Aπ .
72 M392C (Mathematical gauge theory) Lecture Notes

Assume π has a section.


R If s, s0 ∈ Γ(π), then there’s a unique g : X → G such that s0 (x) = s(x) · g(x).

We’ve defined F (Θ, s) = X s α(Θ), where

(27.1) \alpha (\Theta ) = \ang {\Theta \wedge \Omega } - \frac 16\ang {\Theta \wedge [\Theta \wedge \Theta ]}\in \Omega _P^3(\R (1))
is the Chern-Simons form. Last time, we proved (26.11b), characterizing how F changes under a gauge
transformation.
We also began talking about the Chern-Simons line associated to a closed, oriented 2-manifold Y with a
principal G-bundle ρ : Q → Y with connection Ξ. Again assuming ρ admits a section, we defined for any
t ∈ Γ(ρ) and h : Y → G a function c(Ξ, t, h) in (26.13b) (though there we said Θ instead of Ξ), where X is a
compact oriented 3-manifold with boundary Y and g : X → G satisfies g|∂X = h.
Remark 27.2. You can’t always choose such an X and g: for example, let G = T2 , Y = T2 , and Q → Y be
trivial. If h : Y → T2 is a diffeomorphism, there’s no way to extend this to a 3-manifold bounding Y ; the
degree is the obstruction. (
But if G is simply connected, then such X and G exist. So the theory behaves nicely for such groups (e.g.
E7 ).
Exercise 27.3. Show that if h1 , h2 : Y → G, then c(Ξ, t, h1 h2 ) = c(Ξ, t, h1 )c(Ξ, t, h2 ). This will be good
practice with the Maurer-Cartan form if you’re not already familiar with it.
Now define Lρ (Ξ) to be the vector space of functions f : Γ(ρ) → C such that f (t · h) = c(Ξ, t, h)f (t). Since
Map(Y, G) acts simply transitively on Γ(ρ), Lρ (Ξ) is one-dimensional, and is called the Chern-Simons line.
And if you’re given (X, P, Θ), then exp(F (Θ, –)) ∈ L∂π (∂Θ), because it transforms in the correct way. This
is called the Chern-Simons invariant of X and Θ.
What if X is closed? Then ∂X = ∅, a perfectly fine closed 2-manifold, with a unique principal G-
bundle and connection (vacuously), so the Chern-Simons line is canonically C, and we recover the absolute
Chern-Simons invariant from last time.
Digression 27.4. Recall that the action isn’t quite a real number; it’s only defined mod 1. This is good
enough for classical physics, where you can still differentiate the equations of motion. But it’s also a first step
towards quantization – in the Feynman path integral, the action is exponentiated, so the mod 1 problem goes
away, and therefore this is also good enough.
But on a manifold with boundary, this action isn’t even a number. We would like to integrate over intervals
of time, which have boundaries, but the fact that Lρ (Ξ) is not canonically trivial. This is a good thing, as it
turns out; one nonstandard way of thinking of this is as an action on 2-dimensional manifolds (this is unusual
in physics), valued not in numbers but in vector spaces.
We could alternatively say that if X is a 3-dimensional oriented bordism between closed, oriented surfaces
Y0 and Y1 , together with a bundle and connection Θ on X restricting to the connections Ξ0 and Ξ1 on Y0 ,
resp. Y1 ,38 we obtain a linear map
(27.5) \exp (F(\Theta ))\colon L_{Y_0}(\Xi _0)\longrightarrow L_{Y_1}(\Xi _1).
This is a little different than we said it before, where the Chern-Simons invariant of X was valued just in
L∂X (Θ|∂X ). This turns out to be equivalent:
• You can check that if Y is a disjoint union Y 0 q Y 00 , the Chern-Simons line of Y is canonically the
tensor product of the Chern-Simons lines of Y 0 and Y 00 for the restriction of Ξ to Y 0 and Y 00 .
• In an oriented bordism, X induces the opposite orientation on Y0 , so we ask how orientation-reversal
(but keeping the connection the same) affects the Chern-Simons line. The integral changes sign, so we
end up with the dual line; and since it’s Hermitian, this can be identified with the complex conjugate.
The upshot is that the Chern-Simons invariant of X lives in LY0 (Ξ)∗ ⊗ LY1 (Ξ) = Hom(LY0 (Ξ), LY1 (Ξ)). You
can check that these lines compose, so we obtain an example of a Segal-style field theory. We computed the
variation of this invariant in the connection, and it’s not zero, so this is not topological. However, because we
obtain lines rather than general vector spaces, this is in fact an invertible field theory. This is to be expected
for classical theories in general.
38There are some details that go into setting this up nicely, which amounts to choosing collar neighborhoods of this data on
Y0 and Y1 .
Arun Debray May 9, 2019 73

A field theory is more than just a functor on bordisms – we know what it means to differentiate the action,
but that’s not part of the functorial description. We’d like to say correlation functions depend smoothly
on the parameters, and in particular the Chern-Simons line should too. Therefore, given a family of data
(namely a fiber bundle of it), we expect a family of Chern-Simons invariants: a line bundle of Chern-Simons
lines, a smooth section of Chern-Simons invariants, etc. `
The universal family of connections is, of course, Aρ . We claim that Ξ LY (Ξ) → Aρ fits together to form
a smooth Hermitian line bundle. To see this, consider the trivial bundle C → Aρ × Γ(ρ). The base space is a
principal Map(Y, G)-bundle over Aρ – in particular, Map(Y, G) acts freely on it. We’ll lift this action to C,
which will then descend to a line bundle Lρ → AP on the quotient space. Specifically, given h ∈ Map(Y, G)
and (Ξ, t) ∈ Aρ × Γ(ρ), the action on the fiber of C at (Ξ, t) is c(Ξ, t, h). Then the fiber over Ξ in the quotient
is the space of invariant sections, which is exactly Lρ (Ξ).
We want more – namely, a covariant derivative on Lρ . Given a path γ : [0, 1] → Aρ , we would like a
parallel transport map τγ : (Lγ )0 → (Lγ )1 . If we wrote down such a math, how would we know it’s a parallel
transport map? First, it should be a smooth function on the path space; second, it should not depend on
reparameterization, and third, it should compose under composition of paths.
Exercise 27.6. Show that any assignment of parallel transport maps satisfying the above three conditions
arises from a covariant derivative. (A solution can be found in the appendix of [Fre95].)
Well how do we obtain this parallel transport map? Let γ : [0, 1] → Aρ be a smooth path. Then we can
pull back Ξuniv on AP × Q → AP × Y to X := [0, 1] × Y : let P := [0, 1] × Q and Θ := γ ∗ Ξuniv . So X is a
bordism, hence defines a map from the Chern-Simons line at γ(0) and the Chern-Simons line at γ(1). One
must check this satisfies the above three properties, but it does, and so we get a covariant derivative.
But we can do even better, and write down the formula for the covariant derivative. Since Θ = Ξt , then
1
(27.7a) Ω = d[0,1]×Q Θ − [Θ ∧ Θ]
2
1
(27.7b) = dt ∧ Ξ̇ + dQ θ + [Θ ∧ Θ]
2
(27.7c) = dt ∧ Ξ̇ + Ω,
and the Chern-Simons invariant is
1
(27.8a) α = hΘ ∧ Ωi − hΘ ∧ [Θ ∧ Θ]i
6
(27.8b) = hΘ ∧ dt ∧ Ξ̇i + (terms not involving dt)
(27.8c) = −dt ∧ hΞ ∧ Ξ̇i.
If σ ∈ Γ(ρ), the Chern-Simons invariant on [0, 1] × Y is

(27.9) \exp \paren {-\int _0^1 \sigma ^*\ang {\Xi \wedge \dot \Xi }}.

The connection form relative to the trivialization of Q given by σ (which induces a trivialization of Lρ → AP )
is
\int _Y \sigma ^*\ang {\Xi ^{\mathrm {univ}}\wedge \delta \Xi ^{\mathrm {univ}}}.
(27.10)

The curvature of this connection is


(27.11) \mathrm {curv}(L_\rho ) = \int _Y \ang {\delta \Xi ^{\mathrm {univ}}\wedge \delta \Xi ^{\mathrm {univ}}},

which is reminiscent of the universal Chern-Weil form, albeit missing some terms. But the missing terms
aren’t of type (2, 2), so they’ll integrate to zero, and therefore this is equal to
\int _Y \ang {\Omega ^{\mathrm {univ}}\wedge \Omega ^{\mathrm {univ}}}
(27.12)

There’s still more structure, e.g. the moment map coming from the action of gauge transformations.
More natural questions are: what happens if you consider surfaces with boundary? Or what kind of
geometric structure would we use to try to build this without choosing sections, which might not exist (an
introduction to differential cohomology)?
74 M392C (Mathematical gauge theory) Lecture Notes

Lecture 28.
New(er) equations in gauge theory: 5/7/19

Today we’ll discuss some equations in gauge theory that have arisen more recently. We’ll discuss two
sets of equations: the Kapuatin-Witten equations in 4D, written down in 2006, come out of maximally
supersymmetric 4D gauge theory, with the aim of learning something about the geometric Langlands program.
The Haydys-Witten equations in 5D arose out of independent work of Haydys and Witten; the former on
trying to build an analogue of the Fukaya category in dimension 3, and the latter on trying to see Khovanov
homology in gauge theory, categorifying the appearance of the Jones polynomial in Chern-Simons theory.
First, we’ll discuss Witten’s story about the Jones polynomial, Chern-Simons theory, and Khovanov
homology. The Jones polynomial was introduced by Jones using representation theory, and people later
figured out combinatorial formulas for it. Khovanov homology refines the Jones polynomial, in that the Jones
polynomial is its Euler characteristic; it also has a very combinatorial flavor. So it’s strange to see quantum
field theory, geometric and generally not combinatorial, appearing in this story – but Witten explained how
to get the Jones polynomial out of Chern-Simons theory, so maybe something similar is true for Khovanov
homology.
Let W be an oriented Riemannian 3-manifold and V := R>0 × W . We may also consider a knot K ⊂ W .
From this we can obtain the some equations E4 (V ) in 4D, and the moduli space S4 (V ) of solutions with
suitable boundary conditions (at 0) and growth conditions (at ∞). These have particularly nice properties:
S4 (V ) is a finite set, and the solutions are suitably nondegenerate, and they have topological invariants
P : S4 (V ) → Z (akin to a Pontrjagin number) and Q : S4 (V ) → Z/2 (akin to a mod 2 index). Then we can
define

(28.1) J_V(q)\coloneqq \sum _{\sigma \in S_4(V)} (-1)^{Q(\sigma )} q^{P(\sigma )}\in \Z [q, q^{-1}],

and if W = E3 with an embedded knot K, this is exactly the Jones polynomial of K.


This is an appealing interpretation of the Jones polynomial: we have integers, so they should count
something, and this is telling us what it’s counting. However, this is not a theorem! Work of Gaiotto-Witten
and Mazzeo-Witten has made progress on various pieces of this claim, but there is much left to do (in
particular compactness).
Categorification suggests that if you have an integer-valued invariant, it might be the dimension of a vector
space invariant. Negative integers can occur if we use virtual or Z/2-graded vector spaces, or complexes.
Khovanov described a categorification of the Jones polynomial in this way, producing a chain complex
K0 (V ) = Z[S4 (V )], Z × Z/2-graded by P and Q, and in fact if V = R × W , we can refine Q to a Z-valued
invariant F : S4 (V ) → Z and obtain a genuinely bigraded chain complex.
Since we can (conjecturally, and with plenty of heuristic evidence) obtain the Jones polynomial from
4D gauge theory, maybe we can go one dimension higher and obtain Khovanov homology from 5D gauge
theory. This led to the equations E5 (M ) in M := R × V . Let t denote the R coordinate. We obtain static
solutions S5st (R × V ) = S4 (V ) inside the space S4 (R × V ; σ+ , σ− ), where we prescribe σ → σ± as t → ±∞.
Let S5 := S5 /R, where R acts by time translation, the usual thing we do in Morse theory.
The claim/conjecture is that if F (σ+ ) = F (σ− )+1, then S 5 (R × V ; σ+ , σ− ) is finite, with a mod 2 invariant,
and we can define a differential on K0 (V ) with matrix element summing over S 5 (R × V ; σ+ , σ− ) weighted ±1
by the mod 2 invariant. This differential is expected to square to zero, and one expects the homology to
coincide with Khovanov homology.
These equations emerge from quantum field theory as the BPS equations in some given QFT. These are
first-order equations. Many first-order equations in geometry come from BPS equations in quantum field
theory, including the equations giving us Morse flow lines.

The Kapustin-Witten equations. Let G be a compact Lie group. Within the context of the geometric
Langlands program, this is the dual group G∨ to what you start with, so keep that in mind when reading the
paper.
Let V be an oriented Riemannian 4-manifold, and P → V be a principal G-bundle with connection
A ∈ Ω1P (g). Now choose a φ ∈ Ω1V (g); in some contexts this is called a Higgs field.
Arun Debray May 9, 2019 75

Definition 28.2. The Kapustin-Witten equations E4 (V ) are


1
FA − [φ ∧ φ] + ?dA φ = 0
2
d∗A φ = 0.
The first equation takes place in Ω2V (gP ), and the second takes place in Ω0V (gP ).
Where do these equations come from? Rather than discussing the physics context, we can relate them to
other equations that occur in similar contexts.
For example, the Hitchin equations are associated to the same data on a 2-manifold, and look very similar:

1
(28.3a) FA − [φ ∧ φ] = 0
2
(28.3b) dA φ = 0
(28.3c) d∗A φ = 0.
These are elliptic. We have an extra condition so that, roughly speaking, the equations aren’t over- or
underdetermined, but “just right” to be elliptic.
We can also deform the Kapustin-Witten equations to a family of equations parametrized by t ∈ CP1 =
C ∪ ∞:
 
1
(28.4a) FA − [φ ∧ φ] + tdA φ =0
2 +
 
1
(28.4b) FA − [φ ∧ φ] − t−1 dA φ =0
2 −
(28.4c) d∗A φ = 0.
Here (–)± denotes taking the self-dual and anti-self-dual parts of this 2-form. At t = 1, this recovers
the original Kapustin-Witten equations, because the Hodge star acts by 1 on self-dual forms and −1 on
anti-self-dual forms. In the limits t → 0 and t → ∞, one of the first two equations blows up, and we end up
recovering the self-dual and anti-self-dual equations!
A third context to place this in is complex connections. We can complexify G to a complex Lie group
GC : for example, SUn complexifies to SLn (C) and SOn complexifies to SOn (C). Then we can inflate P to a
principal GC -bundle
√ PC := P ×G GC → V , using the fact that G includes in GC√ . If A is a connection on P ,
then A := A + −1φ is a connection on PC (here we use the fact that gC = g ⊕ −1g). If φ = 0, this is the
induced connection on PC from A. The curvature of A is

(28.5) F_\cA = F_A + \sqrt {-1}\d _A\phi + \frac 12[\sqrt {-1}\phi \wedge \sqrt {-1}\phi ] = \paren {F_A - \frac 12[\phi \wedge \phi ]} + \sqrt {-1}(\d _A\phi ).

So the Hitchin equations are exactly the condition that the complexified connection is flat!39
These equations are strictly speaking not elliptic because they have an infinite-dimensional symmetry
group, so in particular the space of solutions is empty or noncompact. But once you quotient by gauge
symmetries, they are elliptic.
As a warm-up, we’ll show that FA+ = 0 is elliptic modulo gauge transformations. We have to check that at
every solution A, the differential is an elliptic operator, so we have to linearize at A ∈ AP in the direction
Ȧ ∈ Ω1V (gP ). The quadratic equation

(28.6) F_A^+ \coloneqq \paren {\d A + \frac 12 [A\wedge A]} = 0

becomes the linear operator


(28.7) \d _A^+\dot A\coloneqq \paren {\d \dot A + [A\wedge \dot A]}_+ = 0.
The gauge symmetry is A 7→ ϕ∗ A, and the infinitesimal gauge symmetry is ζ 7→ dA ζ, with ζ ∈ Ω0V (gP ).
Ellipticity means the symbol
(28.8) \label {ellsymbol} \sigma \colon \Lambda ^1 T_x^*V\otimes (\fg _P)_x\longrightarrow \Lambda ^2_+T_xV\otimes (\fg _P)_x,
39There’s something to say about (28.3c) here, but it ends up working out in this setting.
76 M392C (Mathematical gauge theory) Lecture Notes

which sends α 7→ (θ ∧ α)+ , should be an isomorphism, where θ ∈ Tx∗ V .


The gauge transformation x 7→ θx extends (28.8) to a complex
\label {SDcpx} \paren {\xymatrix @1{ \Lambda ^0T_x^*V\ar [r]^{y\mapsto \theta y} & \Lambda ^1 T_x^*V\ar [r]^-\sigma &\Lambda _+^2 T_xV }}\otimes (\fg _P)_x,
(28.9)

and ellipticity asks for this to be exact when θ 6= 0, which is an easy exercise. The dimensions visibly work
for exactness: the first is one-dimensional, the second is four-dimensional, and the third is three-dimensional,
but that’s not a proof! Exactness of this complex, hence ellipticity of the self-dual equations, is where this
whole story kicks off.
Now let’s imitate this with the Kapustin-Witten equations.
Theorem 28.10. The Kapustin-Witten equations are elliptic.
Taking tangent vectors Ȧ and φ̇, they live in the same space, which will be useful. When we linearize, the
equations become
(28.11a) dA Ȧ − [φ ∧ φ̇] + ?dA φ̇ + ?[Ȧ ∧ φ] = 0 ∈ Ω2V (gP )
(28.11b) d∗A φ̇ − h[Ȧ, φ]i = 0 ∈ Ω0V (gP ).
The symbol and gauge symmetries fit into the complex
\label {KWcpx} \paren {\xymatrix @1{ \Lambda ^0T_x^*V\ar [rr]^-{y\mapsto (\theta y, 0)} && \Lambda ^1 T_x^*V\oplus \Lambda ^1 T_x^*V\ar [r]^-\sigma & \Lambda ^2 T_x^*V\oplus \Lambda ^0 T_x^*V }}\otimes (\fg _P)_x,
(28.12)

where
(28.13) \sigma (\alpha , \beta ) = (\theta \wedge \alpha + {\star }\theta \wedge \beta , \theta , -\ang {\theta , \beta })\otimes \id _{(\fg _P)_x}.
(28.12) looks sort of like two copies of (28.9), though not precisely the same. We’d like to show (28.12) is
exact; again the dimensions check out (in order: 1, 4 + 4, 6 + 1). Checking exactness is an exercise in linear
algebra; it may help to make the substitutions σ := α + β and δ := α − β. Thus reformulated, (28.12) looks
even more like the complexes for the self-dual and anti-self-dual equations, and we can leverage the fact that
those are exact to prove exactness of (28.12).
Theorem 28.14. If V is closed, any solution to the Kapustin-Witten equations is a flat GC -connection, i.e.
FA − (1/2)[φ ∧ φ] = 0, dA φ = 0, and d∗A φ = 0.
In other words, this isn’t very interesting on closed manifolds, by contrast with Donaldson equations,
Seiberg-Witten equations, etc. The only thing you see is the fundamental group: flat connections are identified
with maps π1 V → G.
One typically proves things such as Theorem 28.14 using Weitzenböck formulas. Kapustin and Witten
found a clever way to do this using the deformed Kapustin-Witten equations (28.4). Some of the terms of
the Weitzenböck formula are:
(28.15) \d _A^*\d _A + \d _A\d _A^* = \nabla _A^*\nabla + \mathrm {Ric},
which gives us the Weitzenböck formula for the Hodge Laplacian. Here Ric is the Ricci curvature, understood
as an endomorphism of the tangent space (which doesn’t see gP at all). You can use this formula to show
that if a manifold has everywhere positive Ricci curvature, then its first Betti number vanishes.
Anyways, the full Weitzenböck formula for the Kapustin-Witten equations is
(28.16) \d _A^*\d _A + \d _A\d _A^* = \nabla _A^*\nabla + \mathrm {Ric} + \alpha (F_A),
where
(28.17) \alpha (F_A)\phi \coloneqq {\star }[{\star } F_A\wedge \phi ].
Since F := {(A, φ)} is affine over Ω1V (gP ) ⊕ Ω1V (gP ), we get a symplectic form on its tangent space given by

(28.18) \omega ((\dot A_1, \dot \phi _1), (\dot A_2, \dot \phi _2)) \coloneqq \int _V \ang {\dot A_1, \dot \phi _2} - \ang {\dot A_2, \dot \phi _1}.

The gauge group GP acts on F through symplectomorphisms, and hence we have a moment map
\begin {aligned} \mu \colon \mathcal F &\longrightarrow \Omega _V^0(\fg _P)\\ (A,\phi ) &\longmapsto \pm \d _A^*\phi \end {aligned}
(28.19)
Arun Debray May 9, 2019 77

(one of the ± signs is right). Then dµ(ζ) = ιb


ζ
ω for ζ ∈ Ω0V (gP ) and ζb is the associated vector field on F.
The Haydys-Witten equations. These five-dimensional equations aren’t as general as the Kapustin-
Witten equations, which made sense on any oriented Riemannian 4-manifold. We require in addition to an
oriented Riemannian 5-manifold M , a vector field ξ with |ξ| = 1 (not such a big constraint). We again won’t
get interesting invariants of closed manifolds.
Let θ denote the dual one-form to ξ, so that θ(η) = hξ, ηi for all vectors η. Then we can decompose
T M = R · ξ ⊕ S, where Sx := ξx⊥ ; correspondingly T ∗ M = R · θ ⊕ S ∗ . Since M is oriented and R · ξ is oriented,
we obtain induced orientations on S and S ∗ .
The fields are a G-connection A on a principal G-bundle P → M and a section B of Λ2+ S ∗ ⊗ gP → M (we
have self- and anti-self-duality because S ∗ is an oriented rank-4 bundle).
Definition 28.20. The Haydys-Witten equations with the above data are
1
FA+ − [B × B] − (∇A )ξ B = 0
2
d∗A B − ιξ FA = 0.
The expression B × B means the cross product, since Λ2+ S ∗ is an oriented rank-3 bundle.
Where do these equations live? We have the decomposition
(28.21) \Lambda ^2 T^*M = (\R \cdot \theta \otimes S^*) \oplus \Lambda _+^2 S^* \oplus \Lambda _-^2 S^*.
Therefore the first equation gives us gP -valued 2-forms, and the latter gives us gP -valued one-forms.
The mathematical story about these equations is still being written.
Lecture 29.
BPS equations: 5/9/19

“. . . but if I said ‘nonnegative-energy representations’ instead of ‘positive-energy representa-


tions’ everyone would look at me skew.”
Note: I was 11 minutes late to class, so I may have missed things in the beginning. Sorry about that.
Today we’ll discuss the BPS formalism, which realizes many of these interesting equations in mathematical
gauge theory via supersymmetric quantum field theory.
Example 29.1. Let’s first talk about Yang-Mills theory. Fix a closed oriented Riemannian 4-manifold M , a
compact Lie group G, and a G-invariant inner product h–, –i on g. The Yang-Mills functional on the space of
connections A on principal G-bundles on M is

(29.2) \mathit {YM}(A) \coloneqq \int _M \frac 12\norm {F_A}^2 = \frac 12\int _M \ang {F_A\wedge {\star } F_A}.

Its derivative is
(29.3) \d \mathit {YM}_A(\dot A) = \int _M \ang {\d _A\dot A\wedge {\star } F_A} = \int _M -\ang {\dot A\wedge \d _A {\star } F_A}.

The critical points are exactly those satisfying the Yang-Mills equation dA ?FA = 0.
We can rewrite the Yang-Mills equation: writing FA = F = F + + F − ,
Z Z
1 1
(29.4a) YM (A) = hFA ∧ ?FA i = h(F + + F − ) ∧ (F + − F − )i
2 M 2 M
Z
1
(29.4b) = hF + ∧ F + i − hF − ∧ F − i
2 M
Z
1
(29.4c) = 2hF + ∧ F + i − hF ∧ F i
2 M
Z Z
+ +
(29.4d) = hF ∧ F i − ω(A) = hF + ∧ F + i + c(P ),
M M
where c(P ) ∈ R is a characteristic number. Thus YM (A) ≥ c, with equality iff FA+ = 0.
The point is, though the Yang-Mills functional is second-order, the equation governing its critical points is
a first-order PDE. This will be typical within the BPS formalism. (
78 M392C (Mathematical gauge theory) Lecture Notes

Today we will often look for solutions with extra symmetry (e.g. in particular supersymmetry), as these
are often the most interesting.
Today we will cover lots of topics quickly; there’s plenty of good texts on these topics, and any of them
could be the jumping-off point for a minicourse. First we need to discuss some physics: the theory of fields.
The linear part of this, hence the simpler part, is particles, so let’s begin with that.
Particles. We fix a positive integer n, which will be the dimension of spacetime. We work in the relativistic
setting, so space and time are mixed up. Hence consider a real n-dimensional vector space V with a Lorentz
inner product h–, –i, i.e. a symmetric bilinear form of signature (1, n − 1). Minkowski spacetime M is an
affine space over V .
The Lorentz inner product separates vectors into spacelike vectors, i.e. those ξ ∈ V with hξ, ξi < 0; timelike
vectors, i.e. those with hξ, ξi > 0; and null vectors, those with hξ, ξi = 0. In any dimension, the space of
timelike vectors has two components; we need an additional piece of data, which is the choice C of one of
these components. We don’t quite have an arrow of time, but at least we’ve oriented it, and so sometimes
this is called a time orientation. Working in coordinates t, x1 , . . . , xn1 , where t is timelike and each xi is
spacelike, the metric must have the form
(29.5) c^2\ud t^2 - \d (x^1)^2 - \dotsb - (\d x^{n-1})^2,
and if we change coordinates, c only depends on how we parametrize t. So this is some sort of universal
constant with units of speed, and indeed it is the speed of light.
Now consider the dual space V ∗ . The component C induces a dual component C ∗ ; here we think of
timelike directions as energy, and spacelike ones as momentum. Thus one thinks of energy as having units
1/length, for example. The norm of a vector in the dual space is

(29.6) \label {einstein} m^2c^2\coloneqq \norm {(E,p)}^2 = \frac {E^2}{c^2} - p^2.


When p = 0, this recovers Einstein’s equations. Of course, we defined m this way, so that seems a little
dubious, but if you did this without ignoring factors of ~, they’d cancel out, and m has units of mass, so this
is reasonable.
So the choice of C ∗ is telling us what positive energy is.
Now let’s look at the symmetry groups appearing in this story. Let O(V ) denote the subgroup of GL(V )
which preserves the Lorentz inner product. These are symmetries of M , and we also have V acting on M by
affine translations. The symmetry group is an extension
(29.7) \xymatrix { 1\ar [r] & G\ar [r] & \mathcal G(M)\ar [r] & \O (V)\ar [r] & 1. }
But we actually need a slightly different group. Choose a spin structure on V and let Spin(V ) denote the
corresponding spin group (we’re in indefinite signature, so this isn’t Spinn ). This double covers an index-2
subgroup SO(V ) ⊂ O(V ). Again there’s an extension
(29.8) \xymatrix { 1\ar [r] & G\ar [r] & \mathcal P(M)\ar [r] & \Spin (V)\ar [r] & 1. }
P(V ) is called the Poincaré group.40
Definition 29.9. A particle is an irreducible unitary representation of P(M ) with nonnegative energy.
Since P(M ) is a noncompact Lie group, its representation theory is difficult, but it was studied and solved
by Mackey and Wigner. They parametrized these representations by orbits of the SO(V )-action on C ∗ . Each
orbit is given by points with a fixed mass m (the same m from (29.6)), and these orbits fall into three types.
(1) If m2 > 0, we get an orbit Om2 , which has the geometry of hyperbolic space. The stabilizer group is
Spinn−1 , which means there’s an identification Hn ∼
= Spin1,n−1 /Spinn−1 , which is cool.
(2) If m2 = 0, we get a single orbit O0+ , which is a cone. The stabilizer group is a double cover of the
Euclidean group in dimension n − 2, giving us an identification of the cone as above.41
(3) There’s also the zero orbit {0}.

40Does this sequence split? Well, do you think of yourself as at the center of the universe?
41In dimension 2, the cone minus its singular point is disconnected, as it’s one-dimensional, but in higher dimensions that’s
not true. This is the genesis of many special aspects of two-dimensional quantum field theory.
Arun Debray May 9, 2019 79

The stabilizer groups are sometimes called Wigner’s little groups. The point of this parametrization is
to understand the representation theory of P(M ) in terms of these simpler groups: we’ll see the P(M )-
representations on the spaces of functions on the orbits.
Example 29.10.
(1) Consider the one-dimensional trivial complex representation of the stabilizer group, which induces a
representation of P(M ) called the scalar representation or scalar particle of mass m.
(2) More interestingly, the vector representation is induced by the trivial representation of dimension
n − 1 (if m2 is positive) or n − 2 (if m2 = 0). This tells us something physically interesting: you
can deform a particle in the vector representation by changing its mass, but we can’t do this for the
zero-mass particle. (
The representations of the stabilizer groups are Z/2-graded by the action of the central element ε ∈ Spinm :
the graded pieces are the ±1-eigenspaces. If a state is invariant under this, this is called its spin. This means
the induced representations are also Z/2-graded – and hence the irreducible ones are either even, in which
case they’re called bosons, or odd, in which case they’re called fermions. This is (particle) statistics; we’ve
conflated it with spin, but that’s a theorem in quantum field theory.

Fields. That’s the linear story; now let’s talk about fields. First, let Spin(V ) → Aut(E) be a real represen-
tation, where E = E− ⊕ E1 is Z/2-graded. The space of free fields is F := Map(M, E); since P(M ) acts on E
through P(P )  Spin(V ), then P(M ) also acts on the space of fields.
For nice linear equations E on F, M := {E(φ) = 0} has a symplectic structure. Typically one considers
the Euler-Lagrange equations for an action functional, but this isn’t always linear.
Example 29.11.
(1) If E = R, this is called the (free) scalar field. In this case we should choose
(29.12) \mathcal E(\phi ) \coloneqq \mathop \square \phi + m^2,
where  is the Laplacian in Minkowski signature.
(2) If E is a spin representation, then E(φ) := Dφ + m2 , where D is the Dirac operator.
(3) If E is a vector representation, then E(φ) :=  φ, but we also pick up a gauge symmetry. (
So we have a real symplectic affine space M. We choose an origin and then Fourier transform to the dual
space, which leads to considering complex-valued functions. This isn’t interesting for the scalar field, but
tells you useful things in the other settings: functions are only interesting on a subspace which varies, so we
get a vector bundle with a section. Anyways we get MC = M0 ⊕ M00 (induced by the choice C ∗ of positive
energy), and Fock space is H := Sym• M0 (namely: Sym• M0 is an inner product space but not necessarily a
Hilbert space, so we must take its Hilbert space completion).

Supersymmetry. The Poincaré group separately acts on the even and odd pieces of the particles and fields;
it cannot mix them. But there are remarkable systems in physics, discovered in the 1970s, which have extra
symmetry that can interchange the bosons and fermions. Since we have more symmetries, we have a bigger
symmetry group, and so we need more data. Hence, fix a real spinor representation S of Spin(V ), which is a
real module over the even real Clifford algebra on V .
Now, a miracle occurs: we can always find a Spin(V )-invariant bilinear form Γ : S∗ × S∗ → V which is
positive in the sense that Γ(s∗ , s∗ ) ∈ C. Since this is V -valued, it’s not an inner product.
Theorem 29.13. The space of such Γ is contractible (and in particular nonempty).
This is good, because it means a choice of Γ isn’t much of a choice at all, much like a Riemannian metric.
The spin representation is a choice, but given by data, since the irreducible ones fall into at most two
isomorphism classes. So we just need one or two numbers to describe how many copies of each we have,
which is what in physics corresponds to N = 1, N = 2, and so on.
Let L := V ⊕ ΠS∗ , i.e. the vector space V ⊕ S∗ with the Z/2-grading in which V is even and S∗ is odd.
Then L carries a Lie bracket which vanishes on V and which on ΠS∗ is
(29.14) [f_1, f_2] = -2\Gamma (f_1, f_2)

(here f1 , f2 ∈ S ).
80 M392C (Mathematical gauge theory) Lecture Notes

Example 29.15. Let n = 3, so Spin(V ) ∼ = SL2 (R), and choose S to be the spinor representation, which is
identified with the defining action of SL2 (R) on R2 . Then Sym2 S∗ is three-dimensional. Choosing some
Γ : S∗ ⊗ S∗ → V (TODO: did we do this in a specific way?), we obtain a quadratic form q : S∗ → V sending
s∗ 7→ 2Γ(s∗ , s∗ ). This maps onto the positive null cone and zero. This is special to n = 3, 4, 6, and 10, when
you have the minimal spin representation in these dimensions.
So given f, f 0 ∈ S∗ , we have −∂t = [f, f ] + [f 0 , f 0 ], which (TODO: ???) implies that we can quantize
and the eigenvalues of the Hamiltonian are nonnegative. This is another manifestation of the fact that we
restricted to nonnegative-energy representations. (
Now we enlarge the Poincaré Lie algebra p (i.e. the Lie algebra of the Poincaré group) to the super Poincaré
Lie superalgebra, the Z/2-graded Lie algebra
(29.16) \p (M,\Sph ) \coloneqq (V\otimes \so (V))\oplus \Pi \Sph ^*.
Think of so(V ) as the Lie algebra of Spin(V ). The description of p(M, S) as a direct sum isn’t quite true
when we look at Lie brackets, so this is really an extension. On (super)groups this extension is of the form
(29.17) \xymatrix { 1\ar [r] & V\ar [r] &\cP (M,\Sph )\ar [r] & \Spin (V)\ltimes \Pi \Sph ^*\ar [r] & 1. }
If A denotes the original stabilizer group, which depends on the orbit, then the new stabilizer group is
A0 = A n ΠS∗ .
TODO: I have no idea what we’re trying to do right now – classify representations? Anyways, you can
see what happens when you deform the mass m, giving particle multiplets. You can restrict to the even
symmetries and see what (ordinary, not super) particles you obtain.
Now we’ll see what happens in the case n = 3, but to simplify even further, let’s do dimensional reduction
along a spacelike subspace L ⊂ V . Thus M := M/L still is an affine space with a Lorentz metric, and
translations V /L act on it. The short exact sequence
(29.18) \shortexact {L}{V}{V/L}{}
induces a dual short exact sequence
(29.19) \shortexact {(V/L)^*}{V^*}{L^*}.

Now we want a bigger Poincaré group P(M,


e L, S) fitting into a short exact sequence

(29.20) \xymatrix { 1\ar [r] & L\ar [r] & \widetilde \cP (M,L,\Sph )\ar [r] & \cP (M/L, \Sph )\ar [r] & 1, }
and we can keep the bigger Poincaré group when looking at the lower-dimensional theory. The kernel L
is central, so we obtain a central extension of the usual super-Poincaré group. In this setting, the spinor
representation Spin(V ) → Aut(S) factors through the map Spin(V /L) → Spin(V ).
In the two-dimensional situation, the null cone is just two lines, and we have this extra direction arising
from dimensional reduction. This is also true on the dual cone, so we can choose a projection onto the usual
characters into the 2D dual space. How far we are in the third direction is another character, of L, called
the central charge of the theory. We can arrange that in this three-dimensional case, if we choose the right
elements (the same ones we had before) f1 , f2 , then
\frac 12[f_1\pm f_2, f_1\pm f_2] = -\partial _t + z,
(29.21)
where z is a basis vector for L. So again, in a representation, this must be nonnegative, at least with the
right sign conventions, so H ≥ |z|. This means the energy is bounded below by the central charge. In some
situations, the central charge is topological, and so the equation we saw bounding the Yang-Mills energy at
the beginning of class is an instance of this (albeit with fields rather than particles).
Equality holds iff f1 ± f2 = 0 as operators on the representation. So this is precisely when we have extra
symmetry, so extra symmetry and the energy bound go together.
How do we put this together with fields to get equations? At least when n = 3, we go to the nonlinear
theory of fields, and in the dimensionally-reduced case there’s a target Riemannian manifold X and the fields
are pairs of maps φ : M → X and ψ ∈ C ∞ (M ; Π(φ∗ T X ⊗ S)). The target X carries a function h : X → R
called the superpotential. The Poincaré group acts on these, so infinitesimally the odd part of the Lie
superalgebra gives us odd vector fields with a form commonly studied in physics.
Arun Debray May 9, 2019 81

Writing ζ := η 1 f1 + η 2 f2 ,
(29.22) \widehat \zeta \psi _+ = \eta ^+\partial _+\phi + \eta ^-\phi ^*\mathrm {grad}(h)
If we take η + = η − , so we’re going along the extra symmetries. Then we have the system of equations
(29.23a) ∂+ φ + φ∗ grad(h) = 0
(29.23b) ∂− φ − φ∗ grad(h) = 0.
If we restrict to static solutions, namely those that (TODO: missed the definition), then we get ∂1 φ =
−φ∗ grad(h). This is the equation for some sort of Morse flow line, and the equation |H| ≥ z bounds the
energy of the flow line from below.
There’s a little more that goes into this, e.g. working on a curved manifold. This involves topologically
twisting these theories, which kills a lot of the odd symmetry, but allows one to run this BPS formalism in
general.

References

[ABS64] M.F. Atiyah, R. Bott, and A. Shapiro. Clifford modules. Topology, 3(Supplement 1):3–38, 1964. https://ac.els-cdn.
com/0040938364900035/1-s2.0-0040938364900035-main.pdf?_tid=4564ce36-f342-11e7-b652-00000aab0f27&
acdnat=1515285341_9261e93c966630f1828dd0c2caa00f14. 30
[DK97] S. K. Donaldson and P. B. Kronheimer. The Geometry of Four-Manifolds. Oxford Mathematical Monographs. Oxford
University Press, 1997. 26, 29, 31, 37
[Fre95] Daniel S. Freed. Classical Chern-Simons theory. I. Adv. Math., 113(2):237–303, 1995. https://arxiv.org/abs/hep-th/
9206021. 73
[Hit74] Nigel Hitchin. Harmonic spinors. Advances in Mathematics, 14(1):1–55, 1974. 30
[Ste51] Norman Steenrod. The Topology of Fibre Bundles. Princeton University Press, 1951. 5, 8

You might also like