Hitchin Geometry and Physics Revised
Hitchin Geometry and Physics Revised
Nigel Hitchin
Mathematical Institute, Woodstock Road, Oxford, OX2 6GG
[email protected]
April 9, 2021
1 Introduction
Over the years, since I was a student, I experienced the many ways in which Michael
Atiyah brought mathematics to life by importing new ideas, reformulating them in his
own way, communicating them in his inimitable style and using these fresh insights
to advance his own research. While in Oxford he would take every opportunity to
travel to the USA and listen to what was currently interesting to him, then explain
the results with his added contributions at his regular Monday seminar, hoping that
some of these themes would be taken up by his audience.
He was at heart a geometer, but in the mid 1970s he became convinced that theo-
retical physics was by far the most promising source of new ideas. From that point
on he became a facilitator of interactions between mathematicians and physicists,
attacking mathematical challenges posed by physicists, using physical ideas to prove
pure mathematical results, and feeding the physicist community with the parts of
modern mathematics he regarded as important but were unfamiliar to them.
Here is what he wrote in a retrospective view [24], producing two lists of the historical
influences in both directions. It gives an impression of the weighting he applied to
areas of the two disciplines.
Mathematics ahead of physics: curved space, Lie groups, higher dimensions, fibre
bundles, spinors, exterior algebra, non-commutative algebra, Hilbert space, special
holonomy.
Physics ahead of mathematics: infinite-dimensional representations, Maxwell theory,
Dirac theory, supersymmetry, quantum cohomology, conformal field theory, quantum
1
field theory in 3 and 4 dimensions.
He concludes the article by attacking the conventional view of the relationship – the
market force model where we mathematicians alter our production in the light of
the changing needs of physics. Instead he sees mathematicians as gardeners breeding
new species, and physicists as the modern version of the 19th century collectors who
searched the world for exotic specimens to reinvigorate our gardens. For Michael
Atiyah, I hope to show that both models are valid.
Although Michael had been exposed to physics lectures as a student in Cambridge it
was only to expand his general knowledge – he also attended lectures in architecture
and archaeology. His peer Roger Penrose, also studying algebraic geometry at the
time, took little interest in physics then although he would famously go on to win a
Nobel Prize in Physics! Their subsequent paths diverged and Atiyah’s contributions
to topology and geometry had already won him a Fields Medal in 1966 before having
any substantial contact with physicists.
In what follows I want to discuss some of the examples of the interaction which is
apparent in his work, in roughly chronological order, giving on the way an outline of
the mathematical issues for a general audience. Michael Atiyah was a bridge-builder,
and I come from one side of that bridge, so others may interpret things differently.
First here is a timeline of his mathematical development before he succumbed to the
influence of physics:
• 1952 - 1958 Algebraic geometry: this period saw him introducing the new ap-
proaches coming from France. His work included results on ruled surfaces ex-
pressed in terms of vector bundles on algebraic curves, a classification of bundles
on elliptic curves, and a new way of looking at characteristic classes in holo-
morphic terms.
• 1959 - 1974 K-theory: during this time he became effectively a topologist, col-
laborating with Hirzebruch in constructing K-theory, a cohomology theory of
vector bundles which provided the framework to resolve efficiently many out-
standing problems in algebraic topology
• 1963 - 1975 Atiyah-Singer index theorem: here collaborations with Singer and
also Bott drew on both topology and algebraic geometry together with analysis
and differential geometry and is discussed in the next section.
2
• 1966 Fields Medal
The index theorem replaces this by the general case of a compact manifold M with
two vector bundles V+ , V− and an elliptic operator D transforming sections of V+ to
3
sections of V− and then
dim ker D − dim coker D = ind D
where the index ind D is expressed as a specific polynomial in characteristic coho-
mology classes related to the tangent bundle of M and the vector bundles V+ , V− ,
evaluated on the fundamental class of M .
A key moment in the formulation of the theorem, even before its proof, was a visit
in 1962 of Singer to Oxford where he offered the Dirac operator as the candidate for
using the index theorem to explain the integrality of the particular combination of
characteristic classes given by Hirzebruch’s Â-polynomial. Appropriating the Dirac
operator could have been the first contact with physics but, as Atiyah explains in
[23]:
Several decades later, and with all the interaction now taking place be-
tween physicists and geometers, it may seem incredible that we did not
get there earlier and more directly. There are several explanations. First,
at that time physics and mathematics had grown rather far apart in these
areas. Second, physics dealt with Minkowski space and not with Rieman-
nian manifolds, so any relation we might have noticed would have seemed
purely formal. In fact as a student of Hodge I should have known better,
since Hodge’s theory of harmonic forms, while finding its main applica-
tion in algebraic geometry, was explicitly based on analogy and extension
of Maxwell’s theory. Even more surprising is that Hodge and Dirac were
both professors in the mathematics department at Cambridge at the same
time and knew each other well, and yet it never occurred to Hodge to use
the Dirac operator in geometry. Part of the difficulty lies of course in
the mysterious nature of spinors. Unlike differential forms they have no
easy geometrical interpretation. Even now, at the end of the century, and
with some spectacular progress involving spinors and the Seiberg-Witten
equations, we are still in the dark in some fundamental sense. What,
geometrically is a spinor and why are they important?
The last point is probably the most relevant. If you read Hodge’s book, written in
the 1930s, you will see that even defining a manifold and differential forms takes up
a lot of space. To imagine that spinors could be incorporated into that picture is
asking a lot.
The 1960s saw the index theorem develop and multiply in many ways. The first proof
was largely topological using cobordism theory. The second proof was motivated more
by algebraic geometry and Grothendieck’s approach to the Riemann-Roch theorem.
4
Finally in 1971 came a new method from V.K.Patodi using the heat equation and this
had more resonance with what physicists were doing. It was analytical and involved
the spectrum rather then just the nullspace. This was a fundamental change of direc-
tion – algebraic geometry, the initial inspiration for the theorem, only encounters the
null spaces of operators but physicists are always more interested in the eigenvalues.
Singer’s physicist colleagues at MIT were now able to communicate with him having
a better common language and this was a new opening between the two fields.
Patodi, building on the work of Singer and H.McKean, used the heat kernels for the
second order operators DD∗ and D∗ D where D∗ is the formal adjoint. The cokernel
of D is isomorphic to the kernel of D∗ and since D∗ D is non-negative it has the same
nullspace as D. Hence the index can be written
tr exp(−tD∗ D) − tr exp(−tDD∗ )
5
dimensions may jump, there is nevertheless a well-defined class in the K-theory of
the parameter space, K-theory being specifically created by Atiyah and Hirzebruch
to cater for this scenario. If the original model for the index theorem for families
was Grothendieck’s Riemann-Roch theorem concerning the higher direct images of a
sheaf under a proper map of algebraic varieties, the analysis required to apply this
to elliptic operators became appropriate to the more general discussion of anomalies
later.
Another consequence of Patodi’s approach to proving the index theorem was the
introduction of the eta-invariant. This was another measure of the asymmetry of
the spectrum and also related to anomalies. The eta-invariant concerns the Dirac
and related operators on an odd-dimensional manifold. Here there is only one spinor
bundle V and D is self-adjoint and has a real spectrum, discrete but extending from
−∞ to +∞. The lack of symmetry to be measured here is between the positive and
negative eigenvalues and the device which can detect it – the eta-invariant –is the
analytic continuation to s = 0 of
!
1 X sgn λ
η(s) = + dim ker D .
2 λ6=0 |λ|s
The original motivation was to develop an index theorem for even-dimensional mani-
folds with boundary, for example to compare the index for square integrable solutions
to the polynomial in characteristic classes which gave the index in the compact sit-
uation. The difference is a function on the boundary with one local term involving
curvature (the second fundamental form) and the other the global eta-invariant. It
depends on the metric so is not a topological invariant but its derivative in a family
is the integral of a local curvature term.
The eta-invariant appears as a global anomaly for a diffeomorphism f : X → X of
a manifold. This entails forming the mapping torus M , by taking X × [0, 1] and
identifying the ends to produce a fibration over the circle. The global anomaly in
the limit where the circle is very large is then defined to be e2πiη and appears for the
physicists as the phase of a partition function in the functional integral.
Although these developments appeared somewhat later than the 1970s they provide
enough evidence that the driving force in index theory was moving away from alge-
braic geometry and was more in tune with the requirements of theoretical physics,
even if the authors were not aware of it.
In 1983, the tables were turned when the physicist Alvarez-Gaumé produced a new
proof of the index theorem based on supersymmetry [3], developing ideas of Witten.
It follows the Patodi approach but in a new formalism. For example, the symmetry
of non-zero eigenvalues becomes
6
The properties of supersymmetry however, imply that this supersymmet-
ric index depends only on the zero energy states due to the fact that all
non-zero energy states appear in bose–fermi pairs.
In Getzler’s more mathematical treatment [36] it is the heat kernel of the one-
dimensional harmonic oscillator which picks out the particular polynomial in the
curvature to yield Hirzebruch’s Â-genus. Explaining why this complicated expression
should give an integer was the initial motivation for Atiyah and Singer back in 1962,
so here was physics providing a coherent structure yielding a precise known formula.
3 Twistor theory
I was Michael Atiyah’s research assistant at the Institute for Advanced Study, where
he was a permanent member, from 1971-73. Most of his work at this time revolved
around applying the heat equation approach to the index theorem, but at some point
in early 1973 we both went over to Princeton University to hear a talk by Roger
Penrose on black holes and singularities. After the seminar, the two of them spoke
together at some length: I imagined they were reminiscing about their student days
in Cambridge, but it turned out they were both planning to take up positions in
Oxford – Penrose to the Rouse Ball Chair of Mathematics and Atiyah to a Royal
Society Professorship in the Mathematical Institute, so this was more likely the topic
of conversation.
On his arrival, Penrose quickly set up a group of young students and postdocs fo-
cusing on his notion of twistor theory, which he initiated in 1967 [47]. This regarded
compactified, complexified Minkowski space as a 4-dimensional complex projective
quadric. Interpreting this as the classical Klein quadric, it parametrizes lines in a
complex projective 3-space, which is the (projective) twistor space. Introducing com-
plex coordinates was philosophically justified by arguing that quantum mechanics
7
required complex numbers, so any unified theory would have to use them in gravity,
the essential point though was that twistors were more fundamental.
It is easier to describe the approach without the compactification, which means re-
moving a projective line from complex projective 3-space CP3 . The one-parameter
planes through this line at infinity give a projection CP3 \CP1 → CP1 which expresses
this space as a vector bundle
Here O(1) denotes the line bundle of degree one over CP1 whose holomorphic sections
are linear forms aζ + b in a parameter ζ on C ⊂ C ∪ {∞} = CP1 . (Here Riemann-
Roch gives d + 1 − g = 1 + 1 − 0 = 2). A projective line in CP3 \CP1 is now a
holomorphic section of O(1) ⊕ O(1) and so is given by s(ζ) = (aζ + b, cζ + d) with
(a, b, c, d) ∈ C4 .
We are supposed to regard this as complexified Minkowski space rather than just a 4-
dimensional vector space and the key idea is to define the light cone from each point.
A point x ∈ C4 corresponds to a line Px in twistor space and the light cone through
x is defined to be the set of lines meeting Px . So take the zero section s(ζ) = (0, 0) as
Px then the light cone consists of the sections (aζ + b, cζ + d) which vanish for some
value of ζ. This condition is ad − bc = 0. If (a, b, c, d) = (z + t, x + iy, x − iy, t − z)
then this reads t2 − x2 − y 2 − z 2 = 0 the usual Minkowski light cone.
One of the early achievements of twistor theory was a description of solutions of zero
rest mass field equations in terms of contour integrals. These equations include the
Dirac equation, Maxwell’s equation and the wave equation, all conformally invariant
equations.
Here is the process as originally formulated by Penrose, working with a holomorphic
function f of three variables: [48]:
8
Although 20 years earlier, Penrose and Atiyah were both graduate students in alge-
braic geometry, it was Atiyah who absorbed the new approach of sheaf cohomology
coming from Paris, while Penrose, a student of Todd rather than Hodge, worked in
a more classical vein. So when the two got together in Oxford, it was Atiyah who
pointed out that this was just a sheaf cohomology group.
The simplest case is the wave equation. Penrose’s original contour description [49]
gives
Z
φ(x, y, z, t) = f ((z + t) + (x + iy)ζ, (x − iy) − (z − t)ζ, ζ)dζ
γ⊂CP1
9
He quickly observed that in this case CP3 is the projective space P(C4 ) where C4
is regarded as a 2-dimensional quaternionic vector space. This gives it an antiholo-
morphic involution with no fixed points. Furthermore it fibres over the 4-sphere S 4
with fibres projective lines, the lines which are transformed into themselves by the
involution acting as the antipodal map on S 2 ∼= CP1 . In fact, as Atiyah pointed out,
there was a more general picture. The bundle of complex structures on the tangent
spaces of the 4-sphere is the complex manifold CP3 , the space for S 6 is a complex
manifold with fibre CP3 and so on. The Euclidean version of twistor space soon
became important in another context.
4 Instantons
In 1977 Singer spent a sabbatical term in Oxford and explained to Michael a problem
that had been put to him by physicists concerning self-dual solutions to the Yang-Mills
equations. These equations described the critical points of the Yang-Mills functional
on connections on a principal SU (2)-bundle over R4 . The connection A is given
locally by a 1-form with values in the Lie algebra g defining covariant differentiation
∂
∇i = + Ai
∂xi
and its curvature FA given by [∇i , ∇j ]dxi ∧ dxj is a globally defined g-valued 2-form.
The Yang-Mills functional is the L2 norm square of the curvature, which can be
written using the Hodge star operator ∗ acting on 2-forms as
Z Z
2
|FA | dx = tr FA ∧ ∗FA
R4 R4
The original motivation was an attempt to describe quark confinement, which was
not in the end realized, but from the mathematician’s point of view the Wick rotation
from Minkowski space to Euclidean space placed the problem in a familiar location.
Since the Hodge star is conformally invariant in the middle dimension it meant that
(with suitable justification which came later with Karen Uhlenbeck’s work) it became
a problem on the conformal compactification of R4 namely the sphere S 4 . Since ∗2 = 1
the curvature splits into two pieces FA+ +FA− with ∗FA+ = FA+ and ∗FA− = −FA− . Then,
using the volume form ν,
Z Z
2
tr FA = (|FA+ |2 − |FA− |2 )ν
S4 S4
10
but the left hand side is a topological invariant, a characteristic class, which means
the Yang-Mills functional, the integral of |FA+ |2 + |FA− |2 , is bounded below by 4πk for
a nonnegative integer k and the bound is achieved if FA = ∗FA which is self-dual or
the opposite anti-self-dual FA = − ∗ FA , the choice depending on orientation.
The challenge then, was to find all self-dual solutions for the simplest non-abelian
group SU (2). There are none for an abelian group since the curvature is then a
harmonic 2-form and the 4-sphere has zero cohomology in degree 2.
One aspect of the mathematics/physics relationship is that physicists are much better
at producing examples to prove a point. The mathematician’s response that physicists
are satisfied with examples rather than general theorems may well be justified but
examples are crucially important for getting things moving. In this case Belavin,
Polyakov, Schwartz and Tyupkin had in 1975 exhibited a nonsingular spherically
symmetric solution with k = 1. This was followed by more examples [39] which
yielded solutions depending on 5k + 4 parameters which can locally be written as
k
1 X m2i
Ai = − ei · d log ρ ρ=1+ . (1)
2 1
|x − xi |2
Here the function ρ is a linear superposition of 1/r2 potentials – fundamental solutions
of Laplace’s equation in four dimensions. The points xi ∈ R4 could be interpreted
as locations of particles and indeed these were called “pseudoparticle” solutions. The
notation for the Ai here is essentially describing the connection on the spinor bundle
of R4 outside the points xi . The singularities of ρ at these points mean that a local
gauge transformation in a neighbourhood of the points extends the formula to describe
a smooth connection on a bundle with topological invariant k.
The first advance during Singer’s visit was that Richard Ward, one of Penrose’s
students, had shown that a complex solution to the self-dual Yang-Mills equations
was given by a holomorphic vector bundle on twistor space – a nonlinear version of
the correspondence which worked so well for linear equations. By chance Michael
had attended the seminar and immediately saw how his Euclidean version of twistor
theory would be relevant to the instanton problem – one needed a holomorphic rank
2 bundle on CP3 which was holomorphically trivial on each real line. Since these
lines are the fibres of the projection CP3 → S 4 this clearly defines a bundle on S 4 .
The connection is the unique extension of the holomorphic trivialization to the first
order neighbourhood of a line.
The final answer to the original question was achieved in the ADHM (Atiyah-Hitchin-
Drinfeld-Manin) construction [6] but Simon Donaldson has explained that in his talk
in the series. Instead, let me point out the collaboration between Atiyah and Ward
which preceded this [5].
11
How do you construct a holomorphic vector bundle on CP3 ? By way of analogy,
consider the problem of line bundles on a Riemann surface C. Suppose the line bundle
L of degree d ≥ 0 has a holomorphic section. It will vanish at points x1 , . . . , xd and in
the complement we have a non-vanishing section, namely a trivialization of L. On the
other hand, by the definition of a line bundle there are local trivializations on open
discs Di containing xi . Taking a covering {Uα } of C by the Di and C\{x1 , . . . , xd },
on twofold intersections Uα ∩ Uβ the trivializations differ by a holomorphic function
gαβ with values in C∗ . Conversely, if zi is a local coordinate with zi (xi ) = 0 then zi
on the punctured disc Di defines an equivalent family of transition functions. So the
points x1 , . . . , xd define L.
Suppose then that a rank 2 holomorphic vector bundle E on CP3 has a section. Now
the zero set is defined locally by two functions and is therefore a curve C ⊂ CP3 . In
the complement we no longer have a trivialization but we do have a non-zero section
giving a trivial subbundle and this means that there are transition functions of the
form
1 aαβ
gαβ = .
0 bαβ
Here bαβ is a transition function for the quotient line bundle L of E by the trivial
subbundle and the multiplicative property of the transition function on threefold
intersections means that aαβ is a Čech representative for a class in H 1 (CP3 \C, L∗ ).
If L ∼
= O(2) then, as we have seen, this is a solution to Laplace’s equation. The class
does not extend to CP3 but it does as a class in Ext1CP3 (JC , O(L∗ )) where JC is the
ideal sheaf of C. Under certain conditions the curve C and some scalar information
on C determines E up to equivalence. This was the so-called Serre construction of
vector bundles.
Taking C to be a collection of disjoint lines in CP3 yields the pseudoparticle solutions
of (1). The class in H 1 (CP3 \{C}, O(−2)) corresponds via the Penrose transform to
the solution ρ of the Laplace equation in the Ansatz. By considering L isomorphic to
O(n) more solutions were obtained from a modified Ansatz using more general zero
rest mass field equations, but the condition to be trivial on every real line turned out
to be more difficult.
In the meantime a second advance was a use of the index theorem. Any automorphism
of the principal bundle (a gauge transformation) takes one solution of the equations
to another so a parameter count must take this into account by considering the
moduli space – the set of solutions modulo gauge equivalence. To first order, a gauge
transformation is given by a section ψ of the bundle g of Lie algebras and its effect
on the connection is the variation ∇ψ ∈ Ω1 (S 4 , g), a 1-form with values in g. An
arbitrary first order variation Ȧ of the connection satisfies the self-duality equation
12
if the anti-self-dual component (dA )− of the variation of the curvature dA Ȧ vanishes.
Then the kernel of the differential operator (dA )− transforming Ω1 (S 4 , g) to Ω2− (S 4 , g)
contains the first order deformations and this, modulo the image of ∇A acting on
sections of g should give the parameter count.
This is the first cohomology of an elliptic complex on S 4 :
∇ (dA )−
Ω0 (S 4 , g) →A Ω1 (S 4 , g) → Ω2− (S 4 , g) (2)
and the index theorem (for the elliptic operator (dA )− ⊕ ∇∗A ) gives 8k − 3 for the
dimension if the cokernel is zero. The positivity of the (scalar) curvature of S 4 gives
this via a differential-geometric vanishing theorem. The two topological invariants for
the right hand side of the index theorem are the Euler characteristic 2 of the sphere
and the second Chern class −k of the rank 2 bundle. These 8k − 3 dimensions were
significantly more than was being produced by constructions during this period and
so pointed towards a better approach.
The algebraic geometry of bundles on projective space was a subject of consider-
able interest at the time, independent of the instanton problem, in particular work
by Wolf Barth in Germany, Geoffrey Horrocks in England and Robin Hartshorne
in the USA. More streamlined methods were being used for this problem, the key
algebraic approach was to consider the direct sum of the sheaf cohomology groups
H 1 (CP3 , E(n)) for all n as a module over the direct sum of H 0 (CP3 , O(n)) which is
the algebra of polynomials in four variables z1 , z2 , z3 , z4 . Each of these spaces has a
Penrose transform interpretation but the algebraic geometry was more flexible and
showed that ultimately all depended on the first part H 1 (CP3 , E(−1)) so long as
H 1 (CP3 , E(−2)) = 0. Here the Penrose transform came to the rescue with positive
curvature showing that the only solutions to the Laplace equation coupled to the con-
nection were zero. This last result set the algebraic geometry machinery in motion
and reduced the construction to one of matrices.
For Michael Atiyah this whole experience would be, I believe, a pivotal moment in
his attitude to the relationship between geometry and physics.
A little later, Michael returned to twistor theory and revisited the material that went
into the Serre construction. Physicists [28] had already in 1978 used the ADHM
construction to explicitly calculate the Green’s function for the Laplacian coupled to
a self-dual connection and the issue of how to define a Green’s function in terms of
twistor theory arose.
It is more natural to use conformal invariance and consider the conformally invariant
Laplacian which is ∆ + R/6 with R the scalar curvature of a metric in the confor-
mal class, then because of the positivity of curvature of the 4-sphere, the Green’s
13
function is unambiguously defined. The answer, for the 4-sphere, is to take the
ideal sheaf Jx of the twistor line Lx corresponding to x ∈ S 4 and the Serre class
λ(x) ∈ Ext1CP3 (Jx , O(−2)). The full two-variable Green’s function G(x, y) is the Serre
class of the diagonal in CP3 × CP3 .
The Green’s function paper [7] appeared in 1981 by which time there was a more
general setting. By analogy with the gauge-theoretical instantons, Gary Gibbons
and Stephen Hawking had produced gravitational analogues. These were Euclidean
solutions of Einstein’s vacuum equations with the additional property that the Levi-
Civita connection was self-dual. In 1976, Penrose had also introduced his nonlinear
graviton construction which replaced the twistor space CP3 by a more general 3-
manifold Z with the key property that it contained twistor lines – holomorphic CP1 s
with normal bundle isomorphic to O(1) ⊕ O(1) and the Gibbons-Hawking examples
fitted nicely into this picture. A competition then began with the physicist Don Page
to calculate the Green’s function for these examples. Michael and I used to walk from
the Institute across the University Parks to St Catherine’s College for lunch and I
remember him complaining that he would never do a long intricate calculation like
that again. I think a bottle of champagne was at stake, but I never found out who
won it.
14
from the elliptic complex (2) was used to define a moduli space, a smooth manifold
of dimension 8k − 3.
When k = 1 this moduli space is 5-dimensional and is acted on by the group of confor-
mal transformations of the 4-sphere SO(5, 1). It did not need the ADHM construction
to show that this was hyperbolic 5-space. The key point that its boundary was the
original 4-manifold went unnoticed until Simon Donaldson started his momentous
work.
There was one example in the literature of such a moduli space of connections and this
was the theorem of M.S.Narasimhan and C.S.Seshadri concerning stable holomorphic
vector bundles on a Riemann surface Σ. In [46] they had shown that a holomorphic
vector bundle on a compact Riemann surface which satisfied an algebro-geometric
stability condition admitted a natural flat unitary connection. In their proof, the
connection barely appears since a flat connection can be viewed geometrically rather
than analytically as either a vector bundle with transition matrices which are locally
constant, or simply as the quotient of Σ̃ × Cn (where Σ̃ is the universal covering)
by the action of the fundamental group π1 (Σ), acting on Cn via a homomorphism
ρ : π1 (M ) → U (n). This was an important result, linking topology and algebraic
geometry, for it provided two very different structures on the moduli space. One
was as an algebraic variety, the other as the space of unitary n × n matrices Ai , Bi
satisfying the constraint A−1 −1 −1 −1
1 B1 A1 B1 · · · Ag Bg Ag Bg = 1 up to conjugation. In
[37] Harder and Narasimhan had calculated the Betti numbers of these moduli spaces
by number-theoretic means, using the proof of the Weil conjectures. (Actually to
avoid singularities in the moduli space, they had to restrict to vector bundles whose
degree and rank are coprime, which in the gauge-theoretic interpretation consists of
representations in the projective unitary group.)
Atiyah and his long-time collaborator Raoul Bott, who had recently spent time at
the Tata Institute with Narasimhan, set out in 1980 to use the Yang-Mills functional,
not in four dimensions, but in two, to analyse this situation. Connections on a fixed
smooth complex vector bundle E with a Hermitian inner product form an infinite-
dimensional affine space A – the difference of any two covariant derivatives ∇1 − ∇2
lies in Ω1 (Σ, g) where g is the bundle of skew-Hermitian endomorphisms of E. The
Yang-Mills functional Z
|FA |2 ν
Σ
(where ν is the area form of a metric in the conformal class) is a natural gauge-
invariant function on this space and the absolute minimum is the space of flat con-
nections. The quotient by the group of unitary gauge transformations is the space
studied by Narasimhan and Seshadri. The idea of Atiyah and Bott was to use Morse
15
theory in infinite dimensions to give an alternative approach to finding the cohomol-
ogy of the moduli space.
Recall the classical picture of Morse theory. We are given a compact manifold M and
a real-valued function f with nondegenerate critical points. At a critical point the
Hessian, the second derivative of f , is a well-defined quadratic form on the tangent
space and its index is the number of negative eigenvalues. At a minimum value, say
0, the index is zero and one builds up the homotopy type of the manifold from that of
f −1 [0, t] by adding a cell of dimension n as t passes through a critical value of index
n.
The aim of Atiyah and Bott was to use this the other way round, to determine by
subtraction the topology of the absolute minimum from the topology of M and the
other critical points, or critical submanifolds in this case. The infinite-dimensional
affine space A is of course contractible, but the quotient A/G by the group of gauge
transformations is not and its homotopy type is determined by that of the maps
from Σ to the classifying space BU (n). The other critical points are where the first
variation of the functional is zero for a variation Ȧ ∈ Ω1 (Σ, g) of the connection. This
is Z Z
(dA Ȧ, FA )ν = (Ȧ, d∗A FA )ν
Σ Σ
16
5.2 Symplectic aspects
The affine space A is acted on transitively by Ω1 (Σ, g) so the tangent space at any
point A is naturally isomorphic to this space. Given two tangent vectors Ȧ, Ḃ ∈
Ω1 (Σ, g) there is a canonical skew pairing
Z
tr(Ȧ ∧ Ḃ)
Σ
and Z Z
dfa (Ȧ) = tr(dA Ȧa) = tr(Ȧ ∧ dA a)
Σ Σ
and since dA a is the variation of the connection (just as in the 4-dimensional case of
the complex (2)) FA is the moment map.
From this viewpoint the space of flat connections is µ−1 (0) and the moduli space
µ−1 (0)/G is a symplectic quotient or reduced phase space and acquires a symplectic
structure.
17
as we noted above. When Σ has a complex structure, by differentiating in the anti-
holomorphic direction, any covariant derivative ∇ on E defines a differential operator
¯ s + f ∇0,1 (s) where s is a section of E, f
∇0,1 with the property that ∇0,1 (f s) = ∂f
a function and ∂¯ the Cauchy-Riemann operator. Conversely, given a ∂-operator
¯ the
Hermitian form defines a complex conjugate which is a ∂-operator and ∂¯ + ∂ is the
covariant derivative ∇ of a connection. Atiyah and Bott showed, although in fact
it had previously been proved by Koszul and Malgrange, that the local solutions to
∇0,1 (s) = 0 are the local holomorphic sections of a holomorphic vector bundle.
Thus the space of connections is also the space of holomorphic structures and the
¯
complexified gauge group G c is a well-defined object taking one ∂-operator to another
by conjugation. In particular A is an infinite-dimensional complex manifold with
an infinite-dimensional group G c of holomorphic transformations. Together with the
symplectic structure A is an infinite-dimensional Kähler manifold and the moduli
space inherits a natural Kähler metric.
5.4 Stability
These two differential-geometric observations placed the Narasimhan-Seshadri theo-
rem in a new light. A holomorphic vector bundle on a Riemann surface Σ is said to
be stable if for each subbundle U ⊂ E we have
deg U deg E
< .
rk U rk E
The theorem states that a stable bundle of degree zero has a unique flat unitary con-
nection. From the gauge-theoretic viewpoint this means that the stable holomorphic
structures A ∈ A are those which are transformed by an action of G c to the zero set
of the moment map A 7→ FA . It presented the opportunity for a new proof of the
¯
theorem – to start with a ∂-operator on a fixed C ∞ vector bundle with a Hermitian
metric and try and find the minimum of the Yang-Mills functional on a G c -orbit. The
stability condition should imply the existence of a minimum, which is zero. This was
Donaldson’s first paper [30] and he used the analysis developed by Karen Uhlenbeck
mainly for the 4-dimensional Yang-Mills problem to get convergence results.
Simon Donaldson had been my research student but it was when he observed a similar
phenomenon that I transferred him to Michael Atiyah, who was enthusiastically pur-
suing the moment map idea. The problem I suggested to Donaldson was the issue of
stability for holomorphic vector bundles in higher dimensions, and specifically in two
complex dimensions. A two-dimensional Kähler manifold has a canonical orientation
and with respect to this the anti-self-dual condition on a connection implies that ∇0,1
18
defines a holomorphic structure but there is one more condition, that contraction of
the curvature FA with the Kähler form ω (the Λ-operator in Hodge theory) should
be zero. There was a clear conjecture here that the algebraic geometry notion of
stability was related to this curvature condition. There was evidence:
2. Yau’s proof of the Calabi conjecture gave examples for the tangent bundle of a
K3 surface
Donaldson formulated the conjecture in moment map terms on the space of connec-
tions A. Here one needs the Kähler form to define a formal symplectic structure,
which was unnecessary in the one-dimensional case. The curvature now has different
components corresponding to the decomposition of 2-forms on a complex manifold
into types (1, 1), generated by dzi ∧ dz̄j , (2, 0) generated by dzi ∧ dzj and (0, 2) by
0,1
dz̄i ∧ dz̄j . The vanishing of the (0, 2) part of FA is the condition for ∇A s = 0 to have
local solutions and define a holomorphic structure. This subspace of A is a nonlinear
complex submanifold and then ΛFA = 0 is the vanishing of the restricted moment
map, so stability should be equivalent to transforming by a G c -gauge transformation
to a zero of the moment map. Donaldson proved this for surfaces [32] and Uhlenbeck
and Yau in higher dimensions [54], but this came after Donaldson, as Atiyah’s stu-
dent, turned his attention to instantons on general four-dimensional manifolds with
spectacular results.
Michael’s other student at the time, Frances Kirwan, was given the finite dimen-
sional problem of using moment maps and equivariant Morse theory to calculate the
cohomology of quotients in algebraic geometry using the same ideas and the norm
squared of the moment map instead of the Yang-Mills functional. Lessons learned
from a gauge-theoretic approach were valuable in pure algebraic geometry.
5.5 Convexity
There was another aspect of symplectic geometry which made its presence felt during
this piece of research. The brief sketch above of how Morse theory operates should
be amplified by using a Riemannian metric and considering the gradient flow of f .
19
Each point flows down to a critical point x and the upward flow from x is isomorphic
to Rm where m is the number of positive eigenvalues of the Hessian. Thus the
upward flow from the absolute minimum is open. These stable submanifolds give a cell
decomposition of M and the behaviour of their closures encompasses the complexity
of the cohomology. When the critical points form submanifolds N ⊂ M then these
are total spaces of vector bundles over N .
There is a partial ordering on the critical submanifolds Ni . Considering the downward
gradient flow of f then N1 < N2 if there is a trajectory starting on N1 and whose
limit lies in N2 . In Atiyah and Bott’s case the critical submanifolds are direct sums
of bundles E1 , . . . , Ek of rank n1 , . . . , nk and degree d1 , . . . , dk . A partial ordering was
needed among these sets of integers.
In fact, the gradient flow of the Yang-Mills functional was difficult to apply but instead
there existed a ready-made stratification by considering A as the space of holomorphic
structures. A holomorphic vector bundle may not be stable (or semi-stable which is
where equality occurs in the definition) but instead there is the Harder-Narasimhan
filtration [37]
0 = E0 ⊂ E1 ⊂ E2 ⊂ · · · ⊂ En = E
where Di = Ei /Ei−1 is semistable and µ(D1 ) > µ(D2 ) · · · > µ(Dn ) with slope µ(V ) =
deg V / rk V . Stability is an open condition so the stratum for E = En and µ = 0 is
the analogue of the upward flow from the absolute minimum of the functional.
Using a result of Shatz [52] about how the µ(Di ) specialize under limits the partial
ordering is organized by taking the vector of slopes (µ1 , . . . , µn ) including repetition
and defining λ ≥ µ if and only if
X X
λj ≥ µj (3)
j≤i j≤i
P P
for 1 ≤ i ≤ n − 1. Here µi = λi = deg E which is a topological invariant.
In subsequent papers [9],[8] Atiyah presents this in the context of the action of the
torus T of diagonal matrices on the symplectic manifold U (n)/T , a coadjoint orbit.
The moment map takes for each generator of the Lie algebra t of T a Hamiltonian
function and gives an equivariant map µ : U (n)/T → t∗ ∼ = Rn . The theorem is that
the image is convex. To see the link, let a ∈ u(n) be a skew Hermitian matrix with
distinct eigenvalues (iλ1 , . . . , iλn ) and consider the orbit in the Lie algebra which
we can identify with its dual using the inner product tr(ab). The λi are of course
constant on the orbit. The moment map for the action of U (n) is just the inclusion
but restricted to the torus of diagonal matrices in U (n) it is the projection onto the
Lie algebra. So the moment map is (µ1 , . . . , µn ) where these are the diagonal entries.
The convexity of the image gives the inequality (3).
20
In fact, as Atiyah writes:
These insights fed into other fields, clearly into generalizing to Lie groups, where
Kostant had already trodden, but also into stationary phase approximations and
equivariant cohomology.
This journey into two dimensions was motivated by the four-dimensional problem of
instantons and its moduli space was not neglected, but we shall return to this later.
Physicists at the time were very interested in solving the instanton equa-
tions because of speculation by Alexander Polyakov about the dynamics
of gauge theories. However, the ingredients in the twistor transform of the
instanton equation – complex manifolds, sheaf cohomology, fiber bundles
– were quite unfamiliar to me and most other physicists
Without necessarily going into the details, Atiyah developed a nose for the parts of
mathematics which physicists would find useful and absorbed through his conversa-
tions with physicists, Witten in particular, and also his collaborators Bott and Singer,
the radically different viewpoints in the two disciplines.
Here is Witten’s recollection [53]: ‘
21
At the 1979 Cargèse summer school, Atiyah and Raoul Bott undertook
to educate physicists about Morse theory. I and most (or all?) of the
physicists there had certainly never been exposed to Morse theory be-
fore. Another highlight was a conference in Texas where Atiyah and Is
Singer began to elucidate the topological meaning of what physicists know
as perturbative anomalies in gauge theory. This helped introduce physi-
cists to a deeper understanding of fermion path integrals. Two papers by
Atiyah and Bott in these years were ultimately influential for physicists.
Their 1983 paper “The Yang - Mills equations over a Riemann surface”
introduced ideas that were important later in understanding quantum
gauge theories in two dimensions. Their 1984 paper “The moment map
and equivariant cohomology” helped lead to the important technique of “
localization” in supersymmetric quantum field theory.
Atiyah’s contribution to the interaction was one of issuing challenges to the physics
community and educating them in the parts of mathematics he thought would be
of most interest and use. The process accelerated in the mid 1980s when string
theory provided a much wider interface between the two disciplines. The challenges
included making sense of the relationship between Langlands duality of Lie groups
and electric and magnetic charges, and of finding a quantum field theory to provide a
context for the ongoing work of Donaldson on four-manifold invariants and, perhaps
most successfully, understanding the Jones polynomial as a topological quantum field
theory (see Edward Witten’s article based on a talk in the series). Witten’s description
above of part of the educational process represents just a few of the interactions. In
the next section we shall see what happened to Morse theory in Witten’s hands. To
paraphrase Goethe, replacing mathematicians by physicists: “whatever you say to
them they translate into their own language and forthwith it is something entirely
different.”
22
dim H p (M, R) ≤ mp and
k
X k
X
(−1)p mp ≥ (−1)p dim H p (M, R).
0 0
ker d : Ωp → Ωp+1
H p (M, R) =
im d : Ωp−1 → Ωp
and the Hodge theorem: if M is compact, then each cohomology class has a unique
harmonic representative: dα = 0, d∗ α = 0.
For Witten the direct sum of forms of odd degree Ωod is to be regarded as a space of
fermionic states and the even forms Ωev as bosonic states. There are supersymmetry
operators exchanging bosons and fermions and these are
Q 1 = d + d∗ , Q2 = i(d − d∗ )
while the Hamiltonian H = dd∗ + d∗ d is the Laplacian on forms. These satisfy the
supersymmetry relations
Q21 = Q22 = H, Q1 Q2 + Q2 Q1 = 0.
Now introduce the Morse function f and conjugate d to give dt = e−f t def t and d∗t =
e−f t d∗ ef t . Clearly d2t = 0 and the cohomology groups defined by dt are isomorphic to
the de Rham cohomology. However the Hamiltonian is now
(where iej denotes contraction, or inner product, with an element of a local orthonor-
mal basis of tangent vectors). The expression is a Laplacian plus a potential term V ,
familiar territory for a physicist.
Now let t → ∞ and V becomes large except at the critical points where df = 0 and
this implies that the eigenfunctions of Ht are concentrated near the critical points.
Moreover there is an asymptotic expansion of the eigenvalues
where the coefficients are local expressions around the critical points. Fix p and
consider the eigenvalues on p-forms. For large t the number of eigenvalues which
vanish is no larger than the number of coefficients a which vanish and this is equal
23
to the number of negative eigenvalues of the Hessian term ∇2 f . Thus the number of
critical points of index p = mp ≥ dim H p (M ), the first Morse inequality.
To obtain the stronger form of the inequality requires studying the orbits of the gra-
dient flow, passing from one critical point to another, the “paths of steepest descent”.
These now acquire the terminology of “instantons” “tunneling” between two states.
Witten introduced thereby a complex which was in some sense a revival of early
work by Morse, Smale, Thom, and Milnor but because of its setting it was highly
influential, in particular the development of Andreas Floer [35].
The meeting of Hodge theory with Morse theory also meant that the method could
be adapted by replacing the exterior derivative d with the ∂¯ operator on a complex
manifold [29], [57]. This leads to asymptotic inequalities as p → ∞ of the form
k
pn
X Z
k−j j p
(−1) dim H (M, L ) ≤ p(R)
0
n N
24
each Σ, a compact oriented d-dimensional manifold, a complex vector space Z(Σ),
and to each compact oriented d+1-manifold with boundary Σ a vector Z(M ) ∈ Z(Σ).
Then the axioms are:
5. Z(M ∗ ) = Z̄(M ).
Axiom 3 for the tensor product of the disjoint union is the feature which to a topologist
is the most unusual, and somehow represents the quantum content of the definition.
If M has empty boundary then Z(M ) ∈ C is an invariant.
This abstraction is a long way from the physical motivation but Σ is modelled on
space and the extra dimension of M is time, Z(M ) is the vacuum state in the Hilbert
space Z(Σ). The cylinder M = Σ×[0, 1] has two boundary components with opposite
orientation and then Axiom 2 says that the “propagation” from one Hilbert space
to the other is the identity, or that the Hamiltonian H = 0, which is what the
“topological” adjective is meant to imply.
The relationship in differential topology between manifolds and their boundaries is
called cobordism and the axioms represent a functorial relationship between two cor-
responding categories. They offer an organizing goal for associated results which in
many cases is difficult to fully achieve, but this is one of the challenges which physics
presents to mathematics.
In low dimensional cases where manifolds are all known up to diffeomorphism, we can
recognize the structure the axioms yield. In particular when d = 1, the circle is the
only compact 1-dimensional manifold and all surfaces with boundary are obtained by
gluing discs and pairs of pants. So Z(S 1 ) = V is a vector space and if we take a pair
of pants M with one incoming boundary and two outgoing ones we get a vector
v ∈V∗⊗V∗⊗V
25
Frobenius algebra structure on V . It is a theorem that any Frobenius algebra can be
viewed this way [1].
The even degree cohomology H ev (M, C) of a compact oriented manifold M is an
example, the trace is the integral of a de Rham representative of the top degree com-
ponent. For a complex projective variety, the quantum cohomology is a deformation
of this which involves the enumerative geometry of rational curves in M .
One could also take the group ring of a finite abelian group Γ – any u ∈ V is a linear
combination n
X
u= αi gi
i=0
with gi ∈ Γ, g0 = e the identity and αi ∈ C. The trace is defined by θ(u) = α0 .
Dually, the space V ∗ consists of functions on Γ and group multiplication corresponds
to convolution X
f ∗g(x) = f (xh−1 )g(h)
h∈Γ
and θ(f ) = f (e), evaluating at the identity. For a non-abelian group, the space
of functions on conjugacy classes is the character ring, spanned by the characters
of irreducible representations, and this with the convolution product as functions
on Γ is a commutative Frobenius algebra. The characters χ1 , χ2 of two inequivalent
irreducible representations satisfy χ1∗χ2 = 0 and χ1∗χ1 = |Γ|χ1 /χ1 (e) so the structure
of the algebra is quite simple.
This example is related to the question of flat connections on surfaces: a case with
d = 1. A flat connection on a space X with holonomy in the finite group Γ is just a
principal Γ-bundle, a finite Galois covering. Up to equivalence these are parametrized
by Hom(π1 (X), Γ) modulo the conjugation action of Γ. So if X is the circle these are
the conjugacy classes.
If X = M is a surface with incoming boundary Σ− and outgoing boundary Σ+
then a flat connection on M restricts to one on the boundary, so a function on
Hom(π1 (Σ− ), Γ) can also be considered as a function on the finite set of flat connec-
tions on M . Summing over the inverse images of the map to Hom(π1 (Σ+ ), Γ) and
weighting by the inverse of the number of automorphisms of the connection defines
Z(M ) ∈ Z(Σ). Together with the Frobenius algebra above, this satisfies the axioms.
In physical language the conjugacy classes of Γ form a classical phase space and
the functions on it are the quantization. Then clearly we have the tensor product
property for the disjoint union. The invariant for a closed surface M is the number
of equivalence classes of flat Γ-connections.
Replacing Γ by a compact Lie group G gives an infinite-dimensional Hilbert space V
26
and Witten computed the Frobenius algebra structure giving a similar result to the
finite group case. Now instead of counting the number of flat connections one obtains
the volume of the moduli space M. Recall that Atiyah and Bott had observed the
natural symplectic form ω on M. Witten’s formula is
Z
1 n X
ω = |Z(G)| vol(G)2g−2 (dim R)2−2g
M n R
where R runs through the irreducible representations of G and Z(G) is the centre and
dim M = 2n. Since ω is closed the volume is the cohomology class [ω]n /n evaluated
on a fundamental class and this brought into play the multiplicative properties of the
generators of the cohomology of M which Atiyah and Bott had determined. This
was being pursued by geometers from various points of view but one of them related
to moving TQFTs one dimension higher.
This was the context for the Jones polynomials but it is characteristic of the differ-
ence in approach of physicists and mathematicians, or at least those with a geometric
background, that instead of considering a simple space like the 2-sphere with distin-
guished points on it (which led to knots and braid groups), geometers prefer higher
genus surfaces with no embellishments. For a closed oriented surface Σ the analogue
of the set of conjugacy classes to discuss connections on the circle is the moduli space
M of flat G-connections. Its quantization is no longer the Hilbert space of functions
on M, but instead one has to apply geometric quantization as developed by Kostant
and Souriau in the 1970s. Firstly it is a Hilbert space of sections of a line bundle
rather than just functions – a line bundle with connection whose curvature is a mul-
tiple of the symplectic form ω, which introduces an integer k, the level, the Chern
class of this bundle. Secondly the section has to be constant relative to a polarization
of the symplectic structure and given the holomorphic interpetation of M this means
being holomorphic with respect to a complex structure such that ω is the Kähler
form. Determining its dimension is the first task and this, for a geometer, involves
the full multiplicative structure of the cohomology of M and not just the volume,
which gives just the leading term in a polynomial in the level k. The formula was
known to physicists (the “Verlinde formula”) in terms of conformal blocks.
Creating a genuine topological field theory out of this data was only achieved much
later by roundabout methods. Key issues were to prove, in a suitable setting, that
the space of sections is independent of the choice of complex structure on the surface
Σ and also proving the unitarity of Axiom 5. For this case at least, the structure was
a framework for proving what had to be true rather than a means of attaining it.
Atiyah pursued these ideas with enthusiasm, explaining them to a general audience
later in the little book [15]. A turning point occurred at a dinner in a restaurant
27
in 1988 at the International Conference on Mathematical Physics. As Witten recalls
[53],
In 1990 Fields Medals were awarded to Edward Witten, Vaughan Jones, Vladimir
Drinfeld and Shigefumi Mori. The Selection Committee consisted of L.Faddeev
(chair), M.Atiyah, J-M.Bismut, E.Bombieri, C.Fefferman, K. Iwasawa, P.Lax, and
I.Shafarevich. It was said at the time that there were three quantum prizewinners
(Drinfeld for the introduction of quantum groups) and one mathematician Mori.
Physics had arrived in mathematics.... but not without some controversy.
The paper provoked an avalanche of responses, published the following year. Michael
Atiyah, writing from the Master’s Lodge in Trinity College Cambridge (which, to-
gether with being President of the Royal Society took away most of his research time)
wrote [17]:
I find myself agreeing with much of the detail of the Jaffe-Quinn argu-
ment, especially the importance of distinguishing between results based
on rigorous proofs and those which have a heuristic basis. Overall, how-
ever, I rebel against their general tone and attitude which appears too
authoritarian.
My fundamental objection is that Jaffe and Quinn present a sanitized
view of mathematics which condemns the subject to an arthritic old age.
28
...The history of mathematics is full of instances of happy inspiration
triumphing over a lack of rigour..... The marvelous formulae emerging at
present from heuristic physical arguments are the modern counterparts of
Euler and Ramanujan, and they should be accepted in the same spirit of
gratitude tempered with caution.
• theoretical (or speculative) work, if taken too far, goes astray because it lacks
the feedback and corrections provided by rigorous proof
• further work is discouraged and confused by uncertainty about which parts are
reliable
• a dead area is often created when full credit is claimed by vigorous theorizers:
there is little incentive for cleaning up the debris that blocks further progress
The article was probably deliberately provocative, but engendered various responses,
such as
29
My main objection to JQ is that, in their search for credit for some indi-
viduals at the expense of others they consider rogues, they propose to set
up a police state within Charles (River) mathematics, and a world cop
beyond its borders. (B.Mandelbrot)
30
unorthodox view of physics when he changed, in the words of Bernd Schroers, [50]
from an “inadvertent physicist” to an “intentional one”. This focus on particles then
became more pronounced but here I confine myself to a few of the earlier examples.
31
the higher critical points and although the existence of such non-minimal solutions
to the Yang-Mills equations on R4 is well-established, the picture is not so clear.
Nevertheless from this approach it is clear that the “pseudoparticle” interpretation
of instantons captures a large amount of the topology of the moduli space.
FA = ∗∇A φ.
The boundary conditions for this integral to exist include |φ| → 1 as |x| → ∞ and
the Bogomolny equations describe the absolute minimum when λ = 0 but keeping
the condition |φ| → 1. In the case of an SU (2)-connection, this implies that on a
large sphere in R3 of radius R the eigenspaces of φ define line bundles of degree ±k
and here k is the magnetic charge. The charge 1 monopole, centred at the origin was
given in 1975 by Prasad and Sommerfield:
sinh r − r r cosh r − sinh r
Ai = −iijk σj xk φ = ixj σj
r2 sinh r r2 sinh r
The Yang-Mills density |FA |2 for this solution is concentrated around the origin and
suggests a nonlinear particle-like object. In 1980, Jaffe and Taubes [41] produced
by analytic means an existence theorem for solutions representing “widely spaced
monopoles” where the energy density is approximately localized around k points in
R3 , further supporting the particle-like interpretation. Since the Higgs field for the
1-monopole vanishes at the origin one might also track the “locations” by the zeros
of the Higgs field.
I began working on monopoles in the early 1980s. Since the equations, but not the
boundary conditions, were equivalent to instantons on R4 invariant under translation
in one direction, there was clearly a twistor description, but here there was a more
direct interpretation of the twistor space as the space of oriented straight lines in R3 .
32
As a complex surface the twistor space is the total space of the line bundle O(2) over
CP1 . Holomorphic sections are quadratic in ζ, η = a + bζ + cζ 2 , so that (a, b, c) ∈ C3
parametrizes these twistor lines and is the complexification of R3 . The null cone
through a point is the space of sections tangential to the corresponding line: for the
zero section this means a + bζ + cζ 2 has a double zero, or b2 − 4ac = 0.
The real points for Euclidean space are (a, b, c) = (x1 + ix2 , 2ix3 , x1 − ix2 ) giving the
Euclidean metric quadratic form (4ac − b2 ) = 4(x21 + x22 + x23 ). The sections passing
through the point (η, ζ) in twistor space satisfy the equation
which, taking the real and imaginary parts, gives linear equations for (x1 , x2 , x3 ) –
two planes in R3 which intersect in a line.
A solution of the Bogomolny equations now corresponds to a holomorphic vector
bundle on this minitwistor space, but there is no compactification to an algebraic
surface which works here – the transcendental nature of the 1-monopole solution
suggests this. Nevertheless algebraic geometry comes to the aid of finding solutions.
My own approach involved solving the ODE ∇s + iφs = 0 along the straight lines
in R3 . The lines which admitted an L2 solution formed an algebraic curve of genus
(k − 1)2 – the spectral curve – which is subject to a transcendental constraint.
Then in 1981 a preprint from CERN by the physicist Werner Nahm [45] gave an alter-
native construction. This involved three k × k matrices Ti (t) satisfying the equations
dT1 dT2 dT3
= [T2 , T3 ], = [T3 , T1 ], = [T1 , T2 ].
dt dt dt
In the preprint Nahm shows how to solve these for k = 2 by using elliptic functions
which suggests a link with the spectral curve and indeed writing T = (T1 + iT2 ) +
2iT3 ζ + (T1 − iT2 )ζ 2 the equations become
dT
= [iT3 + (T1 − iT2 )ζ, T ]
dt
so that det(η − T ) is independent of t and is a polynomial p(η, ζ). Its zero set is the
spectral curve in general.
The matrices are defined from the monopole by taking an orthonormal basis ψα of
the k-dimensional space of L2 eigenfunctions with eigenvalue t of the Dirac equation
coupled to the monopole and defining the k × k matrix
Z
Tαβ = (xi ψα , ψβ ).
R3
33
Though Atiyah was following these developments at the time he was more involved
with the issues discussed in the previous sections, although in 1987 he wrote about
the parallel version on hyperbolic 3-space. He did however begin to have discussions
concerning monopoles with Nick Manton, a theoretical physicist in Cambridge. Man-
ton had written [43] about conjectured forces between well-separated monopoles, but
also showed how the classical dynamics of slowly moving monopoles should be well
approximated by geodesic motion on the moduli space of static monopoles [44]. To
put this into effect required a knowledge of the natural Riemannian metric on the
moduli space.
The academic year 1983-84 I spent on sabbatical in Stony Brook and there with
physicists Roček, Lindstrom and Karlhede we developed the hyperkähler quotient
construction. A hyperkähler metric has three symplectic forms ω1 , ω2 , ω3 which are
Kähler forms for complex structures I, J, K which satisfy the algebraic condition
of quaternions. Until the work of Gibbons and Hawking in the late 1970s there
were hardly any explicit examples but the quotient construction, adapted from the
standard symplectic quotient, meant that any quaternionic representation yielded in
principal an example.
When I returned to Oxford and explained this to Michael, he immediately pointed
out that the Bogomolny equations had an interpretation as the zero set of an infinite-
dimensional hyperkähler moment map and as a consequence the moduli space had a
hyperkähler metric – one complex structure for each direction in R3 . Moreover he
wanted the metric for two monopoles in order to test Manton’s ideas about monopole
dynamics. The formalism for implementing the hyperkähler moment map produces
an extra circle factor – the hyperkähler manifold has dimension 4k instead of 4k − 1
for the genuine moduli space. For k = 1 this space is R3 given by a centre for the
monopole, but Atiyah argued that we should really think of the 4-manifold S 1 × R3
as describing a location and an internal U (1) phase.
The charge 2 moduli space is 8-dimensional but a centred version is 4-dimensional with
the rotation group acting isometrically. We determined the metric, which depended
on complete elliptic integrals, as one might have expected from the fact that the
spectral curves were elliptic, and calculated some geodesics. By chance, a seminar
Atiyah was giving was attended by members of IBM’s UK research laboratory in
Southampton and as a test of their parallel processors they proposed making a movie
of the monopole scattering. It can be viewed on my home page [38].
In its preparation, I observed Atiyah as an experimental scientist and it revealed one
other difference between mathematicians and physicists – an appreciation of scale.
For the monopole dynamics it took a great deal of iteration to get any indication of
the nonlinearity of the problem. The first “takes” just showed linear motion with
34
elastic scattering in various directions and it was a while before, by adjusting the
scale, one could see from the changing distribution of the Yang-Mills density the
nonlinear effect of two monopoles colliding and scattering.
Atiyah suggested after this work that we should write a book based on his Porter
Lectures [13] and it was here that the particle picture of monopoles entered in a
rather novel manner. We needed to describe the metric on the moduli space of
charge k monopoles but explicit formulas like the k = 2 situation were out of reach.
Hyperkähler manifolds however have twistor spaces which are complex manifolds
fibering over CP1 with a (twisted) holomorphic symplectic form along the fibres. In
the case of monopoles, given a direction in R3 , i.e. a point (u1 , u2 , u3 ) ∈ S 2 ∼
= CP1
the fibre over u is the moduli space with complex structure u1 I + u2 J + u3 K. The
rotation group SO(3) is an isometry and takes one complex structure to another so
the fibres are all holomorphically equivalent. Donaldson in fact had identified each
as a space of rational maps as follows.
In [31] Donaldson took Nahm’s description of monopoles and, for the complex struc-
ture in the x1 direction, broke the equations into a complex one and a real one,
introducing a fourth matrix T0 which can be removed eventually by a gauge transfor-
mation. With skew-adjoint matrices Ti one sets α = (T0 + iT1 )/2, β = (T2 + iT3 )/2
and the equations are
dβ d
+ 2[α, β] = 0 (α + α∗ ) + 2([α, α∗ ] + [β, β ∗ ]) = 0.
dt ds
Acting on solutions to the complex equation by complex gauge transformations is the
analogue of the moment map/stability situation for holomorphic vector bundles which
began with Atiyah and Bott. Given the boundary conditions from Nahm, Donaldson
describes the “stability condition” on (α, β) for there to exist a solution. The end
result is a classification as pairs (B, v) where B is a complex symmetric k × k matrix
(B = β(1) at the midpoint t = 1 of the t-interval) and v ∈ Ck a cyclic vector, up to
the action of a complex orthogonal matrix. The rational map is then
f (z) = wT (z − B)−1 w
35
where (a, b) ∈ C∗ × C. We can take a rational map
k
X αi p(z)
f (z) = =
i=1
z − βi q(z)
and think of this as an approximate superposition of single monopoles located at
(− log |αi |, βi ) and phase αi /|αi |, though this description is of course direction-dependent.
The βi above are the zeros of the denominator q(z). In general if q(βi ) = 0, then
p(βi ) 6= 0 since the degree of the map f is k, so f (z) defines an unordered sequence of
points ((p(β1 ), β1 ), . . . , (p(βk ), βk )) ∈ (C∗ × C)k , i.e. in the quotient by the symmetric
group Σk . The symmetric product of k surface factors is singular but the space of
rational maps is smooth (the complement of the resultant equation R(p, q) = 0 in
Ck × Ck ) so the space of rational maps is a smooth resolution of the singularities of
the symmetric product. Or put another way, the monopole moduli space smoothes
out the singularities that particles with U (1)-charges acquire under collision.
In algebraic geometry the standard way to resolve the singularities of the symmetric
product of a complex surface X is the Hilbert scheme X [k] , the space of ideal sheaves
I ⊂ OX such that dim OX /I = k. Moreover if X has a holomorphic symplectic form
then the natural form on the product X k extends to a symplectic form on the Hilbert
scheme. The space of rational maps contains no projective spaces as the Hilbert
scheme does, but Atiyah in Chapter 6 of [13] introduces the notion of transverse
Hilbert scheme. This applies to a surface X with a projection π : X → C. The
transverse Hilbert scheme is the subspace of points D ∈ X [k] such that π(D) is an
isomorphism onto its scheme-theoretic image. Surprisingly, according to Bielawski
[26] this is smooth for all dimensions of X, though the Hilbert scheme itself is not for
dim X > 2.
To describe the twistor space now, one takes the twistor space for the flat metric on
S 1 × R3 (the quotient of O(1) ⊕ O(1) by the action of Z generated by the additive
action of (ζ, 1)) and applies the fibrewise transverse Hilbert scheme, a more canonical
description than appealing to rational maps.
The symplectic form coming from dz/z ∧ dw on C∗ × C is
k
X dp(βi )
ω= ∧ dβi
i=1
p(βi )
and this we know extends to the Hilbert scheme. It can be seen more invariantly as
follows [34].
From the expression above one can see that functions of the coefficients of p Poisson
commute as do functions of the coefficients of q. A point z defines the function p(z)
36
of the numerator p and q(z) of the denominator in p/q. Then the following formula
gives the Poisson bracket even when there are multiple zeros of q:
p(z)q(w) − q(z)p(w)
{p(z), q(w)} =
z−w
This is classically the Bezoutian.
9.3 Skyrmions
Atiyah’s contact with Manton led to another field of exploration of the particle-like
world – the Skyrme model. Mathematically it involves maps f : R3 → SU (2) and
minima of a certain functional.
We have the derivative df : T R3 7→ T SU (2) which is a 1-form on R3 with values in
the vector bundle f ∗ T SU (2), and the usual harmonic map functional is the L2 norm
square of this. For the Skyrme functional one introduces Λ2 df : Λ2 T R3 → Λ2 T SU (2)
which is a 2-form with values in f ∗ Λ2 T SU (2) ∼ = f ∗ T SU (2) and the Skyrme energy is
Z
E= c1 |df |2 + c2 |Λ2 df |2 (4)
R3
37
Since the proper large-N effective theory is unknown, we will consider here
a crude description in which the large-N theory is assumed to be a theory
of pions only. In this context, it is necessary to add a non-minimal term to
the non-linear sigma model to prevent the solitons from shrinking to zero-
size..... .... Although the Skyrme model is only a rough description, since
it omits the other mesons and interactions that are present in the large-N
limit of QCD, we regard it as a good model for testing the reasonableness
of a soliton description of nucleons.
In the same paper the authors reduce the determination of the spherically symmetric
k = 1 solution to an ODE and solve it numerically. The obvious Ansatz for f is
38
9.4 The Berry-Robbins problem
Emerging from years of administration as President of the Royal Society and Master
of Trinity College Cambridge, at the turn of the century Michael latched on to a
problem suggested to him by Michael Berry [25]. The question was simple – is there
a natural map
Cn (R3 ) → U (n)/T
which commutes with the action of the symmetric group Σn ?
Here Cn (R3 ) is, as before, the configuration space of ordered n-tuples of distinct
points x1 , . . . , xn ∈ R3 and U (n)/T is the flag manifold, the quotient space of the
unitary group by its diagonal elements T . The symmetric group acts on the left hand
side by changing the order and on the right as the Weyl group N (T )/T . Part of
the appeal was the association of “particles” in R3 with complex vectors in Cn , a
shadow of quantization, yet it was a purely mathematical question which led Atiyah
into various different fields.
In [18] he gave an elementary example of such a continuous map and used this later
to discuss its action in homotopy, in particular SO(3)-equivariant cohomology. But
he disliked the construction, presumably as not “a proof consistent with the elegance
of the problem” as he would often comment. He offered a different one but which was
never fully established:
For each pair of points one associates the direction of xj − xi as a point tij ∈ S 2 iden-
tified as CP1 (this is the celestial sphere that Penrose regarded as the key component
of twistor theory). Then define pi to be the polynomial of degree (n − 1) with roots
tij j 6= i. The space of polynomials of this degree is n-dimensional and if the pi are
linearly independent they define a flag
hp1 i ⊂ hp1 , p2 i ⊂ hp1 , p2 , p3 i . . . .
This map not only commutes with the symmetric group, but it is also invariant under
translations in R3 and equivariant under rotations where SO(3) acts on the projective
space CPn−1 via the n-dimensional irreducible representation of its covering SU (2).
The problem was proving linear independence.
The ad hoc continuous map follows the same pattern but instead one fixes an origin
and identifies any sphere centred at the origin with the unit sphere of directions.
Then for each i, one takes a polynomial whose roots are defined by xj /|xj | if xj is
outside the sphere |x| = |xi | and if xj is inside the root is yj /|yj | where |yj | = |xi | and
yj lies on the line joining xi to xj . Then linear independence is proved by induction.
Atiyah was unhappy with his solution, in particular because it was not translation-
invariant and this inhibited his interest in “clusters” of well-separated points analo-
39
gous to his earlier consideration of monopoles. He pursued the more invariant conjec-
tural solution together with Sutcliffe [20] focusing on a determinant whose nonvanish-
ing would give linear independence of the polynomials. This becomes a function V on
the configuration space which is conjecturally greater than or equal to one. Numer-
ical evaluation of V -minimizing polyhedra gave graphics which were reminiscent of
the numerical work on monopoles or skyrmions, and proofs for small n provided the
opportunity for some elementary geometry which Atiyah would delight in introducing
in his papers when appropriate.
The best solution (in the author’s opinion) came from using Nahm’s equations, with
their origins in gauge theory [19]. There was also a more general setting replacing
U (n) by a simple Lie group G and its maximal torus. Instead of the configuration
space one takes the Cartan subalgebra h, tensors with R3 and removes the kernels ∆
of the root homomorphisms α ⊗ 1 : h ⊗ R3 → R3 . The theorem then is that there is
a map
h ⊗ R3 \∆ → G/T
which commutes with the action of the Weyl group. The solution is also equivariant
with respect to SU (2) acting as SO(3) on the second factor in the left hand side and
the principal three-dimensional subgroup of G on the right hand side.
Nahm’s equations are
dT1 dT2 dT3
= [T2 , T3 ], = [T3 , T1 ], = [T1 , T2 ].
dt dt dt
and clearly make sense if Ti take values in any Lie algebra. If they acquire a simple
pole, at t = 0 say, then Ti = Ri /t + . . . and the equations give R1 = −[R2 , R3 ] etc.
which is a homomorphism from the Lie algebra of SU (2) to g.
For the original application to SU (2) monopoles there is a pole at each end of a finite
interval (0, 2). In this case, following work of Kronheimer, one takes solutions on
(0, ∞) with a pole at t = 0 given by the principal three-dimensional subgroup and
as t → ∞ the Ti approach a regular commuting triple in g. Fixing the subgroup
ρ : SU (2) → G, an existence theorem shows that there is a manifold N 0 of solutions
with these boundary conditions and hence a map N 0 → g ⊗ R3 giving the asymptotic
value τ .
Three commuting regular elements lie in a Cartan subalgebra so the orbit Gτ meets a
fixed h ⊗ R3 in an orbit of the Weyl group W . This gives a map N 0 → (h ⊗ R3 \∆)/W
and a W -covering N maps N → h ⊗ R3 \∆. Fixing τ ∈ h ⊗ R3 \∆ identifies Gτ with
G/T and so there is a map N → G/T .
All of this works for any three-dimensional subgroup but for the principal one the
40
map N → h ⊗ R3 \∆ is an isomorphism and thus gives the required map
φ : h ⊗ R3 \∆ → G/T.
The paper [19] makes interesting speculations about Hecke algebras and Kazhdan-
Lusztig theory which goes far beyond the “particle” aspects.
10 Conclusion
This article by no means exhausts Michael Atiyah’s interactions with physics. In
particular the students he acquired on his return to Oxford from Princeton followed
his physics interests from the mid 1970s onwards: there was Simon Donaldson of
course with the applications of gauge theory to 4-manifold topology, Michael Murray
describing monopoles for a general simple group, Peter Kronheimer producing a con-
struction and classification of ALE gravitational instantons and Lisa Jeffrey providing
a mathematically rigorous proof of results on the asymptotics of the three-manifold
invariants of Witten and Reshetikhin and Turaev. Also Ruth Lawrence’s thesis on
braid group representations was motivated by the work of Jones and Witten.
In later years he wrote a long paper with Witten [21] on G2 -manifolds and, in response
to the physicist’s notion of D-brane charges, wrote one on twisted K-theory with
Graeme Segal [22]. Even a paper as mathematical as [19] on the Dedekind η-function
contains the remark
References
[1] L. Abrams, Two-dimensional topological quantum field theories and Frobenius
algebras, J. Knot Theory Ramifications 5 (1996) 569–587.
41
[2] G.Adkins, C.Nappi & E.Witten, Static properties of nucleons in the Skyrme
model, Nucl. Phys. B228 (1983) 552–566.
[4] M.F.Atiyah & I.M.Singer, The index of elliptic operators IV, Ann. of Math. 93
(1971) 119–138.
[5] M.F.Atiyah & R.S.Ward, Instantons and algebraic geometry, Commun. Math.
Phys. 55 (1977) 117–124.
[6] M.F. Atiyah, V.G. Drinfeld, N.J.Hitchin & Y.I. Manin, Construction of instan-
tons, Phys. Lett. A 65 (1978), 185–187.
[8] M.F.Atiyah, Convexity and commuting Hamiltonians, Bull. Lond. Math Soc. 14
(1982) 1–15.
[10] M.F.Atiyah & I.M.Singer, Dirac operators coupled to vector potentials, Proc.
Natl. Acad. Sci. USA, 81 (1984) 2597–2600.
[11] M.F.Atiyah, The logarithm of the Dedekind η-function, Math.Ann. 278 (1987)
335–380.
[12] M.F.Atiyah, Topological quantum field theory, Publ. math. IHES 68 (1988) 175–
186.
[13] M.F.Atiyah & N.Hitchin, “The geometry and dynamics of magnetic monopoles”,
Princeton Univ. Press (1988).
[14] M.F.Atiyah & N.S.Manton, Skyrmions from instantons, Phys. Lett. B222 (1989)
438–442.
[15] M.F.Atiyah, “The geometry and physics of knots”, Cambridge Univ. Press (1990)
[16] M.F.Atiyah & N.S.Manton, Geometry and kinematics of two skyrmions, Com-
mun. Math. Phys. 152 (1993) 391–422.
42
[17] M.F.Atiyah et al, Responses to “Theoretical Mathematics”: towards a cultural
synthesis of mathematics and theoretical physics, by A.Jaffe and F.Quinn, Bull
AMS 30 (1994) 178–207.
[18] M.F.Atiyah, The geometry of classical particles, Surveys in Differential Geometry
7 (2001) 1–15.
[19] M.F.Atiyah & R.Bielawski, Nahm’s equations, configuration spaces and flag man-
ifolds, Bull. Braz. Math. Soc. 33 (2002) 157–176.
[20] M.F.Atiyah, The geometry of point particles, Proc. R. Soc. London A 458 (2002)
1089–1115.
[21] M.F.Atiyah & E.Witten M-theory dynamics on a manifold of G2 holonomy, Adv.
Theor. Math. Phys. 6 (2003) 1 – 106.
[22] M.F.Atiyah & G.Segal, Twisted K-theory, Ukr. Math. Bull. 1 (2004) 291-334.
[23] M.F.Atiyah, Collected Works Vol 6, Oxford University Press (2004) 11.
[24] M.F.Atiyah, Geometry and Physics of the 20th Century, in “Géometrie au XXe
siècle,1930-2000: Histoire et horizons” , J.Kouneiher et al (eds.), Hermann, Paris
(2005) 4–9. Collected Works Vol 7 259–264.
[25] M.Berry,& J.Robbins, Indistinguishability for quantum particles: spin, statistics
and the geometric phase, Proc. R. Soc. London A 453 (1997) 1771–1790.
[26] R.Bielawski, Transverse Hilbert schemes, bi-Hamiltonian systems, and hy-
perkähler geometry, arXiv:2001.05669.
[27] C.Boyer, J.Hurtubise, B.Mann & J.Milgram (1993), The topology of instanton
moduli spaces. I. The Atiyah –Jones conjecture, Ann. of Math. 137 (1993) 561-
609.
[28] E.F. Corrigan, D.B. Fairlie, S. Templeton & P. Goddard, A Green function for
the general self-dual gauge field. Nucl. Phys. B 140 (1978) 31–44.
[29] J.-P. Demailly Champs magnétiques et inégalités de Morse pour la d”- cohomolo-
gie, Ann. Inst. Fourier 35 (1985) 189-229.
[30] S.K.Donaldson, A new proof of a theorem of Narasimhan and Seshadri, J. Dif-
ferential Geom. 18 (1983) 269 – 277.
[31] S.K.Donaldson, Nahm’s equations and the classification of monopoles, Commun.
Math. Phys. 96 (1984) 387–407.
43
[32] S.K.Donaldson, Anti-self-dual Yang-Mills connections over complex algebraic
surfaces and stable vector bundles, Proc. Lond. Math. Soc. 50 (1985) 1– 26.
[34] L.Faybusovich & M.Gekhtman, Poisson brackets on rational functions and multi-
Hamiltonian structure from integrable lattices, Phys. Lett. A 272 (2000) 236–244.
[36] E.Getzler, Supersymmetry and the Atiyah-Singer index theorem, Commun. Math.
Phys. 90 (1983) 161–173.
[41] A. Jaffe & C. Taubes, “Vortices and monopoles”, Boston, Birkhäuser, (1980).
[42] F.C.Kirwan, Moment maps and convexity: memories of Michael Atiyah, Notices
of the AMS 66 (2019) 540–567.
[43] N.S.Manton, The force between ’t Hooft-Polyakov monopoles, Nucl. Phys. B126
(1977) 525–541.
[44] N.S.Manton, A remark on the scattering of BPS monopoles, Phys. Lett. 110B
(1982) 54–56.
[45] W. Nahm, All self-dual monopoles for arbitrary gauge groups, CERN preprint
TH.3172-CERN (1981).
[46] M.S.Narasimhan & C.S.Seshadri, Stable and unitary vector bundles on a compact
Riemann surface, Ann. of Math. 82 (1965) 540–567.
44
[48] R.Penrose, Twistor functions and sheaf cohomology, Twistor Newsletter 2 10th
June (1976).
[49] R.Penrose, Solutions of the zero rest mass equations, J.Math.Phys. 10 38 (1969).
[50] B.Schroers, Michael Atiyah and Physics: the later years, Notices of the AMS 66
(2019) 1849–1851.
[51] G.B.Segal, The definition of conformal field theory in “ Topology, geometry and
quantum field theory,” U.Tillmann (ed), London Math. Soc. Lecture Note Ser.,
308, Cambridge Univ. Press, Cambridge, (2004) 421-577
[52] S.Shatz, The decomposition and specialization of algebraic families of vector bun-
dles, Compositio math. 35 (1977) 163 – 187.
[53] E.Witten, Michael Atiyah and physics, Notices of the AMS 66 (2019) 1837–1839.
[56] E.Witten, Supersymmetry and Morse theory, J.Differential Geom. 17 (1982) 661–
692.
45