2017 215C Lectures
2017 215C Lectures
Spring 2017
Lecturer: McGreevy
These lecture notes live here. Please email corrections to mcgreevy at physics dot
ucsd dot edu.
1
Contents
0.1 Introductory remarks for the third quarter . . . . . . . . . . . . . . . . 4
0.2 Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
0.3 Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
12 Anomalies 69
14 Duality 110
14.1 XY transition from superfluid to Mott insulator, and T-duality . . . . . 110
14.2 (2+1)-d XY is dual to (2+1)d electrodynamics . . . . . . . . . . . . . . 120
14.3 Deconfined Quantum Criticality . . . . . . . . . . . . . . . . . . . . . . 133
2
15.7 Effective field theory of Fermi surfaces . . . . . . . . . . . . . . . . . . 165
3
0.1 Introductory remarks for the third quarter
Last quarter, we grappled with the Wilsonian perspective on the RG, which (among
many other victories) provides an explanation of the totalitarian principle of physics,
that anything that can happen must happen. More precisely, this means that the
Hamiltonian should contain all terms consistent with symmetries, organized according
to an expansion in decreasing relevance to low energy physics.
This leads directly to the idea of effective field theory, or, how to do physics without
a theory of everything. (You may notice that all the physics that has been done has
been done without a theory of everything.) It is a weaponized version of selective
inattention.
So here are some goals, both practical and philosophical:
And also like sod, each little patch of degrees of freedom only interacts with its
neighboring patches: this property of sod and of QFT is called locality. More
precisely, in a quantum mechanical system, we specify the degrees of freedom by
their Hilbert space; by an extensive system, I’ll mean one in which the Hilbert
space is of the form H = ⊗patches of space Hpatch and the interactions are local
P
H = patches H(nearby patches).
By ‘coarse-graining’ I mean ignoring things we don’t care about, or rather only
paying attention to them to the extent that they affect the things we do care
about.
To continue the sod example in 2+1 dimensions, a person laying the sod in the
picture above cares that the sod doesn’t fall apart, and rolls nicely onto the
ground (as long as we don’t do high-energy probes like bending it violently or
trying to lay it down too quickly). These long-wavelength properties of rigidity
and elasticity are collective, emergent properties of the microscopic constituents
(sod molecules) – we can describe the dynamics involved in covering the Earth
4
with sod (never mind whether this is a good idea in a desert climate) without
knowing the microscopic theory of the sod molecules (‘grass’). Our job is to think
about the relationship between the microscopic model (grassodynamics) and its
macroscopic counterpart (in this case, suburban landscaping). In my experience,
learning to do this is approximately synonymous with understanding.
• I would like to convince you that “non-renormalizable” does not mean “not worth
your attention,” and explain the incredibly useful notion of an Effective Field
Theory.
• There is more to QFT than perturbation theory about free fields in a Fock vac-
uum. In particular, we will spend some time thinking about non-perturbative
physics, effects of topology, solitons. Topology is one tool for making precise
statements without perturbation theory (the basic idea: if we know something is
an integer, it is easy to get many digits of precision!).
• There is more to QFT than the S-matrix. In a particle-physics QFT course (like
this year’s 215A!) you learn that the purpose in life of correlation functions or
green’s functions or off-shell amplitudes is that they have poles (at pµ pµ −m2 = 0)
5
whose residues are the S-matrix elements, which are what you measure (or better,
are the distribution you sample) when you scatter the particles which are the
quanta of the fields of the QFT. I want to make two extended points about this:
1. In many physical contexts where QFT is relevant, you can actually measure
the off-shell stuff. This is yet another reason why including condensed matter
in our field of view will deepen our understanding of QFT.
2. This is good, because the Green’s functions don’t always have simple poles!
There are lots of interesting field theories where the Green’s functions in-
1
stead have power-law singularities, like G(p) ∼ p2∆ . If you Fourier trans-
form this, you don’t get an exponentially-localized packet. The elementary
excitations created by a field whose two point function does this are not
particles. (Any conformal field theory (CFT) is an example of this.) The
theory of particles (and their dance of creation and annihilation and so on)
is an important but proper subset of QFT.
• The crux of many problems in physics is the correct choice of variables with
which to label the degrees of freedom. Often the best choice is very different
from the obvious choice; a name for this phenomenon is ‘duality’. We will study
many examples of it (Kramers-Wannier, Jordan-Wigner, bosonization, Wegner,
particle-vortex, perhaps others). This word is dangerous (at one point it was one
of the forbidden words on my blackboard) because it is about ambiguities in our
(physics) language. I would like to reclaim it.
An important bias in deciding what is meant by ‘correct’ or ‘best’ in the previous
paragraph is: we will be interested in low-energy and long-wavelength physics,
near the groundstate. For one thing, this is the aspect of the present subject which
is like ‘elementary particle physics’; the high-energy physics of these systems is
of a very different nature and bears little resemblance to the field often called
‘high-energy physics’ (for example, there is volume-law entanglement).
• An important goal for the course is demonstrating that many fancy phenomena
precious to particle physicists can emerge from very humble origins in the kinds of
(completely well-defined) local quantum lattice models we will study. Here I have
in mind: fermions, gauge theory, photons, anyons, strings, topological solitons,
CFT, and many other sources of wonder I’m forgetting right now.
6
Here is a confession, related to several of the points above: The following comment
in the book Advanced Quantum Mechanics by Sakurai had a big effect on my education
in physics: ... we see a number of sophisticated, yet uneducated, theoreticians who
are conversant in the LSZ formalism of the Heisenberg field operators, but do not know
why an excited atom radiates, or are ignorant of the quantum-theoretic derivation of
Rayleigh’s law that accounts for the blueness of the sky. I read this comment during
my first year of graduate school and it could not have applied more aptly to me. I
have been trying to correct the defects in my own education which this exemplifies ever
since. I bet most of you know more about the color of the sky than I did when I was
your age, but we will come back to this question. (If necessary, we will also come back
to the radiation from excited atoms.)
So I intend that there will be two themes of this course: coarse-graining and topol-
ogy. Both of these concepts are important in both hep-th and in cond-mat. Topics
which I hope to discuss include:
• some illustrations of effective field theory (perhaps cleverly mixed in with the
other subjects)
• more deep mysteries of gauge theory and its emergence in physical systems.
• If there is demand for it, we will discuss non-abelian gauge theory, in perturbation
theory: Fadeev-Popov ghosts, and the sign of the Yang-Mills beta function. Sim-
ilarly, we can talk about other topics relevant to the Standard Model of particle
physics if there is demand.
• Large-N expansions?
• duality.
0.2 Sources
The material in these notes is collected from many places, among which I should
mention in particular the following:
Peskin and Schroeder, An introduction to quantum field theory (Wiley)
7
Zee, Quantum Field Theory (Princeton, 2d Edition)
Banks, Modern Quantum Field Theory: A Concise Introduction (Cambridge)
Schwartz, Quantum field theory and the standard model (Cambridge)
Coleman, Aspects of Symmetry (Cambridge)
Polyakov, Gauge Field and Strings (Harwood)
Wen, Quantum field theory of many-body systems (Oxford)
Sachdev, Quantum Phase Transitions (Cambridge, 2d Edition)
Many other bits of wisdom come from the Berkeley QFT courses of Prof. L. Hall
and Prof. M. Halpern.
8
0.3 Conventions
Following most QFT books, I am going to use the + − −− signature convention for
the Minkowski metric. I am used to the other convention, where time is the weird one,
so I’ll need your help checking my signs. More explicitly, denoting a small spacetime
displacement as dxµ ≡ (dt, d~x)µ , the Lorentz-invariant distance is:
+1 0 0 0
0 −1 0 0
ds2 = +dt2 − d~x · d~x = ηµν dxµ dxν with η µν = ηµν = .
0 0 −1 0
0 0 0 −1 µν
µ
(spacelike is negative). We will also write ∂µ ≡ ∂ ~x
= ∂t , ∇ , and ∂ µ ≡ η µν ∂ν . I’ll
∂xµ
use µ, ν... for Lorentz indices, and i, k, ... for spatial indices.
The convention that repeated indices are summed is always in effect unless otherwise
indicated.
A consequence of the fact that english and math are written from left to right is
that time goes to the left.
h
A useful generalization of the shorthand ~ ≡ 2π
is
dk
d̄k ≡ .
2π
d
I will also write /δ (q) ≡ (2π)d δ (d) (q). I will try to be consistent about writing Fourier
transforms as
dd k ikx ˜
Z Z
e f (k) ≡ d̄d k eikx f˜(k) ≡ f (x).
(2π)d
IFF ≡ if and only if.
RHS ≡ right-hand side. LHS ≡ left-hand side. BHS ≡ both-hand side.
IBP ≡ integration by parts. WLOG ≡ without loss of generality.
+O(xn ) ≡ plus terms which go like xn (and higher powers) when x is small.
+h.c. ≡ plus hermitian conjugate.
We work in units where ~ and the speed of light, c, are equal to one unless otherwise
noted. When I say ‘Peskin’ I usually mean ‘Peskin & Schroeder’.
Please tell me if you find typos or errors or violations of the rules above.
9
11 Resolving the identity
The following is an advertisement: When studying a quantum mechanical system, isn’t
it annoying to have to worry about the order in which you write the symbols? What
if they don’t commute?! If you have this problem, too, the path integral is for you. In
the path integral, the symbols are just integration variables – just ordinary numbers,
and you can write them in whatever order you want. You can write them upside down
if you want. You can even change variables in the integral (Jacobian not included).
(What order do the operators end up in? As we showed last quarter, in the kinds of
path integrals we’re thinking about, they end up in time-order. If you want a different
order, you will want to use the Schwinger-Keldysh extension package, sold separately.)
This section is about how to go back and forth from Hilbert space to path integral
representations, aka Hamiltonian and Lagrangian descriptions of QFT. You make a
path integral representation of some physical quantity by sticking lots of 1s in there,
and then resolving each of the identity operators in some basis that you like. Different
bases, different integrals. Some are useful, mostly because we have intuition for the
behavior of integrals.
Let me say a few introductory words about quantum spin systems, the flagship
family of examples of well-regulated QFTs. Such a thing is a collection of two-state
systems (aka qbits) Hj = span{|↑j i , |↓j i} distributed over space and coupled somehow:
O
H= Hj , dim (H) = 2N
j
10
They satisfy
XY = iZ, XZ = −ZX, X2 = 1,
and all cyclic permutations X → Y → Z → X of these statements.
Multiple qbits: If we have more than one site, the paulis on different sites commute:
In this section we’re going to study the ‘path integral’ associated with the Z-basis
resolution, 1 = |+i h+| + |−i h−|. The labels on the states are classical spins ±1 (or
equivalently, classical bits). I put ‘path integral’ in quotes because it is instead a ‘path
sum’, since the integration variables are discrete. This discussion will allow us to further
harness our knowledge of stat mech for QFT purposes. An important conclusion at
which we will arrive is the (inverse) relationship between the correlation length and
the energy gap above the groundstate.
One qbit from classical Ising chain. Let’s begin with the classical ising model
in a (longitudinal) magnetic field:
X P P
Z= e−K hjli sj sl −h j sj . (11.1)
{sj }
Here I am imagining we have classical spins sj = ±1 at each site of some graph, and
hjli denotes pairs of sites which share a link in the graph. You might be tempted to call
K the inverse temperature, which is how we would interpret if we were doing classical
stat mech; resist the temptation.
First, let’s think about the case when the graph in (11.1) is just a
chain:
X Mτ
X Mτ
X
−S
Z1 = e , S = −K sl sl+1 − h sl (11.2)
{sl =±1} l=1 l=1
These ss are now just Mτ numbers, each ±1 – there are 2Mτ terms in this sum. (Notice
that the field h breaks the s → −s symmetry of the summand.) The parameter K > 0
is the ‘inverse temperature’ in the Boltzmann distribution; I put these words in quotes
because I want you to think of it as merely a parameter in the classical hamiltonian.
For definiteness let’s suppose the chain loops back on itself,
11
P
l (...)l e(...)l ,
Q
Using the identity e = l
Mτ
XY
Z1 = T1 (sl , sl+1 )T2 (sl )
{sl } l=1
where
T1 (s1 , s2 ) ≡ eKs1 s2 , T2 (s) ≡ ehs .
What are these objects? The conceptual leap is to think of T1 (s1 , s2 ) as a 2 × 2 matrix:
K −K
e e
T1 (s1 , s2 ) = −K K = hs1 | T1 |s2 i ,
e e s s 1 2
So we have
Z1 = trTMτ = λM Mτ
+ + λ−
τ
In the thermodynamic limit, Mτ 1, the bigger one dominates the free energy
Mτ !
λ−
e−F = Z1 = λM+
τ
1+ ∼ λM
+ .
τ
λ+
12
Now I command you to think of the transfer matrix as
T = e−∆τ H
the propagator in euclidean time (by an amount ∆τ ), where H is the quantum hamil-
tonian operator for a single qbit (note the boldface to denote quantum operators). So
what’s H? To answer this, let’s rewrite the parts of the transfer matrix in terms of
paulis, thinking of s = ± as Z-eigenstates. For T2 , which is diagonal in the Z basis,
this is easy:
T2 = ehZ .
To write T1 this way, stare at its matrix elements in the Z basis:
K −K
e e
hs1 | T1 |s2 i = −K K
e e s s 1 2
which are
aX+b1 b cosh a sinh a
hs1 | e |s2 i = e
sinh a cosh a s
1 ,s2
So we want to identify
T1 T2 = eb1 +aX ehZ ≡ e−∆τ H
for small ∆τ . This requires that a, b, h scale like ∆τ , and so we can combine the
exponents. Assuming that ∆τ E0−1 , h−1 , the result is
∆
H = E0 − X − h̄Z .
2
b h 2a
Here E0 = ∆τ , h̄ = ∆τ , ∆ = ∆τ . (Note that it’s not surprising that the Hamiltonian
for an isolated qbit is of the form H = d0 1 + d~ · σ
~ , since these operators span the set
of hermitian operators on a qbit; but the relation between the parameters that we’ve
found will be important.)
To recap, let’s go backwards: consider the quantum system consisting of a single
spin with H = E0 − ∆2 X + h̄Z . Set h̄ = 0 for a moment. Then ∆ is the energy gap
13
between the groundstate and the first excited state (hence the name). The thermal
partition function is
X
ZQ (T ) = tre−H/T = hs| e−βH |si , (11.6)
s=±
where we’ve evaluated the trace in the Z basis, Z |si = s |si. I emphasize that T here
is the temperature to which we are subjecting our quantum spin; β = T1 is the length
of the euclidean time circle. Break up the euclidean time circle into Mτ intervals of size
∆τ = β/Mτ . Insert many resolutions of unity (this is called ‘Trotter decomposition’)
X
ZQ = hsMτ | e−∆τ H |sMτ −1 i hsMτ −1 | e−∆τ H |sMτ −2 i · · · hs1 | e−∆τ H |sMτ i .
s1 ...sMτ
The RHS is the partition function of a classical Ising chain, Z1 in (11.2), with h = 0
and K given by (11.5), which in the present variables is:
−2K β∆
e = tanh . (11.7)
2Mτ
Notice that if our interest is in the quantum model with couplings E0 , ∆, we can use
any Mτ we want – there are many classical models we could use1 . For given Mτ , the
couplings we should choose are related by (11.7).
A quantum system with just a single spin (for any H not proportional to 1) clearly
has a unique groundstate; this statement means the absence of a phase transition in
the 1d Ising chain.
More than one spin.2 Let’s do that procedure again, this time supposing the
graph in question is a cubic lattice with more than one dimension, and let’s think of
one of the directions as euclidean time, τ . We’ll end up with more than one spin.
We’re going to rewrite the sum in (11.1) as a sum of
products of (transfer) matrices. I will draw the pictures
associated to a square lattice, but this is not a crucial lim-
itation. Label points on the lattice by a vector ~n of inte-
gers; a unit vector in the time direction is τ̌ . First rewrite
P −S
the classical action S in Zc = e , using s2j = 1, as
1
If we include the Z term, we need to take ∆τ small enough so that we can write
2
This discussion comes from this paper of Fradkin and Susskind, and can be found in Kogut’s
review article.
14
X
S=− (Ks(~n + τ̌ )s(~n) + Kx s(~n + x̌)s(~n))
~
n
X 1 2
X
=K (s(~n + τ̌ ) − s(~n)) − 1 − Kx s(~n + x̌)s(~n)
2
~
n X ~
n
= const + L(l + 1, l) (11.8)
rows at fixed time, l
with3
1 X 1 X
L(s, σ) = K (s(j) − σ(j))2 − Kx (s(j + 1)s(j) + σ(j + 1)σ(j)) .
2 j
2 j
σ and s are the names for the spins on successive time slices, as in the figure at left.
The transfer matrix between successive time slices is a
2 × 2M matrix:
M
in terms of which
X Mτ
XY
−S
Z= e = Ts(l+1,j),s(l,j) = trH TMτ .
{s} {s} l=1
This is just as in the one-site case; the difference is that now the hilbert space has a
two-state system for every site on a fixed-l slice of the lattice. I will call this “space”,
and label these sites by an index j. (Note that nothing we say in this discussion requires
N
space to be one-dimensional.) So H = j Hj , where each Hj is a two-state system.
[End of Lecture 41]
The diagonal entries of Ts,σ come from contributions where s(l) = σ(l): they come
with a factor of Ts=σ = e−L(0 flips) with
X
L(0 flips) = −Kx σ(j + 1)σ(j).
j
σ(j) = s(j), except for one site where instead σ(j) = −s(j).
3
R
Note that ‘L’ is for ‘Lagrangian’, so that S = dτ L and ‘S’ is for ‘action’.
15
Similarly,
1 X
L(n flips) = 2nK − Kx (σ(j + 1)σ(j) + s(j + 1)s(j)) .
2 j
T = e−∆τ H ' 1 − ∆τ H ;
we want to consider ∆τ small and must choose Kx , K to make it so. We have to match
the matrix elements hs| T |σi = Tsσ :
P
T (0 flips)sσ = δ eKx
P sσ
j s(j)s(j+1)
' 1 − ∆τ H|0 flips
1
−2K K j (σ(j+1)σ(j)+s(j+1)s(j))
T (1 flip)sσ = e e 2 x ' −∆τ H|1 flip
−2nK 21 Kx j (σ(j+1)σ(j)+s(j+1)s(j))
P
T (n flips)sσ = e e ' −∆τ H|n flips (11.9)
From the first line, we learn that Kx ∼ ∆τ ; from the second we learn e−2K ∼ ∆τ ; we’ll
call the ratio which we’ll keep finite g ≡ Kx−1 e−2K . To make τ continuous, we take
K → ∞, Kx → 0, holding g fixed. Then we see that the n-flip matrix elements go like
e−nK ∼ (∆τ )n and can be ignored – the hamlitonian only has 0- and 1-flip terms.
To reproduce (11.9), we must take
!
X X
HTFIM = −J g Xj + Zj+1 Zj .
j j
Here J is a constant with dimensions of energy that we pull out of ∆τ . The first term
is the ‘one-flip’ term; the second is the ‘zero-flips’ term. The first term is a ‘transverse
magnetic field’ in the sense that it is transverse to the axis along which the neighboring
spins interact. So this is called the transverse field ising model. In D = 1 + 1 it can be
understood completely, and I hope to say more about it later this quarter. As we’ll see,
it contains the universal physics of the 2d Ising model, including Onsager’s solution.
The word ‘universal’ requires some discussion.
S |{sj }j i = |{−sj }j i .
16
It is a symmetry in the sense that:
[HTFIM , S] = 0.
e−∆τ HI ≡ Tx Tz + O(∆τ 2 ).
P P
Tx ≡ eJg∆τ j Xj
, Tz ≡ eJ∆τ j Zj Zj+1
.
4
By ‘usual’ I mean that this is just like in the path integral of a 1d particle, when we write
∆τ 2
e−∆τ H = e− 2m p e−∆τ V (q) + O(∆τ 2 ).
17
many many times, one between each pair of transfer operators; this turns the transfer
operators into transfer matrices. The Tz bit is diagonal, by design:
P
Tz |{sj }i = eJ∆τ j sj sj+1
|{sj }i .
Acting on a single spin at site j, this 2 × 2 matrix is just the one from the previous
discussion:
0 Jg∆τ Xj 0 1
sj e |sj i = e−b eKsj sj , e−b = cosh (2Jg∆τ ) , e−2K = tanh (Jg∆τ ) .
2
Notice that it wasn’t important to restrict to 1 + 1 dimensions here. The only differ-
ence is in the Tz bit, which gets replaced by a product over all neighbors in higher
dimensions: P
0
{sj } Tz |{sj }i = δs,s0 eJ∆τ hjli sj sl
where hjli denotes nearest neighbors, and the innocent-looking δs,s0 sets the spins
sj = s0j equal for all sites.
Label the time slices by a variable l = 1...Mτ .
Mτ
− T1 HI
X Y
Z = tre = h{sj (l + 1)}| Tz Tx |{sj (l)}i
{sj (l)} l=1
The sum on the RHS runs over the 2M Mτ values of sj (l) = ±1, which is the right set
of things to sum over in the d + 1-dimensional classical ising model. The weight in the
partition sum is
X X
Z = e|−bM τ
exp J∆τ sj (l)sj+1 (l) + Ksj (l)sj (l + 1)
{z }
| {z } | {z }
{sj (l)}j,l j,l
space deriv, from Tz time deriv, from Tx
unimportant
constant
X
= e−Sclassical ising
spins
except that the the couplings are a bit anisotropic: the couplings in the ‘space’ direction
Kx = J∆τ are not the same as the couplings in the ‘time’ direction, which satisfy
e−2K = tanh (Jg∆τ ). (At the critical point K = Kc , this can be absorbed in a
rescaling of spatial directions, as we’ll see later.)
18
Dictionary. So this establishes a mapping between classical systems in d + 1 di-
mensions and quantum systems in d space dimensions. Here’s the dictionary:
β→0
free energy in infinite volume groundstate energy: e−F = Z = tre−βH → e−βE0
1
periodicity of euclidean time Lτ temperature: β = T
= ∆τ Mτ
Note that this correspondence between classical and quantum systems is not an iso-
morphism. For one thing, we’ve seen that many classical systems are related to the
same quantum system, which does not care about the lattice spacing in time. There is
a set of physical quantities which agree between these different classical systems, called
universal, which is the information in the quantum system. More on this below.
19
1. λ1 (K) is itself a singular function of K. How can this happen? One way it can
happen is if there is a level-crossing where two completely unrelated eigenvectors
switch which is the smallest (while remaining separated from all the others).
This is a first-order transition. A distinctive feature of a first order transition
is a latent heat: although the free energies of the two phases are equal at the
transition (they have to be in order to exchange dominance there), their entropies
(and hence energies) are not: S ∝ ∂K F jumps across the transition.
Now translate those statements into statements about the corresponding quantum
system. Recall that T = e−∆τ H – eigenvectors of T are eigenvectors of H! Their
eigenvalues are related by
λa = e−∆τ Ea ,
so the largest eigenvalue of the transfer matrix corresponds to the smallest eigenvalue
of H: the groundstate. The two cases described above are:
20
Using the quantum-to-classical dictionary, the groundstate energy of the TFIM at
the transition reproduces Onsager’s tour-de-force free energy calculation.
Another failure mode of this correspondence: there are some quantum systems
which when Trotterized produce a stat mech model with non-positive Boltzmann
weights, i.e. e−S < 0 for some configurations; this requires the classical hamilto-
nian S to be complex. These models are less familiar! An example where this happens
is the spin- 12 Heisenberg (≡ SU(2)-invariant) chain, as you’ll see on the homework. This
is a manifestation of a sign problem, which is a general term for a situation requiring
adding up a bunch of numbers which aren’t all positive, and hence may involve large
cancellations. Sometimes such a problem can be removed by cleverness, sometimes it
is a fundamental issue of computational complexity.
The quantum phase transitions of such quantum systems are not just ordinary finite-
temperature transitions of familiar classical stat mech systems. So for the collector of
QFTs, there is something to be gained by studying quantum phase transitions.
Correlation functions. [Sachdev, 2d ed p. 69] For now, let’s construct correlation
functions of spins in the classical Ising chain, (11.2), using the transfer matrix. (We’ll
study correlation functions in the TFIM later, I think.) Let
1 X −Hc
C(l, l0 ) ≡ hsl sl0 i = e sl sl 0
Z1
{sl }l
0 1 Mτ −l0 l0 −l l
C(l − l ) = tr T ZT ZT . (11.10)
Z
Notice that there is only one operator Z = σ z here; it is the matrix
Zss0 = δss0 s .
All the information about the index l, l0 is encoded in the location in the trace.
21
In this basis
01
hα| Z |βi = , α, β =→ or ← .
10 αβ
So the trace (aka path integral) has two terms: one where the system spends l0 − l
steps in the state |→i (and the rest in |←i), and one where it spends l0 − l steps in the
state |→i. The result (if we take Mτ → ∞ holding fixed l0 − l) is
0 0 0 0
λMτ −l +l λl−−l + λM
−
τ −l +l l −l
λ+ Mτ →∞ 0
C(l − l) = +
0
Mτ Mτ
→ tanhl −l K . (11.11)
λ+ + λ−
sl = Z(τ ), τ = ∆τ l.
Notice that this is the same as our formula for the gap, ∆, in (11.7).5 This connection
between the correlation length in euclidean time and the energy gap is general and
important.
For large K, ξ is much bigger than the lattice spacing:
ξ K1 1 2K
' e 1.
∆τ 2
This is the limit we had to take to make the euclidean time continuous.
5
Seeing this requires the following cool hyperbolic trig fact:
(i.e. this equation is ‘self-dual’) which follows from algebra. Here (11.7) says X = T ∆
Mτ = ∆τ ∆ while
(11.13) says X = ∆τ /ξ. Actually this relation (11.14) can be made manifestly symmetric by writing
it as
1 = sinh 2X sinh 2K .
(You may notice that this is the same combination that appears in the Kramers-Wannier self-duality
condition.) I don’t know a slick way to show this, but if you just solve this quadratic equation for
e−2K and boil it enough, you’ll find tanh X.
22
Notice that if we had taken l < l0 instead, we would have found the same answer
with l0 − l replaced by l − l0 .
[End of Lecture 42]
Continuum scaling limit and universality
[Sachdev, 2d ed §5.5.1, 5.5.2] Now we are going to grapple with the term ‘universal’.
Let’s think about the Ising chain some more. We’ll regard Mτ ∆τ as a physical quantity,
the proper length of the chain. We’d like to take a continuum limit, where Mτ → ∞ or
∆τ → 0 or maybe both. Such a limit is useful if ξ ∆τ . This decides how we should
scale K, h in the limit. More explicitly, here is the prescription: Hold fixed physical
quantities (i.e. eliminate the quantities on the RHS of these expressions in favor of
those on the LHS):
1
the correlation length, ξ ' ∆τ e2K ,
2
the length of the chain, Lτ = ∆τ Mτ ,
physical separations between operators, τ = (l − l0 )∆τ,
the applied field in the quantum system, h̄ = h/∆τ. (11.15)
while taking ∆τ → 0, K → ∞, Mτ → ∞.
What physics of the various chains will agree? Certainly only quantities that don’t
depend explicitly on the lattice spacing; such quantities are called universal.
6
Consider the thermal free energy
pof the single quantum spin (11.6) : The energy
spectrum of our spin is E± = E0 ± (∆/2)2 + h̄2 , which means
q
F = −T log ZQ = E0 − T ln 2 cosh β (∆/2)2 + h̄2
(just evaluate the trace in the energy eigenbasis). In fact, this is just the behavior of
the ising chain partition function in the scaling limit (11.15), since, in the limit (11.4)
becomes r
2ξ ∆τ
q
λ± ' 1± 2
1 + 4h̄ ξ 2
∆τ 2ξ
and so in the scaling limit (11.15)
K 1 Lτ
q
F ' Lτ − − ln 2 cosh ξ −2 + 4h̄2
,
∆τ}
| {z Lτ 2
cutoff-dependent vac. energy
which is the same (up to an additive constant) as the quantum formula under the
previously-made identifications T = L1τ , ξ −1 = ∆.
6
[Sachdev, 1st ed p. 19, 2d ed p. 73]
23
We can also use the quantum system to compute the correlation functions of the
classical chain in the scaling limit (11.11). They are time-ordered correlation functions:
where
Z(τ ) ≡ eHτ Ze−Hτ .
This time-ordering is just the fact that we had to decide whether l0 or l was bigger in
(11.10).
For example, consider what happens to this when T → 0. Then (inserting 1 =
P
n |ni hn|, in an energy eigenbasis H |ni = En |ni),
X
C(τ )|T =0 = | h0| Z |ni |2 e−(En −E0 )|τ |
n
where the |τ | is taking care of the time-ordering. This is a spectral representation of the
correlator. For large τ , the contribution of |ni is exponentially suppressed by its energy,
so the sum is approximated well by the lowest energy state for which the matrix element
is nonzero. Assuming this is the first excited state (which in our two-state system it
has no choice!), we have
τ →∞
C(τ )|T =0 ' e−τ /ξ , ξ = 1/∆,
The scaling limit would not be exactly the same; we would have to scale K 0 somehow
(it would also have to grow in the limit). But we would find the same 2-state quantum
system, and when expressed in terms of physical variables, the ∆τ -independent terms
in F would be identical, as would the form of the correlation functions, which is
24
11.2 Interlude on differential forms and algebraic topology
The next item of business is coherent state path integrals of all kinds. We are going to
make a sneak attack on them.
[Zee section IV.4] We interrupt this physics discussion with a message from our
mathematical underpinnings. This is nothing fancy, mostly just some book-keeping.
It’s some notation that we’ll find useful. As a small payoff we can define some simple
topological invariants of smooth manifolds.
Suppose we are given a smooth manifold X on which we can do calculus. For now,
we don’t even need a metric on X.
A p-form on X is a completely antisymmetric p-index tensor,
1
A≡ Am1 ...mp dxm1 ∧ ... ∧ dxmp .
p!
The coordinate one-forms are fermionic objects in the sense that dxm1 ∧dxm2 = −dxm2 ∧
dxm1 and dx2 = 0. The point in life of a p-form is that it can be integrated over a
p-dimensional space. The order of its indices keeps track of the orientation (and it
saves us the trouble of writing them). It is a geometric object, in the sense that it is
something that can be (wants to be) integrated over a p-dimensional subspace of X,
and its integral will only depend on the subspace, not on the coordinates we use to
describe it.
Familiar examples include the gauge potential A = Aµ dxµ , and its field strength
F = 12 Fµν dxµ ∧ dxν . Given a curve C in X parameterized as xµ (s), we have
dxµ
Z Z Z
µ
A≡ dx Aµ (x) = ds Aµ (x(s))
C C ds
and this would be the same if we chose some other parameterization or some other
local coordinates.
The wedge product of a p-form A and a q-form B is a p + q form
A ∧ B = Am1 ..mp Bmp+1 ...mp+q dxm1 ∧ ... ∧ dxmq ,
7
The space of p-forms on a manifold X is sometimes denoted Ωp (X), especially when
7
The components of A ∧ B are then
(p + q)!
(A ∧ B)m1 ...mp+q = A[m1 ...mp Bmp+1 ...mp+q ]
p!q!
where [..] means sum over permutations with a −1 for odd permutations. Try not to get caught up
in the numerical prefactors.
25
it is to be regarded as a vector space (let’s say over R).
The exterior derivative d acts on forms as
d : Ωp (X) → Ωp+1 (X)
A 7→ dA
by
1
dA = ∂m1 (A)m2 ...mp+1 dxm1 ∧ ... ∧ dxmp+1 .
(p + 1)!
You can check that
d2 = 0
basically because derivatives commute. Notice that F = dA in the example above.
Denoting the boundary of a region D by ∂D, Stokes’ theorem is
Z Z
dα = α.
D ∂D
And notice that Ωp>dim(X) (X) = 0 – there are no forms of rank larger than the
dimension of the space.
A form ωp is closed if it is killed by d: dωp = 0.
A form ωp is exact if it is d of something: ωp = dαp−1 . That something must be a
(p − 1)-form.
26
Here’s a very simple example, where X = S 1 is a circle. x ' x + 2π is a coordinate;
the radius will not matter since it can be varied continuously. An element of Ω0 (S 1 ) is
a smooth periodic function of x. An element of Ω1 (S 1 ) is of the form A1 (x)dx where
A1 is a smooth periodic function. Every such element is closed because there are no
2-forms on a 1d space. The exterior derivative on a 0-form is
Which 0-forms are closed? A00 = 0 means A0 is a constant. Which 1-forms can we
make this way? The only one we can’t make is dx itself, because x is not a periodic
function. Therefore b0 (S 1 ) = b1 (S 1 ) = 1.
? : Ωp → Ωd−p
by
?A(p) ≡ µ1 ...µd A(p) µd−p+1 ...µd
µ1 ...µd−p
An application: consider the Maxwell action, 41 Fµν F µν . You can show that this is
R
the same as S[A] = F ∧ ?F . (Don’t trust my numerical prefactor.) You can derive
δS
R
the Maxwell EOM by 0 = δA . F ∧ F is the θ term. The magnetic dual field strength
is F̃ = ?F . Many generalizations of duality can be written naturally using the Hodge
? operation.
As you can see from the Maxwell example, the Hodge star gives an inner product
R
on Ωp : for two p-forms α, β (α, β) = α ∧ ?β), (α, α) ≥ 0. We can define the adjoint
of d with respect to the inner product by
Z Z
† †
d α ∧ ?β = (d α, β) ≡ (α, dβ) = α ∧ ?dβ
∆ = dd† + d† d.
27
Any cohomology class [ω] has a harmonic representative, [ω] = [ω̃] where in addition
to being closed dω = dω̃ = 0, it is co-closed, 0 = d† ω̃, and hence harmonic ∆ω̃ = 0.
I mention this because it implies Poincare duality: bp (X) = bd−p (X) if X has a
volume form. This follows because the map H p → H d−p [ωp ] 7→ [?ωp ] is an isomorphism.
(Choose the harmonic representative, it has d ? ω̃p = 0.)
The de Rham complex of X can be realized as the groundstates of a physical system,
namely the supersymmetric nonlinear sigma model with target space X. The fermions
play the role of the dxµ s. The states are of the form
d
X
|Ai = Aµ1 ···µp (x)ψ µ1 ψ µ2 · · · ψ µp |0i
p=1
where ψ are some fermion creation operators. This shows that the hilbert space is the
space of forms on X, that is H ' Ω(X) = ⊕p Ωp (X). The supercharges act like d and
d† and therefore the supersymmetric groundstates are (harmonic representatives of)
cohomology classes.
This machinery will be very useful to us. I use it all the time.
s is a number. Suppose we think of this sphere as the phase space of some dynamical
system. We can use ω as the symplectic form. What is the associated quantum
mechanics system?
28
Let me remind you what I mean by ‘the sym-
plectic form’. Recall the phase space formulation
of classical dynamics. The action associated to a
trajectory is
Z t2 Z Z
A[x(t), p(t)] = dt (pẋ − H(x, p)) = p(x)dx− Hdt
t1 γ
where γ is the trajectory through the phase space. The first term is the area ‘under
the graph’ in the classical phase space – the area between (p, x) and (p = 0, x). We
can rewrite it as Z Z Z
p(t)ẋ(t)dt = pdx = dp ∧ dx
∂D D
using Stokes’ theorem; here ∂D is the closed curve made by the classical trajectory and
some reference trajectory (p = 0) and it bounds some region D. Here ω = dp ∧ dx is
the symplectic form. More generally, we can consider an 2n-dimensional phase space
with coordinates uα , α = 1..2n and symplectic form
and action Z Z
A[u] = ω− dtH(u, t).
D ∂D
The symplectic form says who is canonically conjugate to whom. It’s important that
dω = 0 so that the equations of motion resulting from A depend only on the trajectory
γ = ∂D and not on the interior of D. The equations of motion from varying u are
∂H
ωαβ u̇β = .
∂uα
Locally, we can find coordinates p, x so that ω = d(pdx). Globally on the phase
space this is not guaranteed – the symplectic form needs to be closed, but need not be
exact.
So the example above of the two-sphere is one where the symplectic form is closed
(there are no three-forms on the two sphere, so dω = 0 automatically), but is not exact.
One way to see that it isn’t exact is that if we integrate it over the whole two-sphere,
we get the area: Z
ω = 4πs .
S2
On the other hand, the integral of an exact form over a closed manifold (meaning a
manifold without boundary, like our sphere) is zero:
Z Z
dα = α = 0.
C ∂C
29
So there can’t be a globally defined one-form α such that dα = ω. Locally, we can find
one; for example:
α = s cos θdϕ ,
but this is singular at the poles, where ϕ is not a good coordinate.
So: what I mean by “what is the associated quantum system...” is the following:
let’s construct a system whose path integral is
Z
i
Z = [dθdϕ]e ~ A[θ,ϕ] (11.16)
with the action above, and where [dx] denotes the path integral measure:
N
Y
[dx] ≡ ℵ dx(ti )
i=1
where ℵ involves lots of awful constants that drop out of ratios. It is important that
the measure does not depend on our choice of coordinates on the sphere.
• Hint 2: We actually didn’t specify the model yet, since we didn’t choose the
Hamiltonian. For definiteness, let’s pick the hamiltonian to be
H = −s~h · ~n
where ~n ≡ (sin θ cos ϕ, sin θ sin ϕ, cos θ). WLOG, we can take the polar axis to
be along the ‘magnetic field’: ~h = ẑh. The equations of motion are then
δA δA
0= = −s sin θ (ϕ̇ − h) , 0= = −∂t (s cos θ)
δθ(t) δϕ(t)
which by rotation invariance can be written better as
30
In QM we care that the action produces a well-
defined phase – the action must be defined modulo
additions of 2π times an integer. We should get
the same answer whether we fill in one side D of
the trajectory γ or the other D0 . The difference [from Witten]
between them is Z Z Z
s − area = s area .
D D0 S2
R
So in this difference s multiplies S 2 area = 4π (actually, this can be multiplied by an
integer which is the number of times the area is covered). Our path integral will be
well-defined (i.e. independent of our arbitrary choice of ‘inside’ and ‘outside’) only if
4πs ∈ 2πZ, that is if 2s ∈ Z is an integer .
The conclusion of this discussion is that the coefficient of the area term must be an
integer. We will interpret this integer below.
WZW term. We have a nice geometric interpretation of the ‘area’ term in our
action A – it’s the solid angle swept out by the particle’s trajectory. But how do we
write it in a manifestly SU(2) invariant way? We’d like to be able to write it, not in
terms of the annoying coordinates θ, φ, but directly in terms of
where xµ = (t, u), and the tensors are completely antisymmetric in their indices with
all nonzero entries 1 and −1.
31
This WZW term has the property that its vari-
ation with respect to ~n depends only on the values
at the boundary (that is: δW0 is a total deriva-
tive). The crucial reason is that allowed variations
δ~n lie on the 2-sphere, as do derivatives ∂µ~n; this
means abc δna ∂µ nb ∂ν nc = 0, since they all lie in a
two-dimensional tangent plane to the 2-sphere at
~n(t). Therefore:
Z 1 Z Z
1 µν a b c abc 1 a
δW0 = du dt n ∂µ δn ∂ν n = n dδnb ∧ dnc abc
Z0 1 Z 4π 4π
B Z
1 µν a b c abc 1 a b c abc
= du dt ∂µ n δn ∂ν n = d n δn dn
0 Z 4π B 4π
Stokes 1
= dtδ~n · ~n˙ × ~n . (11.18)
4π
(Note that abc na mb `c = ~n · m ~ × ~` . The right expressions in red in each line are
a rewriting in terms of differential forms; notice how much prettier they are.) So the
equations of motion coming from this term do not depend on how we extend it into
the auxiliary dimension.
And in fact they are the same as the ones we found earlier:
δ
0= 4πsW0 [n] + sh · ~n + λ ~n − 1 = s∂t~n × ~n + s~h + 2λ~n
~ 2
δ~n(t)
(λ is a Lagrange multiplier to enforce unit length.) The cross product of this equation
with ~n is ∂t~n = ~h × ~n.
In QM we also care that the action produces a well-defined phase – the action
must be defined modulo additions of 2π times an integer. There may be many ways to
extend n̂ into an extra dimension; another obvious way is shown in the figure above.
The demand that the action is the same modulo 2πZ gives the same quantization law
as above for the coefficient of the WZW term. So the WZW term is topological in the
sense that because of topology its coefficient must be quantized.
(This set of ideas generalizes to many other examples, with other fields in other
dimensions. WZW stands for Wess-Zumino-Witten.)
Coherent quantization of spin systems. [Wen §2.3.1, Fradkin, Sachdev, QPT,
chapter 13 and §2.2 of cond-mat/0109419] To understand more about the path integral
we’ve just constructed, we now go in the opposite direction. Start with a spin one-half
system, with
H 1 ≡ span{|↑i , |↓i}.
2
32
Define spin coherent states |~ni by8 :
~ · ~n |~ni = |~ni .
σ
These states form another basis for H 1 ; they are related to the basis where σ z is
2
diagonal by:
cos 2θ eiψ/2
iϕ/2
z1 e
|~ni = z1 |↑i + z2 |↓i , = −iϕ/2 (11.19)
z2 e sin 2θ eiψ/2
~n = z † σ
~ z, |z1 |2 + |z2 |2 = 1
and the phase of zα does not affect ~n (this is the Hopf fibration S 3 → S 2 ). In (11.19) I
chose a representative of the phase. The space of independent states is a two-sphere:
(Notice that H = 0 here, so U ≡ e−iHt is actually the identity.) The crucial ingredient
is
†
h~n(t + )|~n(t)i = z † (dt)z(0) = 1 − z † (dt) (z(dt) − z(0)) ≈ e−z ∂t zdt .
8
For more general spin representation with spin s > 21 , and spin operator ~S, we would generalize
this equation to
~S · ~n |~ni = s |~ni .
33
Z Z t
D~n iSB [~n(t)]
iG(~n2 , ~n1 , t) = e , SB [~n(t)] = dtiz † ż . (11.22)
2π 0
Notice how weird this is: even though the Hamiltonian of the spins was zero – whatever
their state, they have no potential energy and no kinetic energy – the action in the
path integral is not zero. This phase eiSB is a quantum phenomenon called a Berry
phase. [End of Lecture 44]
Starting from the action SB and doing the Legendre transform to find the Hamil-
tonian you will get zero. The first-derivative action says that z † is the canonical
momentum conjugate to z: the space with coordinates (z, z † ) becomes the phase space
(just like position and momentum)! But this phase space is curved. In fact it is the
two-sphere
S 2 = {(z1 , z2 )||z1 |2 + |z2 |2 = 1}/(zα ' eiψ zα ).
In terms of the coordinates θ, ϕ above, we have
Z
1 1
SB [z] = SB [θ, ϕ] = dt − cos θφ̇ − φ̇ = −4πsW0 [n̂]|s= 1 . (11.23)
2 2 2
BIG CONCLUSION: This is the ‘area’ term that we studied above, with s = 12 ! So the
expression in terms of z in (11.22) gives another way to write the area term which is
manifestly SU(2) invariant; this time the price is introducing these auxiliary z variables.
The Berry phase SB [n] is geometric, in the sense that it depends on the trajec-
tory of the spin through time, but not on its parametrization, or speed or dura-
tion. It is called the Berry phase of the spin history because it is the phase ac-
quired by a spin which follows the instantaneous groundstate (i.e. adiabatic evolution)
|Ψ0 (t)i of H(ň(t), t) ≡ −h(t)ň(t) · S, with h > 0. This is Berry’s adiabatic phase,
R
SB = − lim∂t h→0 dtIm hΨ0 (t)| ∂t |Ψ0 (t)i.
Making different choices of for the phase ψ at different times can shift the constant
in front of the second term in (11.23); as we observed earlier, this term is a total
derivative. Different choices of ψ change the overall phase of the wavefunction, which
doesn’t change physics (recall that this is why the space of normalized states of a
qbit is a two-sphere and not a three-sphere). Notice that At = z † ∂t z is like the time
component of a gauge field.
Since SB is geometric, like integrals of differential forms, let’s take advantage of this
to make it pretty and relate it to familiar objects. Introduce a vector potential (the
Berry connection) on the sphere Aa , a = x, y, z so that
I I Z
a Stokes
SB = dτ ṅa A = A = F
γ D
34
where γ = ∂D is the trajectory. (F = dA is the Berry curvature.) What is the correct
form? We must have (∇ × A) · ň = abc ∂na Ab nc = 1 (for spin half). This is a monopole
field. Two choices which work are
These two expressions differ by the gauge transformation dϕ, which is locally a total
derivative. The first is singular at the N and S poles, ň = ±ž. The second is singular
only at the S pole. Considered as part of a 3d field configuration, this codimension
two singularity is the ‘Dirac string’. The demand of invisibility of the Dirac string
quantizes the Berry flux.
If we redo the above coherent-state quantization for a spin-s system we’ll get the
expression with general s (see below). Notice that this only makes sense when 2s ∈ Z.
We can add a nonzero Hamiltonian for our spin; for example, we can put it in an
external Zeeman field ~h, which adds H = −~h · ~S. This will pass innocently through
the construction of the path integral, adding a term to the action S = SB + Sh ,
Z
Sh = dt s~h · ~n
35
we could do calculus, z(t + ) − z(t) = ż(t) + O(2 ). Is this true of the important
contributions to the path integral? Sometimes not, and we’ll come back to this later.
Schwinger bosons. The following is a helpful device for spin matrix elements.
Consider two copies of the harmonic oscillator algebra, with modes a, b satisfing [a, a† ] =
1 = [b, b† ], [a, b] = [a, b† ] = 0. Then
S+ = a† b, S− = b† a, Sz = a† a − b† b
satisfy the SU(2) algebra. The no-boson state |0i is a singlet of this SU(2), and the
†
a |0i
one-boson states form a spin-half doublet.
b† |0i
More generally, the states
36
cos 2θ eiψ/2
iϕ/2
z1 e
Here = −iϕ/2 as above9 .
z2 e sin 2θ eiψ/2
But now we can compute the crucial ingredient in the coherent state path integral,
the overlap of successive coherent states:
0
0 e−is(ψ−ψ ) 0
0
2s
hň|ň i = h0| (z1? a + z2? b)2s (z10 a† + z20 b† )2s |0i = e−is(ψ−ψ ) (z1? z10 +z2? z20 )2s = e−i(ψ−ψ )/2 z † · z 0 .
(2s)! | {z }
Wick 2s
= (2s)!([z1? a+z2? b,z10 a† +z20 b† ])
Here’s the point: this is the same as the spin-half answer, raised to the 2s power.
(s) (1)
This means that the Berry phase just gets multiplied by 2s, SB [n] = 2sSB2 [n], as we
claimed.
Semi-classical spectrum. Above we found a path integral representation for the
Green’s function of a spin as a function of time, G(nt , n0 ; t). The information this con-
tains about the spectrum of the hamiltonian can be extracted by Fourier transforming
Z ∞
G(nt , n0 ; E) ≡ −i dtG(nt , n0 ; t)ei(E+i)t
0
d2 n0
Z
1
Γ(E) ≡ G(n0 , n0 ; E) = tr .
2π E − H + i
This function has poles at the eigenvalues of H. Its imaginary part is the spectral
density, ρ(E) = π1 ImΓ(E) = α δ(E − Eα ).
P
37
In the second equality we used the fact that the Berry phase is geometric, it depends
only on the trajectory, not on t (how long it takes to get there). So the semiclassical
trajectories are periodic solutions to the EOM with energy E = Hcl [nE ]. The exponent
evaluated on such a trajectory is then just the Berry term. Denoting by nE 1 such
trajectories which traverse once (‘prime’ orbits),
∞
XX X einsSB [n]
Γ(E) ∼ einsSB [n] = .
n=0
1 − einsSB [n]
nE
1 nE
1
This is an instance of the Gutzwiller trace formula. The locations of poles of this func-
m
tion approximate the eigenvalues of H. They occur at E = Esc such that SB [~nEm ] =
2πm
s
. The actual eigenvalues are E m = Escm
+ O(1/s).
R
If the path integral in question were a 1d particle in a potential, with SB = pdx,
and Hcl = p2 + V (x), the semiclassical condition would reduce to
I Z p
2πm = p(x)dx = Em − V (x)
xE m turning points
[Zee §6.5] Now we’ll try D ≥ 1 + 1. Consider a chain of spins, each of spin s ∈ Z/2,
interacting via the Heisenberg hamiltonian:
X
H= J ~Sj · ~Sj+1 .
j
This hamiltonian is invariant under global spin rotations, Saj → RSaj R−1 = Rba Sbj for
D E
all j. For J < 0, this interaction is ferromagnetic, so it favors a state like ~Sj = sẑ.
D E
For J > 0, the neighboring spins want to anti-align; this is an antiferromagnet: ~Sj =
(−1)j sẑ. Note that I am lying about there being spontaneous breaking of a continuous
symmetry in 1+1 dimensions. Really there is only short-range order because of the
Coleman-Mermin-Wagner theorem. But that is enough for the calculation we want to
do.10
10
Even more generally, the consequence of short-range interactions of some particular sign for the
groundstate is not so obvious. For example, antiferromagnetic interactions may be frustrated: If I
want to disagree with both Kenenisa and Lasse, and Kenenisa and Lasse want to disagree with each
other, then some of us will have to agree, or maybe someone has to withhold their opinion, hSi = 0.
38
We can write down the action that we get by coherent-state quantization – it’s just
many copies of the above, where each spin plays the role of the external magnetic field
for its neighbors: X † X
L = is zj ∂t zj − Js2 ~nj · ~nj+1 .
j j
Spin waves in ferromagnets. Let’s use this to find the equation of motion for
small fluctuations δ~ni = S~i − sẑ about the ferromagnetic state. Once we recognize
the existence of the Berry phase term, this is the easy case. In fact the discussion is
not restricted to D = 1 + 1. Assume the system is translation invariant, so we should
Fourier transform. The condition that ~n2j = 1 means that δnz (k) = 0.11 Linearizing in
δ~n (using (11.18)) and fourier transforming, we find
h(k) − 2i ω
δnx (k)
0= i
2
ω h(k) δny (k)
with h(k) determined by the exchange (J) term. It is the lattice laplacian in k-
k→0
space. For example for the square lattice, it is h(k) = 4s|J| (2 − cos kx a − cos ky a) '
2s|J|a2 k 2 , with a the lattice spacing. For small k, the eigenvectors have ω ∼ k 2 , a
z = 2 dispersion (meaning that there is scale invariance near ω = k = 0, but space
and time scale differently: k → λk, ω → λ2 ω. The two spin polarizations have their
relative phases locked δnx (k) = iδny (k)/hk , and so these modes describe precession
of the spin about the ordering vector. These low-lying spin excitations are visible in
neutron scattering and they dominate the low-temperature thermodynamics. Their
thermal excitations produce a version of the blackbody spectrum with z = 2. We can
determine the generalization of the Stefan-Boltzmann law by dimensional analysis: the
free energy (or the energy itself) is extensive, so F ∝ Ld , but it must have dimensions
of energy, and the only other scale available is the temperature. With z 6= 1, temper-
d+1
ature scales like [T ] = [L−z ]. Therefore F = cLd T z . (For z = 1 this is the ordinary
Stefan-Boltzmann law).
Notice that a ferromagnet is a bit special because the order parameter Qz = i Szi
P
is actually conserved, [Qz , H] = 0. This is actually what’s responsible for the funny
z = 2 dispersion of the goldstones, and the fact that although the groundstate breaks
two generators Qx and Qy , there is only one gapless mode. If you are impatient to
understand this connection, take a look at this paper.
11
1 = n2j ∀j =⇒ nj · δnj = 0, ∀j which means that for any k,
X X
0= eikja nj · δnj = nz (k − q)δnz (q) = δnzk .
j q
39
Antiferromagnets. [Fradkin, 2d ed, p. 203] Now, let’s study instead the equation
of motion for small fluctuations about the antiferromagnetic state. The conclusion will
be that there is a linear dispersion relation. This would be the conclusion we came
to if we simply erased the WZW/Berry phase term and replaced it with an ordinary
kinetic term
1 X
∂t~nj · ∂t~nj .
2g 2 j
How this comes about is actually a bit more involved! An important role will be
played12 by the ferromagnetic fluctuation ~`j in
~ j + a~`j .
~nj = (−1)j m
~ j is the AF fluctuation; a is the lattice spacing; s ∈ Z/2 is the spin. The constraint
m
~n2 = 1 tells us that m
~ 2 = 1 and m~ · ~` = 0.
~ 2r + 2`2r )+O(a2 ))
The exchange (J) term in the action is (using ~n2r −~n2r−1 ≈ a (∂x m
Z
j ~ 2 1 2 2
SJ [~nj = (−1) m~ j + a`j ] = −aJs dxdt (∂x m)
~ + 2` .
2
So
1 dx δW0 1
W0 [n2r ] − W [n2r−1 ] = − ∂x n̂i a = − dxn̂ × ∂t n̂ · ∂x n̂.
2 a δni 2
40
Altogether, we find that ` is an auxiliary field with no time derivative:
so we can integrate out ` (this is the step analogous to what we’ll do for ρ in the EFT
of SF in §11.5) to find
Z
1 1 2 2 θ
S[m]
~ = dxdt ~ − vs (∂x m)
(∂t m) ~ + ~ · (∂µ m
µν m ~ × ∂ν m)
~ , (11.24)
2g 2 vs 8π
with g 2 = 2s and vs = 2aJs, and θ = 2πs. The equation of motion for small fluctuations
of m
~ therefore gives linear dispersion with velocity vs . Notice that there are two
independent gapless modes. Some of these fluctuations have wavenumber k close to
π, since they are fluctuations of the AF order (k = π means changing sign between
each site), that is, ω ∼ |k − π|. (For a more microscopic treatment, see the book by
Auerbach.)
The last (‘theta’) term in (11.24) is a total derivative. This means it doesn’t affect
the EOM, and it doesn’t affect the Feynman rules. It is even more topological than
the WZW term – its value only depends on the topology of the field configuration,
and not on local variations. It is like the θF ∧ F term in 4d gauge theory. You might
think then that it doesn’t matter. Although it doesn’t affect small fluctuations of the
fields, it does affect the path integral. Where have we seen this functional before? The
integrand is the same as in our 2d representation of the WZW term in 0+1 dimensions:
the object multiplying theta counts the winding number of the field configuration m, ~
2 2
the number of times Q the map m ~ : R → S covers its image (we can assume that the
map m(|x|
~ → ∞) approaches a constant, say the north pole). We can break up the
1
R
path integral into sectors, labelled by this number Q ≡ 8π ~ · (∂µ m
dxdt µν m ~ × ∂ν m)
~ :
Z XZ
iS
Z = [Dm]e ~ = ~ Q eiSθ=0 eiθQ .
[Dm]
Q∈Z
θ determines the relative phase of different topological sectors (for θ = π, this a minus
sign for odd Q).
Actually, the theta term makes a huge difference. (Perhaps it is not so surprising
if you think about the quantum mechanics of a particle constrained to move on a ring
with magnetic flux through it?) The model with even s flows to a trivial theory in the
IR, while the model with odd s flows to a nontrivial fixed point, called the SU(2)1 WZW
model. It can be described in terms of one free relativistic boson. If you are impatient
to understand more about this, the 2nd edition of the book by Fradkin continues this
discussion. Perhaps I can be persuaded to say more.
[End of Lecture 46]
41
Nonlinear sigma models in perturbation theory. Let us discuss what happens
in perturbation theory in small g. A momentum-shell calculation integrating out fast
modes (see the next subsection, §11.3.3) shows that
dg 2
= (D − 2)g 2 + (n − 2)KD g 4 + O(g 3 ) (11.25)
d`
where ` is the logarithmic RG time, and ` → ∞ is the IR. n is the number of components
ΩD−1
of n̂, here n = 3, and KD = (2π) D as usual. Cultural remark: the second term
is proportional to the curvature of the target space, here S n−1 , which has positive
curvature for n > 1. For n = 2, we get S 1 which is one-dimensional and hence flat and
there is no perturbative beta function. In fact, for n = 2, it’s a free massless scalar.
(But there is more to say about this innocent-looking scalar!)
The fact that the RHS of (11.25) is positive in D = 2 says that this model is
asymptotically free – the coupling is weak in the UV (though this isn’t so important if
we are starting from a lattice model) and becomes strong in the IR. This is opposite
what happens in QED; the screening of the charge in QED makes sense in terms of
polarization of the vacuum by virtual charges. Why does this antiscreening happen
here? There’s a nice answer: the effect of the short-wavelength fluctuations is to make
the spin-ordering vector ~n effectively smaller. It is like what happens when you do
the block spin procedure, only this time don’t use majority rule, but just average the
spins. But rescaling the variable ~n → a~n with a < ∼ 1 is the same as rescaling the
coupling g → g/a – the coupling gets bigger. (Beware Peskin’s comments about the
connection between this result and the Coleman-Mermin-Wagner theorem: it’s true
that the logs in 2d enhance this effect, but in fact the model can reach a fixed point at
finite coupling; in fact, this is what happens when θ = π.)
Beyond perturbation theory. Like in QCD, this infrared slavery (the dark side
of asymptotic freedom) means that we don’t really know what happens in the IR from
this calculation. From other viewpoints (Bethe ansatz solutions, many other methods),
we know that (for integer s) there is an energy gap above the groundstate (named after
Haldane) of order
c
− 2
ΛH ∼ Λ0 e g0
,
analogous to the QCD scale. Here g0 is the value of g at the scale Λ0 ; so ΛH is roughly
the energy scale where g becomes large. This is dimensional transmutation again.
For s ∈ Z, for studying bulk properties like the energy gap, we can ignore the theta
term since it only appears as e2πin , with n ∈ Z in the path integral. 14 For half-integer
14
θ = 2πn does, however, affect other properties, such as the groundstate wavefunction and the
behavior in the presence of a boundary. θ = 2π is actually a different phase of matter than θ = 0.
It is an example of a SPT (symmetry-protected topological) phase, the first one discovered. See the
homework for more on this.
42
s, there is destructive interference between the topological sectors. Various results
(such as the paper by Read and Shankar, Nuclear Physics B336 (1990) 457-474, which
contains an amazingly apt Woody Allen joke) show that this destroys the gap. This last
sentence was a bit unsatisfying; more satisfying would be to understand the origin of
the gap in the θ = 2πn case, and show that this interference removes that mechanism.
This strategy is taken in this paper by Affleck.
[Polyakov §3.2; Peskin §13.3; Auerbach chapter 13] I can’t resist explaining the result
(11.25). Consider this action for a D = 2 non-linear sigma model with target space
S n+1 , of radius R: Z Z
S= d2 xR2 ∂µ n̂ · ∂ µ n̂ ≡ d2 xR2 dn2 .
Notice that R is a coupling constant (it’s what I called 1/g earlier). In the second step
I made some compact notation.
Since not all of the components of n̂ are independent (recall that n̂ · n̂ = 1!),
the expansion into slow and fast modes here is a little trickier than in our previous
examples. Following Polyakov, let
n−1
X
i
p
n (x) ≡ ni< (x) 2
1 − φ> + φ> i
a (x)ea (x). (11.26)
a=1
Here the slow modes are represented by the unit vector ni< (x), n̂< · n̂< = 1; the variables
eia are a basis of unit vectors spanning the n − 1 directions perpendicular to ~n< (x)
they are not dynamical variables and how we choose them does not matter.
RΛ
The fast modes are encoded in φ> a (x) ≡ Λ/s d̄ke
ikx
φk , which only has fourier modes
n−1
in a shell of momenta, and φ2> ≡ a=1 φ> >
P
a φa . Notice that differentiating the relations
in (11.27) gives
n̂< · dn̂< = 0, n̂< · dêa + dn̂< · êa = 0. (11.28)
Below when I write φs, the > symbol is implicit.
We need to plug the expansion (11.26) into the action, whose basic ingredient is
21 φ · dφ
dni = dni< 1 − φ2 − ni< p + dφ · ei + φ · dei .
1−φ 2
43
R
So Seff = d2 x L with
1
L= (d~n)2
2g 2
1
= 2 (dn< )2 1 − φ2 + dφ2
+2φa dφb~ea · d~eb
2g |{z}
kinetic term for φ
So let’s do the integral over φ, by treating the dφ2 term as the kinetic term in a gaussian
integral, and the rest as perturbations:
Z Z
− 1 (dφ)2
R R
−Seff [n< ] Λ − L
e = [Dφ> ]Λ/s e = [Dφ> ]ΛΛ/s e 2g2 (all the rest) ≡ hall the resti>,0 Z>,0 .
Λ
d2 k
Z
2 1
hφa φb i>,0 = δab g 2
= g 2 K2 log(s)δab , K2 = .
Λ/s k 2π
What to do with this d~ea · d~eb nonsense? Remember, ~ea are just some arbitrary
basis of the space perpendicular to n̂< ; its variation can be expanded in our ON basis
at x, (n< , ec ) as
n−1
X
d~ea = (dea · n̂< ) n̂< + (d~ea · ~ec ) ~ec
| {z }
c=1
(11.28)
= −dn̂< ·~ea
Therefore X
d~ea · d~ea = + (dn< )2 + (~ec · d~ea )2
c,a
where the second term is a higher-derivative operator that we can ignore for our present
purposes. Therefore
1
Leff [n] = 2 (dn̂< )2 1 − ((N − 1) − 1) g 2 K2 log s + ...
2g
−1
g4
1
' 2
g + (N − 2) log s + ... (dn̂< )2 + ... (11.30)
2 4π
Differentiating this running coupling with respect to s gives the one-loop term in
the beta function quoted above. The tree-level (order g 2 ) term comes from engineering
dimensions.
44
11.3.4 CP1 representation and Large-N
[Auerbach, Interacting Electrons and Quantum Magnetism, Polyakov, Gauge fields and
strings] Above we used large spin as our small parameter to try to control the con-
tributions to the path integral. Here we describe another route to a small parameter,
which can be just as useful if we’re interested in small spin like spin- 12 .
Recall the relationship between the coherent state vector ň and the spinor compo-
nents z: na = z † σ a z. Imagine doing this at each point in space and time:
We saw that the Berry phase term could be written nicely in terms of z as iz † ż, what
about the rest of the path integral?
First, some counting: 1 = ň2 ↔ 1 = z † · z = m=↑,↓ |zm |2 . But this leaves only two
P
Path integral manipulations. [Auerbach, chapter 14] First notice that the AF
kinetic term is
∂µ na ∂ µ na = 4 ∂µ z † ∂ µ z − Aµ Aµ = 4 ∂µ z † ∂ µ z − Aµ Aµ z † z .
(11.33)
that Aµ → Aµ − ∂χ and the BHS of (11.33) is gauge invariant under (11.32). We must
impose the constraintR |z(x)|2 = 1 at each site, which let’s do it by a lagrange muliptlier
δ[|z|2 − 1] = Dλ ei d xλ(x)(|z| −1) . In the action, the A2 term is a self-interaction of
R d 2
the zs, which makes it difficult to do the integral. The standard trick for ameliorating
this problem is the Hubbard-Stratonovich identity:
r Z
cA2µ c 2 µ
e = dAµ e−cAµ +2cAµ A .
π
The saddle point value of A is A. This gives
Z
−# dn2 2
R R
e = [dA]e−# |(∂−iA)z| .
45
R Rπ R 2π
Finally, let’s think about the measure at each point: d2 nδ(n2 −1)... = 0 sin θdθ 0 dϕ....
cos 2θ eiϕ/2 eiχ/2
iφ1
ρ1 e
Compare this to the integral over zs, parametrized as z = = :
ρ2 eiφ2 sin 2θ e−iϕ/2 eiχ/2
Z Z Y
† 2
dzdz δ(|z| − 1)... = ρm dρm dφm δ(ρ1 + ρ2 − 1)...
Zm=1,2p Z
2 0 θ θ
= c dρρ 1 − ρ dϕdχ... = c sin cos dθdϕdχ...
2 2
R
which is the same as dn except for the extra integral over χ: that’s the gauge di-
rection. The integral over χ is just a number at each point, as long as we integrate
invariant objects (otherwise, it gives zero). Thinking of z as parametrizing an arbitrary
1
normalized spinor z = R(θ, ϕ, χ) , so that R is an arbitrary element of SU(2), we’ve
0
just shown the geometric equivalence between the round S 2 and CP1 = SU(2)/U(1).
Z
2ΛD−2
dD x |(∂−iA)z|2 −λ (|z|2 −1) .
R
† −
ZS 2 ' [dzdz dAdλ]e g2 (11.34)
This is a U(1) gauge theory with N = 2 charged scalars. It is called the CP1 sigma
model. There are two slightly funny things: (1) the first is that the gauge field A lacks a
kinetic term: in the microscopic description we are making here, it is infinitely strongly
coupled. We’ll see what the interactions with matter have to say about the coupling
in the IR. (2) The second funny thing is that the scalars z have a funny interaction
with this field λ which only appears linearly. If we add a λ2 /κ quadratic term, we can
do the lambda integral and find V (|z|2 ) = κ(|z|2 − 1)2 , an ordinary quartic potential
for |z|. This has the effect of replacing the delta function imposition with an energetic
recommendation that |z|2 = 1. This is called a soft constraint, and it shouldn’t change
the universal physics.
46
Large N . This representation allows the introduction of another possible small
parameter, namely the number of components of z. Suppose instead of two components,
it has N
N
X N
|zm |2 = ,
m=1
2
and let’s think about the resulting CPN −1 sigma model (notice that CPN −1 and S N
are different generalizations of S 2 , in the sense that for N → 1 they are both S 2 ):
Z D−2
− dD x 2Λ 2 |(∂−iA)z|2 −λ(|z|2 −N/2)
R
ZCPN −1 = [dzdz † dAdλ]e g
Z
N 1
= [dAdλ]e−N S[A,λ] ' Z 0 e−N S[A,λ] .
The z-integral is gaussian in the representation (11.34) even for N = 2, but the resulting
integrals over A, λ are then horrible, with action
iΛd−2
Z
2
S[A, λ] = tr ln − (∂ − iA) − λ.
g2
The role of large N is to make those integrals well-peaked about their saddle point.
The saddle point equations are solved by A = 0 (though there may sometimes be other
saddles where A 6= 0, which break various discrete symmetries). This leaves us with
Z
i
S[0, λ] = d̄d k ln(k 2 + λ) − 2 V λ
g
(where V is the number of sites, the volume of space, and I’ve assumed constant λ),
which is solved by λ = −iλ satisfying
d̄D k
Z
1
2
= 2.
k +λ g
The solution of this equation depends on the number of dimensions d.
Z Z
1 d̄k 1 d̄k
d=1: = = √ 2 =⇒ λ = g 4 Λ2 .
g2Λ k2 + λ λ k +1
This says that the mass of the excitations is m = g 2 . Where did that come from?
D = 1 means we are studying the quantum mechanics of a particle contrained to move
on CPN −1 :
1 2
H = 2 ∂z ∂z̄ + |z|2 − N/2 .
2g Λ
1
The groundstate is the uniform state hz|groundstatei = Ψ(z) = vol . QM of finite
number of degrees of freedom on a compact space has a gap above the groundstate.
This gap is determined by the kinetic energy and naturally goes like g 2 Λ.
47
d̄2 k
Z
−2 1 λ − 4π
d=2: g =2
= − ln 2 =⇒ λ = Λ2 e g2 .
k +λ 4π Λ
This is the case with asymptotic freedom; here we see again that asymptotic freedom
is accompanied by dimensional transmutation: the interactions have generated a mass
scale
− 2π
m = Λe g2
which is parametrically (in the bare coupling g) smaller than the cutoff.
d̄3 k
Z
Λ p Λ 2 4π
d=3: = =⇒ | λ| = − 2 .
g2 2
k +λ 2 π g
Notice that for d > 3 there is a critical value of g below which there is no solution.
That means symmetry breaking: the saddle point is at λ = m2 = 0, and the z-fields
are gapless Goldstone modes. This doesn’t happen in D ≤ 2. The critical coupling
R D D−2
occurs when gc−2 = d̄k2k ' ΛD−2 . The rate at which the mass goes to zero as g → gc
from above is 2 2
2 2 g − gc2 D−2
m 'Λ .
gc2
This is a universal exponent. (For more on critical exponents from large-N calculations,
see Peskin p. 464-465.)
A quantity we’d like to be able to compute for N = 2 is S +− (x) ≡ hS + (0)S − (x)i.
We can write this in terms of the coherent state variables using the identity
Z
(s + 1)(2s + 1)
S = Ns dn |ňi hň| na , (Ns =
a
).
4π
(Up to the constant factor, this identity follows from SU(2) invariance. The constant
can be checked by looking at a convenient matrix element of the BHS.) Then:
48
Z
e−ikx 1 √
−|x| λ
= d̄d k ' d−1 e .
|k|2 + λ |x| 2
0 x>ξ
This says that the correlation length for the spins in S m6=m (x) ' 1
|x|d−1
e−ξ|x| is ξ = √
1
2 λ
2
depends variously on d. In D = 1, it is ξ = Λ/g , so large-N predicts a gap, growing
2π
−1 + g2
with g. In D = 2, the correlation length is ξ = Λ e In D = 3, the correlation
−1
length diverges as g → gc ξ = Λ−1 2
π
− 4π
g2
, signaling the presence of gapless modes,
which we interpret as Goldstones.
0
Exercise. Check that the other components of the spin such as S z = |z m |2 − |z m |2
have the same falloff, as they must by SU(N ) symmetry.
A dynamical gauge field emerges. Finally, let me show you that a gauge field
emerges. Let’s expand the action Seff [A, λ] about the saddle point at A = 0, λ = λ ≡
m2 :
S[A = 0 + a, λ = m2 + v] = W0 + W0 +W2 + O(δ 3 )
|{z}
=0by def
49
where we regard F = dA as a two-form. This is the 2d theta term, analogous to
R
F ∧ F in D = 4 in that F = dA is locally a total derivative, it doesn’t affect the
equations of motion, and it integrates to an integer on smooth configurations (we will
show this when we study anomalies). This integer is called the Chern number of the
gauge field configuration. What integer is it? On the homework you’ll show that
F ∝ abc na dnb dnc . It’s the skyrmion number! So the coefficient is θ = 2πs.
I think it will help to bring home some of the previous ideas by rederiving them using
diagrams in a familiar context. So let’s study the O(N ) model:
1 g m2
~ · ∂ϕ
L = ∂ϕ ~+ (~ ~ )2 +
ϕ·ϕ ~ ·ϕ
ϕ ~. (11.37)
2 4N 2
Let’s do euclidean spacetime, D dimensions. The bare propagator is
e−ikx
Z Z
D
hϕb (x)ϕa (0)i = d̄ k 2 ≡ d̄D k ∆0 (k).
k + m2
The bare vertex is − 2g
N
(δab δcd + δac δbd + δad δbc ). With this normalization, the leading
correction to the propagator is
Z Z
g d̄q N 1
=− (4N + 8)δab ' −gδab d̄q∆0 (q)
4N q 2 + m2
of order N 0 . This is the motivation for the normalization of the coupling in (11.37).
Which diagrams dominate at large N (and fixed g)? Compare two diagrams at
the same order in λ with different topology of the index flow: eyeball and
cactus . The former has one index loop, and the latter has two, and therefore
dominates. The general pattern is that: at large N cacti dominate the 1PI self-energy.
Each extra pod we add to the cactus costs a factor of g/N but gains an index loop N .
So the sum of cacti is a function of gN 0 .
The full propagator, by the usual geometric series, is then
1
∆F (k) = . (11.38)
k2 + m2 + Σ(k)
We can sum all the cacti by noticing that cacti are self-similar: if we replace ∆0 by ∆F
in the propagator: Z
Σ(p) = g d̄D k∆F (k) + O(1/N ). (11.39)
50
The equations (11.38), (11.39) are integral equations for ∆F ; they are called Schwinger-
Dyson equations,
OK, now notice the p-dependence in (11.39): the RHS is independent of p to leading
order in N , so Σ(p) = δm2 is just a mass shift.
Look at the position-space propagator
Z
hϕb (x)ϕa (y)i = δab d̄D ke−ik(x−y) ∆F (k). (11.40)
Let
ϕ2
P
2 a ϕa (x)ϕa (x)
y ≡ = ;
N N
it is independent of x by translation invariance. Now let y → x in (11.40):
Z
(11.39)
y = d̄D k∆F (k) = g −1 Σ.
2
where I are any invariants of the large-N group (i.e. O(N ) in the O(N ) model (nat-
urally) and SU(N ) in the CPN −1 model), and h...i denotes either euclidean vacuum
expectation value or time-ordered vacuum expectation value. Consider, for example,
in the O(N ) model, normalized as above
2
ϕ (x) ϕ2 (y)
.
N N
In the free theory, g = 0, there are two diagrams
2
ϕ (x) ϕ2 (y)
+ O N −1
= =
N N free
51
– the disconnected diagram dominates, because it has one more index loop and the
same number of interactions (zero). With interactions, representative diagrams are
The basic statement is that mean field theory works for singlets. At large N , the
entanglement follows the flavor lines.
We can still ask: what processes dominate the connected (small) bit at large N ?
And what about non-singlet operators? Consider (no sum on b, a):
Gb64,c
=a
+ O N −2
= hϕb (p4 )ϕb (p3 )ϕa (p2 )ϕa (p1 )i =
The answer is: bubbles. More specifically chains of bubbles, propagating in the s-
channel. What’s special about the s-channel, here? It’s the channel in which we can
but as you can see, these go like N −2 . However, bubbles can have cactuses growing on
them, like this: To sum all of these, we just use the full propagator in the
internal lines of the bubbles, ∆0 → ∆F .
I claim that the bubble sum is a geometric series:
2 g
Gb64,c
=a
= − (∆0 (external))4 + O N −2
(11.41)
N 1 + gL(p1 + p2 )
where L is the loop integral L(p) ≡ d̄D k∆F (k)∆F (p + k). You can see this by being
R
g 2 1 2
= ∆0 (external)4 ·2·4·8· L = ∆0 (external)4 (g)2 L.
4N 2!
|{z} N
Dyson
52
2 3 2
Similarly, the chain of two bubbles is N
g L, etc.
Here’s how we knew this had to work without worrying about the damn symmetry
factors: the bubble chain is the σ propagator! At the saddle, σ ' ϕa ϕa , which is
what is going in and out of this amplitude. And the effective action for sigma (after
integrating out ϕ) is
Z 2
σ
+ tr ln ∂ 2 + m2 + σ .
Seff [σ] =
g
The connected two-point function means we subract of hσi hσi, which is the same as
considering the two point function of the deviation from saddle value. This is
−1 !−1
δ2
1
hσ1 σ2 i = Seff [σ] = 2
δσ1 δσ2 g −1 + 2 1 2 ∂ +m +σ
Two comments: (1) We were pretty brash in integrating out all the ϕ variables and
keeping the σ variable: how do we know which are the slow ones and which are the
fast ones? This sort of non-Wilsonian strategy is common in the literature on large-N ,
where physicists are so excited to see an integral that they can actually do that they
don’t pause to worry about slow and fast. But if we did run afoul of Wilson, at least
we’ll know it, because the action for σ will be nonlocal.
(2) σ ∼ ϕ2 is a composite operator. Nevertheless, the sigma propagator we’ve just
derived can have poles at some p2 = m2 (likely with complex m). These would produce
particle-like resonances in a scattering experiment (such as 2 − 2 scattering of ϕs of the
same flavor) which involved sigmas propagating in the s-channel. Who is to say what
is fundamental.
Now that you believe me, look again at (11.41); it is of the form
2
b6=a
= − (∆0 (external))4 geff (p1 + p2 ) + O N −2
G4,c
N
where now
g
geff (p) = R D
1 + g d̄ k∆F (k)∆F (p + k))
is a momentum-dependent effective coupling, just like one dreams of when talking
about the RG.
53
11.4 Coherent state path integrals for fermions
We’ll need these for our discussion of anomalies, and if we ever get to perturbative
QCD (which differs from Yang-Mills theory by the addition of fermionic quarks).
[Shankar, Principles of QM, path integrals revisited. In this chapter of his great
QM textbook, Shankar sneaks in lots of insights useful for modern condensed matter
physics]
Consider the algebra of a single fermion mode operator15 :
H = c† c (ω0 − µ)
(ω0 and µ are (redundant when there is only one mode) constants). This algebra is
represented on a two-state system |1i = c† |0i. We might be interested in its thermal
partition function
H
Z = tr e− T .
ω0 −µ
(In this example, it happens to equal Z = 1 + e− T , as you can see by computing the
trace in the eigenbasis of n = c† c. But never mind that; the one mode is a proxy for
many, where it’s not quite so easy to sum.) How do we trotterize this? That is, what
is ‘the’ corresponding classical system? (One answer is to use the (0d) Jordan-Wigner
map which relates spins and fermions. Perhaps more about that later. Here’s another,
different, answer.) We can do the Trotterizing using any resolution of the identity on
H, so there can be many very-different-looking answers to this question.
Let’s define coherent states for fermionic operators:
Here ψ is a c-number (not an operator), but acting twice with c we see that we must
have ψ 2 = 0. ψ is a grassmann number. These satisfy
– they anticommute with each other and with fermionic operators, and commute with
ordinary numbers and bosons. They seem weird but they are easy. We’ll need to
15
For many modes,
{ci , cj } = 0, {c†j , c†j } = 0, {cj , c†j } = 1δij .
54
consider multiple grassmann numbers when we have more than one fermion mode,
where {c1 , c2 } = 0 will require that they anticommute {ψ1 , ψ2 } = 0 (as in the def-
inition (11.43)); note that we will be simultaneously diagonalizing operators which
anticommute.
The solution to equation (11.42) is very simple:
where as above |0i is the empty state (c |0i = 0) and |1i = c† |0i is the filled state.
(Check: c |ψi = c |0i − cψ |1i = +ψc |1i = ψ |0i = ψ |ψi .)
Similarly, the left-eigenvector of the creation operator is
†
ψ̄ c = ψ̄ ψ̄, ψ̄ = h0| − h1| ψ̄ = h0| + ψ̄ h1| .
Notice that these states are weird in that they are elements of an enlarged hilbert space
with grassmann coefficients (usually we just allow complex numbers). Also, ψ̄ is not
the complex conjugate of ψ and ψ̄ is not the adjoint of |ψi. Rather, their overlap is
f (ψ) = f0 + f1 ψ .
With more than one grassmann we have to worry about the order:
Z Z
1 = ψ̄ψdψdψ̄ = − ψ̄ψdψ̄dψ.
55
A11 A12 · · · ψ
... .1
Here ψ̄ · A · ψ ≡ ψ̄1 , · · · , ψ̄M A21
· · · .. . One way to get this expression
.. .. . . ψM
. . .
is to change variables to diagonalize the matrix A.
ψ̄ψe−aψ̄ψ dψ̄dψ
R
1
ψ̄ψ ≡ R = − = − ψ ψ̄ .
e−aψ̄ψ dψ̄dψ a
P
If for many grassman variables we use the action S = i ai ψ̄i ψi (diagonalize A
above) then
δij
ψ̄i ψj = ≡ hīji (11.44)
ai
and Wick’s theorem here is
ψ̄i ψ̄j ψk ψl = hīli hj̄ki − hīki hj̄li .
(Note the minus sign; it will lead to a deep statement.) So the partition function is:
Z
H
Z = dψ̄0 dψ0 e−ψ̄0 ψ0 −ψ̄0 e− T
|{z} |ψ0 i
=(1 − ∆τ H) · · · (1 − ∆τ H)
| {z }
M times
Because of the −ψ̄ in (11.45), to get this nice expression we had to define an extra
letter
ψ̄M = −ψ̄0 , ψM = −ψ0 (11.46)
so we could replace −ψ̄0 = ψ̄M . [End of Lecture 49]
56
Now we use the coherent state property to turn the matrix elements into grassmann-
valued functions:
∆τ →0
ψ̄l+1 1 − ∆τ H(c† , c) |ψl i = ψ̄l+1 1 − ∆τ H(ψ̄l+1 , ψl ) |ψl i = eψ̄l+1 ψl e−∆τ H(ψ̄l+1 ,ψl ) .
It was important that in H all cs were to the right of all c† s, i.e. that H was normal
ordered.)
So we have
−1
Z MY
Z = dψ̄l dψl e−ψ̄l ψl eψ̄l+1 ψl e−∆τ H(ψ̄l+1 ,ψl )
l=0
−1
Z MY
ψ̄l+1 − ψ̄l
= dψ̄l dψl exp
∆τ
∆τ ψl − H(ψ̄l+1 , ψl )
l=0 | {z }
=∂τ ψ̄
Z Z 1/T ! Z
' [Dψ̄Dψ] exp dτ ψ̄(τ ) (−∂τ − ω0 + µ) ψ(τ ) = [Dψ̄Dψ]e−S[ψ̄,ψ] . (11.47)
0
Points to note:
• The APBCs (11.46) on ψ(τ + T1 ) = −ψ(τ ) mean that in its fourier representation16
X X
ψ(τ ) = T ψ(ω)e−iωn τ , ψ̄(τ ) = T ψ̄(ω)eiωn τ (11.48)
n n
ωn = (2n + 1)πT, n ∈ Z
• The measure [Dψ̄Dψ] is defined by this equation, just as in the bosonic path
integral.
16
ψ̄ is still not the complex conjugate of ψ but the relative sign is convenient.
57
• The derivative of a grassmann function is also defined by this equation; note that
ψl+1 − ψl is not ‘small’ in any sense.
• In the last step we integrated by parts, i.e. relabeled terms in the sum, so
X X X X X X
ψ̄l+1 − ψ̄l ψl = ψ̄l+1 ψl − ψ̄l ψl = ψ̄l0 ψl−1 − ψ̄l ψl = − ψ̄l (ψl − ψl−1 ) .
l l l l0 =l−1 l l
Note that no grassmanns were moved through each other in this process.
The punchline of this discussion for now is that the euclidean action is
Z
S[ψ̄, ψ] = dτ ψ̄∂τ ψ + H(ψ̄, ψ) .
The first-order kinetic term we’ve found ψ̄∂τ ψ is sometimes called a ‘Berry phase term’.
Note the funny-looking sign.
Continuum limit warning (about the red ' in (11.47)). The Berry phase term
is actually
N
X −1 X
ψ̄(ωn ) 1 − eiωn τ ψ(ωn )
ψ̄l+1 (ψl+1 − ψl ) = T
l=0 ωn
1 − eiωn τ → iωn τ.
ωn τ 1
for all ωn which matter. Which ωn contribute? I claim that if we use a reasonable
H = Hquadratic +Hint , reasonable quantities like Z, O† O , are dominated by ωn τ −1 .
There’s more we can learn from what we’ve done here that I don’t want to pass up.
Let’s use this formalism to compute the fermion density at T = 0:
1 −H/T †
hNi = tre c c.
Z
This is an example where the annoying ∆τ s in the path integral not only matter, but
are extremely friendly to us.
Frequency space, T → 0.
58
Let’s change variables to frequency-space fields, which diagonalize S. The Jacobian
is 1 (since fourier transform is unitary):
T →0
Y
Dψ̄(τ )Dψ(τ ) = dψ̄(ωn )dψ(ωn ) → Dψ̄(ω)Dψ(ω).
n
(This is the same fact as V k 7→ d̄d k in the thermodynamic limit.) So the zero-
P R
Using the gaussian-integral formula (11.44) you can see that the propagator for ψ is
δω1 ,ω2 2π
ψ̄(ω1 )ψ(ω2 ) = . (11.49)
T }
| {z iω1 − ω0 + µ
T →0
→ δ(ω1 −ω2 )
2π/T
In particular ψ̄(ω)ψ(ω) = iω−ω0 +µ
. δ(ω = 0) = 1/T is the ‘volume’ of the time
direction.
Back to the number density. Using the same strategy as above, we have
Z −1+1
MY −1
MY
1 −ψ̄l ψl
ψ̄l+1 |(1 − ∆τ H(c† c))|ψl ψ̄N +1 c† c |ψN i
hNi = dψ̄l dψl e ,
Z l=0 l=1
| {z }
=ψ̄N +1 ψN =ψ̄(τN +∆τ )ψ(τN )
where τN is any of the time steps. This formula has a built-in point-splitting of the
operators!
Z
1
hNi = Dψ̄Dψ e−S[ψ̄,ψ] ψ̄(τN + ∆τ )ψ(τN )
Z
Z ∞
eiω∆τ
= d̄ω = θ(µ − ω0 ). (11.50)
−∞ iω − ω0 + µ
59
Which is the right answer: the mode is occupied in the groundstate only if ω0 < µ.
In the last step we used the fact that ∆τ > 0 to close the contour in the UHP; so
we only pick up the pole if it is in the UHP. Notice that this quantity is very UV
R Λ dω
sensitive: if we put a frequency cutoff on the integral, ω
∼ log Λ, the integral
diverges logarithmically. For most calculations the ∆τ can be ignored, but here it told
us the right way to treat the divergence. 17
Where do topological terms come from? [Abanov ch 7] Here is a quick ap-
plication of fermionic path integrals related to the previous subsection §11.3. Consider
a 0+1 dimensional model of spinful fermions cα , α =↑, ↓ coupled to a single spin s, ~S.
Let’s couple them in an SU(2)-invariant way:
HK = M c†~σ c · ~S
by coupling the spin of the fermion c†α~σαβ cβ to the spin. ‘K’ is for ‘Kondo’. Notice
that M is an energy scale. (Ex: find the spectrum of HK .)
Now apply both of the previous coherent state path integrals that we’ve learned to
write the (say euclidean) partition sum as
Z RT
Z = [DψDψ̄D~n]e−S0 [n]− 0 dtψ̄(∂t −M~n·~σ)ψ
17
The calculation between the first and second lines of (11.50) is familiar to us – it is a single Wick
contraction, and can be described as a feynman diagram with one line between the two insertions.
More prosaically, it is
eiωn ∆τ eiω∆τ
Z
60
with
Seff [~n] = S0 [~n] − log det (∂t − M~n · ~σ ) ≡ − log det D.
The variation of the effective action under a variation of ~n is:
−1 † † −1
δSeff = −tr δDD = −tr δDD DD
61
11.5 Coherent state path integrals for bosons
[Wen §3.3] We can do the same thing for bosons, using ordinary SHO (simple harmonic
oscillator) coherent states. What I mean by ‘bosons’ is a many-body system whose
Hilbert space can be written as H = ⊗k Hk where k is a label (could be real space,
could be momentum space) and
1 † 2
Hk = span{|0ik , a†k |0ik , a |0ik , ...} = span{|ni~k , n = 0, 1, 2...}
2 k
is the SHO Hilbert space. Assume the modes satisfy
[a~k , a~†k0 ] = δ d (~k − ~k 0 ).
The object ~k − µ determines the energy of the state with one boson of momentum
~k: a† |0i. The chemical potential µ shifts the energy of any state by an amount
~k
proportional to * +
X †
a~k a~k = N
~k
So, as Jonathan Lam points out, a better name for these would be Hilbert hotel states.
19
Check this by expanding the coherent states in the number basis and doing the integrals
dθ i(n−n0 )θ ∞
Z 2π
dφdφ? −φφ? n ? n0
Z Z
n+n0
e φ (φ ) = e due−u u 2
π 0 2π 0
P
to get 1 = n |ni hn|.
62
2
If we choose N = e−|φ| /2 , they are normalized, but it is more convenient to set N = 1.
The overcompleteness relation on Hk is
dφdφ? −|φ|2
Z
1k = e |φi hφ| .
π
It will be convenient to arrange all our operators into sums of normal-ordered operators:
with all annihilation operators to the right of all creation operators. Coherent state
expectation values of such operators can be built from the monomials
Y † Mk Y
hφ| ak (ak )Nk |φi = (φ?k )Mk (φk )Nk .
k k
Z = tre−H/T
Z N PN
dφl e− l=0 (φl+1 (φl+1 −φl )−∆τ H(φl+1 φl ))
?
Y
=
ZφN +1 =φ0 l=0 R 1/T ? ?
' [Dφ] e− 0 dτ (φ ∂τ φ+H(φ ,φ)) . (11.52)
φ(0)=φ(1/T )
~ ~2
dD−1 xeik·~x ψ(~x), Taylor expanding ~k − µ = −µ + 2m
k
R
In real space a~k ≡ + O(k 4 ), this
is Z
1 1 ~ ? ~
Z = [Dψ]e d ~xdt( 2 (ψ ∂t ψ−ψ∂t ψ )− 2m ∇ψ ·∇ψ−µψ ψ) .
R d ? ? ?
Real time. If you are interested in real-time propagation, rather than euclidean
time, just replace the euclidean propagator e−τ H 7→ e−itH . The result, for example, for
the amplitude to propagate from one bose coherent state to another is
Z φ(tf )=φf R tf
i ? ?
hφf , tf | e −itH
|ψ0 , t0 i = Dφ? Dφ e ~ t0 dt(i~φ ∂t φ−H(φ,φ )) .
φ(t0 )=φ0
Note a distinguishing feature of the Berry phase term that it produces a complex term
in the real-time action.
63
This is the non-relativistic field theory we found in 215A by taking the E m
limit of a relativistic scalar field. Notice that the field ψ is actually the coherent state
eigenvalue!
If instead we had an interaction term in H, say ∆H = dd x dd y 21 ψ ? (x, t)ψ(x, t)V (x−
R R
y)ψ ? (y, t)ψ(y, t), it would lead to a term in the path integral action
Z Z Z
1
Si = − dt d x dd y ψ ? (x, t)ψ(x, t)V (x − y)ψ ? (y, t)ψ(y, t) .
d
2
In the special case V (x − y) = V (x)δ d (x − y), this is the local quartic interaction we
considered briefly earlier.
Non-relativistic scalar fields
[Zee §III.5, V.1, Kaplan nucl-th/0510023 §1.2.1] In the previous discussion of the
EFT for a superconductor (at the end of 215B), I spoke as if the complex scalar were
relativistic.
In superconducting materials, it is generally not. In real superconductors, at least.
How should we think about a non-relativistic field? A simple answer comes from
realizing that a relativistic field which can make a boson of mass m can certainly make
a boson of mass m which is moving slowly, with v c. By taking a limit of the
relativistic model, then, we can make a description which is useful for describing the
interactions of an indefinite number of bosons moving slowly in some Lorentz frame.
A situation that calls for such a description is a large collection of 4 He atoms.
In particular, we have
φ̇2 ' m2 φ2
and the BHS of this equation is large. To remove this large number let’s change
variables:
1 −imt
φ(x, t) ≡ √ e Φ(x, t) +h.c. .
2m | {z }
complex,Φ̇mΦ
64
Notice that Φ is complex, even if φ is real.
Let’s think about the action governing this NR sector of the theory. We can drop
terms with unequal numbers of Φ and Φ? since such terms would come with a factor
of eimt which gives zero when integrated over time. Starting from (∂φ)2 − m2 φ2 − λφ4
we get: !
~2
∇
Lreal time = Φ? i∂t + Φ − g 2 (Φ? Φ)2 + ... (11.53)
2m
λ
with g 2 = 4m2
.
Notice that Φ is a complex field and its action has a U(1) symmetry, Φ → eiα Φ,
even though the full theory did not. The associated conserved charge is the number of
particles:
i
j0 = Φ? Φ, ji = (Φ? ∂i Φ − ∂i Φ? Φ) , ∂t j0 − ∇ · ~j = 0 .
2m
Notice that the ‘mass term’ Φ? Φ is then actually the chemical potential term, which
encourages a nonzero density of particles to be present.
This is an example of an emergent symmetry: a symmetry of an EFT that is not
a symmetry of the microscopic theory. The ... in (11.53) include terms which break
this symmetry, but they are irrelevant. (Particle physics folks sometimes call such a
symmetry ‘accidental’, which is a terrible name. An example of an emergent symmetry
in the Standard Model is baryon number.)
To see more precisely what we mean by irrelevant, let’s think about scaling. To
keep this kinetic term fixed we must scale time and space differently:
x → x̃ = sx, t → t̃ = s2 t, Φ → Φ̃(x̃, t̃) = ζΦ(sx, s2 t) .
A fixed point with this scaling rule has dynamical exponent z = 2. The scaling of the
bare action (with no mode elimination step) is
!
(0)
Z ∇~2 2
d ? 2 2 2 ? 2
SE = dtd ~
x Φ sx, s t ∂ − Φ(sx, s t) − g Φ Φ(sx, s t) + ...
t
2m
| {z }
=sd+z dt̃dd x̃ | {z }
˜~2
=s−2 ∂˜t − ∇
2m
~˜ 2
! !
∇
Z 2
= sd+z−2 ζ −2 dt̃dd x̃ Φ̃? ∂˜t − Φ̃ − ζ −2 g 2 Φ̃? Φ̃(x̃, t̃) + ... (11.54)
| {z } 2m
!
=1 =⇒ ζ=s−3/2
From this we learn that g̃ = s−3+2=−1 g → 0 in the IR – the quartic term is irrelevant
in D = d + 1 = 3 + 1 with nonrelativistic scaling! Where does it become marginal?
Recall the delta function potential for a particle in two dimensions.
65
Number and phase angle. In the NR theory, the canonical momentum for Φ is
just ∂∂LΦ̇ ∼ Φ? , with no derivatives. This statement becomes more shocking if we change
√
variables to Φ = ρeiθ . This is a useful change of variables, if for example we knew
ρ didn’t want to be zero, as would happen if we add to (11.53) a term of the form
−µΦ? Φ. So consider the action density
!
∇~2
L = Lreal time = Φ? i∂t + Φ − V (Φ? Φ), V (Φ? Φ) ≡ g 2 (Φ? Φ)2 − µΦ? Φ.
2m
In polar coordinates this is
i 1 2 1 2
L = ∂t ρ − ρ∂t θ − ρ (∇θ) + (∇ρ) − V (ρ). (11.55)
2 2m 4ρ
The first term is a total derivative. The second term says that the canonical momentum
for the phase variable θ is ρ = Φ? Φ = j0 , the particle number density. Quantumly,
then:
[ρ̂(~x, t), θ̂(~x0 , t)] = iδ d (~x − ~x0 ). (11.56)
Number and phase are canonically conjugate variables. If we fix the phase, the ampli-
tude is maximally uncertain.
R
If we integrate over space, N ≡ dd xρ(~x, t) gives the total number of particles,
which is time independent, and satisfies [N, θ] = i.
What is the term µΦ? Φ = µρ? It is a chemical potential for the boson number.
This relation (11.56) explains why there’s no Higgs boson in most non-relativistic
superconductors and superfluids (in the absence of some extra assumption of particle-
hole symmetry). In the NR theory with first order time derivative, the would-be ampli-
tude mode which oscillates about the minimum of V (ρ) is actually just the conjugate
momentum for the goldstone boson!
Superfluids. [Zee §V.1] Let me amplify the previous remark. A superconductor is
just a superfluid coupled to an external U(1) gauge field, so we’ve already understood
something about superfluids.
The effective field theory has the basic lagrangian (11.55), with hρi = ρ̄ 6= 0. This
nonzero density can be accomplished by adding an appropriate chemical potential to
(11.55); up to an uninteresting constant, this is
i 1 2 1
L = ∂t ρ − ρ∂t θ − ρ (∇θ) + (∇ρ) − g 2 (ρ − ρ̄)2 .
2
2 2m 4ρ
√ √ √
Expand around such a condensed state in small fluctuations ρ = ρ̄+h, h ρ̄:
√ ρ̄ ~ 2 1 ~ 2
L = −2 ρ̄h∂t θ − ∇θ − ∇h − 4g 2 ρ̄h2 + ...
2m 2m
66
Notice that h, the fluctuation of the amplitude mode, is playing the role of the canonical
momentum of the goldstone mode θ. The effects of the fluctuations can be incorporated
by doing the gaussian integral over h (What suppresses self-interactions of h?), and
the result is
√ 1 √ ρ̄ ~ 2
L = ρ̄∂t θ ∇2
ρ̄∂t θ − ∇θ
4g 2 ρ̄ − 2m 2m
1 ρ̄
= 2 (∂t θ)2 − (∇θ)2 + ... (11.57)
4g 2m
where in the second line we are expanding in the small wavenumber k of the modes,
p is, we are constructing an action for Goldstone modes whose wavenumber is k
that
9g 2 ρ̄m so we can ignore higher gradient terms.
The linearly dispersing mode in this superfluid that we have found is sometimes
called the phonon. This is a good name because the wave involves oscillations of the
density:
1 √
h= ∇ 2 ρ̄∂t θ
4g 2 ρ̄ − 2m
is the saddle point solution for h. The phonon has dispersion relation
2g 2 ρ̄ ~ 2
ω2 = k .
m
This
p mode has an emergent Lorentz symmetry with a lightcone with velocity vc =
g 2ρ̄/m. The fact that the sound velocity involves g – which determined the steepness
of the walls of the wine-bottle potential – is a consequence of the non-relativistic dis-
2
persion of the bosons. In the relativistic theory, we have L = ∂µ Φ? ∂ µ Φ − g (Φ? Φ − v 2 )
and we can take g → ∞ fixing v and still get a linearly dispersing mode by plugging
in Φ = eiθ v.
The importance of the linearly dispersing phonon mode of the superfluid is that
there is no other low energy excitation of the fluid. With a classical pile of (e.g. non
interacting) bosons, a chunk of moving fluid can donate some small momentum ~k to a
~k)2
single boson at energy cost (~2m . A quadratic dispersion means more modes at small
dk
k than a linear dispersion (the density of states is N (E) ∝ k D−1 dE ). With only a
linearly dispersing mode at low energies, there is a critical velocity below which a
non-relativistic chunk of fluid cannot give up any momentum [Landau]: conserving
momentum M~v = M~v 0 + ~~k says the change in energy (which must be negative for
this to happen on its own) is
1 1 (~k)2 (~k)2
M (v 0 )2 + ~ω(k) − M v 2 = −~kv + + ~ω(k) = (−v + vc )~k + .
2 2 2m 2m
For small k, this is only negative when v > vc = ∂k ω|k=0 .
67
You can ask: an ordinary liquid also has a linearly dispersing sound mode; why
doesn’t Landau’s argument mean that it has superfluid flow? The answer is that it has
other modes with softer dispersion (so more contribution at low energies), in particular
diffusion modes, with ω ∝ k 2 (there is an important factor of i in there).
The Goldstone boson has a compact target space, θ(x) ≡ θ(x) + 2π, since, after all,
it is the phase of the boson field. This is significant because it means that as the phase
wanders around in space, it can come back to its initial value after going around the
circle – such a loop encloses a vortex. Somewhere inside, we must have Φ = 0. There
is much more to say about this.
68
12 Anomalies
[Zee §IV.7; Polyakov, Gauge Fields and Strings, §6.3; K. Fujikawa, Phys. Rev. Lett. 42
(1979) 1195; Argyres, 1996 lectures on supersymmetry §14.3; Peskin, chapter 19]
Topology means the study of quantities which can’t vary smoothly, but can only
jump. Like quantities which must be integers. But the Wilson RG is a smooth process.
Therefore topological information in a QFT is something the RG can’t wash away –
information which is RG invariant. An example we’ve seen already is the integer
coefficients of WZW terms, which encode commutation relations. Another class of
examples (in fact they are related) is anomalies.
Suppose we have in our hands a classical field theory in the continuum which
has some symmetry. Is there a well-defined QFT whose classical limit produces this
classical field theory and preserves that symmetry? The path integral construction of
QFT offers some insight here. The path integral involves two ingredients: (1) an action,
which is shared with the classical field theory, and (2) a path integral measure. It is
possible that the action is invariant but the measure is not. This is called an anomaly.
It means that the symmetry is broken, and its current conservation is violated by a
known amount, and this often has many other consequences that can be understood
by humans. [End of Lecture 51]
Notice that here I am speaking about actual, global symmetries. I am not talking
about gauge redundancies. If you think that two field configurations are equivalent but
the path integral tells you that they would give different contributions, you are doing
something wrong. An anomaly in a ‘gauge symmetry’ means that the system has more
degrees of freedom than you thought. (In particular, it does not mean that the world
is inconsistent. For a clear discussion of this, please see Preskill, 1990.)
We have already seen a dramatic example of an anomaly: the violation of classical
scale invariance (e.g. in massless φ4 theory, or in massless QED) by quantum effects.
Notice that the name ‘anomaly’ betrays the bias that we construct a QFT by
starting with a continuum action for a classical field theory; you would never imagine
that e.g. scale invariance was an exact symmetry if you started from a well-defined
quantum lattice model.
The example we will focus on here is the chiral anomaly. This is an equation for the
violation of the chiral (aka axial) current for fermions coupled to a background gauge
field. The chiral anomaly was first discovered in perturbation theory, by computing
a certain Feynman diagram with a triangle; the calculation was motivated by the
experimental observation of the process π 0 → γγ, which would not happen if the chiral
current were conserved.
69
I will outline a derivation of this effect which is more illuminating than the triangle
diagram. It shows that the one-loop result is exact – there are no other corrections.
It shows that the quantity on the right hand side of the continuity equation for the
would-be current integrates to an integer. It gives a proof of the index theorem, relating
numbers of solutions to the Dirac equation in a background field configuration to a
certain integral of field strengths. It butters your toast.
Some more explicit words about chiral fermions in D = 3 + 1, mostly notation. Re-
call Peskin’s Weyl basis of gamma matrices in 3+1 dimensions, in which γ 5 is diagonal:
0 σ̄ µ
µ µ µ µ µ 5 1 0
γ = , σ ≡ (1, σ
~ ) , σ̄ ≡ (1, −~ σ) , γ = .
σµ 0 0 −1
This makes the reducibility of the Dirac representation of SO(3, 1) manifest, since the
Lorentz generators are ∝ [γ µ , γ ν ] block diagonal in this basis. The gammas are a map
from the (1, 2R ) representation to the (2L , 1) representation. It is sometimes useful to
denote the 2R indices by α, β = 1, 2 and the 2L indices by α̇, β̇ = 1, 2. Then we can
define two-component Weyl spinors ψL/R = PL/R ψ ≡ 12 (1 ± γ 5 ) ψ by simply forgetting
70
about the other two components. The conjugate of a L spinor χ = ψL (L means
γ 5 χ = χ) is right-handed:
We can represent any system of Dirac fermions in terms of a collection of twice as many
Weyl fermions.
For a continuous symmetry G, we can be more explicit about the meaning of a
complex representation. The statement that ψ is in representation r means that its
transformation law is
δψa = iA tA
r ab ψb
where tA , A = 1.. dim G are generators of G in representation r; for a compact lie group
G, we may take the tA to be Hermitian. The conjugate representation, by definition,
is one with which you can make a singlet of G – it’s the way ψ ?T transforms:
T ?T
δψa?T = −iA tA r ab ψb .
So:
T
tA A
r̄ = − tr .
The condition for a complex representation is that this is different from tA
r (actually
we have to allow for relabelling of the generators). The simplest case is G = U(1),
where t is just a number indicating the charge. In that case, any nonzero charge gives
a complex representation.
Consider the effective action produced by integrating out Dirac fermions coupled
to a background gauge field (the gauge field is just going to sit there for this whole
calculation): Z
eiSeff [A] ≡ [DψDψ̄] eiS[ψ,ψ̄,A] .
We must specify how the fermions coupled to the gauge field. The simplest example is
if A is a U (1) gauge field and ψ is minimally coupled:
Z
S[ψ, ψ̄, A] = dD xψ̄iDψ,
/ / ≡ γ µ (∂µ + iAµ ) ψ.
Dψ
We will focus on this example, but you could imagine instead that Aµ is a non-
Abelian gauge field for the group G, and ψ is in a representation R, with gauge gener-
ators T A (R) (A = 1...dimG), so the coupling would be
71
Much of the discussion below applies for any even D.
In the absence of a mass term, the action (in the Weyl basis) involves no coupling
between L and R:
Z
S[ψ, ψ̄, A] = dD x ψL† iσ µ Dµ ψL + ψR† iσ̄ µ Dµ ψR
notice that the mass parameter is complex.) The associated Noether current is jµ5 =
?
ψ̄γ̄ 5 γµ ψ, and it seems like we should have ∂ µ jµ5 = 0. This follows from the massless
(classical) Dirac equation 0 = γ µ ∂µ ψ. (With the mass term, we would have instead
?
∂ µ jµ5 = 2iψ̄ (Remγ 5 + Imm) ψ. )
Notice that there is another current j µ = ψ̄γ µ ψ. j µ is the current which is coupled
to the gauge field, L 3 Aµ j µ . The conservation of this current is required for gauge
invariance of the effective action
D R E
! i λ(x)∂µ j µ
Seff [Aµ ] = Seff [Aµ + ∂µ λ] ∼ log e + Seff [Aµ ].
No matter what happens we can’t find an anomaly in j µ . The anomalous one is the
other one, the axial current.
To derive the conservation law we can use the Noether method. This amounts to
5
substituting ψ 0 (x) ≡ eiα(x)γ ψ(x) into the action:
Z Z Z
0 D +iαγ 5 iαγ 5 D 5
IBP
SF [ψ ] = d xψ̄e iDe
/ ψ = d x ψ̄iDψ ∂ α) ψ = SF [ψ]−i α(x)∂ µ trψ̄γ 5 γµ ψ.
/ + ψ̄iγ (/
Then we can completely get rid of α(x) if we can change integration variables, i.e. if
?
[Dψ 0 ] = [Dψ]. Usually this is true, but here we pick up an interesting Jacobian.
Claim:
Z Z
iSF [ψ 0 ] dD xα(x)(∂µ j5µ −A(x))
R
0 0
e iSeff [A]
= [Dψ Dψ̄ ]e = [DψDψ̄]eiSF [ψ]+
where X
A(x) = trξ¯n γ 5 ξn (12.2)
n
72
where ξn are a basis of eigenspinors of the Dirac operator. The contribution to A can
be attributed to zeromodes of the Dirac operator.
The expression above is actually independent of α, since the path integral is in-
variant under a change of variables. For a conserved current, α would multiply the
divergence of the current and this demand would imply current conservation. Here
this implies that instead of current conservation we have a specific violation of the
current:
∂ µ jµ5 = A(x).
What is the anomaly. [Polyakov §6.3] An alternative useful (perhaps more ef-
ficient) perspective is that the anomaly arises from trying to define the axial current
operator, which after all is a composite operator. Thus we should try to compute
For a while the discussion works in any even dimension, where γ 5 = D−1 µ
Q
µ=0 γ satisfies
{γ µ , γ 5 } = 0 and is not the identity. (The discussion that follows actually works also
for non-Abelian gauge fields.) The classical Dirac equation immediately implies that
the axial current is conserved
?
∂µ iψ̄γ µ γ 5 ψ = 0.
=
= −iTr γ γµ γ 5 G[A] (x, x) (12.3)
where G is the Green’s function of the Dirac operator in the gauge field background
(and the figure is from Polyakov’s book). We can construct it out of eigenfunctions of
iD:
/ ←
iDξ
/ n (x) = n ξn (x), ξ¯n (x)iγ µ − ∂ µ + iAµ = n ξ¯n (12.4)
73
in terms of which20 X 1
G(x, x0 ) = ξn (x)ξ¯n (x0 ). (12.5)
n
n
(I am suppressing spinor indices all over the place, note that here we are taking the
outer product of the spinors.)
We want to define the coincidence limit, as x0 → x. The problem with this limit
arises from the large |n | eigenvalues; the contributions of such short-wavelength modes
are local and most of them can be absorbed in renormalization of couplings. It should
not (and does not) matter how we regulate them, but we must pick a regulator. A
convenient choice here is heat-kernel regulator:
2 1
X
Gs (x, x0 ) ≡ e−sn ξn (x)ξ¯n (x0 )
n
n
and X 2 1¯
Jµ5 (x) = e−sn ξn (x)γ 5 γµ ξn (x) .
n
n
The anomaly is
X µ e−s2n
∂ µ Jµ5 = ∂ µ jµ5 = i∂ ξ¯n γµ γ 5 ξn .
n
n
using {γ 5 , γ µ } = 0. (Notice that the story would deviate dramatically here if we were
studying the vector current which lacks the γ 5 .) This gives
2
µ /
5 −s iD
∂ Jµ5 = 2Tr α γ e
with
i
/ 2 = − (γµ (∂µ + iAµ ))2 = − (∂µ + Aµ )2 − Σµν F µν
(iD)
2
where Σµν ≡ 21 [γµ , γν ] is the spin Lorentz generator. This is (12.2), now better defined
by the heat kernel regulator. We’ve shown that in any even dimension,
∂ µ jµ5 (x) = 2Tr α γ 5 esD/
2
(12.6)
This can now be expanded in small s, which amounts to an expansion in powers of
A, F . If there is no background field, A = 0, we get
2 Z
−s i∂/ 2 1 D=4 1
x|e |x = d̄D p e−sp = KD D/2
= . (12.7)
|{z} s 16π 2 s2
ΩD−1
= as before
(2π)D
20
Actually, this step is full of danger. (Polyakov has done it to me again. Thanks to Sridip Pal for
discussions of this point.) See §12.0.2 below.
74
This term will renormalize the charge density
for which we must add a counterterm (in fact, it is accounted for by the counterterm
for the gauge field kinetic term, i.e. the running of the gauge coupling). But it will not
affect the axial current conservation which is proportional to
Similarly, bringing down more powers of (∂ + A)2 doesn’t give something nonzero
since the γ 5 remains.
In D = 4, the first term from expanding Σµν F µν is still zero from the spinor trace.
(Not so in D = 2.) The first nonzero term comes from the next term:
2 E s2
/
D 2
−s iD
= x|e−s(iD) |x · · (i2 ) tr γ 5 Σµν Σρλ · trc (Fµν Fρλ ) + O(s1 ) .
tr γ5 e
xx | {z } 8 | {z
µνρλ
} |{z}
color
(12.7) =4
1
= +O(s−1 )
16π 2 s2
In the abelian case, just ignore the trace over color indices, trc . The terms that go like
positive powers of s go away in the continuum limit. Therefore
1 s2 1
∂µ J5µ = −2 · · · 4µνρλ
trc Fµν F ρλ + O(s 1
) = − trFµν (?F )µν . (12.8)
16πs2 8 8π 2
(Here (?F )µν ≡ 81 µνρλ Fρλ .) This is the chiral anomaly formula. It can also be usefully
written as:
1 1 ~ ~
∂µ J5µ = − 2 trF ∧ F = − E · B.
8π 32π 2
• This object on the RHS is a total derivative. In the abelian case it is
F ∧ F = d (A ∧ F ) .
Its integral over spacetime is a topological (in fact 16π 2 times an integer) char-
acterizing the gauge field configuration. How do I know it is an integer? The
anomaly formula! The change in the number of left-handed fermions minus the
number of right-handed fermions during some time interval is:
F ∧F
Z Z Z
5 µ 5
∆QA ≡ ∆ (NL − NR ) = dt∂t J0 = ∂ Jµ = 2 2
M4 M4 16π
75
• Look back at the diagrams in (12.3). Which term in that expansion gave the
nonzero contribution to the axial current violation? In D = 4 it is the diagram
with three current insertions, the ABJ triangle diagram. So in fact we did end
up computing the triangle diagram. But this calculation also shows that nothing
else contributes, even non-perturbatively. [End of Lecture 52]
• We chose a particular regulator above. The answer we got did not depend on the
cutoff; in fact whatever regulator we used, we would get this answer.
• If we had kept the non-abelian structure in (12.1) through the whole calculation,
the only difference is that the trace in (12.8) would have included a trace over
representations of the gauge group; and we could have considered also a non-
abelian flavor transformation
5 a a
ψI → eiγ g τ ψJ
IJ
Do you see now why I said that the step involving the fermion Green’s function was full
of danger? The danger arises because the Dirac operator (whose inverse is the Green’s
function) can have zeromodes, eigenspinors with eigenvalue n = 0. In that case, iD / is
not invertible, and the expression (12.5) for G is ambiguous. This factor of n is about
to be cancelled when we compute the divergence of the current and arrive at (12.2).
Usually this kind of thing is not a problem because we can lift the zeromodes a little
and put them back at the end. But here it is actually hiding something important. The
zeromodes cannot just be lifted. This is true because nonzero modes of iD / must come
5 5
in left-right pairs: this is because {γ , iD}
/ = 0, so iD
/ and γ cannot be simultaneously
diagonalized in general. That is: if iDξ / = ξ then (γ 5 ξ) is also an eigenvector of iDξ,
/
76
with eigenvalue −. Only for = 0 does this fail, so zeromodes can come by themselves.
So you can’t just smoothly change the eigenvalue of some ξ0 from zero unless it has a
partner with whom to pair up. By taking linear combinations
1
χL/R 1 ± γ 5 ξn
n =
2
these two partners can be arranged into a pair of simultaneous eigenvectors of (iD)/ 2
(with eigenvalue 2n ) and of γ 5 with γ 5 = ± respectively.
This leads us to a deep fact, called the (Atiyah-Singer) index theorem: only zero-
modes can contribute to the anomaly. Any mode ξn with nonzero eigenvalue has a
partner with the opposite sign of γ 5 ; hence they cancel exactly in
2
X
ξ¯n γ 5 ξn e−sn !
n
So the anomaly equation tells us that the number of zeromodes of the Dirac operator,
weighted by handedness (i.e. with a + for L and - for R) is equal to
Z Z
D 1
NL − NR = d xA(x) = F ∧ F.
16π 2
A practical consequence for us is that it makes manifest that the result is indepen-
dent of the regulator s.
77
It would therefore seem to imply a conserved
axial current – the number of left moving fermions
minus the number of right moving fermions. But
the fields ψL and ψR are not independent; with
high-enough energy excitations, you reach the bot-
tom of the band (near k = 0 here) and you can’t
tell the difference. This means that the numbers
are not separately conserved.
We can do better in this 1+1d example and
show that the amount by which the axial current
is violated is given by the anomaly formula. Con-
sider subjecting our poor 1+1d free fermions to an
electric field Ex (t) which is constant in space and
slowly varies in time. Suppose we gradually turn
Figure 1: Green dots represent oc-
it on and then turn it off; here gradually means
cupied 1-particle states. Top: In the
slowly enough that the process is adiabatic. Then groundstate. Bottom: After applying
each particle experiences a force ∂t p = eEx and its Ex (t).
net change in momentum is Z
∆p = e dtEx (t).
This means that the electric field puts the fermions in a state where the Fermi surface
k = kF has shifted to the right by ∆p, as in the figure. Notice that the total number
of fermions is of course the same – charge is conserved.
Now consider the point of view of the low-energy theory at the Fermi points. This
theory has the action Z
S[ψ] = dxdtψ̄ (iγ µ ∂µ ) ψ ,
where γ µ are 2 × 2 and the upper/lower component of ψ creates fermions near the
left/right Fermi point. In the process above, we have added NR right-moving particles
and taken away NL left-moving particles, that is added NL left-moving holes (aka anti-
particles). The axial charge of the state has changed by
Z Z Z
∆p L L e e
∆QA = ∆(NL −NR ) = 2 = ∆p = e dtEx (t) = dtdxEx = µν F µν
2π/L π π π 2π
R
On the other hand, the LHS is ∆QA = ∂ µ JµA . We can infer a local version of this
– e.g. from a lattice model, like X
H = −t c†n cn+1 + h.c.
n
1 2 1
where the dispersion would be ωk = −2t (cos ka − 1) ∼ 2m k + O(k 4 ) with 2m = ta2 .
78
equation by letting E vary slowly in space as well, and we conclude that
e
∂µ JAµ = µν F µν .
2π
This agrees exactly with the anomaly equation in D = 1+1 produced by the calculation
above in (12.6) (see the homework).
79
13 Saddle points, non-perturbative field theory and
resummations
means confinement.
Let’s think more about abelian gauge theory in D = 1 + 1. Consider the case of
N = 1. This could be called the CP0 model, but it is usually called the Abelian Higgs
model.
1 κ µ2 F
L = 2 F 2 + Dµ z † Dµ z + (z † z)2 + z † z + θ .
4e 4 2 2π
What would a classical physicist say is the phase diagram of this model as we vary
µ ? For µ2 > 0, it is 2d scalar QED. There is no propagating photon, but (as we just
2
discussed) the model confines because of the Coulomb force. The spectrum is made
of boundstates of zs and z † s, which are stable because there is no photon for them
to decay into. For µ2 < 0, it looks like the potential wants |z|2 = µ2 /κ ≡ v 2 in the
groundstate. This would mean that Aµ eats the phase of z, gets a mass (a massive
vector in D = 1 + 1 has a propagating component); the radial excitation of z is also
massive. In such a Higgs phase, external charges don’t care about each other, the force
is short-ranged.
Not all of the statements in the classical, shaded box are correct quantumly. In
fact, even at µ2 < 0, external charges are still confined (but with a different string
tension than µ2 > 0). Non-perturbative physics makes a big difference here.
Let’s try to do the euclidean path integral at µ2 < 0 by saddle point. This means
we have to find minima of
Z
1 2 κ †
0
SE ≡ F + Dµ z † D µ z + (z z − v ) d2 x.
2 2
4e2 4
(Ignore θ for now, since it doesn’t affect the EOM.) Where have you seen this before?
This is exactly the functional we had to minimize in §10.1 to find the (Abrikosov-
Nielsen-Olesen) vortex solution of the Abelian Higgs model. There we were thinking
about a 3+1 D field theory, and we found a static configuration, translation invariant
80
in one spatial direction, localized in the two remaining directions. Here we have only
two dimensions. The same solution of the equations now represents an instanton – a
solution of the euclidean equations of motion, localized in euclidean spacetime. Here’s
a quick review of the solution: Choosing polar coordinates about some origin (more on
this soom), the solution has (in order that V (ρ) goes to zero at large r)
r→∞
z(r, θ) → g(θ)v,
where g(θ) is a phase. We can make the |Dz|2 term happy by setting
r→∞
A → −ig∂µ g + O(r−2 ).
In the last step I made a caricature of the saddle point approximation. Notice the
dependence of the instanton (Q 6= 0) contributions: if we scale out an overall coupling
(by rescaling fields) and write the action as S[φ] = g12 S[φ, ratios of couplings], then
1
− S[φ,ratios]
e−S0 = e g2 is non-analytic at g = 0 – all the terms of its taylor expansion
vanish at g = 0. This is not something we could ever produce by perturbation series, it
is non-perturbative. Notice that it is also small at weak coupling. However, sometimes
it is the leading contribution, e.g. to the energy of a metastable vacuum. (For more on
this, see Coleman.)
To do better, we need to understand the saddle points better.
1. First, in the instanton solution we found, we picked a center, the location of the
core of the vortex. But in fact, there is a solution for any center xµ0 , with the same
action. This means the determinant of S 00 actually has a zero! The resolution is
simple: There is actually a family of saddles, labelled by the collective coordinate
xµ0 . We just have to do the integral over these coordinates. The result is simple:
81
R
it produces a factor of dD x0 = V T where V T is the volume of spacetime. The
contribution of one instanton to the integral is then
Ke−S0 eiθ V T
3. We can also have anti-instantons. This just means that individual Qs can be
negative.
So we are going to approximate our integral by a dilute gas of instantons and anti-
instantons. Their actions add. A necessary condition for this to be a good idea is that
V T (core size)2 . eiθ is the instanton fugacity.
T →∞ X n+n̄ 1
Z = Tr e−T H ' Ke−S0 (V T )n+n̄ ei(n−n̄)θ
n,n̄
n!n̄!
!
X 1 n
= Ke−S0 V T einθ × (h.c.)
n
n!
−S0 eiθ +h.c. −S0
= eV T Ke = eV T 2Ke cos θ
. (13.1)
We should be happy about this answer. Summing over the dilute gas of instantons
gives an extensive contribution to the free energy. The free energy per unit time in the
euclidean path integral is the groundstate energy density:
T →∞
Z = Tr e−T H ' e−T V E(θ) , =⇒ E(θ) = −2K cos θe−S0 .
Therefore, when θ 6= 0 mod π, there is a nonzero electric field in the vacuum: hF01 i =
E 6= 0. It is uniform.
82
A small variation of this calculation gives the force between external charges:
2 l T0 + D
*
↔0
E D qR E
i qe 2 Aµ dxµ
H
W L = e = ei e F
This has the effect of shifting the value of θ on the inside of the loop to θin ≡ θ + qe 2π.
So the answer in the dilute instanton gas approximation is
* 2 l T 0 + exp 2Ke
S0
(LT − L0 T 0 ) cos θ + L0 T 0 cos θin
} | {z }
↔0
| {z
inside outside 0 0
W L = = e−T V (L )
2Ke −S0 LT cos θ
e
with q
V (L0 ) = L0 2Ke−S0 cos θ − cos θ + 2π
e
which is linear in the separation between the charges – linear confinement, except when
q = ne, n ∈ Z.
Here’s how to think about this result. For small θ, q/e, the potential between
charges is
0 θ1 0 −S0 q 2 2
V (L ) ' L Ke θ + 2π −θ
e
and the energy and flux are
θ1 θ1
E(θ) ' 2Ke−S0 θ2 + const, hF i ' 4πKe−S0 θ.
θ is like the charge on a pair of parallel capacitor plates at x = ∞. Adding charge and
anticharge changes the electric field in between, and the energy density is quadratic
in the field, U ∝ E 2 . But what happens when q = ne? Notice that the potential is
actually periodic in q → q + ne. If L0 > 2µ1
(µ is the mass of the z excitations), then the
energy can be decreased by pair-creating a z and z † , which then fly to the capacitor
plates and discharge them, changing θ → θ − 2π.
Comments about D = 4. Some of the features of this story carry over to gauge
R
theory in D = 3 + 1. Indeed there is a close parallel between the θ 2 F term and the
R
θ 4 F ∧ F term. In 4d, too, there are solutions of the euclidean equations (even in pure
Yang-Mills theory) which are localized in spacetime. (The word instanton is sometimes
used to refer to these solutions, even when they appear in other contexts than euclidean
saddle points. These solutions were found by Belavin, Polyakov, Schwartz and Tyupin.)
Again, the gauge field looks like a gauge transformation at ∞:
r→∞
A → −ig∂µ g + O(r−# ).
83
Now g is a map from the 3-sphere at infinity (in euclidean 4-space) to the gauge group,
g : S 3 → G. Any simple Lie group has an SU(2) ' S 3 inside, and there is an integer
classification of such maps. So again there is a sum over Q ∈ Z. However: the
calculation leading to confinement does not go through so simply. The 4d θ term does
not produce a nonzero electric field in the vacuum, and an external charge isn’t like
a capacitor plate. As Coleman says, whatever causes confinement in 4d gauge theory,
it’s not instantons.
Many bits of the following discussion are already familiar, but I like the organization.
Feynman diagrams from the path integral. Now that we are using path
integrals all the time, the diagrammatic expansion is much less mysterious (perhaps
we should have started here, like Zee does? maybe next time). Much of what we have
to say below is still interesting for QFT in 0 + 0 dimensions, which means integrals. If
everything is positive, this is probability theory. Suppose we want to do the integral
Z ∞ Z
g 4
− 12 m2 q 2 − 4!
Z(J) = dq e q +Jq
≡ dq e−S(q) . (13.2)
−∞
It is the path integral for φ4 theory with fewer labels. For g = 0, this is a gaussian
integral which we know how to do. For g 6= 0 it’s not an elementary function of its
arguments. We can develop a (non-convergent!) series expansion in g by writing it as
Z ∞
− 21 m2 q 2 +Jq g 4 1 g 4 2
Z(J) = dq e 1− q + − q + ···
−∞ 4! 2 4!
and integrating term by term. And the term with q 4n (that is, the coefficient of g n ) is
Z ∞ 4n Z ∞ 4n r
− 21 m2 q 2 +Jq 4n ∂ − 21 m2 q 2 +Jq ∂ 1
J 1
J 2π
dq e q = dq e = e 2 m2 .
−∞ ∂J −∞ ∂J m2
So: r
2π − 4!g ( ∂J
∂ 4 1
) e 2 J m12 J .
Z(J) = 2
e
m
This is a double expansion in powers of J and powers of g. The process of computing
the coefficient of J n g m can be described usefully in terms of diagrams. There is a factor
of 1/m2 for each line (the propagator), and a factor of (−g) for each 4-point vertex
(the coupling), and a factor of J for each external line (the source). For example, the
coefficient of gJ 4 comes from: 4
1
∼ gJ 4 .
m2
84
There is a symmetry factor which comes from expanding the exponential: if the
diagram has some symmetry preserving the external labels, the multiplicity of diagrams
does not completely cancel the 1/n!.
As another example, consider the analog of the two-point function:
dq q 2 e−S(q)
R
2 ∂
G ≡ q |J=0 = R −S(q)
= −2 log Z(J = 0).
dq e ∂m2
In perturbation theory this is:
G'
−2 1 2 2 −4
=m 1 − gm−2 + g m + O(g )3
(13.3)
2 3
• How do I know the perturbation series about g = 0 doesn’t converge? One way to
see this is to notice that if I made g even infinitesimally negative, the integral itself
would not converge (the potential would be unbounded below), and Zg=−|| is not
defined. Therefore Zg as a function of g cannot be analytic in a neighborhood of
g = 0. This argument is due to Dyson.
• The expansion of the exponential in the integrand is clearly convergent for each
q. The place where we went wrong is exchanging the order of integration over q
and summation over n.
2 √ ρ 3m4
Z(J = 0) = √ ρe K 1 (ρ), ρ≡
m2 4 4g
√
(for Re ρ > 0), as Mathematica will tell you. Because we know about Bessel
functions, in this case we can actually figure out what happens at strong coupling,
when g m4 , using the asymptotics of the Bessel function.
85
• In this case, the perturbation expansion too can be given a closed form expression:
1
r
2π X (−1)n 22n+ 2
1 g n
Z(0) ' Γ 2n + . (13.4)
m2 n n! (4!)n 2 m4
• The fact that the coefficients cn grow means that there is a best number of orders
to keep. The errors start getting bigger when cn+1 mg4 ∼ cn , that is, at order
4
n ∼ 3m
2g
. So if you want to evaluate G at this value of the coupling, you should
stop at that order of n.
This procedure requires both that the series in B(z) converges and that the
Laplace transform can be done. In fact this procedure works in this case.
The existence of saddle-point contributions to Z(g) which go like e−a/g imply
that the number of diagrams at large order grows like n!. This is because they
are associated with singularities of B(z) at z = a; such a singularity means the
sum of cn!n z n must diverge at z = a. (More generally, non-perturbative effects
1/p
which go like e−a/g (larger if p > 1) are associated with (faster) growth like
(pn)!. See this classic work.)
86
• The function G(g) can be analytically continued in g away from the real axis,
and can in fact be defined on the whole complex g plane. It has a branch cut on
the negative real axis, across which its discontinuity is related to its imaginary
a
part. The imaginary part goes like e− |g| near the origin and can be computed by
a tunneling calculation.
How did we know Z has a branch cut? One way is from the asymptotics of the
Bessel function. But, better, why does Z satisfy the Bessel differential equation
as a function of the couplings? The answer, as you’ll check on the homework, is
that the Bessel equation is a Schwinger-Dyson equation,
Z ∞
∂
somethinge−S(q)
0=
−∞ ∂q
which results from demanding that we can change integration variables in the
path integral.
For a bit more about this, you might look at sections 3 and 4 of this recent paper from
which I got some of the details here. See also the giant book by Zinn-Justin. There is a
deep connection between the large-order behavior of the perturbation series about the
trivial saddle point and the contributions of non-trivial saddle points. The keywords
for this connection are resurgence and trans-series and a starting references is here.
The Feynman diagrams we’ve been drawing all along are the same but with more
labels. Notice that each of the qs in our integral could come with a label, q → qa . Then
each line in our diagram would be associated with a matrix (m−2 )ab which is the inverse
of the quadratic term qa m2ab qb in the action. If our diagrams have loops we get free
sums over the label. If that label is conserved by the interactions, the vertices will have
some delta functions. In the case of translation-invariant field theories we can label
lines by the conserved momentum k. Each comes with a factor of the free propagator
i
, each vertex conserves momentum, so comes with igδ D ( k) (2π)D , and we
P
k2 +m2 +i
must integrate over momenta on internal lines d̄D k.
R
Next, three general organizing facts about the diagrammatic expansion, two already
familiar. In thinking about the combinatorics below, we will represent collections of
Feynman diagrams by blobs with legs sticking out, and think about how the blobs
combine. Then we can just renormalize the appropriate blobs and be done.
The following discussion will look like I am talking about a field theory with a single
scalar field. But really each of the φs is a collection of fields and all the indices are too
small to see. This is yet another example of coarse-graining.
87
1. Disconnected diagrams exponentiate. [Zee, I.7, Banks, chapter 3] Recall
that the Feynman rules come with a (often annoying, here crucial) statement
about symmetry factors: we must divide the contribution of a given diagram
by the order of the symmetry group of the diagram (preserving various external
labels). For a diagram with k identical disconnected pieces, this symmetry group
includes the permutation group Sk which permutes the identical pieces and has
k! elements. (Recall that the origin of the symmetry factors is that symmetric
feynman diagrams fail to completely cancel the 1/n! in the Dyson formula. For
a reminder about this, see e.g. Peskin p. 93.) Therefore:
X P
Z= (all diagrams) = e (connected diagrams) = eiW .
You can go a long way towards convincing yourself of this by studying the case
where there are only two connected diagrams A+B (draw whatever two squiggles
you want) and writing out eA+B in terms of disconnected diagrams with symmetry
factors.
Notice that this relationship is just like that of the partition function to the
(Helmholtz) free energy Z = e−βF (modulo the factor of i) in statistical me-
chanics (and is the same as that relationship when we study the euclidean path
integral with periodic boundary conditions in euclidean time). This statement is
extremely general. It remains true if we include external sources:
Z R
Z[J] = [Dφ]eiS[φ]+i φJ = eiW [J] .
Now the diagrams have sources J at which propagator lines can terminate; (the
perturbation theory approximation to) W [J] is the sum of all connected such
diagrams. For example
1 δ δ δ
hφ(x)i = Z= log Z = W
Z iδJ(x) iδJ(x) δJ(x)
δ δ δ δ
hT φ(x)φ(y)i = log Z = iW .
iδJ(x) iδJ(y) iδJ(x) iδJ(y)
(Note that here hφi ≡ hφiJ depends on J. You can set it to zero if you want, but
the equation is true for any J.) If you forget to divide by the normalization Z,
δ δ
and instead look at just δJ(x) δJ(y)
Z, you get disconnected quantities like hφi hφi
(the terminology comes from the diagrammatic representation). 22 The point in
life of W is that by differentiating it with respect to J we can construct all the
connected Green’s functions.
22 δ δ δ
More precisely: δJ(x) δJ(y) Z = δJ(x) (hφ(x)iJ Z) = hφ(x)iJ hφ(y)iJ Z + hφ(x)φ(y)iJ Z.
88
2. Propagator corrections form a geometric series. This one I don’t need to
say more about:
where a tree diagram is one with no loops. But the description in terms of
Legendre transform will be extremely useful. Along the way we will show that
the perturbation expansion is a semi-classical expansion. And we will construct
a useful object called the 1PI effective action Γ. The basic idea is that we can
construct the actual correct correlation functions by making tree diagrams (≡
diagrams with no loops) using the 1PI effective action as the action.
Notice that this is a very good reason to care about the notion of 1PI: if we
sum all the tree diagrams using the 1PI blobs, we clearly are including all the
diagrams. Now we just have to see what machinery will pick out the 1PI blobs.
The answer is: Legendre transform. There are many ways to go about showing
this, and all involve a bit of complication. Bear with me for a bit; we will learn
a lot along the way.
Def ’n of φc , the ‘classical field’. Consider the functional integral for a scalar
field theory: Z
= [Dφ]ei(S[φ]+ Jφ) .
R
iW [J]
Z[J] = e (13.5)
Define
Z
δW [J] 1
[Dφ]ei(S[φ]+ Jφ)
R
φc (x) ≡ = φ(x) = h0| φ̂(x) |0i . (13.6)
δJ(x) Z
This is the vacuum expectation value of the field operator, in the presence of the
source J. Note that φc (x) is a functional of J.
89
Warning: we are going to use the letter φ for many conceptually distinct objects
here: the functional integration variable φ, the quantum field operator φ̂, the
classical field φc . I will not always use the hats and subscripts.
[End of Lecture 54]
Now the functional version: Given a functional W [J], we can make a new asso-
ciated functional Γ of the conjugate variable φc :
Z
Γ[φc ] ≡ W [J] − Jφc .
Again, the RHS of this equation defines a functional of φc implicitly by the fact
that J can be determined from φc , using (13.6)23 .
90
– the extremum of the effective action is hφi. This gives a classical-like equation
of motion for the field operator expectation value in QFT.
Z
δΓ[φc ] δ
Proof of (13.7): = W [J] − dyJ(y)φc (y)
δφc (x) δφc (x)
What do we do here? We use the functional product rule – there are three places
where the derivative hits:
Z
δΓ[φc ] δW [J] δJ(y)
= − J(x) − dy φc (y)
δφc (x) δφc (x) δφc (x)
In the first term we must use the functional chain rule:
Z Z
δW [J] δJ(y) δW [J] δJ(y)
= dy = dy φc (y).
δφc (x) δφc (x) δJ(y) δφc (x)
So we have:
Z Z
δΓ[φc ] δJ(y) δJ(y)
= dy φc (y) − J(x) − dy φc (y) = −J(x). (13.9)
δφc (x) δφc (x) δφc (x)
Now φc |J=0 = hφi. So if we set J = 0, we get the equation (13.8) above. So (13.8)
replaces the action principle in QFT – to the extent that we can calculate Γ[φc ].
(Note that there can be more than one extremum of Γ. That requires further
examination.)
91
Let’s go back to (13.5) and think about its semiclassical expansion. If we were
going to do this path integral by stationary phase, we would solve
Z
δ δS
0= S[φ] + φJ = + J(x) . (13.10)
δφ(x) δφ(x)
This determines some function φ which depends on J; let’s denote it here as
φ[J] (x). In the semiclassical approximation to Z[J] = eiW [J] , we would just plug
this back into the exponent of the integrand:
Z
1 [J] [J]
Wc [J] = 2 S[φ ] + Jφ .
g ~
So in this approximation, (13.10) is exactly the equation determining φc . This
is just the Legendre transformation of the original bare action S[φ] (I hope this
manipulation is also familiar from stat mech, and I promise we’re not going in
circles).
Let’s think about expanding S[φ] about such a saddle point φ[J] (or more cor-
rectly, a point of stationary phase). The stationary phase (or semi-classical)
expansion familiar from QM is an expansion in powers of ~ (WKB):
0
Z Z i
~
S(x0 )+(x−x0 ) S (x0 ) + 12 (x−x0 )2 S 00 (x0 )+...
i | {z }
Z = eiW/~ = dx e ~
S(x)
= dxe =0 = eiW0 /~+iW1 +i~W2 +...
with W0 = S(x0 ), and Wn comes from (the exponentiation of) diagrams involving
n contractions of δx = x−x0 , each of which comes with a power of ~: hδxδxi ∼ ~.
Expansion in ~ = expansion in coupling. Is this semiclassical expansion the
same as the expansion in powers of the coupling? Yes, if there is indeed a notion
of “the coupling”, i.e. only one for each field. Then by a rescaling of the fields
we can put all the dependence on the coupling in front:
1
S= s[φ]
g2
so that the path integral is
Z s[φ] R
i + φJ
[Dφ] e ~g 2 .
(It may be necessary to rescale our sources J, too.) For example, suppose we are
talking about a QFT of a single field φ̃ with action
Z 2
p
S[φ̃] = ∂ φ̃ − λφ̃ .
92
1
Then define φ ≡ φ̃λα and choose α = p−2
to get
Z
1 1
(∂φ)2 − φp = 2 s[φ].
S[φ] = 2
λ p−2 g
1 i
s[φ]
with g ≡ λ p−2 , and s[φ] independent of g. Then the path-integrand is e ~g2
and so g and ~ will appear only in the combination g 2 ~. (If we have more than
one coupling term, this direct connection must break down; instead we can scale
out some overall factor from all the couplings and that appears with ~.)
Loop expansion = expansion in coupling. Now I want to convince you
that this is also the same as the loop expansion. The first correction in the
semi-classical expansion comes from
δ2s
Z
1
S2 [φ0 , δφ] ≡ 2 dxdyδφ(x)δφ(y) |φ=φ0 .
g δφ(x)δφ(y)
For the accounting of powers of g, it’s useful to define ∆ = g −1 δφ, so the action
is X
g −2 s[φ] = g −2 s[φ0 ] + S2 [∆] + g n−2 Vn [∆].
n
With this normalization, the power of the field ∆ appearing in each term of the
action is correlated with the power of g in that term. And the ∆ propagator is
independent of g.
So use the action s[φ], in an expansion about φ? to construct Feynman rules for
correlators of ∆: the propagator is hT ∆(x)∆(y)i ∝ g 0 , the 3-point vertex comes
from V3 and goes like g 3−2=1 , and so on. Consider a diagram that contributes
to an E-point function (of ∆) at order g n , for example this contribution to the
93
where
• The total number of lines leaving all the vertices is equal to the total number
of lines: X
ki = E + 2I. (13.12)
vertices, i
L=I −V +1 (13.14)
We conclude that24
!
(13.14) (13.13) 1 X n−E (13.11) n − E
L = I −V +1 = ki − E −V +1= +1 = + 1.
2 i
2 2
n−E
L= 2
+ 1: More powers of g means (linearly) more loops.
24
You should check that these relations are all true for some random example, like the one above,
P
which has I = 7, L = 2, ki = 18, V = 6, E = 4. You will notice that Banks has several typos in his
discussion of this in §3.4. His Es should be E/2s in the equations after (3.31).
94
Diagrams with a fixed number of external lines and more loops are suppressed
by more powers of g. (By rescaling the external field, it is possible to remove the
dependence on E.)
We can summarize what we’ve learned by writing the sum of connected graphs
as ∞
X L−1
W [J] = g2~ WL
L=0
where WL is the sum of connected graphs with L loops. In particular, the order-
~−1 (classical) bit W0 comes from tree graphs, graphs without loops. Solving the
classical equations of motion sums up the tree diagrams.
Diagrammatic interpretation of Legendre transform. Γ[φ] is called the 1PI
effective action25 . And as its name suggests, Γ has a diagrammatic interpretation:
it is the sum of just the 1PI connected diagrams. (Recall that W [J] is the sum
of all connected diagrams.) Consider the (functional) Taylor expansion Γn in φ
X 1 Z
Γ[φ] = Γn (x1 ...xn )φ(x1 )...φ(xn )dD x1 · · · dD xn .
n
n!
The coefficients Γn are called 1PI Green’s functions (we will justify this name
presently). To get the full connected Green’s functions, we sum all tree diagrams
with the 1PI Green’s functions as vertices, using the full connected two-point
function as the propagators.
Perhaps the simplest way to arrive at this result is to consider what happens if
we try to use Γ as the action in the path integral instead of S.
Z
i
ZΓ,~ [J] ≡ [Dφ]e ~ (Γ[φ]+ Jφ)
R
95
Figure 2: [From Banks, Modern Quantum Field Theory, slightly improved] Wn denotes the connected
∂ n
n-point function, ∂J W [J] = hφn i.
This expression is the definition of the inverse Legendre transform, and we see
that it gives back W [J]: the generating functional of connected correlators! On
the other hand, the counting of powers above indicates that the only terms that
survive the ~ → 0 limit are tree diagrams where we use the terms in the Taylor
expansion of Γ[φ] as the vertices. This is exactly the statement we were trying to
demonstrate: the sum of all connected diagrams is the sum of tree diagrams made
using 1PI vertices and the exact propagator (by definition of 1PI). Therefore Γn
are the 1PI vertices.
For a more arduous but more direct proof of this statement, see the problem set
and/or Banks §3.5. There is an important typo on page 29 of Banks’ book; it
should say:
−1 −1
δ2W δ2Γ
δφ(y) δJ(x) (13.9)
= = = − . (13.15)
δJ(x)δJ(y) δJ(x) δφ(y) δφ(x)δφ(y)
(where φ ≡ φc here). You can prove this from the definitions above. Inverse here
means in the sense of integral operators: dD zK(x, z)K −1 (z, y) = δ D (x − y). So
R
W2 = −Γ−1
2 .
96
Here’s two ways to think about why we get an inverse here: (1) diagrammatically,
the 1PI blob is defined by removing the external propagators; but these external
propagators are each W2 ; removing two of them from one of them leaves −1 of
P R
them. You’re on your own for the sign. (2) In the expansion of Γ = n Γn φn
RR
in powers of the field, the second term is φΓ2 φ, which plays the role of the
kinetic term in the effective action (which we’re instructed to use to make tree
diagrams). The full propagator is then the inverse of the kinetic operator here,
namely Γ−12 . Again, you’re on your own for the sign.
The idea to show the general case in Fig. 2 is to just compute Wn by taking the
derivatives starting from (13.15): Differentiate again wrt J and use the matrix
differentiation formula dK −1 = −K −1 dKK −1 and the chain rule to get
Z Z Z
W3 (x, y, z) = dw1 dw2 dw3 W2 (x, w1 )W2 (y, w2 )W2 (z, w3 )Γ3 (w1 , w2 , w3 ) .
This business is useful in at least two ways. First it lets us focus our attention
on a much smaller collection of diagrams when we are doing our perturbative
renormalization.
Secondly, this notion of effective action is extremely useful in thinking about the
vacuum structure of field theories, and about spontaneous symmetry breaking.
In particular, we can expand the functional in the form
Z
Γ[φc ] = dD x −Veff (φc ) + Z(φc ) (∂φc )2 + ...
(where the ... indicate terms with more derivatives of φ). In particular, in the
case where φc is constant in spacetime we can minimize the function Veff (φc ) to
find the vacuum. This is a lucrative endeavor which you get to do for homework.
[Zee §IV.3, Xi Yin’s notes §4.2] Let us now take seriously the lack of indices on our
field φ, and see about actually evaluating more of the semiclassical expansion of the
path integral of a scalar field (eventually we will specify D = 3 + 1):
Z
i
= [Dφ]e ~ (S[φ]+ Jφ) .
i
R
W [J]
Z[J] = e ~ (13.16)
To add some drama to this discussion consider the following: if the potential V in
(∂φ)2 − V (φ) has a minimum at the origin, then we expect that the vacuum
R 1
S= 2
97
has hφi = 0. If on the other hand, the potential has a maximum at the origin, then
the field will find a minimum somewhere else, hφi = 6 0. If the potential has a discrete
symmetry under φ → −φ (no odd powers of φ in V ), then in the latter case (V 00 (0) < 0)
this symmetry will be broken. If the potential is flat (V 00 (0) = 0) near the origin, what
happens? Quantum effects matter.
The configuration of stationary phase is φ = φ? , which satisfies
R
δ S + Jφ
0= |φ=φ? = −∂ 2 φ? (x) − V 0 (φ? (x)) + J(x) . (13.17)
δφ(x)
In the second line, we integrated by parts to get the ϕ integral to look like a souped-up
version of the fundamental formula of gaussian integrals – just think of ∂ 2 + V 00 as a
big matrix – and in the third line, we did that integral. In the last line we used the
matrix identity tr log = log det. Note that all the φ? s appearing in this expression are
functionals of J, determined by (13.17).
So taking logs of the BHS of the previous equation we have the generating func-
tional: Z
i~
W [J] = S[φ? ] + Jφ? + tr log ∂ 2 + V 00 (φ? ) + O(~2 ) .
2
To find the effective potential, we need to Legendre transform to get a functional of φc :
R
δ S[φ ] + Jφ
Z
δW chain rule ? ? δφ? (z) (13.17)
φc (x) = = dD z +φ? (x)+O(~) = φ? (x)+O(~) .
δJ(x) δφ? (z) δJ(x)
98
general. In the special case that we are interested in φc which is constant in spacetime,
it is doable. This case is also often physically relevant if our goal is to solve (13.8)
to find the groundstate, which often preserves translation invariance (gradients cost
energy). If φc (x) = φ is spacetime-independent then we can write
Z
Γ[φc (x) = φ] ≡ dD x Veff (φ).
The computation of the trace-log is doable in this case because it is translation invari-
ant, and hence we can use fourier space. We do this next.
The tr in the one-loop contribution is a trace over the space on which the differential
operator (≡big matrix) acts; it acts on the space of scalar fields ϕ:
X
∂ 2 + V 00 (φ) ϕ x = ∂ 2 + V 00 (φ) xy ϕy ≡ ∂x2 + V 00 (φ) ϕ(x)
y
Z Z
D
d̄D k log −k 2 + V 00 , (|| hx|ki ||2 = 1)
= d x
R
The dD x goes along for the ride and we conclude that
Z
i~
d̄D k log k 2 − V 00 (φ) + O(~2 ).
Veff (φ) = V (φ) −
2
99
What does it mean to take the log of a dimensionful thing? It means we haven’t been
careful about the additive constant (constant means independent of φ). And we don’t
need to be (unless we’re worried about dynamical gravity); so let’s choose the constant
so that
k − V 00 (φ)
Z 2
i~ D
Veff (φ) = V (φ) − d̄ k log + O(~2 ). (13.18)
2 k2
[End of Lecture 55]
X1
V1 loop = ~ω~k . Here’s the interpretation of the 1-loop potential: V 00 (φ) is the
2
~k
2
mass of the field when it has the constant value φ; the one-loop term V1 loop is the
vacuum energy dD−1~k 12 ~ω~k from the gaussian fluctuations of a field with that mass2 ;
R
k 2 − V 00 (φ) + i ω 2 − ωk2 + i
Z Z
I≡ d̄ω log = d̄ω log
k 2 + i ω 2 − ωk20 + i
q
with ωk = ~k 2 + V 00 (φ), and ωk0 = |~k|. The i prescription is as usual inherited from
the euclidean path integral. Notice that the integral is convergent – at large ω, the
integrand goes like
!
1 − ωA2
2
ω −A A−B 1 A−B
log 2
= log B
= log 1 − 2
+O 4
' .
ω −B 1 − ω2 ω ω ω2
Integrate by parts:
k − V 00 (φ) + i
2 2
ω − ωk2
Z Z
I = d̄ω log = − d̄ωω∂ω log
k 2 + i Z ω − ωk0
ω
= −2 d̄ωω − (ωk → ωk0 )
ω 2− ωk2 + i
1
= −i2ωk2 − (ωk → ωk0 ) = i (ωk − ωk0 ) .
−2ωk
This is what we are summing (times −i 21 ~) over all the modes d̄D−1~k.
R
100
13.3.2 Renormalization of the effective action
So we have a cute expression for the effective potential (13.18). Unfortunately it seems
to be equal to infinity. The problem, as usual, is that we assumed that the parameters in
the bare action S[φ] could be finite without introducing any cutoff. Let us parametrize
R
(following Zee §IV.3) the action as S = dD xL with
1 1 1
L= (∂φ)2 − µ2 φ2 − λφ4 − A (∂φ)2 − Bφ2 − Cφ4
2 2 4!
and we will think of A, B, C as counterterms, in which to absorb the cutoff dependence.
So our effective potential is actually:
Λ
kE2 + V 00 (φ)
Z
1 1 ~ D
Veff (φ) = µ2 φ2 + λφ4 + B(Λ)φ2 + C(Λ)φ4 + d̄ kE log ,
2 4! 2 kE2
(notice that A drops out in this special case with constant φ). We rotated the integra-
tion contour to euclidean space. This permits a nice regulator, which is just to limit
the integration region to {kE |kE2 ≤ Λ2 } for some big (Euclidean) wavenumber Λ.
Now let us specify to the case of D = 4, where the model with µ = 0 is classically
scale invariant. The integrals are elementary26
√ 2
1 2 2 1 4 2 4 Λ2 00 (V 00 (φ))2 eΛ
Veff (φ) = µ φ + λφ + B(Λ)φ + C(Λ)φ + 2
V (φ) − 2
log 00 .
2 4! 32π 64π V (φ)
Notice that the leading cutoff dependence of the integral is Λ2 , and there is also a
subleading logarithmically-cutoff-dependent term. (“log divergence” is certainly easier
to say.)
Luckily we have two counterterms. Consider the case where V is a quartic poly-
nomial; then V 00 is quadratic, and (V 00 )2 is quartic. In that case the two counterterms
are in just the right form to absorb the Λ dependence. On the other hand, if V were
sextic (recall that this is in the non-renormalizable category according to our dimen-
sional analysis), we would have a fourth counterterm Dφ6 , but in this case (V 00 )2 ∼ φ8 ,
and we’re in trouble (adding a bare φ8 term would produce (V 00 )2 ∼ φ12 ... and so
on). We’ll need a better way to think about such non-renormalizable theories. The
better way (which we will return to in the next section) is simply to recognize that in
non-renormalizable theories, the cutoff is real – it is part of the definition of the field
theory. In renormalizable theories, we may pretend that it is not (though it usually is
real there, too).
26
This is not the same as ‘easy’. The expressions here assume that Λ V 00 .
101
Renormalization conditions. Return to the renormalizable case, V = λφ4 where
we’ve found
Λ2 λ2 φ2
2 1 2 4 1
Veff = φ µ +B+λ +φ λ+C + log 2 + O(λ3 ) .
2 64π 2 4! 16π 2 Λ
√
(I’ve absorbed an additive log e in C.) The counting of counterterms works out, but
how do we determine them? We need to impose renormalization conditions, i.e. spec-
ify some observable quantities to parametrize our model, in terms of which we can
eliminate the silly letters in the lagrangian. We need two of these. Of course, what is
observable depends on the physical system at hand. Let’s suppose that we can measure
some properties of the effective potential. For example, suppose we can measure the
mass2 when φ = 0:
∂ 2 Veff Λ2
µ2 = |φ=0 =⇒ we should set B = −λ .
∂φ2 64π 2
For example, we could consider the case µ = 0, when the potential is flat at the origin.
With µ = 0, have
λ2 φ2
1 4 3
Veff (φ) = λ+ 2 log 2 + C(Λ) φ + O(λ ) .
4! (16π) Λ
And for the second renormalization condition, suppose we can measure the quartic
term
∂ 4 Veff
λM = |φ=M . (13.19)
∂φ4
Here M is some arbitrarily chosen quantity with dimensions of mass. We run into
trouble if we try to set it to zero because of ∂φ4 (φ4 log φ) ∼ log φ. So the coupling
depends very explicitly on the value of M at which we set the renormalization condition.
Let’s use (13.19) to eliminate C:
2 2
!
! λ λ φ
λ(M ) = 4! +C + log 2 + c1 |φ=M (13.20)
4! 16π Λ
Here I used the fact that we are only accurate to O(λ2 ) to replace λ = λ(M )+O(λ(M )2 )
in various places. We can feel a sense of victory here: the dependence on the cutoff
102
has disappeared. Further, the answer for Veff does not depend on our renormalization
point M :
2 λ2
d 1 4
M Veff = φ M ∂M λ − + O(λ ) = O(λ3 )
3
(13.21)
dM 4! M (16π 2 )
which vanishes to this order from the definition of λ(M ) (13.20), which implies
3
M ∂M λ(M ) = 2
λ(M )2 + O(λ3 ) ≡ β(λ).
16π
The fact (13.21) is sometimes called the Callan-Symanzik equation, the condition that
λ(M ) must satisfy in order that physics be independent of our choice of renormalization
point M . [End of Lecture 56]
However, the minima lie in a region where our approximations aren’t so great. In
particular, the next correction looks like:
2
λφ4 1 + λ log φ2 + λ log φ2 + ...
– the expansion parameter is really λ log φ. (I haven’t shown this yet, it is an application
of the RG, below.) The apparent minimum lies in a regime where the higher powers
of λ log φ are just as important as the one we’ve kept.
RG-improvement. How do I know the good expansion parameter is actually
4
λ log φ/M ? The RG. Define t ≡ log φc /M and Vef f (φc ) = φ4!c U (t, λ). We’ll regard
U as a running coupling, and t as the RG scaling parameter. Our renormalization
conditions are U (0, λ) = λ, Z(λ) = 1, these provide initial conditions. At one loop in
∂
φ4 theory, there are no anomalous dimensions, γ(λ) = ∂M Z = O(λ2 ). This makes the
RG equations quite simple. The running coupling U satisfies (to this order)
dU 3U 2
= β(U ) =
dt 16π 2
which (with the initial condition U (0, λ) = λ) is solved by
λ
U (λ, t) = 3λt
.
1 − 16π 2
103
Therefore, the RG-improved effective potential is
φ4c 1 λφ4c
Vef f (φc ) = U (t, λ) = .
4! 4! 1 − 3λ2 log φ2c2
32π M
The good news: this is valid as long as U is small, and it agrees with our previous
answer, which was valid as long as λ 1 and λt 1. The bad news is that there is
no sign of the minimum we saw in the raw one-loop answer.
By the way, in nearly every other example, there will be wavefunction renormaliza-
tion. In that case, the Callan-Syzmanzik (CS) equation we need to solve is
(−∂t + β∂λ + 4γ) U (t, λ) = 0
whose solution is
Z t
0 0
U (t, λ) = f (U (t, λ)) exp dt 4γ(U (t , λ)) , ∂t U (t, λ) = β(U ), U (0, λ) = λ.
0
f can be determined by studying the CS equation at t = 0. For more detail, see
E. Weinberg’s thesis.
We can get around this issue by studying a system where the fluctuations producing
the extra terms in the potential for φ come from some other field whose mass depends
on φ. For example, consider a fermion field whose mass depends on φ:
Z
S[ψ, φ] = dD xψ̄ (i/ ∂ − m − gφ) ψ
P1
– then mψ = m + gφ. The 2
~ωs from the fermion will now depend on φ (the also
have the opposite sign because they come from fermions), and we get a reliable answer
for hφi =6 0 from this phenomenon of radiative symmetry breaking. In D = 1 + 1 this is
a field theory description of the Peierls instability of a 1d chain of fermions (ψ) coupled
to phonons (ψ). Notice that when φ gets an expectation value it gives a mass to the
fermions. The microscopic picture is that the translation symmetry is spontaneously
broken to a twice-as-big lattice spacing, alternating between strong and weak hopping
matrix elements. This produces a gap in the spectrum of the tight-binding model. (For
a little more, see Zee page 300.)
A second example where radiative symmetry breaking happens is scalar QED. There
we can play the gauge coupling and the scalar self-coupling off each other. I’ll say a
bit more about this example as it’s realized in condensed matter below.
Another example which has attracted a lot of attention is the Standard Model Higgs.
Its expectation value affects the masses of many fields, and you might imagine this
might produce features in its effective potential. Under various (strong) assumptions
about what lies beyond the Standard Model, there is some drama here; I recommend
Schwarz’s discussion on page 748-750.
104
13.3.3 Useful properties of the effective action
[For a version of this discussion which is better in just about every way, see Coleman,
Aspects of Symmetry §5.3.7. I also highly recommend all the preceding sections! And
the ones that come after. This book is available electronically from the UCSD library.]
Veff as minimum energy with fixed φ. Recall that hφi is the configuration
of φc which extremizes the effective action Γ[φc ]. Even away from its minimum, the
effective potential has a useful physical interpretation. It is the natural extension of
the interpretation of the potential in classical field theory, which is: V (φ) = the value
of the energy density if you fix the field equal to φ everywhere. Consider the space of
states of the QFT where the field has a given expectation value:
one of them has the smallest energy. I claim that its energy is Veff (φ0 ). This fact, which
we’ll show next, has some useful consequences.
Let |Ωφ0 i be the (normalized) state of the QFT which minimizes the energy subject
to the constraint (13.22). The familiar way to do this (familiar from QM, associated
with Rayleigh and Ritz)27 is to introduce Lagrange multipliers to impose (13.22) and
the normalization condition and extremize without constraints the functional
Z
hΩ| H |Ωi − α (hΩ|Ωi − 1) − dD−1~xβ(~x) (hΩ| φ(~x, t) |Ωi − φ0 (~x))
28
with respect to |Ωi and the functions on space α, β.
27
The more familiar thing is to find the state which extremizes ha| H |ai subject to the normalization
condition ha|ai = 1. To do this, we vary ha| H |ai − E (ha|ai − 1) with respect to both |ai and
the Lagrange multiplier E. The equation from varying |ai says that the extremum occurs when
(H − E) |ai = 0, i.e. |ai is an energy eigenstate with energy E. Notice that we could just as well have
varied the simpler thing
ha| (H − E) |ai
and found the same answer.
28
Here is the QM version (i.e. the same thing without all the labels): we want to find the extremum
of ha|H|ai with |ai normalized and ha|A|ai = Ac some fixed number. Then we introduce two Lagrange
multipliers E, J and vary without constraint the quantity
(H − E − JA) |ai = 0
so |ai is an eigenstate of the perturbed hamiltonian H − JA, with energy E. J is an auxiliary thing,
105
Clearly the extremum with respect to α, β imposes the desired constraints. Ex-
tremizing with respect to |Ωi gives:
Z
H |Ωi = α |Ωi + dD−1~xβ(~x)φ(~x, t) |Ωi (13.23)
or Z
D−1
H− d ~xβ(~x)φ(~x, t) |Ωi = α |Ωi (13.24)
dE
ha|H|ai = E + JAc = E − J .
dJ
This Legendre transform is exactly (the QM analog of) the effective potential.
106
very useful. (You’ve previously used it on the homework to compute the potential
between static sources, and to calculate the probability for pair creation in an electric
field.) Notice that it gives an independent proof that W only gets contributions from
connected amplitudes. Amplitudes with n connected components, h....i h...i h...i, go
| {z }
n of these
like T n (where T is the time duration) at large T . Since W = −EJ T goes like T 1 ,
we conclude that it has one connected component (terms that went like T n>1 would
dominate at large T and therefore must be absent). This extensivity of W in T is of
the same nature as the extensivity in volume of the free energy in thermodynamics.
[Brown, 6.4.2] Another important reason why W must be connected is called
the cluster decomposition property. Consider a source which has the form J(x) =
J1 (x) + J2 (x) where the two parts have support in widely-separated (spacelike sepa-
rated) spacetime regions. If all the fields are massive, ‘widely-separated’ means pre-
cisely that the distance between the regions is R 1/m, much larger than the range
of the interactions mediated by φ. In this case, measurements made in region 1 can-
not have any effect on those in region 2, and they should be uncorrelated. If so, the
probability amplitude factorizes
107
positive.29 30
On the other hand, it seems that if V (φ) has a maximum, or even any region of
field space where V 00 (φ) < 0, we get a complex one-loop effective potential (from the
log of a negative V 00 ). What gives? One resolution is that in this case the minimum
energy state with fixed hφi is not a φ eigenstate.
For example, consider a quartic potential 12 m2 φ2 + 4!g φ4 with m2 < 0, with minima
q 2
at φ± ≡ ± 6|m|g
. Then for hφi ∈ (φ− , φ+ ), rather we can lower the energy below V (φ)
by considering a state
The one-loop effective potential at φ only knows about some infinitesimal neighborhood
of the field space near φ, and fails to see this non-perturbative stuff. In fact, the correct
effective potential is exactly flat in between the two minima. More generally, if the two
minima have unequal energies, we have
– the potential interpolates linearly between the energies of the two surrounding min-
ima.
The imaginary part of V1 loop is a decay rate. If we find that the (pertur-
bative approximation to) effective potential E ≡ V1 loop is complex, it means that the
amplitude for our state to persist is not just a phase:
dD x δ Γ[φ] dD x
Z Z
∂ 1
(13.7)
Veff (φ0 ) = − |φ(x)=φ0 = − (−J(x)) |φ(x)=φ0 .
∂φ0 V δφ(x) V V V
In the first expression here, we are averaging over space the functional derivative of Γ. The second
derivative is then
2 Z D Z D Z Z
∂ 1 d y δ d x 1 δJ(x)
Veff (φ0 ) = (J(x)) |φ(x)=φ0 = + 3 |φ(x)=φ0
∂φ0 V V δφ(y) V V y x δφ(y)
– the inverse is in a matrix sense, with x, y as matrix indices. But W2 is a positive operator – it is
the groundstate expectation value of the square of a hermitian operator.
30 δ2 Γ
In fact, the whole effective action Γ[φ] is a convex functional: δφ(x)δφ(y) is a positive integral
operator. For more on this, I recommend Brown, Quantum Field Theory, Chapter 6.
108
has a modulus different from one (V is the volume of space). Notice that the |0i here
is our perturbative approximation to the groundstate of the system, which is wrong in
the region of field space where V 00 < 0. The modulus of this object is
– we can interpret 2ImE as the (connected!) decay probability of the state in question
per unit time per unit volume. (Notice that this relation means that the imaginary
part of V1-loop had better be positive, so that the probability stays less than one! In the
one-loop approximation, this is guaranteed by the correct i prescription.)
For more on what happens when the perturbative answer becomes complex and
non-convex, and how to interpret the imaginary part, see this paper by E. Weinberg
and Wu.
109
14 Duality
In this subsection and the next we’re going to think about ways to think about bosonic
field theories with a U(1) symmetry, and dualities between them, in D = 1 + 1 and
D = 2 + 1.
[This discussion is from Ashvin Vishwanath’s lecture notes.] Consider the Bose-
Hubbard model (in any dimension, but we’ll specify to D = 1 + 1 at some point)
X UX X
HBH = −J˜ b†i bj + h.c. + ni (ni − 1) − µ ni
2 i i
hiji
where the b† s and b are bosonic creation and annihilation operators at each site:
[bi , b†j ] = δij . ni ≡ b†i bi counts the number of bosons at site i. The last Hubbard-U
term is zero if nbj = 0, 1, but exacts an energetic penalty ∆E = U if a single site j is
occupied by two bosons.
The Hilbert space which represents the boson algebra has a useful number-phase
representation in terms of
(where the last statement pertains to the eigenvalues of the operator). The bosons are
√ √
bi = e−iφi ni , b†i = ni e+iφi ;
these expressions have the same algebra as the original bs. In terms of these operators,
the hamiltonian is
X √ √ UX X
HBH = −J˜ ni ei(φi −φj ) nj + h.c. + ni (ni − 1) − µ ni .
2 i i
hiji
√ √
If hni i = n0 1, so that ni = n0 + ∆ni , ∆ni n0 then bi = e−iφ ni ' e−iφi n0
and X UX
HBH ' − 2Jn ˜ 0 cos (φi − φj ) + (∆ni )2 ≡ Hrotors
| {z } 2 i
≡J hiji
110
the groundstate. This is a Mott insulator, with a gap of order U . Since n and φ are
conjugate variables, definite number means wildly fluctuating phase.
U J : then we must satisfy the J term first and the phase is locked, φ = 0 in
the groundstate, or at least it will try. This is the superfluid (SF). That is, we can try
to expand the cosine potential31
X X X X 1 2
2 2
Hrotors = U ni − J cos (φi − φj ) ' U ni − J 1 − (φi − φj ) + ...
i i
2
hiji hiji
(14.1)
1
P −ik·xi
which is a bunch of harmonic oscillators and can be solved by Fourier: φi = √ d ie φk ,
N
so X
H' (U πk π−k + J (1 − cos ka) φk φ−k )
k
η= = .
r 2πρs 2K
This is algebraic long range order. This is a sharp distinction between the two phases
we’ve discussed, even though the IR fluctuations destroy the hbi.
[End of Lecture 57]
31
From now on the background density n0 will not play a role and I will write ni for ∆ni .
111
Massless scalars in D = 1 + 1 and T -duality-invariance of the spectrum.
A lot of physics is hidden in the innocent-looking theory of the superfluid goldstone
boson. Consider the following (real-time) continuum action for a free massless scalar
field in 1+1 dimensions:
Z Z L Z
T 2 2
S[φ] = dt dx (∂0 φ) − (∂x φ) = 2T dxdt∂+ φ∂− φ . (14.3)
2 0
So the field space is a circle S 1 with (angular) coordinate φ. It can be useful to think
of the action (14.3) as describing the propagation of a string, since a field configuration
describes an embedding of the real two dimensional space into the target space, which
here is a circle. This is a simple special case of a nonlinear sigma model. The name
T-duality comes from the literature on string theory. The worldsheet theory of a string
√
propagating on a circle of radius R = ρs is governed by the Lagrangian (14.2). To
see this, recall that the action of a 2d nonlinear sigma model with target space metric
gµν φµ φν is α10 d2 σgµν ∂φµ ∂φν . Here α10 is the tension (energy per unit length) of the
R
string; work in units where this disappears from now on. Here we have only one
dimension, with gφφ = ρs .
Notice that we could rescale φ → λφ and change the radius; but this would change
the periodicity of φ ≡ φ + 2π. The proper length of the period is 2πR and is invariant
under a change of field variables. This proper length distinguishes different theories
because the operators : eαφ : (which you saw on a previous homework were good
operators of definite scaling dimension in the theory of the free boson (unlike φ itself))
must be periodic; this determines the allowed values of α.
First a little bit of classical field theory. The equations of motion for φ are
δS
0= ∝ ∂ µ ∂µ φ ∝ ∂+ ∂− φ
δφ(x, t)
which is solved by
φ(x, t) ≡ φL (x+ ) + φR (x− ) .
In euclidean time, φL,R depend (anti-)holomorphically on the complex coordinate z ≡
1
2
(x + iτ ) and the machinery of complex analysis becomes useful.
112
Symmetries: Since S[φ] only depends on φ through its derivatives, there is a
simple symmetry φ → φ + . By the Nöther method the associated current is
jµ = T ∂µ φ . (14.4)
This symmetry is translations in the target space, and I will sometimes call the asso-
ciated conserved charge ‘momentum’.
There is another symmetry which is less obvious. It comes about because of the
topology of the target space. Since φ(x, t) ≡ φ(x, t) + 2πm, m ∈ Z describe the same
point (it is a redundancy in our description, in fact a discrete gauge redundancy), we
don’t need φ(x + L, t) = φ(x, t). It is enough to have
The number m cannot change without the string breaking: it is a topological charge,
a winding number: Z L
1 x=L FTC 1
m= φ(x, t)|x=0 = dx∂x φ . (14.5)
2π 2π 0
1
The associated current whose charge density is ∂ φ
π x
(which integrates over space to
the topological charge) is
1 1 µν
j̃µ = (∂x φ, −∂0 φ)µ = ∂ν φ.
2π 2π
This is conserved because of the equality of the mixed partials: µν ∂µ ∂ν = 0.
Let’s expand in normal modes: φ = φL + φR with
r
L X ρn in(t+x) 2π
φL (t + x) = qL + (p + w)(t + x) − i e L ,
| {z } 4πT n6=0 n
1
≡ 2T pL
r
L X ρ̃n in(t−x) 2π
φR (t − x) = qR + (p − w)(t − x) − i e L , (14.6)
| {z } 4πT n6=0 n
1
≡ 2T pR
The factor of n1 is a convention whose origin you will appreciate below, as are the other
normalization factors. Real φ means ρ†n = ρ−n (If we didn’t put the i it would have
been −ρ−n ).
RL
Here q ≡ L1 0 dxφ(x, t) = qL − qR is the center-of-mass position of the string. The
canonical momentum for φ is π(x, t) = T ∂0 φ(x, t) = T (∂+ φL + ∂− φR ).
QM. Now we’ll do quantum mechanics. Recall that a quantum mechanical particle
on a circle has momentum quantized in units of integers over the period. Since φ is
113
periodic, the wavefunction(al)s must be periodic in the center-of-mass coordinate q
with period 2π, and this means that the total (target-space) momentum must be an
integer Z L Z L
(14.6)
Z 3 j = π0 ≡ dxπ(x, t) = T dx∂t φ = LT 2p
0 0
So our conserved charges are quantized according to
j (14.6)(14.5) πm
p= , w = , j, m ∈ Z .
2LT L
(Don’t confuse the target-space momentum j with the ‘worldsheet momentum’ n!)
(Note that this theory is scale-free. We could use this freedom to choose units where
L = 2π.)
Now I put the mode coefficients in boldface:
r
+ 1 + L X ρn i 2π nx+
φL (x ) = qL + pL x − i e L ,
2T 4πT n6=0 n
r
1 L X ρ̃n i 2π nx−
φR (x− ) = qR + pR x− − i e L , (14.7)
2T 4πT n6=0 n
which determines the commutators of the modes (this was the motivation for the weird
normalizations)
and the same for the rightmovers with twiddles. This is one simple harmonic oscillator
for each n ≥ 1 (and each chirality); the funny normalization is conventional.
Z 1Z 2
π 2
H= dx π(x)φ̇(x) − L = dx + T (∂x φ)
2 T
∞
1 2 2
X
=L pL + pR +π (ρ−n ρn + ρ̃−n ρ̃n ) + a
|4T {z } n=1
2
π0
2T
+ T2 w2
2
∞
1 j 2
X
= + T (2πm) + π n Nn + Ñn + a (14.8)
2L T n=1
Here a is a (UV sensitive) constant which will not be important for us (it is very
important in string theory), which is the price we pay for writing the hamiltonian as
114
a sum of normal-ordered terms – the modes with negative indices are to the right and
they annihilate the vacuum:
which takes the radius of the circle to its inverse and exchanges the momentum and
winding modes. This is called T-duality. The required duality map on the fields is
φL + φR ↔ φL − φR .
√
(The variable R in the plot is R ≡ πT .)
115
T-duality says string theory on a large circle is the same as string theory on a
small circle. On the homework you’ll get to see a derivation of this statement in the
continuum which allows some generalizations.
Vertex operators. It is worthwhile to pause for another moment and think about
the operators which create the winding modes. They are like vortex creation operators.
Since φ has logarithmic correlators, you might think that exponentiating it is a good
idea. First let’s take advantage of the fact that the φ correlations split into left and
right bits to write φ(z, z̄) = φL (z) + φR (z̄):
1 z 1 z̄
hφL (z)φL (0)i = − log , hφR (z̄)φR (0)i = − log , hφL (z)φR (0)i = 0 .
πT a πT a
(14.9)
A set of operators with definite scaling dimension is:
P ρn n P ρn n
: eiαφL (z) :≡ eiαqL eiαpL z eiα n<0 n w eiα n>0 n w
(14.10)
The monster in front here creates oscillator excitations. I wrote q0 ≡ qL + qR and
φ̃0 ≡ qL − qR . The important thing is that the winding number has been incremented
by α − β; this means that α − β must be an integer, too. We conclude that
α + β ∈ Z, α−β ∈Z (14.11)
116
so they can both be half-integer, or they can both be integers.
By doing the gaussian integral (or moving the annihilation operators to the right)
their correlators are
D0
hVα,β (z, z̄)Vα0 ,β 0 (0, 0)i = α2 β2 . (14.12)
z πT z̄ πT
The zeromode prefactor D0 is:
D 0 0
E
D0 = ei((α+α )qL +(β+β )qR ) = δα+α0 δβ+β 0 .
0
1
(hL , hR ) = (α2 , β 2 ).
2πT
Notice the remarkable fact that the exponential of a dimension-zero operator manages
to have nonzero scaling dimension. This requires that the multiplicative prefactor
depend on the cutoff a to the appropriate power (and it is therefore nonuniversal). We
could perform a multiplicative renormalization of our operators V to remove this cutoff
dependence from the correlators.
The values of α, β allowed by single-valuedness of φ and its wavefunctional are
integers. We see (at least) three special values of the parameter T :
• The SU(2) radius: When 2πT = 1, the operators with (n, m) = 1 are marginal.
Also, the operators with (n, m) = (1, 0) and (n, m) = (0, 1) have the scaling
behavior of currents, and by holomorphicity are in fact conserved.
• The free fermion radius: when 2πT = 2, V1,0 has dimension ( 21 , 0), which is
R
the dimension of a left-moving free fermion, with action dtdxψ̄∂+ ψ. In fact the
scalar theory with this radius is equivalent to a massless Dirac fermion! This
equivalence is an example of bosonization. In particular, the radius-changing de-
formation of the boson maps to a marginal four-fermion interaction: by studying
free bosons we can learn about interacting fermions. Should I say more about
this?
After this detour, let’s turn to the drama of the bose-Hubbard model. Starting
from large J/U , where we found a superfluid, what happens as U grows and makes
117
the phase fluctuate more? Our continuum description in terms of harmonic oscillators
hides (but does not ignore) the fact that φ ' φ + 2π. The system admits vortices, aka
winding modes.
Lattice T-duality. To see their effects let us do T-duality on the lattice.
The dual variables live on the bonds, la-
belled by ī = 12 , 23 , 52 ....
Introduce
φi+1 − φi X
mī ≡ , Θī ≡ 2πnj (14.13)
2π
j<ī
To understand where these expressions come from, notice that the operator
P
eiΘī = ei j<ī 2πnj
rotates the phase of the boson on all sites to the left of ī (by 2π). It inserts a vortex
in between the sites i and i + 1. The rotor hamiltonian is
2
U X Θī+1 − Θī X
Hrotors = −J cos 2πmī
2 2π
ī ī !
2
SF X U ∆Θ J
' + (2πmī )2 (14.14)
2 2π 2
ī
where in the second step, we assumed we were in the SF phase, so the phase fluctuations
and hence mī are small. This looks like a chain of masses connected by springs again,
but with the roles of kinetic and potential energies reversed – the second term should
be regarded as a π 2 kinetic energy term. BUT: we must not forget that Θ ∈ 2πZ! It’s
oscillators with discretized positions. We can rewrite it in terms of continuous Θ at the
expense of imposing the condition Θ ∈ 2πZ energetically by adding a term −λ cos Θ32 .
The resulting model has the action
1
Leff = (∂µ Θ)2 − λ cos Θ. (14.15)
2(2π)2 ρs
32
This step seems scary at first sight, since we’re adding degrees of freedom to our system, albeit
gapped ones. Θī is the number of bosons to the left of ī (times 2π). An analogy that I find useful is
to the fact that the number of atoms of air in the room is an integer. This constraint can have some
important consequences, for example, were they to solidify. But in our coarse-grained description
of the fluid phase, we use variables (the continuum number density) where the number of atoms
(implicitly) varies continuously. The nice thing about this story (both for vortices and for air) is that
the system tells us when we can’t ignore this quantization constraint.
118
Ignoring the λ term, this is the T-dual action, with ρs replaced by (2π)12 ρs . The coupling
got inverted here because in the dual variables it’s the J term that’s like the π 2 inertia
term, and the U term is like the restoring force. This Θ = φL − φR is therefore T-dual
variable, with ETCRs
[φ(x), Θ(y)] = 2πisign(x − y). (14.16)
This commutator follows directly from the definition of Θ (14.13). (14.16) means that
the operator cos Θ(x) jumps the SF phase variable φ by 2π – it inserts a 2π vortex, as
we designed it to do. So λ is like a chemical potential for vortices.
This system has two regimes, depending on the scaling dimension of the vortex
insertion operator:
• If λ is an irrelevant coupling, we can ignore it in the IR and we get a superfluid,
with algebraic LRO.
• If the vortices are relevant, λ → ∞ in the IR, and we pin the dual phase, Θī =
0, ∀ī. This is the Mott insulator, since Θī = 0 means ni = 0 – the number fluctuations
are frozen.
When is λ relevant? Expanding around the free theory,
iΘ(x) −iΘ(0) c
e e =
x2πρs
this has scaling dimension ∆ = πρs which is relevant if 2 > ∆ = πρs . Since the bose
correlators behave as b† b ∼ x−η with η = 2πρ 1
, we see that only if η < 41 do we
p s
have a stable SF phase. (Recall that ρs = J/U .) If η > 14 , the SF is unstable to
proliferation of vortices and we end up in the Mott insulator, where the quantization of
particle number matters. A lesson: we can think of the Mott insulator as a condensate
of vortices. [End of Lecture 58]
Note: If we think about this euclidean field theory as a 2+0 dimensional stat-mech
problem, the role of the varying ρs is played by temperature, and this transition we’ve
found of the XY model, where by varying the radius the vortices become relevant, is
the Kosterlitz-Thouless transition. Most continuous phase transitions occur by tuning
the coefficient of a relevant operator to zero (recall the general O(n) transition, where
we have to tune r → rc to get massless scalars). This is not what happens in the 2d
XY model; rather, we are varying a marginal parameter and the dimensions of other
operators depend on it and become relevant at some critical value of that marginal
−√ a
parameter. This leads to very weird scaling near the transition, of the form e K−Kc
(for example, in the correlation length, the exponential arises from inverting expres-
1
sions involving GR (z) = − 4πK log z) – it is sometimes called an ‘infinite order’ phase
transition, because all derivatives of such a function are continuous.
119
14.2 (2+1)-d XY is dual to (2+1)d electrodynamics
Earlier (during our discussion of boson coherent states) I made some claims about the
phase diagram of the Bose-Hubbard model
X X †
HBH = (−µni + U ni (ni − 1)) + bi wij bj
i ij
Here Ψ is an effective field which incorporates the effects of the neighboring sites.
Notice that nonzero Ψ breaks the U(1) boson number conservation: particles can hop
out of the site we are considering. This also means that nonzero Ψ will signal SSB.
What does this simple approximation give up? For one, it assumes the ground-
state preserves the lattice translation symmetry, which doesn’t always happen. More
painfully, it also gives up on any entanglement at all in the groundstate. Phases for
which entanglement plays an important role will not be found this way.
We want to minimize over Ψ the quantity
1 1
E0 ≡ hΨvar | HBH |Ψvar i = hΨvar | HBH − HM F +HM F |Ψvar i
M M | P {z }
=w b† b+Ψb+h.c.
1
EM F (Ψ) − zw b† hbi + hbi Ψ? + b† Ψ.
= (14.17)
M
Here z is the coordination number of the lattice (the number of neighbors of a site,
which we assume is the same for every site), and h..i ≡ hΨvar | .. |Ψvar i.
First consider w = 0, no hopping. Then ΨB = 0 (neighbors
don’t matter), and the single-site state is a number eigenstate
|ψi i = |n0 (µ/U )i, where n0 (x) = 0 for x < 0, and n0 (x) =
dxe, (the ceiling of x, i.e. , the next integer larger than x), for
x > 0. Precisely when µ/U is an integer, there is a twofold
degeneracy per site.
120
This degeneracy is broken by a small hopping term. Away from the degenerate
points, within a single Mott plateau, the hopping term does very little (even away
from mean field theory). This is because there is an energy gap, and [N, HBH ] = 0,
which means that a small perturbation has no other states to mix in which might
have other eigenvalues of N . Therefore, within a whole open set, the particle number
remains fixed. This means ∂µ hN i = 0, the system is incompressible.
We can find the boundaries of this region by expanding E0 in Ψ, following Landau:
E0 = E00 + r|Ψ|2 + O(|Ψ|4 ). We can compute the coefficients in perturbation theory,
and this produces the following picture.
Mean field theory gives the famous picture at
right, with lobes of different Mott insulator states
with different (integer!) numbers of bosons per
site. (The hopping parameter w is called t in the
figure.)
where we introduced the hopping matrix wij = w if hiji share a link, otherwise zero.
Here the bs are numbers, coherent state eigenvalues. Here is another application of the
Hubbard-Stratonovich transformation:
Z R 1/T 0
Z = [d2 b][d2 Ψ]e− 0 dτ Lb
X † X
with L0b = bi ∂τ bi − µb†i bi + U b†i b†i bi bi − Ψb†i − Ψ? bi + −1
Ψi wij Ψj .
i ij
(Warning: if w has negative eigenvalues, so that the gaussian integral over Ψ is well-
defined, we need to add a big constant to it, and subtract it from the single-particle
terms.) Now integrate out the b fields. It’s not gaussian, but notice that the result-
R 2 −S[b]+R Ψb+h.c.
ing action for Ψ is the connected generating function W [J]: [d b]e =
−W [Ψ,Ψ? ]
e . More specifically,
Z R 1/T
V
Z = [d2 Ψ]e− T F0 − 0 dτ LB
121
~ 2 + r̃|Ψ|2 + u|Ψ|4 + · · ·
with LB = K1 Ψ? ∂τ Ψ + K2 |∂τ Ψ|2 + K3 |∇Ψ|
Here V = M ad is the volume of space, and F0 is the mean-field free energy. The
coefficients K etc are connected Green’s functions of the bs. The choice of which terms
I wrote was dictated by Landau, and the order in which I wrote them should have been
determined by Wilson. The Mott-SF transition occurs when r̃ changes sign, that is,
the condition r̃ = 0 determines the location of the Mott-SF boundaries. You can see
that generically we have z = 2 kinetic terms. Less obvious is that r̃ is proportional to
the mean field coefficient r.
Here’s the payoff. I claim that the coefficients in the action for Ψ are related by
This means that K1 = 0 precisely when the boundary of the lobe has a vertical tangent.
This means that right at those points (the ends of the dashed lines in the figure) the
second-order kinetic term is the leading one, and we have z = 1.
Here’s the proof of (14.18). LB must have the same symmetries as Lb . One such
invariance is
bi → bi eiφ(τ ) , Ψi → Ψi eiφ(τ ) , µ → µ + i∂τ φ.
This is a funny transformation which acts on the couplings, so doesn’t produce Noether
currents. It is still useful though, because it implies
14.2.3 Duality
We have seen above (in §14.1) that the prevention of vortices is essential to superfluidity,
which is the condensation of bosons. In D = 1 + 1, vortices are events in spacetime.
In D = 2 + 1, vortices are actual particles, i.e. localizable objects, around which the
superfluid phase variable winds by 2π (times an integer).
More explicitly, if the boson field which condenses is b(x) = veiφ , and we choose
polar coordinates in space x + iy ≡ Reiϕ , then a vortex is a configuration of the order
R→∞
parameter field of the form b(x) = f (R)eiϕ , where f (R) → v far away: the phase
of the order parameter winds around. Notice that the phase is ill-defined in the core
R→0
of the vortex where f (R) → 0. (This is familiar from our discussion of the Abelian
Higgs model.)
To see the role of vortices in destroying superfluidity more clearly, consider super-
fluid flow in a 2d annulus geometry, with the same polar coordinates x + iy = Reiϕ . If
122
the superfluid phase variable is in the configuration φ(R, ϕ) = nϕ, then the current is
~
J(R, ~ = ϕ̌ρs n .
ϕ) = ρs ∇φ
2πR
The current only changes if the integer n changes. This happens if vortices enter from
the outside; removing the current (changing n to zero) requires n vortices to tunnel all
the way through the sample, which if they are gapped and the sample is macroscopic
can take a cosmologically long time.
There is a dual statement to the preceding three paragraphs: a state where the
bosons themselves are gapped and localized – that is, a Mott insulator – can be de-
scribed starting from the SF phase by the condensation of vortices. To see this, let us
consider again the (simpler-than-Bose-Hubbard) 2 + 1d rotor model
X X
Hrotors = U n2i − J cos (φi − φj )
i hiji
and introduce dual variables. Introduce a dual lattice whose sites are (centered in) the
faces of the original (direct) lattice; each link of the dual lattice crosses one link of the
direct lattice.
φ −φ
• First let eīj̄ ≡ i2π j . Here we define īj̄ by the right hand rule:
ij × īj̄ = +ž (ij denotes the unit vector pointing from i to j). This
is a lattice version of ~e = ž × ∇φ ~ 1 . Defining lattice derivatives
2π
∆x φi ≡ φi − φi+x̌ , the definition is ex = − ∆2πy φ , ey = ∆2πx φ . It is like
an electric field vector.
• The conjugate variable to the electric field is aīj̄ , which must
therefore be made from the conjugate variable of φi , namely ni :
[ni , φj ] = −iδij . Acting with ni translates φi , which means that it
shifts all the eīj̄ from the surrounding plaquettes. More precisely:
2πni = a1̄2̄ + a2̄3̄ + a3̄4̄ + a4̄1̄ .
1 ~
This is a lattice, integer version of n ∼ 2π ∇ × a · ž. In terms of these variables,
2
U X ∆×a X
Hrotors = −J cos 2πeīj̄
2 i 2π
hīj̄i
123
(More explicitly, 2π ∇~ · ~e = zij ∂i ∂j φ = [∂x , ∂y ]φ clearly vanishes if φ is single-valued.)
Since this is true for any region R, we have
~ · ~e = 2πδ 2 (vortices).
∇
Actually, the lattice version of the equation has more information (and is true) because
it keeps track of the fact that the number of vortices is an integer:
~ · ~e(ī) = 2πnv (ī),
∆x ex + ∆y ey ≡ ∆ nv (ī) ∈ Z.
It will not escape your notice that this is Gauss’ law, with the density of vortices playing
the role of the charge density.
Phases of the 2d rotors. Since ~e ∼ ∇φ ~ varies continuously, i.e. electric flux
is not quantized, this is called noncompact electrodynamics. Again we will impose
the integer constraint a ∈ 2πZ energetically, i.e. let a ∈ R and add (something like)
?
∆H = −t cos a and see what happens when we make t finite. The expression in the
previous sentence is not quite right, yet, however: This operator does not commute
~ · ~e − 2πnv = 0 – it jumps ~e but not nv 33 .
with our constraint ∆
We can fix this by introducing explicitly the variable which creates vortices, e−iχ ,
with:
[nv (ī), χ(j̄)] = −iδīj̄ .
Certainly our Hilbert space contains states with different number of vortices, so we
can introduce an operator which maps these sectors. Its locality might be an issue:
certainly it is nonlocal with respect to the original variables, but we will see that we
can treat it as a local operator (except for the fact that it carries gauge charge) in the
dual description. Since nv ∈ Z, χ ' χ + 2π lives on a circle. So:
!
X U ∆ × a 2 J
H∼ + (2πe)2 − t cos (∆χ − a)
2 2π 2
ī
~ · ~e = 2πnv .
still subject to the constraint ∆
Two regimes:
J U, t : This suppresses e and its fluctuations, which means a fluctuates. The
fluctuating a is governed by the gaussian hamiltonian
X
H∼ 2 ~
~e + b 2
33
A set of words which has the same meaning as the above: cos a is not gauge invariant. Under-
~ · ~e − 2πnv as the generator of a
standing these words requires us to think of the operator G(ī) ≡ ∆
transformation, X
δO = s(ī)[G(ī), O].
ī
It can be a useful picture.
124
with b ≡ ∆×a
2π
, which should look familiar. This deconfined phase has a gapless photon;
a 2 + 1d photon has a single polarization state. This is the goldstone mode, and this
regime describes the superfluid phase (note that the parameters work out right in the
original variables). The relation between the photon a and the original phase variable,
in the continuum is
µνρ ∂ν aρ = ∂µ φ.
The above is easier to understand (but a bit less precise) in the continuum. Consider a
quantum system of bosons in D = 2 + 1 with a U(1) particle-number symmetry (a real
symmetry, not a gauge redundancy). Let’s focus on a complex, non-relativistic bose
field b with action
Z
S[b] = dtd2 x b† i∂t − ∇ ~ 2 − µ b − U (b† b)2 . (14.19)
125
In the disordered/unbroken/Mott insulator phase, hbi = 0, and there is a mass gap. A
dimensionless parameter which interpolates between these phases is g = µ/U ; large g
encourages condensation of b.
We can ‘solve’ the continuity equation by writing
j µ = µ·· ∂· a· (14.20)
where a· is a gauge potential. The time component of this equation says that the
boson density is represented by the magnetic flux of a. The spatial components relate
the boson charge current to the electric flux of a. The continuity equation for j is
automatic – it is the Bianchi identity for a – as long as a is single-valued. That is:
as long as there is no magnetic charge present. A term for this condition which is
commonly used in the cond-mat literature is: “a is non-compact.” (More on the other
case below.)
The relation (14.20) is the basic ingredient of the duality, but it is not a complete
description: in particular, how do we describe the boson itself in the dual variables?
In the disordered phase, adding a boson is a well-defined thing which costs a definite
energy. The boson is described by a localized clump of magnetic flux of a. Such a
configuration is energetically favored if a participates in a superconductor – i.e. if a is
coupled to a condensate of a charged field. The Meissner effect will then ensure that
its magnetic flux is bunched together. So this suggests that we should introduce into
the dual description a scalar field, call it Φ, minimally coupled to the gauge field a:
126
The vortices have relativistic kinetic terms, i.e. particle-
hole symmetry. This is the statement that in the ordered
phase of the time-reversal invariant bose system, a vortex
and an antivortex have the same energy. An argument
for this claim is the following. We may create vortices
by rotating the sample, as was done in the figure at right.
With time-reversal symmetry, rotating the sample one way
will cost the same energy as rotating it the other way. Fig: M. Zwierlein.
We can parametrize V as
2
V = λ Φ† Φ − v
– when v < 0, hΦi = 0, Φ is massive and we are in the Coulomb phase. When v > 0
Φ condenses and we are in the Anderson-Higgs phase.
In the previous discussion I have been assuming that the vortices of b have unit
charge under a and are featureless bosons, i.e. do not carry any non-trivial quan-
tum numbers under any other symmetry. If e.g. the vortices have more-than-minimal
charge under a, say charge q, then condensing them leaves behind a Zq gauge theory
and produces a state with topological order. If the vortices carry some charge un-
der some other symmetry (like lattice translations or rotations) then condensing them
127
breaks that symmetry. If the vortices are minimal-charge fermions, then they can only
condense in pairs, again leaving behind an unbroken Z2 gauge theory.
[End of Lecture 59]
Here nij = nji , Θij = Θji – we have not oriented our links (yet). We also impose the
Gauss’ law constraint X
Gs ≡ nl = 0 ∀ sites s,
l∈v(s)
where the notation v(s) means the set of links incident upon the site s (‘v’ is for ‘vicin-
ity’).
We’ll demand that the Hamiltonian is ‘gauge invariant’, that is,
that [H, Gs ] = 0∀s. Any terms which depend only on n are OK.
The natural single-valued object made from Θ is eiΘl , but this is
not gauge invariant. A combination which is gauge invariant is the
plaquette operator, associated to a face p of the lattice:
y
Y
e(−1) iΘl ≡ ei(Θ12 −Θ23 +Θ34 −Θ41 )
l∈∂p
– we put a minus sign on the horizontal links. ∂p denotes the links running around the
boundary of p. So a good hamiltonian is
!
UX 2 X X
H= n −K cos (−1)y Θl .
2 l l 2 l∈∂2
Local Hilbert space. The space of gauge-invariant states is not a tensor product
over local Hilbert spaces. This sometimes causes some confusion. Notice, however, that
we can arrive at the gauge-theory hilbert space by imposing the Gauss’ law constraint
energetically (as in the toric code): Start with the following Hamiltonian acting on the
full unconstrained rotor Hilbert space:
X
Hbig = +Γ∞ Gi + H.
i
128
True to its name, the coefficient Γ∞ is some huge energy scale which penalizes configu-
rations which violate Gauss’ law (if you like, such configurations describe some matter
with rest mass Γ∞ ). So, states with energy Γ∞ all satisfy Gauss’ law. Then further,
we want H to act within this subspace, and not create excitations of enormous energies
like Γ∞ . This requires [Gi , H] = 0, ∀i, which is exactly the condition that H is gauge
invariant.
cos (Θ12 − Θ23 + Θ34 − Θ41 ) = cos (a12 + a23 + a34 + a41 ) ≡ cos (∆ × a)
(in the last term we emphasize that this works in D ≥ 2 + 1 if we remember to take the
component of the curl normal to the face in question). This is (compact) lattice U(1)
gauge theory, with no charges. The word ‘compact’ refers to the fact that the charge
is quantized; the way we would add charge is by modifying the Gauss’ law to
∆ · e(ī) = charge at ī
| {z } | {z }
∈Z =⇒ ∈Z
where the charge must be quantized because the LHS is an integer. (In the noncompact
electrodynamics we found dual to the superfluid, it was the continuous angle variable
which participated in the Gauss’ law, and the discrete variable which was gauge vari-
ant.)
129
(here n(x) is the density of charge) is the generator of gauge transformations, in the
sense that a gauge transformation acts on any operator O by
P P
O 7→ e−i x α(x)G(x)
Oei x α(x)G(x)
(14.21)
This is a fact we’ve seen repeatedly above, and it is familiar from ordinary QED, where
using the canonical commutation relations
[ai (x), ej (y)] = −iδ ij δ(x − y), [φ(x), n(y)] = −iδ(x − y)
(φ is the phase of a charged field, Φ = ρeiφ ) in (14.21) reproduce the familiar gauge
transformations
~
~a → ~a + ∇α, φ→φ+α .
SO: if all the objects appearing in Gauss’ law are integers (which is the case if
charge is quantized and electric flux is quantized), it means that the gauge parameter
α itself only enters mod 2π, which means the gauge transformations live in U(1), as
opposed to R. So it’s the gauge group that’s compact.
This distinction is very important, because (in the absence of matter) this model
does not have a deconfined phase! To see this result (due to Polyakov), first consider
strong coupling:
U K : The groundstate has el̄ = 0, ∀¯l. (Notice that this configuration satisfies
the constraint.) There is a gap to excitations where some link has an integer e 6= 0, of
order U . (If e were continuous, there would not be a gap!) In this phase, electric flux
is confined, i.e. costs energy and is generally unwanted.
U K : The surprising thing is what happens when we make the gauge coupling
weak.
Then we should first minimize the magnetic flux term: min-
imizing − cos(∆ × a) means ∆ × a ∈ 2πZ. Near each min-
imum, the physics looks like Maxwell, h ∼ e2 + b2 + · · · .
BUT: it turns out to be a colossally bad idea to ignore the
tunnelling between the minima. To see this, begin by solving the Gauss law constraint
∆ · e = 0 by introducing
1
e1̄2̄ ≡ (χ2 − χ1 ) (14.22)
2π
1
(i.e. ~e = ž · ∆χ 2π .) χ is a (discrete!) ‘height variable’. Then the operator
ei(∆×a)(ī)
increases the value of eīā for all neighboring sites ā, which means it jumps χī → χī +2π.
So we should regard
(∆ × a) (ī) ≡ Πχ (ī)
130
as the conjugate variable to χ, in the sense that
Notice that this is consistent with thinking of χ as the dual scalar related to the
gauge field by our friend the (Hodge) duality relation
∂µ χ = µνρ ∂ν aρ .
The spatial components i say ∂i χ = ij f0j , which is the continuum version of (14.22).
The time component says χ̇ = ij fij = ∇×a, which indeed says that (if χ has quadratic
kinetic terms), the field momentum of χ is the magnetic flux. So χ is the would-be
transverse photon mode.
– that is, eiχ is a raising operator for ∆ × a. To analyze whether the Maxwell limit
survives this, let’s go to the continuum and study perturbations of the free hamiltonian
Z 2
U ~ K 2
H0 = ∇χ + Πχ
2 2
by Z
H1 = − V0 cos χ .
131
has constant amplitude at large r! That means that the operator has dimension zero,
R
and the perturbation in the action has [S1 = − V0 cos χd2 xdτ ] ∼ L3 , very relevant.
The result is that it pins the χ field (the would-be photon mode) to an integer, from
which it can’t escape. This result is due to Polyakov.
And
1
hcos χ(x) cos χ(0)i = cos
4T x
which does not decay at long distance, and in fact approaches a constant.
(I’ve absorbed a factor of the gauge coupling into χ to make the dimensions work
nicely, µνρ ∂ν Aρ = g∂µ χ) and the expectation is
Z
2
R
−1
hW (2)i = Z [dχ]e−Sχ +gi χ̇ ∼ e−cg mχ ·area() .
132
In the last step we did the gaussian integral from small χ fluctuations. This
area-law behavior proportional to mχ means that the mass for χ confines the
gauge theory. This is the same (Polyakov) effect we saw in the previous section,
where the monopole tunneling events produced the mass.
• Think about the action of eiχ(x,t) from the point of view of 2 + 1d spacetime:
it inserts 2π magnetic flux at the spacetime point x, t. From that path integral
viewpoint, this is an event localized in three dimensions which is a source of mag-
netic flux – a magnetic monopole. In Polyakov’s paper, he uses a UV completion
of the abelian gauge theory (not the lattice) in which the magnetic monopole is
a smooth solution of field equations (the ’t Hooft-Polyakov monopole), and these
solutions are instanton events. The cos χ potential we have found above arises
from, that point of view, by the same kind of dilute instanton gas sum that we
did in the D = 1 + 1 Abelian Higgs model.
[The original papers are this and this; this treatment follows Ami Katz’ BU Physics 811
notes.] Consider a square lattice with quantum spins (spin half) at the sites, governed
by the Hamiltonian
X X 1
1
HJQ ≡ J S~i · S
~j + Q S~i · S
~j − S~k · S
~l − .
4 4
hiji [ijkl]
Here hiji denotes pairs of sites which share a link, and [ijkl] denotes groups of four sites
at the corners of a plaquette. This JQ-model is a somewhat artificial model designed
to bring out the following competition which also exists in more realistic models:
J Q : the groundstate is a Neel antiferromagnet (AFM), with local order param-
~i , whose expectation value breaks the spin symmetry SU(2) →
eter ~n = i (−1)xi +yi S
P
133
parameter on the square lattice is
X
V = (−1)xi S~i · S
~i+x + i(−1)yi S
~i · S
~i+y ∈ Z4 .
i
In the four solid states, it takes the values 1, i, −1, −i. Notice
that they are related by multiplication by i = eiπ/2 . V is
a singlet of the spin SU(2), but the VBS states do break
spacetime symmetries: a lattice rotation acts by Rπ/2 : V →
−iV (the Neel order ~n is invariant), while a translation by a
single lattice site acts by
To get a big hint, notice that the VBS order parameter is like
a discrete rotor: if we had a triangular lattice it would be in
Z6 and would come closer to approximating a circle-valued
field. In any case, we can consider vortex configurations,
where the phase of V rotates (discretely, between the four
quadrants) as we go around a point in space. Such a vortex
looks like the picture at right.
Notice that inside the core of the vortex, there is necessarily a spin which is not
paired with another spin: The vortex carries spin: it transforms as a doublet under
the spin SU(2). Why do we care about such vortices? I’ve been trying to persuade
you for the past two sections that the way to think about destruction of (especially
U(1)) ordered phases is by proliferating vortex defects. Now think about proliferating
this kind of VBS vortex. Since it carries spin, it necessarily must break the SU(2)
134
symmetry, as the Neel phase does. This is why the transitions happen at the same
point.
To make this more quantitative, let’s think about it from the AFM side: how do
we make V from the degrees of freedom of the low energy theory? It’s not made from
n since it’s a spin singlet which isn’t 1 (spin singlets made from n are even under a
lattice translation). What about the CP1 version, aka the Abelian Higgs model, aka
scalar QED (now in D = 2 + 1)?
1 2 λ
L=− 2
F + |Dz|2 − m2 |z|2 − |z|4
4g 4
z
where z = ↑ , and Dµ z = (∂µ − iAµ )z as usual. Let’s think about the phases of this
z↓
model.
m2 < 0 : Here z condenses and breaks SU(2) → U(1), and Aµ is higgsed. A gauge
invariant order parameter is ~n = z †~σ z, and there are two goldstones associated with
its rotations. This is the AFM. The cautionary tale I told you about this phase in
D = 1 + 1 doesn’t happen because now the vortices are particles rather than instanton
events. More on these particles below.
m2 > 0 : Naively, in this phase, z are uncondensed and massive, leaving at low
?
energies only Llow-E = − 4g12 F 2 , Maxwell theory in D = 2 + 1. This looks innocent
but it will occupy us for quite a few pages starting now. This model has a conserved
current (conserved by the Bianchi identity)
The thing that’s conserved is the lines of magnetic flux. We can follow these more
effectively by introducing the dual scalar field by a by-now-familiar duality relation:
You can think of the last equation here as a solution of the conservation law ∂µ JFµ = 0.
The symmetry acts on χ by shifts: χ → χ + constant. In terms of χ, the Maxwell
action is
? 1 1
Llow-E = − 2 F 2 = ∂µ χ∂ µ χ.
4g 2
But this is a massless scalar, a gapless theory. And what is the χ → χ + c symmetry
in terms of the spin system? I claim that it’s the rotation of the phase of the VBS
order parameter, which is explicitly broken by the squareness of the square lattice. An
improvement would then be
1 1
Llow-E = ∂µ χ∂ µ χ − m2χ χ2 + · · ·
2 2
135
1
where mχ ∼ a2
comes from the lattice breaking the rotation invariance (a is the lattice
spacing).
To see that shifts of χ are VBS rotations, let’s reproduce the lattice symmetries in
the Abelian Higgs model. Here’s the action of lattice translations T ≡ Tx or Ty (take
a deep breath.): T : na → −na but na = z † σ a z, so on z we must have T : z → iσ 2 z ? .
The gauge current is jµ = iz † ∂µ z + h.c. → −jµ which means we must have Aµ → −Aµ
and Fµν → −Fµν . Therefore by (14.26) we must have T : ∂χ → −∂χ which means
that
Tx,y : χ → −χ + gαx,y
where αx,y are some so-far-undetermined numbers, and g is there on dimensional
grounds. Therefore, by choosing Tx,y χ → −χ ± gπ/2, Rπ/2 : χ → χ − gπ/2 we can
reproduce the transformation (14.25) by identifying
V = ceiχ/g
(up to an undetermined overall complex number). Notice for future reference the
canonical commutation relation between the flux current density (JF0 = g χ̇ = gi δχ
δ
) and
V:
[JF0 (x), V (0)] = V (0)δ 2 (x). (14.27)
It creates flux.
So χ is like the phase of the bosonic operator V which is condensed in the VBS
phase; lattice effects break the U(1) symmetry down to some discrete subgroup (Z4 for
the square lattice, Z6 for triangular, Z3 for honeycomb), with a potential of the form
V(V k ) = m3χ cos(4χ/g) + · · · , where k = 4, 6, 3... depends on the lattice, which has k
minima, corresponding to the k possible VBS states. By (14.27), such a potential has
charge k under JF .
Consider this phase from the point of view of the gauge theory now. Notice that χ
is the same (up to a factor) dual variable we introduced in our discussion of compact
QED, and the Wilson loop will again produce an area law if χ is massive, as with the
Polyakov effect.
In order for this story to make sense, we need that M, g 2 a12 , so that χ is actually
a low-energy degree of freedom. The idea is that the critical point from tuning J/Q
to the critical value is reached by taking mχ → 0. What is the nature of this critical
theory? It has emergent deconfined gauge fields, even though the phases on either side
of the critical point do not (they are confined m > 0 and Higgsed m < 0 respectively).
Hence the name deconfined quantum criticality.
The conjecture (which would explain the phase diagram above) is that this gauge
theory is a critical theory (in fact a conformal field theory) with only one relevant
136
operator (the one which tunes us through the phase transition, the mass for χ) which
is a singlet under all the symmetries. Recall that eikχ has charge k under the JF
symmetry, and the square lattice preserves a Z4 ⊂ U(1) subgroup, so only allows the 4-
vortex-insertion operator ei4χ . What is the dimension of this operator? The conjecture
is that it has dimension larger than 3.
[End of Lecture 60]
Insanely brief sketch of a check at large N . Actually, this can be checked very
explicitly in a large-N version of the model, with N component z fields, so that the spin
is φA = z † T A z, A = 1..N 2 − 1. This has SU(N ) symmetry. When m2 < 0, it is broken
to SU(N − 1), with 2(N − 1) goldstone bosons. (Actually there is a generalization of
the lattice model which realizes this – just make the spins into N × N matrices.)
Introducing an H-S field σ to decouple the |z|4 interaction, we can make the z
integrals gaussian, and find (this calculation is just like our earlier analysis in §11.3.4)
Z
1 1 c1 N 2m + ip µν 1 c2 N 2m + ip
S[A, σ] = d̄p Fµν (p) + log F (−p) + σ(p) − + log σ(−p)
4 gU2 V ip 2m − ip λ ip 2m − ip
This is Z Z Z
Zk = [dA]δ F −k [dzdz † ]e−S[z,A] ≡ e−Fk
137
which at large-N we can do by saddle point. The dominant configuration of the gauge
field is the charge-k magnetic monopole Aϕ = k2 (1 − cos ϕ), and we must compute
Z −N/2
† †
N
z † (−DA DA +m2 )z
= e− 2 tr log(−DA DA +m )
† 2
2 2
[d z]e = det −DA DA + m
Z X
Fk = N T d̄ω (2` + 1) log(ω 2 + λ` (k) + m2 ).
`
Pure field theory description. We’ve been discussing a theory with U(1)VBS ×
SU(2)spin symmetry. Lattice details aside, how can we encode the way these two
symmetries are mixed up which forces the order parameter of one to be the disor-
der operator for the other? To answer this, briefly consider enlarging the symmetry
to SO(5) ⊂ U(1)VBS × SU(2)spin , and organize (ReV, ImV, n1 , n2 , n3 ) ≡ na into a 5-
component mega-voltron-spin vector. We saw that in D = 0 + 1, we could make a
WZW term with a 3-component spin
Z
1 2 3
W0 [(n , n , n )] = abc na dnb ∧ dnc .
B2
Its point in life was to impose the spin commutation relations at spin s when the
coefficient is 2s. In D = 1 + 1, we can make a WZW term with a 4-component spin,
which can have SO(4) symmetry
Z
1 2 3 4
W1 [(n , n , n , n )] = abcd na dnb ∧ dnc ∧ dnd .
B3
34
Once we’ve got this far, how can you resist considering
Z
1 2 3 4 5
W2 [(n , n , n , n , n )] = abcde na dnb ∧ dnc ∧ dnd ∧ dne .
B4
34
In fact the D = 1 + 1 version of this is extremely interesting. A few brief comments: (1) involves
a real VBS order parameter n4 .) (2) The D = 1 + 1 term has the same number of derivatives (in
the EOM) as the kinetic term ∂na ∂na . This means they can compete at a fixed point. The resulting
CFTs are called WZW models. (3) The above is in fact a description of the spin-half chain, which
previously we’ve described by an O(3) sigma model at θ = π.
138
What does this do? Break the SO(5) → U(1) × SU(2) and consider a vortex configu-
ration of V at x2 = x3 = 0. Suppose our action contains the term kW2 [n] with k = 1.
Evaluate this in the presence of the vortex:
Z
1 2 3 4 5
kW2 [(n , n , n , n , n )|vortex of n1 + in2 at x2 = x3 = 0 ] = abc na dnb ∧dnc = kW0 [(n1 , n2 , n3 )].
B2 |x2 =x3 =0
139
15 Effective field theory
[Some nice lecture notes on effective field theory can be found here: J. Polchinski,
A. Manohar, D. B. Kaplan, H. Georgi.]
Diatribe about ‘renormalizability’. Having internalized Wilson’s perspective
on renormalization – namely that we should include all possible operators consistent
with symmetries and let the dynamics decide which are important at low energies – we
are led immediately to the idea of an effective field theory (EFT). There is no reason to
demand that a field theory that we have found to be relevant for physics in some regime
should be a valid description of the world to arbitrarily short (or long!) distances. This
is a happy statement: there can always be new physics that has been so far hidden
from us. Rather, an EFT comes with a regime of validity, and with necessary cutoffs.
As we will discuss, in a useful implementation of an EFT, the cutoff implies a small
parameter in which we can expand (and hence compute).
Caring about renormalizibility is pretending to know about physics at arbitrarily
short distances. Which you don’t.
Even when theories are renormalizable, this apparent victory is often false. For
example, QED requires only two independent counterterms (mass and charge of the
electron), and is therefore by the old-fashioned definition renormalizable, but it is
superseded by the electroweak theory above 80GeV. Also: the coupling in QED actually
increases logarithmically at shorter distances, and ultimately reaches a Landau pole
c 1
at SOME RIDICULOUSLY HIGH ENERGY (of order e+ α where α ∼ 137 is the fine
structure constant (e.g. at the scale of atomic physics) and c is some numerical number.
Plugging in numbers gives something like 10330 GeV, which is quite a bit larger than the
Planck scale). This is of course completely irrelevant for physics and even in principle
because of the previous remark about electroweak unification. And if not because of
that, because of the Planck scale. A heartbreaking historical fact is that Landau and
many other smart people gave up on QFT as a whole because of this silly fantasy about
QED in an unphysical regime.
We will see below that even in QFTs which are non-renormalizable in the strict
sense, there is a more useful notion of renormalizability: effective field theories come
with a parameter (often some ratio of mass scales), in which we may expand the action.
A useful EFT requires a finite number of counterterms at each order in the expansion.
Furthermore, I claim that this is always the definition of renormalizability that
we are using, even if we are using a theory which is renormalizable in the traditional
sense, which allows us to
pretend
n that there is no cutoff. That is, there could always
E
be corrections of order Enew where E is some energy scale of physics that we are
140
doing and Enew is some UV scale where new physics might come in; for large enough
n, this is too small for us to have seen. The property of renormalizibility that actually
matters is that we need a finite number of counterterms at each order in the expansion
E
in Enew .
Renormalizable QFTs are in some sense less powerful than non-renormalizable ones
– the latter have the decency to tell us when they are giving the wrong answer! That
is, they tell us at what energy new physics must come in; with a renormalizable theory
we may blithely pretend that it is valid in some ridiculously inappropriate regime like
10330 GeV.
Notions of EFT. There is a dichotomy in the way EFTs are used. Sometimes one
knows a lot about the UV theory (e.g.
• QCD,
• electrons in a solid,
• water molecules
...) but it is complicated and unwieldy for the questions one wants to answer, so instead
one develops an effective field theory involving just the appropriate and important dofs
(e.g., respectively,
• Landau Fermi liquid theory (or the Hubbard model or a topological field theory
or ...),
...). As you can see from the preceding lists of examples, even a single UV theory
can have many different IR EFTs depending on what phase it is in, and depending on
what question one wants to ask. The relationship between the pairs of theories above
is always coarse-graining from the UV to the IR, though exactly what plays the role
of the RG parameter can vary wildly. For example, in the example of the Fermi liquid
theory, the scaling is ω → 0, and momenta scale towards the Fermi surface, not ~k = 0.
A second situation is when one knows a description of some low-energy physics up
to some UV scale, and wants to try to infer what the UV theory might be. This is a
141
common situation in physics! Prominent examples include: the Standard Model, and
quantized Einstein gravity. Occasionally we (humans) actually learn some physics and
an example of an EFT from the second category moves to the first category.
Then write down all interactions between the dofs which preserve the symmetry in an
expansion in derivatives, with higher-dimension operators suppressed by more powers
of the UV scale.
I must also emphasize two distinct usages of the term ‘effective field theory’ which
are common, and which the discussion above is guilty of conflating (this (often slip-
pery) distinction is emphasized in the review article by Georgi linked at the beginning
of this subsection). The Wilsonian perspective advocated above produces a low-energy
description of the physics which is really just a way of solving (if you can) the original
model; very reductively, it’s just a physically well-motivated order for doing the inte-
grals. If you really integrate out the high energy modes exactly, you will get a non-local
action for the low energy modes. This is to be contrasted with the local actions one
uses in practice, by truncating the derivative expansion. It is the latter which is really
the action of the effective field theory, as opposed to the full theory, with some of the
integrals done already. The latter will give correct answers for physics below the cutoff
scale, and it will give them much more easily.
Some interesting and/or important examples of EFT that we will not discuss ex-
plicitly, and where you can learn about them:
• Hydrodynamics [Kovtun]
• Fermi liquid theory [J. Polchinski, R. Shankar, Rev. Mod. Phys. 66 (1994) 129]
142
• color superconductors [D. B. Kaplan, §5]
There are many others, the length of this list was limited by how long I was willing to
spend digging up references. Here is a longer list.
[from §5 of A. Manohar’s EFT lectures] As a first example, let’s think about part of
the Standard Model.
ig
LEW 3 − √ ψ̄i γ µ PL ψj Wµ Vij + terms involving Z bosons (15.1)
2
If we are asking questions with external momenta less than MW , we can integrate
out W and make our lives simpler:
2
−igµν
Z
ig
Vij Vk` d̄D p 2
?
ψ̄i γ µ PL ψj (p) ψ̄k γ ν PL ψ` (−p)
δSef f ∼ √ 2
2 p − MW
(I am lying a little bit about the W propagator in that I am not explicitly projecting
out the fourth polarization with the negative residue. Also hidden in my notation is
the fact that the W carries electric charge, so the charges of ψ̄i and ψj in (15.1) must
143
differ by one.) This is non-local at scales p >
∼ MW (recall our discussion in §8 (215B)
2 2
with the two oscillators). But for p MW ,
1 p2 MW
2
1 p2 p4
' − 1 + 2 + 4 + ... (15.2)
2 2 2
p − MW MW M M
| W {z W
}
derivative couplings
Z
4GF ? 4 µ
1
SF = − √ Vij Vkl d x ψ̄i γ PL ψj (x) ψ̄k γµ PL ψ` (x)+O 2
+kinetic terms for fermions
2 MW
(15.3)
√ g2
where GF / 2 ≡ 8M 2 is the Fermi coupling. We can use this (Fermi’s) theory to
W
compute the amplitudes above, and it is much simpler than the full electroweak theory
(for example I don’t have to lie about the form of the propagator of the W-boson like
I did above).
On the other hand, this theory is not the same as the electroweak theory; for
example it is not renormalizable, while the EW theory is. Its point in life is to help
facilitate the expansion in 1/MW . There is something about the expression (15.3) that
2
should make you nervous, namely the big red 1 in the 1/MW corrections: what makes
up the dimensions? This becomes an issue when we ask about ...
I skipped this subsection in lecture. Skip to §15.3. Suppose we try to define the Fermi
theory SF with a euclidean momentum cutoff |kE | < Λ, like we’ve been using for most
of our discussion so far. We expect that we’ll have to set Λ ∼ MW . A simple example
which shows that this is problematic is to ask about radiative corrections in the 4-Fermi
theory to the coupling between the fermions and the Z (or the photon).
We are just trying to estimate the magnitude of this correction, so don’t worry
about the factors and the gamma matrices:
Z Λ
1 11
∼I≡ 2 d̄4 k tr (γ...) ∼ O(1).
MW kk
|
|{z} R {z }
∝GF Λ 2
∼ kdk∼Λ2 ∼MW
`
p2
Even worse, consider what happens if we use the vertex coming from the 2
MW
144
correction in (15.2)
Λ `
k2
Z
1 4 1
∼ I` ≡ 2 d̄ k 2 2
∼ O(1)
MW k MW
where m is some mass scale other than the RG scale µ (like a fermion mass parameter,
or an external momentum, or a dynamical scale like ΛQCD ).
We will give a more detailed example next. The point is that in a mass-independent
scheme, the regulator doesn’t produce new dimensionful things that can cancel out the
factors of MW in the denominator. It respects the ‘power counting’: if you see 2`
powers of 1/MW in the coefficient of some term in the action, that’s how many powers
will suppress its contributions to amplitudes. This means that the EFT is like a
renormalizable theory at each order in the expansion (here in 1/MW ), in that there is
only a finite number of allowed vertices that contribute at each order (counterterms
for which need to be fixed by a renormalization condition). The insatiable appetite for
counterterms is still insatiable, but it eats only a finite number at each order in the
expansion. Eventually you’ll get to an order in the expansion that’s too small to care
about, at which point the EFT will have eaten only a finite number of counterterms.
There is a price for these wonderful features of mass-independent schemes, which
has two aspects:
• Heavy particles (of mass m) don’t decouple when µ < m. For example, in a
mass-independent scheme for a gauge theory, heavy charged particles contribute
to the beta function for the gauge coupling even at µ m.
• Perturbation theory will break down at low energies, when µ < m; in the example
just mentioned this happens because the coupling keeps running.
145
We will show both these properties very explicitly in the next subsection. The solution
of both these problems is to integrate out the heavy particles by hand at µ = m, and
make a new EFT for µ < m which simply omits that field. Processes for which we
should set µ < m don’t have enough energy to make the heavy particles in external
states anyway. (For some situations where you should still worry about them, see
Aneesh Manohar’s notes linked above.)
The case study we will make is the contribution of a charged fermion of mass m to the
running of the QED gauge coupling.
Recall that the QED Lagrangian is
1
− Fµν F µν − ψ̄ (iD
/ − m) ψ
4
with Dµ = ∂µ − ieAµ . By redefining the field Fµν = ∂µ Aν − ∂ν Aµ by a constant factor
we can move around where the e appears, i.e. by writing à = eA, we can make the
gauge kinetic term look like 4e12 F̃µν F̃ µν . This means that the charge renormalization
can be seen either in the vacuum polarization, the correction to the photon propagator:
146
The contribution of a fermion of mass m and charge e is (factoring out the momentum-
conserving delta function):
!
−i / −i p + /
k + m
Z
(k + m) /
p,µ p,ν
=− d̄D ktr (−ieγ µ ) 2 (−ieγ ν )
k − m2 (p + k)2 − m2
The minus sign out front is from the fermion loop. Some boiling, which you can find
in Peskin (page 247) or Zee (§III.7), reduces this to something manageable. The steps
1
involved are: (1) a trick to combine the denominators, like the Feynman trick AB =
R1 2
1
0
dx (1−x)A+xB . (2) some Dirac algebra, to turn the numerator into a polynomial
in k, p. As Zee says, our job in this course is not to train to be professional integrators.
The result of this boiling can be written
Z 1
N µν
Z
µν 2 D
iΠ = −e d̄ ` dx
0 (`2 − ∆)2
with ` = k + xp is a new integration variable, ∆ ≡ m2 − x(1 − x)p2 , and the numerator
is
In dim reg, the one-loop vacuum polarization correction satisfies the gauge in-
varaince Ward identity Πµν = P µν δΠ2 (unlike the euclidean momentum cutoff which
is not gauge invariant). A peek at the tables of dim reg integrals shows that δΠ2 is:
Z 1
2 Peskin p. 252 8e2 Γ(2 − D/2)
δΠ2 (p ) = − D/2
dxx(1 − x) µ̄
(4π)Z 0 ∆2−D/2
1
e2
D→4 2 ∆
= − 2 dxx(1 − x) − log (15.5)
2π 0 µ2
where we have introduced the heralded µ:
µ2 ≡ 4π µ̄2 e−γE
147
e2
2
m − x(1 − x)p2
Z
(M )
=⇒ Π2 (p2 )
= 2 dxx(1 − x) log .
2π m2 + x(1 − x)M 2
Notice that the µs go away in this scheme.
Mass-Independent scheme: This is to be contrasted with what we get in a mass-
independent scheme, such as MS, in which Π is defined by the rule that we subtract
the 1/ pole. This means that the counterterm is
e2 2 1
Z
(MS)
δF 2 = − 2 dxx(1 − x) .
2π 0
| {z }
=1/6
Next we will talk about beta functions, and verify the claim above about the failure
of decoupling. First let me say some words about what is failing. What is failing – the
price we are paying for our power counting – is the basic principle of the RG, namely
that physics at low energies shouldn’t care about physics at high energies, except for
small corrections to couplings. An informal version of this statement is: you don’t need
to know about nuclear physics to make toast. A more formal version is the Appelquist-
Carazzone Decoupling Theorem, which I will not state (Phys. Rev. D11, 28565 (1975)).
So it’s something we must and will fix.
Beta functions. M : First in the mass-dependent scheme. Demanding that
physics is independent of our made-up RG scale, we find
!
d (M ) 2 ∂ (M ) ∂ (M ) 2 ∂ (M ) (M )
0=M Π2 (p ) = M + βe e Π2 (p ) = M + βe ·2 Π2 (p2 )
dM ∂M ∂e ∂M |{z}
to this order
148
mM R1
e2 e2
' dxx(1 − x) =
2π 2 0 12π 2
. (15.6)
mM e2
R1 2 x(1−x)
e2 M 2
dxx(1 − x) M
' 2π 2 0 m2
= 60π 2 m2
!
d (MS) ∂ (MS) ∂ (MS) ∂ (MS)
MS : 0 = µ Π2 (p2 ) = µ + βe e Π2 (p2 ) = µ + βe(MS) ·2 Π2 (p2 )
dµ ∂µ ∂e ∂µ |{z}
to this order
1
1 e2 m2 − p2 x(1 − x)
Z
=⇒ βe(MS) =− dxx(1 − x) µ∂µ log
2 2π 2 µ2
|0 {z }| {z }
=1/6 =−2
2
e
= . (15.7)
12π 2
Figure 3: The blue curve is the mass-dependent-scheme beta function; at scales M m, the mass
of the heavy fermion, the fermion sensibly stops screening the charge. The red line is the MS beta
function, which is just a constant, pinned at the UV value.
Also, the MS vacuum polarization behaves for small external momenta like
Z 1
2 2 e2 m2
Π2 (p m ) ' − 2 dxx(1 − x) log 2
2π 0 µ
| {z }
1,for µm! bad!
149
As I mentioned, the resolution of both these prob-
lems is simply to define a new EFT for µ < m
which omits the heavy field. Then the strong cou-
pling problem goes away and the heavy fields do
decouple. The price is that we have to do this by
hand, and the beta function jumps at µ = m; the
coupling is continuous, though.
Table 1: The Standard Model fields and their quantum numbers under the gauge group. 2 indicates
fundamental representation, - indicates singlet. Except for the Higgs, each row is copied three times.
Except for the Higgs all the other fields are Weyl fermions of the indicated handedness. Gauge fields
as implied by the gauge groups. (Some people might leave out the right-handed neutrino, νR .)
Whence the values of the charges under the U(1) (“hypercharge”)? The condition
YL + 3YQ = 0 (where Y is the hypercharge). is required by anomaly cancellation. This
implies that electrons and protons p = ijk ui uj dk have exactly opposite charges of the
same magnitude.
The Lagrangian is just all the terms which are invariant under the gauge group
SU(3) × SU(2) × U(1) with dimension less than or equal to four – all renormalizable
terms. This includes a potential for the Higgs, V (|H|) = m2H |H|2 + λ|H|4 , where it
turns out that m2H ≤ 0. The resulting Higgs vacuum expectation value breaks the
Electroweak part of the gauge group
hHi
SU(2) × U(1)Y U(1)EM .
The broken gauge bosons get masses from the Higgs kinetic term
2 a a 1 0
|Dµ H| | with D H =
µ ∂µ − igWµ τ − ig Yµ H
0 2
H= √
v/ 2
150
where Yµ is the hypercharge gauge boson, and W a , a = 1, 2, 3 are the SU(2) gauge
bosons. The photon and Z boson are
3
Aµ cos θw sin θw Wµ
= .
Zµ − sin θw cos θw Yµ
There are also two massive W -bosons with electric charge ±1.
Fermion masses come from Yukawa couplings
LYukawa = −Yij` L̄i HejR − Yiju Q̄i HdjR − Yijd Q̄i iτ 2 H ? ujR + h.c.
The singlet of SU(5) is the right-handed neutrino, but if we include it, one generation is
an irreducible (spinor) representation of SO(10). This idea is called grand unification.
It is easy to imagine that another instance of the Higgs mechanism accomplishes the
breaking down to the Standard Model. Notice that this means leptons and quarks are
in the same representations – they can turn into each other. This predicts that the
proton should not be perfectly stable. Next we’ll say more about this.
Beyond the Standard Model with EFT. At what energy does the Standard
Model stop working? Because of the annoying feature of renormalizibility, it doesn’t
tell us. However, we have experimental evidence against a cutoff on the Standard
Model (SM) at energies less than something like 10 TeV. The evidence I have in mind
is the absence of interactions of the form
1
δL = ψ̄Aψ · ψ̄Bψ
M2
(where ψ represent various SM fermion fields and A, B can be various gamma and
flavor matrices) with M <
∼ 10 TeV. Notice that I am talking now about interactions
other than the electroweak interactions, which as we’ve just discussed, for energies
151
above MW ∼ 80GeV cannot be treated as contact interactions – you can see the W s
propagate!
If such operators were present, we would have found different answers for exper-
iments at LEP. But such operators would be present if we consider new physics in
addition to the Standard Model (in most ways of doing it) at energies less than 10
TeV. For example, many interesting ways of coupling in new particles with masses
that make them accessible at the LHC would have generated such operators.
A little more explicitly: the Standard Model Lagrangian L0 contains all the renor-
malizable (i.e. engineering dimension ≤ 4) operators that you can make from its fields
(though the coefficients of the dimension 4 operators do vary through quite a large
range, and the coefficients of the two relevant operators – namely the identity operator
which has dimension zero, and the Higgs mass, which has engineering dimension two,
are strangely small, and so is the QCD θ angle).
To understand what lies beyond the Standard Model, we can use our knowledge
that whatever it is, it is probably heavy (it could also just be very weakly coupled,
which is a different story), with some intrinsic scale Λnew , so we can integrate it out
and include its effects by corrections to the Standard Model:
1 1 X (6)
L = L0 + O(5) + 2 ci O i
Λnew Λnew i
where the Os are made of SM fields, and have the indicated engineering dimensions,
and preserve the necessary symmetries of the SM.
In fact there is only one kind of operator of dimension 5:
i
O(5) = c5 ij L̄c H j kl Lk H l
You should read the above tangle of symbols as ‘qqq`’ – it turns three quarks into a
lepton. The epsilon tensor makes a color SU(3) singlet; this thing has the quantum
numbers of a baryon. The long lifetime of the proton (you can feel it in your bones –
see Zee p. 413) then directly constrains the scale of new physics appearing in front of
this operator.
152
Two more comments about this:
• If we didn’t know about the Standard Model, (but after we knew about QM and
GR and EFT (the last of which people didn’t know before the SM for some rea-
son)) we should have made the estimate that dimension-5 Planck-scale-suppressed
1
operators like MPlanck pO would cause proton decay (into whatever O makes). This
m3
predicts Γp ∼ M 2 p ∼ 10−13 s−1 which is not consistent with our bodies not glow-
Planck
ing. Actually it is a remarkable fact that there are no gauge-invariant operators
made of SM fields of dimension less than 6 that violate baryon number. This is
an emergent symmetry, expected to be violated by the UV completion.
2
1
• Surely nothing can prevent ∆L ∼ MPlanck qqq`. Happily, this is consistent
with the observed proton lifetime.
There are ∼ 102 dimension 6 operators that preserve baryon number, and therefore
are not as tightly constrained36 (Those that induce flavor-changing processes in the
SM are more highly constrained and must have Λnew > 104 TeV.) Two such operators
are considered equivalent if they differ by something which vanishes by the tree-level
SM equations of motion. This is the right thing to do, even for off-shell calculations
(like green’s functions and for fields running in loops). You know this from a previous
problem set: the EOM are true as operator equations – Ward identities resulting from
being free to change integration variables in the path integral37 .
15.4 Pions
[Schwartz §28.1] Below the scale of electroweak symmetry breaking, we can forget the
W and Z bosons. Besides the 4-Fermi interactions, the remaining drama is QCD and
electromagnetism:
1 2 X X
LQCD2 = − Fµν +i / αf − mq̄M q.
q̄αf Dq
4 α=L,R f
Here f is a sum over quark flavors, which includes the electroweak doublets, u and
d. Let’s focus on just these two lightest flavors, u and d. We can diagonalize the
36
Recently, humans have gotten better at counting these operators. See this paper.
37
There are a few meaningful subtleties here, as you might expect if you recall that the Ward
identity is only true up to contact terms. The measure in the path integral can produce a Jacobian
which renormalizes some of the couplings; the changes in source terms will drop out of S-matrix
elements (recall our discussion of changing field variables in §??) but can change the form of Green’s
functions. For more information on the use of eom to eliminate redundant operators in EFT, see Arzt,
hep-ph/9304230 and Georgi, “On-Shell EFT”.
153
mass matrix by a field redefinition (this is what makes the CKM matrix meaningful):
mu 0
M= . If it were the case that mu = md , we would have isospin symmetry
0 md
u u
→U , U ∈ SU(Nf = 2).
d d
If, further, there were no masses m = 0, then L and R decouple and we also have chiral
symmetry, q → eiγ5 α q, i.e.
qL → V qL , qR → V −1 qR , V ∈ SU(Nf = 2).
Why do I restrict to SU(2) and not U(2)? The central bit of the axial symmetry
U(1)A is anomalous – it’s divergence is proportional to the gluon F ∧ F , which has all
kinds of nonzero matrix elements. It’s not a symmetry (see Peskin page 673 for more
detail). The central bit of the vectorlike transformation q → eiα q is baryon number, B.
(Actually this is anomalous under the full electroweak symmetry, but B − L is not).
The vacuum of QCD is mysterious, because of infrared slavery. Apparently it is the
case that
hq̄f qf i = V 3
independent of flavor f . This condensate breaks
154
(this will be called a linear sigma model, because Σ transforms linearly) and we can
make singlets (hence an action) out of |Σ|2 = Σij Σ†ji = trΣΣ† :
λ
L = |∂µ Σ|2 + m2 |Σ|2 − |Σ|4 + · · · (15.9)
4
√
V 10
which is designed to have a minimum at hΣi = √2 , with V = 2m/ λ, which
01
preserves SU(2)isospin . We can parametrize the fluctuations about this configuration as
V + σ(x) 2iπaF(x)τ a
Σ(x) = √ e π
2
a
where Fπ will be chosen to give π a (x) canonical kinetic terms. Under gL/R = eiθL/R τ ,
the pion field transforms as
Fπ a 1 abc a
πa → πa + a
(θL − θR ) − a
f (θL + θR ) πc .
|2 {z } |2 {z }
nonlinear realization of SU(2)axial linear realiz’n (adj rep) of SU(2)isospin
The fields π ± , π 0 create pions, they transform in the adjoint representation of the
diagonal SU(2)isospin , and they shift under the broken symmetry. This shift symmetry
forbids mass terms π 2 . The radial excitation σ, on the other hand, is a fiction which
we’ve introduced in (15.9), and which has no excuse to stick around at low energies
(and does not). We can put it out of its misery by taking m → ∞, λ → ∞ fixing Fπ .
In the limit, the useful field to use is
√
2 2iπ a τ a
U (x) ≡ Σ(x)|σ=0 = e Fπ
V
which is unitary U U † = U † U = 1. This last identity means that all terms in an action
for U require derivatives, so (again) no mass for π. The most general Lagrangian for
U can be written as an expansion in derivatives, and is called the chiral Lagrangian:
Fπ2 2
Lχ = trDµ U Dµ U † +L1 tr Dµ U Dµ U † +L2 trDµ U Dν U † trDν U † Dµ U +L3 trDµ U Dµ U † Dν U Dν U † +· · ·
4
(15.10)
In terms of π, the leading terms are
1 a µ a 1 1 0 0 + µ − 1 1 − + 2
0 µ 0
Lχ = ∂µ π ∂ π + 2 − π π Dµ π D π + · · · + 4 π π Dµ π D π + · · ·
2 Fπ 3 Fπ 18
This fixes the relative coefficients of many irrelevant interactions, all with two deriva-
tives, suppressed by powers of Fπ . The expansion of the Li terms have four derivatives,
and are therefore suppressed by further powers of E/Fπ .
155
Pion masses. The pions aren’t actually massless: mπ± ∼ 140MeV. In terms of
quarks, one source for such a thing is the quark mass term L 3 q̄M q. This breaks the
isospin symmetry if the eigenvalues of M aren’t equal. But an invariance of L is
The coefficient V 3 is chosen so that the first term matches hq̄M qi = V 3 (mu + md ). The
second term then gives
V3
m2π ' 2 (mu + md )
Fπ
which is called the Gell-Mann Oakes Renner relation.
Electroweak interactions. You may have noticed that I used covariant-looking
Ds in (15.10). That’s because the SU(2)L symmetry we’ve been speaking about is
actually gauged by Wµa . (The electroweak gauge boson kinetic terms are in the · · · of
(15.10).) Recall that
g
LWeak 3 Wµa Jµa − Jµ5a = Wµa Vij Q̄i γ µ (1 − γ 5 )τ a Qj + L̄i γ µ τ a (1 − γ 5 )Li
2 | {z }
‘V’ - ‘A’
u e
where Q1 = , L1 = are doublets of SU(2)L .
d νe
Now, in equations, the statement “a pion is a Goldstone boson for the axial SU(2)”
is:
h0| Jµ5a (x) π b (p) = ipµ Fπ e−ip·x δ ab .
If the vacuum were invariant under the symmetry transformation generated by Jµ , the
BHS would vanish. The momentum dependence implements the fact that a global
156
rotation does not change the energy. Contracting the BHS with pµ and using current
conservation gives 0 = p2 Fπ2 = m2π Fπ2 , a massless dispersion for the pions.
Combining the previous two paragraphs, we see that the following process can
happen
Goldstone electroweak interaction
π → Jµ5 → leptons
(15.12)
and in fact is responsible for the dominant decay channel of charged pions. (Time goes
from left to right in these diagrams, sorry.)
GF
M(π + → µ+ νµ ) = √ Fπ pµ v̄νµ γ µ (1 − γ 5 )uµ
2
where the Fermi constant GF ∼ 10−5 GeV −2 (known from e.g. µ− → e− ν̄e νµ ) is a good
way to parametrize the Weak interaction amplitude. Squaring this and integrating
over two-body phase space gives the decay rate
2
G2F Fπ2 m2µ
+ + 2
Γ(π → µ νµ ) = mπ mµ 1 − 2 .
4π mπ
(You can see from the answer why the decay to muons is more important than the decay
to electrons, since mµ /me ∼ 200. This is called helicity suppression – the decay of the
helicity-zero π − into back-to-back spin-half particles by the weak interaction (which
only produces L particles and R antiparticles) can’t happen if helicity is conserved
– the mass term is required to flip the eL into an eR .) This contributes most of
τπ+ = Γ−1 = 2.6 · 10−8 s.
Knowing further the mass of the muon mµ = 106MeV then determines Fπ = 92MeV
which fixes the leading terms in the chiral Lagrangian. This is why Fπ is called the pion
decay constant. This gives a huge set of predictions for e.g. pion scattering π 0 π 0 →
π+π−.
Note that the neutral pion can decay by an anomaly into two photons:
e2 νλαβ
qµ hp, k| Jµ5,a=3 (q) |0i = − pα kβ
4π 2
157
where hp, k| is a state with two photons, and this is a matrix element of the Je Je Jisospin
anomaly,
e2 νλαβ
∂µ J µ5a = − a 2
F νλ F αβ tr τ Q
16π 2
2/3 0
where Q = is the quark charge matrix.
0 −1/3
SU(3) and baryons. The strange quark mass is also pretty small ms ∼ 95MeV,
and hs̄si ∼ V 3 . This means the approximate invariance and symmetry breaking pattern
is actually SU(3)L × SU(3)R → SU(3)diag , meaning that there are 16 − 8 = 8 pseudo
NGBs. Besides π ±,0 , the others are the kaons K ±,0 and η. It’s still only the SU(2)L
that’s gauged.
We can also include baryons B = αβγ qα qβ qγ . Since q ∈ 3 of SU(3), the baryons are
in the representation
3 ⊗ 3 ⊗ 3 = (6 ⊕ 3̄) ⊗ 3 = 10 ⊕ 8 ⊕ 8 ⊕ 1
⊗ ⊗ =( ⊕ )⊗ = ⊕ ⊕ ⊕ (15.13)
The proton and neutron are in one of the octets. This point of view brought some
order (and some predictions) to the otherwise-bewildering zoo of hadrons.
Returning to the two-flavor SU(2) approximation, We can include the nucleons
p
NL/R = and couple them to pions by the symmetric coupling
n L/R
L 3 λN N π N̄L ΣNR .
158
−1
to the baryon number current Bµ = 24π µναβ
2 trU ∂ν U U −1 ∂α U U −1 ∂β U whose conserved
R
charge space B0 is the winding number of the map from space (plus the point at infinity)
to the space of goldstones S 3 → SU(3) × SU(3)/SU(3)preserved ' SU(3)broken .
[End of Lecture 61]
We didn’t have lecture time for the remaining sections, but you might still enjoy them.
[from hep-ph/9606222 and nucl-th/0510023] Why is the sky blue? Basically, it’s
because the blue light from the sun scatters in the atmosphere more than the red light,
and you (I hope) only look at the scattered light.
Here is an understanding of this fact using the EFT logic. Consider the scattering
of photons off atoms at low energies. Low energy means that the photon does not have
enough energy to probe the substructure of the atom – it can’t excite the electrons or
the nuclei. This means that the atom is just a particle, with some mass M .
The dofs are just the photon field and the field that creates an atom.
The symmetries are Lorentz invariance and charge conjugation invariance and par-
ity. We’ll use the usual redundant description of the photon which has also gauge
invariance.
The cutoff is the energy ∆E that it takes to excite atomic energy levels we’ve left
out of the discussion. We allow no inelastic scattering. This means we require
α
Eγ ∆E ∼ a−1
0 Matom
a0
Because of this separation of scales, we can also ignore the recoil of the atom, and treat
it as infinitely heavy.
Since there are no charged objects in sight – atoms are neutral – gauge invariance
means the Lagrangian can depend on the field strength Fµν . Let’s call the field which
destroys an atom with velocity v φv . v µ vµ = 1 and vµ = (1, 0, 0, 0)µ in the atom’s rest
frame. The Lagrangian can depend on v µ . We can write a Lagrangian for the free
atoms as
Latom = φ†v iv µ ∂µ φv .
This action is related by a boost to the statement that the atom at rest has zero energy
– in the rest frame of the atom, the eom is just ∂t φv=(1,~0) = 0.
So the Lagrangian density is
159
and we must determine Lint . It is made from local, Hermitian, gauge-invariant, Lorentz
invariant operators we can construct out of φv , Fµν , vµ , ∂µ (It can only depend on Fµν =
∂µ Aν −∂ν Aµ , and not Aµ directly, by gauge invariance.). It should actually only depend
on the combination φ†v φv since we will not create and destroy atoms. Therefore
Lint = c1 φ†v φv Fµν F µν + c2 φ†v φv v σ Fσµ vλ F λµ + c3 φ†v φv v λ ∂λ Fµν F µν + . . .
. . . indicates terms with more derivatives and more powers of velocity (i.e. an expansion
in ∂ · v). Which are the most important terms at low energies? Demanding that the
Maxwell term dominate, we get the power counting rules (so time and space should
scale the same way):
[∂µ ] = 1, [Fµν ] = 2
This then implies [φv ] = 3/2, [v] = 0 and therefore
[c1 ] = [c2 ] = −3, [c3 ] = −4 .
Terms with more partials are more irrelevant.
What makes up these dimensions? They must come from the length scales that we
have integrated out to get this description – the size of the atom a0 ∼ αme and the
energy gap between the ground state and the electronic excited states ∆E ∼ α2 me .
For Eγ ∆E, a−10 , we can just keep the two leading terms.
In the rest frame of the atom, these two leading terms c1,2 represent just the scat-
tering of E and B respectively. To determine their coefficients one would have to do
a matching calculation to a more complete theory (compute transition rates in a the-
ory that does include extra energy levels of the atom). But a reasonable guess is just
that the scale of new physics (in this case atomic physics) makes up the dimensions:
c1 ' c2 ' a30 . (In fact the magnetic term c2 comes with extra factor of v/c which
suppresses it.) The scattering cross section then goes like σ ∼ c2i ∼ a60 ; dimensional
analysis ([σ] = −2 is an area, [a60 ] = −6) then tells us that we have to make up four
powers with the only other scale around:
σ ∝ Eγ4 a60 .
160
15.6 Superconductors
Φ ∼ ψα ψβ αβ ; (15.15)
where Z Z
S2 = dt d̄d kψk† (i∂t − (k)) ψk .
161
Notice the strong similarity with the XY model action in our discussion of the RG (in
fact this similarity was Shankar’s motivation for explaining the RG for the XY model in
the (classic) paper I cited there). A mean field theory description of the condensation of
Cooper pairs (15.15) is obtained by replacing the quartic term in (15.16) by expectation
values:
Z
SM F T [ψ] = S2 [ψ] + dtdd x u hψψi ψ † ψ † + h.c.
Z
= S2 [ψ] + dtdd x uΦψ † ψ † + h.c. (15.17)
So an expectation value for Φ is a mass for the fermions. It is a funny kind of symmetry-
breaking mass, but if you diagonalize the quadratic operator in (15.17) (actually it is
done below) you will find that it costs an energy of order ∆Eψ = u hΦi to excite a
fermion. That’s the cutoff on the LG EFT.
A general lesson from this example is: the useful degrees of freedom at low energies
can be very different from the microscopic dofs.
I am sure that some of you are nervous about the step from S[ψ] to SM F T [ψ] above. To
make ourselves feel better about it, I will say a few more words about the steps from
the microscopic model of electrons (15.16) to the LG theory of Cooper pairs (these
steps were taken by Bardeen, Cooper and Schreiffer (BCS)).
First recall the Hubbard-Stratonovich transformation aka completing the square. in
0+0 dimensional field theory:
−iux4
√ Z ∞
1 2 2
e = 2πu dσ e− iu σ −2ix σ . (15.18)
−∞
At the cost of introducing an extra field σ, we turn a quartic term in x into a quadratic
term in x. The RHS of (15.18) is gaussian in x and we know how to integrate it over
x. (The version with i is relevant for the real-time integral.) Notice the weird extra
factor of i lurking in (15.18). This can be understood as arising because we are trying
to use a scalar field σ, to mediate a repulsive interaction (which it is, for positive u)
(see Zee p. 193, 2nd Ed).
Actually, we’ll need a complex H-S field:
Z ∞ Z ∞
1
−iux2 x̄2 2 2 2
e = 2πu2
dσ dσ̄ e− iu |σ| −ix σ̄−ix̄ σ . (15.19)
−∞ −∞
(The field-independent prefactor is, as usual, not important for path integrals.)
162
We can use a field theory generalization of (15.19) to ‘decouple’ the 4-fermion
interaction in (15.16):
Z Z R D R D |σ|2 (x)
† iS[ψ]
Z = [DψDψ ]e = [DψDψ † DσDσ † ]eiS2 [ψ]+i d x(σ̄ψψ+h.c.)− d x iu . (15.20)
The point of this is that now the fermion integral is gaussian. At the saddle point
of the σ integral (which is exact because it is gaussian), σ is the Cooper pair field,
σsaddle = uψψ.
Notice that we made a choice here about in which
‘channel’ to make the decoupling – we could have in-
stead introduces a different auxiliary field ρ and writ-
R ρ2
ten S[ρ, ψ] = ρψ † ψ + 2u
R
, which would break up
the 4-fermion interaction in the t-channel (as an in-
teraction of the fermion density ψ † ψ) instead of the s
(BCS) channel (as an interaction of Cooper pairs ψ 2 ).
At this stage both are correct, but they lead to differ-
ent mean-field approximations below. That the BCS
mean field theory wins is a consequence of the RG.
How can you resist doing the fermion integral in (15.20)? Let’s study the case where
~k2
the single-fermion dispersion is (k) = 2m − µ.
Z 2
d † ∇
R
† i dtd x ψ 2m −µ ψ+ψσ̄ψ+ψ̄ ψ̄σ
Iψ [σ] ≡ [DψDψ ]e
so the integral is
Iψ [σ] = det M = etr log M (σ) .
The matrix M is diagonal in momentum space, and the integral remaining to be done
is Z R D |σ(x)|2 R D
[DσDσ † ]e− d x 2iu + d̄ k log(ω −k −|σk | ) .
2 2 2
It is often possible to do this integral by saddle point. This can justified, for example,
by the largeness of the volume of the Fermi surface, {k|(k) = µ}, or by large N number
of species of fermions. The result is an equation which determines σ, which as we saw
earlier determines the fermion gap.
Z
δexponent σ 2σ
0= = i + d̄ωd̄d k 2 2
.
δσ̄ 2u ω − k − |σ|2 + i
163
We can do the frequency integral by residues:
Z
1 1 1
d̄ω 2 2 2
= 2πi p 2 .
ω − k − |σ| + i 2π 2 k + |σ|2
which you can imagine solving self-consistently for σ. Plugging back into the action
(15.20) says that σ determines the energy cost to have electrons around; more precisely,
σ is the energy required to break a Cooper pair.
Comments:
u(p, p0 )σ(~p0 )
Z
1
σ(~p) = − d̄d p0 p .
2 (p0 )2 + |σ(p0 )|2
• I haven’t included here effects of the fluctuations of the fermions. In fact, they
make the four-fermion interaction which leads to Cooper pairing marginally rel-
evant. This breaks the degeneracy in deciding how to split up the ψψψ † ψ † into
e.g. ψψσ or ψ † ψρ. BCS wins. This is explained beautifully in Polchinski, lecture
2, and R. Shankar. If there were time, I would summarize the EFT framework
for understanding this in §15.7.
• I’ve tried to give the most efficient introduction I could here. I left out any
possibility of k-dependence or spin dependence of the interactions or the pair
field, and I’ve conflated the pair field with the gap. In particular, I’ve been
sloppy about the dependence on k of σ above.
164
• You studied very closely related manipulation on a previous problem set, in an
example (the Gross-Neveu model) where the saddle point is justified by large N .
Scales involved: The Planck scale of solid state physics (made by the logic by
which Planck made his quantum gravity energy scale, namely by making a quantity
with dimensions of energy out of the available constants) is
1 e4 m 1 e2
E0 = = ∼ 13eV
2 ~2 2 a0
(where m ≡ me is the electron mass and the factor of 2 is an abuse of outside informa-
tion) which is the energy scale of chemistry. Chemistry is to solids as the melting of
spacetime is to particle physics. There are other scales involved however. In particular
a solid involves a lattice of nuclei, each with M m (approximately the proton mass).
So m/M is a useful small parameter which controls the coupling between the electrons
and the lattice vibrations. Also, the actual speed of light c vF can generally also
be treated as ∞ to first approximation. vF /c suppresses spin orbit couplings (though
large atomic numbers enhance them: λSO ∝ ZvF /c).
165
We will show that there is a nearly-RG-stable fixed point describing gapless quasi-
electrons. Notice that we are not trying to match this description directly to some
microscopic lattice model of a solid; rather we will do bottom-up effective field theory.
Having guessed the necessary dofs, let’s try to write an action for them consistent
with the symmetries. A good starting point is the free theory:
Z
Sfree [ψ] = dt d̄d p iψσ† (p)∂t ψσ (p) − ((p) − F ) ψσ† (p)ψσ (p)
(If you don’t like continuous products, put the system in a box so that p is a discrete
label.) The Fermi surface is the set of points in momentum space at the boundary of
the filled states:
FS ≡ {p|(p) = F }.
The low-lying excitations are made by adding an electron just above the FS or
removing an electron (creating a hole) just below.
We would like to define a scaling transformation which focuses on the low-energy
excitations. We scale energies by a factor E → bE, b < 1. In relativistic QFT, p~ scales
like E, toward zero, p~ → b~p, since all the low-energy stuff is near p~ = 0. Here the
situation is much more interesting because the low-energy stuff is on the FS.
One way to implement this is to introduce a hi-
erarchical labeling of points in momentum space,
by breaking the momentum space into patches
around the FS. (An analogous strategy of labeling
is also used in heavy quark EFT and in SCET.)
We’ll use a slightly different strategy, follow-
ing Polchinski. To specify a point p~, we pick the
38
Notice that we are assuming translation invariance. I am not saying anything at the moment
about whether translation invariance is discrete (the ions make a periodic potential) or continuous.
39
We have chosen the normalization of ψ to fix the coefficient of the ∂t term (this rescaling may
depend on p).
166
nearest point ~k on the FS, (~k) = F (draw a line
perpendicular to the FS from p~), and let
p~ = ~k + ~`.
This implies
dt → b−1 dt, dd−1~k → dd−1~k, d~` → bd~`, ∂t → b∂t
Z
d−1~ ~ †
Sfree = dt | d {z k d}` iψ (p) ∂t ψ(p) − `vF (k) ψ † (p)ψ(p)
|{z} | {z }
∼b0 ∼b1 ∼b1
− 21
In order to make this go like b0 we require ψ → b ψ near the free fixed point.
Next we will play the EFT game. To do so we must enumerate the symmetries we
demand of our EFT:
4. Let’s assume that (p) = (−p), which is a consequence of e.g. parity invariance.
Now we enumerate all terms analytic in ψ (since we are assuming that there are no
other low-energy operators integrating out which is the only way to get non-analytic
167
terms in ψ) and consistent with the symmetries; we can order them by the number of
fermion operators involved. Particle number symmetry means every ψ comes with a
ψ † . The possible quadratic terms are:
Z
d−1~ ~ † −1
dt
| d {z k d}` µ(k) ψ σ (p)ψσ (p) ∼ b
0
| {z }
∼b ∼b−1
is relevant. This is like a mass term. But don’t panic: it just shifts the FS around. The
existence of a Fermi surface is Wilson-natural; any precise location or shape (modulo
something enforced by symmetries, like roundness) is not.
Adding one extra ∂t or factor of ` costs a b1 and makes the operator marginal; those
terms are already present in Sfree . Adding more than one makes it irrelevant.
Quartic terms:
Z 4
Y
S4 = dt dd−1~ki d~`i u(4 · · · 1)ψσ† (p1 )ψσ (p3 )ψσ† 0 (p2 )ψσ0 (p4 )δ d (~p1 + p~2 − p~3 − p~4 )
i=1
| {z }
∼b−1+4−4/2
Note the similarity with the discussion of the XY model in §??. The minus signs on
p3,4 is because ψ(p) removes a particle with momentum p. We assume u depends only
on k, σ, so does not scale – this will give the most relevant piece. How does the delta
function scale?
?
δ d (~p1 + p~2 − p~3 − p~4 ) = δ d (k1 + k2 − k3 − k4 + `1 + `2 − `3 − `4 ) ' δ d (k1 + k2 − k3 − k4 )
In the last (questioned) step, we used the fact that ` k to ignore the contributions
of the `s. If this is correct then the delta function does not scale (since ks do not),
and S4 ∼ b1 is irrelevant (and quartic interactions with derivatives are moreso). If this
were correct, the free-fixed point would be exactly stable.
There are two important subtleties: (1) there exist phonons. (2) the questioned
equality above is questionable because of kinematics of the Fermi surface. We will
address these two issues in reverse order.
The kinematic subtlety in the treatment of the
scaling of δ(p1 + p2 − p3 − p4 ) arises because of the
geometry of the Fermi surface. Consider scattering
between two points on the FS, where (in the labeling
convention above)
168
in which case the momentum delta function is
δ d (p1 + p2 − p3 − p4 ) = δ d (δk1 + δ`1 + δk2 + δ`2 ).
For generic choices of the two points p1,2 (top figure at
left), δk1 and δk2 are linearly independent and the δ`s can indeed be ignored as we
did above. However, for two points with p1 = −p2 (they are called nested, as depicted
in the bottom figure at left), then one component of δk1 + δk2 is automatically zero,
revealing the tiny δ`s to the force of (one component of) the delta function. In this
case, δ(`) scales like b−1 , and for this particular kinematic configuration the four-fermion
interaction is (classically) marginal. Classically marginal means quantum mechanics
has a chance to make a big difference.
A useful visualization is at right (d = 2 with
a round FS is shown; this is what’s depicted on
the cover of the famous book by Abrikosov-Gorkov-
Dzyaloshinski): the blue circles have radius kF ; the
yellow vector is the sum of the two initial momenta
p1 + p2 , both of which are on the FS; the condition
that p3 + p4 , each also on the FS, add up to the same vector means that p3 must lie on
the intersection of the two circles (spheres in d > 2). But when p1 + p2 = 0, the two
circles are on top of each other so they intersect everywhere! Comments:
1. We assumed that both p1 and −p2 were actually on the FS. This is automatic if
(p) = (−p), i.e. if is only a function of p2 .
169
Now let’s think about what decision the fluctuations make
about the fate of the nested interactions. The first claim,
which I will not justify here, is that F is not renormalized
at one loop. The interesting bit is the renormalization of the
BCS interaction:
The electron propagator, obtained by inverting the kinetic operator Sfree , is
1
G(, p = k + l) =
(1 + iη) − vF (k)` + O(`)2
where I used η ≡ 0+ for the infinitesimal specifying the contour prescription. (To
understand the contour prescription for the hole propagator, it is useful to begin with
G(t, p) = hF | c†p (t)cp (0) |F i , c†p (t) ≡ e−iHt c†p eiHt
0
d0 dd−1 k 0 d`0
Z
(1) 2 1
δ V = = iV
b0 (2π) d+1 ( + − vF (k )` ) ( − 0 − vF (k 0 )`0 )
0 0 0
−1
Z 0 d−1 0
d d k 1
= iV 2 − 0 − ( + 0 )
Z
0
do d` by residues
(2π) d+1 vF (k 0 ) | {z }
=−20
Z 0 0 Z
2 d dd−1 k 0
= −V (15.22)
0 (2π)d vF (k 0 )
| b{z
0
}| {z }
=log(1/b) dos at FS
Between the first and second lines, we did the `0 integral by residues. The crucial point
is that we are interested in external energies ∼ 0, but we are integrating out a shell
near the cutoff, so |0 | > || and the sign of + 0 is opposite that of − 0 ; therefore
there is a pole on either side of the real ` axis and we get the same answer by closing
the contour either way. On one side the pole is at `0 = vF 1(k0 ) ( + 0 ). (In the t-channel
diagram (what Shankar calls ZS), the poles are on the same side and it therefore does
not renormalize the four-fermion interaction.)
170
The result to one-loop is then
There is therefore a very significant dichotomy depending on the sign of the coupling
at the microscopic scale E1 , as in this phase diagram:
The conclusion is that if the interaction starts attractive at some scale it flows
to large attractive values. The thing that is decided by our perturbative analysis is
that (if V (E1 ) > 0) the decoupling we did with σ (‘the BCS channel’) wins over the
decoupling with ρ (’the particle-hole channel’). What happens at V → −∞? Here we
need non-perturbative physics.
The non-perturbative physics is in general hard, but we’ve already done what we
can in §15.6.1.
The remaining question is: Who is V1 and why would it be attractive (given that
Coulomb interactions between electrons, while screened and therefore short-ranged, are
repulsive)? The answer is:
Phonons. The lattice of positions taken by the ions making up a crystalline solid
spontaneously break many spacetime symmetries of their governing Hamiltonian. This
implies a collection of gapless Goldstone modes in any low-energy effective theory of
such a solid40 . The Goldstone theorem is satisfied by including a field
~ ∝ (local) displacement δ~r of ions from their equilibrium positions
D
171
with spring constants k independent of the nuclear mass M . It is useful to introduce
a canonically normalized field in terms of which the action is
Z
~ 1/2 1
dtdd q ∂t Di (q)∂t Di (−q) − ωij2 (q)Di (q)Dj (−q) .
S[D = (M ) δ~r] =
2
Here ω 2 ∝ M −1 . Their status as Goldstones means that the eigenvalues of ωij2 (q) ∼ |q|2
at small q: moving everyone by the same amount does not change the energy. This also
constrains the coupling of these modes to the electrons: they can only couple through
derivative interactions.
For purposes of their interactions with the elec-
trons, a nonzero q which keeps the e− on the FS must
scale like q ∼ b0 . Therefore
1
dtdd q (∂t D)2 ∼ b+1+2[D] =⇒ D ∼ b− 2
– here we took the delta function to scale like b0 as above. This is relevant when we
use the Ḋ2 scaling for the phonons; when the restoring force dominates we should scale
D differently and this is irrelevant for generic kinematics. This is consistent with our
previous analysis of the four-fermion interaction.
The summary of this discussion is: phonons do not destroy the Fermi surface,
but they do produce an attractive contribution to the 4-fermion interaction, which is
relevant in some range of scales (above the Debye energy). Below the Debye energy, it
Notice that the scale at which the coupling V becomes strong (V (EBCS ) ≡ 1 in
(15.23)) is
− 1
EBCS ∼ ED e N VD .
172
Two comments about this: First, it is non-perturbative in the interaction VD . Second,
it provides some verification of the role of phonons, since ED ∼ M −1/2 can be varied
by studying the same material with different isotopes and studying how the critical
superconducting temperature (∼ EBCS ) scales with the nuclear mass.
Here’s the narrative, proceeding as a func-
tion of decreasing energy scale, beginning at
E0 , the Planck scale of solids: (1) Electrons
repel each other by the Coulomb interac-
tion. However, in a metal, this interaction
1. Putting back the possible angular dependence of the BCS interaction, the result
at one loop is
Z 2π
dV (θ1 − θ3 ) 1
=− 2 d̄θV (θ1 − θ)V (θ − θ3 )
d` 8π 0
or in terms of angular momentum components,
dVl V2
=− l .
d` 4π
2. This example is interesting and novel in that it is a (family of) fixed point(s)
characterized by a dimensionful quantity, namely kF . This leads to a phenomenon
called hyperscaling violation where thermodynamic quantities need not have their
naive scaling with temperature.
3. The one loop analysis gives the right answer to all loops in the limit that N ≡
kF /Λ 1, where Λ is the UV cutoff on the momentum.
173
4. The forward scattering interaction (for any choice of function F (θ13 )) is not renor-
malized at one loop. This means it is exactly marginal at leading order in N .
5. Like in φ4 theory, the sunrise diagram at two loops is the first appearance of
wavefunction renormalization. In the context of the Fermi liquid theory, this
leads to the renormalization of the effective mass which is called m? .
where
1
f () = lim −F = θ( < F )
T →0
e +1
T
is the Fermi function. This is just the demand that a particle can only scatter into
an empty state and a hole can only scatter into a filled state. These constraints imply
that all the energies are near the Fermi energy: both k0 +q and k0 lie in a shell of radius
about the FS; the answer is proportional to the density of possible final states, which
is thus 2
−1
τ ∝ .
F
So the width of the quasiparticle resonance is
τ −1 ∝ 2
174