0% found this document useful (0 votes)
13 views

QFT Lectures Final 2023 1

This document appears to be a preface for a set of lecture notes on quantum field theory. It discusses the context and motivation for writing the notes. Specifically, it explains that the notes were written during the COVID-19 pandemic to provide students with material when they could no longer attend lectures in person. The preface gives an overview of the intended structure and topics covered in the full set of notes. It also reflects on teaching quantum field theory during unusual circumstances of the pandemic.

Uploaded by

Josef Frühauf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

QFT Lectures Final 2023 1

This document appears to be a preface for a set of lecture notes on quantum field theory. It discusses the context and motivation for writing the notes. Specifically, it explains that the notes were written during the COVID-19 pandemic to provide students with material when they could no longer attend lectures in person. The preface gives an overview of the intended structure and topics covered in the full set of notes. It also reflects on teaching quantum field theory during unusual circumstances of the pandemic.

Uploaded by

Josef Frühauf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 337

Lectures on

QUANTUM FIELD THEORY

Jiřı́ Hořejšı́

Institute of Particle and Nuclear Physics


Faculty of Mathematics and Physics
Charles University

Prague 2023
Contents

Preface 1

Conventions, notations and units 4

1 Klein–Gordon and Dirac equations: brief history 6

2 Physical contents of Dirac equation: preliminary discussion 12

3 Covariant form of Dirac equation. Fun with 𝜸-matrices 17

4 Relativistic covariance of Dirac equation 23

5 𝑪, 𝑷 and 𝑻 30

6 Plane-wave solutions of Dirac equation: u and v 38

7 Description of spin states of Dirac particle 43

8 Helicity and chirality 50

9 Weyl equation 57

10 Wave packets. Zitterbewegung 63

11 Klein paradox 72

12 Relativistic equation for spin-1 particle 78

13 Splendors and miseries of relativistic quantum mechanics 84

14 Interlude: Lagrangian formalism for classical fields 88

15 Conservation laws from symmetries 95

16 Canonical quantization of real scalar field 105

17 Particle interpretation of quantized field 111

18 Complex scalar field. Antiparticles 116

19 Quantization of Dirac field. Anticommutators 121

i
20 Quantization of massive vector field 126

21 Interactions of classical and quantum fields 131

22 Examples of 𝑺-matrix elements. Some simple Feynman diagrams 137

23 Decay rates and cross sections 143

24 Sample lowest-order calculations for physical processes 151

25 Scattering in external Coulomb field. Mott formula 159

26 Propagator of scalar field 163

27 Propagator of Dirac field 170

28 Propagator of massive vector field 174

29 Fate of non-covariant term in vector boson propagator 177

30 Some applications: QED with massive photon 181

31 Quantization of electromagnetic field: covariant and non-covariant 186

32 Gupta–Bleuler method 194

33 Compton scattering: Klein–Nishina formula 200

34 𝑺-matrix and Wick’s theorems: an overview 207

35 𝑺-matrix and Wick’s theorems: some applications 211

36 𝑺-matrix in fourth order: QED example 216

37 One-loop QED diagrams in momentum space 221

38 Regularization of UV divergences 226

39 Accomplishing dimensional regularization of 𝚷 𝝁𝝂 (𝒒) 232

40 Pauli–Villars regularization 237

41 𝚺( 𝒑) and all that 244

42 More about QED loops 250

43 Fate of higher fermionic loops 255

44 Index of UV divergence of 1PI diagram 261

45 Renormalization in QED: preliminary considerations 267

46 Renormalization counterterms 272

ii
47 Renormalization and radiative corrections 278

48 One-loop vacuum polarization in detail 285

49 Calculable quantities: UV finite without counterterms 293

50 Schwinger correction 297

A Basic properties of Lorentz transformations 305

B Representations of Lorentz group 309

C Review of “diracology” 313

D More about spin states of Dirac field 319

E Photon propagator in a general covariant gauge 322

F Electromagnetic form factors of electron 325

Bibliography 327

Index 330

iii
Preface

This work covers the material of a two-semester course of quantum field theory (QFT) that I
taught for more than 20 years at the Charles University and Czech Technical University in Prague.
For years, I was reluctant to write up such a set of lecture notes, since the current literature in
this area is quite rich and there are dozens of books on the subject. However, eventually I was
forced to do it, because of the pandemy of the infamous coronavirus that has broken out in
spring 2020. I comment on this in more detail below. Conceptually, my approach is traditional,
starting with several introductory chapters on the relativistic quantum mechanics. Then, after a
brief interlude on the classical field theory, one proceeds to the quantization of free fields and
to some elementary examples of field interactions, the basic tool being the Dyson perturbation
expansion of the 𝑆-matrix in the interaction representation. The pragmatic aim of the first half of
the text (chapters 1–25) is to arrive at the basic techniques for calculations of Feynman diagrams
in the lowest perturbative order, as well as for the computation of the particle decay rates and
scattering cross sections. This is just the matter that should be ideally explained during the first
(winter) semester, since a part of the curriculum in the second (summer) semester, at least for
some students, is a course on the standard model of particle physics, where a Feynman diagram
calculation is an everyday occurrence. The second half (chapters 26–50) represents topics to be
explained during the second semester and the main theme here is quantum electrodynamics at
the level of one-loop diagrams, including techniques of regularization of ultraviolet divergences
and renormalization. In this way, the whole material of the present lecture notes is divided
into 50 chapters and each of them corresponds, roughly, to a 90 min. lecture (the total number
of QFT lectures in a given academic year is about fifty). I would like to stress that the text is
really intended to have the character of lecture notes, which means that, among other things,
some explicit calculations are shown here in greater detail than in most of the representative
monographs and textbooks, so as to make the life of a QFT beginner easier. Throughout the
text one also encounters numerous hints to possible independent calculations, addressed to
interested diligent readers; some of the problems in question may also serve as appropriate
topics for tutorials. Admittedly, readers that are not quite fond of performing independent
calculations may find the repeated offers of problems left to them as “instructive exercises”
somewhat disturbing (or even annoying); anyway, there are just about three dozen of such hints
in the whole text, i.e. less than one per chapter on average.
As I have indicated above, these lecture notes have been written under rather special
circumstances, during the protracted coronavirus (COVID-19) crisis in 2020 and 2021. It was a
situation that people of my generation have experienced never before, so let me add some personal
recollection (which is, admittedly, somewhat emotional). The outbreak of the pandemy was
officially announced in March 2020. Thus, on Wednesday, March 11, the personal attendance of
students in the lecture rooms was banned “until further notice” and I decided to write immediately
the text of a lecture scheduled for Thursday, to be able to send it to students via e-mail. Such
a procedure seemed to me more efficient than a system of videoconferences or so, and I hoped

1
also that the students’ opinion would coincide with that of the aspiring student in Goethe’s Faust,
expressed in a dialogue with Mephistopheles, namely, “You won’t need to tell me twice! I think,
myself, it’s very helpful, too, that one can take back home, and use, what someone’s penned in
black and white”.1 In any case, it is obvious that a carefully written text is more durable than
lectures presented on a blackboard and erased immediately after the classes. Thus I went on in
this manner, sticking to the maxim “nulla dies sine linea”, till the end of May when the semester
terminates. When the summer semester and the students’ exams were over, I returned to the
material of the envisaged next winter semester and continued writing down the relevant lectures
so as to have a complete set (in musical terms, “da capo al fine”). In the meantime, I had to
put together a collection of lectures for another course, aimed at a more advanced audience (25
chapters as well). In this way, the whole work has been basically completed in May 2021, with
the nasty virus still around. Then there followed a period of transforming the manuscript full of
handwritten formulae into a user-friendly electronic file, as well as gradual detailed proofreading
of the text, mostly during the academic year 2021/2022. This was largely finished in autumn
2022, when the pandemy was fading away, but was overshadowed by even more tragic events —
of course, I have in mind the absurd criminal war that Russia started against Ukraine.
When I started writing the lecture notes, in the gloomy atmosphere of the covid calamity
on the rise, it came to my mind that there is a famous work of the world literature that was
created under similar circumstances and survived over centuries. Yes, you guessed right; it is
the Decameron by Giovanni Boccaccio. Its origin is widely known. It represents a collection
of one hundred tales told by a group of ten young people who escaped from Florence, where
the epidemic of plague broke out in 1348, and stayed in a hideout in the countryside to avoid
the dangerous infection. Concerning my text, I have also written the lecture notes partly in
a hideout (the “home office”). These consist of only fifty tales told by myself (not young
anymore), concerning topics not so easily accessible to a general public and I certainly do not
expect that my opus will become so famous as the Boccaccio’s Decameron, or that it could
survive through centuries. Nevertheless, I believe that it may have an appropriate (though
inevitably limited) lifetime and may be useful for at least some students and other potentially
interested scientifically minded readers. My primary aim has been to make it a comprehensible
and digestible introduction to the rather difficult subject of quantum field theory, which, among
others, forms a basis of the contemporary particle physics.
One last remark is perhaps in order here. In view of the above-mentioned origin of
these lecture notes, it is to be expected that most of the potential readers will be university
students fluent in Czech. Thus, I could not resist the temptation to include, occasionally, some
notes concerning the Czech equivalents of the international English terminology, or even some
elements of a common literary folklore. Hopefully, this might add some cheering moments to
the serious scholarly style of the whole opus.

Acknowledgements
From what I have written above it might seem that I should thank the malicious coronavirus
in the first place, for stimulating me to write up these lecture notes. But I will not, taking into
account that, apart from the positive impact mentioned above, this dangerous invisible bug did
also so much harm to so many people all over the world. Needless to say, my acknowledgements
are aimed in a completely different, genuinely positive, direction. In particular, I recognize the
work of my younger colleagues who conducted and supervised, during the previous years, the
1A translation into English by A. S. Kline, 2003. In Czech (in the classic translation by O. Fischer) it reads:
“Tot’ praktické, i heled’me se! To tělem dušı́ při tom jsem. Neb co je černé na bı́lém, to vesele se domů nese.”

2
tutorials related to my lectures. They are, in alphabetical order: Karol Kampf, Karel Kolář, Jiřı́
Novotný and Martin Zdráhal. Further, I appreciate questions and comments that the students
made throughout the years; this certainly led to many improvements of the style and contents
of the lectures. Actually, I have also received some useful remarks from other colleagues; for
instance, Walter Grimus from Vienna University has drawn my attention to the fact that the
frequently cited “Lorentz condition” in electromagnetism is in fact “Lorenz condition”. Finally,
my great thanks are due to Tomáš Husek and Tomáš Kadavý, who recast my manuscript in
LATEX and thus made it ready for publication; the whole work matured to its present form in
spring 2023.

Prague, May 2023 J. Hořejšı́

3
Conventions, notations and units

Unless stated otherwise, we use the natural system of units, in which ℏ = 𝑐 = 1 (note that Peskin
and Schroeder call it “God-given” units in their book [14]). Obviously, within such a system,
the time and length have the same dimension, the energy, momentum and mass have the same
dimension, inverse length has the dimension of a mass, etc. The passage from the economical
natural system to ordinary units is quite straightforward. To this end, one may use the commonly
known approximate values of the Planck constant ℏ and the “conversion constant” ℏ𝑐, namely
ℏ = 6.58 × 10−22 MeV s , ℏ𝑐 = 197 MeV fm ,
where 1 fm = 10−13 cm (fm stands for “fermi” or “femtometer”). Numerical values of observable
quantities (such as decay rates or scattering cross sections) are then converted into ordinary units
by setting
1 MeV−1 = 6.58 × 10−22 s ,
or
1 MeV−1 = 197 fm .
While the natural system of units is universally accepted in the literature concerning
quantum field theory and particle physics, there are three other conventions that may differ in
various books, so one must emphasize what is our particular choice (to avoid any misunder-
standing when comparing our results with other books or papers). First, the metric of the flat
spacetime used throughout the present text is defined by
𝑔 𝜇𝜈 = 𝑔 𝜇𝜈 = diag(+1, −1, −1, −1) .
In other words, the metric we are employing here has the signature (+ − − −). Let us remark
that such a choice seems to be prevalent in current literature; for instance, among the books
that we cite in the list of relevant literature, only [13] and [18] use the metric with the inverse
signature (− + + +). Anyway, one should keep in mind that there is no question of which metric
is “right” or “wrong”; its choice is just a matter of convention. Note also that readers specialized
mostly in relativity and gravitation should not worry about our notation 𝑔 𝜇𝜈 for the metric tensor,
which they got used to employ for the general case of curved Riemann space (and distinguish the
case of the flat spacetime by using the symbol 𝜂 𝜇𝜈 or so). The notation used here is a common
practice in the literature concerning relativistic quantum theory and particle physics, since in
this area one is dealing just with flat spacetime (nevertheless, 𝜂 𝜇𝜈 is employed conventionally
e.g. in the books [11, 13] or [29]).
Second, another important convention is that for the fifth Dirac gamma matrix 𝛾5 . Here
we use the definition
𝛾5 = 𝑖𝛾 0 𝛾 1 𝛾 2 𝛾 3 .
Again, this choice seems to be prevalent in the literature (note that within our list, the books [7]
and [13] define 𝛾5 with opposite sign).

4
Finally, our convention for the fully antisymmetric Levi-Civita tensor is such that

𝜀0123 = +1 .

In this case, one must admit that this is a minority choice, since the option prevalent in current
literature is 𝜀 0123 = +1 (which corresponds to the sign change in contrast to our convention). So,
the reader must be careful when comparing our formulae in Appendix C and elsewhere (see in
particular (C.11)) with those presented in other textbooks. Note that the convention employed
here agrees with the classic books by Bjorken and Drell [1, 2].

5
Chapter 1

Klein–Gordon and Dirac equations:


brief history

The best known equation of quantum mechanics is undoubtedly the Schrödinger equation, which
for a particle moving in an external field reads
 2 
𝜕𝜓 ℏ
𝑖ℏ = − Δ + 𝑉 (®
𝑥) 𝜓 , (1.1)
𝜕𝑡 2𝑚

where Δ is the Laplace operator, Δ = ® 2 , and 𝑉 (® 𝑥 ) is the potential energy corresponding to


Δ
an external force. The wave function 𝜓 = 𝜓(® 𝑥 , 𝑡)| 2
𝑥 , 𝑡) has the familiar interpretation: |𝜓(®
represents the probability density for the particle localization at the point 𝑥® and time 𝑡. Erwin
Schrödinger published it in 1926 (and subsequently won the Nobel Prize in 1933). Let us
consider first Eq. (1.1) for a free particle, i.e. for 𝑉 = 0. There is a simple “correspondence
principle” that may serve as a recipe for recovering the Schrödinger equation. Denoting the
energy as 𝐸 and momentum as 𝑝, ® one may observe the correspondence
𝜕
𝐸 ←→ 𝑖ℏ ,
𝜕𝑡 (1.2)
𝑝® ←→ −𝑖ℏ ® ,
Δ

which leads from the usual non-relativistic relation between kinetic energy and momentum

𝑝®2
𝐸= (1.3)
2𝑚
to the Schrödinger equation
𝜕𝜓 ℏ2 ® 2
(1.4)
Δ
𝑖ℏ =− 𝜓.
𝜕𝑡 2𝑚
Let us stress as emphatically as possible that the correspondence (1.2) does not represent a
derivation of the Schrödinger equation. This cannot be derived, it can only be postulated; this
is what the founding fathers of quantum theory did. The meaning of the correspondence (1.2)
is that it guarantees recovering the right relation between the energy and momentum (1.3) when
the operators in (1.2) act on an appropriate wave function 𝜓, in particular the plane wave

𝑥 , 𝑡) ∝ 𝑒 − ℏ (𝐸𝑡− 𝑝·®
® 𝑥)
(1.5)
𝑖
𝜓(® .

In the same year when the non-relativistic equation (1.4) or (1.1) was postulated, a
pertinent relativistic version was considered (preferably as a quantum mechanical equation for

6
the electron). In that case, one has to use as a motivating hint the relation between the energy
and momentum valid in special relativity, i.e.

𝐸 2 = 𝑐2 𝑝®2 + 𝑚 2 𝑐4 . (1.6)

Then, using the correspondence (1.2), one gets immediately


2𝜓
2𝜕 2 2 ®2 2 4
 
(1.7)
Δ
−ℏ = −ℏ 𝑐 +𝑚 𝑐 𝜓,
𝜕𝑡 2
and this can be recast in a more elegant form
1 𝜕2 𝑚 2 𝑐2
 
−Δ+ 2 𝜓 = 0. (1.8)
𝑐2 𝜕𝑡 2 ℏ
Now, the differential operator in Eq. (1.8) is the familiar d’Alembert operator
1 𝜕2
2= − Δ, (1.9)
𝑐2 𝜕𝑡 2
and thus we end up with
𝑚 2 𝑐2
 
2 + 2 𝜓(𝑥) = 0 . (1.10)

The equation (1.10) had been formulated in 1926 independently by several theorists: Erwin
Schrödinger (who subsequently rejected it), Oskar Klein, Walter Gordon, and Vladimir Fock
(or, better Fok: the name reads Fok in Russian). So, although it is apparently an equation with
many parents, it is universally called the Klein–Gordon equation.
A remark is perhaps in order here. The constant appearing in Eq. (1.10) is the square
of inverse of the Compton wavelength and one might wonder why it happens to be there, when
(1.10) clearly has nothing to do with the famous Compton process (the photon scattering on
a charged particle). The answer is guessed easily on dimensional grounds: the d’Alembert
operator 2 has, obviously, the dimension of inverse length squared, and any possible additive
constant (with the same dimension) must be made of the fundamental constants of a relativistic
quantum theory, i.e. 𝑐 and ℏ and, eventually, the relevant mass 𝑚. The combination ℏ/𝑚𝑐 is
then the only possibility how to form a constant with the dimension of length (it is a refreshing
simple exercise to show that such a combination of 𝑐, ℏ and 𝑚 is indeed unique).2
For convenience, let us now pass to the natural system of units with ℏ = 1, 𝑐 = 1. Using
the standard relativistic covariant notation, one then has

2 + 𝑚 2 𝜓(𝑥) = 0 ,
 
(1.11)

with 2 = 𝜕𝜇 𝜕 𝜇 . The simplest solutions of Eq. (1.11) have the form of plane waves; for their
description one may use two linearly independent exponentials
𝜓 (+) (𝑥) = const. 𝑒 −𝑖 𝑝·𝑥 ,
(1.12)
𝜓 (−) (𝑥) = const. 𝑒𝑖 𝑝·𝑥 ,

where 𝑝 · 𝑥 = 𝑝 0 𝑥 0 − 𝑝® · 𝑥® (the logic of the chosen notation will become clear shortly). Inserting
(1.12) into (1.11), one gets the condition

𝑝2 = 𝑚2 , (1.13)
2 In fact, sticking to the traditional terminology, ℏ/𝑚𝑐 is the Compton wavelength divided by 2𝜋.

7
i.e. 𝑝 20 = 𝑝®2 + 𝑚 2 . Without loss of generality, one may choose 𝑝 0 > 0,
√︃
𝑝0 = 𝑝®2 + 𝑚 2 . (1.14)

So, as expected, one recovers the correct relation between the energy and momentum of a particle
with the mass 𝑚. Using the correspondence (1.2), one sees that the solution 𝜓 (+) (𝑥) describes
a state with positive energy 𝐸 = 𝑝 0 and momentum 𝑝, ® while 𝜓 (−) (𝑥) carries negative energy
𝐸 = −𝑝 0 and momentum − 𝑝. ® In any case, the four-component quantity 𝑝 satisfying (1.13) is
rightly called the four-momentum of the particle with the mass 𝑚. Thus we have encountered,
for the first time, the problem of a wave function for the free particle with negative energy; we
will see that this is a generic feature of the equations of relativistic quantum mechanics.
In fact, there is another difficulty inherent in the Klein–Gordon equation. If one wants
to implement the probabilistic interpretation of the wave function 𝜓, one should derive first a
pertinent continuity equation connecting the probability density (for particle localization) and
the density of probability current. Let us first remind the reader how one proceeds in the case
of non-relativistic Schrödinger equation (1.4) (we are going to use the natural system of units,
i.e. set ℏ = 1). We have the equations for 𝜓 and 𝜓 ∗ ,
𝜕𝜓 1 ®2
(1.15)
Δ
𝑖 =− 𝜓,
𝜕𝑡 2𝑚
𝜕𝜓 ∗ 1 ®2 ∗
(1.16)
Δ
−𝑖 =− 𝜓 .
𝜕𝑡 2𝑚
Multiplying Eq. (1.15) by 𝜓 ∗ and (1.16) by 𝜓, and subtracting the two equations, one gets
immediately
𝜕 ∗ 1  ∗ ®2 Δ ® 2 ∗
 1 ® ∗®
Δ ® ∗
 Δ Δ Δ
𝑖 (𝜓𝜓 ) = − 𝜓 𝜓−𝜓 𝜓 =− 𝜓 𝜓−𝜓 𝜓 .
𝜕𝑡 2𝑚 2𝑚
Thus one obtains the familiar result
𝜕
𝜌Sch. + ® · 𝑗®Sch. = 0 , (1.17)
Δ
𝜕𝑡
with
1  ∗®
𝜌Sch. = 𝜓𝜓 ∗ = |𝜓| 2 ,

𝑗®Sch. = 𝜓 𝜓 − 𝜓 ® 𝜓∗ . (1.18)
Δ Δ
2𝑚𝑖
For the Klein–Gordon equation (1.11) one may try to proceed in a similar manner. To begin
with, (1.11) is recast as
𝜕2𝜓 ® 2
= 𝜓 − 𝑚2𝜓 , (1.19)
Δ
𝜕𝑡 2
and the same equation holds for 𝜓 ∗ . Next, using the multiplication and subtraction trick as
above, one gets first
𝜕𝜓 ∗
 
𝜕 ∗ 𝜕𝜓
 
= ® 𝜓∗ ® 𝜓 − 𝜓 ® 𝜓∗ . (1.20)
Δ Δ Δ
𝜓 −𝜓
𝜕𝑡 𝜕𝑡 𝜕𝑡
In order to make the left-hand side of Eq. (1.20) real, one has to include a factor of 𝑖; for getting
quantities with the same dimension as in the case of the Schrödinger equation, one may write
finally
𝜕
𝜌KG + ® · 𝑗®KG = 0 ,
Δ
𝜕𝑡

8
where
𝜕𝜓 ∗
 
𝑖 ∗ 𝜕𝜓
𝜌KG = 𝜓 −𝜓 ,
2𝑚 𝜕𝑡 𝜕𝑡
(1.21)
1  ∗® ®

𝑗®KG ∗
Δ Δ
= 𝜓 𝜓−𝜓 𝜓 .
2𝑚𝑖
Obviously, in contrast to (1.18), the would-be probability density 𝜌KG in (1.21) is not a priori
positive. In particular, it is easy to see that for 𝜓 (+) shown in (1.12) the expression for 𝜌KG is
positive, while for 𝜓 (−) one gets a negative value of 𝜌KG . This, of course, is a serious flaw.
On the top of that, it has soon become clear that the Klein–Gordon equation is not viable as
an equation for the electron, because it cannot incorporate a description of intrinsic angular
momentum, the spin (note that the concept of electron spin appeared on the physical stage in
1925, when it was introduced by George Uhlenbeck and Samuel Goudsmit — surprisingly, they
have never received the Nobel Prize for it).
Despite the above-mentioned difficulty with the interpretation of the probability density,
Klein–Gordon equation, as an equation of relativistic quantum mechanics, does have some
limited applicability for the description of spinless particles (for more details, see e.g. the book
[1]). However, we will exploit this equation fully later on, within the framework of field theory.
Anyway, it is clear that a topical question that certainly resonated in minds of quantum
theorists in the second half of 1920s was: So, what is the right relativistic quantum equation for
electron? The problem was resolved in 1928 by Paul Dirac. His solution was, at that time, quite
astonishing and this historical breakthrough is thus worth recapitulating here (for the original
paper, see [32]).
As we have already stressed, a major flaw of the Klein–Gordon equation is the non-
positivity of the would-be probability density in (1.21). It is clear what is the source of this
inherent feature of the 𝜌KG : the equation (1.19) is of the second order in time and thus a time
derivative emerges necessarily in (1.21). Thus, it is desirable to have an equation that would be
of the first order with respect to time. To ensure the relativistic covariance, it should also be
of the first order in space variables (time and space coordinates are treated on an equal footing
in Lorentz transformations). In any case, one has to maintain the relativistic relation between
energy and momentum (1.6) (for a moment, we come back to ordinary units). For his purpose,
Dirac took the square root of (1.6) by linearizing it as follows,

𝐸 = 𝑐𝛼 𝑗 𝑝 𝑗 + 𝛽𝑚𝑐2 , (1.22)

where 𝛼 𝑗 , 𝑗 = 1, 2, 3, and 𝛽 are some constant coefficients (summation over the index 𝑗 is
understood here, so 𝛼 𝑗 𝑝 𝑗 can also be written as 𝛼 ® Now, employing the correspondence
® · 𝑝).
(1.2) one arrives at the equation
𝜕𝜓  2

= −𝑖ℏ𝑐𝛼 𝑗 𝑗
(1.23)
Δ
𝑖ℏ + 𝛽𝑚𝑐 𝜓.
𝜕𝑡
The consistency condition for such an equation is that upon squaring it, one should recover the
Klein–Gordon equation (which corresponds trivially to the energy–momentum relation (1.6)).
Before squaring Eq. (1.23) one must clarify a simple point: If one has an equation 𝐴𝜓 = 𝐵𝜓
with 𝐴, 𝐵 being some operators, it does not imply automatically that 𝐴2 𝜓 = 𝐵2 𝜓. Indeed, if
𝐴 and 𝐵 do not commute, the latter identity is not guaranteed. However, if 𝐴𝐵 = 𝐵𝐴, then
obviously 𝐴𝜓 = 𝐵𝜓 ⇒ 𝐴2 𝜓 = 𝐴𝐵𝜓 = 𝐵𝐴𝜓 = 𝐵2 𝜓. Eq. (1.23) clearly corresponds to the case
[ 𝐴, 𝐵] = 0, since the time derivative commutes with 𝑗 on the right-hand side. One may now
Δ

9
square Eq. (1.23) with confidence; the only caveat is that one must not assume a priori that the
coefficients 𝛼 𝑗 , 𝛽 commute (they cannot be ordinary numbers). Thus, one gets
2𝜓
2𝜕 2 2 2 2 2 4
h i
−ℏ = −ℏ 𝑐 𝛼
®· ® Δ
®· ® Δ
− 𝑖ℏ𝑐 · 𝑚𝑐 ( 𝛼 ®
®) · + 𝛽 𝑚 𝑐 𝜓 .
® 𝛽 + 𝛽𝛼
Δ
(1.24)
 
𝛼
𝜕𝑡 2
In Eq. (1.24) one has
1 𝑗 𝑘 1  𝑗 𝑘
®·® 𝛼® · ® = 𝛼 𝑗 𝛼𝑘 𝑗 𝑘 𝑗 𝑘 𝑗 𝑘
(1.25)
Δ Δ Δ Δ Δ Δ Δ Δ
= +
 
𝛼 𝛼 ,𝛼 𝛼 ,𝛼 ,
2 2
but the last term in (1.25) vanishes, since 𝑗 𝑘 = 𝑘 𝑗 . In order to turn Eq. (1.24) into the
Δ Δ Δ Δ
form of the Klein–Gordon equation, the coefficients 𝛼 𝑗 must obviously satisfy the identities
𝛼 , 𝛼 = 2𝛿 𝑗 𝑘 ,
 𝑗 𝑘

𝛽, 𝛼 𝑗 = 0 , (1.26)


𝛽2 = 1 .

So, it is clear that 𝛼 𝑗 and 𝛽 must be matrices rather than ordinary numbers, as we have rightly
anticipated before. Moreover, the equation (1.23) has a “Schrödinger-like” form; the operator on
its right-hand side could be interpreted as a Hamiltonian that should be Hermitian (self-adjoint).
It means that one should impose an additional constraint on 𝛼 𝑗 and 𝛽, namely
†
𝛼𝑗 = 𝛼𝑗 , 𝛽† = 𝛽 . (1.27)
Now the question is, what can be matrices satisfying (1.26) and (1.27). First of all, it is not
difficult to show that the dimension of such matrices must be even. Indeed, (1.26) means, in
particular, that
𝛼 𝑗 𝛼 𝑘 = −𝛼 𝑘 𝛼 𝑗 for 𝑗 ≠ 𝑘 , (1.28)
and 2
𝛼𝑗 =1 for 𝑗 = 1, 2, 3 . (1.29)
Let us now consider the determinants of the matrix products in (1.28). One has
det 𝛼 𝑗 det 𝛼 𝑘 = det(−1) det 𝛼 𝑘 det 𝛼 𝑗 = (−1) 𝑑 det 𝛼 𝑗 det 𝛼 𝑘 , (1.30)
where 𝑑 is the dimension of matrices in question. Obviously, det 𝛼 𝑗 ≠ 0 because of (1.29), so
(1.30) implies (−1) 𝑑 = 1, i.e. 𝑑 is even.
The simplest choice would be 𝑑 = 2, but it does not work; the point is that there are not
four mutually anticommuting 2 × 2 matrices. Indeed, for 𝛼 𝑗 , 𝑗 = 1, 2, 3, one could take the Pauli
matrices 𝜎 𝑗 , but then there is no non-trivial 𝛽 that would anticommute with them. Proving this
statement independently is left to the reader as an instructive algebraic exercise.
The next try is 𝑑 = 4 and we will see shortly that it does work. Needless to say, it then
also means that the wave function 𝜓 in Eq. (1.23) has four components. Before showing an
explicit example of 4 × 4 matrices satisfying (1.26) and (1.27), let us mention another general
property of the matrices in question. It is easy to show that matrices 𝛼 𝑗 and 𝛽 are traceless,
Tr 𝛼 𝑗 = 0 , 𝑗 = 1, 2, 3 ,
(1.31)
Tr 𝛽 = 0 .

Let us prove e.g. the first identity (1.31). Since 𝛽2 = 1, one may write
Tr 𝛼 𝑗 = Tr 𝛽2 𝛼 𝑗 = Tr 𝛽𝛼 𝑗 𝛽 = −Tr 𝛼 𝑗 𝛽2 = −Tr 𝛼 𝑗 ,
  

10
so that indeed Tr 𝛼 𝑗 = 0. Note that we have utilized just the trace cyclicity and the anticom-
mutation property 𝛽𝛼 𝑗 = −𝛼 𝑗 𝛽. The second identity (1.31) can be proved in the similar way,
2
employing the same trick with e.g. 𝛼1 = 1.
Finally, let us display an explicit example of the 4 × 4 matrices satisfying (1.26) and
(1.27). They are
0 𝜎𝑗 1 0
   
𝑗
𝛼 = , 𝑗 = 1, 2, 3, 𝛽= , (1.32)
𝜎𝑗 0 0 −1
where 𝜎 𝑗 are the familiar Pauli matrices
0 1 0 −𝑖 1 0
     
1 2 3
𝜎 = , 𝜎 = , 𝜎 = , (1.33)
1 0 𝑖 0 0 −1

and 1 stands for the 2 × 2 unit matrix. It is straightforward to verify that the matrices (1.32)
have indeed the required properties. The representation (1.32) is used frequently in practical
calculations and is called the standard representation.
In the next chapter we will see that the magic Dirac’s trick of taking the square root
of the energy–momentum relation (1.6) in terms of 4 × 4 matrix coefficients leads indeed to
a successful description of the electron. The great leap from the simple kinematical relation
(1.6) to the deep quantum equation with rich physical contents makes the Dirac equation one
of the most remarkable achievements of the 20th century physics. Note that Dirac received
the Nobel Prize in 1933 together with E. Schrödinger. Many historical details concerning the
Dirac’s discovery can be found in the book [9].

11
Chapter 2

Physical contents of Dirac equation:


preliminary discussion

As we have noted in the preceding chapter, the prime motivation for finding an alternative to the
Klein–Gordon equation was the requirement that the probability defined in terms of a quantum
mechanical wave function should be positive. So, let us now examine this problem for the Dirac
equation; for convenience, we return to the natural units. Eq. (1.23) then reads
𝜕𝜓
® · ® 𝜓 + 𝛽𝑚𝜓 (2.1)
Δ
𝑖 = −𝑖 𝛼
𝜕𝑡
(we will use the standard representation (1.32) in what follows). Let us recall that 𝜓 is a
four-component wave function that is conventionally written as a column
𝜓1 (𝑥)
­𝜓 (𝑥) ®
𝜓(𝑥) = ­ 2 ® .
© ª
(2.2)
­𝜓3 (𝑥) ®
«𝜓4 (𝑥) ¬
Upon Hermitian conjugation of Eq. (2.1) one has
𝜕𝜓 †
= 𝑖 ® 𝜓†𝛼
® + 𝑚𝜓 † 𝛽 , (2.3)
Δ
−𝑖
𝜕𝑡
where 𝜓 † = (𝜓1∗ , 𝜓2∗ , 𝜓3∗ , 𝜓4∗ ), and we have utilized the hermiticity property (1.27) of 𝛼
® and 𝛽.
Multiplying Eq. (2.1) by 𝜓 from the left and (2.3) by 𝜓 from the right, and taking then the

difference of the two equations, one gets immediately


𝜕 †
(𝜓 𝜓) + ® (𝜓 † 𝛼
® 𝜓) = 0 , (2.4)
Δ
𝜕𝑡
which is the anticipated continuity equation. Thus we may identify the probability density and
the probability current as
𝜌Dirac = 𝜓 † 𝜓 , 𝑗®Dirac = 𝜓 † 𝛼
®𝜓 . (2.5)
The positivity of the 𝜌Dirac is obvious, since
𝜓 † 𝜓 = |𝜓1 | 2 + |𝜓2 | 2 + |𝜓3 | 2 + |𝜓4 | 2 . (2.6)
This is an expected result, due to the fact that the Dirac equation (2.1) is, in a sense, “square root
of Klein–Gordon equation”; more precisely, it is an evolution equation of the 1st order in time,
having the form
𝜕𝜓
𝑖 = 𝐻𝜓, (2.7)
𝜕𝑡

12
where 𝐻 is the Dirac Hamiltonian

® · ® + 𝛽𝑚 . (2.8)
Δ
𝐻 = −𝑖 𝛼

Thus, the time evolution is generated by an operator of energy, as it should be, in accordance
with the general principles of quantum theory.
A next issue is the angular momentum. Let us start with orbital angular momentum,
defined in the standard way as 𝐿® = 𝑥® × 𝑝, ® where 𝑝® is the (linear) momentum 𝑝® = −𝑖 ® .
Δ
As we know, 𝐿 commutes with the non-relativistic Hamiltonian in the Schrödinger equation
®
(1.4). For the Dirac Hamiltonian (2.8) one gets, employing the canonical commutation relation
[𝑥 𝑗 , 𝑝 𝑘 ] = 𝑖𝛿 𝑗 𝑘 ,
® = −𝑖( 𝛼
[𝐻, 𝐿] ® × 𝑝)
® . (2.9)
Let us remark that the vector product in (2.9) is defined formally as usual, i.e.

(𝛼 ® 𝑗 = 𝜀 𝑗 𝑘𝑙 𝛼 𝑘 𝑝 𝑙 .
® × 𝑝)

So, apparently, there is something missing, since any decent angular momentum should be an
integral of motion for the free particle, i.e. the corresponding operator should commute with the
Hamiltonian. In other words, the fact that [𝐻, 𝐿] ® ≠ 0 is a hint that we are on the right track
towards the electron spin. A good candidate for such an additional ingredient of the full angular
momentum is guessed quite easily. Let us consider the 4 × 4 matrices

1 0
 
®
𝜎
𝑆® = Σ ®, ®=
Σ , (2.10)
2 0 𝜎®

and recall that the Pauli matrices have the commutation relations

[𝜎 𝑗 , 𝜎𝑘 ] = 2𝑖𝜀 𝑗 𝑘𝑙 𝜎𝑙 . (2.11)

This means that the matrices 𝑆® defined by (2.10) satisfy

[𝑆 𝑗 , 𝑆 𝑘 ] = 𝑖𝜀 𝑗 𝑘𝑙 𝑆𝑙 , (2.12)

which, of course, is a set of commutation relations for components of an angular momentum.


Needless to say, the matrices 𝑆® possess eigenvalues ±1/2 (because (𝜎 𝑗 ) 2 = 1 for 𝑗 = 1, 2, 3).
Now we may evaluate the commutator [𝐻, 𝑆]. ® Clearly, 𝑆® commutes with the diagonal matrix 𝛽
(see (1.32)). Concerning the commutator involving 𝛼 ® , one gets first

0 2𝑖𝜀 𝑗 𝑘𝑙 𝜎 𝑙
 
𝑗 𝑘
[𝛼 , Σ ] = ,
2𝑖𝜀 𝑗 𝑘𝑙 𝜎 𝑙 0

so that
[𝐻, Σ 𝑘 ] = 2𝑖( 𝛼 ® 𝑘.
® × 𝑝) (2.13)
Summarizing the results of our simple algebraic exercise, we have

® = −𝑖( 𝛼
[𝐻, 𝐿] ® × 𝑝)
® ,
(2.14)
® = 𝑖( 𝛼
[𝐻, 𝑆] ® × 𝑝)
® ,

and thus
® = 0,
[𝐻, 𝐽] (2.15)

13
with
𝐽® = 𝐿® + 𝑆® . (2.16)
Thus, in such a straightforward manner we have recovered the electron spin as a part of the
conserved total angular momentum (2.16).
Let us now recall the problem of negative energy solutions of the Klein–Gordon equation,
mentioned in the preceding chapter (cf. (1.12)). One may wonder whether the Dirac equation
suffers an analogous difficulty. For clarifying this point, we are going  to consider the
 solution
of Eq. (2.1) in the form of a plane wave involving the usual factor exp −𝑖(𝐸𝑡 − 𝑝® · 𝑥®) . To make
our discussion as simple as possible, we will restrict ourselves to the case of a particle at rest,
i.e. set 𝑝® = 0. Eq. (2.1) is then reduced to
𝜕𝜓
𝑖 = 𝛽𝑚𝜓 . (2.17)
𝜕𝑡
Taking into account the block diagonal structure of the matrix 𝛽 ((1.32), it is useful to split the
𝜓 as  
𝜑
𝜓= , (2.18)
𝜒
where 𝜙 and 𝜒 are two-component column vectors. Eq. (2.17) is then recast as
𝜕𝜑
𝑖 = 𝑚𝜑 , (2.19)
𝜕𝑡
𝜕𝜒
𝑖 = −𝑚 𝜒 . (2.20)
𝜕𝑡
Thus, two linearly independent solutions of Eq. (2.19) may be written e.g. as

−𝑖𝑚𝑡 1 −𝑖𝑚𝑡 0
   
𝜑 (1) = 𝑒 , 𝜑 (2) = 𝑒 , (2.21)
0 1

and similarly for (2.20),


1 0
   
𝜒(1) = 𝑒 𝑖𝑚𝑡
, 𝜒(2) = 𝑒 𝑖𝑚𝑡
. (2.22)
0 1
In this way, we obtain a set of four independent solutions of Eq. (2.1)

1 0 0 0
­0® ­1® ­0 ® ­0 ®
© ª © ª © ª © ª
𝜓 (1) = 𝑒 −𝑖𝑚𝑡 ­ ® , 𝜓 (2) = 𝑒 −𝑖𝑚𝑡 ­ ® , 𝜓 (3) = 𝑒𝑖𝑚𝑡 ­ ® , 𝜓 (4) = 𝑒𝑖𝑚𝑡 ­ ® . (2.23)
­0® ­0® ­1 ® ­0 ®
«0¬ «0¬ «0 ¬ «1 ¬
Obviously, 𝜓 (1) and 𝜓 (2) correspond to the positive rest energy 𝐸 = 𝑚, while 𝜓 (3) and 𝜓 (4)
carry negative energy 𝐸 = −𝑚 (they are also characterized by the two possible spin projections
to the third axis, up and down (±1/2)). It is interesting to notice that in the considered case,
the existence of the negative energy solutions is a consequence of the specific structure of the
matrix 𝛽. If 𝛽 were 4 × 4 unit matrix, we would have only a solution with positive energy. But,
alas, 𝛽 can never be the unit matrix because of the required anticommutation relations (1.26).
As we have already noted in the preceding chapter, the appearance of negative energy solutions
is a generic feature of the equations of relativistic quantum mechanics. We will discuss the
plane-wave solutions of Dirac equation in detail later on.
The last topic that we are going to discuss here is a derivation of the spin magnetic
moment of the electron. Soon after the birth of relativistic quantum mechanics this was indeed

14
one of the most remarkable achievements of the Dirac theory, so it certainly deserves a detailed
exposition.
To this end, one has to start with the Dirac equation for the electron in an external
electromagnetic field. Using the scalar potential 𝜙 and vector potential 𝐴, ® one may write the
relevant equation as
𝜕𝜓 h ® ®
i
(2.24)
Δ
𝑖 = 𝛼 ® · (−𝑖 − 𝑒 𝐴) + 𝑒𝜙 + 𝛽𝑚 𝜓 .
𝜕𝑡
Note that the form (2.24) represents the so-called minimal electromagnetic interaction and
satisfies certainly the requirement of gauge invariance (invariance under gauge transformations
® In fact, it is not the most general choice, but coincides with the recipe
of the potentials 𝜙 and 𝐴).
to be employed later on, in quantum electrodynamics. More comments on a possible extension
of the gauge invariant electromagnetic interaction within the framework of Dirac equation are
deferred to the Chapter 13.
Our ultimate goal is to get the non-relativistic two-component Pauli equation, from
which one can extract easily the value of the magnetic moment in question. For this purpose,
we will separate upper and lower components of the wave function 𝜓 as
 
𝜑
𝜓= . (2.25)
e
𝜒
e
Then, denoting
−𝑖 ® − 𝑒 𝐴® = 𝜋® , (2.26)
Δ

Eq. (2.24) is recast as a pair of coupled two-component equations


𝜕𝜑
𝑖 = (𝜎
® · 𝜋)
® e𝜒 + (𝑒𝜙 + 𝑚) 𝜑
e,
e
𝜕𝑡 (2.27)
𝜕e
𝜒
𝑖 = (𝜎
® · 𝜋)
® 𝜑e + (𝑒𝜙 − 𝑚) e
𝜒.
𝜕𝑡
Throughout our calculation we will have in mind a situation close to the non-relativistic limit;
thus, it is convenient to factorize in the wave function a part corresponding to the rest energy.
(cf. (2.21)), i.e. introduce the Ansatz
   
𝜑 −𝑖𝑚𝑡 𝜑
=𝑒 . (2.28)
e
𝜒
e 𝜒
Inserting (2.28) into Eq. (2.27) one gets, after a simple manipulation,
𝜕𝜑
𝑖 = (𝜎
® · 𝜋)
® 𝜒 + 𝑒𝜙𝜑 , (2.29a)
𝜕𝑡
𝜕𝜒
𝑖 = (𝜎 ® + 𝑒𝜙𝜒 − 2𝑚 𝜒 .
® · 𝜋)𝜑 (2.29b)
𝜕𝑡
We consider weak fields, in particular 𝑒𝜙 ≪ 𝑚, as well as a small kinetic energy; the latter
assumption may be expressed, technically, as
𝜕𝜒
≪ 𝑚𝜒 .
𝜕𝑡
Thus, in Eq. (2.29b) we will neglect 𝜕 𝜒/𝜕𝑡 and 𝑒𝜙𝜒 in comparison with 2𝑚 𝜒. Consequently,
the function 𝜒 can be approximately written as
1
𝜒 (𝜎
® · 𝜋)𝜑
® . (2.30)
2𝑚

15
Using the last expression in Eq. (2.29a), we have
𝜕𝜑 1
𝑖 = (𝜎
® · 𝜋)(
® 𝜎 ® · 𝜋)𝜑
® + 𝑒𝜙𝜑 . (2.31)
𝜕𝑡 2𝑚
To work out the right-hand side of Eq. (2.31), one may utilize the familiar identity for Pauli
matrices
𝜎 𝑗 𝜎𝑘 = 𝛿 𝑗 𝑘 · 1 + 𝑖𝜀 𝑗 𝑘𝑙 𝜎𝑙 . (2.32)
From (2.32) one then gets
(𝜎
® · 𝜋)(
® 𝜎 ® = 𝜋®2 + 𝑖 𝜎
® · 𝜋) ® · ( 𝜋® × 𝜋)
® . (2.33)
One must treat the vector product carefully, since 𝜋® is a differential operator. So, one has to
evaluate it by letting it act on an arbitrary test function 𝑓 ; one obtains, after some manipulations,
® 𝑗𝑓,
® 𝑗 𝑓 = 𝑖𝑒( ® × 𝐴)
Δ
( 𝜋® × 𝜋)
so that
𝜋® × 𝜋® = 𝑖𝑒( ® × 𝐴)
® = 𝑖𝑒 𝐵® , (2.34)
Δ

where 𝐵® is the magnetic field (the reader is encouraged to reproduce independently the result
(2.34)). In total, we thus have
(𝜎
® · 𝜋)(
® 𝜎 ® · 𝜋)
Δ ® 2 − 𝑒𝜎
® = (−𝑖 ® − 𝑒 𝐴) ® · 𝐵® .
The two-component equation (2.31) thus becomes
1
 
𝜕𝜑 2
® + 𝑒𝜙 − 𝑒
𝑖 = ( 𝑝® − 𝑒 𝐴) ® · 𝐵® 𝜑 ,
𝜎 (2.35)
𝜕𝑡 2𝑚 2𝑚
and this is the anticipated Pauli equation. Obviously, the last term in the square brackets
represents an interaction of magnetic moment with magnetic field 𝐵. ® Since the Pauli matrices
have eigenvalues ±1, one may conclude that the value of the magnetic moment in question is
𝑒/(2𝑚) (i.e. one Bohr magneton). Note that Wolfgang Pauli formulated Eq. (2.35) in 1927
as a phenomenological description of the electron moving in an external field; he then used
the empirically known value of the spin magnetic moment. The derivation described above is
actually a prediction of the relevant value, made on the basis of a more fundamental equation
(though restricted to the minimal electromagnetic interaction). This is why the result (2.35)
obtained as a non-relativistic approximation of Dirac equation is extolled as a true achievement.
One more remark is in order here. Magnetic moment of a particle is usually characterized
also by its gyromagnetic ratio, which is the ratio of the magnetic moment to the angular
momentum. It is a well-known fact that for the orbital motion, the gyromagnetic ratio is equal to
𝑒/(2𝑚) (this holds both in classical and in quantum theory). For the spin magnetic moment we
obviously have the gyromagnetic ratio 𝑒/𝑚, since the magnitude of the spin projection is 1/2.
Thus, the spin magnetic moment of electron does not obey the “normal” rule and differs from
it by a dimensionless factor called simply 𝒈-factor, here equal to 2. The 𝑔-factor has become a
usual way of description of intrinsic magnetic moment of subatomic particles.
The above-described elegant derivation of the electron spin magnetic moment, in par-
ticular the natural explanation of the “anomalous” value 𝑔 = 2 for the 𝑔-factor was certainly a
great success of the Dirac theory in 1928. In fact, even more remarkable was the continuation
of this success story some 20 years later. It turned out that quantum electrodynamics (QED)
leads to a tiny correction to the Dirac’s prediction. The correction is of relative order of one
per-mille; it was found experimentally in 1947 and subsequently calculated theoretically by
Julian Schwinger, one of the founding fathers of modern QED. This achievement corroborated
strongly the QED as the relevant model of quantum field theory capable to describe the most
subtle electromagnetic phenomena. We will discuss this topic in detail in Chapter 50.

16
Chapter 3

Covariant form of Dirac equation.


Fun with 𝜸-matrices

The main topic of this and the following chapter is the relativistic invariance of the Dirac equation.
Before proceeding to this extensive theme, let us return briefly to the Klein–Gordon equation. In
that case, the relativistic invariance is almost obvious, since the d’Alembert operator 2 = 𝜕 𝜇 𝜕𝜇
has the structure of a scalar product in the four-dimensional spacetime. So, when passing from
one Lorentz reference frame to another, with coordinates transformed as 𝑥 ′ = Λ𝑥 (with Λ being
the matrix of a Lorentz transformation), one can get along with a trivial transformation of the
wave function
𝜓 ′ (𝑥 ′) = 𝜓(𝑥) . (3.1)
As we will see in the next chapter, in the case of Dirac equation the situation is much more
interesting, i.e. far from trivial. For a proper assessment of this problem, it is convenient to recast
first the Dirac equation in a form that is more symmetric with respect to spacetime coordinates;
this is what is meant by the term “covariant form” in the title of this chapter.
For a moment, let us use the ordinary units with ℏ ≠ 1, 𝑐 ≠ 1. As we know, Dirac
equation reads
𝜕𝜓
® · ® 𝜓 + 𝛽𝑚𝑐2 𝜓 . (3.2)
Δ
𝑖ℏ = −𝑖ℏ𝑐 𝛼
𝜕𝑡
Introducing the usual notation 𝑥 0 = 𝑐𝑡 and taking into account that 𝑗 is defined, conventionally,
Δ
as 𝜕/𝜕𝑥 𝑗 = 𝜕 𝑗 , one may rewrite Eq. (3.2) as
𝜕𝜓
® · ® 𝜓 + 𝑚𝑐2 𝛽𝜓 , (3.3)
Δ
𝑖ℏ𝑐 = −𝑖ℏ𝑐 𝛼
𝜕𝑥0
and this subsequently becomes
𝑖ℏ𝛽𝜕0 𝜓 + 𝑖ℏ𝛽𝛼 𝑗 𝜕 𝑗 𝜓 − 𝑚𝑐𝜓 = 0 . (3.4)
If we now denote
𝛾0 = 𝛽 , 𝛾 𝑗 = 𝛽𝛼 𝑗 , (3.5)
then Eq. (3.4) can be rewritten as
𝑚𝑐
𝜓 = 0.
𝑖𝛾 𝜇 𝜕𝜇 𝜓 − (3.6)

One may notice the appearance of the inverse Compton length in the second term; this, of course,
was to be expected on dimensional grounds. Thus, in natural units, Eq. (3.6) reads
(𝑖𝛾 𝜇 𝜕𝜇 − 𝑚)𝜓(𝑥) = 0 , (3.7)

17
and this is the promised “covariant form” of Dirac equation, which will be our staple food from
now on. The Dirac matrices 𝛾 𝜇 will be called simply gamma matrices, or 𝛾-matrices in what
follows. Notice that Eq. (3.7) seems to look covariant, since the term 𝛾 𝜇 𝜕𝜇 has, at first sight,
the form of a scalar product in Minkowski spacetime; however, 𝛾 𝜇 , 𝜇 = 0, 1, 2, 3, are fixed 4 × 4
matrices to be used in any reference frame, so one should not jump to conclusions at this point.
Anyway, a most economical form of Eq. (3.7), utilizing the scalar product symbol, is perhaps

𝑖𝛾 · 𝜕𝜓 = 𝑚𝜓 , (3.8)

and this is precisely what is engraved in the commemorative marker in Dirac’s honour in
Westminster Abbey (it is there since 1995).
So, from now on, we will work with the set of matrices 𝛾 𝜇 introduced in (3.5); it is also
convenient to employ formally the rule for raising and lowering the Lorentz indices and define
𝛾 𝜇 = 𝑔 𝜇𝜈 𝛾 𝜈 , i.e.
𝛾0 = 𝛾 0 , 𝛾 𝑗 = −𝛾 𝑗 . (3.9)
To begin with, we should rewrite the anticommutation relations (1.26) in terms of 𝛾 𝜇 . This is
an easy exercise; we have

{𝛾 0 , 𝛾 𝑗 } = {𝛽, 𝛽𝛼 𝑗 } = 𝛽2 𝛼 𝑗 + 𝛽𝛼 𝑗 𝛽 = 0 ,
{𝛾 𝑗 , 𝛾 𝑘 } = {𝛽𝛼 𝑗 , 𝛽𝛼 𝑘 } = 𝛽𝛼 𝑗 𝛽𝛼 𝑘 + 𝛽𝛼 𝑘 𝛽𝛼 𝑗 = −{𝛼 𝑗 , 𝛼 𝑘 } = −2𝛿 𝑗 𝑘 · 1 ,

and, of course,
{𝛾 0 , 𝛾 0 } = 2𝛽2 = 2 · 1 ,
where 1 denotes the 4 × 4 unit matrix. Thus, we can summarize the above results as

{𝛾 𝜇 , 𝛾 𝜈 } = 2𝑔 𝜇𝜈 · 1 (3.10)

(for brevity, we will usually omit 1 when using (3.10)). In mathematics, the anticommutation
relations (3.10) are known to correspond to generators of the so-called Clifford algebra. Note
that (3.10) means, in particular,

(𝛾 0 ) 2 = 1 , (𝛾 𝑗 ) 2 = −1 . (3.11)

Further, let us see what becomes of the hermiticity relations (1.27). Obviously, one has

(𝛾 0 ) † = 𝛾 0 , (𝛾 𝑗 ) † = (𝛽𝛼 𝑗 ) † = 𝛼 𝑗 𝛽 = −𝛾 𝑗 . (3.12)

So, taking into account (3.10), (3.12) may be summarized as

(𝛾 𝜇 ) † = 𝛾 0 𝛾 𝜇 𝛾 0 . (3.13)

This last relation is one of the identities that will be used very frequently in our forthcoming
calculations.
It is highly useful to introduce a fifth 𝛾-matrix, denoted traditionally as 𝛾5 , which is
proportional to the product 𝛾0 𝛾1 𝛾2 𝛾3 . The salient feature of such a matrix product is that it
anticommutes with any 𝛾 𝜇 , 𝜇 = 0, 1, 2, 3 (the reader is encouraged to check this statement
independently). Note that we fix the definition of 𝛾5 conventionally as

𝛾5 = 𝑖𝛾 0 𝛾 1 𝛾 2 𝛾 3 . (3.14)

18
So, we have
{𝛾5 , 𝛾 𝜇 } = 0 , 𝜇 = 0, 1, 2, 3 . (3.15)
Other basic properties of the 𝛾5 shown in (3.14) are

(𝛾5 ) 2 = 1 , (𝛾5 ) † = 𝛾5 (3.16)

(again, proving (3.16) is left to the reader as an easy exercise).


The rest of this chapter is devoted to a rather detailed discussion of various properties of
the gamma matrices; it is just an algebra, no physics. So, this is the would-be “fun” mentioned
in the heading (though, admittedly, the “fun” in the present context is a matter of personal taste
— obviously, the trick here was to lure the reader into studying an otherwise somewhat boring
subject).
First of all, it is useful to master some simple formulae for traces of products of 𝛾-
matrices. We already know (see (1.31)) that traces of the original Dirac matrices 𝛼 𝑗 and 𝛽 are
zero; this finding is easily reproduced for the 𝛾-matrices as well. With the matrix 𝛾5 at hand,
it is straightforward to prove that the trace of a product of odd number of 𝛾-matrices is zero;
symbolically,
Tr(odd #) = 0 . (3.17)
The proof can be left to the reader as an instructive exercise. Hint: Use 𝛾52 = 1, the anticom-
mutation property (3.15) and trace cyclicity. By the way, using a similar trick, one can prove as
well that
Tr 𝛾5 = 0 (3.18)
(to this end, one may use e.g. 𝛾02 = 1).
For products of an even number 𝑛 = 2𝑘 of 𝛾-matrices one gets a series of formulae that
have quite uniform structure. Let’s start with 𝑛 = 2. One has certainly Tr(𝛾 𝜇 𝛾𝜈 ) = Tr(𝛾𝜈 𝛾 𝜇 ), so
that, employing (3.10),
1 1
Tr(𝛾 𝜇 𝛾𝜈 ) = Tr(𝛾 𝜇 𝛾𝜈 + 𝛾𝜈 𝛾 𝜇 ) = · 2𝑔 𝜇𝜈 1 = 4𝑔 𝜇𝜈 . (3.19)
2 2
How about 𝑛 = 4? We have, using (3.10),

Tr(𝛾 𝜇 𝛾𝜈 𝛾 𝜌 𝛾𝜎 ) = Tr (2𝑔 𝜇𝜈 − 𝛾𝜈 𝛾 𝜇 )𝛾 𝜌 𝛾𝜎 = 2𝑔 𝜇𝜈 Tr(𝛾 𝜌 𝛾𝜎 ) − Tr(𝛾𝜈 𝛾 𝜇 𝛾 𝜌 𝛾𝜎 ) . (3.20)


 

Now, we can go on anticommuting the 𝛾 𝜇 with 𝛾 𝜌 and then with 𝛾𝜎 . In this way, we end up with

Tr(𝛾 𝜇 𝛾𝜈 𝛾 𝜌 𝛾𝜎 ) = 2𝑔 𝜇𝜈 Tr(𝛾 𝜌 𝛾𝜎 ) − 2𝑔 𝜇𝜌 Tr(𝛾𝜈 𝛾𝜎 ) + 2𝑔 𝜇𝜎 Tr(𝛾𝜈 𝛾 𝜌 ) − Tr(𝛾𝜈 𝛾 𝜌 𝛾𝜎 𝛾 𝜇 ) . (3.21)

However, in the last term on the right-hand side of (3.21) we can use the trace cyclicity and one
thus gets, eventually,
1
Tr(𝛾 𝜇 𝛾𝜈 𝛾 𝜌 𝛾𝜎 ) = 2𝑔 𝜇𝜈 Tr(𝛾 𝜌 𝛾𝜎 ) − 2𝑔 𝜇𝜌 Tr(𝛾𝜈 𝛾𝜎 ) + 2𝑔 𝜇𝜎 Tr(𝛾𝜈 𝛾 𝜌 )

2
= 4(𝑔 𝜇𝜈 𝑔 𝜌𝜎 − 𝑔 𝜇𝜌 𝑔𝜈𝜎 + 𝑔 𝜇𝜎 𝑔𝜈𝜌 ) , (3.22)

where we have utilized the preceding result (3.19).


The above example makes it clear how to proceed further, i.e. for 𝑛 ≥ 6: one moves the
first 𝛾-matrix in the product step by step (employing the basic anticommutation relation (3.10))
to the last position, and then the trace cyclicity can be used. On the way, one encounters products
with the number of 𝛾-matrices less by two, so one can utilize the result for the preceding member

19
of the whole hierarchy. Thus, it is quite clear that the results for traces in question are expressed
as products of pertinent components of the metric tensor 𝑔; for 𝑛 = 2𝑘 these products consist of
just 𝑘 factors. The resulting number 𝑁 of terms for the trace with 𝑛 = 2𝑘 grows rapidly with 𝑛;
the recursive procedure outlined above shows clearly that 𝑁 (2𝑘) = (2𝑘 − 1)𝑁 (2𝑘 − 2), which
means that
(2𝑘)!
𝑁 (2𝑘) = (2𝑘 − 1)!! = 𝑘 (3.23)
2 𝑘!
(so, for 𝑛 = 6 one gets 15 terms, for 𝑛 = 8 there are 105 terms, etc.). One would certainly
have a lot of fun computing such a trace for 𝑛 = 14, which amounts to 135 135 terms (sic!),
but rest assured that we will always get along with smaller numbers.3 In any case, one might
wonder what is it all good for; please, don’t worry and be patient, you will see that the traces of
products of 𝛾-matrices will come in handy later (in QED, in particular). Perhaps one may refer
to a well-known quotation (due to A. P. Chekhov (Čechov)) saying that “If in the first act (of a
drama) there is a rifle hanging on the wall, then in the last act someone must fire it.” (in fact, we
will “fire the rifle” much earlier than in the last act of this lecture course).
There are many other special identities for 𝛾-matrices that will be practically useful later
on (they are collected in Appendix C), but now we are going to study some of their deeper
structural properties that will be needed soon. The basic point of the analysis that follows is
the observation that one can find an appropriate basis in the space of 4 × 4 matrices, made of
products of 𝛾-matrices. To this end, we will consider 16 matrices, denoted for convenience as
Γ𝐴 , 𝐴 = 1, . . . , 16, and defined as follows. First, Γ1 = 1 (this can be obtained as e.g. (𝛾0 ) 2 );
next, we take
𝛾 𝜇 , 𝜇 = 0, 1, 2, 3 : Γ2 , Γ3 , Γ4 , Γ5 , (3.24)
and then one may form products of two, three and four 𝛾-matrices. That’s the end of the story
— it is clear that products of five and more 𝛾-matrices would not bring anything new, because
of (3.11) (such expressions are reduced to products of less than five 𝛾-matrices). Thus, let us
denote
𝛾0 𝛾1 , 𝛾0 𝛾2 , 𝛾0 𝛾3 , 𝛾1 𝛾2 , 𝛾1 𝛾3 , 𝛾2 𝛾3 : Γ6 , . . . , Γ11 . (3.25)
Further, four independent products of three 𝛾-matrices are equivalent to

𝛾0 e
𝛾5 , 𝛾1 e
𝛾5 , 𝛾2 e
𝛾5 , 𝛾3 e
𝛾5 : Γ12 , Γ13 , Γ14 , Γ15 , (3.26)

where e
𝛾5 is defined as
𝛾5 = 𝛾0 𝛾1 𝛾2 𝛾3
e (3.27)
(we have chosen this provisional notation instead of (3.14) for simplicity). Finally, we set

Γ16 = e
𝛾5 . (3.28)

Using (3.10), it is easy to see that the square of any matrix Γ𝐴 is either 1 or −1. In particular,
one has

Γ2𝐴 = 1 for 𝐴 = 1, 2, 6, 7, 8, 12 ,
Γ2𝐴 = −1 for 𝐴 = 3, 4, 5, 9, 10, 11, 13, 14, 15, 16 . (3.29)

So, the set of Γ𝐴 , 𝐴 = 1, . . . , 16, has the right number of terms to be a good candidate for a basis
in the considered 16-dimensional space of 4 × 4 matrices. Before showing that the Γ𝐴 are indeed
3 Note that discovering this remarkable number has been just serendipitous; obviously, an average QFT practi-
tioner can hardly come across it in routine calculations.

20
linearly independent, we are going to present a few auxiliary statements (lemmas) describing
some simple, but important, properties of the matrices Γ𝐴 .
L1 (commutation & anticommutation): Any pair 𝛤A , 𝛤B either commutes or anticommutes.
This can be proved easily by employing the anticommutation relations (3.10) and (3.15).
L2 (on traces): Tr 𝛤A = 0 for any A > 1.
A part of this statement we have already proved before; in general, using (3.10) and (3.15) is
sufficient.
L3 (on the rearrangement): When multiplying all 𝛤A ’s by a particular 𝛤B from left or right,
one gets again the same set, up to signs and the order.
One may prove such a statement simply “by inspection” (in principle, one should produce a
pertinent multiplication table with 16 × 16 = 256 entries).
The above lemmas are now sufficient for establishing the fact that such Γ𝐴 ’s form a basis.
L4 (on the linear independence): The matrices 𝛤A , A = 1, . . . , 16, are linearly independent.
Proof: Suppose that
𝑎 1 Γ1 + 𝑎 2 Γ2 + . . . + 𝑎 16 Γ16 = 0 (3.30)
for some coefficients 𝑎 1 , . . . , 𝑎 16 . For convenience, let us denote the linear combination on the
left-hand side of (3.30) simply as 𝐿. Obviously, Eq. (3.30) implies, for any 𝐴 = 1, . . . , 16,

Tr(Γ𝐴 𝐿) = 0 . (3.31)

Now, using lemmas L2 and L3, along with the identities (3.28), (3.29), one gets Tr(Γ𝐴 𝐿) = ±4𝑎 𝐴 ,
depending on whether Γ2𝐴 is 1 or −1. In any case, Eq. (3.31) thus implies 𝑎 𝐴 = 0 for any
𝐴 = 1, . . . , 16 and the statement L4 is thereby proved.
Now we are in a position to prove the following important statement:
L5: Let M be a matrix 4 × 4 that commutes with any 𝛾 𝜇 , 𝜇 = 0, 1, 2, 3. Then M is a multiple of
the unit matrix.
Proof: According to the preceding lemma L4, the matrices Γ1 , . . . , Γ16 form a basis; thus, the
matrix 𝑀 can be expressed as a linear combination

𝑀 = 𝑎 1 Γ1 + . . . + 𝑎 16 Γ16 . (3.32)

The premise represents four conditions, namely [𝑀, 𝛾 𝜇 ] = 0 for 𝜇 = 0, 1, 2, 3. Let us start with
𝜇 = 0. Using (3.32), the condition [𝑀, 𝛾0 ] = 0 means

𝑎 1 Γ1 𝛾0 + . . . + 𝑎 16 Γ16 𝛾0 = 𝑎 1 𝛾0 Γ1 + . . . + 𝑎 16 𝛾0 Γ16 . (3.33)

It is easy to find out that 𝛾0 commutes with Γ𝐴 for 𝐴 = 1, 2, 9, 10, 11, 13, 14, 15 and anticommutes
with Γ𝐴 for 𝐴 = 3, 4, 5, 6, 7, 8, 12, 16. Thus, the terms involving the commuting Γ𝐴 ’s drop out
of Eq. (3.33), while the anticommuting Γ𝐴 ’s survive, and Eq. (3.33) eventually becomes

𝑎 3 Γ3 + 𝑎 4 Γ4 + 𝑎 5 Γ5 + 𝑎 6 Γ6 + 𝑎 7 Γ7 + 𝑎 8 Γ8 + 𝑎 12 Γ12 + 𝑎 16 Γ16 = 0 .

Of course, according to the lemma L4 this amounts to

𝑎 3 = 𝑎 4 = 𝑎 5 = 𝑎 6 = 𝑎 7 = 𝑎 8 = 𝑎 12 = 𝑎 16 = 0 . (3.34)

Consequently, after this first step, the expansion (3.32) is reduced to

𝑀 = 𝑎 1 Γ1 + 𝑎 2 Γ2 + 𝑎 9 Γ9 + 𝑎 10 Γ10 + 𝑎 11 Γ11 + 𝑎 13 Γ13 + 𝑎 14 Γ14 + 𝑎 15 Γ15 . (3.35)

21
One may now continue in this way, using for (3.35) the condition [𝑀, 𝛾1 ] = 0. It reduces
further the form of 𝑀, and then one goes on along the same line with 𝛾2 and 𝛾3 . It turns out
that eventually one is left with 𝑀 = 𝑎 1 Γ1 = 𝑎 1 · 1 (since Γ1 commutes with anything), and
this is precisely what we wanted to prove. The reader is urged to check independently the steps
involving the commutators [𝑀, 𝛾 𝜇 ] for 𝜇 = 1, 2, 3.
The above series of statements concerning matrices Γ𝐴 culminates in a profound theorem,
usually called the “fundamental theorem on 𝛾-matrices”. It can be formulated as follows.
Theorem: Let 𝛾 𝜇 and 𝛾 ′ 𝜇 , 𝜇 = 0, 1, 2, 3, be two sets of 4 × 4 matrices satisfying the relations
{𝛾 𝜇 , 𝛾 𝜈 } = 2𝑔 𝜇𝜈 , {𝛾 ′ 𝜇 , 𝛾 ′ 𝜈 } = 2𝑔 𝜇𝜈 . Then there exists a non-singular matrix 𝑆, unique up to a
multiplicative factor, such that
𝛾 ′ 𝜇 = 𝑆𝛾 𝜇 𝑆 −1 (3.36)
for any 𝜇 = 0, 1, 2, 3. Further, if the 𝛾-matrices satisfy the hermiticity conditions (3.12), the
matrix 𝑆 can be chosen to be unitary.
The proof of this remarkable statement is somewhat long and thus we will refrain from
presenting it here. The interested reader can find the proof e.g. in the book [3].
Obviously, the importance of this theorem consists in the observation that all possible
realizations of the Dirac 𝛾-matrices are equivalent. Nevertheless, some particular representations
may be more convenient than others in practical calculations. So, one might use a paraphrase
of the familiar sentence from a famous book by George Orwell, namely: “All representations of
𝛾-matrices are equal, but some of them are more equal than others”.
We already know at least one explicit realization of 𝛾-matrices; more precisely, we know
the so-called standard representation of 𝛼® and 𝛽 (see (1.32)), and using (3.5) one then has

1 0 0 𝜎𝑗
   
0
𝛾 = , 𝑗
𝛾 = , 𝑗 = 1, 2, 3 , (3.37)
0 −1 −𝜎 𝑗 0

as the standard representation for 𝛾 𝜇 . We have noticed before that within such a representation,
the non-relativistic limit is characterized by a suppression of the lower two components of the
wave function with respect to the upper ones. There are at least two more examples of 𝛾-matrix
representations that are worth mentioning here. One of them is the so-called spinor (or chiral)
representation (the origin of these names will become clear later), which is

0 1 0 −𝜎 𝑗
   
0 𝑗
𝛾𝑆 = , 𝛾𝑆 = , 𝑗 = 1, 2, 3 . (3.38)
1 0 𝜎𝑗 0

Further, there is a remarkable representation, in which all 𝛾-matrices are purely imaginary (this
in turn means that in the Dirac equation only real coefficients are then involved). It is called the
𝜇
Majorana representation and the corresponding matrices 𝛾 𝑀 can be expressed with the help
of the standard 𝛾-matrices as

𝛾 0𝑀 = 𝛾 0 𝛾 2 , 𝛾 1𝑀 = −𝛾 1 𝛾 2 , 𝛾 2𝑀 = −𝛾 2 , 𝛾 3𝑀 = 𝛾 2 𝛾 3 . (3.39)

One might also wonder whether there could be a representation involving purely real 𝛾-matrices.
The answer is no. The proof is quite tedious, but it could be a real challenge for a hard-working
student. In any case, an enjoyable exercise would be finding the transformation matrices
implementing the passage from the standard representation to the other two mentioned above.
After all those preparatory steps, we are ready to take up seriously the problem of
relativistic invariance of the Dirac equation. This will be the main theme of the next chapter.

22
Chapter 4

Relativistic covariance of Dirac equation

The problem indicated in the title of this chapter can be formulated as follows. Let us consider
the Dirac equation
𝜕𝜓(𝑥)
𝑖𝛾 𝜇 − 𝑚 𝜓(𝑥) = 0 (4.1)
𝜕𝑥 𝜇
in some coordinate frame, and a Lorentz transformation to another frame,
𝜇
𝑥′ 𝜇 = Λ 𝜈 𝑥𝜈 . (4.2)

The question is whether there is an appropriate linear transformation of the wave function,
𝜓(𝑥) → 𝜓 ′ (𝑥 ′), such that 𝜓 ′ (𝑥 ′) satisfies the equation
𝜕𝜓 ′ (𝑥 ′)
𝑖𝛾 𝜇 − 𝑚 𝜓 ′ (𝑥 ′) = 0 . (4.3)
𝜕𝑥 ′ 𝜇
So, suppose that
𝜓 ′ (𝑥 ′) = 𝑆 𝜓(𝑥) , (4.4)
where 𝑆 is a 4 × 4 constant invertible matrix depending on Λ, i.e. 𝑆 = 𝑆(Λ). Our goal is to find
an appropriate 𝑆 corresponding to a given Λ. To this end, one may use (4.4) to express 𝜓(𝑥) as

𝜓(𝑥) = 𝑆 −1 𝜓 ′ (𝑥 ′) . (4.5)

Inserting now (4.5) into (4.1), one has


𝜕𝜓 ′ (𝑥 ′) 𝜕𝑥 ′ 𝜆
𝑖𝛾 𝜇 𝑆 −1 − 𝑚 𝑆 −1 𝜓 ′ (𝑥 ′) = 0 . (4.6)
𝜕𝑥 ′ 𝜆 𝜕𝑥 𝜇
However, from (4.2) it is clear that
𝜕𝑥 ′ 𝜆
= Λ𝜆 𝜇 . (4.7)
𝜕𝑥 𝜇
Thus, using (4.7) and multiplying Eq. (4.6) by 𝑆 from the left, one gets
𝜕𝜓 ′ (𝑥 ′)
𝑖𝑆𝛾 𝜇 𝑆 −1 Λ𝜆 𝜇 ′ 𝜆
− 𝑚𝜓 ′ (𝑥 ′) = 0 . (4.8)
𝜕𝑥
Obviously, if one wants to arrive at Eq. (4.3), the condition

Λ𝜆 𝜇 𝑆𝛾 𝜇 𝑆 −1 = 𝛾 𝜆

is to be imposed. This can be recast in a more elegant form


𝜇
Λ 𝜈 𝛾 𝜈 = 𝑆 −1 𝛾 𝜇 𝑆 . (4.9)

23
For a practical evaluation of 𝑆 = 𝑆(Λ), the exponential form of Λ (see the formula (A.17) in
Appendix A)  
𝑖 𝛼𝛽
Λ = exp − 𝜔 𝐼𝛼𝛽 (4.10)
2
is instrumental. Let us recall that 𝜔𝛼𝛽 = −𝜔 𝛽𝛼 are the six independent parameters of a general
continuous Lorentz transformation and 𝐼𝛼𝛽 = −𝐼 𝛽𝛼 are the corresponding generators. It is
reasonable to write 𝑆 in analogy with (4.10) as
 
𝑖 𝛼𝛽
𝑆 = exp − 𝜔 𝜎𝛼𝛽 , (4.11)
4
where 𝜎𝛼𝛽 = −𝜎𝛽𝛼 is a set of unknown would-be generators (the factor 1/4 in the exponent
is introduced for later convenience). In this way, the solution of the problem is reduced to
finding the set of the matrices 𝜎𝛼𝛽 . This means that it is sufficient to consider infinitesimal
transformations. Let us denote the infinitesimal parameters in(4.10) and (4.11) as Δ𝜔𝛼𝛽 . We
know (see (A.19)) that the form of the generators in (4.10) then leads to
𝜇 𝜇 𝜇
Λ 𝜈 =𝑔 𝜈 + Δ𝜔 𝜈 , (4.12)
or, equivalently,
Λ 𝜇𝜈 = 𝑔 𝜇𝜈 + Δ𝜔 𝜇𝜈 .
For infinitesimal 𝑆 and 𝑆 −1 one may write
𝑖
𝑆 = 1 − 𝜎𝜇𝜈 Δ𝜔 𝜇𝜈 ,
4 (4.13)
𝑖
𝑆 −1 = 1 + 𝜎𝜇𝜈 Δ𝜔 𝜇𝜈 .
4
Using (4.12) and (4.13) in the condition (4.9) one has
   
𝑖 𝑖
1 + 𝜎𝛼𝛽 Δ𝜔 𝛼𝛽 𝜇
𝛾 1 − 𝜎𝛼𝛽 Δ𝜔 𝛼𝛽
= (𝑔 𝜇𝜈 + Δ𝜔 𝜇𝜈 )𝛾𝜈 . (4.14)
4 4
After some simple manipulations, (4.14) is recast as
𝑖 𝜇
− Δ𝜔𝛼𝛽 (𝛾 𝜇 𝜎𝛼𝛽 − 𝜎𝛼𝛽 𝛾 𝜇 ) = 𝑔 𝛼 Δ𝜔𝛼𝛽 𝛾 𝛽 . (4.15)
4
Utilizing the antisymmetry of parameters Δ𝜔𝛼𝛽 , one gets eventually the condition for the
generators 𝜎𝛼𝛽 :
[𝛾 𝜇 , 𝜎𝛼𝛽 ] = 2𝑖(𝑔 𝜇𝛼 𝛾 𝛽 − 𝑔 𝜇𝛽 𝛾𝛼 ) . (4.16)
For any fixed pair of indices 𝛼, 𝛽 we thus have 64 equations for 16 unknowns (elements of
the 4 × 4 matrix 𝜎𝛼𝛽 ). So, at first sight one could say that 𝜎𝛼𝛽 is overconstrained by the
conditions (4.16). An uninspired way of solving Eq. (4.16) would consist in writing 𝜎𝛼𝛽 as a
linear combination of matrices Γ𝐴 , 𝐴 = 1, . . . , 16, from the preceding chapter, and employ the
commutation and anticommutation relations to fix the values of the relevant coefficients. This
is possible, but rather tedious; one might call it a “poor man’s way”. Instead, let us try to guess
some hint that would help us find a short cut to the desired solution. To this end, we are going
to start e.g. with 𝜎01 . The conditions (4.16) then give, for 𝜇 = 0, 1, 2, 3,
[𝛾0 , 𝜎01 ] = 2𝑖𝛾1 ,
[𝛾1 , 𝜎01 ] = 2𝑖𝛾0 ,
(4.17)
[𝛾2 , 𝜎01 ] = 0,
[𝛾3 , 𝜎01 ] = 0.

24
Contemplating (4.17) one may guess that 𝜎01 must be proportional to 𝛾0 𝛾1 ; more precisely, a
solution of (4.17) is, obviously,
𝜎01 = 𝑖𝛾0 𝛾1 . (4.18)
This is just the hint we need. One may, tentatively, generalize (4.18) to 𝜎𝛼𝛽 = 𝑖𝛾𝛼 𝛾 𝛽 (of course,
such an Ansatz is meaningful just for 𝛼 ≠ 𝛽; for 𝛼 = 𝛽 the matrix 𝜎𝛼𝛽 is trivial). Since we know
that 𝜎𝛼𝛽 = −𝜎𝛽𝛼 , this is equivalent to

𝑖
𝜎𝛼𝛽 = [𝛾𝛼 , 𝛾 𝛽 ] . (4.19)
2
It is not difficult to verify that the expression (4.19) satisfies indeed Eq. (4.16). For this purpose,
one may employ the elementary algebraic identity

[ 𝐴, 𝐵𝐶] = {𝐴, 𝐵}𝐶 − 𝐵{𝐴, 𝐶} . (4.20)

Then
[𝛾 𝜇 , 𝜎𝛼𝛽 ] = [𝛾 𝜇 , 𝑖𝛾𝛼 𝛾 𝛽 ] = 2𝑖𝑔 𝜇𝛼 𝛾 𝛽 − 2𝑖𝑔 𝜇𝛽 𝛾𝛼 ,
and (4.16) is thereby proved.
So, we have guessed a particular solution of the conditions (4.16), but the question
remains whether it is unique or not. To clarify this point, suppose there is another solution,
denoted as 𝜎𝛼𝛽′ . Using (4.16) for 𝜎
𝛼𝛽 and 𝜎𝛼𝛽 , one sees immediately that


[𝜎𝛼𝛽 − 𝜎𝛼𝛽 , 𝛾𝜇 ] = 0

for any 𝜇 = 0, 1, 2, 3. Then, according to the lemma L5 from preceding chapter, it must hold

𝜎𝛼𝛽 − 𝜎𝛼𝛽 = 𝑎 · 1 ,

where 𝑎 is an arbitrary coefficient. Thus, (4.19) is in fact the general solution of Eq. (4.16), up
to a multiple of unit matrix. Needles to say, such a trivial ambiguity has been clear from the
very beginning; the non-trivial point here is that it is the only possible ambiguity. From now on,
we will use (4.19) as the relevant formula for the generators 𝜎𝛼𝛽 .
An important remark is in order here. Our construction of the transformation matrix 𝑆
in (4.9) has been based on generators 𝜎𝛼𝛽 that correspond to the Lorentz generators 𝐼𝛼𝛽 . More
precisely, the correspondence between (4.10) and (4.11) is
1
𝜎𝛼𝛽 ←→ 𝐼𝛼𝛽 . (4.21)
2
As we know, the generators 𝐼𝛼𝛽 satisfy commutation relations characteristic of a Lie algebra. In
particular (see (A.22)),

[𝐼 𝜇𝜈 , 𝐼 𝜌𝜎 ] = 𝑖(𝑔 𝜇𝜎 𝐼𝜈𝜌 + 𝑔𝜈𝜌 𝐼 𝜇𝜎 − 𝑔 𝜇𝜌 𝐼𝜈𝜎 − 𝑔𝜈𝜎 𝐼 𝜇𝜌 ) . (4.22)

One might suspect that the matrices 21 𝜎𝛼𝛽 satisfy the same commutation relation (in mathematical
language, it would mean that the six matrices 21 𝜎𝛼𝛽 constitute a representation of the Lie algebra
of the Lorentz group). It turns out that it is indeed the case. For the evaluation of the relevant
commutators one may use the identity (an extension of (4.20))

[ 𝐴𝐵, 𝐶𝐷] = 𝐴{𝐵, 𝐶}𝐷 − 𝐴𝐶{𝐵, 𝐷} − 𝐶{𝐴, 𝐷}𝐵 + { 𝐴, 𝐶}𝐷𝐵 . (4.23)

25
Then, one has
1 1 𝑖 𝑖
[ 𝜎𝜇𝜈 , 𝜎𝜌𝜎 ] = · [𝛾 𝜇 𝛾𝜈 , 𝛾 𝜌 𝛾𝜎 ]
2 2 2 2
1
= − (2𝑔𝜈𝜌 𝛾 𝜇 𝛾𝜎 − 2𝑔𝜈𝜎 𝛾 𝜇 𝛾 𝜌 − 2𝑔 𝜇𝜎 𝛾 𝜌 𝛾𝜈 + 2𝑔 𝜇𝜌 𝛾𝜎 𝛾𝜈 )
4
1 1 1 1 1
   
= − −𝑔𝜈𝜎 {𝛾 𝜇 , 𝛾 𝜌 } + [𝛾 𝜇 , 𝛾 𝜌 ] + 𝑔𝜈𝜌 {𝛾 𝜇 , 𝛾𝜎 } + [𝛾 𝜇 , 𝛾𝜎 ]
2 2 2 2 2
1 1 1 1
   
− 𝑔 𝜇𝜎 {𝛾 𝜌 , 𝛾𝜈 } + [𝛾 𝜌 , 𝛾𝜈 ] + 𝑔 𝜇𝜌 {𝛾𝜎 , 𝛾𝜈 } + [𝛾𝜎 , 𝛾𝜈 ]
2 2 2 2
1 X 1 1
= − ( 𝑔𝜈𝜌X𝑔X X + 𝑔𝜈𝜌 𝑖 𝜎𝜇𝜎 − 𝑔𝜈𝜎
𝜇𝜎
XX𝑔X𝜇𝜌 − 𝑔𝜈𝜎 𝜎𝜇𝜌
2 X 𝑖
1 1
−X𝑔 𝜇𝜎
X𝑔 X − 𝑔 𝜇𝜎 𝑖 𝜎𝜌𝜈 + 𝑔 𝜇𝜌
X𝜌𝜈 XX𝑔X
X + 𝑔 𝜇𝜌 𝑖 𝜎𝜎𝜈 )
𝜎𝜈

1 1 1 1
 
= 𝑖 𝑔 𝜇𝜎 𝜎𝜈𝜌 + 𝑔𝜈𝜌 𝜎𝜇𝜎 − 𝑔 𝜇𝜌 𝜎𝜈𝜎 − 𝑔𝜈𝜎 𝜎𝜇𝜌 , (4.24)
2 2 2 2
and this is precisely the anticipated commutation relation for the generators 𝜎𝛼𝛽 that matches
Eq. (4.22).
Thus, in our straightforward way we have discovered a four-dimensional representation
of the Lorentz group (or Lorentz algebra, if you want). It is rightly called the bispinor (or Dirac
spinor) as we will see shortly. An elementary mathematical theory of representations of Lorentz
group is described briefly in Appendix B.
It will be certainly instructive to present some explicit examples of the matrix 𝑆 imple-
menting the transformation (4.4) of Dirac wave function. First, let us consider a spatial rotation
around the third axis of coordinate system, i.e. in the plane (12). So, the relevant transformation
(4.2) is given by
𝑥0 0 𝑥0

1 0 0
© 1′ ª ©
­𝑥 ® ­0 cos 𝜑 sin 𝜑 0® ­𝑥 1 ®
ª© ª
­ 2′ ® = ­ ®­ ®. (4.25)
­𝑥 ® ­0 − sin 𝜑 cos 𝜑 0® ­𝑥 2 ®
3
«𝑥 ¬ «0 0 0 1¬ «𝑥 3 ¬

The corresponding infinitesimal form thus amounts to


𝑥 1 = 𝑥 1 + 𝛿𝜑 𝑥 2 ,

(4.26)
𝑥 2 = −𝛿𝜑 𝑥 1 + 𝑥 2 .

Using the notation (4.12), this means


( Δ𝜔) 1 2 = 𝛿𝜑 , ( Δ𝜔) 2 1 = −𝛿𝜑 . (4.27)
It is easy to see that raising the indices in (4.27) leads to
( Δ𝜔) 12 = −𝛿𝜑 , ( Δ𝜔) 21 = 𝛿𝜑 . (4.28)
Then, according to the formula (4.13), one has
𝑖 𝑖
𝑆 = 1 − 2 · ( Δ𝜔) 12 𝜎12 = 1 + 𝛿𝜑𝜎12 (4.29)
4 2
for the infinitesimal transformation, i.e. for the representation of the finite rotation (4.25) one
may write  
𝑖
𝑆(𝜑) = exp 𝜑𝜎12 . (4.30)
2

26
Now, in the standard representation of 𝛾-matrices one gets

0 𝜎1 0 𝜎2 𝜎3 0
    
𝜎12 = 𝑖𝛾1 𝛾2 = 𝑖 = = Σ3 .
−𝜎1 0 −𝜎2 0 0 𝜎3

Thus, the result (4.30) can be recast as


 
𝑖
𝑆(𝜑) = exp 𝜑Σ3 . (4.31)
2

Expanding the exponential (4.31) in Taylor series, one gets, taking into account that (Σ3 ) 2 = 1,
𝜑 𝜑
𝑆(𝜑) = cos · 1 + 𝑖 sin Σ3 . (4.32)
2 2
This means, in particular,
𝑆(2𝜋) = −1 . (4.33)
Thus, the full rotation with 𝜑 = 360◦ changes the sign of a wave function in question. This
is a typical property of spinors (well-known already from the non-relativistic description of a
spin-1/2 particle in terms of a two-component wave function). For this reason, it is natural to
call the four-component Dirac wave function the bispinor (in Appendix B, one may find a more
precise explanation of this concept).
Next, let us consider the case of a Lorentz boost; in particular, the example we take
up is the uniform motion along the coordinate axis 1 with a velocity 𝑣. The corresponding
transformation of spacetime coordinates reads

𝑥0 0 𝑥0

ch 𝜑 −sh 𝜑 0
© 1′ ª ©
­𝑥 ® ­−sh 𝜑 ch 𝜑 0 0 ® ­𝑥 1 ®
ª© ª
­ 2′ ® = ­ ®­ ®, (4.34)
­𝑥 ® ­ 0 0 1 0® ­𝑥 2 ®
3 0 0 0 1¬ «𝑥 3 ¬

«𝑥 ¬ «
where
1 𝑣
ch 𝜑 = √ , sh 𝜑 = √ .
1 − 𝑣2 1 − 𝑣2
For the infinitesimal transformation we then have

𝑥 0 = 𝑥 0 − 𝛿𝜑 𝑥 1 ,

(4.35)
𝑥 1 = −𝛿𝜑 𝑥 0 + 𝑥 1 .

Using the notation (4.12), this means

( Δ𝜔) 0 1 = −𝛿𝜑 , ( Δ𝜔) 1 0 = −𝛿𝜑 .

The corresponding infinitesimal transformation of the Dirac wave function is thus, according to
(4.13),
𝑖 𝑖
𝑆 = 1 − 2 · ( Δ𝜔) 01 𝜎01 = 1 − 𝛿𝜑𝜎01 ,
4 2
so that the relevant finite transformation may be written as
 
𝑖
𝑆(𝜑) = exp − 𝜑𝜎01 . (4.36)
2

27
In the standard representation of 𝛾-matrices one gets

1 0 0 −𝜎1 0 𝜎1
    
𝜎01 = 𝑖𝛾0 𝛾1 = 𝑖 = −𝑖 .
0 −1 𝜎1 0 𝜎1 0

Obviously, (𝜎01 ) 2 = −1, and the exponential (4.36) can thus be worked out as
𝜑 𝜑
𝑆(𝜑) = ch · 1 − 𝑖 sh 𝜎01 . (4.37)
2 2
In this way, we have established Lorentz covariance of the Dirac equation. Up to now we
have considered continuous Lorentz transformations (spatial rotations and boosts) described, in
general, by six parameters. The case of discrete symmetries will be discussed in the next chapter.
To extend our technical tools for “diracology”, let us now mention a simple but very
useful formula that relates 𝑆 −1 and 𝑆 † . It holds

𝑆 −1 = 𝛾0 𝑆 † 𝛾0 . (4.38)

The proof of this identity is quite easy. Let us write, for convenience,

𝑆 = exp(−𝑖Ω) , (4.39)

with
1
Ω = 𝜔𝛼𝛽 𝜎𝛼𝛽 .
4
Expanding the exponential (4.39) in Taylor series, one has
1 1
𝑆 =1+ (−𝑖Ω) + (−𝑖Ω) 2 + . . . (4.40)
1! 2!

Using the relation 𝛾 †𝜇 = 𝛾0 𝛾 𝜇 𝛾0 (see (3.13)), it is easy to arrive at the identity

Ω† = 𝛾0 Ω𝛾0 . (4.41)

This, of course, means that such a rule is valid for any power of Ω, i.e.

(Ω𝑛 ) † = 𝛾0 Ω𝑛 𝛾0 . (4.42)

So, taking into account (4.42), from (4.40) one gets immediately

1 1
 
2

𝑆 = 𝛾0 1 + 𝑖Ω + (𝑖Ω) + . . . 𝛾0 . (4.43)
1! 2!

The expression in square brackets is just exp(𝑖Ω) = 𝑆 −1 . Thus, we have 𝑆 † = 𝛾0 𝑆 −1 𝛾0 and the
identity (4.38) is thereby proved.
One more remark is in order here. For spatial rotations, Ω is clearly Hermitian, so that the
corresponding 𝑆 in (4.39) is unitary. For boosts, Ω is anti-Hermitian, and 𝑆 is thus Hermitian.
This corresponds precisely to the known properties of the matrices Λ of Lorentz transformations.
With the basic elements of the covariant formalism at hand, let us now return briefly to
the continuity equation (2.4) for the probability density and probability current. Let us see how
it can be recovered from the covariant form of Dirac equation. We have

𝑖𝛾 𝜇 𝜕𝜇 𝜓 = 𝑚𝜓 , (4.44)

28
and its Hermitian conjugation becomes, upon multiplication by 𝛾0 from the right,

−𝑖𝜕𝜇 𝜓 † 𝛾0 𝛾 𝜇 = 𝑚𝜓 † 𝛾0 . (4.45)

One can see that in (4.45) the expression

𝜓 = 𝜓 † 𝛾0 (4.46)

emerges naturally. It is called the Dirac conjugation and we will use it frequently from now on.
So, (4.45) is recast as
−𝑖𝜕𝜇 𝜓𝛾 𝜇 = 𝑚𝜓 . (4.47)
Now, multiplying equations (4.44) and (4.47) by 𝜓 and 𝜓 from left and right, respectively, the
difference of the resulting expressions gives immediately

𝜕𝜇 𝜓𝛾 𝜇 𝜓 = 0 . (4.48)


Taking into account the relations between matrices 𝛾 𝜇 and 𝛼


® , 𝛽, it is obvious that (4.48) coincides
with (2.4). Thus, we have recovered the continuity equation in the covariant form

𝜕𝜇 𝑗 𝜇 = 0 (4.49)

involving a four-current 𝑗 𝜇 ,
𝑗 𝜇 = 𝜓𝛾 𝜇 𝜓 . (4.50)
Our notation suggests that 𝑗 𝜇 might be a four-vector under Lorentz transformations. It is indeed
so. Denoting

𝑗 𝜇 (𝑥 ′) = 𝜓 ′ (𝑥 ′)𝛾 𝜇 𝜓 ′ (𝑥 ′) , (4.51)
and using the transformation law (4.4), as well as the definition (4.46), one has

𝑗 𝜇 (𝑥 ′) = 𝜓 † ′ (𝑥 ′)𝛾0 𝛾 𝜇 𝑆𝜓(𝑥)
= 𝜓 † (𝑥)𝑆 † 𝛾0 𝛾 𝜇 𝑆𝜓(𝑥)
= 𝜓(𝑥)𝛾0 𝑆 † 𝛾0 𝛾 𝜇 𝑆𝜓(𝑥) . (4.52)

So, utilizing the identity (4.38), the last expression becomes



𝑗 𝜇 (𝑥 ′) = 𝜓(𝑥)𝑆 −1 𝛾 𝜇 𝑆𝜓(𝑥) ,

and, taking into account (4.9), one gets finally


′ 𝜇
𝑗 𝜇 (𝑥 ′) = Λ 𝜈 𝑗 𝜈 (𝑥) , (4.53)

which is the anticipated result.


Thus, we have seen that one can construct a four-vector as a bilinear form made of
bispinors (in this sense, a bispinor is a “square root of a four-vector”). We will see later on
that the example described above can be generalized in a systematic way; such constructions are
particularly useful within the framework of field theory.

29
Chapter 5

𝑪, 𝑷 and 𝑻

In this chapter, we are going to examine discrete symmetries of the Dirac equation. In the title,
they are shown in alphabetical order, but now we will perform a cyclic permutation and start
with 𝑃. So, what is 𝑃? It is spatial inversion, or space reflection, if you want. Usually, it is
also called the parity transformation (hence 𝑃). Such a transformation simply means

(𝑥 0 , 𝑥®) −→ (𝑥 0 , −®
𝑥) . (5.1)

It is certainly a Lorentz transformation, since it preserves the spacetime interval 𝑥 2 = (𝑥 0 ) 2 − 𝑥®2 .


The corresponding transformation matrix is, obviously,

1 0 0 0
­0 −1 0 0 ®
© ª
Λ𝑃 = ­ ®. (5.2)
­0 0 −1 0 ®
«0 0 0 −1¬
Using the standard terminology (see Appendix A), spatial inversion (5.1) belongs to the set of
orthochronous transformations, with det Λ𝑃 = −1.
Now, the question is what could be the corresponding transformation of the Dirac wave
function satisfying Eq. (3.7). The relation (4.9) is quite general; it is not restricted to the
continuous proper Lorentz transformations discussed in detail in the preceding chapter. So,
using (4.9) and taking into account the simple structure of the matrix Λ𝑃 , one gets immediately
the conditions for the relevant matrix 𝑆 𝑃 :
0 0
𝑆 −1
𝑃 𝛾 𝑆𝑃 = 𝛾 ,
(5.3)
𝑆 −1 𝑘 𝑘
𝑃 𝛾 𝑆 𝑃 = −𝛾 , 𝑘 = 1, 2, 3 .

Thus, the sought matrix 𝑆 𝑃 should commute with 𝛾 0 and anticommute with any 𝛾 𝑘 , 𝑘 = 1, 2, 3.
A solution is then clear:
𝑆𝑃 = 𝑎 · 𝛾0 , (5.4)
where 𝑎 is an arbitrary constant factor. It remains to be clarified whether (5.4) is the general
solution or not. So, suppose there are two matrices 𝑅 and 𝑆 satisfying (5.3). It then means

𝑆 −1 𝛾 0 𝑆 = 𝑅 −1 𝛾 0 𝑅 ,
(5.5)
𝑆 −1 𝛾 𝑘 𝑆 = 𝑅 −1 𝛾 𝑘 𝑅 , 𝑘 = 1, 2, 3 .

In other words, one has


𝑆 −1 𝛾 𝜇 𝑆 = 𝑅 −1 𝛾 𝜇 𝑅 (5.6)

30
for any 𝜇 = 0, 1, 2, 3. Eq. (5.6) can be recast as

𝑅𝑆 −1 𝛾 𝜇 (𝑅𝑆 −1 ) −1 = 𝛾 𝜇 ,

and this means that 𝑅𝑆 −1 commutes with any 𝛾 𝜇 , 𝜇 = 0, 1, 2, 3. According to the lemma L5
from Chapter 3 this implies that
𝑅𝑆 −1 = 𝑏 · 1 , (5.7)
where 𝑏 is an arbitrary factor. Thus,
𝑅 = 𝑏·𝑆. (5.8)
In view of the relation (5.8) one may conclude that (5.4) is indeed the general solution of
conditions (5.3). Conventionally, we will use 𝑆 𝑃 = 𝛾0 henceforth.
Thus, we have established the covariance of Dirac equation under the parity transforma-
tion: if 𝜓(𝑥) is a solution in a given reference frame, then for 𝑥 ′ = (𝑥 0 , −®
𝑥 ) the function

𝜓 ′ (𝑥 ′) = 𝛾0 𝜓(𝑥) (5.9)

is the corresponding solution in the primed system. Note that this is tantamount to the statement
that if 𝜓(𝑥) is a solution of the Dirac equation, then

𝜓 𝑃 (𝑥) = 𝛾0 𝜓(𝑥 0 , −®
𝑥) (5.10)

is its solution as well.


One may notice that we have achieved such a result quite easily and the parity symmetry
seems to be almost automatic in the present case. In fact, as we will see later on, it is not difficult
to find an example of a relativistic equation that does exhibit parity violation.
With the knowledge of the parity transformation at hand, we may now extend our previous
considerations concerning bilinear forms made of Dirac spinors (cf. the discussion following
the formula (4.50)). Although such a technical progress is not of immediate importance for our
study of Dirac equation, it will be useful later on, within the framework of field theory (so, it
will be another “rifle hanging on the wall” à la A. P. Chekhov). In any case, at the moment it
may serve as a refreshing exercise for a loyal reader.
For simplicity, let us start with the expression 𝜓𝜓. Using the relation (4.38), it is easy to
see that such a form is a scalar under proper Lorentz transformations. Moreover, from (5.9) it is
obvious that it is invariant under spatial inversion as well. So, in this sense, 𝜓𝜓 is a true scalar.
Next, let us consider the combination 𝜓𝛾5 𝜓. For 𝑥 ′ = Λ𝑥 one gets, in general,

𝜓 ′ (𝑥 ′)𝛾5 𝜓 ′ (𝑥 ′) = 𝜓 † (𝑥)𝑆 † 𝛾0 𝛾5 𝑆𝜓(𝑥) = 𝜓(𝑥)𝛾0 𝑆 † 𝛾0 𝛾5 𝑆𝜓(𝑥) = 𝜓(𝑥)𝑆 −1 𝛾5 𝑆𝜓(𝑥) . (5.11)

For a proper Lorentz transformation, the generators of 𝑆 are made of products of two 𝛾-matrices,
and therefore they commute with 𝛾5 ; this in turn means that [𝑆, 𝛾5 ] = 0. Thus, we see that 𝜓𝛾5 𝜓
is invariant in such a case. On the other hand, for the spatial inversion one has 𝑆 = 𝛾0 , so that
𝑆 −1 𝛾5 𝑆 = −𝛾5 and one thus gets

𝜓 ′ (𝑥 ′)𝛾5 𝜓 ′ (𝑥 ′) = −𝜓(𝑥)𝛾5 𝜓(𝑥) . (5.12)

So, we end up with the conclusion that 𝜓𝛾5 𝜓 is a pseudoscalar.


In a similar way, one may compare the behaviour of 𝜓𝛾 𝜇 𝜓 and 𝜓𝛾 𝜇 𝛾5 𝜓. We have already
observed that 𝑗 𝜇 = 𝜓𝛾 𝜇 𝜓 is a four-vector (cf. (4.53)) under proper Lorentz transformations.
Obviously, for spatial inversion the relation (4.53) holds equally well; this means that one has,
schematically,
𝑃
( 𝑗 0 , 𝑗®) −
→ ( 𝑗 0 , − 𝑗®) , (5.13)

31
𝜇
i.e. 𝑗 𝜇 is a true four-vector. For 𝑗5 = 𝜓𝛾 𝜇 𝛾5 𝜓 one gets, after a simple manipulation,
𝜇
𝑗5 (𝑥 ′) = 𝜓 † (𝑥)𝑆 † 𝛾0 𝛾 𝜇 𝛾5 𝑆𝜓(𝑥) = 𝜓(𝑥)𝑆 −1 𝛾 𝜇 𝛾5 𝑆𝜓(𝑥) . (5.14)

For proper Lorentz transformations one has [𝑆, 𝛾5 ] = 0 and (5.14) then amounts to
𝜇′ 𝜇
𝑗 5 (𝑥 ′) = Λ 𝜈
𝜈 𝑗 5 (𝑥) .

For spatial inversion, 𝛾5 𝑆 = −𝑆𝛾5 and one thus gets


𝑃
( 𝑗50 , 𝑗®5 ) −
→ (− 𝑗50 , 𝑗®5 ) . (5.15)
𝜇
This means that 𝑗5 is a pseudovector (axial vector). Thus, in the above examples, the matrix
𝛾5 is responsible for the prefix “pseudo-” in the notation of the quantities in question.
Let us now proceed to the item 𝑇 of our list. It denotes time reversal (or time inversion,
if you want). Before examining a pertinent transformation for Dirac equation, let us return
briefly to the non-relativistic Schrödinger equation mentioned in the first two chapters; it may
provide us with an inspiring hint. From (1.15), (1.16) it is clear that the replacement 𝑡 → −𝑡
changes the sign of the time derivative, and the complex conjugation of the wave function does
the same. Thus, one may observe that the free-particle Schrödinger equation is invariant under
time reversal, in the sense that if 𝜓(𝑡, 𝑥®) is a solution, then 𝜓 ∗ (−𝑡, 𝑥®) is a solution as well. So,
an important point is that the transformation 𝑡 → −𝑡 is to be accompanied by the complex
conjugation. This has a clear and desirable physical effect. Upon such a transformation, the
energy is not changed, while the momentum changes its sign; to see this explicitly, please recall
the form of a plane wave, involving the familiar factor exp[−𝑖(𝐸𝑡 − 𝑝® · 𝑥®)].
So, let us come back to our staple food, the Dirac equation. The considered transformation
of spacetime coordinates is now described by means of the matrix

−1 0 0 0
­0 1 0 0®
© ª
Λ = Λ𝑇 = ­ ®. (5.16)
­0 0 1 0®
«0 0 0 1¬
Motivated by the preceding considerations, we may try an Ansatz for the corresponding trans-
formation of the Dirac wave function, defined as

𝜓 ′ (𝑥 ′) = 𝐵𝜓 ∗ (𝑥) , (5.17)

where 𝑥 ′ = (−𝑥 0 , 𝑥®) and 𝐵 is an invertible constant 4 × 4 matrix. By the way, in contrast to
the preceding case of spatial inversion, (5.17) represents an antilinear transformation, due to
the involvement of the complex conjugation — this is a well-known characteristic of the time
reversal in quantum theory. So, our task is now to find the matrix 𝐵 (if it exists). To this end, we
start with the complex conjugation of the Dirac equation, i.e.

−𝑖𝛾 𝜇 ∗ 𝜕𝜇 𝜓 ∗ − 𝑚𝜓 ∗ = 0 . (5.18)

Using (5.17), we express 𝜓 ∗ as


𝜓 ∗ (𝑥) = 𝐵−1 𝜓 ′ (𝑥 ′) , (5.19)
and plugging this into Eq. (5.18) one gets

𝜕 ′ ′ 𝜕𝑥 ′ 𝜆
−𝑖𝛾 𝜇 ∗ 𝐵−1 ′ 𝜆
𝜓 (𝑥 ) 𝜇 − 𝑚𝐵−1 𝜓 ′ (𝑥 ′) = 0 . (5.20)
𝜕𝑥 𝜕𝑥

32
Now, taking into account that
𝜕𝑥 ′ 𝜆
𝜇
= Λ𝑇 𝜆 𝜇 ,
𝜕𝑥
and multiplying Eq. (5.20) by 𝐵 from the left, one obtains
𝜕𝜓 ′ (𝑥 ′)
−𝑖𝐵𝛾 𝜇 ∗ 𝐵−1 Λ𝑇 𝜆𝜇 − 𝑚𝜓 ′ (𝑥 ′) = 0 . (5.21)
𝜕𝑥 ′ 𝜆
The condition of the covariance of Dirac equation under time reversal thus reads
𝜇
𝐵−1 𝛾 𝜇 𝐵 = −Λ𝑇 𝜈𝛾
𝜈∗
. (5.22)

To work out the general relation (5.22) explicitly, one should realize that the complex conjugation
of 𝛾-matrices depends on their particular representation. In the standard representation, 𝛾 𝜇 ∗ = 𝛾 𝜇
for 𝜇 = 0, 1, 3 and 𝛾 𝜇 ∗ = −𝛾 𝜇 for 𝜇 = 2. Thus, in this “household representation” the conditions
(5.22) read

𝐵−1 𝛾 0 𝐵 = 𝛾 0 ,
𝐵−1 𝛾 1 𝐵 = −𝛾 1 ,
(5.23)
𝐵−1 𝛾 2 𝐵 = 𝛾 2 ,
𝐵−1 𝛾 3 𝐵 = −𝛾 3 .

It is easy to guess that a solution of (5.23) is

𝐵 = 𝑎 · 𝛾1𝛾3 , (5.24)

where 𝑎 is an arbitrary constant factor. One can also show that such a solution is unique; to
prove this, one may proceed in the same way as before, in the case of the parity transformation.
If we want to make (5.17) antiunitary operator, the factor 𝑎 in (5.24) should be chosen so that
|𝑎| = 1. A conventional choice used frequently in the literature is 𝑎 = 𝑖. Sticking to such a
convention, our result can be written as

𝜓 ′ (𝑥 ′) = 𝑖𝛾 1 𝛾 3 𝜓 ∗ (𝑥) , (5.25)

with 𝑥 ′ = (−𝑥 0 , 𝑥®).


Let us remind the reader that the concept of antiunitary vs. unitary operators is mostly
due to E. P. Wigner, who is the author of a fundamental theorem on symmetries in quantum
theory, which bears his name.4
Finally, let us take up the item 𝐶 in our list. It is the so-called charge conjugation,
and in distinction to the preceding two discrete symmetries, 𝐶 is not related to any spacetime
transformation. Rather, it is an internal symmetry and its substantial ingredient is the complex
conjugation. An obvious motivation for investigating such a symmetry is the existence of free-
particle solutions with positive and negative energy; one may thus naturally contemplate the
possibility of a transformation turning one type of a solution into another.
We are going to start with the Ansatz

𝜓 ′ (𝑥) = 𝐴𝜓 ∗ (𝑥) , (5.26)


4 Biographical remark: Eugene Paul Wigner (1902–1995) (originally Wigner Jenö Pál) was an eminent theorist
in the field of quantum theory. Born in Hungary, he emigrated to U. S. in 1930s and received Nobel Prize in
1963 together with Maria Goeppert Mayer for the work on symmetries in nuclear and particle physics. He was a
brother–in–law of Paul Dirac.

33
where 𝐴 is an invertible constant matrix. Let us stress once again that spacetime coordinates are
left unchanged. Expressing then 𝜓 ∗ (𝑥) as

𝜓 ∗ (𝑥) = 𝐴−1 𝜓 ′ (𝑥) , (5.27)

and inserting (5.27) into the complex conjugate of the Dirac equation (5.18), one gets first

−𝑖𝛾 𝜇 ∗ 𝐴−1 𝜕𝜇 𝜓 ′ (𝑥) − 𝑚 𝐴−1 𝜓 ′ (𝑥) = 0 ,

and, subsequently,
−𝑖 𝐴𝛾 𝜇 ∗ 𝐴−1 𝜕𝜇 𝜓 ′ (𝑥) − 𝑚𝜓 ′ (𝑥) = 0 . (5.28)
The condition of the invariance under the considered transformation thus reads

−𝐴𝛾 𝜇 ∗ 𝐴−1 = 𝛾 𝜇 ,

or, equivalently,
−𝛾 𝜇 ∗ = 𝐴−1 𝛾 𝜇 𝐴 . (5.29)
As we have already noted before, there is no universal pattern of complex conjugation for 𝛾-
matrices, so one has to resort to an explicit representation. Thus, for the good old standard
representation the set of conditions (5.29) is worked out as

𝐴−1 𝛾 0 𝐴 = −𝛾 0 ,
𝐴−1 𝛾 1 𝐴 = −𝛾 1 ,
(5.30)
𝐴−1 𝛾 2 𝐴 = 𝛾 2 ,
𝐴−1 𝛾 3 𝐴 = −𝛾 3 .

A solution of (5.30) is immediately obvious: it is

𝐴 = 𝑎 · 𝛾2 , (5.31)

where 𝑎 is an arbitrary constant. Again, one can prove easily that (5.31) is in fact the general
solution of the conditions (5.30). It is an old hat by now, relying on the lemma L5 from Chapter 3.
Conventionally, one may set 𝑎 = 𝑖; then 𝐴 is unitary matrix (and Hermitian as well). Thus, we
have established a remarkable internal symmetry of Dirac equation, described by the antiunitary
transformation5
𝜓 ′ (𝑥) = 𝑖𝛾 2 𝜓 ∗ (𝑥) . (5.32)
Instead of (5.32), it is more practical to express the charge conjugation transformation in terms
of the Dirac conjugation 𝜓 (which, as we know, appears naturally e.g. in bilinear covariant forms
discussed above). To this end, 𝜓 ∗ is expressed as

= 𝜓𝛾 0 (5.33)
T T
𝜓∗ = 𝜓† .

Within the standard representation, (5.33) becomes

𝜓∗ = 𝛾0𝜓 ,
T

5 Let us emphasize that (5.32) holds within the standard representation of 𝛾-matrices. A watchful reader may
have noticed that for the Majorana representation mentioned in Chapter 3 one would get 𝐴 = 𝑎 ·1, i.e. conventionally
𝜓 ′ (𝑥) = 𝜓 ∗ (𝑥).

34
and (5.32) may be recast as
T
𝜓 ′ = 𝐶𝜓 , (5.34)
where
𝐶 = 𝑖𝛾 2 𝛾 0 . (5.35)
Let us also rewrite (5.29) as a corresponding condition for the matrix 𝐶. First, one has

−𝛾 𝜇 ∗ = (𝐶𝛾 0 ) −1 𝛾 𝜇 𝐶𝛾 0 .

Using the simple properties of 𝛾 0 in the standard representation, one gets finally, after some
simple manipulations,
−𝛾 𝜇 T = 𝐶 −1 𝛾 𝜇 𝐶 . (5.36)
A remark is in order here. We have arrived at (5.34) with 𝐶 satisfying (5.36) within the standard
representation of 𝛾-matrices. In fact, these relations are quite general; what is specific for the
standard representation, is the result (5.35) for 𝐶. So, let us show how an alternative derivation
of (5.36) can proceed. First, an equation for 𝜓 is obtained easily; it reads

−𝑖𝜕𝜇 𝜓𝛾 𝜇 − 𝑚𝜓 = 0 . (5.37)
T
Next, using the Ansatz (5.34), 𝜓 is expressed as
T
𝜓 = 𝐶 −1 𝜓 ′ . (5.38)

Eq. (5.37) is recast as  T T


−𝑖𝛾 𝜇 T 𝜕𝜇 𝜓 − 𝑚𝜓 = 0 ,
and inserting there (5.38), one gets readily

−𝑖𝐶𝛾 𝜇 T 𝐶 −1 𝜕𝜇 𝜓 ′ − 𝑚𝜓 ′ = 0 .

The requirement of invariance then reads

−𝐶𝛾 𝜇 T 𝐶 −1 = 𝛾 𝜇 ,

i.e.
−𝛾 𝜇 T = 𝐶 −1 𝛾 𝜇 𝐶 ,
so that (5.36) is recovered. Note that it is also quite easy to obtain the formula (5.35) directly
from (5.36). Indeed, using the familiar properties of 𝛾-matrices, one can see that in the standard
representation 𝛾 𝜇 T = 𝛾 𝜇 for 𝜇 = 0, 2 and 𝛾 𝜇 T = −𝛾 𝜇 for 𝜇 = 1, 3. The relation (5.36) thus
means that 𝐶 should commute with 𝛾 1 , 𝛾 3 and anticommute with 𝛾 0 , 𝛾 2 . In this way, one is
led immediately to the solution (5.35) (up to an arbitrary coefficient). Let us also notice that the
form (5.35) reveals the following simple properties of the matrix 𝐶:

𝐶 −1 = 𝐶 † = 𝐶T = −𝐶 . (5.39)

It is interesting that the relation 𝐶T = −𝐶 (i.e. 𝐶 is an antisymmetric matrix) is quite general,


i.e. it holds in any representation of 𝛾-matrices. To see this, let us consider two different
representations, 𝛾 𝜇 and e𝛾 𝜇 . According to the fundamental theorem on the 𝛾-matrices, formulated
in Chapter 3, there is a similarity transformation between the two sets,

𝛾 𝜇 = 𝑈𝛾 𝜇𝑈 −1 .
e (5.40)

35
Then

𝛾 𝜇 T = (𝑈 −1 )T 𝛾 𝜇 T𝑈T = (𝑈T ) −1 (−𝐶 −1 𝛾 𝜇 𝐶)𝑈T


e
= −(𝑈T ) −1𝐶 −1𝑈 −1 e
𝛾 𝜇𝑈𝐶𝑈T = −(𝑈𝐶𝑈T ) −1 e
𝛾 𝜇𝑈𝐶𝑈T . (5.41)

Thus, if one denotes the charge conjugation matrix for the representation e
𝛾 𝜇 as 𝐶,
e from (5.41)
one has
e = 𝑈𝐶𝑈T .
𝐶 (5.42)
Now, we already know that in the standard representation 𝐶T = −𝐶. From the relation (5.42)
one gets
eT = 𝑈𝐶T𝑈T ,
𝐶
and thus we see that the antisymmetry of 𝐶 in the standard representation implies the antisym-
metry of 𝐶e in any other representation.
For an illustration, let us show the explicit form of the matrix 𝐶 in the standard represen-
tation:
0 0 0 −1
0 0 1 0®
© ª
𝐶 = 𝑖𝛾 2 𝛾 0 = ­ ®. (5.43)
­
­0 −1 0 0 ®
«1 0 0 0 ¬
Let us add that another universal property of the charge conjugation is the unitarity of the matrix
𝐶. Indeed, according to (5.39), this holds in the standard representation and the passage to
any other set of 𝛾-matrices is implemented by means of the similarity transformation (5.40),
where one may set 𝑈 −1 = 𝑈 † (assuming that the Dirac matrices in question possess standard
properties under Hermitian conjugation). Employing then the relation (5.42), one can see easily
that 𝐶 −1 = 𝐶 † is valid for any relevant 𝛾-matrix representation.
We may now resume, for a moment, our earlier theme of the “fun with 𝛾-matrices”; we
are going to add a remarkable item to the collection of 𝛾-matrix identities. It turns out that the
trace of a product of 𝛾-matrices is not changed when the order of matrices is reversed, i.e.

Tr(𝛾𝛼 𝛾 𝛽 · · · 𝛾𝜏 𝛾𝜔 ) = Tr(𝛾𝜔 𝛾𝜏 · · · 𝛾 𝛽 𝛾𝛼 ) . (5.44)

The proof of the “palindromic” relation (5.44) is quite easy and relies on the identity (5.36).
First, if the number of 𝛾-matrices under the trace is odd, (5.44) holds trivially. For an even
number 𝑛 of 𝛾-matrices one has

Tr(𝛾𝛼 𝛾 𝛽 · · · 𝛾𝜏 𝛾𝜔 ) = Tr(𝛾𝛼 𝛾 𝛽 · · · 𝛾𝜏 𝛾𝜔 )T = Tr(𝛾T𝜔 𝛾T𝜏 · · · 𝛾T𝛽 𝛾T𝛼 ) ,

and then the relation (5.36) can be employed. Thus one gets

Tr(𝛾𝛼 𝛾 𝛽 · · · 𝛾𝜏 𝛾𝜔 ) = (−1) 𝑛 Tr(𝐶 −1 𝛾𝜔 𝐶𝐶 −1 𝛾𝜏 𝐶 · · · 𝐶 −1 𝛾𝛼 𝐶) ,

and using trace cyclicity (as well as (−1) 𝑛 = 1) one arrives at (5.44).
After this relatively long exposition concerning 𝐶 one might naturally ask why it is called
“charge conjugation”, when there was no charge at play. Well, one natural answer comes from
the form of Dirac equation for the charged particle in an external electromagnetic field. Denoting
the corresponding four-potential as 𝐴 𝜇 , the “minimal” interaction is incorporated in the equation

𝑖𝛾 (𝜕𝜇 − 𝑖𝑒 𝐴 𝜇 ) − 𝑚 𝜓 = 0 . (5.45)
 𝜇 

36
Since the transformation
T
𝜓𝐶 = 𝐶𝜓
involves the complex conjugation, it is easy to see that 𝜓𝐶 satisfies Eq. (5.45) with 𝑒 replaced
by −𝑒. Moreover, we have motivated our discussion of the charge conjugation by the desire
to find a symmetry transformation connecting the solutions with positive and negative energy.
We will see later (within the framework of field theory) that the negative energy solutions are
closely related to antiparticles (carrying the charge opposite to particles). So, this is another,
quite standard, justification of the label “charge conjugation”.
We have devoted this chapter to the discrete symmetries 𝐶, 𝑃, 𝑇 and now one might
wonder whether this is all, i.e. whether one would not be able to find another independent
symmetry of such a type. For instance, one might try full spacetime inversion 𝑥 0 → −𝑥 0 ,
𝑥® → −®𝑥 without complex conjugation of the wave function. In such a case, one has to solve the
𝜇
conditions Λ 𝜈 𝛾 𝜈 = 𝑆 −1 𝛾 𝜇 𝑆 with Λ = −1. It means that the matrix 𝑆 should anticommute with
any 𝛾 𝜇 , 𝜇 = 0, 1, 2, 3; of course, the answer is then obvious, 𝑆 = 𝛾5 . It is easy to realize that
such a transformation is in fact equivalent to the product 𝐶𝑃𝑇. If one tries other possibilities,
seemingly not covered by 𝐶, 𝑃 and 𝑇, one always arrives at a combination of some of them. In
this sense, our set of discrete symmetries is complete.

37
Chapter 6

Plane-wave solutions
of Dirac equation: u and v

The subject indicated in the title has been already touched earlier, in Chapter 2, but here we are
going to discuss it in detail; the solutions we have in mind are plane waves, describing particle
states with a definite energy and momentum. One might wonder why should we pay so much
attention to plane waves, which may be viewed as a description of an almost trivial physical
situation. Well, a knowledgeable reader may guess that the plane-wave solutions will come in
handy later on, when considering problems of particle scattering within perturbation theory: in
this context one should recall the famous Born approximation for the potential scattering. Of
course, the plane waves for a Dirac particle are more complicated than what one may remember
from the introductory quantum mechanics course, because of spin degrees of freedom; such
technical aspects may add more flavour to the subject that otherwise might seem rather boring
at first sight. So much for the motivation and apologies, and now let us proceed to calculations.
For convenience, let us reiterate here the familiar equation of our interest:
𝜕𝜓
𝑖𝛾 𝜇 − 𝑚𝜓 = 0 . (6.1)
𝜕𝑥 𝜇
Following our earlier treatment of the Klein–Gordon equation (cf. (1.12)), it is straightforward
to figure out an appropriate form of the Dirac plane waves in question. One may write

𝜓 (+) (𝑥) = u ( 𝑝) 𝑒 −𝑖 𝑝𝑥 ,
(6.2)
𝜓 (−) (𝑥) = v ( 𝑝) 𝑒𝑖 𝑝𝑥 ,

where, according to the preliminary discussion in Chapter 2 (cf. (2.23)), we envisage the
existence of solutions with positive and negative energy. The coefficients u ( 𝑝) and v ( 𝑝) are
four-component bispinor amplitudes of the considered plane waves and their properties will be
the main subject of our subsequent study. Substituting the expressions (6.2) into Eq. (6.1), one
gets readily

u ( 𝑝) = 0 ,
( 𝑝 𝜇 𝛾 𝜇 − 𝑚)u
(6.3)
( 𝑝 𝜇 𝛾 + 𝑚)vv ( 𝑝) = 0 .
𝜇

The linear combination of 𝛾-matrices appearing in (6.3) is denoted as

𝑝 𝜇 𝛾 𝜇 = 𝑝/ . (6.4)

38
The symbol 𝑝/ is to be pronounced as “𝑝 slash” (this somewhat crazy notation is due to R. Feyn-
man). Thus, linear algebraic equations (6.3) are recast as

u ( 𝑝) = 0 ,
( 𝑝/ − 𝑚)u
(6.5)
( 𝑝/ + 𝑚)vv ( 𝑝) = 0 .

Multiplying the first equation (6.5) by the matrix 𝑝/ + 𝑚, or the second equation by 𝑝/ − 𝑚, one
gets
( 𝑝/ 𝑝/ − 𝑚 2 )u
u ( 𝑝) = 0 , (6.6)
and the same for v ( 𝑝). However, it is easy to see that 𝑝/ 𝑝/ = 𝑝 2 = 𝑝 20 − 𝑝®2 . Indeed, one has

1 1
𝑝/ 𝑝/ = 𝑝 𝜇 𝛾 𝜇 𝑝 𝜈 𝛾 𝜈 = 𝑝 𝜇 𝑝 𝜈 {𝛾 𝜇 , 𝛾 𝜈 } = 𝑝 𝜇 𝑝 𝜈 · 2𝑔 𝜇𝜈 = 𝑝 𝜇 𝑝 𝜇 = 𝑝 2 . (6.7)
2 2
Thus, we get the condition 𝑝 2 = 𝑚 2 , i.e. 𝑝 20 = 𝑝®2 + 𝑚 2 , which is certainly good (though expected)
news. Conventionally, one may choose 𝑝 0 > 0 (without loss of generality); then one may find
readily the values of the energy for 𝜓 (+) and 𝜓 (−) in (6.2). One has

𝜕
𝑖 𝜓 (+) (𝑥) = 𝑝 0 𝜓 (+) (𝑥) ,
𝜕𝑥 0
(6.8)
𝜕
𝑖 𝜓 (−) (𝑥) = −𝑝 0 𝜓 (−) (𝑥) ,
𝜕𝑥 0
so that one may conclude that 𝜓 (+) and 𝜓 (−) correspond to a positive and negative energy,
respectively.
Our main goal is to find an explicit form of the amplitudes u ( 𝑝) and v ( 𝑝). For this
purpose, it is useful to realize that 𝜓 (−) can be related to 𝜓 (+) via charge conjugation; it means
that one may define v ( 𝑝) as
u T ( 𝑝) ,
v ( 𝑝) = u𝐶 ( 𝑝) = 𝐶u (6.9)
where 𝐶 is determined by (5.36). So, we will concentrate now on u ( 𝑝). Let us start with 𝑝® = 0,
®
i.e. the particle at rest. One then has 𝑝 0 = 𝑚; denoting the corresponding u ( 𝑝) as u (𝑚, 0),
Eq. (6.5) is reduced to
(𝛾0 − 1)u ® =0
u (𝑚, 0) (6.10)
® one gets (6.10) with 1 replaced by −1). From now on we will employ
(similarly, for v (𝑚, 0)
the standard representation of 𝛾-matrices. From (6.10) one then gets two linearly independent
solutions, e.g.
1 0
0 ® = ­1®®
© ª © ª
® = ­ ®,
u (1) (𝑚, 0)
­ ®
u (2) (𝑚, 0)
­
(6.11)
­0® ­0®
«0¬ «0¬
(a watchful reader has certainly observed that we have just reproduced our earlier result (2.23)).
The form of u ( 𝑝) for 𝑝® ≠ 0 could be obtained from u (𝑚, 0) ® by means of an appropriate Lorentz
boost, but there is an elegant short cut to the desired solution, utilizing a simple trick. Clearly, if
® the result certainly satisfies the equation ( 𝑝/ − 𝑚)u
one acts with 𝑝/ + 𝑚 on u (𝑚, 0), u = 0, because
2 2 ®
of ( 𝑝/ − 𝑚) ( 𝑝/ + 𝑚) = 0 for 𝑝 = 𝑚 , and for 𝑝 = (𝑚, 0) it is proportional to u (𝑚, 0). ® So, one
may employ an Ansatz

u (𝑟) ( 𝑝) = 𝑁 ( 𝑝/ + 𝑚)u ® ,
u (𝑟) (𝑚, 0) 𝑟 = 1, 2 , (6.12)

39
where 𝑁 is an appropriate normalization factor. For the standard representation of 𝛾-matrices
one has  
𝑝 0 + 𝑚 −® 𝜎 · 𝑝®
𝑝/ + 𝑚 = , (6.13)
® · 𝑝® −𝑝 0 + 𝑚
𝜎
® can be written as
and u (𝑟) (𝑚, 0)  (𝑟) 
® 𝜑
u (𝑟)
(𝑚, 0) = , (6.14)
0
with
1 0
   
(1) (2)
𝜑 = , 𝜑 = .
0 1
Using (6.12), (6.13) and (6.14), one thus gets

𝜑 (𝑟)
u (𝑟)
( 𝑝) = 𝑁 (𝐸 + 𝑚) ­ 𝜎
® · 𝑝® (𝑟) ® , (6.15)
© ª
𝜑
«𝐸 + 𝑚 ¬
where we have denoted 𝐸 = 𝑝 0 . Notice that the formula (6.15) demonstrates clearly the
suppression of the lower components for | 𝑝| ® ≪ 𝑚, i.e. in the non-relativistic approximation (as
we have stressed before, it is a characteristic feature of the standard representation of 𝛾-matrices).
Let us now fix the value of the factor 𝑁. It is reasonable to formulate the normalization
condition in terms of the quantity u u, since it is a scalar. The convention we are going to use is

u ( 𝑝) = 2𝑚 .
u ( 𝑝)u (6.16)

Note that it is not universally used in the literature; some authors prefer the normalization defined
by u u = 1. We will see later why the choice (6.16) is convenient. So, using (6.12) and denoting,
® as u (0), one gets first
for brevity, u (𝑚, 0)

u ( 𝑝) = u† ( 𝑝)𝛾0 u ( 𝑝) = 𝑁 2 u† (0)( 𝑝/ + 𝑚) † 𝛾0 ( 𝑝/ + 𝑚)u


u ( 𝑝)u u (0)
= 𝑁 2 u† (0)𝛾0 ( 𝑝/ + 𝑚) 2 u (0) = 𝑁 2 u† (0)𝛾0 ( 𝑝 2 + 2𝑚 𝑝/ + 𝑚 2 )u
u (0)
= 2𝑚𝑁 2 u† (0)𝛾0 ( 𝑝/ + 𝑚)u
u (0)

(for simplicity, we take 𝑁 to be real). Then, employing (6.13), the last expression is recast as
  
2 † 𝑝 0 + 𝑚 −® 𝜎 · 𝑝® 𝜑
u ( 𝑝) = 2𝑚𝑁 (𝜑 , 0)
u ( 𝑝)u
−®𝜎 · 𝑝® 𝑝 0 − 𝑚 0
= 2𝑚𝑁 2 ( 𝑝 0 + 𝑚)𝜑† 𝜑 = 2𝑚𝑁 2 (𝐸 + 𝑚) . (6.17)

The condition (6.16) thus gives


𝑁 = (𝐸 + 𝑚) −1/2 , (6.18)
and the final form of (6.15) becomes

√ 𝜑 (𝑟)
u (𝑟)
= 𝐸 +𝑚­ 𝜎
® · 𝑝® (𝑟) ® . (6.19)
© ª
𝜑
«𝐸 + 𝑚 ¬
Next, how about v ( 𝑝) defined by (6.9)? For brevity, we will write simply v , u instead of v ( 𝑝),
u ( 𝑝) in what follows. Using (6.9), one has
†
v v = u 𝐶 u𝐶 = u𝐶† 𝛾0 u𝐶 = u T 𝐶 † 𝛾0𝐶u
uT . (6.20)

40
Now it is useful to take into account the trivial fact that the expression (6.20) is a number, i.e. the
matrix 1 × 1. This, of course, means that it is equal to its transpose. Thus, we may write
h iT
T †
(6.21)
† T
vv = T
= u 𝐶T 𝛾T0 𝐶 −1 u† ,

u 𝐶 𝛾0 𝐶 u

where we have employed the relation 𝐶 † = 𝐶 −1 . Using the known properties of 𝐶 and 𝛾0 in the
standard representation, from (6.21) one thus gets finally

v v = u 𝐶𝛾0𝐶 −1 𝛾0 u = −u
uu . (6.22)

The last result is quite remarkable: it means that the normalization condition for v ( 𝑝),

v ( 𝑝)vv ( 𝑝) = −2𝑚 , (6.23)

is a necessary consequence of our choice (6.16) for u ( 𝑝).


As we have already noted at the beginning of this chapter, some knowledge of the prop-
erties of the plane-wave solutions of Dirac equation will be important later on, in computation
of scattering amplitudes and cross sections of various physical processes. Well, the formula
(6.19) perhaps does not look quite encouraging from the point of view of the envisaged algebraic
manipulations, but there is a good news for a skeptical reader. In fact, we will never need an
explicit form of u ( 𝑝) or v ( 𝑝) like (6.19) or (6.9); instead, when working out the physically rel-
evant quantities, one usually needs just some specific bilinear combinations of these functions,
which are well suited for creating efficient computational algorithms (by the way, the knowledge
of traces of 𝛾-matrix products then becomes essential). An example of the above-mentioned
bilinear combination is the sum over two spin states
2
∑︁
u (𝑟) ( 𝑝) = 𝑝/ + 𝑚
u (𝑟) ( 𝑝)u (6.24)
𝑟=1

(such an identity is usually called the “completeness relation”). As a warm-up exercise, let us
prove Eq. (6.24) by means of a straightforward calculation. So, using (6.12) one has

u (𝑟) ( 𝑝) = 𝑁 2 ( 𝑝/ + 𝑚)u
u (𝑟) ( 𝑝)u u (𝑟) (0)u
u (𝑟) † (0)( 𝑝/ + 𝑚) † 𝛾0
= 𝑁 2 ( 𝑝/ + 𝑚)u
u (𝑟) (0)u
u (𝑟) (0)( 𝑝/ + 𝑚) , (6.25)

where we have utilized the familiar identity 𝑝/ † = 𝛾0 𝑝/ 𝛾0 . From (6.11) it is easy to see that
2
1 0 1
∑︁  
u (𝑟)
(0)u
u (𝑟)
(0) = = (1 + 𝛾0 ) . (6.26)
0 0 2
𝑟=1

Then one has


2
1 + 𝛾0
u (𝑟) ( 𝑝) = 𝑁 2 ( 𝑝/ + 𝑚)
∑︁
u (𝑟) ( 𝑝)u ( 𝑝/ + 𝑚) . (6.27)
𝑟=1
2
Working out the product of 𝛾-matrices in (6.27) is straightforward. One gets, after some obvious
manipulations,

𝑋 ≡ ( 𝑝/ + 𝑚)(1 + 𝛾0 )( 𝑝/ + 𝑚)
= 2𝑚 2 + 2𝑚 𝑝/ + 𝑝/ 𝛾0 𝑝/ + 𝑚 𝑝/ 𝛾0 + 𝑚𝛾0 𝑝/ + 𝑚 2 𝛾0 . (6.28)

41
Employing the basic anticommutation relation for 𝛾-matrices, one has 𝑝/ 𝛾0 + 𝛾0 𝑝/ = 2𝑝 0 , so that
(6.28) can be further simplified as

𝑋 = 2𝑚 2 + 2𝑚 𝑝/ + 𝑝/ (2𝑝 0 − 𝑝/ 𝛾0 ) + 2𝑚 𝑝 0 + 𝑚 2 𝛾0
= 2𝑚 2 + 2𝑚 𝑝/ + 2𝑝 0 𝑝/ + 2𝑚 𝑝 0 = 2( 𝑝 0 + 𝑚)( 𝑝/ + 𝑚) . (6.29)

Thus, substituting (6.29) into (6.27) and using the value (6.18) for 𝑁, one recovers indeed the
formula (6.24). The corresponding result for the bispinors v (𝑟) ( 𝑝) reads
2
∑︁
v (𝑟) ( 𝑝) v (𝑟) ( 𝑝) = 𝑝/ − 𝑚 , (6.30)
𝑟=1

and it can be proved by using (6.24) and the charge conjugation transformation. Indeed, from
(6.9) one has †
v = v † 𝛾0 = u T 𝐶 † 𝛾0 ,
so that
 †   † T
(𝑟) (𝑟) (𝑟) T (𝑟) T † (𝑟) T (𝑟) †
v v = 𝐶u
u u 𝐶 𝛾0 = 𝐶u
u u 𝛾0 𝐶 † 𝛾0
 T
(𝑟) T (𝑟)
= 𝐶u
u 𝛾0 u 𝐶 † 𝛾0 = 𝐶u
u (𝑟) T u (𝑟) T 𝛾0𝐶 −1 𝛾0
 T
(𝑟) (𝑟)
=𝐶 u u 𝛾0𝐶 −1 𝛾0 . (6.31)

Thus, using (6.24), the sum in question becomes


2
∑︁
v (𝑟) ( 𝑝)vv (𝑟) ( 𝑝) = 𝐶 ( 𝑝/ + 𝑚)T 𝛾0𝐶 −1 𝛾0 = 𝐶 (−𝐶 −1 𝑝/ 𝐶 + 𝑚)𝛾0𝐶 −1 𝛾0
𝑟=1
= (− 𝑝/ 𝐶 + 𝑚𝐶)𝛾0𝐶 −1 𝛾0 = − 𝑝/ 𝐶𝛾0𝐶 −1 𝛾0 + 𝑚𝐶𝛾0𝐶 −1 𝛾0
= 𝑝/ − 𝑚 , (6.32)

and (6.30) is thereby proved.


At this point, one can also see what is the advantage of using the normalizations (6.16)
and (6.23). For the convention u u = 1 and v v = −1 one would get in (6.24) the result ( 𝑝/ +𝑚)/2𝑚
and (6.23) would become ( 𝑝/ − 𝑚)/2𝑚. In some situations, the limit 𝑚 → 0 may be of interest;
within our normalization convention this can be implemented directly at the level of the formulae
(6.23), (6.24), while in the alternative scheme more care is needed.

42
Chapter 7

Description of spin states of Dirac particle

The solutions of the Dirac equation, described in the preceding chapter (see (6.11), (6.12))
have a straightforward physical interpretation. They correspond to a particle that has (apart
from definite values of the energy and momentum) a definite projection of the spin to the 3rd
coordinate axis in its rest system. In particular, for 𝑟 = 1 such a projection (corresponding to
1
2 Σ3 ) is +1/2 and for 𝑟 = 2 its value is −1/2. Of course, one would like to have a more direct
explicit characterization of the spin states in the reference frame, where the particle is moving.
However, it turns out that one cannot simply rely on the eigenvalues of the matrix 𝑠® · Σ ® with 𝑠®
being the unit vector for a fixed space direction. The point is that the plane-wave amplitude u ( 𝑝)
is an eigenvector of the matrix 𝑝/ (with the eigenvalue 𝑚, see (6.5)), but 𝑝/ does not, in general,
commute with 𝑠® · Σ. ® We will return to this issue in more technical terms later on.
So, how should one proceed in order to achieve the desired goal? Obviously, a right
strategy could be to start in the rest system, perform an appropriate Lorentz boost and find out
how the condition for the spin projection gets transformed. To this end, let us consider two
solutions of the equation ( 𝑝/ − 𝑚)u ® that are eigenvectors of the matrix
u = 0 for 𝑝 = 𝑝 (0) = (𝑚, 0)
® with |®𝑠 | = 1. We have in mind the standard representation of the 𝛾-matrices, so that Σ
𝑠® · Σ ® has
the usual form
® 0
 
𝜎
®
Σ= . (7.1)
0 𝜎®
® 2 = |®𝑠 | 2 = 1; thus, the
Using the familiar properties of the Pauli matrices, it is clear that (®𝑠 · Σ)
eigenvalues of 𝑠® · Σ® are ±1. Note that the spin projection matrix is in fact 𝑠® · 𝑆,® with 𝑆® = 1 Σ,®
2
but for convenience we are going to work with 𝑠® · Σ ® in the sequel. The two linearly independent
eigenvectors of 𝑠® · Σ in the particle rest system can be written as
®

® u (0, 𝑠®) = u (0, 𝑠®) ,


𝑠® · Σ (7.2a)
−®𝑠 · Σ® u (0, −®𝑠) = u (0, −®𝑠) , (7.2b)

which is a form most appropriate for our purpose. Let us now start processing Eq. (7.2a) along
the lines indicated above. A key ingredient of the calculational procedure is the remarkable
identity
® = 𝛾5 𝛼
Σ ®. (7.3)
It can be verified easily within the standard representation, since here we have

0 1
 
0 1 2 3
𝛾5 = 𝑖𝛾 𝛾 𝛾 𝛾 = (7.4)
1 0

43
(the reader is urged to verify this result independently) and the matrices 𝛼 ® are given by (1.32)
(in fact, the formula (7.3) is valid in any representation of 𝛾-matrices; for technical details see
Appendix C). So, using (7.3) and the relation 𝛼 ® = 𝛾0 𝛾®, Eq. (7.2a) is recast as

𝛾5 𝛾0 𝑠® · 𝛾®u (0, 𝑠®) = u (0, 𝑠®) . (7.5)

Next, as we know, in the rest system it holds

𝛾0 u (0, 𝑠®) = u (0, 𝑠®)

(cf. (6.10)). Thus, (7.5) becomes

−𝛾5 𝑠® · 𝛾®u (0, 𝑠®) = u (0, 𝑠®) . (7.6)

The matrix product on the left-hand side of Eq. (7.6) can be formally rewritten as 𝛾5 /𝑠 (0) , if one
adopts a natural notation
𝑠 (0) = (0, 𝑠®) . (7.7)
So, an intermediate result of the first step of our calculation reads

𝛾5 /𝑠 (0) u (0, 𝑠®) = u (0, 𝑠®) . (7.8)

Next, let us consider the Lorentz boost. The coordinate system, in which the particle has
a momentum 𝑝® (velocity 𝑣®) is moving with velocity −®𝑣 relatively to the particle rest frame.
Thus, the reference system of our interest is connected with the rest frame by means of the
Lorentz transformation described by a matrix denoted briefly as Λ(−®𝑣 ). Dirac spinors are then
transformed by means of the corresponding matrix 𝑆(−®𝑣 ) (cf. (4.9)). Denoting for brevity u (0, 𝑠®)
simply as u (0), we thus have u ( 𝑝) = 𝑆(−®𝑣 )u
u (0), i.e.

u (0) = 𝑆 −1 (−®𝑣 )u
u ( 𝑝) . (7.9)

Of course, one should keep in mind that

𝑆 −1 (−®𝑣 ) = 𝑆(®𝑣 ) ↔ Λ(®𝑣 ) . (7.10)

So, substituting (7.9) into (7.8), one has

𝛾5 /𝑠 (0) 𝑆 −1 (−®𝑣 )u
u ( 𝑝) = 𝑆 −1 (−®𝑣 )u
u ( 𝑝) ,

i.e.
𝛾5 𝑆(−®𝑣 ) /𝑠 (0) 𝑆 −1 (−®𝑣 )u
u ( 𝑝) = u ( 𝑝) , (7.11)
where we have utilized the fact that 𝑆(−®𝑣 ) commutes with 𝛾5 . Working out the left-hand side of
Eq. (7.11), one gets

𝑆(−®𝑣 ) /𝑠 (0) 𝑆 −1 (−®𝑣 ) = 𝑆 −1 (®𝑣 )𝑠 𝜇(0) 𝛾 𝜇 𝑆(®𝑣 ) = 𝑠 𝜇(0) Λ 𝜈 (®𝑣 )𝛾 𝜈 .


𝜇
(7.12)

In this way, one arrives at the combination

Λ 𝜈 (®𝑣 )𝑠 𝜇(0) ≡ 𝑠 𝜈 .
𝜇
(7.13)

This relation can be written in the matrix form as

𝑠 = ΛT (®𝑣 )𝑠 (0) . (7.14)

44
Using the notation introduced in Appendix A, one may employ the known result for ΛT , namely
(see (A.7) or (A.8))
  −1
T
Λ = Λ .
So, (7.14) can be recast as
  −1
𝑠 = Λ(®𝑣 ) 𝑠 (0) = Λ(−®𝑣 )𝑠 (0) . (7.15)

Having in mind the definition of the matrix Λ, Eq. (7.15) can be written in terms of components
as
𝑠 𝜇 = Λ 𝜇 (−®𝑣 )𝑠 𝜌(0)
𝜌
(7.16)
or, equivalently,
𝜇
𝑠 𝜇 = Λ 𝜈 (−®𝑣 )𝑠 (0) 𝜈 . (7.17)
Then, using (7.13) and (7.16), the expression (7.12) becomes /𝑠 and the relation (7.11) can be
thus rewritten as
𝛾5 /𝑠 u ( 𝑝) = u ( 𝑝) . (7.18)
The result (7.18) is quite remarkable. We have found out that the original condition (7.2a) for the
particle at rest is equivalent to the relation (7.18) for the particle in motion. The four-component
quantity 𝑠 is obtained from 𝑠 (0) , defined by (7.7), by means of the same Lorentz transformation
that turns the rest four-momentum 𝑝 (0) = (𝑚, 0) ® into 𝑝 = (𝐸, 𝑝). ® This, of course, means that
𝑠 = 𝑠( 𝑝) is a four-vector. Moreover, since obviously 𝑝 (0) · 𝑠 (0) = 0, it also holds

𝑝 · 𝑠( 𝑝) = 0 . (7.19)

The four-vector 𝑠 has another obvious property; it is space-like, and

𝑠2 = −1 . (7.20)

This is a direct consequence of (𝑠 (0) ) 2 = −1 and Lorentz invariance. Now one may check
immediately the consistency of the condition (7.18). One would like to see that 𝛾5 /𝑠 commutes
with 𝑝/ . It is quite easy; one has

[ 𝑝/ , 𝛾5 /𝑠 ] = 𝑝/ 𝛾5 /𝑠 − 𝛾5 /𝑠 𝑝/ = −𝛾5 { 𝑝/ , /𝑠 } = −2𝛾5 𝑝 · 𝑠 = 0 .

Moreover,
(𝛾5 /𝑠 ) 2 = 1 . (7.21)
Indeed, taking into account (7.20), one has

(𝛾5 /𝑠 ) 2 = 𝛾5 /𝑠 𝛾5 /𝑠 = −/𝑠 /𝑠 = 1 .

So, owing to (7.21), the eigenvalues of 𝛾5 /𝑠 are ±1.


The condition (7.18) represents a covariant characterization of the spin state of the Dirac
particle, so it is quite natural to call 𝑠 the spin four-vector; an alternative term that is used is the
polarization four-vector. From now on, we will use the notation u ( 𝑝, 𝑠) for the solution of the
equation ( 𝑝/ − 𝑚)u u = 0 satisfying (7.18):

𝛾5 /𝑠 u ( 𝑝, 𝑠) = u ( 𝑝, 𝑠) . (7.22)

Next, the question is what happens with u (0, −®𝑠) satisfying (7.2b). The answer is quite clear:
Following the same procedure as above, one gets a spin four-vector equal to −𝑠, since in this

45
case we would start with u (0, −®𝑠) in the rest system, and the ensuing Lorentz transformation is
linear. Thus, we end up with a solution u ( 𝑝, −𝑠) of ( 𝑝/ − 𝑚)u
u = 0, satisfying

−𝛾5 /𝑠 u ( 𝑝, −𝑠) = u ( 𝑝, −𝑠) . (7.23)

Just to be sure: Please note that while 𝑠 and −𝑠 are linearly dependent, u ( 𝑝, 𝑠) and u ( 𝑝, −𝑠) are
linearly independent.
As regards the solutions of the equation ( 𝑝/ + 𝑚)vv = 0, we may rely on the approach
outlined in Chapter 6, i.e. define v ( 𝑝, 𝑠) by means of the charge conjugation of u ( 𝑝, 𝑠):

u T ( 𝑝, 𝑠) .
v ( 𝑝, 𝑠) = u𝐶 ( 𝑝, 𝑠) = 𝐶u (7.24)

Then it is not difficult to show that

𝛾5 /𝑠 v ( 𝑝, 𝑠) = v ( 𝑝, 𝑠) . (7.25)

Indeed, starting with (7.22) one gets first, after some simple manipulations,

u ( 𝑝, 𝑠) /𝑠 𝛾5 = u ( 𝑝, 𝑠) .
−u (7.26)

The transpose of (7.26) becomes

𝐶 −1 𝛾5 /𝑠𝐶u
u T ( 𝑝, 𝑠) = u T ( 𝑝, 𝑠) , (7.27)

so that finally one has


u T ( 𝑝, 𝑠) = 𝐶u
𝛾5 /𝑠𝐶u u T ( 𝑝, 𝑠) , (7.28)
and (7.25) is thereby proved. Note that when deriving (7.27) we have used the relation 𝛾T5 =
𝐶 −1 𝛾5𝐶 (the reader is encouraged to verify this identity independently). Clearly, in the same
way one may obtain v ( 𝑝, −𝑠) as u𝐶 ( 𝑝, −𝑠) that satisfies

−𝛾5 /𝑠 v ( 𝑝, −𝑠) = v ( 𝑝, −𝑠) . (7.29)

At this point, a conceptual remark is in order. Our road to the covariant treatment of the spin
states of a free Dirac particle relied on an elementary description of the states in question in the
rest system, and the ensuing Lorentz transformation. In fact, one might now forget about any
reference to the rest system and adopt a more abstract approach, defining e.g. u ( 𝑝, 𝑠) directly as
a common eigenvector of 𝑝/ and 𝛾5 /𝑠 , with 𝑠 being a space-like four-vector orthogonal to 𝑝 and
normalized as 𝑠2 = −1. We will come back to this viewpoint in more detail later on.
Hopefully, the preceding discussion contained some exciting moments (in particular, the
appearance of somewhat mysterious 𝛾5 in the spin description might be surprising at first sight).
Now we have to add some more technicalities to our description of the functions u ( 𝑝, ±𝑠) and
v ( 𝑝, ±𝑠) that you may find rather boring, but they will turn out to be very important later (another
“rifle hanging on the wall”. . . ). The technique we are going to develop here usually appears
in textbooks under the heading “projection operators for energy and spin”. So, what are the
projection operators in question? Let us start with u ( 𝑝, 𝑠). As we know, it satisfies the equation

u ( 𝑝, 𝑠) = 0 .
( 𝑝/ − 𝑚)u

This can be rewritten, somewhat artificially, as


𝑚 + 𝑝/
u ( 𝑝, 𝑠) = u ( 𝑝, 𝑠) . (7.30)
2𝑚

46
Using it, we are on the right track, since the matrix
𝑚 + 𝑝/
Λ+ = (7.31)
2𝑚
is a projection operator, in the sense that (Λ+ ) 2 = Λ+ (i.e. it is an idempotent matrix). Indeed,

𝑚 + 𝑝/ 2 𝑚 2 + 2𝑚 𝑝/ + 𝑝/ 𝑝/ 𝑚 2 + 2𝑚 𝑝/ + 𝑝 2 𝑚 + 𝑝/
 
2
(Λ+ ) = = = = .
2𝑚 4𝑚 2 4𝑚 2 2𝑚
The notation chosen in (7.31) refers to the fact that u ( 𝑝, 𝑠) is the amplitude of a plane wave with
positive energy. Similarly, the relation (7.22) can be recast as
1 + 𝛾5 /𝑠
u ( 𝑝, 𝑠) = u ( 𝑝, 𝑠) , (7.32)
2
and the matrix
1 + 𝛾5 /𝑠
Σ(𝑠) = (7.33)
2
is another projection operator, since
2 1 1
Σ(𝑠) = (1 + 2𝛾5 /𝑠 + 𝛾5 /𝑠 𝛾5 /𝑠 ) = (1 + 2𝛾5 /𝑠 − 𝑠2 ) = Σ(𝑠) ,
4 4
where we have used (7.20) in the last step.
Furthermore, since we know that 𝑝/ and 𝛾5 /𝑠 commute, Λ+ and Σ(𝑠) commute as well.
This in turn means that the product Λ+ Σ(𝑠) is also a projection operator. Obviously, the same
can be done for u ( 𝑝, −𝑠), replacing Σ(𝑠) with Σ(−𝑠).
For a v ( 𝑝, 𝑠), which is a solution of the equation

( 𝑝/ + 𝑚)vv ( 𝑝, 𝑠) = 0 , (7.34)

one may proceed analogously. Eq. (7.34) may be recast as


𝑚 − 𝑝/
v ( 𝑝, 𝑠) = v ( 𝑝, 𝑠) ,
2𝑚
and it is easily seen that
𝑚 − 𝑝/
Λ− = (7.35)
2𝑚
is a projection operator. As for the spin operators, we get the same as before.
Our findings can be summarized as follows. One may construct four matrices
𝑚 + 𝑝/ 1 + 𝛾5 /𝑠
𝑃1 = Λ+ Σ(𝑠) = ,
2𝑚 2
𝑚 + 𝑝/ 1 − 𝛾5 /𝑠
𝑃2 = Λ+ Σ(−𝑠) = ,
2𝑚 2
𝑚 − 𝑝/ 1 + 𝛾5 /𝑠
𝑃3 = Λ− Σ(𝑠) = ,
2𝑚 2
𝑚 − 𝑝/ 1 − 𝛾5 /𝑠
𝑃4 = Λ− Σ(−𝑠) = , (7.36)
2𝑚 2
that have the basic property of projection operators, i.e. satisfy

𝑃2𝑗 = 𝑃 𝑗 , 𝑗 = 1, . . . , 4 . (7.37)

47
These projection operators are in one-to-one correspondence with functions u ( 𝑝, ±𝑠), v ( 𝑝, ±𝑠),
in such a way that

𝑃1 u ( 𝑝, 𝑠) = u ( 𝑝, 𝑠) , 𝑃2 u ( 𝑝, −𝑠) = u ( 𝑝, −𝑠) ,
(7.38)
𝑃3 v ( 𝑝, 𝑠) = v ( 𝑝, 𝑠) , 𝑃4 v ( 𝑝, −𝑠) = v ( 𝑝, −𝑠) .

It is not difficult to find out that the matrices 𝑃 𝑗 , 𝑗 = 1, . . . , 4 have the following properties:
4
∑︁
𝑃𝑗 = 1 ,
𝑗=1
𝑃𝑖 · 𝑃 𝑗 = 0 for 𝑖 ≠ 𝑗 , (7.39)
𝑃†𝑗 = 𝛾0 𝑃 𝑗 𝛾0 .

Proving (7.39) is left to the reader as a simple exercise.


The relations (7.38) indicate that the projection operators 𝑃 𝑗 can eventually be expressed
in terms of bilinear combinations of the functions u ( 𝑝, ±𝑠), v ( 𝑝, ±𝑠). It is indeed so, as we
will see shortly. First, let us introduce a simpler notation, adapted directly to the formalism of
projection operators, namely

𝑤 1 = u ( 𝑝, 𝑠) , 𝑤 2 = u ( 𝑝, −𝑠) , 𝑤 3 = v ( 𝑝, 𝑠) , 𝑤 4 = v ( 𝑝, −𝑠) . (7.40)

Thus, the identities (7.38) are recast as

𝑃𝑗𝑤 𝑗 = 𝑤 𝑗 , 𝑗 = 1, . . . , 4 , (7.41)

and from this one gets readily


𝑤 𝑗 𝑃𝑗 = 𝑤 𝑗 . (7.42)
These relations imply
𝑤 𝑗 𝑤𝑘 = 0 for 𝑗≠𝑘 (7.43)
as an immediate consequence of the second identity in (7.39). Let us also recall that the
normalization conditions read

𝑤 𝑗 𝑤 𝑗 = 2𝑚 for 𝑗 = 1, 2 ,
𝑤 𝑗 𝑤 𝑗 = −2𝑚 for 𝑗 = 3, 4 . (7.44)

Thus, we have a basis in the 4-dimensional complex space C4 , made of 𝑤 1 , . . . , 𝑤 4 . Now, let 𝜙
be an arbitrary element of such a space; it can be expanded as

𝜙 = 𝐶1 𝑤 1 + . . . + 𝐶4 𝑤 4 . (7.45)

Using (7.43) and (7.44), the coefficients 𝐶 𝑗 can be expressed as


1 1
𝐶1 = 𝑤 1𝜙 , 𝐶2 = 𝑤 2𝜙 ,
2𝑚 2𝑚
1 1
𝐶3 = − 𝑤 3𝜙 , 𝐶4 = − 𝑤 4𝜙 . (7.46)
2𝑚 2𝑚
On the other hand, owing to the properties of the projection operators, one gets readily

𝑃1 𝜙 = 𝐶1 𝑤 1 , . . . , 𝑃4 𝜙 = 𝐶4 𝑤 4 . (7.47)

48
Thus, (7.47) along with (7.46) give
1 1
𝑃1 𝜙 = 𝑤 1 𝜙𝑤 1 , . . . , 𝑃4 𝜙 = − 𝑤 4 𝜙𝑤 4 . (7.48)
2𝑚 2𝑚
Now the results for 𝑃1 , . . . , 𝑃4 are quite clear, but in order to have a foolproof argument, let us
write (7.48) in terms of components:
1 1
(𝑃1 )𝑎𝑏 𝜙 𝑏 = 𝑤 1 𝑏 𝜙𝑏 𝑤1 𝑎 = 𝑤1 𝑎 𝑤 1 𝑏 𝜙𝑏 , (7.49)
2𝑚 2𝑚
etc. (the indices 𝑎, 𝑏 take on values 1, . . . , 4). From (7.48) or (7.49) one may thus conclude that
1 1
𝑃1 = 𝑤1 𝑤 1 , 𝑃2 = 𝑤2 𝑤 2 ,
2𝑚 2𝑚
1 1
𝑃3 = − 𝑤 3 𝑤 3 , 𝑃4 = − 𝑤 4 𝑤 4 . (7.50)
2𝑚 2𝑚
Taking into account (7.36), one then gets finally
1 + 𝛾5 /𝑠
u ( 𝑝, 𝑠) = ( 𝑝/ + 𝑚)
u ( 𝑝, 𝑠)u ,
2
1 − 𝛾5 /𝑠
u ( 𝑝, −𝑠)uu ( 𝑝, −𝑠) = ( 𝑝/ + 𝑚) ,
2 (7.51)
1 + 𝛾5 /𝑠
v ( 𝑝, 𝑠) v ( 𝑝, 𝑠) = ( 𝑝/ − 𝑚) ,
2
1 − 𝛾5 /𝑠
v ( 𝑝, −𝑠)vv ( 𝑝, −𝑠) = ( 𝑝/ − 𝑚) .
2
These formulae are highly important in computations of observable quantities, like decay rates
and scattering cross sections, for physical processes; we will enjoy such applications later
on. Recall that we have already mentioned such a technical point in the preceding chapter,
in connection with the so-called completeness relations (see (6.24) and (6.30)). The formulae
(7.51) represent, in a sense, an “anatomy” of the completeness relations, since they refer to
particular spin (polarization) states and by summing over ±𝑠 one recovers the completeness
relations. Practically, this means that the formulae (7.51) will be instrumental in calculations
concerning processes that involve polarized particles.

49
Chapter 8

Helicity and chirality

In the preceding chapter we have noted that, in general, 𝑝/ does not commute with 𝑠® · Σ,
® where
𝑠® represents a fixed space direction. Let us now work out the commutator in question explicitly.
One has, using the identity (7.3), as well as other properties of 𝛾-matrices
® = [ 𝑝 0 𝛾0 − 𝑝® · 𝛾®, 𝑠® · Σ]
[ 𝑝/ , 𝑠® · Σ] ® = [ 𝑝 0 𝛾0 − 𝑝® · 𝛾®, 𝛾5 𝛾0 𝑠® · 𝛾®] = −[ 𝑝® · 𝛾®, 𝛾5 𝛾0 𝑠® · 𝛾®]
= −𝛾5 𝛾0 [ 𝑝® · 𝛾®, 𝑠® · 𝛾®] = −𝛾5 𝛾0 𝑝 𝑗 𝑠 𝑘 [𝛾 𝑗 , 𝛾 𝑘 ] = 2𝑖𝛾5 𝛾0 𝑝 𝑗 𝑠 𝑘 𝜎 𝑗 𝑘 = 2𝑖𝛾5 𝛾0 𝑝 𝑗 𝑠 𝑘 𝜀 𝑗 𝑘𝑙 Σ𝑙 ,
(8.1)

where we have employed the general representation of the spin matrices Σ ® in the last step (see
the formula (C.36) in Appendix C). Thus, the relation (8.1) may be finally recast as
® = 2𝑖𝛾5 𝛾0 ( 𝑝® × 𝑠®) · Σ
[ 𝑝/ , 𝑠® · Σ] ®. (8.2)

This result means that the considered commutator vanishes if and only if 𝑝® × 𝑠® = 0. Obviously,
there are just two exceptional situations, in which this can happen; either 𝑝® = 0 (the familiar
case of the rest system), or 𝑝® ≠ 0 and 𝑠® parallel to 𝑝.
®
Let us now focus on the second case. Thus, we are going to consider the projection of the
particle spin onto the direction of momentum (again, instead of the true spin matrices 𝑆® = 21 Σ ®
we will work with Σ, ® for convenience). Such a quantity is called, traditionally, the helicity.
The etymology of this technical term may be understood as follows. Let us visualize, naı̈vely, a
particle in the above-mentioned spin state as e.g. a small disc rotating in the plane perpendicular
to the direction of motion (defined by the unit vector 𝑛® = 𝑝/| ® Then, a positive value of the
® 𝑝|).
projection of its angular momentum (spin) onto the direction 𝑛® reminds one of the motion of a
right-handed screw; similarly, the negative value of the considered projection would correspond
to a left-handed screw. Geometrically, the motion of a screw is associated with the shape (curve)
called helix (which, of course, is encountered in many other situations, including everyday life).
For etymological completeness, let us add that the name originates from the Greek word 𝜀𝜆𝜄𝜉
(meaning twisted). So, the Dirac particle with a definite momentum and positive helicity is
called right-handed, and a state with negative helicity corresponds to a left-handed particle.
The straightforward definition of the helicity introduced above is quite transparent. Nev-
ertheless, it is certainly desirable to have also an equivalent description in terms of an appropriate
spin four-vector; in other words, instead of the scalar product 𝑛® · Σ ® one would like to have a
covariant matrix form that may be employed within the framework of the technique developed
in the preceding chapter. So, we want to find a four-vector 𝑠 = 𝑠( 𝑝) with the standard general
properties
𝑠 · 𝑝 = 0, 𝑠2 = −1 , (8.3)

50
such that 𝛾5 /𝑠 is equivalent to 𝑛® · Σ
® when acting on the solution of the equation ( 𝑝/ − 𝑚)uu = 0.
Note that in this case we do not intend to rely on a starting point in the rest system and the
ensuing Lorentz transformation; rather we would like to implement the viewpoint advocated
in the remark following the relation (7.29) in the preceding chapter, i.e. find an appropriate 𝑠
directly. For this purpose, we will start with an “educated guess”: having in mind that the only
relevant space direction is here 𝑛® = 𝑝/| ® we will try to construct the four-vector 𝑠 so that its
® 𝑝|,
space part be parallel to 𝑛®. More precisely, for a right-handed state, we will employ the Ansatz
𝑝®
 
0
𝑠R ( 𝑝) = 𝑠R , 𝜆 , (8.4)
| 𝑝|
®
with 𝜆 > 0; the parameters 𝑠R0 and 𝜆 are to be determined by means of the conditions (8.3). So,
let us work it out. The requirement 𝑝 · 𝑠R ( 𝑝) = 0 yields, obviously,
𝑝®
0 = 𝐸 𝑠R0 − 𝜆 · 𝑝® = 𝐸 𝑠R0 − 𝜆| 𝑝|
® ,
| 𝑝|
®
so that
| 𝑝|
®
𝑠R0 = 𝜆 = 𝜆𝛽 , (8.5)
𝐸
where 𝛽 is the particle velocity. Further, the normalization condition leads to
−1 = (𝑠R ( 𝑝)) 2 = 𝜆2 𝛽2 − 𝜆2 ,
and thus
1 1 𝐸2
𝜆2 = = = . (8.6)
1 − 𝛽2 ® 2 𝑚2
| 𝑝|
1− 2
𝐸
Taking into account (8.5) and (8.6), the result for 𝑠R ( 𝑝) reads, finally,
® 𝐸 𝑝®
| 𝑝|
 
𝑠R ( 𝑝) = , . (8.7)
𝑚 𝑚 | 𝑝| ®
Now, let us check whether we are indeed on the right track, i.e. whether the spin four-vector
(8.7) has the desired relation to the helicity. Fortunately, it is so. It holds: Let u be a solution of
the equation ( 𝑝/ − 𝑚)u
u = 0. Then
® · 𝑝®
Σ
u=
𝛾5 /𝑠 R ( 𝑝)u u. (8.8)
| 𝑝|
®
The easiest way of proving Eq. (8.8) is to work out both sides independently, until one gets
identical expressions. Let us start with the left-hand side. Using the formula (8.7) and taking
into account that 𝑝/ − 𝑚 = 𝐸𝛾0 − 𝑝® · 𝛾® − 𝑚, one has
| 𝑝|
® 𝐸 1
 
u = 𝛾5
𝛾5 /𝑠 R ( 𝑝)u 𝛾0 − 𝑝® · 𝛾® u
𝑚 𝑚 | 𝑝|
®
| 𝑝|
® 𝐸2 1
  
𝐸
= 𝛾5 − 𝛾0 + u= 𝛾5 (𝐸 − 𝑚𝛾0 )u
u. (8.9)
𝑚 ®
𝑚| 𝑝| | 𝑝|
® | 𝑝|
®
For the right-hand side of (8.8) one gets, using the identity (7.3),
® · 𝑝®
Σ 1 1 1 1
u= ®u =
® · 𝑝u
𝛾5 𝛼 ®u =
𝛾5 𝛾0 𝛾® · 𝑝u u=
𝛾5 𝛾0 (𝐸𝛾0 − 𝑚)u u . (8.10)
𝛾5 (𝐸 − 𝑚𝛾0 )u
| 𝑝|
® | 𝑝|
® | 𝑝|
® | 𝑝|
® | 𝑝|
®

51
So, our calculation turned out well; the coincidence of (8.9) and (8.10) proves the validity of
(8.8).
If we now denote the plane-wave amplitude u ( 𝑝) describing a right-handed particle
(i.e. carrying positive helicity) as uR ( 𝑝), the identity (8.8) means that

uR ( 𝑝) = uR ( 𝑝) .
𝛾5 /𝑠 R ( 𝑝)u (8.11)

Recalling the results of the preceding chapter, one may then also write (cf. (7.51))
1 + 𝛾5 /𝑠 R
u R ( 𝑝) = ( 𝑝/ + 𝑚)
uR ( 𝑝)u . (8.12)
2
Further, considering a free-particle state with negative helicity, the relevant spin four-vector is

𝑠L = −𝑠R . (8.13)

For the corresponding plane-wave amplitude uL ( 𝑝) one has

uL ( 𝑝) = uL ( 𝑝) ,
𝛾5 /𝑠 L ( 𝑝)u (8.14)

and thus also


1 + 𝛾5 /𝑠 L
u L ( 𝑝) = ( 𝑝/ + 𝑚)
uL ( 𝑝)u . (8.15)
2
Now, it is natural to ask what one gets for a pertinent v ( 𝑝, 𝑠) corresponding to a definite helicity.
The answer is quite straightforward and is based on an identity analogous to (8.8). It holds: Let
v be a solution of the equation ( 𝑝/ + 𝑚)vv = 0. Then

® · (− 𝑝)
Σ ®
𝛾5 /𝑠 R v = v, (8.16)
| 𝑝|
®
where 𝑠R = 𝑠R ( 𝑝) is given by (8.7).
The proof of (8.16) is practically the same as in the case of the identity (8.8).
A remark is in order here. Recall that the function v ( 𝑝) is the amplitude of the plane
wave
𝜓 (−) (𝑥) = v ( 𝑝) 𝑒𝑖 𝑝𝑥 . (8.17)
In our conventional notation 𝑝 = (𝐸, 𝑝), ® and the solution (8.17) carries negative energy −𝐸
and the momentum − 𝑝. ® So, the scalar product on the right-hand side of Eq. (8.16) corresponds
indeed to the straightforward definition of the helicity. In analogy with (8.11) we may therefore
write
𝛾5 /𝑠 R v R ( 𝑝) = v R ( 𝑝) , (8.18)
where v R ( 𝑝) corresponds to positive particle helicity (right-handed state). Thus, we may also
write
1 + 𝛾5 /𝑠 R
v R ( 𝑝)vv R ( 𝑝) = ( 𝑝/ − 𝑚) . (8.19)
2
Similarly, we write
𝛾5 /𝑠 L v L ( 𝑝) = v L ( 𝑝) (8.20)
for left-handed states, with 𝑠L = −𝑠R according to (8.13), and
1 + 𝛾5 /𝑠 L
v L ( 𝑝) v L ( 𝑝) = ( 𝑝/ − 𝑚) . (8.21)
2

52
There is another important issue arising in connection with the discussion of helicity states,
namely the problem of the limit of zero mass (shortly, the “massless limit”). Of course, we know
that at present there is no serious candidate for a spin-1/2 massless particle, but despite this, the
massless limit is interesting for practical, technical, reasons. The point is that in high-energy
limit, 𝐸 ≫ 𝑚, the energy of a relativistic particle 𝐸 = 𝑝® + 𝑚 2 is very close to | 𝑝|;
2 ® in other
√︁

words, in such a situation one is indeed close to the limit 𝑚 → 0. Now, from (8.7) it is clear
that the limit of 𝑠R ( 𝑝) for 𝑚 → 0 does not exist. On the other hand, the limit of u ( 𝑝) or v ( 𝑝)
does exist, since the equation 𝑝/ u = 0 certainly has a solution. Thus, we expect that the identities
(8.12) or (8.19) have a well-defined limit for 𝑚 → 0. To verify this, let us work out the limit of
the right-hand side of the relation (8.12) explicitly. The starting point is the expression
1 1
( 𝑝/ + 𝑚)(1 + 𝛾5 /𝑠 R ) = ( 𝑝/ + 𝑚 − 𝛾5 𝑝/ /𝑠 R + 𝑚𝛾5 /𝑠 R ) . (8.22)
2 2
Taking into account the formula (8.7), the last term in parentheses obviously has a finite limit
for 𝑚 → 0, since the coefficient 𝑚 cancels the 1/𝑚 factors appearing in 𝑠R ( 𝑝). In particular,

| 𝑝|
® 𝐸 𝑝®
 
lim 𝑚𝛾5 /𝑠 R ( 𝑝) = lim 𝑚𝛾5 𝛾0 − · 𝛾® = 𝛾5 𝑝/ , (8.23)
𝑚→0 𝑚→0 𝑚 𝑚 | 𝑝|
®

since lim𝑚→0 𝐸/| 𝑝| ® = 1. The third term in parentheses is more complicated, since at first sight
there is no clear suppression of the 1/𝑚 factor in 𝑠R ( 𝑝). So, let us evaluate the relevant limit in
detail. One has
| 𝑝|
®
 
𝐸
lim 𝑝/ /𝑠 R = lim (𝐸𝛾0 − 𝑝® · 𝛾®) 𝛾0 − 𝑝® · 𝛾®
𝑚→0 𝑚→0 𝑚 ®
𝑚| 𝑝|
𝐸 | 𝑝|
® 𝐸2 | 𝑝|
®
 
𝐸
= lim + 𝑝® · 𝛾®𝛾0 − 𝑝® · 𝛾®𝛾0 + ( 𝑝® · 𝛾®)( 𝑝® · 𝛾®) . (8.24)
𝑚→0 𝑚 ®
𝑚| 𝑝| 𝑚 ®
𝑚| 𝑝|
For the last term in square brackets one gets readily
1 𝑗 𝑘 𝑗 𝑘 1
( 𝑝® · 𝛾®)( 𝑝® · 𝛾®) = ®2.
𝑝 𝑝 {𝛾 , 𝛾 } = 𝑝 𝑗 𝑝 𝑘 (−2𝛿 𝑗 𝑘 ) = −| 𝑝| (8.25)
2 2
The first term then gets cancelled and the expression (8.24) becomes

𝐸 2 − | 𝑝|
®2 𝑚
lim 𝑝® · 𝛾®𝛾0 = lim 𝑝® · 𝛾®𝛾0 = 0 . (8.26)
𝑚→0 ®
𝑚| 𝑝| 𝑚→0 | 𝑝|
®
Thus, surprisingly, the term that looked potentially troublesome, vanishes in the massless limit!
Putting all above results together, the limit of the expression (8.22) for 𝑚 → 0 becomes
1 1 − 𝛾5
( 𝑝/ + 𝛾5 𝑝/ ) = 𝑝/ . (8.27)
2 2
Let us summarize the results of our calculation. We have found that for 𝑚 = 0 one may write
1 − 𝛾5
u R ( 𝑝) = 𝑝/
uR ( 𝑝)u . (8.28)
2
In the same way, we would get
1 + 𝛾5
u L ( 𝑝) = 𝑝/
uL ( 𝑝)u . (8.29)
2
53
Concerning the plane-wave amplitudes v ( 𝑝) for 𝑚 → 0, one obtains, clearly (cf. (8.19)),
1 + 𝛾5
v R ( 𝑝) v R ( 𝑝) = 𝑝/ , (8.30)
2
and finally
1 − 𝛾5
v L ( 𝑝) v L ( 𝑝) = 𝑝/
. (8.31)
2
Let us now return, for a moment, to the remarkable finding (8.26), i.e.

lim 𝑝/ /𝑠 R ( 𝑝) = 0 . (8.32)
𝑚→0

To get a better insight into this result, it is useful to examine the behaviour of 𝑠R ( 𝑝) in the
high-energy limit (which, technically, is tantamount to 𝑚 → 0). It turns out that it behaves
asymptotically like the four-momentum 𝑝, up to a remainder that is suppressed as 1/𝐸. More
precisely, it holds, for 𝐸 ≫ 𝑚:
𝜇 𝑝𝜇
𝑠R ( 𝑝) = + Δ 𝜇 ( 𝑝) , (8.33)
𝑚
where Δ 𝜇 ( 𝑝) is of the order O (𝑚/𝐸), i.e. it vanishes for 𝐸 → ∞, or, equivalently, for 𝑚 → 0.
The proof of such a statement is not difficult. Using the formula (8.7), one gets, after a trivial
manipulation,
𝑝 𝜇 𝐸 − | 𝑝| ® 𝑝®
 
𝜇
𝜇
Δ ( 𝑝) = 𝑠R ( 𝑝) − = −1, . (8.34)
𝑚 𝑚 | 𝑝|
®
The factor (𝐸 − | 𝑝|)/𝑚
® can be recast as
𝐸 − | 𝑝|
® (𝐸 − | 𝑝|)(𝐸
® + | 𝑝|)
® 𝑚
= = . (8.35)
𝑚 𝑚(𝐸 + | 𝑝|)
® 𝐸 + | 𝑝|
®
For 𝐸 ≫ 𝑚, | 𝑝|® is of the same order of magnitude as 𝐸, and the proof is thereby completed. Let
us add that from (8.33) it is easy to see that 𝑝 · Δ( 𝑝) = −𝑚. In any case, our conclusion is that
Δ 𝜇 ( 𝑝) → 0 for 𝑚 → 0. Thus, one has
𝑝/ 𝑝2
 
𝑝/ /𝑠 R ( 𝑝) = 𝑝/ / ( 𝑝) =
+Δ / ( 𝑝) = 𝑚 + 𝑝/ Δ
+ 𝑝/ Δ / ( 𝑝).
𝑚 𝑚
According to our preceding discussion, the second term in the last expression vanishes for
𝑚 → 0, and thus we recover the result (8.32).
Let us add that the relation (8.33) is, technically, quite remarkable in its own right; we will
encounter such a type of algebraic identity later on, in a completely different physical situation.
Our preceding description of the limiting case 𝑚 = 0 can be reformulated in a different
way, which in fact is even simpler and more transparent. One may return to the original natural
definition of the helicity in terms of the scalar product
® · 𝑝®
Σ
ℎ=
b (8.36)
| 𝑝|
®
standing on the right-hand side of Eq. (8.8). Clearly, such a definition is directly applicable
even for a massless particle. Let us consider first the plane-wave amplitude u ( 𝑝). For 𝑚 = 0 it
satisfies the equation (𝐸𝛾0 − 𝑝® · 𝛾®)u
u ( 𝑝) = 0 with 𝐸 = | 𝑝|.
® For our purpose, it is useful to write
it as
𝛼 ®u = | 𝑝|u
® · 𝑝u ® u. (8.37)

54
Let us now consider a solution of Eq. (8.37) with definite helicity, i.e
® · 𝑝®
Σ
u = ℎu, (8.38)
| 𝑝|
®
where, of course, ℎ can be +1 or −1. Employing the identity (7.3) and Eq. (8.37), one gets from
(8.38)
𝛾5 u = ℎ u . (8.39)
Thus, it turns out that for a massless particle described by means of a plane-wave amplitude the
helicity is equal to an eigenvalue of the matrix 𝛾5 (as a consistency check one should realize
that 𝛾5 has eigenvalues ±1, since 𝛾52 = 1). Such an eigenvalue has a special name; traditionally
it is called chirality. So, the right-handed u, i.e. uR , has the chirality +1 and the left-handed
uL carries the chirality −1. This way of distinguishing between right and left explains the
etymological origin of the term chirality: in Greek, χειρ means the hand, so that the chirality
could also be dubbed the “handedness”. Note that this term was introduced into relativistic
quantum physics by Satosi Watanabe in 1957 (see ref. [34]).
Similarly, if one considers a plane-wave amplitude v ( 𝑝) (corresponding to a state with
negative energy −𝐸 = −| 𝑝| ® and momentum − 𝑝), ® the helicity operator is
® · (− 𝑝)
Σ ®
ℎ=
b , (8.40)
| 𝑝|
®
and the v ( 𝑝) satisfies for 𝑚 = 0 the equation

® · 𝑝® v = | 𝑝|v
𝛼 ®v. (8.41)

Then, in the same manner as before, one gets, for the solution with a definite helicity ℎ,

𝛾5 v = −ℎ v . (8.42)

Thus, for a plane-wave amplitude v , the helicity is equal to minus chirality in the massless case.
Our results can be summarized briefly as the following simple rule:

For u ( 𝑝) with 𝑚 = 0 , helicity = chirality ,


For v ( 𝑝) with 𝑚 = 0 , helicity = − chirality . (8.43)

In more detail, this means that for 𝑚 = 0 one has


𝛾 5 uR = uR ,
𝛾 5 uL = −uuL ,
(8.44)
𝛾5 v R = −vv R ,
𝛾5 v L = vL .

Using the relations (8.44), one may recover easily our previous results for uR u R , etc. (see (8.28),
(8.29), (8.30) and (8.31)). To this end, let us first recast (8.44) in terms of appropriate projection
operators. One gets, after some simple manipulations,
1 + 𝛾5
𝛾 5 uR = uR ⇒ uR = uR , (8.45)
2
1 − 𝛾5
𝛾5 uL = −u
uL ⇒ uL = uL , (8.46)
2

55
and similarly for v R and v L . Clearly, the matrices (1 ± 𝛾5 )/2 are mutually orthogonal projection
operators, so that one may write e.g.
1 + 𝛾5
uR = uR ,
2 (8.47)
1 + 𝛾5
uL = 0 .
2
It is easy to see that the Dirac conjugation of (8.47) amounts to
1 − 𝛾5
uR = uR ,
2 (8.48)
1 − 𝛾5
uL = 0.
2
Then, multiplying (8.48) by uR and uL , respectively, and summing the resulting relations, one
gets
1 − 𝛾5
(u
uR u R + uL u L ) = uR u R . (8.49)
2
However, the sum uR u R + uL u L is equal to 𝑝/ , since it is the massless limit of the completeness
relation (polarization sum) for the functions u ( 𝑝, 𝑠) (cf. (7.51) and (6.24)). Thus, we end up
with the result
1 − 𝛾5
uR ( 𝑝)uu R ( 𝑝) = 𝑝/ (8.50)
2
for 𝑚 = 0. The other relations shown above can be proved in an analogous manner. The simple
trick exemplified by the relations (8.47) is worth remembering as an ideal starting point for a
quick derivation of the important formulae like (8.50) (i.e. a derivation written on the back of
an envelope, or in the sand of a beach on a desert island).
We have seen that the concept of chirality is quite useful from the technical point of view,
since it obviously leads to a considerable simplification of the treatment of the helicity states in
high-energy limit. In fact, such a concept has much broader impact as it is very important also
in field theory models; in particular, it plays a key role in the formulation of the standard model
of electroweak interactions.

56
Chapter 9

Weyl equation

In the preceding chapter, we have examined the massless limit of solutions of the Dirac equation
and we have uncovered a remarkable rule for the description of right-handed and left-handed
states in terms of the chirality. Now, as a natural continuation of the study of the 𝑚 → 0 limit,
one might try to go back “ad fontes” and take 𝑚 = 0 from the very beginning. Well, we have
already noted that at present there is no truly massless spin- 12 particle; nevertheless, for various
reasons, such a case retains some purely theoretical interest, and thus it certainly makes sense to
devote one chapter to this issue.
Let us start with re-examining the guesswork presented in the introductory Chapter 1,
which has led to the Dirac equation. If 𝑚 = 0, one may use a simple Ansatz
𝜕𝜓
® · ®𝜓 (9.1)
Δ
𝑖 = −𝑖 𝛼
𝜕𝑡
for the relativistic equation in question (cf. (1.23) with ℏ = 𝑐 = 1). In the same way as before,
the matrices 𝛼 𝑗 , 𝑗 = 1, 2, 3, should satisfy the anticommutation relations

{𝛼 𝑗 , 𝛼 𝑘 } = 2𝛿 𝑗 𝑘 . (9.2)

The matrix coefficient 𝛽 is absent now, and this makes a significant difference. The point is that
the relation (9.2) can be satisfied by means of 2 × 2 matrices; obviously, the Pauli matrices 𝜎 𝑗
represent a possible choice. More precisely, there are two inequivalent options for the triplet of
𝛼 𝑗 in (9.2), namely

1) 𝛼𝑗 = 𝜎𝑗 , (9.3)
2) 𝛼 𝑗 = −𝜎 𝑗 . (9.4)

The sets (9.3) and (9.4) are indeed inequivalent, since there is no invertible matrix 𝑆 such that

−𝜎 𝑗 = 𝑆𝜎 𝑗 𝑆 −1 . (9.5)

We have come across this fact in Chapter 1: such a matrix 𝑆 would have to anticommute with
all 𝜎 𝑗 , 𝑗 = 1, 2, 3, and there is no such thing within the space of 2 × 2 matrices (let us recall that
it was the problem of finding a matrix 𝛽 anticommuting with all 𝛼 𝑗 , which has led us to 4 × 4
matrices in the Dirac equation with 𝑚 ≠ 0).
Of course, instead of (9.3) or (9.4), one could use any matrices related to these by means
of a similarity transformation, i.e. ±𝑈𝜎 𝑗 𝑈 −1 ; thus, there is an infinite number of triplets of 2 × 2
matrices 𝛼 𝑗 satisfying (9.2), but they fall into two distinct classes, generated either by (9.3) or

57
(9.4). Taking (9.3) or (9.4) as the familiar representatives of these classes, one has two possible
types of relativistic equations in question, namely
𝜕𝜓
® · ®𝜓 , (9.6)
Δ
𝑖 = −𝑖 𝜎
𝜕𝑡
𝜕𝜓
® · ®𝜓 . (9.7)
Δ
𝑖 = 𝑖𝜎
𝜕𝑡
Equations (9.6) and (9.7) are called Weyl equations (they are due to Hermann Weyl, who
introduced them in 1929, shortly after the formulation of the Dirac equation).
As in the case of the Dirac equation, we should check the relativistic covariance of the
equations (9.6), (9.7). To this end, let us first rewrite them in a “covariant form”. It means that
Eq. (9.6) can be recast as
𝜎 𝜇 𝜕𝜇 𝜓 = 0 , (9.8)
and (9.7) as
𝜎 𝜇 𝜕𝜇 𝜓 = 0 ,
e (9.9)
where
𝜎 𝜇 = (1, 𝜎
®), 𝜎 𝜇 = (1, −®
e 𝜎) . (9.10)
In what follows, let us concentrate e.g. on Eq. (9.8); Eq. (9.9) can be treated in an analogous
manner. So, let 𝑥 ′ = Λ𝑥 be a Lorentz transformation of spacetime coordinates; the corresponding
transformation of the wave function 𝜓 can be written as

𝜓 ′ (𝑥 ′) = 𝑆𝜓(𝑥) , (9.11)

where 𝑆 = 𝑆(Λ) is a constant invertible 2 × 2 matrix. Substituting 𝜓(𝑥) = 𝑆 −1 𝜓 ′ (𝑥 ′) in Eq. (9.8),


one gets, after a simple manipulation,

𝜕𝜓 ′ 𝜕𝑥 ′ 𝜆 𝜇 −1 𝜕𝜓

0 = 𝜎 𝜇 𝑆 −1 = Λ𝜆
𝜇 𝜎 𝑆 . (9.12)
𝜕𝑥 ′ 𝜆 𝜕𝑥 𝜇 𝜕𝑥 ′ 𝜆
In order to obtain from (9.12) the Weyl equation (9.8) in primed coordinates, the most general
condition to be imposed is, obviously,

Λ𝜆 𝜇 𝜎 𝜇 𝑆 −1 = 𝑅𝜎𝜆 , (9.13)

where 𝑅 is another non-singular matrix; the point is that if (9.13) is satisfied, then multiplying
Eq. (9.12) by 𝑅 −1 , the form of the equation (9.8) is restored. Thus, the condition that should
determine the transformation matrix 𝑆 can be written as
𝜇
Λ 𝜈 𝜎 𝜈 = 𝑅𝜎 𝜇 𝑆 (9.14)

(simultaneous determination of the auxiliary matrix 𝑅 is a part of the procedure). Notice that
the relation (9.14) differs from Eq. (4.9) that we have employed earlier for the Dirac equation.
It is so because here the mass term is absent; recovering such a term in the Dirac equation gives
in fact an additional requirement to be imposed on the transformation matrix 𝑆.
For the proper (continuous) Lorentz transformations, the procedure of finding the corre-
sponding matrices 𝑅 and 𝑆 is quite lengthy, so in current literature the solution for the matrix 𝑆
is usually shown immediately, relying on the theory of representations of the Lorentz group (see
Appendix B). Using the standard notation, the relevant representation here is (0, 21 ), i.e. it is one
of the two inequivalent 2-dimensional spinor representations. Similarly, for the second Weyl

58
equation (9.9) the pertinent representation of the Lorentz group is ( 12 , 0). Nevertheless, a direct
solution of the matrix equation (9.14) should be a challenge for any enthusiast, so let us now
take it up. Denoting the six independent parameters of an infinitesimal Lorentz transformation
as Δ𝜔𝛼𝛽 (Δ𝜔 𝛽𝛼 = − Δ𝜔𝛼𝛽 ), one may write
𝑖
𝑆 =1− Δ𝜔𝛼𝛽 𝑊𝛼𝛽 , (9.15)
4
𝑖
𝑅 = 1 − Δ𝜔𝛼𝛽𝑉𝛼𝛽 (9.16)
4
(of course, 𝑊 stands here for Weyl); our goal is to find the generators 𝑊𝛼𝛽 and 𝑉𝛼𝛽 . Before
starting the calculation, a remark is in order. The solution of Eq. (9.14) for the matrices 𝑅
and 𝑆 has an obvious ambiguity: if a pair 𝑅, 𝑆 is a solution, then 𝑅′, 𝑆′ such that 𝑅′ = 𝜆𝑅,
𝑆′ = 𝜆−1 𝑆 represent a solution as well. We will eliminate such an essentially trivial ambiguity
by restricting ourselves to the generators that are traceless (i.e. we discard possible multiples of
the unit matrix). So, substituting the expressions (9.15), (9.16) into Eq. (9.14) and using the
familiar form of the infinitesimal Λ, one obtains the conditions for the generators 𝑊𝛼𝛽 , 𝑉𝛼𝛽 (in
much the same way as in the case of the Dirac equation in Chapter 4). These read

2𝑖(𝑔 𝜇𝛼 𝜎𝛽 − 𝑔 𝜇𝛽 𝜎𝛼 ) = 𝑉𝛼𝛽 𝜎𝜇 + 𝜎𝜇 𝑊𝛼𝛽 . (9.17)

Thus, for a fixed pair (𝛼, 𝛽) we have 16 equations for 8 unknowns (elements of the matrices 𝑉𝛼𝛽
and 𝑊𝛼𝛽 ). The procedure of solving the equations (9.17) is straightforward though somewhat
tedious, so we will describe only some salient features of the whole calculation.
As a first step, take (𝛼, 𝛽) = ( 𝑗, 𝑘) with 𝑗, 𝑘 = 1, 2, 3. For 𝜇 = 0 in Eq. (9.17) one gets
readily (recall that 𝜎 0 = 1)
𝑉 𝑗 𝑘 = −𝑊 𝑗 𝑘 . (9.18)
For 𝜇 = 1, 2, 3 one then gets, e.g. for 𝑊12 ,

[𝜎1 , 𝑊12 ] = −2𝑖𝜎2 ,


[𝜎2 , 𝑊12 ] = 2𝑖𝜎1 , (9.19)
[𝜎3 , 𝑊12 ] = 0 .

Notice that the aforementioned ambiguity is obviously manifested here (to any solution 𝑊12 one
may add an arbitrary multiple of the unit matrix). The only traceless solution of the equations
(9.19) is 𝑊12 = 𝜎3 . In a similar way, one obtains 𝑊13 = −𝜎2 and 𝑊23 = 𝜎1 . These results can
be summarized in a compact form as

𝑊 𝑗 𝑘 = 𝜀 𝑗 𝑘𝑙 𝜎𝑙 . (9.20)

Next, consider the combination (𝛼, 𝛽) = (0, 𝑗), 𝑗 = 1, 2, 3. Taking 𝜇 = 0 in Eq. (9.17), one gets

𝑉0 𝑗 + 𝑊0 𝑗 = 2𝑖𝜎 𝑗 . (9.21)

The remaining conditions (9.17), i.e. those with 𝜇 = 1, 2, 3, then yield, after some simple
manipulations, e.g. for 𝑊01 ,

[𝜎1 , 𝑊01 ] = 0 ,
[𝜎2 , 𝑊01 ] = 2𝜎3 , (9.22)
[𝜎3 , 𝑊01 ] = −2𝜎2 .

59
The only traceless solution of these equations is 𝑊01 = 𝑖𝜎1 . Analogously, one can get 𝑊02 = 𝑖𝜎2
and 𝑊03 = 𝑖𝜎3 .
Putting all these results together, we conclude that the only solution of the conditions
(9.17) such that Tr 𝑊𝛼𝛽 = 0, Tr 𝑉𝛼𝛽 = 0 is given by
𝑊 𝑗 𝑘 = 𝜀 𝑗 𝑘𝑙 𝜎𝑙 , 𝑊0 𝑗 = 𝑖𝜎 𝑗 , (9.23)
and
𝑉 𝑗 𝑘 = −𝑊 𝑗 𝑘 , 𝑉0 𝑗 = 𝑊0 𝑗 . (9.24)
Using this in (9.15) and (9.16), and taking into account that 𝑊 𝑗 𝑘 is Hermitian, while 𝑊0 𝑗 is
anti-Hermitian, one can see that 𝑅 = 𝑆 † (but 𝑆 † is not, in general, equal to 𝑆 −1 ).
It is to be expected that the generators 𝑊𝛼𝛽 correspond to a representation of the Lorentz
algebra; in particular, the matrices 𝑤 𝛼𝛽 = 21 𝑊𝛼𝛽 should satisfy the commutation relations of
the form (A.22). It is indeed so, though the explicit calculation is somewhat lengthy (another
challenge for a diligent reader). In any case, the matrices 𝑤 𝛼𝛽 are seen to coincide with the
generators of the representation ( 21 , 0) described in the Appendix B.
Let us now turn to discrete symmetries. First, we are going to examine the spatial
inversion (the parity operation). In such a case, the matrix of Lorentz transformation is
diag(+1, −1, −1, −1). Using this in the relation (9.14), one gets first, for 𝜇 = 0, 1 = 𝑅𝑆,
i.e. 𝑅 = 𝑆 −1 . Then, for 𝜇 = 𝑗 = 1, 2, 3 one has the requirement
−𝜎 𝑗 = 𝑆 −1 𝜎 𝑗 𝑆 ,
but we already know that such a 2 × 2 matrix does not exist. Thus, we see that the Weyl equation
(9.6) is not invariant under space inversion. Of course, the same is true for the second Weyl
equation (9.7) as well. This is the example promised earlier, in Chapter 5 — a simple relativistic
equation involving parity violation. In fact, it is not difficult to realize that the spatial inversion
converts Eq. (9.6) into (9.7) and vice versa. Explicitly, this means that if 𝜓(𝑥) is a solution of
Eq. (9.6), then
𝜓 𝑃 (𝑥) = 𝜓(𝑥 0 , −®
𝑥) (9.25)
satisfies Eq. (9.7).
Next, let us consider the charge conjugation, i.e. an operation of internal symmetry
involving complex conjugation. The corresponding transformation of the wave function can be
written as
𝜓𝐶 (𝑥) = 𝐶𝜓 ∗ (𝑥) . (9.26)
The complex conjugation of Eq. (9.6) reads
𝜕0 𝜓 ∗ + 𝜎 𝑗∗ 𝜕 𝑗 𝜓 ∗ = 0 ,
so that for 𝜓𝐶 one gets
𝜕0 𝜓𝐶 + 𝐶𝜎 𝑗∗𝐶 −1 𝜕 𝑗 𝜓𝐶 = 0 .
So, in order to restore Eq. (9.6) for 𝜓𝐶 , one would like to impose the conditions
𝜎 𝑗 = 𝐶𝜎 𝑗∗𝐶 −1 , 𝑗 = 1, 2, 3 . (9.27)
Taking into account familiar properties of the Pauli matrices, the requirement (9.27) means
𝜎1 = 𝐶𝜎1𝐶 −1 ,
−𝜎2 = 𝐶𝜎2𝐶 −1 , (9.28)
𝜎3 = 𝐶𝜎3𝐶 −1 .

60
In other words, the matrix 𝐶 should commute with 𝜎1 and 𝜎3 , and anticommute with 𝜎2 . It turns
out that such a matrix does not exist. Proving this is a refreshing algebraic exercise and the reader
is encouraged to do it. (Hint: Use the most general form of 𝐶 written as a linear combination of
the unit matrix and Pauli matrices.) Thus, the Weyl equation (9.6) is not invariant under charge
conjugation (of course, the same is true for Eq. (9.7)).
On the other hand, while there is no similarity transformation between 𝜎 𝑗∗ and 𝜎 𝑗 , such
a transformation between 𝜎 𝑗∗ and −𝜎 𝑗 does exist. Indeed, it holds

𝜎2 𝜎 𝑗∗ 𝜎2−1 = −𝜎 𝑗 (9.29)

(of course, 𝜎2−1 = 𝜎2 ). This in turn means that the transformation (that may be called charge
conjugation)
𝜓𝐶 (𝑥) = 𝜎2 𝜓 ∗ (𝑥) (9.30)
converts a solution of Eq. (9.6) into a solution of (9.7) (and vice versa). Thus, the combination
of the spatial inversion and charge conjugation is a symmetry of the Weyl equation, i.e. if 𝜓(𝑥)
is a solution of Eq. (9.6), then
𝜓𝐶𝑃 (𝑥) = 𝜎2 𝜓 ∗ (𝑥 0 , −®
𝑥) (9.31)
is a solution as well (and the same is true for Eq. (9.7)).
Further, let us examine the possible invariance of the Weyl equation under time reversal.
In analogy with our earlier discussion of the Dirac equation, an appropriate Ansatz for the
corresponding transformation is
𝜓 ′ (𝑥 ′) = 𝐵𝜓 ∗ (𝑥) , (9.32)
where 𝑥 ′ = (−𝑥 0 , 𝑥®). An observant reader will have no problem with finding out that the relevant
condition for the matrix 𝐵 reads
𝜇
Λ 𝜈 𝜎 𝜈 ∗ = −𝐵−1 𝜎 𝜇 𝐵 , (9.33)

where the matrix of the Lorentz transformation is now Λ = diag(−1, +1, +1, +1). This means
that 𝐵 should commute with 𝜎2 and anticommute with 𝜎1 and 𝜎3 . An obvious choice is therefore
𝐵 = 𝜎2 . So, if 𝜓(𝑥) is a solution of Eq. (9.6), then

𝜓𝑇 (𝑥) = 𝜎2 𝜓 ∗ (−𝑥 0 , 𝑥®) (9.34)

is a solution as well, and the 𝑇-invariance of the Weyl equation is thereby established.
Note that this last discrete symmetry does not come as a surprise, as it is perfectly
consistent with the famous 𝐶𝑃𝑇 theorem (an interested reader can find the relevant information
e.g. in [6] or [21]): once we have established the invariance under 𝐶𝑃 (see (9.31)), the 𝑇-
invariance must hold as well, since the 𝐶𝑃𝑇 symmetry is guaranteed in the considered case.
One can verify it explicitly: it is easy to see that the Weyl equation is invariant under the
transformation
𝜓 ′ (𝑥 ′) = 𝜓(𝑥) , (9.35)
where 𝑥 ′ = (−𝑥 0 , −® 𝑥 ), and this is just the 𝐶𝑃𝑇.
As the last issue to be discussed in this chapter, we are going to describe briefly the salient
features of the plane-wave solutions of the Weyl equations. We will consider the equation of the
first type (9.6) as an instructive example; the relevant results can be extended in a straightforward
manner to Eq. (9.7). The solutions in question can be written in a form analogous to what we
know from our previous study of the Dirac equation. Thus, we have

𝜓 (+) (𝑥) = u ( 𝑝) 𝑒 −𝑖 𝑝𝑥 (9.36)

61
and
𝜓 (−) (𝑥) = v ( 𝑝) 𝑒𝑖 𝑝𝑥 , (9.37)
where 𝑝 = ( 𝑝 0 , 𝑝)
® with 𝑝 0 = | 𝑝|.
® As the notation employed in (9.36), (9.37) indicates, 𝜓 (+)
corresponds to the positive energy 𝐸 = | 𝑝|
® and momentum 𝑝, ® while 𝜓 (−) carries the negative
energy 𝐸 = −| 𝑝|® and momentum − 𝑝. ® Substituting the Ansatz (9.36) into Eq. (9.6), one gets
readily
(𝜎 ® u = | 𝑝|u
® · 𝑝)u ® u. (9.38)
This result is quite remarkable, since it means that

® · 𝑝®
𝜎
u = u, (9.39)
| 𝑝|
®
i.e. the solution with positive energy has automatically positive helicity. Similarly, for the
negative energy solution (9.37) one obtains

® · (− 𝑝)
𝜎 ®
v = −vv , (9.40)
| 𝑝|
®
which means that 𝜓 (−) has negative helicity. From the above exposition it is also clear that for
the second Weyl equation (9.7) the correlation between the energy and helicity is reversed: a
plane-wave solution of Eq. (9.7) with positive energy has negative helicity and a negative energy
plane wave is endowed with positive helicity.
The fact that for a given energy and momentum one can have just a single value of the
helicity is a clear physical manifestation of the non-invariance of the Weyl equations under space
reflection; we have in mind a natural intuitive picture of the parity violation as the absence of a
“mirror symmetry” between right-handed and left-handed states.
A historical remark is perhaps in order here. The two-component Weyl equation, formu-
lated almost simultaneously with the four-component Dirac equation, was rejected soon after
its birth, just because of the inherent parity violation (the invariance under spatial inversion
was taken for granted then, as an automatic symmetry of the fundamental laws of nature). The
situation has changed in 1957 with the advent of the experimentally confirmed parity violation
in weak interactions. The Weyl equation has been rehabilitated and accepted then as a proper
description of the neutrino (that was believed for a long time to be strictly massless); thus, the
“two-component Weyl neutrino” was taken to be responsible for the left-right asymmetry of
weak interactions. Such an idea inspired indeed a very successful theory of weak forces, but
now we know that the neutrinos do have tiny masses; so, a true source of the parity violation
must lie deeper. In any case, the Weyl equation(s) still represent a useful theoretical tool in
building models of fundamental physics, where one may start with massless spin- 21 particles
that acquire masses by means of some additional mechanism. From the purely mathematical
point of view, the two-dimensional spinor representations of the Lorentz group, implemented in
the Weyl equations, are the basic building blocks for the construction of any higher-dimensional
representation; this certainly underlines the fundamental nature of the original Weyl’s work.

62
Chapter 10

Wave packets. Zitterbewegung

Up to now we have discussed mostly the plane waves, as the free-particle solutions of the Dirac
equation. Such states are rather boring from the physical point of view, but, as we have already
stressed earlier, their detailed description is important with regard to the perturbative calculations
of the scattering and decay processes to be pursued later on. In fact, there is one point that has
not been clarified yet, namely the relevance of the negative energy states. Naı̈vely, one might
be tempted to ignore them completely, but it would not be consistent as they form, along with
the positive energy states, a complete system of solutions of the quantum mechanical Dirac
equation.
Apart from this general aspect, which is common to all models of relativistic quantum
mechanics, the presence of negative energy solutions has an intriguing and rather surprising
manifestation when one considers the time evolution of a wave packet, i.e. a superposition of the
plane waves (that may represent a spatially localized state); here we are alluding to the strange
German word in the title. So, let us now pack the plane waves thoroughly, the German dictionary
will be consulted in the end.
We will start with a packet of the positive energy plane waves, i.e. the superposition we
have in mind may be written as
d3 𝑝
∫ ∑︁
Ψ(+) (𝑥) = 3/2 1/2
u ( 𝑝, 𝑠) 𝑒 −𝑖 𝑝𝑥 ,
𝑏( 𝑝, 𝑠)u (10.1)
(2𝜋) (2𝐸) 𝑠

where 𝑝 = (𝐸, 𝑝) ® with 𝐸 = 𝑝 0 = ( 𝑝®2 + 𝑚 2 ) 1/2 . The reader should not feel uneasy about
the unfamiliar factor (2𝜋) −3/2 (2𝐸) −1/2 (to be denoted for brevity as 𝑁 ( 𝑝) in what follows); it
will prove to be an appropriate contribution to the normalization of the expansion coefficients
𝑏( 𝑝, 𝑠). The sum over the spin label 𝑠 is to be understood as including the two possible states
corresponding e.g. to polarizations ±𝑠. First of all, let us compute the normalization integral;
since we know that the product 𝜓 † 𝜓 in general represents the probability density, the relevant
integral is

𝐼 = d3 𝑥 Ψ(+)

(𝑥)Ψ(+) (𝑥)
∫ ∫∫
= d3 𝑥 d3 𝑝 d3 𝑞 𝑁 ( 𝑝)𝑁 (𝑞)
∑︁
𝑏( 𝑝, 𝑠)𝑏 ∗ (𝑞, 𝑠′)u
u† (𝑞, 𝑠′)u
u ( 𝑝, 𝑠) 𝑒𝑖𝑞𝑥−𝑖 𝑝𝑥 . (10.2)
𝑠,𝑠′

The integration in (10.2) gets simplified readily. First, one has, obviously
∫ ∫
3
d 𝑥𝑒 𝑖𝑞𝑥−𝑖 𝑝𝑥
=𝑒 𝑖(𝑞 0 −𝑝 0 )𝑥 0
d3 𝑥 𝑒 −𝑖( 𝑞−
® 𝑝)
® 𝑥®
= (2𝜋) 3 𝛿 (3) ( 𝑝® − 𝑞)
® 𝑒𝑖(𝑞0 −𝑝0 )𝑥0 . (10.3)

63
However, since 𝑝 0 = 𝑝®2 + 𝑚 2 , 𝑞 0 = 𝑞®2 + 𝑚 2 , the values of 𝑝 0 and 𝑞 0 become effectively
√︁ √︁

equal, due to the presence of 𝛿 (3) ( 𝑝® − 𝑞).


® Further, taking into account that the dependence on 𝑝
and 𝑞 in (10.2) means in fact a dependence on 𝑝® and 𝑞, ® the integral then becomes

𝐼 = d3 𝑝 (2𝜋) 3 𝑁 2 ( 𝑝)
∑︁
𝑏( 𝑝, 𝑠)𝑏 ∗ ( 𝑝, 𝑠′)u
u† ( 𝑝, 𝑠′)u
u ( 𝑝, 𝑠) . (10.4)
𝑠,𝑠′

The scalar product u† u = u 𝛾0 u in (10.4) can be worked out by means of the Gordon identity
(cf. Appendix C). In general, it holds
1
u ( 𝑝)𝛾 𝜇 u ( 𝑝′) = u ( 𝑝) [( 𝑝 + 𝑝′) 𝜇 + 𝑖𝜎 𝜇𝜈 ( 𝑝 − 𝑝′) 𝜈 ] u ( 𝑝′) . (10.5)
2𝑚
Then one gets
1
u ( 𝑝, 𝑠) = u ( 𝑝, 𝑠′)𝛾 0 u ( 𝑝, 𝑠) =
u† ( 𝑝, 𝑠′)u u ( 𝑝, 𝑠′)u
u ( 𝑝, 𝑠)2𝑝 0 . (10.6)
2𝑚
According to the results summarized in Chapter 7, one has u ( 𝑝, 𝑠′)u
u ( 𝑝, 𝑠) = 2𝑚𝛿 𝑠,𝑠′ and the
expression (10.6) thus becomes

u ( 𝑝, 𝑠) = 2𝐸𝛿 𝑠𝑠′ .
u† ( 𝑝, 𝑠′)u (10.7)

One then gets, finally,


∫ ∫
3 3 2
|𝑏( 𝑝, 𝑠)| = d3 𝑝
2
|𝑏( 𝑝, 𝑠)| 2
∑︁ ∑︁
𝐼 = d 𝑝 (2𝜋) 2𝐸 𝑁 ( 𝑝) (10.8)
𝑠 𝑠

(so, now it is also clear why the choice of the normalization factor 𝑁 ( 𝑝) in (10.1) is convenient).
Thus, the expansion coefficients 𝑏( 𝑝, 𝑠) may be normalized according to

d3 𝑝 |𝑏( 𝑝, 𝑠)| 2 = 1 .
∑︁
(10.9)
𝑠

One may now employ the wave packet (10.1), normalized properly according to (10.9),
to compute expectation values of various physical quantities. The quantity that we will choose
as an example is the velocity. Admittedly, at first sight it does not look as the most attractive and
urgent case, but we will see that in fact it does lead to interesting and rather unexpected results.
Before starting the calculation, we must make a brief digression concerning the quantum
mechanical picture of the quantity we have in mind. The components of velocity are defined
in terms of the time derivative of the corresponding coordinates, so one should invoke the
Heisenberg picture (representation), in which the observables are time-dependent, in general.
Let us recall that an observable 𝐴(𝑡) in such a picture satisfies the equation
d
𝐴(𝑡) = 𝑖[𝐻, 𝐴(𝑡)] , (10.10)
d𝑡
where 𝐻 is the relevant Hamiltonian. So, let us take 𝐴(𝑡) to be the coordinate (vector) 𝑥®(𝑡).
Of course, in the non-relativistic case with the Hamiltonian 𝐻 = 𝑝®2 /2𝑚, one gets from (10.10)
readily 𝑣® = 𝑝/𝑚,
® if the canonical commutation relations between the momentum and coordinate
components are utilized. For the Dirac Hamiltonian the situation is different. One has

d𝑥 𝑘 (𝑡)
= 𝑖[𝐻, 𝑥 𝑘 (𝑡)] = 𝑖[𝐻, 𝑒𝑖𝐻𝑡 𝑥 𝑘 𝑒 −𝑖𝐻𝑡 ] = 𝑖 𝑒𝑖𝐻𝑡 [𝐻, 𝑥 𝑘 ] 𝑒 −𝑖𝐻𝑡 , (10.11)
d𝑡

64
where now 𝐻 = 𝛼 𝑗 𝑝 𝑗 + 𝛽𝑚. The commutator in (10.11) is worked out easily; one gets
[𝐻, 𝑥 𝑘 ] = [𝛼 𝑗 𝑝 𝑗 + 𝛽𝑚, 𝑥 𝑘 ] = −𝑖𝛼 𝑘 , (10.12)
as [ 𝑝 𝑗 , 𝑥 𝑘 ] = −𝑖𝛿 𝑗 𝑘 . Thus, for the operator of velocity one has finally
d𝑥 𝑘 (𝑡)
𝑣 𝑘 (𝑡) = = 𝑒𝑖𝐻𝑡 𝛼 𝑘 𝑒 −𝑖𝐻𝑡 . (10.13)
d𝑡
This result is in fact rather strange. The operator on the right-hand side obviously has just the
eigenvalues ±1 (since for 𝛼 𝑘 it is so, and (10.13) represents a similarity transformation); so it
does not comply with the common idea about what the velocity of a relativistic particle should
be (in contrast to the non-relativistic case, which is quite transparent). On the other hand, the
familiar relativistic relation for the particle velocity can be written simply in terms of the energy
and momentum as
𝑝®
𝑣® = . (10.14)
𝐸
Thus, one might try to transform our result (10.13) in an appropriate way to the momentum
representation, and see whether the form (10.14) emerges there. Practically, it means that we
should evaluate the expectation value of the velocity operator in the state described by the
wave packet (10.1), and write it in a form involving the probability density |𝑏( 𝑝, 𝑠)| 2 . For this
purpose, it is convenient to employ the Schrödinger picture of time evolution, because the wave
function (10.1) is already represented so; thus, one has to compute the expectation value of the
time-independent 𝛼 ® (cf. (10.13)) in the state given by (10.1) (having in mind the normalization
condition (10.9)). We will return to the Heisenberg picture later on.
The quantity we are going to evaluate is given by

⟨𝑣 ⟩ = d3 𝑥 Ψ(+)
𝑗 †
(𝑥)𝛼 𝑗 Ψ(+) (𝑥)
∫ ∑︁ ∫∫
3
= d 𝑥 d3 𝑝 d3 𝑞 𝑁 ( 𝑝)𝑁 (𝑞)𝑏 ∗ ( 𝑝, 𝑠)𝑏(𝑞, 𝑠′)uu ( 𝑝, 𝑠)𝛾 𝑗 u (𝑞, 𝑠′) 𝑒𝑖 𝑝𝑥−𝑖𝑞𝑥 . (10.15)
𝑠,𝑠′

Proceeding in the same way as before (cf. (10.2) through (10.4)), one gets first

⟨𝑣 ⟩ = d3 𝑝 (2𝜋) 3 𝑁 2 ( 𝑝)
∑︁
𝑗
𝑏 ∗ ( 𝑝, 𝑠)𝑏( 𝑝, 𝑠′)u
u ( 𝑝, 𝑠)𝛾 𝑗 u ( 𝑝, 𝑠′) . (10.16)
𝑠,𝑠′

Next, using the Gordon identity (10.5), one has


1
u ( 𝑝, 𝑠)𝛾 𝑗 u ( 𝑝, 𝑠′) = u ( 𝑝, 𝑠′)2𝑝 𝑗 = 2𝑝 𝑗 𝛿 𝑠𝑠′ .
u ( 𝑝, 𝑠)u (10.17)
2𝑚
Substituting this and the value of 𝑁 ( 𝑝) into (10.16), one obtains finally
𝑝𝑗

⟨𝑣 ⟩ = d3 𝑝 |𝑏( 𝑝, 𝑠)| 2 .
∑︁
𝑗
(10.18)
𝑠
𝐸
So, we have arrived at the envisaged result; the familiar relativistic velocity appears here in the
right place, and this is certainly reassuring.
However, this cannot be the whole story. As we know, the complete system of solutions
of the Dirac equation comprises also the plane waves with amplitudes v ( 𝑝, 𝑠), i.e. the negative
energy states. Thus, the most general wave packet should be written in the form

Ψ(𝑥) = d3 𝑝 𝑁 ( 𝑝)
∑︁ 
u ( 𝑝, 𝑠) 𝑒 −𝑖 𝑝𝑥 + 𝑑 ∗ ( 𝑝, 𝑠)vv ( 𝑝, 𝑠) 𝑒𝑖 𝑝𝑥 . (10.19)

𝑏( 𝑝, 𝑠)u
𝑠

65
Note that we have used here a conventional notation 𝑑 ∗ ( 𝑝, 𝑠) instead of 𝑑 ( 𝑝, 𝑠) (which would
seem more natural) with regard to later applications of such a formula within the field theory;
please be patient and you will see. As before, we will consider first the normalization integral

𝐼 = d3 𝑥 Ψ† (𝑥)Ψ(𝑥) .

Using the form (10.19), one gets for 𝐼 the following long expression:
∫ ∫∫
3
d3 𝑝 d3 𝑞
∑︁
𝐼= d 𝑥 𝑁 ( 𝑝)𝑁 (𝑞)
𝑠,𝑠′
h
× 𝑏( 𝑝, 𝑠)𝑏 ∗ (𝑞, 𝑠′)u u (𝑞, 𝑠′)𝛾0 u ( 𝑝, 𝑠) 𝑒 −𝑖 𝑝𝑥+𝑖𝑞𝑥 + 𝑑 ∗ ( 𝑝, 𝑠)𝑑 (𝑞, 𝑠′) v (𝑞, 𝑠′)𝛾0 v ( 𝑝, 𝑠) 𝑒𝑖 𝑝𝑥−𝑖𝑞𝑥
i
u (𝑞, 𝑠′)𝛾0 v ( 𝑝, 𝑠) 𝑒𝑖𝑞𝑥+𝑖 𝑝𝑥 + 𝑑 (𝑞, 𝑠′)𝑏( 𝑝, 𝑠)vv (𝑞, 𝑠′)𝛾0 u ( 𝑝, 𝑠) 𝑒 −𝑖𝑞𝑥−𝑖 𝑝𝑥 .
+ 𝑏 ∗ (𝑞, 𝑠′)𝑑 ∗ ( 𝑝, 𝑠)u
(10.20)
The delta functions that emerge, as usual, upon the spatial integration result in the identification
𝑞® = 𝑝® in the first two terms in the square brackets, and 𝑞® = − 𝑝® in the third and fourth term.
Further, the time-dependent factor exp(2𝑖 𝑝 0 𝑥 0 ) then survives in the third term and exp(−2𝑖 𝑝 0 𝑥 0 )
in the fourth term. For the remaining algebra one has to employ all variants of the Gordon identity
(see Appendix C). As a result, the expression in the square brackets is simplified considerably:
for the first term one gets |𝑏( 𝑝, 𝑠)| 2 2𝐸𝛿 𝑠𝑠′ and for the second term |𝑑 ( 𝑝, 𝑠)| 2 2𝐸𝛿 𝑠𝑠′ , while the
third and the fourth terms vanish. The reader is encouraged to verify these statements with the
help of the identities (C.33).
Now, putting all this together, including also the ubiquitous overall factor (2𝜋) 3 accom-
panying the delta functions, one has finally
∫ ∫
3 3 2 2
∑︁  
d 𝑥 Ψ (𝑥)Ψ(𝑥) = d 𝑝

|𝑏( 𝑝, 𝑠)| + |𝑑 ( 𝑝, 𝑠)| . (10.21)
𝑠

Up to now, everything seems to be OK. So, let us compute the expectation value of the velocity,
in order to generalize our previous result. As before, it is given by
∫ ∫
⟨𝑣 ⟩ = d 𝑥 Ψ (𝑥)𝛼 Ψ(𝑥) = d3 𝑥 Ψ(𝑥)𝛾 𝑘 Ψ(𝑥) .
𝑘 3 † 𝑘
(10.22)

The calculation is straightforward and proceeds along the same lines as before. In comparison
with the previous case of the restricted wave packet there is more algebraic work involving the
Gordon identities and the result is rather long. It reads
 𝑘
2 𝑝

3 2
∑︁ 
⟨𝑣 ⟩ = d 𝑝
𝑘
|𝑏( 𝑝, 𝑠)| + |𝑑 ( 𝑝, 𝑠)|
𝑠
𝐸
∫ ∑︁ 𝑖 
+ d3 𝑝 u ( 𝑝e, 𝑠′)𝜎 𝑘0 v ( 𝑝, 𝑠) 𝑒 2𝑖𝐸𝑥0
𝑏 ∗ ( 𝑝e, 𝑠′)𝑑 ∗ ( 𝑝, 𝑠)u
𝑠,𝑠′
2𝑚

′ ′ 𝑘0 −2𝑖𝐸𝑥0
− 𝑑 ( 𝑝e, 𝑠 )𝑏( 𝑝, 𝑠) v ( 𝑝e, 𝑠 )𝜎 u ( 𝑝, 𝑠) 𝑒
∫ ∑︁ 1 
3
+ d 𝑝 𝑑 ( 𝑝e, 𝑠′)𝑏( 𝑝, 𝑠)vv ( 𝑝e, 𝑠′)u u ( 𝑝, 𝑠) 𝑒 −2𝑖𝐸𝑥0
𝑠,𝑠′
2𝑚
 𝑝𝑘
− 𝑏 ∗ ( 𝑝e, 𝑠′)𝑑 ∗ ( 𝑝, 𝑠)uu ( 𝑝e, 𝑠′)vv ( 𝑝, 𝑠) 𝑒 2𝑖𝐸𝑥0 , (10.23)
𝐸

66
where we have denoted 𝑝e = ( 𝑝 0 , − 𝑝). ® Note that in several books one may find the expression
(10.23) without the last two lines (see e.g. [1, 6]); at one place there is even an explicit statement
that such terms vanish (see [5]). It is not true, since, in general, v ( 𝑝e, 𝑠′)u u ( 𝑝, 𝑠) ≠ 0 and
similarly for u ( 𝑝e, 𝑠 )vv ( 𝑝, 𝑠). In any case, the result (10.23) is quite intriguing. Apart from the

“normal” term in the first line, there are “anomalous” time-dependent oscillatory terms involving
an interference of positive and negative energy waves, which are a priori unexpected and their
physical interpretation is not clear (the point of the conundrum is that there is no external force
to cause the oscillations). Obviously, a minimum frequency of such an oscillatory motion is at
least 2𝑚 (i.e. 2𝑚𝑐2 /ℏ in ordinary units), which for the electron amounts to about 1021 s−1 . This
effect was observed theoretically for the first time in 1930 by E. Schrödinger, who called it, in
German, die Zitterbewegung (it means “jittery”, or “quivering” motion).
Such a phenomenon may cast some doubt on the interpretation of the quantity in question
as a true velocity expectation value, and this in turn is related to the interpretation of the
space coordinates as the true quantum mechanical observables; it is indeed so that the operator
of multiplication by the space coordinate is not satisfactory in this context (see e.g. [8, 24]).
In general, the issue of a correct definition of a position operator is one of the fundamental
consistency problems of relativistic quantum mechanics.
On the other hand, even quite recently there have been serious attempts to prove or
disprove experimentally the reality of the “Zitterbewegung” (not directly for electrons in the
vacuum, but for some more appropriate materials) and some results seem to be positive (see
e.g. [38]). Anyway, one should say, with due caution, that this enigmatic issue is still not settled
at present. An illuminating discussion of Zitterbewegung can be found also in the book [4].
The Schrödinger picture of time evolution makes it clear that the phenomenon of Zit-
terbewegung is intimately related to the presence of negative energy waves in the packet. The
anomalous terms contributing to the particle velocity can be also represented in a compact form
in the Heisenberg picture. Below we describe it briefly, following the treatment presented in the
book [11].
One may start with Eq. (10.13). For convenience, we will denote the expression on its
right-hand side as 𝛼 ® (𝑡). It satisfies the differential equation (Heisenberg equation of motion)
.
® (𝑡) = 𝑖[𝐻, 𝛼
𝛼 ® (𝑡)] , (10.24)

where we use, for the sake of brevity, the Newton’s dot symbol for the time derivative. Obviously,
the commutator can be recast as

® (𝑡)] = 𝑒𝑖𝐻𝑡 [𝐻, 𝛼


[𝐻, 𝛼 ® } − 2®
® ] 𝑒 −𝑖𝐻𝑡 = 𝑒𝑖𝐻𝑡 ({𝐻, 𝛼 𝛼 𝐻) 𝑒 −𝑖𝐻𝑡
= 𝑒𝑖𝐻𝑡 (2 𝑝® − 2®
𝛼 𝐻) 𝑒 −𝑖𝐻𝑡 = 2 𝑝® − 2®
𝛼 (𝑡)𝐻 ,

where we have utilized the familiar anticommutation relations for 𝛼


® , as well as the fact that 𝐻
commutes with 𝑝.
® Thus, Eq. (10.24) becomes
.
® (𝑡) = 2𝑖 𝑝® − 2𝑖 𝛼
𝛼 ® (𝑡)𝐻 . (10.25)

Now, one may differentiate Eq. (10.25) with respect to time, taking into account that 𝑝® and 𝐻
are constants of motion; one gets .. .
® (𝑡) = −2𝑖 𝛼
𝛼 ® (𝑡)𝐻 , (10.26)
and this can be readily integrated so that
. .
® (0) 𝑒 −2𝑖𝐻𝑡 .
® (𝑡) = 𝛼
𝛼 (10.27)

67
However, from (10.25) one has
.
® (0) = 2𝑖 𝑝® − 2𝑖 𝛼
𝛼 ® 𝐻 = −2𝑖( 𝛼 ® −1 )𝐻 ,
® − 𝑝𝐻 (10.28)
.
and substituting this into (10.27), the expression for 𝛼
® (𝑡) becomes
.
® (𝑡) = −2𝑖( 𝛼
𝛼 ® −1 )𝐻 𝑒 −2𝑖𝐻𝑡 .
® − 𝑝𝐻 (10.29)

Further, one may use it in (10.25) and thus we end up with a relation for 𝛼
® (𝑡), namely

−2𝑖( 𝛼 ® −1 )𝐻 𝑒 −2𝑖𝐻𝑡 = 2𝑖 𝑝® − 2𝑖 𝛼
® − 𝑝𝐻 ® (𝑡)𝐻 ,

from which one obtains, after a simple manipulation,

𝑝® 𝑝® −2𝑖𝐻𝑡
 
® (𝑡) =
𝛼 + 𝛼®− 𝑒 (10.30)
𝐻 𝐻

(where, of course, 1/𝐻 means 𝐻 −1 ). A technical remark is in order here. We have reproduced
here the calculation presented in [11], since it is a very efficient and elegant way of obtaining
the result (10.30). An alternative procedure would be to start with (10.13) and employ the
well-known general formula
1 1
𝑒 𝐴 𝐵 𝑒 −𝐴 = 𝐵 + [ 𝐴, 𝐵] + [ 𝐴, [ 𝐴, 𝐵]] + . . . (10.31)
1! 2!
along with the commutator
® ] = 2 𝑝® − 2®
[𝐻, 𝛼 𝛼𝐻
shown above. Taking into account that [𝐻, 𝑝] ® = 0 and proceeding with due care, the result
(10.30) can be recovered (as usual, a diligent reader is encouraged to perform this calculation).
The first term on the right-hand side of (10.30) is the “normal” one, since it corresponds
to the standard relation for velocity of a relativistic particle (10.14). The second term is “anoma-
lous”, as it is time-dependent (oscillatory) and is clearly responsible for the Zitterbewegung that
we have uncovered previously. Finally, it is also instructive to get an expression for the time
evolution of the coordinate. We have
. 𝑝®

𝑝® −2𝑖𝐻𝑡

𝑥®(𝑡) = 𝛼
® (𝑡) = + 𝛼 ®− 𝑒 ,
𝐻 𝐻
and integrating it we obtain

𝑝® 𝑝®
 
𝑖
𝑥®(𝑡) = 𝑎® + 𝑡 + ®−
𝛼 𝐻 −1 𝑒 −2𝑖𝐻𝑡 , (10.32)
𝐻 2 𝐻

where 𝑎® is a constant. The last term on the right-hand side of (10.32) is another representation
of the quivering motion of a free particle, the mysterious Zitterbewegung. One may also notice
that the amplitude of such oscillations is at most of the order 1/𝑚, the Compton wavelength of
the particle, since for a definite momentum and energy 1/𝐸 ≤ 1/𝑚 and, of course, | 𝑝/𝐸 ® | ≤ 1.
We have seen that the (inevitable) presence of the negative energy solutions leads to an
unexpected and counter-intuitive behaviour of a wave packet representing, in general, a localized
state of a Dirac particle. In this context, one may naturally ask when the admixture of the negative
energy waves can be substantial, or when it is negligible. Technically, it is the problem of the
relative magnitude of the expansion coefficients 𝑏( 𝑝, 𝑠) and 𝑑 ( 𝑝, 𝑠).

68
As an explicit illustration, let us consider an example of a wave packet that for 𝑡 = 0 has
the Gaussian form
1 𝑥®2
 
Ψ(0, 𝑥®) = 𝑤 exp − 2 , (10.33)
(𝜋𝑎 2 ) 3/4 2𝑎
where 𝑤 is e.g.
1
­0®
© ª
𝑤 = ­ ®. (10.34)
­0®
«0¬
Note that Ψ(0, 𝑥®) given by (10.33) is normalized to unity. It represents an initial condition for a
time-dependent solution of the Dirac equation that can be generally written in the form (10.19).
Setting 𝑥0 = 0 in (10.19) one has

Ψ(0, 𝑥®) = d3 𝑝 𝑁 ( 𝑝)
∑︁ h i
u ( 𝑝, 𝑠) 𝑒𝑖 𝑝®𝑥® + 𝑑 ∗ ( 𝑝, 𝑠)vv ( 𝑝, 𝑠) 𝑒 −𝑖 𝑝®𝑥® .
𝑏( 𝑝, 𝑠)u (10.35)
𝑠

Evaluation of the expansion coefficients in (10.35) amounts to the implementation of inverse


Fourier transformation; practically, it means that one multiplies (10.35) by 𝑒 −𝑖 𝑞®𝑥® and integrates
over 𝑥®. The calculation is not difficult and the result is
 2  3/4
𝑎 1 2 2 1 ∑︁
𝑒 − 2 𝑎 𝑞® 𝑤 = √︁ u (𝑞, 𝑠) + 𝑑 ∗ (e
[𝑏(𝑞, 𝑠)u 𝑞 , 𝑠)vv (e
𝑞 , 𝑠)] , (10.36)
𝜋 2𝐸 (𝑞) 𝑠

where we have denoted, as before, 𝑞e = (𝑞 0 , −𝑞) ® and, of course, 𝐸 (𝑞) = ( 𝑞®2 + 𝑚 2 ) 1/2 . In the
sequel, we will change the notation (just for convenience), writing 𝑝 instead of 𝑞. The coefficients
𝑏( 𝑝, 𝑠) and 𝑑 ∗ ( 𝑝, 𝑠) can now be determined with the help of the familiar orthogonality relations
for u and v . For instance, multiplying Eq. (10.36) by u† ( 𝑝, 𝑠′) from left and using the Gordon
identities, one gets
u ( 𝑝, 𝑠) = 2𝐸𝛿 𝑠𝑠′
u† ( 𝑝, 𝑠′)u (10.37)
and
u† ( 𝑝, 𝑠′)vv ( 𝑝e, 𝑠) = 0 . (10.38)
Similarly, the multiplication by v † ( 𝑝e, 𝑠′) yields

u ( 𝑝, 𝑠) = 0
v † ( 𝑝e, 𝑠′)u (10.39)

and
v † ( 𝑝e, 𝑠′)vv ( 𝑝e, 𝑠) = 2𝐸𝛿 𝑠𝑠′ . (10.40)
Final results for the expansion coefficients then read
 2  3/4
1 𝑎 1 2 2
𝑏( 𝑝, 𝑠) = 1/2
𝑒 − 2 𝑎 𝑝® u† ( 𝑝, 𝑠)𝑤 ,
(2𝐸) 𝜋
 2  3/4 (10.41)
1 𝑎 1 2 2
𝑑 ∗ ( 𝑝e, 𝑠) = 1/2
𝑒 − 2 𝑎 𝑝® v † ( 𝑝e, 𝑠)𝑤 .
(2𝐸) 𝜋

69
Now, to get explicit values of the functions (10.41), one may use the form of u ( 𝑝, 𝑠) and v ( 𝑝, 𝑠)
discussed in Chapter 5. It is

√ 𝜑 (𝑟)
u ( 𝑝, 𝑠) = 𝐸 +𝑚­ 𝜎
® · 𝑝® (𝑟) ® ,
© ª
𝜑
«𝐸 + 𝑚 ¬
(10.42)
√ ® · 𝑝® (𝑟)
𝜎
𝜑 ª
v ( 𝑝, 𝑠) = 𝐸 + 𝑚 ­ 𝐸 + 𝑚 ®,
©
(𝑟)
« 𝜑 ¬

1 0 1
     
where 𝜑 , with 𝑟 = 1, 2, is e.g.
(𝑟) or . Choosing 𝜑 =
(𝑟) , one gets non-trivial values of
0 1 0
the expressions (10.41):
 1/2  2  3/4
𝐸 +𝑚

∗ 𝑎 1 2 2 𝑝3
𝑑 ( 𝑝, 𝑠) = 𝑑 ( 𝑝, 𝑠) = 𝑒 − 2 𝑎 𝑝® ,
2𝐸 𝜋 𝐸 +𝑚
 1/2  2  3/4 (10.43)
𝐸 +𝑚

𝑎 1 2 2
𝑏( 𝑝, 𝑠) = 𝑒 − 2 𝑎 𝑝® .
2𝐸 𝜋

Thus, the ratio of typical non-zero expansion coefficients is quite simple, namely

𝑑 ∗ ( 𝑝, 𝑠) | 𝑝3 | | 𝑝|
®
= ≲ . (10.44)
𝑏( 𝑝, 𝑠) 𝐸 +𝑚 𝐸 +𝑚

Now we are in a position to find out when the contribution  of negative energy waves may be
1 2 2

substantial. In any case, the exponential factor exp − 2 𝑎 𝑝® has non-negligible value (both for
𝑏 and 𝑑) if | 𝑝|𝑎
® ≲ 1, i.e.
1
| 𝑝|
® ≲ . (10.45)
𝑎
As regards the magnitude of the constant 𝑎 (the width of the wave packet), there are two
possibilities. First,
1 1
𝑎≫ i.e. ≪ 𝑚. (10.46)
𝑚 𝑎
1 2 2
 
In such a situation, in the region (10.45) of the non-negligible values of exp − 2 𝑎 𝑝® one has
® ≲ 1/𝑎 ≪ 𝑚, i.e. | 𝑝|
| 𝑝| ® ≪ 𝑚, and according to (10.44) one then has |𝑑/𝑏| ≪ 1, i.e. the
contribution of the negative energy waves is suppressed. Second,
1 1
𝑎≲ i.e. ≳𝑚. (10.47)
𝑚 𝑎
In this case, for | 𝑝|
® ≃ 1/𝑎 one has | 𝑝|
® ≳ 𝑚 and the ration 𝑑/𝑏 is not suppressed for such values
of | 𝑝|.
®
Note that the condition (10.46) corresponds to a particle localized in a region much larger
than its Compton wavelength. Our example thus demonstrates that a substantial contribution
of negative energy states may be expected for a wave packet localized in a region with linear
dimension of the order of Compton wavelength or less.

70
A physical system with the size corresponding to (10.46) is e.g. the hydrogen atom, where
the electron is localized in a region with the linear dimension characterized by the Bohr radius;
in our system of units it is
1
𝑎 Bohr = , (10.48)
𝑚𝛼
where 𝛼 is the fine-structure constant, 𝛼  1/137 (thus, the size of the hydrogen atom is about
two orders of magnitude larger than the Compton wavelength of the electron). Of course, our
preceding considerations concerning the wave packets correspond to localized free particles, but
one might hope that even for the electron bound in an external field one could get results that are
not heavily contaminated by the puzzling effects of relativistic quantum mechanics mentioned
above. Such an expectation is strengthened by the fact that the hydrogen atom is described quite
successfully even within the non-relativistic quantum mechanics.
Needless to say, these intuitive arguments turn out to be true. The treatment of the
hydrogen atom within the framework of relativistic quantum mechanics has become one of
the early triumphs of the Dirac equation. The relevant calculations were carried out in 1928
independently by Walter Gordon and Charles Darwin (the grandson of the famous Charles
Darwin) and, among other things, explained very naturally the so-called fine structure of the
hydrogen energy levels. Although the calculation is really impressive (but also long and tedious),
it will not be reproduced here, since we would like to proceed faster on our way to quantum field
theory. The interested reader can find the details e.g. in [1], [6] or [7].

71
Chapter 11

Klein paradox

As the title of this chapter indicates, we are going to discuss another intriguing phenomenon
that emerges within the framework of relativistic quantum mechanics. This time, the effect in
question concerns the theoretical description of a simple scattering process involving a one-
dimensional potential step. The reader may remember that solving a problem like this is a
traditional mundane topic of one of the first technical exercises in an introductory course of the
non-relativistic quantum mechanics. There one can demonstrate some effects that are impossible
classically (such as quantum tunnelling), but otherwise well understood and interpreted in terms
of the behaviour of the particle wave function. In contrast to that, here we will see that in
the relativistic case one encounters some surprising counterintuitive new phenomena that may
reveal limits of applicability of relativistic quantum mechanics as a single-particle theory.
So, let us formulate the problem we have in mind. We are going to consider a Dirac
particle moving in a static external field, described in terms of the corresponding potential energy
that has the form

𝑥 ) = 𝑉 (𝑧) ,
𝑉 (®
𝑉 (𝑧) = 0 for 𝑧 < 0 , (11.1)
𝑉 (𝑧) = 𝑉 for 𝑧 ≥ 0 ,

where 𝑉 is a constant, 𝑉 > 0. Thus, the potential energy defining our model depends non-
trivially just on the 3rd coordinate 𝑧 and has the idealized shape of a “potential step” with sharp
boundary. Suppose that the particle is in a stationary state with a definite energy 𝐸 > 0. The
time-independent part of the wave function then satisfies the equation
 
® · ® + 𝛽𝑚 + 𝑉 (® (11.2)
Δ
−𝑖 𝛼 𝑥 ) 𝜓(®𝑥 ) = 𝐸𝜓(®
𝑥) .

In view of the form of our 𝑉 (®𝑥 ) we may examine the solution of Eq. (11.2) separately in the
two distinct regions indicated in (11.1); for convenience, the half-spaces with 𝑧 < 0 and 𝑧 ≥ 0
will be denoted as the regions I and II, respectively. Thus, we have two simple equations to be
solved, namely  
−𝑖 𝛼 ®Δ
® · + 𝛽𝑚 𝜓I (® 𝑥 ) = 𝐸𝜓I (®
𝑥) (11.3)
in the region I, and  
® · ® + 𝛽𝑚 𝜓II (® (11.4)
Δ
−𝑖 𝛼 𝑥 ) = (𝐸 − 𝑉)𝜓II (®
𝑥)
in the region II. Formally, on the left-hand side of both equations one has the free-particle Dirac
Hamiltonian, so that one may look for solutions of the type 𝜓(® 𝑥 ) = u (𝑘) exp(𝑖 𝑘® 𝑥®). It makes

72
sense to choose the momentum variable as 𝑘® = (0, 0, 𝑘); with such a choice, solving equations
(11.3), (11.4) then turns out to be effectively a one-dimensional problem, since the action of
𝜕/𝜕𝑥 and 𝜕/𝜕𝑦 on the wave function becomes trivial. Thus, one is left with
 
3 𝜕
−𝑖𝛼 + 𝛽𝑚 𝜓I (𝑧) = 𝐸𝜓I (𝑧) (11.5)
𝜕𝑧

and  
3𝜕
−𝑖𝛼 + 𝛽𝑚 𝜓II (𝑧) = (𝐸 − 𝑉)𝜓II (𝑧) . (11.6)
𝜕𝑧
Now we are in a position to specify the physical boundary conditions to be imposed on the wave
function in question. The setting we have in mind corresponds to one-dimensional scattering,
where one has to consider incident and reflected waves in the region I, along with the transmitted
wave in the region II. The incident wave can be written as
 
𝜑 𝑖 𝑝𝑧
𝑖 𝑝𝑧
𝜓inc. (𝑧) = u ( 𝑝) 𝑒 = 𝑒 , (11.7)
𝜒

with 𝑝 > 0 (it corresponds to the particle motion toward the potential step). The four-component
amplitude u ( 𝑝) is conveniently described in terms of its two-component upper and lower parts
𝜑 and 𝜒; the 𝜑 can be chosen e.g. as
1
 
𝜑= , (11.8)
0
and 𝜒 is then obtained by solving Eq. (11.5). Working in the standard representation for Dirac
matrices, one has
0 𝜎3
 
3
𝛼 = , (11.9)
𝜎3 0
and Eq. (11.5) becomes

𝑝𝜎3 𝜒 + 𝑚𝜑 = 𝐸 𝜑 , (11.10)
𝑝𝜎3 𝜑 − 𝑚 𝜒 = 𝐸 𝜒 . (11.11)

Thus, Eq. (11.11) yields


𝑝
𝜒= 𝜎3 𝜑 , (11.12)
𝐸 +𝑚
and substituting this into Eq. (11.10), one of course gets the expected relation 𝑝 2 = 𝐸 2 − 𝑚 2 .
So, using (11.8) and (11.12), one may write finally

1
0 ®
© ª
­
𝜓inc. (𝑧) = ­ 𝑝 ®® 𝑒𝑖 𝑝𝑧 . (11.13)
­
­𝐸 + 𝑚®
­ ®

« 0 ¬

Note that our choice (11.8) thus corresponds to an incident particle with positive helicity.
Next, a solution corresponding to the reflected particle should carry the momentum −𝑝,
and one cannot exclude, a priori, the helicity flip; thus, a general form of the reflected wave can
be written as    
−𝑖 𝑝𝑧 𝜑1 −𝑖 𝑝𝑧 𝜑2
𝜓refl. = 𝑎 𝑒 +𝑏𝑒 , (11.14)
𝜒1 𝜒2

73
where
1 0
   
𝜑1 = , 𝜑2 = , (11.15)
0 1
and 𝜒1 , 𝜒2 are obtained, as before, by solving Eq. (11.5). In analogy with (11.12) one gets here
−𝑝
𝜒1,2 = 𝜎3 𝜑1,2 , (11.16)
𝐸 +𝑚
so that (11.14) becomes

1 0
­ 0 ® 1 ª®
© ª ©
−𝑖 𝑝𝑧 ­ −𝑝 ® −𝑖 𝑝𝑧 ­
(11.17)
­
𝜓refl. (𝑧) = 𝑎 𝑒 ­ ®+𝑏𝑒 ­ 0 ®® .
­𝐸 + 𝑚® 𝑝 ®
­ ® ­
« 0 ¬ «𝐸 + 𝑚¬
Finally, let us consider the transmitted wave that represents the particle penetrating the
potential step for 𝑧 ≥ 0. Obviously, one may employ an Ansatz
   
𝑖𝑞𝑧 𝜑1 𝑖𝑞𝑧 𝜑2
𝜓trans. = 𝑐 𝑒 +𝑑𝑒 , (11.18)
𝜒1 𝜒2

which is supposed to satisfy Eq. (11.6). The relevant equations for any pair 𝜑, 𝜒 are now
obtained from (11.10), (11.11) by replacing 𝐸 with 𝐸 − 𝑉. So, we have

𝑞𝜎3 𝜒 + 𝑚𝜑 = (𝐸 − 𝑉)𝜑 , (11.19)


𝑞𝜎3 𝜑 − 𝑚 𝜒 = (𝐸 − 𝑉) 𝜒 . (11.20)

Eq. (11.20) yields immediately


𝑞
𝜒= 𝜎3 𝜑 , (11.21)
𝐸 −𝑉 +𝑚
and substituting this back into Eq. (11.19) one gets, after a trivial manipulation,

𝑞 2 = (𝐸 − 𝑉) 2 − 𝑚 2 (11.22)

(as expected, it corresponds to the substitution 𝐸 → 𝐸 − 𝑉 in the previous relation for 𝑝 2 ).


Eq. (11.18) now reads

1 0
0 1
© ª © ª
­ ®
𝜓trans. (𝑧) = 𝑐 𝑒𝑖𝑞𝑧 ­ 𝑞 ® + 𝑑 𝑒𝑖𝑞𝑧 ­­ (11.23)
­ ®
0 ®,
­ ®
®
­𝐸 − 𝑉 + 𝑚® −𝑞
­ ® ­ ®
« 0 ¬ «𝐸 − 𝑉 + 𝑚¬

with 𝑞 given by (11.22). With the expressions (11.13), (11.17) and (11.23) at hand, one may
impose the natural condition of continuity of the wave function at 𝑧 = 0, the edge of the potential
step. Explicitly, the condition reads

𝜓I (0) = 𝜓II (0) , (11.24)

where
𝜓I (𝑧) = 𝜓inc. (𝑧) + 𝜓refl. (𝑧) (11.25)

74
and
𝜓II (𝑧) = 𝜓trans. (𝑧) . (11.26)
The “matching” condition (11.24) then becomes, after a trivial manipulation,

1+𝑎 = 𝑐, (11.27)
𝑏 = 𝑑, (11.28)
𝑝 −𝑝 𝑞
+𝑎 =𝑐 , (11.29)
𝐸 +𝑚 𝐸 +𝑚 𝐸 −𝑉 +𝑚
𝑝 −𝑞
𝑏 = 𝑑. (11.30)
𝐸 +𝑚 𝐸 −𝑉 +𝑚
It is easy to see that the equations (11.28) and (11.30) imply 𝑏 = 𝑑 = 0, i.e. the particle helicity
is not flipped in a reflection or transmission event. The equations (11.27) and (11.29) yield
1−𝑟 2
𝑎= , 𝑐= , (11.31)
1+𝑟 1+𝑟
where
𝐸 +𝑚 𝑞
𝑟= . (11.32)
𝐸 −𝑉 +𝑚 𝑝
Let us now examine the character of the solution in the region II (i.e. the transmitted
wave), in dependence on the particle energy 𝐸 and the potential step height 𝑉. For this purpose,
the key relation is Eq. (11.22). First, if |𝐸 − 𝑉 | < 𝑚, 𝑞 2 is negative, and 𝑞 is thus purely
imaginary. It is reasonable to chose 𝑞 = 𝑖|𝑞|; the transmitted wave then decreases exponentially
inside the step (the choice 𝑞 = −𝑖|𝑞| would lead to exponential growth of the wave function,
which is an unacceptable behaviour). Note that the condition |𝐸 − 𝑉 | < 𝑚 means

𝑉 −𝑚 < 𝐸 <𝑉 +𝑚. (11.33)

Thus, 𝐸 − 𝑚 < 𝑉, i.e. the kinetic energy is less than 𝑉; the exponential decay of the wave
function inside the step is in agreement with the familiar result known from the analysis of the
non-relativistic Schrödinger equation in an analogous situation.
The second possibility is |𝐸 − 𝑉 | > 𝑚, which means either

𝐸 −𝑉 > 𝑚, i.e. 𝐸 −𝑚 >𝑉 (11.34)

or
𝐸 − 𝑉 < −𝑚 , i.e. 𝐸 <𝑉 −𝑚. (11.35)
Equivalently, (11.35) yields a condition for 𝑉:

𝑉 > 𝐸 +𝑚. (11.36)

In this case, (11.22) tells us that 𝑞 2 > 0, so that the transmitted wave has oscillatory character:
the solution in question behaves as exp(±𝑖|𝑞|𝑧). If (11.34) holds, such a result is quite transparent
and corresponds to the non-relativistic description (the kinetic energy 𝐸 − 𝑚 is greater than the
potential step height and the free motion inside is classically feasible). On the other hand, the
situation corresponding to the condition (11.35) is rather different. Here the kinetic energy is
certainly less than 𝑉, but the transmitted wave still has an oscillatory character! Such a result is
astonishing, if one relies on an earlier experience based on classical physics and non-relativistic
quantum mechanics. It simply means that an incident particle with kinetic energy less than
potential energy for 𝑧 ≥ 0 can get arbitrarily far inside the step, i.e. for any 𝑧 → +∞ (because of

75
an undamped wave function). In this context, the relation (11.36) is also remarkable; according
to this condition, the height of the potential barrier is quite big, certainly at least 2𝑚, twice
the rest energy of the incident particle. The counterintuitive behaviour of the wave function
described above is the core of the effect called commonly the Klein paradox, since Oskar Klein
was the first who came across such a conundrum in 1929.
Preceding considerations can be supplemented with the evaluation of the corresponding
reflection and transmission coefficients characterizing the considered dynamical process. As we
have noted at the beginning of this chapter, we are in fact studying a one-dimensional scattering,
and in such a case the above-mentioned coefficients describe the event probabilities, instead of a
scattering cross section used in three dimensions. The coefficients in question can be defined by
means of the relevant probability density currents (recall the basic formula 𝑗® = 𝜓 † 𝛼® 𝜓 derived
in Chapter 2). The probability currents are expressed in terms of the incident, reflected and
transmitted waves shown above; using the result (11.34) one gets
𝑗trans. 4𝑟
𝑇= = (11.37)
𝑗inc. (1 + 𝑟) 2
for the transition coefficient, and the reflection coefficient becomes

𝑗refl. (1 − 𝑟) 2
𝑅= = , (11.38)
𝑗inc. (1 + 𝑟) 2
where 𝑟 is given by (11.32) (the reader is encouraged to reproduce these results independently).
Now, the question is what is in fact the value of 𝑞 appearing in the expression for the quantity
𝑟. Using the relation (11.22), and taking mechanically the value of 𝑞 as the positive square
 1/2
root 𝑞 = (𝐸 − 𝑉) 2 − 𝑚 2 , the kinematical factor 𝑟 becomes negative, when 𝐸 and 𝑉 satisfy


the condition (11.35). Then 𝑇 is negative, while 𝑅 becomes greater than 1, but we still have
𝑅 + 𝑇 = 1. This strange fact is often taken as an additional attribute of the Klein paradox (see
e.g. [1, 6, 11]). However, a closer look reveals that it is not quite so. Important explanatory
remarks can be found e.g. in the book [5] or in the nice review paper [36]. The point is that
one should be more careful when defining the boundary condition for the transmitted wave.
For that purpose, one may consider the (group) velocity 𝑣 = d𝐸/d𝑞. This is computed easily;
differentiating the relation (11.22) with respect to 𝑞, one has
d𝐸
2𝑞 = 2(𝐸 − 𝑉) ,
d𝑞
so that
d𝐸 𝑞
= . (11.39)
d𝑞 𝐸 − 𝑉
Since 𝐸 < 𝑉 in the considered case (11.35), for a motion of the particle from 𝑧 = 0 to +∞ (from
left to right), i.e. with positive 𝑣, one should take 𝑞 to be the negative square root

𝑞 = − (𝐸 − 𝑉) 2 − 𝑚 2 . (11.40)
√︁

Then 𝑟 is positive, √︄
(𝑉 − 𝐸 + 𝑚)(𝐸 + 𝑚)
𝑟= , (11.41)
(𝑉 − 𝐸 − 𝑚)(𝐸 − 𝑚)
so that for 𝑅 and 𝑇 given by (11.37) and (11.38) one has 0 < 𝑇 < 1, 0 < 𝑅 < 1 and, of course,
𝑅 +𝑇 = 1. As noted in ref. [36], a hint concerning the velocity was given to O. Klein by W. Pauli

76
(thus, the seeming paradox referring to 𝑇 < 0 and 𝑅 > 1 was not an issue in the original Klein’s
paper).
Anyway, the physical essence of the Klein paradox persists. Once again: its content is that
the solution of the Dirac equation for transmitted wave is not exponentially damped inside the
potential step, in a situation that is classically forbidden.6 Having found that the Klein paradox is
real, one may wonder how to live with it. A natural reaction is that such a puzzling observation
reveals a limit of applicability of the Dirac equation as a one-particle equation of relativistic
quantum mechanics. In particular, the Klein paradox shows that one may run into controversy in
the presence of a very strong field; let us recall that in the considered case we have 𝑉 > 2𝑚. The
“critical” value 𝑉 = 2𝑚 suggests a seemingly crazy idea that such a huge potential energy might
give rise to two more particles with the mass 𝑚 and thus the whole picture of the scattering
process would be substantially altered. In fact, such an idea is not quite crazy (after all, there
is an equivalence of the energy and mass in relativistic physics!). Quantum field theory does
enable one to describe the creation of particle-antiparticle pairs in a sufficiently strong external
field (the so-called Schwinger process); the condition 𝑉 > 2𝑚 corresponds precisely to such a
situation. So, the resolution of the conundrum, called traditionally Klein paradox, should rely
on abandoning the one-particle interpretation of the Dirac equation in favour of quantum field
theory, which can describe naturally the processes with changing number and even the type of
particles involved. We will not pursue this topic further now; referring to the Klein paradox has
served us here just as another hint at possible difficulties of the one-particle relativistic quantum
mechanics. An interested reader may find a detailed discussion of this issue e.g. in [36] and
some references therein, or in a more recent paper [37].

6 Moreover, from (11.41) it is clear that 𝑟 → 1 for 𝑉 → ∞ and 𝐸 fixed. Thus, according to (11.37), 𝑇 → 1
in such a limit. It means that an arbitrarily high barrier is penetrable for the Dirac particle; this is certainly a
paradoxical behaviour.

77
Chapter 12

Relativistic equation for spin-1 particle

Up to now we have considered relativistic equations for particles with spin 0 (Klein–Gordon)
and spin 1/2 (Dirac or Weyl). Next in the hierarchy are particles with spin 1 (which also play
a crucial role in the present-day standard model of particle physics). So, it certainly makes
sense to devote one chapter to a brief discussion of the relevant relativistic equation describing
a massive spin-1 particle (the case of massless photon is deferred, for convenience, to one of the
later chapters on the field theory).
In fact, finding such an equation is a non-trivial task. The point is that a massive spin-1
particle has three independent states, which correspond to spin projections ±1 and 0 onto a
given space direction in its rest system; thus, one would like to describe it in terms of a wave
function with three independent components. However, if one wants to maintain the relativistic
covariance, writing a pertinent equation for such an object is not straightforward. The simplest
and rather natural way to achieve this goal is to employ a wave function that is Lorentz four-
vector, and impose one covariant constraint on its components. Since we wish to maintain the
standard relativistic relation for energy, momentum and mass of a free particle, it is clear that
the components of the wave function should satisfy the Klein–Gordon equation. So, denoting
the wave function in question as 𝐴 𝜇 (𝑥), 𝜇 = 0, 1, 2, 3, one has

( 2 + 𝑚 2 ) 𝐴 𝜇 (𝑥) = 0 . (12.1)

A simple choice of the envisaged Lorentz covariant constraint, linear in 𝐴 𝜇 , reads

𝜕𝜇 𝐴 𝜇 (𝑥) = 0 . (12.2)

In this way, one arrives at covariant equations for a wave function with three independent
components; these can be chosen as e.g. 𝐴 𝑗 (𝑥), 𝑗 = 1, 2, 3, and 𝐴0 (𝑥) is supposed to be
evaluated using the constraint (12.2).
It turns out that there is another elegant way leading to the equations (12.1), (12.2). The
idea is to use, as a starting point, Maxwell equations in the covariant form and add an appropriate
mass term (sure, one might object that Maxwell equations describe a classical field, but it does
not matter — the Lorentz covariance is the main point here). Thus, one may write, tentatively,

𝜕𝜇 𝐹 𝜇𝜈 + 𝑚 2 𝐴𝜈 = 0 , (12.3)

where 𝐹 𝜇𝜈 = 𝜕 𝜇 𝐴𝜈 − 𝜕 𝜈 𝐴 𝜇 . In order to check whether we are on the right track, let us work out
the left-hand side of Eq. (12.3). One gets

𝜕𝜇 𝐹 𝜇𝜈 + 𝑚 2 𝐴𝜈 = 𝜕𝜇 (𝜕 𝜇 𝐴𝜈 − 𝜕 𝜈 𝐴 𝜇 ) + 𝑚 2 𝐴𝜈 = 2 𝐴𝜈 − 𝜕 𝜈 (𝜕𝜇 𝐴 𝜇 ) + 𝑚 2 𝐴𝜈 .

78
Thus, Eq. (12.3) can be recast as

( 2 + 𝑚 2 ) 𝐴𝜈 − 𝜕 𝜈 (𝜕𝜇 𝐴 𝜇 ) = 0 . (12.4)

Differentiating this equation by means of 𝜕𝜈 and denoting 𝜕𝜇 𝐴 𝜇 as 𝜕 · 𝐴, one obtains

2 (𝜕 · 𝐴) + 𝑚 2 𝜕 · 𝐴 − 2 (𝜕 · 𝐴) = 0 . (12.5)

It means that we are left with 𝑚 2 𝜕 · 𝐴 = 0, so that 𝜕 · 𝐴 = 0, which is precisely the constraint
(12.2). As a result, the equation (12.4) is reduced to (12.1). Thus we see that the equation
(12.3) is equivalent to the pair (12.1), (12.2): Eq. (12.3) implies the validity of (12.1) along with
(12.2) and from (12.4) it is obvious that the converse is true as well. A remark is in order here.
The equation (12.2) is formally identical with the Lorenz condition in the Maxwell theory of
electromagnetism. However, in the considered case, this identity emerges as a consequence of
the basic equation (12.3), while in the Maxwell theory it must be imposed by hand. Of course,
it is the mass 𝑚 ≠ 0 that makes the difference.
The construction described above is due to A. Proca,7 who came up with it in 1936 within
the framework of field theory (trying to extend a theory of nuclear forces); Eq. (12.3) is therefore
usually called the Proca equation. Of course, such an equation can be utilized equally well
within relativistic quantum mechanics; as noted before, this is precisely what we are doing here
— it is just the Lorentz covariance that matters.
Let us now examine some basic properties of solutions of the equations (12.1), (12.2).
Of course, our primary goal is to verify that these equations provide us indeed with a proper
description of a spin-1 particle. To this end, we are going to consider the solutions of the
plane-wave form, i.e. those corresponding to the states with a given energy and momentum. In
analogy with our previous treatment of relativistic equations, an appropriate Ansatz for such a
solution reads
𝐴 𝜇 (𝑥) = 𝜀 𝜇 (𝑘) 𝑒 −𝑖𝑘𝑥 . (12.6)
Since Eqs. (12.1), (12.2) are invariant under complex conjugation, another independent solution
then would be
𝐴∗𝜇 (𝑥) = 𝜀 ∗𝜇 (𝑘) 𝑒 +𝑖𝑘𝑥 . (12.7)
Note that the form (12.6) is an analogue of the Dirac plane wave, with 𝜀 𝜇 (𝑘) being a counterpart
of u (𝑘); the 𝜀 ∗𝜇 (𝑘) is then analogous to v (𝑘) (cf. Chapter 6). Substituting (12.6) into Eq. (12.1),
one gets immediately 𝑘 2 = 𝑚 2 , i.e. 𝑘 02 = 𝑘® 2 + 𝑚 2 . Choosing 𝑘 0 > 0, the solution (12.6) carries
positive energy 𝑘 0 = ( 𝑘® 2 + 𝑚 2 ) 1/2 and momentum 𝑘. ® Similarly, (12.7) then corresponds to
negative energy −𝑘 0 and momentum − 𝑘. ® Further, Eq. (12.2) amounts to the requirement

𝑘 𝜇 𝜀 𝜇 (𝑘) = 0 . (12.8)

A terminological remark is perhaps in order here. Since the form of the solutions (12.6), (12.7)
including the condition (12.8) is so similar to the description of plane waves in Maxwell theory,
the amplitude 𝜀 𝜇 (𝑘) is called generally the polarization vector.
So, now the question is what are the independent polarization vectors satisfying the
constraint (12.8). In order to clarify this point, one may start in the rest system, i.e. with
® Writing 𝜀 𝜇 = (𝜀 0 , 𝜀),
𝑘 = 𝑘 (0) = (𝑚, 0). ® from (12.8) one then gets readily 𝜀 0 = 0, so that

𝜀 𝜇 (𝑘 (0) ) = (0, 𝜀)
® . (12.9)
7 Alexandru Proca (1897-1955) was an eminent Romanian theoretical physicist who spent most of his professional

career in France. His work belongs to the “golden age” of the quantum theory in the first half of the 20th century.

79
Thus, 𝜀 𝜇 is space-like. Obviously, there are three independent 3-vectors 𝜀;
® for convenience, one
may write them as columns, and as a particular example one can choose
1 0 0
𝜀®(1)
= ­0® , 𝜀®(2)
= ­1® , (3)
𝜀® = ­0 ® . (12.10)
© ª © ª © ª

«0¬ «0¬ «1 ¬
Having in mind our earlier remark concerning the independent components of the wave function
(i.e. that these are just 𝐴 𝑗 , 𝑗 = 1, 2, 3), one may observe that the set (12.10) defines three possible
solutions for the particle at rest, which are the eigenstates of the spin matrix
1 0 0
𝑆 3 = ­0 0 0 ® , (12.11)
© ª

«0 0 −1¬
with eigenvalues +1, 0 and −1. This, of course, is an anticipated result for a spin-1 particle.
Let us now extend our analysis to the case of a particle in motion, i.e. 𝑘® ≠ 0 in the
relation (12.8). There is a natural way of constructing a triplet of space-like polarization vectors
𝜀 𝜇 (𝑘, 𝜆), 𝜆 = 1, 2, 3, satisfying (12.8). First, one can use
 
𝜀 (𝑘, 1) = 0, 𝜀(
𝜇 ®
® 𝑘, 1) ,
  (12.12)
𝜀 𝜇 (𝑘, 2) = 0, 𝜀( ® 2) ,
® 𝑘,

where 𝜀( ® 1) and 𝜀(
® 𝑘, ® 2) are two linearly independent vectors lying in the plane perpendicular
® 𝑘,
®
to 𝑘, i.e. satisfying
𝑘® · 𝜀( ® 𝜆) = 0 ,
® 𝑘, 𝜆 = 1, 2 . (12.13)
Following (12.10), one may normalize conventionally 𝜀( ® 𝜆) to be mutually orthogonal unit
® 𝑘,
vectors, so that then
𝜀 𝜇 (𝑘, 𝜆)𝜀 𝜇 (𝑘, 𝜆′) = −𝛿𝜆𝜆′ , 𝜆, 𝜆′ = 1, 2 . (12.14)
For the third member of the sought triplet one may try an Ansatz
!
𝑘®
𝜀 𝜇 (𝑘, 3) = 𝜀 0 , 𝑎 , (12.15)
| 𝑘® |
where the numbers 𝜀 0 and 𝑎 are to be determined by means of (12.8) and the normalization
condition indicated in (12.14) (needless to say, 𝜀(𝑘, 3) is orthogonal to 𝜀(𝑘, 1) and 𝜀(𝑘, 2)).
From (12.8) one gets readily
| 𝑘® |
𝜀0 = 𝑎 , (12.16)
𝑘0
and the normalization condition 𝜀 𝜇 (𝑘, 3)𝜀 𝜇 (𝑘, 3) = −1 then yields

2 −1 𝑘 02
𝑎 = = .
𝑘® 2 𝑚2
−1
𝑘 02
Conventionally, we will choose 𝑎 > 0, i.e. 𝑎 = 𝑘 0 /𝑚. Thus, we have finally
!
| ®
𝑘 | 𝑘 0
®
𝑘
𝜀 𝜇 (𝑘, 3) = , . (12.17)
𝑚 𝑚 | 𝑘® |

80
Since the spatial part of 𝜀 𝜇 (𝑘, 3) is directed along 𝑘, ® it is called longitudinal polarization;
correspondingly, 𝜀 𝜇 (𝑘, 𝜆) with 𝜆 = 1, 2 are transverse polarizations. There is a remarkable
technical point that should be mentioned here. It is seen that the four-vector 𝜀(𝑘, 3) given by
(12.17) has, mathematically, the same form as the spin four-vector corresponding to the helicity
of a spin- 21 Dirac particle (cf. (8.7)). Of course, such an accidental coincidence is due to the fact
that the defining conditions for 𝜀(𝑘, 3) and 𝑠R (𝑘) are the same, but one should keep in mind that
the physical roles of these two quantities are quite different.
So, we have arrived at a rather detailed form of the plane-wave solutions of the Proca
equation and now one may try to identify the spin states in terms of the polarization vectors
𝜀(𝑘, 𝜆), 𝜆 = 1, 2, 3. Hopefully, the concept of helicity could be relevant here, similarly to
the case of the Dirac equation. To this end, let us first specify the relevant spin matrices 𝑆 𝑗 ,
𝑗 = 1, 2, 3. It turns out that a triplet 𝑆® best suited for our purpose is
0 0 0 0 0 𝑖 0 −𝑖 0
𝑆 1 = ­0 0 −𝑖 ® , 𝑆 2 = ­ 0 0 0® , 𝑆 3 = ­ 𝑖 0 0® . (12.18)
© ª © ª © ª

«0 𝑖 0 ¬ «−𝑖 0 0¬ «0 0 0¬
It is easy to check that the matrices (12.18) satisfy indeed the commutation relations
[𝑆 𝑗 , 𝑆 𝑘 ] = 𝑖𝜀 𝑗 𝑘𝑙 𝑆 𝑙 . (12.19)
The reader may recognize in (12.18) the familiar antisymmetric generators of rotations in three-
dimensional space. Notice also that this is a representation different from that mentioned before
(cf. (12.11)); please don’t worry about it, it is just a matter of choice of the algebraic basis in the
space of 3 × 3 matrices (more precisely, the basis of the Lie algebra of the rotation group). One
defines the operator (matrix) of helicity as before, i.e. as the spin projection onto the direction
of particle momentum. So, one has
® = 𝑛® · 𝑆® ,
ℎ( 𝑘)
b ® 𝑘® | .
𝑛® = 𝑘/| (12.20)
Substituting the matrices (12.18) into the definition (12.20), one gets
0 −𝑛3 𝑛2
®
ℎ( 𝑘) = 𝑖 ­ 𝑛3
b ©
0 −𝑛1 ® .
ª
(12.21)
«−𝑛2 𝑛1 0 ¬
Writing the spatial part of a polarization vector 𝜀(𝑘, 𝜆) as a three-component column, one gets
from (12.21)
® 𝜀® = 𝑖( 𝑛® × 𝜀)
ℎ( 𝑘)
b ® . (12.22)
Now, the two vectors 𝜀(𝑘, ® 2) perpendicular to 𝑘® (the transverse polarizations (12.12))
® 1), 𝜀(𝑘,
can be chosen conventionally, so that
® 2) = 𝑛® × 𝜀(𝑘,
𝜀(𝑘, ® 1) . (12.23)
Then one also has
® 2) = 𝑛® × ( 𝑛® × 𝜀(𝑘,
𝑛® × 𝜀(𝑘, ® 1)) = 𝑛® ( 𝑛® · 𝜀(𝑘,
® 1)) − 𝜀(𝑘,
® 1)( 𝑛® · 𝑛®) = −𝜀(𝑘,
® 1) . (12.24)
Thus, using (12.22), (12.23) and (12.24) one gets
® 𝜀(𝑘,
ℎ( 𝑘)
b ® 1) = 𝑖 𝜀(𝑘,
® 2) ,
® 𝜀(𝑘,
ℎ( 𝑘)
b ® 2) = −𝑖 𝜀(𝑘,
® 1) , (12.25)
® 𝜀(𝑘,
ℎ( 𝑘)
b ® 3) = 0 .

81
Note that the last identity in (12.25) is obtained readily from (12.22), taking into account that
® 3) is parallel to 𝑘® (see (12.17)). With (12.25) at hand, a natural next step is to introduce
𝜀(𝑘,
complex combinations
1
® 1) + 𝑖 𝜀(𝑘,
® +) = √ [ 𝜀(𝑘,
𝜀(𝑘, ® 2)] ,
2
(12.26)
1
® 1) − 𝑖 𝜀(𝑘,
® −) = √ [ 𝜀(𝑘,
𝜀(𝑘, ® 2)]
2
(one may notice that the transformation (12.26) represents the passage from linear to circular
polarizations). From (12.25) and (12.26) one then gets immediately
® 𝜀(𝑘,
ℎ( 𝑘)
b ® +) = 𝜀(𝑘,
® +) ,
(12.27)
® 𝜀(𝑘,
ℎ( 𝑘)
b ® −) = −𝜀(𝑘,
® −) .

In this way we have achieved the desired goal: the results (12.27) along with the last identity
(12.25) tell us that the transverse polarization vectors 𝜀(𝑘, +) and 𝜀(𝑘, −) correspond to helicities
+1 and −1, respectively, and the longitudinal polarization 𝜀(𝑘, 3) represents the state with zero
helicity. A minor technical remark is in order here: It is easy to realize that the polarization
vectors, which now are in general complex, satisfy the orthonormality relation

𝜀 𝜇 (𝑘, 𝜆)𝜀 ∗𝜇 (𝑘, 𝜆′) = −𝛿𝜆𝜆′ , (12.28)

where 𝜆, 𝜆′ take on values +, − and 3 (it follows readily from the relations mentioned above for
the real 𝜀’s and from (12.26)).
As we have stressed before, the polarization vectors 𝜀(𝑘, 𝜆) are in fact counterparts of the
plane-wave amplitudes u (𝑘, 𝑠) or v (𝑘, 𝑠) known in the Dirac theory of spin- 21 particles. We know
that the spinorial amplitudes u and v satisfy the completeness relations (6.24), (6.30), and one
may thus wonder how to obtain an analogous relation for the polarization vectors we are working
with. For such a purpose, one may utilize the following trick: Obviously, the triplet of space-like
vectors 𝜀®𝜇 (𝑘, 𝜆), 𝜆 = 1, 2, 3, along with the time-like vector 𝑘 𝜇 /𝑚 form an orthonormal basis
in the Minkowski space and thus it is not difficult to guess that the corresponding completeness
relation can be written as
3
∑︁ 𝑘𝜇 𝑘𝜈
− 𝜀 𝜇 (𝑘, 𝜆)𝜀 ∗𝜈 (𝑘, 𝜆) + = 𝑔 𝜇𝜈 . (12.29)
𝜆=1
𝑚 𝑚

The reader is encouraged to verify this identity; for those who would be reluctant to take up this
task, a hint may be helpful: Multiply both sides of Eq. (12.29) consecutively by the basis vectors
and employ the orthonormality relation (12.28). So, Eq. (12.29) leads to the desired polarization
sum
3
∑︁ 𝑘𝜇𝑘𝜈
𝜀 𝜇 (𝑘, 𝜆)𝜀 ∗𝜈 (𝑘, 𝜆) = −𝑔 𝜇𝜈 + . (12.30)
𝜆=1
𝑚2
We will stop our discussion of the Proca equation here; what we have done up to now
will be sufficient for later applications. In particular, the properties of the plane waves and
polarization vectors that we have found will be utilized in our future perturbative calculations of
decay and scattering processes. Of course, it is the same story as our previous detailed treatment
of the plane waves for Dirac particles (remember how we tried to find excuses for pursuing such
a rather boring topic). One should also add that the Proca equation, as an equation of relativistic

82
quantum mechanics, is not so interesting as the Dirac equation, as regards phenomenological
applications. The reason is that there is no stable massive elementary particle with spin 1, so
that e.g. the problem of finding bound states of a spin-1 particle in external Coulomb field is not
so important as the same problem for the Dirac’s electron. Strictly speaking, the Proca equation
is more important for the quantum field theory; as we have noted before, this was the framework,
in which A. Proca formulated originally his idea.

83
Chapter 13

Splendors and miseries


of relativistic quantum mechanics

The present chapter is a sort of epilogue, in which we would like to review the successes and
failures of the relativistic quantum mechanics; its specific character is reflected in a literary
style of the title (the connoisseurs of the classical French literature would perhaps appreciate the
original expression “splendeurs et misères”).
In a recapitulation of successes (splendors), the prominent position certainly belongs to
the Dirac equation, which brought a real breakthrough in the early years of quantum theory in
1920s. As we have seen, it assigns very naturally the half-integer intrinsic angular momentum
(spin 12 ) to the electron. Furthermore, it leads to a simple and elegant explanation of the value
of the spin magnetic moment. In its time, this was actually a prediction, in a limited sense;
we will return to this point shortly. Another highlight of the Dirac theory of the electron is the
famous Darwin–Gordon formula for the energy levels of the hydrogen atom; though we have
not discussed its derivation before, let us reproduce the result here for reader’s convenience. It
is written as an expression for the full electron energy, which is the sum of the rest energy and
the (negative) binding energy. It reads
 −1/2
𝛼2

𝐸 =𝑚 1+ , (13.1)
(𝑛𝑟 + 𝛾) 2

where 𝛼 is the fine-structure constant (𝛼  1/137), 𝑛𝑟 is the radial quantum number, 𝑛𝑟 =


1 2
0, 1, 2, . . ., and 𝛾 = ( 𝑗 + 2 ) − 𝛼2 ] 1/2 , with 𝑗 denoting full angular momentum (including both


the orbital momentum and spin, so that 𝑗 takes on half-integer values). The formula (13.1)
becomes more transparent when expanded in powers of 𝛼2 ; one thus gets

𝛼2 𝛼4
" ! #
1 3
𝐸 =𝑚 1− 2 − 3 − + O (𝛼6 ) , (13.2)
2𝑛 2𝑛 𝑗 + 21 4𝑛

where 𝑛 now denotes the principal quantum number, 𝑛 = 𝑛𝑟 + 𝑗 + 12 . It is easy to see that
the second term in square brackets reproduces the familiar Balmer formula for hydrogen levels
obtained in the non-relativistic theory (recall that 𝛼 = 𝑒 2 /ℏ𝑐 in ordinary units, so that 𝑚𝛼2 /2𝑛2
then becomes 𝑚𝑒 4 /2𝑛2 ℏ2 ). The term proportional to 𝛼4 represents the relativistic corrections
that bring about the famous “fine structure” of the energy levels. So, the natural explanation
of such subtle effects in the hydrogen atom spectrum was certainly a stunning success of the
Dirac equation at the end of 1920s. However, twenty years later, more accurate measurements
(performed by W. Lamb and R. Retherford) revealed that there is a tiny difference between

84
energies of two states carrying identical values of 𝑛 and 𝑗 (recall that according to (13.1), such
a difference should be exactly zero). In the common spectroscopic notation, the difference in
question is 𝐸 (2𝑆1/2 ) − 𝐸 (2𝑃1/2 ) and it is called, traditionally, the Lamb shift. Numerically,
it is of the order of 10−6 eV and the Dirac equation alone is not capable of explaining such an
effect. However, soon after this experimental discovery it was clarified within the framework
of quantum electrodynamics (QED) and this computational success had become one of the
powerful arguments in favour of QED as a physically relevant model of quantum field theory.8
Let us now come back to the electron magnetic moment. As we have already noted in
Chapter 2, the successful prediction of the value 𝜇 𝑒 = 𝜇 𝐵 = 𝑒/2𝑚 depends on the assumption
that the electromagnetic interaction has the “minimal” form, obtained by replacing 𝜕𝜇 in the
Dirac equation with the covariant derivative 𝜕𝜇 + 𝑖𝑒 𝐴 𝜇 (where 𝐴 𝜇 is the relevant electromagnetic
four-potential). In fact, the interaction may have a more general form that is easily incorporated
in the Dirac equation. Let us now explain briefly how it can be done. The above-mentioned
minimal scheme relies on the Dirac equation written as
𝑖𝛾 (𝜕𝜇 + 𝑖𝑒 𝐴 𝜇 ) − 𝑚 𝜓 = 0 , (13.3)
 𝜇 

and it is the covariant derivative that guarantees here the desired gauge invariance. Now, the
point is that apart from the four-potential 𝐴 𝜇 , there is another electromagnetic quantity that may
be utilized as well, namely the field strength tensor 𝐹𝜇𝜈 = 𝜕𝜇 𝐴𝜈 − 𝜕𝜈 𝐴 𝜇 . It is gauge invariant and
antisymmetric, so one may couple it to the familiar antisymmetric combination of the 𝛾-matrices,
𝜎𝜇𝜈 = 2𝑖 [𝛾 𝜇 , 𝛾𝜈 ]. A simple extension of Eq. (13.3) can then be written as
 𝑒 
𝑖 𝜕/ − 𝑒 𝐴/ − 𝑚 − 𝜅 𝜎𝜇𝜈 𝐹 𝜇𝜈 𝜓 = 0 , (13.4)
4𝑚
where we have used the slash notation for brevity, and the form of the prefactor in the last term
has been chosen for convenience. The constant 𝜅 is an arbitrary dimensionless coefficient, and
the whole expression involving 𝐹𝜇𝜈 is traditionally called the Pauli term. Choosing the potential
𝐴 𝜇 so that it represents a magnetic field and employing the non-relativistic approximation, one
may proceed in analogy with what we have done in Chapter 2. We will not go into the details of
the calculation (it might be a non-trivial challenge for a diligent reader), and rather show the final
result. It turns out that the particle described by Eq. (13.4) carries the spin magnetic moment
with the magnitude
𝑒
𝜇 = (1 + 𝜅) . (13.5)
2𝑚
In this way, one may arrive correctly at any value of the spin magnetic moment; in this sense,
the result obtained previously for the electron is not a true prediction (rather, it is a “conditional
prediction” relying on the assumption that 𝜅 = 0). On the other hand, Eq. (13.4) enables one to
describe also the other spin- 12 particles like proton or neutron; as we know, these are not pointlike
objects and their magnetic moments differ substantially from the simple Dirac’s values.
In fact, the problem of the Pauli term (to be or not to be) has one more aspect. As we
will see later, the presence of such a term would change dramatically the properties of QED;
in particular, it would spoil the conventional perturbative renormalizability, which is generally
recognized as a user-friendly feature of any model of quantum field theory (QFT).
So, for the electron, the value 𝜅 = 0 seems to be the right option to start with, but this
does not mean that it is the end of the story. Similarly as in the case of the Lamb shift, accurate
8 The experimental result of Lamb and Retherford was published in 1947 and Willis Lamb (1913 – 2008)
received for it the Nobel Prize in 1955. The first successful theoretical calculation of the Lamb shift was performed
also in 1947 by Hans Bethe (1906 – 2005), who received the Nobel Prize in 1967 for something else (the so-called
CNO cycle in astrophysics).

85
measurements performed in the late 1940s uncovered a tiny deviation from the value 𝜇 𝐵 assigned
to the electron within the Dirac theory.9 Such a difference, which is of the relative order of
10−3 , was explained successfully by means of QED (the first theorist who did the calculation
was J. Schwinger). This achievement had become another milestone in establishing the QED as
a realistic physical theory at the fundamental level.
Let us now turn to the difficulties (“miseries”) of relativistic quantum mechanics. As we
have seen in the preceding chapters, a generic feature of all considered cases is the appearance of
negative energy solutions for the free particle. This leads to specific interpretation problems that
we are not going to summarize here, but perhaps the most serious flaw, from the physical point
of view, is the potential instability of matter that would be caused by spontaneous transitions of
particles to the available (unoccupied) states with negative energy. Worried by such an obvious
drawback of his theory, Dirac came up with an original and rather surprising solution. He
proposed that what we perceive as the “vacuum” (the no-particle state) is in fact a state, in
which all negative energy levels are occupied, so that no spontaneous transitions are possible.
Such an infinite continuum of negative energy particles is supposed to be a sort of unobservable
background and is called, traditionally, the Dirac sea. Under the influence of an external
force, a particle (an electron, for definiteness) can be excited from the negative energy level to
become an ordinary electron with positive energy. Removing a negative energy particle from
the sea amounts to creating a “hole” and, effectively, such a state carries a positive energy (since
subtracting a negative value is tantamount to adding a positive one); moreover, its charge is
opposite to that of the electron. After a short period of confusion, the picture of a possible hole
in the Dirac sea was recognized as the prediction of an antiparticle of the electron. The origin
of such a term is obvious; furthermore, since the electron charge is conventionally taken to be
negative, the (by now familiar) name positron for such an antiparticle has been established soon
after its theoretical prediction. The proposal made by Dirac has turned into a triumph, when the
positron was observed experimentally in cosmic rays (Carl Anderson in 1932).
The prediction of antiparticles (or, more generally, antimatter) was certainly one of the
most spectacular successes of the Dirac theory, but it must be taken with a grain of salt. The
concept of the filled sea of the negative energy states is in fact highly controversial. Why is it
so? We know that electrons are fermions, which means, in particular, that they satisfy the Pauli
exclusion principle — there can be just one particle in a given state. So, in such a case one
may, in principle, imagine a filled sea that would not absorb any other particle falling into it as
a result of a spontaneous quantum transition. However, we also know that the spinless particles
described by the Klein–Gordon equation, or spin-1 particles obeying the Proca equation are
bosons, which means that their number in a given state is unlimited. Thus, a would-be sea of
negative energy bosonic states could never be filled, so that this concept thus fails completely
(for an early critique see e.g. [39]). On the other hand, we know that the existence of antiparticles
is an universal phenomenon — they do exist both for bosons and fermions. So, strictly speaking,
the amazing theoretical construction of the positron was a right prediction based on a wrong
(or at least dubious) argument. The reader following the above exposition may have noticed
how the “splendors” and “miseries” are sometimes remarkably intertwined in the achievements
of relativistic quantum mechanics. Fortunately, as we know now, there is a truly universal and
consistent treatment of antiparticles within the QFT framework, in which the idea of a filled
“sea” is totally irrelevant. One may thus conclude that the original concept of the Dirac sea is
obsolete and in fact may be forgotten (or, more politely, deferred to a history chapter), since it
has been superseded by the QFT methods. Having in mind our ultimate goal, it is in order to
9 The corresponding experimental result was published in 1947 by P. Kusch and H. Foley. Polykarp Kusch (1911
– 1993) received for it the Nobel Prize in 1955 together with Willis Lamb.

86
offer the reader some references to sources describing the QFT genesis and history: apart from
the papers [39] quoted above, instructive discussion of the subject can be found e.g. in [9, 13]
and [54]. In any case, the existence of negative energy solutions is a fundamental difficulty
of relativistic quantum mechanics in general, so it is reassuring that this problem is eradicated
naturally within QFT.
As a last item on the list of “miseries” one should mention the puzzling behaviour of a
particle in a strong external field, as exemplified by the so-called Klein paradox that we have
discussed in Chapter 11. This also points to limitations of the relativistic quantum mechanics
as a theory of essentially one-particle systems. As we have already noted before, in contrast to
this, QFT is able to describe very naturally physical processes, in which the number and type
of participating particles is variable; this in fact is the most important QFT asset as regards the
description of processes involving subatomic particles.
Of course, one could say more about the issues mentioned above, but the main message
is clear. Equations of relativistic quantum mechanics certainly yielded some wonderful or even
astonishing predictions, though sometimes the employed arguments are not quite waterproof.
In fact, these equations represent a precursor of a deeper approach, which is the quantum field
theory; as we have seen, there is certainly more than one reason to proceed in this way. A bonus
drawn from the preceding chapters is that many mathematical formulae and relations derived
there will be utilized without change in our forthcoming discussion of the QFT models.

87
Chapter 14

Interlude:
Lagrangian formalism for classical fields

Before proceeding to the study of quantum fields, it is necessary to discuss first an appropriate
formalism for their classical counterparts. Similarly as in the ordinary classical mechanics of
point particles, the key concept is a Lagrange function, or briefly Lagrangian.
At the end of the preceding chapter we have mentioned a certain bonus descending
from our previous effort. A substantial part of this bonus is that we in fact already know all
relevant equations of motion for the classical fields we would like to deal with. Indeed, such
relativistically covariant wave equations coincide with the equations of relativistic quantum
mechanics considered previously; so, we will discuss a Klein–Gordon field, Dirac field, etc.
Although the physical content is different (earlier we have worked with quantum mechanical
wave functions, while now we have in mind classical fields), the equations are the same, since
their form is determined uniquely by the requirement of Lorentz covariance.
Since the Lagrangian formalism for fields is developed in close analogy with classical
mechanics, we would like to recall first the basic principles of the latter. Let us consider a
mechanical system with 𝑓 degrees of freedom, described in terms of the generalized coordinates
𝑞𝑖 = 𝑞𝑖 (𝑡), 𝑖 = 1, . . . , 𝑓 . The Lagrange function depends on the coordinates and on velocities
.
𝑞𝑖 (𝑡),
. 
𝐿 = 𝐿 𝑞𝑖 (𝑡), 𝑞𝑖 (𝑡) (14.1)
(from the mathematical point of view, 𝐿 is a functional depending on the time variable 𝑡). Equa-
tions of motion (Lagrange equations) are obtained from the variational principle of stationary
action
𝛿𝑆 = 0 , (14.2)
where the functional of action is given by

∫𝑡 2
𝑆=
. 
𝐿 𝑞𝑖 (𝑡), 𝑞𝑖 (𝑡) d𝑡 . (14.3)
𝑡1

The resulting Lagrange equations then read


d 𝜕𝐿 𝜕𝐿
. −
d𝑡 𝜕 𝑞𝑖 𝜕𝑞𝑖
= 0, 𝑖 = 1, . . . , 𝑓 . (14.4)

Now, how about fields? A classical field is described, in general, by a multiplet of functions
𝜑𝑟 (𝑥) = 𝜑𝑟 (𝑡, 𝑥®), 𝑟 = 1, . . . , 𝑛. A natural counterpart of the index 𝑖 in (14.1) or (14.4) is

88
the continuous “index” 𝑥®. Since we have in mind relativistically covariant models (where all
spacetime coordinates are treated on an equal footing), it is natural to introduce the density
of Lagrange function (briefly, Lagrangian density), depending on the fields and their first
derivatives, i.e.
L = L 𝜑𝑟 (𝑥), 𝜕𝜇 𝜑𝑟 (𝑥) , (14.5)


so that the Lagrange function is the integral over the whole three-dimensional space

𝐿 = L d3 𝑥 . (14.6)

The action is then conveniently defined as


∫+∞ ∫
𝑆= 𝐿 d𝑥0 = L d4 𝑥 , (14.7)
−∞

and the variational principle of stationary action then yields


 
𝜕L 𝜕L
∫ ∫
4 4
0 = 𝛿𝑆 = d 𝑥 𝛿L = d 𝑥 𝛿𝜑𝑟 + 𝛿(𝜕𝜇 𝜑𝑟 )
𝜕𝜑𝑟 𝜕 (𝜕𝜇 𝜑𝑟 )
 
𝜕L 𝜕L

4
= d 𝑥 𝛿𝜑𝑟 + 𝜕𝜇 (𝛿𝜑𝑟 ) , (14.8)
𝜕𝜑𝑟 𝜕 (𝜕𝜇 𝜑𝑟 )
where in the last step we have used the obvious rule that the variation of a derivative of a
function is the derivative of the variation. Now, in the last expression we may carry out partial
integration; in order to get rid of the surface terms, let us assume that the variation 𝛿𝜑𝑟 vanishes at
the spacetime infinity (this is a usual constraint on the trial functions in the considered variational
problem). The expression (14.8) is then recast as
  
𝜕L 𝜕L

4
0 = 𝛿𝑆 = d 𝑥 − 𝜕𝜇 𝛿𝜑𝑟 , (14.9)
𝜕𝜑𝑟 𝜕 (𝜕𝜇 𝜑𝑟 )
and one thus arrives at the Lagrange equations (or, if you want, Euler–Lagrange equations)
𝜕L 𝜕L
𝜕𝜇 − = 0, 𝑟 = 1, . . . , 𝑛 . (14.10)
𝜕 (𝜕𝜇 𝜑𝑟 ) 𝜕𝜑𝑟
Notice that (14.10) is a “covariantized” form of the familiar equations (14.4) of classical me-
chanics, so it is easy to remember.
Now we are in a position to find some examples of Lagrangian densities corresponding
to the relativistic wave equations we already know (note that for the sake of brevity, we will
usually use the term “Lagrangian” instead of “Lagrangian density”). Let us start with a real
Klein–Gordon field 𝜑(𝑥), which, as we know, satisfies the equation
( 2 + 𝑚 2 )𝜑(𝑥) = 0 . (14.11)
It is easy to verify that an appropriate Lagrangian can be written as
1 1
L = 𝜕𝜇 𝜑 𝜕 𝜇 𝜑 − 𝑚 2 𝜑2 . (14.12)
2 2
Indeed, from (14.12) one gets readily
𝜕L 𝜕L
= 𝜕𝜇𝜑 , = −𝑚 2 𝜑 , (14.13)
𝜕 (𝜕𝜇 𝜑) 𝜕𝜑

89
and substituting this into (14.10), the Klein–Gordon equation (14.11) is recovered (needless to
say, (14.13) amounts just to elementary differentiation of quadratic functions). Note that the
factors 1/2 standing in (14.12) are conventional; it is clear that any Lagrangian proportional
to 𝜕𝜇 𝜑𝜕 𝜇 𝜑 − 𝑚 2 𝜑2 would yield Eq. (14.11) as well. The option (14.12) leads to a convenient
normalization of the quantities to be discussed later on (e.g. the field energy and momentum).
Next, let us consider a complex Klein–Gordon field. Of course, the equation of motion
is (14.11), but now the field has in fact two components; one may describe it in terms of its
real and imaginary part, or, equivalently, 𝜑 and 𝜑∗ are considered as the two independent field
variables. It is easy to see that an appropriate Lagrangian reads

L = 𝜕𝜇 𝜑 𝜕 𝜇 𝜑∗ − 𝑚 2 𝜑𝜑∗ . (14.14)

The relevant derivatives are


𝜕L 𝜕L
= 𝜕 𝜇 𝜑∗ , = −𝑚 2 𝜑∗ , (14.15)
𝜕 (𝜕𝜇 𝜑) 𝜕𝜑
and similarly for the complex conjugate quantities. From (14.10) one thus gets readily the
Klein–Gordon equations for 𝜑∗ and 𝜑, respectively. Concerning the conventional overall factor
embodied in (14.14), the same remark holds as in the preceding case.
As another example, let us consider a real Proca field 𝐴 𝜇 (𝑥), 𝜇 = 0, 1, 2, 3. This is
defined by means of the equations of motion

𝜕𝜇 𝐹 𝜇𝜈 + 𝑚 2 𝐴𝜈 = 0 , (14.16)

with 𝐹 𝜇𝜈 = 𝜕 𝜇 𝐴𝜈 − 𝜕 𝜈 𝐴 𝜇 . As we know, Eq. (14.16) is equivalent to the pair

(2 + 𝑚2) 𝐴𝜇 = 0 , 𝜕 𝜇 𝐴𝜇 = 0 . (14.17)

It turns out that the corresponding (properly normalized) Lagrangian reads


1 1
L = − 𝐹𝜇𝜈 𝐹 𝜇𝜈 + 𝑚 2 𝐴 𝜇 𝐴 𝜇 . (14.18)
4 2
Let us verify that it is indeed the case. The derivative of L with respect to 𝐴 𝜇 is obtained
immediately; obviously, it is
𝜕L
= 𝑚2 𝐴𝜇 . (14.19)
𝜕 𝐴𝜇
The differentiation with respect to the field derivatives is slightly more complicated, so let us
now carry out the calculation in detail. One has
1
 
𝜕L 𝜕
= − 𝐹𝜇𝜈 𝐹 𝜇𝜈
, (14.20)
𝜕 (𝜕𝜌 𝐴𝜎 ) 𝜕 (𝜕𝜌 𝐴𝜎 ) 4
and the expression in parentheses is worked out as
1 1
− 𝐹𝜇𝜈 𝐹 𝜇𝜈 = − (𝜕𝜇 𝐴𝜈 − 𝜕𝜈 𝐴 𝜇 )(𝜕 𝜇 𝐴𝜈 − 𝜕 𝜈 𝐴 𝜇 )
4 4
1 1
= − (𝜕𝜇 𝐴𝜈 )(𝜕 𝜇 𝐴𝜈 ) + (𝜕𝜇 𝐴𝜈 )(𝜕 𝜈 𝐴 𝜇 ) . (14.21)
2 2
So, the basic ingredients are the derivatives
𝜕 (𝜕𝜇 𝐴𝜈 ) 𝜌
= 𝑔 𝜇 𝑔𝜈𝜎 (14.22)
𝜕 (𝜕𝜌 𝐴𝜎 )

90
and
𝜕 (𝜕 𝜇 𝐴𝜈 )
= 𝑔 𝜇𝜌 𝑔 𝜈𝜎 . (14.23)
𝜕 (𝜕𝜌 𝐴𝜎 )
The relations (14.22), (14.23) are easily understood: basically, they represent a straightforward
𝜇 𝜇 𝜇
generalization of the trivial identity 𝜕𝑥 𝜇 /𝜕𝑥 𝜈 = 𝑔𝜈 (please remember that 𝛿 𝜈 = 𝑔𝜈 ). In this way,
one gets
𝜕L 1 𝜌
= − (𝑔 𝜇 𝑔𝜈𝜎 𝜕 𝜇 𝐴𝜈 + 𝜕𝜇 𝐴𝜈 𝑔 𝜌𝜇 𝑔 𝜎𝜈 )
𝜕 (𝜕𝜌 𝐴𝜎 ) 2
1 𝜌
+ (𝑔 𝜇 𝑔𝜈𝜎 𝜕 𝜈 𝐴 𝜇 + 𝜕𝜇 𝐴𝜈 𝑔 𝜌𝜈 𝑔 𝜎𝜇 )
2
= −(𝜕 𝜌 𝐴𝜎 − 𝜕 𝜎 𝐴 𝜌 ) .

So, the result is


𝜕L
= −𝐹 𝜌𝜎 . (14.24)
𝜕 (𝜕𝜌 𝐴𝜎 )
Using Eq. (14.10) and the identities (14.19), (14.24) one obtains

𝜕𝜌 𝐹 𝜌𝜎 + 𝑚 2 𝐴𝜎 = 0 , (14.25)

i.e. the Proca equation is thereby recovered (just to be sure, the label 𝑟 in the general equations
(14.10) coincides here with 𝜎).
As a last example, let us consider the Dirac field satisfying the familiar equation

𝑖𝛾 𝜇 𝜕𝜇 𝜓 − 𝑚𝜓 = 0 . (14.26)

The function 𝜓 is in general complex, so the set of dynamical field variables in Eq. (14.10) should
also include an appropriate conjugation of 𝜓, e.g. 𝜓 † or 𝜓. The Dirac conjugation 𝜓 = 𝜓 † 𝛾0 is
the right option, since, as we know, it is instrumental in constructing bilinear covariants like 𝜓𝜓,
etc. One possible form of a Lagrangian yielding Eq. (14.26) is then guessed easily. It reads

L = 𝜓(𝑖𝛾 𝜇 𝜕𝜇 − 𝑚)𝜓 . (14.27)

Indeed, in (14.27) there is no derivative of 𝜓, and the left-hand side of Eq. (14.26) appears
there manifestly as a (column) factor. Thus, following the general recipe (14.10), one Lagrange
equation is simply
𝜕L
= 0. (14.28)
𝜕𝜓
This, using (14.27), gives immediately Eq. (14.26). A technical remark is in order here. So as
to be absolutely correct, one should work with bispinor components of 𝜓 and 𝜓, i.e. use 𝜓 𝑗 ,
𝑗 = 1, 2, 3, 4 and similarly for 𝜓, when working out Eq. (14.10). Hopefully, the compact matrix
notation employed here will not cause any confusion.
Further, differentiating (14.27) with respect to 𝜓 and 𝜕𝜇 𝜓, one has

𝜕L 𝜕L
= 𝑖 𝜓𝛾 𝜇 , = −𝑚𝜓 . (14.29)
𝜕 (𝜕𝜇 𝜓) 𝜕𝜓

So, apart from (14.28), one gets from (14.10) and (14.29) also

𝑖𝜕𝜇 𝜓𝛾 𝜇 + 𝑚𝜓 = 0 . (14.30)

91
It is easy to see that Eq. (14.30) is, as expected, just the Dirac conjugation of Eq. (14.26).
The form (14.27) is simple and user-friendly in further applications (as we will see later
on). However, an aesthetically minded reader might worry about the asymmetric way, in which
𝜓 and 𝜓 appear there. It is a legitimate observation, so one may try to find a more sophisticated
form of the Lagrangian density. Such a more symmetric solution does exist, namely
𝑖 ↔
L = 𝜓𝛾 𝜇 𝜕𝜇 𝜓 − 𝑚𝜓𝜓 (14.31)
2
(cf. e.g. the book [6]), where the symbol ↔ means the both-sided derivative, defined here
conventionally as

𝑓 𝜕𝜇 𝑔 = 𝑓 𝜕𝜇 𝑔 − 𝜕𝜇 𝑓 𝑔 . (14.32)
So, the expression (14.31) may be recast as
𝑖 𝑖
L = 𝜓𝛾 𝜇 𝜕𝜇 𝜓 − 𝜕𝜇 𝜓𝛾 𝜇 𝜓 − 𝑚𝜓𝜓 . (14.33)
2 2
The derivatives to be utilized in the general Eq. (14.10) are then e.g.

𝜕L 𝑖 𝜕L 𝑖
= 𝜓𝛾 𝜇 , = − 𝜕𝜇 𝜓𝛾 𝜇 − 𝑚𝜓 . (14.34)
𝜕 (𝜕𝜇 𝜓) 2 𝜕𝜓 2
From (14.34) and (14.10) one gets immediately Eq. (14.30). Similarly, using the derivatives
involving 𝜓, one obtains directly Eq. (14.26).
Thus, one may say that the Lagrangians (14.27) and (14.31) are equivalent, in the sense
that they lead to the same equation of motion for the field 𝜓. In fact, it is a nice demonstration
of the ambiguity of the Lagrangian density, which is not determined uniquely by the equations
of motion. We have already noticed an ambiguity that is basically trivial, corresponding to the
rescaling of L by a constant factor. The relation between (14.27) and (14.31) is more intriguing.
As the reader may verify easily, it holds
𝑖
L = L + 𝜕𝜇 𝜓𝛾 𝜇 𝜓 . (14.35)

2
The essential point is that the difference L −L is equal to a four-divergence (total derivative)
of a particular expression made of 𝜓 and 𝜓. This in turn means that the action is not changed
when passing from L to L (adding a total derivative leads, upon the spacetime integration,
to an extra surface term that vanishes because of boundary conditions). Thus, the equations of
motion following from the principle of stationary action (cf. (14.8)) should be left unchanged as
well.
The next chapter will be devoted to the important issue of the conservation laws for
classical fields and their relation to symmetries. A conservation law is, basically, a consequence
of the equations of motion (an integral of motion). So, as a preliminary step, it may be useful to
carry out here a straightforward derivation of the conservation law for the energy and momentum,
in the simplest case of the real Klein–Gordon field.
We start with Eq. (14.11), i.e.

𝜕𝜇 𝜕 𝜇 𝜑 + 𝑚 2 𝜑 = 0 .

Multiplying it by 𝜕 𝜈 𝜑, one has

𝜕𝜇 𝜕 𝜇 𝜑𝜕 𝜈 𝜑 + 𝑚 2 𝜑𝜕 𝜈 𝜑 = 0 ,

92
and this can be recast as
1
0 = 𝜕𝜇 (𝜕 𝜇 𝜑𝜕 𝜈 𝜑) − 𝜕 𝜇 𝜑𝜕𝜇 𝜕 𝜈 𝜑 + 𝑚 2 𝜕 𝜈 (𝜑2 )
2
1 1
= 𝜕𝜇 (𝜕 𝜇 𝜑𝜕 𝜈 𝜑) − 𝜕 𝜈 (𝜕𝜇 𝜑𝜕 𝜇 𝜑) + 𝑚 2 𝜕 𝜈 (𝜑2 ) . (14.36)
2 2
Taking into account (14.12), the last identity may be rewritten as

𝜕𝜇 (𝜕 𝜇 𝜑𝜕 𝜈 𝜑) − 𝑔 𝜇𝜈 𝜕𝜇 L = 0 . (14.37)

Thus, we have arrived at the four-divergence identity (“continuity equation”)

𝜕𝜇 T 𝜇𝜈
= 0, (14.38)

where
T 𝜇𝜈
= 𝜕 𝜇 𝜑𝜕 𝜈 𝜑 − 𝑔 𝜇𝜈 L . (14.39)
By the way, an observant reader may have noticed that our derivation has been quite similar to
what one does in classical mechanics when deriving the energy conservation for linear harmonic
..
oscillator (LHO). Indeed, starting with the LHO equation of motion 𝑥 + 𝜔2 𝑥 = 0 and multiplying
.
it by 𝑥, one gets immediately
d 1 .2 1 2 2
 
𝑥 + 𝜔 𝑥 = 0,
d𝑡 2 2
and that’s it.
Now, with such an analogy in mind, one may wonder what is in fact the meaning of
Eq. (14.38). It is not difficult to realize that the tensor T 𝜇𝜈 is closely related, per analogiam, to
an energy. Indeed, let us consider the component T 00 . According to (14.39), this is

T 00 = 𝜕0 𝜑𝜕0 𝜑 − L . (14.40)

However, using (14.13) one has


𝜕L
𝜕0 𝜑 = , (14.41)
𝜕 (𝜕0 𝜑)
so that the relation (14.40) is analogous to the Legendre transformation

. 𝜕𝐿
𝐻=𝑞 . −𝐿 (14.42)
𝜕𝑞
that in classical mechanics leads from the Lagrange function to the Hamiltonian function (i.e. the
energy). Thus, it is quite natural to interpret T 00 as the energy density (for a given field
configuration 𝜑(𝑥)). It remains to be shown that the energy, i.e. the integrated energy density, is
indeed conserved, i.e. constant in time. To this end, let us set 𝜈 = 0 in Eq. (14.38). One then has

𝜕0 T 00 = −𝜕 𝑗 T 𝑗0
. (14.43)

When Eq. (14.43) is integrated over the three-dimensional space and one employs the Gauss
theorem to work out the right-hand side (i.e. transform it to a surface integral), this is seen to
vanish (because of the boundary conditions at infinity); thus, we are left with

𝜕0 T 00 d3 𝑥 = 0 , (14.44)

93
and this is the desired result. Please notice also that using (14.40) along with (14.12) one gets
1 1 1
T 00 = 𝜕0 𝜑𝜕0 𝜑 + 𝜕 𝑗 𝜑𝜕 𝑗 𝜑 + 𝑚 2 𝜑2 . (14.45)
2 2 2
So, since 𝜑(𝑥) is taken to be real, T 00 ≥ 0.
Having identified the energy density, we may also call the components T 0 𝑗 , 𝑗 = 1, 2, 3,
the momentum density, since, repeating the preceding procedure, one gets first

𝜕0 T 0 𝑗 = −𝜕𝑘 T 𝑘𝑗
, (14.46)

and subsequently ∫
𝜕0 T 0 𝑗 d3 𝑥 = 0 . (14.47)

Thus, we conclude that the four-component quantity



𝑃 = T 0𝜇 d3 𝑥 ,
𝜇
𝜇 = 0, 1, 2, 3 , (14.48)

may be consistently interpreted as the conserved four-momentum of the Klein–Gordon field.


The tensor T 𝜇𝜈 is called the energy–momentum tensor; as we have seen, its components
represent the relevant densities.

94
Chapter 15

Conservation laws from symmetries

A direct derivation of the energy–momentum tensor from the equations of motion is not always
so easy and straightforward as in the case of the Klein–Gordon field described in the preceding
chapter. Fortunately, there is an elegant general method of obtaining conservation laws, which
relies on the symmetry properties of the Lagrangian (or the action) of the considered field theory
model. Such a method originates in the work of Emmy Noether published more than 100 years
ago; so, what we are going to present in this chapter is a particular implementation of the famous
Noether’s theorem.
In general, the analysis of conservation laws à la Noether starts usually by considering
the invariance of the action that defines the dynamics of a given physical system. Since our
playground is the classical field theory, one may try to employ directly the Lagrangian density
as the basic quantity, instead of the action. It turns out that such a simplified approach does
work, if the notion of the Lagrangian symmetry is defined properly. So, let us consider a general
Lagrangian density of the form (14.5). There are basically two possible types of the symmetries:
either a transformation of spacetime coordinates is involved (which induces an appropriate field
transformation), or the coordinates are left unchanged, and only the functional form of the field
is transformed (this is the case of the so-called internal symmetry). Let us start with the
former case, corresponding to symmetries of geometric origin. We consider the transformation
of coordinates and fields
𝑥 𝜇 −→ 𝑥 ′𝜇 ,
(15.1)
𝜑𝑟 (𝑥) −→ 𝜑′𝑟 (𝑥 ′) ,

and assume that the form of the Lagrangian does not change under (15.1), i.e. that it holds
′ ′ 
′ ′ 𝜕𝜑𝑟 (𝑥 ) 𝜕𝜑𝑟 (𝑥)
  
L 𝜑𝑟 (𝑥 ), = L 𝜑𝑟 (𝑥), . (15.2)
𝜕𝑥 ′ 𝜇 𝜕𝑥 𝜇

For obvious reasons, we will call such a relation the form-invariance of the Lagrangian.10 It
is not difficult to realize that the Lagrangians discussed in the preceding chapter satisfy the
condition (15.2) when the coordinate transformation in (15.1) is a spacetime translation or
Lorentz transformation; more about this later. For the purpose of further discussion, let us
10 Note that such a term is not quite common in current literature, but occasionally it is used explicitly, see e.g. the

book [12].

95
introduce the following notation:
𝜕𝜑′𝑟 (𝑥)
 

L (𝑥) =L 𝜑′𝑟 (𝑥), ,
𝜕𝑥 𝜇
𝛿L (𝑥) = L ′ (𝑥) − L (𝑥) , (15.3)
𝛿𝜑𝑟 (𝑥) = 𝜑′𝑟 (𝑥) − 𝜑𝑟 (𝑥) ,
𝛿𝑥 𝜇 = 𝑥 ′𝜇 − 𝑥 𝜇 .
Note that the term “form-invariance” has been introduced so as to distinguish the relation (15.2)
from a plain invariance that would mean, in the shorthand notation (15.3),
L ′ (𝑥) = L (𝑥) , (15.4)
i.e.
𝛿L (𝑥) = 0 . (15.5)
Obviously, such a relation would be relevant in the case of an internal symmetry (to be discussed
later on).
To proceed, we will consider continuous transformations of the type (15.1) (i.e. those
depending on some continuous parameters), and these may be restricted to the infinitesimal
form. Utilizing the notation (15.3), the form-invariance condition (15.2) may be written as
L ′ (𝑥 ′) = L (𝑥) , (15.6)
i.e.
L ′ (𝑥 ′) = L (𝑥 ′ − 𝛿𝑥) . (15.7)
Neglecting terms of higher order in 𝛿𝑥, one then gets
L ′ (𝑥) = L (𝑥 − 𝛿𝑥) (15.8)
(the reader is encouraged to check independently the last statement). The relation (15.8) may be
further recast in the form
L ′ (𝑥) − L (𝑥) = L (𝑥 − 𝛿𝑥) − L (𝑥) (15.9)
that is worth noting: while its left-hand side represents the variation of L (𝑥) due to the variation
𝜑𝑟 (𝑥) → 𝜑′𝑟 (𝑥), the right-hand side is the change of the value of L (𝑥) due to the (infinitesimal)
coordinate transformation. Thus, the identity (15.9) may be expressed as
𝜕L 𝜕L 𝜕L
𝛿𝜑𝑟 + 𝛿(𝜕𝜇 𝜑𝑟 ) = −𝛿𝑥 𝜇 𝜇 (15.10)
𝜕𝜑𝑟 𝜕 (𝜕𝜇 𝜑𝑟 ) 𝜕𝑥
(where one is summing over the index 𝑟 on the left-hand side).
As we know, one has to invoke the equations of motion in order to obtain a conservation
law. So, one employs the Euler–Lagrange equations (14.10) to express the derivative 𝜕L /𝜕𝜑𝑟 ,
and one may also use the obvious identity 𝛿(𝜕𝜇 𝜑𝑟 ) = 𝜕𝜇 (𝛿𝜑𝑟 ). One thus gets
 
𝜕L 𝜕L 𝜕L
𝜕𝜇 𝛿𝜑𝑟 + 𝜕𝜇 (𝛿𝜑𝑟 ) = −𝛿𝑥 𝜇 𝜇 ,
𝜕 (𝜕𝜇 𝜑𝑟 ) 𝜕 (𝜕𝜇 𝜑𝑟 ) 𝜕𝑥
and this becomes, finally,
 
𝜕 𝜕L 𝜕L
𝜇
𝛿𝜑𝑟 + 𝛿𝑥 𝜇 𝜇 = 0 . (15.11)
𝜕𝑥 𝜕 (𝜕𝜇 𝜑𝑟 ) 𝜕𝑥

96
The general relation (15.11) may be called the Noether’s identity and one expects that it will
yield some relevant conservation laws if the appropriate transformations (15.1) are considered.
Let us start with spacetime translations. The transformation of coordinates is simply

𝑥′ 𝜇 = 𝑥 𝜇 + 𝑎 𝜇 , (15.12)

where 𝑎 𝜇 , 𝜇 = 0, 1, 2, 3, are some constants. It is easy to see that the Lagrangians we have
discussed in the preceding chapter are form-invariant if the transformed fields are defined by

𝜑′𝑟 (𝑥 ′) = 𝜑𝑟 (𝑥) . (15.13)

For infinitesimal translations, we write

𝑥′ 𝜇 = 𝑥 𝜇 + 𝜀 𝜇 ,

i.e. 𝛿𝑥 𝜇 = 𝜀 𝜇 , and
𝜕𝜑𝑟
𝜑′𝑟 (𝑥) = 𝜑𝑟 (𝑥 − 𝜀) = 𝜑𝑟 (𝑥) − 𝜀 𝜇 .
𝜕𝑥 𝜇
Thus, one has
𝜕𝜑𝑟
𝛿𝜑𝑟 (𝑥) = −𝜀 𝜇 . (15.14)
𝜕𝑥 𝜇
Substituting these expressions into the relation (15.11), one gets first
 
𝜕 𝜕L 𝜕𝜑𝑟 𝜕L
𝜇
− 𝜀𝜈 + 𝜀𝜈 = 0,
𝜕𝑥 𝜕 (𝜕𝜇 𝜑𝑟 ) 𝜕𝑥 𝜈 𝜕𝑥 𝜈

and this can be recast as


𝜕𝜇 T 𝜇𝜈
𝜀𝜈 = 0 , (15.15)
where
𝜕L
T 𝜇𝜈
= 𝜕 𝜈 𝜑𝑟 − 𝑔 𝜇𝜈 L . (15.16)
𝜕 (𝜕𝜇 𝜑𝑟 )
The identity (15.15) holds for an arbitrary infinitesimal 𝜀 𝜈 , so one may conclude that

𝜕𝜇 T 𝜇𝜈
= 0, 𝜈 = 0, 1, 2, 3 . (15.17)

The quantity T 𝜇𝜈 is called the canonical energy–momentum tensor. It is reassuring that using
the general formula (15.16) for the real Klein–Gordon field one recovers the expression (14.39)
we have found before by means of a direct manipulation with the equation of motion. One
may notice that (15.16) looks like a covariantized form of the Legendre transformation 𝐿 → 𝐻
in classical mechanics, which we have recalled in the preceding chapter (cf. (14.42)). This
observation may serve as a helpful mnemonics for remembering the important result (15.16).
We already know that the four-divergence equations (15.17) for 𝜈 = 0, 1, 2, 3 lead to
four constants of motion, which can be identified with the energy and momentum. Let us
emphasize the nice feature of the above derivation, namely that we have thereby confirmed
an expected connection between the four-momentum conservation and the invariance under
spacetime translations (in a more philosophical parlance, the energy and momentum conservation
is a consequence of the homogeneity of the flat four-dimensional spacetime).
The canonical form (15.16) is certainly suitable for computing the energy and momentum
(defined as the corresponding space integrals of densities), but the tensor T 𝜇𝜈 itself does not
always possess the properties we would like to have. In particular, for some applications it

97
is important to have a symmetric tensor (recall that it stands e.g. on the right-hand side of
the Einstein’s gravitational equations), but, as it turns out, the formula (15.6) does not always
guarantee such a property. A prominent example is the Maxwell field, whose Lagrangian is
given by (14.18) with 𝑚 = 0. In general, the way out of such a problem is to realize that T 𝜇𝜈 is
not determined uniquely by Eq. (15.17); one may always add to the canonical expression (15.16)
an appropriate term 𝑋 𝜇𝜈 such that 𝜕𝜇 𝑋 𝜇𝜈 = 0.
Let us now discuss this point for the aforementioned case of the Maxwell field. One has
1
L = − 𝐹𝜇𝜈 𝐹 𝜇𝜈 , (15.18)
4
with 𝐹𝜇𝜈 = 𝜕𝜇 𝐴𝜈 − 𝜕𝜈 𝐴 𝜇 . The canonical energy–momentum tensor is thus given by

𝜇𝜈 𝜕L
Tcan. = 𝜕 𝜈 𝐴 𝜌 − 𝑔 𝜇𝜈 L (15.19)
𝜕 (𝜕𝜇 𝐴 𝜌 )

(so that 𝜌 now plays the role of the index 𝑟 in (15.16)). According to (14.24) it holds

𝜕L
= −𝐹 𝜇𝜌 . (15.20)
𝜕 (𝜕𝜇 𝐴 𝜌 )

Substituting this into (15.19) and using (15.18), one gets

𝜇𝜈 1
Tcan. = −𝐹 𝜇𝜌 𝜕 𝜈 𝐴 𝜌 + 𝑔 𝜇𝜈 𝐹𝛼𝛽 𝐹 𝛼𝛽 . (15.21)
4
It is clear that the expression (15.21) has two flaws: it is not gauge invariant (since its first term
depends explicitly on the potential 𝐴 𝜌 ) and it is not symmetric under 𝜇 ↔ 𝜈 (also because of
the first term). To get some hint how to proceed further, let us examine the variation of (15.21)
under the gauge transformation
𝐴 𝜌 −→ 𝐴 𝜌 + 𝜕𝜌 𝑓 , (15.22)
where 𝑓 is an arbitrary differentiable function. Carrying out the change (15.22) in (15.21) one
gets

𝜇𝜈 ′ 1
Tcan. = −𝐹 𝜇𝜌 (𝜕 𝜈 𝐴 𝜌 + 𝜕 𝜈 𝜕𝜌 𝑓 ) + 𝑔 𝜇𝜈 𝐹𝛼𝛽 𝐹 𝛼𝛽
4
𝜇𝜈 𝜇𝜌 𝜈
= Tcan. − 𝐹 𝜕 𝜕𝜌 𝑓
𝜇𝜈
= Tcan. − 𝜕𝜌 (𝐹 𝜇𝜌 𝜕 𝜈 𝑓 ) − 𝜕𝜌 𝐹 𝜇𝜌 𝜕 𝜈 𝑓 . (15.23)
 

In the last term one can use the equation of motion, i.e. 𝜕𝜌 𝐹 𝜇𝜌 = 0, and we are thus left with
𝜇𝜈 ′ 𝜇𝜈
Tcan. = Tcan. − 𝜕𝜌 (𝐹 𝜇𝜌 𝜕 𝜈 𝑓 ) . (15.24)

The desired hint is now clear: one may add to the expression (15.21) the term

Δ 𝜇𝜈 = 𝜕𝜌 (𝐹 𝜇𝜌 𝐴𝜈 ) (15.25)
𝜇𝜈
that is designed to compensate the gauge dependence of the original Tcan. . At the same time, it
holds
𝜕𝜇 Δ 𝜇𝜈 = 0

98
(because of the antisymmetry of 𝐹 𝜇𝜌 ), so that the divergence equation (15.17) remains valid for
the modified energy–momentum tensor. So, instead of (15.21) one may consider
𝜇𝜈 𝜇𝜈
T = Tcan. + Δ 𝜇𝜈
1
= −𝐹 𝜇𝜌 𝜕 𝜈 𝐴 𝜌 + 𝑔 𝜇𝜈 𝐹𝛼𝛽 𝐹 𝛼𝛽 + 𝜕𝜌 (𝐹 𝜇𝜌 𝐴𝜈 )
4
1
= −𝐹 𝜇𝜌 𝜕 𝜈 𝐴 𝜌 + 𝐹 𝜇𝜌 𝜕𝜌 𝐴𝜈 + 𝑔 𝜇𝜈 𝐹𝛼𝛽 𝐹 𝛼𝛽
4
𝜇 1 𝜇𝜈
𝜌𝜈
= 𝐹 𝜌 𝐹 + 𝑔 𝐹𝛼𝛽 𝐹 , 𝛼𝛽
(15.26)
4
where we have used the equation of motion 𝜕𝜌 𝐹 𝜇𝜌 = 0. Now the gauge invariance of the new
tensor is manifest and there is even an additional bonus: the expression (15.26) is also symmetric
under the interchange 𝜇 ↔ 𝜈! Indeed, one has
𝜇 𝜇↔𝜈 𝜇 𝜇
𝐹 𝜌 𝐹 𝜌𝜈 −−−−→ 𝐹 𝜈 𝜌 𝐹 𝜌𝜇 = −𝐹 𝜈 𝜌 𝐹 𝜇𝜌 = −𝐹 𝜈𝜌 𝐹 𝜌 = 𝐹 𝜌𝜈 𝐹 𝜌

(of course, the second term in (15.26) is manifestly symmetric). Thus, one may conclude that
our achievement consists in identifying a symmetric, gauge invariant energy–momentum tensor
for the Maxwell field, which reads

𝜇𝜈 𝜇 1
Tsym. = 𝐹 𝜌 𝐹 𝜌𝜈 + 𝑔 𝜇𝜈 𝐹𝛼𝛽 𝐹 𝛼𝛽 . (15.27)
4
It is also possible to check that the energy density corresponding to (15.27) has the familiar value

00 1 ®2 ®2
Tsym. = (𝐸 + 𝐵 ) , (15.28)
2

where 𝐸® and 𝐵® denote the electric and magnetic field strength, respectively. To this end, one
can use the relations 𝐹 𝑗0 = 𝐸 𝑗 and 𝐹 𝑗 𝑘 = −𝜀 𝑗 𝑘𝑙 𝐵𝑙 .
As a last example, let us consider the case of the Dirac field. According to (15.16) one
has, in general
𝜕L 𝜕L
T 𝜇𝜈 = 𝜕𝜈𝜓 + 𝜕𝜈𝜓 − 𝑔 𝜇𝜈 L . (15.29)
𝜕 (𝜕𝜇 𝜓) 𝜕 (𝜕𝜇 𝜓)
For the computation of the energy and momentum it is convenient to employ the simple form
(14.27) for the Lagrangian, since it vanishes for the solutions of the Dirac equation, and,
moreover, it does not contain derivatives of 𝜓. Thus, the expression (15.29) is then reduced to
𝜕L
T 𝜇𝜈
= 𝜕 𝜈 𝜓 = 𝑖 𝜓𝛾 𝜇 𝜕 𝜈 𝜓 . (15.30)
𝜕 (𝜕𝜇 𝜓)

We will utilize this simple formula later on, when discussing the quantization of the Dirac field.
Next, we are going to examine conservation laws associated with the Lorentz symmetry.
It is not difficult to realize that the Lagrangians we have considered up to now possess the
form-invariance under proper Lorentz transformations 𝑥 ′ = Λ𝑥, where Λ is, in general, a six-
parametric matrix encompassing spatial rotations and boosts. Indeed, for the Klein–Gordon
field (which is a Lorentz scalar) one may use the trivial transformation 𝜑′ (𝑥 ′) = 𝜑(𝑥). For the
Dirac field (which is a bispinor), an appropriate choice is 𝜓 ′ (𝑥 ′) = 𝑆(Λ)𝜓(𝑥) with 𝑆(Λ) being
the 4 × 4 matrix that we have uncovered in Chapter 4 (cf. Eq. (4.9)). Similarly, for the Proca
and Maxwell field one may assume that 𝐴 𝜇 (𝑥) is transformed as a four-vector (i.e. through the

99
matrix Λ itself). These remarks justify the usual statement that the Lorentz form-invariance of
the Lagrangians in question is manifest.
So, let us consider an infinitesimal Lorentz transformation. As we know, it can be written
as
𝑥 ′𝜇 = 𝑥 𝜇 + 𝛿𝜔 𝜇𝜈 𝑥 𝜈 , (15.31)
where 𝛿𝜔 𝜇𝜈 = −𝛿𝜔 𝜈𝜇 are the six independent parameters for rotations and boosts. The field
transformation is, in general, given by
 
𝑖
′ ′
𝜑𝑟 (𝑥 ) = 1 − 𝛿𝜔𝛼𝛽 Σ 𝛼𝛽
𝜑 𝑠 (𝑥) , (15.32)
4 𝑟𝑠

where the generators Σ 𝛼𝛽 stand for the corresponding representation of the Lorentz algebra. For
instance,
𝑖
Σ 𝛼𝛽 = 𝜎 𝛼𝛽 = [𝛾 𝛼 , 𝛾 𝛽 ] (15.33)
2
for the Dirac bispinor field. Referring to the notation introduced in (15.3), the relation (15.31)
means that
𝛿𝑥 𝜇 = 𝛿𝜔 𝜇𝜈 𝑥 𝜈 , (15.34)
and for the field variation 𝛿𝜑𝑟 one gets

𝛿𝜑𝑟 (𝑥) = 𝜑′𝑟 (𝑥) − 𝜑𝑟 (𝑥) = 𝜑′𝑟 (𝑥 ′) − 𝜑𝑟 (𝑥) + 𝜑′𝑟 (𝑥) − 𝜑′𝑟 (𝑥 ′)
𝑖 𝜕𝜑′𝑟
= − 𝛿𝜔𝛼𝛽 Σ 𝛼𝛽 𝑟 𝑠 𝜑 𝑠 (𝑥) −

𝛿𝑥 𝛼
4 𝜕𝑥 𝛼
𝑖 𝜕
= − 𝛿𝜔𝛼𝛽 Σ 𝛼𝛽 𝑟 𝑠 𝜑 𝑠 (𝑥) − (𝜑𝑟 + 𝛿𝜑𝑟 )𝛿𝑥 𝛼 .

4 𝜕𝑥 𝛼
Thus, neglecting the higher-order term involving 𝛿𝜑𝑟 𝛿𝑥 𝛼 , one is left with
𝑖 𝜕𝜑𝑟
𝛿𝜑𝑟 = − 𝛿𝜔𝛼𝛽 Σ 𝛼𝛽 𝑟 𝑠 𝜑 𝑠 (𝑥) − 𝛿𝜔𝛼𝛽 𝑥 𝛽 . (15.35)

4 𝜕𝑥 𝛼
For later convenience, the last relation can be rewritten as
1 𝛼𝛽 
 
𝑖
𝛿𝜑𝑟 (𝑥) = − 𝛿𝜔𝛼𝛽 𝛼 𝛽 𝛽 𝛼
Σ 𝑟 𝑠 𝜑 𝑠 (𝑥) + 𝑖(𝑥 𝜕 − 𝑥 𝜕 )𝜑𝑟 (𝑥) , (15.36)
2 2
where we have utilized the antisymmetry of the parameters 𝛿𝜔𝛼𝛽 .
Now, the Noether’s relation (15.11) reads
 
𝜕 𝜕L 𝜕L
𝜇
𝛿𝜑𝑟 + 𝛿𝜔𝛼𝛽 𝑥 𝛽 = 0. (15.37)
𝜕𝑥 𝜕 (𝜕𝜇 𝜑𝑟 ) 𝜕𝑥 𝛼
In order to turn this into a “continuity equation” analogous to (15.17), one has to work out the
second term:
 
𝛽 𝜕L 𝜕 𝜕
𝛿𝜔𝛼𝛽 𝑥 = 𝛿𝜔𝛼𝛽 𝛽 𝛼𝛽
(𝑥 L ) − 𝑔 L = 𝛿𝜔𝛼𝛽 (𝑥 𝛽 L ) . (15.38)
𝜕𝑥 𝛼 𝜕𝑥 𝛼 𝜕𝑥 𝛼

Note that for arriving at (15.38) we have taken into account that 𝛿𝜔𝛼𝛽 𝑔 𝛼𝛽 = 0 because of the
antisymmetry of 𝛿𝜔𝛼𝛽 . After some simple manipulations, the expression (15.38) can be finally
recast as
𝜕L 1 𝜕 
𝛿𝜔𝛼𝛽 𝑥 𝛽 = 𝛿𝜔𝛼𝛽 𝜇 (𝑔 𝛼𝜇 𝑥 𝛽 − 𝑔 𝛽𝜇 𝑥 𝛼 )L . (15.39)

𝜕𝑥 𝛼 2 𝜕𝑥

100
So, using (15.36) and (15.39), the relation (15.37) yields
𝜕
M 𝜇𝛼𝛽 = 0 , (15.40)
𝜕𝑥 𝜇
where
 
𝜕L 𝑖 𝛼𝛽
M 𝜇𝛼𝛽
= − Σ𝑟 𝑠 𝜑 𝑠 + (𝑥 𝜕 − 𝑥 𝜕 )𝜑𝑟 + (𝑔 𝛼𝜇 𝑥 𝛽 − 𝑔 𝛽𝜇 𝑥 𝛼 )L .
𝛼 𝛽 𝛽 𝛼
(15.41)
𝜕 (𝜕𝜇 𝜑𝑟 ) 2

Eq. (15.40) is the desired “continuity equation”, or, if you want, “current conservation” (in fact,
a set of six such equations) corresponding to the Lorentz symmetry. It is interesting that the
formula (15.41) can also be recast in terms of the canonical energy–momentum tensor (15.16),
namely
𝜇𝛽 𝜇𝛼 𝑖 𝜕L 𝛼𝛽
M 𝜇𝛼𝛽 = 𝑥 𝛼 Tcan. − 𝑥 𝛽 Tcan. − Σ 𝜑𝑠 (15.42)
2 𝜕 (𝜕𝜇 𝜑𝑟 ) 𝑟 𝑠
(checking (15.42) is an easy task).
Let us now consider some particular examples. In the simplest case of the scalar Klein–
Gordon field one has Σ 𝛼𝛽 = 0, so that
𝜇𝛼𝛽 𝜇𝛽 𝜇𝛼
Mscalar = 𝑥 𝛼 Tcan. − 𝑥 𝛽 Tcan. . (15.43)

The identity (15.40) then yields


   
𝜇𝛽 𝜇𝛼
0 = 𝜕𝜇 𝑥 𝛼 Tcan. − 𝜕𝜇 𝑥 𝛽 Tcan.
𝜇𝛽 𝜇𝛽 𝛽 𝜇𝛼 𝜇𝛼
= 𝑔 𝛼 𝜇 Tcan. + 𝑥 𝛼 𝜕𝜇 Tcan. − 𝑔 𝜇 Tcan. − 𝑥 𝛽 𝜕𝜇 Tcan.
𝛼𝛽 𝛽𝛼
= Tcan. − Tcan. . (15.44)

Thus we see that there is a simple general consequence of the equations (15.40) and (15.42):
for a scalar field the energy–momentum tensor is always symmetric. An elementary illustration
of this fact is provided by our formula (14.39) for the free field, but the result (15.44) is quite
general (i.e. it is valid also for interacting scalar fields described by Lagrangians involving terms
beyond the quadratic form (14.12)).
As an example of a situation involving non-trivial Σ 𝛼𝛽 in Eq. (15.42), it is quite instructive
to consider the Dirac field. Before working out this particular example, a general remark is in
order. One might guess that the most interesting components of the quantity M 𝜇𝛼𝛽 are those
with (𝛼𝛽) = ( 𝑗 𝑘), 𝑗, 𝑘 = 1, 2, 3. Why? The reason is that such a combination of indices 𝛼,
𝛽 corresponds to spatial rotations, and one knows from other theories (classical and quantum
mechanics) that the rotational invariance is intimately connected with the conservation of angular
momentum. So, one should be ready to identify the corresponding quantities

𝑀 = d3 𝑥 M 0 𝑗 𝑘 ,
𝑗𝑘
(15.45)

with components of the field angular momentum (note that owing to the antisymmetry in the
indices 𝑗, 𝑘 there are just three independent components 𝑀 𝑗 𝑘 ). Now, let us turn to the case of
the Dirac field. According to (15.42) and taking into account (15.33), one has
𝑖 𝑖
M 0 𝑗 𝑘 = 𝑥 𝑗 T 0𝑘 − 𝑥 𝑘 T 0 𝑗 − · 𝑖 𝜓𝛾0 [𝛾 𝑗 , 𝛾 𝑘 ]𝜓 . (15.46)
2 2

101
Components of the energy–momentum tensor are given by (15.30), so that e.g.
𝜕
T 0 𝑗 = 𝑖 𝜓𝛾0 𝜕 𝑗 𝜓 = −𝑖𝜓 † 𝜓. (15.47)
𝜕𝑥 𝑗
One thus gets, finally,
1
M 0 𝑗 𝑘 = 𝜓 † (𝑥 𝑗 𝑝 𝑘 − 𝑥 𝑘 𝑝 𝑗 + 𝜎 𝑗 𝑘 )𝜓 , (15.48)
2
where we have used a suggestive notation
𝜕
𝑝 𝑛 = −𝑖 , 𝑛 = 𝑗, 𝑘 . (15.49)
𝜕𝑥 𝑛
Furthermore, it holds 𝜎 𝑗 𝑘 = 𝜀 𝑗 𝑘𝑙 Σ𝑙 , where Σ𝑙 (or, more precisely, 21 Σ𝑙 ), 𝑙 = 1, 2, 3, are the spin
matrices that we know from the chapters on the relativistic quantum mechanics. So one has, for
instance,
1
M 012 = 𝜓 † (𝑥 1 𝑝 2 − 𝑥 2 𝑝 1 + Σ3 )𝜓 , (15.50)
2
and this offers another suggestive notation, namely 𝑥 1 𝑝 2 − 𝑥 2 𝑝 1 = 𝐿 3 . In general, one may
introduce the vector 𝐿® defined by

𝑥 𝑗 𝑝 𝑘 − 𝑥 𝑘 𝑝 𝑗 = 𝜀 𝑗 𝑘𝑙 𝐿 𝑙 , (15.51)

and in this way one is led to the vector dual of the antisymmetric tensor (15.48) (tensor under
spatial rotations), i.e. M 0 𝑗 𝑘 → M®, where


 
® † ®
M =𝜓 𝐿+ Σ 𝜓. (15.52)
2

The conserved angular momentum for the Dirac field is then



𝑀 = M® d3 𝑥 .
® (15.53)

We should also recall that the conserved field momentum is, according to (15.47),

𝑃 = d3 𝑥 𝜓 † −𝑖 ® 𝜓 .
® (15.54)
 Δ

An astute reader might now say that our results (15.52) through (15.54) are a sort of
“déjà vu”, in the sense that they involve quantum mechanical operators of the momentum and
angular momentum. In fact, it is not surprising; within the relativistic quantum mechanics,
the expressions (15.53), (15.54) would represent the relevant expectation values and these are
certainly constant in time. As we have stressed repeatedly, the classical field considered here is
a completely different system, but it satisfies the same Dirac equation, and this is what really
matters.
As a last item of our program we are going to consider the case of conservation laws
associated with internal symmetries. This, in a sense, is the simplest situation, since the
coordinate transformations are not involved here. As we have noted earlier, one may then say
that the Lagrangian is truly invariant under the field transformation in question (cf. Eq. (15.4)).
So, suppose that the Lagrangian is invariant under a continuous transformation 𝜑𝑟 (𝑥) → 𝜑′𝑟 (𝑥);

102
its infinitesimal form is given by a variation 𝛿𝜑𝑟 (𝑥) and, at the same time, one sets 𝛿𝑥 = 0. The
Noether’s identity (15.11) thus reads
 
𝜕L
𝜕𝜇 𝛿𝜑𝑟 = 0 . (15.55)
𝜕 (𝜕𝜇 𝜑𝑟 )
It is clear that such an identity yields immediately the desired conserved currents (or “conti-
nuity equations”); it is sufficient to disentangle the dependence of the variations 𝛿𝜑𝑟 on the
transformation parameters. Let us show, on some instructive examples, how it works.
Our first example will refer to the complex Klein–Gordon field. Its Lagrangian

L = (𝜕𝜇 𝜑)(𝜕 𝜇 𝜑∗ ) − 𝑚 2 𝜑𝜑∗

is obviously invariant under phase transformations

𝜑′ (𝑥) = 𝑒𝑖𝛼 𝜑(𝑥) ,


(15.56)
𝜑∗ ′ (𝑥) = 𝑒 −𝑖𝛼 𝜑∗ (𝑥) ,

where 𝛼 is a real constant parameter. Infinitesimal variations 𝛿𝜑, 𝛿𝜑∗ are then

𝛿𝜑(𝑥) = 𝑖𝛿𝛼𝜑(𝑥) , 𝛿𝜑∗ (𝑥) = −𝑖𝛿𝛼𝜑∗ (𝑥) . (15.57)

Substituting this into the identity (15.55) one gets


 
𝜕L 𝜕L
0 = 𝜕𝜇 𝛿𝜑 + 𝛿𝜑 = 𝜕𝜇 [(𝜕 𝜇 𝜑∗ )𝑖𝛿𝛼𝜑 − (𝜕 𝜇 𝜑)𝑖𝛿𝛼𝜑∗ ] .

𝜕 (𝜕𝜇 𝜑) 𝜕 (𝜕𝜇 𝜑∗ )
It means that the conserved current satisfying 𝜕𝜇 𝐽 𝜇 = 0 has the form

𝐽 𝜇 = 𝑖 [𝜑(𝜕 𝜇 𝜑∗ ) − (𝜕 𝜇 𝜑)𝜑∗ ] (15.58)

(an attentive reader may recall that we have in fact arrived at such a result in Chapter 1 by means
of direct manipulations with the Klein–Gordon equation).
Next, one may repeat an analogous exercise with the Dirac field. In this case, the
infinitesimal phase transformations are written as

𝛿𝜓(𝑥) = 𝑖𝛿𝛼𝜓(𝑥) ,
(15.59)
𝛿𝜓(𝑥) = −𝑖𝛿𝛼𝜓(𝑥) .

The Noether’s identity (15.11) then leads immediately to the current conservation equation
𝜕𝜇 𝐽 𝜇 = 0 with
𝐽 𝜇 (𝑥) = 𝜓(𝑥)𝛾 𝜇 𝜓(𝑥) (15.60)
(again, we have thus recovered the expression found earlier in Chapter 4).
There is another interesting point concerning the Dirac field that is worth mentioning
here. It turns out that for 𝑚 = 0 its Lagrangian possesses an extra internal symmetry, in addition
to the phase invariance discussed above. The corresponding transformations read

𝜓 ′ (𝑥) = 𝑒𝑖𝛼𝛾5 𝜓(𝑥) ,


(15.61)
𝜓 ′ (𝑥) = 𝜓(𝑥) 𝑒𝑖𝛼𝛾5

(where 𝛼 is a real constant parameter), and usually they are called “chiral”, because the matrix
𝛾5 is involved here. The reader is urged to check that the transformations for 𝜓 and 𝜓 are

103
mutually compatible (it is easy, one has just to utilize the anticommutativity property of 𝛾5 ).
The infinitesimal form of (15.61) is immediately clear, and from the Noether’s identity (15.11)
one thus gets readily the current conservation

𝜕 𝜇 𝐽 𝜇5 = 0 , (15.62)

with
𝐽 𝜇5 = 𝜓𝛾 𝜇 𝛾5 𝜓 . (15.63)
Recall that whereas the current (15.60) is a true Lorentz four-vector, the expression (15.63) is
an axial vector (cf. (5.15)). One might wonder what happens with such an axial vector current
for 𝑚 ≠ 0. The answer is straightforward: employing the Dirac equation, one gets

𝜕 𝜇 𝐽 𝜇5 = 2𝑖𝑚𝜓𝛾5 𝜓 . (15.64)

So, in the limit 𝑚 → 0 Eq. (15.62) is recovered. Discovering the extra conservation law for
the massless Dirac field, embodied in (15.62), has been an elementary exercise at this level. In
fact, the concept of the chiral symmetry had been immensely important in the development of
particle physics in 20th century (but a more detailed treatment of this topic would go far beyond
the scope of these lecture notes).
Let us now summarize briefly the main results we have achieved here. First of all, we
have found ten conservation laws, embodied in the identities (15.17) and (15.40). Four of the
“conserved currents”, which coincide with the canonical energy–momentum tensor, correspond
to spacetime translations, and the remaining six are related to the Lorentz symmetry. All
together they reflect the notorious Poincaré invariance, which is supposed to be a universal
property of the laws of relativistic physics. Apart from this, for some models there are also
“accidental” internal symmetries that lead to additional conservation laws. In principle, the
conserved quantities may be derived directly from the field equations of motion; however, the
procedure à la Emmy Noether is certainly much more efficient. So, the general identity (15.11)
is indeed an invaluable tool in this context.
There is another point that should be stressed. Most of the classical fields we have
considered here are rather abstract entities that are not realized as physical objects in the nature;
an obvious exception is the Maxwell field (and, of course, the gravitational field, which, however,
is beyond our scope in the present context). In particular, one can certainly never encounter
a classical Dirac field in our Universe. Nevertheless, this is by no means a drawback of our
approach. The real meaning of our effort is that we have developed a basic formalism for
the description of relativistic field, and these may now be quantized. Historically, the first
attempt in such a direction concerned the electromagnetic Maxwell field, and it resulted in a
successful description of the light quanta — photons. The procedure of the field quantization
was subsequently extended to encompass other fields and the corresponding particles, and it
marked the birth of the methods of the quantum field theory that we are using today. Precisely
this, i.e. the theory of quantized fields and their relation to relativistic particles, constitutes the
contents of the forthcoming chapters.

104
Chapter 16

Canonical quantization of real scalar field

We are going to start our tour of the field quantization schemes with the simplest possible case,
which is, of course, a real Klein–Gordon field. By the way, maybe it is not entirely impossible
that you come across such a classical field when wandering around the universe, since it does
appear in some cosmological models (though, admittedly, it is still a highly speculative matter).
Anyway, it is not our concern here. Our goal is to find out what is the physical picture that
emerges upon quantizing such a field.
So, let us take up this task. The real scalar field we have in mind is described, classically,
by means of a single function 𝜑(𝑥) = 𝜑(® 𝑥 , 𝑡) that may be understood as a “generalized coordinate”
within the framework of the Lagrangian formalism developed in chapters 14 and 15. Using the
familiar Lagrangian density
1 1
L = 𝜕𝜇 𝜑𝜕 𝜇 𝜑 − 𝑚 2 𝜑2 , (16.1)
2 2
one may define, in analogy with the classical particle mechanics, the “conjugate momentum”
𝜕L
𝜋(𝑥) = = 𝜕0 𝜑(𝑥) . (16.2)
𝜕 (𝜕0 𝜑)
Now, with the “canonical variables” (coordinate and momentum) at hand, one may consider
𝜑(𝑥) and 𝜋(𝑥) as operators (in an as yet unspecified Hilbert space), and try to imitate the
commutation relations of the ordinary quantum mechanics. Before doing this, one should
realize that 𝜑(𝑥) = 𝜑(® 𝑥 , 𝑡) and 𝜋(𝑥) = 𝜋(®𝑥 , 𝑡) are to be understood as operators in the Heisenberg
picture, due to their time dependence. Canonical quantum mechanical commutation relations
read (in the Schrödinger picture)
[𝑥 𝑗 , 𝑝 𝑘 ] = 𝑖𝛿 𝑗 𝑘 ,
[𝑥 𝑗 , 𝑥 𝑘 ] = 0 , (16.3)
[ 𝑝 𝑗 , 𝑝𝑘 ] = 0 .
It is easy to see that in the Heisenberg picture, where
𝑥 𝑗 (𝑡) = 𝑒𝑖𝐻𝑡 𝑥 𝑗 𝑒 −𝑖𝐻𝑡 , 𝑝 𝑘 (𝑡) = 𝑒𝑖𝐻𝑡 𝑝 𝑘 𝑒 −𝑖𝐻𝑡 ,
the commutators (16.3) remain the same. So, the equations (16.3) are equivalent to the corre-
sponding set of equal-time commutation relations (ETCR). Inspired by these observations,
one may postulate the following “canonical” commutation relations for the field variables 𝜑(𝑥)
and 𝜋(𝑥):
𝑥 , 𝑡), 𝜋( 𝑦®, 𝑡)] = 𝑖𝛿 (3) (®
[𝜑(® 𝑥 − 𝑦®) ,
𝑥 , 𝑡), 𝜑( 𝑦®, 𝑡)] = 0 ,
[𝜑(® (16.4)
𝑥 , 𝑡), 𝜋( 𝑦®, 𝑡)] = 0 .
[𝜋(®

105
Note that we have used here 𝛿 (3) (®
𝑥 − 𝑦®) as a counterpart of the Kronecker delta appearing in
(16.3). This seems to be quite natural replacement, since, as we have noted in Chapter 14,
in field theory the spatial coordinates 𝑥® play the role of the discrete indices used for labelling
dynamical variables in the particle mechanics. One can also check that mass dimensions come
out right in the commutator of 𝜑 and 𝜋. It may not be obvious at first sight, so let us explain it in
detail. Within our system of units with ℏ = 𝑐 = 1 the dimension of the Lagrangian density L is
𝑀 4 (where 𝑀 is an arbitrary mass), simply because the action (14.7) is dimensionless (due to
ℏ = 1) and d4 𝑥 has the dimension (length) 4 , which is 𝑀 −4 . Thus, the dimension of the field 𝜑
in (16.1) is 𝑀, and, subsequently, 𝜋 = 𝜕0 𝜑 must have dimension 𝑀 2 (why?). It means that the
commutator of 𝜑 and 𝜋 has dimension 𝑀 3 . On the other hand, the delta function 𝛿 (3) (® 𝑥 − 𝑦®) has
dimension (length) −3 (why?), which is 𝑀 3 , and this completes the argument.
A general solution of the Klein–Gordon equation for the classical field 𝜑(𝑥) can be
written as ∫
𝜑(𝑥) = d3 𝑘 𝑁 𝑘 𝑎(𝑘) 𝑒 −𝑖𝑘𝑥 + 𝑎 ∗ (𝑘) 𝑒𝑖𝑘𝑥 , (16.5)
 

® with 𝑘 0 = ( 𝑘® 2 + 𝑚 2 ) 1/2 , 𝑁 𝑘 = (2𝑘 0 ) −1/2 (2𝜋) −3/2 , and 𝑎(𝑘) (in fact 𝑎( 𝑘))
where 𝑘 = (𝑘 0 , 𝑘) ® are
some arbitrary expansion coefficients. The normalization factor 𝑁 𝑘 is conventional and has been
chosen so for later convenience. Note also that in (16.5) we have taken into account that 𝜑(𝑥)
is supposed to be real. Upon quantization, a classical real 𝜑(𝑥) becomes a Hermitian operator
written as ∫
𝜑(𝑥) = d3 𝑘 𝑁 𝑘 𝑎(𝑘) 𝑒 −𝑖𝑘𝑥 + 𝑎 † (𝑘) 𝑒𝑖𝑘𝑥 , (16.6)
 

where the expansion coefficients 𝑎(𝑘), 𝑎 † (𝑘) are operators (of course, 𝑎 † (𝑘) denotes the Her-
mitian conjugation of 𝑎(𝑘)). Having postulated the canonical commutation relations (16.4),
we may now find out what are the corresponding formulae for 𝑎(𝑘), 𝑎 † (𝑘). These operator
coefficients can be extracted from 𝜑(𝑥) by utilizing appropriate orthogonality relations for the
normalized exponential functions
𝑓 𝑘 (𝑥) = 𝑁 𝑘 𝑒 −𝑖𝑘𝑥 , 𝑓 𝑘∗ (𝑥) = 𝑁 𝑘 𝑒𝑖𝑘𝑥 (16.7)
appearing in (16.6). It holds


𝑓 𝑘∗ (® 𝑥 , 𝑡) d3 𝑥 = 𝛿 (3) ( 𝑘® − 𝑘®′) ,
𝑥 , 𝑡)𝑖 𝜕0 𝑓 𝑘 ′ (®


(16.8)
𝑓 𝑘 (® 𝑥 , 𝑡) d3 𝑥 = 0 ,
𝑥 , 𝑡)𝑖 𝜕0 𝑓 𝑘 ′ (®

where we have used the symbol for the both-sided derivative defined earlier (cf. (14.32)). The
proof of the identities (16.8) is not difficult and can be left to the reader as an instructive exercise.
Employing the formulae (16.8), one gets from (16.6)


𝑎(𝑘) = 𝑖 d3 𝑥 𝑓 𝑘∗ (®𝑥 , 𝑡) 𝜕0 𝜑(®𝑥 , 𝑡) ,


(16.9)
3
𝑎 (𝑘) = −𝑖 d 𝑥 𝑓 𝑘 (®

𝑥 , 𝑡) 𝜕0 𝜑(®𝑥 , 𝑡) .

Now we are in a position to work out the commutators of the operators 𝑎(𝑘), 𝑎 † (𝑘). One has
↔ ↔

[𝑎(𝑘), 𝑎 (𝑘 )] = d3 𝑥 d3 𝑦 𝑓 𝑘∗ (®
† ′
 
𝑥 , 𝑡) 𝜕0 𝜑(®𝑥 , 𝑡), 𝑓 𝑘 ′ ( 𝑦®, 𝑡) 𝜕0 𝜑( 𝑦®, 𝑡)
. 𝑥 , 𝑡) − 𝑓.∗ (®𝑥 , 𝑡)𝜑(®𝑥 , 𝑡), 𝑓 ′ ( 𝑦®, 𝑡) 𝜑(
. 𝑦®, 𝑡) − 𝑓. ′ ( 𝑦®, 𝑡)𝜑( 𝑦®, 𝑡)  .

= d3 𝑥 d3 𝑦 𝑓 𝑘∗ (®

𝑥 , 𝑡) 𝜑(® 𝑘 𝑘 𝑘

(16.10)

106
Now, using (16.2), the last expression may be recast in terms of the canonical commutators
(16.4). Discarding immediately the vanishing commutators of the type [𝜑, 𝜑] and [𝜋, 𝜋], one is
left with

3 3
 . .∗ 
[𝑎(𝑘), 𝑎 (𝑘 )] = d 𝑥 d 𝑦 − 𝑓 𝑘 (®
† ′ ∗
𝑥 , 𝑡) 𝑓 𝑘 ( 𝑦®, 𝑡) [𝜋(®
′ 𝑥 , 𝑡), 𝜑( 𝑦®, 𝑡)] − 𝑓 𝑘 (® 𝑥 , 𝑡) 𝑓 𝑘 ( 𝑦®, 𝑡) [𝜑(®
′ 𝑥 , 𝑡), 𝜋( 𝑦®, 𝑡)]

. .
= d3 𝑥 d3 𝑦 𝑖𝛿 (3) (®
 
𝑥 − 𝑦®) 𝑓 𝑘∗ (®𝑥 , 𝑡) 𝑓 𝑘 ′ ( 𝑦®, 𝑡) − 𝑖𝛿 (3) (®
𝑥 − 𝑦®) 𝑓 𝑘∗ (®
𝑥 , 𝑡) 𝑓 𝑘 ′ ( 𝑦®, 𝑡)


= 𝑖 d3 𝑥 𝑓 𝑘∗ (®
𝑥 , 𝑡) 𝜕0 𝑓 𝑘 ′ (®
𝑥 , 𝑡)

= 𝛿 (3) ( 𝑘® − 𝑘®′) , (16.11)


where we have utilized (16.8) in the final step. In a similar way, it can be shown that
[𝑎(𝑘), 𝑎(𝑘 ′)] = 0 (which immediately implies also [𝑎 † (𝑘), 𝑎 † (𝑘 ′)] = 0). Thus, our results
can be summarized as
[𝑎(𝑘), 𝑎 † (𝑘 ′)] = 𝛿 (3) ( 𝑘® − 𝑘®′) ,
[𝑎(𝑘), 𝑎(𝑘 ′)] = 0 , (16.12)
[𝑎 (𝑘), 𝑎 (𝑘 )] = 0 .
† † ′

As a next step towards the physical interpretation of the quantized scalar field, let us
calculate the energy and momentum according to the formulae derived in the preceding chapters.
Needless to say, these quantities now become operators (hopefully, time-independent) expressed
eventually in terms of 𝑎(𝑘) and 𝑎 † (𝑘). We are going to start with (the operator of) the energy.
Anticipating its role as the Hamiltonian of the quantized field, we denote it as 𝐻. One has
1 1® ® 1 2 2 3
∫ ∫  
00 3
𝐻= T d 𝑥= 𝜑 𝜑+ 𝑚 𝜑 d 𝑥, (16.13)
Δ Δ
𝜕0 𝜑 𝜕0 𝜑 +
2 2 2
where we have employed the formula (14.45). The calculation is straightforward, but rather
lengthy, so please be patient (at subsequent stages of our quantization tour we will be able to
proceed faster, utilizing the skills gained here).
Substituting the expression (16.6) into (16.13), we will evaluate consecutively the indi-
vidual terms appearing there. First,
1

𝐻1 = d3 𝑥 𝜕0 𝜑𝜕0 𝜑
2 ∫∫
1
d3 𝑘 d3 𝑙 𝑁 𝑘 𝑁𝑙 (−𝑖𝑘 0 )(−𝑖𝑙0 ) 𝑒 −𝑖(𝑘 0 +𝑙0 )𝑥0 (2𝜋) 3 𝛿 (3) ( 𝑘® + 𝑙)𝑎(𝑘)𝑎(𝑙)
h
= ®
2
+ (−𝑖𝑘 0 )(𝑖𝑙 0 ) 𝑒 −𝑖(𝑘 0 −𝑙0 )𝑥0 (2𝜋) 3 𝛿 (3) ( 𝑘® − 𝑙)𝑎(𝑘)𝑎
® †
(𝑙)
+ (𝑖𝑘 0 )(−𝑖𝑙 0 ) 𝑒𝑖(𝑘 0 −𝑙0 )𝑥0 (2𝜋) 3 𝛿 (3) ( 𝑘® − 𝑙)𝑎
® † (𝑘)𝑎(𝑙)

+ (𝑖𝑘 0 )(𝑖𝑙 0 ) 𝑒𝑖(𝑘 0 +𝑙0 )𝑥0 (2𝜋) 3 𝛿 (3) ( 𝑘® + 𝑙)𝑎


i
® † (𝑘)𝑎 † (𝑙) . (16.14)

Note that in arriving at (16.14) we have performed the integration with d3 𝑥 producing the delta
functions 𝛿 (3) ( 𝑘® ± 𝑙).
® Taking into account how 𝑘 0 , 𝑙 0 depend on 𝑘, ® 𝑙,
® the expression (16.14) is
readily simplified to
1

d3 𝑘 𝑁 𝑘2 (2𝜋) 3 −𝑘 02 𝑒 −2𝑖𝑘 0 𝑥0 𝑎(𝑘)𝑎(−𝑘) + 𝑘 02 𝑎(𝑘)𝑎 † (𝑘)
h
𝐻1 =
2
+ 𝑘 02 𝑎 † (𝑘)𝑎(𝑘) − 𝑘 02 𝑒 2𝑖𝑘 0 𝑥0 𝑎 † (𝑘)𝑎 † (−𝑘) .
i
(16.15)

107
Just to be sure, let us stress that 𝑎(−𝑘), 𝑎 † (−𝑘) here in fact mean 𝑎(− 𝑘), ® 𝑎 † (− 𝑘).
®
In a similar way, for the second term in (16.13) one obtains
1

d3 𝑥 ® 𝜑 ® 𝜑
Δ Δ
𝐻2 =
2 ∫∫
1
d3 𝑘 d3 𝑙 𝑁 𝑘 𝑁𝑙 − 𝑘® · 𝑙® 𝑒 −𝑖(𝑘 0 +𝑙0 )𝑥0 (2𝜋) 3 𝛿 (3) ( 𝑘® + 𝑙)𝑎(𝑘)𝑎(𝑙)
h
= ®
2
+ 𝑘® · 𝑙® 𝑒 −𝑖(𝑘 0 −𝑙0 )𝑥0 (2𝜋) 3 𝛿 (3) ( 𝑘® − 𝑙)𝑎(𝑘)𝑎
® †
(𝑙)
+ 𝑘® · 𝑙® 𝑒𝑖(𝑘 0 −𝑙0 )𝑥0 (2𝜋) 3 𝛿 (3) ( 𝑘® − 𝑙)𝑎
® † (𝑘)𝑎(𝑙)

− 𝑘® · 𝑙® 𝑒𝑖(𝑘 0 +𝑙0 )𝑥0 (2𝜋) 3 𝛿 (3) ( 𝑘® + 𝑙)𝑎


i
® † (𝑘)𝑎 † (𝑙) . (16.16)

The integration over the variable 𝑙® then obviously yields


1

d3 𝑘 𝑁 𝑘2 (2𝜋) 3 𝑘® 2 𝑒 −2𝑖𝑘 0 𝑥0 𝑎(𝑘)𝑎(−𝑘) + 𝑘® 2 𝑎(𝑘)𝑎 † (𝑘)
h
𝐻2 =
2
+ 𝑘® 2 𝑎 † (𝑘)𝑎(𝑘) + 𝑘® 2 𝑒 2𝑖𝑘 0 𝑥0 𝑎 † (𝑘)𝑎 † (−𝑘) .
i
(16.17)

Finally,
1

𝐻3 = d3 𝑥 𝑚 2 𝜑2
2 ∫∫
1
d3 𝑘 d3 𝑙 𝑁 𝑘 𝑁𝑙 𝑒 −𝑖(𝑘 0 +𝑙0 )𝑥0 (2𝜋) 3 𝛿 (3) ( 𝑘® + 𝑙)𝑎(𝑘)𝑎(𝑙)
h
= ®
2
+ 𝑒 −𝑖(𝑘 0 −𝑙0 )𝑥0 (2𝜋) 3 𝛿 (3) ( 𝑘® − 𝑙)𝑎(𝑘)𝑎
® †
(𝑙)
+ 𝑒𝑖(𝑘 0 −𝑙0 )𝑥0 (2𝜋) 3 𝛿 (3) ( 𝑘® − 𝑙)𝑎
® † (𝑘)𝑎(𝑙)

+ 𝑒𝑖(𝑘 0 +𝑙0 )𝑥0 (2𝜋) 3 𝛿 (3) ( 𝑘® + 𝑙)𝑎


i
® † (𝑘)𝑎 † (𝑙) , (16.18)

and this becomes


1

d3 𝑘 𝑁 𝑘2 (2𝜋) 3 𝑚 2 𝑒 −2𝑖𝑘 0 𝑥0 𝑎(𝑘)𝑎(−𝑘) + 𝑎(𝑘)𝑎 † (𝑘)
h
𝐻3 =
2
+ 𝑎 † (𝑘)𝑎(𝑘) + 𝑒 2𝑖𝑘 0 𝑥0 𝑎 † (𝑘)𝑎 † (−𝑘) .
i
(16.19)

Putting all this together, it is seen that the sum of the time-dependent terms in (16.15),
(16.17) and (16.19) includes the overall coefficient −𝑘 02 + 𝑘® 2 + 𝑚 2 , which is zero. It is gratifying,
since we have anticipated a time-independent energy as a result of correct calculation. On the
other hand, the sum of coefficients in the time-independent terms amount to 𝑘 02 + 𝑘® 2 + 𝑚 2 = 2𝑘 02 .
Taking also into account that 𝑁 𝑘2 = (2𝜋) −3 (2𝑘 0 ) −1 one then gets, after a simple manipulation,

1

𝐻 = 𝐻1 + 𝐻2 + 𝐻3 = d3 𝑘 𝑘 0 [𝑎 † (𝑘)𝑎(𝑘) + 𝑎(𝑘)𝑎 † (𝑘)] . (16.20)
2
An observant reader may have noticed already that the integrand in the last expression bears a
striking resemblance to the Hamiltonian of the linear harmonic oscillator (written in terms of
the popular “ladder” operators). It is indeed so and we will elaborate on this point in the next
chapter.

108
Further, let us consider the (operator of) momentum. As we know, it is given by the space
integral of the components T 0 𝑗 of the energy–momentum tensor; according to (14.39) this is

𝑃 = d3 𝑥 𝜕0 𝜑 𝜕 𝑗 𝜑 .
𝑗
(16.21)

For convenience, we will switch to the notation involving the gradient symbol ® , similarly to
Δ
(15.54). Recall that ® represents conventionally the derivatives 𝜕/𝜕𝑥 𝑗 (≡ 𝜕 𝑗 ). Thus, (16.21)
Δ
can be recast as ∫
𝑃 = − d3 𝑥 𝜕0 𝜑 ® 𝜑 .
® (16.22)
Δ

Substituting the expression (16.6) into (16.22) and proceeding similarly as before, one gets first
∫∫
d3 𝑘 d3 𝑙 𝑁 𝑘 𝑁𝑙 −𝑘 0 𝑙® 𝑒 −𝑖(𝑘 0 +𝑙0 )𝑥0 (2𝜋) 3 𝛿 (3) ( 𝑘® + 𝑙)𝑎(𝑘)𝑎(𝑙)
h
®
𝑃= ®

+ 𝑘 0 𝑙®(2𝜋) 3 𝛿 (3) ( 𝑘® − 𝑙)𝑎(𝑘)𝑎


® †
(𝑙)
3
+ 𝑘 0 𝑙®(2𝜋) 𝛿 ( 𝑘® − 𝑙)𝑎
(3) ® (𝑘)𝑎(𝑙)

− 𝑘 0 𝑙® 𝑒𝑖(𝑘 0 +𝑙0 )𝑥0 (2𝜋) 3 𝛿 (3) ( 𝑘® + 𝑙)𝑎


i
® † (𝑘)𝑎 † (𝑙) , (16.23)

and this is readily turned into the form



𝑃 = d3 𝑘 𝑁 𝑘2 (2𝜋) 3 𝑘 0 𝑘® 𝑒 −2𝑖𝑘 0 𝑥0 𝑎(𝑘)𝑎(−𝑘) + 𝑘 0 𝑘® 𝑒 2𝑖𝑘 0 𝑥0 𝑎 † (𝑘)𝑎 † (−𝑘)
h
®
i
®
+ 𝑘 0 𝑘𝑎(𝑘)𝑎 † ® † (𝑘)𝑎(𝑘) .
(𝑘) + 𝑘 0 𝑘𝑎 (16.24)

Now, one would like to get rid of the time dependent terms in (16.24). This is done quite easily.
Indeed, these contributions to the integrand are odd functions under 𝑘® → − 𝑘, ® since all factors
appearing there are even, except 𝑘;® in particular, the products 𝑎(𝑘)𝑎(−𝑘) and 𝑎 † (𝑘)𝑎 † (−𝑘) are
even, due to the commutation relations (16.12). Thus, the first two terms in (16.24) vanish upon
the integration. The final result can be then written, after a simple manipulation, as
1

®
𝑃= d3 𝑘 𝑘® [𝑎 † (𝑘)𝑎(𝑘) + 𝑎(𝑘)𝑎 † (𝑘)] . (16.25)
2

We will study the properties of the operators 𝐻 and 𝑃® in detail in the next chapter.
However, there is at least one simple observation that should be mentioned already here, to
convince ourselves that we are on the right track. One should rightly expect that the energy
operator (16.20) is a correct quantum Hamiltonian, in the sense that it controls the time evolution
of the field operator 𝜑(®
𝑥 , 𝑡). To see this, let us evaluate the relevant commutator [𝐻, 𝜑(𝑥)]. First,
one can show that
[𝐻, 𝑎(𝑘)] = −𝑘 0 𝑎(𝑘) ,
(16.26)
[𝐻, 𝑎 † (𝑘)] = 𝑘 0 𝑎 † (𝑘) .

Proving (16.26) is not difficult. Indeed, using the basic commutation relations (16.12), it is easy
to see that

[𝑎 † (𝑙)𝑎(𝑙), 𝑎(𝑘)] = [𝑎(𝑙)𝑎 † (𝑙), 𝑎(𝑘)] = −𝛿 (3) ( 𝑘® − 𝑙)𝑎(𝑙)


® ,
(16.27)
[𝑎 † (𝑙)𝑎(𝑙), 𝑎 † (𝑘)] = [𝑎(𝑙)𝑎 † (𝑙), 𝑎 † (𝑘)] = 𝛿 (3) ( 𝑘® − 𝑙)𝑎
® † (𝑙) .

109
Then, employing the formula (16.20), one has
1
∫  
3
[𝐻, 𝑎(𝑘)] = d 𝑙 𝑙0 𝑎 (𝑙)𝑎(𝑙) + 𝑎(𝑙)𝑎 (𝑙) , 𝑎(𝑘)
† † 
2
1 

= d3 𝑙 𝑙 0 −𝛿 (3) ( 𝑘® − 𝑙)𝑎(𝑙)

® − 𝛿 (3) ( 𝑘® − 𝑙)𝑎(𝑙)
®
2
= −𝑘 0 𝑎(𝑘) .
Similarly, one gets the second identity (16.26). With the identities (16.26) at hand, the calculation
of the commutator [𝐻, 𝜑(𝑥)] is straightforward. One has

𝑖[𝐻, 𝜑(𝑥)] = 𝑖 d3 𝑘 𝑁 𝑘 [𝐻, 𝑎(𝑘)] 𝑒 −𝑖𝑘𝑥 + [𝐻, 𝑎 † (𝑘)] 𝑒𝑖𝑘𝑥
 


3
 
= d 𝑘 𝑁 𝑘 −𝑖𝑘 0 𝑎(𝑘) 𝑒 −𝑖𝑘𝑥 †
+ 𝑖𝑘 0 𝑎 (𝑘) 𝑒 𝑖𝑘𝑥
. (16.28)

However, the expression (16.28) is precisely what one obtains for the time derivative 𝜕0 𝜑 by
using (16.6). Thus, we have arrived at the equation of motion (in the Heisenberg picture)
𝜕0 𝜑(𝑥) = 𝑖[𝐻, 𝜑(𝑥)] , (16.29)
which is certainly reassuring.
In a similar way, using (16.26), one gets easily
𝜕 𝑗 𝜑(𝑥) = 𝑖[𝑃 𝑗 , 𝜑(𝑥)] , 𝑗 = 1, 2, 3 . (16.30)
Thus, (16.29) and (16.30) may be written summarily as a covariant relation
𝜕 𝜇 𝜑(𝑥) = 𝑖[𝑃 𝜇 , 𝜑(𝑥)] , 𝜇 = 0, 1, 2, 3 , (16.31)
for the commutator of quantized field with the four-momentum operator. Obviously, the identity
(16.31) indicates that 𝑃 𝜇 is the generator of spacetime shifts (of course, this is in agreement with
our expectations). Now, employing the well-known operator identity
1 1
𝑒 𝐴 𝐵 𝑒 −𝐴 = 𝐵 + [ 𝐴, 𝐵] + [ 𝐴, [ 𝐴, 𝐵]] + . . . (16.32)
1! 2!
along with (16.31), it is not difficult to obtain an elegant formula for the shift of the spacetime
coordinates in 𝜑(𝑥), namely
𝜑(𝑥 + 𝑎) = 𝑒𝑖𝑃·𝑎 𝜑(𝑥) 𝑒 −𝑖𝑃·𝑎 . (16.33)
Proving (16.33) is left to an interested reader as an instructive exercise.
In concluding this chapter, let us add one more remark. It is clear that along with the
above treatment of the quantized field 𝜑(𝑥), one gets immediately the results for the energy and
momentum of the classical Klein–Gordon field configuration (16.5); it suffices to replace the
operators 𝑎(𝑘), 𝑎 † (𝑘) in the formulae (16.20) and (16.25) with the numerical coefficients 𝑎(𝑘),
𝑎 ∗ (𝑘) from (16.5). One thus gets readily

𝐸 class. = d3 𝑘 𝑘 0 |𝑎(𝑘)| 2 ,

𝑃®class. = d3 𝑘 𝑘® |𝑎(𝑘)| 2 .

It is seen, as we have noted earlier, that 𝐸 class. ≥ 0 (this is obvious already at the level of the energy
density appearing in (16.13)). Thus, the problem of negative energies, which is detrimental to
the relativistic quantum mechanics, is absent in the theory of classical Klein–Gordon field. Of
course, the point is that the energy here and in the one-particle quantum mechanics has a different
meaning; a comparison of the two cases is left to the reader as a useful revision of the old stuff.

110
Chapter 17

Particle interpretation of quantized field

Now we are on the way to establish an adequate description of the physical states of the quantized
scalar field; in particular, we have in mind the eigenstates of the energy and momentum. The
formulae (16.20) and (16.25) could be employed for such a purpose, but it is more convenient to
develop a slightly different technical language, or rather a “dialect”, which turns out to be more
user-friendly (and, moreover, it becomes highly useful also in other situations).
The idea of the technique we are going to introduce is that one may consider the field in
question (classical or quantum) to be confined within a finite spatial box, which is e.g. a cube
with the side length 𝐿, i.e. the volume 𝑉 = 𝐿 3 . The auxiliary parameter 𝐿 is essentially arbitrary,
but the reader can rest assured that no relevant physical result will depend on it. To make such a
scheme workable, one should also choose appropriate boundary conditions; one possible option
is to require that the field has the same value for 𝑥 𝑗 = 0 and 𝑥 𝑗 = 𝐿, 𝑗 = 1, 2, 3 (briefly, one thus
requires the periodicity with respect to the box side). Such a choice turns out to be convenient
with regard to the ubiquitous integrals of the exponential functions, which thus get simplified in
a desirable way. Let us emphasize that the restrictions we impose on the field in question are
just technical — there is no deeper principle that would enforce a “correct” formulation of the
whole set-up.
The field 𝜑(® 𝑥 , 𝑡) placed in the box should of course satisfy the Klein–Gordon equation, so
that a general solution can again by represented as a superposition of the exponentials exp(±𝑖𝑘𝑥)
with 𝑘 0 = ( 𝑘® 2 + 𝑚 2 ) 1/2 , but these must now satisfy the above-mentioned periodic boundary
conditions in the 3-dimensional space. It is easy to see that the variables 𝑘 𝑗 , 𝑗 = 1, 2, 3, in
exp(±𝑖 𝑘® 𝑥®) then can take on just the discrete values
2𝜋𝑛 𝑗
𝑘𝑗 = , 𝑛 𝑗 = 0, ±1, ±2, . . . (17.1)
𝐿
So, the quantized Klein–Gordon field 𝜑(𝑥) = 𝜑(®𝑥 , 𝑡) can be written as an infinite sum
∑︁  
† 𝑖𝑘𝑥
𝜑(𝑥) = 𝑁𝑘 𝑎 𝑘 𝑒 −𝑖𝑘𝑥
+ 𝑎𝑘 𝑒 , (17.2)
𝑘®

with 𝑘® given by (17.1). It turns out that an appropriate value of the normalization factor 𝑁 𝑘 is
now
1
𝑁𝑘 = . (17.3)
(2𝑘 0𝑉) 1/2
Let us see why the formula (17.3) is a good choice. First, because of our boundary conditions it
holds, taking into account (17.1),

® ®
d3 𝑥 𝑒 −𝑖 𝑘·®𝑥 𝑒𝑖 𝑙·®𝑥 = 𝑉 𝛿 𝑘,
® 𝑙® , (17.4)

111
where the integration domain is just the box. Then, defining the functions 𝑓 𝑘 (𝑥), 𝑓 𝑘∗ (𝑥) in the
same manner as in (16.7), and utilizing Eq. (17.4), one gets the orthogonality relations


d3 𝑥 𝑓 𝑘∗ (® 𝑥 , 𝑡) = 𝛿 𝑘,
𝑥 , 𝑡)𝑖 𝜕0 𝑓 𝑘 ′ (® ® 𝑘® ′ ,


(17.5)
3
d 𝑥 𝑓 𝑘 (® 𝑥 , 𝑡) = 0 .
𝑥 , 𝑡)𝑖 𝜕0 𝑓 𝑘 ′ (®

The option (17.3) is thereby justified: the identities (17.5) are directly analogous to (16.8) and
one may thus expect that the commutation relations for the operators 𝑎 𝑘 , 𝑎 †𝑘 will have a simple
structure similar to (16.12). Indeed, owing to the identities (17.5), 𝑎 𝑘 and 𝑎 †𝑘 can be expressed
in terms of 𝜑(𝑥) in the same way as before (cf. (16.9)); employing then the commutators (16.4),
one gets readily

[𝑎 𝑘 , 𝑎 †𝑙 ] = 𝛿 𝑘𝑙 ,
[𝑎 𝑘 , 𝑎 𝑙 ] = 0 , (17.6)
[𝑎 †𝑘 , 𝑎 †𝑙 ] = 0

(again, the commutators (17.6) are eventually determined by the orthogonality relations (17.5)).
Recall that we use the notation 𝑘, 𝑙 instead of 𝑘,® 𝑙® whenever it cannot cause a confusion.
The calculation of the energy and momentum proceeds along the same lines as in
Chapter 16; the
 only difference here is that for3 the integrals involving the exponentials like
® 𝑥® one gets 𝑉 𝛿 ® ® instead of (2𝜋) 𝛿 ( 𝑘® − 𝑙).
exp 𝑖( 𝑘® − 𝑙) (3) ® But this matches precisely the

𝑘,𝑙
expression (17.3) for the normalization factors and one thus immediately obtains
1 ∑︁  † 
𝐻= 𝑘 0 𝑎 𝑘 𝑎 𝑘 + 𝑎 𝑘 𝑎 †𝑘 ,
2
𝑘®
(17.7)
® 1 ∑︁
®

† †

𝑃= 𝑘 𝑎𝑘 𝑎𝑘 + 𝑎𝑘 𝑎𝑘 .
2
𝑘®

We have already noticed in the preceding chapter that the structure of the field Hamiltonian
𝐻 reminds one of the linear harmonic oscillator (LHO). Such a picture becomes even clearer
now, when we focus on the formula (17.7). Indeed, from (17.6) one gets, in particular,

[𝑎 𝑘 , 𝑎 †𝑘 ] = 1 (17.8)

for any 𝑘 satisfying (17.1). By the way, the validity of the simple relation (17.8) is the reason why
the present “discrete” formalism can be considered as more user-friendly than the “continuous”
one used in Chapter 16: it is seen that from (16.12) one would get

[𝑎(𝑘), 𝑎 † (𝑘)] = 𝛿 (3) (0) , (17.9)

and the (notoriously ill-defined) expression on the right-hand side of (17.9) is surely a nuisance.
Utilizing (17.8), the expression for 𝐻 in (17.7) may be recast as
∑︁  1


𝐻= 𝑘0 𝑎𝑘 𝑎𝑘 + , (17.10)
2
𝑘®

and the “oscillator picture” of the quantized field should now be obvious. The formula (17.10)
represents an infinite sum of LHO Hamiltonians written in terms of the “ladder operators” (or,

112
raising and lowering operators), where the coefficient 𝑘 0 corresponds to the oscillator frequency
(recall that in ordinary units the coefficient of 𝑎 † 𝑎 + 12 is ℏ𝜔). Thus, the field Hamiltonian
(17.10) incorporates an infinite number of oscillator modes with different frequencies; this is
why the term “decomposition into oscillators” is traditionally used in connection with the field
quantization.
Obviously, the constant
1 ∑︁ 1 ∑︁ ® 2
𝑘0 = ( 𝑘 + 𝑚 2 ) 1/2 (17.11)
2 2
𝑘® 𝑘®

is infinite, in view of the formula (17.1). However, such a constant does not influence the
equation of motion (16.29), since it commutes with anything. So, let us discard the infinite
constant (17.11) from (17.10); such a redefinition of 𝐻 will be called “normal ordering” (of the
field operators) from now on (one might also say that it is the first example of a “renormalization”
in quantum field theory). The formula (17.7) for the momentum 𝑃® may be recast in the same
way; thus, in what follows we are going to work with the normal ordered expressions
∑︁
𝐻= 𝑘 0 𝑎 †𝑘 𝑎 𝑘 ,
®
𝑘
∑︁ (17.12)
𝑃® = ® † 𝑎𝑘 .
𝑘𝑎 𝑘
𝑘®

Now we are in a position to examine the eigenvalues and eigenstates of the operators 𝐻
and 𝑃. First of all, let us note that [𝐻, 𝑃]
® ® = 0 (as expected a priori). Indeed, to see this, it
suffices to prove that
[𝑎 †𝑘 𝑎 𝑘 , 𝑎 †𝑙 𝑎 𝑙 ] = 0 (17.13)
for any 𝑘, 𝑙. The proof of (17.13) is easy: for 𝑘 = 𝑙 it holds trivially, and for 𝑘 ≠ 𝑙 one should
take into account that the commutator in question involves only mutually commuting operators
(cf. (17.6)).
To proceed, we will invoke the formulae (16.26). Obviously, such identities are valid
within our “discrete formalism” as well, so let us reproduce them here for reader’s convenience:

[𝐻, 𝑎 𝑘 ] = −𝑘 0 𝑎 𝑘 ,
(17.14)
[𝐻, 𝑎 †𝑘 ] = 𝑘 0 𝑎 †𝑘 .

Now, suppose that |Ψ⟩ is a common eigenvector of 𝐻 and 𝑃,


® so that

𝐻|Ψ⟩ = 𝐸 |Ψ⟩ , ®
𝑃|Ψ⟩ = 𝑝|Ψ⟩
® . (17.15)

Let us find out what happens if the operators 𝑎 𝑘 and 𝑎 †𝑘 act on |Ψ⟩. One gets, using (17.14),

𝐻𝑎 †𝑘 |Ψ⟩ = (𝑎 †𝑘 𝐻 + 𝑘 0 𝑎 †𝑘 )|Ψ⟩
= (𝐸 + 𝑘 0 )𝑎 †𝑘 |Ψ⟩ (17.16)

and

𝐻𝑎 𝑘 |Ψ⟩ = (𝑎 𝑘 𝐻 − 𝑘 0 𝑎 𝑘 )|Ψ⟩
= (𝐸 − 𝑘 0 )𝑎 𝑘 |Ψ⟩ . (17.17)

113
Similarly, the results for the momentum operator become

𝑃𝑎 ® † |Ψ⟩
® † |Ψ⟩ = ( 𝑝® + 𝑘)𝑎 (17.18)
𝑘 𝑘

and
® 𝑘 |Ψ⟩ .
® 𝑘 |Ψ⟩ = ( 𝑝® − 𝑘)𝑎
𝑃𝑎 (17.19)
The relations (17.16) through (17.19) are quite remarkable. They show that the action of 𝑎 †𝑘
amounts to adding the momentum 𝑘® and the energy 𝑘 0 = ( 𝑘® 2 +𝑚 2 ) 1/2 , while 𝑎 𝑘 removes the same
momentum and energy when acting on a four-momentum eigenstate. Thus, such a behaviour
may be described as adding a particle with the four-momentum 𝑘, 𝑘 2 = 𝑚 2 to the original state
|Ψ⟩, or removing such a particle from it. This leads naturally to the usual terminology: 𝑎 †𝑘 is
called the creation operator (of a particle) and 𝑎 𝑘 is the annihilation operator. Recall that in
the theory of LHO, 𝑎 † and 𝑎 is called the raising and lowering operator, respectively (referring
to the corresponding shift of the oscillator energy).
Now we are very close to our final goal. In view of the structure of the Hamiltonian
(17.12), one may construct its eigenstates as the tensor products of the eigenvectors of the
operators 𝑎 †𝑘 𝑎 𝑘 for different values of 𝑘 (these commute, according to (17.13)!). Owing to the
commutation relation (17.8) it is clear that the action of 𝑎 𝑘 decreases the eigenvalue of 𝑎 †𝑘 𝑎 𝑘
by 1. In full analogy with the LHO theory one may then observe that for a given annihilation
operator 𝑎 𝑘 there must be an eigenvector |Ω 𝑘 ⟩ of 𝑎 †𝑘 𝑎 𝑘 such that 𝑎 𝑘 |Ω 𝑘 ⟩ = 0. Indeed, if there
were no such |Ω 𝑘 ⟩, the repeated action of 𝑎 𝑘 would produce a negative eigenvalue of 𝑎 †𝑘 𝑎 𝑘 , but
this is in contradiction with the fact that 𝑎 †𝑘 𝑎 𝑘 is, obviously, a positive operator. So, let us define
the vector |0⟩ such that
𝑎 𝑘 |0⟩ = 0 for any 𝑘 (17.20)
(|0⟩ might be represented formally as a tensor product of the vectors |Ω 𝑘 ⟩ for all possible values
of 𝑘). It is clear that one then has

𝐻|0⟩ = 0 , ® = 0.
𝑃|0⟩ (17.21)

Thus, |0⟩ may be considered as a no-particle state and called the vacuum. It is the ground state
of the Hamiltonian (17.12).
Taking into account what we already know about the properties of the creation operators
(see (17.16), (17.18)), one may construct one-particle, two-particle, etc. states by means of
appropriate actions of 𝑎 †𝑘 on the vacuum. In general, an 𝑛-particle state can be built as
  𝑛1   𝑛𝑟
𝑎 †𝑘 1 . . . 𝑎 †𝑘 𝑟 |0⟩ , (17.22)

with 𝑛1 + . . . + 𝑛𝑟 = 𝑛. Of course, the corresponding eigenvalue of 𝐻 is the sum of particle


energies, i.e. 𝑛1 𝐸 (𝑘 1 )+. . .+𝑛𝑟 𝐸 (𝑘 𝑟 ). The numbers 𝑛1 , . . . , 𝑛𝑟 are naturally called the occupation
numbers, since they tell us how many particles occupy a given state. Consequently, the
description of the field states shown in (17.22) is called the representation of occupation
numbers (a knowledgeable reader may recall that such a scheme is employed also in the theory
of many-body systems, e.g. in the condensed matter physics or nuclear physics). The linear span
of all possible state vectors of the type (17.22) can be rightly called the 𝒏-particle subspace of
the whole space of states of the quantized Klein–Gordon field, and denoted e.g. as H (𝑛) . The
full space is then the direct sum of such subspaces, i.e.

H =H (0)
⊕H (1)
⊕H (2)
⊕ ... , (17.23)

114
where H (0) is just the vacuum state. H is called the Fock space, in honour of the Russian
theorist V. A. Fock (as we have already noted in Chapter 1, the transcription “Fok” would perhaps
be more appropriate).
It is important to notice that the number of particles sharing the same state is, in principle,
unlimited, i.e. the occupation numbers 𝑛1 , . . . , 𝑛𝑟 in (17.22) are arbitrary; the point is that the
commutation relations for the creation and annihilation operators do not imply any constraint
here. This means that the Klein–Gordon scalar particles are bosons, i.e. they obey the Bose–
Einstein statistics. Referring to our earlier experience with the Klein–Gordon equation, one
may guess that these particles have zero spin. Thus, we have an explicit example of the famous
connection between the spin and statistics (namely that particles with an integer spin are bosons
and those with half-integer spin are fermions), which is a general result of relativistic quantum
field theory (many details of the history of this problem and several proofs of this remarkable
theorem can be found in the book [20]; for an instructive comprehensive treatment of the subject
see also the book [27]).
A last remark: One might wonder if a scalar particle described here really exists in nature.
Yes, it does; it is the celebrated Higgs boson (see e.g. [22]).
Let us summarize our results. We have arrived at the explicit construction of a space of
states corresponding to the quantized real scalar field. Such a space is spanned by the common
eigenvectors of the field energy and momentum, and the most remarkable finding is that such
states may be described in terms of particles satisfying the standard relativistic relation 𝑘 2 = 𝑚 2
for the four-momentum (the condition of mass shell). This is why one says, in the common
physical folklore, that “particles are the field quanta”. Obviously, the “oscillator decomposition”
of the quantized field is instrumental in the whole treatment based on the notion of creation
and annihilation operators (this seems to support the popular wisdom that most of the quantum
theory relies on the linear harmonic oscillator). The expansion of the quantized field in terms of
the creation and annihilation operators may be implemented either in the infinite 3-dimensional
space (“continuous formalism”) or in a finite spatial box (“discrete formalism”). The origin of
such provisional names should be clear when comparing (16.5) and (17.2) along with (17.1).
Obviously, the discrete scheme is more convenient for establishing the correspondence with the
oscillator picture, but in the subsequent chapters we will use freely both formalisms. In fact,
these two “dialects” of the same language could also be compared, in musical terms, to playing
the same tune on two different instruments, e.g. violin or piano, according to one’s taste.
In concluding this chapter, a historical remark is in order. The method of canonical
quantization that we have demonstrated here and in Chapter 16 on the example of the real Klein–
Gordon field is due to Werner Heisenberg and Wolfgang Pauli, who published their original
work in 1929. Another important contribution appeared in the paper by Wolfgang Pauli and
Victor Weisskopf published in 1934. The latter work treated the problem of antiparticles and it
was, partly, an opposition to the Dirac’s idea of a filled sea of negative energy states. For a brief
review of the QFT history, see e.g. [13]. As we know, it is indeed the field theory approach,
which provides a consistent description of particles and antiparticles. This fundamental QFT
theme will be discussed in the following chapter.

115
Chapter 18

Complex scalar field. Antiparticles

Let us now consider a complex Klein–Gordon field. As we have seen in Chapter 14, its
Lagrangian can be written as
L = 𝜕𝜇 𝜑𝜕 𝜇 𝜑∗ − 𝑚 2 𝜑𝜑∗ , (18.1)
and the independent dynamical variables (“generalized coordinates”) are 𝜑(𝑥) and 𝜑∗ (𝑥). The
corresponding conjugate momenta are then
𝜕L
𝜑(𝑥) −→ 𝜋(𝑥) = = 𝜕0 𝜑∗ (𝑥) ,
𝜕 (𝜕0 𝜑)
𝜕L
𝜑∗ (𝑥) −→ 𝜋 ∗ (𝑥) = = 𝜕0 𝜑(𝑥) . (18.2)
𝜕 (𝜕0 𝜑∗ )

In the quantum case, 𝜑(𝑥) and 𝜑∗ (𝑥) become operators 𝜑(𝑥) and 𝜑† (𝑥) related by means of
Hermitian conjugation. Thus, there are two pairs of canonical variables, namely

. 
𝜑(𝑥), 𝜑† (𝑥) ,

. 
𝜑† (𝑥), 𝜑(𝑥) , (18.3)

and it is easy to realize that the relevant equal-time commutation relations to be postulated are

[𝜑(®
.
𝑥 , 𝑡), 𝜑† ( 𝑦®, 𝑡)] = 𝑖𝛿 (3) (®
𝑥 − 𝑦®) ,
.
𝑥 , 𝑡), 𝜑( 𝑦®, 𝑡)] = 0 ,
[𝜑(®
𝑥 , 𝑡), 𝜑( 𝑦®, 𝑡)] = 0 ,
[𝜑(®
(18.4)
[𝜑(®𝑥 , 𝑡), 𝜑† ( 𝑦®, 𝑡)] = 0 ,
. 𝑥 , 𝑡), 𝜑(
[ 𝜑(®
. 𝑦®, 𝑡)] = 0 ,
. .
𝑥 , 𝑡), 𝜑† ( 𝑦®, 𝑡)] = 0 .
[ 𝜑(®
Other possible canonical commutators are reduced to the Hermitian conjugation of those shown
above. Needless to say, to arrive at the relations (18.4), we have followed consistently the
paradigm of quantum mechanics, formulated succinctly in Chapter 16 (see (16.3)). In brief, one
could also say that

[𝜑(®
.
𝑥 , 𝑡), 𝜑† ( 𝑦®, 𝑡)] = 𝑖𝛿 (3) (®
𝑥 − 𝑦®) , [𝜑† (®
.
𝑥 , 𝑡), 𝜑( 𝑦®, 𝑡)] = 𝑖𝛿 (3) (®
𝑥 − 𝑦®) , (18.5)

and all remaining equal-time commutators vanish.


As a solution of the Klein–Gordon equation, the classical field 𝜑(𝑥) can be written as

𝜑(𝑥) = d3 𝑘 𝑁 𝑘 𝑏(𝑘) 𝑒 −𝑖𝑘𝑥 + 𝑑 ∗ (𝑘) 𝑒𝑖𝑘𝑥 , (18.6)
 

116
® 𝑘 0 = ( 𝑘® 2 + 𝑚 2 ) 1/2 and 𝑁 𝑘 has the familiar form (2𝜋) −3/2 (2𝑘 0 ) −1/2 . We
where 𝑘 = (𝑘 0 , 𝑘),
have denoted the coefficient at exp(𝑖𝑘𝑥) deliberately as 𝑑 ∗ (𝑘), which may seem rather artificial.
However, the quantum counterpart of (18.6) is then naturally written as

𝜑(𝑥) = d3 𝑘 𝑁 𝑘 𝑏(𝑘) 𝑒 −𝑖𝑘𝑥 + 𝑑 † (𝑘) 𝑒𝑖𝑘𝑥 , (18.7)
 

and, consequently, ∫

𝜑 (𝑥) = d3 𝑘 𝑁 𝑘 𝑏 † (𝑘) 𝑒𝑖𝑘𝑥 + 𝑑 (𝑘) 𝑒 −𝑖𝑘𝑥 . (18.8)
 

The conventional notation 𝑑 † (𝑘) in (18.7) gets a deeper meaning: it turns out that it will be
possible to interpret it as a specific creation operator.
So, let us find out what are the commutation relations for the operator coefficients in
(18.6), (18.7). To this end, one may express them in analogy with the formulae (16.9); one gets


𝑏(𝑘) = 𝑖 𝑓 𝑘∗ (®
𝑥 , 𝑡) 𝜕0 𝜑(®𝑥 , 𝑡) d3 𝑥 ,



𝑏 (𝑘) = −𝑖 𝑓 𝑘 (® 𝑥 , 𝑡) d3 𝑥 ,
𝑥 , 𝑡) 𝜕0 𝜑† (®


(18.9)
𝑑 (𝑘) = 𝑖 𝑥 , 𝑡) 𝜕0 𝜑† (®
𝑓 𝑘∗ (® 𝑥 , 𝑡) d3 𝑥 ,



𝑑 (𝑘) = −𝑖 𝑓 𝑘 (®𝑥 , 𝑡) 𝜕0 𝜑(®𝑥 , 𝑡) d3 𝑥 .

The computation of the commutators in question relies on (18.9) and the canonical relations
(18.4), and it is essentially the same procedure as before, in Chapter 16. The results can be
summarized as follows:
[𝑏(𝑘), 𝑏 † (𝑙)] = 𝛿 (3) ( 𝑘® − 𝑙)
® ,
[𝑑 (𝑘), 𝑑 † (𝑙)] = 𝛿 (3) ( 𝑘® − 𝑙)
® ,
[𝑏(𝑘), 𝑏(𝑙)] = 0,
(18.10)
[𝑏(𝑘), 𝑑 (𝑙)] = 0,
[𝑑 (𝑘), 𝑑 (𝑙)] = 0,
[𝑏(𝑘), 𝑑 † (𝑙)] = 0 .
Again, the succinct statement could be that just [𝑏, 𝑏 † ] and [𝑑, 𝑑 † ] are non-zero, having the
standard value as in (16.12), and the rest is zero. According to the findings presented in
Chapter 17 it is clear that 𝑏, 𝑑 and 𝑏 † , 𝑑 † are candidates for the annihilation and creation
operators, respectively. The derivation of the above results is straightforward, but, just to be
sure, let us verify the second relation in (18.10), which we have envisaged before, in connection
with the notation introduced in (18.6), (18.7). One has, using (18.9),
. 𝑥 , 𝑡) − 𝑓.∗ (®𝑥 , 𝑡)𝜑† (®𝑥 , 𝑡),
∫∫
3 3
h

[𝑑 (𝑘), 𝑑 (𝑙)] = d 𝑥 d 𝑦 𝑓 𝑘∗ (® 𝑥 , 𝑡) 𝜑† (® 𝑘

. . i
𝑓𝑙 ( 𝑦®, 𝑡) 𝜑( 𝑦®, 𝑡) − 𝑓𝑙 ( 𝑦®, 𝑡)𝜑( 𝑦®, 𝑡) . (18.11)
Discarding the terms involving vanishing equal-time commutators according to (18.4), one is
left with
∫∫
. . 𝑥 , 𝑡), 𝜑( 𝑦®, 𝑡)]
d3 𝑥 d3 𝑦 − 𝑓 𝑘∗ (®


[𝑑 (𝑘), 𝑑 (𝑙)] = 𝑥 , 𝑡) 𝑓𝑙 ( 𝑦®, 𝑡) [ 𝜑† (®
. . 
− 𝑓 𝑘∗ (®
𝑥 , 𝑡) 𝑓𝑙 ( 𝑦®, 𝑡) [𝜑† (®𝑥 , 𝑡), 𝜑( 𝑦®, 𝑡)] . (18.12)

117
Then, employing again (18.4), one gets finally


[𝑑 (𝑘), 𝑑 (𝑙)] = d3 𝑥 𝑓 𝑘∗ (®

𝑥 , 𝑡) = 𝛿 (3) ( 𝑘® − 𝑙)
𝑥 , 𝑡)𝑖 𝜕0 𝑓𝑙 (® ® , (18.13)

and that’s it.


The next step is the evaluation of the energy and momentum. Using the general formula
(15.16) for the energy–momentum tensor, one may write

𝐻 = d3 𝑥 𝜕0 𝜑† 𝜕0 𝜑 + ® 𝜑† ® 𝜑 + 𝑚 2 𝜑† 𝜑 ,
 Δ  Δ
∫ (18.14)
3
 
†® ®
𝑃 = − d 𝑥 𝜕0 𝜑 𝜑 + 𝜕0 𝜑 𝜑 .
® †Δ Δ

Substituting there the expressions (18.7), (18.8), one may follow basically the same boring
procedure that in Chapter 16 led from (16.13) and (16.22) to (16.20) and (16.25). So, let us skip
the calculational details and announce immediately the relevant results. One gets

𝐻 = d3 𝑘 𝑘 0 𝑏 † (𝑘)𝑏(𝑘) + 𝑑 (𝑘)𝑑 † (𝑘) ,
 
∫ (18.15)
3 ®
𝑃 = d 𝑘 𝑘 𝑏 (𝑘)𝑏(𝑘) + 𝑑 (𝑘)𝑑 (𝑘) .
®
 † †


From the preceding chapter we know that it is quite convenient to play the same tune on a
different instrument; so, shrinking the 3-dimensional space to a finite box, the formulae (18.15)
are recast in the form
∑︁  
† †
𝐻= 𝑘0 𝑏𝑘 𝑏𝑘 + 𝑑𝑘 𝑑𝑘 ,
𝑘 ®
∑︁   (18.16)
® ® † †
𝑃= 𝑘 𝑏𝑘 𝑏𝑘 + 𝑑𝑘 𝑑𝑘
𝑘®

in analogy with (17.7). The commutation relations (18.10) now become


[𝑏 𝑘 , 𝑏 †𝑙 ] = 𝛿 𝑘𝑙 , [𝑑 𝑘 , 𝑑 †𝑘 ] = 𝛿 𝑘𝑙 , (18.17)
and one may thus repeat the analysis presented in the preceding chapter. As a result, (18.16) can
be rewritten in normal ordered form
∑︁  
𝐻= 𝑘 0 𝑏 †𝑘 𝑏 𝑘 + 𝑑 †𝑘 𝑑 𝑘 ,
𝑘 ®
∑︁   (18.18)
𝑃® = 𝑘® 𝑏 †𝑘 𝑏 𝑘 + 𝑑 †𝑘 𝑑 𝑘 ,
𝑘®

and one may then construct the common eigenstates of the operators (18.18) in the same manner
like in the preceding chapter, recovering the architecture of the Fock space. The resulting picture
is that one can generate multiparticle states by means of the creation operators of two different
types, 𝑏 †𝑘 and 𝑑 †𝑘 , but the corresponding particles carry the same mass; therefore, one would like
to figure out a way of discerning the “𝑑-particles” from “𝑏-particles”.
Fortunately, the solution is at hand, and is rather profound from the physical point of
view. As we have seen in Chapter 15, for complex Klein–Gordon field there is an extra integral
of motion, corresponding to the conserved current 𝐽 𝜇 , which can be written, classically, as
𝐽 𝜇 = 𝑖 [(𝜕 𝜇 𝜑)𝜑∗ − 𝜑(𝜕 𝜇 𝜑∗ )] (18.19)

118
(cf. (15.58); the choice of sign here is purely conventional). From the identity 𝜕𝜇 𝐽 𝜇 = 0 one gets
the time-independent quantity ∫
𝑄= d3 𝑥 𝐽0 (®
𝑥 , 𝑡) (18.20)

that may be called, conventionally, the charge. The quantum counterpart of the expression
(18.20) is the operator ∫
3
 
𝑄 = 𝑖 d 𝑥 𝜕0 𝜑 𝜑 − 𝜑 𝜕0 𝜑 .
† †
(18.21)

Substituting (18.7) and (18.8) into (18.21), one gets the result

𝑄 = d3 𝑘 𝑏(𝑘)𝑏 † (𝑘) − 𝑑 † (𝑘)𝑑 (𝑘) . (18.22)
 

Note that to arrive at (18.22), one employs basically the same manipulations that we have
described in detail in Chapter 16 when deriving the formulae (16.20) and (16.25). So, the
verification of (18.22) is left to the reader as an instructive exercise. In the “discrete dialect”,
corresponding to the field quantized in a finite box, the formula (18.22) becomes, as usual,
∑︁
𝑄= (𝑏 𝑘 𝑏 †𝑘 − 𝑑 †𝑘 𝑑 𝑘 ) . (18.23)
𝑘®

Using the commutation relation [𝑏 𝑘 , 𝑏 †𝑘 ] = 1 and discarding an infinite constant, the expression
for 𝑄 is recast in the normal ordered form
∑︁
𝑄= (𝑏 †𝑘 𝑏 𝑘 − 𝑑 †𝑘 𝑑 𝑘 ) . (18.24)
𝑘®

Now, it is easy to see that the commutation relations (18.17) imply

[𝑄, 𝑏 †𝑘 ] = 𝑏 †𝑘 ,
(18.25)
[𝑄, 𝑑 †𝑘 ] = −𝑑 †𝑘 .

Of course, the vacuum state |0⟩ is defined by

𝑏 𝑘 |0⟩ = 0 , 𝑑 𝑘 |0⟩ = 0 (18.26)

for any 𝑘. Then also 𝑄|0⟩ = 0 and from (18.25) one gets readily

𝑄𝑏 †𝑘 |0⟩ = 𝑏 †𝑘 |0⟩ ,
(18.27)
𝑄𝑑 †𝑘 |0⟩ = −𝑑 †𝑘 |0⟩ .

Thus, the one-particle state 𝑏 †𝑘 |0⟩ can be distinguished from 𝑑 †𝑘 |0⟩ by means of the eigenvalue
of the charge operator 𝑄: for the “𝑏-particle” one has 𝑄 = +1, and the “𝑑-particle” carries the
value 𝑄 = −1. In this way one arrives naturally at the concept of antiparticles: the one-particle
state created by means of 𝑑 †𝑘 can be conventionally called the antiparticle with respect to the
particle state produced by the action of 𝑏 †𝑘 .
Let us summarize our main results. The lesson to be learnt from the above simple analysis
is that the theory of quantized complex scalar field leads naturally to the appearance of particles
and antiparticles; the ground state of the whole system is the vacuum, in which particles and
antiparticles are totally absent; at the same time, there is no question of negative energies. Such

119
a picture should be appreciated as a real achievement of the quantum field theory. It differs
radically from the older approach formulated within the framework of the relativistic quantum
mechanics, which relied on the idea of the vacuum as a filled sea of negative energy states;
as we have noted earlier (cf. the end of Chapter 13), such an idea is in a sense absurd and
fails completely just in the case of bosonic particles. Thus, it is reassuring that the theory of
quantized complex Klein–Gordon field accommodates bosons and leads to a natural prediction
of antiparticles and their consistent description.

120
Chapter 19

Quantization of Dirac field.


Anticommutators

The next stop on our quantization tour is the Dirac field. We will see that this case exhibits some
fundamentally new features that go far beyond our previous experience (the most important point
has already been indicated in the title of this chapter).
Let us start with the conventional Lagrangian (14.27) that we reproduce here for the
reader’s convenience:
L = 𝑖 𝜓𝛾 𝜇 𝜕𝜇 𝜓 − 𝑚𝜓𝜓 . (19.1)
As we know, the independent dynamical variables are 𝜓(𝑥) and 𝜓(𝑥). One may try to develop
the canonical formalism in accordance with the paradigm established in previous chapters, but
it is clear that one immediately runs into a difficulty. Indeed, while the conjugate momentum
for 𝜓 is
𝜕L
= 𝑖 𝜓𝛾0 = 𝑖𝜓 † , (19.2)
𝜕 (𝜕0 𝜓)
for 𝜓 one obviously gets zero. Well, an astute reader might object that one could invoke the
alternative form (14.31), which is more symmetric in 𝜓 and 𝜓. However, in fact, the real problem
lies deeper. It turns out that when sticking to commutators of field operators, one would reach
an impasse!
To elucidate this crucial point, we are going to evaluate the energy and momentum of
the Dirac field. For such a purpose, the Lagrangian (19.1) can be used safely. Let us recall that
according to our earlier results (15.30), (15.54) we have

𝐻 = d3 𝑥 𝜓 †𝑖 𝜕0 𝜓 , (19.3)

𝑃 = d3 𝑥 𝜓 † −𝑖 ® 𝜓 .
® (19.4)
 Δ

Now, one may employ the formula (10.19) for a general solution of the Dirac equation (we have
considered it originally in the context of relativistic quantum mechanics, but, as we have stressed
repeatedly, it is equally valid for the classical field). So, one has

𝜓(𝑥) = d3 𝑝
∑︁
u ( 𝑝, 𝑠) 𝑒 −𝑖 𝑝𝑥 + 𝑑 ∗ ( 𝑝, 𝑠)vv ( 𝑝, 𝑠) 𝑒𝑖 𝑝𝑥 , (19.5)
 
𝑁 𝑝 𝑏( 𝑝, 𝑠)u
𝑠

where all used symbols have the standard meaning defined in preceding chapters. Concerning the
conventional notation 𝑑 ∗ ( 𝑝, 𝑠), it is the same story as in the case of the complex Klein–Gordon

121
field: the quantum version of the formula (19.5) reads

𝜓(𝑥) = d3 𝑝
∑︁
u ( 𝑝, 𝑠) 𝑒 −𝑖 𝑝𝑥 + 𝑑 † ( 𝑝, 𝑠)vv ( 𝑝, 𝑠) 𝑒𝑖 𝑝𝑥 , (19.6)
 
𝑁 𝑝 𝑏( 𝑝, 𝑠)u
𝑠

where 𝑏( 𝑝, 𝑠), 𝑑 ( 𝑝, 𝑠) are operators, and we anticipate the role of 𝑑 † ( 𝑝, 𝑠) as a creation operator.
Of course, the Hermitian conjugation of the formula (19.6) then reads

3
∑︁
𝜓 (𝑥) = d 𝑝

𝑁 𝑝 𝑏 † ( 𝑝, 𝑠)u
u† ( 𝑝, 𝑠) 𝑒𝑖 𝑝𝑥 + 𝑑 ( 𝑝, 𝑠)vv † ( 𝑝, 𝑠) 𝑒 −𝑖 𝑝𝑥 . (19.7)
 
𝑠

Substituting the expressions (19.6), (19.7) into (19.3), one integrates first the exponentials and
then the resulting delta functions are used up. When these routine manipulations are carried out,
we are left with

3 2 3
∑︁ h
𝐻= d 𝑝 𝑁 𝑝 (2𝜋) 𝑝 0 𝑏 † ( 𝑝, 𝑠)𝑏( 𝑝, 𝑠)u
u† ( 𝑝, 𝑠)u
u ( 𝑝, 𝑠′)
𝑠,𝑠′
− 𝑑 ( 𝑝, 𝑠)𝑑 † ( 𝑝, 𝑠′)vv † ( 𝑝, 𝑠)vv ( 𝑝, 𝑠′)
u† ( 𝑝, 𝑠)vv ( 𝑝e, 𝑠′) 𝑒 2𝑖 𝑝0 𝑥0
− 𝑏 † ( 𝑝, 𝑠)𝑑 † ( 𝑝e, 𝑠′)u
i
′ †
+ 𝑑 ( 𝑝, 𝑠)𝑏( 𝑝e, 𝑠 )vv ( 𝑝, 𝑠)u ′ −2𝑖 𝑝 0 𝑥0
u ( 𝑝e, 𝑠 ) 𝑒 , (19.8)

where 𝑝e = ( 𝑝 0 , − 𝑝)
® if 𝑝 = ( 𝑝 0 , 𝑝).
® Now one may utilize the results mentioned in Chapter 10
(cf. the text following Eq. (10.20)), namely

u ( 𝑝, 𝑠′) = 2𝑝 0 𝛿 𝑠𝑠′ ,
u† ( 𝑝, 𝑠)u v † ( 𝑝, 𝑠)vv ( 𝑝, 𝑠′) = 2𝑝 0 𝛿 𝑠𝑠′ (19.9)

and
u† ( 𝑝, 𝑠)vv ( 𝑝e, 𝑠′) = 0 , u ( 𝑝e, 𝑠′) = 0
v † ( 𝑝, 𝑠)u (19.10)
(let us recall that these relations are derived by using the Gordon identity (10.5) and its siblings
summarized in Appendix C). Of course, it is reassuring that the relations (19.10) hold, since the
time-dependent terms in (19.8) are thereby eliminated. Thus, the final result for the Hamiltonian
reads ∫
𝐻 = d3 𝑝
∑︁ 
𝑝 0 𝑏 † ( 𝑝, 𝑠)𝑏( 𝑝, 𝑠) − 𝑑 ( 𝑝, 𝑠)𝑑 † ( 𝑝, 𝑠) . (19.11)

𝑠
In a similar way, for the momentum (19.4) one gets

𝑃 = d3 𝑝
∑︁ 
® 𝑝® 𝑏 † ( 𝑝, 𝑠)𝑏( 𝑝, 𝑠) − 𝑑 ( 𝑝, 𝑠)𝑑 † ( 𝑝, 𝑠) . (19.12)

𝑠

Note that the formulae (19.11) and (19.12) can be modified readily so as to get the energy and
momentum of the classical Dirac field configuration defined by (19.5). In particular, for the
energy one gets, mutatis mutandis,

𝐸 = d3 𝑝 𝑝 0 |𝑏( 𝑝, 𝑠)| 2 − |𝑑 ( 𝑝, 𝑠)| 2 .
∑︁  
(19.13)
𝑠

So, in contrast to the case of the scalar field, the energy of Dirac field is, in general, not positive.
Now we are in a position to take up our main task, i.e. to formulate a meaningful theory
of quantized Dirac field, including its particle interpretation. For convenience, let us first switch

122
to the familiar alternative dialect, describing the field confined in a finite box. Of course, it
means that the formulae (19.11), (19.12) are recast, as usual, in the form of the sums over the
admissible discrete values of 𝑝:
®
∑︁ ∑︁  
† †
𝐻= 𝑝 0 𝑏 𝑝,𝑠 𝑏 𝑝,𝑠 − 𝑑 𝑝,𝑠 𝑑 𝑝,𝑠 ,
𝑝® 𝑠
∑︁ ∑︁   (19.14)
𝑃® = 𝑝® 𝑏 †𝑝,𝑠 𝑏 𝑝,𝑠 − 𝑑 𝑝,𝑠 𝑑 †𝑝,𝑠 .
𝑝® 𝑠

It is seen that the expressions (19.14) have the right “oscillator-like” structure, so that 𝑏, 𝑏 † ,
etc. are good candidates for possible annihilation and creation operators. According to our
previous experience, for achieving a natural particle interpretation of the eigenstates of 𝐻 and
® appropriate commutation relations for creation and annihilation operators are instrumental.
𝑃,
However, in the present case the commutation relations are useless, since the problem of negative
energy would persist; the point is that 𝐻 is a difference of two positive operators and the minus
sign inside (19.14) is not removed upon commuting 𝑑 and 𝑑 † . The radical way out, historically
due to Pascual Jordan and Eugene Wigner, is to introduce anticommutation relations for 𝑑
and 𝑑 † (and, subsequently, for 𝑏, 𝑏 † as well). So, let us proceed by postulating the set of
anticommutation relations

{𝑏 𝑝,𝑠 , 𝑏 †𝑞,𝑠′ } = 𝛿 𝑠𝑠′ 𝛿 𝑝𝑞 , †


{𝑑 𝑝,𝑠 , 𝑑 𝑞,𝑠 ′ } = 𝛿 𝑠𝑠 ′ 𝛿 𝑝𝑞 ,

{𝑏 𝑝,𝑠 , 𝑏 𝑞,𝑠′ } = 0 , {𝑑 𝑝,𝑠 , 𝑑 𝑞,𝑠′ } = 0 , (19.15)



{𝑏 𝑝,𝑠 , 𝑑 𝑞,𝑠 } = 0 , {𝑏 𝑝,𝑠 , 𝑑 𝑞,𝑠 ′} =0

(further relevant identities are given by the Hermitian conjugation of (19.15)). Briefly, one may
say that {𝑏, 𝑏 † } and {𝑑, 𝑑 † } are non-trivial (being expressed in terms of the notorious Kronecker
symbols), and all remaining anticommutators vanish. Needless to say, if we decide to work
in the infinite space, i.e. use a continuous variable 𝑝® like in (19.11), (19.12), 𝛿 𝑝𝑞 in (19.15) is
replaced with 𝛿 (3) ( 𝑝® − 𝑞).
®
The impact of the anticommutators (19.15) on the expressions (19.14) is immediately
clear. In particular, using the relation 𝑑 𝑝,𝑠 𝑑 †𝑝,𝑠 + 𝑑 †𝑝,𝑠 𝑑 𝑝,𝑠 = 1, one has
∑︁ ∑︁  
𝐻= 𝑝 0 𝑏 †𝑝,𝑠 𝑏 𝑝,𝑠 + 𝑑 †𝑝,𝑠 𝑑 𝑝,𝑠 − 1 , (19.16)
𝑝® 𝑠

and, discarding the (negative) infinite constant emerging from (19.16), one is left with
∑︁ ∑︁  
𝐻= 𝑝 0 𝑏 †𝑝,𝑠 𝑏 𝑝,𝑠 + 𝑑 †𝑝,𝑠 𝑑 𝑝,𝑠 . (19.17)
𝑝® 𝑠

Such a redefinition guarantees positivity of the Hamiltonian (19.17); now it remains to be


clarified whether 𝑏 † , 𝑏, etc. may play indeed the desired role of creation and annihilation
operators. Fortunately, it is so. The crucial point is that even though the basic relations (19.15)
involve anticommutators, for the composite commutators of the type [𝑏 † 𝑏, 𝑏 † ], etc. we recover
the same structure as before, in the case of a scalar field. To see this, one employs the general
identity
[ 𝐴𝐵, 𝐶] = 𝐴{𝐵, 𝐶} − { 𝐴, 𝐶}𝐵 . (19.18)

123
Using (19.15), one then gets, for example,

[𝑏 †𝑝,𝑠 𝑏 𝑝,𝑠 , 𝑏 †𝑞,𝑠′ ] = 𝑏 †𝑝,𝑠 {𝑏 𝑝,𝑠 , 𝑏 †𝑞,𝑠′ } − {𝑏 †𝑝,𝑠 , 𝑏 †𝑞,𝑠′ }𝑏 𝑝,𝑠


= 𝛿 𝑝𝑞 𝛿 𝑠𝑠′ 𝑏 †𝑝,𝑠 . (19.19)

Similarly,

[𝑏 †𝑝,𝑠 𝑏 𝑝,𝑠 , 𝑏 𝑞,𝑠′ ] = 𝑏 †𝑝,𝑠 {𝑏 𝑝,𝑠 , 𝑏 𝑞,𝑠′ } − {𝑏 †𝑝,𝑠 , 𝑏 𝑞,𝑠′ }𝑏 𝑝,𝑠


= −𝛿 𝑝𝑞 𝛿 𝑠𝑠′ 𝑏 𝑝,𝑠 . (19.20)

Of course, analogous relations are obtained for the 𝑑-operators. The above results then imply
readily

[𝐻, 𝑏 †𝑝,𝑠 ] = 𝑝 0 𝑏 †𝑝,𝑠 ,


(19.21)
[𝐻, 𝑏 𝑝,𝑠 ] = −𝑝 0 𝑏 𝑝,𝑠 ,

and similarly for 𝑑 †𝑝,𝑠 , 𝑑 𝑝,𝑠 . For the operator 𝑃® one obviously gets analogous identities, with 𝑝 0
replaced by 𝑝.®
As we already know from the previous chapters, the relations like (19.21) represent the
true basis for the particle interpretation of a quantized field. So, we have achieved our goal, at
least a substantial part of it: 𝑏 † and 𝑑 † can be interpreted as creation operators, and 𝑏, 𝑑 are their
annihilation counterparts. It remains to be clarified how can one distinguish the “𝑑-particles”
from “𝑏-particles” (note that using just the information concerning the energy and momentum,
both types of particles obviously carry the same mass). Having in mind our experience with
the complex scalar field, it is natural to expect that we might uncover here a particle-antiparticle
connection. To verify such an educated guess, we will consider the charge corresponding to the
conserved current 𝐽 𝜇 (𝑥) = 𝜓(𝑥)𝛾 𝜇 𝜓(𝑥) derived in Chapter 15 (cf. (15.60)), i.e.
∫ ∫
3
𝑄 = d 𝑥 𝜓(𝑥)𝛾0 𝜓(𝑥) = d3 𝑥 𝜓 † (𝑥)𝜓(𝑥) . (19.22)

Substituting the expressions (19.6), (19.7) into (19.22) and performing basically the same
manipulations as before, one obtains the result

𝑄 = d3 𝑝
∑︁ 
𝑏 † ( 𝑝, 𝑠)𝑏( 𝑝, 𝑠) + 𝑑 ( 𝑝, 𝑠)𝑑 † ( 𝑝, 𝑠) (19.23)

𝑠

(note that such a result is in fact a natural counterpart of our earlier formula (10.21) obtained
within the framework of relativistic quantum mechanics). For the field placed inside a box the
formula for 𝑄 reads, of course,
∑︁ ∑︁  
𝑄= 𝑏 †𝑝,𝑠 𝑏 𝑝,𝑠 + 𝑑 𝑝,𝑠 𝑑 †𝑝,𝑠 . (19.24)
𝑝® 𝑠

So, 𝑄 is a positive operator; however, by means of the anticommutation relations (19.15) it may
be recast in the normal ordered form, namely
∑︁ ∑︁  
𝑄= 𝑏 †𝑝,𝑠 𝑏 𝑝,𝑠 − 𝑑 †𝑝,𝑠 𝑑 𝑝,𝑠 (19.25)
𝑝® 𝑠

124
(needless to say, when passing from (19.24) to (19.25) a positive infinite constant is subtracted).
The expression (19.25) is precisely what we need. Using (19.15) one gets

[𝑄, 𝑏 †𝑝,𝑠 ] = 𝑏 †𝑝,𝑠 ,


(19.26)
[𝑄, 𝑑 †𝑝,𝑠 ] = −𝑑 †𝑝,𝑠 ,

so that the one-particle states 𝑏 †𝑝,𝑠 |0⟩ and 𝑑 †𝑝,𝑠 |0⟩ carry opposite eigenvalues of 𝑄. In this way,
the description of states of the quantized Dirac field, incorporating particles and antiparticles,
is implemented. Conventionally, we will associate particles with the 𝑏-operators and the 𝑑-
operators will represent antiparticles.
As regards the structure of the corresponding Fock space, there is a very important point
that must be mentioned here. The anticommutation relations

{𝑏 †𝑝,𝑠 , 𝑏 †𝑞,𝑠′ } = 0 , †
{𝑑 †𝑝,𝑠 , 𝑑 𝑞,𝑠 ′} = 0

mean, in particular, that


 2  2
𝑏 †𝑝,𝑠 = 0, 𝑑 †𝑝,𝑠 = 0. (19.27)
This in turn leads to enormous simplification of the possible contents of the particle states: the
identities (19.27) imply, obviously, that the same state cannot be occupied by more than one
particle (antiparticle). In other words, the occupation numbers, characterizing the state vectors
in the Fock space of the quantized Dirac field, can take on just the values 0 and 1. Of course,
such a statement is the famous Pauli exclusion principle. Thus, the particles and antiparticles
corresponding to a quantum Dirac field are fermions (i.e. they obey the Fermi–Dirac statistics).
This is another example of the spin-statistics connection, since there is a good reason to believe
that we are working here with spin- 21 particles (note that a clear hint to this point is provided by
the form of the Dirac field angular momentum shown in (15.52)). In fact, the problem of the
spin labels of the particles we have described up to now in terms of the creation operators 𝑏 †𝑝,𝑠
and 𝑑 †𝑝,𝑠 would deserve a more detailed treatment and we defer it to the Appendix D. Throughout
the subsequent chapters we will assume, bona fide, that the spin state of a particle in question
is described by means of the spin four-vector 𝑠, in much the same way as in the framework of
relativistic quantum mechanics.

125
Chapter 20

Quantization of massive vector field

Following the sequence of relativistic wave equations for classical fields, the next example is the
massive vector field satisfying the Proca equation that we have discussed earlier (cf. chapters 12
and 14). Let us stress again that while the basic equation for a four-vector function 𝐴 𝜇 (𝑥) that we
are going to use is the same as in Chapter 12, its interpretation is totally different: in Chapter 12,
we have dealt with a relativistic quantum mechanical wave function for a single spin-1 particle,
but now we have in mind the classical field described in Chapter 14 (which, admittedly, is a
rather abstract quantity).
The quantization of a massive vector (Proca) field is quite interesting for more than one
reason. First of all, it is designed to describe massive spin-1 bosons (which, as we know, play a
crucial role in weak interactions); in fact, in many applications of quantum electrodynamics one
may also treat the photon as a massless limit of a massive vector boson. Moreover, the relevant
quantization scheme follows basically the canonical paradigm used before for a scalar field,
although one is dealing here with a set of dynamical variables subject to a non-trivial constraint.
Anyway, such a procedure is conceptually simpler than a quantization of massless Maxwell field,
where one has to cope with consequences of the electromagnetic gauge invariance.
So, let us see how the canonical quantization of the Proca field can be implemented. For
simplicity, we are going to consider the case of a real field. According to our previous results,
the Lagrangian density is given by (14.18) and the corresponding equation of motion reads

𝜕𝜇 𝐹 𝜇𝜈 + 𝑚 2 𝐴𝜈 = 0 , (20.1)

where 𝐹𝜇𝜈 = 𝜕𝜇 𝐴𝜈 − 𝜕𝜈 𝐴 𝜇 . This, as we know, is equivalent to the pair of Klein–Gordon equation


and a “Lorenz-like constraint”, namely

2 + 𝑚2 𝐴𝜇 = 0 ,
 
𝜕 𝜇 𝐴𝜇 = 0 . (20.2)

For a definition of the relevant canonical variables, the formula (14.24) is instrumental, so let us
reproduce it here for reader’s convenience:

𝜕L
= −𝐹 𝜌𝜎 . (20.3)
𝜕 (𝜕𝜌 𝐴𝜎 )

Supposing that the field components 𝐴 𝜇 (𝑥) play the role of generalized coordinates, the corre-
sponding conjugate momenta can be defined, conventionally, by

𝜕L
𝐴 𝜇 −→ 𝜋 𝜇 = = −𝐹 0𝜇 . (20.4)
𝜕 (𝜕0 𝐴 𝜇 )

126
It is seen that 𝜋0 would be then identically zero, so it makes sense to use just 𝐴 𝑗 (𝑥), 𝑗 = 1, 2, 3, as
dynamical variables, and 𝐴0 (𝑥) is supposed to be calculated in terms of these, or the conjugate
momenta 𝜋 𝑗 (𝑥). Thus, the canonical equal-time commutation relations may be written (in
analogy with the scalar field case described in Chapter 16) as

[ 𝐴 𝑗 (𝑥), 𝜋 𝑘 (𝑦)] E.T. = 𝑖𝛿 𝑗 𝑘 𝛿 (3) (®


𝑥 − 𝑦®) ,
[ 𝐴 𝑗 (𝑥), 𝐴 𝑘 (𝑦)] E.T. = 0 , (20.5)
[𝜋 𝑗 (𝑥), 𝜋 𝑘 (𝑦)] E.T. = 0 .
Note also that according to (20.4) one has

𝜋 𝑗 = 𝐹0 𝑗 (= 𝜕0 𝐴 𝑗 − 𝜕 𝑗 𝐴0 ) . (20.6)

Now, the question is how to express 𝐴0 in terms of the above canonical variables. For this
purpose, one may invoke the equation of motion (20.1). One gets readily
1
𝐴0 = − 𝜕 𝑗 𝐹0 𝑗 , (20.7)
𝑚2
and this means, in view of (20.6),
1
𝐴0 = −
𝜕𝑗 𝜋 𝑗 . (20.8)
𝑚2
As we will see shortly, the relation (20.8) is crucial. for a successful implementation of the
whole quantization procedure. In addition to (20.8), 𝐴0 can be immediately obtained from the
constraint 𝜕 𝜇 𝐴 𝜇 = 0:
.
𝐴0 = 𝜕 𝑗 𝐴 𝑗 . (20.9)
Of course, our ultimate goal is the particle interpretation of the quantized field, based on the
appropriate creation and annihilation operators. To this end, one may proceed in analogy with
the previous examples of the scalar and spinor field, i.e. write a solution of the equations (20.2) as
a general superposition of plane waves and establish commutation relations of the corresponding
expansion coefficients. We have examined the properties of plane-wave solutions of the Proca
equation in Chapter 12, so it comes in handy now. The general expression for 𝐴 𝜇 (𝑥) that we
need can be written as
∫ 3
3
∑︁
𝐴 𝜇 (𝑥) = d 𝑘 𝑁 𝑘 𝑎(𝑘, 𝜆)𝜀 𝜇 (𝑘, 𝜆) 𝑒 −𝑖𝑘𝑥 + 𝑎 † (𝑘, 𝜆)𝜀 ∗𝜇 (𝑘, 𝜆) 𝑒𝑖𝑘𝑥 , (20.10)
 
𝜆=1

where 𝜀 𝜇 (𝑘, 𝜆) are the three space-like polarization vectors satisfying the conditions (12.8),
(12.28), and 𝑁 𝑘 = (2𝜋) −3/2 (2𝑘 0 ) −1/2 is the usual conventional normalization factor; of course,
𝑘 0 = ( 𝑘® 2 + 𝑚 2 ) 1/2 . The next key step is deriving the commutation relations for the operator
coefficients 𝑎, 𝑎 † . Basically, the procedure is the same as in the case of the scalar field discussed
in Chapter 16, but the practical calculation is algebraically much more complicated. First of
all, one must express the operators 𝑎, 𝑎 † in terms of the original canonical variables 𝐴 𝑗 , 𝜋 𝑗
appearing in the set of commutators (20.5). For this purpose, it is convenient to introduce the
combinations
3
∑︁
𝑎 𝜇 (𝑘) = 𝑎(𝑘, 𝜆)𝜀 𝜇 (𝑘, 𝜆) ,
𝜆=1
(20.11)
3
∑︁
𝑎 †𝜇 (𝑘) = 𝑎 † (𝑘, 𝜆)𝜀 ∗𝜇 (𝑘, 𝜆) .
𝜆=1

127
Then the expansion (20.10) is recast as

𝐴 𝜇 (𝑥) = d3 𝑘 𝑁 𝑘 𝑎 𝜇 (𝑘) 𝑒 −𝑖𝑘𝑥 + 𝑎 †𝜇 (𝑘) 𝑒𝑖𝑘𝑥 , (20.12)
 

and the operators 𝑎 𝜇 (𝑘), 𝑎 †𝜇 (𝑘) can be expressed with the help of the relations analogous to
(16.9), namely


𝑎 𝜇 (𝑘) = 𝑖 d3 𝑥 𝑓 𝑘∗ (𝑥) 𝜕0 𝐴 𝜇 (𝑥) ,


(20.13)
3
𝑎 𝜇 (𝑘) = −𝑖 d 𝑥 𝑓 𝑘 (𝑥) 𝜕0 𝐴 𝜇 (𝑥) ,

where 𝑓 𝑘 (𝑥), 𝑓 𝑘∗ (𝑥) are given by (16.7). Now it is clear that for the evaluation of commutators
involving 𝑎 𝜇 (𝑘) and 𝑎 †𝜇 (𝑘) one needs to know all possible equal-time commutators of 𝐴 𝜇 and
.
𝐴 𝜇 . This is why the relation (20.8) is so important: using it, along with (20.6) and (20.9),
all commutators in question can be reduced to the basic relations (20.5). The corresponding
calculation is quite boring, long and tedious; according to a little known literary character, it
might be called “the horrors of the unborn” (in Czech: “hrůzy nezrozeného”).11 Anyway, let
us at least summarize the results for the commutators in question. An awkward feature of the
formulae shown below is that they cannot be presented in a covariant form (this, of course, is
due to the intrinsically non-covariant way of distinguishing between 𝐴0 and 𝐴 𝑗 , 𝑗 = 1, 2, 3). So,
there are basically three types of commutators:
a) Type [ 𝐴 𝜇 (𝑥), 𝐴𝜈 (𝑦)] E.T.

𝑖 𝜕 (3)
[ 𝐴0 (𝑥), 𝐴 𝑗 (𝑦)] E.T. = − 𝛿 (®
𝑥 − 𝑦®) ,
𝑚 2 𝜕𝑥 𝑗
[ 𝐴 𝑗 (𝑥), 𝐴 𝑘 (𝑦)] E.T. = 0, (20.14)
[ 𝐴0 (𝑥), 𝐴0 (𝑦)] E.T. = 0.
. .
b) Type [ 𝐴 𝜇 (𝑥), 𝐴𝜈 (𝑦)] E.T.
. . 𝜕 (3) 𝑖 𝜕
[ 𝐴0 (𝑥), 𝐴 𝑗 (𝑦)] E.T. = 𝑖 𝑗
𝑥 − 𝑦®) − 2 𝑗 Δ𝛿 (3) (®
𝛿 (® 𝑥 − 𝑦®) ,
. . 𝜕𝑥 𝑚 𝜕𝑥
[ 𝐴 𝑗 (𝑥), 𝐴 𝑘 (𝑦)] E.T. = 0, (20.15)
. .
[ 𝐴0 (𝑥), 𝐴0 (𝑦)] E.T. = 0 .
.
c) Type [ 𝐴 𝜇 (𝑥), 𝐴𝜈 (𝑦)] E.T.
. 𝑖
[ 𝐴0 (𝑥), 𝐴0 (𝑦)] E.T. = − 2
Δ𝛿 (3) (®
𝑥 − 𝑦®) ,
. 𝑚
[ 𝐴0 (𝑥), 𝐴 𝑗 (𝑦)] E.T. = 0,
. (20.16)
[ 𝐴 𝑗 (𝑥), 𝐴0 (𝑦)] E.T. = 0 ,
. 1 𝜕 𝜕
 
[ 𝐴 𝑗 (𝑥), 𝐴 𝑘 (𝑦)] E.T. = 𝑖 𝛿 𝑗 𝑘 − 2 𝑗 𝑘 𝛿 (3) (®
𝑥 − 𝑦®) .
𝑚 𝜕𝑥 𝜕𝑥
11 By way of explanation: True connoisseurs of the novel “The good soldier Švejk” by J. Hašek may recognize
in it a statement of the cook occultist Jurajda concerning a difficult and confusing situation.

128
Well, the above list appears to be a welter of cumbersome formulae, but when these are utilized
in commutators of the operators 𝑎 𝜇 (𝑘), 𝑎 †𝜈 (𝑘) expressed by means of (20.13), the final result
that emerges from such an “ugly-duckling form” is quite elegant; one gets
1
 
[𝑎 𝜇 (𝑘), 𝑎 𝜈 (𝑘 )] = −𝑔 𝜇𝜈 + 2 𝑘 𝜇 𝑘 𝜈 𝛿 (3) ( 𝑘® − 𝑘®′) ,
† ′
𝑚 (20.17)
[𝑎 𝜇 (𝑘), 𝑎 𝜈 (𝑘 )] = 0 .

From (20.17) it is then easy to obtain the corresponding commutators for 𝑎(𝑘, 𝜆), 𝑎 † (𝑘, 𝜆).
Taking into account the definition (20.11) and employing the orthogonality relation (12.28) one
gets readily
𝑎(𝑘, 𝜆) = −𝜀 ∗ 𝜇 (𝑘, 𝜆)𝑎 𝜇 (𝑘) , (20.18)
so that

[𝑎(𝑘, 𝜆), 𝑎 † (𝑘 ′, 𝜆′)] = [−𝜀 ∗ 𝜇 (𝑘, 𝜆)𝑎 𝜇 (𝑘), −𝜀 𝜈 (𝑘 ′, 𝜆′)𝑎 †𝜈 (𝑘 ′)]


1
 
= 𝜀 (𝑘, 𝜆)𝜀 (𝑘 , 𝜆 ) −𝑔 𝜇𝜈 + 2 𝑘 𝜇 𝑘 𝜈 𝛿 (3) ( 𝑘® − 𝑘®′)
∗𝜇 𝜈 ′ ′
𝑚
= −𝜀 ∗ (𝑘, 𝜆) · 𝜀(𝑘, 𝜆′)𝛿 (3) ( 𝑘® − 𝑘®′)
= 𝛿𝜆𝜆′ 𝛿 (3) ( 𝑘® − 𝑘®′) , (20.19)

where we have used again the basic properties of the polarization vectors. Obviously, one also
gets immediately
[𝑎(𝑘, 𝜆), 𝑎(𝑘 ′, 𝜆′)] = 0 . (20.20)
Thus, the results (20.19), (20.20) suggest that we are on the right track for the interpretation of
𝑎(𝑘, 𝜆) and 𝑎 † (𝑘, 𝜆) as an annihilation and creation operator, respectively. Of course, as before,
a necessary further step towards this goal is the calculation of the field energy and momentum.
For this purpose, it is convenient to employ an alternative Lagrangian for the Proca field, which
is equivalent to the original form (14.18); it reads

f= − 1 (𝜕𝜇 𝐴𝜈 )(𝜕 𝜇 𝐴𝜈 ) + 1 (𝜕𝜇 𝐴 𝜇 ) 2 + 1 𝑚 2 𝐴 𝜇 𝐴 𝜇 .


L (20.21)
2 2 2
The equivalence of (20.21) and (14.18) is verified easily. To derive the corresponding equation
of motion, one gets first
𝜕L
= −𝜕 𝜌 𝐴𝜎 + 𝑔 𝜌𝜎 𝜕 · 𝐴 , (20.22)
f
𝜕 (𝜕𝜌 𝐴𝜎 )
and utilizing it in the general relation (14.10), one obtains readily the Proca equation (20.1).
In addition to that, one may check that the Lagrangians (14.18) and (20.21) differ by a four-
divergence. Indeed, the form (14.18) may be recast as
1 1 1
L = − (𝜕𝜇 𝐴𝜈 )(𝜕 𝜇 𝐴𝜈 ) + (𝜕𝜇 𝐴𝜈 )(𝜕 𝜈 𝐴 𝜇 ) + 𝑚 2 𝐴 𝜇 𝐴 𝜇 , (20.23)
2 2 2
and one can prove easily that the following identity holds:

(𝜕𝜇 𝐴 𝜇 ) 2 − (𝜕𝜇 𝐴𝜈 )(𝜕 𝜈 𝐴 𝜇 ) = 𝜕𝜇 𝐴 𝜇 𝜕𝜈 𝐴𝜈 . (20.24)


Thus, L and L fdiffer by a term proportional to the right-hand side of (20.24). To evaluate the
energy (the field Hamiltonian), one has to substitute into (20.25) the solution of the equation

129
of motion given by (20.10) and integrate over d3 𝑥. The calculation is somewhat tedious, but it
proceeds basically in the same manner as e.g. for the scalar field; of course, it is algebraically
slightly more complicated (but, in fact, one gets along with the orthogonality relation (12.28)
for the polarization vectors). The reader is encouraged to carry out the calculation in detail; in
any case it is a useful revision exercise. As one may have rightly expected, the final result is
3
1

3
∑︁
𝐻= d 𝑘 𝑘 0 𝑎 † (𝑘, 𝜆)𝑎(𝑘, 𝜆) + 𝑎(𝑘, 𝜆)𝑎 † (𝑘, 𝜆) , (20.25)
 
2 𝜆=1

and this may be subsequently recast in the normal-ordered form. The evaluation of the momen-
tum operator proceeds similarly and it is easy to guess the result, which can be written as an
obvious analogue of our previous formulae. An analysis of the angular momentum would be
technically more complicated, so for the time being we will simply take for granted that the label
𝜆 = 1, 2, 3 corresponds to a definite spin (helicity) state of the spin-1 particle, in analogy with
our earlier treatment within the framework of relativistic quantum mechanics.
Finally, let us add that the above discussion can be generalized in a straightforward manner
to the case of a complex (non-Hermitian) Proca field; in such a case, the pair of the operators 𝑎,
𝑎 † appearing in (20.10) is replaced with 𝑏, 𝑑 † in much the same way as for the complex scalar
field described in Chapter 18.

130
Chapter 21

Interactions of classical and quantum fields

The quantization of free fields that we have pursued in preceding chapters is certainly quite
remarkable and elegant way of a description of relativistic particles. Particularly valuable is the
connection of particle states with the creation and annihilation operators satisfying appropriate
commutation or anticommutation relations. On the other hand, a set of totally free particles is a
rather dull stuff, since the real life is obviously based on interactions. So, a next step forward is
a treatment of dynamical processes involving particles, in terms of the field interactions.
Let us start with classical fields. A formal description of an envisaged interaction is,
conceptually, quite simple: the idea is to have a coupling of fields that influence mutually each
other (it is also possible to imagine a situation where a field influences itself; in such a case one
may speak of a self-interaction). An instructive example is a system of coupled spinor (Dirac)
and scalar (Klein–Gordon) fields. Full Lagrangian for such a system can be written e.g. as
1 1
L = 𝜕𝜇 𝜑 𝜕 𝜇 𝜑 − 𝑀 2 𝜑2 + 𝑖𝜓𝛾 𝜇 𝜕𝜇 𝜓 − 𝑚𝜓𝜓 + 𝑔𝜓𝜓𝜑 . (21.1)
2 2
The notation used in (21.1) is standard, so it is clear that the first four monomials constitute
the familiar free Lagrangians discussed earlier. The last term represents an interaction; the
dimensionless parameter 𝑔 is therefore traditionally called the “coupling constant”. (Note
that the type of the interaction shown in (21.1) is called, for historical reasons, the Yukawa
coupling.) The role of the interaction term appearing in (21.1) becomes quite transparent when
the corresponding equations of motion are written down. Using the general relations (14.10),
one gets easily the Lagrange equations for the considered system; these read
2
 
2 + 𝑀 𝜑 = 𝑔𝜓𝜓 ,
(21.2)
𝑖𝛾 𝜇 𝜕𝜇 𝜓 − 𝑚𝜓 + 𝑔𝜑𝜓 = 0 .
From (21.2) it is clear that the fields 𝜑 and 𝜓 influence each other owing to the presence of the
interaction; of course, the two equations become decoupled for 𝑔 = 0.
Another example is the interaction of Dirac field with the electromagnetic Maxwell field;
the corresponding Lagrangian has the form
1
L = 𝑖𝜓𝛾 𝜇 𝜕𝜇 𝜓 − 𝑚𝜓𝜓 − 𝐹𝜇𝜈 𝐹 𝜇𝜈 + 𝑒𝜓𝛾 𝜇 𝜓 𝐴 𝜇 , (21.3)
4
where, of course, 𝐹𝜇𝜈 = 𝜕𝜇 𝐴𝜈 − 𝜕𝜈 𝐴 𝜇 . Again, in (21.3) one may distinguish the free part of L
and an interaction, which now has the structure “current × field” (recall that 𝜓𝛾 𝜇 𝜓 is a conserved
vector current in Dirac theory). It is worth noting that (21.3) may be recast as
1
L = − 𝐹𝜇𝜈 𝐹 𝜇𝜈 + 𝑖𝜓𝛾 𝜇 (𝜕𝜇 − 𝑖𝑒 𝐴 𝜇 )𝜓 − 𝑚𝜓𝜓 . (21.4)
4
131
The point is that the presence of the covariant derivative 𝐷 𝜇 = 𝜕𝜇 − 𝑖𝑒 𝐴 𝜇 guarantees the
invariance of the considered Maxwell–Dirac Lagrangian under gauge transformations

𝜓 ′ (𝑥) = 𝑒𝑖𝜔(𝑥) 𝜓(𝑥) ,


𝜓 ′ (𝑥) = 𝑒 −𝑖𝜔(𝑥) 𝜓(𝑥) , (21.5)
1
𝐴′𝜇 (𝑥) = 𝐴 𝜇 (𝑥) + 𝜕𝜇 𝜔(𝑥) ,
𝑒
where 𝜔(𝑥) is a real function. In other words, the expression (21.4) exhibits manifestly the local
U(1) symmetry, which is a familiar feature of classical electrodynamics. The form (21.3) or
(21.4) can be easily extended to incorporate also a mass term for the vector field 𝐴 𝜇 (i.e. replace
Maxwell with Proca); then, of course, the gauge invariance under (21.5) is lost, one is left
just with global U(1) symmetry (corresponding to 𝜔(𝑥) ≡ 𝜔 = const.). Anyway, the current
conservation is maintained in such a case. Adding the mass term 21 𝑀 2 𝐴 𝜇 𝐴 𝜇 to (21.3), one gets
the equations of motion

𝜕𝜇 𝐹 𝜇𝜈 + 𝑀 2 𝐴𝜈 = −𝑒𝜓𝛾 𝜈 𝜓 ,
(21.6)
𝑖𝛾 𝜕𝜇 − 𝑖𝑒 𝐴 𝜇 − 𝑚 𝜓 = 0 ,
 𝜇  

and thus it is seen again how the fields 𝜓 and 𝐴 𝜇 are intertwined within the couple (21.6).
Finally, let us mention a simple example of a self-interaction of real scalar field. The
Lagrangian in question reads
1 1 1
L = 𝜕𝜇 𝜑 𝜕 𝜇 𝜑 − 𝑚 2 𝜑2 − 𝜆𝜑4 . (21.7)
2 2 4
For definiteness, we assume that the coupling constant 𝜆 is positive, so as to get a positive
energy density (please check it). Note that the quartic term appearing in (21.7) occurs also in the
standard model (SM) of particle physics, where 𝜑 may be identified with the Higgs field (note,
however, that this sector of SM has not been fully tested yet). From (21.7) one obtains easily the
corresponding equation of motion:

2 + 𝑚 2 𝜑 = −𝜆𝜑3 .
 
(21.8)

So, the “source term” on the right-hand side of Eq. (21.8) (cf. also (21.6)) is made of the field 𝜑
itself and this makes the concept of self-interaction manifest.
The above considerations give a clear instruction for proceeding to the QFT case. One
may utilize interaction terms written as products of quantum field operators, and figure out a
viable procedure of calculating transition amplitudes for decay and scattering processes. Recall
that, as we have already noted in Chapter 13, the formalism of quantized fields based on creation
and annihilation operators should lead naturally to a description of physical processes, in which
a transmutation of particles occurs, and also their number may be changed when passing from
the initial to a final state (in the Fock space). Well, the problem is that we have only been
able to quantize the free fields, but still one would like to utilize this valuable knowledge for
calculations of dynamical processes involving interacting particles. It is not difficult to guess
that a practicable scheme could be provided by an appropriate form of perturbation theory
(assuming, bona fide, that the particle interactions in question are not too strong). To develop
the pertinent technique, we are going to make now a brief digression concerning the general
formalism for the description of time evolution in quantum theory (for most of the readers it

132
will be probably just a revision of a part of knowledge gained in an earlier course of quantum
mechanics).
To begin with, let us recall the two best known “pictures” (or representations) of time
evolution. In the Schrödinger picture, the state vectors are time-dependent and satisfy the
familiar equation
𝜕
𝑖 |Ψ(𝑡)⟩ = 𝐻|Ψ(𝑡)⟩ , (21.9)
𝜕𝑡
with 𝐻 being the relevant Hamiltonian, while an operator 𝐴 = 𝐴S representing a standard
observable is time-independent. In the Heisenberg picture it is the other way round, so that the
observables are viewed as time-dependent dynamical variables

𝐴H (𝑡) = 𝑒𝑖𝐻𝑡 𝐴S 𝑒 −𝑖𝐻𝑡 (21.10)

(where one has to take into account that the label H has a different meaning in the l.h.s. and r.h.s.
of Eq. (21.10)). One should notice that the value of a general matrix element of an observable
is not changed upon transition between the two pictures; it holds

⟨ΦS (𝑡)| 𝐴S |ΨS (𝑡)⟩ = ⟨ΦH | 𝐴H (𝑡)|ΨH ⟩ , (21.11)

where we have marked explicitly that the state vectors in the Heisenberg picture are constant in
time.
An “interpolation” between the two above-mentioned schemes is the so-called interaction
picture (sometimes also called Dirac picture). It is defined as follows. Suppose that the full
Hamiltonian can be conveniently split into two terms,

𝐻 = 𝐻0 + 𝐻int , (21.12)

where 𝐻0 is an appropriate “solvable” part (i.e. such that its spectrum and eigenstates can be
computed explicitly); in particular, it may be the Hamiltonian for free particles. 𝐻int then
denotes, conventionally, the “interaction term” (referring to the free Hamiltonian 𝐻0 ). Typically,
𝐻int incorporates a small parameter that may warrant the use of perturbation expansion in powers
of 𝐻int . Let us see how such a scheme is implemented technically. The basic defining relation
determines the passage from a time-independent operator 𝐴 in Schrödinger picture to 𝐴I (𝑡) in
the interaction picture:
𝐴I (𝑡) = 𝑒𝑖𝐻0 𝑡 𝐴 𝑒 −𝑖𝐻0 𝑡 . (21.13)
Next, we require that all pictures of time evolution give the same results for matrix elements of the
relevant operators; in such a way one can arrive at the corresponding law for the time-dependence
of a state vector |ΨI (𝑡)⟩. So, one may impose e.g. the condition

⟨Φ| 𝐴H (𝑡)|Ψ⟩ = ⟨ΦI (𝑡)| 𝐴I (𝑡)|ΨI (𝑡)⟩ (21.14)

relating the Heisenberg and interaction pictures. Using (21.10) and (21.13), the identity (21.14)
yields
|ΨI (𝑡)⟩ = 𝑒𝑖𝐻0 𝑡 𝑒 −𝑖𝐻𝑡 |Ψ⟩ . (21.15)
Now we are in a position to derive an evolution equation for |ΨI (𝑡)⟩. From (21.15) one gets
𝜕  
|ΨI (𝑡)⟩ = 𝑖𝐻0 𝑒𝑖𝐻0 𝑡 𝑒 −𝑖𝐻𝑡 + 𝑒𝑖𝐻0 𝑡 𝑒 −𝑖𝐻𝑡 (−𝑖𝐻) |Ψ⟩
𝜕𝑡  
𝑖𝐻0 𝑡 −𝑖𝐻𝑡 𝑖𝐻0 𝑡 −𝑖𝐻𝑡
= 𝑖𝑒 𝐻0 𝑒 −𝑖𝑒 𝐻𝑒 |Ψ⟩
= −𝑖 𝑒𝑖𝐻0 𝑡 (𝐻 − 𝐻0 ) 𝑒 −𝑖𝐻𝑡 |Ψ⟩ . (21.16)

133
However, according to (21.12), 𝐻 − 𝐻0 = 𝐻int , so that the equation (21.16) may be recast
conveniently as
𝜕
𝑖 |ΨI (𝑡)⟩ = 𝑒𝑖𝐻0 𝑡 𝐻int 𝑒 −𝑖𝐻0 𝑡 𝑒𝑖𝐻0 𝑡 𝑒 −𝑖𝐻𝑡 |Ψ⟩ . (21.17)
𝜕𝑡
Taking into account the basic definitions (21.13) and (21.15), one thus gets finally
𝜕
𝑖 |ΨI (𝑡)⟩ = 𝐻Iint (𝑡)|ΨI (𝑡)⟩ , (21.18)
𝜕𝑡
with
𝐻Iint (𝑡) = 𝑒𝑖𝐻0 𝑡 𝐻int 𝑒 −𝑖𝐻0 𝑡 . (21.19)
In this way, one arrives at the desired equation, which determines the time evolution of a state
vector just through the interaction part of the Hamiltonian (and the name of the considered
picture is thereby justified). As any quantum theory practitioner knows, an equation like (21.18)
may be solved formally by means of an evolution operator, which transforms a state vector at an
initial time 𝑡0 to its value at any other time 𝑡. So, one has

|ΨI (𝑡)⟩ = 𝑈I (𝑡, 𝑡0 )|ΨI (𝑡0 )⟩ , (21.20)

and from (21.18) one obtains readily the differential equation for 𝑈I (𝑡, 𝑡0 ):
𝜕
𝑈I (𝑡, 𝑡0 ) = −𝑖𝐻Iint (𝑡)𝑈I (𝑡, 𝑡0 ) . (21.21)
𝜕𝑡
Of course, an appropriate initial condition for the solution of Eq. (21.21) is

𝑈I (𝑡0 , 𝑡0 ) = 1 . (21.22)

The equation (21.21) may be recast in the form of an integral equation, namely
∫𝑡
𝑈I (𝑡, 𝑡0 ) = 1 − 𝑖 d𝑡 ′ 𝐻Iint (𝑡 ′)𝑈I (𝑡 ′, 𝑡0 ) , (21.23)
𝑡0

which incorporates automatically the initial condition (21.22). The main virtue of Eq. (21.23)
is that it can be solved formally by means of iterations that lead eventually to an operator power
expansion. Indeed, the first step amounts to

∫𝑡  ∫𝑡 ′ 
′ int ′  ′′ int ′′
d𝑡 𝐻I (𝑡 ) 1 − 𝑖 d𝑡 𝐻I (𝑡 )𝑈I (𝑡 , 𝑡0 )  ,
′′
(21.24)
 
𝑈I (𝑡, 𝑡0 ) = 1 − 𝑖
𝑡0 𝑡0
 
 
and in such a way one may proceed further. Thus we end up with the result

∫𝑡 ∫𝑡 ∫𝑡 1
𝑈I (𝑡, 𝑡0 ) = 1 + (−𝑖) 𝐻Iint (𝑡1 ) d𝑡1 + (−𝑖) 2
d𝑡 1 d𝑡2 𝐻Iint (𝑡 1 )𝐻Iint (𝑡2 ) + . . .
𝑡0 𝑡0 𝑡0
∫𝑡 ∫𝑡 1 ∫𝑡 𝑛−1
+ (−𝑖) 𝑛 d𝑡1 d𝑡2 · · · d𝑡 𝑛 𝐻Iint (𝑡1 )𝐻Iint (𝑡2 ) · · · 𝐻Iint (𝑡 𝑛 ) + . . . , (21.25)
𝑡0 𝑡0 𝑡0

134
which is called the Dyson series.12 It has obviously the form of a perturbation expansion, if
𝐻int may be viewed as a “small correction” (in whatever sense) to 𝐻0 in the decomposition
(21.12). One may notice that the individual terms in (21.25) exhibit a specific time ordering
𝑡1 > 𝑡 2 > · · · 𝑡 𝑛 for the factors 𝐻Iint (𝑡 𝑗 ) in the integrands, which is naturally enforced by the
structure of the original equation (21.23). It is useful to rewrite the expression (21.25) with the
help of the formal operation of chronological ordering, or briefly T-ordering. For its general
definition, let us start with two operator factors 𝐴(𝑡1 ), 𝐵(𝑡 2 ). One defines
T 𝐴(𝑡1 )𝐵(𝑡2 ) = 𝐴(𝑡1 )𝐵(𝑡2 ) for 𝑡 1 > 𝑡2 ,

(21.26)
= 𝐵(𝑡2 ) 𝐴(𝑡1 ) for 𝑡 2 > 𝑡1 .
Then it is not difficult to realize that the third term in the expansion (21.25) may be recast as
∫𝑡 ∫𝑡 1 ∫𝑡 ∫𝑡
1
d𝑡1 d𝑡2 𝐻I (𝑡1 )𝐻I (𝑡 2 ) = d𝑡 1 d𝑡 2 T 𝐻I (𝑡1 )𝐻I (𝑡2 ) , (21.27)

2!
𝑡0 𝑡0 𝑡0 𝑡0

where we have omitted the label “int” for brevity. Indeed, the double integral on the right-hand
side of (21.27) is equal
∫𝑡 ∫𝑡 1 ∫𝑡 ∫𝑡
𝐼 = 𝐼1 + 𝐼2 = d𝑡1 d𝑡 2 𝐻I (𝑡1 )𝐻I (𝑡2 ) + d𝑡1 d𝑡2 𝐻I (𝑡2 )𝐻I (𝑡 1 ) . (21.28)
𝑡0 𝑡0 𝑡0 𝑡1

However, in the second term in (21.28) one may interchange the order of integrations and get
∫𝑡 ∫𝑡 2
𝐼2 = d𝑡2 d𝑡1 𝐻I (𝑡2 )𝐻I (𝑡1 ) . (21.29)
𝑡0 𝑡0

Then the integration variables can be renamed, 𝑡 1 ↔ 𝑡2 , and thus it is immediately seen that
𝐼1 = 𝐼2 . The factor 1/2! is thereby compensated and (21.27) is proven. Notice that no assumption
concerning commutation properties of the operators 𝐻I (𝑡) for different times was needed; the
derivation of (21.27) relies just on standard rules for interchanging the order of integrations in
a double integral.
It is straightforward to generalize the definition of T-ordering (T-product of operators)
for an arbitrary number of factors. The operators appearing in the considered chain are ordered
in such a way that the corresponding time arguments decrease from left to right. Obviously,
for a product of 𝑛 factors there are 𝑛! possible time orderings and one may repeat the previous
argumentation based on interchanging the order of integrations in a multiple integral. In such
a manner, the 𝑛-fold integral over the hypercube (𝑡0 , 𝑡) × · · · × (𝑡 0 , 𝑡) is shown to consist of 𝑛!
identical contributions equal the (𝑛 + 1)-th term in (21.25). This means that the formula (21.25)
may be recast as
∞ ∫𝑡 ∫𝑡
(−𝑖) 𝑛
d𝑡1 · · · d𝑡 𝑛 T 𝐻Iint (𝑡1 ) · · · 𝐻Iint (𝑡 𝑛 ) .
∑︁  
𝑈I (𝑡, 𝑡0 ) = 1 + ··· (21.30)
𝑛=1
𝑛!
𝑡0 𝑡0

For obvious reason, the Dyson expansion (21.30) is usually written in a compact form as
 ∫𝑡 
𝑈I (𝑡, 𝑡0 ) = Texp −𝑖 𝐻Iint (𝑡 ′) d𝑡 ′ (21.31)
𝑡0
12 Freeman J. Dyson (1923-2020) was one of the founding fathers of modern QED.

135
(but one should still keep in mind that the definition of the symbol Texp is given by (21.30); the
expression (21.31) is just a shorthand notation for the exponential-like series (21.30)).
So much for a digression concerning general methods of quantum theory that are in-
strumental in our subsequent discussion. Let us now come back to the field theory. The basic
quantity we are working with is a Lagrangian density L . From L one can proceed to the energy
density H , which is then integrated over the 3-dimensional space to get the full Hamiltonian; it
remains to be clarified how the perturbative technique described above is implemented in a field
theory model. For an illustration, we will consider the simple case of the 𝜑4 self-interaction
shown in (21.7). Using the general formula (15.16), one gets readily
1 1 1 1
H = T 00 = 𝜕0 𝜑 𝜕0 𝜑 + ® 𝜑 ® 𝜑 + 𝑚 2 𝜑2 + 𝜆𝜑4 . (21.32)
Δ Δ
2 2 2 4
Thus, it is natural to split H as
H = H0 + Hint , (21.33)
with
1
Hint = 𝜆𝜑4 . (21.34)
4
One may notice that
Hint = −Lint , (21.35)
which is an obvious consequence of the simple structure of the Lagrangian density (21.7). In
fact, the relation (21.35) is valid for many other field theory models; for example, it is easy to
find out that it is the case for the Yukawa interaction described by (21.1). Nevertheless, there
are exceptions to this rule; we will discuss one prominent example in Chapter 29.
As regards the general formula (21.31), it is often used for a situation, in which one
considers a transition from “distant past” to “distant future”; formally, one then sets 𝑡0 = −∞,
𝑡 = +∞ (an idealized description, of course). Typically, this is pertinent for the treatment of
scattering or decay processes. Thus, it is not surprising that the corresponding evolution operator
has a special name: it is the famous 𝑺-matrix (“𝑆” from “scattering” in English or “Streuung”
in German). So, one has, in the interaction picture,
𝑆 = 𝑈I (+∞, −∞) . (21.36)
In field theory, the interaction Hamiltonian appearing in the formula (21.31) is the space integral
of the density Hint . This in turn means that in the expression for 𝑆 one ends up with the
integral of Hint over the whole spacetime. Taking also into account the relation (21.35) (with
the aforementioned caveat), the 𝑆-matrix operator may be written compactly as
 ∫ 
int 4
𝑆 = Texp 𝑖 LI (𝑥) d 𝑥 . (21.37)

Now, the question is how to work it out practically. To this end, the key observation is that
according to the definition of the interaction picture (cf. (21.13)), the time evolution of operators
is controlled by the free Hamiltonian; this means that L int (in its form copied from classical
theory) is made of free fields, which we know explicitly. This is the main asset of the interaction
picture in the context of field theory: within such a framework, the individual terms of the
perturbation expansion based on the Dyson series
𝑖2
∫ ∫∫
int 4
d4 𝑥 d4 𝑦 T LIint (𝑥)LIint (𝑦) + . . .
 
𝑆 = 1 + 𝑖 LI (𝑥) d 𝑥 + (21.38)
2!
can be evaluated with the help of the results that we have obtained for free fields in previous
chapters. Some instructive examples will be discussed in the sequel.

136
Chapter 22

Examples of 𝑺-matrix elements.


Some simple Feynman diagrams

The definitions (21.20) and (21.36) mean that

|Ψ(+∞)⟩ = 𝑆|Ψ(−∞)⟩ , (22.1)

where |Ψ(±∞)⟩ represent states of the considered system in distant past and distant future. For
the sake of brevity, we will denote the initial state vector as |𝑖⟩; Eq. (22.1) thus tells us how this is
evolved during a sufficiently long interval of time under an interaction embodied in the 𝑆-matrix
operator. The probability amplitude for a definite process (scattering or whatever) |𝑖⟩ → | 𝑓 ⟩ is
thus given by the scalar product

⟨ 𝑓 |Ψ(+∞)⟩ = ⟨ 𝑓 |𝑆|𝑖⟩ . (22.2)

In what follows, we are going to work out some examples of 𝑆-matrix elements ⟨ 𝑓 |𝑆|𝑖⟩ = 𝑆 𝑓 𝑖 in
the lowest order of Dyson perturbation series (21.38).
Let us start with the model of Yukawa coupling (21.1). We will assume that 𝑀 > 2𝑚;
then one may rightly expect that the interaction in question is responsible, among others, for
the decay of the scalar boson 𝜑 into the fermion-antifermion pair. To see how it may proceed,
we will utilize the 1st order term in the expansion (21.38) (an exhilarating feature of such an
otherwise quite boring calculation is that it touches some truly modern physics, namely a decay
of the enigmatic Higgs boson; an attentive reader should appreciate it). In general, a basic input
to the evaluation of an 𝑆-matrix element 𝑆 𝑓 𝑖 is the description of the initial and final states: quite
naturally, one may assume that these consist of free particles. To put it more eloquently, one
starts with a set of free particles in a definite state prepared before the interaction and these (or
a transmuted set) emerge from the interaction region (and are detected) as free particles again.
Thus, the states |𝑖⟩ and | 𝑓 ⟩ are described as the appropriate vectors belonging to the relevant
Fock space, i.e. they are generated by means of the action of the corresponding creation operators
on the vacuum state. In the considered case of the decay process 𝜑 → 𝑓 𝑓 one thus has

|𝑖⟩ = 𝑎 † (𝑞)|0⟩ ,
(22.3)
| 𝑓 ⟩ = 𝑏 † (𝑘, 𝑠)𝑑 † ( 𝑝, 𝑟)|0⟩ ,

where we have brought back the familiar notation introduced in Chapters 16 and 19. In the first
order of the Dyson expansion (21.38) one then gets

𝑆 𝑓 𝑖 = 𝑖 d4 𝑥 ⟨ 𝑓 |Lint (𝑥)|𝑖⟩ ,
(1)
(22.4)

137
where Lint is given by (21.1) (from now on, we are omitting the label I for the interaction
picture). So, let us examine the matrix element of Lint (𝑥). In order not to get lost in a welter of
symbols, it is useful to write the structure of the field operators 𝜑, 𝜓 and 𝜓 as

𝜑 = 𝑎 & 𝑎† , 𝜓 = 𝑏 & 𝑑† , 𝜓 = 𝑏† & 𝑑 . (22.5)

This is, hopefully, a comprehensible shortcut notation for the contents of the operators in
question in terms of the relevant creation and annihilation operators. Of course, we refer here to
the original formulae (16.6), (19.6) and (19.7). The matrix element ⟨ 𝑓 |𝜓𝜓𝜑|𝑖⟩ is then reduced,
basically, to the vacuum expectation value (v.e.v.) that looks like

⟨0|𝑑 𝑏 (𝑏 † & 𝑑) (𝑏 & 𝑑 † ) (𝑎 & 𝑎 † ) 𝑎 † |0⟩ . (22.6)

Obviously, such an expression consists of eight separate contributions involving different chains
of annihilation and creation operators. The good news is that there is only one term yielding
a nontrivial result. Indeed, taking into account the basic property of an annihilation operator,
namely 𝑎|0⟩ = 0 (equivalently, ⟨0|𝑎 † = 0), etc. (both for bosons and fermions), it is clear that a
non-zero v.e.v. may be obtained just for a chain, in which annihilation and creation operators of
the same kind are paired completely; the point is that the commutator or anticommutator of such
a pair gives a Kronecker delta (or a delta function) and this, of course, survives in v.e.v. On the
other hand, if there are some unpaired operators, their contribution to the v.e.v. (22.6) vanishes
due to the action of an extra annihilation operator on the vacuum. Thus, the only non-zero
contribution to (22.6) has the form

⟨0|𝑑 𝑏 𝑏 † 𝑑 † 𝑎 𝑎 † |0⟩ , (22.7)

while the remaining seven terms drop out; for example

⟨0|𝑑 𝑏 𝑏 † 𝑏 𝑎 𝑎 † |0⟩ = 0 , (22.8)

etc.
Now we are in a position to carry out the evaluation of the matrix element ⟨ 𝑓 |Lint |𝑖⟩ in
detail. As we know (cf. Chapter 17), the basic formulae for the field operators 𝜑, 𝜓, 𝜓 can be
expressed in two possible “dialects” (“continuous” or “discrete”). For simplicity, we are going
to utilize now the discrete formalism and suppress, for brevity, the spin labels of the fermion
operators (these can be easily retrieved at the end of the calculation). So, using the formula
(17.2) as well as the discrete counterparts of (19.6) and (19.7), one may write
∑︁  
† 𝑖𝑙1 𝑥
⟨ 𝑓 |Lint (𝑥)|𝑖⟩ = ⟨0|𝑑 𝑝 𝑏 𝑘 𝑁𝑙1 𝑏 𝑙1 u (𝑙1 ) 𝑒 + . . .
𝑙1
∑︁   ∑︁  
× 𝑁𝑙2 . . . + 𝑑𝑙†2 v (𝑙 2 ) 𝑒𝑖𝑙2 𝑥 𝑁𝑙3 𝑎 𝑙3 𝑒 −𝑖𝑙3 𝑥 + . . . 𝑎 †𝑞 |0⟩ , (22.9)
𝑙2 𝑙3

where we have marked explicitly just the relevant parts of the field operators, indicated by the
above discussion around the expression (22.7). As a next step, one employs the commutation
and anticommutation relations, moving the annihilation operators to the right so as to let them
act on the vacuum state. In this way, one is eventually left just with the corresponding Kronecker
deltas and the v.e.v. of the operator chain in question becomes

⟨0|𝑑 𝑝 𝑏 𝑘 𝑏 †𝑙1 𝑑𝑙†2 𝑎 𝑙3 𝑎 †𝑞 |0⟩ = 𝛿 𝑘𝑙1 𝛿 𝑝𝑙2 𝛿 𝑞𝑙3 . (22.10)

138
Using the result (22.10) in (22.9), one thus gets

⟨ 𝑓 |Lint (𝑥)|𝑖⟩ = 𝑁 𝑘 𝑁 𝑝 𝑁 𝑞 u (𝑘, 𝑠)vv ( 𝑝, 𝑟) 𝑒𝑖(𝑘+𝑝−𝑞)𝑥 , (22.11)

where we have restored also the spin labels, according to (22.3). It is quite clear that if we used
in (22.9) the continuous formalism (i.e. integrals instead of infinite sums), the structure of the
result would be the same, only the normalization factors in (22.11) would be different (recall that
such a change amounts to the replacement 𝑉 −1/2 → (2𝜋) −3/2 in the corresponding formula).
So, let us now switch to the continuous values of the particle momenta, which correspond
to the infinite 3-dimensional coordinate space. The spacetime integration in (22.4) then leads to

𝑆 (1) v ( 𝑝, 𝑟)(2𝜋) 4 𝛿 (4) (𝑘 + 𝑝 − 𝑞) .


𝑓 𝑖 = 𝑖𝑔𝑁 𝑘 𝑁 𝑝 𝑁 𝑞 u (𝑘, 𝑠)v (22.12)

Conventionally, we will write (22.12) in the form


4 (4)
𝑆 (1) (1)
𝑓 𝑖 = 𝑁 𝑘 𝑁 𝑝 𝑁 𝑞 𝑖M 𝑓 𝑖 (2𝜋) 𝛿 (𝑘 + 𝑝 − 𝑞) , (22.13)

with
M 𝑓(1)
𝑖 = 𝑔u
u (𝑘, 𝑠)vv ( 𝑝, 𝑟) . (22.14)
The structure of the 𝑆-matrix element embodied in the formula (22.13) is quite general, as we will
see repeatedly throughout this text. As expected, there is the four-dimensional delta function
representing the energy–momentum conservation, and a product of normalization factors for
particles involved in the considered process. These ingredients are universal; on the other hand,
the quantity (22.14) is specific for a given type of interaction, i.e. it carries an information about
the true dynamics of the process in question. It is also seen that its form is Lorentz invariant; such
a quantity M 𝑓 𝑖 is thus usually called the invariant matrix element, or invariant amplitude
(in the common parlance, it is just a “matrix element”, whenever it cannot lead to confusion).
The expression (22.14) for M 𝑓(1) 𝑖 can be represented graphically by a simple Feynman diagram
shown in Fig. 22.1. It exhibits the first few “canonical” rules that will be encountered in all

q =k+p

k −p

Fig. 22.1: Lowest-order Feynman graph for the process 𝜑 → 𝑓 𝑓 . The outgoing fermion (e.g. electron)
is depicted as an outgoing solid line, while the outgoing antifermion (e.g. positron) is represented by an
incoming line carrying four-momentum with inverted sign.

forthcoming calculations. In particular, there is four-momentum conservation in the vertex and


the contribution of the pair of fermion lines is read off conveniently by running against the
direction of the arrows; for this purpose, the outgoing antiparticle is represented by an incoming
line with inverted four-momentum.
Before leaving the above simple example, some more remarks are in order. It is gratifying
that the decay process 𝜑 → 𝑓 𝑓 is so easily and naturally described by means of the QFT
formalism; it is an obvious manifestation of its strength, when one is dealing with a process, in
which the number as well as the nature of the particles involved is changing. Apart from the

139
conservation of the energy and momentum, one gets automatically also the charge conservation
in the considered decay of a neutral particle. It is instructive to check that charge non-conserving
processes like 𝜑 → 𝑓 𝑓 , or 𝜑 → 𝑓 𝑓 are impossible within the model (21.1) we are using; the
reader is encouraged to find the relevant argument for such a statement.
Let us now consider an appropriate example of a scattering process. Obviously, restricting
ourselves to the 1st order of Dyson expansion, within the model (21.1) we are not able to describe
a process of fermion scattering, simply because the monomial 𝜓𝜓𝜑 does not contain a sufficient
number of the relevant creation and annihilation operators. For this purpose, one would have to
proceed to the 2nd order, but we will defer such a discussion to Chapter 26. Instead, one may
contemplate another field theory model, which would not be too far from reality, but still be
suitable for a description of purely fermionic processes in the lowest order. A good example is
a toy model of a direct four-fermion interaction described by the interaction Lagrangian
  
Lint = 𝐺 𝜓 1 𝜓2 𝜓 2 𝜓1 . (22.15)

Here 𝜓1 and 𝜓2 represent two different Dirac fields; for definiteness, one may identify the label 1
with neutrino and 2 with electron. 𝐺 is a coupling constant, which, unlike the case of the Yukawa
interaction, is not dimensionless: it is easy to find out that it has a dimension (mass) −2 , since
Lint has dimension (mass) 4 and the dimension of each Dirac field is (mass) 3/2 (the observant
reader is supposed to guess that all these statements are based on the simple fact that the action
functional is a dimensionless quantity in the natural system of units with ℏ = 1). As regards a
relation of (22.15) to the “real world”, let us note that it is a greatly simplified version of the
effective electron-neutrino weak interaction (in a more realistic scheme corresponding to the
standard model there would be some 𝛾-matrices inside the field products). Let us also recall
that an interaction of the above type has been used first by Enrico Fermi in the early 1930s for
a description of the radioactive beta decay; the model introduced by Fermi involved fields of
neutron, proton, electron and neutrino and it was one of the earliest successful applications of
quantum field theory.
Anyway, the phenomenology is not our primary interest here; rather we will focus on
the basic calculational technique. So, let us consider the process of elastic neutrino–electron
scattering, 𝜈(𝑘) + 𝑒( 𝑝) → 𝜈(𝑘 ′) + 𝑒( 𝑝′), where we have marked explicitly the corresponding
four-momenta. In such a case, the initial and final state may be represented as
|𝑖⟩ = 𝑏 †1 (𝑘, 𝑟)𝑏 †2 ( 𝑝, 𝑠)|0⟩ , | 𝑓 ⟩ = 𝑏 †1 (𝑘 ′, 𝑟 ′)𝑏 †2 ( 𝑝′, 𝑠′)|0⟩ . (22.16)
Utilizing the experience gained in the preceding example (cf. (22.5) through (22.9)), one may
now proceed faster (in musical terms, the tempo can be “allegro” rather than “largo”). To
evaluate the corresponding 𝑆-matrix element 𝑆 (1) 𝑓 𝑖 given by the general formula (22.4), one has
to work out first the v.e.v.

⟨0|𝑏 2 ( 𝑝′)𝑏 1 (𝑘 ′) 𝜓 1 𝜓2 𝜓 2 𝜓1 𝑏 †1 (𝑘)𝑏 †2 ( 𝑝)|0⟩ (22.17)


 

(we have suppressed again the spin labels). Employing the “structural symbols” like (22.5),
one can see that the expression (22.17) consists of 16 separate terms, but only one of them is
non-zero, namely that involving the complete pairing of creation and annihilation operators of
the same kind. These pairings are visualized in (22.17). Then, using the complete expressions
for the field operators in (22.17) and carrying out some simple manipulations analogous to what
we have done previously (cf. (22.11) through (22.14)), one arrives at the result
4 (4) ′
𝑆 (1) ′ ′
(1) ′
𝑓 𝑖 = 𝑁 𝑘 𝑁 𝑝 𝑁 𝑘 𝑁 𝑝 𝑖M 𝑓 𝑖 (2𝜋) 𝛿 (𝑘 + 𝑝 − 𝑘 − 𝑝) , (22.18)

140
where
M 𝑓(1) = ′ ′ ′ ′
(22.19)
  
𝑖 𝐺 u ( 𝑝 , 𝑠 )u
u (𝑘, 𝑟) u (𝑘 , 𝑟 )u
u ( 𝑝, 𝑠)
(hopefully, an attentive reader would not have any problem to reproduce independently the above
results).
The algebraic expression (22.19) can be represented graphically by the simple Feynman
diagram shown in Fig. 22.2. One should notice that in addition to the Feynman rules uncovered

(ν) (e)
k p′

p k′
(e) (ν)

Fig. 22.2: The lowest-order Feynman diagram for the elastic scattering 𝜈 + 𝑒 → 𝜈 + 𝑒. One has to read its
contribution separately along the green and blue lines, to copy the structure of the interaction Lagrangian.

in the previous example of the decay process, here one encounters also incoming particle lines,
representing the plane-wave bispinor amplitudes u (𝑘, 𝑟) and u ( 𝑝, 𝑠).
In fact, the elastic scattering is not the only physical process that can be described (in
the lowest order) by means of the interaction Lagrangian (22.15). Another possibility is e.g. the
electron–positron annihilation 𝑒 − 𝑒 + → 𝜈𝜈. Sticking to the above notation of relevant operators,
one may write the initial and final state as

|𝑖⟩ = 𝑏 †2 (𝑘, 𝑟)𝑑2† ( 𝑝, 𝑠)|0⟩ ,


(22.20)
| 𝑓 ⟩ = 𝑏 †1 (𝑘 ′, 𝑟 ′)𝑑1† ( 𝑝′, 𝑠′)|0⟩ .

This means that the corresponding 𝑆-matrix element involves v.e.v. of an operator chain, which
reads, following the paradigm (22.16),
  
⟨0|𝑑1 ( 𝑝 )𝑏 1 (𝑘 ) 𝜓 1 𝜓2 𝜓 2 𝜓1 𝑏 †2 (𝑘)𝑑2† ( 𝑝)|0⟩ .
′ ′
(22.21)

Again, 15 out of the 16 possible terms descending from (22.21) are zero. It is easy to figure
out what is the complete pairing of creation and annihilation operators yielding a non-vanishing
result, so this is left to the reader as a simple one-minute exercise. Thus, recalling that the
𝑏-operators are associated with spinors u or u , while the 𝑑-operators are accompanied by v or
v , the result for 𝑆 (1)
𝑓 𝑖 reads, eventually,

4 (4) ′
𝑆 (1) ′ ′
(1) ′
𝑓 𝑖 = 𝑁 𝑘 𝑁 𝑝 𝑁 𝑘 𝑁 𝑝 𝑖M 𝑓 𝑖 (2𝜋) 𝛿 (𝑘 + 𝑝 − 𝑘 − 𝑝) , (22.22)

with
M 𝑓(1) = ′ ′ ′ ′
(22.23)
  
𝑖 𝐺 u (𝑘 , 𝑟 )u
u (𝑘, 𝑟) v ( 𝑝, 𝑠)v
v ( 𝑝 , 𝑠 ) .
So, in (22.22) one may observe (as expected) the recurrent general form of an 𝑆-matrix element,
but the formula (22.23) exhibits a new Feynman rule: the incoming antiparticle (here positron)
corresponds to v (while, as before, an outgoing antiparticle (antineutrino) is represented by v ).
The expression (22.23) may be now depicted by the Feynman diagram displayed in Fig. 22.3.

141

k k′
(e ) (ν)

(e+) (ν)
−p −p′

Fig. 22.3: The lowest-order Feynman diagram for the annihilation process 𝑒 − 𝑒 + → 𝜈𝜈. Its contri-
bution (22.23) incorporates all relevant Feynman rules for the external lines representing fermions and
antifermions.

In this way, we have arrived at a set of Feynman rules for Dirac fermions and these will
be utilized in many forthcoming calculations. Another remarkable finding is that the content
of the modest Lagrangian (22.15) is in fact quite rich: it embodies both elastic scattering
and annihilation processes (needless to say, apart from those already discussed, the scattering
𝜈 𝑒 → 𝜈 𝑒, or the annihilation 𝜈𝜈 → 𝑒 − 𝑒 + , etc. are also possible within the considered model).
This is a salient feature of QFT models involving various types of fields; moreover (and not
surprisingly), the number of calculable processes increases in higher orders of the 𝑆-matrix
perturbation expansion.

142
Chapter 23

Decay rates and cross sections

According to the results of the preceding chapter, we are now able to compute (at least in
the lowest perturbative order) some quantum transition amplitudes, either for a particle decay,
or a scattering process. To get measurable quantities, one has to calculate the corresponding
probabilities. In general, such a transition probability is given by the square of an 𝑆-matrix
element |𝑆 𝑓 𝑖 | 2 ; but, as we will see, for obtaining a truly measurable quantity one must, roughly
speaking, normalize such a probability in an appropriate manner.
Let us start with a decay process. Up to now, we have discussed explicitly just a two-body
decay, but here we will consider a general case of a particle decaying into an arbitrary number
𝑛 of other particles. We will assume, bona fide, that the corresponding 𝑆-matrix element is
factorized as
𝑆 𝑓 𝑖 = 𝑁𝑖 𝑁 𝑓1 · · · 𝑁 𝑓𝑛 𝑖M 𝑓 𝑖 (2𝜋) 4 𝛿 (4) ( 𝑝 𝑓1 + · · · + 𝑝 𝑓𝑛 − 𝑝𝑖 ) . (23.1)
Of course, the Ansatz (23.1) is inspired by our previous results (cf. (22.13), (22.18), (22.22)).
Although momentarily we have no rigorous theorem for justifying it, the formula (23.1) is,
hopefully, quite plausible, because of the arguments indicated in the preceding chapter. Note
also that the relation (23.1) is supposed to be valid generally, without referring to perturbation
expansion.
As we have noted above, |𝑆 𝑓 𝑖 | 2 is the probability of the transition |𝑖⟩ → | 𝑓 ⟩, but for a
connection with an experimental observation one should rather consider a probability of the
transition to a definite region of the space of final-state momenta. Thus, one may start with
examining the differential decay probability corresponding to final states involving momenta in
a close neighbourhood of a given value 𝑝® 𝑓 = 𝑝® 𝑓1 + . . . + 𝑝® 𝑓𝑛 . To proceed, we will streamline
our notation a bit; the four-momenta of the decay products will be denoted as 𝑝 1 , . . . , 𝑝 𝑛 and for
the four-momentum of the decaying particle we choose the symbol 𝑃. For obtaining the number
of possible states (momenta) in a close vicinity of a value 𝑝, ® it is useful to employ the “discrete
dialect” introduced in Chapter 17 (which means that the relevant fields are quantized within a
finite spatial box with the volume 𝑉 = 𝐿 3 ). Taking into account the formula (17.1), one can see
that in the interval ( 𝑝, ® one has Δ 𝑁 ( 𝑝)
® 𝑝® + Δ 𝑝) ® states, where
Δ 𝑝1 Δ 𝑝2 Δ 𝑝3 𝑉 Δ 𝑝1 Δ 𝑝2 Δ 𝑝3
® =
Δ 𝑁 ( 𝑝) · · = . (23.2)
2𝜋 2𝜋 2𝜋 (2𝜋) 3
𝐿 𝐿 𝐿
So, passing from differences to differentials, the relation (23.2) becomes

𝑉 d3 𝑝
® =
Δ 𝑁 ( 𝑝) . (23.3)
(2𝜋) 3

143
Note that such an expression is traditionally called the phase-space element. Now it is clear
that, including the relevant phase-space elements, the differential probability for the considered
decay may be written as
𝑉 d3 𝑝 1 𝑉 d3 𝑝 𝑛
dP𝑖→ 𝑓 = |𝑆 𝑓 𝑖 | 2 · · · . (23.4)
(2𝜋) 3 (2𝜋) 3
Substituting the expression (23.1) into (23.4), one gets
i 2 𝑉 d3 𝑝 𝑉 d3 𝑝 𝑛
1
dP𝑖→ 𝑓 = 𝑁𝑖2 𝑁12 · · · 𝑁𝑛2 |M 𝑓 𝑖 | 2 (2𝜋) 4 𝛿 (4) (𝑃 𝑓 − 𝑃)
h
· · · , (23.5)
(2𝜋) 3 (2𝜋) 3
where 𝑃 𝑓 = 𝑝 1 + . . . + 𝑝 𝑛 and the normalization factors 𝑁𝑖 , 𝑁1 , . . . , 𝑁𝑛 are given by the formula
(17.3). The main difficulty is now the square of the delta function; this, in fact, is a kind of a
“mathematician’s nightmare”. To cope with it, one may proceed as follows. First, since for an
ordinary function 𝑓 (𝑥) one has 𝑓 (𝑥)𝛿(𝑥) = 𝑓 (0)𝛿(𝑥), we will set, denoting Δ = 𝑃 𝑓 − 𝑃,
h i2
𝛿 (4) (Δ) = 𝛿 (4) (0)𝛿 (4) (Δ) . (23.6)

Thus, the challenge now consists in making sense of 𝛿 (4) (0). This is an ill-defined infinity, but
one may regularize it by means of a clever trick: utilizing the integral representation

(2𝜋) 𝛿 (Δ) = d4 𝑥 𝑒𝑖Δ·𝑥
4 (4)
(23.7)

and restricting the infinite spacetime to the spatial box with a volume 𝑉, along with a finite (but
large) time interval (−𝑇/2, 𝑇/2), for Δ = 0 one gets on the right-hand side of (23.7) the value
𝑉𝑇. Thus, our regularization prescription reads

(2𝜋) 4 𝛿 (4) (0) −→ 𝑉𝑇 . (23.8)

Of course, from the point of view of pure rigorous mathematics, the replacement (23.8) is a dirty
trick, but it is commonly used in physics literature, since, as we will see immediately, it is really
instrumental in obtaining the relevant result for the description of a general decay process (and
for scattering processes as well). At the end of this chapter, we will show how the short cut way
leading to (23.8) can be (slightly) improved.
So, let us substitute the regularization recipe (23.8) into the expression (23.5), using also
(23.6) and the standard formulae for the normalization factors according to (17.3). One thus gets

1 1 1 3 𝑉 d3 𝑝 𝑛
2 𝑉 d 𝑝1
dP𝑖→ 𝑓 = ··· |M 𝑓 𝑖 | ··· 𝑉𝑇 (2𝜋) 4 𝛿 (4) (𝑃 𝑓 − 𝑃) , (23.9)
2𝐸𝑖𝑉 2𝐸 1𝑉 2𝐸 𝑛𝑉 (2𝜋) 3 (2𝜋) 3

and it becomes obvious that the factors of the space volume 𝑉 (which is an auxiliary parameter)
are cancelled. The quantity dP𝑖→ 𝑓 is proportional to 𝑇; it means that one is led naturally to
the definition of the decay probability per unit time, the quantity commonly called the decay
rate. To distinguish it from dP𝑖→ 𝑓 , we will introduce a special notation
1
d𝑤 𝑓 𝑖 = dP𝑖→ 𝑓 . (23.10)
𝑇
From (23.9) one then gets

1 d3 𝑝 1 d3 𝑝 𝑛
d𝑤 𝑓 𝑖 = |M 𝑓 𝑖 | 2 · · · (2𝜋) 4 𝛿 (4) ( 𝑝 1 + · · · + 𝑝 𝑛 − 𝑃) , (23.11)
2𝐸𝑖 (2𝜋) 3 2𝐸 1 (2𝜋) 3 2𝐸 𝑛

144
and this is a basic formula that can be found in all handbooks of particle physics; with such
a knowledge at hand, the reader may “join the crowd” and start calculating any fancy decay
process occurring in the universe (provided that the matrix element M 𝑓 𝑖 is available). An
attentive reader may have also noticed that the derivation of the formula (23.10) has been quite
straightforward, due to the trick embodied in (23.8). Indeed, the above procedure is designed to
be user-friendly; it can be reproduced easily e.g. when writing with your finger in the sand of a
beach on a desert island.
A particularly simple case, which we have already touched in the preceding chapter,
is a two-body decay. In such a case, the general formula for the differential decay rate may
be further elaborated on; one can carry out explicitly an integration over the phase space and
obtain another useful formula that is frequently employed in applications. So, let us consider
such a decay process for the “parent” particle (with a mass 𝑀) at rest, i.e. the initial four-
momentum is 𝑃 = (𝑀, 0). ® Then the delta function for the momentum conservation in (23.11)
becomes 𝛿 ( 𝑝®1 + 𝑝®2 ) and a first integration (e.g. over 𝑝®2 ) is basically trivial; the decay products
(3)

(“daughter” particles with masses 𝑚 1 , 𝑚 2 ) have momenta 𝑝®1 = − 𝑝®2 ≡ 𝑝® and one is left with

1 d3 𝑝 1
d𝑤 𝑓 𝑖 = |M 𝑓 𝑖 | 2 𝛿(𝐸 1 + 𝐸 2 − 𝑀) , (23.12)
2𝑀 16𝜋 2 𝐸 1 𝐸 2
where, of course, √︃ √︃
𝐸1 = 𝑝®2 + 𝑚 12 , 𝐸2 = 𝑝®2 + 𝑚 22 . (23.13)
The next step is the integration involving the one-dimensional delta function in (23.12), through
which the energy conservation is implemented. The integral to be evaluated is

1 𝑑3 𝑝

d(LIPS2 ) = 𝛿(𝐸 1 + 𝐸 2 − 𝑀) , (23.14)
16𝜋 2 𝐸1 𝐸2
where LIPS is an acronym for “Lorentz-invariant phase space”. To this end, one may employ
the familiar formula for a composite delta function, namely
1
𝛿 [ 𝑓 (𝑥)] = 𝛿(𝑥 − 𝑥 0 ) , (23.15)
| 𝑓 ′ (𝑥 0 )|

where 𝑓 (𝑥 0 ) = 0. In the present case, denoting | 𝑝|


® = 𝑥, we have 𝛿[ 𝑓 (𝑥)] with
√︃ √︃
𝑓 (𝑥) = 𝑥 + 𝑚 1 + 𝑥 2 + 𝑚 22 − 𝑀 ,
2 2 (23.16)

so that
𝑥 𝑥
𝑓 ′ (𝑥) = √︃ + √︃ . (23.17)
2 2
𝑥 + 𝑚1 2 2
𝑥 + 𝑚2
For the zero of 𝑓 (𝑥), i.e. for 𝑥 0 such that
√︃ √︃
𝑥0 + 𝑚 1 + 𝑥 02 + 𝑚 22 = 𝑀 ,
2 2 (23.18)

one then gets readily


𝑥0 𝑀
𝑓 ′ (𝑥 0 ) = √︃ √︃ . (23.19)
2 2 2 2
𝑥0 + 𝑚 1 𝑥0 + 𝑚 2

145
In the integral (23.14) one may introduce spherical variables, so that d3 𝑝 = | 𝑝|® 2 d| 𝑝|
® dΩ =
2
𝑥 d𝑥 dΩ, where dΩ is the element of solid angle corresponding to the direction of 𝑝.
® Employing
(23.15) and (23.19) we thus get, after a simple manipulation,
dΩ 𝑥0
d(LIPS2 ) = . (23.20)
16𝜋 2 𝑀
Now, what is 𝑥 0 ? According to (23.18), it is the magnitude of the momentum of a decay product
in the rest system of the parent particle, in other words, in the c. m. system of the daughter
particles. So, we will denote 𝑥 0 as | 𝑝®c.m. | later on. The value of 𝑥 0 = | 𝑝®c.m. | is obtained easily
from the kinematical relation (23.18). Indeed, this can be recast as
√︃ 2
2 2

𝑥0 + 𝑚 1 = 𝑀 − 𝑥0 + 𝑚 2 ,2 2 (23.21)

and one thus gets immediately


√︃
2𝑀 𝑥 02 + 𝑚 22 = 𝑀 2 + 𝑚 22 − 𝑚 12 . (23.22)

The solution of Eq. (23.22) may be written as


# 1/2
𝜆(𝑀 2 , 𝑚 12 , 𝑚 22 )
"
𝑥0 = , (23.23)
4𝑀 2

where
𝜆(𝑥, 𝑦, 𝑧) = 𝑥 2 + 𝑦 2 + 𝑧2 − 2𝑥𝑦 − 2𝑥𝑧 − 2𝑦𝑧 (23.24)
is the so-called Källén’s function;13 its alternative name is “triangle function” since it is closely
related to the area 𝐴 of a triangle, expressed in terms of its sides (cf. also the famous Heron’s
formula). Explicitly, it holds
1 √︁
𝐴= −𝜆(𝑎 2 , 𝑏 2 , 𝑐2 )
4
if 𝑎, 𝑏, 𝑐 are the triangle side lengths. Note that another useful formula for the function 𝜆 in
(23.23), which can be obtained easily from (23.22), reads

𝜆(𝑀 2 , 𝑚 12 , 𝑚 22 ) = 𝑀 2 − (𝑚 1 + 𝑚 2 ) 2 𝑀 2 − (𝑚 1 − 𝑚 2 ) 2 . (23.25)
  

So, returning to the relations (23.12) and (23.14), one has a general formula for the
differential decay rate
1
d𝑤 𝑓 𝑖 = |M 𝑓 𝑖 | 2 d(LIPS2 ) , (23.26)
2𝑀
with the element of phase space given by (23.20), i.e.

dΩ | 𝑝®c.m. |
d(LIPS2 ) = . (23.27)
16𝜋 2 𝑀
In a situation where the matrix element squared in (23.26) is independent of the spherical angles
involved in dΩ (this happens e.g. for unpolarized particles), the relation (23.26) may be integrated
13 Gunnar Källén (1926-1968) was an eminent Swedish theoretical physicist who made many important contri-

butions to QFT and particle physics. He died tragically in an airplane accident; he was flying his own plane from
Malmö to CERN and it crashed during the emergency landing in Hannover.

146
trivially and one gets a useful formula for the integral decay rate (called also the decay width),
namely
1
Γ= |M 𝑓 𝑖 | 2 LIPS2 , (23.28)
2𝑀
where
1 | 𝑝®c.m. |
LIPS2 = . (23.29)
4𝜋 𝑀
Using the above kinematical formulae (23.23) and (23.25), one may observe that there are some
notable cases, in which the formula (23.29) becomes particularly simple. So, for 𝑚 1 = 𝑚 2 = 𝑚
one has
1 4𝑚 2
√︂
LIPS2 = 1− 2 . (23.30)
8𝜋 𝑀
Further, for 𝑚 1 , 𝑚 2 ≪ 𝑀 one gets an extremely simple approximate formula
1
LIPS2 .  (23.31)
𝑚 1 , 𝑚 2 ≪𝑀 8𝜋
So much for decay processes. Let us now consider a general process of the scattering
type
1+2 → 3+4+···+𝑛. (23.32)
In analogy with our previous discussion we will employ an Ansatz for the corresponding 𝑆-matrix
element, which now reads
𝑆 𝑓 𝑖 = 𝛿 𝑓 𝑖 + 𝑁1 𝑁2 𝑁3 𝑁4 · · · 𝑁𝑛 𝑖M 𝑓 𝑖 (2𝜋) 4 𝛿 (4) (𝑃 𝑓 − 𝑃𝑖 ) , (23.33)
where the first term reflects the possibility of a trivial event with 𝑓 = 𝑖, 𝑃 𝑓 = 𝑝 3 + . . . + 𝑝 𝑛 and
𝑃𝑖 = 𝑝 1 + 𝑝 2 . Considering just a non-trivial situation, in which 𝑓 ≠ 𝑖, the transition probability
becomes, after the manipulations described above for the case of a decay,
1 1 1 1 3 𝑉 d3 𝑝 𝑛
2 𝑉 d 𝑝3
dP𝑖→ 𝑓 = ··· |M 𝑓 𝑖 | ··· (2𝜋) 4 𝛿 (4) (𝑃 𝑓 − 𝑃𝑖 )𝑉𝑇 .
2𝑉 𝐸 1 2𝑉 𝐸 2 2𝑉 𝐸 3 2𝑉 𝐸 𝑛 (2𝜋) 3 (2𝜋) 3
(23.34)
The relevant quantity that can be derived from the transition probability is the cross section. As
in the case of decay processes, one may first define a differential cross section. This is given by
the number of events per unit time corresponding to the elements of the phase space shown in
(23.34), divided by the current density of the colliding particles. Concerning the latter quantity,
if one considers a target particle at rest (e.g. that labelled as 2), this would be
𝑁
𝑗inc =
|®𝑣 1 | for 𝑣®2 = 0 , (23.35)
𝑉
where 𝑁 is the total number of the incident particles (1) and 𝑣®1 is their velocity (thus, 𝑗inc is the
number of incident particles passing per unit time through unit area). For colliding beams of
particles 1 and 2 (with parallel velocities), the relation (23.35) is generalized to
𝑁
𝑗 inc = |®𝑣 1 − 𝑣®2 | (23.36)
𝑉
(let us stress that |®𝑣 1 − 𝑣®2 | is not the relative velocity of the particles 1 and 2; it just defines an
effective volume to be multiplied by the particle density 𝑁/𝑉). Thus, one defines the differential
cross section as
1
number of events per unit time 𝑁 · dP𝑖→ 𝑓
d𝜎 = = 𝑇 . (23.37)
incident current density 𝑁
|®𝑣 1 − 𝑣®2 |
𝑉

147
Using the expression (23.34) for the transition probability, one gets from (23.37)

1 1 1 d3 𝑝 3 d3 𝑝 𝑛
d𝜎 = |M 𝑓 𝑖 | 2 · · · (2𝜋) 4 𝛿 (4) (𝑃 𝑓 − 𝑃𝑖 ) . (23.38)
|®𝑣 1 − 𝑣®2 | 2𝐸 1 2𝐸 2 (2𝜋) 3 2𝐸 3 (2𝜋) 3 2𝐸 𝑛
It is certainly reassuring that the auxiliary parameters 𝑉 and 𝑇 have dropped out (but this is
something that we have in fact “ordered” in advance). Note that the cross section d𝜎 has, by its
definition, the dimension of an area; in the natural system of units this is (mass) −2 , or, if you
want, (energy) −2 . Let us also remark that the factor |®𝑣 1 − 𝑣®2 | −1 may be recast by means of the
kinematical identity
1   1/2
|®𝑣 1 − 𝑣®2 | = ( 𝑝 1 · 𝑝 2 ) 2 − 𝑚 12 𝑚 22 (23.39)
𝐸1 𝐸2
(proving it is left to the reader as an instructive exercise; please don’t forget that 𝑣®1 and 𝑣®2 are
supposed to be directed along the same line).
Now we are going to examine in more detail a very important special case, namely a
binary process 1 + 2 → 3 + 4. Similarly as for the two-body particle decay, we will be able to
carry out a substantial part of the integration over the phase space and obtain a practical formula
for the angular distribution of the final-state particles. To this end, it is most convenient to work
in the c. m. system; then 𝑝®2 = − 𝑝®1 , and the momentum conservation yields 𝑝®4 = − 𝑝®3 . Thus,
we will use the notation 𝑝®1 = 𝑝®c.m. and 𝑝®3 = 𝑝®′c.m. in what follows. Returning to the general
formula (23.38), let us first work out the factor multiplying the matrix element squared. One
gets
1 1 1 1 1
= = , (23.40)
|®𝑣 1 − 𝑣®2 | 2𝐸 1 2𝐸 2 4| 𝑝®c.m. |𝐸 c.m. 4| 𝑝®c.m. |𝑠1/2
where 𝐸 c.m. = 𝐸 1 + 𝐸 2 and we have expressed 𝐸 c.m. in terms of the familiar Mandelstam
kinematical invariant 𝑠 = ( 𝑝 1 + 𝑝 2 ) 2 = 𝐸 c.m. 2 . Thus, the starting point of our calculation

becomes
1 d3 𝑝 3 d3 𝑝 4
d𝜎 = |M 𝑓 𝑖 | 2 (2𝜋) 4 𝛿 (4) ( 𝑝 3 + 𝑝 4 − 𝑝 1 − 𝑝 2 ) . (23.41)
4| 𝑝®c.m. |𝑠 1/2 (2𝜋) 3 2𝐸 3 (2𝜋) 3 2𝐸 4

The integration over d3 𝑝 4 is trivial and one gets, after a simple manipulation,

1 dΩc.m. d𝑥 𝑥 2
d𝜎 = |M 𝑓 𝑖 | 2 2
𝛿(𝐸 3 + 𝐸 4 − 𝑠1/2 ) , (23.42)
4| 𝑝®c.m. |𝑠 1/2 16𝜋 𝐸 3 𝐸 4
where we have denoted 𝑥 = | 𝑝®3 | (= | 𝑝®′c.m. |), and, of course, the final energies 𝐸 3 , 𝐸 4 are given by
𝐸 3 = (𝑥 2 + 𝑚 32 ) 1/2 , 𝐸 4 = (𝑥 2 + 𝑚 42 ) 1/2 . Thus, we may utilize our previous results for a two-body
decay (obviously, 𝑠1/2 now plays basically the same role as the mass 𝑀 in (23.12)); in particular,
integrating (23.42) over 𝑥 and taking into account the formula (23.27) for d(LIPS2 ), one gets
finally
d𝜎 1 1 | 𝑝®′c.m. |
= |M 𝑓 𝑖 | 2 , (23.43)
dΩc.m. 64𝜋 2 𝑠 | 𝑝®c.m. |
where, according to (23.23), it holds
# 1/2
2 , 𝑚 2 ) 1/2
𝜆(𝑠, 𝑚 12 , 𝑚 22 )
" " #
𝜆(𝑠, 𝑚 3 4
| 𝑝®c.m. | = , | 𝑝®′c.m. | = . (23.44)
4𝑠 4𝑠

The relation (23.43) is a key formula for the angular distribution of final-state particles (in the
c. m. system) in a general binary process and it is used most frequently in applications. Thus,

148
a reader who has mastered it, earns an enhanced status within the crowd of particle physicists
and QFT practitioners. In a similar manner, one may derive a corresponding result for the
laboratory frame, where the target particle 2 would be at rest; an explicit discussion of such a
case is deferred to the Chapter 33. Note also that in some specific situations the formula (23.43)
gets simplified considerably; in particular, for elastic scattering 1 + 2 → 1 + 2 one obviously has
| 𝑝®′c.m. | = | 𝑝®c.m. | and the formula for the differential cross section thus reads

d𝜎 1 1
= |M 𝑓 𝑖 | 2 . (23.45)
dΩc.m. elast.
2
64𝜋 𝑠

Similarly, for any binary process involving massless particles one also gets the result (23.45)
(recall that the zero-mass approximation is relevant in the high-energy limit, i.e. for 𝑠 ≫ 𝑚 2𝑗 ,
𝑗 = 1, . . . , 4). Finally, it may be instructive to make a remark on dimensions of the quantities
we are working with. Taking into account that the dimension of a cross section is (length) 2 ,
i.e. (mass) −2 in natural units, one may deduce easily from the formula (23.38) that an 𝑛-body
matrix element M 𝑓 𝑖 has a dimension (mass) 4−𝑛 . Thus, for a binary process it is a dimensionless
quantity; this is confirmed immediately by the simple formula (23.43) (or (23.45)). It means
that the structure of the relation (23.45) is particularly easy to remember: the square of M 𝑓 𝑖
is an obvious ingredient, and to get a cross section with the proper dimension, the factor 1/𝑠
is mandatory. If one is able to recall even the numerical factor (64𝜋 2 ) −1 , it is an extra bonus
(surprisingly, students frequently remember just such a detail).
In concluding this chapter, let us return to a technical point concerning the regularization
rule (23.8). As we have stressed, its “derivation” has been a mathematical “hocus-pocus”, so
now we will try harder to improve it a bit. To this end, we are going to revisit the integral (23.7).
Evaluating it in the “restricted spacetime” consisting of a finite spatial box with a volume 𝑉 = 𝐿 3
and a finite time interval (−𝑇/2, 𝑇/2) one gets
Δ0
∫𝑇/2 sin 𝑇
2 · 𝑉𝛿 .
∫ ∫
®
𝐼 (𝑉, 𝑇) = d4 𝑥 𝑒𝑖Δ·𝑥 = d𝑥 0 d3 𝑥 𝑒 −𝑖Δ·®𝑥 = ® 0®
Δ,
(23.46)
Δ0
(𝑉,𝑇) −𝑇/2 (𝑉)
2

Note that in the evaluation of the integral over d3 𝑥 we have taken into account the fact that Δ
® is
a difference of momenta with discrete values given by (17.1). Now, the square of the expression
(23.46) becomes
2
Δ0

sin 𝑇
2
𝐼 (𝑉, 𝑇) =   2 · 𝑉 2 𝛿Δ,
2
® 0® , (23.47)
Δ0
2
where we have used the obvious identity 𝛿2® ® = 𝛿Δ,
® 0® (due to the fact that the Kronecker delta
Δ,0
takes on just values 0 or 1). However, it holds

sin2 𝑥𝑇
lim = 𝜋𝛿(𝑥) (23.48)
𝑇→∞ 𝑇𝑥 2

(this is a rigorous statement) and thus one may write

Δ0 2
 
2
𝐼 (𝑉, 𝑇) = 𝑇 𝜋𝛿 𝑉 𝛿Δ,
® 0® . (23.49)
𝑇→∞ 2

149
At the same time, one relies on an (intuitive) rule that
3 (3) ®
® 0® −→ (2𝜋) 𝛿 ( Δ)
𝑉 𝛿Δ, (23.50)

for 𝑉 → ∞. In this way, one eventually arrives at

𝐼 2 (𝑉, 𝑇) −→ 𝑇 · 2𝜋𝛿(Δ0 )𝑉 (2𝜋) 3 𝛿 (3) ( Δ)


® = 𝑉𝑇 (2𝜋) 4 𝛿 (4) (Δ) (23.51)

for 𝑉, 𝑇 → ∞, and this corresponds precisely to the earlier replacement recipe (23.8).
A scrupulous reader has certainly noticed that the above argumentation is not fully
rigorous; the main asset is the identity (23.48), but (23.50) is again hand-waving. Obviously,
the main difficulty consists, from the very beginning, in working with the energy–momentum
delta function. A rigorous treatment would be possible if one used a representation of the
initial and final states by means of wave packets, smearing somewhat the values of the particle
momenta (see e.g. [6]). Our excuse for adopting a more intuitive short cut approach is that in
most textbooks the other authors proceed in the same way, so that it has become, as Germans
say, “salonfähig”, at least in a QFT environment.

150
Chapter 24

Sample lowest-order calculations


for physical processes

Now we are equipped with some basic general formulae that may be employed for the evaluation
of decay rates and scattering cross sections. In fact, the crucial dynamical part of such a quantity
is the matrix element squared, |M 𝑓 𝑖 | 2 , which represents a specific contribution of the considered
particular QFT model. In the present chapter we will work out several typical examples within
the lowest order of the Dyson perturbation expansion for various field theory models.
Let us start with the process of fermion-antifermion decay of a scalar boson, discussed
previously in Chapter 22. Using the model of Yukawa interaction (21.1), we have arrived
at the simple result (22.14) for the matrix element in the first perturbative order. From the
computational point of view, the simplest case corresponds to the situation, in which the final-
state particles are unpolarized. It means that the conditions of a corresponding experiment
(thought or real) are supposed to be set in such a way that the considered decay process is
inclusive with respect to particle spins; consequently, in a pertinent calculation the final-state
spins (more precisely, spin projections, such as e.g. helicities) are summed over. So, let us see
how it works for the scalar boson decay in question. First, one gets

|M | 2 = M M ∗ = 𝑔 2 u (𝑘, 𝑠)vv ( 𝑝, 𝑟)vv † ( 𝑝, 𝑟)𝛾0 u (𝑘, 𝑠)


= 𝑔 2 u (𝑘, 𝑠)vv ( 𝑝, 𝑟)vv ( 𝑝, 𝑟)u
u (𝑘, 𝑠) . (24.1)

Just to be sure, a trivial remark is perhaps in order here. The complex conjugation M ∗ is
calculated, equivalently, as the Hermitian conjugation M † ; obviously, this is most convenient,
since the number M is a matrix product. Needless to say, in (24.1) we have also utilized the
familiar relation 𝛾0† = 𝛾0 .
Now comes a crucial trick that will be instrumental in most of the forthcoming calcula-
tions. The matrix product in (24.1) is a number, so that it is equal to its trace; however, such a
trace is invariant under a cyclic permutation and thus one has

|M | 2 = 𝑔 2 Tr u (𝑘, 𝑠)vv ( 𝑝, 𝑟) v ( 𝑝, 𝑟)u


 
u (𝑘, 𝑠)
= 𝑔 2 Tr u (𝑘, 𝑠)u (24.2)
 
u (𝑘, 𝑠)vv ( 𝑝, 𝑟) v ( 𝑝, 𝑟) .

In this way, the original expression (24.1) has been recast as the trace of a product of 4 × 4
matrices and it is clear why such a seemingly simple-minded trick is so powerful: the reader
should recall that the expressions like u (𝑘, 𝑠)u
u (𝑘, 𝑠) or v ( 𝑝, 𝑟)vv ( 𝑝, 𝑟) (and their sums over the
spin labels) are known explicitly (see chapters 6 and 7) and, moreover, there are many practical
formulae for traces of products of Dirac matrices (see Chapter 3 and Appendix C). Thus, the

151
reader may appreciate now that the promise made in Chapter 3, based on a quotation from
A. P. Chekhov, is thereby fulfilled. The procedure outlined above is usually called simply the
trace technique.14
So, how about the expression (24.2)? In the envisaged case of unpolarized particles in
the final state one has, using (6.24) and (6.30),
2 2
∑︁ ∑︁
|M | = 𝑔 Tr[u u (𝑘, 𝑠)vv ( 𝑝, 𝑟) v ( 𝑝, 𝑟)]
u (𝑘, 𝑠)u
pol. 𝑟,𝑠
2
= 𝑔 Tr (/𝑘 + 𝑚)( 𝑝/ − 𝑚) (24.3)
 

(note that here and in what follows we denote, conventionally, the sum over spins by the label
“pol.”). Employing some elementary formulae for Dirac traces, the result for (24.3) becomes

|M | 2 = 4𝑔 2 (𝑘 · 𝑝 − 𝑚 2 ) .
∑︁
(24.4)
pol.

The four-momentum conservation 𝑞 = 𝑘 + 𝑝 amounts to


1 2 1
𝑘·𝑝= (𝑞 − 𝑘 2 − 𝑝 2 ) = 𝑀 2 − 𝑚 2 , (24.5)
2 2
and thus we get finally
|M | 2 = 2𝑔 2 (𝑀 2 − 4𝑚 2 ) .
∑︁
(24.6)
pol.

Now we are in a position to evaluate the decay rate; according to the general formulae
(23.28) and (23.30) one has
 3/2
𝑔2 4𝑚 2

Γ(𝜑 → 𝑓 𝑓 ) = 𝑀 1− 2 . (24.7)
8𝜋 𝑀

Thus, any reader who has reached this point may be happy to learn how to compute, in the lowest
order, at least one decay channel of the Higgs boson (the value of the relevant coupling constant
𝑔 can be found elsewhere, see e.g. ref. [22]).
In fact, we can do more. We certainly have sufficient technical means to evaluate a decay
rate for polarized particles in the final state; of course, for such a purpose, the formulae (7.51)
are instrumental. In the present context, the most instructive description of the spin states relies
on the helicity formalism. So, let us study the “anatomy” of the result (24.6). It means that
we would like to compute the matrix element squared for all combinations of the final state
helicities, namely RR, LL, RL, LR, where R (right-handed) stands for the positive helicity and
similarly L (left-handed) denotes a negative helicity. By the way, it is not difficult to guess the
result a priori, on the basis of the angular momentum conservation. Indeed, staying in the rest
frame of the parent particle, the decay products emerge with opposite momenta and thus the
admissible combinations of helicities should be just RR or LL; for RL or LR configuration, the
spin projections would be oriented in the same direction and this is incompatible with the zero
angular momentum (spin) of the decaying scalar boson. Furthermore, an educated guess is that,
for symmetry reasons, one should get |MRR | 2 = |MLL | 2 . In this way, one might arrive at a
complete solution of the above problem. Anyway, it will be instructive to perform an explicit
14 Note that this elegant method has been invented by the well-known Dutch physicist Hendrik Casimir (1909-
2000) in 1933, so it is sometimes called the Casimir’s trick.

152
calculation and verify the expected results; at the same time, one thus may confirm that our
formalism is sound indeed.
So, let us start with |MRR | 2 . According to (24.2), one has
1
|MRR | 2 = Tr[u
uR (𝑘)u
u R (𝑘)vv R ( 𝑝)vv R ( 𝑝)]
𝑔2
1 
= Tr (/𝑘 + 𝑚) 1 + 𝛾5 /𝑠 R (𝑘) ( 𝑝/ − 𝑚) 1 + 𝛾5 /𝑠 R ( 𝑝) , (24.8)
 
4
where we have employed the formulae (8.12). Carrying out the matrix multiplication inside the
trace, one gets 16 terms, but most of them do not contribute in the end. Taking into account
the basic fact that the trace of the product of an odd number of 𝛾-matrices vanishes, as well the
identity Tr(𝛾 𝜇 𝛾𝜈 𝛾5 ) = 0, the expression (24.8) is reduced to
1 2 2 2
= Tr / /
 
|MRR | 𝑘 𝑝
/ − 𝑚 + 𝑘 𝛾 5 /
𝑠 R (𝑘) 𝑝
/ 𝛾 5 /
𝑠 R ( 𝑝) − 𝑚 𝛾 5 /
𝑠 R (𝑘)𝛾 5 /
𝑠 R ( 𝑝)
𝑔2
1
= 4𝑘 · 𝑝 − 4𝑚 2 + Tr 𝑘/ /𝑠 R (𝑘) 𝑝/ /𝑠 R ( 𝑝) + 𝑚 2 Tr /𝑠 R (𝑘) /𝑠 R ( 𝑝) , (24.9)
 
4
where we have also utilized the anticommutation property of 𝛾5 and the identity 𝛾52 = 1. To work
out the traces in (24.9), one may employ the orthogonality property of the spin four-vectors,
𝑘 · 𝑠R (𝑘) = 0, 𝑝 · 𝑠R ( 𝑝) = 0 and it is also useful to take into account that 𝑘 + 𝑝 = 𝑞, with
® Then we get readily
𝑞 = (𝑀, 0).
| 𝑝|
®
𝑘 · 𝑠R ( 𝑝) = (𝑞 − 𝑝) · 𝑠R ( 𝑝) = 𝑞 · 𝑠R ( 𝑝) = 𝑀 𝑠R0 ( 𝑝) = 𝑀 , (24.10)
𝑚
and similarly
| 𝑘® |
𝑝 · 𝑠R (𝑘) = 𝑀 . (24.11)
𝑚
Note that to arrive at (24.10) and (24.11), we have used the explicit form of the helicity four-vector
according to the formula (8.7). Putting all this together, the expression (24.9) becomes

1 2 2
®2
2 |𝑘 |
|MRR | = 𝑘 · 𝑝 − 𝑚 − (𝑘 · 𝑝) (𝑠 R (𝑘) · 𝑠 R ( 𝑝)) + 𝑀 + 𝑚 2 𝑠R (𝑘) · 𝑠R ( 𝑝) , (24.12)
𝑔2 𝑚2

where we have also taken into account that | 𝑝|® = | 𝑘® |. The scalar product 𝑠R (𝑘) · 𝑠R ( 𝑝) is
® one gets
evaluated easily by means of the formula (8.7) (keeping in mind that 𝑝® = − 𝑘);
1
| 𝑘® | 2 𝐸 2 2𝐸 2 − 𝑚 2 2 𝑀 2 − 𝑚 2
𝑠R (𝑘) · 𝑠R ( 𝑝) = + = = . (24.13)
𝑚2 𝑚2 𝑚2 𝑚2

Now, substituting (24.13) and (24.5) (as well as | 𝑘® | 2 = 14 𝑀 2 − 𝑚 2 ) into (24.12) we obtain, after
a simple manipulation,
|MRR | 2 = 𝑔 2 (𝑀 2 − 4𝑚 2 ) . (24.14)
Note that this is just one half of (24.6), in accordance with our previous expectation. Let us also
remark that from the intermediate result (24.9) one gets immediately |MLL | 2 = |MRR | 2 , since
𝑠L = −𝑠R . Consequently, |MRL | 2 = |MLR | 2 = 0, because

|M | 2 = |MRR | 2 + |MLL | 2 + |MRL | 2 + |MLR | 2 ,


∑︁
(24.15)
pol.

153
and the first two terms in (24.15) already saturate the value (24.6) for the left-hand side. Needless
to say, using the relations derived above, it is easy to show directly that the combinations RL
and LR give null results for the corresponding matrix elements.
As a next instructive example, we will consider the decay of a massive vector boson
(i.e. spin-1 particle) into a fermion-antifermion pair within the model (21.3) involving the
interaction Lagrangian
Lint = 𝑔𝜓𝛾 𝜇 𝜓 𝐴 𝜇 . (24.16)
As we will see in Chapter 29, there is a subtle point concerning the validity of the relation (21.35),
but in the first order of Dyson expansion it is safe, so we are going to proceed confidently, utilizing
the basic formula (21.38).
For convenience, we will denote the vector boson as 𝑉. The derivation of the relevant
matrix element for the decay 𝑉 → 𝑓 𝑓 can be carried out in much the same way as in the
preceding case of the Yukawa interaction. Indeed, taking into account the representation (20.10)
of the Proca field, it is easy to realize that the only substantial difference in comparison with
the scalar case is that here the annihilation and creation operators are accompanied by the
polarization vectors 𝜀 𝜇 (𝑘, 𝜆). Thus, following the steps that in Chapter 22 led from (22.9) to
(22.14), one gets
M 𝑓(1) u (𝑘, 𝑠)𝛾 𝜇 v ( 𝑝, 𝑟)𝜀 𝜇 (𝑞, 𝜆) .
𝑖 = 𝑔u (24.17)
This is represented graphically by the Feynman diagram shown in Fig. 24.1.

q =k+p

k −p

Fig. 24.1: First order Feynman diagram for the process 𝑉 → 𝑓 𝑓 . Feynman rules for the fermion lines
are the same as before, the wavy line represents the initial vector boson and its contribution is given by
the polarization vector 𝜀 𝜇 (𝑞, 𝜆).

Now, we would like to evaluate the contribution of the matrix element squared in the
case of the decay of an unpolarized vector boson into the 𝑓 𝑓 pair in an arbitrary admissible
spin state. A novel issue here is a proper characterization of the state of the initial vector boson.
From the point of view of the general formalism of quantum theory, such an unpolarized particle
is considered to be in a mixed state described by a density matrix defined by equal probabilities
for any spin state; in the present case there are three possible spin states (polarizations), so that
all probabilities in question are equal to 1/3. This in turn means that |M 𝑓 𝑖 | 2 should be summed
over the vector boson polarizations 𝜆 = 1, 2, 3 and multiplied by 1/3. Such an operation is
called, conventionally, averaging over the initial spin states (polarizations). So, squaring the
expression (24.17) we get first (dropping the obvious extra labels)
|M | 2 = 𝑔 2 u (𝑘, 𝑠)𝛾 𝜇 v ( 𝑝, 𝑟) v ( 𝑝, 𝑟)𝛾𝜈 u (𝑘, 𝑠) 𝜀 𝜇 (𝑞, 𝜆)𝜀 𝜈 ∗ (𝑞, 𝜆) , (24.18)
  

where we have used the identities 𝛾𝜈† = 𝛾0 𝛾𝜈 𝛾0 and (𝛾0 ) 2 = 1. Then, utilizing the trace technique,
summing over the final spins and averaging over the initial polarizations, one obtains
1 1 1
 
2 2
∑︁
|M | 2 ≡ |M | = 𝑔 Tr (/𝑘 + 𝑚)𝛾 𝜇 ( 𝑝/ − 𝑚)𝛾𝜈 −𝑔 + 2 𝑞 𝑞 ,
𝜇𝜈 𝜇 𝜈
(24.19)
 
3 pol. 3 𝑀

154
where we have employed the formula (12.30) for the polarization sum for the massive vector
boson. Note that the symbol |M | 2 used on the left-hand side of (24.19) is a standard notation
for a spin-averaged matrix element squared that will be used from now on. The expression
(24.19) can be worked out easily with the help of standard identities for traces of Dirac matrices.
A welcome simplification is due to the fact that the term proportional to 1/𝑀 2 gives zero (an
attentive reader may find out that such a striking result is not accidental: in fact, it can be traced
back to the “current conservation” identity u (𝑘, 𝑠) 𝑞/ v ( 𝑝, 𝑟) = 0). Thus, one is left with

1
|M | 2 = − 𝑔 2 Tr (/𝑘 + 𝑚)𝛾 𝜇 ( 𝑝/ − 𝑚)𝛾 𝜇 ,
 
3
and another simple application of “diracology” yields the final result
1   4
|M | 2 = 𝑔 2 8𝑘 · 𝑝 + 16𝑚 2 = 𝑔 2 (𝑀 2 + 2𝑚 2 ) . (24.20)
3 3
The corresponding decay rate then becomes

1 2 2𝑚 2 4𝑚 2
  √︂
𝑔
Γ(𝑉 → 𝑓 𝑓 ) = |M | 2 LIPS2 = 𝑀 1+ 2 1− 2 . (24.21)
2𝑀 12𝜋 𝑀 𝑀
As for a physical background of the above technical study, note that we have calculated the
decay rate of a hypothetical “heavy photon”. More realistically, the result (24.21) corresponds
to a simplified treatment of the neutral intermediate vector boson (𝑍) of the standard model of
electroweak interactions. A knowledgeable reader may be aware of the fact that in the realistic
case, the electroweak interaction of the 𝑍 boson involves, along with the vector current appearing
in (24.16) also an axial-vector part 𝜓𝛾 𝜇 𝛾5 𝜓. Such a generalization of the above calculation is
left to the reader as an instructive exercise.
As a last example to be discussed in this chapter, let us consider the process 𝑒 − 𝑒 + → 𝜈𝜈
that we have mentioned, within a simplified model, in Chapter 22. Here we would like to make
its treatment more realistic; we are going to use for its description a theory of weak interactions
that preceded the present-day standard model. The SM precursor we have in mind is called,
historically, the 𝑉 − 𝐴 theory.15 The corresponding Lagrangian reads
𝐺F 
Lint = − √ 𝜓 1 𝛾 𝜇 (1 − 𝛾5 )𝜓2 𝜓 2 𝛾 𝜇 (1 − 𝛾5 )𝜓1 , (24.22)
 
2
where the labelling of the Dirac fields is the same as in Chapter 22. The coupling constant is
written here in the form generally accepted in particle physics; in particular, 𝐺 F is the famous
Fermi constant that can be found in any particle data tables (note that the overall minus sign is a
pure convention). The most spectacular feature of the Lagrangian (24.22), in comparison with
the toy model (22.15), is the algebraic structure involving 𝛾 𝜇 (1 − 𝛾5 ). This has been established
on the phenomenological grounds and it also explains the 𝑉 − 𝐴 name: the bilinear combinations
of Dirac fields (currents) appearing in (24.22) can be read as vector (𝑉) minus axial vector (𝐴).
So much for an excursion into the history of particle physics; our main concern now is a technical
study of the relevant matrix element and the cross section for the above-mentioned process.
So, how about the matrix element M 𝑓 𝑖 ? Comparing the interaction Lagrangian (24.22)
with (22.15), it is clear that one may rely on the basic structure of the expression (22.23) and
15 Concerning its inventors, it is mostly attributed to Richard Feynman and Murray Gell-Mann, but there had
been also a significant contribution of Robert Marshak and George Sudarshan. An interested reader can find some
details of this story e.g. in [22].

155
just insert the matrix product 𝛾 𝜇 (1 − 𝛾5 ) between the Dirac spinors u , u and v , v . We are thus
led immediately to the result
𝐺F 
M 𝑓 𝑖 = − √ u (𝑘 ′, 𝑟 ′)𝛾 𝜇 (1 − 𝛾5 )u
u (𝑘, 𝑟) v ( 𝑝, 𝑠)𝛾 𝜇 (1 − 𝛾5 )vv ( 𝑝′, 𝑠′) . (24.23)
 
2
For brevity, we will suppress the spin labels in what follows. The matrix element squared then
becomes, after a simple manipulation,
1 
|M | 2 = 𝐺 2F u (𝑘 ′)𝛾 𝜌 (1 − 𝛾5 )u
u (𝑘) v ( 𝑝)𝛾 𝜌 (1 − 𝛾5 )vv ( 𝑝′)
 
2 
u (𝑘 ′) v ( 𝑝′)𝛾 𝜎 (1 − 𝛾5 )vv ( 𝑝) . (24.24)
 
× u (𝑘)𝛾𝜎 (1 − 𝛾5 )u

To obtain the spin-averaged quantity |M | 2 , one has to take into account that there are four
possible spin combinations in the initial state, so that the averaging factor is 1/4. Employing
also the trace technique, we get
1
|M | 2 = 𝐺 2F Tr (/𝑘 ′ + 𝑚 𝜈 )𝛾 𝜌 (1 − 𝛾5 )(/𝑘 + 𝑚 𝑒 )𝛾𝜎 (1 − 𝛾5 )
 
8
× Tr ( 𝑝/ − 𝑚 𝑒 )𝛾 𝜌 (1 − 𝛾5 )( 𝑝/ ′ − 𝑚 𝜈 )𝛾 𝜎 (1 − 𝛾5 ) . (24.25)
 

To proceed further, one may neglect safely the neutrino mass, since this is tiny indeed; in any
case, 𝑚 𝜈 ≪ 𝑚 𝑒 . When this is done, it also becomes clear that the terms in (24.25) involving
explicitly 𝑚 𝑒 drop out because of Tr(odd #) = 0 (note that we employ here the shorthand notation
introduced in (3.17), see also (C.6), Appendix C). Taking into account these simplifications and
using also the obvious identity (1 − 𝛾5 ) 2 = 2(1 − 𝛾5 ), the expression (24.25) is reduced to
1
|M | 2 = 𝐺 2F Tr 𝑘/ ′ 𝛾 𝜌 𝑘/ 𝛾𝜎 (1 − 𝛾5 ) · Tr 𝑝/ 𝛾 𝜌 𝑝/ ′ 𝛾 𝜎 (1 − 𝛾5 ) . (24.26)
   
2
To work out the product of the traces in (24.26), it is very efficient to employ a set of formulae,
which come in handy in many QFT calculations; these read

Tr 𝑎/ 𝛾 𝜇 𝑏/ 𝛾𝜈 · Tr 𝑐/𝛾 𝜇 𝑑/ 𝛾 𝜈 = 32 [(𝑎 · 𝑐)(𝑏 · 𝑑) + (𝑎 · 𝑑)(𝑏 · 𝑐)] , (24.27)


 

Tr 𝑎/ 𝛾 𝜇 𝑏/ 𝛾𝜈 𝛾5 · Tr 𝑐/𝛾 𝑑/ 𝛾 𝛾5 = 32 [(𝑎 · 𝑐)(𝑏 · 𝑑) − (𝑎 · 𝑑)(𝑏 · 𝑐)] ,


𝜇 𝜈
(24.28)
 

Tr 𝑎/ 𝛾 𝜇 𝑏/ 𝛾𝜈 · Tr 𝑐/𝛾 𝑑/ 𝛾 𝛾5 = 0 .
𝜇 𝜈
(24.29)
 

As a practical abbreviation, we will call these remarkably simple identities the “formulae 32”.
Their proof is straightforward, based on the standard formulae for products of Dirac matrices
and the Levi-Civita pseudotensor 𝜀 𝜇𝜈𝜌𝜎 . Anyway, it is somewhat lengthy, so we offer it to a
diligent reader as an instructive exercise.
In the present case, using the above formulae in (24.26), one gets readily

|M | 2 = 32𝐺 2F (𝑘 ′ · 𝑝)(𝑘 · 𝑝′) (24.30)

(so, this is the right moment to appreciate the efficiency of the formulae 32, isn’t it?). The
kinematics of the considered process is such that 𝑘 + 𝑝 = 𝑘 ′ + 𝑝′. Thus, the Mandelstam
invariants may be defined conventionally as

𝑠 = (𝑘 + 𝑝) 2 = (𝑘 ′ + 𝑝′) 2 ,
𝑡 = (𝑘 − 𝑘 ′) 2 = ( 𝑝 − 𝑝′) 2 , (24.31)
𝑢 = (𝑘 − 𝑝′) 2 = (𝑘 ′ − 𝑝) 2 .

156
It means that the scalar products appearing in (24.32) can be recast as
1
𝑘 · 𝑝′ = 𝑘 ′ · 𝑝 = − (𝑢 − 𝑚 2𝑒 ) , (24.32)
2
and one gets finally
|M | 2 = 8𝐺 2F (𝑢 − 𝑚 2𝑒 ) 2 . (24.33)
For an illustration (as well as for simplicity) let us see now what is the explicit form of
the angular distribution of the final-state particles in the high-energy limit, i.e. for 𝑠 ≫ 𝑚 2𝑒 . In
such a case, 𝑚 2𝑒 in (24.33) may be neglected and it is not difficult to find out that

1
𝑢 = − 𝑠 (1 + cos 𝜗c.m. ) , (24.34)
2
where 𝜗c.m. is the angle between the momenta of the incoming electron and outgoing neutrino.
Thus, using the general formula (23.45) (valid for massless particles), we get

d𝜎 𝐺 2F 𝑠
= (1 + cos 𝜗c.m. ) 2 . (24.35)
dΩc.m. 32𝜋 2
The expression (24.35) may be integrated easily over the scattering angle and one thus arrives
at the result for the total cross section
1 2
𝜎(𝑠) = 𝐺 𝑠. (24.36)
𝑠≫𝑚 2𝑒 6𝜋 F
It is worth noting that a substantial part of the result (24.36) could have been guessed a
priori, just on dimensional grounds. Indeed, the cross section has the dimension (mass) −2 and
in the lowest order it is proportional to 𝐺 2F . As we have already pointed out in Chapter 22, the
dimension of 𝐺 F is (mass) −2 . In the high-energy limit, particle masses may be neglected, so the
only other dimensionful quantity, apart from 𝐺 F , is the energy. Thus, obviously, to get a cross
section 𝜎 with the right dimension, one has to multiply 𝐺 2F by an energy squared; since 𝜎 is
2 ). Thus, one arrives at an estimate
Lorentz invariant, the relevant variable is 𝑠 (= 𝐸 c.m.

𝜎(𝑠) = const. 𝐺 2F 𝑠 . (24.37)


𝑠≫𝑚 2𝑒

In other words, the detailed calculation that we have performed just supplements (24.37) with
the numerical factor 1/6𝜋.
It may be also instructive to find a numerical value of the cross section (24.36) for a
typical high energy, e.g. 𝐸 c.m. = 1 GeV, i.e. 𝑠 = 1 GeV2 . According to particle data tables, the
value of the Fermi constant amounts to 𝐺 F  1.166 × 10−5 GeV−2 . Thus, the cross section
(24.36) comes out, in the natural units,

𝜎(𝑠 = 1 GeV2 )  0.07 × 10−10 GeV−2 . (24.38)

To recast it in ordinary units, one employs the well-known conversion constant ℏ𝑐 = 197 MeV fm,
where 1 fm = 10−13 cm. Thus, in natural units one has 1 GeV−1  0.2 fm. Putting all this
together, one gets finally

𝜎(𝑠 = 1 GeV2 )  2.8 × 10−39 cm2 = 2.8 fb (femtobarn) , (24.39)

157
where 1 b = 10−24 cm2 . Later on, we will see that for a typical electromagnetic process like
𝑒 − 𝑒 + → 𝜇− 𝜇+ (production of a muon pair in the electron–positron annihilation) one gets the
result (within QED)

𝜎(𝑒 − 𝑒 + → 𝜇− 𝜇+ ) 2
 8.7 × 10−32 cm2 ,
𝑠=1 GeV

i.e. a value that is seven orders of magnitude larger than (24.39) (such a huge difference justifies
the attribute “weak” for the interaction responsible for the process 𝑒 − 𝑒 + → 𝜈𝜈). Well, an astute
reader might object that the cross section (24.36) grows indefinitely for 𝑠 → ∞, so that the
weak interaction might become strong at a sufficiently high energy. The standard model of
electroweak interactions gives a clear answer to such a question, but this is another story that
goes beyond the scope of these lectures (an interested reader is referred to [22]).

158
Chapter 25

Scattering in external Coulomb field.


Mott formula

There is another example of a physical process that deserves a separate treatment. It is the
scattering of a Dirac particle (e.g. electron) in an external static electromagnetic field; of
particular interest is the case of the Coulomb field. For the description of such a situation one
may employ the interaction Lagrangian

Lint = 𝑒𝜓(𝑥)𝛾 𝜇 𝜓(𝑥) 𝐴clas.


𝜇 (𝑥) , (25.1)

where 𝜓(𝑥) is the quantized Dirac field and 𝐴clas.


𝜇 (𝑥) represents the external classical field. Of
course, such a “hybrid” formalism is an idealized picture of a situation, in which the electron
interacts with a heavy charged particle (e.g. proton or another atomic nucleus) that is the source
of the pertinent electromagnetic field. We will come back to this issue later on, within the
framework of quantum electrodynamics, but now we are going to proceed using (25.1).
It is not difficult to obtain the relevant matrix element for the considered scattering
process. The states corresponding to the incident and scattered electron can be represented as

|𝑖⟩ = 𝑏 † ( 𝑝, 𝑠)|0⟩ , | 𝑓 ⟩ = 𝑏 † ( 𝑝′, 𝑠′)|0⟩ , (25.2)

and the 1st order 𝑆-matrix element is then



𝑆 𝑓 𝑖 = 𝑖𝑒 d4 𝑥 ⟨0|𝑏( 𝑝′, 𝑠′)𝜓(𝑥)𝛾 𝜇 𝜓(𝑥)𝑏 † ( 𝑝, 𝑠)|0⟩ 𝐴 𝜇 (𝑥) .
(1)
(25.3)

Note that here and in what follows we suppress the label “clas.” on the electromagnetic field
(four-potential) 𝐴 𝜇 . Moreover, we will assume that the field is static, i.e. 𝐴 𝜇 (𝑥) = 𝐴 𝜇 (®
𝑥 ). The
v.e.v. in the integrand in (25.3) is worked out easily, in analogy with the examples discussed
previously; the non-trivial contribution corresponds to the pairings of 𝑏( 𝑝′, 𝑠′) with 𝜓(𝑥) and
𝑏 † ( 𝑝, 𝑠) with 𝜓(𝑥). The rest of the calculation is routine and one ends up with

𝑆 𝑓 𝑖 = 𝑖𝑒𝑁 𝑝 𝑁 𝑝 ′ u ( 𝑝 , 𝑠 )𝛾 u ( 𝑝, 𝑠) d4 𝑥 𝑒𝑖( 𝑝 −𝑝)𝑥 𝐴 𝜇 (®
(1) ′
′ ′ 𝜇
𝑥) . (25.4)

The time integration is trivial and yields 2𝜋𝛿( 𝑝′0 − 𝑝 0 ), which means that the energy is conserved.
Needless to say, this is what must happen in the presence of a static field. On the other hand,
the three-momentum is not conserved, for the same reason (more precisely, the momentum
magnitude is conserved, but its direction is changed, defining the scattering angle). Thus, we
have
𝑆 (1) ′
(1)
𝑓 𝑖 = 𝑁 𝑝 𝑁 𝑝 𝑖M 𝑓 𝑖 2𝜋𝛿(𝐸 − 𝐸) ,

(25.5)

159
with
M 𝑓(1) ′ ′ 𝜇
𝑖 = 𝑒 u ( 𝑝 , 𝑠 )𝛾 u ( 𝑝, 𝑠) 𝐴 𝜇 ( 𝑞)
e ® , (25.6)
where ∫
® =
e𝜇 ( 𝑞)
𝐴 d3 𝑥 𝐴 𝜇 (®
𝑥 ) 𝑒 −𝑖 𝑞®𝑥® (25.7)

for
𝑞® = 𝑝®′ − 𝑝® . (25.8)
Such an expression can be represented graphically by means of a simple Feynman diagram
shown in Fig. 25.1.

p p′

q = p′ − p

Fig. 25.1: Feynman diagram for the scattering in an external field. The contribution of the wavy line
with the cross at the endpoint is the Fourier transform of the external field.

The evaluation of the scattering cross section proceeds basically along the same lines as
before in Chapter 23, but there are some minor modifications due to the different kinematics of
the considered process, embodied in (25.5). In analogy with (23.37), one may start with

1 𝑉 d3 𝑝 𝑓
|𝑆 𝑓 𝑖 | 2
𝑇 (2𝜋) 3
d𝜎 = (25.9)
1
|®𝑣 𝑖 |
𝑉
(for convenience, we have temporarily switched to the self-explanatory notation 𝑝𝑖 = 𝑝, 𝑝 𝑓 = 𝑝′).
The expression for |𝑆 𝑓 𝑖 | 2 involves, among other things, the nasty singular factor 𝛿(0); its
pragmatic regularization now reads (cf. (23.8))
∫𝑇/2
2𝜋𝛿(0) −→ d𝑥 0 = 𝑇 . (25.10)
−𝑇/2

Thus, substituting all relevant ingredients into (25.9) one gets first, after a simple manipulation,
3
1 2 d 𝑝𝑓
d𝜎 = |M 𝑓𝑖 | 𝛿(𝐸 𝑓 − 𝐸𝑖 ) . (25.11)
16𝜋 2 | 𝑝®𝑖 |𝐸 𝑓

Next, using spherical variables so that d3 𝑝 𝑓 = | 𝑝® 𝑓 | 2 d| 𝑝® 𝑓 | dΩ 𝑓 and taking into account that
| 𝑝® 𝑓 | d| 𝑝® 𝑓 | = 𝐸 𝑓 d𝐸 𝑓 , the integration over d𝐸 𝑓 is carried out readily and we obtain finally
d𝜎 1
= |M | 2 (25.12)
dΩ 16𝜋 2
(we have streamlined our notation as much as possible). Such a remarkably simple formula holds
generally, for any static external field; now let us consider the particular case of the Coulomb
field.

160
In such a case, the external field can be written as
1 𝑒 ®
 
𝜇
𝐴 (®𝑥) = ,0 , (25.13)
4𝜋 𝑟
where 𝑟 = |® 𝑥 |. We have in mind here that the source carries the elementary charge, i.e. it
may be viewed as a static proton. Note that the form (25.13) corresponds to the rationalized
(Heaviside–Lorentz) electromagnetic units; in compliance with the natural system of units we
are using, it then holds 𝑒 2 = 4𝜋𝛼, where 𝛼  1/137 is the famous fine-structure constant. For
the evaluation of the Fourier transform of (25.13) one may utilize the well-known integral
1 4𝜋

d3 𝑥 𝑒𝑖 𝑞®𝑥® = 2 . (25.14)

𝑥| 𝑞®
Thus,  
𝑒 ®
e𝜇 ( 𝑞)
𝐴 ® = ,0 , (25.15)
𝑞®2
and the matrix element (25.6) becomes
𝑒2
M = u ( 𝑝′, 𝑠′)𝛾0 u ( 𝑝, 𝑠) . (25.16)
𝑞®2
We are going to compute now the differential cross section for unpolarized electrons. The
spin-averaged matrix element squared is given by
 2
2
1 𝑒 2 ∑︁ 
|M | = u ( 𝑝′, 𝑠′)𝛾0 u ( 𝑝, 𝑠) u ( 𝑝, 𝑠)𝛾0 u ( 𝑝′, 𝑠′)
 
2 𝑞® 2
pol.
 2 2
1 𝑒
Tr (25.17)
 ′
=

( 𝑝
/ + 𝑚)𝛾 0 ( 𝑝
/ + 𝑚)𝛾 0 .
2 𝑞®2
The trace in (25.17) is evaluated easily and one gets

2
2𝑒 4  2 2

 2 2𝐸 − 𝑝 · 𝑝 + 𝑚

|M | =
𝑞®2
2𝑒 4  2 2 2

= 2 𝐸 + | ®
𝑝| cos 𝜗 + 𝑚
𝑞®2
2𝑒 4  2 2 2 2
= cos (25.18)

2
𝐸 + 𝑚 + (𝐸 − 𝑚 ) 𝜗 ,
𝑞®2


where 𝜗 is the scattering angle (between 𝑝®′ and 𝑝). ® For 𝑞®2 we have 𝑞®2 = ( 𝑝®′ − 𝑝)
®2 =
® 2 sin2 𝜗2 and the cross section (25.12) becomes, finally,
® 2 (1 − cos 𝜗) = 4| 𝑝|
2| 𝑝|

d𝜎 𝛼2 1
 
2 2 𝜗
= 1 − 𝛽 sin , (25.19)
dΩ 4𝛽2 | 𝑝| ®2 4 𝜗 2
sin
2
where 𝛽 is the particle velocity, 𝛽 = | 𝑝|/𝐸.
® We have thus arrived at the celebrated Mott
16
formula announced in the title of the present chapter. It represents a relativistic generalization
16 Sir Nevill Mott (1905-1996) published this formula in 1929. Later he conducted highly successful research
in solid state physics and received the Nobel Prize in 1977 for his work on disordered systems such as amorphous
semiconductors.

161
of the familiar Rutherford formula known from non-relativistic quantum and classical mechanics.
Indeed, let us see what one gets from (25.19) in the low-energy approximation, i.e. for | 𝑝| ® ≪ 𝑚.
In such a case, one has 𝛽 ≪ 1 and the second term in parentheses can be neglected. In the factor
multiplying the angular distribution 1/sin4 𝜗2 one may then set 4𝛽2 | 𝑝|® 2  4𝛽2 · 𝑚 2 𝛽2 = 16𝐸 kin
2 ,

where 𝐸 kin = 21 𝑚𝛽2 is the non-relativistic kinetic energy. Putting all this together, (25.19) is
reduced to 2
d𝜎 1

𝛼
= , (25.20)
dΩ 𝛽≪1 4𝐸 kin 4 𝜗
sin
2
and this is just the Rutherford formula. Of course, in the opposite limit 𝛽 → 1 the angular
distribution is substantially modified by the relativistic effect embodied in the factor 1− 𝛽2 sin2 𝜗2 ;
in particular, the backward scattering, corresponding to 𝜗 = 180◦ , is suppressed.
There is another aspect of the Mott scattering that is worth mentioning, namely the
behaviour of polarized particles. To elucidate this point, let us consider the case of the scattering
of fully polarized electron, e.g. with positive helicity. One may define the degree of polarization
of scattered electrons as
d𝜎R d𝜎L

𝑃 = dΩ dΩ , (25.21)
d𝜎R d𝜎L
+
dΩ dΩ
where the indices 𝑅, 𝐿 mark the scattered electron with positive and negative helicity, re-
spectively. The calculation based on (25.12) (and relying heavily on the trace techniques) is
straightforward, but rather tedious, so we will give here just the final result. This reads
𝜗 𝜗 𝜗
− 𝑚 2 sin2
𝐸 2 cos2 2𝑚 2 sin2
𝑃= 2 2 =1− 2 . (25.22)
2 2
𝜗 2 2 𝜗 2 2
𝜗 2 2 𝜗
𝐸 cos + 𝑚 sin 𝐸 cos + 𝑚 sin
2 2 2 2
The derivation of this formula is left as a challenge for a truly diligent reader. One can thus
see that there is a depolarization effect depending, in general, on the energy and the scattering
angle. A salient feature of the result (25.22) is that in the ultrarelativistic limit 𝐸 ≫ 𝑚 the initial
polarization is preserved, 𝑃 → 1. In fact, there is a deeper reason for such a behaviour. In the
high-energy limit, the mass can be neglected and the helicity becomes equivalent to chirality
(cf. Chapter 8); however, in the electromagnetic interaction the chirality is conserved, since e.g.
1 + 𝛾5 1 + 𝛾5
u L 𝛾 𝜇 uR = u L 𝛾𝜇 uR = 0 .
2 2
One may also notice immediately that in this aspect, namely the chirality conservation, the
electromagnetic (vector-like) force differs substantially from the Yukawa interaction, since, in
general, u L uR ≠ 0.

162
Chapter 26

Propagator of scalar field

In the preceding chapters we have discussed some elementary applications of perturbative QFT,
employing just the first-order term of the Dyson expansion of the 𝑆-matrix. These lowest-order
calculations are very simple indeed and the resulting Feynman diagrams consist of vertices and
external lines only; nevertheless, one thus gets a lot of useful results.
Now, let us proceed to the 2nd order of Dyson expansion. This will lead us naturally
to a new crucial ingredient of the diagram technique, namely to internal lines connecting the
vertices; mathematically, such an internal line corresponds to the Feynman propagator of a
quantized field.
As a motivation example, we are going to consider a model of Yukawa-like interaction

Lint = 𝑔1 𝜓 1 𝜓1 𝜙 + 𝑔2 𝜓 2 𝜓2 𝜙 , (26.1)

where 𝜓1 , 𝜓2 are two different Dirac fields (corresponding, for definiteness, e.g. to electron and
muon) and 𝜙 is a real scalar field. (Note that a knowledgeable reader may recognize in (26.1) the
interaction of the Higgs boson field with fermions within the current standard model of particle
physics.) Let us consider the electron–muon scattering process

𝑒(𝑘) + 𝜇( 𝑝) → 𝑒(𝑘 ′) + 𝜇( 𝑝′) , (26.2)

where we have also indicated the corresponding particle four-momenta. Obviously, to get a
non-trivial contribution to such a process, one has to use (at least) the 2nd order of the 𝑆-matrix
expansion, since one needs two electron operators, as well as two muon operators. Thus, we
start with the 𝑆-matrix operator approximation

𝑖2
∫∫
(2)
𝑆 = d4 𝑥 d4 𝑦 T Lint (𝑥)Lint (𝑦) , (26.3)

2!
and with the initial and final states defined as

|𝑖⟩ = 𝑏 †1 (𝑘)𝑏 †2 ( 𝑝)|0⟩ ,


(26.4)
| 𝑓 ⟩ = 𝑏 †1 (𝑘 ′)𝑏 †2 ( 𝑝′)|0⟩ ,

where the notation is self-explanatory; for brevity, we have suppressed the relevant spin labels.
For convenience, let us denote the two parts of the interaction Lagrangian (26.1) as L1 and
L2 , respectively. It is clear that a non-vanishing contribution to the matrix element 𝑆 (2)
𝑓 𝑖 can
be obtained only from products L1 (𝑥)L2 (𝑦) and L2 (𝑥)L1 (𝑦) in (26.3). Taking into account

163
that T 𝐴(𝑥)𝐵(𝑦) = T 𝐵(𝑦) 𝐴(𝑥) , as well as the simple fact that in the integral (26.3) one can
 

perform the change of variables 𝑥 ↔ 𝑦 if necessary, 𝑆 (2)


𝑓 𝑖 can be written as
∫∫
2
𝑆 (2) = 𝑖 𝑔1 𝑔2 d4 𝑥 d4 𝑦 ⟨ 𝑓 |T 𝜓 1 (𝑥)𝜓1 (𝑥)𝜑(𝑥)𝜓 2 (𝑦)𝜓2 (𝑦)𝜑(𝑦) |𝑖⟩ , (26.5)

𝑓𝑖

because the above-mentioned two terms under the T-product give the same contribution. To
work out the last expression, we use the definitions (26.4) and the usual general representation
of the T-product in terms of the Heaviside step functions,

T 𝐴(𝑥)𝐵(𝑦) = 𝜗(𝑥0 − 𝑦 0 ) 𝐴(𝑥)𝐵(𝑦) + 𝜗(𝑦 0 − 𝑥 0 )𝐵(𝑦) 𝐴(𝑥) . (26.6)




For the integrand in Eq. (26.5) one then has

⟨ 𝑓 |T 𝜓 1 (𝑥)𝜓1 (𝑥)𝜑(𝑥)𝜓 2 (𝑦)𝜓2 (𝑦)𝜑(𝑦) |𝑖⟩




= 𝜗(𝑥 0 − 𝑦 0 )⟨0|𝑏 2 ( 𝑝′)𝑏 1 (𝑘 ′)𝜓 1 (𝑥)𝜓1 (𝑥)𝜑(𝑥)𝜓 2 (𝑦)𝜓2 (𝑦)𝜑(𝑦)𝑏 †1 (𝑘)𝑏 †2 ( 𝑝)|0⟩ (26.7)
+ 𝜗(𝑦 0 − 𝑥0 )⟨0|𝑏 2 ( 𝑝′)𝑏 1 (𝑘 ′)𝜓 2 (𝑦)𝜓2 (𝑦)𝜑(𝑦)𝜓 1 (𝑥)𝜓1 (𝑥)𝜑(𝑥)𝑏 †1 (𝑘)𝑏 †2 ( 𝑝)|0⟩ .

According to our previous experience, we know that the non-vanishing contributions to the
vacuum expectation values like those in (26.7) are given by complete pairings of the annihilation
and creation operators of the same kind. Of course, we exclude pairings between 𝑏 1 (𝑘 ′) and
𝑏 †1 (𝑘), etc., since these correspond to kinematically trivial situation (no real scattering). Taking
into account the operator structure of Dirac fields, one gets for the relevant pairings (determined
by the basic anticommutators), in the usual notation

𝑏 1 (𝑘 ′)𝜓 1 (𝑥) = 𝑁 𝑘 ′ u (𝑘 ′)𝑒𝑖𝑘 𝑥 , 𝜓1 (𝑥)𝑏 †1 (𝑘) = 𝑁 𝑘 u (𝑘)𝑒 −𝑖𝑘𝑥 ,

𝑏 2 ( 𝑝′)𝜓 2 (𝑦) = 𝑁 𝑝 ′ u ( 𝑝′)𝑒𝑖 𝑝 𝑦 , 𝜓2 (𝑦)𝑏 †2 ( 𝑝) = 𝑁 𝑝 u ( 𝑝)𝑒 −𝑖 𝑝𝑦 . (26.8)

While the expressions (26.8) are factorized from the vacuum matrix elements in (26.7), the
scalar fields “stay inside”, i.e. one is left with their time-ordered product sandwiched between
the vacuum states. Thus, the integrand in (26.5) becomes

⟨ 𝑓 |𝑇 𝜓 1 (𝑥)𝜓1 (𝑥)𝜑(𝑥)𝜓 2 (𝑦)𝜓2 (𝑦)𝜑(𝑦) |𝑖⟩



′ ′ 
= 𝑁 𝑘 𝑁 𝑝 𝑁 𝑘 ′ 𝑁 𝑝 ′ 𝑒 −𝑖𝑘𝑥 𝑒𝑖𝑘 𝑥 𝑒 −𝑖 𝑝𝑦 𝑒𝑖 𝑝 𝑦 u (𝑘 ′)u u ( 𝑝) ⟨0|T 𝜑(𝑥)𝜑(𝑦) |0⟩ . (26.9)
u (𝑘) u ( 𝑝′)u
  

So, now it is in order to examine the new object involving scalar fields in detail. First of all, let
us introduce an appropriate (standard) notation, namely

𝑖DF (𝑥 − 𝑦) ≡ ⟨0|T 𝜑(𝑥)𝜑(𝑦) |0⟩ . (26.10)




The function DF (𝑥 − 𝑦) is called the Feynman propagator and we have already used the fact
that it is a function of the difference 𝑥 − 𝑦, not 𝑥 and 𝑦 separately. This is easy to prove, as we
are going to find out shortly. It is certainly highly desirable to have an explicit representation of
the function DF , so let us now evaluate the vacuum expectation value in (26.10). The quantized
(free) scalar field has the form

𝜑(𝑥) = d3 𝑘 𝑁 𝑘 𝑎(𝑘)𝑒 −𝑖𝑘𝑥 + 𝑎 † (𝑘)𝑒𝑖𝑘𝑥 . (26.11)
 

164
√︃
Just to be sure, let us recall that in Eq. (26.11), 𝑘𝑥 ≡ 𝑘 · 𝑥 = 𝑘 0 𝑥 0 − 𝑘® · 𝑥®, with 𝑘 0 = | 𝑘® | 2 + 𝑚 2 .
Denoting the annihilation and creation parts of the field 𝜑(𝑥) as 𝜑− (𝑥) and 𝜑+ (𝑥), respectively,
it holds, clearly,

⟨0|T 𝜑(𝑥)𝜑(𝑦) |0⟩ = 𝜗(𝑥0 − 𝑦 0 )⟨0|𝑇 𝜑− (𝑥)𝜑+ (𝑦) |0⟩


 
(26.12)
+ 𝜗(𝑦 0 − 𝑥 0 )⟨0|𝑇 𝜑− (𝑦)𝜑+ (𝑥) |0⟩ .


So, using the explicit expressions appearing in (26.11), one gets


∫∫
⟨0|T 𝜑(𝑥)𝜑(𝑦) |0⟩ = 𝜗(𝑥 0 − 𝑦 0 ) d3 𝑘 d3 𝑙 𝑁 𝑘 𝑁𝑙 ⟨0|𝑎(𝑘)𝑎 † (𝑙)|0⟩𝑒 −𝑖𝑘𝑥+𝑖𝑙 𝑦

∫∫ (26.13)
+ 𝜗(𝑦 0 − 𝑥 0 ) d3 𝑘 d3 𝑙 𝑁 𝑘 𝑁𝑙 ⟨0|𝑎(𝑙)𝑎 † (𝑘)|0⟩𝑒𝑖𝑘𝑥−𝑖𝑙 𝑦 .

However, one finds out easily that

⟨0|𝑎(𝑘)𝑎 † (𝑙)|0⟩ = ⟨0|𝑎(𝑙)𝑎 † (𝑘)|0⟩ = 𝛿 (3) ( 𝑘® − 𝑙)


® ,

as a simple consequence of the canonical commutation relations. So, we have finally



⟨0|T 𝜑(𝑥)𝜑(𝑦) |0⟩ = 𝜗(𝑥 0 − 𝑦 0 ) d3 𝑘 𝑁 𝑘2 𝑒 −𝑖𝑘 (𝑥−𝑦)

∫ (26.14)
+ 𝜗(𝑦 0 − 𝑥 0 ) d3 𝑘 𝑁 𝑘2 𝑒𝑖𝑘 (𝑥−𝑦) .

The last result makes it obvious that the above statement about the dependence of the function
 F on 𝑥† − 𝑦 is valid;
D one may notice that technically, it is due to the commutation relation
𝑎(𝑘), 𝑎 (𝑙) = 𝛿 (3) ( 𝑘® − 𝑙).
®
A remark is in order here. In fact, there is another explanation of the 𝑥 − 𝑦 dependence
manifested now in (26.14), which is more general and in a sense more elegant than the explicit
calculation. Indeed, one may employ the relation

𝜑(𝑥 + 𝑎) = 𝑒𝑖𝑃·𝑎 𝜑(𝑥)𝑒 −𝑖𝑃·𝑎 (26.15)

(see (16.33)), where 𝑎 is a constant spacetime shift and 𝑃 is the operator of four-momentum of
the quantized field 𝜑(𝑥), i.e.
∫ ∫
𝑃 = d 𝑘 𝑘 𝑎 (𝑘)𝑎(𝑘) , 𝑃 = d3 𝑘 𝑘® 𝑎 † (𝑘)𝑎(𝑘) .
0 3 0 † ® (26.16)

Thus,
⟨0|𝜑(𝑥 + 𝑎)𝜑(𝑦 + 𝑎)|0⟩ = ⟨0|𝑒𝑖𝑃·𝑎 𝜑(𝑥)𝜑(𝑦)𝑒 −𝑖𝑃·𝑎 |0⟩ ,
but 𝑒 −𝑖𝑃·𝑎 |0⟩ = |0⟩, since 𝑃 𝜇 |0⟩ = 0 due to the normal ordering in the expressions (26.16). In
this way, we see that
⟨0|𝜑(𝑥 + 𝑎)𝜑(𝑦 + 𝑎)|0⟩ = ⟨0|𝜑(𝑥)𝜑(𝑦)|0⟩ ,
and this in turn means that the vacuum expectation value in (26.10) is invariant under spacetime
translations; this, of course, is tantamount to the statement that DF depends just on 𝑥 − 𝑦.
Having in mind our ultimate goal of constructing an appropriate momentum-space Feyn-
man diagram representing the scattering amplitude contained in (26.5), one should endeavour

165
to get the four-dimensional Fourier transform of the function DF (𝑥 − 𝑦). To this end, one can
utilize the known formula for the Fourier transformation of the 𝜗-function, namely
∫∞
1 𝑒𝑖𝜔𝑡
𝜗(𝑡) = d𝜔 , (26.17)
2𝜋𝑖 𝜔 − 𝑖𝜖
−∞

where 𝜖 > 0 is an arbitrarily small constant. Using this in the expression (26.14) and taking into
account that 𝑁 𝑘2 = (2𝜋) −3 (2𝑘 0 ) −1 one has, after a trivial manipulation,

⟨0|T 𝜑(𝑥)𝜑(𝑦) |0⟩




1 d3 𝑘 d𝜔 𝑒𝑖(𝜔−𝑘 0 ) (𝑥0 −𝑦0 ) 𝑖 𝑘·( 𝑒𝑖(𝜔−𝑘 0 ) (𝑦0 −𝑥0 ) −𝑖 𝑘·(


∫  
® 𝑥®−®𝑦 ) ® 𝑥®−®𝑦 )
= 𝑒 + 𝑒 . (26.18)
𝑖 (2𝜋) 4 2𝑘 0 𝜔 − 𝑖𝜖 𝜔 − 𝑖𝜖

The integration variable 𝜔 can be shifted by means of the substitution 𝜔 − 𝑘 0 = 𝜔′. Further,
in the second term in (26.18), one may perform the change 𝜔′ → −𝜔′ and in the first term one
substitutes 𝑘® → − 𝑘.
® After these simple manipulations one has (renaming 𝜔′ as 𝜔)

⟨0|T 𝜑(𝑥)𝜑(𝑦) |0⟩




1 d3 𝑘 d𝜔 𝑖𝜔(𝑥0 −𝑦0 )−𝑖 𝑘·( 1 1


∫  
® 𝑥®−®𝑦 )
= 𝑒 + . (26.19)
𝑖 (2𝜋) 4 2𝑘 0 𝜔 + 𝑘 0 − 𝑖𝜖 −𝜔 + 𝑘 0 − 𝑖𝜖
The sum of the fractions in the integrand in (26.19) becomes
1 1 −2𝑘 0
− = 2 ,
𝜔 + 𝑘 0 − 𝑖𝜖 𝜔 − 𝑘 0 + 𝑖𝜖 𝜔 − 𝑘 02 + 2𝑖𝑘 0 𝜖

where we have discarded irrelevant terms involving the auxiliary parameter √︃ 𝜖 → 0 . As for the
+

term 2𝑖𝑘 0 𝜖 in the denominator, this is in fact equivalent to 𝑖𝜖, since 𝑘 0 = | 𝑘® | 2 + 𝑚 2 > 𝑚 and 𝜖
is taken to be infinitesimal, but otherwise arbitrary. Thus, (26.19) can be recast as
∫ 3
d 𝑘 d𝜔 𝑖𝜔(𝑥0 −𝑦0 )−𝑖 𝑘·(
® 𝑥®−®𝑦 ) 1
⟨0|T 𝜑(𝑥)𝜑(𝑦) |0⟩ = 𝑖 (26.20)

𝑒 .
(2𝜋) 4 𝜔2 − 𝑘 02 + 𝑖𝜖

Now, taking into account that 𝑘 02 = | 𝑘® | 2 + 𝑚 2 , and introducing the notation 𝑞 = (𝜔, 𝑘),
® i.e.
® the result (26.20) can be written finally as
𝜔 = 𝑞 0 and 𝑞® = 𝑘,

d4 𝑞 𝑒𝑖𝑞(𝑥−𝑦)

⟨0|T 𝜑(𝑥)𝜑(𝑦) |0⟩ = 𝑖 (26.21)

,
(2𝜋) 4 𝑞 2 − 𝑚 2 + 𝑖𝜖
which is the desired four-dimensional Fourier representation. It also shows the utility of the
conventional factor of 𝑖 in the definition (26.10); for the function DF (𝑥 − 𝑦) we have

d4 𝑞 𝑒𝑖𝑞(𝑥−𝑦)

DF (𝑥 − 𝑦) = . (26.22)
(2𝜋) 4 𝑞 2 − 𝑚 2 + 𝑖𝜖
Thus, we see that the Fourier transform of the function DF (𝑥 − 𝑦) is remarkably simple; denoting
it as 𝐷 F (𝑞), one has
1
𝐷 F (𝑞) = 2 (26.23)
𝑞 − 𝑚 2 + 𝑖𝜖

166
(on the other hand, the function DF (𝑥) is quite complicated for 𝑚 ≠ 0, but we will not need
its explicit form). The simple form (26.23) is not accidental; the point is that the function DF
defined by (26.10) is in fact a Green’s function of the Klein–Gordon equation. We are going to
prove it now ab initio (i.e. without any reference to our preceding calculation).
So, our task is to find out how the d’Alembert operator acts on the T-product in (26.10).
In our calculation we will employ the canonical equal-time commutation relations (ETCR)

𝜑(𝑥), 𝜑(𝑦) E.T. = 0 , (26.24)


 
 . (3)
𝜑(𝑥), 𝜑(𝑦) E.T. = 𝑖𝛿 (®

𝑥 − 𝑦®) , (26.25)

and the familiar Klein–Gordon equation

( 2 + 𝑚 2 )𝜑(𝑥) = 0 . (26.26)

Let us start with the time derivative 𝜕0 = 𝜕/𝜕𝑥 0 . To proceed, the readers should recall any good
math course, where they learned that 𝜕0 𝜗(𝑥 0 − 𝑦 0 ) = 𝛿(𝑥 0 − 𝑦 0 ). Then one has

𝜕0𝑇 𝜑(𝑥)𝜑(𝑦) = 𝜕0 𝜗(𝑥 0 − 𝑦 0 )𝜑(𝑥)𝜑(𝑦) + 𝜗(𝑦 0 − 𝑥0 )𝜑(𝑦)𝜑(𝑥)


  
.
= 𝛿(𝑥0 − 𝑦 0 )𝜑(𝑥)𝜑(𝑦) + 𝜗(𝑥 0 − 𝑦 0 ) 𝜑(𝑥)𝜑(𝑦)
.
− 𝛿(𝑥 0 − 𝑦 0 )𝜑(𝑦)𝜑(𝑥) + 𝜗(𝑦 0 − 𝑥0 )𝜑(𝑦) 𝜑(𝑥) (26.27)
= 𝛿(𝑥0 − 𝑦 0 ) 𝜑(𝑥), 𝜑(𝑦) E.T.
 
. .
+ 𝜗(𝑥0 − 𝑦 0 ) 𝜑(𝑥)𝜑(𝑦) + 𝜗(𝑦 0 − 𝑥 0 )𝜑(𝑦) 𝜑(𝑥) .

Now, taking into account (26.24), the second derivative becomes


. ..
𝜕00 T 𝜑(𝑥)𝜑(𝑦) = 𝛿(𝑥 0 − 𝑦 0 ) 𝜑(𝑥)𝜑(𝑦) + 𝜗(𝑥 0 − 𝑦 0 ) 𝜑(𝑥)𝜑(𝑦)

. ..
− 𝛿(𝑥 0 − 𝑦 0 )𝜑(𝑦) 𝜑(𝑥) + 𝜗(𝑦 0 − 𝑥0 )𝜑(𝑦) 𝜑(𝑥) ,
(26.28)

and using (26.25) one is then left with


.. ..
𝜕00 T 𝜑(𝑥)𝜑(𝑦) = 𝜗(𝑥 0 − 𝑦 0 ) 𝜑(𝑥)𝜑(𝑦) + 𝜗(𝑦 0 − 𝑥 0 )𝜑(𝑦) 𝜑(𝑥) − 𝑖𝛿 (4) (𝑥 − 𝑦) .

(26.29)

The differentiation with respect to spatial coordinates is much simpler, since the 𝜗-functions
then do not get in the way. So, for the action of the Laplacian one gets immediately

Δ𝑥 T 𝜑(𝑥)𝜑(𝑦) = 𝜗(𝑥 0 − 𝑦 0 ) Δ𝜑(𝑥)𝜑(𝑦) + 𝜗(𝑦 0 − 𝑥 0 )𝜑(𝑦) Δ𝜑(𝑥) . (26.30)




Putting together the results (26.29) and (26.30), the Klein–Gordon equation (26.26) may be
utilized and one obtains, finally,

( 2𝑥 + 𝑚 2 )T 𝜑(𝑥)𝜑(𝑦) = −𝑖𝛿 (4) (𝑥 − 𝑦) .




This in turn means that the function DF (𝑥 − 𝑦) defined by (26.10) satisfies the differential
equation
( 2𝑥 + 𝑚 2 )DF (𝑥 − 𝑦) = −𝛿 (4) (𝑥 − 𝑦) , (26.31)
and this is indeed the advertised equation for the Green’s function. Now it also becomes obvious
that the Fourier transform 𝐷 F (𝑞) must have the form (𝑞 2 − 𝑚 2 ) −1 , at least for 𝑞 2 ≠ 𝑚 2 : the
point is that the d’Alembert operator 2 is turned into −𝑞 2 upon Fourier transformation and the
delta function on the right-hand side of Eq. (26.31) becomes 1. We know that the equation
(26.31) has infinitely many solutions, consisting of a particular solution of the inhomogeneous

167
Klein–Gordon equation and the general solution of its homogeneous counterpart. The specific
form of the Feynman propagator (26.23) involving 𝑖𝜖 (which serves as a regularization of the pole
singularity for 𝑞 2 = 𝑚 2 ) is the choice that corresponds to the so-called causal Green’s function.
We will not discuss this concept here, since it is not necessary for our description of momentum-
space Feynman diagrams (for some details concerning Green’s functions in relativistic quantum
theory see e.g. the books [2], [6] or [16]).
With the results (26.22) and (26.23) at hand, let us now come back to the 𝑆-matrix element
(26.5) and the expression (26.9). Putting all things together, one has

2
𝑆 (2) ′
𝑓 𝑖 = 𝑖 · 𝑖 𝑔1 𝑔2 𝑁 𝑘 𝑁 𝑝 𝑁 𝑘 𝑁 𝑝 u (𝑘 )u u (𝑘) u ( 𝑝′)u
  
′ ′ u ( 𝑝)
d4 𝑞
∫∫ ∫
4 4 −𝑖𝑘𝑥 𝑖𝑘 ′ 𝑥 −𝑖 𝑝𝑦 𝑖 𝑝 ′ 𝑦
× d 𝑥d 𝑦 𝑒 𝑒 𝑒 𝑒 𝐷 F (𝑞) 𝑒𝑖𝑞(𝑥−𝑦) . (26.32)
(2𝜋) 4
The integration over 𝑥 and 𝑦 is trivial; one gets
∫∫
d4 𝑥 d4 𝑦 𝑒𝑖(−𝑘+𝑘 +𝑞)𝑥 𝑒𝑖(−𝑝+𝑝 −𝑞)𝑦 = (2𝜋) 4 𝛿 (4) (𝑘 ′ − 𝑘 + 𝑞)(2𝜋) 4 𝛿 (4) ( 𝑝′ − 𝑝 − 𝑞) . (26.33)
′ ′

The ensuing integration over 𝑞 then leads to the expected delta function that embodies the overall
four-momentum conservation; at the same time, the intermediate delta functions in (26.33) mean
that the momentum-space propagator 𝐷 F (𝑞) is to be taken here for 𝑞 = 𝑝′ − 𝑝, or 𝑘 − 𝑘 ′ (which is
the same). Thus, we have arrived at the result for 𝑆 (2)
𝑓 𝑖 that can be written in the by now familiar
form
4 (4) ′
𝑆 (2) ′ ′
(2) ′
𝑓 𝑖 = 𝑁 𝑘 𝑁 𝑝 𝑁 𝑘 𝑁 𝑝 𝑖M 𝑓 𝑖 (2𝜋) 𝛿 (𝑘 + 𝑝 − 𝑘 − 𝑝) , (26.34)
with
3
𝑖M 𝑓(2) ′
u (𝑘) 𝐷 F (𝑞) u ( 𝑝′)u
𝑖 = 𝑖 𝑔1 𝑔2 u (𝑘 )u (26.35)
   
u ( 𝑝) .
The last expression suggests naturally a graphical representation in the form of the Feynman
diagram in Fig. 26.1. Note that when drawing such a diagram, the four-momentum conservation

k k′
q = k − k′
(= p′ − p)

p p′

Fig. 26.1: Electron–muon scattering at the lowest order.

holds at each vertex and the rules for the external lines are the same as before, in the case of 1st
order diagrams. Let us also notice that the direction of the internal line, indicated in Fig. 26.1,
is in fact not important, since 𝐷 F (𝑞) is an even function.
The usual verbal description of a diagram like Fig. 26.1 refers to an “exchange of the
virtual particle”; one can also say that the interaction of fermions is mediated by the exchange
of the scalar boson. Of course, the adjective “virtual” here means that, in general, 𝑞 2 ≠ 𝑚 2 ;
the would-be pole at 𝑞 2 = 𝑚 2 corresponds to the real “on-shell” particle. Fig. 26.1 and its
intuitive verbal description also provides a natural justification of the term “propagator”: one

168
may continue the standard popular lore by saying that the incident electron emits a virtual
scalar boson, which is captured by the incident muon, and the four-momenta of the initial-state
particles are thereby changed; in such a thought (“gedanken”) process the intermediate scalar
boson propagates between the interaction vertices in the diagram in Fig. 26.1.
In any case, one should take the above-mentioned popular common parlance with a grain
of salt and always keep in mind the simple fact that the internal line in a diagram like Fig. 26.1
is just a graphical representation of a characteristic mathematical function — the propagator
𝐷 F (𝑞).
Let us now turn to a practical aspect of the propagator contribution in Feynman diagram
calculations. The question is what is in fact the role played by the still somewhat mysterious
term 𝑖𝜖 in the propagator denominator. Well, it depends. For instance, in the contribution of the
diagram in Fig. 26.1 the term 𝑖𝜖 can be safely neglected, since in the elastic scattering process
in question one can never encounter the pole at 𝑞 2 = 𝑚 2 ; the point is that here one has always
𝑞 2 ≤ 0 (proving this is left to the reader as an exercise in kinematics). On the other hand, the
infinitesimal 𝑖𝜖 term plays quite important role in the evaluation of closed-loop diagrams, as
we will see in later chapters. Some more remarks concerning the problem of the potential pole
singularity of the Feynman propagator are deferred to the Chapter 29.
In closing this chapter, a remark is in order here concerning the form of the propagator
in coordinate space. As we have already mentioned before, the explicit expression for DF (𝑥)
is quite complicated for 𝑚 ≠ 0; indeed, one needs all sorts of Bessel functions for its full
description (see e.g. [16]). Nevertheless, for 𝑚 = 0 the result is quite simple; in such a case one
gets
1 𝑖 1
DF (𝑥) = − 𝛿(𝑥 2 ) + 2 2 ,
4𝜋 4𝜋 𝑥
which can be recast as
𝑖 1
DF (𝑥) = 2 2 . (26.36)
4𝜋 𝑥 − 𝑖𝜖
In fact, such a simple form is not so surprising, on dimensional grounds; by its definition, the
scalar field propagator has obviously the dimension of inverse length squared, so in the absence
of a mass scale, the only functional forms with such a dimension are just 𝛿(𝑥 2 ) and 1/𝑥 2 . The
above-mentioned formulae will not be needed in our subsequent treatment of perturbative QFT,
but provide at least a nice illustration of the singular behaviour of massless field theories.

169
Chapter 27

Propagator of Dirac field

Taking into account the results of the preceding chapter, one may naturally expect that the concept
of the propagator is relevant for any quantized field. The case we are going to consider now is
the Dirac field. To have a clear motivation for an appropriate definition of the corresponding
propagator, we will employ another variant of a Yukawa-type interaction, namely

Lint = 𝑔𝜓 1 𝜓2 𝜑 + 𝑔𝜓 2 𝜓1 𝜑† , (27.1)

where 𝜓1 , 𝜓2 are two different Dirac fields (corresponding to fermions 𝑓1 , 𝑓2 ) and 𝜙 is a complex
Klein–Gordon field (the corresponding particles will be denoted as 𝜑− , 𝜑+ ). Note that the
interaction Lagrangian (27.1) represents just a “toy model”, since there is actually no realistic
physical system that would be described in such a way; nevertheless, it is good enough to set the
stage for our purpose.
Let us consider the process of pair production of charged scalars in the fermion–
antifermion annihilation, i.e.
𝑓1 + 𝑓 1 → 𝜑 − + 𝜑 + , (27.2)
with four-momenta 𝑘, 𝑙 for 𝑓1 , 𝑓 1 and 𝑝, 𝑟 for 𝜑− , 𝜑+ . In the usual notation, the initial and final
states are defined as

|𝑖⟩ = 𝑏 †1 (𝑘)𝑑1† (𝑙)|0⟩ , | 𝑓 ⟩ = 𝑏 †𝜑 ( 𝑝)𝑑 𝜑† (𝑟)|0⟩ . (27.3)

Clearly, there is a non-trivial contribution to such a process in the 2nd order of the Dyson
expansion of the 𝑆-matrix. Following analogous arguments as in the preceding chapter, it is not
difficult to realize that the relevant 𝑆-matrix element can be written immediately as
∫∫
2 2
(2)
𝑆 𝑓𝑖 = 𝑖 𝑔 d4 𝑥 d4 𝑦 ⟨ 𝑓 |T 𝜓 1 (𝑥)𝜓2 (𝑥)𝜑(𝑥)𝜓 2 (𝑦)𝜓1 (𝑦)𝜑† (𝑦) |𝑖⟩ . (27.4)


Let us examine the matrix element in the integrand. It reads

⟨ 𝑓 |T 𝜓 1 (𝑥)𝜓2 (𝑥)𝜑(𝑥)𝜓 2 (𝑦)𝜓1 (𝑦)𝜑† (𝑦) |𝑖⟩




= 𝜗(𝑥 0 − 𝑦 0 )⟨0|𝑏 𝜑 ( 𝑝)𝑑 𝜑 (𝑟)𝜓 1𝑎 (𝑥)𝜓2𝑎 (𝑥)𝜑(𝑥)𝜓 2𝑏 (𝑦)𝜓1𝑏 (𝑦)𝜑† (𝑦)𝑏 †1 (𝑘)𝑑1† (𝑙)|0⟩ (27.5)
+ 𝜗(𝑦 0 − 𝑥 0 )⟨0|𝑏 𝜑 ( 𝑝)𝑑 𝜑 (𝑟)𝜓 2𝑏 (𝑦)𝜓1𝑏 (𝑦)𝜑† (𝑦)𝜓 1𝑎 (𝑥)𝜓2𝑎 (𝑥)𝜑(𝑥)𝑏 †1 (𝑘)𝑑1† (𝑙)|0⟩ ,

where we have introduced explicitly the bispinor indices of the Dirac field operators with regard
to the later appearance of 4 × 4 matrices. Now we can utilize our universal finding concerning
the vacuum expectation values like those shown in (27.5): the field operators have to be paired
completely, in the usual sense. Here it means that 𝜓1𝑏 (𝑦) is paired with 𝑏 †1 (𝑘), 𝜓 1𝑎 (𝑥) is paired

170
with 𝑑1† (𝑙), and 𝑏 𝜑 ( 𝑝), 𝑑 𝜑 (𝑟) get paired with 𝜑† (𝑦) and 𝜑(𝑥), respectively. Looking carefully
at the fermion pairings in the two terms in (27.5), one should notice that when moving, say,
𝜓1𝑏 (𝑦) to its counterpart 𝑏 †1 (𝑘) in the second term, the operator 𝜓 1𝑎 (𝑥) gets in the way, so that
there is one more anticommutation as compared with the first term (where 𝜓1𝑏 (𝑦) and 𝑏 †1 (𝑘) are
neighbours). Thus, one may conclude that there is a relative minus sign in the contributions of
the two terms in the expression (27.5). (It is instructive to get back for a moment to the preceding
chapter, in order to see that there is no such thing in the evaluation of the expression (26.7).)
The operator pairings have the familiar form and so it is easy to see that the matrix
element (27.5) can be recast as
⟨ 𝑓 |T 𝜓 1 (𝑥)𝜓2 (𝑥)𝜑(𝑥)𝜓 2 (𝑦)𝜓1 (𝑦)𝜑† (𝑦) |𝑖⟩


= 𝑁 𝑘 𝑁𝑙 𝑁 𝑝 𝑁𝑟 𝑒 −𝑖𝑙𝑥 𝑒 −𝑖𝑘 𝑦 𝑒𝑖𝑟𝑥 𝑒𝑖 𝑝𝑦 (27.6)


 
× v 𝑎 (𝑙) 𝜗(𝑥 0 − 𝑦 0 )⟨0|𝜓2𝑎 (𝑥)𝜓 2𝑏 (𝑦)|0⟩ − 𝜗(𝑦 0 − 𝑥0 )⟨0|𝜓 2𝑏 (𝑦)𝜓2𝑎 (𝑥)|0⟩ u𝑏 (𝑘) .
Now, it is natural to denote the expression in square brackets as the T-product of Dirac field
operators and subsequently we can also define the propagator of Dirac field by means of the
relation
𝑖SF 𝑎𝑏 (𝑥 − 𝑦) = ⟨0|T 𝜓𝑎 (𝑥)𝜓 𝑏 (𝑦) |0⟩ , (27.7)


with
T 𝜓𝑎 (𝑥)𝜓 𝑏 (𝑦) = 𝜗(𝑥 0 − 𝑦 0 )𝜓𝑎 (𝑥)𝜓 𝑏 (𝑦) − 𝜗(𝑦 0 − 𝑥 0 )𝜓 𝑏 (𝑦)𝜓𝑎 (𝑥) . (27.8)


We have omitted the label 2, since the relations (27.7), (27.8) will serve as general definitions
from now on. Note also that the symbol SF for the Feynman propagator of Dirac field refers
to the spinor nature of the field in question. The dependence of the matrix function SF on the
difference 𝑥 − 𝑦 can be proven analogously as in the case of the scalar field.
In this way, we have arrived at a generalization of the definition of chronological product:
for fermion fields we have to take it with change of sign when passing from 𝑥 0 > 𝑦 0 to 𝑥 0 < 𝑦 0 .
This, of course, is not in contradiction with the original definition of the T-product à la Dyson:
the Lagrangian density is a composite scalar and we already know that even for an elementary
scalar field the original definition of the T-product is pertinent. So, the next step is an explicit
evaluation of the propagator SF (𝑥 − 𝑦). Employing the familiar representation of the Dirac field
in terms of creation and annihilation operators, one has
⟨0|T 𝜓𝑎 (𝑥)𝜓 𝑏 (𝑦) |0⟩

∑︁ ∫∫
d3 𝑘 d3 𝑘 ′ 𝑁 𝑘 𝑁 𝑘 ′ u𝑎 (𝑘, 𝑠)u

= 𝜗(𝑥 0 − 𝑦 0 ) u 𝑏 (𝑘 ′, 𝑠′)⟨0|𝑏(𝑘, 𝑠)𝑏 † (𝑘 ′, 𝑠′)|0⟩𝑒 −𝑖𝑘𝑥+𝑖𝑘 𝑦
𝑠,𝑠′ (27.9)
∑︁ ∫∫
d3 𝑘 d3 𝑘 ′ 𝑁 𝑘 𝑁 𝑘 ′ v 𝑏 (𝑘 ′, 𝑠′)vv 𝑎 (𝑘, 𝑠)⟨0|𝑑 (𝑘 ′, 𝑠′)𝑑 † (𝑘, 𝑠)|0⟩𝑒𝑖𝑘𝑥−𝑖𝑘 𝑦 .

− 𝜗(𝑦 0 − 𝑥 0 )
𝑠,𝑠′
Using the standard anticommutation relations, the vacuum expectation values in Eq. (27.9) are
equal to 𝛿 𝑠𝑠′ 𝛿 (3) ( 𝑘® − 𝑘®′) and one thus gets
∑︁ ∫
⟨0|T 𝜓𝑎 (𝑥)𝜓 𝑏 (𝑦) |0⟩ = 𝜗(𝑥 0 − 𝑦 0 ) d3 𝑘 𝑁 𝑘2 u𝑎 (𝑘, 𝑠)u
u 𝑏 (𝑘, 𝑠)𝑒 −𝑖𝑘𝑥+𝑖𝑘 𝑦

𝑠
∑︁ ∫
− 𝜗(𝑦 0 − 𝑥 0 ) d3 𝑘 𝑁 𝑘2 v 𝑎 (𝑘, 𝑠)vv 𝑏 (𝑘, 𝑠)𝑒𝑖𝑘𝑥−𝑖𝑘 𝑦
∫𝑠 (27.10)
3 1
= 𝜗(𝑥 0 − 𝑦 0 ) d 𝑘 (/𝑘 + 𝑚)𝑎𝑏 𝑒 −𝑖𝑘 (𝑥−𝑦)
(2𝜋) 3 2𝑘 0
1

− 𝜗(𝑦 0 − 𝑥 0 ) d3 𝑘 (/𝑘 − 𝑚)𝑎𝑏 𝑒𝑖𝑘 (𝑥−𝑦) ,
(2𝜋) 3 2𝑘 0

171
where we have also utilized the “completeness relations” (6.24), (6.30) for u (𝑘, 𝑠) and v (𝑘, 𝑠).
Next, one employs the Fourier representation of the 𝜗-functions and Eq. (27.10) then
becomes
1 d3 𝑘 d𝜔 𝑒𝑖(𝜔−𝑘 0 ) (𝑥0 −𝑦0 ) 𝑖 𝑘·(

®
⟨0|T 𝜓𝑎 (𝑥)𝜓 𝑏 (𝑦) |0⟩ = (/ + 𝑒 𝑥®−®𝑦)

𝑘 𝑚) 𝑎𝑏
𝑖 (2𝜋) 4 2𝑘 0 𝜔 − 𝑖𝜖
(27.11)
1 d3 𝑘 d𝜔 𝑒𝑖(𝜔−𝑘 0 ) (𝑦0 −𝑥0 ) −𝑖 𝑘·(

® 𝑥®−®𝑦 )
− (/𝑘 − 𝑚)𝑎𝑏 𝑒 .
𝑖 (2𝜋) 4 2𝑘 0 𝜔 − 𝑖𝜖

Performing now the same substitutions as in the case of scalar field (cf. (26.18) and (26.19)), the
expression (27.11) is recast as
®
1 d3 𝑘 d𝜔 e 𝑒𝑖𝜔(𝑥0 −𝑦0 )−𝑖 𝑘·(𝑥®−®𝑦)

⟨0|T 𝜓𝑎 (𝑥)𝜓 𝑏 (𝑦) |0⟩ = /
( 𝑘 + 𝑚)𝑎𝑏

𝑖 (2𝜋) 4 2𝑘 0 𝜔 + 𝑘 0 − 𝑖𝜖
®
(27.12)
1 d3 𝑘 d𝜔 𝑒𝑖𝜔(𝑥0 −𝑦0 )−𝑖 𝑘·(𝑥®−®𝑦)

− (/𝑘 − 𝑚)𝑎𝑏 ,
𝑖 (2𝜋) 4 2𝑘 0 −𝜔 + 𝑘 0 − 𝑖𝜖

where e ® i.e. e
𝑘 = (𝑘 0 , − 𝑘), 𝑘/ = 𝑘 0 𝛾0 + 𝑘® · 𝛾®, while 𝑘/ = 𝑘 0 𝛾0 − 𝑘® · 𝛾®. The integrals in (27.12) can
then be reorganized as follows (omitting from now on the matrix indices 𝑎, 𝑏):

⟨0|T 𝜓(𝑥)𝜓(𝑦) |0⟩




1 d3 𝑘 d𝜔 1 1
∫  
®
= 𝑘 0 𝛾0 − 𝑒𝑖𝜔(𝑥0 −𝑦0 )−𝑖 𝑘·(𝑥®−®𝑦)
𝑖 4
(2𝜋) 2𝑘 0 𝜔 + 𝑘 0 − 𝑖𝜖 −𝜔 + 𝑘 0 − 𝑖𝜖 (27.13)
1 d3 𝑘 d𝜔 ® 1 1
∫  
®
+ 4
( 𝑘 · 𝛾® + 𝑚) + 𝑒𝑖𝜔(𝑥0 −𝑦0 )−𝑖 𝑘·(𝑥®−®𝑦) .
𝑖 (2𝜋) 2𝑘 0 𝜔 + 𝑘 0 − 𝑖𝜖 −𝜔 + 𝑘 0 − 𝑖𝜖

Then, after some simple manipulations analogous to those carried out in the scalar case, one
gets
∫ 3
1 d 𝑘 d𝜔 1 ®
⟨0|T 𝜓(𝑥)𝜓(𝑦) |0⟩ = (𝜔𝛾0 − 𝑘® · 𝛾® − 𝑚) 2 𝑒𝑖𝜔(𝑥0 −𝑦0 )−𝑖 𝑘·(𝑥®−®𝑦) .

𝑖 (2𝜋) 4 2
𝜔 − 𝑘 0 + 𝑖𝜖
(27.14)

Thus, taking into account that 𝑘 02 = | 𝑘® | 2 +𝑚 2 , introducing the notation 𝑞 = (𝜔, 𝑘), ® and performing
finally the substitution 𝑞 → −𝑞, (27.14) can be rewritten as

d4 𝑞 𝑞/ + 𝑚

⟨0|T 𝜓(𝑥)𝜓(𝑦) |0⟩ = 𝑖 𝑒 −𝑖𝑞(𝑥−𝑦) . (27.15)

(2𝜋) 4 𝑞 2 − 𝑚 2 + 𝑖𝜖
According to the definition (27.7) one then has for the propagator of Dirac field

d4 𝑞 𝑞/ + 𝑚

SF (𝑥 − 𝑦) = 𝑒 −𝑖𝑞(𝑥−𝑦) . (27.16)
(2𝜋) 4 𝑞 2 − 𝑚 2 + 𝑖𝜖
So, in the momentum-space representation it reads simply
𝑞/ + 𝑚
𝑆F (𝑞) = . (27.17)
𝑞 2 − 𝑚 2 + 𝑖𝜖

172
A QFT newcomer should thus remember that 𝑆F (𝑞) is a 4×4 matrix. Note that the last expression
can also be written as (neglecting the infinitesimal term 𝑖𝜖 for a moment)
1
𝑆F (𝑞) = , (27.18)
𝑞/ − 𝑚
where, of course, 1/( 𝑞/ − 𝑚) = ( 𝑞/ − 𝑚) −1 . It is easy to see that the expressions (27.18) and
(27.17) are equivalent. Indeed, taking into account that 𝑞/ 𝑞/ = 𝑞 2 , one has, obviously,
𝑞/ + 𝑚
( 𝑞/ − 𝑚) 2 = 1,
𝑞 − 𝑚2
and that’s it. In fact, (27.18) is the most common form used for the momentum-space propagator
𝑆F (𝑞).

k p

q =k−p
=r−l

−l r

Fig. 27.1: The tree-level diagram for the process (27.2).

Let us now return to the 𝑆-matrix element for the annihilation process (27.2). Employing
the expressions (27.4), (27.6), (27.7) and (27.16), one has
d4 𝑞 −𝑖𝑙𝑥−𝑖𝑘 𝑦+𝑖𝑟𝑥+𝑖 𝑝𝑦 −𝑖𝑞(𝑥−𝑦)
∫∫ ∫
(2) 2 2 4 4
𝑆 𝑓 𝑖 = 𝑖 · 𝑖 𝑔 𝑁 𝑘 𝑁 𝑙 𝑁 𝑝 𝑁𝑟 d 𝑥d 𝑦 𝑒 𝑒 u (𝑘)𝑆F (𝑞)vv (𝑙) .
(2𝜋) 4
(27.19)
The integrations over 𝑥 and 𝑦 lead to the product of delta functions
𝛿 (4) (−𝑙 + 𝑟 − 𝑞) 𝛿 (4) (−𝑘 + 𝑝 + 𝑞) , (27.20)
and the subsequent integration over 𝑞 produces the anticipated delta function 𝛿 (4) ( 𝑝 + 𝑟 − 𝑘 − 𝑙)
for the overall four-momentum conservation. The delta functions (27.20) show that the variable
𝑞 is effectively equal to 𝑟 − 𝑙 or 𝑘 − 𝑝 (which is the same). Thus, we get finally
4 (4)
𝑆 (2) (2)
𝑓 𝑖 = 𝑁 𝑘 𝑁 𝑙 𝑁 𝑝 𝑁𝑟 𝑖M 𝑓 𝑖 (2𝜋) 𝛿 ( 𝑝 + 𝑟 − 𝑘 − 𝑙) ,
with
3 2
𝑖M 𝑓(2)
𝑖 = 𝑖 𝑔 v (𝑙)𝑆 F (𝑞)u
u (𝑘) , (27.21)
and such an expression can be represented graphically as shown in Fig. 27.1.
In this case we see that the direction of the internal fermion line, corresponding to the
Dirac field propagator, is important, since 𝑆F (𝑞) is not an even function. So, in this way we have
extended our catalogue of Feynman rules by including diagrams involving another type of the
internal line. The other rules implemented in the contribution of the diagram in Fig. 27.1 are
reproduced here.
Another remark is perhaps in order. The simple expression (27.18) indicates that the
propagator of quantized Dirac field can also be interpreted as an appropriate Green’s function of
the Dirac equation. It is indeed so, but we will not discuss this theme here; an interested reader
is encouraged to examine this problem independently.

173
Chapter 28

Propagator of massive vector field

Proceeding in the spirit of the previous two chapters, let us now consider a model of the interaction
of massive vector (Proca) field with fermions, and an appropriate scattering process that will
lead us directly to the propagator of such a vector field. We will see that the straightforward
calculation will bring a new ingredient to our perturbative treatment of the relevant 𝑆-matrix
elements.
So, let our model be defined by the interaction Lagrangian
Lint = 𝑔 𝜓 1 𝛾 𝜇 𝜓1 + 𝜓 2 𝛾 𝜇 𝜓2 𝐴 𝜇 , (28.1)


where 𝜓1 ,𝜓2 are two different Dirac fields (the corresponding dermions being denoted as 𝑓1 , 𝑓2 )
and 𝐴 𝜇 is a (real) massive vector field. Note that such a model can be understood as quantum
electrodynamics (QED) with a spin-1 “massive photon”. The process we are going to examine
is the elastic scattering
𝑓1 + 𝑓2 → 𝑓1 + 𝑓2 , (28.2)
where the particle four-momenta can be denoted e.g. as 𝑘,𝑝 for the initial state and 𝑘 ′,𝑝′ for the
final state. Thus, the initial and final state are defined as
|𝑖⟩ = 𝑏 †1 (𝑘)𝑏 †2 ( 𝑝)|0⟩ ,
(28.3)
| 𝑓 ⟩ = 𝑏 †1 (𝑘 ′)𝑏 †2 ( 𝑝′)|0⟩ .
We will start our calculation by assuming bona fide that the Dyson expansion in terms of powers
of the interaction Lagrangian is valid (though we will see later on that this is a subtle point to
be clarified separately). The first few steps are then almost identical with what we did before in
the case of Yukawa-type interaction involving a scalar field. In the present case, when working
out the corresponding 𝑆-matrix element, we encounter the vacuum expectation value of the
T-product of vector fields
⟨0|T 𝐴 𝜇 (𝑥) 𝐴𝜈 (𝑦) |0⟩ = 𝜗(𝑥0 − 𝑦 0 )⟨0| 𝐴 𝜇 (𝑥) 𝐴𝜈 (𝑦)|0⟩ + 𝜗(𝑦 0 − 𝑥 0 )⟨0| 𝐴𝜈 (𝑦) 𝐴 𝜇 (𝑥)|0⟩ , (28.4)


that we would like to relate to the vector field propagator. For the explicit evaluation of the
expression (28.4) we employ the standard representation of 𝐴 𝜇 (𝑥) in terms of the creation and
annihilation operators, i.e. (see (20.10))
∫ 3
𝐴 𝜇 (𝑥) = d3 𝑘 𝑁 𝑘
∑︁
𝑎(𝑘, 𝜆)𝜖 𝜇 (𝑘, 𝜆)𝑒 −𝑖𝑘𝑥 + 𝑎 † (𝑘, 𝜆)𝜖 𝜇∗ (𝑘, 𝜆)𝑒𝑖𝑘𝑥 . (28.5)
 
𝜆=1

Inserting (28.5) into (28.4) and using the canonical commutation relation (20.19)
𝑎(𝑘, 𝜆), 𝑎 † (𝑘 ′, 𝜆′) = 𝛿𝜆𝜆′ 𝛿 (3) ( 𝑘® − 𝑘®′) ,
 

174
one gets, after a simple manipulation,
∫ 3
d3 𝑘 𝑁 𝑘2
∑︁
⟨0|T 𝐴 𝜇 (𝑥) 𝐴𝜈 (𝑦) |0⟩ = 𝜗(𝑥 0 − 𝑦 0 ) 𝜖 𝜇 (𝑘, 𝜆)𝜖 𝜈∗ (𝑘, 𝜆)𝑒 −𝑖𝑘 (𝑥−𝑦)

𝜆=1
3
(28.6)

d3 𝑘 𝑁 𝑘2
∑︁
+ 𝜗(𝑦 0 − 𝑥 0 ) 𝜖 𝜇∗ (𝑘, 𝜆)𝜖 𝜈 (𝑘, 𝜆)𝑒𝑖𝑘 (𝑥−𝑦)
𝜆=1

d3 𝑘 𝑁 𝑘2 𝑃 𝜇𝜈 (𝑘) 𝜗(𝑥0 − 𝑦 0 )𝑒 −𝑖𝑘 (𝑥−𝑦) + 𝜗(𝑦 0 − 𝑥 0 )𝑒𝑖𝑘 (𝑥−𝑦) ,
h i
=

where 𝑃 𝜇𝜈 (𝑘) is the well-known polarization sum for massive spin-1 boson (see (12.30))
1
𝑃 𝜇𝜈 (𝑘) = −𝑔 𝜇𝜈 + 𝑘𝜇𝑘𝜈 (28.7)
𝑚2

(let us stress that 𝑃 𝜇𝜈 ® ®


√︃(𝑘) should in fact be written as 𝑃 𝜇𝜈 ( 𝑘), since only 𝑘 are independent
variables, and 𝑘 0 = | 𝑘® | 2 + 𝑚 2 ) . The following steps are routine; the reader may revisit the
paradigm of the scalar field for details, if necessary. Upon the relevant manipulations one gets

⟨0|T 𝐴 𝜇 (𝑥) 𝐴𝜈 (𝑦) |0⟩




1 d3 𝑘 d𝜔 1 1 (28.8)
∫  
® ® ®
= 4
𝑃 𝜇𝜈 (− 𝑘) + 𝑃 𝜇𝜈 ( 𝑘) 𝑒𝑖𝜔(𝑥0 −𝑦0 )−𝑖 𝑘·(𝑥®−®𝑦) .
𝑖 (2𝜋) 2𝑘 0 𝜔 + 𝑘 0 − 𝑖𝜖 −𝜔 + 𝑘 0 − 𝑖𝜖
Let us now work out the expression (28.8) step by step, for all possible combinations of
the indices 𝜇, 𝜈.
1) First, let (𝜇, 𝜈) = (𝑖, 𝑗), 𝑖, 𝑗 = 1, 2, 3. Then (28.7) becomes

® = 𝛿𝑖 𝑗 + 1
𝑃𝑖 𝑗 ( 𝑘) 𝑘𝑖 𝑘 𝑗 ,
𝑚2
® = 𝑃𝑖 𝑗 ( 𝑘).
so that obviously 𝑃𝑖 𝑗 (− 𝑘) ® Repeating again the simple manipulations performed in
the case of scalar field and using the notation 𝑞 = (𝜔, 𝑘), ® one has

d4 𝑞 1

⟨0|T 𝐴𝑖 (𝑥) 𝐴 𝑗 (𝑦) |0⟩ = 𝑖 𝑃𝑖 𝑗 (𝑞) 2 𝑒𝑖𝑞(𝑥−𝑦) . (28.9)

(2𝜋) 4 𝑞 − 𝑚 2 + 𝑖𝜖
2) Next, for (𝜇, 𝜈) = (0, 𝑗), 𝑗 = 1, 2, 3, we have

® = 1 𝑘0 𝑘 𝑗 ,
𝑃0 𝑗 ( 𝑘)
𝑚2
® = −𝑃0 𝑗 ( 𝑘).
so that 𝑃0 𝑗 (− 𝑘) ® Then the usual sequence of elementary manipulations yields
first
∫ 3
d 𝑘 d𝜔 1 1 ®
⟨0|T 𝐴0 (𝑥) 𝐴 𝑗 (𝑦) |0⟩ = 𝑖 𝑒𝑖𝜔(𝑥0 −𝑦0 )−𝑖 𝑘·(𝑥®−®𝑦) , (28.10)

4 2
𝜔 𝑘𝑗 2 2
(2𝜋) 𝑚 𝜔 − 𝑘 0 + 𝑖𝜖

and introducing our favourite four-component variable 𝑞, the expression (28.10) is recast as

d4 𝑞 1

⟨0|T 𝐴0 (𝑥) 𝐴 𝑗 (𝑦) |0⟩ = 𝑖 𝑃0 𝑗 (𝑞) 2 𝑒𝑖𝑞(𝑥−𝑦) . (28.11)

(2𝜋) 4 𝑞 − 𝑚 2 + 𝑖𝜖

175
3) Finally, consider the combination (𝜇𝜈) = (0, 0). Then

® = −1 + 1 𝑘 2 ,
𝑃00 ( 𝑘)
𝑚2 0
® = 𝑃00 ( 𝑘).
so that 𝑃00 (− 𝑘) ® Proceeding as usual, from (28.8) one gets

𝑘 02
∫ 3 !
d 𝑘 d𝜔 1 ®
⟨0|T 𝐴0 (𝑥) 𝐴0 (𝑦) |0⟩ = 𝑖 −1 + 2 𝑒𝑖𝜔(𝑥0 −𝑦0 )−𝑖 𝑘·(𝑥®−®𝑦) . (28.12)

(2𝜋) 4 2 2
𝑚 𝜔 − 𝑘 0 + 𝑖𝜖

Now we can see that there is a problem: in order to recover the 00 component of the 𝑃 𝜇𝜈 (𝑞)
® in the integrand in (28.12), one would like to have there the factor −1 + 𝜔2 /𝑚 2
with 𝑞 = (𝜔, 𝑘)
rather than −1 + 𝑘 02 /𝑚 2 ! Well, the situation is serious, but not desperate. So as to get the desired
term at play, we can rewrite Eq. (28.12) as
∫ 3 2 2
𝜔2 𝑘 0 − 𝜔
!
d 𝑘 d𝜔 1 ® 𝑥®−®𝑦 )
𝑖𝜔(𝑥 0 −𝑦 0 )−𝑖 𝑘·(
⟨0|T 𝐴0 (𝑥) 𝐴0 (𝑦) |0⟩ = 𝑖 −1 + 2 +

4 2 2
𝑒 .
(2𝜋) 𝑚 𝑚 𝜔2 − 𝑘 0 + 𝑖𝜖
(28.13)
2 2
The term in parentheses that is proportional to 𝑘 0 − 𝜔 is seen to cancel the denominator of the
integrand. Introducing now the variable 𝑞, the integral (28.13) can be written down as

d4 𝑞 1 𝑖 d4 𝑞 𝑖𝑞(𝑥−𝑦)
∫ ∫
𝑖𝑞(𝑥−𝑦)
⟨0|T 𝐴0 (𝑥) 𝐴0 (𝑦) |0⟩ = 𝑖 (𝑞) −

𝑃 00 𝑒 𝑒
(2𝜋) 4 𝑞 2 − 𝑚 2 + 𝑖𝜖 𝑚2 (2𝜋) 4
d4 𝑞 𝑃00 (𝑞) 𝑖 (4)

𝑖𝑞(𝑥−𝑦)
=𝑖 𝑒 − 𝛿 (𝑥 − 𝑦) .
(2𝜋) 4 𝑞 2 − 𝑚 2 + 𝑖𝜖 𝑚2
(28.14)

Putting all things together, we may write our results in a unified form

d4 𝑞 𝑃 𝜇𝜈 (𝑞) 1
∫  
⟨0|T 𝐴 𝜇 (𝑥) 𝐴𝜈 (𝑦) |0⟩ = 𝑖 − 2 𝛿 𝜇0 𝛿 𝜈0 𝑒𝑖𝑞(𝑥−𝑦)

4 2
(2𝜋) 𝑞 − 𝑚 + 𝑖𝜖 𝑚 2
(28.15)
covar. 𝑖
= 𝑖D 𝜇𝜈 (𝑥 − 𝑦) − 2 𝛿 𝜇0 𝛿 𝜈0 𝛿 (4) (𝑥 − 𝑦) .
𝑚
Thus, while the first term in (28.15) is a good candidate for covariant Feynman propagator of
the massive vector field, the second term is manifestly non-covariant (of course, we could write
equivalently 𝑔 𝜇0 𝑔𝜈0 instead of 𝛿 𝜇0 𝛿 𝜈0 , but it does not help). On the other hand, the non-covariant
contribution is extremely simple — it is a “contact term”, proportional to the delta function.
In fact, its emergence is closely related to our straightforward definition of the chronological
product (28.4): due to the presence of time-dependent 𝜗-functions it is not manifestly covariant
and the contribution of the whole expression at 𝑥 = 𝑦 is not uniquely defined. In the next chapter
we will uncover a way out of this difficulty; the good news is that when the dust settles, one is left
with just the covariant part of the expression (28.15) and a simple structure of the perturbation
expansion is thereby saved.

176
Chapter 29

Fate of non-covariant term


in vector boson propagator

In order to solve the difficulty with the non-covariant term in the propagator of massive vector
field (28.15), one has to return to the original form of the Dyson expansion of the 𝑆-matrix.
It is a perturbation series in powers of the interaction Hamiltonian (21.31), in the interaction
picture of time evolution. In field-theory models it is recast in terms of powers of the interaction
Hamiltonian density Hint . We know that in some simple cases, such as e.g. the direct four-
fermion interaction, or a Yukawa-type coupling of scalar field, Hint is equal to the interaction
Lagrangian density with minus sign. We will see that in the theory of massive vector field
interacting with fermions (like e.g. (28.1)) such a simple relation between Hint and Lint may
fail. The reason is that the component 𝐴0 of the vector field is not an independent dynamical
variable — rather it is to be understood as a solution of a constraint equation.
To examine the relation between Hint and Lint in detail, let us consider the full Lagrangian
of the model in question, i.e.
1 1
L = − 𝐹𝜇𝜈 𝐹 𝜇𝜈 + 𝑚 2 𝐴 𝜇 𝐴 𝜇 + 𝐽 𝜇 𝐴 𝜇 + . . . , (29.1)
4 2
with 𝐹𝜇𝜈 = 𝜕𝜇 𝐴𝜈 − 𝜕𝜈 𝐴 𝜇 ; we have denoted here

𝐽 𝜇 = 𝑔𝜓 1 𝛾 𝜇 𝜓1 + 𝑔𝜓 2 𝛾 𝜇 𝜓2 . (29.2)

The ellipsis in (29.1) means the Lagrangian of the free Dirac fields; these terms do not play any
substantial role in our discussion. Now, the essential “strategic” point is that the independent
dynamical variables for the description of the quantized vector field are the components 𝐴 𝑗 ,
𝑗 = 1, 2, 3; the corresponding canonical conjugate momenta are then (see (20.4))

𝜕L
𝜋𝑗 = = −𝐹 0 𝑗 = 𝐹0 𝑗 . (29.3)
𝜕 (𝜕0 𝐴 𝑗 )

In the Hamiltonian density we will single out the part quadratic in 𝐴 𝑗 and 𝜋 𝑗 (including their
derivatives) and the rest will be identified with Hint .
So much for our strategy; now let us proceed to evaluate the relevant Hamiltonian density
H . This is defined, in general, as the component T 00 of the energy–momentum tensor; so, we
have
𝜕L
H = 𝜕0 𝐴 𝑗 −L +... (29.4)
𝜕 (𝜕0 𝐴 𝑗 )

177
(here and in what follows, three dots always mean irrelevant terms originating in Dirac fields).
Using (29.3), one gets

H = 𝜕0 𝐴 𝑗 𝐹0 𝑗 − L + . . .
= 𝜕0 𝐴 𝑗 − 𝜕 𝑗 𝐴0 + 𝜕 𝑗 𝐴0 𝐹0 𝑗


1 1 1 1 2 1 2
 
0𝑗 𝑗0 𝑗𝑘 2
− − 𝐹0 𝑗 𝐹 − 𝐹 𝑗0 𝐹 − 𝐹 𝑗 𝑘 𝐹 + 𝑚 ( 𝐴0 ) − 𝑚 𝐴 𝑗 𝐴 𝑗 + 𝐽0 𝐴0 − 𝐽 𝑘 𝐴 𝑘 + . . .
4 4 4 2 2

After some simple manipulations, and employing again (29.3), it is simplified to


1 1 1 1
H = 𝜋 𝑗 𝜋 𝑗 + 𝜕 𝑗 𝐴0 𝐹0 𝑗 + 𝐹 𝑗 𝑘 𝐹 𝑗 𝑘 − 𝑚 2 ( 𝐴0 ) 2 + 𝑚 2 𝐴 𝑗 𝐴 𝑗 − 𝐽0 𝐴0 + 𝐽 𝑘 𝐴 𝑘 + . . . (29.5)
2 4 2 2
Now comes a crucial step: we are going to express 𝐴0 in terms of canonical variables by using
the equation of motion corresponding to the Lagrangian L . It is easy to see that the relevant
Euler–Lagrange equation reads

𝜕𝜇 𝐹 𝜇𝜈 + 𝑚 2 𝐴𝜈 + 𝐽 𝜈 = 0 ,

i.e. for 𝜈 = 0 one gets


1
𝐴0 = − (𝜕 𝑗 𝐹0 𝑗 + 𝐽0 )
𝑚2 (29.6)
1
= − 2 (𝜕 𝑗 𝜋 𝑗 + 𝐽0 ) .
𝑚
Employing now the expression (29.6) in (29.5), one has
1 1 1
H = 𝜋 𝑗 𝜋 𝑗 + 𝑚2 𝐴 𝑗 𝐴 𝑗 + 𝐹𝑗 𝑘 𝐹 𝑗 𝑘
2 2 4
1
− 2 𝜕 𝑗 (𝜕𝑘 𝜋 𝑘 + 𝐽0 )𝜋 𝑗
𝑚 (29.7)
1 2
− 𝜕 𝑗 𝜋 𝑗 𝜕𝑘 𝜋 𝑘 + 2𝜕 𝑗 𝜋 𝑗 𝐽0 + (𝐽0 )
2𝑚 2
1 1
+ 2 𝜕 𝑗 𝜋 𝑗 𝐽0 + 2 (𝐽0 ) 2 + 𝐽 𝑘 𝐴 𝑘 + . . .
𝑚 𝑚
As a next step, one may discard terms that have the form of total derivative, since they do not
contribute upon the integration of the density H over the 3-dimensional space. From (29.7)
one thus gets an equivalent expression
1 1 1 1
H = 𝜋 𝑗 𝜋 𝑗 + 𝑚2 𝐴 𝑗 𝐴 𝑗 + 𝐹𝑗 𝑘 𝐹 𝑗 𝑘 + 𝜕 𝑗 𝜋 𝑗 𝜕𝑘 𝜋 𝑘
2 2 4 2𝑚 2 (29.8)
1 1
+ 2 𝜕 𝑗 𝜋 𝑗 𝐽0 + (𝐽0 ) 2 + 𝐽 𝑘 𝐴 𝑘 + . . .
𝑚 2𝑚 2
The terms quadratic in canonical variables 𝐴 𝑗 and 𝜋 𝑗 (including their derivatives) constitute the
free part of the Hamiltonian density H and the rest can be identified with the interaction part,
as we have indicated above. Thus, we have (using again (29.3))
1 1
Hint = 2
𝐽0 𝜕 𝑗 𝐹0 𝑗 + 2
(𝐽0 ) 2 + 𝐽 𝑘 𝐴 𝑘 . (29.9)
𝑚 2𝑚

178
However, in the Dyson expansion we use the interaction picture, i.e. the expression (29.9) should
be evaluated by inserting there free fields. The equation of motion (29.6) for free fields (i.e. for
𝐽0 = 0) amounts to
1
𝜕 𝑗 𝐹0free free
𝑗 = −𝐴0 . (29.10)
𝑚2
So, denoting Hint in the interaction picture as Hint(I) , one has

1
Hint(I) = −𝐽0 𝐴0 + 𝐽 𝑘 𝐴 𝑘 + 2
(𝐽0 ) 2
2𝑚
1
= −𝐽 𝜇 𝐴 𝜇 + (𝐽0 ) 2 (29.11)
2𝑚 2
1
(I)
= −Lint + (𝐽0 ) 2 .
2𝑚 2
We have thus obtained the result announced above. Indeed, one has Hint ≠ −Lint , in particular,
1
Hint(I) = −Lint
(I)
+ 2
(𝐽0 ) 2 . (29.12)
2𝑚
It remains to be clarified if the two “anomalous” effects, namely the non-covariant term
in the propagator (28.15) and the extra term in Hint(I) appearing in (29.12) could somehow cancel
each other. To this end, let us consider the scattering process (28.2) within the model described
by (29.1), (29.2). Now we know that one has to start correctly with the 𝑆-matrix operator written
as
 ∫ 
4
𝑆 = Texp −𝑖 Hint (𝑥) d 𝑥
(29.13)
(−𝑖) 2
∫ ∫∫
4  4 4
= 1 − 𝑖 Hint (𝑥) d 𝑥 + T Hint (𝑥)Hint (𝑦) d 𝑥 d 𝑦 + . . . ,
2!
where Hint is given by Eq. (29.12). In the Dyson expansion we should collect terms of the same
order in the coupling constant 𝑔. In the considered case we are interested in contributions of the
order O (𝑔 2 ). There are obviously two such terms descending from Eq. (29.13), namely
𝑖

(2)
𝑆1 = − 2 𝐽02 (𝑥) d4 𝑥 (29.14)
2𝑚
and
𝑖2
∫∫
𝑆2(2) = 𝑇 Lint (𝑥)Lint (𝑦) d4 𝑥 d4 𝑦 . (29.15)

2!
As regards (29.14), only its part containing the product 𝜓 1 𝛾0 𝜓1 𝜓 2 𝛾0 𝜓2 can contribute to the
 
considered process; denoting it as 𝑋1 for convenience, one has

𝑔2

𝑋1 = −𝑖 2 𝜓 1 𝛾 0 𝜓 1 𝜓 2 𝛾 0 𝜓 2 d4 𝑥 . (29.16)
 
𝑚
Concerning (29.15), in the integrand one also has to take into account just the products combining
the fermion fields 1 and 2 (according to our previous experience, the factor 1/2! is thus cancelled).
Now, there are two types of contributions originating in (29.15). First, the “normal” term leading

179
k k′
q = k − k′
= p′ − p
p p′

Fig. 29.1: 2nd order Feynman diagram involving the exchange of the massive vector boson.

ultimately to the standard 2nd order Feynman diagram involving the covariant part of the vector-
field propagator (28.15). Second, there is the contribution of the non-covariant part of the
propagator; denoting it as 𝑋2 , one gets
 
𝑖
∫∫
2 2 4 4
𝑋2 = 𝑖 𝑔 d 𝑥 d 𝑦 𝜓 1 (𝑥)𝛾 𝜇 𝜓1 (𝑥) 𝜓 2 (𝑦)𝛾𝜈 𝜓2 (𝑦) − 2 𝛿 𝜇0 𝛿 𝜈0 𝛿 (4) (𝑥 − 𝑦)
 
𝑚
2
(29.17)
𝑔

 4
=𝑖 2 𝜓 1 (𝑥)𝛾0 𝜓1 (𝑥) 𝜓 2 (𝑥)𝛾0 𝜓2 (𝑥) d 𝑥 .

𝑚
So, 𝑋1 + 𝑋2 = 0; this is the envisaged “miraculous” cancellation of the unwanted ingredients
of our calculations. The “normal” term from 𝑆2(2) can be processed in a standard routine way,
and we are thus left with the 2nd order contribution to the considered scattering process that is
represented by the Feynman diagram in Fig. 29.1. Its contribution reads

𝑖M = 𝑖 3 𝑔 2 u (𝑘 ′)𝛾 𝜇 u (𝑘) 𝐷 F (𝑞) u ( 𝑝′)𝛾𝜈 u ( 𝑝) ,


 𝜇𝜈
(29.18)
  

with
𝑞 𝜇 𝑞𝜈
−𝑔 𝜇𝜈 +
𝜇𝜈
𝐷 F (𝑞) = 2 𝑚2 . (29.19)
𝑞 − 𝑚 2 + 𝑖𝜖
Notice that the propagator (29.19) is an even function of 𝑞, so the direction of the wavy line in
Fig. 29.1 is irrelevant. Finally, let us add that our discussion of the cancellation of the terms 𝑋1
and 𝑋2 is just an illustration of such a remarkable mechanism at the level of the lowest non-trivial
order. More comments on this problem can be found e.g. in [10] or [13].

180
Chapter 30

Some applications:
QED with massive photon

After the long and perhaps rather boring exposition of the preceding four chapters, we have now
sufficient tools for considering some further physically interesting applications, in addition to
those already discussed earlier, which have been restricted to the first order of Dyson expansion
of the 𝑆-matrix.
In fact, with the propagators of Dirac field and massive vector field at hand, we have
come very close to the quantum electrodynamics (QED). So, let us continue working with the
model defined by (29.1), (29.2); since we envisage a road to QED, we will change the notation
for the coupling constant, using 𝑒 instead of 𝑔. For definiteness, let us identify the fermion 1
with the electron and 2 will be the muon; the mass of the “heavy photon” is denoted simply as 𝑚
in what follows. Apart from the elastic scattering process 𝑒 + 𝜇 → 𝑒 + 𝜇 mentioned briefly in the
preceding chapters, one can also consider the production of muon pair in the electron–positron
annihilation,
𝑒 − (𝑘) + 𝑒 + ( 𝑝) → 𝜇− (𝑘 ′) + 𝜇+ ( 𝑝′) . (30.1)
Surely, the reader is already knowledgeable enough to draw the relevant Feynman diagram
immediately; obviously, this is as shown in Fig. 30.1 (any “doubting Thomas” is encouraged to

k k′

−p q =k+p −p′
= k ′ + p′

Fig. 30.1: Electron–positron annihilation into a muon pair at the leading order in QED with massive
photon.

put a finger into the 2nd order of the Dyson expansion involving (29.1) and (29.2) so as to verify
it). The matrix element corresponding to the diagram in Fig. 30.1 reads
𝑞 𝜇 𝑞𝜈
−𝑔 𝜇𝜈 +
𝑖M = 𝑖 3 𝑒 v ( 𝑝)𝛾 𝜇 u (𝑘) 2
2
  𝑚 2 u (𝑘 ′)𝛾 𝜈 v ( 𝑝′)  . (30.2)
𝑞 − 𝑚 2 + 𝑖𝜖
At first sight, the desired passage to the massless photon could be thwarted by the presence of
the term proportional to 1/𝑚 2 in the propagator numerator. Fortunately, it is not so; in fact, the

181
contribution of this term vanishes. Indeed, one has e.g.

v ( 𝑝)𝛾 𝜇 u (𝑘)𝑞 𝜇 = v ( 𝑝) 𝑞/ u (𝑘) = v ( 𝑝)( 𝑝/ + 𝑘/ )u


u (𝑘) . (30.3)

But the Dirac equation tells us that

𝑘/ u (𝑘) = 𝑚 u (𝑘) , v ( 𝑝) 𝑝/ = −𝑚 v ( 𝑝) , (30.4)

and there you are. Then the limit 𝑚 → 0 in the denominator is safe; at the same time, 𝑖𝜖 becomes
irrelevant, since 𝑞 2 = (𝑘 ′ + 𝑝′) 2 ≥ 4𝑚 2𝜇 for the given process (in other words, total energy in the
c.m. system must be at least 2𝑚 𝜇 ). Thus, for 𝑚 = 0 the propagator in the expression (30.2) is
effectively equal to −𝑔 𝜇𝜈 /𝑞 2 . As we will see in Chapter 32, this is precisely the simplest form of
the photon propagator obtained within the covariant quantization of the electromagnetic field.
Let us now proceed to the evaluation of the cross section. For 𝑚 = 0 we have from (30.2)
 1
M = 𝑒 2 v ( 𝑝)𝛾 𝜇 u (𝑘) u (𝑘 ′)𝛾 𝜇 v ( 𝑝′) 2 . (30.5)
 
𝑞
We are going to consider first the case of unpolarized particles. Using the familiar trace technique,
the spin-averaged square of the matrix element becomes, after some simple manipulations,
1 ∑︁
|M | 2 = |M | 2
4 pol.
(30.6)
1 𝑒4  ′
= Tr ( 𝑝/ − 𝑚 𝑒 )𝛾 𝜇 (/𝑘 + 𝑚 𝑒 )𝛾𝜈 · Tr (/𝑘 + 𝑚 𝜇 )𝛾 𝜇 ( 𝑝/ ′ − 𝑚 𝜇 )𝛾 𝜈 .
  
4 (𝑞 2 ) 2
As we have already noted, the threshold energy for the considered process is 𝐸 c.m. = 2𝑚 𝜇 . The
electron mass is two orders of magnitude smaller than 𝑚 𝜇 , so 𝑚 𝑒 can be safely neglected in
(30.6). Employing the familiar identities for 𝛾-matrices, in particular our “formulae 32” (C.24),
the expression (30.6) is worked out easily; one gets, for 𝑚 𝑒 = 0,
𝑒4  2
|M | 2 = 8 𝑚 𝜇 𝑘 · 𝑝 + (𝑘 · 𝑝′)(𝑘 ′ · 𝑝) + (𝑘 · 𝑘 ′)( 𝑝 · 𝑝′) , (30.7)

𝑠 2

where we have also introduced the standard Mandelstam variable 𝑠 = 𝑞 2 .


It is instructive to consider now the high-energy limit, i.e. the collision energy such that
2
𝑠 ≫ 𝑚 𝜇 . Then one may set 𝑚 𝜇 = 0 wherever it occurs; this also means that the scalar products
in (30.7) can be expressed simply in terms of the Mandelstam variables 𝑡 = (𝑘 − 𝑘 ′) 2 = ( 𝑝 − 𝑝′) 2
and 𝑢 = (𝑘 − 𝑝′) 2 = (𝑘 ′ − 𝑝) 2 . One thus ends up with an elegant result
𝑡 2 + 𝑢2
|M | 2 = 2𝑒 4 . (30.8)
𝑠≫𝑚 2𝜇 𝑠2
Now we can utilize the kinematical formulae for 𝑡 and 𝑢 that hold in the massless case, namely
1
𝑡 = − 𝑠 (1 − cos 𝜗c.m. ) ,
2 (30.9)
1
𝑢 = − 𝑠 (1 + cos 𝜗c.m. ) ,
2
where 𝜗c.m. is the angle between 𝑘® and 𝑘®′ in the c.m. system. The expression (30.8) is thus recast
as
|M | 2 2
= 𝑒 4 (1 + cos2 𝜗c.m. ) . (30.10)
𝑠≫𝑚 𝜇

182
Further, the differential cross section in terms of variables of c.m. system becomes

d𝜎 𝛼2
= (1 + cos2 𝜗c.m. ) , (30.11)
dΩc.m. 𝑠≫𝑚 2𝜇 4𝑠

where we have introduced the fine-structure constant 𝛼 = 𝑒 2 /(4𝜋). Note that the angular distri-
bution (30.11) is characteristic for an interaction mediated by a spin-1 particle (here the photon,
either massless or massive). The reader is encouraged to perform an analogous calculation
in a model of Yukawa-type interaction of a scalar field like that promoted in Chapter 26 (it
is algebraically much simpler than within QED); it turns out that in such a case the angular
distribution is trivial, i.e. isotropic.
Finally, one may integrate the expression (30.11) over the solid angle Ωc.m. ; for the
integral cross section we arrive at the well-known approximate QED formula

4𝜋𝛼2
𝜎(𝑠) 𝑠≫𝑚 2𝜇
= . (30.12)
3𝑠
A remark is in order here. The most important part of the formula (30.12) may in fact be
obtained simply by means of an “educated guess”: In the 2nd order of perturbation expansion,
the matrix element is proportional to 𝑒 2 , i.e. 𝛼; thus, the cross section must be proportional to
𝛼2 , to the given order. Further, in high-energy limit, where all masses are neglected, the total
cross section can only depend on the energy, i.e. on the Mandelstam invariant 𝑠. The cross
section has the dimension of the length squared, i.e. inverse energy squared. So, it is clear that
in high-energy limit the cross section is proportional to 𝛼2 /𝑠, since 𝛼 is dimensionless. The
explicit evaluation using the diagram in Fig. 30.1 thus makes this estimate more precise just by
adding the factor 4𝜋/3. In this way we see that the simple guesswork described above gives us
essentially a correct order-of-magnitude estimate of the cross section in question.
If the muon mass is not neglected, i.e. if one wants to get a formula for 𝜎(𝑠) valid in
the whole kinematical region 𝐸 c.m. ≥ 2𝑚 𝜇 , the relevant calculation is slightly more complicated
than before and the result is (of course, still taking 𝑚 𝑒 = 0)

2
! √︄
4𝜋𝛼 2 2𝑚 𝜇 4𝑚 2𝜇
𝜎(𝑠) = 1+ 1− . (30.13)
3𝑠 𝑠 𝑠

The derivation of the formula (30.13) is left as an instructive exercise for any diligent reader.
Note that another useful exercise would be the evaluation of the relevant cross sections for
various combinations of polarizations of the participating particles (i.e. combining left-handed
and right-handed states). Such a calculation is quite easy in the high-energy limit, where one
can utilize the known simple relation between the helicity and chirality (see Chapter 8).
Another physically interesting process is the annihilation of the electron–positron pair
into a pair of photons. In such a case, there are two contributing diagrams, namely those in
Fig. 30.2. These are quite often depicted also as shown in Fig. 30.3 in order to stress “crossing”
of the external photon lines. We will make more explanatory comments on the origin of these
diagrams in the forthcoming chapters, but in fact their genesis should be intuitively clear already
now: in the 2nd order of Dyson expansion there appear two operators of the vector field, say
𝐴 𝜇 (𝑥) and 𝐴𝜈 (𝑦), and there are obviously two possibilities of pairing them with the operators
representing the final-state vector bosons, 𝑎( 𝑝) and 𝑎(𝑟). For the time being, let us adopt a
pragmatic position, employing the diagrams from Fig. 30.2 as they stand; note that the outgoing
vector boson lines correspond to the polarization vectors 𝜖 ∗ (let us recall that for a vector boson

183
k p k r

q =k−p Q=k−r
=r−l =p−l

−l r −l p

Fig. 30.2: Electron–positron annihilation into a pair of photons.

k p k

r
−l r −l

Fig. 30.3: Electron–positron annihilation into a pair of photons: an alternative way of graphical
representation of the configuration of external photons.

in the initial state, the incoming line would represent the polarization vector 𝜖). Using the by
now familiar Feynman rules, the sum of contributions of the diagrams in Fig. 30.2 reads
1
𝑖M = 𝑖 3 𝑒 2 v (𝑙)𝛾 𝜇 𝛾 𝜈 u (𝑘)𝜖 𝜇∗ (𝑟, 𝜆)𝜖 𝜈∗ ( 𝑝, 𝜆′)
𝑞/ − 𝑚 𝑒
(30.14)
1
+ 𝑖 3 𝑒 2 v (𝑙)𝛾 𝜈 𝛾 𝜇 u (𝑘)𝜖 𝜇∗ (𝑟, 𝜆)𝜖 𝜈∗ ( 𝑝, 𝜆′) ,
/ − 𝑚𝑒
𝑄
where 𝜆, 𝜆′ = 1, 2, 3 label the possible polarization (spin) states of the vector bosons (we still
consider “heavy photons” with three possible polarizations). For later convenience, the matrix
element M can be written as

M = −𝑒 2 M 𝜇𝜈 𝜖 𝜇∗ (𝑟, 𝜆)𝜖 𝜈∗ ( 𝑝, 𝜆′) , (30.15)

where
1
M 𝜇𝜈 = v (𝑙)𝛾 𝜇 𝛾 𝜈 u (𝑘)
𝑞/ − 𝑚 𝑒
(30.16)
1
+ v (𝑙)𝛾 𝜈 𝛾 𝜇 u (𝑘) .
/ − 𝑚𝑒
𝑄
Let us now consider unpolarized vector bosons in the final state. For the matrix element squared,
involving the summation over polarization states 𝜆,𝜆′ we have, using (30.15),

|M | 2 = 𝑒 4 M 𝜇𝜈 𝜖 𝜇∗ (𝑟, 𝜆)𝜖 𝜈∗ ( 𝑝, 𝜆′) M ∗𝜌𝜎 𝜖 𝜌 (𝑟, 𝜆)𝜖 𝜎 ( 𝑝, 𝜆′)


∑︁

𝜆,𝜆′
(30.17)
1 1
  
4 𝜇𝜈 ∗𝜌𝜎
=𝑒 M M −𝑔 𝜇𝜌 + 2 𝑟 𝜇 𝑟 𝜌 −𝑔𝜈𝜎 + 2 𝑝 𝜈 𝑝 𝜎 ,
𝑚 𝑚

184
where we have utilized the standard formula for the vector boson polarization sum. Now we
face again the problem of the feasibility of the limit 𝑚 → 0. At first sight, there is the same
difficulty as before concerning the heavy-photon propagator. In the present case, the solution is
similar to that discovered above: we will be able to show that the terms proportional to 1/𝑚 2 in
the polarization sums in (30.17) do not contribute at all. Indeed, it turns out that the following
identities hold,
𝑟 𝜇 M 𝜇𝜈 = 0 , 𝑝 𝜈 M 𝜇𝜈 = 0 , (30.18)
and this makes the above statement obvious. It is sufficient to prove the first identity (30.18), for
the other one it goes in the same way. So, using (30.16), one gets
1 1
𝑟 𝜇 M 𝜇𝜈 = v (𝑙)/𝑟 𝛾 𝜈 u (𝑘) + v (𝑙)𝛾 𝜈 𝑟/u (𝑘)
𝑞/ − 𝑚 𝑒 / − 𝑚𝑒
𝑄
1 1
= v (𝑙)( /𝑙 + 𝑞/ ) 𝛾 𝜈 u (𝑘) + v (𝑙)𝛾 𝜈 (/𝑘 − 𝑄)u
/ u (𝑘)
𝑞/ − 𝑚 𝑒 𝑄/ − 𝑚𝑒
1 1
= v (𝑙)(−𝑚 𝑒 + 𝑞/ ) 𝛾 𝜈 u (𝑘) + v (𝑙)𝛾 𝜈 (𝑚 𝑒 − 𝑄)u
/ u (𝑘)
𝑞/ − 𝑚 𝑒 / − 𝑚𝑒
𝑄
= v (𝑙)𝛾 𝜈 u (𝑘) − v (𝑙)𝛾 𝜈 u (𝑘)
= 0.

Note that the relations (30.18) in fact represent an example of the so-called Ward identities. A
deeper aspect of the considered situation is that we are working here with a neutral vector field
coupled to a conserved current. We will come across more examples of such a kind later on.
Thus, we see that the limit 𝑚 → 0 can be performed safely. In the forthcoming chapters,
where we will discuss the quantization of the electromagnetic field, we will convince ourselves
that the polarization sums for massless photons (i.e. massless from the very beginning) lead to
the same result as our approach based on the limit 𝑚 → 0. So, in principle, we are now in a
position to evaluate the two-photon annihilation of the electron–positron pair in detail. However,
we will not pursue this theme now and rather defer it to later chapters devoted to an extensive
treatment of QED. The above exposition was meant as an elucidation of a part of the intriguing
problem of massless limit in the physics of massive vector bosons. Actually, the problem is
more complex. For charged vector bosons the situation is different and the simple procedure
described above does not work. It is in fact one of the crucial topics in the standard model of
electroweak interactions, but this would be another tale (see e.g. [22]).
Let us now leave the theme of massless limit as a possible road towards QED, and consider
an opposite case, where the vector boson mass is sufficiently large, in particular 𝑚 > 2𝑚 𝜇 . Then,
revisiting the expression (30.2), it is clear that one may run into a potential singularity of the
propagator for 𝑞 2 = 𝑚 2 ; a legitimate question is then what can be done about it (obviously,
the infinitesimal term 𝑖𝜖 does not provide a satisfactory answer). The solution goes beyond
the tree-level diagrams we have been working with up to now. If one includes higher-order
corrections to the propagator, these modify the propagator denominator in such a way that its
resulting form is 𝑞 2 − 𝑚 2 + 𝑖𝑚Γ, where Γ is the width of the unstable heavy vector boson
corresponding to its decay into the muon pair. Such an expression for the corrected propagator
is called the Breit–Wigner form and it is relevant in many situations. In particular, in the theory
of electroweak interactions, it is appropriate for the description of processes of production of
massive vector bosons, the most prominent example being the production of the neutral boson
𝑍 that was studied in much detail at the electron–positron collider LEP at CERN during the last
decade of 20th century.

185
Chapter 31

Quantization of electromagnetic field:


covariant and non-covariant

As we have seen in the preceding chapter, in some situations one may describe photons (both
real and virtual) by means of the massless limit of quantized Proca field. Nevertheless, one
should also take up the task of a direct quantization of the electromagnetic Maxwell field. In a
sense, it is the most intriguing case to be discussed in the present text and that’s why we have
postponed it up until this moment (though, historically, the photon had been the first example of
a particle interpreted as a field quantum).
So, let us start with the familiar Lagrangian for the free Maxwell field
1
L = − 𝐹𝜇𝜈 𝐹 𝜇𝜈 , (31.1)
4
with 𝐹𝜇𝜈 = 𝜕𝜇 𝐴𝜈 − 𝜕𝜈 𝐴 𝜇 . The corresponding equation of motion reads 𝜕𝜇 𝐹 𝜇𝜈 = 0, i.e.
2 𝐴 𝜇 − 𝜕 𝜇 (𝜕 · 𝐴) = 0 , (31.2)
where we have employed the usual shorthand notation 𝜕 · 𝐴 = 𝜕 𝜇 𝐴 𝜇 . An attempt to implement
the straightforward procedure of canonical quantization runs into a difficulty that we have already
encountered in Chapter 20 in the case of the Proca field: from the relation
𝜕L
= −𝐹 𝜌𝜎 (31.3)
𝜕 (𝜕𝜌 𝐴𝜎 )
(see Eq. (20.3)) one gets
𝜕L
. = 0,
𝜕 𝐴0
and this means that 𝐴0 certainly cannot play the role of an appropriate canonical variable. In any
case, it is clear a priori that all four components of the four-potential 𝐴 𝜇 , 𝜇 = 0, 1, 2, 3, cannot
be taken as independent dynamical variables; one should recall that, classically, electromagnetic
plane waves have two physical polarizations. Thus, there are just two independent degrees of
freedom to be quantized. Of course, this is intimately related to the gauge symmetry, i.e. the
invariance of the field strength 𝐹𝜇𝜈 under the gauge transformation 𝐴 𝜇 → 𝐴 𝜇 + 𝜕𝜇 𝑓 , with 𝑓 being
an essentially arbitrary function. In view of the gauge freedom, one may impose the Lorenz
condition17
𝜕𝜇 𝐴 𝜇 (𝑥) = 0 . (31.4)
17 Note that in many textbooks and papers the relation (31.4) is called the “Lorentz condition”. However, it is
originally due to the Danish physicist Ludvig Lorenz (1829–1891), who had introduced it in 1867, when the more
famous Hendrik Lorentz (1853–1928) was only 14 years old.

186
Let us recall that such a relation holds also for the massive vector (Proca) field, but there it
emerges as a consequence of the equations of motion. For the Maxwell field, Eq. (31.4) is
just a subsidiary condition imposed by hand. An obvious advantage of utilizing this standard
constraint is that Eq. (31.2) is thereby reduced to the d’Alembert equation

2 𝐴 𝜇 (𝑥) = 0 . (31.5)

In this manner, one unphysical degree of freedom is eliminated; as a next step in this reduction
procedure, one may set e.g. 𝐴0 = 0 (recall that in the classical theory, this is a usual way of
description of the electromagnetic radiation far from its source). Thus, taking into account
Eq. (31.4), our gauge conditions read

𝐴0 = 0 ,
Δ® · 𝐴® = 0 . (31.6)

This defines what is usually called, for obvious reasons, the radiation gauge. Note that such a
way of gauge fixing is not Lorentz covariant, since the component 𝐴0 is singled out and treated
differently from 𝐴 𝑗 , 𝑗 = 1, 2, 3. These latter components can be employed as the canonical
variables for quantization, but one must keep in mind the constraint ® · 𝐴® = 0. Defining the
Δ
conjugate momenta 𝜋 𝑗 as
𝜕L
𝜋𝑗 = ,
𝜕 (𝜕0 𝐴 𝑗 )
one has, using (31.3) and (31.6),
𝜋 𝑗 = 𝐹0 𝑗 = 𝜕0 𝐴 𝑗 . (31.7)
It is immediately clear that the quantization based on the canonical commutation relations
 ?
𝑥 , 𝑡), 𝜋 𝑘 ( 𝑦®, 𝑡) = 𝑖𝛿 𝑗 𝑘 𝛿 (3) (® (31.8)

𝐴 𝑗 (® 𝑥 − 𝑦®)

would not work, since it is not compatible with the constraint ® · 𝐴® = 0. As we already know,
Δ
the ultimate goal of any quantization procedure is an identification of creation and annihilation
operators worth this name, i.e. endowed with standard properties regarding the field energy and
momentum. So, we will proceed in the indicated direction, starting with an “educated guess”
for the annihilation and creation operators (i.e. the other way round than e.g. in the case of the
Klein–Gordon or Proca fields) and subsequently one may find out what would be a pertinent
modification of the commutation relation (31.8). Notice that such an approach is similar to the
procedure we have employed previously for the quantization of the Dirac field.
To this end, let us examine solutions of the d’Alembert equation (31.5) satisfying the
gauge constraint (31.6). It is not difficult to see that such a general solution can be written as
∫ 2 h
d3 𝑘 𝑁 𝑘
∑︁ i
𝐴 𝑗 (𝑥) = 𝑎(𝑘, 𝜆)𝜖 𝑗 (𝑘, 𝜆)𝑒 −𝑖𝑘𝑥 + 𝑎 † (𝑘, 𝜆)𝜖 ∗𝑗 (𝑘, 𝜆)𝑒𝑖𝑘𝑥 , (31.9)
𝜆=1

where 𝑘 2 = 0, i.e. 𝑘 0 = | 𝑘® | and 𝜖 𝑗 (𝑘, 𝜆), 𝜆 = 1, 2, are two linearly independent polarization
vectors such that (in compliance with ® · 𝐴® = 0)
Δ

𝑘® · 𝜖®(𝑘, 𝜆) = 0 . (31.10)

A convenient normalization condition is

𝜖®(𝑘, 𝜆) · 𝜖®∗ (𝑘, 𝜆′) = 𝛿𝜆𝜆′ . (31.11)

187
Needless to say, 𝑁 𝑘 is the conventional factor 𝑁 𝑘 = (2𝜋) −3/2 (2𝑘 0 ) −1/2 .
Before considering the envisaged commutation relations for the operator coefficients 𝑎,
𝑎 † it is useful to establish an important relation for the polarization vectors. The formula we
have in mind is
2
∑︁ 𝑘𝑖 𝑘 𝑗
𝑃𝑖 𝑗 ≡ 𝜖𝑖 (𝑘, 𝜆)𝜖 ∗𝑗 (𝑘, 𝜆) = 𝛿𝑖 𝑗 − , (31.12)
𝜆=1 | 𝑘® | 2
where 𝑃𝑖 𝑗 is the natural notation for the polarization sum. The identity (31.12) can be proved
quite easily. One may start with the completeness relation for the orthogonal basis in the
® 𝑘® |. This reads
3-dimensional Euclidean space, made of 𝜖®(𝑘, 𝜆), 𝜆 = 1, 2, and 𝜖®(𝑘, 3) = 𝑘/|
2
∑︁ 𝑘𝑖 𝑘 𝑗
𝜖𝑖 (𝑘, 𝜆)𝜖 ∗𝑗 (𝑘, 𝜆) + = 𝛿𝑖 𝑗 ,
𝜆=1 | 𝑘® | | 𝑘® |

and the result (31.12) thus becomes immediately obvious.


It is also convenient to recast (31.12) in a “pseudo-covariant” form, namely
2
∑︁ 1 1
𝜖 𝜇 (𝑘, 𝜆)𝜖 𝜈∗ (𝑘, 𝜆) = −𝑔 𝜇𝜈 + (𝑘 𝜇 𝜂 𝜈 + 𝜂 𝜇 𝑘 𝜈 ) − 𝑘𝜇𝑘𝜈 , (31.13)
𝜆=1
𝜂·𝑘 (𝜂 · 𝑘) 2

where 𝜂 = (1, 0, 0, 0). The proof goes as follows. We can use the completeness relation for
a basis in the four-dimensional Minkowski space that is formed by 𝜖 𝜇 (𝑘, 𝜆) = 0, 𝜖®(𝑘, 𝜆) ,

® 𝑘® |) and 𝜂 = (1, 0, 0, 0). The first three 𝜖’s are space-like and 𝜂 is
𝜆 = 1, 2, 𝜖 𝜇 (𝑘, 3) = (0, 𝑘/|
time-like. Thus, one may write
2
∑︁
− 𝜖 𝜇 (𝑘, 𝜆)𝜖 ∗𝜈 (𝑘, 𝜆) − 𝜖 𝜇 (𝑘, 3)𝜖 ∗𝜈 (𝑘, 3) + 𝜂 𝜇 𝜂 𝜈 = 𝑔 𝜇𝜈 . (31.14)
𝜆=1

The vector 𝜖 𝜇 (𝑘, 3) can be rewritten artificially as


1 𝜇 1
𝜖 𝜇 (𝑘, 3) = 𝑘 − 𝜂𝜇 = 𝑘 𝜇 − 𝜂𝜇 . (31.15)
𝑘0 𝜂·𝑘
From (31.14) and (31.15) one then gets
2
1 1 𝜈
∑︁   
𝜇 ∗𝜈 𝜇𝜈
𝜖 (𝑘, 𝜆)𝜖 (𝑘, 𝜆) = −𝑔 − 𝑘 𝜇 − 𝜂𝜇 𝑘 − 𝜂 + 𝜂 𝜇 𝜂𝜈 .
𝜈

𝜆=1
𝜂·𝑘 𝜂·𝑘

The term 𝜂 𝜇 𝜂 𝜈 gets cancelled and the identity (31.13) is thereby proved.
As we have indicated above, we will proceed by postulating commutation relations

𝑎(𝑘, 𝜆), 𝑎 † (𝑘 ′, 𝜆′) = 𝛿𝜆𝜆′ 𝛿 (3) ( 𝑘® − 𝑘®′) , 𝑎(𝑘, 𝜆), 𝑎(𝑘 ′, 𝜆′) = 0 (31.16)
   

for the annihilation and creation operators “in spe”. In order to confirm their anticipated contents,
one would like to show that the energy (Hamiltonian) has the proper form, i.e.
2
1

3
∑︁
𝐻= d 𝑘 𝑘0 𝑎 † (𝑘, 𝜆)𝑎(𝑘, 𝜆) + 𝑎(𝑘, 𝜆)𝑎 † (𝑘, 𝜆) . (31.17)
 
2 𝜆=1

188
Then, in accordance with our previous experience, one may conclude that

𝐻, 𝑎 † (𝑘, 𝜆)] = 𝑘 0 𝑎 † (𝑘, 𝜆) ,



(31.18)
𝐻, 𝑎(𝑘, 𝜆)] = −𝑘 0 𝑎(𝑘, 𝜆) ,


and this guarantees the particle interpretation of the quantized field. So, how about the energy
operator? Employing the standard formula for the component T 00 of the energy–momentum
tensor and the Lagrangian (31.1), one gets, after some simple manipulations (utilizing also the
gauge condition (31.6)),

𝜕L
H = T 00 = 𝜕0 𝐴 𝑗 − L
𝜕 (𝜕0 𝐴 𝑗 )
1
= (𝜕0 𝐴 𝑗 )(𝜕0 𝐴 𝑗 ) + 𝐹𝜇𝜈 𝐹 𝜇𝜈
4 (31.19)
1 1
= (𝜕0 𝐴 𝑗 )(𝜕0 𝐴 𝑗 ) + 𝐹 𝑗 𝑘 𝐹 𝑗 𝑘
2 4
1 1 1
= (𝜕0 𝐴 𝑗 )(𝜕0 𝐴 𝑗 ) + (𝜕 𝑗 𝐴 𝑘 )(𝜕 𝑗 𝐴 𝑘 ) − (𝜕 𝑗 𝐴 𝑘 )(𝜕𝑘 𝐴 𝑗 ) .
2 2 2
Then the energy becomes
1
∫ ∫
3
𝐻= d 𝑥H = d3 𝑥 (𝜕0 𝐴 𝑗 )(𝜕0 𝐴 𝑗 ) − 𝐴 𝑗 Δ 𝐴 𝑗 , (31.20)
 
2
where we have used the integration by parts as well as the gauge condition 𝜕 𝑗 𝐴 𝑗 = 0. Now one
may substitute the solution (31.9) into (31.20) and arrive at the result (31.17). The calculation
is somewhat tedious, but straightforward; so, the hard-working reader is encouraged to perform
it as an instructive exercise.
With these results at hand, we are on the right track to photons as the quanta of the
Maxwell field. We will not proceed further in such an analysis; let us only remark that one
may expect rightly that the spin (helicity) states of photons are determined by the polarization
vectors 𝜖®(𝑘, 𝜆), similarly as in the case of the quantized Proca field. However, for photons
only two helicity states are possible, corresponding to the transverse polarizations displayed in
Eq. (31.10). So, a “massive photon” would differ from the massless one by the additional degree
of freedom — the longitudinal polarization (zero helicity). A remark is in order here. One
might wonder what is the fate of the would-be “canonical” commutation relation (31.8) within
the quantization scheme that we have described above. The answer is that the right-hand side
of (31.8) is replaced by a more complicated object, aptly dubbed “transverse delta function”; its
explicit form can be found e.g. in the book [2].
Next, let us examine the propagator of the quantized Maxwell field. One may define, as
usual,
𝜇𝜈
𝑖DF (𝑥 − 𝑦) = ⟨0|T 𝐴 𝜇 (𝑥) 𝐴𝜈 (𝑦) |0⟩ . (31.21)


Since 𝐴0 = 0, this in fact means that only the components with (𝜇𝜈) = (𝑖 𝑗), 𝑖, 𝑗 = 1, 2, 3, are
non-zero. Following then the similar steps as in the case of massive vector (Proca) field, one
gets first (cf. (28.8))

⟨0|T 𝐴𝑖 (𝑥) 𝐴 𝑗 (𝑦) |0⟩



∫ 3
1 d 𝑘 d𝜔 1 1 1
 
® ® ®
= 𝑃𝑖 𝑗 (− 𝑘) + 𝑃𝑖 𝑗 ( 𝑘) 𝑒𝑖𝜔(𝑥0 −𝑦0 )−𝑖 𝑘·(𝑥®−®𝑦) .
𝑖 (2𝜋) 2𝑘 0
4 𝜔 + 𝑘 0 − 𝑖𝜖 −𝜔 + 𝑘 0 − 𝑖𝜖

189
® is an even function (see (31.12)), this yields immediately, after some simple
Since 𝑃𝑖 𝑗 ( 𝑘)
manipulations,

d4 𝑞 1

⟨0|T 𝐴𝑖 (𝑥) 𝐴 𝑗 (𝑦) |0⟩ = 𝑖 𝑃𝑖 𝑗 ( 𝑞)
® 2 𝑒𝑖𝑞(𝑥−𝑦) , (31.22)

(2𝜋) 4 𝑞 + 𝑖𝜖

® Thus, we have
where we have denoted, as usual, 𝑞 = (𝜔, 𝑘).

d4 𝑞

𝜇𝜈 𝜇𝜈
DF (𝑥 − 𝑦) = 4
𝐷 F (𝑞) 𝑒𝑖𝑞(𝑥−𝑦) ,
(2𝜋)
where
𝜇𝜈 𝑃 𝜇𝜈 (𝑞)
𝐷 F (𝑞) = , (31.23)
𝑞 2 + 𝑖𝜖
with
𝑞𝑖 𝑞 𝑗
𝑃 𝜇𝜈 (𝑞) = 𝛿𝑖 𝑗 − for (𝜇, 𝜈) = (𝑖, 𝑗) ,
®2
| 𝑞| (31.24)
𝑃 (𝑞) = 0
𝜇𝜈
otherwise .

Let us also note that the above result is equivalent to the following “pseudo-covariant” form of
𝑃 𝜇𝜈 (𝑞):
(𝜂 · 𝑞)(𝜂 𝜇 𝑞 𝜈 + 𝑞 𝜇 𝜂 𝜈 ) − 𝑞 𝜇 𝑞 𝜈 − 𝑞 2 𝜂 𝜇 𝜂 𝜈
𝑃 𝜇𝜈 (𝑞) = −𝑔 𝜇𝜈 + , (31.25)
(𝜂 · 𝑞) 2 − 𝑞 2
with 𝜂 = (1, 0, 0, 0) as before. The reader is encouraged to verify the equivalence of (31.24)
and (31.25). Hint: One may proceed similarly as in the derivation of√︁(31.13), but here one must
take into account that 𝑞 2 ≠ 0 in general. In particular, one has | 𝑞| ® = (𝜂 · 𝑞) 2 − 𝑞 2 .
Thus, one may notice that the resulting form of the photon propagator is rather com-
plicated. Owing to the non-covariant character of the gauge condition (31.6), the expression
(31.25) is non-covariant as well, as manifested by the presence of 𝜂, which is just a fixed set of
numbers, not a four-vector (the attribute “pseudo-covariant” is therefore pertinent in the context).
In particular, the term 𝜂 𝜇 𝜂 𝜈 in 𝑃 𝜇𝜈 (𝑞) indicates that in a simple application, like e.g. describing
the process 𝑒 + 𝑒 − → 𝜇+ 𝜇− discussed earlier, one cannot get the same result as in the approach to
QED based on the limit 𝑚 𝛾 → 0 (see the preceding Chapter 30). More precisely, one does not
obtain the same results for the scattering amplitudes, if one takes, naı̈vely, Hint = −Lint . But this
is just the point where one should be more cautious. It turns out that here in fact Hint ≠ −Lint
(similarly as in the case of the interaction of Proca field) and, in the end, the extra 𝜂 𝜇 𝜂 𝜈 term in
the propagator (31.23), (31.25) gets cancelled in combination with the additional term in Hint .
We will not discuss the details here (the interested reader can find an explicit treatment of the
problem e.g. in the books [2] and [10]); instead, we are going to pursue an alternative route and
develop a quantization scheme, which is manifestly covariant (though the price to be paid is the
appearance of unphysical degrees of freedom).
So, let us see how one can quantize the Maxwell field and maintain the Lorentz covariance
(avoiding thus the cumbersome form of the photon propagator involving (31.25)). To this end, it is
certainly desirable to preserve d’Alembert quation (31.5) supplemented, in a proper manner, with
the Lorenz condition (31.4). The basic idea is to modify the gauge invariant Lagrangian (31.1)
in such a way that Eq. (31.5) is obtained directly. Then, the key invention is the implementation
of the Lorenz condition as a constraint imposed on the states of the quantized field. The above-
mentioned modification of the Lagrangian consists in the so-called “Fermi trick”, which means

190
that Eq. (31.1) is replaced with
1 1
L = − 𝐹𝜇𝜈 𝐹 𝜇𝜈 − (𝜕 · 𝐴) 2 (31.26)
4 2
(notice that such a form is not gauge invariant any longer; its part proportional to (𝜕 · 𝐴) 2
is therefore usually called the gauge-fixing term). Let us check that (31.26) yields indeed
the d’Alembert equation for 𝐴 𝜇 . Utilizing the by now familiar relation (31.3), one gets the
Euler–Lagrange equation of motion
1 𝜕 (𝜕 · 𝐴)
 
𝜕L
0 = 𝜕𝜇 = 𝜕𝜇 −𝐹 − · 2 · (𝜕 · 𝐴)
𝜇𝜈
= −2 𝐴𝜈 + 𝜕 𝜈 (𝜕 · 𝐴) − 𝜕 𝜈 (𝜕 · 𝐴)
𝜕 (𝜕𝜇 𝐴𝜈 ) 2 𝜕 (𝜕𝜇 𝐴𝜈 )
= −2 𝐴𝜈 .

Note that we have employed here an obvious identity


𝜕 (𝜕 · 𝐴)
= 𝑔 𝜇𝜈 .
𝜕 (𝜕𝜇 𝐴𝜈 )
Thus, Eq. (31.5) is thereby recovered. A caveat is in order here: the four-vector field 𝐴 𝜇 is not
the Maxwell field by itself, since the Lorenz condition is yet to be implemented.
Anyway, one may now proceed to quantize such a massless vector field by postulating
canonical equal-time commutation relations; all components 𝐴 𝜇 may be treated on an equal
footing. So, let us define
𝜕L
𝜋𝜇 ≡ = −𝐹 0𝜇 − 𝑔 0𝜇 𝜕 · 𝐴 . (31.27)
𝜕 (𝜕0 𝐴 𝜇 )
In particular, one gets
𝜋0 = −𝜕 · 𝐴 . (31.28)
The commutation relations in question read
𝐴 𝜇 (𝑥), 𝜋 𝜈 (𝑦) E.T. = 𝑖𝛿 𝜇𝜈 𝛿 (3) (®
 
𝑥 − 𝑦®) ,
𝐴 𝜇 (𝑥), 𝐴𝜈 (𝑦) E.T. = 0 , (31.29)
 

𝜋 𝜇 (𝑥), 𝜋 𝜈 (𝑦) E.T. = 0 .


 

Then it is not difficult to show that from Eq. (31.29) one gets
.
𝑥 , 𝑡), 𝐴𝜈 ( 𝑦®, 𝑡) = −𝑖𝑔 𝜇𝜈 𝛿 (3) (®
 
𝐴 𝜇 (® 𝑥 − 𝑦®) ,
. . (31.30)
𝑥 , 𝑡), 𝐴𝜈 ( 𝑦®, 𝑡) = 0 .

𝐴 𝜇 (®
The derivation of the identities (31.30) is straightforward and may be left to the reader as an
instructive exercise. Now, the general solution of d’Alembert equation can be written as
∫ 3
3
∑︁
𝐴 𝜇 (𝑥) = d 𝑘 𝑁 𝑘 𝑎(𝑘, 𝜆)𝜖 𝜇 (𝑘, 𝜆)𝑒 −𝑖𝑘𝑥 + 𝑎 † (𝑘, 𝜆)𝜖 𝜇∗ (𝑘, 𝜆)𝑒𝑖𝑘𝑥 , (31.31)
 
𝜆=0

where the “polarization vectors” are as follows:


® 𝜆) , 𝑘® · 𝜖®( 𝑘,
𝜖 𝜇 (𝑘, 𝜆) = 0, 𝜖®( 𝑘, ® 𝜆) = 0 for 𝜆 = 1, 2 ,


𝑘®
 
𝜖 (𝑘, 3) = 0,
𝜇
, (31.32)
| 𝑘® |
𝜖 𝜇 (𝑘, 0) = 1, 0, 0, 0 .


191
The corresponding orthonormality relations read, obviously,

𝜖 (𝑘, 𝜆) · 𝜖 ∗ (𝑘, 𝜆′) = 𝑔𝜆𝜆 . (31.33)

Using the conventionally normalized exponentials

𝑓 𝑘 (𝑥) = 𝑁 𝑘 𝑒 −𝑖𝑘𝑥

as in our previous examples of quantized fields (cf. e.g. the case of scalar field), one may recall
the orthogonality relations


d3 𝑥 𝑓 𝑘∗ (𝑥) 𝑖 𝜕0 𝑓 𝑘 ′ (𝑥) = 𝛿 (3) ( 𝑘® − 𝑘®′) ,


d3 𝑥 𝑓 𝑘 (𝑥) 𝑖 𝜕0 𝑓 𝑘 ′ (𝑥) = 0 ,

.
and express the operators 𝑎(𝑘, 𝜆), 𝑎 † (𝑘, 𝜆) in terms of 𝐴 𝜇 and 𝐴 𝜇 . To make the calculation more
transparent, it is convenient to define the combinations
3
∑︁
𝑎 𝜇 (𝑘) = 𝑎(𝑘, 𝜆)𝜖 𝜇 (𝑘, 𝜆) . (31.34)
𝜆=0

One then obtains first




𝑎 𝜇 (𝑘) = 𝑖 d3 𝑥 𝑓 𝑘∗ (𝑥) 𝜕0 𝐴 𝜇 (𝑥) , (31.35)

and thus also




𝑎 †𝜇 (𝑘) = −𝑖 d3 𝑥 𝑓 𝑘 (𝑥) 𝜕0 𝐴 𝜇 (𝑥) . (31.36)

Then, employing (31.30), we get

𝑎 𝜇 (𝑘), 𝑎 †𝜈 (𝑘 ′) = −𝑔 𝜇𝜈 𝛿 (3) ( 𝑘® − 𝑘®′) ,


 
(31.37)
𝑎 𝜇 (𝑘), 𝑎 𝜈 (𝑘 ′) = 0 .
 

As a final step, utilizing the orthonormality properties (31.33), one may express 𝑎(𝑘, 𝜆) as

𝑎(𝑘, 𝜆) = 𝑔𝜆𝜆′ 𝜖 ∗𝜇 (𝑘, 𝜆′)𝑎 𝜇 (𝑘) , (31.38)

and we thus get, after some simple manipulations,

𝑎(𝑘, 𝜆), 𝑎 † (𝑘 ′, 𝜆′) = −𝑔𝜆𝜆′ 𝛿 (3) ( 𝑘® − 𝑘®′) ,


 
(31.39)
𝑎(𝑘, 𝜆), 𝑎(𝑘 ′, 𝜆′) = 0 .
 

Again, the reader is encouraged to prove that the relations (31.39) indeed follow from (31.37)
and (31.38).
The most remarkable feature of the commutation relations (31.39) is that for 𝜆 = 𝜆′ = 0
one has
𝑎(𝑘, 0), 𝑎 † (𝑘 ′, 0) = −𝛿 (3) ( 𝑘® − 𝑘®′) , (31.40)
 

i.e. there is an opposite sign on the right-hand side in comparison with “normal” commutators
of annihilation and creation operators (note that for 𝜆 = 1, 2, 3 the situation is as usual). For

192
convenience, we may resort to a finite box (with volume 𝑉) instead of the infinite 3-dimensional
space, and then one has
𝑎(𝑘, 0), 𝑎 † (𝑘 ′, 0) = −𝛿 𝑘,
 
® 𝑘® ′ ,

in particular,
𝑎(𝑘, 0), 𝑎 † (𝑘, 0) = −1 . (31.41)
 

This indicates that the state defined by means of the action of 𝑎 † (𝑘, 0) on the conventional
vacuum should have negative norm squared, since Eq. (31.41) yields

⟨0|𝑎(𝑘, 0)𝑎 † (𝑘, 0)|0⟩ = −1 . (31.42)

The discussion of this intriguing fact is a main theme of the following chapter. It will turn out
that a way out of such a conundrum consists in a proper implementation of the Lorenz condition.

193
Chapter 32

Gupta–Bleuler method

Before proceeding to the main topic of this chapter, let us calculate the energy of our quantized
massless vector field. For this purpose it is useful to notice that the original Lagrangian (31.26)
is in fact equivalent to a somewhat simpler form of the “Klein–Gordon type”, namely
1
L = − 𝜕𝜇 𝐴𝜈 𝜕 𝜇 𝐴𝜈 . (32.1)
2
Indeed, the Lagrangian (31.26) reads
1 1 1
L = − (𝜕𝜇 𝐴𝜈 − 𝜕𝜈 𝐴 𝜇 )(𝜕 𝜇 𝐴𝜈 − 𝜕 𝜈 𝐴 𝜇 ) − (𝜕 · 𝐴) 2 = − 𝜕𝜇 𝐴𝜈 𝜕 𝜇 𝐴𝜈 + Δ ,
4 2 2
with
1 1
Δ = 𝜕𝜇 𝐴𝜈 𝜕 𝜈 𝐴 𝜇 − (𝜕 · 𝐴) 2 , (32.2)
2 2
and it is easy to verify that
1
Δ= 𝜕𝜇 ( 𝐴𝜈 𝜕 𝜈 𝐴 𝜇 − 𝐴 𝜇 𝜕 𝜈 𝐴𝜈 ) . (32.3)
2
Thus, using the notation (32.1), we have

L = L + 𝜕𝜇 Δ 𝜇 , (32.4)

where
1
Δ𝜇 = ( 𝐴𝜈 𝜕 𝜈 𝐴 𝜇 − 𝐴 𝜇 𝜕 𝜈 𝐴𝜈 ) .
2
However, we know that two Lagrangian densities differing just by a four-divergence lead to the
same physical results; in particular, they give the same energy (and momentum). Now, the L
becomes
1
L = − 𝜕𝜇 𝐴𝜈 𝜕 𝜇 𝐴𝜈
2
1
3
1 ∑︁ (32.5)
𝜇
= − 𝜕𝜇 𝐴0 𝜕 𝐴0 + 𝜕𝜇 𝐴 𝑗 𝜕 𝜇 𝐴 𝑗 ,
2 2 𝑗=1

and thus we have an alternating sum of terms of the Klein–Gordon (KG) type (the components
𝐴 𝜇 corresponding to four independent KG fields). So, using our previous knowledge from the

194
theory of quantized KG field, and employing the notation (31.34) in the decomposition (31.31),
one gets first
3 ∫
1 i 1 ∑︁

3
d3 𝑘 𝑘 0 𝑎 †𝑗 (𝑘)𝑎 𝑗 (𝑘) + 𝑎 𝑗 (𝑘)𝑎 †𝑗 (𝑘)
h h i
† †
𝐻=− d 𝑘 𝑘 0 𝑎 0 (𝑘)𝑎 0 (𝑘) + 𝑎 0 (𝑘)𝑎 0 (𝑘) +
2 2 𝑗=1
1

=− d3 𝑘 | 𝑘® | 𝑎 †𝜇 (𝑘)𝑎 𝜇 (𝑘) + 𝑎 𝜇 (𝑘)𝑎 𝜇† (𝑘) .
 
2
(32.6)
Next, using (31.34) and the orthogonality relations (31.33), one gets finally
1

𝐻=− d3 𝑘 | 𝑘® | 𝑎 † (𝑘, 0)𝑎(𝑘, 0) + 𝑎(𝑘, 0)𝑎 † (𝑘, 0)
 
2
3 ∫ (32.7)
1
d3 𝑘 | 𝑘® | 𝑎 † (𝑘, 𝜆)𝑎(𝑘, 𝜆) + 𝑎(𝑘, 𝜆)𝑎 † (𝑘, 𝜆) .
∑︁  
+
2 𝜆=1

We will proceed further in the usual way, and try to interpret 𝑎(𝑘, 𝜆) and 𝑎 † (𝑘, 𝜆), 𝜆 = 0, 1, 2, 3,
as the annihilation and creation operators, respectively. Then we can employ the usual trick
of normal ordering; for convenience, we may also resort to the finite volume of 3-dimensional
space (“box”), replacing thereby the integrals in the expression (32.7) by infinite sums over
® So, instead of (32.7) we are going to work with the form
discrete values of 𝑘.
∑︁ 3 ∑︁
∑︁
𝐻=− | 𝑘® | 𝑎 † (𝑘, 0)𝑎(𝑘, 0) + | 𝑘® | 𝑎 † (𝑘, 𝜆)𝑎(𝑘, 𝜆) . (32.8)
𝑘® 𝜆=1 𝑘®

A remark is in order here. It is instructive to realize that despite the minus sign in the first term
in (32.8), the eigenvalues of the operator 𝐻 are non-negative. This is due to the anomalous
commutation relation (31.41)
𝑎(𝑘, 0), 𝑎 † (𝑘, 0) = −1 . (32.9)
 

Indeed, it is easy to see that if 𝑎, 𝑎 † = −1, then the operator −𝑎 † 𝑎 has eigenvalues 0, 1, 2, . . .
 

On the other hand, if one denotes 𝑁ˆ = −𝑎 † 𝑎 and |𝜓⟩ = 𝑎 † |0⟩, then |𝜓⟩ is an eigenvector of
the operator 𝑁ˆ with the eigenvalue 1, 𝑁ˆ |𝜓⟩ = |𝜓⟩, but the expectation value of 𝑁, ˆ ⟨𝜓| 𝑁ˆ |𝜓⟩ =
⟨𝜓|𝜓⟩ = ⟨0|𝑎𝑎 |0⟩ = −1. One should keep in mind such rather counterintuitive facts.

After this somewhat lengthy introduction let us now proceed to the discussion of the
Lorenz condition. As a first attempt, one might try to impose the constraint 𝜕 𝜇 𝐴 𝜇 = 0 as an
operator identity. We have
∑︁ 3
∑︁
𝐴 𝜇 (𝑥) = 𝑎(𝑘, 𝜆)𝜖 𝜇 (𝑘, 𝜆)𝑒 −𝑖𝑘𝑥 + 𝑎 † (𝑘, 𝜆)𝜖 𝜇∗ (𝑘, 𝜆)𝑒𝑖𝑘𝑥 , (32.10)
 
𝑁𝑘
𝑘® 𝜆=0

and in the expression for 𝜕 𝜇 𝐴 𝜇 one thus gets scalar products 𝑘 𝜇 𝜖 𝜇 (𝑘, 𝜆). Using the definition
(31.32), one sees that 𝑘 𝜇 𝜖 𝜇 (𝑘, 𝜆) = 0 for 𝜆 = 1, 2, but

𝑘 𝜇 𝜖 𝜇 (𝑘, 3) = − 𝑘® · 𝜖®(𝑘, 3) = −| 𝑘® | ,
(32.11)
𝑘 𝜇 𝜖 𝜇 (𝑘, 0) = | 𝑘® | .
The operator identity 𝜕 𝜇 𝐴 𝜇 (𝑥) = 0 would then obviously mean 𝑎(𝑘, 0) − 𝑎(𝑘, 3) = 0, i.e.
𝑎(𝑘, 0) = 𝑎(𝑘, 3), but this is impossible, since 𝑎(𝑘, 0) and 𝑎(𝑘, 3) satisfy different commutation
relations (cf. (31.39)).

195
Thus, one is forced to change the strategy. Instead of a constraint for field operators, one
may try to impose the Lorenz condition on the states of the quantized massless vector field in
question. One may expect, quite naturally, that Lorenz condition should be connected with a
selection of physical states (in a vague analogy with the classical theory). So, as a first attempt
within such strategy, let us consider the definition of physical states according to

𝜕 𝜇 𝐴 𝜇 (𝑥)|𝜓phys. ⟩ = 0 . (32.12)

In fact, such a straightforward definition leads to a contradiction: it turns out that then not even
the vacuum could satisfy the condition (32.12)). Let us explain this observation in more detail.
For the vacuum state one certainly has 𝜕 𝜇 𝐴−𝜇 (𝑥)|0⟩ = 0, where 𝐴−𝜇 (𝑥) denotes the annihilation
part of the field operator 𝐴 𝜇 (𝑥). Thus, the condition (32.12) would in fact mean that

𝜕 𝜇 𝐴+𝜇 (𝑥)|0⟩ = 0 , (32.13)

where 𝐴+𝜇 denotes the creation part of 𝐴 𝜇 . Among other things, it would then also mean that

⟨0| 𝐴−𝜇 (𝑥)𝜕 𝜈 𝐴+𝜈 (𝑦)|0⟩ = 0 . (32.14)

On the other hand, one has


d3 𝑘

⟨0| 𝐴−𝜇 (𝑥) 𝐴+𝜈 (𝑦)|0⟩ = −𝑔 𝜇𝜈 𝑒 −𝑖𝑘 (𝑥−𝑦) (32.15)
(2𝜋) 3 2| 𝑘® |
(this is a result of a simple calculation, using in particular the commutation relations (31.37)).18
Thus, one would get

⟨0| 𝐴−𝜇 (𝑥)𝜕 𝜈 𝐴+𝜈 (𝑦)|0⟩ = −𝑖𝑔 𝜇𝜈 𝜕(𝑦)


𝜈
𝐷 − (𝑥 − 𝑦)
(𝑦)
(32.16)
= −𝑖𝜕𝜇 𝐷 − (𝑥 − 𝑦) ,
but the right-hand side of Eq. (32.16) is certainly non-zero.
Nevertheless, one may impose the Lorenz condition in a weaker form, namely

𝜕 𝜇 𝐴−𝜇 (𝑥)|𝜓phys. ⟩ = 0 . (32.17)

This is precisely the formulation due to Suraj Gupta and Konrad Bleuler, who suggested this
independently in 1950. Needless to say, Eq. (32.17) is automatically valid for the vacuum, as it
involves only annihilation operators.
Now, what does the condition (32.17) mean explicitly, in terms of annihilation operators?
To see this, let us use again the expansion (32.10) and the relations (32.11). One gets
3
∑︁ ∑︁
𝜕 𝜇 𝐴−𝜇 (𝑥) = 𝑁 𝑘 𝑎(𝑘, 𝜆) −𝑖𝑘 𝜇 𝜖 𝜇 (𝑘, 𝜆) 𝑒 −𝑖𝑘𝑥


𝑘® 𝜆=0
∑︁ (32.18)
= −𝑖 𝑁 𝑘 | 𝑘® |𝑒 −𝑖𝑘𝑥 [𝑎(𝑘, 0) − 𝑎(𝑘, 3)] .
𝑘®

Thus, it is clear that the condition (32.17) for a physical state means

𝑎(𝑘, 0) − 𝑎(𝑘, 3) |𝜓phys. ⟩ = 0 (32.19)


 

18 Note
also that the integral in Eq. (32.15) is usually denoted as 𝑖D − (𝑥 − 𝑦) and D − is called the Pauli–Jordan
commutator function.

196
®
for any value of 𝑘.
A remark is in order here. The relation (32.17) also means that

⟨𝜓phys. |𝜕 𝜇 𝐴 𝜇 (𝑥)|𝜓phys. ⟩ = 0 . (32.20)

Why is it so? Obviously, (32.17) implies that ⟨0|𝜕 𝜇 𝐴+𝜇 = 0, and adding these two identities
together one obtains (32.20). However, it is not difficult to find out that the conditions (32.17)
and (32.20) are not equivalent. An appropriate example of the state vector satisfying (32.20) but
not (32.17) is 𝑎 † (𝑘, 0)|0⟩ (the reader is encouraged to verify this simple statement).
Thus, one may conclude that (32.20) would be an elegant candidate for quantum Lorenz
condition, but it turns out that it is weaker than the correct relation (32.17).
The overall picture of the quantization scheme developed so far can be described as
follows. We use four types of creation and annihilation operators 𝑎 † (𝑘, 𝜆), 𝑎(𝑘, 𝜆), 𝜆 = 0, 1, 2, 3,
that act in a rather broad vector space (let’s call it V ) whose metric, induced by a scalar product
(adapted to the 𝑎, 𝑎 † algebra), is indefinite, i.e. the would-be “norm squared” ⟨𝜓|𝜓⟩ may be
negative for some |𝜓⟩ ∈ V . In particular, defining vacuum state |0⟩ in terms of the annihilation
operators in the usual way, then for |𝜓⟩ = 𝑎 † (𝑘, 0)|0⟩ one has ⟨𝜓|𝜓⟩ < 0. The operators
𝑎(𝑘, 𝜆), 𝑎 † (𝑘, 𝜆) are associated with vectors 𝜖 𝜇 (𝑘, 𝜆) in the expansion (32.10) and this leads
naturally to the conventional terminology: the state created by 𝑎 † (𝑘, 𝜆) is generally called a
“photon”, which for 𝜆 = 1, 2 is “transverse”, for 𝜆 = 3 we use the label “longitudinal”, and the
case 𝜆 = 0 corresponds to a “scalar” (or “time-like”) photon. From our previous discussion
of the radiation gauge we know that only transverse photons are physical, so the states with
𝜆 = 0, 3 involved in our present treatment are to be considered unphysical, corresponding to the
redundant components of the 𝐴 𝜇 field. The Lorenz condition (32.17), or equivalently (32.19),
is intended to select a physical subspace Vphys. ⊂ V . As a simple example of properties of the
states |𝜓phys. ⟩ satisfying Eq. (32.19), let us consider the expectation values of the Hamiltonian
(32.8). According to Eq. (32.19) one has

𝑎(𝑘, 0)|𝜓phys. ⟩ = 𝑎(𝑘, 3)|𝜓phys. ⟩ ,

so
⟨𝜓phys. |𝑎 † (𝑘, 0) = ⟨𝜓phys. |𝑎 † (𝑘, 3) .
Thus one gets

⟨𝜓phys. |𝑎 † (𝑘, 0)𝑎(𝑘, 0)|𝜓phys. ⟩ = ⟨𝜓phys. |𝑎 † (𝑘, 3)𝑎(𝑘, 3)|𝜓phys. ⟩ . (32.21)

From (32.8) it is then clear that the contributions of the scalar and longitudinal photons mutually
cancel in the expectation value in question and one gets

⟨𝜓phys. |𝐻|𝜓phys. ⟩ = ⟨𝜓phys. |𝐻tr |𝜓phys. ⟩ , (32.22)

where
2
∑︁ ∑︁
𝐻tr = | 𝑘® | 𝑎 † (𝑘, 𝜆)𝑎(𝑘, 𝜆)
𝑘® 𝜆=1
is the transverse part of 𝐻.
In any case, one would like to know what is the complete structure of the Vphys. . First
of all, it is obvious that any state |𝜓tr ⟩, consisting of transverse photons only, belongs to Vphys. .

197
This, of course, is due to the fact that 𝑎(𝑘, 0) and 𝑎(𝑘, 3) commute with any 𝑎 † (𝑘, 𝜆), 𝜆 = 1, 2,
so 𝑎(𝑘, 𝜆)|𝜓tr ⟩ = 0 for 𝜆 = 0, 3. Further, let us denote

𝐿 − (𝑘) = 𝑎(𝑘, 0) − 𝑎(𝑘, 3) ,


(32.23)
𝐿 + (𝑘) = 𝑎 † (𝑘, 0) − 𝑎 † (𝑘, 3)

(so, 𝐿 − (𝑘) and 𝐿 + (𝑘) may be provisionally called annihilation and creation part of the “Lorenz
operator” in momentum representation). From the commutation relations (31.37) one can see
immediately that
𝐿 (𝑘), 𝐿 + (𝑘 ′) = 0 (32.24)
 − 

for any 𝑘, 𝑘 ′. From (32.24) it is then clear that a vector |𝜓 (0) ⟩ obtained by a repeated action of
the operators 𝐿 + (𝑘) on a vector |𝜓tr ⟩ (in general, for various values of 𝑘) is a zero-norm state,
⟨𝜓 (0) |𝜓 (0) ⟩ = 0. Thus, any |𝜓⟩ of the type |𝜓⟩ = |𝜓tr ⟩ + |𝜓 (0) ⟩ belongs to Vphys. and for the scalar
product of two such vectors one gets

⟨𝜙|𝜓⟩ = ⟨𝜙tr |𝜓tr ⟩ . (32.25)

Moreover, it should be obvious that for any |𝜓tr ⟩ one has ⟨𝜓tr |𝜓tr ⟩ ≥ 0 (this, of course, is due
to the “normal” commutation relations for 𝑎(𝑘, 𝜆), 𝑎 † (𝑘, 𝜆) with 𝜆 = 1, 2). A key result of the
Gupta–Bleuler theory is that the vectors of the type |𝜓tr ⟩ + |𝜓 (0) ⟩ in fact constitute the whole
Vphys. , i.e. any state in Vphys. is essentially transverse, with the possible admixture of a zero-norm
vector mentioned above. For more technical details concerning this point see e.g. the books [8]
and [16].
One may conclude that the Gupta–Bleuler quantization scheme is successful in selecting
correctly the physical states corresponding essentially to transverse photons, and in identifying a
physical subspace Vphys. with positive definite metric. Vphys. is thus a standard Hilbert space —
although one has always a whole “tower” of states of the form |𝜓tr ⟩ + |𝜓 (0) ⟩, the purely transverse
|𝜓tr ⟩ may be chosen as a representative of such an equivalence class in practical perturbative
calculations.
One must admit that the Gupta–Bleuler theory is rather sophisticated and quite compli-
cated. However, for practical calculations it is sufficient to know the polarization sum (31.13)
for the physical photons (of course, this is the same for any approach to the quantization of the
Maxwell field) and, last but not least, the propagator.
In fact, evaluating the propagator of the field 𝐴 𝜇 within our covariant quantization scheme
is quite simple — it is almost identical to the case of the (massless) Klein–Gordon scalar field.
Indeed, 𝐴 𝜇 (𝑥) is written as

𝐴 𝜇 (𝑥) = d3 𝑘 𝑁 𝑘 𝑎 𝜇 (𝑘)𝑒 −𝑖𝑘𝑥 + 𝑎 †𝜇 (𝑘)𝑒𝑖𝑘𝑥 ,
 

and from the commutation relations (31.37) one gets

⟨0|𝑎 𝜇 (𝑘)𝑎 †𝜈 (𝑙)|0⟩ = −𝑔 𝜇𝜈 𝛿 (3) ( 𝑘® − 𝑙)


® . (32.26)

Thus, one may proceed in the same way as in the case of the scalar field, the only exception
being the extra factor −𝑔 𝜇𝜈 in (32.26). The result can therefore be written down immediately:

d4 𝑞

⟨0|T 𝐴 𝜇 (𝑥) 𝐴𝜈 (𝑦) |0⟩ = 𝑖 𝐷 𝜇𝜈 (𝑞)𝑒𝑖𝑞(𝑥−𝑦) , (32.27)

(2𝜋) 4

198
with −𝑔 𝜇𝜈
𝐷 𝜇𝜈 (𝑞) = . (32.28)
𝑞 2 + 𝑖𝜖
It is reassuring that this result coincides with the form following from the massless limit of
the covariant part of the propagator of Proca field within scattering amplitudes of physical
processes like 𝑒 + 𝑒 − → 𝜇+ 𝜇− , 𝑒 𝑝 → 𝑒 𝑝, etc. In the Dyson perturbation series one can also use
Hint = −Lint in a straightforward way.

199
Chapter 33

Compton scattering:
Klein–Nishina formula

As a prelude to the main topic of this chapter, we are going to derive first a useful formula for
the cross section of a general binary process in variables of the laboratory system (i.e. in the rest
system of a target particle).
So, let us consider a process 1 + 2 → 3 + 4, denoting the corresponding four-momenta as
𝑝 1 , 𝑝 2 , 𝑝 3 , 𝑝 4 and the masses as 𝑚 1 , . . . , 𝑚 4 . We choose the laboratory system to be the rest
frame of the particle 2, i.e.
® .
𝑝 2 = (𝑚 2 , 0) (33.1)
According to the general formula (23.38) for differential cross section we have

1 1 1 d3 𝑝 3 d3 𝑝 4
d𝜎 = |M | 2 (2𝜋) 4 𝛿 (4) ( 𝑝 3 + 𝑝 4 − 𝑝 1 − 𝑝 2 ) . (33.2)
|®𝑣 1 | 2𝐸 1 2𝐸 2 3 3
(2𝜋) 2𝐸 3 (2𝜋) 2𝐸 4

Our goal is to derive the angular distribution of the particle 3. The integration over d3 𝑝 4 in
(33.2) is trivial and one thus gets first (using also the familiar relation |®𝑣 1 | = | 𝑝®1 |/𝐸 1 )

𝐸1 1 1 d3 𝑝 3 1
d𝜎 = |M | 2
| 𝑝®1 | 2𝐸 1 2𝐸 2 (2𝜋) 2𝐸 3 (2𝜋) 3 2𝐸 4
3
√︃ √︃ 
4 2 2 2 2
× (2𝜋) 𝛿 | 𝑝®3 | + 𝑚 3 + | 𝑝®4 | + 𝑚 4 − 𝐸 1 − 𝐸 2 , (33.3)

where 𝑝®4 = 𝑝®1 − 𝑝®3 (see Fig. 33.1). Thus, using the above definition of the scattering angle and

~p3

~p1

~p4

Fig. 33.1: Kinematics of the 1 + 2 → 3 + 4 process in the laboratory system.

200
setting also 𝐸 2 = 𝑚 2 , one has

1 1 d3 𝑝 3 1
d𝜎 = |M | 2
2| 𝑝®1 | 2𝑚 2 (2𝜋) 2𝐸 3 (2𝜋) 3 2𝐸 4
3
√︃ √︃  (33.4)
4 2 2 2 2 2
× (2𝜋) 𝛿 | 𝑝®3 | + 𝑚 3 + | 𝑝®3 | − 2| 𝑝®1 || 𝑝®3 | cos 𝜗 + | 𝑝®1 | + 𝑚 4 − 𝐸 1 − 𝑚 2 .

For brevity, let us now denote 𝑥 ≡ | 𝑝®3 |. Eq. (33.4) may be recast as

1 1 𝑥 2 d𝑥 dΩ 1
d𝜎 = |M | 2 (2𝜋) 4 𝛿 𝑓 (𝑥) , (33.5)
 
2| 𝑝®1 | 2𝑚 2 3 3
(2𝜋) 2𝐸 3 (2𝜋) 2𝐸 4
where √︃ √︃
𝑓 (𝑥) = 𝑥 + 𝑚 3 + 𝑥 2 − 2𝑥| 𝑝®1 | cos 𝜗 + | 𝑝®1 | 2 + 𝑚 42 − 𝐸 1 − 𝑚 2 .
2 2 (33.6)
Then one may utilize the standard formula
1
𝛿 𝑓 (𝑥) = (33.7)
 
𝛿(𝑥 − 𝑥0 ) ,
| 𝑓 ′ (𝑥 0 )|

where 𝑥 0 is the zero of 𝑓 (𝑥), i.e. 𝑓 (𝑥 0 ) = 0. Differentiating the function (33.6), one gets
𝑥 𝑥 − | 𝑝®1 | cos 𝜗
𝑓 ′ (𝑥) = √︃ + √︃ , (33.8)
2 2
𝑥 + 𝑚3 2 2 2
𝑥 − 2𝑥| 𝑝®1 | cos 𝜗 + | 𝑝®1 | + 𝑚 4

i.e.
𝑥 0 𝑥 0 − | 𝑝®1 | cos 𝜗 | 𝑝®3 |(𝐸 1 + 𝑚 2 ) − 𝐸 3 | 𝑝®1 | cos 𝜗
𝑓 ′ (𝑥0 ) =
+ = , (33.9)
𝐸3 𝐸4 𝐸3 𝐸4
where we have taken into account the energy conservation 𝐸 3 + 𝐸 4 = 𝐸 1 + 𝑚 2 . The integration
over the variable 𝑥 in the expression (33.5) thus yields

1 1 | 𝑝®3 | 2 dΩ 1 𝐸 3 𝐸 4 (2𝜋) 4
d𝜎 = |M | 2 ,
2| 𝑝®1 | 2𝑚 2 (2𝜋) 3 2𝐸 3 (2𝜋) 3 2𝐸 4 | 𝑝®3 |(𝐸 1 + 𝑚 2 ) − 𝐸 3 | 𝑝®1 | cos 𝜗
and the resulting formula for the angular distribution in question becomes
d𝜎 1 1 | 𝑝®3 |
= |M | 2 . (33.10)
dΩlab 64𝜋 | 𝑝®1 | 𝑚 2
2 | 𝑝®1 |
𝐸1 + 𝑚2 − 𝐸 3 cos 𝜗
| 𝑝®3 |
Let us add that | 𝑝®3 | and 𝐸 3 are given as functions of the scattering angle 𝜗 and masses through
the condition of energy conservation, i.e.
√︃ √︃
𝑥0 + 𝑚 3 + 𝑥02 − 2𝑥 0 | 𝑝®1 | cos 𝜗 + | 𝑝®1 | 2 + 𝑚 42 = 𝐸 1 + 𝑚 2 ,
2 2 (33.11)

where we have kept the notation 𝑥 0 = | 𝑝®3 |. In a general case of arbitrary masses 𝑚 1 , . . . , 𝑚 4 the
result for 𝑥 0 is rather complicated, but it is simplified substantially e.g. for the elastic scattering
of a massless particle (which is precisely the envisaged case of the Compton scattering).
So, let us now consider the case 𝑚 1 = 𝑚 3 = 0, 𝑚 2 = 𝑚 4 = 𝑚 ≠ 0 and rename the
four-momenta as 𝑝 1 ≡ 𝑘, 𝑝 2 ≡ 𝑝, 𝑝 3 ≡ 𝑘 ′, 𝑝 4 ≡ 𝑝′. The energy conservation (33.11) then
means √︃
| 𝑘® | + | 𝑘®′ | 2 − 2| 𝑘® || 𝑘®′ | cos 𝜗 + | 𝑘® | 2 + 𝑚 2 = | 𝑘® | + 𝑚 .

(33.12)

201
For brevity, let us denote provisionally 𝜔 = | 𝑘® |, 𝜔′ = | 𝑘®′ |. The solution of Eq. (33.12) can be
then written as
𝜔
𝜔′ = 𝜔 , (33.13)
1 + (1 − cos 𝜗)
𝑚
which is the famous Compton relation for the change of energy (or frequency) of the photon in
the scattering process.19 From (33.10) one then gets, after some simple manipulations,

d𝜎 1 1 ®′ 2
2 |𝑘 |
= |M | , (33.14)
dΩlab 64𝜋 2 𝑚 2 | 𝑘® | 2

where | 𝑘®′ | = 𝜔′ is given by (33.13).


Let us now consider the Compton scattering, i.e. the scattering of the photon on a charged
particle — for definiteness we will have in mind the electron. So, the process in question is

𝛾(𝑘) + 𝑒 − ( 𝑝) → 𝛾(𝑘 ′) + 𝑒 − ( 𝑝′) , (33.15)

where we have also specified the corresponding four-momenta. In what follows, we are going to
® The interaction Lagrangian has the familiar form
work in the laboratory frame where 𝑝 = (𝑚, 0).
Lint = 𝑒𝜓𝛾 𝜓 𝐴 𝜇 and the lowest order Feynman diagrams describing the scattering amplitude
𝜇

are shown in Fig. 33.2 (the reader is recommended to find out how these diagrams emerge from
the 2nd order of Dyson expansion of the 𝑆-matrix).

k k′ k k′

q =p+k Q = p − k′
p p′ p p′

Fig. 33.2: 2nd order Feynman diagrams for Compton scattering.

Utilizing standard Feynman rules, the matrix element M = M𝑎 + M𝑏 is given by


1
𝑖M = 𝑖 3 𝑒 2 u ( 𝑝′)𝛾 𝜇 𝛾𝜈 u ( 𝑝)𝜖 ′ 𝜇 (𝑘 ′)𝜖 𝜈 (𝑘)
𝑝/ + 𝑘/ − 𝑚
(33.16)
1
+ 𝑖 3 𝑒 2 u ( 𝑝′)𝛾 𝜇 𝛾𝜈 u ( 𝑝)𝜖 𝜇 (𝑘)𝜖 ′ 𝜈 (𝑘 ′) .
𝑝/ − 𝑘/ ′ − 𝑚

Note that in general one has to use 𝜖 (𝑘) for the incoming photon and 𝜖 ∗ (𝑘 ′) for the outgoing one.
For simplicity, we consider here real values of 𝜖’s (having thus in mind linear polarizations). In
what follows, we are going to use the shorthand notation 𝜀 and 𝜀′ for the relevant polarization
19 The relation (33.13) can be easily recast in terms of the corresponding wavelengths; in the ordinary system of
units it reads 𝜆′ − 𝜆 = (ℎ/𝑚𝑐) (1 − cos 𝜗), where ℎ = 2𝜋ℏ. In 1922, Arthur Holly Compton (1892-1962) discovered
experimentally such a shift of the X-rays wavelength due to the scattering by free electrons and this is why the term
Compton wavelength refers traditionally to the quantity ℎ/𝑚𝑐. This observation was of fundamental importance,
since it provided a truly convincing evidence for photons as the quanta of the electromagnetic field. Compton
received the Nobel Prize in 1927, according to the original (rather terse) citation of the Nobel committee, “for his
discovery of the effect named after him”.

202
vectors. Recall that the physical (transverse) polarizations have the form 𝜖 = (0, 𝜖®) and it holds
𝑘 · 𝜖 (𝑘) = 0, 𝑘 ′ · 𝜖 ′ (𝑘 ′) = 0. So, we have

𝑝/ + 𝑘/ + 𝑚 𝑝/ − 𝑘/ ′ + 𝑚
 
2 ′ ′ ′ ′
M = −𝑒 u ( 𝑝 ) 𝜖/ 𝜖/u ( 𝑝) + u ( 𝑝 ) 𝜖/ 𝜖/ u ( 𝑝) ,
( 𝑝 + 𝑘) 2 − 𝑚 2 ( 𝑝 − 𝑘 ′) 2 − 𝑚 2

and using 𝑝 2 = 𝑚 2 , 𝑘 2 = 0, 𝑘 ′2 = 0, it becomes

𝜖 ( 𝑝 + 𝑘/ + 𝑚) 𝜖/ 𝜖/ ( 𝑝/ − 𝑘/ ′ + 𝑚) 𝜖/′
 ′
′ / /

2
M = −𝑒 u ( 𝑝 ) + u ( 𝑝) . (33.17)
2𝑝 · 𝑘 −2𝑝 · 𝑘 ′

® one
The last expression can be further simplified. Indeed, taking into account that 𝑝 = (𝑚, 0),
has 𝑝 · 𝜖 = 0, 𝑝 · 𝜖 = 0, and this in turn means that

𝑝/ 𝜖/ = −𝜖/ 𝑝/ , 𝑝/ 𝜖/′ = −𝜖/′ 𝑝/ . (33.18)

Then it becomes clear that ( 𝑝/ + 𝑚) 𝜖/u ( 𝑝) = 𝜖/ (− 𝑝/ + 𝑚)u u ( 𝑝) = 0 due to the Dirac equation for
u ( 𝑝), and in the same way one gets ( 𝑝/ + 𝑚) 𝜖/ u ( 𝑝) = 0. Moreover, because of 𝑘 · 𝜖 = 0 and

𝑘 ′ · 𝜖 ′ = 0 one has
𝑘/ 𝜖/ = −𝜖/ 𝑘/ , 𝑘/ ′𝜖/′ = −𝜖/′ 𝑘/ ′ . (33.19)
Employing all this in (33.17), one obtains

𝜖/′𝜖/ 𝑘/ 𝜖/𝜖/′ 𝑘/ ′
 
2
M = 𝑒 u(𝑝 ) ′
+ u ( 𝑝) . (33.20)
2𝑝 · 𝑘 2𝑝 · 𝑘 ′

Next, we will calculate the matrix element squared |M | 2 for unpolarized electrons; it amounts
to summing over the spin states of the outgoing electron and averaging over the spins of the
initial (target) electron. Using the standard trace technique, one gets first

1 4 𝜖/𝜖/′ 𝑘/ ′ 𝑘/ ′𝜖/′𝜖/
 ′
𝜖/ 𝜖/ 𝑘/ 𝑘/ 𝜖/𝜖/′
   
2
|M | = 𝑒 Tr ( 𝑝/ + 𝑚)

+ ( 𝑝/ + 𝑚) + . (33.21)
2 2𝑝 · 𝑘 2𝑝 · 𝑘 ′ 2𝑝 · 𝑘 2𝑝 · 𝑘 ′

In this first step, we keep photon polarizations fixed, the case of unpolarized photons will be
discussed at the end of our calculation. So, in (33.21) one can see four types of traces, namely

𝑇1 = Tr ( 𝑝/ ′ + 𝑚) 𝜖/′𝜖/ 𝑘/ ( 𝑝/ + 𝑚) 𝑘/ 𝜖/𝜖/′ ,
 

𝑇2 = Tr ( 𝑝/ ′ + 𝑚) 𝜖/𝜖/′ 𝑘/ ′ ( 𝑝/ + 𝑚) 𝑘/ ′𝜖/′𝜖/ ,
 
(33.22)
𝑇3 = Tr ( 𝑝/ ′ + 𝑚) 𝜖/′𝜖/ 𝑘/ ( 𝑝/ + 𝑚) 𝑘/ ′𝜖/′𝜖/ ,
 

𝑇4 = Tr ( 𝑝/ ′ + 𝑚) 𝜖/𝜖/′ 𝑘/ ′ ( 𝑝/ + 𝑚) 𝑘/ 𝜖/𝜖/′ .
 

These expressions may look rather complicated at first sight, as there are products of up to eight
𝛾-matrices under the trace symbol. Fortunately, the situation is in fact far better than it might
seem. Let us illustrate the feasibility of the algebraic calculations on the example of the trace
𝑇1 . One gets first
𝑇1 = Tr 𝑝/ ′𝜖/′𝜖/ 𝑘/ 𝑝/ 𝑘/ 𝜖/𝜖/′ + 𝑚 2 Tr 𝜖/′𝜖/ 𝑘/ 𝑘/ 𝜖/𝜖/′ .
 

The second term is zero because of 𝑘/ 𝑘/ = 𝑘 2 = 0. In the first term, we can write

𝑘/ 𝑝/ 𝑘/ = (2𝑘 · 𝑝 − 𝑝/ 𝑘/ ) 𝑘/ = 2𝑘 · 𝑝 𝑘/ .

203
Thus,

𝑇1 = 2𝑘 · 𝑝 Tr 𝑝/ ′𝜖/′𝜖/ 𝑘/ 𝜖/𝜖/′ ,


= −2𝑘 · 𝑝 Tr 𝑝/ ′𝜖/′ 𝑘/ 𝜖/𝜖/𝜖/′ , (33.23)




= 2𝑘 · 𝑝 Tr 𝑝/ ′𝜖/′ 𝑘/ 𝜖/′ .


Note that in arriving at (33.23) we have used (33.19) and 𝜖/𝜖/ = 𝜖 2 = −1. The trace of the product
of four 𝛾-matrices is worked out easily:

Tr 𝑝/ ′𝜖/′ 𝑘/ 𝜖/′ = 4 2( 𝑝′ · 𝜖 ′)(𝑘 · 𝜖 ′) − (𝑘 · 𝑝′)(𝜖 ′ · 𝜖 ′)


  

= 4 2(𝑘 + 𝑝 − 𝑘 ′) · 𝜖 ′ (𝑘 · 𝜖 ′) + 𝑘 · 𝑝′
 

= 4 2(𝑘 · 𝜖 ′) 2 + 𝑘 ′ · 𝑝 .
 

Here we have used the four-momentum conservation 𝑝′ = 𝑘 + 𝑝 − 𝑘 ′, and also the relations
𝑝 · 𝜖 ′ = 0, 𝑘 ′ · 𝜖 ′ = 0, as well as 𝑘 · 𝑝′ = 𝑘 ′ · 𝑝 (this follows easily from (𝑘 − 𝑝′) 2 = (𝑘 ′ − 𝑝) 2 ).
Putting all this together, one has finally

𝑇1 = 8𝑘 · 𝑝 2(𝑘 · 𝜖 ′) 2 + 𝑘 ′ · 𝑝 . (33.24)
 

The remaining traces in (33.22) can be evaluated in a similar way. In fact, it is a good topic for
a homework or a boring exercise, so I will just summarize the relevant results:

𝑇2 = 8(𝑘 ′ · 𝑝) −2(𝑘 ′ · 𝜖) 2 + 𝑘 · 𝑝 ,
 

𝑇3 = 8(𝑘 · 𝑝)(𝑘 ′ · 𝑝) 2(𝜖 · 𝜖 ′) 2 − 1 − 8(𝑘 · 𝜖 ′) 2 (𝑘 ′ · 𝑝) + 8(𝑘 ′ · 𝜖) 2 (𝑘 · 𝑝) , (33.25)


 

𝑇4 = 𝑇3 .

Now, the trace in (33.21) is


1 1 1
Tr . . . = 𝑇2 + 2 (33.26)
 
2
𝑇1 + 2
𝑇3 .
(2𝑝 · 𝑘) ′
(2𝑝 · 𝑘 ) (2𝑝 · 𝑘)(2𝑝 · 𝑘 ′)
Using the results (33.24) and (33.25), the expression (33.26) becomes, after some elementary
manipulations,  ′
𝑘 ·𝑝 𝑘·𝑝

′ 2
Tr . . . = 2 + 4(𝜖 · 𝜖 ) − 2 ,
 
+
𝑘 · 𝑝 𝑘′ · 𝑝
and |M | 2 given by (33.21) is then

𝑘′ · 𝑝 𝑘 · 𝑝
 
4 ′ 2
|M | 2 =𝑒 + + 4(𝜖 · 𝜖 ) − 2 . (33.27)
𝑘 · 𝑝 𝑘′ · 𝑝
Thus, the calculation has been somewhat tedious, but the result is quite simple and rewarding
(note that the first two terms in the square bracket in (33.27) are in fact 𝜔′/𝜔 and 𝜔/𝜔′,
respectively, if we return to the notation used in (33.13)). Using now the formula (33.14) for the
cross section, we get finally
 2  ′
d𝜎 𝛼 2 𝜔′

𝜔 𝜔 ′ 2
= + + 4(𝜖 · 𝜖 ) − 2 , (33.28)
dΩ 4𝑚 2 𝜔 𝜔 𝜔′

where we have introduced the fine-structure constant 𝛼 = 𝑒 2 /(4𝜋). The result (33.28) is
traditionally called the Klein–Nishina formula, in honour of Oskar Klein and Yoshio Nishina,

204
who derived the formula for Compton scattering of unpolarized photons in 1930 (it was done
independently also by Igor Tamm). The formula (33.28) for polarized photons was obtained first
by Ugo Fano in 1949.
Let us now proceed to the case of unpolarized photons. It means that we have to sum
the expression (33.28) over polarizations of the initial and final photon and multiply by 1/2 (this
corresponds to averaging over the polarizations of the incident photon). For this purpose we
may write the scalar product of polarizations on the right-hand side of (33.28) explicitly as

𝜖 · 𝜖 ′ = 𝜖 (𝑘, 𝜆) · 𝜖 (𝑘 ′, 𝜆′) = −®
𝜖 (𝑘, 𝜆) · 𝜖®(𝑘 ′, 𝜆′) . (33.29)

Then
2 2
∑︁ 2 ∑︁
𝜖®(𝑘, 𝜆) · 𝜖®(𝑘 ′, 𝜆′) = 𝜖𝑖 (𝑘, 𝜆)𝜖𝑖 (𝑘 ′, 𝜆′)𝜖 𝑗 (𝑘, 𝜆)𝜖 𝑗 (𝑘 ′, 𝜆′)
𝜆,𝜆′ =1 𝜆,𝜆′ =1
2 2
(33.30)
∑︁ ∑︁
= 𝜖𝑖 (𝑘, 𝜆)𝜖 𝑗 (𝑘, 𝜆) 𝜖𝑖 (𝑘 ′, 𝜆′)𝜖 𝑗 (𝑘 ′, 𝜆′)
𝜆=1 𝜆′ =1

𝑘𝑖 𝑘 𝑗
 𝑘 𝑖′ 𝑘 ′𝑗 
= 𝛿𝑖 𝑗 − 𝛿𝑖 𝑗 − .
| 𝑘® | 2 | 𝑘®′ | 2
Note that in the last line we have utilized the formula (31.24) for the photon polarization sum.
Now, taking into account that 𝛿𝑖 𝑗 𝛿𝑖 𝑗 = 3 and 𝑘 𝑖 𝑘 𝑖′ = 𝑘® · 𝑘®′ = | 𝑘® || 𝑘®′ | cos 𝜗, the expression (33.30)
becomes
3 − 1 − 1 + cos2 𝜗 = 1 + cos2 𝜗 . (33.31)
For the unpolarized cross section one then gets
 2  ′
d𝜎 𝛼 2 𝜔′

𝜔 𝜔 2
= + − sin 𝜗 . (33.32)
dΩ 2𝑚 2 𝜔 𝜔 𝜔′
For passing from (33.28) to (33.32) don’t forget that upon summing over photon polarizations,
the polarization-independent terms in (33.28) are simply multiplied by four and for the term
involving (𝜖 · 𝜖 ′) 2 we employ the results (33.30), (33.31). Taking into account the relation
(33.13), it is clear that the angular distribution (33.32) is rather complicated and its form also
depends strongly on the photon energy. An instructive picture can be found in the literature,
see e.g. the book [15]. One prominent feature of the angular dependence in (33.32) is that at
high energy, there is a pronounced forward peak. Let us stress that such a picture is valid in
the laboratory frame; in the c.m. system the angular distribution is quite different. Anyway,
an interesting point is what becomes of the expression (33.32) in the low-energy limit, i.e. for
𝜔 ≪ 𝑚. In such a case one may expect that the classical result could be reproduced, since for
𝜔 ≪ 𝑚 one gets from (33.13) that 𝜔′ ≈ 𝜔. Note that in classical electrodynamics 𝜔′ = 𝜔,
i.e. the frequency of the scattered radiation is the same as the initial one: it is an inevitable
consequence of the scattering mechanism in classical theory, which, of course, is quite different
from the quantum case.
So, setting simply 𝜔′ = 𝜔 in the formula (33.32), one gets
d𝜎 𝛼2
= 1 + cos2 𝜗 , (33.33)

dΩ 𝜔≪𝑚 2𝑚 2

and this is the famous classical formula for Thomson scattering (named in honour of J. J. Thom-
son, by the way the Nobel Prize winner for 1906, for the discovery of the electron). The expression

205
(33.33) can be easily integrated over the scattering angle and one gets

8𝜋 𝛼2
𝜎 𝜔≪𝑚 ≃ . (33.34)
3 𝑚2
For reader’s convenience, let us recast the result (33.34) in ordinary units, in which 𝛼 = 𝑒 2 /ℏ𝑐
and 1/𝑚 becomes ℏ/𝑚𝑐. Then 𝛼/𝑚 turns into 𝑒 2 /𝑚𝑐2 , which is the classical electron radius 𝑟 0 .
Putting in numbers, in particular 𝑒 = 4.8 × 10−10 esu, one has 𝑟 0 ≃ 2.8 × 10−13 cm and thus
8𝜋 2
𝜎 𝜔≪𝑚 ≈ 𝑟 ≈ 6.67 × 10−25 cm2 . (33.35)
3 0

206
Chapter 34

𝑺-matrix and Wick’s theorems:


an overview

In our previous calculations of 𝑆-matrix elements for various decay and scattering processes we
have seen that any such matrix element can be recast as the vacuum expectation value (v. e. v.)
of a product of annihilation and creation operators and the non-zero contributions to such a
v. e. v. correspond to complete “pairing” of annihilation and creation operators that match each
other. The procedure of pairing is based on the observation that a non-zero contribution to
the considered 𝑆-matrix element originates in a commutator (or anticommutator) of a pair of
operators of the same sort. In doing this, one in fact uses, though only implicitly, a particular
consequence of famous Wick’s theorems, which are the main topic of this chapter.20
So, now it is the right time to discuss products of creation and annihilation operators
in a systematic way. To begin with, we are going to formulate some basic definitions. First,
we will consider bosonic operators in the Fock space; more precisely, the operators in question
are in general linear combinations of annihilation and creation operators (satisfying standard
commutation relations) and may depend on spacetime coordinates (a typical example: operator
of a quantized bosonic field, e.g. scalar or another one).
Let 𝐴 and 𝐵 be such operators. We define the contraction (pairing) of 𝐴 and 𝐵, denoted
as 𝐴𝐵, according to
𝐴𝐵 = : 𝐴𝐵 : + 𝐴𝐵 , (34.1)
where : 𝐴𝐵 : is the normal product, in which creation operators stand on the left of the
annihilation operators. Since we know that the basic commutators are “c-numbers” (i.e. multiples
of unit operator — Kronecker deltas, or delta functions), it is clear that the contraction 𝐴𝐵 is a
c-number. Then, it is also obvious that 𝐴𝐵 is the v. e. v. of 𝐴𝐵,

𝐴𝐵 = ⟨0| 𝐴𝐵|0⟩ . (34.2)

An important point to be mentioned here is that within the normal product the operators commute.
Indeed, any operator we are dealing with can be written as a sum of its creation and annihilation
parts, i.e. 𝐴 = 𝐴− + 𝐴+ , 𝐵 = 𝐵− + 𝐵+ . Then,

: 𝐴𝐵 : = : ( 𝐴− + 𝐴+ )(𝐵− + 𝐵+ ) : = 𝐴− 𝐵− + 𝐵+ 𝐴− + 𝐴+ 𝐵− + 𝐴+ 𝐵+ ,

while
: 𝐵𝐴 : = : (𝐵− + 𝐵+ )( 𝐴− + 𝐴+ ) : = 𝐵− 𝐴− + 𝐴+ 𝐵− + 𝐵+ 𝐴− + 𝐵+ 𝐴+ .
20 Gian-CarloWick (1909–1992) was an Italian theoretical physicist. He was an assistant of Enrico Fermi in
Rome, later worked in United States, in particular at the Columbia University, N. Y.

207
However, 𝐴− 𝐵− = 𝐵− 𝐴− and 𝐴+ 𝐵+ = 𝐵+ 𝐴+ (commutators between annihilation operators are
trivial and the same is true for the creation operators). Thus, it is clear that one has

: 𝐴𝐵 : = : 𝐵𝐴 : . (34.3)

In a similar way, one may define chronological contraction (pairing) through

T 𝐴(𝑥)𝐵(𝑦) = : 𝐴(𝑥)𝐵(𝑦) : + 𝐴(𝑥)𝐵(𝑦) . (34.4)




It is not difficult to realize that 𝐴(𝑥)𝐵(𝑦) is also a c-number.21 Indeed, one has

T 𝐴(𝑥)𝐵(𝑦) = 𝜗(𝑥 0 − 𝑦 0 ) 𝐴(𝑥)𝐵(𝑦) + 𝜗(𝑦 0 − 𝑥0 )𝐵(𝑦) 𝐴(𝑥)




= 𝜗(𝑥 0 − 𝑦 0 ) : 𝐴(𝑥)𝐵(𝑦) : + 𝐴(𝑥)𝐵(𝑦) + 𝜗(𝑦 0 − 𝑥0 ) : 𝐵(𝑦) 𝐴(𝑥) : + 𝐵(𝑦) 𝐴(𝑥) .


 

Then, using (34.3) and taking into account that

𝜗(𝑥 0 − 𝑦 0 ) + 𝜗(𝑦 0 − 𝑥 0 ) = 1 ,

one gets

T 𝐴(𝑥)𝐵(𝑦) = : 𝐴(𝑥)𝐵(𝑦) : + 𝜗(𝑥 0 − 𝑦 0 ) 𝐴(𝑥)𝐵(𝑦) + 𝜗(𝑦 0 − 𝑥 0 ) 𝐵(𝑦) 𝐴(𝑥) .




So,
𝐴(𝑥)𝐵(𝑦) = 𝜗(𝑥 0 − 𝑦 0 ) 𝐴(𝑥)𝐵(𝑦) + 𝜗(𝑦 0 − 𝑥0 ) 𝐵(𝑦) 𝐴(𝑥) , (34.5)

and that’s it.


Remark on the notation: We have distinguished the symbols for “ordinary” and “chrono-
logical” contractions as and , respectively. In fact, there is no firmly established
notation used in current literature, so the reader can take the present convention as provisional
and rather ad hoc.
Now it is also clear that

𝐴(𝑥)𝐵(𝑦) = ⟨0|T 𝐴(𝑥)𝐵(𝑦) |0⟩ . (34.6)




Thus, we see that (34.4) in fact coincides with our earlier definition of chronological pairing that
led us to the concept of the propagator of a quantized field.
So far we have considered boson fields. For fermion (Dirac) fields one may formulate
analogous definitions, except that commutators are replaced by anticommutators. This in turn
means that in the definition of normal product one has to incorporate a sign change for trans-
position of creation and annihilation operators and the same rule concerns the definition of the
T-product. Thus, in particular, fermion operators anticommute within the normal product.
Before proceeding to the announced Wick’s theorems, let us work out, for the sake of
motivation, an instructive example. We will use bosonic annihilation and creation operators 𝑎 𝑘 ,
𝑎 †𝑘 , with 𝑘 being the standard label for the momentum; for brevity, we will denote 𝑎 1 ≡ 𝑎 𝑘 1 ,
𝑎 2 ≡ 𝑎 𝑘 2 , etc. Now, consider the product 𝑎 1 𝑎 2 𝑎 †3 𝑎 †4 and rearrange the operator factors so as to
21 The symbol for the chronological contraction might be called the “cramp” or “clamp”, alluding to the form
of a simple carpenter’s tool (the reader fluent in Czech might appreciate the colloquial expression “kramle”). This
linguistic extempore could be understood as the author’s self-made contribution to QFT terminology.

208
have creation operators on the left and annihilation operators on the right (i.e. to accomplish
“normal ordering”). We get, step by step:

𝑎 1 𝑎 2 𝑎 †3 𝑎 †4 = 𝑎 1 (𝑎 †3 𝑎 2 + 𝛿23 )𝑎 †4
= 𝑎 1 𝑎 †3 (𝑎 †4 𝑎 2 + 𝛿24 ) + (𝑎 †4 𝑎 1 + 𝛿14 )𝛿23 = . . . (34.7)
= 𝑎 †3 𝑎 †4 𝑎 1 𝑎 2 + 𝛿14 𝑎 †3 𝑎 2 + 𝛿13 𝑎 †4 𝑎 2 + 𝛿24 𝑎 †3 𝑎 1 + 𝛿23 𝑎 †4 𝑎 1 + 𝛿13 𝛿24 + 𝛿23 𝛿14 .
From (34.7) we see that the original operator product can be rearranged as a sum of normal
products accompanied by the appropriate contractions, including the term without contraction
and completely contracted terms. The result (34.7) is leading us to the following general
definition of a normal product with contractions. Symbolically,

𝐴𝐵 . . . 𝑅 . . . 𝑋 𝑌 𝑍 = ±𝐴𝑍 𝐵𝑋 : . . . 𝑅 . . . 𝑌 : , (34.8)

where the sign ± is included depending on whether the permutation of fermionic factors corre-
sponding to the passage from the left-hand side to right-hand side is even or odd. The definition
(34.8) is valid also for a “normal product with chronological contractions”.
Now we are in a position to formulate the basic Wick’s theorems (they are valid for the
sort of operators specified at the beginning of this chapter).
T1 (Wick’s theorem for ordinary products): A product of operators is equal to the sum
of their normal products with all possible contractions, including the normal product without
contractions.
For T-products, the statement is almost identical:
T2 (Wick’s theorem for T-products): T-product of operators is equal to the sum of their
normal products with all possible chronological contractions, including the normal product
without contractions.
Hopefully, the validity of the theorem T1 is plausible, on the basis of our explicit motivation
example. In general, the theorems are proved by induction (which is not a very inspiring
procedure, as we know).
Let us also remark that in our previous straightforward calculations of the 𝑆-matrix
elements we have obviously employed particular examples of the theorem T1, namely its appli-
cation to the vacuum expectation value of operator products in question (this is tantamount to
considering just the fully contracted terms).
The great value of Wick’s theorems consists in the systematic reordering of operator
products in terms of normal products, since the matrix elements of the latter have some important
specific properties. Let us illustrate it on a simple example. Consider the operator

𝑂 = 𝑎 †𝑘 1 𝑎 †𝑘 2 𝑎 𝑘 3 𝑎 𝑘 4 . (34.9)

It turns out that the only non-trivial matrix elements of (34.9) are of the type ⟨𝑘 1′ 𝑘 2′ |𝑂|𝑘 3′ 𝑘 4′ ⟩, i.e.
(34.9) can only have non-zero matrix elements between two-particle states. In more detail: for
instance, there is no matrix element of the operator (34.9) between one-particle states. Indeed,

⟨𝑝|𝑎 †𝑘 1 𝑎 †𝑘 2 𝑎 𝑘 3 𝑎 𝑘 4 |𝑘⟩ = ⟨0|𝑎 𝑝 𝑎 †𝑘 1 𝑎 †𝑘 2 𝑎 𝑘 3 𝑎 𝑘 4 𝑎 †𝑘 |0⟩


= ⟨0|(𝛿 𝑘 1 𝑝 + 𝑎 †𝑘 1 𝑎 𝑝 )𝑎 †𝑘 2 𝑎 𝑘 3 (𝛿 𝑘 4 𝑘 + 𝑎 †𝑘 𝑎 𝑘 4 )|0⟩
= 𝛿 𝑘 1 𝑝 𝛿 𝑘 4 𝑘 ⟨0|𝑎 †𝑘 2 𝑎 𝑘 3 |0⟩
= 0.

209
On the other hand, if one takes

⟨𝑝𝑞|𝑂|𝑘𝑙⟩ = ⟨0|𝑎 𝑝 𝑎 𝑞 𝑎 †𝑘 1 𝑎 †𝑘 2 𝑎 𝑘 3 𝑎 𝑘 4 𝑎 †𝑘 𝑎 †𝑙 |0⟩ , (34.10)

then there is a non-zero contribution: according to the Wick’s theorem T1, there is a completely
contracted term in the expression (34.10), namely

⟨0|𝑎 𝑝 𝑎 𝑞 𝑎 †𝑘 1 𝑎 †𝑘 2 𝑎 𝑘 3 𝑎 𝑘 4 𝑎 †𝑘 𝑎 †𝑙 |0⟩ .

So, what is the substantial difference in comparison with the preceding case? For one-particle
states we have only a fully contracted term like e.g.

⟨0|𝑎 𝑝 𝑎 †𝑘 1 𝑎 †𝑘 2 𝑎 𝑘 3 𝑎 𝑘 4 𝑎 †𝑘 |0⟩ ,

but 𝑎 †𝑘 2 𝑎 𝑘 3 = 0, since ⟨0|𝑎 †𝑘 2 𝑎 𝑘 3 |0⟩ = 0. In other words, we cannot avoid here trivial contractions!
On the other hand, for an operator of the sort (34.9) that is not normally ordered, one
does obtain non-zero matrix elements even for one-particle states. Indeed, as an example, let us
consider the operator
e = 𝑎 𝑘1 𝑎† 𝑎 𝑘3 𝑎† .
𝑂 (34.11)
𝑘2 𝑘4
For its matrix element between one-particle states one gets e.g. the contribution

⟨0|𝑎 𝑝 𝑎 𝑘 1 𝑎 †𝑘 2 𝑎 𝑘 3 𝑎 †𝑘 4 𝑎 †𝑘 |0⟩

and also some other possible fully contracted terms.


Thus, it is clear that Wick’s theorems provide an efficient tool for a systematic investigation
of matrix elements of operator products in the Fock space and thereby may serve as a basis for
the derivation and construction of Feynman diagrams.

210
Chapter 35

𝑺-matrix and Wick’s theorems:


some applications

Before considering applications for the 𝑆-matrix calculations, we should clarify some remaining
problems. We know that in the Dyson expansion of the 𝑆-matrix, one encounters T-products
of Lagrangian densities, i.e. whole “blocks” of field operators like 𝜓𝜓𝜑, or 𝜓𝛾 𝜇 𝜓 𝐴 𝜇 , etc. The
fields constituting such monomials are taken at the same spacetime point, so their time ordering
is in a sense arbitrary. Let us now concentrate on quantum electrodynamics (QED), which is the
most important QFT model discussed throughout these lecture notes. It turns out that a good
idea is to improve the definition of the QED interaction Lagrangian so as to replace the current
𝜓𝛾 𝜇 𝜓 by its normal-ordered form, i.e.
𝜓(𝑥)𝛾 𝜇 𝜓(𝑥) −→ : 𝜓(𝑥)𝛾 𝜇 𝜓(𝑥) : . (35.1)
The redefined interaction Lagrangian thus becomes
Lint = 𝑒 : 𝜓(𝑥)𝛾 𝜇 𝜓(𝑥) : 𝐴 𝜇 (𝑥) , (35.2)
which in fact is the same as
Lint = 𝑒 : 𝜓(𝑥)𝛾 𝜇 𝜓(𝑥) 𝐴 𝜇 (𝑥) : , (35.3)
since the operator of the photon field 𝐴 𝜇 commutes with 𝜓 and 𝜓. We will be able to appreciate
the improvement (35.3) in the Dyson expansion somewhat later, but there are at least two assets
connected with the redefinition of the current shown in (35.2) that one can notice right now.
First, the charge 𝑄 corresponding to the current : 𝜓𝛾 𝜇 𝜓 : is
∫ ∫
𝑄 = d 𝑥 : 𝜓(𝑥)𝛾0 𝜓(𝑥) : = d3 𝑥 : 𝜓 † (𝑥)𝜓(𝑥) : ,
3
(35.4)

and the normal product in (35.4) thus guarantees that 𝑄 annihilates the vacuum, which is certainly
desirable (in fact, we have used such a prescription in our earlier discussion of the quantization
of free Dirac field). Second, one may consider the transformation of the current under charge
conjugation 𝐶. We know that for the Dirac field this transformation means
T
𝜓 −→ 𝜓C = 𝐶𝜓 , (35.5)
where T denotes transposition and 𝐶 is the matrix 𝑖𝛾2 𝛾0 in the standard representation of 𝛾-
matrices. For the current 𝜓𝛾 𝜇 𝜓 made of classical fields one gets, after some simple manipulations
(and using the known properties of the matrix 𝐶, like 𝐶 † = 𝐶 −1 = −𝐶, 𝐶 −1 𝛾 𝜇 𝐶 = −𝛾T𝜇 ):
T T
𝜓 C 𝛾 𝜇 𝜓C = 𝜓T 𝐶𝛾 𝜇 𝐶𝜓 = 𝜓T 𝛾T𝜇 𝜓 . (35.6)

211
For classical Dirac fields, the expression (35.6) obviously becomes 𝜓𝛾 𝜇 𝜓, since in such a case
there is no obstacle to commuting 𝜓 and 𝜓. However, for quantized fields we have to take
into account the relevant anticommutation relations; these lead to a sign change when passing
from 𝜓T 𝛾T𝜇 𝜓T to 𝜓𝛾 𝜇 𝜓 and also produce an additional awkward c-number term. Using the
normal-ordered form (35.1) one can simply anticommute the Dirac fields, and the result is then

: 𝜓 C 𝛾 𝜇 𝜓C : = − : 𝜓𝛾 𝜇 𝜓 : . (35.7)

Intuitively, this is a desirable relation, since from the physical point of view one would certainly
prefer such a change of sign for the electromagnetic current under charge conjugation.
Now the main problem is what is the variant of a Wick’s theorem relevant for the “mixed
products” appearing in the 𝑆-matrix expansion, namely

T : 𝜓(𝑥 1 )𝛾 𝛼 𝜓(𝑥1 ) 𝐴𝛼 (𝑥1 ) : . . . : 𝜓(𝑥 𝑛 )𝛾 𝜔 𝜓(𝑥 𝑛 ) 𝐴𝜔 (𝑥 𝑛 ) :




(explicitly, by a “mixed product” we mean a T-product containing already some normal products
inside). The answer to this question is provided by the Wick’s theorem for mixed products,
which states:
T3: A mixed (Dyson) T-product can be decomposed into a sum of normal products with
chronological contractions, omitting contractions between operators that are already normally
ordered.
A proof of the theorem can be found e.g. in [16].
Now we may proceed to some applications, employing the formalism based on the Wick’s
theorems. We will consider the 𝑆-matrix in a given perturbative order (i.e. a definite term of
the Dyson expansion) and represent it as a Wick series of normal products with contractions,
involving field operators that constitute the relevant interaction Lagrangian. Let us start with
the second order of the 𝑆-matrix for QED (as for the coupling constant, we will set temporarily
𝑒 = 1). One has

𝑖2
∫∫
(2)
𝑆 = d4 𝑥 d4 𝑦 T : 𝜓(𝑥)𝛾 𝜇 𝜓(𝑥) 𝐴 𝜇 (𝑥) : : 𝜓(𝑦)𝛾 𝜈 𝜓(𝑦) 𝐴𝜈 (𝑦) : .

2!
Using the Wick’s theorem for mixed products (T3), one can examine consecutively normal
products with no contraction, one contraction, etc. Then one may utilize our earlier finding
that in order to get a non-zero matrix element, the number of particles involved in the states in
question should coincide with the number of the corresponding operators in the normal product.
As for the term without any contraction, it is not difficult to realize that it cannot de-
scribe any real physical process (the reader is recommended to verify this negative statement).
Concerning terms with one contraction, there are essentially two possibilities, namely

: 𝜓(𝑥)𝛾 𝜇 𝜓(𝑥) 𝐴 𝜇 (𝑥)𝜓(𝑦)𝛾 𝜈 𝜓(𝑦) 𝐴𝜈 (𝑦) :


= ⟨0|T 𝐴 𝜇 (𝑥) 𝐴𝜈 (𝑦) |0⟩ : 𝜓(𝑥)𝛾 𝜇 𝜓(𝑥)𝜓(𝑦)𝛾 𝜈 𝜓(𝑦) : (35.8)


and

: 𝜓(𝑥)𝛾 𝜇 𝜓(𝑥) 𝐴 𝜇 (𝑥)𝜓(𝑦)𝛾 𝜈 𝜓(𝑦) 𝐴𝜈 (𝑦) :


= ⟨0|T 𝜓(𝑥)𝜓(𝑦) |0⟩ : 𝜓(𝑥)𝛾 𝜇 𝐴 𝜇 (𝑥)𝛾 𝜈 𝜓(𝑦) 𝐴𝜈 (𝑦) : . (35.9)


212
From our previous observations it should be clear that the expression (35.8) corresponds to pro-
cesses involving four fermions (electrons and/or positrons) only, so one may anticipate Feynman
diagrams with four external fermion lines and one internal photon line (photon propagator).
Similarly, the form (35.9) describes processes involving two photons and two fermions (i.e.
either the Compton scattering, or the two-photon annihilation of the electron–positron pair).
Next, there are two terms with two contractions; let us show explicitly one of them:
𝜇
: 𝜓 𝑗 (𝑥)𝛾 𝑗 𝑘 𝜓 𝑘 (𝑥) 𝐴 𝜇 (𝑥)𝜓 𝑟 (𝑦)𝛾𝑟𝜈𝑠 𝜓 𝑠 (𝑦) 𝐴𝜈 (𝑦) :
𝜇
= (−1)⟨0|T 𝜓 𝑠 (𝑦)𝜓 𝑗 (𝑥)|0⟩⟨0|T 𝜓 𝑘 (𝑥)𝜓 𝑟 (𝑦)|0⟩𝛾 𝑗 𝑘 𝛾𝑟𝜈𝑠 : 𝐴 𝜇 (𝑥) 𝐴𝜈 (𝑦) : . (35.10)

In arriving at (35.10), we have used our previous definition of the normal product with chrono-
logical contractions and the minus sign has been included so as to take into account the signature
of the relevant permutation of fermion fields. The “coefficient function” multiplying the normal
product : 𝐴 𝜇 (𝑥) 𝐴𝜈 (𝑦) : in (35.10) is obviously equal to

(−1) Tr 𝑖SF (𝑦 − 𝑥)𝛾 𝜇 𝑖SF (𝑥 − 𝑦)𝛾 𝜈 , (35.11)


 

where we have used our earlier notation for the propagator of Dirac field. From (35.10) one can
obviously get only a trivial process of photon–photon transition. Nevertheless, the structures
(35.10) and (35.11) suggest the graphical representation corresponding to a Feynman diagram
(here in the coordinate space) shown in Fig. 35.1. While the photon–photon transition is

Fig. 35.1: QED fermion bubble: the simplest correction to the photon propagator.

physically trivial, we will see soon that the closed loop in Fig. 35.1 made of internal fermion
lines plays an important role as a subdiagram in higher-order contributions to the 𝑆-matrix.
Let us also mention that apart from the configuration of contractions (35.10) there is a term
involving one contraction of Dirac fields along with the contraction of the electromagnetic
fields 𝐴 𝜇 and 𝐴𝜈 . In analogy with the picture shown in Fig. 35.1 one may guess easily that
the corresponding diagram is the one shown in Fig. 35.2. Again, Fig. 35.2 would describe a

Fig. 35.2: One-loop correction to the fermion propagator.

physically trivial process of electron–electron transition, but, similarly to Fig. 35.1, it will also
become an important subdiagram in higher-order Feynman graphs.
Finally, there is the fully contracted term that would correspond to a physically irrelevant
term of vacuum–vacuum transition. It can be visualized by means of a diagram that reminds
one of a cracked egg, see Fig. 35.3 (the reader may find some relevant comments on the role
played by those vacuum graphs e.g. in the book [10]).

213
Fig. 35.3: “Cracked-egg” vacuum diagram.

k k′ k p′

q = k − k′ Q = k − p′

p p′ p k′

(a) (b)

Fig. 35.4: Feynman diagrams for Møller scattering in the second order of QED.

Let us now come back briefly to the normal product with the contraction (35.8). Its
operator content indicates that it can describe elastic scattering processes 𝑒 − 𝑒 − → 𝑒 − 𝑒 − (Møller
scattering) and 𝑒 − 𝑒 + → 𝑒 − 𝑒 + (Bhabha scattering).22 As an illustration, let us consider the
first case. The initial and final states can be written as
|𝑖⟩ = 𝑏 † (𝑘)𝑏 † ( 𝑝)|0⟩ ,
(35.12)
| 𝑓 ⟩ = 𝑏 † (𝑘 ′)𝑏 † ( 𝑝′)|0⟩ .
Now, in order to work out the matrix element
⟨ 𝑓 | : 𝜓(𝑥)𝛾 𝜇 𝜓(𝑥)𝜓(𝑦)𝛾 𝜈 𝜓(𝑦) : |𝑖⟩ , (35.13)
one has to consider the possible pairings in the usual way. Then it is not difficult to find out that
a non-zero contribution can only originate from
⟨0|𝑏( 𝑝′)𝑏(𝑘 ′) : 𝜓 + (𝑥)𝛾 𝜇 𝜓 − (𝑥)𝜓 + (𝑦)𝛾 𝜈 𝜓 − (𝑦) : 𝑏 † (𝑘)𝑏 † ( 𝑝)|0⟩ , (35.14)
where + and − denote the creation and annihilation parts, respectively. It is clear that one has
to calculate the fully contracted terms; it is easy to see that there are four possibilities: 𝑏(𝑘 ′)
and 𝑏( 𝑝′) can be contracted with 𝜓 + (𝑥) or 𝜓 + (𝑦), while 𝑏 † (𝑘) and 𝑏 † ( 𝑝) should be contracted
with 𝜓 − (𝑥) or 𝜓 − (𝑦). The procedure is straightforward and working out its details may serve
as a useful exercise for any serious reader. The net result is that the factor 1/2! from the Dyson
expansion is cancelled, and one is left with two types of contributions, which can be depicted
as Feynman diagrams in Fig. 35.4. An important point is that there is a relative minus sign for
contributions (a) and (b).
In a similar way, one can treat the process of Bhabha scattering
𝑒 − (𝑘) + 𝑒 + ( 𝑝) → 𝑒 − (𝑘 ′) + 𝑒 + ( 𝑝′) . (35.15)
In this case, one ends up with two diagrams, namely those shown in Fig. 35.5. The corresponding
22 Christian Møller (1904–1980) was a Danish physicist, he spent his scientific career at the University of
Copenhagen (he also studied there, with Niels Bohr). Homi Bhabha (1909–1966) was a prominent Indian physicist,
father of the Indian nuclear program. He died at Mont Blanc in an airplane accident.

214
k k′
k k′
q = k − k′
−p Q=k+p −p′

−p −p

(a) (b)

Fig. 35.5: Feynman diagrams for Bhabha scattering in the second order of QED.

matrix elements can be written down by using the familiar Feynman rules that we already know
from our previous experience with other processes. Again, as in the case of Møller scattering,
one has to include an additional relative minus sign between the contributions of the diagrams
(a) and (b).
With the matrix elements for these scattering processes at hand, one can also evaluate the
corresponding cross sections. Such a calculation is straightforward, though somewhat tedious.
As usual, one can employ the standard trace technique, as well as the elementary kinematic
relations. Let us display here the results for both processes in the high-energy limit, i.e. for
𝐸 c.m. = 𝑠1/2 ≫ 𝑚. One has, in the variables of the c.m. system,

𝛼2 1 + cos4 2
𝜗
1 + sin4 𝜗2
" #
d𝜎 2
 
= + 2𝜗 + (35.16)
dΩ M 2𝑠 sin4 𝜗2 sin 2 cos2 𝜗2 cos4 𝜗2

for the Møller scattering and

𝛼2 1 + cos4 2 cos4 1 + cos2 𝜗


𝜗 𝜗
" #
d𝜎
 
2 2
= − + (35.17)
dΩ B 2𝑠 sin4 𝜗2 2 𝜗
sin 2 2

for the Bhabha scattering (cf. e.g. the book [1]).


The derivation of the formulae (35.16) and (35.17) should be a challenge for any diligent
reader. In any case, it is a good topic for a tutorial.

215
Chapter 36

𝑺-matrix in fourth order: QED example

In previous chapters, we have discussed several examples of processes described by 2nd order
contributions to the Dyson expansion of the 𝑆-matrix. These are graphically represented by tree
diagrams that are made of external and internal lines connecting the interaction vertices and do
not contain closed loops of internal lines (precisely this is the defining feature of the tree-level
diagrams). In the preceding chapter, we also come across two simple examples of closed-loop
diagrams, in the 𝑆-matrix contributions corresponding to processes that are in a sense physically
trivial.
In the present chapter, we are going to examine one-loop diagrams that arise as contribu-
tions to the 4th order 𝑆-matrix elements within QED, for a particular physical process, namely
electron–muon elastic scattering (chosen here as one instructive example from the variety of
possibilities provided by QED). An astute reader might wonder why we have skipped the 3rd
order; we will comment briefly on this point later on.
For our present purpose, we will work with the interaction Lagrangian
Lint = 𝑒 : 𝜓 1 𝛾 𝜇 𝜓1 : + : 𝜓 2 𝛾 𝜇 𝜓2 : 𝐴 𝜇 , (36.1)


where 𝜓1 and 𝜓2 correspond to electron and muon fields, respectively. Obviously, in the 2nd
order the process
𝑒 − (𝑘) + 𝜇− ( 𝑝) → 𝑒 − (𝑘 ′) + 𝜇− ( 𝑝′) (36.2)
is described by a single tree diagram, namely the one shown in Fig. 36.1. The 4th order term in

k k′

q = k′ − k

p p′

Fig. 36.1: The lowest-order contribution to the process (36.2) represented by a tree-level diagram.

the 𝑆-matrix Dyson expansion reads


𝑖4 4

(4)
𝑆 = 𝑒 d4 𝑥 1 . . . d4 𝑥4 T 𝐽 𝜇 (𝑥 1 ) 𝐴 𝜇 (𝑥 1 ) . . . 𝐽𝜎 (𝑥 4 ) 𝐴𝜎 (𝑥 4 ) , (36.3)

4!
where we have introduced the usual notation for the current
𝐽𝛼 = : 𝜓 1 𝛾𝛼 𝜓1 : + : 𝜓 2 𝛾𝛼 𝜓2 : = 𝐽𝛼(1) + 𝐽𝛼(2) . (36.4)

216
In order to get from the expression (36.3) a contribution to the process (36.2), one has to take
the muon part from at least one current in (36.3) (but not from all of them). There are three
possibilities, namely

(i) three 𝐽 (1) and one 𝐽 (2) ,


(ii) two 𝐽 (1) and two 𝐽 (2) , (36.5)
(iii) one 𝐽 (1) and three 𝐽 (2) .

Let us start with the variant (i). There are four possibilities of how to get a term of such a type
(taking 𝐽 𝜇(2) at 𝑥1 , 𝑥 2 , 𝑥 3 , or 𝑥 4 ). Obviously, the corresponding contributions are the same. Thus,
one may write

𝑖4 4

(4)
𝑆 (𝑖) = 4 𝑒 d4 𝑥1 . . . d4 𝑥 4 T 𝐽 𝜇(2) (𝑥 1 ) 𝐴 𝜇 (𝑥1 )𝐽𝜈(1) (𝑥 2 ) 𝐴𝜈 (𝑥 2 )𝐽 𝜌(1) (𝑥3 ) 𝐴 𝜌 (𝑥 3 )𝐽𝜎(1) (𝑥 4 ) 𝐴𝜎 (𝑥 4 ) .

4!
(36.6)
Now, if one wants to describe the four-fermion process like (36.2), photon fields must be
contracted completely. Furthermore, two pairs of Dirac fields must be contracted as well, since
only two electron and two muon fields should survive as constituents of the relevant normal
product (needless to say, we are relying here on the Wick’s theorem T3 for the mixed operator
products). In this way, the basic strategy for constructing the individual Feynman diagrams is
given. As a first step, let us consider the configuration of contractions that would lead to a closed
purely fermion loop (this amounts to taking Dirac field contractions at a single pair of spacetime
points). So, the anticipated structure of the emerging Feynman graph we have in mind is e.g.
the one shown in Fig. 36.2.

x4

x3

x2

x1

Fig. 36.2: The contribution of the closed fermion loop to the process (36.2) in QED at the fourth order.

For convenience, let us denote the electron field simply as (lower case) 𝜓 and the muon
field as (capital) Ψ. In fact, there are 3 possibilities of how to contract the 𝐴 𝜇 (𝑥 1 ) (coupled to
𝐽 𝜇(2) (𝑥 1 )), but for a chosen variant, e.g. 𝐴 𝜇 (𝑥1 ) 𝐴𝜈 (𝑥 2 ), the second contraction 𝐴𝐴 is determined
uniquely. With those bosonic contractions fixed, there are still two possibilities of how to
contract electron fields to arrive at the closed loop as in Fig. 36.2 (making it either at 𝑥 2 , 𝑥3
or 𝑥 2 , 𝑥4 ). Taking into account all those observations, one may conclude that it is sufficient
to consider the diagram in Fig. 36.2 and cancel the factor 1/4! from Dyson expansion by our
combinatorial factor 4 × 3 × 2. Notice also that one might recover our careful combinatorics as
the 4! permutations of points 𝑥1 , . . . , 𝑥 4 in Fig. 36.2. Thus, the diagram in Fig. 36.2 represents

217
4 4
3 4
3 3
2 2 2

1 1 1

(a) (b) (c)

Fig. 36.3: Other 4th-order contributions to the process (36.2).

3 4

Fig. 36.4: A disconnected fourth-order graph contributing to the process (36.2).

the normal product with contractions

: Ψ(𝑥 1 )𝛾 𝜇 Ψ(𝑥 1 ) 𝐴 𝜇 (𝑥 1 )𝜓(𝑥 2 )𝛾𝜈 𝜓(𝑥 2 ) 𝐴𝜈 (𝑥 2 )𝜓(𝑥 3 )𝛾 𝜌 𝜓(𝑥3 ) 𝐴 𝜌 (𝑥 3 )𝜓(𝑥 4 )𝛾𝜎 𝜓(𝑥 4 ) 𝐴𝜎 (𝑥 4 ) : .
(36.7)
The explicit expression corresponding to (36.7) is, according to our previous knowledge,
𝜇𝜈  𝜌𝜎
𝑖DF (𝑥 1 − 𝑥 2 ) (−1) Tr 𝑖SF (𝑥 3 − 𝑥 2 )𝛾𝜈 𝑖SF (𝑥 2 − 𝑥 3 )𝛾 𝜌 𝑖DF (𝑥3 − 𝑥 4 )


× : Ψ(𝑥 1 )𝛾 𝜇 Ψ(𝑥 1 )𝜓(𝑥 4 )𝛾𝜎 𝜓(𝑥4 ) : . (36.8)

Now one might proceed further by considering the relevant matrix element and using Fourier
representations of the propagators in the coefficient function in (36.8). We will do it later (in
the next chapter); now, let us examine other diagrams corresponding to the remaining possible
configurations of chronological contractions according to the Wick’s theorem. We are going
to avoid writing explicitly further long expressions like (36.7); hopefully, one such instructive
example is enough. Instead, let us show the relevant graphical representations — the rest of this
chapter should thus be understood as an eulogy on Feynman diagrams.
Employing the shorthand notation 1 ≡ 𝑥 1 , 2 ≡ 𝑥 2 , etc., the diagrams we have in mind are
guessed quite easily (see Fig. 36.3). This is essentially all, as far as the topologically connected
graphs are concerned. Apart from these, one can also obtain a disconnected diagram as shown
in Fig. 36.4, but this is irrelevant for the evaluation of the scattering amplitude in question.
Next, there are some other connected diagrams that originate in the options (ii) and (iii)
in (36.5). As for the variant (ii), one gets a new type of Feynman graph, namely the “box” (see
Fig. 36.5), where the lines in the lower part correspond to muons and the upper part represents

218
4 3 4 3

1 2 1 2

(a) (b)

Fig. 36.5: Box-type contribution to the process (36.2).

1 2 3 4

Fig. 36.6: Another disconnected contribution to the process (36.2).

electrons. Here one could also consider the disconnected graph shown in Fig. 36.6, but this
would obviously lead to a kinematically trivial matrix element (in fact, no scattering at all).
Finally, the option (iii) in (36.5) is essentially the same like (i), with graphs from Fig. 36.4
turned “upside down” and with the electron loop in Fig. 36.2 being replaced by muon loop
(hopefully, such an observation is more or less obvious).
Let us now return to our remark at the beginning of this chapter, namely why did we skip
the third order of the Dyson expansion? Well, in the 3rd order in QED, one certainly encounters
some new tree-level diagrams, e.g. for 𝑒 − + 𝑒 + → 𝛾𝛾𝛾 (see Fig. 36.7), or 𝑒 + 𝜇 → 𝑒 + 𝜇 + 𝛾 (the
so-called bremsstrahlung, see Fig. 36.8).

Fig. 36.7: The process 𝑒 − + 𝑒 + → 𝛾𝛾𝛾 at the tree level.

Fig. 36.8: The process 𝑒 + 𝜇 → 𝑒 + 𝜇 + 𝛾 at the tree level.

219
(a) (b)

Fig. 36.9: Processes involving classical external field in the 3rd order in QED.

In fact, one could also get some one-loop diagrams, if only an interaction with a classical
external (e.g. Coulomb) field is included along with the quantized photon field (remember our
earlier derivation of the famous Mott formula). Then one could consider diagrams like e.g. those
shown in Fig. 36.9.
For the main part of the present chapter, our primary option was an analysis of the 4th
order of the Dyson expansion because of the methodology, since this gives a fuller view of
the relevant one-loop diagrams (including the box in Fig. 36.5). Nevertheless, we will discuss
the graphs in Fig. 36.9 later on as well, since they enable one to elucidate some important
results in QED applications. The bremsstrahlung graphs of the type shown in Fig 36.8 also play
an extremely important role in QED, in the analysis of the problem of the so-called infrared
divergences due to the zero mass of the photon (for more details, the reader is referred e.g. to
the books [1], [6], or [7]).
A final remark is perhaps in order here. One should keep in mind that the diagrams
like those in Figs. 36.2 and 36.3, etc. represent contributions to the 𝑺-matrix operator. So,
the external lines represent operators 𝜓, 𝜓, Ψ, Ψ and internal lines are propagators in the
coordinate space. We will proceed to the usual Feynman diagrams for scattering matrix elements
(amplitudes) in the next chapter.

220
Chapter 37

One-loop QED diagrams


in momentum space

In the preceding chapter, we have found several types of one-loop diagrams characteristic for the
QED 𝑆-matrix. As we have emphasized there, internal lines in those diagrams depict spacetime
propagators and the external lines represent some field operators (or a classical field like in
Fig. 36.9). Needless to say, the integration over the relevant set of spacetime coordinates is
tacitly assumed for the evaluation of the whole diagram contribution (i.e. over 𝑥1 , . . . , 𝑥 4 in our
examples).
Thus, as the next step, we would like to calculate the one-loop contributions to the matrix
element (scattering amplitude) for our sample process 𝑒 + 𝜇 → 𝑒 + 𝜇, in an analogy with what
we have done before at the tree level. For the purpose of an illustration we will perform the
calculation in detail for the diagram in Fig. 36.2 involving the purely fermion loop. Hopefully,
such a sample calculation should be sufficient for understanding the basic technicalities; in fact,
an evaluation of the other diagrams, like those in Fig. 36.3, etc. proceeds essentially along the
same lines and does not bring any new ingredients.
So, let us start with the expression (36.8). It must be integrated over all possible values
of 𝑥 1 , . . . , 𝑥 4 , and the full contribution of the diagram in Fig. 36.2 to the 𝑆-matrix thus reads

4
𝑆 𝑎(4) d4 𝑥 1 . . . d4 𝑥4 DF (𝑥1 − 𝑥2 ) (−1) Tr 𝛾 𝜇 SF (𝑥 2 − 𝑥 3 )𝛾𝜈 SF (𝑥 3 − 𝑥 2 ) DF (𝑥3 − 𝑥4 )
𝛼𝜇  𝜈𝛽
=𝑒


× : Ψ(𝑥 1 )𝛾𝛼 Ψ(𝑥 1 )𝜓(𝑥 4 )𝛾 𝛽 𝜓(𝑥4 ) : . (37.1)

Note that when passing from (36.8) to (37.1), we have employed the trace cyclicity and changed,
for later convenience, the labelling of the summation indices; we have also used the obvious fact
that 𝑖 4 · 𝑖 4 = 1. Now one may employ the Fourier representations of the spacetime propagators,
given by

d4 𝑝

𝛼𝜇 𝛼𝜇
DF (𝑥 1 − 𝑥2 ) = 4
𝐷 F ( 𝑝) 𝑒𝑖 𝑝(𝑥1 −𝑥2 ) ,
(2𝜋)
d4 𝑞

𝛽𝜈 𝛽𝜈
DF (𝑥3 − 𝑥 4 ) = 4
𝐷 F (𝑞) 𝑒𝑖𝑞(𝑥3 −𝑥4 ) ,
(2𝜋)
(37.2)
d4 𝑘

−𝑖𝑘 (𝑥2 −𝑥3 )
SF (𝑥2 − 𝑥 3 ) = 𝑆F (𝑘) 𝑒 ,
(2𝜋) 4
d4 𝑙

SF (𝑥3 − 𝑥 2 ) = 4
𝑆F (𝑙) 𝑒 −𝑖𝑙 (𝑥3 −𝑥2 )
(2𝜋)

221
𝜌𝜎
(where 𝐷 F (𝑞) and 𝑆F ( 𝑝) are the functions found in chapters 27 and 32). Further, for the
evaluation of the matrix element of the operator in (37.1) we will also change, for convenience,
the labelling of the electron and muon four-momenta, so that the initial and final state will be

|𝑖⟩ = 𝑏 †1 ( 𝑝 1 )𝑏 †2 ( 𝑝 2 )|0⟩ ,
(37.3)
| 𝑓 ⟩ = 𝑏 †1 ( 𝑝′1 )𝑏 †2 ( 𝑝′2 )|0⟩ ,

where 1 and 2 denote the electron and muon, respectively. Now, one may take into account the
contractions of the operators in (37.3) and field operators appearing in the expression (37.1):

𝑏 1 ( 𝑝′1 )𝜓(𝑥4 ) = 𝑁 𝑝1′ u ( 𝑝′1 ) 𝑒𝑖 𝑝1 𝑥4 ,

𝜓(𝑥 4 )𝑏 †1 ( 𝑝 1 ) = 𝑁 𝑝1 u ( 𝑝 1 ) 𝑒 −𝑖 𝑝1 𝑥4 ,
′ (37.4)
𝑏 2 ( 𝑝′2 )Ψ(𝑥1 ) = 𝑁 𝑝2′ u ( 𝑝′2 ) 𝑒𝑖 𝑝2 𝑥1 ,

Ψ(𝑥 1 )𝑏 †2 ( 𝑝 2 ) = 𝑁 𝑝2 u ( 𝑝 2 ) 𝑒 −𝑖 𝑝2 𝑥1 .

The matrix element of the operator : Ψ(𝑥 1 ) . . . 𝜓(𝑥 4 ) : in (37.1) between the states (37.3) thus
becomes
′ ′
𝑁 𝑝1 𝑁 𝑝2 𝑁 𝑝1′ 𝑁 𝑝2′ u ( 𝑝′1 )𝛾 𝛽 u ( 𝑝 1 ) u ( 𝑝′2 )𝛾𝛼 u ( 𝑝 2 ) 𝑒𝑖( 𝑝2 −𝑝2 )𝑥1 𝑒𝑖( 𝑝1 −𝑝1 )𝑥4 . (37.5)
  

Using the representation (37.2) in (37.1), one has

d4 𝑝 d4 𝑘 d4 𝑙 d4 𝑞
∫ ∫ ∫ ∫ ∫
(4) 4 4
𝑆𝑎 = d 𝑥1 . . . d 𝑥4
(2𝜋) 4 (2𝜋) 4 (2𝜋) 4 (2𝜋) 4
× 𝑒𝑖 𝑝(𝑥1 −𝑥2 ) 𝑒 −𝑖𝑘 (𝑥2 −𝑥3 ) 𝑒 −𝑖𝑙 (𝑥3 −𝑥2 ) 𝑒𝑖𝑞(𝑥3 −𝑥4 ) (37.6)
𝛼𝜇  𝜈𝛽
× 𝐷 F ( 𝑝)(−1) Tr 𝛾 𝜇 𝑆F (𝑘)𝛾𝜈 𝑆F (𝑙) 𝐷 F (𝑞)


× : Ψ(𝑥 1 )𝛾𝛼 Ψ(𝑥 1 )𝜓(𝑥 4 )𝛾 𝛽 𝜓(𝑥 4 ) : .


Such an expression may look intimidating, but most of the integrations are actually trivial. When
we form the matrix element ⟨ 𝑓 |𝑆 𝑎(4) |𝑖⟩, the exponential factors can be accumulated as
′ ′
𝑒𝑖( 𝑝2 −𝑝2 )𝑥1 𝑒𝑖( 𝑝1 −𝑝1 )𝑥4 𝑒𝑖 𝑝(𝑥1 −𝑥2 ) 𝑒 −𝑖𝑘 (𝑥2 −𝑥3 ) 𝑒 −𝑖𝑙 (𝑥3 −𝑥2 ) 𝑒𝑖𝑞(𝑥3 −𝑥4 ) , (37.7)

and the integration with d4 𝑥1 . . . d4 𝑥 4 leads to the product of delta functions:

(2𝜋) 4 𝛿 (4) ( 𝑝′2 + 𝑝 − 𝑝 2 )(2𝜋) 4 𝛿 (4) (−𝑝 − 𝑘 +𝑙)(2𝜋) 4 𝛿 (4) (𝑘 −𝑙 + 𝑞)(2𝜋) 4 𝛿 (4) (−𝑞 + 𝑝′1 − 𝑝 1 ) . (37.8)

Now, the integration over the variable 𝑘 reduces the third 𝛿-function to 𝛿 (4) (𝑞 − 𝑝) (it means
that photon propagators in (37.6) carry the same four-momentum 𝑝 = 𝑞). The subsequent
integrations over 𝑝 and 𝑞 then lead finally to 𝛿 (4) ( 𝑝′1 + 𝑝′2 − 𝑝 1 − 𝑝 2 ), which embodies the
anticipated overall energy and momentum conservation. Note also that the second and third
𝛿-functions in (37.8) mean that 𝑙 = 𝑘 + 𝑞 and 𝑙 = 𝑘 + 𝑝 (and we also know that 𝑝 = 𝑞). The
variable 𝑙 remains unconstrained and the integral over the infinite domain of 𝑙 stays with us in the
contribution of the considered matrix element. Precisely this is a substantially new ingredient
concerning closed-loop diagrams, in comparison with tree-level graphs (where all four-momenta
flowing through the diagram are determined by their values for incoming and outgoing particles
— simply because of the conservation laws acting in the interaction vertices).

222
So, putting all things together, in the matrix element in question one sees the usual
factorization of the normalization factors
Ö
𝑁 𝑝1 𝑁 𝑝2 𝑁 𝑝1′ 𝑁 𝑝2′ = 𝑁𝑖, 𝑓 (37.9)
𝑖, 𝑓

and the delta function for overall four-momentum conservation


(2𝜋) 4 𝛿 (4) ( 𝑝′1 + 𝑝′2 − 𝑝 1 − 𝑝 2 ) = (2𝜋) 4 𝛿 (4) (𝑃 𝑓 − 𝑃𝑖 ) . (37.10)
Using the conventional form (37.10), one may then attach a factor of 1/(2𝜋) 4 to d4 𝑙 in the
integral over the arbitrary loop momentum 𝑙.
The result of our calculation may thus be written as

⟨ 𝑓 |𝑆 𝑎(4) |𝑖⟩ = 𝑁𝑖, 𝑓 𝑖M 𝑓 𝑖 (2𝜋) 4 𝛿 (4) (𝑃 𝑓 − 𝑃𝑖 ) ,


Ö
(37.11)
𝑖, 𝑓

with
𝑖M 𝑓 𝑖 = 𝑒 4 u ( 𝑝′2 )𝛾𝛼 u ( 𝑝 2 ) 𝐷 F (𝑞) −𝑖Π 𝜇𝜈 (𝑞) 𝐷 F (𝑞) u ( 𝑝′1 )𝛾 𝛽 u ( 𝑝 1 ) ,
 𝛼𝜇  𝜈𝛽
(37.12)
  

where we have denoted, for convenience,


d4 𝑙 1 1
∫  
−𝑖Π 𝜇𝜈 (𝑞) = (−1) Tr 𝛾 𝜇 𝛾 . (37.13)
(2𝜋) 4 /𝑙 − 𝑞/ − 𝑚 𝜈 /𝑙 − 𝑚
The expression (37.12) can be represented by the Feynman graph shown in Fig. 37.1. Note that

p1 p′1
q = p′1 − p1

l l−q

q = p2 − p′2

p2 p′2

Fig. 37.1: Fourth-order contribution to the electron-muon scattering matrix element involving closed
fermion loop as a correction to the photon propagator.

in (37.13) we have already used the explicit form of the propagator of Dirac field in momentum
space; for the photon propagators one can use e.g. the simple expression
𝜇𝜈 −𝑔 𝜇𝜈
𝐷 F (𝑞) = . (37.14)
𝑞2
Let us also mention that when “reading” the diagram in Fig. 37.1, one maintains, apart from
other familiar rules, also the factors of 𝑖 for any vertex and any internal line (propagator).
As for the other one-loop diagrams displayed in the preceding chapter (see Figs. 36.3(a),(b),(c)),
one may proceed in much the same way. Instead of repeating the rather boring steps in the evalua-
tion of the relevant 𝑆-matrix elements, let us just show some final results for the momentum-space
Feynman diagrams.

223
l
p1 p′1

l + p1 l + p′1

q = p′1 − p1
= p2 − p′2

p2 p′2

Fig. 37.2: Fourth-order contribution to the electron-muon scattering matrix element involving the one-
loop correction to the interaction vertex.

From Fig. 36.3(a) one gets the Feynman graph shown in Fig. 37.2 and its contribution
(the corresponding matrix element M ) is given by
 −𝑔 𝜇𝜈 
𝑖M = 𝑒 4𝑖 2 u ( 𝑝′1 )𝑖Γ 𝜇 ( 𝑝′1 , 𝑝 1 )u ′ 𝜈
(37.15)
 
u( 𝑝1) u ( 𝑝 2 )𝛾 u ( 𝑝 2 ) ,
𝑞2
where we have denoted, for later convenience,

d4 𝑙 1 1 −𝑔 𝛼𝛽

𝜇 ′
𝑖Γ ( 𝑝 1 , 𝑝 1 ) = − 𝛾 𝛼 𝛾 𝜇
𝛾 𝛽 . (37.16)
(2𝜋) 4 /𝑙 + 𝑝/ ′1 − 𝑚 /𝑙 + 𝑝/ 1 − 𝑚 𝑙2

Here the prefactor −1 is 𝑖 6 , since we observe the “book-keeping rule” of 𝑖’s for vertices and
propagators.
Finally, concerning the diagrams (c) and (d) in Fig. 36.3, let us write down explicitly just
the expression for the one-loop subdiagram shown in Fig. 37.3. Employing a conventional (quite

p l+p p

Fig. 37.3: One-loop graph that appears as a subdiagram in various higher-order Feynman graphs. It is
called commonly the “electron-self energy graph”.

standard) symbol 𝑖Σ( 𝑝) for the graph in Fig. 37.3, we have (suppressing the coupling factor 𝑒 2 )

d4 𝑙 1 −𝑔 𝛼𝛽

𝑖Σ( 𝑝) = 𝛾𝛼 𝛾 . (37.17)
(2𝜋) 4 /𝑙 + 𝑝/ − 𝑚 𝛽 𝑙 2

An important lesson to be learnt from the preceding discussion is that while the original
mathematical expressions for the higher-order 𝑆-matrix contributions are rather long and perhaps
somewhat awkward, the resulting picture in terms of Feynman diagrams is quite elegant and
easy to recover almost by heart. There is a set of relatively simple rules that work both at the
tree level and for closed-loop graphs; a substantially new extra rule for loop diagrams is that one

224
has to integrate over the arbitrary values of a four-momentum circulating around the loop. Thus,
there is a good reason to consider the main part of the forthcoming chapters as a “canticle for
Feynman diagrams”.
In this chapter we have already had a lot of good news, but there are bad news as well
— that’s life! The bad news about the nice expressions (37.13), (37.16) and (37.17) is that the
integrals diverge23 . Indeed, e.g. in (37.13) the integrand behaves like 𝑙 −2 at infinity and the
integration volume element d4 𝑙 amounts to 𝑙 3 d𝑙 (if one employs e.g. hyperspherical coordinates
in the four-dimensional space). Thus, for 𝑙 → ∞, the integral in question looks like 𝑙d𝑙 for

the neighbourhood of infinity; if one introduces a “cut-off” Λ (a provisional upper limit in this
integral), one gets
∫Λ
𝑙d𝑙 ≃ Λ2 , (37.18)

for Λ → ∞. Thus, one may say that the integral in the formula (37.13) is quadratically divergent
in the asymptotic region 𝑙 → ∞. In QFT, a divergence of such a type is generally called
ultraviolet (UV) divergence (since it is due to the large values of loop energies and momenta).
In a similar way, one may conclude that the integral in (37.16) is logarithmically divergent and
the expression (37.17) exhibits a linear divergence.
On the other hand, Feynman diagrams like the one in Fig. 37.1, etc., being of the order
O (𝑒 ) (i.e. O (𝛼2 )), should represent higher-order corrections to the contribution of the lowest-
4

order (tree-level) graph(s) (that are of the order O (𝑒 2 )). Thus, one would like to tame the
annoying UV divergences and give some reasonable meaning to the UV finite parts of the loop
diagrams. This is precisely the program of the regularization and renormalization to be
pursued in the subsequent chapters.

23 So,if you know the popular animated sitcom South Park, you might be tempted to exclaim: “Oh my God, they
killed Kenny!”

225
Chapter 38

Regularization of UV divergences

In several forthcoming chapters, we will discuss the problem of the regularization of UV


divergences, arising in QED one-loop diagrams. We are going to start with the closed purely
fermion loop (see Fig. 38.1) that emerged as a subgraph of the 4th order 𝑆-matrix diagram shown
in Fig. 37.1. Let us remark already now that it is usually called “vacuum polarization graph”
(the reason for such a fancy name will become clear later).

q q

l−q

Fig. 38.1: The vacuum polarization loop in QED.

As a prelude to the main subject of this chapter, let us examine some simple general
properties of the contribution of the closed loop in Fig. 38.1. In the preceding chapter we have
arrived at its integral representation, which reads

d4 𝑙 1 1 d4 𝑙 Tr 𝛾 𝜇 ( /𝑙 − 𝑞/ + 𝑚)𝛾𝜈 ( /𝑙 + 𝑚)
∫   ∫  
𝑖Π 𝜇𝜈 (𝑞) = Tr 𝛾 𝜇 𝛾 = .
(2𝜋) 4 /𝑙 − 𝑞/ − 𝑚 𝜈 /𝑙 − 𝑚 (2𝜋) 4 (𝑙 − 𝑞) 2 − 𝑚 2 (𝑙 2 − 𝑚 2 )
 
(38.1)
It is not difficult to find out that Π 𝜇𝜈 (𝑞) is symmetric, i.e. Π 𝜇𝜈 (𝑞) = Π𝜈𝜇 (𝑞), and is also an
even function, Π 𝜇𝜈 (−𝑞) = Π 𝜇𝜈 (𝑞). Indeed, these two properties can be verified by means of an
appropriate formal shift of the integration variable in (38.1) and by using the trace cyclicity (the
reader is encouraged to do this, as a simple exercise). In fact, there is an alternative argument,
which is perhaps even more elegant. Whatever definition might be used for the (up to now
ill-defined) integral in (38.1), one would like to get a covariant form for Π 𝜇𝜈 (𝑞); this would
mean that
Π 𝜇𝜈 (𝑞) = 𝐴(𝑞 2 )𝑔 𝜇𝜈 + 𝐵(𝑞 2 )𝑞 𝜇 𝑞 𝜈 , (38.2)
where 𝐴 and 𝐵 are essentially arbitrary functions. It is so, because Π 𝜇𝜈 (𝑞) should be a 2nd
rank Lorentz tensor depending on a single four-vector variable 𝑞; the expression (38.2) is then
obviously its most general form. The Ansatz (38.2) makes the above-mentioned symmetry
properties manifest. With these simple facts in mind, it is clear that the contribution of the graph
in Fig. 38.1 is the same as for Fig. 38.2. It means that when drawing the vacuum polarization

226
l

q q

l+q

Fig. 38.2: An alternative equivalent labelling of the loop momenta for the vacuum polarization graph.

diagram in question, one need not worry about the orientation of the loop (the direction of
running around the loop is irrelevant — clockwise or counterclockwise, it doesn’t matter).
There is one more property, which may not be so obvious at first sight: Π 𝜇𝜈 (𝑞) is
transverse with respect to 𝑞, i.e.
𝑞 𝜇 Π 𝜇𝜈 (𝑞) = 0 . (38.3)
So, how can one derive Eq. (38.3), at least heuristically? A hand-waving argument may proceed
as follows. Multiplying (38.1) by 𝑞 𝜇 , one has

d4 𝑙 1 1
∫  
𝜇
𝑖𝑞 Π 𝜇𝜈 (𝑞) = Tr 𝑞/ 𝛾
(2𝜋) 4 /𝑙 − 𝑞/ − 𝑚 𝜈 /𝑙 − 𝑚
d4 𝑙 1 1
∫  
= Tr ( /𝑙 − 𝑚) − ( /𝑙 − 𝑞/ − 𝑚)

𝛾
(2𝜋) 4 /𝑙 − 𝑞/ − 𝑚 𝜈 /𝑙 − 𝑚
d4 𝑙 1 1 d4 𝑙 1
∫   ∫  
= Tr ( /𝑙 − 𝑚) 𝛾 − Tr 𝛾𝜈 .
(2𝜋) 4 /𝑙 − 𝑞/ − 𝑚 𝜈 /𝑙 − 𝑚 (2𝜋) 4 /𝑙 − 𝑚

Now, using the trace cyclicity in the first term, one has

d4 𝑙 1 d4 𝑙 1
∫   ∫  
𝜇
𝑖𝑞 Π 𝜇𝜈 (𝑞) = Tr 𝛾𝜈 − Tr 𝛾𝜈 . (38.4)
(2𝜋) 4 /𝑙 − 𝑞/ − 𝑚 (2𝜋) 4 /𝑙 − 𝑚

Then, upon the shift 𝑙 − 𝑞 → 𝑙 in the first integral, the first and the second term on the right-
hand side of Eq. (38.4) become identical and Eq. (38.3) is thereby proved. Of course, we have
used here manipulations with badly divergent integrals, so our “proof” has been rather sloppy,
certainly not rigorous.
On the other hand, the transversality property (38.3) is highly desirable for ensuring
gauge independence of physical 𝑆-matrix elements. Such a strong statement clearly calls for an
explanation. So, as an instructive example, let us consider the two-loop “sausage-like” diagram
contributing (at 6th order) e.g. to our earlier sample process 𝑒 + 𝜇 → 𝑒 + 𝜇, which is shown in
Fig. 38.3. If one uses photon propagators in the form (E.16) (see Appendix E), i.e. in the general
covariant 𝛼-gauge, the contribution of the diagram in Fig. 38.3 would certainly be 𝛼-independent
if (38.3) holds. However, if Eq. (38.3) were invalid, the 𝛼-dependence in the contribution of
graph in Fig. 38.3 would persist, because of the propagator 2 in the middle (marked in red).
Notice also that the propagators 1 and 3 are harmless in this context because of familiar identities
u ( 𝑝′1 ) 𝑞/ u ( 𝑝 1 ) = 0, u ( 𝑝′2 ) 𝑞/ u ( 𝑝 2 ) = 0.
So, any definition of the tensor Π 𝜇𝜈 (𝑞) based on an explicit regularization should maintain
the identity (38.3). Employing the general form of Π 𝜇𝜈 (𝑞) according to (38.2), the relation (38.3)
obviously amounts to
𝐴(𝑞 2 ) = −𝑞 2 𝐵(𝑞 2 ) . (38.5)

227
p1 p′1
1

p2 p′2

Fig. 38.3: An example of a multiloop QED diagram, whose contribution is sensitive to the gauge
dependence of photon propagators and to the transversality of the vacuum polarization loop.

Thus, the general form of Π 𝜇𝜈 (𝑞), satisfying Eq. (38.3), can be written in terms of a single “form
factor”. In a universally accepted convention it reads

Π 𝜇𝜈 (𝑞) = Π(𝑞 2 )(𝑞 2 𝑔 𝜇𝜈 − 𝑞 𝜇 𝑞 𝜈 ) . (38.6)

Note also that the form factor Π(𝑞 2 ) is dimensionless, as it should be clear from the integral
representation (38.1).
There is another important point that should be mentioned here in connection with the
transversality property (38.3). It can be shown that Π 𝜇𝜈 (𝑞), as given by the expression (38.1), is
in fact the Fourier transform of the vacuum expectation value of the T-product of two currents,

⟨0|T 𝐽 𝜇 (𝑥)𝐽𝜈 (0) |0⟩ , (38.7)




with 𝐽 𝜇 = 𝜓𝛾 𝜇 𝜓. Multiplication of Π 𝜇𝜈 (𝑞) by 𝑞 𝜇 is thus equivalent to differentiating the


expression (38.7) with respect to 𝑥. However, the current 𝐽 𝜇 is conserved, 𝜕 𝜇 𝐽 𝜇 = 0. So, one
may write (on the heuristic level we still stick to)

𝜕 𝜇 𝐽 𝜇 (𝑥) = 0 =⇒ 𝑞 𝜇 Π 𝜇𝜈 (𝑞) = 0 . (38.8)

Technical details of the calculation confirming the result (38.8) are left as a non-trivial exercise
for an interested reader.
In QFT, identities that are related to the gauge invariance (manifested here in the current
conservation) are generally called the Ward identities, in honour of the British theorist John C.
Ward (1924–2000). So, the transversality relation (38.3) is one of the whole set of QED Ward
identities. Note that, historically, the first identity of such a type, discovered by Ward, was a
simple relation between quantities Σ and Γ𝜇 introduced in Chapter 37 (see (37.16) and (37.17));
we will discuss it later on, in Chapter 42.
Let us now proceed to the main theme of this chapter, the UV regularization. A regular-
ization of UV divergences in an expression like (38.1) means that the corresponding divergent
integral is replaced by a convergent one, at the price of introducing an auxiliary (regularization)
parameter, in such a way that by removing it one returns to the original form. This is admittedly

228
rather vague statement, so now we are going to make it explicit. There are essentially three
possible strategies for the UV regularization. First, one can replace the infinite integration do-
main by a finite one, simply by introducing a finite upper bound for the integration over the loop
four-momentum. This type of regularization is usually called momentum cut-off, for obvious
reasons. Such a straightforward procedure is too simple so as to guarantee special relations like
e.g. (38.3), i.e. to maintain the symmetry properties of the QFT model in question, but we will
see that it can be useful as an auxiliary regularization within a more complicated scheme.
Second, taking into account that the UV divergence is also due to a bad behaviour of
the integrand at infinity, one may improve such an asymptotics by modifying appropriately the
integrand. For instance, the propagator 1/(𝑙 2 − 𝑚 2 ) may be replaced by a subtracted form,
1 1 1
−→ 2 − 2 , (38.9)
𝑙2 −𝑚 2 𝑙 −𝑚 2 𝑙 − 𝑀2
where 𝑀 is a “regulator mass”. Obviously, the expression (38.9) behaves like 1/(𝑙 2 ) 2 for 𝑙 → ∞
in contrast with the 1/𝑙 2 behaviour of the original propagator. So, here the regularization
parameter is the auxiliary mass 𝑀 and removing such a regularization means, of course, taking
𝑀 → ∞. An example of this regularization strategy is the so-called Pauli–Villars method (to
be discussed in detail later on).
Third, another source of UV divergences is the integration in the four-dimensional
space. The integrals in question would, obviously, have better convergence properties in lower-
dimensional spaces. To implement this idea, one may derive formulae for the relevant integrals
in dependence on the dimensionality 𝑛 of the integration space. Then one can use an analytic
continuation of these formulae to non-integer (or even complex) values of the parameter 𝑛. The
original UV divergence is then revealed as a singularity of such an 𝑛-dependent regularized
expression in the limit 𝑛 → 4. This last mentioned method is called simply the dimensional
regularization. It had been invented independently by the Dutch theorists Gerardus ’t Hooft
and Martinus Veltman, and Argentinians Carlos Bollini and Juan Giambiagi in the early 1970s
(but usually it is attributed only to ’t Hooft and Veltman, who became much more famous and
also received the Nobel Prize in 1999 for their work on gauge theories in particle physics). For
the original papers, see [41] and [42]. In fact, the dimensional regularization (DR) is the least
intuitive method introduced here, and for a newcomer in QFT it might represent a sort of “shock
therapy”. Nevertheless, from the point of view of practical calculations, in most cases it is a
most efficient and flexible procedure, so we will employ it as the first example of how to deal
with the UV divergence appearing in Π 𝜇𝜈 (𝑞), the contribution of the vacuum polarization graph.
Before implementing the relevant steps of the DR method, let us introduce a general cal-
culational tool (used in any computation of Feynman diagrams), namely the so-called Feynman
parametrization. Such a trick amounts to replacing a product of propagator denominators by
a power of a single expression, at the price of introducing some new integrations over finite
domain of “Feynman parameters”. The simplest example is the elementary formula

∫1
1 1
= d𝑥
𝑎𝑏 [𝑎𝑥 + 𝑏(1 − 𝑥)] 2
0
(38.10)
∫1
1
 
= d𝑥 .
[(𝑎 − 𝑏)𝑥 + 𝑏] 2
0

229
A generalization of (38.10) reads

∫1 ∫1−𝑥1 1−𝑥1∫
− ... −𝑥 𝑛−2
1
= (𝑛 − 1)! d𝑥 1 d𝑥 2 . . . d𝑥 𝑛−1
𝑎1 𝑎2 . . . 𝑎𝑛
0 0 0
1
×   𝑛 . (38.11)
(𝑎 1 − 𝑎 𝑛 )𝑥1 + · · · + (𝑎 𝑛−1 − 𝑎 𝑛 )𝑥 𝑛−1 + 𝑎 𝑛
A formal proof of the identity (38.11) can be found in many places and it is also an appropriate
topic for a tutorial.
A remark concerning Eq. (38.10) is in order here (in fact it could be extended to the
general case (38.11) as well). Proving the identity (38.10) is elementary. But one might wonder
how can it hold in case that 𝑎, 𝑏 are real, and 𝑎𝑏 < 0, while the integral in the right-hand side is
apparently positive. This is a right question and the answer is as follows. By 𝑎 and 𝑏 one means
propagator denominators that have, as we know, the form 𝑙 2 − 𝑚 2 + 𝑖𝜖, 𝜖 > 0. It turns out that it
is precisely the tiny piece 𝑖𝜖, which saves the consistency of the relation (38.10). For simplicity,
let us consider an almost trivial example, where 𝑎 = 1 and 𝑏 = −1, i.e. in fact
𝑎 = 1 + 𝑖𝜖 , 𝑏 = −1 + 𝑖𝜖 . (38.12)
Then, obviously, the left-hand side of (38.10) is −1. On the right-hand side one has
∫1
1
𝐼 (𝜖) = d𝑥 . (38.13)
(2𝑥 − 1 + 𝑖𝜖) 2
0

Now it is clear what is the role of 𝑖𝜖: without it, the integrand in (38.13) has a singularity at 𝑥 = 12
and the integral is divergent. In the presence of 𝑖𝜖, the integrand is equal to −1/𝜖 2 for 𝑥 = 12
(so it becomes highly negative!). The evaluation of the integral (38.13) is of course elementary;
using the substitution 2𝑥 − 1 = 𝑦 for convenience, one gets
∫1
1 1 1 1 1 1
 
𝐼 (𝜖) = d𝑦 = − − =− ,
2 (𝑦 + 𝑖𝜖) 2 2 1 + 𝑖𝜖 −1 + 𝑖𝜖 1 + 𝜖2
−1

so that lim 𝐼 (𝜖) = −1.


𝜖→ 0
Thus, the 𝑖𝜖 prescription is welcome for avoiding singularities of integrands consisting of
products of field propagators — a typical situation in the evaluation of Feynman diagrams. Note
also that there are some further generalizations of the Feynman-parametric formulae (38.10) or
(38.11) that can be found easily in the current literature.
Let us now start the evaluation of the Π 𝜇𝜈 (𝑞) within dimensional regularization. We will
consider the diagram in Fig. 38.2, so one may write, integrating formally in 𝑛 dimensions:
d𝑛 𝑙 1 1
∫  
DR 4−𝑛
𝑖Π 𝜇𝜈 (𝑞) = 𝜇 Tr 𝛾 𝜇 𝛾 . (38.14)
(2𝜋) 𝑛 /𝑙 − 𝑚 𝜈 /𝑙 + 𝑞/ − 𝑚

Note that we have introduced an additional overall factor 𝜇4−𝑛 , where 𝜇 is an arbitrary mass
scale; this is done for preserving the right dimensionality of Π 𝜇𝜈 (𝑞). So, the integral on the
right-hand side of (38.14) is
d𝑛 𝑙 Tr[𝛾 𝜇 ( /𝑙 + 𝑚)𝛾𝜈 ( /𝑙 + 𝑞/ + 𝑚)]

𝐼 𝜇𝜈 (𝑞) = . (38.15)
(2𝜋) 𝑛 [(𝑙 + 𝑞) 2 − 𝑚 2 ] (𝑙 2 − 𝑚 2 )

230
Introducing the Feynman parametrization according to (38.10), the expression (38.15) becomes

∫1
d𝑛 𝑙 Tr[𝛾 𝜇 ( /𝑙 + 𝑚)𝛾𝜈 ( /𝑙 + 𝑞/ + 𝑚)]

𝐼 𝜇𝜈 (𝑞) = d𝑥
(2𝜋) 𝑛 [(𝑙 + 𝑞) 2 − 𝑙 2 ]𝑥 + 𝑙 2 − 𝑚 2  2
0
(38.16)
∫1
d𝑛 𝑙Tr[𝛾 𝜇 ( /𝑙 + 𝑚)𝛾𝜈 ( /𝑙 + 𝑞/ + 𝑚)]

= d𝑥 .
(2𝜋) 𝑛  (2𝑙 · 𝑞 + 𝑞 2 )𝑥 + 𝑙 2 − 𝑚 2  2
0

The next step is an appropriate shift of the integration variable: since we assume implicitly that
the integral (38.16) is now convergent (regularized), a shift of 𝑙 is legitimate. For this purpose
the expression (38.16) is recast as

∫1
d𝑛 𝑙 Tr[𝛾 𝜇 ( /𝑙 + 𝑚)𝛾𝜈 ( /𝑙 + 𝑞/ + 𝑚)]

𝐼 𝜇𝜈 (𝑞) = d𝑥 . (38.17)
(2𝜋) 𝑛 (𝑙 + 𝑥𝑞) 2 − 𝑥 2 𝑞 2 + 𝑥𝑞 2 − 𝑚 2  2

0

Now, the relevant shift is 𝑙 ′ = 𝑙 + 𝑥𝑞, i.e. 𝑙 = 𝑙 ′ − 𝑥𝑞. Performing this in (38.17) and renaming 𝑙 ′
again as 𝑙, one has

∫1
d𝑛 𝑙 Tr[𝛾 𝜇 ( /𝑙 − 𝑥 𝑞/ + 𝑚)𝛾𝜈 ( /𝑙 + (1 − 𝑥) 𝑞/ + 𝑚)]

𝐼 𝜇𝜈 (𝑞) = d𝑥 2 , (38.18)
(2𝜋) 𝑛 𝑙2 − 𝐶
0

with
𝐶 = 𝑚 2 − 𝑥(1 − 𝑥)𝑞 2 (38.19)
(one should keep in mind that 𝑚 2 in the expression (38.19) is in fact 𝑚 2 − 𝑖𝜖).

231
Chapter 39

Accomplishing
dimensional regularization of 𝚷 𝝁𝝂 (𝒒)

Our starting point is now the integral (38.18) along with the formula (38.19) from the preceding
chapter. A great advantage of the form (38.18) is that the denominator of the integrand is an
even function of 𝑙. This enables one to eliminate immediately terms in the numerator that
are odd (i.e. linear in 𝑙). So, working out the matrix trace under the integral in (38.18) in a
straightforward way and discarding the odd terms, one gets
∫1
d𝑛 𝑙 Tr(𝛾 𝜇 /𝑙 𝛾𝜈 /𝑙 ) − 𝑥(1 − 𝑥) Tr(𝛾 𝜇 𝑞/ 𝛾𝜈 𝑞/ ) + 𝑚 2 Tr(𝛾 𝜇 𝛾𝜈 )

𝐼 𝜇𝜈 (𝑞) = d𝑥 2
(2𝜋) 𝑛 𝑙2 − 𝐶
0
(39.1)
∫1
d𝑛 𝑙 2𝑙 𝜇 𝑙 𝜈 − 𝑙2𝑔 − 𝑥(1 − 𝑥)(2𝑞 𝜇 𝑞 𝜈 − 𝑞2𝑔 𝜇𝜈 ) + 𝑚2𝑔

𝜇𝜈 𝜇𝜈
=4 d𝑥 2 .
(2𝜋) 𝑛 𝑙2 − 𝐶
0

A remark is in order here: The overall factor 4 comes from the familiar formulae for traces of
products of 𝛾-matrices in four dimensions (recall that it is the trace of unit 4×4 matrix). In fact,
in 𝑛 dimensions, one should work with 2𝑛/2 × 2𝑛/2 matrices, so the trace factor would be then
2𝑛/2 . However, such an overall factor does not play any significant role in practical calculations,
so usually one employs, as an “operational prescription”, the trace factor equal to four.
Now, the first term in the numerator of the integrand in (39.1) can be simplified by using
a rule for “symmetric integration”, namely by means of the replacement
1 2
𝑙 𝑔𝛼𝛽 . 𝑙 𝛼 𝑙 𝛽 −→ (39.2)
𝑛
How can one justify the “trick” (39.2)? It is clear that the integral
𝑙𝛼 𝑙 𝛽

d𝑛 𝑙 2 (39.3)
(𝑙 − 𝐶) 2
must have the tensor form 𝑓 (𝐶)𝑔𝛼𝛽 . Taking the trace over the indices 𝛼, 𝛽 and realizing that
𝑔𝛼 𝛼 = 𝑛 in 𝑛-dimensional space, one has
𝑙2

d𝑛 𝑙 2 = 𝑓 (𝐶)𝑛 ,
(𝑙 − 𝐶) 2
i.e.
1 𝑙2

𝑓 (𝐶) = d𝑛 𝑙 . (39.4)
𝑛 (𝑙 2 − 𝐶) 2

232
Returning to the expression (39.3) and using (39.4), one thus has the result
𝑙𝛼 𝑙 𝛽 1 𝑙2
∫ ∫
d 𝑙 2
𝑛
= 𝑓 (𝐶)𝑔 𝛼𝛽 = 𝑔 𝛼𝛽 d 𝑛
𝑙 ,
(𝑙 − 𝐶) 2 𝑛 (𝑙 2 − 𝐶) 2
and the rule (39.2) is thereby proved.
In this way, the integral (39.1) becomes
∫1 2
d𝑛 𝑙 − 1 𝑙 2 𝑔 𝜇𝜈 − 2𝑥(1 − 𝑥)𝑞 𝜇 𝑞 𝜈 + 𝑥(1 − 𝑥)𝑞 2 𝑔 𝜇𝜈 + 𝑚 2 𝑔 𝜇𝜈
∫ 
𝐼 𝜇𝜈 (𝑞) = 4 d𝑥 𝑛
2 . (39.5)
(2𝜋) 𝑛 𝑙2 − 𝐶
0

Now, we would like to prove that our regularized expression satisfies the transversality condition
(38.3), i.e. that the integral 𝐼 𝜇𝜈 (𝑞) (which is simply proportional to Π 𝜇𝜈 (𝑞)) has the form (38.6).
To this end, let us separate in (39.5) the transverse part and then try to prove that the rest is zero.
Thus, the expression (39.5) may be recast as

∫1 ∫ 2
 
2𝑔 − 𝑞 𝑞 ) +
 2 2 + 𝑚2 𝑔
d 𝑙
𝑛 2𝑥(1 − 𝑥)(𝑞 𝜇𝜈 𝜇 𝜈 𝑛 − 1 𝑙 − 𝑥(1 − 𝑥)𝑞 𝜇𝜈
𝐼 𝜇𝜈 (𝑞) = 4 d𝑥 2 ,
(2𝜋) 𝑛 𝑙2 − 𝐶
0
(39.6)
and we would like to prove that
 2  2 2 2
d𝑛 𝑙 𝑛 − 1 𝑙 − 𝑥(1 − 𝑥)𝑞 + 𝑚
∫ 
2 = 0. (39.7)
(2𝜋) 𝑛 𝑙2 − 𝐶
For this purpose, one needs a general formula suitable for the type of integrals appearing in
(39.7), which in fact represents the very core of the method of dimensional regularization. It
reads
d𝑛 𝑙 (𝑙 2 ) 𝑟 𝑖 Γ(𝑟 + 𝑛2 ) Γ(𝑠 − 𝑟 − 𝑛2 )

= (−1) 𝑟−𝑠 𝑟+ 𝑛2 −𝑠
𝐶 , (39.8)
(2𝜋) 𝑛 (𝑙 2 − 𝐶 + 𝑖𝜖) 𝑠 (4𝜋) 2 Γ( 𝑛2 ) Γ(𝑠)
𝑛

where Γ denotes the Euler gamma function. Note that the integral in (39.8) is convergent for
𝑠 −𝑟 > 𝑛/2. Notice also that we have marked explicitly the contribution 𝑖𝜖 in the denominator on
the left-hand side so as to emphasize the way of avoiding a possible singularity of the integrand.
Because of the key position of the relation (39.8) in the whole regularization scheme, it may be
aptly dubbed the “master formula” of the DR method. We will prove the formula (39.8) at the
end of this chapter, now let us utilize it for proving the identity (39.7). Taking into account that
𝑚 2 − 𝑥(1 − 𝑥)𝑞 2 = 𝐶 (see Eq. (38.19)), the integral in Eq. (39.7) is
 2  2
d𝑛 𝑙 𝑛 −1 𝑙 +𝐶
∫ 
𝐼= 2 . (39.9)
(2𝜋) 𝑛 𝑙2 − 𝐶
Then, using the formula (39.8), one has

2 Γ(1 + 𝑛2 ) Γ(1 − 𝑛2 ) Γ( 𝑛2 ) Γ(2 − 𝑛2 )


 
𝑖 𝑖
−1 2 −1 2 −2 . (39.10)
𝑛 𝑛
𝐼= (−1)𝐶 + 𝐶 · 𝐶
𝑛 Γ( 𝑛2 ) Γ(2) Γ( 𝑛2 ) Γ(2)
𝑛 𝑛
(4𝜋) 2 (4𝜋) 2
The last expression can be simplified by means of the well-known identity

𝑧 Γ(𝑧) = Γ(1 + 𝑧) . (39.11)

233
One thus gets, after a simple manipulation,
2
    
𝑖 𝑛 𝑛  𝑛
−1
Γ 1− − −1 + 1− = 0.
𝑛
𝐼= 𝑛 𝐶
2
(4𝜋) 2 2 𝑛 2 2
There is another interesting aspect of the above calculation that is worth mentioning here. Along
with the elimination of the non-transverse part of the expression (39.6), the original quadratic
divergence drops out (note that such a divergence would correspond to the part of the integrand
involving 𝑙 2 /(𝑙 2 −𝐶) 2 ). Thus, the remaining transverse term in Eq. (39.6) is only logarithmically
divergent, according to our earlier preliminary classification.
In this way, the integral (39.6) has been reduced to
∫1
d𝑛 𝑙 1

2
𝐼 𝜇𝜈 (𝑞) = 8 (𝑞 𝑔 𝜇𝜈 − 𝑞 𝜇 𝑞 𝜈 ) d𝑥 𝑥(1 − 𝑥) . (39.12)
(2𝜋) 𝑙 2 − 𝐶  2
𝑛
0

Getting back to the original quantity Π 𝜇𝜈 (𝑞), this means (cf. (38.6) and (38.14)) that we have
DR
𝑖Π 𝜇𝜈 (𝑞) = 𝑖Π DR (𝑞 2 )(𝑞 2 𝑔 𝜇𝜈 − 𝑞 𝜇 𝑞 𝜈 ) ,
with
∫1
d𝑛 𝑙 1

DR 2 4−𝑛
𝑖Π (𝑞 ) = 𝜇 8 d𝑥 𝑥(1 − 𝑥) . (39.13)
(2𝜋) 𝑙 2 − 𝐶  2
𝑛
0
One may now carry out the integration over d𝑛 𝑙 with the help of the formula (39.8) and we thus
get, after a simple manipulation,
 ∫1  𝑛2 −2
1
 
𝑛 𝐶
Π DR (𝑞 2 ) = 8 𝑛 Γ 2 − d𝑥 𝑥(1 − 𝑥) 2 . (39.14)
(4𝜋) 2 2 𝜇
0

Removing the regularization would mean, formally, performing the limit 𝑛 → 4, so it makes
sense to expand the expression (39.14) in the vicinity of 𝑛 = 4. Its form suggests that it is natural
to introduce a parameter
𝑛
𝜖 =2− , (39.15)
2
i.e. 4 − 𝑛 = 2𝜖 (hopefully, such a notation will not cause confusion as regards the symbol 𝜖
appearing in any propagator denominator). So, (39.14) can be recast as
∫1  −𝜖
1

DR 2 𝐶
Π (𝑞 ) = 2 (4𝜋) 𝜖 Γ(𝜖) d𝑥 𝑥(1 − 𝑥) 2 . (39.16)
2𝜋 𝜇
0

In view of the definition (39.15), the limit 𝑛 → 4 corresponds to 𝜖 → 0. The Euler gamma
function Γ(𝜖) has a simple pole at 𝜖 = 0, and we may utilize the well-known form of Laurent
expansion around 𝜖 = 0:
1
Γ(𝜖) = − 𝛾E + O (𝜖) , (39.17)
𝜖
where 𝛾E ≈ 0.577 is the Euler–Mascheroni constant (note that 𝛾E = −Γ′ (1)). The remaining
𝜖-dependent factors in (39.16) can be expanded in Taylor series around 𝜖 = 0. One has
(4𝜋) 𝜖 = 1 + 𝜖 ln 4𝜋 + O (𝜖 2 ) ,
(39.18)
  −𝜖
𝐶 𝐶
2
= 1 − 𝜖 ln 2 + O (𝜖 2 ) .
𝜇 𝜇

234
Putting all this together, one gets finally24

 ∫1 2 − 𝑥(1 − 𝑥)𝑞 2
1 1 1
  
DR 2 𝑚
 6 𝜖 − 𝛾E + ln 4𝜋 − d𝑥 𝑥(1 − 𝑥) ln O  . (39.19)

Π (𝑞 ) = 2 + (𝜖)
2𝜋 𝜇2


0
 
 
In this way, we have isolated the UV divergence as the pole term 1/𝜖. The general strategy
of the dimensional regularization should now be clear: we use the master formula (39.8) that is
strictly valid for some values of parameters 𝑛, 𝑟, 𝑠 (𝑠 − 𝑟 > 𝑛/2) and then one can employ its
analytic continuation in the parameter 𝑛. This is quite straightforward, because of the known
simple properties of Γ function.
As a technical appendix to our conceptual exposition of basic DR techniques, let us now
prove the “master formula” (39.8). So, we would like to compute the integral

d𝑛 𝑙 (𝑙 2 ) 𝑟

𝐼= . (39.20)
(2𝜋) 𝑛 (𝑙 2 − 𝐶 + 𝑖𝜖) 𝑠

The integration variable 𝑙 lives in the space with pseudo-Euclidean metric, so that

𝑙 2 = 𝑙 02 − | 𝑙®| 2 = 𝑙02 − (𝑙12 + · · · + 𝑙 𝑛−1


2
). (39.21)

In (39.20) one may integrate first over the components of 𝑙® and subsequently over 𝑙0 . In this last
integration along the real axis one avoids potential singularities (poles) thanks to the presence of
𝑖𝜖. One may then rotate the real interval (−∞, +∞) for 𝑙0 to the imaginary axis in complex plane
(this should be understandable for anybody with at least basic knowledge of complex analysis).
Such a transformation then means that 𝑙0 = 𝑖𝑙 𝑛 , 𝑙 𝑛 ∈ (−∞, +∞) and 𝑙 2 in (39.21) becomes

𝑙 2 = 𝑙02 − | 𝑙®| 2 = −𝑙E2 , (39.22)

with
𝑙E2 = 𝑙12 + · · · + 𝑙 𝑛−1
2
+ 𝑙 𝑛2 (39.23)
(the index E stands here for Euclidean, for obvious reasons). Such a trick is quite common in
QFT and is called Wick rotation (due to the same G. C. Wick as before). The substitution
𝑙0 = 𝑖𝑙 𝑛 in the integral of course means that

d𝑛 𝑙 −→ 𝑖d𝑛 𝑙E . (39.24)

So, we have, as a first step,

d𝑛 𝑙E (𝑙E2 ) 𝑟

𝐼 = 𝑖(−1) 𝑟−𝑠
. (39.25)
(2𝜋) 𝑛 (𝑙E2 + 𝐶) 𝑠

For simplicity, let us assume that 𝐶 > 0 throughout our calculation. A most convenient way of
the evaluation of the integral (39.25) is based on the exponentiation of the rational function in
24 Note that the factor 1/6 arises here from the integration over the Feynman parameter, as

∫1
1
d𝑥 𝑥(1 − 𝑥) = .
6
0

235
the integrand, with the help of the identity
∫∞
𝑘!
𝑡 𝑘 𝑒 −𝑡 𝐴 d𝑡 =
𝐴 𝑘+1
0 (39.26)
Γ(𝑘 + 1)
 
= .
𝐴 𝑘+1

Thus, we may write (omitting the index E)


∫∞
d𝑛 𝑙 2 𝑟 1

2 +𝐶)
𝐼 = 𝑖(−1) 𝑟−𝑠
(𝑙 ) 𝑡 𝑠−1 𝑒 −𝑡 (𝑙 d𝑡
(2𝜋) 𝑛 Γ(𝑠)
0
(39.27)
∫∞
𝑖(−1) 𝑟−𝑠 1

2
= d𝑡 𝑡 𝑠−1 𝑒 −𝑡𝐶 d𝑛 𝑙 (𝑙 2 ) 𝑟 𝑒 −𝑡𝑙 .
Γ(𝑠) (2𝜋) 𝑛
0

The integral over 𝑙 is expressed easily in terms of the Gaussian integral, since it holds, obviously,
 𝑟 ∫
𝑟 d

2 𝑟 −𝑡𝑙 2 2
d 𝑙 (𝑙 ) 𝑒
𝑛
= (−1) d𝑛 𝑙 𝑒 −𝑡𝑙 , (39.28)
d𝑡

and  √ 𝑛
𝜋
∫ ∫
−𝑡𝑙 2 −𝑡 (𝑙12 +···+𝑙 𝑛2 )
d 𝑙𝑒𝑛
= d 𝑙𝑒 𝑛
= √ . (39.29)
𝑡
Differentiating the last expression as in (39.28) one gets, after some simple manipulations (and
using some well-known properties of the Γ function),
𝑛
𝑛 Γ( + 𝑟) 𝑛

2
d𝑛 𝑙 (𝑙 2 ) 𝑟 𝑒 −𝑡𝑙 = 𝜋 2 2 𝑛 𝑡 − 2 −𝑟 . (39.30)
Γ( 2 )

Inserting (39.30) into (39.27), one has

𝑛 ∫∞
1 𝜋 2 Γ( 2 + 𝑟)
𝑛
𝑟−𝑠
d𝑡 𝑡 𝑠−𝑟− 2 −1 𝑒 −𝑡𝐶 . (39.31)
𝑛
𝐼 = 𝑖(−1)
Γ(𝑠) (2𝜋) 𝑛 Γ( 𝑛2 )
0

Using the familiar definition


∫∞
Γ(𝑧) = d𝑡 𝑡 𝑧−1 𝑒 −𝑡 , (39.32)
0
one gets easily, via a substitution 𝑡𝐶 = 𝑢 in (39.31),

1 Γ(𝑟 + 𝑛2 ) Γ(𝑠 − 𝑟 − 𝑛2 )
𝐼 = 𝑖(−1) 𝑟−𝑠
𝐶 𝑟+ 𝑛2 −𝑠
, (39.33)
Γ( 𝑛2 ) Γ(𝑠)
𝑛
(4𝜋) 2
and the proof is thereby completed.

236
Chapter 40

Pauli–Villars regularization

As we have already indicated in Chapter 38, the Pauli–Villars regularization relies on the strategy
of modifying (“deforming”) the integrand of an integral representing a closed-loop Feynman
diagram (while staying firmly in four dimensions). The simplest variant of such a scheme can
be implemented via a propagator subtraction shown in (38.9). Such a procedure is good enough
e.g. for the triangular loop in Fig. 37.2 or the diagram in Fig. 37.3, where one can consider the
above-mentioned modification of the photon propagator. However, for the purely fermion loop
appearing in the vacuum polarization graph one has to employ a more sophisticated approach.
The reason is that one would like to maintain the transversality property (38.3); performing
a subtraction in just a single propagator in the loop would introduce terms involving different
masses in the internal lines attached to the same vertex, and this would certainly destroy the
identity (38.3) — it should be clear from our heuristic derivation of the condition in question.
For an additional argument, one may realize that the appearance of two different masses in such
a loop would mean that it is made of non-conserved currents (recall that the four-divergence of
a vector current made of Dirac fields with unequal masses is proportional to their difference).
Wolfgang Pauli and Felix Villars came up with the right (and very natural) solution of
this problem in the early days of modern era of QED, in 1949 (for the original paper see [43]).
They proposed to subtract the whole fermion loop involving an auxiliary regulator mass — then,
obviously, the transversality (38.3) is preserved automatically. More generally, it may happen
that more than one subtraction of such a type is needed; as we are going to see immediately, this
is precisely the case of the quantity Π 𝜇𝜈 (𝑞), which is currently our toy object for testing various
regularization methods.
So, let us consider the contribution of the familiar loop in Fig. 40.1 and try to regularize its
contribution using the above-mentioned basic idea of the Pauli–Villars (PV) method. It amounts
to replacing the ill-defined contribution of the diagram in Fig. 40.1 by a bona fide convergent

l+q

q q

Fig. 40.1: Vacuum polarization loop reproduced here for reader’s convenience.

237
integral
d4 𝑙

PV
𝑖Π 𝜇𝜈 (𝑞) = [𝐽 (𝑚) − 𝐶1 𝐽 (𝑀1 ) − 𝐶2 𝐽 (𝑀2 )] , (40.1)
(2𝜋) 4
where we have denoted
Tr 𝛾 𝜇 ( /𝑙 + 𝑚)𝛾𝜈 ( /𝑙 + 𝑞/ + 𝑚)
 
𝐽 (𝑚) = , (40.2)
(𝑙 2 − 𝑚 2 ) [(𝑙 + 𝑞) 2 − 𝑚 2 ]
and similarly for the other two terms in the integrand of the expression (40.1). We will see that
in the considered case, two PV subtractions are sufficient. Thus, the prescription (40.1) can be
pictured symbolically as shown in Fig. 40.2.

𝑚 𝑚 𝑀1 𝑀2

PV = − 𝐶1 − 𝐶2

𝑚 𝑚 𝑀1 𝑀2

Fig. 40.2: Pictorial representation of the Pauli–Villars regularization of the vacuum polarization loop.

One may employ now the Feynman parametrization (38.10) for each term in (40.1) and
then shift the integration variable 𝑙 as before, in Chapter 38 (𝑙 ′ = 𝑙 + 𝑥𝑞). Let us stress that
such a shift is legitimate, since we work with a convergent integral. From 𝐽 (𝑚) one then gets
(preserving the original notation)
Tr 𝛾 𝜇 ( /𝑙 − 𝑥 𝑞/ + 𝑚)𝛾𝜈 ( /𝑙 + (1 − 𝑥) 𝑞/ + 𝑚)
 
𝐽 (𝑚) = 2 , (40.3)
𝑙 2 − 𝑓 (𝑚)
with
𝑓 (𝑚) = 𝑚 2 − 𝑥(1 − 𝑥)𝑞 2 (40.4)
(cf. (38.19)) and similarly for 𝐽 (𝑀1 ) and 𝐽 (𝑀2 ). Working out the trace in (40.3) and carrying
out the symmetric integration (see Chapter 39), the relevant integrand in (40.1) becomes, after
some simple manipulations,
𝐽 = 𝐽 (𝑚) − 𝐶1 𝐽 (𝑀1 ) − 𝐶2 𝐽 (𝑀2 ) , (40.5)
where
1 1
 
𝐽 (𝑚) = 4 − 𝑙 2 𝑔 𝜇𝜈 − 2𝑥(1 − 𝑥)𝑞 𝜇 𝑞 𝜈 + 𝑥(1 − 𝑥)𝑞 2 𝑔 𝜇𝜈 + 𝑚 2 𝑔 𝜇𝜈 2 , (40.6)
2 𝑙 2 − 𝑓 (𝑚)
and analogously for 𝐽 (𝑀1 ) and 𝐽 (𝑀2 ).
Consider now the power-like behaviour of the individual terms in the integrand 𝐽 for
𝑙 → ∞ and the potential UV divergences they could produce. Obviously, the 𝑙 2 term in the
numerators could lead to a quadratic divergence; so, let us examine first a condition for its
elimination. The part of the integrand (40.5) responsible for such a would-be leading UV
divergence is proportional to
𝑙2 𝑙2 𝑙2
𝑋q. div. =  2 − 𝐶1  2 − 𝐶2 2 . (40.7)
𝑙 2 − 𝑓 (𝑚) 𝑙 2 − 𝑓 (𝑀1 ) 𝑙 2 − 𝑓 (𝑀2 )

238
Let us denote, for brevity,

𝑓 (𝑚) = 𝑓0 , 𝑓 (𝑀1 ) = 𝑓1 , 𝑓 (𝑀2 ) = 𝑓2 . (40.8)

The expression (40.7) can then be written as

(𝑙 2 − 𝑓1 ) 2 (𝑙 2 − 𝑓2 ) 2 − 𝐶1 (𝑙 2 − 𝑓0 ) 2 (𝑙 2 − 𝑓2 ) 2 − 𝐶2 (𝑙 2 − 𝑓0 ) 2 (𝑙 2 − 𝑓1 ) 2
𝑋q. div. = 𝑙 2 . (40.9)
(𝑙 2 − 𝑓0 ) 2 (𝑙 2 − 𝑓1 ) 2 (𝑙 2 − 𝑓2 ) 2

The denominator in (40.9) is of the order O (𝑙 12 ) for 𝑙 → ∞, so it is clear that a UV divergence


can arise from the terms in the numerator that are at least of the order O (𝑙 8 ). In other words,
contributions of the type 𝑙 2 (𝑙 2 ) 2 𝑓12 , 𝑙 2 (𝑙 2 ) 2 𝑓1 𝑓2 , etc. are harmless (i.e. lead to UV convergent
integrals). Let us start with the worst possible case, namely the power 𝑙 2 (𝑙 2 ) 2 (𝑙 2 ) 2 appearing in
the numerator. Clearly, the condition for eliminating these terms reads 1 − 𝐶1 − 𝐶2 = 0, i.e.

𝐶1 + 𝐶2 = 1 . (40.10)

Next, some logarithmic (next-to-leading) UV divergences would originate in terms of the type
𝑙 2 (𝑙 2 ) 2 (−2𝑙 2 𝑓 ). There are plenty of them, a full list is as follows:

𝑙 2 (𝑙 2 ) 2 (−2𝑙 2 𝑓2 ) + 𝑙 2 (𝑙 2 ) 2 (−2𝑙 2 𝑓1 )
− 𝐶1 𝑙 2 (𝑙 2 ) 2 (−2𝑙 2 𝑓2 ) + 𝑙 2 (𝑙 2 ) 2 (−2𝑙 2 𝑓0 ) (40.11)
 

− 𝐶2 𝑙 2 (𝑙 2 ) 2 (−2𝑙 2 𝑓1 ) + 𝑙 2 (𝑙 2 ) 2 (−2𝑙 2 𝑓0 ) .
 

Now, the factors 𝑓0 , 𝑓1 , 𝑓2 have the structure (40.4), so the mass-independent terms get cancelled
because of the relation (40.10). On the other hand, the mass-dependent terms are proportional
to
𝑀22 − 𝐶1 𝑀22 − 𝐶2 𝑀12
+ 𝑀12 − 𝐶1 𝑚 2 − 𝐶2 𝑚 2 (40.12)
= 𝑀12 + 𝑀22 − 𝐶1 𝑀22 − 𝐶2 𝑀12 − (𝐶1 + 𝐶2 )𝑚 2 .

Using now (40.10), the last expression becomes 𝐶1 𝑀12 + 𝐶2 𝑀22 − 𝑚 2 , and the condition for the
elimination of the considered next-to-leading terms reads

𝐶1 𝑀12 + 𝐶2 𝑀22 = 𝑚 2 . (40.13)

Completing this discussion, it is not difficult to see that due to the conditions (40.10) and (40.13),
also the 𝑙-independent terms in (40.6) do not contribute to potential UV divergences.
To summarize our results: We have found out that UV divergences disappear from our
expression (40.1) if the parameters 𝐶1 , 𝐶2 and 𝑀1 , 𝑀2 satisfy the conditions

𝐶1 + 𝐶2 = 1 ,
(40.14)
𝐶1 𝑀12 + 𝐶2 𝑀22 = 𝑚 2 .

Obviously, there are many ways of how to satisfy the relations (40.14); one particular choice
could be
2
𝐶1 = 2 , 𝑀12 = 𝑚 2 + 𝑀 ,
2
𝐶2 = −1 , 𝑀22 = 𝑚 2 + 2𝑀 . (40.15)

239
The conditions (40.14) also show clearly that just a single PV subtraction would not be enough;
indeed, such a setting would correspond to 𝐶2 = 0 and, consequently, 𝐶1 = 1 along with 𝑀1 = 𝑚,
which of course would be useless. Thus, we have verified explicitly the well-known fact that the
number of PV subtractions depends generally on the degree of UV divergence of the Feynman
diagram in question.
The preceding discussion was, in a sense, just a warm-up exercise, showing that the
proposal of Pauli and Villars does work. Now we would like to evaluate explicitly the regularized
PV (𝑞) and establish its dependence on the employed regularization parameters.
contribution of Π 𝜇𝜈
To this end, it is very helpful to use an auxiliary “intermediate” regularization for the
evaluation of the individual integrals appearing in the whole expression for Π 𝜇𝜈PV ; the reason for

such a “side step” (following the mythical Czech genius Jára Cimrman) is that a direct evaluation
of the complete integral in Eq. (40.1) using the integrand given by Eqs. (40.5), (40.6) would be a
hopeless task. A needed auxiliary tool could be dimensional regularization, but we would rather
stick to our original commitment to stay firmly in four dimensions. Thus, we are going to use
a direct “momentum cut-off” mentioned earlier as one of the possible regularization strategies.
We will need two integrals, namely

d4 𝑙 𝑙2

, (40.16)
(2𝜋) 4 (𝑙 2 − 𝑓 ) 2
and
d4 𝑙 1

, (40.17)
(2𝜋) (𝑙 − 𝑓 ) 2
4 2

with an appropriately chosen integration upper limit. The integrals (40.16) and (40.17) can
be computed easily with the help of the Wick rotation (sketched at the end of Chapter 39).
A relevant cut-off Λ can be then defined as the radius of a hypersphere in four-dimensional
Euclidean space, which represents the boundary of the finite integration region. Details of the
calculation are left to the reader as an exercise; here let us only show the relevant results, up to
inessential terms vanishing for Λ → ∞. One gets

d4 𝑙 𝑙2 Λ2 1
  
𝑖

2
4 2 2
=− 2
Λ − 2 𝑓 ln + 𝑓 +O 2 , (40.18)
(2𝜋) (𝑙 − 𝑓 ) 16𝜋 𝑓 Λ
(Λ)
d4 𝑙 1 Λ2 1
  
𝑖

4 2 2
= 2
ln −1+O 2 . (40.19)
(2𝜋) (𝑙 − 𝑓 ) 16𝜋 𝑓 Λ
(Λ)

Now we can employ these formulae in the expression for Π 𝜇𝜈 PV in its present form (cf. (40.5),

(40.6)). Instead of writing an unbearably long expression, let us try to find out, which con-
tributions will drop out (in fact, we expect some significant cancellations). First of all, it is
clear that the terms from (40.18) proportional to Λ2 will disappear because of 1 − 𝐶1 − 𝐶2 = 0
(i.e. the quadratic divergence is cancelled, similarly to the case of dimensional regularization).
Analogously, terms 𝑓 from (40.18) get cancelled owing to the relations (40.14) and the constant
(−1) from (40.19) also drops out due to 1 −𝐶1 −𝐶2 = 0. Thus, it is sufficient to take into account

240
the logarithmic terms only. One then gets, returning to the original notation for 𝑓 ’s:
∫1
1 Λ2
   
PV 𝑖
𝑖Π 𝜇𝜈 =4 d𝑥 − 𝑔 𝜇𝜈 − −2 𝑓 (𝑚) ln
2 16𝜋 2 𝑓 (𝑚)
0
Λ2

2 2 𝑖
ln
 
+ −2𝑥(1 − 𝑥)𝑞 𝜇 𝑞 𝜈 + 𝑥(1 − 𝑥)𝑞 𝑔 𝜇𝜈 + 𝑚 𝑔 𝜇𝜈
16𝜋 2 𝑓 (𝑚)

 
− 𝐶1 𝑚 → 𝑀1 − 𝐶2 𝑚 → 𝑀2

∫1  h
𝑖
= 2 d𝑥 𝑚2 − 𝑥(1 − 𝑥)𝑞 2 𝑔 𝜇𝜈
−

4𝜋
0
Λ2

2
𝑚2 𝑔 𝜇𝜈
i
− 2𝑥(1 − 𝑥)𝑞 𝜇 𝑞 𝜈 + 𝑥(1 − 𝑥)𝑞 𝑔 𝜇𝜈 +  ln
𝑓 (𝑚)

 
− 𝐶1 𝑚 → 𝑀1 − 𝐶2 𝑚 → 𝑀2

∫1
Λ2 Λ2 Λ2
 
𝑖 2
= 2 d𝑥 𝑥(1 − 𝑥)(𝑞 𝑔 𝜇𝜈 − 𝑞 𝜇 𝑞 𝜈 ) ln − 𝐶1 ln − 𝐶2 ln .
2𝜋 𝑓 (𝑚) 𝑓 (𝑀1 ) 𝑓 (𝑀2 )
0
(40.20)
PV , it is useful to write
For arriving at a final neat form of the expression Π 𝜇𝜈
Λ2 Λ2 𝑚2 Λ2 Λ2 𝑚2
ln = ln 2 + ln , ln = ln 2 + ln , etc.
𝑓 (𝑚) 𝑚 𝑓 (𝑚) 𝑓 (𝑀1 ) 𝑚 𝑓 (𝑀1 )
Then,
Λ2 Λ2 Λ2
− ln
𝐶1 ln − 𝐶 2 ln = 0,
𝑚2 𝑚2 𝑚2
because of 1 − 𝐶1 − 𝐶2 = 0. In this way, one may say that the auxiliary cut-off has fulfilled its
role and disappears from the scene; of course, we are left with a non-trivial dependence on the
PV regulator masses 𝑀1 , 𝑀2 . Another reassuring feature of (40.20) is that we have recovered
the transverse tensor structure — since we know a priori that the PV recipe should obey such a
rule, we thus have a consistency check of the explicit calculation.
So, we have
PV 𝑖
𝑖Π 𝜇𝜈 (𝑞) = 2 (𝑞 2 𝑔 𝜇𝜈 − 𝑞 𝜇 𝑞 𝜈 )
2𝜋
∫1
𝑚2 𝑚2 𝑚2
" #
× d𝑥 𝑥(1 − 𝑥) ln 2 − 𝐶1 ln 2 − 𝐶2 ln 2 .
𝑚 − 𝑥(1 − 𝑥)𝑞 2 𝑀1 − 𝑥(1 − 𝑥)𝑞 2 𝑀2 − 𝑥(1 − 𝑥)𝑞 2
0
(40.21)
The regulator masses 𝑀1 , 𝑀2 are supposed to be, in principle, arbitrarily large. It means that
one may write e.g.
𝑚2 𝑚2 𝑞2
 
ln 2 = ln 2 − ln 1 − 𝑥(1 − 𝑥) 2
𝑀1 − 𝑥(1 − 𝑥)𝑞 2 𝑀1 𝑀1
2 1
 
𝑚
= ln 2 + O .
𝑀1 𝑀12

241
So, we are allowed to replace the second and third logarithmic terms in (40.21) by ln(𝑚 2 /𝑀12 )
and ln(𝑚 2 /𝑀22 ), respectively, and recast (40.21) as

PV 𝑖
𝑖Π 𝜇𝜈 (𝑞) = (𝑞 2 𝑔 𝜇𝜈 − 𝑞 𝜇 𝑞 𝜈 )
2𝜋 2
∫1 (40.22)
𝑚2 𝑀12 𝑀22
" #
1

× d𝑥 𝑥(1 − 𝑥) ln 2 + 𝐶1 ln 2 + 𝐶2 ln 2 + O ,
𝑚 − 𝑥(1 − 𝑥)𝑞 2 𝑚 𝑚 𝑀2
0

where 𝑀 is a generic large mass scale. The two large logarithms involving 𝑀1 and 𝑀2 can be
combined into one term, i.e. one can define a new mass parameter 𝑀 according to

𝑀2 𝑀12 𝑀22
ln = 𝐶1 ln + 𝐶2 ln (40.23)
𝑚2 𝑚2 𝑚2
(the reader is recommended to use in the expression (40.23) the earlier choice (40.15) and find
an approximate relation between 𝑀 and 𝑀).
Thus, finally,
PV
Π 𝜇𝜈 (𝑞) = Π PV (𝑞 2 )(𝑞 2 𝑔 𝜇𝜈 − 𝑞 𝜇 𝑞 𝜈 ) ,
with
∫1
1 𝑀2 𝑚 2 − 𝑥(1 − 𝑥)𝑞 2
 
PV 2
Π (𝑞 ) = 2 d𝑥 𝑥(1 − 𝑥) ln 2 − ln ,
2𝜋 𝑚 𝑚2
0
i.e.
∫1
1  1 𝑀2 𝑚 2 − 𝑥(1 − 𝑥)𝑞 2 
 
Π PV (𝑞 2 ) = 2  ln
 6 𝑚2 − d𝑥 𝑥(1 − 𝑥) ln . (40.24)
2𝜋 𝑚2
0
 
 
Notice that this result is quite similar to our previous expression obtained within the DR scheme:
there is a simple correspondence between the UV divergences in both schemes, namely

𝑀2 1
ln 2
←→ , (40.25)
𝑚 𝜖
or, if you want,
𝑀 1
ln
←→ . (40.26)
𝑚 4−𝑛
It was a lot of work, a real tour de force, but the result (40.24) is quite elegant and
in fact gratifying. One may now also appreciate the flexibility and technical simplicity of
the dimensional regularization scheme. Nevertheless, in some situations the DR scheme may
encounter specific difficulties. In particular, the 𝛾5 matrix needs special treatment within the
DR scheme, since, as we know, the identities for traces of products of 𝛾-matrices involving 𝛾5
do not have a simple uniform structure as those without 𝛾5 . Fortunately, in QED we are spared
those subtle problems.
In closing this chapter, an additional general remark is in order. Apart from the examples
we have discussed explicitly up to now, there are in fact infinitely many UV regularization
schemes with the desirable properties. In particular, one may consider a class of regularization
devised in 1980s by Oleg I. Zavialov, which are implemented as “continuous superpositions of
the PV cut-offs” (see [44] and references therein). An individual scheme belonging to this class
can be defined by means of an integration over the variable PV mass parameter, involving an

242
appropriate wight function (whose form is rather general). A specific peculiar choice of such
a weight function enables one to imitate the dimensional regularization (sic!). This amusing
example may thus be dubbed aptly “DR in four dimensions”. Anyway, the straightforward DR
scheme remains to be the most efficient regularization method in perturbative QFT.

243
Chapter 41

𝚺( 𝒑) and all that

We are going to examine the other one-loop QED diagrams that we have already encountered in
Chapter 37 (as well as some extra species that are yet to be identified). Let us start with the loop
shown in Fig. 41.1 (we have reproduced here Fig. 37.3 for reader’s convenience). As we have
noted in Chapter 37, it is commonly known as the fermion self-energy diagram — the origin
of such a name will become clear later.

= 𝑖Σ( 𝑝)
𝑝 𝑙+𝑝 𝑝

Fig. 41.1: Electron self-energy loop.

The original integral representation (37.17) for Σ( 𝑝) can now be written within the
dimensional regularization scheme as

d𝑛 𝑙 1 −𝑔 𝛼𝛽

DR 4−𝑛 2
𝑖Σ ( 𝑝) = 𝜇 𝑒 𝛾 𝛼 𝛾 . (41.1)
(2𝜋) 𝑛 /𝑙 + 𝑝/ − 𝑚 𝛽 𝑙 2 − 𝜆2

Note that we have included here also the coupling factor 𝑒 2 (of course, in the same way we might
complete our previous result for the vacuum polarization graph). Another important point is
to be noticed here: In the photon propagator we have included an auxiliary “fictitious” mass
squared 𝜆2 , since we anticipate later appearance of the infrared divergence due to massless
photon; this aspect will be clarified when we arrive at the relevant explicit formulae for Σ( 𝑝). In
subsequent manipulations with the expression (41.1) we will set, temporarily, 𝑒 = 1 and restore
the coupling factor at the very end of the calculation.
From (41.1) one gets first
d𝑛 𝑙 1

DR 4−𝑛
(41.2)
 𝜇
𝑖Σ ( 𝑝) = −𝜇 𝛾 𝜇 /
𝑙 + 𝑝
/ + 𝑚 𝛾 .
(2𝜋) 𝑛 [(𝑙 + 𝑝) 2 − 𝑚 2 ] (𝑙 2 − 𝜆2 )
Introducing Feynman parametrization, shifting appropriately the integration variable 𝑙 (𝑙 ′ =
𝑙 + 𝑥 𝑝) and utilizing symmetric integration, one has

∫1
d𝑛 𝑙 1

DR 4−𝑛
𝑖Σ ( 𝑝) = 𝜇 d𝑥 [(𝑛 − 2)(1 − 𝑥) 𝑝/ − 𝑛𝑚] , (41.3)
(2𝜋) (𝑙 − 𝐶) 2
𝑛 2
0

244
where
𝐶 = 𝑥𝑚 2 + (1 − 𝑥)𝜆2 − 𝑥(1 − 𝑥) 𝑝 2 . (41.4)
The reader is encouraged to verify independently the results (41.3) and (41.4). Note that in
arriving at (41.3), we have used the identities

𝛾 𝜇 ( /𝑙 + 𝑝/ )𝛾 𝜇 = (2 − 𝑛)( /𝑙 + 𝑝/ ) ,
(41.5)
𝛾𝜇 𝛾 𝜇 = 𝑛

valid in 𝑛 dimensions. One should also mention that the superficial linear divergence apparent
in the original expression (41.1) disappears upon the symmetric integration and one is thus left
with a logarithmic UV divergence only.
Next, employing the master formula (39.8) and introducing the parameter 𝜖 = 2 − 𝑛/2
instead of 𝑛, one gets

∫1  −𝜖
1

𝐶
Σ( 𝑝) = 2 (4𝜋) 𝜖 Γ(𝜖) d𝑥 [(1 − 𝜖)(1 − 𝑥) 𝑝/ − (2 − 𝜖)𝑚] 2 . (41.6)
8𝜋 𝜇
0

The result thus can be expressed in the form

ΣDR ( 𝑝) = 𝑋 ( 𝑝 2 ) + 𝑌 ( 𝑝 2 ) 𝑝/ (41.7)

(where, in general, 𝑝 2 ≠ 𝑚 2 ), with

∫1  −𝜖
1

2 𝐶
𝑋 ( 𝑝 ) = − 2 𝑚 (4𝜋) 𝜖 Γ(𝜖)(2 − 𝜖) d𝑥 2 ,
8𝜋 𝜇
0
∫1  −𝜖
1

2 𝐶
𝑌 ( 𝑝 ) = 2 (4𝜋) 𝜖 Γ(𝜖)(1 − 𝜖) d𝑥 (1 − 𝑥) 2 ,
8𝜋 𝜇
0

and, using 𝜖 Γ(𝜖) = Γ(1 + 𝜖), this can be conveniently rewritten as

∫1  −𝜖
1

𝐶
𝑋 ( 𝑝 2 ) = − 2 𝑚 (4𝜋) 𝜖 Γ(𝜖) d𝑥 2
4𝜋 𝜇
0
(41.8)
∫1  −𝜖
1

𝐶
+ 𝑚 (4𝜋) 𝜖 Γ(1 + 𝜖) d𝑥 ,
8𝜋 2 𝜇2
0
∫1  −𝜖
1

2 𝐶
𝑌 ( 𝑝 ) = 2 (4𝜋) 𝜖 Γ(𝜖) d𝑥 (1 − 𝑥) 2
8𝜋 𝜇
0
(41.9)
∫1  −𝜖
1

𝐶
− (4𝜋) 𝜖 Γ(1 + 𝜖) d𝑥 (1 − 𝑥) .
8𝜋 2 𝜇2
0

245
Expanding now the expressions (41.8), (41.9) around 𝜖 = 0, one obtains

∫1
1 1
 
2 𝐶
𝑋 ( 𝑝 ) = − 2 𝑚 ΔUV − − d𝑥 ln 2 + O (𝜖) , (41.10)
4𝜋 2 𝜇
0
∫1
1
 
2 𝐶
𝑌 (𝑝 ) = ΔUV − 1 − 2 d𝑥 (1 − 𝑥) ln 2 + O (𝜖) , (41.11)
16𝜋 2 𝜇
0

where
1
− 𝛾E + ln 4𝜋 .
ΔUV =
𝜖
In general, the loop in Fig. 41.1 may emerge as an insertion into the fermion propagator,
for instance in the 4th order diagram for Compton scattering like the one in Fig. 41.2. In other

Fig. 41.2: A fourth-order contribution to the Compton scattering, involving a one-loop correction to the
electron propagator.

words, the quantity Σ( 𝑝) may play the role of a correction to the fermion propagator (we will
discuss this in more detail later on). With this in mind, it is natural to consider an expansion of
Σ( 𝑝) in powers of 𝑝/ − 𝑚, writing

Σ( 𝑝) = 𝐴 + 𝐵( 𝑝/ − 𝑚) + 𝐶 ( 𝑝/ − 𝑚) 2 + . . . , (41.12)

where 𝐴, 𝐵, . . . are constants (i.e. independent of 𝑝 2 ). The most economical way of obtaining
an expansion like (41.12) is to employ, quite formally, the corresponding Taylor expansion with
the coefficients 𝐴, 𝐵, etc. being determined by derivatives of the function Σ( 𝑝) with respect to
𝑝/ , taken at 𝑝/ = 𝑚 (although we know that there is no 𝑝 such that 𝑝/ = 𝑚!). In doing this, one
should also take into account that 𝑝 2 = 𝑝/ 𝑝/ . Then, using the decomposition (41.7), one gets

𝐴 = Σ( 𝑝/ ) 𝑝/ =𝑚
= 𝑋 (𝑚 2 ) + 𝑚 𝑌 (𝑚 2 ) , (41.13)
𝜕Σ
𝐵= = 𝑋 ′ ( 𝑝 2 = 𝑚 2 ) 2 𝑝/ 𝑝/ =𝑚
𝜕 𝑝/ 𝑝/ =𝑚

+ 𝑌 ′ ( 𝑝 2 = 𝑚 2 ) 2 𝑝/ · 𝑝/ 𝑝/ =𝑚
(41.14)
+ 𝑌 ( 𝑝2 = 𝑚2) · 1
= 𝑌 (𝑚 2 ) + 2𝑚 𝑋 ′ (𝑚 2 ) + 2𝑚 2𝑌 ′ (𝑚 2 ) .

A watchful reader, who would consider our calculational trick with 𝑝/ = 𝑚 suspect, may arrive
at the same results by expanding 𝑋 ( 𝑝 2 ) and 𝑌 ( 𝑝 2 ) around 𝑝 2 = 𝑚 2 and using 𝑝 2 = 𝑝/ 𝑝/ wherever
necessary (in particular, 𝑝 2 − 𝑚 2 = ( 𝑝/ − 𝑚)( 𝑝/ + 𝑚), etc.). In any case, it is also important to
notice that the coefficients 𝐴 and 𝐵 contain the UV divergence, while 𝐶 is UV finite (as well as
the higher terms). Indeed, for computing the coefficient 𝐶 one has to take the second derivative

246
of Σ( 𝑝) with respect to 𝑝/ and the resulting expression then involves only derivatives of 𝑋 and 𝑌 ;
from (41.10), (41.11) it is clear that these are UV finite (since ΔUV is manifestly 𝑝-independent).
The evaluation of 𝐴 and 𝐵 by means of the relations (41.13), (41.14), (41.10) and (41.11) is
straightforward; one gets
∫1
1 1 (1 − 𝑥)𝜆2 + 𝑥 2 𝑚 2
 
𝐴 = − 2 𝑚 ΔUV − − d𝑥 ln
4𝜋 2 𝜇2
0
∫1
1 (1 − 𝑥)𝜆2 + 𝑥 2 𝑚 2
 
+ 𝑚 ΔUV − 1 − 2 d𝑥 (1 − 𝑥) ln (41.15)
16𝜋 2 𝜇2
0

and
∫1
1 (1 − 𝑥)𝜆2 + 𝑥 2 𝑚 2
 
𝐵= ΔUV − 1 − 2 d𝑥 (1 − 𝑥) ln
16𝜋 2 𝜇2
0
∫1
1 1
− 𝑚2 d𝑥 𝑥(1 − 𝑥)
2𝜋 2 (1 − 𝑥)𝜆2 + 𝑥 2 𝑚 2
0
∫1
1 1
+ 2
𝑚2 d𝑥 𝑥(1 − 𝑥) 2 . (41.16)
4𝜋 (1 − 𝑥)𝜆2 + 𝑥 2 𝑚 2
0

In (41.15) the infrared (IR) regulator 𝜆 can be removed (𝜆 → 0) and the result is simplified
considerably. Restoring also the coupling factor 𝑒 2 = 4𝜋𝛼, one has finally
3𝛼 𝑚2 4
 
𝐴 = − 𝑚 ΔUV − ln 2 + (41.17)
4𝜋 𝜇 3
(the reader is encouraged to verify this independently). In the expression (41.16) for the
coefficient 𝐵 the IR divergence persists (the second and third integrals obviously diverge for
𝜆 = 0), but one may simplify the result by using an appropriate expansion in the vicinity of
𝜆 = 0, and discarding the terms that vanish for 𝜆 → 0. One thus gets, after somewhat tedious
manipulations (including then 𝑒 2 = 4𝜋𝛼 as well)
𝑚2 𝜆2
 
𝛼
𝐵= ΔUV − ln 2 + 2 ln 2 + 4 . (41.18)
4𝜋 𝜇 𝑚
A detailed derivation of the result (41.18) is left as a challenge for a hard-working reader.
Note that such a calculation can be carried out also by using the Pauli–Villars regular-
ization. The results can be found e.g. in the textbook [6]; it is quite remarkable that the UV
divergence is reproduced in much the same way as in the case of the vacuum polarization graph.
In particular, there is a one-to-one correspondence
𝑚2 𝑀2
ΔUV − ln 2 ←→ ln 2 .
𝜇 𝑚
On the other hand, the constant terms are different; for 𝐴, instead of 4/3 one has 1/2 in the PV
scheme, and for 𝐵, 4 is replaced by 9/4. There is nothing wrong with such a discrepancy; it
corresponds to a common experience with regularization procedures of different types.

247
All this was again quite a long and tedious procedure, but our goal was to exhibit a new
feature of the loop calculations in QED, namely the possible appearance of the IR divergence,
and this requires a detailed calculation. Moreover, the expansion of Σ( 𝑝) shown in (41.3) and
the coefficients 𝐴, 𝐵 will play an important role in our future discussion of the renormalization
program in QED.
Let us now proceed to another example of a closed-loop diagram, namely the triangle
shown in Fig. 41.3. In this case (usually called, for obvious reasons, the vertex correction),

l
p p′

l+p l + p′

q = p′ − p

Fig. 41.3: One-loop vertex correction in QED.

the UV finite part is quite complicated, and we will not discuss it now; later on, we will
evaluate in detail at least a part of it in connection with the famous QED application —
the so-called Schwinger correction to the electron magnetic moment. Anyway, the diagram in
Fig. 41.3 provides a good opportunity to demonstrate the power and efficiency of the dimensional
regularization method for extracting the UV divergence — it turns out that many calculational
details can be simply ignored and the road to the UV divergent part of the loop contribution is
surprisingly short and straightforward. So, using musical terminology, while the tempo of our
previous calculations was “andante”, now it will be “allegro”, or “allegro moderato”. Let us
start with the DR form of the contribution of Fig. 41.3 (cf. (37.16)). Setting again provisionally
𝑒 = 1, one has, after a trivial manipulation,

d𝑛 𝑙 𝛾𝛼 ( /𝑙 + 𝑝/ ′ + 𝑚)𝛾 𝜇 ( /𝑙 + 𝑝/ + 𝑚)𝛾 𝛼

DR ′ 4−𝑛
𝑖Γ𝜇 ( 𝑝 , 𝑝) = 𝜇 . (41.19)
(2𝜋) 𝑛 [(𝑙 + 𝑝′) 2 − 𝑚 2 ] [(𝑙 + 𝑝) 2 − 𝑚 2 ] (𝑙 2 − 𝜆2 )
The relevant Feynman parametrization now reads (cf. (38.11))

∫1 ∫1−𝑥
1 1
=2 d𝑥 d𝑦 .
𝑎𝑏𝑐 [(𝑎 − 𝑐)𝑥 + (𝑏 − 𝑐)𝑦 + 𝑐] 3
0 0

Using this, and performing an appropriate shift of the integration variable in (41.19) (note that
such a shift is a linear combination of 𝑝 and 𝑝′ involving Feynman parameters 𝑥, 𝑦), one is left
with an expression that has rather complicated structure for UV finite terms, but the part of the
integrand leading to UV divergence is very simple: it is 𝛾𝛼 /𝑙 𝛾 𝜇 /𝑙 𝛾 𝛼 and nothing more. So, one
has
∫1 ∫1−𝑥
d𝑛 𝑙 1

𝑖Γ𝜇DR ( 𝑝′, 𝑝) = 2𝜇 2𝜖
(𝛾𝛼 /𝑙 𝛾 𝜇 /𝑙 𝛾 𝛼 + . . . ) d𝑥 d𝑦 , (41.20)
(2𝜋) 𝑛 (𝑙 2 − 𝑓 )3
0 0

where 𝑓 is a function of 𝑝 2 , 𝑝′2 , 𝑝 · 𝑝′, 𝑥, 𝑦.

248
Now comes a crucial observation: The integration using our master formula (39.8)
produces Γ(2 − 𝑛/2) = Γ(𝜖), and this is multiplied by factors that are finite for 𝑛 = 4, some
of them appearing under the integral over Feynman parameters. Thus, for extracting the UV
divergence, i.e. the pole term 1/𝜖, one can set 𝑛 = 4 in the factors surrounding Γ(𝜖). Employing
in (41.20) the trick of symmetric integration and an identity for 𝛾-matrices (see (41.5)), one thus
has
∫1 ∫1−𝑥 ∫
1 d𝑛 𝑙 𝑙2
𝑖Γ𝜇DR ( 𝑝′, 𝑝) = 2 · (−2) 2 · 𝛾 𝜇 d𝑥 d𝑦 +...
4 (2𝜋) 𝑛 (𝑙 2 − 𝑓 ) 3
0 0
∫1 ∫1−𝑥
𝑖 Γ(1 + 𝑛2 )Γ(2 − 𝑛2 )
= 2𝛾 𝜇 d𝑥 d𝑦 𝑓 2 −2
𝑛
+...
16𝜋 2 Γ( 𝑛2 )Γ(3) (41.21)
0 0
∫ 1 ∫1−𝑥
𝑖  𝑛
= 2𝛾 𝜇 Γ 2− d𝑥 d𝑦 + . . .
16𝜋 2 2
0 0
𝑖
= Γ(𝜖)𝛾 𝜇 + . . .
16𝜋 2
In all preceding expressions, the ellipses stand for the UV finite contributions. Notice that a
highly welcome simplification of our procedure is due to the fact that 𝑓 𝑛/2−2 is trivialized for
𝑛 = 4, so there are no complications related to the integration over 𝑥 and 𝑦.
Thus, the calculation carried out here has led to a remarkably simple result for the UV
divergent part of the vertex loop, namely
1 1
Γ𝜇DR ( 𝑝′, 𝑝) = 𝛾𝜇 + . . . (41.22)
16𝜋 2 𝜖
The tempo was perhaps too fast for a beginner, so I recommend the reader to go through the
calculation “da capo al fine” and try to master firmly the DR method, which is so useful for
QFT practitioners. In particular, it is useful for anybody to work out the details that we did not
need for obtaining the results (41.22), e.g. to find the explicit expression for the shift of the loop
momentum 𝑙 in (41.19), as well as the function 𝑓 in (41.20). Good luck!

249
Chapter 42

More about QED loops

Before proceeding further, let us make more precise our notation conventions for contributions
of QED loops we are dealing with. As it is practical to work sometimes with the relevant
quantities stripped of the corresponding coupling factors, it is also useful to introduce special
symbols for them: we will use tilde to distinguish such “truncated” contributions from those
Σ instead of Σ, with Σ = 𝑒 2e
involving the full coupling factor (i.e. e Σ, etc.).
Let us now come back to our results for the self-energy and the vertex correction. We
have seen that eΣ can be expanded like

Σ( 𝑝) = 𝐴
e e 𝑝/ − 𝑚) + UV finite terms ,
e + 𝐵( (42.1)

where, among other things,


1 1
e=
𝐵 +... (42.2)
16𝜋 2 𝜖
(note that the coefficient of the pole term 1/𝜖 is obtained from (41.18) by dividing it by 4𝜋𝛼 = 𝑒 2 ).
For the eΓ𝜇 ( 𝑝′, 𝑝) we have
1 1
Γ𝜇 ( 𝑝′, 𝑝) = 𝛾𝜇 + . . . (42.3)
16𝜋 2 𝜖
e

Now, one might think that the coincidence of expressions (42.2) and (42.3) is purely accidental.
But it is not. It turns out that it has a deeper origin — it is a simple consequence of the so-called
Ward identity, as we are going to explain below.
The identity we have in mind reads
𝜕 e
Σ( 𝑝) = e
Γ𝜇 ( 𝑝, 𝑝) (42.4)
𝜕 𝑝𝜇
(as we have already noted in Chapter 38, it had been first observed by J. C. Ward in 1950, see
the paper [45]). A proof of the identity (42.4) is not difficult; it is based on a simple formula for
the differentiation of the propagator of Dirac field, namely
𝜕 1 1 1
𝜇
=− 𝛾𝜇 . (42.5)
𝜕 𝑝 𝑝/ − 𝑚 𝑝/ − 𝑚 𝑝/ − 𝑚

So, let us first verify the relation (42.5). It is elementary, one has just to keep in mind that the
derivative of a matrix 𝑀 does not, in general, commute with 𝑀. One must therefore start with
the definition of inverse matrix

( 𝑝/ − 𝑚)( 𝑝/ − 𝑚) −1 = 1 . (42.6)

250
Differentiating (42.6) one gets
𝜕
𝛾 𝜇 ( 𝑝/ − 𝑚) −1 + ( 𝑝/ − 𝑚) ( 𝑝/ − 𝑚) −1 = 0 ,
𝜕 𝑝𝜇
and thus
𝜕
( 𝑝/ − 𝑚) −1 = −( 𝑝/ − 𝑚) −1 𝛾 𝜇 ( 𝑝/ − 𝑚) −1 ,
𝜕 𝑝𝜇
which is precisely the identity (42.5).
Now one may recall our previous expressions for e Σ( 𝑝) and e Γ𝜇 ( 𝑝, 𝑝), namely
d𝑛 𝑙 1 −𝑔 𝛼𝛽

DR 4−𝑛
𝑖 Σ ( 𝑝) = 𝜇 𝛾 𝛼 𝛾 𝛽 , (42.7)
𝑙 2 − 𝜆2
e
(2𝜋) 𝑛 /𝑙 + 𝑝/ − 𝑚
d𝑛 𝑙 1 1 −𝑔 𝛼𝛽

DR 4−𝑛
𝑖 Γ𝜇 ( 𝑝, 𝑝) = −𝜇 𝛾 𝛼 𝛾 𝜇 𝛾 𝛽 (42.8)
𝑙 2 − 𝜆2
e
(2𝜋) 𝑛 /𝑙 + 𝑝/ − 𝑚 /𝑙 + 𝑝/ − 𝑚

(just to be sure: let us remind the reader that the minus sign in (42.8) is 𝑖 6 coming from three
vertices and three propagators, while in (42.7) the corresponding “bookkeeping factor” is 𝑖 4 = 1).
Taking into account the elementary identity (42.5), the validity of (42.4) is obvious, since the
differentiation of ( /𝑙 + 𝑝/ − 𝑚) −1 in (42.7) with respect to 𝑝 𝜇 leads precisely to the matrix chain
in (42.8), including the necessary minus sign.
So, how can one utilize the Ward identity (42.4) for elucidating the coincidence of the
pole terms in (42.2) and (42.3)? In fact, it is quite obvious now. Differentiating the expression
(42.1), one gets
𝜕 e e 𝜇 + UV finite terms ,
Σ( 𝑝) = 𝐵𝛾 (42.9)
𝜕 𝑝𝜇
and so, according to (42.4), one also has
e 𝜇 + UV finite terms .
Γ𝜇 ( 𝑝, 𝑝) = 𝐵𝛾
e (42.10)
Since we know that the UV divergence in e Γ𝜇 ( 𝑝′, 𝑝) does not depend on the external momenta
𝑝, 𝑝′, the validity of the results (42.9) and (42.10) is sufficient for our argument.
Let us remark that our derivation of the Ward identity (42.4) in fact does not depend on
an explicit use of dimensional regularization. It is clear that it would be equally valid within
the conventional Pauli–Villars scheme (or any other that does not deform fermion propagators).
Therefore we may work formally with the expressions for e Σ and e Γ𝜇 , in which the regularization
operation is suppressed.
The discussion of the Ward identity can be extended so as to get it in a form that involves
the full vertex function e Γ𝜇 ( 𝑝′, 𝑝) with 𝑝′ ≠ 𝑝 in general. The resulting relation was derived first
by Yoshio Takahashi in 1957 (see [46]) and thus carries the name Ward–Takahashi identity.
Let us demonstrate how such an identity emerges from the structure of one-loop QED diagrams
we are dealing with.
After a trivial manipulation, e Γ𝜇 ( 𝑝′, 𝑝) has the form (as we have stressed, the dimension
of the integration space is now irrelevant)
d4 𝑙 1 1 1


𝑖 Γ𝜇 ( 𝑝 , 𝑝) = 4
𝛾𝛼 𝛾𝜇 𝛾𝛼 2 , (42.11)
𝑙 − 𝜆2
e
(2𝜋) /𝑙 + 𝑝/ − 𝑚
′ /𝑙 + 𝑝/ − 𝑚

and for e
Σ( 𝑝) we have
d4 𝑙 1 1

Σ( 𝑝) = −
𝑖e 4
𝛾𝛼 𝛾𝛼 2 . (42.12)
(2𝜋) /𝑙 + 𝑝/ − 𝑚 𝑙 − 𝜆2

251
Now, multiplying the expression (42.11) by 𝑞 𝜇 , where 𝑞 = 𝑝′ − 𝑝, one gets

d4 𝑙 1 1 1

𝜇e ′
𝑖𝑞 Γ𝜇 ( 𝑝 , 𝑝) = 4
𝛾𝛼 ( 𝑝/ ′ − 𝑝/ ) 𝛾𝛼 2 . (42.13)
(2𝜋) /𝑙 + 𝑝/ − 𝑚
′ /𝑙 + 𝑝/ − 𝑚 𝑙 − 𝜆2

One may now write


𝑝/ ′ − 𝑝/ = ( /𝑙 + 𝑝/ ′ − 𝑚) − ( /𝑙 + 𝑝/ − 𝑚) ,
and utilizing this in Eq. (42.13), the propagator denominators can be partially cancelled, so that
one gets

d4 𝑙 1 1 1
∫  
𝜇e ′
𝑖𝑞 Γ𝜇 ( 𝑝 , 𝑝) = 4
𝛾𝛼 𝛼
𝛾 − 𝛾𝛼 𝛾 𝛼
. (42.14)
(2𝜋) /𝑙 + 𝑝/ − 𝑚 /𝑙 + 𝑝/ − 𝑚
′ 𝑙 − 𝜆2
2

Thus, taking into account (42.12), the identity (42.14) can be obviously recast as

Γ𝜇 ( 𝑝′, 𝑝) = e
𝑞𝜇e Σ( 𝑝′) − e
Σ( 𝑝) . (42.15)

The relation (42.15) is just the famous Ward–Takahashi (WT) identity. It is not difficult to find
out that the original Ward identity (42.4) can be obtained from the WT identity by means of the
Taylor expansion of both sides of Eq. (42.15) in powers of 𝑞. Indeed, on the left-hand side one
then has
𝑞𝜇e Γ𝜇 ( 𝑝, 𝑝) + O (𝑞 2 ) ,
Γ𝜇 ( 𝑝′, 𝑝) = 𝑞 𝜇 e (42.16)
while the right-hand side of (42.15) gives
𝜕 e
Σ( 𝑝′) − e
e Σ( 𝑝) = Σ( 𝑝) 𝑞 𝜇 + O (𝑞 2 ) . (42.17)
𝜕 𝑝𝜇
So, comparing the terms of the first order in 𝑞 𝜇 one gets immediately the relation (42.4).
One more remark is in order here. If the vertex triangular loop is a part of a Feynman graph
with external lines carrying on-shell four-momenta 𝑝 and 𝑝′ (and the plane-wave amplitudes
u ( 𝑝), u ( 𝑝′)), the matrix function e Γ𝜇 ( 𝑝′, 𝑝) is then sandwiched between u ( 𝑝′) and u ( 𝑝); the
multiplication by 𝑞 𝜇 then gives, according to the WT identity,
h i
𝑞 𝜇 u ( 𝑝′) e
Γ𝜇 ( 𝑝′, 𝑝)uu ( 𝑝) = u ( 𝑝′) eΣ( 𝑝′) − e
Σ( 𝑝) u ( 𝑝) . (42.18)

Next, using the decomposition (42.1) for e


Σ( 𝑝′) and e
Σ( 𝑝), one has

Σ( 𝑝′) − e
e e 𝑝/ ′ − 𝑚) − 𝐵(
Σ( 𝑝) = 𝐵( e 𝑝/ − 𝑚)
(42.19)
+𝐶e( 𝑝/ ′ − 𝑚) 2 − 𝐶
e( 𝑝/ − 𝑚) 2 + . . .

So, the right-hand side of (42.18) obviously vanishes, because of u ( 𝑝′)( 𝑝/ ′ − 𝑚) = 0, ( 𝑝/ −


u ( 𝑝) = 0, and one is left with the relation
𝑚)u

𝑞 𝜇 u ( 𝑝′) e u ( 𝑝) = 0 ,
Γ𝜇 ( 𝑝′, 𝑝)u (42.20)

which looks like “current conservation” in momentum representation (in this context, recall the
familiar identity 𝑞 𝜇 u ( 𝑝′)𝛾 𝜇 u ( 𝑝) = u ( 𝑝′) 𝑞/ u ( 𝑝) = 0 that we used repeatedly in our previous
calculations). The identity (42.20) will be useful later, in the calculation of the celebrated
Schwinger correction to the spin magnetic moment of electron.

252
As we have already noted before, the transversality of the vacuum polarization tensor
Π 𝜇𝜈 (𝑞) is an example of a Ward identity, in general sense. So, now we have two examples of
this kind, at the level of one-loop QED diagrams. A general systematic analysis of WT identities
and their connection with gauge invariance of QED would go beyond the scope of the present
text; an interested reader can find the relevant deeper exposition of the subject e.g. in the book
[6] (Chapter 8, section 8.4.1 therein).
Up to now we have discussed in detail three one-loop diagrams that indeed belong to the
standard “QED household”, and we have used them primarily for practicing techniques of the
UV regularization. Now, a legitimate question could be: How about other possible one-loop
diagrams? In particular, can one encounter also UV finite loops? In closing this chapter, let us
mention briefly at least two new specific examples; the remaining more substantial stuff will be
a subject of the next chapter.
In chapters 36 and 37, we have in fact missed the opportunity to notice a curious closed
loop that could, naı̈vely, also be present within the set of 4th order graphs contributing to our
sample scattering process 𝑒 + 𝜇 → 𝑒 + 𝜇. The example in question may be represented by the
funny picture shown in Fig. 42.1.

Fig. 42.1: A would-be tadpole contribution to the 4th order matrix element for 𝑒–𝜇 scattering.

When saying “naı̈vely”, one means a mechanical usage of the “pictographic script”
of Feynman diagrams, involving internal lines, external lines and familiar vertices specified
in Chapter 35. In fact, a diagram like that in Fig. 42.1 is excluded automatically owing to the
Wick’s theorem T3. Indeed, one should keep in mind that in our definition of the QED interaction
Lagrangian, the current has the form : 𝜓(𝑥)𝛾 𝜇 𝜓(𝑥) : and the loop in Fig. 42.1 would involve a
contraction of field operators at the same spacetime point; however, they occur inside the normal
product and such contractions are omitted, according to the cited Wick’s theorem. It is quite
amusing that the contribution of such a loop would, in fact, vanish even without the intervention
of the Wick’s theorem (the reader is encouraged to prove this statement independently; hint:
don’t forget that for the evaluation of a closed fermion loop, matrix trace is involved). The funny
loop in Fig. 42.1 has its special name in the QFT literature: it is called tadpole because of its
similarity with the junior (larval) stage of a frog.25
The second example to be mentioned is the box diagram shown in Fig. 42.2. It is easy to
guess that its contribution is UV finite. Indeed, taking into account the asymptotic behaviour of
the propagators, the relevant integrand corresponding to the loop in Fig. 42.2 behaves like

𝑙 3 d𝑙 · 𝑙 −1 · 𝑙 −1 · 𝑙 −2 · 𝑙 −2

for 𝑙 → ∞, i.e. summarily we have d𝑙 · 𝑙 −3 (so, in a sense, the integral over 𝑙 is “more convergent
than necessary”). Later on, in our discussion of the renormalization program in QED, we will
appreciate greatly the UV finiteness of the loop in Fig. 42.2, which involves four external fermion
25 In Czech: pulec, in Slovak: žubrienka.

253
Fig. 42.2: A QED box diagram that is manifestly UV finite.

lines. In the next chapter, we will discover more sophisticated examples of UV finite diagrams,
namely the higher purely fermion loops.

254
Chapter 43

Fate of higher fermionic loops

In chapters 38, 39 and 40, we have processed the by now familiar vacuum polarization graph
in rather detailed manner and in various ways (perhaps almost ad nauseam). Now it is straight-
forward to introduce other close relatives of the bubble shown in Fig. 40.1, namely the purely
fermionic loops with more than two vertices — a triangle, a square (“box”), etc. This is what is
meant by “higher fermionic loops” in the title of the present chapter. It is also easy to imagine
how such loops could emerge within Feynman diagrams for physical scattering amplitudes. For
instance, let us consider the two-photon annihilation of an electron–positron pair. In the second
order, there are the familiar tree-level diagrams shown in Fig. 43.1.

(a) (b)

Fig. 43.1: Tree-level QED graphs for the process 𝑒 + 𝑒 − → 𝛾𝛾.

In the 4th order, there is a lot of diagrams; among them, one could consider the one in
Fig. 43.2, where “cross.” means crossing of the photon lines in analogy with Fig. 43.1.

+ cross.

Fig. 43.2: A fourth-order QED graph contributing to the process 𝑒 + 𝑒 − → 𝛾𝛾.

Further, within the 4th order of QED one could contemplate possible diagrams for a
process involving four real photons. Obviously, there is only one type (topology) of such a
diagram, namely the box shown in Fig. 43.3. A detailed analysis of the 4th order 𝑆-matrix
contribution shows that one must include all six permutations of three external photon lines,
while keeping the fourth one fixed. Similarly, one could consider the process 𝛾𝛾 → 𝛾𝛾𝛾, which
would proceed via a pentagon loop, in the 5th order in QED. So, there is a hierarchy of fermionic

255
Fig. 43.3: The fourth-order QED graph describing the elastic photon–photon scattering.

(a) (b) (c) (d)

Fig. 43.4: Examples of purely fermionic QED loops with external photon lines.

ρ ρ

l−k l+p l−p l+k

µ ν ν µ

l l
k p p k
(a) (b)

Fig. 43.5: Pure fermionic triangles in QED.

loops shown in Fig. 43.4, and our goal is to explore properties of the other members of the family
appearing there, beyond the familiar vacuum polarization bubble.
Let us start with the triangle. Considering the 4th order annihilation diagram in Fig. 43.2,
it is not difficult to realize that the part of its contribution involving the loop in question amounts
to 𝑇𝜌𝜇𝜈 (𝑘, 𝑝)𝜖 ∗𝜇 (𝑘)𝜖 ∗𝜈 ( 𝑝), where 𝑘, 𝑝 denote photon four-momenta and 𝑇𝜌𝜇𝜈 (𝑘, 𝑝) is, due to
the crossing, given by the sum of two graphs, namely those shown in Fig. 43.5. It means that
𝑇𝜌𝜇𝜈 (𝑘, 𝑝) = Γ𝜌𝜇𝜈 (𝑘, 𝑝) + Γ𝜌𝜈𝜇 ( 𝑝, 𝑘) , (43.1)
where, symbolically (i.e. without invoking an explicit regularization),
1 1 1
∫  
4
Γ𝜌𝜇𝜈 (𝑘, 𝑝) = d 𝑙 Tr 𝛾 𝛾 𝛾 , (43.2)
/𝑙 − 𝑘/ − 𝑚 𝜇 /𝑙 − 𝑚 𝜈 /𝑙 + 𝑝/ − 𝑚 𝜌
so that
1 1 1
∫  
4
Γ𝜌𝜈𝜇 ( 𝑝, 𝑘) = d 𝑙 Tr 𝛾 𝛾 𝛾 . (43.3)
/𝑙 − 𝑝/ − 𝑚 𝜈 /𝑙 − 𝑚 𝜇 /𝑙 + 𝑘/ − 𝑚 𝜌
Note that the crossing operation (43.1) is usually called “Bose symmetrization”. Let us also
remark that throughout this chapter we set 𝑒 = 1, since the coupling factors are irrelevant for our

256
ρ ρ

l−k l+p l+k l−p

µ ν µ ν
l l
k p k p
(a) (b)

Fig. 43.6: Topologically inequivalent triangle loops with mutually reversed orientation.

present purpose. In the same way, we suppress the overall factor (-1) that otherwise belongs to
any closed fermionic loop. For the sake of brevity, we also write simply d4 𝑙 instead of the usual
d4 𝑙/(2𝜋) 4 . Concerning the convergence properties of the considered loops, they are obviously
linearly divergent, since the integrand behaves like 𝑙 3 d𝑙 (𝑙 −1 ) 3 = d𝑙.
It is easy to see that the expression Γ𝜌𝜈𝜇 ( 𝑝, 𝑘) can equivalently be obtained by reversing
the direction of the loop momenta flowing through the first diagram in Fig. 43.5. In other words,
instead of the pair from Fig. 43.5 we can use the one shown in Fig. 43.6 (note that, conventionally,
we read the contribution of the loops by running against the arrows in fermion internal lines).
Let us now compare the contributions of loops (a) and (b) in Fig. 43.6. We have

Tr[( /𝑙 − 𝑘/ )𝛾 𝜇 /𝑙 𝛾𝜈 ( /𝑙 + 𝑝/ )𝛾 𝜌 ] + . . .

Γ𝜌𝜇𝜈 (𝑘, 𝑝) = d4 𝑙 , (43.4)
[(𝑙 − 𝑘) 2 − 𝑚 2 ] (𝑙 2 − 𝑚 2 ) [(𝑙 + 𝑝) 2 − 𝑚 2 ]
Tr[( /𝑙 − 𝑝/ )𝛾𝜈 /𝑙 𝛾 𝜇 ( /𝑙 + 𝑘/ )𝛾 𝜌 ] + . . .

Γ𝜌𝜈𝜇 ( 𝑝, 𝑘) = d4 𝑙 . (43.5)
[(𝑙 + 𝑘) 2 − 𝑚 2 ] (𝑙 2 − 𝑚 2 ) [(𝑙 − 𝑝) 2 − 𝑚 2 ]

In the last two expressions, the ellipsis represents terms proportional to 𝑚 2 , which involve just
four 𝛾-matrices inside the trace; for the moment, we are going to deal with the leading mass-
independent terms only. The structure of the expression (43.5) suggests that the substitution
𝑙 → −𝑙 might be useful. So, doing this and using trace cyclicity, one gets

(−1) 3 Tr[𝛾 𝜌 ( /𝑙 + 𝑝/ )𝛾𝜈 /𝑙 𝛾 𝜇 ( /𝑙 − 𝑘/ )] + . . .



4
Γ𝜌𝜈𝜇 ( 𝑝, 𝑘) = d 𝑙 . (43.6)
[(𝑙 − 𝑘) 2 − 𝑚 2 ] (𝑙 2 − 𝑚 2 ) [(𝑙 + 𝑝) 2 − 𝑚 2 ]
To proceed further, let us recall a remarkable “palindromic” identity for the trace of a product
of 𝛾-matrices, namely
Tr(𝛾𝛼 𝛾 𝛽 . . . 𝛾𝜏 𝛾𝜔 ) = Tr(𝛾𝜔 𝛾𝜏 . . . 𝛾 𝛽 𝛾𝛼 ) , (43.7)
i.e. the trace is not changed when the order of 𝛾-matrices is reversed (see (C.10) in Appendix C,
or Chapter 5; its proof follows (5.44)).
So, with the identity (43.7) at hand, the expression (43.6) can be recast as

Tr[( /𝑙 − 𝑘/ )𝛾 𝜇 /𝑙 𝛾𝜈 ( /𝑙 + 𝑝/ )𝛾 𝜌 ]

Γ𝜌𝜈𝜇 ( 𝑝, 𝑘) = − d4 𝑙 +... , (43.8)
[(𝑙 − 𝑘) 2 − 𝑚 2 ] (𝑙 2 − 𝑚 2 ) [(𝑙 + 𝑝) 2 − 𝑚 2 ]
which makes it clear that the leading term in Γ𝜌𝜈𝜇 ( 𝑝, 𝑘) is exactly opposite to its counterpart in
Γ𝜌𝜇𝜈 (𝑘, 𝑝). In other words, the leading terms (involving six 𝛾-matrices) in the sum (43.1) are
mutually cancelled.

257
In fact, one can proceed in much the same way in the case of terms that are proportional
to 𝑚2 and find out that they are cancelled in the sum (43.1) as well (the reader is encouraged to
verify independently such a statement).
It is important to notice that our analysis admits a generalization for any closed fermionic
loop with an odd number of vertices. Indeed, there have been two main ingredients in our
calculation: first, the substitution 𝑙 → −𝑙 that brings about the factor (−1) 𝑛 , with 𝑛 being
the number of the internal lines in the loop (and this of course coincides with the number of
vertices). Second, the 𝛾-matrix identity (43.7). In this way, one can prove that the contribution
of a fermionic loop with an odd number of vertices is exactly cancelled by its counterpart with
the reverse orientation of the loop momenta. There are some further generalizations of such
a statement, but we will not need them now. In QED, the exact cancellation of odd fermionic
loops that we have just observed is called the Furry’s theorem, in honour of Wendell Furry
(1907–1984) who discovered it in 1937.
To conclude this part of the story, the main message of the Furry’s theorem is that the
odd members of the family in Fig. 43.4 can be discarded completely. Among other things, the
elimination of the triangular loops that, at first sight, could generate linear UV divergences, plays
an important role in the implementation of the renormalization program in QED.
Let us now proceed to the box diagram sketched in Fig. 43.3. As we have already noted,
for its full contribution one has to take into account six permutations of three external momenta,
with a fourth one fixed. Let us display the box diagram once again, now equipped with all labels
necessary for the explicit calculation. So, we will select as a “basic configuration” the picture
shown in Fig. 43.7. Conventionally, we have taken all external momenta as outgoing, so that
𝑘 1 + 𝑘 2 + 𝑘 3 + 𝑘 4 = 0.

k2 l + k1 + k2
k3
ν ρ

l + k1 l + k1 + k2 + k3

µ σ
k1 k4
l

Fig. 43.7: A detailed labelling of the four-momenta appearing in the fermionic box diagram.

Our primary interest is a possible UV divergence that might originate in these box
diagrams. Obviously, the diagram in Fig. 43.7 is a priori logarithmically divergent, since the
relevant integrand behaves like 𝑙 3 d𝑙 (𝑙 −1 ) 4 = d𝑙 𝑙 −1 for 𝑙 → ∞. So, our approach will consist in
computing the UV divergence descending from the diagram in Fig. 43.7 for the given setting of
vertex indices, and then perform the “Bose symmetrization” with respect to indices 𝜈, 𝜌, 𝜎.
We are going to consider the fermionic loop itself, i.e. stripped of the external photon
lines (we also suppress all irrelevant overall factors, similarly as in the preceding example of the
triangle loop). Thus, we start with

d𝑛 𝑙 Tr[( /𝑙 + 𝑚)𝛾 𝜇 ( /𝑙 + 𝑝/ 1 + 𝑚)𝛾𝜈 ( /𝑙 + 𝑝/ 2 + 𝑚)𝛾 𝜌 ( /𝑙 + 𝑝/ 3 + 𝑚)𝛾𝜎 ]



𝑖Γ𝜇𝜈𝜌𝜎 (𝑘 1 , 𝑘 2 , 𝑘 3 , 𝑘 4 ) = ,
(2𝜋) 𝑛 (𝑙 2 − 𝑚 2 ) [(𝑙 + 𝑝 1 ) 2 − 𝑚 2 ] [(𝑙 + 𝑝 2 ) 2 − 𝑚 2 ] [(𝑙 + 𝑝 3 ) 2 − 𝑚 2 ]
(43.9)
where 𝑝 1 = 𝑘 1 , 𝑝 2 = 𝑘 1 + 𝑘 2 , 𝑝 3 = 𝑘 1 + 𝑘 2 + 𝑘 3 . For the evaluation of the anticipated UV
divergence we choose the tempo “allegro moderato”, in analogy with our calculation of the
vertex function in Chapter 41. It means that we are going to preserve only those parts of the full

258
contribution, which are absolutely necessary for extracting the pole factor 1/𝜖 within the DR
scheme. In particular, we will set 𝑛 = 4 in factors multiplying 1/𝜖 whenever possible.
To begin with, we introduce Feynman parametrization and carry out a pertinent shift of
the loop momentum 𝑙; as a result of performing these routine steps one gets first (suppressing
also the factor 1/(2𝜋) 𝑛 in the integrand)
1
∫ ∫
𝑖Γ𝜇𝜈𝜌𝜎 = d 𝑙 d𝑋 Tr( /𝑙 𝛾 𝜇 /𝑙 𝛾𝜈 /𝑙 𝛾 𝜌 /𝑙 𝛾𝜎 + . . . ) 2
𝑛
, (43.10)
(𝑙 − 𝑓 ) 4
where we have used an abbreviation d𝑋 for the full integration over Feynman parameters,

namely
∫ ∫ 1 ∫1−𝑥 1−𝑥−𝑦 ∫
d𝑋 = 3! d𝑥 d𝑦 d𝑧 ,
0 0 0
and the ellipsis denotes, as usual, the irrelevant terms. The quantity 𝑓 in the denominator is a
function of the mass 𝑚, external momenta and Feynman parameters, but we will not need its
explicit value (however, hard-working readers are encouraged to strain their muscles to evaluate
𝑓 , at least for on-shell photon momenta 𝑘 𝑖2 = 0, 𝑖 = 1, 2, 3, 4).
The next step is the symmetric integration; one has to find out what becomes of
1

𝐼 𝛼𝛽𝜄𝜅
= d𝑛 𝑙 𝑙 𝛼 𝑙 𝛽 𝑙 𝜄 𝑙 𝜅 2 . (43.11)
(𝑙 − 𝑓 ) 4
In fact, it is not difficult to get the answer. The integral 𝐼 𝛼𝛽𝜄𝜅 is a completely symmetric tensor
that can be expressed solely in terms of components of the metric tensor (there is not any other
vector or tensor at play; of course, 𝑓 is a scalar). So, denoting for brevity the denominator in the
expression (43.11) as 𝐷 (𝑙 2 ), we may write first

𝐼 𝛼𝛽𝜄𝜅 = 𝐹 (𝑔 𝛼𝛽 𝑔 𝜄𝜅 + 𝑔 𝛼𝜄 𝑔 𝛽𝜅 + 𝑔 𝛼𝜅 𝑔 𝛽𝜄 ) , (43.12)

and contracting the indices 𝛼, 𝛽, one gets


𝑙 2𝑙 𝜄𝑙 𝜅

d𝑛 𝑙 = 𝐹 (𝑛𝑔 𝜄𝜅 + 𝑔 𝜄𝜅 + 𝑔 𝜄𝜅 )
𝐷 (𝑙 2 ) (43.13)
= (𝑛 + 2)𝐹𝑔 𝜄𝜅 .
Contracting now 𝜄 and 𝜅 in (43.13), one has
(𝑙 2 ) 2

d𝑛 𝑙 = 𝑛(𝑛 + 2)𝐹 ,
𝐷 (𝑙 2 )
so that
1 (𝑙 2 ) 2

𝐹= d𝑛 𝑙 . (43.14)
𝑛(𝑛 + 2) 𝐷 (𝑙 2 )
Thus, we have finally
𝛼 𝛽 𝜄 𝜅 1 2 2
𝑛 𝑙 𝑙 𝑙 𝑙 𝑛 (𝑙 )
∫ ∫
d 𝑙 = d 𝑙 (𝑔 𝛼𝛽 𝑔 𝜄𝜅 + 𝑔 𝛼𝜄 𝑔 𝛽𝜅 + 𝑔 𝛼𝜅 𝑔 𝛽𝜄 ) .
2
𝐷 (𝑙 ) 𝑛(𝑛 + 2) 2
𝐷 (𝑙 )
In other words, a rule for the symmetric integration in the tensor expression (43.11) reads
1
𝑙 𝛼 𝑙 𝛽 𝑙 𝜄 𝑙 𝜅 −→ (𝑙 2 ) 2 (𝑔𝛼𝛽 𝑔𝜄𝜅 + 𝑔𝛼𝜄 𝑔 𝛽𝜅 + 𝑔𝛼𝜅 𝑔 𝛽𝜄 ) . (43.15)
𝑛(𝑛 + 2)

259
Of course, in our accelerated process of evaluating the UV divergence one can replace the factor
1/𝑛(𝑛 + 2) with 1/24. Using the rule (43.15) in (43.10) we have

𝑖Γ𝜇𝜈𝜌𝜎
1 (𝑙 2 ) 2
∫ ∫
= d𝑋 d𝑛 𝑙 2 Tr(𝛾𝛼 𝛾 𝜇 𝛾 𝛽 𝛾𝜈 𝛾𝜄 𝛾 𝜌 𝛾𝜅 𝛾𝜎 )(𝑔 𝛼𝛽 𝑔 𝜄𝜅 + 𝑔 𝛼𝜄 𝑔 𝛽𝜅 + 𝑔 𝛼𝜅 𝑔 𝛽𝜄 ) + . . .
24 (𝑙 − 𝑓 ) 4
1 (𝑙 2 ) 2
∫ ∫
= d𝑋 d𝑛 𝑙 2
24 (𝑙 − 𝑓 ) 4
× Tr(𝛾𝛼 𝛾 𝜇 𝛾 𝛼 𝛾𝜈 𝛾𝜄 𝛾 𝜌 𝛾 𝜄 𝛾𝜎 ) + Tr(𝛾𝛼 𝛾 𝜇 𝛾 𝛽 𝛾𝜈 𝛾 𝛼 𝛾 𝜌 𝛾 𝛽 𝛾𝜎 ) + Tr(𝛾𝛼 𝛾 𝜇 𝛾 𝛽 𝛾𝜈 𝛾 𝛽 𝛾 𝜌 𝛾 𝛼 𝛾𝜎 ) + . . .
 

(43.16)

One may now use the identities for chains of 𝛾-matrices, namely

𝛾𝛼 𝛾 𝜇 𝛾 𝛼 = (2 − 𝑛)𝛾 𝜇 ,
𝛾𝛼 𝛾 𝜇 𝛾𝜈 𝛾 𝛼 = 4𝑔 𝜇𝜈 + (𝑛 − 4)𝛾 𝜇 𝛾𝜈 ,
𝛾𝛼 𝛾 𝜇 𝛾𝜈 𝛾 𝜌 𝛾 𝛼 = −2𝛾 𝜌 𝛾𝜈 𝛾 𝜇 + (𝑛 − 4)𝛾 𝜇 𝛾𝜈 𝛾 𝜌 ,

in their simplified form for 𝑛 = 4; note that in the last trace in (43.16), also the palindromic
identity (43.7) is eventually utilized. The integration over 𝑙 with the help of the master formula
(39.8) obviously generates Γ(𝜖) and 𝑓 appears in the factor of 𝑓 𝑛/2−2 . Thus, ignoring all
inessential factors, one may write down the UV divergence in question as
1
= const. (−2) 2 Tr(𝛾 𝜇 𝛾𝜈 𝛾 𝜌 𝛾𝜎 )+(−2) Tr(𝛾𝜈 𝛾 𝛽 𝛾 𝜇 𝛾 𝜌 𝛾 𝛽 𝛾𝜎 )+(−2) 2 Tr(𝛾 𝜇 𝛾𝜈 𝛾 𝜌 𝛾𝜎 ) ,

𝑖Γ𝜇𝜈𝜌𝜎
UVdiv 𝜖
so that finally
1
Γ𝜇𝜈𝜌𝜎 = const. (𝑔 𝜇𝜈 𝑔 𝜌𝜎 + 𝑔 𝜇𝜎 𝑔𝜈𝜌 − 2𝑔 𝜇𝜌 𝑔𝜈𝜎 ) (43.17)
UVdiv 𝜖
(needless to say, “const.” here is different from the preceding one). The aforementioned
Bose symmetrization means adding to the expression (43.17) the other five terms involving
permutations of the indices 𝜈, 𝜌, 𝜎. For better transparency, let us denote 𝜇 ≡ 1, 𝜈 ≡ 2,
𝜌 ≡ 3, 𝜎 ≡ 4. The permutations of the expression in parentheses in (43.17) are collected in
Table 43.1 and we have shown there the cancellations among the terms in the sum of UV divergent
contributions of the form (43.17). Thus, it turns out that the box diagram is effectively UV finite.

permutation expression in parentheses (43.17)


234 
+X
𝑔12𝑔34 X−
𝑔14X𝑔X23 2𝑔
XX
13X𝑔
X24
342 𝑔
 13
 𝑔
  + 𝑔 𝑔
42  12
 − 2𝑔 𝑔
34 14 32
423 𝑔14 𝑔23 +  𝑔13X
X 𝑔
X  − 2𝑔
X  12 𝑔43
42

324 
+X
𝑔13𝑔24 X − 2𝑔12X
𝑔14X𝑔X32 XX
𝑔X34
243 𝑔12X𝑔X43
X +
X  X𝑔
X
13
X
 𝑔

X 24
 − 2𝑔
X X14X𝑔X23
432 𝑔14 𝑔32 + X 𝑔12X𝑔X43 − 2𝑔
X  13 42
 𝑔

Table 43.1: Cancellations among UV divergent terms in permutations of (43.17).

This is again a favourable circumstance from the point of view of the QED renormalization
program, as we will see later. Moreover, it means that e.g. the process of photon–photon elastic
scattering is calculable in QED. More about this later.

260
Chapter 44

Index of UV divergence of 1PI diagram

Up to now, we have discussed one-loop diagrams in QED. Our conclusions can be summarized
as follows. There are three types of UV divergent loops, namely those shown in Fig. 44.1, while

(a) (b) (c)

Fig. 44.1: “Canonical” UV divergent one-loop diagrams in QED.

some other potentially suspicious species eventually turn out to be UV finite or even trivial.
In particular, it is the case of pure fermionic loops: the triangle drops out completely (due to
the Furry’s theorem) and for the box (square) diagrams the UV divergent parts vanish (while
the finite part survives and gives a calculable contribution to the intriguing process of photon–
photon scattering). Needless to say, even higher fermionic loops (hexagon, etc.) are obviously
UV convergent. This valuable knowledge is for good and we will utilize it amply later on.
However, one might tackle a more ambitious goal: to explore the convergence properties
of higher-order diagrams, in principle multiloop ones, as well. An analysis of such a problem is
the main topic of this chapter; note that we follow here closely the exposition of the book [6].
To begin with, we must introduce a basic definition: A one-particle-irreducible (1PI)
Feynman diagram is such that it does not fall into two disjoint pieces by cutting one internal
line. For an illustration: the diagram shown in Fig. 44.2 is obviously 1PI, while the one depicted
in Fig. 44.3 is one-particle-reducible (to see this, please cut the middle wavy line).

q q

Fig. 44.2: 1PI two-loop vacuum polarization graph in QED.

261
q q q

Fig. 44.3: An example of a one-particle-reducible graph.

Let us now consider the contribution of a general 1PI graph 𝐺. It is represented by an


integral ∫
Γ= d4 𝑙1 . . . d4 𝑙 𝐿 𝐽 (𝑙1 , . . . , 𝑙 𝐿 ; 𝑝 ext , 𝑚) , (44.1)

where the integration variables are the independent loop momenta, 𝑝 ext denotes collectively the
external momenta and 𝑚 stands for masses in propagators. The index 𝐿 is, by definition, the
number of closed loops. A terminological remark is perhaps in order here: Looking at the
diagram in Fig. 44.2, one might be tempted to say that it involves three closed loops (and in a
purely geometrical sense it is true). However, we classify it as a two-loop graph, since there are
only two independent integration variables (loop momenta) for its contribution: one may choose
them e.g. as belonging to the internal photon line and one of the internal fermion lines — the
other loop momenta are then already fixed in terms of these two and the external momentum
𝑞. It should also be clear that if we work with 1PI diagrams, integration variables are present
in any internal line (an obvious counterexample is provided by the picture shown in Fig. 44.3).
Precisely this is the crucial feature of 1PI diagrams, as regards the analysis we are going to carry
out.
Our previous experience with convergence properties of one-loop graphs can be extended
and reformulated in the following way: In the UV asymptotic region of the integration variables
𝑙1 , . . . , 𝑙 𝐿 the integrand in (44.1) is a homogeneous function, since the dependence on external
momenta and masses can be neglected there. Including also the integration volume elements
d4 𝑙1 . . . d4 𝑙 𝐿 , one may employ the degree of homogeneity of the whole expression as an indicator
of UV convergence properties of the integral in Eq. (44.1). We will denote such a number as
𝜔(𝐺) and call it the index of divergence of the graph 𝐺 (note that in many textbooks it is also
called “superficial degree of divergence”). Our previous results for one-loop diagrams suggest
that 𝜔(𝐺) < 0 would correspond to a convergent integral, and 𝜔(𝐺) ≥ 0 to a divergent one; for
𝜔(𝐺) = 0 the UV divergence is logarithmic, for 𝜔(𝐺) = 1 it is linear, etc. (let us remind the
reader that for a one-loop logarithmically divergent graph the basic structure of the integrand for
𝑙 → ∞ is d𝑙/𝑙, etc.).
Now, let us consider a general QFT model involving bosons and fermions with polynomial
interaction terms (including, in general, also field derivatives). For simplicity, we are going
to assume first that all boson propagators behave like 1/𝑙 2 for 𝑙 → ∞; a straightforward
generalization will follow later on. So, what is the value of 𝜔(𝐺)? Just to be sure, let us recall
that the degree of homogeneity is obtained from rescaling the relevant variables by a factor, say,
𝜆 and finding the corresponding power of 𝜆. In the case of the integrand in (44.1) one takes
into account that each boson propagator contributes 𝜆−2 , each fermion (Dirac) propagator gives
𝜆−1 , any d4 𝑙𝑖 , 𝑖 = 1, . . . , 𝐿 will produce 𝜆4 , and one must also consider positive powers of loop
momenta generated possibly by derivatives acting in some interaction vertices. With all this in
mind, we can write down the value of 𝜔(𝐺) as
∑︁
𝜔(𝐺) = 4𝐿 − 2𝐼B − 𝐼F + 𝛿𝑣 , (44.2)
𝑣

262
where 𝐼B is the number of boson internal lines (propagators), 𝐼F means the same for fermions
and there is a sum over vertices 𝑣, with 𝛿𝑣 denoting the power of a loop momentum generated by
derivatives operating in the vertex 𝑣 (i.e. acting on the internal lines attached to such a vertex).
Further, it holds
𝐿 = 𝐼 −𝑉 + 1, (44.3)
where 𝐼 is the total number of internal lines, 𝐼 = 𝐼F + 𝐼B , and 𝑉 is the number of vertices of the
diagram. The relation (44.3) means that there are 𝑉 constraints (four-momentum conservation)
on the loop momenta appearing in the internal lines, but one such constraint represents just the
overall conservation of the external momenta. Thus, the expression (44.2) can be recast, after a
trivial manipulation, as ∑︁
𝜔(𝐺) − 4 = 3𝐼F + 2𝐼B + (𝛿𝑣 − 4) . (44.4)
𝑣
Now, 𝐼F and 𝐼B can be expressed as
1 ∑︁
𝐼F = 𝑓𝑣 ,
2 𝑣
(44.5)
1 ∑︁
𝐼B = 𝑏𝑣 ,
2 𝑣

where 𝑓𝑣 is the number of fermion lines attached to the vertex 𝑣, and 𝑏 𝑣 means the same for
boson lines (note that the factor 1/2 in (44.5) is included there so as to avoid double counting
— every internal line has two ends inside the diagram). So, we have
∑︁  3 
𝜔(𝐺) − 4 = 𝑓𝑣 + 𝑏 𝑣 + 𝛿 𝑣 − 4 . (44.6)
𝑣
2

Next, 𝑓𝑣 , 𝑏 𝑣 and 𝛿𝑣 can be written as


𝑓𝑣 = 𝑛F;𝑣 − 𝐸 F;𝑣 ,
𝑏 𝑣 = 𝑛B;𝑣 − 𝐸 B;𝑣 , (44.7)
𝛿𝑣 = 𝑛D;𝑣 − 𝐸 D;𝑣 ,
where 𝑛F;𝑣 means the full number of fermion lines attached to the vertex 𝑣 (of course, it is the
number of fermion fields appearing in the interaction term corresponding to the vertex 𝑣) and
𝐸 F;𝑣 is the number of external fermion lines attached to the vertex 𝑣. An analogous notation holds
for boson lines and, concerning the last line in (44.7), 𝑛D;𝑣 denotes the number of derivatives in
the interaction term corresponding to vertex 𝑣 and 𝐸 D;𝑣 means the number of derivatives “acting
on external lines” (this amounts to the factorization of a power of external momenta). From
(44.6) and (44.7) we thus get
3
∑︁  
𝜔(𝐺) − 4 = (𝜔𝑣 − 4) − 𝐸 F + 𝐸 B + 𝐸 D , (44.8)
𝑣
2

where the “index of interaction vertex” 𝜔𝑣 is


3
𝜔𝑣 = 𝑛F;𝑣 + 𝑛B;𝑣 + 𝑛D;𝑣 (44.9)
2
and 𝐸 F = 𝑣 𝐸 F;𝑣 is the total number of external fermion lines; 𝐸 B means the same for boson
Í
lines and 𝐸 D is the complete power of external momenta factorized in the diagram contribution
due to the operation of derivatives appearing the interaction terms.

263
(a) (b)

Fig. 44.4: (a) UV convergent box diagram, (b) the box diagram involving an UV divergent subdiagram.

The results (44.8), (44.9) are quite remarkable, but there is also one caveat: in the
interpretation of the value of 𝜔(𝐺) as the criterion for UV convergence or divergence we have,
in fact, tacitly assumed that the diagram in question does not involve UV divergent subdiagrams.
This can be illustrated on an example of the QED diagrams shown in Fig. 44.4. In QED,
obviously, 𝜔𝑣 = 4, since 𝑛F = 2, 𝑛B = 1 and 𝑛D = 0. Thus, according to the formula (44.8) we
get for both diagrams in Fig. 44.4 the value 𝜔(𝐺) = −2. While for the diagram in Fig. 44.4(a) the
conclusion about UV convergence is correct, for the diagram (b) we observe a UV divergence
hidden in the self-energy subdiagram inserted in an internal fermion line.26
Thus, for the integral representing the diagram (b) it is more appropriate to use the term
“conditional convergence”, which is quite common in the literature (cf. e.g. the book [6]).
In any case, the formulae (44.8), (44.9) are very valuable for providing us with the
information about the possible number of types of UV divergent 1PI diagrams. In this connection,
the value 𝜔𝑣 = 4 is in a sense critical. Indeed, if for a given QFT model one has at least one
interaction term with 𝜔𝑣 > 4, it is clear that for any chosen configuration of external lines (i.e.
for chosen values of 𝐸 F and 𝐸 B ) one can get an UV divergent graph by going to a sufficiently
high order of perturbation expansion. On the other hand, if for any 𝑣 one has 𝜔𝑣 = 4, or even
𝜔𝑣 < 4, the number of types of UV divergent graphs is finite.
Another point is supposed to be obvious: the simple “power counting” leading to the
formula (44.8) is dealing just with individual diagrams and cannot reflect e.g. the cancellation
of some UV divergences within groups of diagrams (cf. the case of the fermion box in QED).
One more remark is in order here. The value of 𝜔𝑣 is seen to coincide with the “mass
dimension” of the corresponding term in the interaction Lagrangian, stripped of the coupling
constant (i.e. simply the dimension of the relevant monomial in participating fields). Indeed,
we know that the dimension of Dirac field is 3/2, for any boson field it is 1 and the derivative
has the dimension of inverse length, which is also 1. Complete Lagrangian density, i.e. also the
interaction terms (including coupling constants) has the dimension four; thus, the critical value
𝜔𝑣 = 4 corresponds to dimensionless coupling constant.
The derivation of the formulae (44.8), (44.9) can be generalized to the case where bosons
are represented by a massive vector (Proca) field. In such a case, the boson propagator has
26 Itis fair to admit that our simple analysis of the convergence properties of Feynman graphs, which has led us
to the key quantity 𝜔(𝐺), has been somewhat sloppy. A detailed treatment of the problem and a relevant rigorous
theorem concerning the convergence issue in question can be found e.g. in the book [6] (see the Section 8.1.3
therein). In any case, an important result of the rigorous analysis is that a non-negative value of 𝜔(𝐺) definitely
signifies UV divergent 1PI diagram.

264
different scaling properties: its form is
𝑙 𝜇 𝑙𝜈
−𝑔 𝜇𝜈 +
𝐷 𝜇𝜈 (𝑙) = 𝑀2 , (44.10)
𝑙 − 𝑀2
2

so that for 𝑙 → ∞ it behaves like O (1), i.e. the zeroth power of 𝑙. Thus, for 𝜔(𝐺) one may now
write, instead of (44.2), ∑︁
𝜔(𝐺) = 4𝐿 − 𝐼F + 𝛿𝑣 . (44.11)
𝑣
The following steps are the same as before and one ends up with the formula

3
∑︁  
𝜔(𝐺) − 4 = (𝜔𝑣 − 4) − 𝐸 F + 2𝐸 B + 𝐸 D , (44.12)
𝑣
2

where now
3
𝜔𝑣 = 𝑛F;𝑣 + 2𝑛B;𝑣 + 𝑛D;𝑣 . (44.13)
2
In the most general case, where both types of bosons are present, one gets
3 (1) (2)
𝜔𝑣 = 𝑛F;𝑣 + 𝑛B;𝑣 + 2𝑛B;𝑣 + 𝑛D;𝑣 , (44.14)
2
where (1) and (2) refer to the bosons of the 1st type (like the photon or a scalar) and 2nd type
(Proca fields), respectively.
Finally, for an illustration of our general formulae, let us discuss some explicit examples.
1) QED
As we have already noted, here 𝜔𝑣 = 4, so that for any 1PI graph one has, using the
formula (44.8), 𝜔(𝐺) = 4 − 23 𝐸 F − 𝐸 B . The readers are encouraged to identify a full set of
configurations (𝐸 F , 𝐸 B ) for which 𝜔(𝐺) ≥ 0, and confront their findings with our previous
knowledge concerning one-loop diagrams.
2) QED involving Pauli term
As we have noted in Chapter 13, if one wants to describe a spin-1/2 particle with an essentially
arbitrary magnetic moment, one has to include the so-called Pauli term in the Dirac equation.
In field theory, this amounts to adding the (gauge invariant) interaction term

LPauli = 𝑓 𝜓𝜎 𝜇𝜈 𝜓𝐹𝜇𝜈 (44.15)

to the basic QED Lagrangian. It is easy to see that the coupling constant 𝑓 has the dimension
of an inverse mass. The numbers relevant for the evaluation of the corresponding index of
interaction vertex (44.9) are 𝑛F = 2, 𝑛B = 1 and 𝑛D = 1 so that 𝜔v = 5. This indicates that
within such an “extended QED” one may anticipate an unlimited proliferation of types of UV
divergent graphs in higher orders of perturbation expansion. It turns out that it is indeed so
— their number becomes infinite.
3) QED with massive photon
In such a case, one should use the formulae (44.12), (44.13). Thus, one gets 𝜔𝑣 = 5 and
this suggests that the number of types of UV divergent graphs is infinite. However, it turns
out (it is a non-trivial finding) that there are many cancellations and the resulting picture is
essentially the same as in ordinary QED.

265
(a) (b)

Fig. 44.5: Interaction vertices in the scalar QED.

4) Scalar QED
This describes the interaction of a charged scalar and the photon. Here we have two types of
interaction vertices, namely those shown in Fig. 44.5.

For the vertex (a) one has 𝑛B = 2, 𝑛D = 1 (the interaction term 𝑖𝑒𝜑∗ 𝜕 𝜇 𝜑𝐴 𝜇 involving the
derivative), while for (b) 𝑛B = 4, 𝑛D = 0 (the “seagull” term 𝑒 2 𝜑∗ 𝜑𝐴 𝜇 𝐴 𝜇 ) and, of course, for
both cases 𝑛F = 0. Thus, according to (44.9), both for (a) and (b) one has 𝜔𝑣 = 4.
5) Direct four-fermion interaction 
Schematically, take Lint = 𝐺 𝜓 1 𝜓2 𝜓 2 𝜓1 (which is a simplified prototype of the weak

interaction of the electron and neutrino). Here 𝑛F = 4, 𝑛B = 0, 𝑛D = 0, so that 𝜔𝑣 = 6.
In this case, the number of types of UV divergent graphs is indeed infinite; the types
of diagrams involving UV divergences proliferate with increasing order of perturbation
expansion (i.e. with the increasing number of vertices in the sum appearing in (44.8)).
6) Standard model of electroweak interactions
One encounters there, among other things, self-interactions of massive vector bosons (𝑊 ± , 𝑍)
and these are of two types — trilinear (involving one derivative) and quadrilinear (without
derivative). Moreover, charged vector bosons 𝑊 ± also interact with photons. For such
interactions one then has, according to our formulae:

𝜔𝑊𝑊 𝛾 = 2 · 2 + 1 + 1 = 6 ,
𝜔𝑊𝑊 𝑍 = 2 · 3 + 1 = 7,
𝜔𝑊𝑊𝑊𝑊 = 2 · 4 = 8,
𝜔𝑊𝑊 𝛾𝛾 = 2 · 2 + 2 = 6.

Thus, it might seem that the number of types of UV divergences is infinite. However, there are
delicate cancellations due to the underlying gauge invariance, and when the dust settles, the
resulting situation is similar as in QED. An alternative way of understanding the remarkable
phenomenon of “miraculous” cancellations is a sophisticated reformulation of the theory in
such a way that the propagators of massive vector bosons also behave like 1/𝑙 2 for 𝑙 → ∞,
so that the simple formula (44.8) is applicable (for details, see e.g. [22]).

In summary, one might be tempted to say now that a subtitle for this chapter could read
“UV divergences wherever you look”. In such a situation, Eric Cartman from South Park would
probably exclaim: “Screw you guys! I’m going home.” But we persist. In the forthcoming
chapters, we will try to cope with the problem of UV divergences by developing the program of
the renormalization in QED.

266
Chapter 45

Renormalization in QED:
preliminary considerations

Let us come back to our sample scattering process 𝑒(𝑘) + 𝜇( 𝑝) → 𝑒(𝑘 ′) + 𝜇( 𝑝′). The second
order matrix element corresponding to the familiar tree-level diagram has the form
 𝑔 𝛼𝛽 
M (2) = 𝑒 2 u ( 𝑝′)𝛾𝛼 u ( 𝑝) ′
(45.1)
 
u (𝑘 )𝛾 𝛽 u (𝑘) ,
𝑞2
where 𝑞 = 𝑘 ′ − 𝑘 (= 𝑝 − 𝑝′). The 4th order correction due to the vacuum polarization loop
corresponds to the diagram shown in Fig. 45.1. Using our conventional notation, its contribution

β
k k′
q
ν

µ
q

α
p p′

Fig. 45.1: Vacuum polarization insertion into the photon propagator in the matrix element for 𝑒–𝜇
scattering.

can be written as
 𝑔 𝛼𝜇  𝑔 𝜈𝛽 
M𝑎(4) = 𝑒 2 u ( 𝑝′)𝛾𝛼 u ( 𝑝) ′
(45.2)
 
−Π 𝜇𝜈 (𝑞) u (𝑘 )𝛾 𝛽 u (𝑘) ,
𝑞2 𝑞2
where another factor of 𝑒 2 is included in Π 𝜇𝜈 (𝑞). Taking into account the transversality property
of Π 𝜇𝜈 (𝑞), i.e.
Π 𝜇𝜈 (𝑞) = Π(𝑞 2 )(𝑞 2 𝑔 𝜇𝜈 − 𝑞 𝜇 𝑞 𝜈 ) , (45.3)
and using the identity u ( 𝑝′) 𝑞/ u ( 𝑝) = 0, or u (𝑘 ′) 𝑞/ u (𝑘) = 0 (due to the Dirac equation) one gets,
after some simple manipulations,

M𝑎(4) = M (2) · −Π(𝑞 2 ) . (45.4)




267
So, it is surprisingly simple, isn’t it? Nevertheless, the main problem lies ahead. We know how
to regularize the UV divergent form factor Π(𝑞 2 ), but the question is what is the right candidate
for a “true” contribution of the expression like (45.4); obviously, a physical scattering amplitude
depending on a regularization parameter would be unsatisfactory.
For the exploration of such a problem it will be useful to utilize our earlier “tilde” notation
for quantities stripped of the coupling factors (see chapter 42). Then, we may write

M (2) + M𝑎(4) = 𝑒 2 1 − 𝑒 2 Π(𝑞


e 2) M (45.5)
 (2)
f .

Obviously, the UV divergence may be singled out in the form factor Π(𝑞 e 2 ) for any particular
value of 𝑞 2 , e.g. 𝑞 2 = 0. Splitting Π(𝑞
e 2 ) into Π(0)
e e 2 ) − Π(0),
and the UV finite part Π(𝑞 e
Eq. (45.5) may be recast as

M (2) + M𝑎(4) = 𝑒 2 1 − 𝑒 2 Π(0)


 (2)
Mf
(45.6)
e
− 𝑒 4 Π(𝑞
e 2 ) − Π(0)
 (2)
e Mf .

Now we are ready to make a crucial step forward. The first line in (45.6) suggests that we
might try to reinterpret the original coupling constant 𝑒 in such a way that the whole coefficient
multiplying M f(2) is identified with the physical coupling constant squared, i.e. to define a
“renormalized” coupling constant 𝑒 R by

𝑒 2R = 𝑒 2 1 − 𝑒 2 Π(0) (45.7)

e .

Thus, 𝑒 is now considered to be an unphysical parameter, usually called bare coupling constant
(in Czech: holá vazbová konstanta). Our trick of introducing 𝑒 R instead of 𝑒 is therefore
a reinterpretation of the relevant parameter in the interaction Lagrangian and the subsequent
reparametrization of the expression for the scattering matrix element in question.
Let us now return to the relation (45.7). One would like to express 𝑒 2 in terms of 𝑒 2R . It
is almost trivial; but, just to be sure, we will do it explicitly here.
So, denoting 𝑒 2 for brevity as 𝑥, one has to solve the quadratic equation
2
Π(0)𝑥
e − 𝑥 + 𝑒 2R = 0 . (45.8)

Of course, it has two solutions, namely

1
 √︃ 
𝑥1,2 = 2
1 ± 1 − 4𝑒 R Π(0)
e . (45.9)
2Π(0)
e

It is clear how to select the right solution. We wish, naturally, that 𝑒 2 = 0 also means 𝑒 2R = 0 and
vice versa. So, we choose
1
 √︃ 
𝑥= 1 − 1 − 4𝑒 2R Π(0)
e . (45.10)
2Π(0)
e

A remark is in order here. Having in mind that Π(0) e contains the UV divergence, one might
worry about a negative number under the square root sign, resulting in complex values of 𝑥.
Please, don’t be afraid! In our calculation, we always keep the regularization parameter fixed,
and its value can be taken such that 𝑒 2R Π(0)
e ≪ 1. So, without further misgivings, we may
expand the square root in (45.10) to get, after some simple manipulations,

𝑒 2 = 𝑒 2R 1 + 𝑒 2R Π(0) + O (𝑒 6R ) . (45.11)

e

268
Next, in the second line of (45.6) one may now simply replace 𝑒 4 with 𝑒 4R and one thus has, to
the order O (𝑒 4R ),

M (2) + M𝑎(4) = 𝑒 2R Mf(2) − 𝑒 4 Π(𝑞


e 2 ) − Π(0) f + O (𝑒 6 ) . (45.12)
 (2)
R
e M R

What we have done up to now is just a first hint of the renormalization procedure. Of course,
the diagram in Fig. 45.1 is not the only 4th order contribution to the considered process, so we
cannot yet make a full statement about the meaning of 𝑒 R as the physical measurable coupling
constant. Nevertheless, our first step into the area of renormalization theory gives us useful
insight into the conceptual foundations of the whole method.
Next, we are going to discuss one more example of this kind, namely the renormalization
of fermion mass. For this purpose, we will consider the fermion (e.g. electron) self-energy
loop inserted into the propagator in a 4th order Feynman diagram contributing to the Compton
scattering, as shown in Fig. 41.2. In this picture, the self-energy loop is sandwiched between
two Dirac field propagators, so that the interior of the diagram in Fig. 41.2 is proportional to
( 𝑝/ − 𝑚) −1 Σ( 𝑝)( 𝑝/ − 𝑚) −1 . Before proceeding further, let us recall a matrix identity that will
come in handy in our calculation. The identity in question is as follows. Suppose that 𝑋, 𝑌 are
non-singular matrices such that 𝑋 − 𝑌 is non-singular as well; then

(𝑋 − 𝑌 ) −1 = 𝑋 −1 + 𝑋 −1𝑌 𝑋 −1 + 𝑋 −1𝑌 𝑋 −1𝑌 𝑋 −1 + . . . (45.13)

The proof of Eq. (45.13) is easy. Indeed, one may write


  −1
(𝑋 − 𝑌 ) −1 = 𝑋 (1 − 𝑋 −1𝑌 ) = (1 − 𝑋 −1𝑌 ) −1 𝑋 −1
(45.14)
= (1 + 𝑋 −1𝑌 + 𝑋 −1𝑌 𝑋 −1𝑌 + . . . ) 𝑋 −1 ,
and there you are. Note that in arriving at the last line in (45.14) we have utilized a standard
formula for geometric series.
When the diagram in Fig. 41.2 is added to the familiar 2nd order (tree-level) graph, one
is dealing with “corrected propagator”

𝑖 𝑖 𝑖
+ = + 𝑖Σ( 𝑝)
𝑝/ − 𝑚 𝑝/ − 𝑚 𝑝/ − 𝑚
1 1  1
 
=𝑖 + −Σ( 𝑝) ,
𝑝/ − 𝑚 𝑝/ − 𝑚 𝑝/ − 𝑚
and this, using the identity (45.13), can be written as
1
𝑖 + O (𝑒 4 ) (45.15)
𝑝/ − 𝑚 + Σ( 𝑝)

(the point is that Σ( 𝑝) is of the order O (𝑒 2 ); for the time being we do not distinguish here
between bare and renormalized couplings).
The quantity Σ( 𝑝) has been examined in detail in chapters 41 and 42, where we have
used the expansion
Σ( 𝑝) = 𝐴 + 𝐵( 𝑝/ − 𝑚) + . . . , (45.16)
with the remainder “. . . ” that can be written, for convenience, as

𝐶 ( 𝑝/ − 𝑚) 2 + · · · = 𝐶 ( 𝑝)( 𝑝/ − 𝑚) (45.17)

269
(so that 𝐶 ( 𝑝) = O ( 𝑝/ − 𝑚) and this is, as we know, UV finite). Thus, the expression for the
corrected propagator in (45.15) can be recast as
1 1
= . (45.18)
𝑝/ − 𝑚 + Σ( 𝑝) 𝑝/ − 𝑚 + 𝐴 + 𝐵( 𝑝/ − 𝑚) + 𝐶 ( 𝑝)( 𝑝/ − 𝑚)

The structure of the denominator in (45.18) suggests a reinterpretation of the original mass
parameter 𝑚: one may now denote 𝑚 as an unphysical “bare mass” and identify the combination
𝑚 − 𝐴 ≡ 𝑚 − 𝛿𝑚 as the physical (“renormalized”) mass. The contribution −𝛿𝑚 = −𝐴,
originating in Σ( 𝑝), may be intuitively understood as an “electromagnetic self-energy” (since the
corresponding closed loop looks like an interaction of the electron with its own electromagnetic
field, represented by the virtual photon in Fig. 41.2). Thus, the term “self-energy graph” for
Σ( 𝑝) is clarified.
In any case, the idea of the mass renormalization is embodied in the relation

𝑚 − 𝐴 = 𝑚 phys. . (45.19)

As we know from Chapter 41, the constant 𝐴 depends on a regularization parameter; in particular,
in the Pauli–Villars (PV) scheme one has for the UV divergent part

3𝛼 𝑀2
𝐴div = − 𝑚 ln 2 (45.20)
4𝜋 𝑚
(see the result (41.17) and its PV counterpart shown e.g. in the book [6]). So, 𝛿𝑚 = 𝐴 is an extra
(negative) ingredient of the bare mass, in addition to (positive) 𝑚 phys. . Both the bare mass and
𝛿𝑚 depend on the regularization (cut-off) parameter, while 𝑚 phys. should be the (measurable)
physical mass. Notice also a curious feature of the relation (45.19) along with (45.20): for
𝑀 → ∞ (which corresponds to 𝜖 → 0 in dimensional regularization) one would get negative
value for the bare mass. However, one need not worry about a “paradox” like this, since we
always have in mind a fixed finite value of the regularization parameter; the limit of removed
cut-off is thus of purely academic interest (this is similar to our preceding discussion around
Eq. (45.10) for the coupling).
From now on, we will write simply 𝑚 instead of 𝑚 phys. . Since the coefficients 𝐴, 𝐵 and
𝐶 ( 𝑝) in (45.16), (45.17) are of the order O (𝑒 2 ), the expression (45.18) can be rewritten in terms
of the renormalized mass as
1
, (45.21)
𝑝/ − 𝑚 + 𝐵( 𝑝/ − 𝑚) + 𝐶 ( 𝑝)( 𝑝/ − 𝑚)

if one neglects terms of the order O (𝑒 4 ). We may make another step forward and recast the
expression (45.21) as
1 1 1
= 1 − 𝐵 − 𝐶 ( 𝑝) + O (𝑒 4 )

𝑝/ − 𝑚 1 + 𝐵 + 𝐶 ( 𝑝) 𝑝/ − 𝑚
(45.22)
1
= (1 − 𝐵) 1 − 𝐶 ( 𝑝) + O (𝑒 4 ) .

𝑝/ − 𝑚

The idea of mass renormalization outlined above is, hopefully, quite plausible. But, in (45.22)
there is still an UV divergent remnant; the fermion propagator with the one-loop correction
incorporates, among other things, a constant UV divergent factor 1 − 𝐵. We know that the
propagator is defined by means of a product of fields. So, one might come up with a crazy idea

270
to renormalize, with the help of an appropriate factor, the quantized field itself. Well, such an
idea is crazy only at first sight. We will see soon that its implementation is feasible, but for such
a purpose it is convenient to develop a novel technique, based on the concept of the so-called
renormalization counterterms. This will be the main topic of the next chapter.
The contents of the present chapter may be summarized as follows. We have introduced
the idea of renormalization,27 which means reinterpretation of some parameters in the basic
Lagrangian and subsequent reparametrization of the scattering amplitudes in terms of those
renormalized constants (which are supposed to be measurable quantities). The procedure high-
lights the concept of “bare parameters” along with the renormalized ones. The UV divergences
coming from the closed-loop diagrams are eliminated by renormalization, as they are “absorbed”
in the redefinition of parameters (like the coupling constant and the mass) when passing from
bare to renormalized ones. In other words, UV divergences become part of the unphysical bare
parameters.
In several forthcoming chapters, we are going to describe a basic practical technique
of the renormalization in perturbative QED. In fact, the theme of renormalization of QFT
is wide-ranging and its various aspects are treated in many books and review papers (see
e.g. [2, 6, 8, 10–14, 19, 24, 25, 27–31, 55]. To paraphrase Woody Allen, in the aforementioned
literature there is (almost) “everything you always wanted to know about renormalization (but
were afraid to ask)”.

27 Note that, historically, the term “renormalization” (or rather “to renormalize”) had been used apparently for
the first time by Robert Serber in the paper [48].

271
Chapter 46

Renormalization counterterms

The QFT model we are going to employ in what follows is the QED of electrons, positrons and
photons; a single fermion species is sufficient for the discussion of methods we would like to
develop here. The relevant Lagrangian can be written, schematically, as

LQED = L (1) + L (2) + L (3) , (46.1)

where

L (1) = 𝑖𝜓 𝜕/𝜓 − 𝑚𝜓𝜓 ,


1
L (2) = − 𝐹𝜇𝜈 𝐹 𝜇𝜈 , 𝐹𝜇𝜈 = 𝜕𝜇 𝐴𝜈 − 𝜕𝜈 𝐴 𝜇 , (46.2)
4
(3)
L = 𝑒𝜓𝛾 𝜇 𝜓 𝐴 𝜇 .

In the preceding chapter we have observed, besides other things, that there is a reasonable
motivation for introducing the concept of bare mass for the electron. The point is that a part
of the UV divergent contribution of the electron self-energy loop can be buried in such an
unphysical (regularization-dependent) parameter and one is left with the physical renormalized
mass that defines the pole of the corresponding field propagator. Thus, L (1) in (46.2) may be
rewritten, tentatively, as
L (1) = 𝑖𝜓 𝜕/𝜓 − 𝑚 0 𝜓𝜓 , (46.3)
where 𝑚 0 is the bare mass. Let us denote the difference between 𝑚 0 and the physical mass 𝑚 as
𝛿𝑚, so that
𝑚 0 = 𝑚 + 𝛿𝑚 . (46.4)
The Lagrangian (46.3) is thus

L (1) = 𝑖𝜓 𝜕/𝜓 − 𝑚𝜓𝜓 − 𝛿𝑚𝜓𝜓 . (46.5)

Now we make a basic strategic step. Since we would like to define the free part of the relevant
Lagrangian in terms of the physical mass 𝑚, the additional term −𝛿𝑚𝜓𝜓 will be incorporated in
the interaction part of the QED Lagrangian. Of course, up to now we have been used to identify
as “interactions” the terms involving at least three fields, but in fact there is no fundamental
principle that would forbid quadratic field monomials as interactions — from our point of
view it is just a choice of an organizational principle for the perturbation expansion. The full
correction to the electron propagator then looks, pictorially, as shown in Fig. 46.1. Using the
familiar expansion
Σ( 𝑝) = 𝐴 + 𝐵( 𝑝/ − 𝑚) + . . . ,

272
+
𝑝 𝑖Σ( 𝑝) 𝑝 𝑝 𝑝
−𝑖𝛿𝑚

Fig. 46.1: Electron self-energy loop and the corresponding mass counterterm.

the sum corresponding to diagrams in Fig. 46.1 is proportional to

𝐴 + 𝐵( 𝑝/ − 𝑚) + · · · + (−𝛿𝑚) , (46.6)

and an obvious choice for the cancellation is

𝛿𝑚 = 𝐴 . (46.7)

Comparing this with the relation (46.4), one sees that the mass renormalization introduced in
the preceding chapter is thereby recovered.
Thus, the funny interaction term −𝛿𝑚𝜓𝜓 plays the role of a compensation of a part of
the self-energy loop in Fig. 46.1; precisely this is the reason why we call it the counterterm:
against the UV divergent loop term, a counterterm is invoked.28
So, we have reproduced, in an almost trivial way, our previous picture of the mass
renormalization. However, having done it, there is still an UV divergent remnant in the sum
(46.6). Once we have launched the program relying on counterterms, one may cast off all
inhibitions and continue such a game. In particular, the coefficient 𝐵 multiplies 𝑝/ and this
indicates that a structure involving a pair of Dirac fields and a derivative could come in handy.
Having this in mind and recalling the elementary result for the mass renormalization, one may
consider a pair of counterterms incorporated in the Lagrangian structure
(1)
LCT = 𝑖𝐾2 𝜓 𝜕/𝜓 − 𝐾1 𝜓𝜓 , (46.8)

designed to take care of the UV divergent terms coming from Σ( 𝑝). Thus, one may say that
the Dirac-like structure (46.8) is the counterterm Lagrangian induced by the electron self-
energy function Σ( 𝑝), corresponding to the loop diagram in Fig. 46.1. Now, the corrected
electron propagator up to the order O (𝑒 2 ), including contributions of the counterterms (46.8),
is, pictorially, shown in Fig. 46.2. In the last diagram we have indicated the Feynman rule

+ + +
𝑝 𝑝 𝑖Σ( 𝑝) 𝑝 𝑝 𝑝 𝑝 𝑖𝐾2 /𝑝 𝑝
−𝑖𝐾1

Fig. 46.2: Corrected electron propagator including the self-energy loop and two independent countert-
erms.
28 In Czech, it reads kontrčlen, though it is an ugly hybrid word. A linguistic remark addressed to potential
polyglots: In English, the word “counterterm” is obviously quite consistent. In Czech, so as to avoid the hybrid
word “kontrčlen”, one should in fact use a funny but consistent term “protičlen”. But this is clearly not viable. Some
people tried to say “kontračlen”, but this also did not take root. Most probably, the word “kontrčlen” migrated to
the Czech language from Russian term “kontrqlen”, which is constructed by using the prefix “contre” of French
origin (see e.g. the Russian original of the book [16]).

273
corresponding to the first term in (46.8). It is not difficult to find out that such a rule is indeed
valid, but we will not discuss it now (it is also a good topic for a tutorial, not time-consuming).
Thus, the corrections shown in Fig. 46.2 amount to the sum

Σ( 𝑝) − 𝐾1 + 𝐾2 𝑝/ . (46.9)

Again, using the familiar expansion of Σ( 𝑝), namely

Σ( 𝑝) = 𝐴 + 𝐵( 𝑝/ − 𝑚) + Σ( 𝑝) , (46.10)

where Σ( 𝑝) is a shorthand notation for the UV finite part, one may set the values of the
counterterm constants 𝐾1 , 𝐾2 so as to cancel the first two terms in (46.10) completely. It means

𝐴 − 𝐵𝑚 − 𝐾1 = 0 ,
(46.11)
𝐵 + 𝐾2 = 0 ,

i.e.
𝐾1 = 𝐴 − 𝐵𝑚 ,
(46.12)
𝐾2 = −𝐵 .
(1)
Let us now see what one gets for the sum of the free Dirac Lagrangian and LCT (recall that now
the free Lagrangian is simply 𝑖𝜓 𝜕/𝜓 − 𝑚𝜓𝜓 with 𝑚 being the physical mass). We have
(1)
L (1) + LCT = 𝑖(1 + 𝐾2 )𝜓 𝜕/𝜓 − (𝑚 + 𝐾1 )𝜓𝜓 . (46.13)

The form of the expression (46.13) suggests that one may introduce a rescaled Dirac field
according to
𝜓0 = 𝑍21/2 𝜓 , (46.14)
with 𝑍2 = 1 + 𝐾2 , i.e., using (46.12),

𝑍2 = 1 − 𝐵 . (46.15)

Note that the choice of notation in (46.14) is a standard convention. The rescaling (46.14) is in
fact an implementation of our previous hint at a “crazy” idea to renormalize the field itself. In
(1)
the spirit of previous considerations we will call 𝜓0 the bare field and the sum L (1) + LCT can
(1)
be denoted as Lbare . If we express this in terms of the bare field (46.14) one gets, using (46.12)
and (46.13),
(1)
Lbare = 𝑖𝜓 0 𝜕/𝜓0 − (𝑚 + 𝐴 − 𝐵𝑚)𝑍2−1 𝜓 0 𝜓0 . (46.16)
Then, up to the lowest non-trivial order in the coupling constant 𝑒, (46.16) can be recast as
(1)
Lbare = 𝑖𝜓 0 𝜕/𝜓0 − (𝑚 + 𝐴)𝜓 0 𝜓0 . (46.17)

Note that in arriving at (46.17) we have used 𝑍2−1 = (1 − 𝐵) −1 = 1 + 𝐵 + O (𝑒 4 ) and then we have
discarded terms proportional to 𝐴𝐵 or 𝐵2 , since these are of the order O (𝑒 4 ). Thus, we have
(1)
Lbare = 𝑖𝜓 0 𝜕/𝜓0 − 𝑚 0 𝜓 0 𝜓0 , (46.18)

with
𝑚0 = 𝑚 + 𝐴 ,

274
l
p p′

l+p l + p′

q = p′ − p

Fig. 46.3: QED vertex one-loop correction reproduced here for reader’s convenience.

which is just our earlier relation between the bare and physical mass (cf. (45.19)).
As a next step, let us consider the vertex correction corresponding to the diagram shown
in Fig. 46.3. We know from our previous treatment of the vertex function Γ𝜇 ( 𝑝′, 𝑝) (see chapters
41, 42) that its UV divergent part resides e.g. in Γ𝜇 ( 𝑝, 𝑝). In other words, if we split Γ𝜇 ( 𝑝′, 𝑝)
as
Γ𝜇 ( 𝑝′, 𝑝) = Γ𝜇 ( 𝑝, 𝑝) + Γ𝜇 ( 𝑝′, 𝑝) − Γ𝜇 ( 𝑝, 𝑝) , (46.19)


the term in parentheses is UV finite. Further, an important information is provided by the Ward
identity (42.4), which tells us that, employing the “tilde notation”, it holds
e 𝜇 + UV finite terms .
Γ𝜇 ( 𝑝, 𝑝) = 𝐵𝛾
e (46.20)

Including the coupling factors, Γ𝜇 ( 𝑝, 𝑝) is proportional to 𝑒 3 , while 𝑒 2 𝐵


e = 𝐵. So, one may
write, instead of (46.20),
Γ𝜇 ( 𝑝, 𝑝) = 𝑒𝐵𝛾 𝜇 + . . . (46.21)
Now, it is clear what could be the relevant counterterm compensating the UV divergent term in
(46.21). Since the latter is proportional to 𝛾 𝜇 (which appears in the basic QED interaction term),
it is natural to choose the counterterm with the structure of the interaction Lagrangian itself, i.e.
(3)
LCT = 𝐾v 𝜓𝛾 𝜇 𝜓 𝐴 𝜇 . (46.22)

Of course, the field composition of the counterterm (46.22) is dictated by the structure of the
diagram in Fig. 46.3, which contains two external fermion lines and one external photon line.
For a compensation of the UV divergent term in (46.21) one must obviously set

𝐾v = −𝑒𝐵 . (46.23)

Thus, our result for the “vertex counterterm” reads


(3)
LCT = −𝑒𝐵𝜓𝛾 𝜇 𝜓 𝐴 𝜇 , (46.24)

so that
(3)
L (3) + LCT = 𝑒(1 − 𝐵)𝜓𝛾 𝜇 𝜓 𝐴 𝜇
(46.25)
= 𝑒𝑍1 𝜓𝛾 𝜇 𝜓 𝐴 𝜇 .

Here we have introduced the conventional notation for the renormalization constant

𝑍1 = 1 − 𝐵 . (46.26)

275
Thus, we see that owing to the Ward identity one has
𝑍1 = 𝑍2 (46.27)
(cf. (46.15)).
Finally, let us examine the familiar vacuum polarization bubble shown in Fig. 40.1. Let
us recall that, conventionally, we denote its contribution as −𝑖Π 𝜇𝜈 (𝑞) and we know that in any
good regularization scheme the tensor Π 𝜇𝜈 (𝑞) is transverse, i.e.

Π 𝜇𝜈 (𝑞) = Π(𝑞 2 )(𝑞 2 𝑔 𝜇𝜈 − 𝑞 𝜇 𝑞 𝜈 ) .


From our earlier results it is clear that if Π(𝑞 2 ) is split as e.g.
Π(𝑞 2 ) = Π(0) + Π(𝑞 2 ) − Π(0) , (46.28)


then the term in parentheses is UV finite. This gives us a hint for finding a corresponding
counterterm. It is clear that it should be quadratic in the electromagnetic fields (because there
are two photon lines attached to the fermion loop in Fig. 40.1) and it is expected to involve two
derivatives (because of the quadratic 𝑞-dependence of Π(0)(𝑞 2 𝑔 𝜇𝜈 − 𝑞 𝜇 𝑞 𝜈 )). It turns out that
the right answer is
1
 
(2)
LCT = 𝐾3 − 𝐹𝜇𝜈 𝐹 𝜇𝜈
. (46.29)
4
So, the counterterm in question is proportional to the Maxwell Lagrangian in (46.2)! This might
seem quite surprising at first sight, but our comments preceding (46.29) should make it plausible.
In any case, a detailed examination of the contribution of the counterterm (46.29) leads to the
conclusion that the corresponding Feynman rule is

µ ν
= −𝑖𝐾3 (𝑞 2 𝑔 𝜇𝜈 − 𝑞 𝜇 𝑞 𝜈 ) (46.30)
q q

(verifying this is left to a diligent reader as an instructive exercise). So, following the decompo-
sition (46.28), one may set
𝐾3 = −Π(0) . (46.31)
Thus,
(2) 1
L (2) + LCT = − (1 + 𝐾3 )𝐹𝜇𝜈 𝐹 𝜇𝜈 , (46.32)
4
and this leads us naturally to rescaling the photon field 𝐴 𝜇 , in analogy with what we have done
before for the Dirac field. So, let us define the bare photon field by

𝐴0 = 𝑍31/2 𝐴 𝜇 ,
𝜇
(46.33)
where
𝑍3 = 1 + 𝐾3 = 1 − Π(0) . (46.34)
Thus, the original Maxwell term L (2) in the QED Lagrangian is reproduced, with 𝐴 𝜇 being
𝜇
replaced by the bare field 𝐴0 .
Now we may express the interaction term (46.25) in terms of the bare fields. One has
(3)
L (3) + LCT = 𝑒𝑍1 𝜓𝛾 𝜇 𝜓 𝐴 𝜇
= 𝑒𝑍1 𝑍2−1 𝑍3−1/2 𝜓 0 𝛾 𝜇 𝜓0 𝐴0
𝜇
(46.35)
= 𝑒𝑍3−1/2 𝜓 0 𝛾 𝜇 𝜓0 𝐴0 ,
𝜇

276
where we have used (46.14) and the Ward identity 𝑍1 = 𝑍2 , displayed in (46.27). Thus, the
coefficient multiplying the field monomial in (46.35) may be identified with the bare coupling
constant, i.e.
𝑒 0 = 𝑍3−1/2 𝑒 , (46.36)
and this in turn means that
𝑒 = 𝑍31/2 𝑒 0 . (46.37)
It is worth noticing that in such a way we have recovered our earlier preliminary relation
between the renormalized and bare coupling constant (cf. (45.7)). The point is that, owing to
the Ward identity, the only contribution to the renormalization of the coupling constant (“charge
renormalization”) comes from the vacuum polarization loop. By the way, from (46.34) and
(46.37) it is seen that 𝑒 2 < 𝑒 20 ; we will comment on this at the end of Chapter 48.
So, we have arrived at a rather remarkable conclusion. The original QED Lagrangian,
when supplemented with all necessary counterterms, has the same form as before, except that
one employs here bare parameters and bare fields. The salient point is that the necessary
counterterms have the form of terms already present in the basic Lagrangian. In this sense,
QED is a renormalizable theory (we have verified it here at the level of one-loop Feynman
diagrams). We can also see now, why it is so important that the fermion triangle and box
diagrams do not require counterterms. Indeed, these would have to involve three or four photon
fields, respectively, in a gauge invariant form; thus, the triangle would induce a counterterm
of the type, schematically, 𝐹𝐹𝐹 and the box would produce a counterterm structure 𝐹𝐹𝐹𝐹.
But those are monomials with mass dimension six and eight, respectively, and according to our
earlier formula for the index of divergence of 1PI diagrams (see (44.8), (44.9)) they would lead
then to an infinite number of types of UV divergences in higher orders. Similarly beneficial for
our purpose is the finiteness of the box diagram with four external fermion lines (see Fig. 42.2).
Thus, it is reassuring that we have been able to avoid potential destructive complications that
could be induced by interaction terms (counterterms) with dimension greater than four. Let
us also recall that adding the Pauli term (44.15) to the standard QED Lagrangian would lead
to the appearance of infinite number of types of UV divergent graphs, i.e. it would spoil the
renormalizability.
Two more remarks are in order here. First, an essential point for our success was the
simple structure of the UV divergent terms in the loop diagrams in question. Indeed, all three of
them, corresponding to functions Π 𝜇𝜈 (𝑞), Σ( 𝑝) and Γ𝜇 ( 𝑝′, 𝑝) exhibit UV divergences that have
polynomial dependence on the external momenta. This enables one to use local counterterms
(which are polynomials made of fields and involving possibly just a finite number of their
derivatives). Second, returning to the formulae (44.8), (44.9), one may say in another way what
is so problematic about a field theory model involving an interaction for which 𝜔𝑣 > 4: the
ensuing infinite number of types of UV divergences would mean the need for an infinite number
of renormalization counterterms. In conventional terms, it would mean that such a theory is
non-renormalizable.
A historical remark is perhaps in order here. The Nobel Prize for the modern formu-
lation of QED including renormalization techniques was awarded to Richard Feynman, Julian
Schwinger and Sin-itiro Tomonaga in 1965 for their work published in the late 1940s. In fact,
another crucial contribution is due to Freeman Dyson (see, in particular, his paper [40]), who
reformulated the conventional renormalization procedure in the current “textbook” form. In this
context, Steven Weinberg once noted that, taking this into account, one may say that Dyson was
in fact “fleeced” of the Nobel Prize. Fortunately, he received many other prestigious prizes for
his achievements.

277
Chapter 47

Renormalization and radiative corrections

Before proceeding to some simple practical applications of QED at the one-loop level, let
us mention briefly also some other examples of QFT models from the point of view of the
renormalization theory and the technique of counterterms.
First, let us consider the model with the interaction of Yukawa type, namely

Lint = 𝑔𝜓𝜓𝜑 , (47.1)

where 𝜓 is a Dirac field (mass 𝑚) and 𝜑 is a real scalar (Klein–Gordon) field (mass 𝑀). The
free part of the full Lagrangian has the familiar form, so it need not be repeated here. Feynman
diagrams corresponding to such a model are topologically quite similar to those of QED. Namely,
one-loop graphs contributing to propagator corrections are shown in Fig. 47.1 and there is also

(a) (b)

Fig. 47.1: One-loop graphs representing propagator corrections in a model of Yukawa-type interaction.

Fig. 47.2: One-loop vertex correction for a Yukawa-type interaction.

the vertex correction shown in Fig. 47.2. Obviously, the graphs shown in Figs. 47.1 and 47.2
are analogous to the familiar QED diagrams (cf. Fig. 44.1), with the wavy photon line being
replaced by a dashed line corresponding to the scalar field. Needless to say, contributions of
Feynman graphs within such a model are algebraically simpler than in QED, since there are no
𝛾-matrices here. Similarly to QED, the diagrams in Figs. 47.1, 47.2 induce counterterms leading
to field renormalizations, renormalization of masses (of course, renormalization of the scalar
boson mass is an extra ingredient in comparison with QED), and renormalization of the coupling

278
constant 𝑔. The latter is determined by the 𝑍-factors for the fields 𝜓, 𝜑 and the interaction vertex
𝜓𝜓𝜑; similarly to QED, this leads to the relation

𝑔0 = 𝑔𝑍 𝑣 𝑍𝜓−1 𝑍 𝜑−1/2 , (47.2)


where we have employed a notation that is perhaps more instructive than that practiced in QED
(here 𝑍 𝑣 corresponds to the constant 𝑍1 of QED, 𝑍𝜓 to 𝑍2 , and 𝑍 𝜑 to 𝑍3 ). However, unlike QED,
there is no Ward identity here, so 𝑍 𝑣 ≠ 𝑍𝜓 . An explicit evaluation of the renormalization factors
entering (47.2) is left as a challenge for a hard-working reader.
In fact, there is another feature, in which the Yukawa-like model differs from QED.
Namely, higher fermion loops, in particular the triangle and the box shown in Fig. 47.3, do not

(a) (b)

Fig. 47.3: Fermionic triangle and the box in a Yukawa-type model. These loops yield UV divergent
contributions and induce the corresponding counterterms.

drop out of the game. Indeed, as for the triangle, the graph with reverse orientation of the loop
momentum does not cancel the contribution of its counterpart, since the Furry’s theorem does
not work here (because of the absence of 𝛾-matrices in the triangle vertices). As for the box,
its algebraic structure is so simple that there cannot occur any cancellation of UV divergences
due to the permutation of external boson lines. It means that two extra counterterms are needed,
proportional to 𝜑3 and 𝜑4 , respectively. Nevertheless, both these scalar self-interactions have
dimension less than or equal to four (so that 𝜔𝑣 ≤ 4 in the formulae (44.8), (44.9)) and the
Yukawa-like model supplemented with such counterterms remains to be renormalizable.
A similar situation occurs in the scalar QED (the interaction of charged spin-0 particles
with photons, cf. the end of Chapter 44). Vacuum polarization graphs representing the correction
to the photon propagator are shown in Fig. 47.4. It is a highly useful and instructive exercise

µ ν
q q
µ ν
q q
(a) (b)

Fig. 47.4: Vacuum polarization loop in scalar QED.

to show that both diagrams in Fig. 47.4 are necessary to guarantee the transversality of the
corresponding function Π 𝜇𝜈 (𝑞). The vertex correction is described, at one-loop level, by the
diagrams shown in Fig 47.5, and the self-energy graphs for the charged scalar are shown in
Fig. 47.6.

279
(a) (b) (c)

Fig. 47.5: One-loop vertex correction in scalar QED.

(a) (b)

Fig. 47.6: One-loop corrections to the propagator of the spin-0 charged boson in scalar QED.

The diagrams shown in figures 47.4 through 47.6 contribute to “ordinary” renormalization
constants in analogy with spinor QED. However, there are also UV divergent diagrams with four
external scalar lines, namely those shown in Fig. 47.7. It is easy to realize that these graphs

(a) (b) (c)

Fig. 47.7: One-loop graphs generating the counterterm representing a quartic self-interaction of the
scalar field.

are logarithmically divergent; so, because of the configuration of the external lines they induce
a counterterm proportional to (𝜑∗ 𝜑) 2 . On the other hand, diagrams with four external photon
lines, namely those shown in Fig. 47.8, give a UV finite contribution (in their sum).
Thus, similarly to the situation in spinor QED, there is no need to introduce a four-photon
counterterm (which would be destructive for the renormalization program). The upshot of all

this is as follows: Scalar QED with the original interaction terms of the type 𝜑 𝜕𝜑∗ 𝐴 and 𝜑𝜑∗ 𝐴𝐴
should be supplemented with the quartic counterterm (𝜑∗ 𝜑) 2 , and then it is “closed under renor-
malization” (i.e. it contains all counterterms necessary for implementing the renormalization
program).
Finally, let us mention a curious example of renormalizable field theory model, which
involves not only finite number of types of UV divergences, but even a finite number of all
UV divergent 1PI diagrams! (such a model is called “superrenomalizable”). The model in
question involves the cubic self-interaction 𝜑3 alone. We know that such an interaction term is

280
(a) (b) (c)

Fig. 47.8: One-loop diagrams contributing to the photon–photon scattering in scalar QED; their sum is
UV finite.

present within some more complex models; by the way, it also appears in the standard model
of electroweak interactions as the (yet untested) self-interaction of the Higgs boson. When one
considers the 𝜑3 self-interaction by itself, i.e. the model Lagrangian that reads simply
1 1
L = 𝜕𝜇 𝜑𝜕 𝜇 𝜑 − 𝑚 2 𝜑2 + 𝑔𝜑3 , (47.3)
2 2
it turns out that there is only one UV divergent 1PI graph (if one ignores tadpoles) namely the
self-energy diagram shown in Fig. 47.9. The reader is encouraged to verify such a statement,

Fig. 47.9: Scalar self-energy loop in the 𝜑3 model.

possibly with the help of formulae (44.8), (44.9). Please notice that the dimension of the
interaction term 𝜑3 is 3, i.e. the coupling constant 𝑔 in (47.3) has the dimension of a mass.
Two more remarks addressed to potential QFT enthusiasts: In fact, the model (47.3) is,
in a broader perspective, inconsistent, since the corresponding energy density is not bounded
from below (precisely because of 𝜑3 ). Of course, such a pathological feature does not prevent
us from studying such a model within perturbation theory. Second, when the model (47.3) is
considered, academically, in six-dimensional spacetime (see e.g. the book [12]), the coupling
constant is dimensionless and the relevant Feynman diagrams are much more interesting than
in four dimensions: such a model is renormalizable in the conventional sense, and, moreover,
it exhibits a remarkable property, called asymptotic freedom, which in four dimensions is
reserved just for field theory models involving non-Abelian gauge fields (Yang–Mills fields).
The most famous theory of this type in four dimensions is quantum chromodynamics (QCD),
the modern theory of strong interactions in particle physics (the Nobel Prize in physics in 2004
was awarded just for this). The end of remarks for QFT enthusiasts.
Let us now come back to our good old QED. We have found that all necessary renormal-
ization counterterms can be incorporated by rewriting the original Lagrangian in terms of bare
parameters and bare fields, so that one has
1 (0) (0)𝜇𝜈
LQED = 𝑖𝜓 0 𝜕/𝜓0 − 𝑚 0 𝜓 0 𝜓0 − 𝐹𝜇𝜈 𝐹 + 𝑒 0 𝜓 0 𝛾 𝜇 𝜓0 𝐴 𝜇(0) . (47.4)
4
Thus, the counterterms (which, according to the accepted strategy, belong to interactions) are
“wrapped” in an elegant way in the fundamental form (47.4). This Lagrangian could now be “un-

281
wrapped” so as to recover all terms that are relevant for the evaluation of renormalized Feynman
diagrams, but in fact it is not necessary: we know how we have fixed the counterterms in terms
of UV divergent parts of one-loop diagrams and we understand that including the counterterms
in the interaction Lagrangian is tantamount to specific subtractions in the regularized functions
Π(𝑞 2 ), Σ( 𝑝) and Γ𝜇 ( 𝑝′, 𝑝). In particular, our renormalization scheme (i.e. our particular choice
of counterterms) means that Π(𝑞 2 ) is replaced by

Π(𝑞 2 ) = Π(𝑞 2 ) − Π(0) , (47.5)

and from Σ( 𝑝) one is led to Σ( 𝑝) = Σ( 𝑝) − 𝐴 − 𝐵( 𝑝/ − 𝑚), i.e.

𝜕
Σ( 𝑝) = Σ( 𝑝) − Σ( 𝑝) 𝑝/ =𝑚
− Σ( 𝑝) 𝑝/ =𝑚
( 𝑝/ − 𝑚) . (47.6)
𝜕 𝑝/

Thus,
Σ( 𝑝) = 𝐶 ( 𝑝/ − 𝑚) 2 + . . . (47.7)
As for Γ𝜇 ( 𝑝′, 𝑝), our choice (related to the Ward identity) was (cf. (46.21), (46.24))

Γ 𝜇 ( 𝑝′, 𝑝) = Γ𝜇 ( 𝑝′, 𝑝) − 𝑒𝐵𝛾 𝜇 . (47.8)

It is easy to recast the subtraction in the last expression in a more compact way. Indeed, in
arriving at the counterterm in (47.8) we have utilized the Ward identity (42.4) that gives us
𝜕 e
Γ𝜇 ( 𝑝, 𝑝) =
e Σ( 𝑝)
𝜕 𝑝𝜇
𝜕 he e e( 𝑝/ − 𝑚) 2 + . . . (47.9)
i
= 𝐴 + 𝐵( 𝑝
/ − 𝑚) + 𝐶
𝜕 𝑝𝜇
= 𝐵𝛾
e 𝜇 +𝐶 e 𝛾 𝜇 ( 𝑝/ − 𝑚) + ( 𝑝/ − 𝑚)𝛾 𝜇 + . . .


Obviously, all terms in the last expression, except 𝐵𝛾


e 𝜇 , vanish when we set, formally, 𝑝/ = 𝑚.
Thus, the subtraction in (47.8) can be written as

Γ 𝜇 ( 𝑝′, 𝑝) = Γ𝜇 ( 𝑝′, 𝑝) − Γ𝜇 ( 𝑝, 𝑝) 𝑝/ =𝑚
. (47.10)

So, the recipe emerging from the above considerations is that the renormalized QED scattering
amplitudes can be obtained, at the one-loop level, by replacing contributions of the relevant sub-
diagrams (vacuum polarization, fermion self-energy and vertex correction) with the subtracted
expressions Π(𝑞 2 ), Σ( 𝑝) and Γ 𝜇 ( 𝑝′, 𝑝) defined in (47.5), (47.6) and (47.10).
Let us now see how it works in practice. In particular, the loop corrections in the external
lines deserve special attention. First, let us consider the 4th order diagram for our favourite
scattering process 𝑒 + 𝜇 → 𝑒 + 𝜇, shown in Fig. 47.10, where we have denoted by the heavy dot
the contribution of the function Σ( 𝑝). Full contribution of the line involving the insertion of
Σ( 𝑝) amounts to
1
Σ( 𝑝)uu ( 𝑝) . (47.11)
𝑝/ − 𝑚
However, according to (47.7), Σ( 𝑝) is proportional to ( 𝑝/ − 𝑚) 2 , so that in the expression (47.11)
one certainly gets a factor ( 𝑝/ − 𝑚)u
u ( 𝑝), which is zero. Of course, the fate of the self-energy
insertion in the outgoing line would be the same, because of u ( 𝑝′)( 𝑝/ ′ − 𝑚) = 0.

282
k k′

p p′

Fig. 47.10: A would-be contribution to 𝑒–𝜇 scattering involving the self-energy correction in an external
line.

k
k′

q =p+k
p p′

Fig. 47.11: A would-be contribution to the Compton scattering, involving the vacuum-polarization
correction in an external photon line.

Similarly, one may consider the 4th order diagram for Compton scattering with the
insertion of the vacuum polarization bubble in an external photon line, as shown in Fig. 47.11,
where the heavy dot denotes the function

Π 𝜇𝜈 (𝑘) = Π(𝑘 2 )(𝑘 2 𝑔 𝜇𝜈 − 𝑘 𝜇 𝑘 𝜈 ) .

Now, the full contribution of the line including Π 𝜇𝜈 (𝑘) is proportional to

𝑔 𝜈𝜌 2 2 𝑔 𝜈𝜌
𝜇 𝜇
𝜖 (𝑘)Π 𝜇𝜈 (𝑘) 2 = 𝜖 (𝑘)Π(𝑘 )(𝑘 𝑔 𝜇𝜈 − 𝑘 𝜇 𝑘 𝜈 ) 2 . (47.12)
𝑘 𝑘
However, 𝑘 𝜇 𝜖 𝜇 (𝑘) = 0 for a physical photon polarization, so one is left just with the factor of
Π(𝑘 2 ), but this is zero, since 𝑘 2 = 0 and Π(0) = 0 by its definition.
The above two examples show that within our renormalization scheme, corrections on
the external lines vanish. In any case, our elementary calculations show that one need not worry
about the anticipated singularity of the propagator standing between the vertex and the loop
(anticipated because of the value of the physical four-momentum, 𝑝 2 = 𝑚 2 or 𝑘 2 = 0, imposed
by the attached external line).
In this context, a terminological remark is in order. The renormalization scheme we are
using is called the on-shell scheme, since the choice of counterterms corresponds to subtractions
at physical (on-shell) values of the relevant four-momenta. One should come to terms with the
fact that the choice of the counterterms (including their UV finite parts) is to some extent
arbitrary. For instance, within the dimensional regularization, another popular choice of the
counterterms corresponds just to the subtraction of the pole terms 1/𝜖; such a renormalization
scheme is called minimal subtraction scheme, or briefly MS scheme.
Apart from the corrections in external lines, which are trivial in the on-shell scheme,
in propagators and vertices of Feynman diagrams one, of course, gets non-trivial results. At
the one-loop level, the renormalization corrections are of the order O (𝑒 2 ) = O (𝛼) and they

283
are commonly called radiative corrections, because they involve, pictorially, virtual particles
radiated from some lines of Feynman diagrams. Some examples of radiative corrections will be
discussed in the forthcoming chapters.

284
Chapter 48

One-loop vacuum polarization in detail

Renormalized UV finite parts of the contributions of the one-loop QED diagrams, which rep-
resent an important portion of the radiative corrections to scattering amplitudes are, in general,
quite complicated functions of external momenta. For instance, the expression for the vertex
correction Γ 𝜇 ( 𝑝′, 𝑝) contains also higher transcendental functions like the dilogarithm (Spence’s
function). On the other hand, the vacuum polarization form factor Π(𝑞 2 ) is relatively simple and
can be expressed fully in terms of elementary functions. A detailed description of this quantity
is the main subject of this chapter.
Let us start with the regularized expressions for Π(𝑞 2 ) that we have obtained in chapters
39 and 40. Including also the coupling factor 𝑒 2 = 4𝜋𝛼 in the formulae (39.19), (40.24), we
have
 ∫1
𝑒2 1 1 2 2 − 𝑥(1 − 𝑥)𝑞 2 
  
DR 2 𝑚 𝑚
Π (𝑞 ) = 2
2𝜋  6 𝜖 − 𝛾E + ln 4𝜋 − ln 𝜇2 − d𝑥 𝑥(1 − 𝑥) ln
𝑚2
(48.1)
 

0
 
 
in the dimensional regularization and
∫1
𝑒2  1 𝑀2 𝑚 2 − 𝑥(1 − 𝑥)𝑞 2 
 
PV 2  ln
Π (𝑞 ) = 2  6 𝑚2 − d𝑥 𝑥(1 − 𝑥) ln (48.2)
2𝜋 𝑚2 
0
 
 
in the Pauli–Villars scheme.
Thus, we see that Π(𝑞 2 ) = Π(𝑞 2 ) − Π(0) does not depend on the regularization scheme.
One has, using 4𝜋𝛼 instead of 𝑒 2 ,
∫1
2 2𝛼 𝑚 2 − 𝑥(1 − 𝑥)𝑞 2
Π(𝑞 ) = − d𝑥 𝑥(1 − 𝑥) ln . (48.3)
𝜋 𝑚2
0

A brief inspection of the integrand in the last expression reveals that one should distinguish three
regions of the 𝑞 2 values, namely
I) 𝑞2 < 0 ,
II) 0 < 𝑞 2 ≤ 4𝑚 2 , (48.4)
2 2
III) 𝑞 > 4𝑚 ,
with regard to the distinct properties of the quadratic function
𝐶 (𝑥) = 𝑚 2 − 𝑥(1 − 𝑥)𝑞 2 (48.5)

285
in these kinematical areas. Indeed, for 𝑞 2 < 0 the function 𝐶 (𝑥) is positive for any 𝑥 ∈ (0, 1)
and has real zeroes outside the interval (0, 1), namely
√︄
1© 4𝑚 2 ª
𝑥 ± = ­1 ± 1 + 2 ® . (48.6)
2 |𝑞 |
« ¬
In the region II, 𝐶 (𝑥) is positive for any 𝑥 ∈ (0, 1) and has no real roots whatsoever. The region
III is, in a sense, the most interesting case. The function 𝐶 (𝑥) then has real roots inside the
interval (0, 1), namely √︄
1© 4𝑚 2 ª
𝑥± = ­1 ± 1 − 2 ® , (48.7)
2 |𝑞 |
« ¬
and thus it changes sign for 𝑥 between 0 and 1. In particular, 𝐶 (𝑥) < 0 for 𝑥 ∈ (𝑥 − , 𝑥 + )
and 𝐶 (𝑥) > 0 for 𝑥 ∈ (0, 𝑥 − ) ∪ (𝑥 + , 1). However, the negative value of 𝐶 (𝑥) means that the
logarithm in the integrand in (48.3) has a non-trivial imaginary part for 𝑥 ∈ (𝑥 − , 𝑥 + ). Thus,
one may expect that the function Π(𝑞 2 ) will be purely real for 𝑞 2 ∈ (−∞, 4𝑚 2 ) and complex for
𝑞 2 > 4𝑚 2 !
The evaluation of real parts of the integral in (48.3) in the regions I, II, III is elementary,
but somewhat tedious. It is clear that it can be carried out by means of partial integration, which
results in integrating a rational function. We leave it to a hard-working reader as an exercise in
the elementary calculus, and here we only summarize the relevant results.
I) For 𝑞 2 < 0, one gets
2
√︃
1 + 4𝑚 2| − 1
 √︄
2 ©

1 2𝑚 4𝑚 2
 
2 𝛼  |𝑞

− + 1 − 2 ­ 1 + 2 ln √︃ + 2 ® . (48.8)
ª
Π(𝑞 ) =
 
3𝜋  3
­ ®
|𝑞 | |𝑞 | 1 + 4𝑚
2
+1


 « |𝑞 2 | ¬

II) For 0 ≤ 𝑞 2 ≤ 4𝑚 2 ,
  √︄
2𝑚 2 ©­ 4𝑚 2

𝛼  1 
1 ª
Π(𝑞 2 ) = − − 2 1 + 2 ­ − 1 arctan √︃ − 1®® . (48.9)

3𝜋  3 𝑞 𝑞2 4𝑚 2
−1


 « 𝑞2 ¬

III) For 𝑞 2 > 4𝑚 2 ,

2
√︃
 √︄ 1 − 1 − 4𝑚

2

1 2𝑚 4𝑚 2 2

𝛼  𝑞

Re Π(𝑞 2 ) = 1 1 ln 2 (48.10)
© ª
− + + − + .
 
3𝜋  3 2 2
­ ®
𝑞 𝑞 2
 √︃
4𝑚
1 + 1 − 𝑞2
­ ®
 
 « ¬
The evaluation of the imaginary part of Π(𝑞 2 ) for 𝑞 2 > 4𝑚 2 is quite simple, so it is worth
doing it here explicitly. To begin with, let us recall what is the origin of the function 𝐶 (𝑥) in
the argument of the logarithm in (48.3). Going back to the expression (38.19) for 𝐶, one cannot
overlook the remark that 𝑚 2 is to be understood as 𝑚 2 − 𝑖𝜖, where 𝜖 > 0 is an infinitesimal
constant, ubiquitous in Feynman propagators. Then, if the real part of 𝐶 (𝑥) is negative, the
imaginary part of the logarithm is equal −𝑖𝜋. An explanatory comment is perhaps in order here:
Note that the logarithm of complex variable 𝑧 = |𝑧|𝑒𝑖𝜑 is ln 𝑧 = ln |𝑧| + 𝑖𝜑 and it has the branch
cut on the real axis, extending from −∞ to 0; with the specification of 𝑚 2 as 𝑚 2 − 𝑖𝜖 in mind,
we are on the lower side of the cut, where 𝜑 = −𝜋.

286
Thus, the evaluation of Im Π(𝑞 2 ) is easy. We have

Π(𝑞 2 ) = Re Π(𝑞 2 ) + 𝑖 Im Π(𝑞 2 ) ,

where
∫𝑥+
2 2𝛼
𝑖 Im Π(𝑞 ) = − d𝑥 𝑥(1 − 𝑥)(−𝑖𝜋) ,
𝜋
𝑥−

i.e.
∫𝑥+
2
Im Π(𝑞 ) = 2𝛼 d𝑥 𝑥(1 − 𝑥) , (48.11)
𝑥−

with 𝑥 ± being given by (48.7) as the solution of the quadratic equation

𝑚2
𝑥2 − 𝑥 + = 0. (48.12)
𝑞2
So, from (48.11) we have

1 2 1 3
 
2 2 3
Im Π(𝑞 ) = 2𝛼 (𝑥+ − 𝑥 − ) − (𝑥 + − 𝑥 − ) . (48.13)
2 3

Working out the last expression is a refreshing exercise in high school maths. Indeed, one may
utilize the elementary identity

𝑥+3 − 𝑥 −3 = (𝑥 + − 𝑥 − )(𝑥 +2 + 𝑥 − 𝑥 + + 𝑥−2 ) ,

as well as the properties of roots of the quadratic equation (48.12), such as


√︄
𝑚2 4𝑚 2
𝑥+ + 𝑥− = 1 , 𝑥− 𝑥+ = 2 , 𝑥+ − 𝑥− = 1 − 2 ,
𝑞 𝑞
2𝑚 2 2𝑚 2
𝑥+2 + 𝑥 −2 = 𝑥 + + 𝑥− − = 1 − .
𝑞2 𝑞2
For (48.13) one then gets, after a simple manipulation,
 √︄
4𝑚 2 2𝑚 2

𝛼
Im Π(𝑞 2 ) = 1+ 2 1− 2 . (48.14)
3 𝑞 𝑞

The last result is in fact highly remarkable. If we denote, provisionally, 𝑞 2 = 𝑀 2 and


come back to 𝑒 2 = 4𝜋𝛼, the formula (48.14) reads

𝑒2 2𝑚 2 4𝑚 2
  √︂
2 2
Im Π(𝑞 = 𝑀 ) = 1+ 2 1− 2 . (48.15)
12𝜋 𝑀 𝑀
Now it turns out that
Γ
Im Π(𝑞 2 = 𝑀 2 ) =
, (48.16)
𝑀
where Γ is the rate of the decay of massive vector boson (“massive photon” with mass 𝑀) into
a pair of fermions, with all particles unpolarized. To appreciate this, the reader is urged to

287
q =k+p

k −p

Fig. 48.1: Tree-level graph for the vector boson decay into a fermion-antifermion pair.

revisit Chapter 24, where we have carried out the calculation of the decay rate Γ at the tree
level, i.e. using the diagram in Fig. 48.1 (see the result (24.21)). Such a relation, which might
seem mysterious at first sight, is in fact not accidental; it is a direct consequence of the 𝑆-matrix
unitarity (a variant of the “optical theorem”). Unfortunately, there is not enough space to discuss
this interesting topic in detail now; we may refer the reader to the book [7] for more details. In
any case, it is instructive to work out explicitly more examples of this kind, e.g. within a model
of Yukawa interaction, etc.
Looking back at the formulae (48.8) through (48.10), it is seen that the analytic form
of Π(𝑞 2 ) is different in the regions I, II, III, so one would like to have at hand an instructive
global picture of this quantity. An exact form of the plot can be found e.g. in the book [15]
(see Fig. 5.13 therein; note that an opposite sign convention is used there, so the quantity
displayed is −Π(𝑞 2 )). Let us mention at least several salient features of the function Π(𝑞 2 ).
Some of its simple properties can be established rather easily on the basis of the original integral
representation (48.3). For instance, it is clear that Π(0) = 0 (by its definition) and d𝑞d 2 Π(𝑞 2 ) > 0
d
for any 𝑞 2 ∈ (−∞, 4𝑚 2 ). Similarly, d𝑞 2
Re Π(𝑞 2 ) < 0 for 𝑞 2 ∈ (4𝑚 2 , +∞). The function Π(𝑞 2 )
is continuous at 𝑞 2 = 4𝑚 2 and it is not difficult to find out that the corresponding value is
8𝛼
Π(𝑞 2 = 4𝑚 2 ) = . (48.17)
9𝜋
On the other hand, the derivative of Π(𝑞 2 ) is discontinuous at 𝑞 2 = 4𝑚 2 . It is a refreshing exercise
in elementary calculus to show that the left limit of the derivative is infinite, while the right limit
is finite. In any case, (48.17) is the maximum value of the real part of Π(𝑞 2 ) and, because of its
discontinuous derivative, Re Π(𝑞 2 ) exhibits a pronounced peak (spike) at 𝑞 2 = 4𝑚 2 ; such a type
of singularity is commonly called the cusp. Let us also notice that the asymptotic behaviour
of Re Π(𝑞 2 ) for |𝑞 2 | → ∞ is given by the leading logarithm descending from (48.3), namely
(−𝛼/3𝜋) ln |𝑞 2 /𝑚 2 |. The reader is encouraged to verify the above statements explicitly and look
up the picturesque plot shown in [15].
Next, we are going to discuss another useful representation of Π(𝑞 2 ) that can be obtained
by an appropriate transformation of the integration variables in the expression (48.3). We have
2𝛼
Π(𝑞 2 ) = − 𝐼 (𝑞 2 ) , (48.18)
𝜋
where
∫1
𝑞2
 
2
𝐼 (𝑞 ) = d𝑥 𝑥(1 − 𝑥) ln 1 − 𝑥(1 − 𝑥) 2 . (48.19)
𝑚
0
Let us recall again that in the argument of the logarithm, 1 is to be understood as 1 − 𝑖𝜖. First, we
use the simple substitution 𝑥 = 21 (1 − 𝑢), i.e. 1 − 𝑥 = 21 (1 + 𝑢), so that 𝑢 ∈ (−1, 1). Performing

288
then a partial integration, one gets, after some manipulations,
∫1
1 𝑞2 1 2 1

𝐼 (𝑞 2 ) = − 2
d𝑢 𝑢 1 − 𝑢 , (48.20)
2 4𝑚 2 3 1 − (1 − 𝑢 2 ) 𝑞2
0 4𝑚 2

where we have also taken into account that the integrand is an even function of the variable u.
As a second step, we employ the substitution
 1/2
4𝑚 2

𝑢 = 1− . (48.21)
𝑡
The integral (48.20) is then recast as
∫∞
1 d𝑡 4𝑚 2 2𝑚 2 1
√︂  
𝐼 (𝑞 2 ) = − 𝑞 2 1− 1+ , (48.22)
6 𝑡 𝑡 𝑡 𝑡 − 𝑞 2 − 𝑖𝜖
4𝑚 2

where we have also restored the infinitesimal 𝑖𝜖 term implicit in the denominator of the integrand
in (48.20). So, returning to the relation (48.18), one has
∫∞ √︂
d𝑡 4𝑚 2 2𝑚 2 1
 
2 𝛼 2
Π(𝑞 ) = 𝑞 1− 1+ , (48.23)
3𝜋 𝑡 𝑡 𝑡 𝑡 − 𝑞 2 − 𝑖0
4𝑚 2

where we have used the symbol 𝑖0 instead of 𝑖𝜖. The formula (48.23) represents another highly
remarkable result that we have achieved here. Why is it so remarkable? Well, looking back at
our result for Im Π, one may notice that (48.23) can be recast as
∫∞
1 1 Im Π(𝑡) 1
2
Π(𝑞 2 ) = d𝑡 . (48.24)
𝑞 𝜋 𝑡 𝑡 − 𝑞 2 − 𝑖0
4𝑚 2

It means that the full expression for Π(𝑞 2 ) can be obtained from its imaginary part Im Π by
means of a relatively simple integral transformation! (Remember how easy it is to get Im Π —
in principle one could just use the “optical theorem” (48.16).) By the way, Im Π = Im Π, since
Π(0) is real.
Again, one must stress that the peculiar relation (48.24) is not accidental. It is called the
dispersion relation (please do not confuse it with the same term that is used e.g. in optics for
a relation of the wavelength to frequency) and it is a special consequence of Cauchy’s theorem
in complex analysis, which in the considered case relies on specific analytic properties of the
function Π(𝑞 2 ) (or simply Π(𝑞 2 )) extended to the complex plane. Unfortunately, within this
introductory text there is not enough space for discussing such a problem in more detail; so,
we adopt a pragmatic approach by using the dispersion relation (48.24) when it is helpful and
refer the interested reader e.g. to the book [7]. In any case, let us add a technical remark. We
know that Im Π(𝑞 2 ) = Im Π(𝑞 2 ); in fact, the relation (48.16) clearly indicates a priori that the
imaginary part of Π(𝑞 2 ) has nothing to do with UV divergences. If one takes for granted the
form of dispersion relation, one might try to write down such a representation for Π(𝑞 2 ) itself,
namely
∫∞
2 ? 1 Im Π(𝑡)
Π(𝑞 ) = d𝑡 (48.25)
𝜋 𝑡 − 𝑞2
4𝑚 2

289
(for the moment, we suppress the symbol 𝑖0). However, Im Π(𝑡) is asymptotically (i.e. for
𝑡 → ∞) constant (see (48.14)), so the integral in (48.25) diverges logarithmically. This is
an alternative manifestation of the UV divergence of the fermion loop (UV divergence “in
disguise”). One subtraction (which does not change the analytic properties of Π(𝑞 2 )) is needed
to remove the divergence; in particular, subtracting a divergent constant
∫∞
1 Im Π(𝑡)
Π(0) = d𝑡 , (48.26)
𝜋 𝑡
4𝑚 2

one gets
∫∞ 
1 Im Π(𝑡) Im Π(𝑡)

Π(𝑞 2 ) = Π(𝑞 2 ) − Π(0) = − d𝑡
𝜋 𝑡 − 𝑞2 𝑡
4𝑚 2
(48.27)
∫∞
1 2 Im Π(𝑡)
= 𝑞 d𝑡 ,
𝜋 𝑡 (𝑡 − 𝑞 2 )
4𝑚 2

and the relation (48.24) is thereby recovered. Thus, the convergent integral representation
(48.24) or (48.27) is rightly called “dispersion relation with one subtraction”. The technique of
dispersion relations is quite useful in some practical calculations, as we will see in what follows.
In the rest of this chapter we are going to discuss a simple example of one-loop QED
radiative correction, namely the correction to the Coulomb potential due to the vacuum polar-
ization. It is one of the earliest and best known results in QED; the original papers concerning
this were published in mid 1930s by Edwin A. Uehling [49] and independently by Robert Serber
[48] (of course, without Feynman diagrams that were born much later). So, let us consider the
scattering of a charged spin- 21 fermion (electron) in an external electrostatic field described by
the potential 𝑉 (®𝑥 ); it means that we employ QED incorporating both quantized and classical
electromagnetic fields. In the lowest (first) order, such a scattering process is described by the
familiar tree diagram shown in Fig. 48.2. Note that 𝑞 = 𝑝′ − 𝑝 = (0, 𝑞), ® since the energy is

p p′

q = p′ − p

Fig. 48.2: Tree-level graph for the electron scattering in an external electromagnetic field.

conserved, in contrast to the momentum. The corresponding matrix element is


M (1) = 𝑒 u ( 𝑝′)𝛾0 u ( 𝑝)𝑈 ( 𝑞)
® , (48.28)
where 𝑈 ( 𝑞)
® is the Fourier transform of the potential, i.e.

® = d3 𝑥 𝑒 −𝑖 𝑞·®
𝑈 ( 𝑞) ®𝑥
𝑉 (®
𝑥) . (48.29)

The contribution of the vacuum polarization is given by the 3rd order diagram shown in
Fig. 48.3, where the fermion bubble is now represented by the renormalized quantity −𝑖Π 𝜇𝜈 (𝑞)

290
p p′
q

q = p′ − p

Fig. 48.3: Vacuum polarization contribution to electron scattering in an external field.

involving the subtracted form factor Π(𝑞 2 ). Using the standard Feynman rules and taking into
account the transversality of Π 𝜇𝜈 (𝑞), as well as the notorious identity u ( 𝑝′) 𝑞/ u ( 𝑝) = 0, one gets
for the corresponding matrix element

® −Π(𝑞 2 ) .
 
M (3) = 𝑒 u ( 𝑝′)𝛾0 u ( 𝑝)𝑈 ( 𝑞) (48.30)

Thus,
® 1 − Π(𝑞 2 ) ,
 
M (1) + M (3) = 𝑒 u ( 𝑝′)𝛾0 u ( 𝑝)𝑈 ( 𝑞) (48.31)

where, of course, 𝑞 2 = −| 𝑞|
® 2 , since 𝑞 = (0, 𝑞).
® Eq. (48.31) means that the sum of diagrams in
Figs. 48.2 and 48.3 leads to an “effective potential” whose Fourier transform is

®2 .
 
® 1 − Π −| 𝑞|
® = 𝑈 ( 𝑞)
𝑈eff ( 𝑞) (48.32)

The Coulomb potential is, in our system of units,


1 𝑒
𝑥) =
𝑉 (® , 𝑟 = |®
𝑥| , (48.33)
4𝜋 𝑟
so that its Fourier transform 𝑈 ( 𝑞)
® becomes
𝑒
® =
𝑈 ( 𝑞) . (48.34)
®2
| 𝑞|

® 2 ), the dispersion relation (48.23) now comes in handy. It reads


For Π(−| 𝑞|
∫∞
d𝑡 4𝑚 2 2𝑚 2 1
√︂  
2 𝛼
Π(−| 𝑞| ®2
® ) = − | 𝑞| 1− 1+ , (48.35)
3𝜋 𝑡 𝑡 𝑡 ®2
𝑡 + | 𝑞|
4𝑚 2

and from (48.32) along with (48.34) we thus have


∫∞ √︂ 2 2
d𝑡 4𝑚 2𝑚 1
   
𝑒 1 + 𝛼 2
1− 1+ (48.36)
 
® = 2
𝑈eff ( 𝑞) | 𝑞|
® .
| 𝑞|
®  3𝜋 𝑡 𝑡 𝑡 ® 2 
𝑡 + | 𝑞|
4𝑚 2

 
Now, our task is to carry out the inverse Fourier transformation, i.e. evaluate 𝑉eff (®
𝑥 ) according
to
d3 𝑞

𝑥) =
𝑉eff (® 𝑈eff ( 𝑞) ®𝑥
® 𝑒𝑖 𝑞·® . (48.37)
(2𝜋) 3

291
For this purpose, one needs the formula

d3 𝑞 𝑖 𝑞·® 1 1 𝑒 −𝑟 𝑡

𝑒 ®𝑥
= (48.38)
(2𝜋) 3 ® 2 4𝜋 𝑟
𝑡 + | 𝑞|
(its verification is left as a homework for any diligent reader). Thus, one gets
∫∞ √ 2
1 𝑒© 2𝛼 1 𝑢 − 1ª

𝑉eff (𝑟) = ­1 + d𝑢 𝑒 −2𝑚 𝑟 𝑢 1+ 2 ®, (48.39)
4𝜋 𝑟 3𝜋 2𝑢 𝑢2
« 1 ¬
where we have introduced, instead of 𝑡, a dimensionless integration variable 𝑢 such that 𝑡 =
4𝑚 2 𝑢 2 .
So, the result of our calculation can be summarized in an elegant form by writing it as
𝑒
𝑉eff (𝑟) = 𝑄(𝑟) , (48.40)
4𝜋𝑟
where 𝑄(𝑟) is an “effective charge” depending on the distance,
∫∞ √ 2
2𝛼 1 𝑢 −1

𝑄(𝑟) = 1 + d𝑢 𝑒 −2𝑚 𝑟 𝑢 1+ 2 . (48.41)
3𝜋 2𝑢 𝑢2
1

It is most instructive to find out what is the limiting behaviour of 𝑄(𝑟) for small and large 𝑟. It
turns out that for leading terms one gets

1
 
𝛼
𝑄(𝑟) = 1 + ln + ... , 𝑚𝑟 ≪ 1 (48.42)
3𝜋 (𝑚 𝑟) 2

and
𝛼
𝑄(𝑟) = 1 + √ 3/2
𝑒 −2𝑚 𝑟 + . . . , 𝑚𝑟 ≫ 1. (48.43)
4 𝜋(𝑚 𝑟)
Thus, it is seen that at large distances, one has the Coulomb potential with the usual charge 𝑒,
while at small distances the effective charge grows logarithmically and goes to infinity for 𝑟 → 0.
A derivation of the results (48.42) and (48.43) is left as a challenge for calculus aficionados.
Intuitively, such a picture now justifies the term “vacuum polarization” that we have
been using from Chapter 38 on. The idea is that the QED vacuum behaves, in a sense, like a
polarizable medium that contributes to “screening” of the charge at larger distances by means
of a “cloud of virtual electron–positron pairs” appearing in the loops. The relation between the
values of 𝑄(𝑟) at large and small distances is thus analogous to the relation between renormalized
and bare electromagnetic coupling: recall that according to (46.37) one has 𝑒 2 = 1 − Π(0) 𝑒 20 ,


with Π(0) > 0. Thus, 𝑒 2 < 𝑒 20 , i.e. the vacuum polarization Π(0) causes the screening of the
bare coupling constant in QED.

292
Chapter 49

Calculable quantities:
UV finite without counterterms

We already know that in QED there are one-loop Feynman diagrams that are UV finite, and
thus do not generate renormalization counterterms. They correspond precisely to situations, in
which the would-be counterterms with dimension greater than four could destroy the perturbative
renormalizability of the theory in higher orders. In this context, a particularly interesting case
is the fermion box, discussed in detail in Chapter 43. Such a diagram is the only contribution
to an intriguing process, namely the photon–photon elastic scattering (or, as it is often called,
the light-by-light scattering). Such a process certainly does not exist at the classical level,
i.e. it is a purely quantum effect. A detailed calculation of the contribution of the box diagram
shown in Fig. 43.7 is unfortunately very tedious; explicit results are available e.g. in the book [7].
Nevertheless, one can get at least a reasonable estimate of the cross section of the photon–photon
scattering in the low-energy limit, e.g. for visible light. The technique appropriate for such a
purpose is based on the idea of an effective Lagrangian; so, the discussion of the scattering
process in question provides us with an opportunity to illustrate the power and modest beauty of
this rather general method.
The basic idea is in fact quite simple. If one considers photon energies (𝐸 𝛾 ) much smaller
than the rest mass (𝑚) of the virtual charged particle (the electron, for definiteness) inside the
loop, one can assume that the contribution of the closed-loop diagram is approximately equal
to the contribution of a 1st order (tree-level) graph corresponding to an effective Lagrangian
for direct interaction of four photons. Now, before proceeding further, a remark is in order
here. An assumption of the above-mentioned kind should sound familiar to anybody acquainted
with fundamentals of weak interaction theory. There, the original Fermi-type model of a direct
four-fermion interaction is a low-energy effective theory corresponding to the deeper underlying
theory (the standard electroweak model) involving intermediate vector bosons (IVB) as the “force
carriers”. The “matching condition” for those two theories to be equivalent in the low-energy
limit is 𝐺 F ∝ 𝑔 2 /𝑚𝑊2 , where 𝐺 is the Fermi coupling constant, 𝑔 is a dimensionless coupling
F
for the IVB interactions and 𝑚𝑊 is the IVB mass.
Returning to the problem of the photon–photon scattering in QED, our assumption
corresponds, schematically, to the approximate equality depicted in Fig. 49.1 for 𝐸 𝛾 ≪ 𝑚; the
main problem is how to estimate the interaction strength 𝐺 corresponding to the heavy dot in
the tree diagram appearing there.
To begin with, one must take into account that an effective four-photon Lagrangian should
be gauge invariant (since the underlying QED Lagrangian is so), i.e. it must be made of a product

293

Fig. 49.1: Schematic depiction of the origin of an effective Lagrangian for the light-by-light scattering
in QED.

of four electromagnetic field tensors 𝐹𝜇𝜈 ; appropriate forms would be e.g.

𝐹𝛼𝛽 𝐹 𝛽𝜇 𝐹𝜇𝜈 𝐹 𝜈𝛼 , (49.1)

or
(𝐹𝜇𝜈 𝐹 𝜇𝜈 ) 2 . (49.2)
We will discuss the relevant 𝐹𝐹𝐹𝐹 structures later on; for now, it is sufficient to notice that the
mass dimension of a term like this is eight. This in turn means that the effective coupling 𝐺 has
the dimension minus four (since the full Lagrangian has the dimension four). Obviously, the
coupling 𝐺 must be made of the electron mass (in general, the mass of the charged particle in
the loop) and, because of the equality depicted in Fig. 49.1, it must be proportional to 𝑒 4 (i.e.
𝛼2 ). Thus, the matching condition reads

𝛼2
𝐺 = const. (49.3)
𝑚4
(notice that the last relation is analogous to the above-mentioned matching condition for the
Fermi constant). Now, estimating the scattering cross section is an easy task. The contribution
of the tree-level effective diagram in Fig. 49.1 is proportional to 𝐺, so that the cross section is
proportional to 𝐺 2 . The dimension of cross section is (length)2 , i.e. (mass)−2 , and apart from
𝐺 2 , it is proportional to an appropriate power of photon energy. So, it is clear that the cross
section we are trying to estimate behaves like

𝛼4 6
𝜎(𝐸 𝛾 ) = const. 𝐸 . (49.4)
𝐸 𝛾 ≪𝑚 𝑚8 𝛾
This is the key result: in the low-energy limit, the light-by-light scattering cross section is
proportional to 𝐸 𝛾6 ! Note that our derivation has been rather simple, two basic ingredients
being gauge invariance and dimensional arguments, but the result (49.4) is in fact quite
strong. Concerning the numbers, for visible light, 𝐸 𝛾  2 − 3 eV; thus, taking into account
that the electron mass is 𝑚  0.5 MeV, 𝛼 = 1/137 and invoking the conversion constant
ℏ𝑐 = 197 MeV fm with 1 fm = 10−13 cm, one has 1 MeV−2  4 × 10−22 cm2 . Then one gets

𝜎(𝛾𝛾 → 𝛾𝛾) visible ≃ 10−67 cm2 . (49.5)


light

So, for the visible light the cross section is terribly small and thus it is not surprising that it has
not been observed yet.
Let us now review some more detailed results available in the literature. The effective
low-energy four-photon Lagrangian resulting from the calculation of the box diagram in Fig. 49.1

294
can be written as a linear combination of the expressions (49.1) and (49.2). In fact, it is more
common to use another basis, consisting of (𝐹 2 ) 2 = (𝐹𝜇𝜈 𝐹 𝜇𝜈 ) 2 and (𝐹 · 𝐹)e 2 = (𝐹𝜇𝜈 𝐹 e𝜇𝜈 ) 2 ,
where 𝐹e𝜇𝜈 is the dual of 𝐹𝜇𝜈 , i.e.
𝐹e𝜇𝜈 = 1 𝜖 𝜇𝜈𝜌𝜎 𝐹 𝜌𝜎 . (49.6)
2
With some effort, utilizing the properties of the Levi-Civita symbol 𝜖 𝜇𝜈𝜌𝜎 , one finds out that

e 2 = −2 𝐹 2 2 + 4 ⟨𝐹𝐹𝐹𝐹⟩ ,
𝐹·𝐹 (49.7)
 

where ⟨𝐹𝐹𝐹𝐹⟩ is a shorthand notation for the expression (49.1). A detailed evaluation of the
box diagram (including, of course, all permutations of external photon lines) leads to the result

𝛼2 7
 
22 2
Leff = 𝐹 + 𝐹·𝐹 . (49.8)
90𝑚 4 4
e

This can be recast in a form that is most usual in current literature, namely

2𝛼2 h ®2 ®2  2 ® · 𝐵® 2 ,
 i
Leff = 𝐸 − 𝐵 + 7 𝐸 (49.9)
45𝑚 4

where 𝐸® and 𝐵® denote the strength of the electric and magnetic field, respectively. To arrive at
(49.9) from (49.8), one employs the identities
1 1
𝐸®2 − 𝐵®2 = − 𝐹𝜇𝜈 𝐹 𝜇𝜈 , 𝐸® · 𝐵® = 𝐹𝜇𝜈 𝐹
e𝜇𝜈 . (49.10)
2 4
The result (49.9) is called the Euler–Heisenberg Lagrangian, since, historically, it was derived
first in mid 1930s by Werner Heisenberg and his doctoral student Hans Euler (needless to say,
before the advent of Feynman diagrams). Since then, it was recovered by other authors using
different techniques, including, of course, the direct evaluation of the box diagram. Nice reviews
of this time-honoured subject can be found in ref. [50]. A lot of useful information is contained
also in the book [15] (including some remarkable details of Euler’s biography).
It is perhaps worth mentioning that an effective Lagrangian of the Euler–Heisenberg type
can also be evaluated for other QED models. In particular, for scalar QED the contributing
one-loop diagrams are shown in Chapter 47 (see Fig. 47.8); the sum of their contributions leads
to the effective low-energy Lagrangian of the form

(scalar) 𝛼2 h
22 2i
Leff = 7 𝐹 + 𝐹·𝐹 , (49.11)
1440 𝑚 4𝑆
e

where 𝑚 𝑆 is the mass of the charged scalar in the relevant closed loops. By the way, this result
has been recovered in the paper [51]. In this paper, one can also find the result for QED of
massive charged spin-1 particles (i.e. electromagnetic interactions of 𝑊 ± bosons within SM).
The calculation is indeed a formidable task and the result is

(vector) 𝛼2 h 22 e2
 i
Leff = 4
29 𝐹 + 27 𝐹 · 𝐹 (49.12)
160 𝑚𝑊

(note that in such a case the relevant QED one-loop diagrams are topologically the same as in
scalar QED, with the obvious replacement of propagators and the rule for vertices).

295
Let us return to the “textbook case” of spinor QED. When one employs the effective
Euler–Heisenberg Lagrangian (49.9), the evaluation of the low-energy cross section for photon–
photon scattering is straightforward, though somewhat tedious. For unpolarized photons one
gets the result (see e.g. [6])

1 7784 𝛼4 6
𝜎(𝛾𝛾 → 𝛾𝛾) = 𝐸 . (49.13)
𝐸 𝛾 ≪𝑚 2𝜋 89100 𝑚 8 𝛾
Note that the numerical factor in (49.13) turns out to be roughly 0.01, so one may say that our
estimate based on the simple form (49.4) is not so bad.
If one goes beyond the low-energy region, the cross section 𝜎(𝛾𝛾 → 𝛾𝛾) depends quite
strongly on the photon energy. The analytic expression is, of course, rather complicated (see
e.g. the book [7]), but the plot of the energy dependence of 𝜎 is quite instructive and the reader
is encouraged to look it up in Chapter 12 of [7] (Fig. 24 therein). It is remarkable that the cross
section in question, which is so tiny for the visible light, becomes quite sizable (at least by the
particle physics standards) in the region of photon energies of the order of 1 MeV. The maximum
value of 𝜎 is about 10−30 cm2 and corresponds to the photon collision energy in the c.m. system
close to the threshold for production of an electron–positron pair (i.e. for the inelastic process
𝛾𝛾 → 𝑒 − 𝑒 + ). At this threshold, the plot of 𝜎(𝐸 𝛾 ) has a cusp singularity corresponding to the
discontinuous derivative of 𝜎. For 𝐸 𝛾 → ∞ the cross section decreases rapidly to zero.
Finally, let us remark that a measurement of such a cross section is an extremely challeng-
ing task, for obvious reasons. Nevertheless, quite recently, the collaboration ATLAS at LHC
(CERN) announced an observation of events of the desired type, in the so-called ultraperipheral
collisions of heavy ions; in fact, what is observed is a collision of “quasi-real” virtual photons
constituting the electromagnetic fields of colliding lead ions (see [52]).
The matrix element for the photon–photon scattering is, as we have emphasized, UV
finite in QED and no particular counterterm is needed. Generally, such a quantity is called
“calculable”.29 There is another prominent example of a quantity of such a type, namely the
magnetic moment of the electron (or muon, if you want). The lowest-order prediction for
this follows already from the Dirac equation considered as an equation of relativistic quantum
mechanics (see Chapter 2). Within QED, there is a calculable radiative correction to the lowest-
order result. The successful calculation of such an O (𝛼) correction to the electron magnetic
moment had become a breakthrough argument in favour of QED at the end of 1940s and quantum
field theory then entered its modern era, in a sense its “golden age”.30 The evaluation of the
so-called Schwinger correction mentioned above is the subject of the next chapter.

29 In Czech: “spočitatelná, resp. “vyčı́slitelná” veličina.


30 Itis fair to note that the term “golden age” should be taken with a grain of salt. In particular, while there
had been many precision calculations within renormalized QED, late 1950s and 1960s brought a sort of crisis of
the QFT status, which was mostly due to unsatisfactory description of strong and weak interactions by means of
the available QFT models. Thus, many theorists in the particle physics community rejected at that time QFT as
a fundamental concept and looked for radically different alternatives. Nevertheless, the ultimate change of the
paradigm in favour of QFT came in the early 1970s with the advent of the non-Abelian gauge theory of weak and
strong interactions, i.e. the highly successful scheme now commonly called the standard model of particle physics.
Concerning this, the reader may find relevant information in several books cited here, see e.g. [14, 22, 27] and [31].

296
Chapter 50

Schwinger correction

As we have recalled at the end of the preceding chapter, one of the highlights of the original
Dirac theory is a prediction of the value of the spin magnetic moment of the electron; we have
discussed various aspects of this achievement is some detail in chapters 2 and 13. In particular,
in Chapter 13 we have stressed that the original result
𝑒
𝜇 𝑒 = 𝜇B = (50.1)
2𝑚
(where 𝜇B is the common symbol for the Bohr magneton) is in fact a “conditional” prediction,
as it depends on an additional assumption concerning the interaction of the Dirac particle with
an external electromagnetic field: for deriving the result (50.1), it is assumed that the interaction
is “minimal”, i.e. it is defined by means of the familiar replacement 𝜕𝜇 → 𝜕𝜇 − 𝑖𝑒 𝐴 𝜇 in the
equation for the free particle. Apparently, such a recipe is appropriate for a pointlike particle,
like the electron or muon (but, as we know, it would be insufficient for a phenomenological
description of the proton or neutron).
In any case, the theoretical derivation of the result (50.1) was a stunning success in
its time, since it is indeed very close to the experimental value of 𝜇 𝑒 . However, some high-
precision measurements carried out in 1940s (see Chapter 13) have shown that there is in fact
a small deviation from the Dirac prediction, with the relative magnitude of about 0.001. So,
it has become an obvious challenge for theorists to explain such a tiny discrepancy within the
framework of QED, which at that time was developing rapidly. The first successful calculation
of the QED correction to the Dirac prediction is due to J. Schwinger, who had done it in 1948
without using Feynman diagrams (he never used them). In what follows, we are going to derive
the famous Schwinger’s result with the help of standard techniques of Feynman diagrams.
First of all, we have to find out how to extract the desired magnetic moment from the
matrix element for the electron scattering in an external field. QED Feynman diagrams of the
first and third order that might be of interest are shown in Fig. 50.1 (though we have already
discussed the diagrams (a) and (b) in Chapter 26; here we have summarized them all for reader’s
convenience).
Before proceeding to the diagram calculation, let us outline our basic strategy. We will
consider the electron scattering in a static, spatially homogeneous magnetic field, i.e. use the
four-potential of the form
𝐴 𝜇 (𝑥) = 0, 𝐴 𝑗 (®
𝑥) , (50.2)


with the vector 𝐴 𝑗 (®


𝑥 ) chosen so that the field strength

𝐵 𝑗 = 𝜖 𝑗 𝑘𝑙 𝜕𝑘 𝐴𝑙 (®
𝑥) (50.3)

297
p p′
q p p′

p p

q = p′ − p
q
q

(a) (b) (c)

Fig. 50.1: (a) 1st order diagram for electron scattering in an external electromagnetic field; (b) and (c)
are 3rd order diagrams involving both external and quantized field.

is a constant. Since the spin magnetic moment is a static characteristic of a particle, we will
examine the relevant matrix element for non-relativistic, quasi-static electrons; then, one may
envisage obtaining a form involving the scalar product of the 𝐵® with the vector of the would-
be magnetic moment, proportional to Pauli spin matrices, The corresponding coefficient of
proportionality is then identified as the value of the electron magnetic moment in question.
So, now we are ready to evaluate the relevant diagrams in Fig. 50.1, keeping in mind the
hint sketched above. For the contribution of the diagram (a) one has
Ma = 𝑒 u ( 𝑝′)𝛾 𝑗 u ( 𝑝) 𝐴
e𝑗 ( 𝑞)
® , (50.4)

where 𝐴 ® is the Fourier transform of the vector potential in (50.2) (note that in what follows,
e𝑗 ( 𝑞)
tilde always denotes the Fourier transform). For our purpose it will be useful to employ the
Gordon decomposition (see (C.28) in Appendix C)
1
u ( 𝑝′)𝛾 𝑗 u ( 𝑝) = u ( 𝑝′) ( 𝑝 + 𝑝′) 𝑗 + 𝑖𝜎 𝑗 𝑘 𝑞 𝑘 u ( 𝑝) , (50.5)
 
2𝑚
separating the “convective” and “spin” parts of the Dirac current. Let us recall that 𝜎 𝑗 𝑘 =
2 [𝛾 𝑗 , 𝛾 𝑘 ], and this is simply related to the spin matrices Σ through the identity (see (C.36))
𝑖 ®

𝜎 𝑗 𝑘 = 𝜖 𝑗 𝑘𝑙 Σ𝑙 . (50.6)
Needless to say, in the standard representation of 𝛾-matrices one has
0
 
®
𝜎
®=
Σ , (50.7)
0 𝜎 ®
where 𝜎 ® are the Pauli matrices.
Let us now work out the part of the matrix element (50.4) involving the spin matrices
𝜎 𝑗 𝑘 . We have ∫
𝐴 ® = d3 𝑥 𝑒 −𝑖 𝑞·®
e𝑗 ( 𝑞) ®𝑥
𝐴 𝑗 (®
𝑥) ,

and thus ∫
® =𝑖
e𝑗 ( 𝑞)
𝑞𝑘 𝐴 d3 𝑥 𝑒 −𝑖 𝑞·®
®𝑥
𝜕𝑘 𝐴 𝑗 (®
𝑥) . (50.8)

Then, using the identities (50.5) and (50.6) (note also that 𝜎 𝑗 𝑘 = 𝜎 𝑗 𝑘 ), the expression (50.4) is
recast as
𝑒
Ma = u ( 𝑝′) ( 𝑝 + 𝑝′) 𝑗 + 𝑖𝜖 𝑗 𝑘𝑙 Σ𝑙 𝑞 𝑘 u ( 𝑝) 𝐴 (50.9)
 
e𝑗 ( 𝑞)
® .
2𝑚

298
Furthermore, from (50.8), along with (50.3), one gets

𝑖𝐵 ® = 𝜖𝑙 𝑘 𝑗 𝑞 𝑘 𝐴
e𝑙 ( 𝑞) e𝑗 ( 𝑞)
® . (50.10)

Thus, the matrix element (50.9) becomes finally


𝑒
Ma = u ( 𝑝′)u
u ( 𝑝)( 𝑝 + 𝑝′) 𝑗 𝐴
e𝑗 ( 𝑞)
®
2𝑚
𝑒   (50.11)
+ u ( 𝑝′) Σ® ·e
𝐵® u ( 𝑝)
2𝑚
(when passing from (50.9), (50.10) to (50.11) please don’t forget that 𝜖 𝑗 𝑘𝑙 = −𝜖 𝑙 𝑘 𝑗 ).
For a constant 𝐵,® its Fourier transform is proportional to 𝛿 (3) ( 𝑞)
® (thus, in such a scattering
process both energy and momentum are conserved, as it must be for any static homogeneous
external field); having it in mind, in what follows we will write simply Σ ® · 𝐵® in the second term in
the expression (50.11). Now, in the spirit of the strategy outlined above, we restrict ourselves to
quasi-static electrons. Appealing again to reader’s previous training in diracology, let us recall
that in such a case, u ( 𝑝) (as well as u ( 𝑝′)) is reduced to
 
𝑤
u ( 𝑝) = , (50.12)
0

i.e. only the upper two components survive, and the two-component column 𝑤 embodies the
two possible electron spin states. Note that we have in mind the standard representation of
𝛾-matrices; so, we are going to utilize the form (50.7) for Σ.
® Moreover, since

1 0
 
𝛾0 =
0 −1

in the standard representation, the matrix products in (50.11) obviously become

u ( 𝑝) = 𝑤 †𝑓 𝑤 𝑖 ,
u ( 𝑝′)u
(50.13)
® · 𝐵® u ( 𝑝) = 𝑤 † 𝜎
u ( 𝑝′) Σ ®
𝑓 ® · 𝐵 𝑤𝑖 ,

where 𝑖 and 𝑓 are usual labels for the initial and final state, respectively.
In fact, experimental measurements of the magnetic moment are based on detecting spin-
flip transitions from the initial to final state. This means that the first term in the matrix element
(50.11), being proportional to 𝑤 †𝑓 𝑤 𝑖 , is irrelevant for our purpose; we are left with the non-trivial
spin contribution
𝑒 †
Ma = 𝑤 𝜎 ® · 𝐵® 𝑤𝑖 , (50.14)
2𝑚 𝑓
and there one can see the value of the electron magnetic moment corresponding precisely to the
original Dirac prediction (50.1). This is just the anticipated result.
Note that, in musical terms, the tempo we have chosen for the preceding exposition was
“andante”, or rather “largo”, but it has not been a loss of time or energy, because in most of the
available textbooks the details of the relevant calculation are usually skipped; so, our lengthy
discussion has been presented here for the reader’s convenience.
Now, we would like to calculate the O (𝛼) radiative correction to the lowest-order value
(50.1) that might originate in one-loop diagrams in Figs. 50.1(b),(c). To this end, we will need
a formula describing the general form of the matrix element in question. It turns out that the

299
matrix element for the electron scattering in an arbitrary external field 𝐴 𝜇 (𝑥) can be written, in
general, as follows:
 
2 𝜇 𝑖 2

M = 𝑒 u ( 𝑝 ) 𝐹1 (𝑞 )𝛾 + 𝜇𝜈
𝜎 𝑞 𝜈 𝐹2 (𝑞 ) u ( 𝑝) 𝐴
e𝜇 (𝑞) , (50.15)
2𝑚
where 𝐹1 and 𝐹2 are two independent form factors. Note that the form (50.15) does not depend
on the perturbation expansion. Obviously, the tree-level matrix element (50.4) has such a form,
with 𝐹1 (𝑞 2 ) = 1 and 𝐹2 (𝑞 2 ) = 0. If we stick to the on-shell renormalization scheme, the form
factor 𝐹1 (𝑞 2 ) is normalized by the condition 𝐹1 (0) = 1; this corresponds to the subtraction in
the vertex function Γ𝜇 ( 𝑝′, 𝑝) at 𝑞 = 0 (see (47.10)). The derivation of the formula (50.15) can be
found in the Appendix F. Here, let us emphasize its important consequences for our calculation
of the desired radiative correction.
We already know that for obtaining the value of the magnetic moment, the kinemat-
ical configuration with four-momentum transfer 𝑞 = (0, 𝑞), ® 𝑞® = 0 is considered. Now, the
contribution of the diagram (b) in Fig. 50.1 is just the tree-level matrix element multiplied by
Π(𝑞 2 ) = Π(𝑞 2 ) − Π(0) (cf. (48.30)), so it vanishes for 𝑞 = 0; thus, the vacuum polarization
graph (b) does not contribute to the radiative correction we are looking for. As for the diagram
(c), there is a subtraction at 𝑞 = 0 for the term involving 𝛾 𝜇 , so the contribution to the form factor
𝐹1 from this diagram vanishes for 𝑞 = 0. It means that the form factor 𝐹1 (𝑞 2 ), normalized by
𝐹1 (0) = 1, gives just the tree-level value (50.1). On the other hand, the term in (50.15) involving
the form factor 𝐹2 can contribute even for 𝑞 = 0, since, as we know from the previous discussion
of the tree diagram (a), the factor 𝑞 𝜈 is absorbed in the definition of the magnetic field strength.
In fact, it should be clear now that for working out the contribution of the 𝐹2 term in (50.15) one
can essentially repeat the steps that followed the relation (50.5); the only extra ingredient is the
additional factor 𝐹2 (0) in the contribution to the magnetic moment. Thus, we end up with the
general formula
𝑒 
𝜇𝑒 = 1 + 𝐹2 (0) . (50.16)

2𝑚
To summarize our preceding discussion: First, the diagram (b) does not contribute at all. Second,
the contribution of the diagram (c) involving 𝛾 𝜇 is irrelevant, as it vanishes for 𝑞 = 0 (due to
the on-shell subtraction). Finally, when computing the vertex function Γ𝜇 ( 𝑝′, 𝑝), the term
proportional to 𝜎 𝜇𝜈 𝑞 𝜈 is UV convergent (let us recall that the only UV divergence in Γ𝜇 ( 𝑝′, 𝑝)
is proportional to 𝛾 𝜇 ) and thus the form factor 𝑭 2 (𝒒 2 ) is calculable.31 The only problem that
remains to be clarified is how to get all relevant terms involving the structure 𝜎 𝜇𝜈 𝑞 𝜈 in a practical
evaluation of the diagram (c). The answer is quite easy. Because of Lorentz covariance, an
explicit calculation of the vertex function Γ𝜇 ( 𝑝′, 𝑝) brings naturally terms proportional to 𝛾 𝜇 ,
𝑝 𝜇 and 𝑝′𝜇 . While the 𝛾 𝜇 terms can be thrown away, those involving 𝑝 𝜇 and 𝑝′𝜇 are expected
to appear eventually in the combination 𝑝 𝜇 + 𝑝′𝜇 , and this in turn can be converted to 𝜎𝜇𝜈 𝑞 𝜈 by
means of the Gordon identity. Thus, the strategy of the evaluation of the relevant part of the
vertex diagram (c) is clearly set, and we may now go in medias res to start the calculation, which
is quite tedious, but its result is nice and truly rewarding.
The contribution of the “interior” of the diagram (c) can be written as (cf. also (41.19))
d4 𝑙 𝛾𝛼 ( /𝑙 + 𝑝/ ′ + 𝑚)𝛾 𝜇 ( /𝑙 + 𝑝/ + 𝑚)𝛾 𝛼

3

𝑖Γ𝜇 ( 𝑝 , 𝑝) = 𝑒 . (50.17)
(2𝜋) 4 [(𝑙 + 𝑝′) 2 − 𝑚 2 ] [(𝑙 + 𝑝) 2 − 𝑚 2 ] (𝑙 2 − 𝜆2 )
Note that we do not invoke here any UV regularization, since we need only the above-mentioned
finite part of the integral (50.17). The first few steps of the evaluation of the expression (50.17)
31 This, of course, is gratifying, since a counterterm for the contribution proportional to 𝜎 𝜇𝜈 𝑞 𝜇 would have the
structure 𝜓𝜎𝜇𝜈 𝜓𝐹 𝜇𝜈 , and such an UV divergent term would spoil the renormalizability (cf. Chapter 44).

300
are routine: introducing Feynman parametrization, shifting properly the loop momentum 𝑙 and
using the symmetric integration, one gets
∫1 ∫1−𝑥 ∫
d4 𝑙 𝑁𝜇
𝑖Γ𝜇 ( 𝑝′, 𝑝) = 2𝑒 3 d𝑥 d𝑦 , (50.18)
UVfin. (2𝜋) (𝑙 − 𝐶) 3
4 2
0 0

where the numerator 𝑁 𝜇 reads

𝑁 𝜇 = 𝛾𝛼 (−𝑥 𝑝/ + (1 − 𝑦) 𝑝/ ′ + 𝑚)𝛾 𝜇 (1 − 𝑥) 𝑝/ − 𝑦 𝑝/ ′ + 𝑚 𝛾 𝛼 , (50.19)




and
𝐶 = (𝑥 + 𝑦) 2 𝑚 2 + (1 − 𝑥 − 𝑦)𝜆2 − 𝑥𝑦 𝑞 2 . (50.20)
Note that in arriving at the formula (50.20) we have set 𝑝 2 = 𝑚 2 , 𝑝′2 = 𝑚 2 . (Needless to
say, the reader is encouraged to recover the above expressions independently.) Now, we in fact
need the relevant expression for Γ𝜇 ( 𝑝′, 𝑝) sandwiched between u ( 𝑝′) and u ( 𝑝), as this enters
the matrix element (50.15). To work it out, we utilize the algebra of 𝛾-matrices, in particular
the “chain identities” (see Appendix C, formulae (C.24)) for 𝑛 = 4, and also Dirac equations
( 𝑝/ − 𝑚)uu ( 𝑝) = 0 and u ( 𝑝′)( 𝑝/ ′ − 𝑚) = 0. At the same time, we discard systematically the terms
involving just 𝛾 𝜇 alone, as these are irrelevant. The calculation is quite lengthy and boring; an
intermediate result is
n
′ ′
u ( 𝑝) [ 4𝑥(1 − 𝑥) − 4(1 − 𝑥)(1 − 𝑦) + 4(1 − 2𝑥)] 𝑚 𝑝 𝜇
u ( 𝑝 )𝑁 𝜇 u ( 𝑝) = u ( 𝑝 )u
o (50.21)
+ [ 4𝑦(1 − 𝑦) − 4(1 − 𝑥)(1 − 𝑦) + 4(1 − 2𝑦)] 𝑚 𝑝′𝜇 + . . .

where the ellipsis means the irrelevant terms.


As a next step, we may employ our master formula for the loop-momentum integration
(see (39.8)). In our case, this gives

d4 𝑙 1 𝑖 1

= − . (50.22)
(2𝜋) 4 (𝑙 2 − 𝐶) 3 32𝜋 2 𝐶
From (50.20) it is obvious that 𝐶 is symmetric under the interchange of 𝑥 and 𝑦.
Now, one may notice that the expression (50.21) has a remarkable symmetry: the coef-
ficients at 𝑝 𝜇 and 𝑝′𝜇 can be made equal upon the interchange 𝑥 ↔ 𝑦 in one of them. To see
that such an interchange is legitimate, one should realize that the integration over the Feynman
parameters has the following simple property:
∫1 ∫1−𝑥 ∫ 1 ∫1−𝑥
d𝑥 d𝑦 𝑓 (𝑥, 𝑦) = d𝑥 d𝑦 𝑓 (𝑦, 𝑥) (50.23)
0 0 0 0

(the proof is easy). Taking all this into account, it is clear that under the integral over 𝑥 and 𝑦 one
may identify the coefficients at 𝑝 𝜇 and 𝑝′𝜇 and the expression (50.21) thus becomes, effectively,

u ( 𝑝′)𝑁 𝜇 u ( 𝑝) 4𝑚( 𝑝 𝜇 + 𝑝′𝜇 ) 𝑥(1 − 𝑥) + 1 − 2𝑥 − (1 − 𝑥)(1 − 𝑦) u ( 𝑝′)u


 
−→ u ( 𝑝)
= 4𝑚( 𝑝 𝜇 + 𝑝′𝜇 )(𝑦 − 𝑥𝑦 − 𝑥 2 )u
u ( 𝑝′)u
u ( 𝑝) , (50.24)

where we have already omitted the irrelevant terms. This is the anticipated result: 𝑝 𝜇 and 𝑝′𝜇
enter in the combination 𝑝 𝜇 + 𝑝′𝜇 , which can be immediately turned into 𝜎𝜇𝜈 𝑞 𝜈 (with the help of

301
Gordon identity) and thus fits in the general form (50.15). Explicitly, the Gordon identity tells
us that
u ( 𝑝) = 2𝑚 u ( 𝑝′)𝛾 𝜇 u ( 𝑝) − 𝑖 u ( 𝑝′)𝜎𝜇𝜈 𝑞 𝜈 u ( 𝑝) .
u ( 𝑝′)u
( 𝑝 𝜇 + 𝑝′𝜇 )u (50.25)
One thus has, preserving only the relevant term,

∫1 ∫1−𝑥
𝑖 𝑦 − 𝑥𝑦 − 𝑥 2
u ( 𝑝) = 𝑒 3 2 𝑚 u ( 𝑝′)𝜎𝜇𝜈 𝑞 𝜈 u ( 𝑝)
u ( 𝑝′)Γ𝜇 ( 𝑝′, 𝑝)u d𝑥 d𝑦 . (50.26)
4𝜋 (𝑥 + 𝑦) 2 𝑚 2 − 𝑥𝑦 𝑞 2
0 0

Note that to obtain the last result, we have used Eqs. (50.18), (50.20), (50.22), (50.24) and
(50.25); in the denominator of the integrand we have also discarded the IR regulator 𝜆, as the
integral in question has no IR divergence. The identification of the form factor 𝐹2 (𝑞 2 ) is now
easy: the contribution of the diagram (c) to the scattering matrix element in question is

M𝑐 = u ( 𝑝′)Γ𝜇 ( 𝑝′, 𝑝)u e𝜇 (𝑞) ,


u ( 𝑝) 𝐴

and comparing the structure of the expression (50.26) with the general form (50.15), one gets

∫1 ∫1−𝑥
𝑒2 𝑦 − 𝑥𝑦 − 𝑥 2
2
𝐹2 (𝑞 ) = 2 𝑚 2 d𝑥 d𝑦 . (50.27)
2𝜋 (𝑥 + 𝑦) 2 𝑚 2 − 𝑥𝑦 𝑞 2
0 0

By means of appropriate substitutions, the last expression can be reduced to a simple one-
dimensional integral
∫1
2 𝛼 1
𝐹2 (𝑞 ) = d𝑢 , (50.28)
2𝜋 𝑞2
0 1 − 2 𝑢(1 − 𝑢)
𝑚
2
where we have also used 𝑒 = 4𝜋𝛼. (The passage from (50.27) to (50.28) is left to enthusiasts as
a nice exercise in computing integrals.) Thus, the correction to the electron magnetic moment
is, according to the key formula (50.16),
𝛼
𝐹2 (0) = . (50.29)
2𝜋
This is the celebrated Schwinger correction, published first in 1948. Obviously, J. Schwinger
was proud of this achievement and so it is also placed on his tombstone (see Fig. 50.2). Let
us add that this computational success (that matched the experimental results) was certainly
a part of the reason why he received the Nobel Prize in 1965 together with R. Feynman and
S. Tomonaga (the precise citation of the Nobel committee was “for their fundamental work
in quantum electrodynamics, with deep-ploughing consequences for the physics of elementary
particles”).
So, the QED prediction for the electron spin magnetic moment is, more generally,
 𝛼 
𝜇 𝑒 = 𝜇B 1 + +... , (50.30)
2𝜋
where the ellipsis stands for contributions of the order O (𝛼2 ) and higher. Of course, the
evaluation of those higher-order contributions is rather laborious indeed; nevertheless, as of
today, theoretical calculations have been done up to the order (𝛼/𝜋) 4 (both numerical and
analytic) and, recently, also the correction of the order (𝛼/𝜋) 5 has been obtained (numerically).

302
Fig. 50.2: The tombstone of J. Schwinger at Mt. Auburn Cemetery in Cambridge, Massachusetts, USA.
Source: Wikipedia, the article on Julian Schwinger; photo by Jacob Bourjaily.

Such a theoretical precision matches the accuracy of the best current experiments and the
electron magnetic moment is in fact a quantity most precisely ever measured. For an illustration,
let us mention explicitly the order O (𝛼2 ). The corresponding result had been obtained in 1957
independently by Charles Sommerfield and André Petermann. Using their result, one may write,
approximately,
  2 
𝛼 𝛼 3
𝜇 𝑒 = 𝜇B 1 + − 0.328 + O (𝛼 ) , (50.31)
2𝜋 𝜋
where the coefficient at (𝛼/𝜋) 2 arises as

197 𝜋 2 𝜋 2 3
+ − ln 2 + 𝜁 (3)  −0.328 , (50.32)
144 12 2 4

1/𝑛3 being the relevant value of the Riemann zeta-function (𝜁 (3)  1.2 is also
∑︁
with 𝜁 (3) =
𝑛=1
known as the Apéry’s constant). The structure of the expression (50.32) indicates that obtaining
it must be a tough job.

303
***

Epilogue

Coming back to the Schwinger correction (50.29) that we have been able to recover
explicitly, after some effort that is not overly time-consuming, we can see that numerically,
𝛼/2𝜋  0.001 if we set 𝛼 = 1/137. One may now recall the introductory Chapter 2, where
we have reproduced the original Dirac’s prediction 𝜇 𝑒 = 𝜇B . Thus, one might say, with some
self-irony, that after going through the remaining 48 chapters (which correspond roughly to six
months of a lecture course within the academic year), our knowledge of relativistic quantum
theory has improved by about one per mille. Well, of course, I am kidding. In fact, we have
seen that one has to master a lot of material, in order to be able to compute a tiny QED radiative
correction like (50.29). One may conclude that QED is a great theory indeed, which fully
deserved the Nobel Prize in its time. Now it is a subsection of the more comprehensive standard
model of elementary particle physics.

***

304
Appendix A

Basic properties
of Lorentz transformations

The Lorentz transformation of spacetime coordinates is written as


𝜇
𝑥′ 𝜇 = Λ 𝜈 𝑥𝜈 (A.1)

or
𝑥 ′𝜇 = Λ 𝜇 𝜈 𝑥 𝜈 , (A.2)
where, in accordance with the rules for raising and lowering the indices, one has
𝜌
Λ 𝜇 𝜈 = 𝑔 𝜇𝜌 Λ 𝜎𝑔
𝜎𝜈
. (A.3)

When the coordinates 𝑥 𝜇 are ordered in a column, the relation (A.1) is recast in the matrix form
simply as 𝑥 ′ = Λ𝑥 (with 𝜇 and 𝜈 being the row and column index, respectively); for convenience,
the matrix defined by the elements appearing in (A.2) may be then denoted as Λ. The formula
(A.3) is thus tantamount to the matrix identity

Λ = 𝑔 ·Λ·𝑔, (A.4)

where 𝑔 stands for the 4 × 4 matrix representing the metric of the Minkowski space (which is the
same for the components 𝑔 𝜇𝜈 and 𝑔 𝜇𝜈 ); recall that our convention is 𝑔 = diag(+1, −1, −1, −1).
The invariance of the spacetime interval under Lorentz transformations means that
𝑥 𝑥 𝜇 = 𝑥 𝜌 𝑥 𝜌 . Thus one gets
′ 𝜇 ′
𝜇
Λ 𝜌 Λ 𝜇 𝜎 𝑥 𝜌 𝑥 𝜎 = 𝛿 𝜎𝜌 𝑥 𝜌 𝑥 𝜎 , (A.5)
and this implies the pseudoorthogonality relation
𝜇
Λ 𝜌 Λ 𝜇 𝜎 = 𝛿 𝜎𝜌 . (A.6)

In the matrix form it reads


ΛT · Λ = 1 , (A.7)
where ΛT is the transpose of Λ. Since we are working with finite dimensional matrices, the
relation (A.7) implies also
Λ · ΛT = 1 , (A.8)
which, in compliance with the above definitions, means
𝜌 𝜇 𝜇
Λ𝜈 Λ 𝜌 = 𝛿𝜈 . (A.9)

305
𝜇 𝜇
Now, taking into account that 𝛿 𝜈 = 𝑔𝜈 and raising the index 𝜈, from (A.9) one gets
𝜇
Λ 𝜌 Λ𝜈𝜌 = 𝑔 𝜇𝜈 ,

and this can be rewritten as


𝜇
𝑔 𝜇𝜈 = Λ 𝜌 Λ𝜈 𝜎 𝑔 𝜌𝜎 . (A.10)
However, (A.10) is just the transformation law for a 2nd order tensor. Thus, the elementary
considerations presented above lead to the conclusion that the metric components 𝑔 𝜇𝜈 in fact
constitute a 2nd order tensor under Lorentz transformations (this justifies the standard term
metric tensor). Obviously, it is a rather exceptional tensor, whose salient feature is that its
components are the same in all Lorentz reference frames. In other words, 𝑔 𝜇𝜈 is a purely
numerical Lorentz tensor (recall that the Kronecker delta 𝛿 𝑗 𝑘 has an analogous property with
respect to three-dimensional rotations).
The above identities (A.4), (A.7) have a simple consequence that is worth mentioning
here. Employing them, one has
ΛT · 𝑔 · Λ · 𝑔 = 1 . (A.11)
Taking the determinant of both sides of the identity (A.11), having in mind that det 𝑔 = −1 and
det ΛT = det Λ, one gets
(det Λ) 2 = 1 . (A.12)
Thus one arrives at the well-known fact that for any Lorentz transformation one has

det Λ = ±1 . (A.13)

At this point, it is useful to recall the standard terminology concerning the whole set (in fact group)
of Lorentz transformations. Among the matrices Λ satisfying the pseudoorthogonality relations
(A.7), (A.8), there are such that preserve the direction of time, as well as those that reverse
it. Transformations of the first type are called orthochronous. Concerning the second type,
the basic example is provided by the pure time reversal denoted conventionally as 𝑇, for which
Λ = Λ𝑇 = diag(−1, 1, 1, 1, ). According to the sign of det Λ shown in (A.13), one distinguishes
between proper (det Λ = +1) and improper (det Λ = −1) Lorentz transformations. Under
such as criterion, the time reversal matrix Λ𝑇 certainly represents an improper transformation.
Another obvious example is the spatial inversion (space reflection, or the parity transformation
𝑃), for which Λ = Λ𝑃 = diag(1, −1, −1, −1). One the other hand, the continuous Lorentz
transformations, such as “boosts” and spatial rotations, are proper (more about these shortly).
The sign ambiguity embodied in (A.13) enables one to distinguish between tensors and
pseudotensors. Let us first recall some familiar examples. A quantity is a true scalar if it is
invariant under both proper Lorentz transformations and parity, while a pseudotensor changes
its sign under parity. Concerning vectors and pseudovectors, both behave in the same way
under proper transformations (according to the paradigm (A.1)); with respect to parity, a true
vector 𝑉 𝜇 transforms as 𝑉 𝜇 → (𝑉 0 , −𝑉), ® while a pseudovector (axial vector) 𝐴 𝜇 transforms
as 𝐴 𝜇 → (−𝐴0 , 𝐴). ® To summarize and generalize the above examples, for a pseudotensor, an
additional factor of det Λ should be involved in its transformation law, apart from the standard
chain of matrix elements of Λ.
There is an important example of a pseudotensor that is just a collection of numbers,
i.e. its components are the same in all Lorentz reference frames: it is the familiar Levi-Civita
symbol 𝜀 𝜇𝜈𝜌𝜎 , known also as the permutation symbol. Just to be sure, let us recall that 𝜀 𝜇𝜈𝜌𝜎
is ±1 according to the signature of the particular permutation 𝜇𝜈𝜌𝜎 with respect to the basic
set 0123 (our convention is such that 𝜀0123 = +1). So, let us see that 𝜀 𝜇𝜈𝜌𝜎 is indeed a Lorentz

306
pseudotensor. To this end, one may start with an appropriate representation of the determinant
of Λ, namely
det Λ = Λ0 𝜇 Λ1 𝜈 Λ2 𝜌 Λ3 𝜎 (−𝜀 𝜇𝜈𝜌𝜎 ) (A.14)
(note that the minus sign in (A.14) is due to our convention, according to which 𝜀 0123 = −1).
Now, since a transposition of the rows in a determinant changes its sign, one may also write
𝛽 𝛾
Λ𝛼 𝜇 Λ 𝜈 Λ 𝜌 Λ𝛿 𝜎 (−𝜀 𝜇𝜈𝜌𝜎 ) = det Λ · (−𝜀 𝛼𝛽𝛾𝛿 ) . (A.15)
Thus we get
𝛽 𝛾
𝜀 𝛼𝛽𝛾𝛿 = det Λ Λ𝛼 𝜇 Λ 𝜈 Λ 𝜌 Λ𝛿 𝜎 𝜀 𝜇𝜈𝜌𝜎 , (A.16)
where we have also utilized the simple fact that (det Λ) −1 = det Λ, due to (A.13). The pseu-
dotensor property of the four-dimensional Levi-Civita symbol is thereby proven. It means that
one has at disposal two purely numerical Lorentz tensors, namely 𝑔 𝜇𝜈 and 𝜀 𝜇𝜈𝜌𝜎 , which thus
naturally appear in many formulae for various quantities of tensor character encountered in field
theory models.
Let us now proceed to the formal description of the continuous (proper) Lorentz trans-
formations. The corresponding matrices Λ form a six-parameter Lie group; it means that a
particular Λ may be written in the exponential form
 
𝑖 𝛼𝛽
Λ = exp − 𝜔 𝐼𝛼𝛽 , (A.17)
2
where 𝜔𝛼𝛽 = −𝜔 𝛽𝛼 stand for the relevant parameters and 𝐼𝛼𝛽 = −𝐼 𝛽𝛼 are 4 × 4 matrix generators
(which form a basis of the corresponding Lie algebra). Of course, the summation in the exponent
runs over 𝛼, 𝛽 = 0, 1, 2, 3. An attentive reader should notice that the antisymmetry property of
𝜔𝛼𝛽 guarantees that there are just six independent values of these parameters. As usual, one
may now consider an infinitesimal form of (A.17), which reads
𝑖
Λ = 1 − Δ𝜔𝛼𝛽 𝐼𝛼𝛽 , (A.18)
2
where we have distinguished the infinitesimal parameters Δ𝜔𝛼𝛽 from the original finite ones. In
fact, one would like to represent such an infinitesimal Λ in a natural way as a small deviation
from unit matrix; for the matrix elements this would mean
𝜇 𝜇 𝜇
Λ 𝜈 =𝛿 𝜈 + Δ𝜔 𝜈 . (A.19)
It is not difficult to show that one can arrive at (A.19) if the generators in (A.18) are chosen as
𝜇  
𝜇 𝜇
𝐼𝛼𝛽 𝜈 = 𝑖 𝑔 𝛼 𝑔 𝛽𝜈 − 𝑔 𝛽 𝑔𝛼𝜈 (A.20)
(let us stress that one should not confuse the role of the Greek indices (𝛼, 𝛽) and (𝜇, 𝜈); 𝛼, 𝛽
are labelling the six generators, while 𝜇, 𝜈 mark the elements of a given matrix 𝐼𝛼𝛽 ). Indeed,
employing (A.20) in (A.18), one gets
 
𝜇 𝑖 𝛼𝛽
𝜇
Λ 𝜈 = 1 − Δ𝜔 𝐼𝛼𝛽
2 𝜈
𝜇 𝑖 
𝜇 𝜇

= 𝛿 𝜈 − Δ𝜔𝛼𝛽 𝑖 𝑔 𝛼 𝑔 𝛽𝜈 − 𝑔 𝛽 𝑔𝛼𝜈
2
𝜇 1 1
= 𝛿 𝜈 + Δ𝜔 𝜇𝛽 𝑔 𝛽𝜈 − Δ𝜔𝛼𝜇 𝑔𝛼𝜈
2 2
𝜇 1 𝜇 1
= 𝛿 𝜈 + Δ𝜔 𝜈 + Δ𝜔 𝜇𝛼 𝑔𝛼𝜈
2 2
𝜇 𝜇
= 𝛿 𝜈 + Δ𝜔 𝜈 , (A.21)

307
and that’s it.
With the explicit form of the generators at hand, one may verify that the following
commutation relation is valid:

[𝐼 𝜇𝜈 , 𝐼 𝜌𝜎 ] = 𝑖 𝑔 𝜇𝜎 𝐼𝜈𝜌 + 𝑔𝜈𝜌 𝐼 𝜇𝜎 − 𝑔 𝜇𝜌 𝐼𝜈𝜎 − 𝑔𝜈𝜎 𝐼 𝜇𝜌 (A.22)




(the corresponding calculation is left to a diligent reader as an instructive exercise). Further, it


is useful to express the generators 𝐼𝛼𝛽 in an explicit matrix form. Utilizing the formula (A.20)
one gets

0 1 0 0 0 0 1 0 0 0 0
1
­1 0 0 0® ­0 0 0 0® ­0 0 0

© ª © ª © ª
𝐼01 = −𝑖 ­ ®, 𝐼02 = −𝑖 ­ ®, 𝐼03 = −𝑖 ­ ®,
­0 0 0 0® ­1 0 0 0® ­0 0 0

«0 0 0 0¬ «0 0 0 0¬ «1 0 0

0 0 0 0 0 0 0 0 0 0 0 0
­0 0 0 0® ­0 0 0 −1® ­0 0 1 0®
© ª © ª © ª
𝐼23 = −𝑖 ­ ®, 𝐼31 = −𝑖 ­ ®, 𝐼12 = −𝑖 ­ ® . (A.23)
­0 0 0 1® ­0 0 0 0® ­0 −1 0 0®
«0 0 −1 0¬ «0 1 0 0¬ «0 0 0 0¬

It is worth noting that the matrices (A.23) are traceless; this in turn means that for Λ given by
(A.17) one has det Λ = 1, due to the known identity

Tr ln 𝑀 = ln det 𝑀 (A.24)

valid for any regular matrix 𝑀. In this way it is confirmed that a matrix Λ in (A.17) represents
a proper Lorentz transformation.
When one examines the transformations defined by (A.17) in detail, it is not difficult
to arrive at the conclusion that 𝐼0 𝑗 , 𝑗 = 1, 2, 3, generate Lorentz boosts (i.e. correspond to
relative uniform motions of two reference frames), while 𝐼 𝑗 𝑘 , 𝑗, 𝑘 = 1, 2, 3, generate spatial
rotations. The point is that due to some obvious properties of the matrices (A.23) with respect
to multiplication, the exponential (A.17) can be evaluated explicitly for the individual one-
parameter subgroups of transformations generated by 𝐼01 , 𝐼02 , etc. and obtain e.g. the formulae
shown in Chapter 4 (cf. (4.25), (4.34)); any hard-working reader is encouraged to perform
such an exercise and recover the relevant hyperbolic or trigonometric functions from the Taylor
expansion of the exponential (A.17) in question.
Next, let us introduce a convenient notation

𝐼01 = 𝐾1 , 𝐼02 = 𝐾2 , 𝐼03 = 𝐾3 ,


𝐼23 = 𝐽1 , 𝐼31 = 𝐽2 , 𝐼12 = 𝐽3 . (A.25)

Then it is straightforward to show that the commutation relations (A.22) are tantamount to

[𝐽 𝑗 , 𝐽𝑘 ] = 𝑖𝜀 𝑗 𝑘𝑙 𝐽𝑙 ,
[𝐽 𝑗 , 𝐾 𝑘 ] = 𝑖𝜀 𝑗 𝑘𝑙 𝐾𝑙 , (A.26)
[𝐾 𝑗 , 𝐾 𝑘 ] = −𝑖𝜀 𝑗 𝑘𝑙 𝐽𝑙 .

Notice that the rotation generators 𝐽 𝑗 , 𝑗 = 1, 2, 3, emerge indeed with the expected commu-
tators (corresponding to an angular momentum). The formulae (A.26) are instrumental in the
discussion of representations of the Lorentz algebra (see Appendix B).

308
Appendix B

Representations of Lorentz group

We provide here an elementary treatment of finite-dimensional representations of the proper


Lorentz group; in fact, relying on the basic relation (A.17), we examine the representations
of the Lorentz algebra, i.e. we are aiming at a systematic description of all possible matrix
generators satisfying the commutation relations (A.26) (or, equivalently, (A.22)).
There is a simple trick that enables one to solve such a problem. Instead of 𝐽® and 𝐾®
appearing in (A.25), (A.26), let us introduce their linear combinations

® = 1 ( 𝐽® + 𝑖 𝐾)
𝑀 ® , 1
𝑁® = ( 𝐽® − 𝑖 𝐾)
® (B.1)
2 2

(note that we have used here a shorthand vector notation, so that 𝐽® stands for 𝐽 𝑗 , 𝑗 = 1, 2, 3, etc.).
It is straightforward to show that the commutation relations (A.26) imply

[𝑀 𝑗 , 𝑀𝑘 ] = 𝑖𝜀 𝑗 𝑘𝑙 𝑀𝑙 ,
[𝑁 𝑗 , 𝑁 𝑘 ] = 𝑖𝜀 𝑗 𝑘𝑙 𝑁𝑙 , (B.2)
[𝑀 𝑗 , 𝑁 𝑘 ] = 0 .

Thus, 𝑀 ® and 𝑁® are triplets of matrices representing two independent angular momenta; this
also means that the Lorentz algebra is equivalent to that of the product group SU(2) × SU(2)
(in a better mathematical notation it can be viewed as su(2) ⊕ su(2)). However, the construction
of the finite-dimensional representations of the SU(2) algebra is known very well from the
quantum mechanical theory of angular momentum (spin). Using a generic notation 𝐽® for the
spin components, the corresponding matrices can be labelled by a number 𝑗 that is a non-
negative integer or half-integer and they are of the order 2 𝑗 + 1. As the knowledgeable reader
may recall, the explicit form of such spin matrices can be obtained by means of the technique
of the raising and lowering (“ladder”) operators 𝐽± = 𝐽1 ± 𝑖𝐽2 . The label 𝑗 corresponds to the
eigenvalue 𝑗 ( 𝑗 + 1) of the quadratic Casimir operator 𝐽®2 = 𝐽12 + 𝐽22 + 𝐽32 , which commutes with
any component 𝐽 𝑗 , 𝑗 = 1, 2, 3 (and in a given representation it is a multiple of the unit matrix).
Thus, one may classify the representations of the matrices 𝑀,® 𝑁® by means of pairs ( 𝑗 1 , 𝑗2 )
with 𝑗1 , 𝑗2 obeying the above familiar rules. Consequently, such a labelling may be employed
for the representations of the generators 𝐽, ® which are expressed in terms of 𝑀,
® 𝐾, ® 𝑁® as

1 ® ®
𝐽® = 𝑀
® + 𝑁® , 𝐾® = ( 𝑀 − 𝑁) . (B.3)
𝑖

Since 𝑀,
® 𝑁® are independent (commuting) angular momenta, the Lorentz group representation
corresponding to a given pair 𝐽,
® 𝐾® (i.e. 𝑀, ® is described by the direct product of the
® 𝑁)

309
matrices corresponding to 𝑀 ® and 𝑁,
® and thus the dimension of the representation in question is
(2 𝑗1 + 1) (2 𝑗2 + 1).
The above described construction may be elucidated by means of some examples of the
lowest-dimensional representations. The simplest case is the trivial (scalar) representation (0, 0),
in which the matrices 𝑀, ® 𝑁® are zero (and, of course, the same is then true for 𝐽® and 𝐾).
® In the
context of relativistic quantum mechanics or field theory this corresponds to the Klein–Gordon
equation that describes spinless particles. Next, one may consider the representations (0, 12 ) and
( 12 , 0). It is quite easy to realize what is their matrix content. Indeed, spin 12 is described by the
familiar Pauli matrices (more precisely, by 12 𝜎 ® ), so that one has

1
0, 12 : ® = 0, 𝑁® = 𝜎 ®, (B.4)

𝑀
2
1 ® = 1𝜎
2, 0 : ®, 𝑁® = 0 . (B.5)

𝑀
2
Thus, according to (B.3), one gets
1 𝑖
0, 12 : 𝐽® = 𝜎®, 𝐾® = ®, (B.6)

𝜎
2 2
1 1 𝑖
2, 0 : 𝐽® = 𝜎®, 𝐾® = − 𝜎®. (B.7)

2 2
Both these representations are two-dimensional, and one may thus expect that they are imple-
mented in the description of the Weyl equation(s) (cf. Chapter 9). To check it, let us look at the
equation (9.8), i.e. 𝜎 𝜇 𝜕𝜇 𝜓 = 0, which is the same as

𝜕𝜓
® · ® )𝜓 (B.8)
Δ
𝑖 = −𝑖( 𝜎
𝜕𝑡
(cf. (9.6)). In our straightforward analysis of its relativistic covariance we have found out that
the relevant transformation of the wave function is
 
𝑖 𝛼𝛽
𝑆 = exp − 𝜔 𝑊𝛼𝛽 , (B.9)
4

with (see (9.23))

𝑊12 = 𝜎3 , 𝑊31 = 𝜎2 , 𝑊23 = 𝜎1 ,


𝑊01 = 𝑖𝜎1 , 𝑊02 = 𝑖𝜎2 , 𝑊03 = 𝑖𝜎3 . (B.10)

Now, comparing (B.9) with the generic form (A.17) of a representation of the Lorentz group, and
taking into account the correspondence (A.25), one would like to match the generators (B.10)
with 𝐽® and 𝐾® in such a way that
1 1 1
𝑊12 = 𝐽3 , 𝑊31 = 𝐽2 , 𝑊23 = 𝐽1 ,
2 2 2
1 1 1
𝑊01 = 𝐾1 , 𝑊02 = 𝐾2 , 𝑊03 = 𝐾3 . (B.11)
2 2 2
From (B.6) it is seen that such a coincidence occurs indeed for the representation (0, 21 ). In
a similar way, one would be able to show that the relevant transformation law for the Weyl
equation of the second type, i.e. (9.7), corresponds to the representation ( 12 , 0) specified in

310
(B.7). The representations (0, 21 ) and ( 12 , 0) are irreducible in the usual sense and, despite
having the same dimension, they are inequivalent; this is due to the simple fact that there is no
similarity transformation between 𝜎 ® and −® 𝜎 . Concerning the common terminology, these two
representations correspond to 2-component relativistic Weyl spinors.
As we know, Dirac equation involving a non-zero mass term is written for a four-
component field or wave function, so the question is what is the corresponding representation of
the Lorentz group, from the point of view of the classification described above. The answer is
that it is the direct sum ( 21 , 0) ⊕ (0, 12 ) (note such a representation is reducible). Let us convince
ourselves that it is indeed so. Using (B.6) and (B.7), the Lorentz generators corresponding to
( 21 , 0) ⊕ (0, 12 ) may be written as
1
0 −𝑖 𝜎 0
  
®
𝜎 ®
𝐽® = 2 1 , 𝐾® = 2 . (B.12)
0 2𝜎® 0 𝑖
2𝜎®

On the other hand, looking back at the results of Chapter 4, the transformation matrix for the
solution of Dirac equation is given by
 
𝑖 𝛼𝛽 𝑖
𝑆 = exp − 𝜔 𝜎𝛼𝛽 , 𝜎𝛼𝛽 = [𝛾𝛼 , 𝛾 𝛽 ] . (B.13)
4 2

Thus, similarly as in (B.11), one would like to arrive at the identification


1 1 1
𝜎23 = 𝐽1 , 𝜎31 = 𝐽2 , 𝜎12 = 𝐽3 ,
2 2 2
1 1 1
𝜎01 = 𝐾1 , 𝜎02 = 𝐾2 , 𝜎03 = 𝐾3 (B.14)
2 2 2
for a particular realization of the 𝛾-matrices appearing in (B.13). It turns out that this is indeed
possible; a subtle point is that the relevant set of 𝛾-matrices is not the most common standard
representation (3.37), but the choice (3.38) that is called spinor (or chiral) representation.
Indeed, using (3.38) one gets

−𝑖𝜎1 0
 
𝜎01 = 𝑖𝛾0 𝛾1 = ,
0 𝑖𝜎1
−𝑖𝜎2 0
 
𝜎02 = 𝑖𝛾0 𝛾2 = ,
0 𝑖𝜎2
−𝑖𝜎3 0
 
𝜎03 = 𝑖𝛾0 𝛾3 = ,
0 𝑖𝜎3
(B.15)
𝜎3 0
 
𝜎12 = 𝑖𝛾1 𝛾2 = ,
0 𝜎3
𝜎2 0
 
𝜎31 = 𝑖𝛾3 𝛾1 = ,
0 𝜎2
𝜎1 0
 
𝜎23 = 𝑖𝛾2 𝛾3 = ,
0 𝜎1

where we have utilized some well-known properties of Pauli matrices, in particular 𝜎1 𝜎2 = 𝑖𝜎3 ,
𝜎2 𝜎3 = 𝑖𝜎1 , 𝜎3 𝜎1 = 𝑖𝜎2 . Thus, the pattern of matching (B.14) is verified. Obviously, such a
coincidence also justifies the usual term “spinor representation” for (3.38) and the Dirac field or
a wave function transforming according to (B.13) is rightly called bispinor (or Dirac spinor).

311
The alternative label “chiral” for (3.38) is related to the fact that the corresponding matrix 𝛾5
has then the block diagonal form
1 0
 
𝛾5 = .
0 −1
As a last example of a low-lying representation of Lorentz algebra we will consider the
case ( 12 , 12 ). The dimension of such a representation is four, so one may guess that it could
be equivalent to the Lorentz transformations of a four-vector (e.g. spacetime coordinates, in
particular), with the relevant generators given by the “canonical” matrices shown in (A.23),
(A.25). It is indeed so, but some commentary is in order here. For 𝑗1 = 𝑗2 = 12 , the matrix
triplets 𝑀 ® and 𝑁® can be expressed directly in terms of Pauli matrices, but to construct the desired
1 1
( 2 , 2 ) representation of 𝐽® and 𝐾® one should find some appropriate four-dimensional equivalents
for 𝑀 ® and 𝑁. ® It is not immediately obvious how to guess it out of hand, but one may proceed
the other way round; substituting (A.23) into (B.1) one gets

0 1 0 0 0 0 1 0 0 0 0 1
1 ­1
©
0 0 0®
ª 1 ­0 0
©
0 𝑖®
ª 1 ­0
©
0 −𝑖 0®
ª
𝑀1 = ­ ®, 𝑀2 = ­ ®, 𝑀3 = ­ (B.16)
2 ­0 0 0 −𝑖 ® 2 ­1 0 0 0® 2 ­0 𝑖 0 0®
®

«0 0 𝑖 0¬ «0 −𝑖 0 0¬ «1 0 0 0¬

and
0 −1 0 0 0 0 −1 0 0 0 0 −1
1 ­−1 0 0 0 ª®
© 1 ­ 0 0 0 𝑖 ª®
© 1­0
©
0 −𝑖 0 ®
ª
𝑁1 = ­ ®, 𝑁2 = ­ ®, 𝑁3 = ­ ® . (B.17)
2 ­ 0 0 0 −𝑖 ® 2 ­−1 0 0 0® 2­0 𝑖 0 0®
« 0 0 𝑖 0¬ « 0 −𝑖 0 0¬ «−1 0 0 0¬

One may check readily that the squares of the matrices (B.16), (B.17) are proportional to the
® 2 and 𝑁® 2 become
4 × 4 unit matrix and the Casimir operators 𝑀

®2 = 3 · 1,
𝑀
3
𝑁® 2 = · 1 . (B.18)
4 4
It is reassuring, due to the familiar arithmetic identity 34 = 12 ( 12 + 1); the representation contents
( 12 , 12 ) of the standard four-vector transformations generated by (A.23) is thereby confirmed.
Note also that unlike the four-dimensional bispinors discussed previously, the representation
( 12 , 12 ) is irreducible.
We would like to conclude this appendix with a challenge for a truly diligent and deter-
mined reader. Obviously, there must be also 3-dimensional representations (1, 0) and (0, 1) and
the question is, where one may encounter them. Well, quite correctly one should stick to the
rule formulated in the well-known play by J. Cimrman “The soothsayer” (in Czech: “Vizionář”),
namely “We may not even indicate...”. Nevertheless, so as to make the life of a potential explorer
easier, let us point towards the electromagnetic tensor 𝐹𝜇𝜈 , more precisely to the field strengths
𝐸® and 𝐵. ® So, good luck!

312
Appendix C

Review of “diracology”

The purpose of this appendix is to collect in one place some important identities for the Dirac
gamma matrices and solutions of the Dirac equation, which are employed frequently in the main
text; thus we have taken the liberty to use the rather informal expression “diracology” in the title.
To begin with, let us recall the basic anticommutation relation

{𝛾 𝜇 , 𝛾𝜈 } = 2𝑔 𝜇𝜈 (C.1)

and the definition of the fully anticommuting matrix 𝛾5 , which in our convention reads

𝛾5 = 𝑖𝛾 0 𝛾 1 𝛾 2 𝛾 3 . (C.2)

So, 𝛾5 satisfies
{𝛾 𝜇 , 𝛾5 } = 0 , 𝜇 = 0, 1, 2, 3 , 𝛾52 = 1 . (C.3)
Note that the right-hand side of (C.1) in fact means 2𝑔 𝜇𝜈 · 1 with 1 being the 4 × 4 unit matrix,
but we are going to use the shorthand notation like (C.1) whenever it does not lead to confusion.
𝜇
Taking into account that 𝑔 𝜇 = 4, from (C.1) one gets immediately

𝛾𝜇 𝛾 𝜇 = 4 , (C.4)

and a sequence of “chain identities” then follows:


𝛾𝛼 𝛾 𝜇 𝛾 𝛼 = −2𝛾 𝜇 ,
𝛾𝛼 𝛾 𝜇 𝛾𝜈 𝛾 𝛼 = 4𝑔 𝜇𝜈 , (C.5)
𝛾𝛼 𝛾𝜆 𝛾 𝜇 𝛾𝜈 𝛾 𝛼 = −2𝛾𝜈 𝛾 𝜇 𝛾𝜆 .
The derivation of these formulae is straightforward: one starts with the basic anticommutation
relation (C.1) and then (C.3) may be utilized; the procedure is recursive. Finding the next term
in the sequence (C.5) is left to the reader as an instructive exercise. Note that another pertinent
label for the identities of the type (C.5) would be “sandwich relations”.
Next, there are highly useful identities for traces of products of 𝛾-matrices. First of all,
utilizing the properties (C.3) of the matrix 𝛾5 , it is easy to show that the trace of a product of an
odd number of 𝛾-matrices is zero; in a neat shorthand notation we will write

Tr(odd #) = 0 . (C.6)

As for the products involving an even number of 𝛾-matrices one has, e.g.
Tr(𝛾 𝜇 𝛾𝜈 ) = 4𝑔 𝜇𝜈 ,
(C.7)
Tr(𝛾 𝜇 𝛾𝜈 𝛾 𝜌 𝛾𝜎 ) = 4(𝑔 𝜇𝜈 𝑔 𝜌𝜎 − 𝑔 𝜇𝜌 𝑔𝜈𝜎 + 𝑔 𝜇𝜎 𝑔𝜈𝜌 ) .

313
The proof of such formulae is straightforward and relies just on employing (C.1) in an appropriate
way; again, the procedure is recursive, as indicated in Chapter 3. The overall factor 4 comes
from Tr 1; this is immediately obvious from the first identity (C.7). Another familiar form of the
relations (C.7) reads

Tr( 𝑎/ 𝑏/ ) = 4𝑎 · 𝑏 ,
(C.8)
Tr( 𝑎/ 𝑏/ 𝑐/ 𝑑/ ) = 4 [(𝑎 · 𝑏)(𝑐 · 𝑑) − (𝑎 · 𝑐)(𝑏 · 𝑑) + (𝑎 · 𝑑)(𝑏 · 𝑐)] ,

with 𝑎/ = 𝑎 𝜇 𝛾 𝜇 , etc. Of course, this is obtained readily from (C.7) and the definition of the scalar
products, 𝑎 · 𝑏 = 𝑔 𝜇𝜈 𝑎 𝜇 𝑏 𝜈 , etc.
The formulae (C.7) have a clear tensor character; it is not surprising, since the products
of the metric tensor components are the only algebraic expressions that may descend from
anticommutators of 𝛾-matrices. In fact, there is another argument in favour of such a tensor
structure. To see this, one may recall the relation (4.9) involving the bispinor transformation
matrix 𝑆. Then one has

Tr(𝛾 𝜇 𝛾 𝜈 · · · 𝛾 𝜏 𝛾 𝜔 ) = Tr(𝑆𝑆 −1 𝛾 𝜇 𝛾 𝜈 · · · 𝛾 𝜏 𝛾 𝜔 )
= Tr(𝑆 −1 𝛾 𝜇 𝛾 𝜈 · · · 𝛾 𝜏 𝛾 𝜔 𝑆)
= Tr(𝑆 −1 𝛾 𝜇 𝑆𝑆 −1 𝛾 𝜈 𝑆 · · · 𝑆 −1 𝛾 𝜏 𝑆𝑆 −1 𝛾 𝜔 𝑆)
𝜇
= Λ 𝛼 Λ𝜈 𝛽 · · · Λ𝜏 𝛾 Λ𝜔𝛿 Tr(𝛾 𝛼 𝛾 𝛽 · · · 𝛾 𝜏 𝛾 𝜔 ) . (C.9)

Thus it is clear that any such trace is a (purely numerical) tensor under Lorentz transformations;
obviously, such a quantity can be constructed just from components of the metric tensor. By the
way, the relation (C.9) also provides an alternative proof of the fact that Tr(odd #) = 0, since one
obviously cannot construct a tensor with an odd number of indices by using only components of
the metric tensor carrying two indices. It is worth noting that the structure of the formulae for
the traces of the type Tr(even #) is quite uniform (one just has to keep in mind that the number of
relevant terms grows rapidly with increasing number of 𝛾-matrices inside the trace, cf. (3.23)).
In addition to the formulae shown above, one may also add the “palindromic” (or “backwards”)
relation
Tr(𝛾𝛼 𝛾 𝛽 · · · 𝛾𝜏 𝛾𝜔 ) = Tr(𝛾𝜔 𝛾𝜏 · · · 𝛾 𝛽 𝛾𝛼 ) (C.10)
that can be proved by employing the properties of the matrix of charge conjugation (cf. Chapter 5).
Let us now consider the trace identities involving 𝛾5 . Apart from the situations that may
be reduced to Tr(odd #) = 0, the most frequently used identities are as follows:

Tr 𝛾5 = 0 ,
Tr(𝛾 𝜇 𝛾𝜈 𝛾5 ) = 0 , (C.11)
Tr(𝛾 𝜇 𝛾𝜈 𝛾 𝜌 𝛾𝜎 𝛾5 ) = 4𝑖 𝜀 𝜇𝜈𝜌𝜎 .

A proof of the first two relations is quite straightforward. As for the last formula in (C.11), it is
not difficult to find out that the result for such a trace is totally antisymmetric in the indices 𝜇, 𝜈,
𝜌, 𝜎; thus, it must be proportional to the Levi-Civita symbol 𝜀 𝜇𝜈𝜌𝜎 . The overall factor 4𝑖 is then
fixed easily by an explicit calculation for (𝜇𝜈𝜌𝜎) = (0123); one gets, using the definition (C.2),

Tr(𝛾0 𝛾1 𝛾2 𝛾3 𝛾5 ) = Tr(𝑖𝛾5 · 𝑖𝛾5 ) = 4𝑖 (C.12)

(let us stress that our convention in (C.11) is 𝜀0123 = +1).


Now, obtaining a continuation of the series (C.11) is not so straightforward as in the
case of 𝛾-matrix products without 𝛾5 . For instance, if one wants to derive a formula for

314
Tr(𝛾𝜆 𝛾 𝜇 𝛾𝜈 𝛾 𝜌 𝛾𝜎 𝛾𝜏 𝛾5 ), the procedure relying on sequential anticommutations of 𝛾𝜆 toward the
end of the whole chain yields, because of the presence of the extra 𝛾5 , just the so-called Schouten
identity
𝑔 𝜇𝜆 𝜀 𝜈𝜌𝜎𝜏 − 𝑔𝜈𝜆 𝜀 𝜇𝜌𝜎𝜏 + 𝑔 𝜌𝜆 𝜀 𝜇𝜈𝜎𝜏 − 𝑔𝜎𝜆 𝜀 𝜇𝜈𝜌𝜏 + 𝑔𝜏𝜆 𝜀 𝜇𝜈𝜌𝜎 = 0 (C.13)
instead of the desired trace formula. However, one may proceed differently. As we know from
Chapter 3, sixteen 4 × 4 matrices Γ𝐴 defined by means of products of 𝛾-matrices (see (3.24)
through (3.28)) form a basis in the space of 4 × 4 matrices. So, one may expand e.g. the product
𝛾𝜆 𝛾 𝜇 𝛾𝜈 in such a basis and utilize the known properties of traces of 𝛾-matrix products. In
this way it becomes clear that there are only two types of non-trivial contributions to such an
expansion; explicitly, one has
𝛾𝜆 𝛾 𝜇 𝛾𝜈 = (𝑔𝜆𝜇 𝑔𝜈𝜔 − 𝑔𝜆𝜈 𝑔 𝜇𝜔 + 𝑔𝜆𝜔 𝑔 𝜇𝜈 )𝛾 𝜔 − 𝑖𝜀𝜆𝜇𝜈𝜔 𝛾 𝜔 𝛾5 . (C.14)
Then, using (C.14), one gets
Tr(𝛾𝜆 𝛾 𝜇 𝛾𝜈 𝛾 𝜌 𝛾𝜎 𝛾𝜏 𝛾5 ) = 4𝑖(𝑔𝜆𝜇 𝜀 𝜈𝜌𝜎𝜏 −𝑔𝜆𝜈 𝜀 𝜇𝜌𝜎𝜏 +𝑔 𝜇𝜈 𝜀𝜆𝜌𝜎𝜏 +𝑔𝜎𝜏 𝜀𝜆𝜇𝜈𝜌 −𝑔 𝜌𝜏 𝜀𝜆𝜇𝜈𝜎 +𝑔 𝜌𝜎 𝜀𝜆𝜇𝜈𝜏 ) .
(C.15)
The expressions appearing in (C.11) and (C.15) are pseudotensors due to the presence
of the Levi-Civita symbol (cf. Appendix A). It is not difficult to realize that such an algebraic
structure is in fact a simple consequence of the properties of the 𝛾5 matrix. To see this, one may
utilize the same procedure that has led to (C.9). For a proper Lorentz transformation, the matrix
𝑆 in (4.9) is generated by 𝜎𝜇𝜈 = 2𝑖 [𝛾 𝜇 , 𝛾𝜈 ], which commutes with 𝛾5 , and thus we get
𝜇
Tr(𝛾 𝜇 𝛾 𝜈 · · · 𝛾 𝜏 𝛾 𝜔 𝛾5 ) = Λ 𝛼 Λ𝜈 𝛽 · · · Λ𝜏 𝛾 Λ𝜔𝛿 Tr(𝛾 𝛼 𝛾 𝛽 · · · 𝛾 𝛾 𝛾 𝛿 𝛾5 ) . (C.16)
However, for the parity transformation one has 𝑆 = 𝑆 𝑃 = 𝛾0 (see Chapter 5), so that
𝑆 −1
𝑃 𝛾5 𝑆 𝑃 = −𝛾5 . (C.17)
Thus, a general result may be written as
𝜇
Tr(𝛾 𝜇 𝛾 𝜈 · · · 𝛾 𝜏 𝛾 𝜔 𝛾5 ) = det Λ Λ 𝛼 Λ𝜈 𝛽 · · · Λ𝜏 𝛾 Λ𝜔𝛿 Tr(𝛾 𝛼 𝛾 𝛽 · · · 𝛾 𝛾 𝛾 𝛿 𝛾5 ) , (C.18)
which is just the envisaged pseudotensor transformation law (cf. Appendix A). On the basis
of the above symmetry argument it is obvious that any trace of the considered type must be
expressed by means of products of a certain number of components of the metric tensor and a
single Levi-Civita symbol. Now one also arrives at an alternative elegant proof of the identity
Tr(𝛾 𝜇 𝛾𝜈 𝛾5 ) = 0: such an expression would be a Lorentz pseudotensor with two indices, but,
obviously, there is no such a thing (to get it from 𝜀 𝜇𝜈𝜌𝜎 , one would have to contract two indices
with the help of the metric tensor, but this gives zero due to the antisymmetry of the Levi-Civita
symbol).
Some of the identities shown above can be generalized in a straightforward way to an
𝑛-dimensional spacetime (such a generalization may be useful, among other things, within the
method of dimensional regularization for closed-loop diagrams). To this end, let us reconsider
the construction of the matrix basis described in Chapter 3, which relies on forming all possible
independent products of 𝛾-matrices. In an 𝑛-dimensional spacetime one has 𝑛 𝛾-matrices
satisfying the relation (C.1). Thus, they possess basically the same properties as in the case
𝑛 = 4, and the total number of their linearly independent products, starting with 1 and ending
up with 𝛾0 𝛾1 · · · 𝛾𝑛−1 , is equal to
       
𝑛 𝑛 𝑛 𝑛
+ +...+ + = 2𝑛 . (C.19)
0 1 𝑛−1 𝑛

315
According to the statements formulated in Chapter 3 (which hold for a general 𝑛 as well), these
2𝑛 matrices constitute a basis in the matrix space in question, and this in turn means that they
can be represented in the form 2𝑛/2 × 2𝑛/2 (for the purpose of our discussion we may restrict
ourselves to even-dimensional spacetimes, 𝑛 = 2𝑘, 𝑘 = 1, 2, . . .). For 𝑛 = 4 we recover our good
old 4 × 4 matrices; for 𝑛 = 2 one should use two 𝛾-matrices 2 × 2, for 𝑛 = 6 one has six basic
𝛾-matrices 8 × 8, etc. Note that we may introduce immediately also a fully anticommuting extra
matrix 𝛾𝑛+1 (as an analogue of 𝛾5 in four dimensions), defined conventionally as e.g.
𝛾𝑛+1 = 𝑖𝛾 0 𝛾 1 · · · 𝛾 𝑛−1 . (C.20)
Now, how about a generalization of the identities shown above? Concerning the traces,
one gets readily Tr(odd #) = 0 and for Tr(even #) we obtain, in analogy with (C.7),
Tr(𝛾 𝜇 𝛾𝜈 ) = 2𝑛/2 𝑔 𝜇𝜈 ,
(C.21)
Tr(𝛾 𝜇 𝛾𝜈 𝛾 𝜌 𝛾𝜎 ) = 2𝑛/2 (𝑔 𝜇𝜈 𝑔 𝜌𝜎 − 𝑔 𝜇𝜌 𝑔𝜈𝜎 + 𝑔 𝜇𝜎 𝑔𝜈𝜌 ) ,
etc. The point is that the derivation of (C.21) proceeds in much the same way as in the case
𝑛 = 4, except that here Tr 1 = 2𝑛/2 .
As for the “chain” or “sandwich” identities generalizing (C.3), (C.5), one has to use
𝜇
𝑔 𝜇 = 𝑛, so that the anticommutation relation (C.1) then yields
𝛾𝜇 𝛾 𝜇 = 𝑛,
𝛾𝛼 𝛾 𝜇 𝛾 𝛼 = (2 − 𝑛)𝛾 𝜇 ,
(C.22)
𝛾𝛼 𝛾 𝜇 𝛾 𝜈 𝛾 𝛼 = 4𝑔 𝜇𝜈 + (𝑛 − 4)𝛾 𝜇 𝛾𝜈 ,
𝛾 𝛼 𝛾𝜆 𝛾 𝜇 𝛾 𝜈 𝛾 𝛼 = −2𝛾𝜈 𝛾 𝜇 𝛾𝜆 − (𝑛 − 4)𝛾𝜆 𝛾 𝜇 𝛾𝜈 .
As we have already noted before, formulae for traces involving 𝛾𝑛+1 do not have such a
regular structure as (C.21), because of the specific properties of the Levi-Civita pseudotensor.
For an illustration, let us display at least one particular example (in which, however, some salient
features of the problem are manifested). So, for 𝑛 = 6 one gets
Tr 𝛾7 = 0,
Tr(𝛾 𝜇 𝛾𝜈 𝛾7 ) = 0,
(C.23)
Tr(𝛾 𝜇 𝛾𝜈 𝛾 𝜌 𝛾𝜎 𝛾7 ) = 0,
Tr(𝛾 𝜇 𝛾𝜈 𝛾 𝜌 𝛾𝜎 𝛾𝜏 𝛾𝜔 𝛾7 ) = 8𝑖 𝜀 𝜇𝜈𝜌𝜎𝜏𝜔 ,
if 𝛾7 is defined according to (C.19) and, in compliance with our earlier convention, we set
𝜀012345 = +1. An attentive reader may notice that a generic feature of the formulae like
(C.23) is the increasing number of vanishing traces of the type Tr(even # 𝛾𝑛+1 ) due to the total
antisymmetry of the Levi-Civita symbol.
Let us now return to four dimensions. Apart from the identities shown previously, there
is a triplet of elegant formulae that are highly useful in calculations of decay rates and scattering
cross sections in models involving Dirac fields. These read
Tr 𝑎/ 𝛾 𝜇 𝑏/ 𝛾𝜈 · Tr 𝑐/𝛾 𝜇 𝑑/ 𝛾 𝜈 = 32 [(𝑎 · 𝑐)(𝑏 · 𝑑) + (𝑎 · 𝑑)(𝑏 · 𝑐)] ,
 

Tr 𝑎/ 𝛾 𝜇 𝑏/ 𝛾𝜈 𝛾5 · Tr 𝑐/𝛾 𝜇 𝑑/ 𝛾 𝜈 𝛾5 = 32 [(𝑎 · 𝑐)(𝑏 · 𝑑) − (𝑎 · 𝑑)(𝑏 · 𝑐)] , (C.24)


 

Tr 𝑎/ 𝛾 𝜇 𝑏/ 𝛾𝜈 · Tr 𝑐/𝛾 𝑑/ 𝛾 𝛾5 = 0 .
𝜇 𝜈
 

The proof of the first relation is based on a straightforward application of (C.7) and the third
identity is obatined easily from (C.7) and (C.11). To prove the second formula, one employs
(C.11) and an identity for the Levi-Civita tensor, namely
𝜇 𝜇
𝜀 𝜇𝜈𝛼𝛽 𝜀 𝜌𝜎𝛼𝛽 = −2(𝛿 𝜌 𝛿𝜎𝜈 − 𝛿𝜎 𝛿 𝜈𝜌 ) . (C.25)

316
Note that in the present text we occasionally use the nickname “formulae 32” for the triplet
(C.24).
Next, we are going to discuss the Gordon identity (often called also Gordon decompo-
sition). Its basic ingredients are bispinor amplitudes of Dirac plane waves u ( 𝑝), v ( 𝑝) satisfying
the familiar equations

u ( 𝑝) = 0 ,
( 𝑝/ − 𝑚)u ( 𝑝/ + 𝑚)vv ( 𝑝) = 0 , (C.26)

and, equivalently,
u ( 𝑝)( 𝑝/ − 𝑚) = 0 , v ( 𝑝)( 𝑝/ + 𝑚) = 0 . (C.27)
(we suppress here the spin labels). A basic Gordon identity reads
1
u ( 𝑝)𝛾 𝜇 u ( 𝑝′) = u ( 𝑝) ( 𝑝 + 𝑝′) 𝜇 + 𝑖𝜎𝜇𝜈 ( 𝑝 − 𝑝′) 𝜈 u ( 𝑝′) , (C.28)
 
2𝑚
with 𝜎𝜇𝜈 = 2𝑖 [𝛾 𝜇 , 𝛾𝜈 ]. The derivation of this formula is quite easy if one hits on a clever trick
that consists in representing zero in a rather sophisticated form, namely

0 = u ( 𝑝) ( 𝑝/ − 𝑚)𝛾 𝜇 + 𝛾 𝜇 ( 𝑝/ ′ − 𝑚) u ( 𝑝′) . (C.29)


 

From (C.29) one gets readily


1
u ( 𝑝)𝛾 𝜇 u ( 𝑝′) = u ( 𝑝) 𝑝/ 𝛾 𝜇 + 𝛾 𝜇 𝑝/ ′ u ( 𝑝′) , (C.30)

2𝑚
and the matrix products in parentheses are then conveniently rewritten in terms of anticommu-
tators and commutators (recall that in general one has 𝐴𝐵 = 21 {𝐴, 𝐵} + 12 [ 𝐴, 𝐵]). Thus we
have
1 1 1
𝑝/ 𝛾 𝜇 = { 𝑝/ , 𝛾 𝜇 } + [ 𝑝/ , 𝛾 𝜇 ] = 𝑝 𝜇 + 𝑝 𝜈 [𝛾𝜈 , 𝛾 𝜇 ] = 𝑝 𝜇 + 𝑖𝜎𝜇𝜈 𝑝 𝜈 , (C.31)
2 2 2
and similarly
1 1
𝛾 𝜇 𝑝/ ′ = {𝛾 𝜇 , 𝑝/ ′ } + [𝛾 𝜇 , 𝑝/ ′] = 𝑝′𝜇 − 𝑖𝜎𝜇𝜈 𝑝′ 𝜈 . (C.32)
2 2
So, substituting (C.31) and (C.32) into (C.30), one recovers (C.29). Now it is also clear that
there are three additional identities of the type (C.29) involving v and/or v along with (or instead
of) u and u . To derive these, one may start with the trick (C.29), where an expression 𝑝/ − 𝑚 or
𝑝/ ′ − 𝑚 is replaced by (− 𝑝/ − 𝑚) or (− 𝑝/ ′ − 𝑚) if v and/or v enters the game. Thus it is clear that
the “master identity” (C.28) generates three extra variants simply by changing appropriately the
sign of 𝑝 and/or 𝑝′. In such a way, one arrives at
1
u ( 𝑝)𝛾 𝜇 v ( 𝑝′) = ( 𝑝 − 𝑝′) 𝜇 + 𝑖𝜎𝜇𝜈 ( 𝑝 + 𝑝′) 𝜈 v ( 𝑝′) ,
 
u ( 𝑝)
2𝑚
1
v ( 𝑝)𝛾 𝜇 u ( 𝑝′) = (−𝑝 + 𝑝′) 𝜇 + 𝑖𝜎𝜇𝜈 (−𝑝 − 𝑝′) 𝜈 u ( 𝑝′) , (C.33)
 
v ( 𝑝)
2𝑚
1
v ( 𝑝)𝛾 𝜇 v ( 𝑝′) = (−𝑝 − 𝑝′) 𝜇 + 𝑖𝜎𝜇𝜈 (−𝑝 + 𝑝′) 𝜈 v ( 𝑝′) .
 
v ( 𝑝)
2𝑚
Note that the forms (C.28), (C.33) may be viewed as a decomposition of the “Dirac current” into
its “convective” and “spin” part; the common terminology reflects just this observation.
Finally, let us comment briefly on a general definition of the spin matrices and on the
intriguing relation (7.3) involving 𝛾5 . In Chapter 2 we have introduced the triplet Σ 𝑗 , 𝑗 = 1, 2, 3,
defined directly in terms of Pauli matrices (see (2.10)). Such a straightforward definition

317
corresponds to the standard representation of 𝛾-matrices. In fact, it is not difficult to figure out a
natural definition of Σ,
® which is independent of a specific representation of 𝛾-matrices. It reads

1
Σ 𝑗 = 𝜀 𝑗 𝑘𝑙 𝜎𝑘𝑙 , i.e. 𝜎 𝑗 𝑘 = 𝜀 𝑗 𝑘𝑙 Σ𝑙 (C.34)
2
(let us recall that the position of the indices is irrelevant — up or down, it does not matter).
From (C.34) one gets readily

Σ1 = 𝜎23 , Σ2 = 𝜎31 , Σ3 = 𝜎12 . (C.35)

As we know from Chapter 4, the matrices 12 𝜎𝛼𝛽 satisfy commutation relations of the Lorentz
algebra (see (4.22) and (4.24)). Further, the fundamental Lorentz generators 𝐼 𝑘𝑙 labelled appro-
priately as
𝐽1 = 𝐼23 , 𝐽2 = 𝐼31 , 𝐽3 = 𝐼12 (C.36)
correspond to components of an angular momentum, in view of the relations (A.26) in Ap-
pendix A. Thus it is clear that the 4 × 4 matrices 𝑆® = 21 Σ
® defined through Eq. (C.34) satisfy the
commutation relations of angular momentum (spin).
Now it should also be obvious that the above-mentioned relation (7.3) is representation
independent. Indeed, suing the general definition (C.34) and the fundamental theorem on 𝛾-
matrices, one can see immediately that Σ, ® and 𝛾5 are changed via a similarity transformation
® 𝛼
when passing from one representation to another. Therefore, the validity of Eq. (7.3) in the
standard representation guarantees that it holds in any other one as well.

318
Appendix D

More about spin states of Dirac field

The purpose of this appendix is to show that the one-particle states of the quantized Dirac field,
𝑏 † ( 𝑝, 𝑠)|0⟩ or 𝑑 † ( 𝑝, 𝑠)|0⟩ are indeed eigenstates of the helicity. Our treatment is adapted, with
minor modifications, from the book [18].
To begin with, one has to establish some identities that are instrumental in achieving the
desired goal. First, employing the basic anticommutation relations (19.15), one obtains

{𝜓(𝑥), 𝜓(𝑦)} = 0 ,
(D.1)
𝑥 , 𝑡), 𝜓 † ( 𝑦®, 𝑡)} = 𝛿 (3) (®
{𝜓(® 𝑥 − 𝑦®) .

Note that the first identity (D.1) is an obvious consequence of (19.15). A proof of the second
relation requires some calculation, which is not difficult and hopefully can be left to the diligent
reader as an instructive exercise.
Next, utilizing the results of Chapter 15, the operators of the momentum and angular
momentum of the quantized Dirac field may be written as (cf. (15.52) through (15.54))

𝑃 = d3 𝑥 𝜓 † (𝑥) −𝑖 ® 𝜓(𝑥) ,
®
 Δ

(D.2)

∫  
3
𝑀 = d 𝑥 𝜓 (𝑥) 𝐿 + Σ 𝜓(𝑥) ,
® † ®
2

where
𝑥 × ®)
𝐿® = −𝑖(® (D.3)
Δ

and 12 Σ
® are the familiar spin matrices. Let us also recall that in our notation, 𝑗 , 𝑗 = 1, 2, 3,
Δ
means 𝜕/𝜕𝑥 𝑗 . A crucial consequence of the above results is represented by the following
commutators
® 𝜓(𝑥)] = 𝑖 ® 𝜓(𝑥) ,
[ 𝑃,
Δ

1® (D.4)
 
®
[ 𝑀, 𝜓(𝑥)] = 𝑖(® ® Δ
𝑥 × ) − Σ 𝜓(𝑥) .
2

The above commutation relations are proved easily. Indeed, one may utilize the elementary
identity
[ 𝐴𝐵, 𝐶] = 𝐴{𝐵, 𝐶} − { 𝐴, 𝐶}𝐵 , (D.5)

319
and this yields, e.g. for the first commutator in (D.4),
∫ 
3
® 𝜓(𝑥)] = d 𝑦 𝜓 ( 𝑦®, 𝑡) −𝑖 ® 𝜓( 𝑦®, 𝑡), 𝜓(®
† Δ
[ 𝑃,

𝑥 , 𝑡)

= d3 𝑦 𝜓 † ( 𝑦®, 𝑡)(−𝑖 ® 𝑦 ){𝜓( 𝑦®, 𝑡), 𝜓(®
  
𝑥 , 𝑡)} −𝑖 ® 𝜓( 𝑦®, 𝑡)
𝑥 , 𝑡)} − {𝜓 † ( 𝑦®, 𝑡), 𝜓(®
Δ Δ

(D.6)

(note that we have been allowed to use the same value of 𝑡 in all field operators appearing in
(D.6), since 𝑃® is time independent). From (D.6) and (D.1) one then gets readily the first formula
(D.4). The second relation (D.4) is obtained in the same way.
Now, we may take Hermitian conjugation of (D.4) and let the result act on the vacuum
|0⟩, taking into account that
® = 0,
𝑃|0⟩ 𝑀® |0⟩ = 0 (D.7)
(because of normal ordering). One thus gets, after a simple manipulation,

® † (𝑥)|0⟩ = 𝑖 ® 𝜓 † (𝑥)|0⟩ ,
𝑃𝜓
Δ

1 (D.8)
 
® (𝑥)|0⟩ = 𝑖(®

𝑥 × ® )𝜓 + 𝜓 Σ
† † ® |0⟩ .
Δ
𝑀𝜓
2

As a last item (last but not least), we will need a formula expressing the creation operator
𝑏 † ( 𝑝, 𝑠) (or 𝑑 † ( 𝑝, 𝑠)) in terms of 𝜓(𝑥). To this end, let us recall the expansion of 𝜓(𝑥) in the
annihilation and creation operators (see (19.6)); we reproduce it here for reader’s convenience:

𝜓(𝑥) = d3 𝑘 𝑁 𝑘
∑︁ 
u (𝑘, 𝑟) 𝑒 −𝑖𝑘𝑥 + 𝑑 † (𝑘, 𝑟)vv (𝑘, 𝑟) 𝑒𝑖𝑘𝑥 . (D.9)

𝑏(𝑘, 𝑟)u
𝑟

Multiplying (D.9) by 𝑒𝑖 𝑝𝑥 and integrating over d3 𝑥, one gets the notorious delta functions that
enable one to replace the variable 𝑘® with 𝑝® or − 𝑝,® respectively. Next, one may multiply the
resulting expression by u ( 𝑝, 𝑠) from the left and employ then the relations (19.9), (19.10) (these

follow from Gordon identities discussed in Appendix C, see (C.28), (C.33)). In this way, the
term in (D.9) that involves 𝑑 † along with v drops out and we are left just with the annihilation
operator 𝑏( 𝑝, 𝑠). One thus has

𝑏( 𝑝, 𝑠) = 𝑁 𝑝 d3 𝑥 𝑒𝑖 𝑝𝑥 u† ( 𝑝, 𝑠)𝜓(𝑥) . (D.10)

Taking Hermitian conjugate of (D.10) we obtain finally



𝑏 ( 𝑝, 𝑠) = 𝑁 𝑝 d3 𝑥 𝑒 −𝑖 𝑝𝑥 𝜓 † (𝑥)u

u ( 𝑝, 𝑠) . (D.11)

Now we are in a position to take up the problem of helicity. We would like to examine
the action of the scalar product 𝑃® · 𝑀
® on the one-particle state 𝑏 † ( 𝑝, 𝑠)|0⟩, i.e.

|𝑎⟩ = 𝑃® · 𝑀
® 𝑏 † ( 𝑝, 𝑠)|0⟩ . (D.12)

In this context, it is useful to realize that 𝑃® · 𝑀


® =𝑀 ® since
® · 𝑃,

[𝑀 𝑗 , 𝑃 𝑗 ] = 0 (D.13)

320
(such a vanishing commutator has an obvious analogue in ordinary quantum mechanics, where
one has, in general, [𝑀 𝑗 , 𝑃 𝑘 ] = 𝑖𝜀 𝑗 𝑘𝑙 𝑃𝑙 ). So, we are going to evaluate
® · 𝑃® 𝑏 † ( 𝑝, 𝑠)|0⟩ = 𝑀
|𝑎⟩ = 𝑀 ® · 𝑝® 𝑏 † ( 𝑝, 𝑠)|0⟩ . (D.14)

Employing the representation (D.11), one has



® 𝑁 𝑝 d3 𝑥 𝑒 −𝑖 𝑝𝑥 𝜓 † (𝑥)|0⟩u
|𝑎⟩ = 𝑝® · 𝑀 u ( 𝑝, 𝑠) . (D.15)

Since 𝑀
® is independent of spacetime coordinates, (D.15) can be recast as

|𝑎⟩ = 𝑝® 𝑁 𝑝 d3 𝑥 𝑒 −𝑖 𝑝𝑥 𝑀𝜓
® † (𝑥)|0⟩u
u ( 𝑝, 𝑠)

†1 ®
∫  
3 ®
= 𝑝® 𝑁 𝑝 d 𝑥 𝑒 −𝑖 𝑝𝑥 †
(D.16)
Δ
𝑥 × )𝜓 + 𝜓 Σ |0⟩u
𝑖(® u ( 𝑝, 𝑠) ,
2
where we have utilized the second identity in (D.8). Let us first consider the “orbital part” of
(D.16), which involves the operator

O=𝑝 𝑗
d3 𝑥 𝑒 −𝑖 𝑝𝑥 (®
𝑥 × ® ) 𝑗 𝜓 † (𝑥)
Δ

𝑗 𝑗𝑟 𝑠
d3 𝑥 𝑒 −𝑖 𝑝𝑥 𝑥 𝑟 𝑠 𝜓 † (𝑥) . (D.17)
Δ
=𝑝 𝜀

Performing the integration by parts (and discarding the surface term), (D.17) is recast as

O = −𝑝 𝜀𝑗 𝑗𝑟 𝑠
d3 𝑥 𝑠 𝑒 −𝑖 𝑝𝑥 𝑥 𝑟 𝜓 † (𝑥)
Δ 

𝑗 𝑗𝑟 𝑠
= −𝑝 𝜀 d3 𝑥 𝑖 𝑝 𝑠 𝑒 −𝑖 𝑝𝑥 𝑥 𝑟 + 𝑒 −𝑖 𝑝𝑥 𝛿𝑟 𝑠 𝜓 † (𝑥) . (D.18)


Now it is clear that O vanishes, because of the total antisymmetry of the Levi-Civita symbol
𝜀 𝑗𝑟 𝑠 . Thus, the contribution of the orbital part of the scalar product 𝑀 ® · 𝑃® in (D.14) is seen
to be identically zero, in accordance with an intuitive expectation (note that classically, orbital
angular momentum is orthogonal to the linear momentum, 𝐿® · 𝑃® = 0).
All this means that the expression (D.16) is reduced to
1

|𝑎⟩ = 𝑁 𝑝 d3 𝑥 𝑒 −𝑖 𝑝𝑥 𝜓 † (𝑥)|0⟩ ( 𝑝® · Σ)u
® u ( 𝑝, 𝑠) . (D.19)
2
However, according to the definition of u ( 𝑝, 𝑠) as the amplitude of a plane wave carrying a
definite helicity one has
𝑛® · 𝑆® u ( 𝑝, 𝑠) = ℎ u ( 𝑝, 𝑠) , (D.20)
where 𝑛® = 𝑝/| ® 𝑆® = 12 Σ
® 𝑝|, ® and ℎ is the corresponding helicity (ℎ = ± 1 ). Thus, using (D.20) and
2
(D.11) we can see that
|𝑎⟩ = ℎ | 𝑝|𝑏® † ( 𝑝, 𝑠)|0⟩ ,
i.e.
𝑝® ® †
· 𝑀𝑏 ( 𝑝, 𝑠)|0⟩ = ℎ 𝑏 † ( 𝑝, 𝑠)|0⟩ , (D.21)
| 𝑝|
®
and this is what we wanted to show. The analysis of the state 𝑑 † ( 𝑝, 𝑠)|0⟩ may proceed in much
the same way.

321
Appendix E

Photon propagator
in a general covariant gauge

The “Fermi trick” (31.26) may be generalized by means of a simple modification of the gauge-
fixing term, using
1 1
L = − 𝐹𝜇𝜈 𝐹 𝜇𝜈 − (𝜕 · 𝐴) 2 , (E.1)
4 2𝛼
with 𝛼 being an arbitrary real parameter. In such a case one has
𝜕L 1
= −𝐹 𝜌𝜎 − (𝜕 · 𝐴)𝑔 𝜌𝜎 , (E.2)
𝜕 (𝜕𝜌 𝐴𝜎 ) 𝛼
and the equation of motion then becomes
1
 
𝜎
2𝐴 + − 1 𝜕𝜌 𝜕 𝜎 𝐴 𝜌 = 0 ,
𝛼
or
1
  
𝑔 𝜎𝜌
2+ − 1 𝜕 𝜕𝜌 𝐴 𝜌 = 0 .
𝜎
(E.3)
𝛼
Thus, Eq. (E.3) for 𝛼 ≠ 1 differs from d’Alembert equation and the procedure of canonical
operator quantization is therefore considerably more complicated than in the case 𝛼 = 1.
However, since we are primarily interested in the corresponding propagator, one may try a
heuristic method relying on a general observation that the propagator of a quantized field is
a particular Green’s function of the equation of motion in question (cf. our previous findings
concerning scalar field). Now, for 𝛼 = 1 one obviously gets
2D𝜈𝜇 = 𝑔𝜈𝜇 𝛿 (4) (𝑥) , (E.4)

simply because .
[ 𝐴 𝜇 (𝑥), 𝐴𝜈 (𝑦)] E.T. = −𝑖𝑔 𝜇𝜈 𝛿 (4) (𝑥 − 𝑦)
(see (31.30)). Thus, for Eq. (E.3) one may try
1
   
𝜇 𝜇
𝑔𝜆 2 + − 1 𝜕 𝜕𝜆 D𝜈𝜆 (𝑥) = 𝑔𝜈 𝛿 (4) (𝑥) .
𝜇
(E.5)
𝛼
Differential equation (E.5) can be solved in usual way by means of Fourier transformation.
Writing
d4 𝑞

D 𝜇𝜈 (𝑥) = 𝐷 𝜇𝜈 (𝑞) 𝑒𝑖𝑞𝑥 , (E.6)
(2𝜋) 4

322
from (E.5) one then gets an algebraic equation for 𝐷 𝜇𝜈 (𝑞), namely

1
   
𝜇 2 𝜇
−𝑔𝜆 𝑞 − − 1 𝑞 𝑞 𝜆 𝐷 𝜆𝜈 (𝑞) = 𝑔𝜈 .
𝜇
(E.7)
𝛼

This can be written in matrix form as

𝐿 · 𝐷 = 1, (E.8)

where the matrix elements of 𝐿 are just

1
 
−𝑔𝜆 𝑞 2
𝜇 𝜇
𝐿𝜆 = − − 1 𝑞 𝜇 𝑞𝜆 (E.9)
𝛼

and 1 is the 4 × 4 unit matrix. Finding 𝐷 thus amounts to inverting 𝐿, i.e. 𝐷 = 𝐿 −1 . The most
efficient way of solving (E.8) consists in decomposing the matrices 𝐿 and 𝐷 (as well as 1) in
terms of two independent projection operators. Since the matrices in question are also Lorentz
tensors, one may formally raise and lower the matrix indices whenever it may be convenient.
The projection operators we have in mind are
𝑞 𝜇 𝑞𝜈
(𝑃T ) 𝜇𝜈 = 𝑔 𝜇𝜈 − ,
𝑞2
𝑞 𝜇 𝑞𝜈 (E.10)
(𝑃L ) 𝜇𝜈 = ,
𝑞2
where the labels T and L stand for transverse and longitudinal, respectively. It is easy to see that
𝑃T and 𝑃L are (mutually orthogonal) projection operators: it holds

(𝑃T ) 2 = 𝑃T , (𝑃L ) 2 = 𝑃L , 𝑃T · 𝑃L = 0 , 𝑃T + 𝑃L = 1 . (E.11)

Now, writing 𝐿 and 𝐷 as

𝐿 = 𝐴𝑃T + 𝐵𝑃L ,
(E.12)
𝐷 = 𝑋 𝑃T + 𝑌 𝑃L ,

where 𝐴, 𝐵, 𝑋, 𝑌 are some numerical coefficients, and using (E.11), Eq. (E.8) is recast as

𝐴𝑋 𝑃T + 𝐵𝑌 𝑃L = 𝑃T + 𝑃L , (E.13)

and thus
1 1
𝑋= , 𝑌= . (E.14)
𝐴 𝐵
In this way, the problem of inverting the 4 × 4 matrix 𝐿 is reduced to inverting numbers 𝐴 and
𝐵, a trivial operation. From (E.9) it is seen that
1 2
𝐿 = −𝑞 2 𝑃T − 𝑞 𝑃L , (E.15)
𝛼
so
1 1
𝐷= 𝑃T + 𝑃 ,
−𝑞 2 1 2 L
− 𝑞
𝛼

323
i.e., using (E.10), one gets finally

1 𝑞 𝜇 𝑞𝜈
 
𝐷 𝜇𝜈 (𝑞) = 2 −𝑔 𝜇𝜈 + (1 − 𝛼) 2 . (E.16)
𝑞 𝑞

The expression (E.16) represents a class of propagators of the electromagnetic field, correspond-
ing to covariant gauges (often called “𝛼-gauges”). Admittedly, it is somewhat vague statement
at the present heuristic level, but this subtle issue goes beyond the scope of our elementary
approach. Of course, one thus obtains just the relevant functional dependence of 𝐷 𝜇𝜈 (𝑞), while
the replacement 1/𝑞 2 → 1/(𝑞 2 + 𝑖𝜀) for the true Feynman propagator is done by hand (well, but
we know that at least for 𝛼 = 1 we are able to do it correctly).
There are some prominent values of 𝛼, related to famous names:
• 𝛼 = 1: Feynman gauge ,
• 𝛼 = 0: Landau gauge .
Note that 𝛼 = 0 does not make sense in the Lagrangian (E.1), but for the propagator it’s OK.
Finally, notice that for 𝛼 → ∞ one recovers the original gauge invariant Maxwell Lagrangian,
but in this case the matrix 𝐿 is singular (it is reduced to 𝑃T ) and has no inverse. This is precisely
the point of gauge fixing — for the gauge invariant Lagrangian itself the problem of finding the
propagator is ill-defined.

324
Appendix F

Electromagnetic form factors of electron

In this appendix, we discuss briefly the general formula (50.15). As we have already mentioned
before, at the tree level it is valid automatically, with 𝐹1 (𝑞 2 ) = 1, 𝐹2 (𝑞 2 ) = 0. Now the question
is what one gets in higher orders for the matrix product

M 𝜇 = u ( 𝑝′)Γ𝜇 ( 𝑝′, 𝑝)u


u ( 𝑝) , (F.1)

where Γ𝜇 ( 𝑝′, 𝑝) denotes the vertex function represented by Feynman diagrams with two external
electron lines and one external photon line (pictorially, see Fig. F.1). Needless to say, 𝑝 and 𝑝′

p p′

q = p′ − p

Fig. F.1: Schematic depiction of the QED vertex function that embodies the electron magnetic moment.

are taken to be on the mass shell, 𝑝 2 = 𝑝′2 = 𝑚 2 , while the external photon line is in general
off-shell. Within the covariant perturbation expansion we are using, the M 𝜇 should be a Lorentz
four-vector, and it is a function of two independent four-momenta 𝑝 and 𝑝′.
Thus, one may guess immediately that the most general form of the M 𝜇 could be

M 𝜇 = u ( 𝑝′) 𝐴1 (𝑞 2 )𝛾 𝜇 + 𝐴2 (𝑞 2 ) 𝑝 𝜇 + 𝐴3 (𝑞 2 ) 𝑝′𝜇 u ( 𝑝) . (F.2)


 

As regards the invariant amplitudes (form factors) 𝐴 𝑗 (𝑞 2 ), 𝑗 = 1, 2, 3, these might depend on 𝑝 2 ,


𝑝′2 and the scalar product 𝑝 · 𝑝′. But 𝑝 2 = 𝑝′2 = 𝑚 2 , so there is just one independent kinematical
invariant, e.g. 𝑞 2 = ( 𝑝′ − 𝑝) 2 = 2𝑚 2 − 2𝑝 · 𝑝′. A remark is perhaps in order here. When the loop
integrations inside the blob in Fig. F.1 are carried out, one gets some Lorentz invariant form
factors and the resulting Γ𝜇 ( 𝑝′, 𝑝) incorporates diverse products of 𝛾-matrices, which enter the
game either as 𝛾 𝜇 , or slashed combinations 𝑝/ and 𝑝/ ′. To simplify the products of 𝛾-matrices,
one can employ their basic anticommutation relations, which lead e.g. to 𝑝/ 𝛾 𝜇 = 2𝑝 𝜇 − 𝛾 𝜇 𝑝/ ,
𝑝/ 𝑝/ ′ = 2𝑝 · 𝑝′ − 𝑝/ ′ 𝑝/ , etc. In this way, one may eventually encounter just a finite number of
𝛾-matrix products, such as
𝑝/ 𝛾 𝜇 , 𝑝 𝜇 𝑝/ ′, 𝑝/ 𝛾 𝜇 𝑝/ ′, . . . (F.3)

325
The readers are encouraged to activate their imagination and try to find all possible relevant
𝛾-matrix products involved here. By the way, a detailed explicit evaluation of Γ𝜇 ( 𝑝′, 𝑝) at one-
loop level, sketched in Chapter 50, is quite instructive in this context. Higher-order diagrams can
of course produce long chains of 𝛾-matrices, but one can always employ the anticommutation
relations, and move the matrices inside the chain in such a way that one eventually gets ( 𝑝/ ) 2 =
𝑝 2 = 𝑚 2 and similarly for ( 𝑝/ ′) 2 . Moreover, one should put factors 𝑝/ and 𝑝/ ′ in the right order,
such that 𝑝/ ′ stands on the left and 𝑝/ on the right; one may then utilize Dirac equations u ( 𝑝′) 𝑝/ ′ =
𝑚 u ( 𝑝′) and 𝑝/ u ( 𝑝) = 𝑚 u ( 𝑝). A typical example of the above-mentioned manipulations is as
follows: one may get easily, upon appropriate anticommutations of 𝛾-matrices,

𝑝/ 𝛾 𝜇 𝑝/ ′ = 2𝑝 𝜇 𝑝/ ′ + 2𝑝′𝜇 𝑝/ − 2𝑝 · 𝑝′ 𝛾 𝜇 − 𝑝/ ′ 𝛾 𝜇 𝑝/ .

In such a way, one can justify the simple structure (F.2).


Well, after such a long explanatory comment, let us take the form (F.2) for granted and
proceed further. For our purpose, it is more convenient to recast the expression in terms of the
combinations 𝑝′𝜇 + 𝑝 𝜇 and 𝑝′𝜇 − 𝑝 𝜇 , so that we write Eq. (F.2) in an equivalent form

M 𝜇 = u ( 𝑝′) 𝐴(𝑞 2 )𝛾 𝜇 + 𝐵(𝑞 2 )( 𝑝′𝜇 + 𝑝 𝜇 ) + 𝐶 (𝑞 2 )( 𝑝′𝜇 − 𝑝 𝜇 ) u ( 𝑝) . (F.4)


 

Now we may use the Ward–Takahashi (WT) identity (see Chapter 42, the formula (42.20)),
which in our present notation reads simply

𝑞 𝜇 M𝜇 = 0 . (F.5)

Using the decomposition (F.4) and the identities u ( 𝑝′) 𝑞/ u ( 𝑝) = 0, ( 𝑝′𝜇 + 𝑝 𝜇 )( 𝑝′ 𝜇 − 𝑝 𝜇 ) = 0, the
WT relation (F.5) is reduced to 𝑞 2𝐶 (𝑞 2 ) = 0, or,

𝐶 (𝑞 2 ) = 0 . (F.6)

Thus, the form (F.4) becomes

M 𝜇 = u ( 𝑝′) 𝐴(𝑞 2 )𝛾 𝜇 + 𝐵(𝑞 2 )( 𝑝′𝜇 + 𝑝 𝜇 ) u ( 𝑝) , (F.7)


 

and with the help of the Gordon identity (50.25) this can be immediately rewritten as a combi-
nation of terms involving 𝛾 𝜇 and 𝜎𝜇𝜈 𝑞 𝜈 . The formula (50.15) is thereby proven.
Finally, let us remark that we have demonstrated the validity of the WT identity at the
one-loop level, but in fact it is quite general, as a consequence of the gauge invariance of QED.
A detailed discussion of this topic can be found e.g. in the book [6], Chapter 8, section 8.4.1.
Thus, we may conclude that the form (50.15) is valid to any order of perturbation expansion.

326
Bibliography

Below I have listed some books that I was using occasionally when preparing my lectures during
the past years. Thus, the reader may find there more details concerning the topics discussed in
the present text. One item of the list is rather special, so it is worth mentioning explicitly; I have
in mind [23] and the reason is as follows. In my experience, the mathematically minded students
sometimes complain that the standard QFT methods are somewhat sloppy (though efficient) and
strive for due rigorousness. The remarkable book [23], written by a professional mathematician
(with deep respect for physics) responds, at least partly, to such needs. So, I have included it in
the list, for the reader’s convenience. Needless to say, apart from the limited set of textbooks and
monographs displayed here, there are many other good books covering the huge and fascinating
area of the quantum field theory. Further, the list of relevant literature continues with references
to original and review papers dealing with some particular themes treated in these lecture notes.
The selection of cited papers has been rather minimalistic; some of them reflect the history of
QFT and particle physics, while the other ones might (along with the comprehensive books)
arouse the reader’s interest and open them new horizons in the rich QFT landscape. In particular,
the papers cited in ref. [56] contain a remarkable defence of the “conventional QFT” (i.e. the
approach pursued also in the present lecture notes) in contrast to a rigorous axiomatic theory,
which might be a “holy grail” for a mathematically minded reader. One more remark is perhaps
in order here. It is clear that the present text is oriented primarily to possible applications in
particle physics. In fact, the scope of quantum field theory is much broader. In particular, QFT
methods are highly efficient also in the condensed matter physics or nuclear physics, i.e. in the
theory of many-body systems in general. The reader interested in these aspects of quantum field
theory may find some relevant information e.g. in the book [10].

[1] J. Bjorken, S. Drell: Relativistic quantum mechanics, McGraw Hill Book Company,
New York 1964.
[2] J. Bjorken, S. Drell: Relativistic quantum fields, Mc Graw Hill Book Company, New York 1965.
[3] A. Messiah: Quantum mechanics, Dover Publications Inc., Mineola, New York 1999.
[4] P. Strange: Relativistic quantum mechanics, Cambridge Univ. Press, Cambridge 1998.
[5] W. Greiner: Relativistic quantum mechanics, Springer-Verlag, Berlin 1994.
[6] C. Itzykson, J. B. Zuber: Quantum field theory, Mc Graw Hill Book Company, New York 1980.
[7] V. B. Berestetskii, E. M. Lifshitz and L. P. Pitaevskii: Quantum electrodynamics (Landau & Lifshitz
course of theoretical physics, Vol. 4), Butterworth–Heinemann, Oxford 1999.
[8] S. Schweber: An introduction to relativistic quantum field theory, Dover Publications, Mineola,
New York 1989.
[9] S. Schweber: QED and the men who made it, Princeton University Press, Princeton 1994.
[10] S. J. Chang: Introduction to quantum field theory, World Scientific, Singapore 1990.
[11] A. Das: Lectures on quantum field theory, World Scientific, Singapore 2008.
[12] G. Sterman: An introduction to quantum field theory, Cambridge University Press,
Cambridge 1993.

327
[13] S. Weinberg: The quantum theory of fields I, Cambridge University Press 1995.
[14] M. Peskin, D. Schroeder: Introduction to quantum field theory, Westview Press 1995.
[15] W. Greiner, J. Reinhardt: Quantum electrodynamics, Springer-Verlag, Berlin 1994.
[16] N. N. Bogoliubov, D. V. Shirkov: Quantum fields, Addison-Wesley, Boston 1982.
[17] N. Nakanishi, I. Ojima: Covariant operator formalism of gauge theories and quantum gravity,
World Scientific, Singapore 1990.
[18] T. D. Lee: Particle physics and introduction to field theory, Harwood Academic Publishers,
London 1988.
[19] L. Ryder: Quantum field theory, Cambridge University Press 1996.
[20] I. Duck, E. C. G. Sudarshan: Pauli and the spin-statistics theorem, World Scientific,
Singapore 1997.
[21] R. Streater, A. Wightman: PCT, spin and statistics, and all that, Princeton University Press,
Princeton 1980.
[22] J. Hořejšı́: Fundamentals of electroweak theory, Karolinum Press, Prague 2003;
P. Langacker: The standard model and beyond, 2nd edition, CRC Press, Boca Raton 2017.
[23] G. Folland: Quantum field theory: A tourist guide for mathematicians, American Mathematical
Society, Providence, Rhode Island 2008.
[24] B. Thaller: The Dirac equation, Springer-Verlag, Berlin 1992.
[25] J. Collins: Renormalization, Cambridge University Press, Cambridge 1984.
[26] J. Formánek: Úvod do relativistické kvantové mechaniky a kvantové teorie pole I (in Czech),
Karolinum, Praha 2004.
[27] M. Schwartz: Quantum field theory and the standard model, Cambridge University Press,
Cambridge 2014.
[28] L. M. Brown (editor): Renormalization: From Lorentz to Landau (and beyond), Springer-Verlag,
Berlin 1993.
[29] L. Álvarez-Gaumé, M. Vázquez-Mozo: An invitation to quantum field theory, Springer-Verlag,
Berlin 2012.
[30] E. Manoukian: Renormalization, Academic Press, New York 1983.
[31] K. Huang: Quarks, leptons & gauge fields, 2nd edition, World Scientific, Singapore 1992.
[32] P. A. M. Dirac: The quantum theory of the electron, Proc. Roy. Soc. London A 117 (1928) 610.
[33] P. A. M. Dirac: A theory of electrons and protons, Proc. Roy. Soc. London A 126 (1930) 360.
[34] S. Watanabe: Chirality of 𝐾 particle, Phys. Rev. 106 (1957) 1306.
[35] R. P. Feynman, M. Gell-Mann: Theory of Fermi interaction, Phys. Rev. 109 (1958) 193.
[36] N.Dombey, A. Calogeracos: Seventy years of the Klein paradox, Phys. Reports 315 (1999) 41.
[37] P. Krekora et al.: Klein paradox in spatial and temporal resolution,
Phys. Rev. Lett. 92 (2004) 040406.
[38] L. Lamata et al.: Dirac equation and quantum relativistic effects in a single trapped ion,
Phys. Rev. Lett. 98 (2007) 253005;
R. Gerritsma et al.: Quantum simulation of the Dirac equation, Nature 463 (2010) 68;
Zhi-Yong Wang, Cai-Dong Xiong: Zitterbewegung by quantum field-theory considerations,
Phys. Rev. A 77 (2008) 045402;
P. Krekora et al.: Relativistic electron localization and the lack of Zitterbewegung,
Phys. Rev. Lett. 93 (2004) 043004.
[39] W. Dittrich: On the Pauli–Weisskopf anti-Dirac paper, Eur. Phys. J. H 40 (2015) 261.
[40] F. J. Dyson: The radiation theories of Tomonaga, Schwinger, and Feynman,
Phys. Rev. 75 (1949) 486.
[41] G. ‘t Hooft, M. J. G. Veltman: Regularization and renormalization of gauge fields,
Nucl. Phys. B 44 (1972) 189.
[42] C. G. Bollini, J. J. Giambiagi: Dimensional renormalization: The number of dimensions as a
regularizing parameter, Nuovo Cim. B 12 (1972) 20.

328
[43] W. Pauli, F. Villars: On the invariant regularization in relativistic quantum theory,
Rev. Mod. Phys. 21 (1949) 434.
[44] J. Hořejšı́, J. Novotný, O. I. Zavialov: Dimensional regularization in four dimensions,
Czech. J. Phys. B 39 (1989); Dimensional regularization of the VVA triangle graph as a con-
tinuous superposition of of Pauli–Villars regularizations, Phys. Lett. B 213 (1988) 173.
[45] J. C. Ward: An identity in quantum electrodynamics, Phys. Rev. 78 (1950) 182.
[46] Y. Takahashi: On the generalized Ward identity, Nuovo Cimento 6 (1957) 371.
[47] F. J. Dyson: The 𝑆-matrix in quantum electrodynamics, Phys. Rev. 75 (1949) 1736.
[48] R. Serber: A note on positron theory and proper energies, Phys. Rev. 49 (1936) 545.
[49] E. A. Uehling: Polarization effects in the positron theory, Phys. Rev. 48 (1935) 55.
[50] G. Dunne: Heisenberg-Euler effective Lagrangians: basics and extensions, arXiv:hep-th/0406216;
The Heisenberg–Euler effective action: 75 years on, Int. J. Mod. Phys. A 27 (2012) 1260004.
[51] F. Přeučil, J. Hořejšı́: Effective Euler–Heisenberg Lagrangians in models of QED,
J. Phys. G 45 (2018) 085005.
[52] M. Aaboud et al. (ATLAS Collaboration): Evidence for light-by-light scattering in heavy ion
collisions with the ATLAS detector at the LHC, Nature Phys. 13 (2017) 852.
[53] J. Schwinger: On quantum electrodynamics and the magnetic moment of the electron,
Phys. Rev. 73 (1948) 416.
[54] A. S. Blum: The state is not abolished, it withers away: How quantum field theory became a theory
of scattering, Stud. Hist. Phil. Sci. B 60 (2017) 46.
[55] B. Delamotte: A hint of renormalization, Am. J. Phys. 72 (2004) 170.
[56] D. Wallace: In defence of naiveté: The conceptual status of Lagrangian quantum field theory,
Synthese 151 (2006) 33;
Taking particle physics seriously: A critique of the algebraic approach to quantum field theory,
Stud. Hist. Phil. Sci. B 42 (2011) 116.

329
Index

angular momentum, 9, 13–14, 84 chirality, 50, 55


annihilation operators, 114, 123, 197 chronological ordering, 135
anticommutation relations, 123, 319 chronological product, 171, 176
antilinear transformation, 32 classical fields, 88–104
antiparticles, 37, 86, 116, 119, 125 Clifford algebra, 18
antiunitary transformation, 34 completeness relation, 41, 82, 188
asymptotic freedom, 281 Compton scattering, 200, 246, 283
averaging over polarizations, 154 Compton wavelength, 7, 68, 70, 202
axial vector, 32, 104, 155 conjugate momentum, 105
conserved charge, 119, 124
Balmer formula, 84 conserved currents, 103, 118, 124, 185
bare coupling, 268, 277, 292 continuity equation, 8, 12, 28, 93, 100
bare field, 274 contraction, 207, 208, 213, 217, 253
bare mass, 270, 272, 281 conversion constant, 4, 157
Bhabha scattering, 214 counterterms, 272–277
bilinear forms, 29, 31, 34 coupling constant, 131, 140, 181
bispinor, 26–27, 29, 38, 91, 99 covariant derivative, 132
Bohr magneton, 16, 297 covariant gauges, 324
Bohr radius, 71 𝐶𝑃 invariance, 61
boost, 27, 43, 306 𝐶𝑃𝑇 theorem, 61
Bose symmetrization, 256 creation operators, 114, 117, 123, 125
Bose–Einstein statistics, 115 cross section, 147, 200
bosons, 86, 115 cusp singularity, 288, 296
box diagram, 253, 294
Breit–Wigner form, 185 Darwin–Gordon formula, 84
bremsstrahlung, 219 decay of scalar boson, 151
decay of vector boson, 154
calculable quantities, 293, 296 decay rate, 144
canonical commutation relations, 106 decay width, 147
canonical quantization, 105, 126, 186 dimensional regularization, 229, 233, 242
canonical variables, 105, 126 Dirac conjugation, 29
Casimir’s trick, 152 Dirac equation, 11, 17
chain identities for Dirac matrices, 313, 316 Dirac field, 91, 121
charge conjugation, 33 Dirac matrices, 18, 19, 313
charge renormalization, 277 Dirac picture, 133
chiral representation of Dirac matrices, 22, Dirac sea, 86
311 discrete symmetries, 30
chiral symmetry, 104 dispersion relation, 289–291

330
Dyson expansion, series, 135, 137, 179 four-momentum, 8, 114, 165, 225, 283
fundamental theorem on 𝛾-matrices, 22, 35,
effective charge, 292 318
effective Lagrangian, 293, 295 Furry’s theorem, 258, 261, 279
electrodynamics of 𝑊 bosons, 295
electromagnetic field, 15, 159, 186 𝑔-factor, 16
electromagnetic form factors, 325 𝛾5 , 4, 18–19, 43, 55, 242, 313
electron magnetic moment, 16, 85, 298, gamma matrices, 18, 313
302–303 gauge fixing, 191, 324
electron self-energy, 244, 269, 273 gauge invariance, 15, 85, 99, 132, 228, 253,
electron spin, 9, 13–14 266, 294
electron–positron annihilation, 158, 181, generators of Lorentz transformations, 24,
184 59, 100, 307, 311
energy–momentum tensor, 97, 104, 109, generators of rotations, 81, 308
118 global symmetry, 132
equal-time commutation relations, 105, Gordon identity, 64, 122, 300, 317
127, 191 Green’s function, 167, 322
Euclidean space, 188, 240 Gupta–Bleuler method, 194
Euler–Heisenberg Lagrangian, 295–296, gyromagnetic ratio, 16
329
Euler–Lagrange equations, 89, 96 Hamiltonian, 10, 107, 108, 133, 177
Euler–Mascheroni constant, 234 harmonic oscillator, 93, 112, 115
evolution equation, 133 Heisenberg picture, 64, 67, 105, 110, 133
evolution operator, 134 helicity, 50–56, 152–153
Higgs boson, 115, 132, 137, 152, 163, 281
Fermi constant, 155, 157, 294 hole theory, 86
Fermi trick, 190, 322
Fermi–Dirac statistics, 125 imaginary part of closed loop, 286
fermionic loops, 255, 258 improper Lorentz transformations, 306
fermions, 86, 115, 125, 142, 170 indefinite metric, 197
Feynman diagrams, 137, 163, 168, 210, index of interaction vertex, 263
214, 218, 223 index of UV divergence, 261
Feynman gauge, 324 infrared divergence, 244
Feynman parametrization, 229, 248, 259 interaction picture, 133
internal symmetry, 33, 95
Feynman propagator, 164, 171, 176, 190,
199, 324 Källén’s function, 146
Feynman rules, 141–142, 154, 173, 184, Klein paradox, 72
202, 291 Klein–Gordon equation, 7, 12, 78, 90
field angular momentum, 101, 125, 319 Klein–Gordon field, 89, 105, 114, 116
field interactions, 131 Klein–Nishina formula, 200, 204
field renormalization, 274, 276, 278
field strength, 85, 99, 186, 297 Lagrangian density, 89
fine structure, 71 Lamb shift, 85
fine-structure constant, 71, 161, 183 Landau gauge, 324
Fock space, 115, 125, 137 left-handed particle, 50
form-invariance, 95 Legendre transformation, 93, 97
formulae 32, 156, 182, 317 Levi-Civita symbol, 5, 306, 316
four-current, 29 Lie algebra, 25, 81, 307
four-fermion interaction, 140, 266 Lie group, 307

331
light-by-light scattering, 293 parity violation, 31, 62
local U(1) symmetry, 132 particle interpretation, 111, 122–124, 127,
longitudinal polarization, 81 189
loop diagrams, 217, 221 Pauli equation, 16
loop momentum, 223, 249, 263, 301 Pauli exclusion principle, 86, 125
Lorentz algebra, 100, 309 Pauli matrices, 11, 13, 16, 57, 60
Lorentz transformations, 28, 305 Pauli term, 85, 265, 277
Lorentz-invariant phase space, 145 Pauli–Jordan function, 196
Lorenz condition, 3, 190, 195 Pauli–Villars regularization, 237
Lorenz-like constraint, 126 photon propagator, 190, 199, 322
photon–photon scattering, 256, 261, 281,
Majorana representation of Dirac matrices, 296
22, 34 Planck constant, 4
Mandelstam invariants, 148, 156, 182 plane waves, 6, 38
mass dimension, 264, 277, 294 Poincaré invariance, 104
mass renormalization, 270, 273
polarization sum, 82, 175, 185, 188
mass shell, 115, 325
polarization vector, 79, 187
matching condition, 293
positron, 86, 139, 158, 181, 184
Maxwell field, 98, 104, 186
power counting, 264
metric tensor, 4, 20, 306
probability current, 8, 12, 28
minimal electromagnetic interaction, 15
probability density, 6, 9, 12, 63
minimal subtraction, 283
Proca equation, 79, 91, 126
Minkowski space, 18, 82, 188, 305
Proca field, 90, 126, 154, 174
momentum cut-off, 229, 240
propagator of Dirac field, 170, 172, 223
Mott formula, 159, 161
propagator of massive vector field, 174, 177
Møller scattering, 214
propagator of scalar field, 163
negative energy solutions, 14, 37, 68, 86–87 proper Lorentz transformations, 306
negative norm squared, 193 pseudoscalar, 31
neutrino, 62, 140, 156 pseudovector, 32, 306
Noether’s identity, 97
Noether’s theorem, 95 QED with massive photon, 181, 265
non-Abelian gauge theory, 296 quantized field, 107, 110, 111, 115, 163
non-covariant photon propagator, 190 quantum chromodynamics, 281
non-renormalizable theory, 277 quantum electrodynamics, 1, 16, 174, 181,
normal ordering, 113 211, 302
normal product, 207 quasi-real virtual photons, 296

occupation numbers, 114 radiation gauge, 187, 197


on-shell scheme, 283 radiative corrections, 278, 284
one-particle-irreducible Feynman diagram, relativistic covariance, 9, 23, 58, 78
261 renormalization, 225, 267, 271–277
optical theorem, 288 renormalization constant, 275
orbital angular momentum, 13 renormalization counterterms, 272–277
orthochronous Lorentz transformations, representations of Lorentz group, 25, 309
306 right-handed particle, 52
oscillator decomposition, 115 rotations, 28, 81, 99

palindromic relation, 36, 314 𝑆-matrix, 1, 136, 139, 143, 163, 212, 216
parity, 30, 60 𝑆-matrix unitarity, 288

332
sandwich relations for Dirac matrices, 313, trace identities for Dirac matrices, 314
316 trace technique, 152, 156, 182
scalar photon, 197 tree diagrams, 216
scalar QED, 266, 279–281, 295
scattering in Coulomb field, 159 Uehling correction, 290
Schouten identity, 315 ultraviolet (UV) divergence, 225, 228, 235,
Schrödinger equation, 6 247, 261
Schrödinger picture, 65, 133 unpolarized particles, 146, 182
Schwinger correction, 248, 252, 297, 302,
𝑉 − 𝐴 theory, 155
304
vacuum, 86, 114, 119, 164, 197
screening of bare charge, 292
vacuum expectation value, 164, 207, 228
self-interaction, 132
vacuum polarization, 226, 237, 267, 285
six-dimensional spacetime, 281
vector current, 131, 155, 237
slash notation, 39, 85
velocity, 27, 44, 51
space reflection, 30
velocity operator, 65
space-like vector, 46, 82, 127
vertex correction, 248, 275, 280
spacetime translation, 97, 165
virtual electron–positron pair, 292
spatial inversion, 30
virtual particle, 168, 284, 293
spin and statistics, 115, 328
spin four-vector, 45, 52, 81, 125 Ward identity, 228, 250, 253, 276
spinor QED, 280, 296 Ward–Takahashi (WT) identity, 251, 326
standard model, 132, 140, 155, 158, 163, wave function, 6, 8, 12, 23, 27, 58, 76, 78
185, 266, 296 wave packet, 63, 65, 70
standard representation of Dirac matrices, weak interaction, 140, 155, 158, 266, 293
11, 22, 33–36, 39, 318 Weyl equation, 57, 62, 310
strong interactions, 281, 296 Weyl spinors, 311
superficial degree of divergence, 262 Wick rotation, 235, 240
symmetric integration, 232, 249, 259 Wick’s theorems, 207, 209, 211
T-product, 135, 164, 171 Yang–Mills field, 281
tadpole, 253 Yukawa interaction (coupling), 136, 151,
Thomson scattering, 205 288
time evolution, 13, 63, 67, 134
time ordering, 135, 211 zero-norm state, 198
time reversal, 32, 61, 306 Zitterbewegung, 63, 67

333

You might also like