Advanced Quantum Mechanics
Advanced Quantum Mechanics
Peter S. Riseborough
June 4, 2009
Contents
1 Introduction 5
3 Maxwell’s Equations 14
3.1 Vector and Scalar Potentials . . . . . . . . . . . . . . . . . . . . . 15
3.2 Gauge Invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1
7 The Electromagnetic Lagrangian 51
7.1 Conservation Laws for Electromagnetic Fields . . . . . . . . . . . 55
7.2 Massive Spin-One Particles . . . . . . . . . . . . . . . . . . . . . 60
2
11 The Dirac Equation 191
11.1 Conservation of Probability . . . . . . . . . . . . . . . . . . . . . 196
11.2 Covariant Form of the Dirac Equation . . . . . . . . . . . . . . . 197
11.3 The Field Free Solution . . . . . . . . . . . . . . . . . . . . . . . 199
11.4 Coupling to Fields . . . . . . . . . . . . . . . . . . . . . . . . . . 206
11.4.1 Mott Scattering . . . . . . . . . . . . . . . . . . . . . . . . 207
11.4.2 Maxwell’s Equations . . . . . . . . . . . . . . . . . . . . . 210
11.4.3 The Gordon Decomposition . . . . . . . . . . . . . . . . . 212
11.5 Lorentz Covariance of the Dirac Equation . . . . . . . . . . . . . 215
11.5.1 The Space of the Anti-commuting γ µ -Matrices. . . . . . . 226
11.5.2 Polarization in Mott Scattering . . . . . . . . . . . . . . . 234
11.6 The Non-Relativistic Limit . . . . . . . . . . . . . . . . . . . . . 237
11.7 Conservation of Angular Momentum . . . . . . . . . . . . . . . . 240
11.8 Conservation of Parity . . . . . . . . . . . . . . . . . . . . . . . . 242
11.9 Bi-linear Covariants . . . . . . . . . . . . . . . . . . . . . . . . . 246
11.10The Spherically Symmetric Dirac Equation . . . . . . . . . . . . 250
11.10.1 The Hydrogen Atom . . . . . . . . . . . . . . . . . . . . . 265
11.10.2 Lowest-Order Radial Wavefunctions . . . . . . . . . . . . 274
11.10.3 The Relativistic Corrections for Hydrogen . . . . . . . . . 275
11.10.4 The Kinematic Correction . . . . . . . . . . . . . . . . . . 281
11.10.5 Spin-Orbit Coupling . . . . . . . . . . . . . . . . . . . . . 282
11.10.6 The Darwin Term . . . . . . . . . . . . . . . . . . . . . . 288
11.10.7 The Fine Structure of Hydrogen . . . . . . . . . . . . . . 289
11.10.8 A Particle in a Spherical Square Well . . . . . . . . . . . 293
11.10.9 The MIT Bag Model . . . . . . . . . . . . . . . . . . . . . 299
11.10.10The Temple Meson Model . . . . . . . . . . . . . . . . . . 303
11.11Scattering by a Spherically Symmetric Potential . . . . . . . . . 305
11.11.1 Polarization in Coulomb Scattering. . . . . . . . . . . . . 305
11.11.2 Partial Wave Analysis . . . . . . . . . . . . . . . . . . . . 309
11.12An Electron in a Uniform Magnetic Field . . . . . . . . . . . . . 312
11.13Motion of an Electron in a Classical Electromagnetic Field . . . . 315
11.14The Limit of Zero Mass . . . . . . . . . . . . . . . . . . . . . . . 319
11.15Classical Dirac Field Theory . . . . . . . . . . . . . . . . . . . . . 327
11.15.1 Chiral Gauge Symmetry . . . . . . . . . . . . . . . . . . . 330
11.16Hole Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
11.16.1 Compton Scattering . . . . . . . . . . . . . . . . . . . . . 338
11.16.2 Charge Conjugation . . . . . . . . . . . . . . . . . . . . . 343
3
12.4 The Connection between Spin and Statics . . . . . . . . . . . . . 360
4
1 Introduction
Non-relativistic mechanics yields a reasonable approximate description of phys-
ical phenomena, in the range where the particles kinetic energies are small com-
pared with their rest mass energies. However, it should be noted that the
expression for relativistic invariant mass
E 2 − p2 c2 = m2 c4 (1)
implies that the dispersion relation has two branches
q
E = ±c p2 + m2 c2 (2)
E(p)
+mc2
2mc2
pc
-mc2
Figure 1: The positive and negative energy branches for a relativistic particle
with rest mass m. The minimum separation between the positive-energy branch
and the negative-energy branch is 2mc2 .
can only change continuously, it is impossible that a particle with positive energy
can make a transition from the positive to negative-energy states. However, in
quantum mechanics, particles can make discontinuous transitions. Therefore, it
is necessary to consider both the positive and negative-energy branches. These
considerations naturally lead one to the concept of particles and anti-particles,
and also to the realization that one must consider multi-particle quantum me-
chanics or field theory.
5
perturbation theory does not converge. In fact, straightforward perturbation
theory is plagued by infinities. However, physics is a discipline which is aimed
at uncovering the relationships between measured quantities. The quantities
e and m which occur in quantum electrodynamics are theoretical constructs
which, respectively, describe the bare charge of the electron and bare mass of
the electron. This means one is assuming that e and m would be the results
of measurements on a (fictional) electron which does not interact electromag-
netically. That is, e and m are not physically measurable and their values are
therefore unknown. What can be measured experimentally are the renormal-
ized mass and the renormalized charge of the electron. The divergences found in
quantum electrodynamics can be shown to cancel or drop out, when one relates
different physically measurable quantities, as only the renormalized masses and
energies enter the theory. Despite the existence of infinities, quantum electrody-
namics is an extremely accurate theory. Experimentally determined quantities
can be predicted to an extremely high degree of precision.
6
A spin-zero particle has just one state and is uniquely described by a one-
component field ψ.
A spin one-half particle has two independent states corresponding to the two
allowed values of the z-component of the intrinsic angular momentum S z =
± h̄2 . The wave function ψ of a spin one-half particle is a spinor which has two
independent components
(1)
ψ (r, t)
ψ(r, t) = (4)
ψ (2) (r, t)
These two components can be used to represent two independent basis states.
We conjecture that since a particle with intrinsic spin S has (2S + 1) inde-
pendent basis states, then the wave function should have (2S + 1) independent
components.
or equivalently
7
The above equation can be used to determine ψ 0 (r) by using the substitution
r → R̂−1 r so
ψ 0 (r) = ψ(R̂−1 r) (8)
If ê is a unit vector along the axis of rotation, the rotation of r through an
e
exr
ϕ
ex(exr)
r(e.r)
r
R̂ r = r + δϕ ê ∧ r + . . . (9)
where terms of order δϕ2 have been neglected. Hence, under an infinitesimal
ψ ψ(R-1r)
ψ(r) y-axis
r'=R-1r
x-axis
rotation, the transformation of a scalar wave function can be found from the
Taylor expansion
ψ 0 (r) = ψ(r − δϕ ê ∧ r)
= ψ(r) − δϕ ( ê ∧ r ) . ∇ ψ(r) + . . .
= ψ(r) − δϕ ( r ∧ ∇ ) . ê ψ(r) + . . .
i δϕ
= ψ(r) − ( ê . L̂ ) ψ(r) + . . .
h̄
i δϕ
= exp − ( ê . L̂ ) ψ(r) (10)
h̄
8
where the operator L̂ has been defined as
L̂ = − i h̄ r ∧ ∇ (11)
Therefore, locally, rotations of the scalar field are generated by the orbital an-
gular momentum operator L̂.
Since the operation R̂ is a rotation, it also rotates a vector field ψ(r). Not
only does the rotation transfer the magnitude of ψ(r) to the new point r0 but
it must also rotate the transferred vector so that ψ 0 (r0 ) has the same direction
R̂ ψ(r). That is
0.5
0
-0.5
-1
-1
-0.5
-0.5
0
0
0.5
0.5
Figure 4: The effect of a rotation R̂ on a vector field ψ(r). The rotation affects
both the magnitude and direction of the vector. 1
1 -2
0 ψ02(r0 ) = R̂1ψ(r) 0 -1 -2 (12)
2
or equivalently
The part of the rotational operator designated by R̂ does not affect the posi-
tional coordinates (r) of the vector field, and so can be found by considering
the rotation of the vector field ψ at the origin
R̂ ψ = Iˆ + δϕ ê ∧ ψ (15)
That is, the operator R̂ only produces a mixing of the components of ψ. Hence,
the complete rotational transformation of the vector field can be represented as
ψ 0 (r) = R̂ ψ(r − δϕ ê ∧ r)
= ψ(r − δϕ ê ∧ r) + δϕ ê ∧ ψ(r − δϕ ê ∧ r)
9
= ψ(r − δϕ ê ∧ r) + δϕ ê ∧ ψ(r) + . . .
i δϕ
= ψ(r) − ( ê . L̂ ) ψ(r) + δϕ ê ∧ ψ(r) + . . .
h̄
i δϕ i δϕ
= ψ(r) − ( ê . L̂ ) ψ(r) − ( ê . Ŝ ) ψ(r) (16)
h̄ h̄
where the terms of order (δϕ)2 have been neglected and a vector operator Ŝ has
been introduced. The operator Ŝ only admixes the components of ψ µ , unlike
L̂ which only acts on the r dependence of the components. The components of
the three-dimensional vector operator S are expressed as 3 × 3 matrices1 , with
matrix elements
(Ŝ (i) )j,k = − i h̄ ξ i,j,k (18)
where ξ i,j,k is the antisymmetric Levi-Civita symbol. Specifically, the antisym-
metric matrices are given by
0 0 0
Ŝ (1) = h̄ 0 0 −i (19)
0 i 0
and by
0 0 i
Ŝ (2) = h̄ 0 0 0 (20)
−i 0 0
and finally by
0 −i 0
Ŝ (3) = h̄ i 0 0 (21)
0 0 0
By using a unitary transform, these operators can be transformed into the stan-
dard representation of spin-one operators where S (3) is chosen to be diagonal.
It is easily shown that the components of the matrix operators L̂ and Ŝ satisfy
the same type of commutation relations
[ L̂(i) , L̂(j) ] = i h̄ ξ i,j,k L̂(k) (22)
and
[ Ŝ (i) , Ŝ (j) ] = i h̄ ξ i,j,k Ŝ (k) (23)
The above set of operators form a Lie algebra associated with the corresponding
Lie group of continuous rotations. Thus, it is natural to identify these operators
which arise in the analysis of transformations in classical physics with the an-
gular momentum operators of quantum mechanics. In terms of these operators,
the infinitesimal transformation has the form
i δϕ
ψ 0 (r) ≈ ψ(r) − ê . ( L̂ + Ŝ ) ψ(r) + . . . (24)
h̄
1 The component of the matrix denoted by
(Ŝ)j,k (17)
denotes the element of Ŝ in the j-th row and k-th column.
10
or
i δϕ
ψ 0 (r) = exp − ê . ( L̂ + Ŝ ) ψ(r) (25)
h̄
Thus, the transformation is locally accomplished by
0 i δϕ
ψ (r) = exp − ( ê . Ĵ ) ψ(r) (26)
h̄
where
Ĵ = L̂ + Ŝ (27)
is the total angular momentum. The operator Ŝ is the intrinsic angular mo-
mentum of the vector field ψ. The magnitude of S is found from
which is evaluated as
1 0 0
Sˆ2 = 2 h̄2 0 1 0 (29)
0 0 1
which is the Casimir operator. It is seen that a vector field has intrinsic angular
momentum, with a magnitude given by the eigenvalue of Sˆ2 which is
E 2 − p2 c2 = 0 (31)
One finds that real scalar wave function ψ(r, t) satisfies the wave equation
1 ∂2
2
− ∇ ψ = 0 (33)
c2 ∂t2
11
since h̄ drops out. This is not a very useful result, since it is a second-order differ-
ential equation in time, and the solution of a second-order differential equation
can only be determined if two initial conditions are given. Usually, the initial
condition is given by
ψ(r, 0) = f (r)
∂ψ
= g(r) (34)
∂t t=0
In quantum mechanics, measurements disturb the state of the system and so it
becomes difficult to design two independent measurements which can uniquely
specify two initial conditions for one state. Hence, one has reached an impasse.
Due to this difficulty and since there are no known examples of massless spinless
particles found in nature, this theory is not very useful.
We shall look try and factorize the wave equation for the vector E into two
first-order differential equations, each of which requires one boundary condition.
This requires one to specify six quantities. Therefore, one needs to postulate
the existence of two independently measurable fields, E and B. Each of these
fields should satisfy the two wave equations
1 ∂2
2
− ∇ E = 0 (35)
c2 ∂t2
and
1 ∂2
− ∇2 B = 0 (36)
c2 ∂t2
a = d = 0 (38)
12
and
b = −e (39)
E → E
B → −B (40)
under time-reversal invariance2 . On taking the time derivative of the first equa-
tion, one obtains
2
2 ∂ E 2 2
− h̄ = − c b p̂ ∧ p̂ ∧ E (41)
∂t2
∂2B
− h̄2 = − c2 b2 p̂ ∧ p̂ ∧ B (42)
∂t2
∂2E
− h̄2 = − c2 2
b − p̂ 2
E + p̂ p̂ . E (43)
∂t2
and
∂2B
2 2 2 2
− h̄ = −c b − p̂ B + p̂ p̂ . B (44)
∂t2
∂2E
2 2 2
= −c b − ∇ E + ∇ ∇.E (45)
∂t2
and
∂2B
= − c2 b2 − ∇2 B + ∇ ∇.B (46)
∂t2
so h̄ drops out. To reduce these equations to the form of wave equations, one
needs to impose the conditions
∇.E = 0 (47)
and
∇.B = 0 (48)
2 For the non-relativistic Schrödinger equation, time-reversal invariance implies that t →
t0 = −t and ψ → ψ 0 = ψ ∗ .
13
On identifying the coefficients with those of the wave equation, one requires
that
b2 = 1 (49)
Thus, one has arrived at the set of the source-free Maxwell’s equations
1 ∂E
= ∇ ∧ B
c ∂t
1 ∂B
− = ∇ ∧ E
c ∂t
∇.E = 0
∇.B = 0 (50)
3 Maxwell’s Equations
Classical Field Theories describe systems in which a very large number of par-
ticles are present. Measurements on systems containing very large numbers of
particles are expected to result in average values, with only very small devia-
tions. Hence, we expect that the subtleties of quantum measurements should be
completely absent in systems that can be described as quantum fields. Classical
Electromagnetism is an example of such a quantum field, in which an infinitely
large number of photons are present.
The field equations ensure that the sources j and ρ satisfy a continuity equation.
Taking the divergence of the first equation and combining it with the time
derivative of the third, one obtains
1 ∂ 4π
∇. ∇ ∧ B − ∇.E = ∇.j
c ∂t c
14
1 ∂ 4π
− ∇.E = ∇.j
c ∂t c
4 π ∂ρ 4π
− = ∇.j (52)
c ∂t c
Hence, one has derived the continuity equation
∂ρ
+ ∇.j = 0 (53)
∂t
which shows that charge is conserved.
One can solve the two source-free Maxwell equations, by expressing the
electric E and magnetic fields B in terms of the vector A and scalar φ potentials,
via
1 ∂A
E = − − ∇φ (54)
c ∂t
and
B = ∇ ∧ A (55)
The expressions for B and E automatically satisfy the two source-free Maxwell’s
equations. This can be seen by examining
1 ∂B
∇ ∧ E + = 0 (56)
c ∂t
which, on substituting the expressions for the electromagnetic fields in terms of
the vector and scalar potentials, becomes
1 ∂A 1 ∂
∇ ∧ − − ∇φ + ∇ ∧ A = 0 (57)
c ∂t c ∂t
which is automatically satisfied since
∇ ∧ ∇φ = 0 (58)
and the terms involving A cancel since A is analytic. The remaining source-free
Maxwell equation is satisfied, since it has the form
∇.B = 0 (59)
15
which reduces to
∇. ∇ ∧ A = 0 (60)
Therefore, the six components of E and B have been replaced by the four
quantities A and φ. These four quantities are determined by the Maxwell equa-
tions which involve the sources, which are four in number.
The fields are governed by the set of non-trivial equations which relate A and
φ to the sources j and ρ. When expressed in terms of A and φ, the remaining
non-trivial Maxwell equations become
1 ∂ 1 ∂A 4π
∇ ∧ ∇ ∧ A + + ∇φ = j
c ∂t c ∂t c
1 ∂A
−∇. + ∇φ = 4πρ (61)
c ∂t
but since
∇ ∧ ∇ ∧ A = ∇ ∇.A − ∇2 A (62)
1 ∂2A
1 ∂φ 4π
− ∇2 A + 2 + ∇ ∇ . A + = j
c ∂t2 c ∂t c
1 ∂
− ∇2 φ − ∇.A = 4πρ (63)
c ∂t
16
fields remain invariant. These transformations are known as gauge transforma-
tions of the second kind3 .
A → A0 = A − ∇ Λ
1 ∂Λ
φ → φ0 = φ + (66)
c ∂t
where Λ is an arbitrary analytic function and this transformation leaves the E
and B fields invariant. The magnetic field is seen to be invariant since
B0 = ∇ ∧ A0
= ∇ ∧ A − ∇Λ
= ∇ ∧ A
= B (67)
valid for any scalar function Λ has been used. The electric field is invariant,
since the transformed electric field is given by
1 ∂A0
E0 = − − ∇ φ0
c ∂t
1 ∂ 1 ∂Λ
= − A − ∇Λ − ∇ φ +
c ∂t c ∂t
1 ∂A
= − − ∇φ
c ∂t
= E (69)
In the above derivation, it has been noted that the order of the derivatives can
be interchanged,
∂Λ ∂
∇ = ∇Λ (70)
∂t ∂t
since Λ is an analytic scalar function.
3 The transformation
iχ
ψ → ψ 0 = ψ exp
h̄
p → p̂0 = − i h̄ ∇ − ∇ χ
used in quantum mechanics is known as a gauge transformation of the first kind.
17
The gauge invariance allows us the freedom to impose a gauge condition
which fixes the gauge. Two gauge conditions which are commonly used are the
Lorenz gauge
1 ∂φ
∇.A + = 0 (71)
c ∂t
and the Coulomb or radiation gauge
∇.A = 0 (72)
The Lorenz gauge is manifestly Lorentz invariant, whereas the Coulomb gauge
is frequently used in cases where the electrostatic interactions are important.
For example, if the fields (φ, A) do not satisfy the Lorenz gauge condition,
since
1 ∂φ
∇.A + = χ(r, t) (73)
c ∂t
where χ is non-zero, then one can perform the gauge transformation to the new
fields (φ0 , A0 )
1 ∂φ0 1 ∂φ 1 ∂2Λ
∇ . A0 + = ∇ . A − ∇2 Λ − + 2
c ∂t c ∂t c ∂t2
2
1 ∂
= χ − ∇2 − 2 Λ (74)
c ∂t2
The new fields satisfy the Lorentz condition if one chooses Λ to be the solution
of the wave equation
1 ∂2
2
∇ − 2 Λ = χ(r, t) (75)
c ∂t2
This can always be done, since the driven wave equation always has a solution.
Hence, one can always insist that the fields satisfy the gauge condition
1 ∂φ0
∇ . A0 + = 0 (76)
c ∂t
Alternatively, if one is to impose the Coulomb gauge condition
∇ . A0 = 0 (77)
one can use Poisson’s equations to show that one can always find a Λ such that
the Coulomb gauge condition is satisfied4 .
since in the case of the Coulomb gauge, the vector potential is only known up to the gradient
of any harmonic function Λ.
18
In the Lorenz gauge, the equations of motion for the electromagnetic field
are given by
1 ∂2
4π
− ∇2 + 2 2
A = j
c ∂t c
1 ∂2
− ∇2 + 2 φ = 4πρ (78)
c ∂t2
Hence, A and φ both satisfy the wave equation, where j and ρ are the sources.
The solutions are waves which travel with velocity c.
1 ∂2
2 4π 1 ∂
− ∇ + 2 2
A = j − ∇φ
c ∂t c c ∂t
− ∇2 φ = 4 π ρ (79)
ρ(r0 , t)
Z
φ(r, t) = d3 r 0 (80)
| r − r0 |
Exercise:
19
of the four-vector x(0) (the time-like component) is defined to be ct, where c is
the velocity of light, in order that all the components have the dimensions of
length. In Minkowski space, the four-vector is defined as having contravariant
components xµ = (ct, x(1) , x(2) , x(3) ), while the covariant components are de-
noted by xµ = (ct, −x(1) , −x(2) , −x(3) ). The invariant length is given by the
scalar product
which is related to the proper time τ . This definition can be generalized to the
scalar product of two arbitrary four-vectors Aµ and B µ as
where repeated indices are summed over. In special relativity, the four-vector
scalar product can be written in terms of the product of the time-like compo-
nents and the scalar product of the usual three-vectors as
Aµ Bµ = gµ,ν Aµ B ν (84)
Aµ = gµ,ν Aν (85)
where µ labels the rows and ν labels the columns. If the four-vectors are ex-
pressed as column-vectors
(0)
A
A(1)
Aν =
A(2)
(87)
A(3)
and
A(0)
A(1)
Aν =
A(2) (88)
A(3)
20
then the transformation from contravariant to covariant components can be
expressed as
(0)
A(0) 1 0 0 0 A
A(1) 0 −1 0 0 A(1)
A(2) = 0 0 −1 (89)
0 A(2)
A(3) 0 0 0 −1 A(3)
Aµ = g µ,ν Aν (90)
A familiar example of the Lorentz invariant scalar product involves the mo-
mentum four-vector with contravariant components pµ ≡ ( Ec , p(1) , p(2) , p(3) )
where E is the energy. The covariant components of the momentum four-vector
are given by pµ ≡ ( Ec , −p(1) , −p(2) , −p(3) ) and the scalar product defines the
invariant mass m via
2
µ E
p pµ = − p2 = m2 c2 (92)
c
xµ → xµ0 = xµ + aµ (94)
21
the scalar function φ(xµ0 ) is still a scalar. Therefore, on performing a Taylor
expansion, one has
∂
φ(xµ + aµ ) = φ(xµ ) + aµ φ(xµ ) + . . . (95)
∂xµ
which is also a scalar. Therefore, the quantity
∂
aµ φ(xµ ) (96)
∂xµ
is a scalar and can be interpreted as a scalar product between the contravariant
vector displacement aµ and the covariant gradient
∂
φ(xµ ) (97)
∂xµ
The covariant gradient can be interpreted in terms of a covariant derivative
∂
∂µ = µ
∂x
1 ∂ ∂ ∂ ∂
= , , ,
c ∂t ∂x(1) ∂x(2) ∂x(3)
1 ∂
= , ∇ (98)
c ∂t
22
which is of the form of a Lorentz scalar. Likewise, if one introduces the current
density four-vector j µ with contravariant components
jµ = c ρ , j (1) , j (2) , j (3)
= cρ, j (102)
which is a Lorentz scalar. Also, the gauge transformation can also be compactly
expressed in terms of a transformation of the contravariant vector potential
Aµ → Aµ0 = Aµ + ∂ µ Λ (104)
Similarly, one can use the contravariant notation to express the quantization
conditions
∂
E → i h̄
∂t
p → − i h̄ ∇ (106)
in the form
pµ = i h̄ ∂ µ (107)
One can also express the wave equation operator in terms of the scalar product
of the contravariant and covariant derivative operators
1 ∂2
∂ µ ∂µ = − ∇2 (108)
c2 ∂t2
Hence, in the Lorenz gauge, the equations of motion for the four-vector potential
Aµ can be expressed concisely as
4π µ
∂ ν ∂ν Aµ = j (109)
c
However, these equations are not gauge invariant.
23
4.3 Lorentz Transformations
A Lorentz transform can be defined as any transformation which leaves the
scalar product of two four-vectors invariant. Under a Lorentz transformation,
an arbitrary four-vector Aµ is transformed to Aµ0 , via
Aµ0 = Λµ ν Aν (110)
If the scalar product is to be invariant, the transform must satisfy the condition
where µ labels the rows and ν labels the columns. In terms of the matrices, the
condition that Λ is a Lorentz transformation can be written as
g = ΛT g Λ (115)
( ΛT )ν µ = Λµ ν (116)
− vc
1 0 0
− v 1 q 0 0
1 c
µ
Λ ν = q
0 0 1 − v 2
0
(117)
v 2 c2
1 − c2 q
2
0 0 0 1 − vc2
24
x(3)
x'(3)
v
x(2)
x'(2)
O
O'
x(1) x'
(1)
Figure 5: Two inertial frames of reference moving with a constant relative ve-
locity with respect to each other.
x(3)
x'(2)
x(2)
x'(1)
φ
O x(1)
Figure 6: Two inertial frames of reference rotated with respect to each other.
Likewise, the rotation through an angle ϕ about the x(3) -axis represented by
1 0 0 0
0 cos ϕ sin ϕ 0
Λµ ν = 0 − sin ϕ cos ϕ 0
(119)
0 0 0 1
Since the boost velocity v and the angles of rotation ϕ are continuous, one
could consider transformations where these quantities are infinitesimal. Such
infinitesimal transformations can be expanded as
Λµ ν = δ µ ν + µ ν + . . . (120)
25
where δ µ ν is the Kronecker delta function representing the identity transforma-
tion5 and µ ν is a matrix which is first-order in the infinitesimal parameter. The
condition on µ ν required for Λµ ν to be a Lorentz transform is given by
or, on using the metric tensor to lower the indices, one has
Exercise:
Show that a Lorentz transformation from the unprimed rest frame to the
primed reference frame moving along the x(3) -axis with constant velocity v, can
be considered as a rotation through an imaginary angle θ = i χ in space-time,
where i c t plays the role of a spatial coordinate. Find the equation that deter-
mines χ.
26
shall express the six ( (16−4)
2 = 6) independent components of the antisymmetric
tenor in terms of the four-vector potential Aµ and the contravariant derivative
as
F µ,ν = ∂ µ Aν − ∂ ν Aµ (125)
so the tensor is antisymmetric
F µ,ν = − F ν,µ (126)
µ,ν
It is immediately obvious that F is invariant under gauge transformations
Aµ → Aµ0 = Aµ + ∂ µ Λ (127)
since
∂µ∂ν Λ − ∂ν ∂µΛ ≡ 0 (128)
µ,ν
Alternatively, explicit evaluation of F shows that the six independent com-
ponents can be expressed in terms of the electric and magnetic fields, which are
gauge invariant. Components of the field tensor are explicitly evaluated from
the definition as
1 ∂ (1) ∂
F 0,1 = A − φ
c ∂t ∂x1
1 ∂ (1) ∂
= A + φ
c ∂t ∂x(1)
= − E (1) (129)
and
∂ ∂
F 1,2 = A(2) − A(1)
∂x1 ∂x2
∂ ∂
= − A(2) + A(1)
∂x(1) ∂x(2)
= − B (3) (130)
The non-zero components of the field tensor are related to the spatial compo-
nents (i, j, k) of the electromagnetic field by
F i,0 = E (i) (131)
and
F i,j = − ξ i,j,k B (k) (132)
i,j,k
where ξ is the Levi-Civita symbol. The Levi-Civita symbol is given by
ξ i,j,k = 1 if the ordered set (i, j, k) is obtained by an even number of permuta-
tions of (1, 2, 3) and is −1 if it is obtained by an odd number of permutations,
and is zero if two or more indices are repeated. Therefore, the field tensor can
be expressed as the matrix
−E (1) −E (2) −E (3)
0
E (1) 0 −B (3) B (2)
F µ,ν =
E (2) B (3)
(133)
0 −B (1)
E (3) −B (2) B (1) 0
27
Maxwell’s equations can be written in terms of the field tensor as
4π µ
∂ν F ν,µ = j (134)
c
For µ = i, the field equations become
1 ∂ 0,i ∂ 4 π (i)
F + F j,i = j
c ∂t ∂xj c
1 ∂ (i) ∂ 4 π (i)
− E + ξ i,j,k B (k) = j
c ∂t ∂xj c
(i)
1 ∂ (i) 4 π (i)
− E + ∇ ∧ B = j (135)
c ∂t c
since F 0,0 vanishes. The above field equations are the two Maxwell’s equations
which involve the sources of the fields. The remaining Maxwell two source-less
Maxwell equations are expressed in terms of the antisymmetric field tensor as
where the indices are permuted cyclically. These internal equations reduce to
∇.B = 0 (138)
when µ, ν and ρ are the space indices (1, 2, 3). When one index taken from the
set (µ, ν, ρ) is the time index, and the other two are different space indices, the
field equations reduce to
1 ∂B
+ ∇ ∧ E = 0 (139)
c ∂t
If two indices are repeated, the above equations are satisfied identically, due to
the antisymmetry of the field tensor.
Alternatively, when expressed in terms of the vector potential, the field equa-
tions of motion are equivalent to the wave equations
4π µ
∂ν ∂ ν Aµ − ∂ µ ∂ν Aν = j (140)
c
28
Since four-vectors Aµ and j µ transform as
Aµ0 = Λµ ν Aν
j µ0 = Λµ ν j ν (141)
and likewise for the contravariant derivative
∂ µ0 = Λµ ν ∂ ν (142)
then one can conclude that the field tensor transforms as
F µ,ν 0 = Λµ σ Λν τ F σ,τ (143)
This shows that, under a Lorentz transform, the electric and magnetic fields
(E, B) transform into themselves.
Exercise:
Show explicitly, how the components of the electric and magnetic fields
change, when the coordinate system is transformed from the unprimed refer-
ence frame to a primed reference frame which is moving along the x(3) -axis with
constant velocity v.
The Lagrangian for the string is a function of the coordinates yi and the
velocities dy
dt . The Lagrangian is given by
i
i=N 2 2
X mi dyi κi
L = − yi − yi−1 (144)
i=1
2 dt 2
The first term represents the kinetic energy of the mass elements, and the second
term represents the increase in the elastic potential energy of the section of the
string between the i-th and (i − 1)-th element is stretched from its equilibrium
position. This follows since, ∆si the length of the section of string between
mass element i and i − 1 in a non-equilibrium position is given by
∆s2i = ( xi − xi−1 )2 + ( yi − yi−1 )2
= a2 + ( yi − yi−1 )2 (145)
29
y
yi+1
yi yi+1-yi
yi-1 a
xi-1 xi xi+1 x
since the x-coordinates are fixed. Thus, if one assumea that the spring constant
for the stretched string segment is κi , then the potential energy of the segment
is given by
κi
Vi = ( yi − yi−1 )2 (146)
2
to be evaluated for arbitrary functions yi (t). The string follows the trajectories
yiex (t) which minimizes the action, which travels between the fixed initial value
yi (0) and the final value yi (T ). We shall represents the deviation of an arbitrary
trajectory yi (t) from the extremal trajectory by δyi (t), then
δyi (t) = yi (t) − yiex (t) (148)
The action can be expanded in powers of the deviations δyi as
S = S0 + δ 1 S + δ 2 S + . . . (149)
where S0 is the action evaluated for the extremal trajectories. The first-order
deviation found by varying δyi is given by
Z T i=N
X ex
dδyi dyi
δ1 S = dt mi − κ δyi yiex − yi−1
ex ex
+ κ δyi yi+1 − yiex
0 i=1
dt dt
(150)
30
in which yi (T ) and dy
dt are to be evaluated for the extremal trajectory. Since
i
the trajectory which the string follows minimizes the action, the term δ 1 S must
vanish for an arbitrary variation δyi . We can eliminate the time derivative of
the deviation by integrating by parts with respect to t. This yields
T i=N
d dyiex
Z X
1 ex ex ex ex
δ S = dt − mi δyi − κ δyi yi − yi−1 + κ δyi yi+1 − yi
0 i=1
dt dt
ex T
X dyi
+ mi δyi (t) (151)
i
dt 0
The boundary term vanishes since the initial and final configurations are fixed,
so
δyi (T ) = δyi (0) = 0 (152)
Hence the first-order variation of the action reduces to
Z T i=N
d dyiex
X
1 ex ex ex
δ S = dt δyi − mi − κ 2 yi − yi−1 − yi+1
0 i=1
dt dt
(153)
The linear variation of the action vanishes for an arbitrary δyi (t), if the term in
the square brackets vanishes
d dyiex
ex ex ex
mi + κ 2 yi − yi−1 − yi+1 = 0 (154)
dt dt
Thus, out of all possible trajectories, the physical trajectory yiex (t) is determined
by the equation of motion
d dyi
mi = − κ 2 yi − yi−1 − yi+1 (155)
dt dt
31
The Hamiltonian is only a function of the pairs of canonically conjugate mo-
menta pi and coordinates yi . This can be seen, considering infinitesimal changes
in yi , dy
dt and pi . The resulting infinitesimal change in the Hamiltonian dH is
i
expressed as
i=N
X
dyi dyi ∂L dyi ∂L
dH = dpi + pi d( ) − d( ) − dy i
i=1
dt dt ∂( dy
dt )
i dt ∂yi
i=N
X dyi ∂L
= dpi − dyi (159)
i=1
dt ∂yi
When expressed in terms of the Hamiltonian, the equations of motion have the
form
dyi ∂H
=
dt ∂pi
dpi ∂H
= − (163)
dt ∂yi
The Hamilton equations of motion reduce to
dyi pi
=
dt mi
dpi
= − κi yi − yi−1 + κi+1 yi+1 − yi (164)
dt
for each i value N ≥ i ≥ 1.
One can define the Poisson brackets of two arbitrary quantities A and B in
terms of derivatives of the canonically conjugate variables
i=N
X ∂A ∂B
∂B ∂A
A, B = − (165)
i=1
∂yi ∂pi ∂yi ∂pi
32
The Poisson bracket is antisymmetric in A and B
A, B = − B, A (166)
and
pi , pj = yi , yj = 0 (168)
The increase in energy of this segment, per unit time, is clearly given by the
difference of the quantity
dyi
Pi = − κ yi+1 − yi (172)
dt
at the front end of the segment and Pi−1 at the back end of the segment. Since,
from continuity of energy, the rate of increase in the energy of the segment must
equal the net inflow of energy into the segment, one can identify Pi as the flux
of energy flowing out of the i-th into the (i + 1)-th segment.
33
5.1 The Continuum Limit
The displacement of each element of the string can be expressed as a function
of its position, via
yi = y(xi ) (173)
where each segment has length a, so that xi+1 = xi + a. The displacement
y(xi+1 ) can be Taylor expanded about xi as
a2 ∂ 2 y
∂y
y(xi+1 ) = y(xi ) + a + + ... (174)
∂x xi 2! ∂x2 xi
We intend to take the limit a → 0, so that only the first few terms of the series
need to be retained. The summations over i are to be replaced by integrations
N Z L
X 1
→ dx (175)
i=1
a 0
T = κa (176)
where 2 2
1 dy ∂y
L = ρ − κa (178)
2 dt ∂x
The equations of motion are found from the extrema of the action
Z T Z L
S = dt dx L (179)
0 0
It should be noted, that in S time and space are treated on the same footing
and that L is a scalar quantity.
34
where the Hamiltonian density H is given by
2 2
1 dy ∂y
H = ρ + κa (181)
2 dt ∂x
35
where the ck are arbitrary complex functions of k. If the time dependence of
the ψk (x) is absorbed into the complex functions ck via
ck (t) = ck (0) exp − i ωk t (191)
which is purely real. Thus, the field y(x) is determined by the amplitudes of
the normal modes by ck (t). The time-dependent amplitude ck (t) satisfies the
equation of motion
d2 ck
= − ωk2 ck (193)
dt2
and, therefore, behaves like a classical harmonic oscillator. To quantize this
classical field theory, one needs to quantize these harmonic oscillators.
and
dc∗−k
ρa X dck
p(x) = √ + exp i k x (196)
L k dt dt
then after integrating over x, one finds that the energy has the form
dc∗k dc∗−k
ρ X dc−k dck
H = + +
2 dt dt dt dt
k
κa X 2
+ k c−k (t) + c∗k (t) ck (t) + c∗−k (t) (197)
2
k
36
but the frequency is given by the dispersion relation
κa
ωk2 = v 2 k 2 = k2 (199)
ρ
Therefore, the expression for the Hamiltonian simplifies to
X
H = ρ ωk2 c∗k (t) ck (t) + c−k (t) c∗−k (t)
k
X
= ρ ωk2 c∗k (0) ck (0) + c−k (0) c∗−k (0) (200)
k
These relations are simply the results of applying the inverse Fourier transform
to y(x) and p(x). One can find the Poisson brackets relations between ck and
c∗k from
− i ωk 0 ρ ck0 − c∗−k0 , ck + c∗−k
a X
= − p(xi ) , y(xj ) exp − i ( k xi + k 0 xj )
L i,j
a X 0
= − δi,j exp − i ( k xi + k xj )
L i,j
a X 0
= − exp − i ( k + k ) xi
L i
= − δk+k0 (204)
37
Likewise, one can obtain similar expressions for the other commutation relations.
This set of equations can be satisfied by setting
i
c∗k0 , ck = δk,k0 (205)
2 ωk ρ
and
c∗k0 , c∗k = ck0 , ck = 0 (206)
The above set of Poisson brackets can be recast in a simpler form by defining
1
ck = √ ak (207)
2 ωk ρ
and
a∗k0 , a∗k = ak0 , ak = 0 (209)
So one has
[ â†k0 , âk ] = − h̄ δk,k0 (211)
and
[ â†k0 , â†k ] = [ âk0 , âk ] = 0 (212)
To get rid of the annoying h̄ in the commutator, one can set
√
âk = h̄ b̂
†
√ k†
âk = h̄ b̂k (213)
Whether it was noted or not, b̂†k is the Hermitean conjugate of b̂k . The Her-
mitean relation can proved by taking the Hermitean conjugate of ŷ(xi ), and
38
noting that the 3-rd rule of quantization states, “Measurable quantities are to
replaced by Hermitean operators”. Therefore, the operator
s
1 X h̄
ŷ(x) = √ b̂k (t) + b̂†−k (t) exp i k x (214)
L 2 ρ ωk
k
has to be the same as ŷ(x). On setting k = −k 0 in the above equation, one has
s
† 1 X h̄ † † † 0
ŷ (x) = √ b̂ −k0 (t) + ( b̂ k 0 (t) ) exp + i k x (216)
L 0 2 ρ ωk0
k
For this to be equal to ŷ(x), it is necessary that the Hermitean conjugate of the
operator b̂†k0 is equal to b̂k0 . This shows that the pair of operators are indeed
Hermitean conjugates. The quantum field is represented by the operator
s
1 X h̄ † 0
ŷ(x) = √ b̂−k0 (t) + b̂k0 (t) exp + i k x (217)
L 0 2 ρ ωk0
k
where the b̂k and b̂†k are to be identified as annihilation and creation operators
for the quanta.
39
The quantum operator P̂ corresponding to the classical quantity P
Z L
P = dx P (221)
0
is evaluated as
X
κa
P̂ = − h̄ k b̂†−k − b̂k b̂†k + b̂−k (222)
2ρ
k
where the plane-wave orthogonality properties have been used. This quantity
can be expressed as the sum of two terms
κa X † †
P̂ = − h̄ k b̂−k b̂k − b̂k b̂−k
2ρ
k
κa X † †
+ h̄ k b̂k b̂k − b̂−k b̂−k (223)
2ρ
k
h̄ k b̂†k b̂k
X
P̂ = v2 (224)
k
which obviously is proportional to the sum of the momenta of the quanta. The
quantity κρa is just the square of the wave velocity v 2 . On noting that the
quanta travel with velocities given by v sign(k) and have energies given by
h̄ ωk = h̄ v | k |, one sees that P is the expressed as the total energy flux
associated with the quanta.
40
That states which are eigenstates of the number operators ( | {nk } > ) can
not represent classical states, can be seen by noting that the expectation value
of the field operator is zero
follows from the expectation value of the creation and annihilation operators
Despite the fact that the average value of the field is zero, the fluctuation in the
field amplitude is infinite since
2 1 X h̄ † †
< {nk } | ŷ(x) | {nk } > = < {nk } | b̂k0 + b̂−k0 b̂k0 + b̂−k0 | {nk } >
L 0 2 ρ ωk 0
k
1 X h̄
= ( 1 + 2 nk 0 ) (227)
L 0 2 ρ ωk 0
k
and the zero-point contribution diverges logarithmically at the upper and lower
limits of integration.
Hence, the eigenstates of Ĥ do not describe the classical states of the string.
Classical states must be expressed as a linear superposition of energy eigenstates.
φα = φα
ex + δφ
α
(229)
The space and time derivatives of the arbitrary field can also be expressed as
the derivatives of the sum of the extremal field and the deviation
∂ν φα = ∂ν φα
ex + ∂ν δφ
α
(230)
41
The first-order change in the action δS is given by
Z t Z
0 3 α ∂ α α α ∂ α α
δS = dt d x δφ L φex , ∂µ φex + (∂ν δφ ) L φex , ∂µ φex
0 ∂φα ∂(∂ν φα )
(231)
On integrating by parts with respect to xν in the last term, and on assuming
appropriate boundary conditions, one finds
Z t Z
∂ ∂
δS = dt0 d3 x δφα L φα
ex , ∂ φα
µ ex − ∂ν L φα
ex , ∂ φα
µ ex
0 ∂φα ∂(∂ν φα )
(232)
which has to vanish for an arbitrary choice of δφα . Hence, one obtains the
Euler-Lagrange equations
∂ α α ∂ α α
L φex , ∂ φ
µ ex = ∂ν L φ ex , ∂ φ
µ ex (233)
∂φα ∂(∂ν φα )
This set of equations determine the time dependence of the classical fields
φα
ex (x). That is, out of all possible fields with components φα , the equations
of motion determine the physical field which has the components φα ex . It is
convenient to define the field momentum density πα0 (x) conjugate to φα as
0 ν 1 ∂ α α
πα (x ) = L φ , ∂µ φ (234)
c ∂(∂0 φα )
The Hamiltonian density H is then defined as the Legendre transform
X
H = c πα0 (∂0 φα ) − L (235)
α
Exercise:
Exercise:
42
for a complex scalar field ψ. Treat ψ and ψ ∗ as independent fields.
(i) Determine the Euler-Lagrange equation and the Hamiltonian density H.
(ii) By Fourier transforming with respect to space and time, determine the form
of the general solution for ψ.
Exercise:
The Lagrangian density for the complex field ψ representing a charged par-
ticle is given by
h̄2
∗
h̄ ∂ψ ∂ψ
L = − ∇ ψ∗ . ∇ ψ − ψ∗ − ψ − ψ ∗ V (x) ψ
2m 2i ∂t ∂t
(238)
(i) Determine the equation of motion, and the Hamiltonian density H.
(ii) Consider the case V (x) ≡ 0, then by Fourier transforming with respect to
space and time, determine the form of the general solution for ψ.
by noting that H is only a functional of πα0 and φα . This can be seen, since as
Z X
H = d3 x c πα0 (∂0 φα ) − L (240)
α
43
which does not involve the time derivative of the fields. This implies that
the Hamiltonian is a function of the fields πα0 , φα and their derivatives. On
calculating the variation of H using the independent variables πα0 and φα , and
integrating by parts, one finds that the Hamiltonian equations of motion are
given by
α ∂H ∂H
c ∂0 φ = − ∇
∂π 0 ∂(∇πα0 )
α
0 ∂H ∂H
− c ∂0 πα = − ∇ (244)
∂φα ∂(∇φα )
The structure of these equations are similar to those of the classical mechanics of
point particles. Similar to classical mechanics of point particles, one can define
Poisson Brackets with fields. When quantizing the fields, the Poisson Bracket
relations between the fields can be replaced by commutation relations.
44
so
∂L
δL = παµ (∂µ δφα ) + δφα (249)
∂φα
The Euler-Lagrange equation for each field φα is given by
∂L
∂µ παµ − = 0 (250)
∂φα
where φα satisfies the appropriate boundary conditions. Thus,
∂L
δL = παµ ∂µ δφα + (∂µ παµ ) δφα + − ∂ π
µ α
µ
δφα
∂φα
∂L
= ∂µ παµ δφα + − ∂ π
µ α
µ
δφα
∂φα
= ∂µ παµ δφα (251)
since the last term vanishes if the fields φα satisfy the Euler-Lagrange equations.
If the Lagrangian is invariant under the transformation, then δL = 0, so
µ α
∂µ πα δφ = 0 (252)
where the field index α is to be summed over. The above equation can be
re-written as a continuity equation
∂µ j µ = 0 (253)
The conserved charge Q is defined as the integral over all space of the time-like
component of the current density j (0) . That is, the conserved charge is given by
Z
Q = d3 x j (0) (x) (256)
45
or, more specifically
Z
Q = d3 x πα(0) (x) δφα (x)
Z
∂L
= d3 x δφα (x) (257)
∂(∂0 φα )
Since is a constant, the total charge Q is constant. Therefore, the total time
derivative of Q vanishes
dQ
= 0 (258)
dt
The spatial components of j µ form the current density vector.
ψ0 = ψ + iψ
ψ ∗0 = ψ∗ − i ψ∗ (261)
where ψ and its complex conjugate ψ ∗ are regarded as independent fields. The
transformation represents a an infinitesimal constant shift of the phase of the
field8 . The conserved current is
µ ∂L ∂L ∗
j = −i ψ(x) − ψ (x) (262)
∂(∂µ ψ) ∂(∂µ ψ ∗ )
8 This particular transformation is a specific example of a gauge transformations of the first
kind, in which
q
ψ 0 (x) = exp − i Λ(x) ψ(x)
h̄ c
A gauge transformation of the second kind is one in which the field changes according to
Aµ0 = Aµ + (∂ µ Λ)
q
Since p̂µ = i h̄ ∂ µ , the combination of these transformations keep the quantity (p̂µ − c
Aµ )ψ
invariant
46
which is the electromagnetic current density four-vector.
Exercise:
Exercise:
one has
µ
∂µ Λ = ∂µ παµ δφ α
(268)
47
If the conserved currents are identified as
j µ = παµ δφα − Λµ (269)
In this case, the change in the Lagrangian density is given by the total derivative
∂L α ∂L
δL = (∂ ν δφ ) + δφα
∂(∂ν φα ) ∂φα
∂L ∂L
= µ ∂ ∂
ν µ φα
+ ∂µ φα
∂(∂ν φα ) ∂φα
= µ ∂µ L (273)
where the last line follows since the Lagrangian only depends implicitly on xµ
through the fields. Hence, the change in the Lagrangian is a total derivative
δL = µ ∂µ Λ (274)
φα → φα + µ (∂µ φα ) (275)
48
where the Euler-Lagrange equation has been used in the second line. Thus, the
fields satisfy the continuity conditions
µ ∂L α ν
0 = ∂ν ∂µ φ − δ µL (277)
∂(∂ν φα )
where
δµ ν = 1 if µ = ν
δµ ν = 0 otherwise (278)
are related to the momentum density since the total momentum of the field is
given by Z
(j) 1
P = d3 x T 0,j (286)
c
49
Since T 0,j is the momentum density, one expects that the components of the
orbital angular momentum density are proportional to
M 0,j,k = T 0,j x(k) − T 0,k x(j) (287)
One can define a third-rank tensor via
M µ,ν,ρ = T µ,ν xρ − T µ,ρ xν (288)
The divergence of the third-rank tensor is evaluated as
∂µ M µ,ν,ρ = ∂µ T µ,ν xρ + T µ,ν δ ρ µ − ∂µ T µ,ρ xν − T µ,ρ δ ν µ
It should be noted that the tensor T µ,ν is only symmetric for scalar fields.
This is related to the fact that a vector or tensor field carries an non-zero in-
trinsic angular momentum. It is possible to incorporate an additional term in
the momentum-energy tensor of a vector field to make it symmetric.
Exercise:
(i) Determine the momentum-energy tensor for a complex scalar field ψ governed
by the Lagrangian density
2
1 ∗ µ mc 2
L = ( ∂µ ψ ) ( ∂ ψ ) − |ψ| (291)
2 h̄
(ii) Find the forms of the energy and momentum density of the field.
(iii) Using the form of the general solution, find expressions for the total energy
and momentum of the field in terms of the Fourier components of the field.
Exercise:
(i) Determine the energy-momentum tensor for the Lagrangian density for the
complex Schrödinger field representing a charged particle given by
h̄2
∗
h̄ ∂ψ ∂ψ
L = − ∇ ψ∗ . ∇ ψ − ψ∗ − ψ − ψ ∗ V (x) ψ
2m 2i ∂t ∂t
(292)
50
(ii) Find the forms of the energy and momentum density of the field.
(iii) Find the forms of the generalized orbital angular momentum density of the
field.
(iv) Consider the case where V (x) ≡ 0. Using the form of the general solution,
find expressions for the total energy and momentum of the field in terms of the
Fourier components of the field.
∂µ F µ,ν = 0 (296)
51
lowering both indices with the metric tensor. The contravariant field tensor can
be expressed as the matrix
and the the co-variant field tensor can be expressed as the matrix
in which the sign of the terms with mixed time and space indices have changed.
Therefore, the Lagrangian density can be expressed in terms of the electromag-
netic fields as
1
L = ( E2 − B2 ) (299)
8π
Since the Lagrangian density is completely expressed in terms of the electro-
magnetic field, it is gauge invariant.
52
Since charge is conserved, the current density must satisfy the continuity equa-
tion
∂ µ jµ = 0 (305)
The continuity condition can be used to express the interaction as the untrans-
formed Lagrangian density and a perfect derivative
1 µ 1 µ
Lint 0 = − A jµ − ∂ ( Λ jµ ) (306)
c c
The perfect derivative term only adds a constant term to the action which does
not affect the equations of motion9 . Hence, although the Lagrangian density
is not gauge invariant in the presence of sources, the Lagrangian equations of
motion are gauge invariant.
mation should be taken as a warning against considering quantities in a field theory as being
localized.
53
which can be combined with the term
1
− ( E2 − B2 ) (311)
8π
originating from the Lagrangian density. This combination results in the term
1
E2 + B2 (312)
8π
which is recognized as the usual expression for the energy density of a free
electromagnetic field. On substituting eqn(309) into the second term in the
third line, one finds
1
+ ( ∇A0 ) . E (313)
4π
which can be expressed as
1 1 1
( ∇A0 ) . E = ∇ . ( A0 E ) − A0 ( ∇ . E ) (314)
4π 4π 4π
This relation has been used in arriving at the fourth line of eqn(308). Since the
divergence of the electric field satisfies Gauss’s law
54
Hamiltonian is expressed entirely in terms of an interaction between the current
density and the vector potential, which demonstrates that the Hamiltonian is
not invariant under a Lorentz transformation
1
Hint = − j.A (320)
c
but is invariant under rotations in space. This situation is to be contrasted
with the interaction term in the Lagrangian which was Lorentz invariant as it
explicitly included an interaction between the scalar potential and the charge
density.
On raising the index µ with the metric tensor, one has the contravariant second-
rank tensor
1
T ν,µ = − F ν,ρ ∂ µ Aρ − g ν,µ L (326)
4π
55
The energy-momentum tensor is not gauge invariant, as it explicitly involves
the fields Aµ . On using the expression for the source-free Lagrangian density
1 2 2
L = E − B (327)
8π
where the relation between the space-like components of the covariant and con-
travariant four-vector Ai = − A(i) has been used. Since the time-like compo-
nent of the field tensor is given by
and10 X
(i) (j) (j) (i)
∂ A − ∂ A = − ξ i,j,k B (k) (331)
k
∇.E = 0 (333)
10 Since the vector relationship B = ∇ ∧ A involves the covariant derivative, there is a
56
and by adding a term proportional to A(j) ( ∇ . E ) to the expression for T 0,j
in eqn(332), one arrives at the result
(j)
1 1
T 0,j = E ∧ B + ∇ . A(j) E (334)
4π 4π
The components T 0,ν , apart from the terms involving total derivatives which
integrate out to zero, are related to the total energy and the components of the
total momentum of the electromagnetic field. The components of T µ,ν satisfy
the continuity equations
∂µ T µ,ν = 0 (335)
which represent the conservation of energy and momentum. The other mixed
time and spatial components of the energy-momentum tensor are evaluated as
(j) (j)
1 1 1 ∂
T j,0 = E ∧B + ∇ ∧ φB − φ E (j)
4π 4π c ∂t
(336)
The components T j,0 represent the components of the energy flux.
where Λρ;µ,ν is an arbitrary tensor that is antisymmetric under the interchange of the first
pair of indices
Λρ;µ,ν = − Λµ;ρ,ν
will automatically satisfy the same continuity conditions as T µ,ν and leave the total energy
and momentum unaltered.
57
The first two terms are symmetric and are gauge invariant. These two terms
will form the basis for Θµ,ν , which will be expressed as
1 1 µ,ν
Θµ,ν = g µ,ρ Fρ,λ F λ,ν + g Fρ,λ F ρ,λ (340)
4π 4
The expression Θµ,ν is symmetric under the interchange of µ and ν, as can be
seen by writing
1 1 µ,ν
Θµ,ν = F µ λ F λ,ν + g Fρ,λ F ρ,λ
4π 4
1 1 µ,ν
= F µ,λ Fλ ν + g Fρ,λ F ρ,λ (341)
4π 4
If Θµ,ν and T µ,ν are to represent the same set of conserved quantities, the last
term in eqn(339) must be expressible as a total derivative. That this is true can
be seen by examining the asymmetric term
1 µ,ρ 1
− g Fρ,λ (∂ λ Aν ) = − F µ,λ (∂λ Aν ) (342)
4π 4π
where the index ρ was raised by using the metric tensor. On combining the
above expression with the source free Maxwell equation
(∂λ F µ,λ ) = 0 (343)
one obtains
1 µ,ρ λ ν 1 µ,λ ν ν µ,λ
− g Fρ,λ (∂ A ) = − F (∂λ A ) + A (∂λ F )
4π 4π
1 µ,λ ν
= − ∂λ F A (344)
4π
which is a total derivative. Furthermore, this term does not alter the conserva-
tion laws since their difference involves the double derivative
1
∂µ ( Θµ,ν − T µ,ν ) = − ∂µ ∂λ F λ,µ Aν (345)
4π
and F λ,µ is antisymmetric. On interchanging the order of the derivatives in the
right hand side, switching the summation labels, and using the antisymmetric
property of F λ,µ , one has
µ,ν µ,ν 1 λ,µ ν
∂µ ( Θ − T ) = − ∂µ ∂λ F A
4π
1 λ,µ ν
= − ∂λ ∂µ F A
4π
1 µ,λ ν
= − ∂µ ∂λ F A
4π
1 λ,µ ν
= + ∂µ ∂λ F A (346)
4π
58
On comparing the right hand sides of the first and last line, one finds that
they have opposite signs, and therefore vanish. Thus, the difference between
continuity relations vanish
Noether’s theorem is purely classical, but there are generalizations for quan-
tum fields. Quantum generalizations includes the Ward-Takahashi and Taylor-
Slavnov identities.
Exercise:
Exercise:
59
Show that in the presence of sources, the symmetric energy-momentum ten-
sor has components with the form
1 1
Θ0,0 = E2 + B2 − j.A
8π c
(j)
1
Θ0,j
= E ∧ B − ρ A(j) (352)
4π
Verify the form of the conservation laws for energy and momentum.
Exercise:
Show that the extra term included in the tensor Θi,j produce a contribution
to the angular momentum density of the form
(j)
0,j 1
S = E ∧ A (353)
4π
pµ pµ = m2 c2 (354)
where h̄ no longer drops out. This equation can be derived from the Lagrangian
2
1 1 mc 1 µ
L = − F µ,ν Fµ,ν + Aµ Aµ − j Aµ (357)
16 π 8π h̄ c
12 A. Proca, J. Phys. et Radium 7, 147 (1936).
60
For example, on varying Aµ , one obtains the equation of motion
2
mc 4π µ
∂ν F ν,µ + Aµ = j (358)
h̄ c
Neither the Lagrangian, nor the equation of motion are gauge invariant. The ap-
propriate gauge condition can be enforced by imposing conservation of charge13
∂µ j µ = 0 (359)
F ν,µ = ∂ ν Aµ − ∂ µ Aν (361)
one finds
∂ν F ν,µ = ∂ν ∂ ν Aµ − ∂ µ ∂ν Aν (362)
therefore
∂µ ∂ν F ν,µ = ∂ν ∂ ν ∂µ Aµ − ∂µ ∂ µ ∂ν Aν
= 0 (363)
The term on the right-hand side of eqn(360) also vanishes, because it was chosen
to impose charge conservation. Hence, one finds that Aµ for a massive spin-one
particle must satisfy the Lorenz gauge condition
∂µ Aµ = 0 (364)
Exercise:
additional assumption.
61
only short-ranged interactions) has a broken symmetry state then the system
supports a branch of small amplitude excitations with a dispersion relation ωk
that vanishes at k = 0. We shall then examine the situation in which the system
is coupled by long-ranged interactions, as modelled by an electromagnetic field.
As was first pointed out by Anderson, the long-ranged interactions alter the
excitation spectrum of the symmetry broken state by removing the Goldstone
modes and generating a branch of massive excitations.
for any real constant α. The static or minimum energy solution corresponds to
| ψ | = φ0 (367)
which leaves the phase of ψ undetermined. Since the phase of ψ is continuous,
v(Ψ)
Re [Ψ]
Im [Ψ] φ0
62
then the Lagrangian can be written as a Lagrangian density involving the two
real scalar fields φ1 and φ2 . The Lagrangian density has a U (1) symmetry which
corresponds to the rotation of ψ around a circle about the origin in the (φ1 , φ2 )
plane.
We shall assume the field ψ representing the physical ground state corre-
sponds to only one of the infinite number of possible candidates. The physical
state must have a phase, which shall be defined as zero. That is, one starts with
a ground state ψ = φ0 , and then considers the small amplitude excitations. A
low-energy excited state corresponds to the complex field
ψ = φ0 + δψ (369)
where δψ is static and uniform and can be considered to be very small. The
small amplitude complex field δψ can be expressed in terms of its real and
imaginary parts
δψ = χ1 + i χ2 (370)
The Lagrangian density takes the form
2 2
µ µ mc
L = ( ∂µ χ1 ) ( ∂ χ1 ) + ( ∂µ χ2 ) ( ∂ χ2 ) − 2 φ0 χ1 + χ21 + χ22
2 h̄ φ0
(371)
If one only consider infinitesimally small amplitude oscillations, one only needs
consider term quadratic in the fields. The quadratic Lagrangian density LF ree
describes non-interacting fields. The quadratic Lagrangian density is given by
2
mc
LF ree = ( ∂µ χ1 ) ( ∂ µ χ1 ) − χ21 + ( ∂µ χ2 ) ( ∂ µ χ2 ) (372)
h̄
The symmetry breaking has resulted in the complex field breaking up into two
fields: The first field χ1 describes massive excitations m and the second field χ2
describes massless excitations. The first field χ1 has plane-wave solutions if the
energy and momentum are related via the dispersion relation
2
m c2
ω 2 = c2 k 2 + (373)
h̄
and represents excitations which corresponds to a “stretching” of φ0 . It is
massive since this excitation moves the field away from the minimum of the
potential. The second excitation χ2 represents δψ which is transverse to φ0 in
the (φ1 , φ2 ) plane. This last excitation is known as a Goldstone boson14 . The
Goldstone boson has a dispersion relation
ω 2 = c2 k 2 (374)
which vanishes at k = 0. The Goldstone boson dynamically restores the sponta-
neously broken U (1) symmetry since, at k = 0, it just corresponds to a change
14 J. Goldstone, Il Nuovo Cimento, 19, 154 (1961).
63
of the value of the (static and uniform) broken symmetry field from (φ0 , 0) to
the new direction (φ0 , χ2 ). Therefore, if infinitely many zero-energy Goldstone
bosons are excited in the system, the resulting state should correspond to a new
ground state with a different value of the phase. As noted by Anderson15 prior
to Goldstone’s work, the Goldstone theorem breaks down when long-ranged in-
teractions are present. Anderson’s work was subsequently amplified on by Peter
Higgs and Tom Kibble.
The system has minimum energy when ψ has a constant value with a magnitude
given by
| ψ | = φ0 (379)
and the Aµ vanish. Any local gauge transformation leads to a state with the
same energy, therefore, the ground state is infinitely degenerate.
64
The small amplitude excitations can be expressed as
ψ = φ0 + δψ (380)
δψ = χ1 (381)
and on substituting in the Lagrangian and collecting the quadratic terms, one
obtains
2
µ mc
LF ree = ( ∂µ χ1 ) ( ∂ χ1 ) − χ21
h̄
2
1 q φ0
− F µ,ν Fµ,ν + Aµ Aµ (382)
16 π h̄ c
Therefore, one finds that the charged boson field has a mass m and the gauge
field has acquired a mass mA given by
2
q φ0
m2A = 8π (383)
c2
16 P. W. Higgs, Phys. Rev. Lett. 12, 132 (1964), Phys. Rev. 145, 1156 (1966). T. W.
In this paper Dirac uses two different approaches to quantizing electromagnetism. In one
approach he treated a single photon as satisfying a single-particle Schrödinger equation, that
has a similar form to Maxwell’s equations. The other approach treated the fields as dynamical
variables and then quantized them. Dirac then showed that these two methods produce
equivalent results. By doing this, Dirac created second quantization.
65
In the absence of sources, the (classical) wave equation for the vector poten-
tial has the form
1 ∂2
− ∇2 + 2 A = 0 (384)
c ∂t2
when the Coulomb gauge condition is imposed
∇.A = 0 (385)
On Fourier transforming the wave equation with respect to space and time, one
finds the equation of motion
2
1 ∂
k2 + 2 A(k, t) = 0 (388)
c ∂t2
k . A(k, t) = 0 (389)
We shall look for solutions for A(k, t) have a time dependence given by a linear
superpositions of the terms proportional to
exp ∓ i ωk t (390)
By substituting the above terms into the wave equation, it is found that linear
superpositions of plane-waves are solutions of Maxwell’s equation but only if
the frequency ωk and wave vector k are related via the dispersion relation
ωk2 = c2 k 2 (391)
The gauge condition also requires that the vector potential is oriented perpen-
dicular to the direction of propagation. Therefore, an arbitrary plane-wave
solution can be represented as a linear superposition of two polarized waves
with polarizations described by two mutually orthogonal unit vectors denoted
by ˆα (k). The polarization vectors satisfy
k . ˆα (k) = 0
ˆα (k) . ˆβ (k) = δα,β (392)
66
E
B
k
Figure 9: The normal modes of the classical electromagnetic field are plane-
polarized waves, in which E and B are transverse to the direction of propagation
k, and oscillate in phase.
We shall assume that three vectors k, ˆ1 (k), ˆ2 (k) form a mutually orthogonal
coordinate system. We shall define
The algebraic equations for A(k) can be solved trivially. One can express the
vector potential as a linear superposition
1 X
A(r, t) = √ ˆα (k) exp − i k . r Φα (k, t) (394)
V k,α
67
9.1 The Lagrangian and Hamiltonian Density
The Lagrangian density L for the electromagnetic field can be expressed as
1
L = E2 − B2 (397)
8π
in the Coulomb gauge, the electromagnetic field is given by
1 ∂A
E = −
c ∂t
B = ∇ ∧ A (398)
On substituting A(r, t) in the form of eqn(394) and integrating over r and using
the identity Z
1
d3 r exp i ( k + k 0 ) . r = δk+k0 (401)
V
one finds the Lagrangian is given by
1 X X
L = δk+k0
8π
k,k0 α,β
∂Φβ (k 0 )
1 ∂Φα (k)
× ˆα (k) . ˆβ (k 0 ) 2
c ∂t ∂t
+ ( k ∧ α (k) ) . ( k 0 ∧ β (k 0 ) ) Φα (k) Φβ (k 0 )
1 ∂Φ∗α (k)
1 X ∂Φα (k) 2 ∗
= − k Φ α (k) Φ α (k)
8π c2 ∂t ∂t
k,α
(402)
(403)
68
k
-k
Figure 10: A possible partition of k-space, which does not contain both k and
its inverse −k.
where the prime over the summation denotes the restriction of k to values
in the “positive” half volume of k-space. Since there are half the number of
independent normal modes, their contributions are twice as big. The Lagrangian
is a function of the six generalized variables Φα (k) and Φ∗α (k) for the independent
k values. The generalized momenta variables are found as
∗
2 ∂Φα (k)
Πα (k) =
8 π c2 ∂t
∗ 2 ∂Φ α (k)
Πα (k) = (404)
8 π c2 ∂t
The Lagrangian equations of motion of the field are given by
k2
∂ 1 ∂Φα (k)
2
= − Φα (k) (405)
∂t 8 π c ∂t 8π
or
∂ 2 Φα (k)
= − ωk2 Φα (k) (406)
∂t2
where ωk = c k. Thus, the classical field Φα (k) has a time-dependent ampli-
tude which resembles that of a harmonic oscillator with frequency ωk = c k.
The Hamiltonian can be obtained from the Lagrangian, via the Legendre Trans-
formation
X ∂Φ∗α (k)
0 ∂Φα (k)
H = Π∗α (k) + Πα (k) − L (407)
∂t ∂t
k,α
69
where the summation over (k, α) runs over the independent normal modes.
Hence, the k summation only runs over the set of points in k space which are
not related via the inversion operator. The Hamiltonian is related to the energy
of the electromagnetic field, as shall be seen below.
in which the summation over k is unrestricted. Thus, the above expression for
the energy is identical to the Hamiltonian for the electromagnetic field. Fur-
thermore, the Hamiltonian has been expressed in terms of a set of the normal
modes labeled by (k, α).
by the operators
Φ̂α (k) , Π̂α (k) (414)
and their complex conjugates are replaced by the Hermitean conjugate opera-
tors. The canonically conjugate coordinates and momenta operators satisfy the
commutation relations
70
The quantized Hamiltonian for the electromagnetic field is given by
X 8 π c2 1 2 †
Ĥ = Π̂α (k) Π̂†α (k) + k Φ̂α (k) Φ̂α (k) (416)
4 8π
k,α
known as creation operators. The commutation relations for the creation and
annihilation operators can be obtained directly from the commutation relations
of the field operators Φ̂α (k) and Π̂α (k) which are shown in eqn(415). It can
be shown that the creation and annihilation operators satisfy the commutation
relations
The field operators can be expressed in terms of the creation and annihilation
operators. Starting with
s s
8 π c2 2 k2
1 †
âk,α = √ i Π̂α (k) + Φ̂ (k) (420)
2 2 h̄ ωk 8 π h̄ ωk α
transforming k → −k and then by noting that Π̂α (−k) = Π̂†α (k) and Φ̂†α (−k) =
Φ̂α (k), one finds
s s
8 π c2 † 2 k2
1
â−k,α = √ i Π̂ (k) + Φ̂α (k) (421)
2 2 h̄ ωk α 8 π h̄ ωk
One can eliminate Π̂†α (k) by adding the expression for the creation operator
given by eqn(418) and the expression for the annihilation operator with mo-
mentum −k given by eqn(421). This process yields the expression for the field
component operators Φ̂α (k) in the form
r
2 π h̄ ωk †
Φ̂α (k) = âk,α + â−k,α (422)
k2
71
and, by an analogous procedure, the Hermitean conjugate operator is found to
be given by r
† 2 π h̄ ωk †
Φ̂α (k) = âk,α + â−k,α (423)
k2
which is identical to Φ̂α (−k). Likewise, the canonically conjugate momenta
operators are given by
r
h̄ ωk †
Π̂α (k) = i â−k,α − âk,α (424)
8 π c2
as was anticipated.
(427)
If one sets k → −k in the second set of terms, then one finds the Hamiltonian
becomes the sum over independent harmonic oscillators for each k value and
polarization
X h̄ ωk †
Ĥ = âk,α âk,α + âk,α â†k,α (428)
2
k,α
72
and has integer eigenvalues denoted by nk,α . Hence, the energy eigenvalues E
are given by
X 1
E = h̄ ωk nk,α + (430)
2
k,α
It should be noted that the contributions to the total energy from the zero-
point energy terms h̄ω2 k diverge. However, in most circumstances, only the
excitation energy of the field is measurable, hence the divergence is mainly ir-
relevant. The zero-point energy does have physical consequences, and can be
observed if the volume or boundary conditions of the field are changed. The
change in the zero-point energy of the field due to change in volume or boundary
conditions is known as the Casimir effect18 .
(434)
The above equation was obtained by noting that, in the basis composed of
eigenstates of the number operators |nk,α >, one has
âk,α (t) |nk,α > = exp + iωk t (â†k,α âk,α + 1/2) âk,α (0) |nk,α > exp − iωk t (nk,α + 1/2)
73
√
= exp + iωk t (nk,α − 1/2) nk,α |nk,α − 1 > exp − iωk t (nk,α + 1/2)
= exp − i ωk t âk,α |nk,α > (435)
and that the time-dependent creation operator is given by the Hermitean con-
jugate expression. Thus, the explicit form of time dependence of the vector
potential is a consequence of the explicit time dependence of the creation and
annihilation operators in the Heisenberg representation. Alternatively, one can
find the time dependence of the creation and annihilation operators directly
from the Heisenberg equations of motion without invoking a privileged set of
basis states. The equation of motion for the creation operator is given by
∂â†k,α
i h̄ = [ â†k,α , Ĥ ] (436)
∂t
and the commutator is evaluated as
[ â†k,α , â†k0 ,β âk0 ,β ] = − â†k,α δα,β δk,k0 (437)
∂â†k,α
i h̄ = − h̄ ωk â†k,α (438)
∂t
Therefore, one finds the result
â†k,α (t) = â†k,α exp i ωk t (439)
which is just the Hermitean conjugate of the â†k,α (t) that was found previously.
Therefore, the time-dependence of the vector potential is entirely due to the
time-dependence of the Heisenberg representation of the creation and annihila-
tion operators.
74
9.2.3 The Momentum of the Field
The total momentum operator for the electromagnetic field is given by the
integral over all space of the Poynting vector
Z
1 3
P̂ = d r Ê ∧ B̂ (444)
4πc
This will be evaluated by expressing the Ê and B̂ field operators in terms of the
vector potential A operator via
1 ∂ Â
Ê = −
c ∂t
B̂ = ∇ ∧ Â (445)
The vector potential operator can be written in terms of the creation and anni-
hilation operators for the normal modes as
s
2 π h̄ c2
†
X
Â(r, t) = ˆα (k) âk,α (t) + â−k,α (t) exp − i k . r (446)
ωk V
k,α
and
s
2 π h̄ c2
â†k,α + â−k,α
X
B̂(r) = − i ( k ∧ ˆα (k) ) exp − ik.r
ωk V
k,α
(448)
For a fixed k, the polarization vectors ˆα (k) and k are mutually orthogonal.
Therefore, one has
ˆα (k) ∧ ( k ∧ ˆβ (k) ) = k ( ˆα (k) . ˆβ (k) ) − ˆβ (k) ( k . ˆα (k) )
= k δα,β (449)
Hence, the total momentum of the electromagnetic field is determined from
h̄ X † †
P̂ = ˆα (k) ∧ ( k ∧ ˆα (k) ) âk,α − â−k,α â−k,α + âk,α
2
k,α
h̄ X
= k â†k,α − â−k,α â†−k,α + âk,α (450)
2
k,α
It should be noted that the momenta from each normal mode of the field is
parallel to the direction of propagation. Since the creation operators commute
â†k,α â†−k,α = â†−k,α â†k,α (451)
75
and that the annihilation operators also commute
one finds that the part of the momentum represented by the summation over k
given by
X † †
h̄ k âk,α â−k,α − â−k,α âk,α = 0 (453)
k,α
vanishes since the summand is odd under inversion symmetry. Thus, the mo-
mentum of the electromagnetic field is given by
h̄ X
P̂ = k â†k,α âk,α − â−k,α â†−k,α
2
k,α
1 X † †
= h̄ k âk,α âk,α − h̄ k â−k,α â−k,α − h̄ k (454)
2
k,α
where the commutation relations for the creation and annihilation operators
were used to obtain the last line. The last term vanishes when summed over k,
due to inversion symmetry. Hence, the momentum of the field is given by the
operator
1 X
P̂ = h̄ k â†k,α âk,α − h̄ k â†−k,α â−k,α (455)
2
k,α
Finally, on transforming −k to k in the last term of the summand, one finds the
total momentum of the field is carried by the excitations since
h̄ k â†k,α âk,α
X
P̂ = (456)
k,α
76
9.2.4 The Angular Momentum of the Field
The total angular momentum operator of the electromagnetic field JˆEM is given
by Z
1
JˆEM = d3 r r ∧ ( Ê ∧ B̂ ) (458)
4πc
The i-th component is given by
Z
ˆ(i) 1 3 i,j,k (j) (k)
JEM = d rξ x ( Ê ∧ B̂ )
4πc
Z
1 3 i,j,k (j) k,l,m (l) (m)
= d rξ x ξ Ê B̂
4πc
∂ Â(p)
Z
1
= d3 r ξ i,j,k x(j) ξ k,l,m Ê (l) ξ m,n,p
4πc ∂x(n)
(459)
However, due to the identity
k,l,m m,n,p k,n l,p k,p l,n
ξ ξ = δ δ − δ δ (460)
one finds
∂ Â(l) (k)
Z
(i) 1 (l) ∂ Â
JˆEM = d3 r ξ i,j,k x(j) Ê (l) − x(j)
Ê (461)
4πc ∂x(k) ∂x(l)
On integrating by parts in the last term, one has
∂ Â(l)
Z
(i) 1 ∂
JˆEM = d3 r ξ i,j,k x(j) Ê (l) + x(j)
Ê (l)
Â(k)
4πc ∂x(k) ∂x(l)
(462)
The divergence of the electric field vanishes,
∂ Ê (l)
= 0 (463)
∂x(l)
and since
∂x(j)
= δ j,l (464)
∂x(l)
the total angular momentum can be re-written as
(l)
Z
ˆ(i) 1 3 i,j,k (l) (j) ∂ Â (j) (k)
JEM = d rξ Ê x + Ê Â (465)
4πc ∂x(k)
The first term can be recognized as the orbital angular momentum of the field.
The orbital angular momentum operator L̂(i) is given by
∂
L̂(i) = − i h̄ ξ i,j,k x(j) (466)
∂x(k)
77
so the total angular momentum of the field is given by
Z
(i) i
JˆEM = d3 r Ê (l) L̂(i) Â(l) − i h̄ ξ i,j,k Ê (j) Â(k)
4 π h̄ c
Z
i 3 (l) (i) (l) (j) (i) j,k (k)
= d r Ê L̂ Â + Ê ( Ŝ ) Â (467)
4 π h̄ c
which shows that the orbital angular momentum is diagonal with respect to
the field components and the spin angular momentum mixes the different field
components.
The total spin component of the angular momentum operator for the elec-
tromagnetic field is given by
Z
(i) i 3 (j) (i) j,k (k)
ŜEM = d r Ê ( Ŝ ) Â
4 π h̄ c
Z
1
= d3 r ξ i,j,k Ê (j) Â(k)
4πc
Z (i)
1 3
= d r Ê ∧ Â (470)
4πc
This can be expressed in terms of the photon creation and annihilation operators
as
(i) h̄ X (j)
ŜEM = − i ˆβ (k) ξ i,j,k ˆ(k)
α (k)
2
k,α,β
× â†−k,β − âk,β â†k,α + â−k,α (471)
The first term in parenthesis is recognized as the i-th component of the vector
product
ˆβ (k) ∧ ˆα (k) (472)
and, therefore, it is antisymmetric in the polarization indices α and β and the
non-zero contributions are restricted to the case α 6= β. Since the creation
and annihilation operators corresponding to different polarizations commute,
78
the product of the two remaining parenthesis can be re-arranged as the sum of
two terms
(i)
(i) h̄ X
ŜEM = − i ˆβ (k) ∧ ˆα (k)
2
k,α,β
† †
× â−k,β âk,α − âk,β â−k,α
† †
+ â−k,β â−k,α − âk,α âk,β (473)
(474)
On defining the sense of the polarization vectors relative to k̂ (≡ ê3 (k)) the unit
vector in the direction of propagation via
ˆ1 (k) ∧ ˆ2 (k) = k̂ (475)
On setting −k → k in the second part of the summation, the spin of the elec-
tromagnetic field is found as
X †
Ŝ EM = i h̄ k̂ âk,2 âk,1 − â†k,1 âk,2 (477)
k
It should be noted that in this expression, the indices (1, 2) refer to directions
in three-dimensional space and do not refer to the z-component of the spin an-
gular momentum. Therefore, the above equation shows that a plane-polarized
photon is not an eigenstate of the single-particle spin operator quantized along
79
the k-axis20 .
e (k), where
are given by Φ m
1
1
Φ
e (k)
+1 = − √ i
2 0
0
Φ
e (k)
0 = 0
1
1
1
Φ
e (k)
−1 = √ −i (479)
2 0
and where the subscript m refers to the eigenvalue of Ŝ (3) , in units of h̄. From
this, it follows that an arbitrary transverse vector wave function Φ(k) can only
be expressed as a linear superposition of states involving m = ±1, and that the
m = 0 component is absent. On expressing an arbitrary (non-transverse) vector
wave function Φ(k) with components Φ(1) (k), Φ(2) (k) and Φ(3) (k) in terms of
its components referred to the helicity eigenstates Φm (k) one has
(1)
Φ+1 (k) − 1 i √0 Φ (k)
Φ0 (k) = √1 0 0 2 Φ(2) (k) (480)
Φ (k) 2 1 i 0 (3)
Φ (k)
−1
This relation between the two bases can be expressed in the alternate form
m=1
X
Φ(k) = êm Φm (k) (481)
m=−1
80
The circularly-polarized unit vectors are associated with photons which have
definite helicity eigenvalues. It should be noted that these complex unit vectors
are orthogonal, and satisfy
The above relations allow one to define the circularly-polarized creation and
annihilation operators via their relation to the quantum fields. This procedure
yields
i=3
X m=1
X
ˆi (k) âk,i = êm (k) âk,m (484)
i=1 m=−1
When expressed in terms of the circularly-polarized unit vectors, the spin oper-
ator for the electromagnetic field becomes
X †
Ŝ EM = h̄ k̂ âk,m=1 âk,m=1 − â†k,m=−1 âk,m=−1 (487)
k
which is expressed in terms of photons with definite helicity. Within the man-
ifold of single-photon states with momentum h̄ k, the spin operator has eigen-
values of ±h̄ when measured along the direction k̂. It is seen that the photon
has helicity m = ±1 but does not involve the helicity state with m = 0 since
the electromagnetic field is transverse. The transverse nature of the field is due
to the photon being massless. In general, a massive particle with spin S should
have (2S + 1) helicity states. However, a massless particle can only have the
two helicity states corresponding to m = ±S.
81
e1
e2 k
1 ∂ Â
Ê = − (488)
c ∂t
Although the expectation value of Ê vanishes for any eigenstate of the set of
21 R. A. Beth, Phys. Rev. 50, 115 (1936).
22 J. H. Poynting, Proc. Roy. Soc. A82, 560 (1909).
82
occupation numbers | {nk0 ,β } >
since
< {nk0 ,β } | ak,α | {nk0 ,β } > = 0 (490)
the fluctuation in the fields are given by
→ ∞ (491)
The fluctuations in the field diverge since the zero-point energy fluctuations.
The commutation relations between the x-component of the E field and the
B field at the same instant of time are non-zero23 . That is,
2π X
[ Êx (r) , B̂y (r0 ) ] = h̄ ωk ˆα (k)x ( k̂ ∧ ˆα (k) )y exp i k . ( r0 − r )
V
k,α
2π X
− h̄ ωk ˆα (k)x ( k̂ ∧ ˆα (k) )y exp − i k . ( r0 − r )
V
k,α
4 π h̄ c X 0
= − kz exp − i k . ( r − r )
V
k
4 π c h̄ ∂ X
= i exp − i k . ( r0 − r )
V ∂z
k
c h̄ ∂ 3 0
= i δ (r − r) (492)
2 π 2 ∂z
The fact that the two polarizations are transverse to the unit vector k̂ has been
used to obtain the third line. Since Ê and B̂ do not commute, it follows that E
and B obey an uncertainty relation in that the values of E and B cannot both
be specified to arbitrary accuracy at the same point.
However, if two points in space time x and x0 are not causally related, i.e.
6 c | t0 − t |
| r0 − r | = (493)
83
Thus, if the two points in space-time are not connected by the propagation of
light, then the Ex and By fields can both be determined to arbitrary accuracy.
For example, the vacuum state or ground state is an eigenstate of the annihila-
tion operator, in which case aϕ = 0.
On taking the matrix elements of this equation with the state < m |, and using
the orthonormality of the eigenstates of the number operator, one finds
√
Cm+1 m + 1 = aϕ Cm (499)
84
The normalization constant C0 can be found from
∞ n∗ n
X aϕ aϕ
1 = C0∗ C0 (502)
n=0
n!
From this, it can be shown that if the number of photons in a coherent state
are measured, the result n will occur with a probability given by
( a∗ϕ aϕ )n
P (n) = exp − a∗ϕ aϕ (505)
n!
0.15
Pn
0.1
0.05
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Figure 12: The probability of finding n photons P (n) in a normal mode repre-
sented by a coherent state.
the quantity a∗ϕ aϕ is the average number of photons n present in the coherent
state.
The coherent states can be written in a more compact form. Since the state
with occupation number n can be written as
( ↠)n
|n > = √ |0 > (506)
n!
85
the coherent state can also be expressed as
∞
( aϕ ↠)n
X
1 ∗
| aϕ > = exp − aϕ aϕ |0 > (507)
2 n=0
n!
The number states can be expressed in terms of the coherent states via the
inverse transformation
√ Z 2π
n! 1 2 dϕ
|n > = exp + a exp − i n ϕ | aϕ >
an 2 0 2π
(510)
by integrating over the phase ϕ of the coherent state. Since the set of occupa-
tion number states is complete, the set of coherent states must also span Hilbert
space. In fact, the set of coherent states is over-complete.
The coherent state | aϕ > can be represented by the point aϕ in the Argand
plane. The overlap matrix elements between two coherent states is calculated
as
| < a0ϕ0 | aϕ > |2 = exp − | aϕ − a0ϕ0 |2 (511)
Hence, coherent states corresponding to different points are not orthogonal. The
coherent states form an over complete basis set. The over completeness relation
can be expressed as
d <e aϕ d =m aϕ
Z
| aϕ > < aϕ | = Iˆ (512)
π
This relation can be proved by taking the matrix elements between the occupa-
tion number states < n0 | and | n >, which leads to
d <e aϕ d =m aϕ
Z
< n0 | aϕ > < aϕ | n > = δn0 ,n (513)
π
86
1
Im z
aφ
0.5
a
φ Re z
0
-1 -0.5 0 0.5 1
-0.5
-1
The effect of the creation operator on the coherent state can be expressed
as
† † 1 ∗ †
â | aϕ > = â exp − a aϕ exp aϕ â |0 >
2 ϕ
1 ∗
= exp − a aϕ ↠exp aϕ ↠| 0 >
2 ϕ
1 ∗ ∂ †
= exp − a aϕ exp aϕ â |0 >
2 ϕ ∂aϕ
87
1 ∗ ∂ 1 ∗
= exp − a aϕ exp + a aϕ | aϕ >
2 ϕ ∂aϕ 2 ϕ
(516)
The coherent state is not an eigenstate of the creation operator, since the re-
sulting state does not include the zero-photon state.
The expectation value of the field operators between the coherent states
yields the classical value, since
has been used in the term involving the annihilation operator and the term orig-
inating from the creation operator is evaluated using the Hermitean conjugate
equation
< aϕ | ↠= < aϕ | a∗ϕ (519)
One also finds that that the expectation value of the number operator is given
by
< aϕ | ↠â | aϕ > = a∗ϕ aϕ (520)
so the magnitude of aϕ is related to the average number of photons in the
coherent state n. This identification is consistent with the Poisson distribution
of eqn(505) which governs the probability of finding n photons in the coherent
state. The coherent state is not an eigenstate of the number operator since
there are fluctuations in any measurement of the number of photons. The rms
fluctuation ∆n can be evaluated by noting that
where the boson commutation relations have been used in the second line. Thus,
the mean squared fluctuation in the number operator is given by
The rms fluctuation of the photon number is only negligible when compared to
the average value if aϕ has a large magnitude
a∗ϕ aϕ 1 (523)
88
The expectation values of coherent states almost behave completely clas-
sically. The deviation from the classical expectation values can be seen by
examining
< aϕ | â ↠| aϕ > = a∗ϕ aϕ + 1 (524)
which is evaluated by using the commutation relations. It is seen that the ex-
pectation values can be approximated by the classical values, if the magnitude
of aϕ is much greater than unity.
Exercise:
Determine the expectation values for the electric and magnetic field opera-
tors in a coherent state which represents a plane-polarized electromagnetic wave.
Exercise:
Determine the expectation values for the electric and magnetic field opera-
tors in a coherent state which represents a left circularly-polarized electromag-
netic wave composed of photons with a helicity of +1.
and the Hermitean conjugate operator, the creation operator can be expressed
as
√ˆ
†
âk,α = nk,α exp − i (ϕ̂k,α − ωk t) (526)
√
since it has been required that ˆn and ϕ̂ are Hermitean. Furthermore, the
√ˆ
operator n must have the property
√ˆ √ˆ
nk,α nk,α = n̂k,α (527)
25 P. A. M. Dirac, Proc. Roy. Soc. A 114, 243 (1927).
89
On substituting the expressions for the creation and annihilation operators,
in terms of the phase and amplitude, into boson commutation relations
√ √ˆ
− nˆk0 ,β exp − i (ϕ̂k0 ,β − ωk0 t) exp + i (ϕ̂k,α − ωk t) nk,α
(529)
This relationship is satisfied, if the phase and number operators satisfy the
commutation relation
[ n̂k,α , ϕk,α ] = i (531)
If one can construct the Hermitean operators that satisfy this commutation
relation, then one can show that the rms uncertainties phase and number must
satisfy the inequality
It should be noted that only the relative phase can be measured26 . Thus, if the
phase difference of any two components (k, α) and (k 0 , α0 ) is specified precisely,
then the occupation number of either component can not be specified.
Exercise:
Express the vector potential and the electric and magnetic field operators in
terms of the amplitude and phase operators.
90
Hence, coherent states are not orthogonal. In fact, their overlap decreases expo-
nentially with large “separations” between the points aϕ and a0ϕ0 in the Argand
plane. We shall denote | aϕ | by a. Two states separated by distances a ∆ϕ or
∆a such that a ∆ϕ ≥ 1 and ∆a ≥ 1 are effectively orthogonal or independent.
However, states within an area given by ∆a × a ∆ϕ ≈ 1 have significant overlap
and so can represent the same state. Therefore, the a minimum uncertainty
state occupies an area ∆a × a ∆ϕ ≈ 1. We note that 2 a ∆a can be interpreted
as a measure of the uncertainty ∆nϕ in the particle number for the state, and
∆ϕ is the uncertainty in the phase of the state. Hence, the phase - number
uncertainty relation sets the area of the Argand diagram that can be associated
with a single state as
a ∆a ∆ϕ ∼ 1 (534)
Im z
a ∆φ
∆a
∆φ
Re z
Figure 14: Due to the phase-number uncertainty principle, the minimum area
of the Argand diagram needed to represent a minimum uncertainty state has
dimensions such that a ∆a ∆ϕ ∼ 1.
p̂2 q2
q 2
Ĥ = + q φ(r) − p̂ . Â(r) + Â(r) . p̂ + 2
 (r)
2m 2mc 2mc
Z 2 0 2 0
Ê (r ) + B̂ (r )
+ d3 r 0 (535)
8π
when the vector potential is chosen to satisfy the Coulomb gauge. The second
and third terms are to be evaluated at the location of the charged point particle,
r, and the last term is evaluated at all points in space. The Hamiltonian can
91
be expressed as
Ĥ = Ĥ0 + Ĥrad + Ĥint (536)
where Ĥ0 is the Hamiltonian for the charged particle in the electrostatic poten-
tial φ
p̂2
Ĥ0 = + q φ(r) (537)
2m
and Ĥrad is the Hamiltonian for the electromagnetic radiation and Hint is the
interaction
q2
q 2
Ĥint = − p̂ . Â + Â . p̂ + Â (538)
2mc 2 m c2
(k,α) (k,α)
p p
p'
p'
92
diamagnetic interaction is expressed as
q2 X 2 π h̄ c2
0 0
Ĥdia = √ ˆβ (k ) . ˆα (k) exp − i ( k + k ) . r
2 m c2 ωk ωk 0 V
k,k0 ,α,β
× â†k0 ,β â†k,α + â†k0 ,β â−k,α + â−k0 ,β â†k,α + â−k0 ,β â−k,α (542)
For charged particles with spin one-half, then analysis of the non-relativistic
Pauli equation shows that there is another interaction term involving the parti-
cles’ spins. This interaction can be described by the anomalous Zeeman inter-
action
q h̄
ĤZeeman = − σ.B (543)
2mc
where
B = ∇ ∧ A(r) (544)
where σi are the three Pauli matrices.
Generally, the paramagnetic interaction has a greater strength than the Zee-
man interaction. This can be seen by examining the magnitudes of the interac-
tions. The paramagnetic interaction has a magnitude given by
e
p . ˆα A (545)
mc
and for an atom of size a , the uncertainty principle yields
h̄
p ∼ (546)
a
The Zeeman interaction has a magnitude given by
e h̄
σ.(k ∧ A) (547)
mc
but since k is the wavelength of light
1
k ∼ (548)
λ
Hence, since the wave length of light is larger than the linear dimension of an
atom, λ > a, one finds the inequality between the magnitude of the paramag-
netic interaction and the Zeeman interaction
e h̄ 1 e h̄ 1
A > A (549)
mc a mc λ
Both the paramagnetic and Zeeman coupling strengths are proportional to the
magnitude of the vector potential A, hence the ratio of the strengths of the
93
interactions are independent of A. Therefore, there magnitudes satisfy the in-
equality
1 1
> (550)
a λ
so the Zeeman interaction can frequently be neglected in comparison with the
paramagnetic interaction.
(hω'/c,hk')
0
Enlm
-4
E
-8
e-
-12
En'l'm'
-16
Figure 16: An electron in the initial atomic state with energy En,l,m makes a
transition to the final atomic state with energy En0 ,l0 ,m0 , by emitting a photon
with energy h̄ωk0 .
photon is emitted, the final state of the photon field described by the set {n0k0 ,β }
where
n0k0 ,β = nk0 ,β for (k 0 , β) 6= (k, α) (551)
94
and the number of photons in a normal mode (k, α) is increased by one
The transition rate for the electron to make a transition from (n, l, m) to
(n0 , l0 , m0 ) can be calculated27 from the Fermi-Golden rule expression
1 2π X
= | < n0 l0 m0 {n0k0 ,β } | Ĥint | nlm {nk0 ,β } > |2 δ( Enlm − En0 l0 m0 − h̄ ωk,α )
τ h̄
k,α
(553)
The delta function expresses the conservation of energy. The energy of the
initial state is given by
X 1
Enlm + h̄ ωk0 ,β ( nk0 ,β + ) (554)
2
k0 ,β
The difference in the energy of the initial state and final state is evaluated as
which is the argument of the delta function and must vanish if energy is con-
served. The sum over k can be evaluated by assuming that the radiation field is
confined to a volume V . The allowed k values for the normal modes are deter-
mined by the boundary conditions. In this case, the sum over k is transformed
to an integral over k-space via
Z
X V
→ d3 k (557)
( 2 π )3
k
(558)
since only the paramagnetic part of the interaction has non-zero matrix ele-
ments. For the photon emission process, the matrix elements of the creation
27 P. A. M. Dirac, Proc. Roy. Soc. A 112, 661 (1926), A 114, 243 (1927).
95
operator between the initial and final states of the electromagnetic cavity is
evaluated as
(560)
(561)
The above expression shows that the rate for emitting a photon into state (k, α)
is proportional to a factor of nk,α + 1, which depends on the state of occupation
of the normal mode. The term proportional to the photon occupation number
describes stimulated emission. However, if there are no photons initially present
in this normal mode, one still has a non-zero transition rate corresponding to
spontaneous emission. These factors are the result of the rigorous calculations28
based on Dirac’s quantization of the electromagnetic field, but were previously
derived by Einstein29 using a different argument. From the above expression, it
is seen that the number of photons emitted into state (k, α) increases in propor-
tional to the number of photons present in that normal mode. This stimulated
emission increases the number of photons and can lead to amplification of the
number of quanta in the normal mode, and leads to the phenomenon of Light
Amplified Stimulated Emission (LASER).
96
whereas the typical length scale r for the electronic state is of the order of an
Angstrom. Therefore the product k r ∼ 10−3 , so the exponential factor in the
vector potential can be Taylor expanded as
exp − i k . r ∼ 1 − i k . r + ... (563)
The first term in the expression produces results that are equivalent to the ra-
diation from an oscillating classical electric dipole. If only the first term in the
A(r)
Ψ(r)
V(r)
Figure 17: A cartoon depicting the relative length-scales assumed in the dipole
approximation.
In the dipole approximation, the transition rate for single photon emission
is given by
2
2 π h̄ c2
Z X
1 2π q V 3
≈ d k ( nk,α + 1 )
τ h̄ mc ( 2 π )3 V ωk α
× | ˆα (k) . < n0 l0 m0 | p̂ | nlm > |2 δ( Enlm − En0 l0 m0 − h̄ ωk )
(564)
The matrix elements of the momentum can be evaluated by noting that the
states | nlm > are eigenstates of the unperturbed electronic Hamiltonian so
97
where the unperturbed Hamiltonian is given by
p̂2
Ĥ0 = + V (r) (566)
2m
The electronic momentum operator p̂ can be expressed in terms of the commu-
tator of the Hamiltonian Ĥ0 and r through the relation
h̄
[ r , Ĥ0 ] = i p̂ (567)
m
On using this relation, the matrix elements of the momentum operator can be
written in terms of the matrix elements of the electron’s position operator r by
m
< n0 l0 m0 | p̂ | nlm > = −i < n0 l0 m0 | [ r , Ĥ0 ] | nlm >
h̄
m
= i ( Enlm − En0 l0 m0 ) < n0 l0 m0 | r | nlm >
h̄
(568)
q2
Z
1 X
≈ d3 k ωk ( nk,α + 1 )
τ (2π) α
× | ˆα (k) . < n0 l0 m0 | r | nlm > |2 δ( Enlm − En0 l0 m0 − h̄ ωk )
(569)
where the property of the delta function has been used to set
( Enlm − En0 l0 m0 )2
δ( Enlm − En0 l0 m0 − h̄ ωk ) = ωk2 δ( Enlm − En0 l0 m0 − h̄ ωk )
h̄2
(570)
It is seen that the volume of the electromagnetic cavity has dropped out of the
expression of eqn(569) for the transition rate. We shall assume that the number
of photons nk,α in the initial state is zero. The (complex) factor
is defined as the electric dipole moment, and the electronic energy difference is
denoted by the frequency
98
The integration over d3 k can be performed by separating the integration over
the direction dΩk of the outgoing photon and an integration over the magnitude
of k. The integration over the magnitude of k can be performed by noting that
the integrand is proportional to a Dirac delta function, so the transition rate
can be evaluated as
2 Z ∞ 2 X
ωnl,n
Z
1 0 l0 k
= dΩk dk | ˆα (k) . dnlm,n0 l0 m0 |2 δ( ωnl,n0 l0 − c k )
τ 2 π h̄ 0 ωk α
3
ωnl,n
Z
0 l0 X
= dΩk | ˆα (k) . dnlm,n0 l0 m0 |2 (574)
2 π h̄ c3 α
The above expression yields the rate at which an electron makes a transition
between the initial and final electronic state, in which one photon of any polar-
ization is emitted in any direction.
If one is only interested in the decay rate of the electronic state via the
emission of a photon, one should sum over all polarizations and integrate over
all directions of the emitted photon. The direction of the emitted photon k̂ is
expressed in terms of polar coordinates defined with respect to an arbitrarily
chosen polar axis. The direction of the photon’s wave vector k̂ is defined as
dΩk
z e2(k)
k
e1(k)
θk
y
φk
Figure 18: A photon is emitted with wave vector k with a direction denoted
by the polar coordinates (θk , ϕk ). The polarization vector ê1 (k) is chosen to be
in the plane containing the polar-axis and k, therefore, ê2 (k) is parallel to the
x − y plane.
k̂ = (sin θk cos ϕk , sin θk sin ϕk , cos θk ). The directions of the two transverse
polarizations α are defined as
ˆ1 (k) = (cos θk cos ϕk , cos θk sin ϕk , − sin θk )
99
ˆ2 (k) = (− sin ϕk , cos ϕk , 0) (575)
The scalar product between the polarization vectors and the dipole moment can
be expressed in terms of the Cartesian components via
(i)
X
ˆα (k) . dnlm,n0 l0 m0 = ˆ(i)
α (k) . ( dnlm,n0 l0 m0 ) (576)
i
As neither the polarization nor the direction of the outgoing photon are mea-
sured, the transition rates is determined as an integral over all directions
3
ωnl,n
X X Z
1 0 l0 (j) ∗ (i)
= dΩk ˆ(j) ˆ(i)
α (k) α (k) ( dnlm,n0 l0 m0 ) ( dnlm,n0 l0 m0 )
τ 2 π h̄ c3 i,j α
(577)
one finds that the transition rate is given by the scalar product of complex
vectors ∗
3
4 ωnl,n0 l0 dnlm,n0 l0 m0 . dnlm,n0 l0 m0
1
= (579)
τ 3 h̄ c3
The electric dipole matrix elements can be shown to vanish between most pairs
of states. The selection rules determine which matrix elements are non-zero
and, therefore, which electric dipole transitions are allowed.
where Rn,l (r) is the radial wave function, and the Yml (θ, ϕ) is the spherical
harmonic function quantized along the z-direction. The components of an ar-
bitrarily oriented electric dipole matrix elements involve matrix elements of the
quantities
x = r sin θ cos ϕ
y = r sin θ sin ϕ
z = r cos θ (581)
100
Since the above expressions are the components of a vector, they can be re-
written as combinations of the spherical harmonics with angular momentum
l = 1, via
r
1 8π 1
x = r Y−1 (θ, ϕ) − Y11 (θ, ϕ)
2 3
r
i 8π 1 1
y = r Y−1 (θ, ϕ) + Y1 (θ, ϕ)
2 3
r
4π 1
z = r Y0 (θ, ϕ) (582)
3
Hence, the components of the vector r can be written as
r
4π êx + i êy 1 1 êx − i êy 1
r = r √ Y−1 + êz Y0 − √ Y1 (583)
3 2 2
The circular polarization vectors are given by
êx − i êy
êm=−1 = √
2
êm=0 = êz
êx + i êy
êm=+1 = − √ (584)
2
which are orthogonal
ê∗m0 . êm = δm,m0 (585)
Hence, the vector r can be written in the alternate forms
r
4π X ∗ 1
r = r êm Ym (θ, ϕ)
3 m
r
4π X
= r êm Ym1 (θ, ϕ)∗ (586)
3 m
an electron with angular momentum quantized along the z-direction most natu-
rally couples to circularly-polarized light with the same quantization axis. The
electric dipole matrix elements involve the three factors
Z 2π Z π
0
dϕ dθ sin θ Yml 0 (θ, ϕ)∗ Y±1
1
(θ, ϕ) Yml (θ, ϕ)
0 0
Z 2π Z π
0
dϕ dθ sin θ Yml 0 (θ, ϕ)∗ Y01 (θ, ϕ) Yml (θ, ϕ) (588)
0 0
101
which come from the angular integrations. Conservation of angular momentum
leads to the dipole-transition selection rules
l0 = l ± 1 (589)
and
m0 = m ± 1
m0 = m (590)
because one unit of angular momentum is carried away by the photon in the
form of its spin30 .
mentum. Therefore, the angular momentum is completely transformed to the photon’s spin.
More generally, the spatial (plane-wave) part of vector potential should be expanded in terms
of spherical harmonics to exhibit the photon’s orbital angular momentum components.
102
On taking the matrix elements between states with definite z-components of the
angular momenta, one finds
which reduce to
( m0 − m )2 = 1 (600)
or
< n0 l0 m0 | x | nlm > = 0 (601)
Hence, the m-selection rules for the electric dipole transitions are ∆m = ± 1, 0.
The selection rules for the magnitude of the electron’s orbital angular mo-
mentum can be found by considering the double commutator
2 2 2 2 2
[ L̂ , [ L̂ , r ] ] = 2 h̄ r L̂ + L̂ r (602)
103
and
0 0 0 2 0 2
2 l (l + 1) + l(l + 1) = (l + l + 1) + (l − l) − 1 (605)
or
( l0 + l + 1 )2 − 1 ( l0 − l )2 − 1 = 0 (607)
The first factor in eqn(603) is always positive when l0 6= l, therefore, the electric
dipole selection rule becomes ∆l = ± 1.
The actual values of the matrix elements can be found from explicit calcu-
lations. The θ-dependence of the matrix elements is governed by the associated
Legendre functions through
s
l ( 2 l + 1 ) ( l − m )! l
Θm (θ) = P (cos θ) (608)
2 ( l + m )! m
104
for ∆m = 1, while for ∆m = − 1 one finds
s
l (l + m)(l + m − 1 ) l−1
sin θ Θm (θ) = Θm−1
(2l + 1)(2 l − 1)
s
(l + 2 − m ) ( l + 1 − m ) l+1
− Θm−1
(2l + 1 )(2l + 3)
(613)
The coefficients in the above equation have a similar form to the Clebsch-Gordon
coefficients. The dipole matrix elements can be evaluated by taking the matrix
0
elements of the above set of relations with Θlm0 (θ)∗ and then using the orthog-
onality properties. The above three relations give rise to the selection rules for
the magnitude of the orbital angular momentum l
l0 = l ± 1 (615)
Hence, not only have the selection rules on l been re-derived but the angular
integrations have also been evaluated.
What the above mathematics describes is how the spin angular momentum
of the emitted photon is combined with the orbital angular momentum of the
electron in the final state, so that total angular momentum is conserved. This
implies the selection rules which leads to the magnitude of the initial and final
electronic angular momentum l having to satisfy the triangular inequality
l0 + 1 ≥ l ≥ | l0 − 1 | (616)
105
Table 1: Matrix Elements of the Components of the Dipole Moment
l0 m0 x y z
q q
(l+2+m)(l+1+m) (l+2+m)(l+1+m)
m0 = m + 1 1
2 (2l+1)(2l+3) − 2i (2l+1)(2l+3) -
q
(l+1+m)(l+1−m)
l0 = l + 1 m0 = m - - (2l+1)(2l+3)
q q
(l+2−m)(l+1−m) (l+2−m)(l+1−m)
m0 = m − 1 − 12 (2l+1)(2l+3) − 2i (2l+1)(2l+3) -
q q
(l−m)(l−1−m) (l−m)(l−1−m)
m0 = m + 1 − 12 (2l−1)(2l+1)
i
2 (2l−1)(2l+1) -
q
0 0 (l+m)(l−m)
l =l−1 m =m - - (2l−1)(2l+1)
q q
(l+m)(l−1+m) (l+m)(l−1+m)
m0 = m − 1 1
2 (2l−1)(2l+1)
i
2 (2l−1)(2l+1) -
Then, since the electric dipole carries angular momentum (1, µ), the Wigner-
Eckart theorem reduces to
1
< nl0 m0 | V µ | nlm > = √ < l, m; 1, µ | l0 m0 > < n0 l0 | |V | | nl >
2 l0 + 1
(618)
where the first term which represents the angular integration is a Clebsch-
Gordon coefficient and the second factor is the reduced matrix element which
does depend on the form of the particular vector, but is independent of any
choice of coordinate system. Furthermore, the Wigner-Eckart theorem yields
the selection rules for the electric dipole transition
l + l0 ≥ 1 ≥ | l − l0 | (619)
Exercise:
Using the commutation relations for the j-th component of a vector V̂ j with
the i-th component of the orbital angular momentum L̂i ,
X
[ L̂i , V̂ j ] = i h̄ ξ i,j.k V̂ k (620)
k
106
= − 2 i h̄ L̂ ∧ V̂ − i h̄ V̂ (621)
(622)
The parity operator is its own inverse since for any state ψ(r)
P̂ 2 ψ(r) = P̂ ψ(−r)
= ψ(r) (624)
Therefore, the parity operator has eigenvalues p = ±1 for the eigenstates which
are defined by
P̂ φp (r) = p φp (r) (625)
so
P̂ 2 φp (r) = p2 φp (r) = φp (r) (626)
which yields p2 = 1 or p = ± 1. In polar coordinates, the parity operation is
equivalent to a reflection
θ → π − θ (627)
followed by a rotation
ϕ → ϕ + π (628)
In electromagnetic processes, parity is conserved since the Coulomb potential is
symmetric under reflection31 . Therefore, the parity operator P̂ commutes with
the Hamiltonian
[ P̂ , Ĥ ] = 0 (629)
and so one can find states | φn > that are simultaneous eigenstates of Ĥ and
P̂.
Ĥ | φn > = En | φn >
P̂ | φn > = pn | φn > (630)
31 The weak interaction does not conserve parity.
107
z r
r
θ
y
π−θ
π+ϕ ϕ
x
-r
P̂ r P̂ −1 = − r (631)
Hence, for any matrix elements of r between any eigenstates the parity operator,
one has
pn 0 pn = − 1 (633)
This is known as the Laporte selection rule for electric dipole transitions32 . The
validity of this selection follows from the facts that inversion commutes with the
orbital angular momentum operator. The spherical harmonics are eigenstates
of the parity operator since
108
same eigenvalue since the lowering operator (like any component of the angular
momentum) commutes with the parity operator. Therefore, one can use the
angular momentum selection rule to show that parity does change in an electric
dipole transition since 0
( − 1 )l+l = − 1 (636)
The Laporte selection rule is satisfied since ∆l = 1.
and
i
ˆ2 (k) . < n0 l0 m0 | r | nlm > = − exp[ − iϕk ] < n0 l0 m0 | (x + iy) | nlm >
2
i
+ exp[ − i ϕk ] < n0 l0 m0 | (x − iy) | nlm >
2
(641)
109
and
< n0 l0 m0 | z | nlm > ∝ δm0 −m (643)
the cross-terms in the square of the matrix elements are zero. Hence, on sum-
ming over the polarizations, one finds that the (θk , ϕk ) dependence of the decay
is governed by the dipole matrix elements through
X 1
2
| ˆα (k) . rnlm,n0 l0 m0 | = 1 + cos θk | < n0 l0 m0 | (x + iy) | nlm > |2
2
α
4
1
+ 1 + cos2 θk | < n0 l0 m0 | (x − iy) | nlm > |2
4
+ sin2 θk | < n0 l0 m0 | z | nlm > |2 (644)
For l0 = l + 1, the above sum is found to depend on the angular factors
0 1 (l + 2 + m)(l + 1 + m)
Ilm
0 =l+1 (θk , ϕk ) = 1 + cos2 θk δm0 −m−1
4 (2l + 1)(2l + 3)
1 (l + 2 − m)(l + 1 − m)
+ 1 + cos2 θk δm0 −m+1
4 (2l + 1)(2l + 3)
(l + 1 + m)(l + 1 − m)
+ sin2 θk δm0 −m (645)
(2l + 1)(2l + 3)
Since the z-component of the final electron’s orbital angular momentum is not
measured, m0 should be summed over. The angular distribution of the emitted
radiation for the l0 = l + 1 transition when neither the polarization nor the final
state m0 value are measured is given by
X 0 1 (l + 2)(l + 1) + m2 (l + 1)2 − m2
Ilm ( 1 + cos2 θk )
0 =l+1 (θk , ϕk ) = + sin2 θk
2 (2l + 1)(2l + 3) (2l + 1)(2l + 3)
m0
(646)
This factor determines the angular dependence of the emitted electromagnetic
radiation, which clearly depends on the value of m specifying the initial elec-
tronic state. On rearranging the expression, one finds that the anisotropy is
governed by the factor
X 0 (l + 2)(l + 1) + m2 1 l(l + 1) − 3 m2
Ilm
0 =l+1 (θk , ϕk ) = + sin2 θk
(2l + 1)(2l + 3) 2 (2l + 1)(2l + 3)
m0
(647)
which shows that for m = 0 the photons are preferentially emitted perpendicular
to the direction of quantization axis since this maximizes the overlap between
the polarization and the dipole matrix element. In the opposite case of large
values of m2 [ 3 m2 > l (l + 1) ], one finds that the photons are preferentially
emitted parallel (or anti-parallel) to the axis of quantization. On integrating
over all directions of the emitted photon, one obtains
Z
1 X 0 2 (l + 1)
dΩk Ilm
0 =l+1 (θk , ϕk ) = (648)
4π 0
3 (2l + 1)
m
110
The independence of the result on m follows since, in this case, there are no an-
gular correlations and the choice of direction of quantization of m is completely
arbitrary. The total decay rate for an electron in a state with fixed m due to
an l0 = l + 1 transition is given by
2
4 e2 ωnl,n
3 Z ∞
1 0 l0 (l + 1) 2 ∗
= 3
dr r Rn0 l+1 (r) r Rnl (r) (649)
τl =l+1
0 3 h̄ c (2l + 1)
0
However, if the initial electronic state is unpolarized, then one should sta-
tistically average over the initial m. In this case, the emitted radiation becomes
isotropic
l
1 X X 0 2 (l + 1)
Ilm
0 =l+1 (θk , ϕk ) = (650)
2l + 1 0
3 (2l + 1)
m=−l m
since
l
1 X 1
m2 = l(l + 1) (651)
2l + 1 3
m=−l
for l0 = l + 1. This is the same result that was previously obtained for the decay
rate of a level with a specific m value, when the m0 value of the final state and
the polarization or direction of the emitted photon are not measured.
For the case where l0 = l − 1, one finds that the decay rate involves the
angular factor
0 1 (l − m)(l − 1 − m)
Ilm
0 =l−1 (θk , ϕk ) = 1 + cos2 θk δm0 −m−1
4 (2l − 1)(2l + 1)
1 (l − 1 + m)(l + m)
+ 1 + cos2 θk δm0 −m+1
4 (2l − 1)(2l + 1)
(l + m)(l − m)
+ sin2 θk δm0 −m (653)
(2l − 1)(2l + 1)
which on summing over the final values of m0 yields the angular dependence of
the radiation field
X 0 1 l(l − 1) + m2 l2 − m2
Ilm
0 =l−1 (θk , ϕk ) = ( 1 + cos2 θk ) + sin2 θk
2 (2l − 1)(2l + 1) (2l − 1)(2l + 1)
m0
(654)
111
The anisotropy of the emitted radiation is determined by the factor
X 0 l(l − 1) + m2 1 l(l + 1) − 3 m2
Ilm
0 =l−1 (θk , ϕk ) = + sin2 θk (655)
(2l − 1)(2l + 1) 2 (2l − 1)(2l + 1)
m0
which shows that for m = 0 the photons are preferentially emitted perpendicular
to the direction of quantization axis since this maximizes the overlap between
the polarization and the dipole matrix element. In the opposite case of larger
m2 [ 3 m2 > l (l + 1) ], one finds that the photons are preferentially emitted
parallel (or anti-parallel) to the axis of quantization.
Therefore, the decay rate in which the photon is emitted in any direction is
given by the expression
2
4 e2 ωnl,n
3 Z ∞
1 0 l0 l 2 ∗
= 3
dr r Rn0 l−1 (r) r Rnl (r) (657)
τl =l−1
0 3 h̄ c (2l + 1)
0
for l0 = l − 1.
Classical Interpretation.
The quantum mechanical results for the angular distribution of the radiation
can be understood in terms of a simple classical model of the atom. In Bohr’s
model, a single electron orbits a central nucleus to which it is bound by the
attractive Coulomb potential. We shall assume that the radius of the orbit is a
and that the electron is performing a circular orbit in the x − y plane. Since the
direction of the electron’s orbital angular momentum is aligned with the z-axis,
it corresponds to the case where m ≈ l and l 1. In this case, the electron
has an oscillating dipole moment given by
d(t) = q a cos ω t êx + sin ω t êy
= q a <e êx − i êy exp i ω t (658)
This rotating dipole moment can be decomposed into two orthogonal linear
dipole moments which oscillate out of phase with each other. It should be
recalled that a classical oscillating (linear) electric dipole moment radiates power
P (ω) into a solid angle dΩk with a distribution given by
4
dP c ω
= | d |2 sin2 Θkd (659)
dΩk linear 8π c
112
y
m=l
-
e
d(t) ωt
x
Figure 20: A classical electron orbiting in the x-y plane (m = l) can be consid-
ered as producing two perpendicular linearly-oscillating electric-dipole moments.
where Θkd is the angle between the detector and the direction of the electric
dipole. On considering the radiation from the atom to be generated from two
dΩk
ez k
Θky
θk
ey
Θkx
ex
e- m=l
Figure 21: The polarization of the radiated electromagnetic field for an electron
orbiting in the x-y plane (m = l) can be comprehended in terms of the classical
radiation emanating from two linearly oscillating electric-dipole moments. The
angles Θkx and Θky , respectively, are the angles between the emitted radiation
and the x-axis and the angle subtended by the emitted radiation and the y-axis.
113
which on using
cos Θkx = sin θk cos ϕk
cos Θky = sin θk sin ϕk (661)
becomes
4
dP c ω
= | d |2 1 + cos2 θk (662)
dΩk dipole 8π c
Since the energy of the emitted photon is given by h̄ ω, one finds the angular
dependence of the semi-classical prediction of the decay rate is given by
3
e2
1 ωa
= 1 + cos2 θk dΩk (663)
τdΩk 8 π h̄ a c
The polarization vector is parallel to the direction of the electric field, which in
turn is given by the direction of the oscillating dipole that produced it. Hence,
Circular
e-
k
Linear
The angular dependence of the decay rate follows directly from the expres-
sions of eqn(646) and eqn(654) by setting m ≈ l 1, replacing the radial matrix
elements of r by a, adding the expressions and inserting them into eqn(637). The
analysis shows that quantum mechanics reproduces the classical limit correctly,
as is expected from the correspondence principle.
114
10.1.5 The Decay Rate from Dipole Transitions.
The decay rate due to dipole transitions includes processes in which photons
of all polarizations are emitted in all directions. Accordingly, the decay rate
is found by summing over all polarizations and integrating over the directions
of the emitted photon. For a spherically symmetric system, the energy will
be independent of the z-component of the orbital angular momentum. In this
case, one should sum over all values of m0 corresponding to the degenerate final
states. On summing over all final states corresponding to a specific l0 value, that
is on summing over m0 where m0 = m, m ± 1, one finds that the transition
rate can be expressed as
(l+1) Z ∞ 2
4 e2 ωnl,n
3
1 0 l0
(2l+1)
2
= 3 l
dr r Rn0 l0 (r) r Rnl (r) (664)
τ 3 h̄ c (2l+1)
0
for
l+1
l0 = (665)
l−1
It should be noted that, for a fixed l0 , the lifetime of the state | nlm > is inde-
pendent of the value of m. This is expected since the choice of the quantization
direction is completely arbitrary.
There are no selection rules associated with the radial integration in the
dipole matrix elements
Z ∞
dr r2 Rn0 ,l−1 r Rn,l (666)
0
The radial part of the dipole matrix element can be expressed in terms of the
hypergeometric function F (a, b, c) via
0
s 0
a (−1)n −l (n + l)!(n0 + l − 1)! (4n0 n)l+1 (n − n0 )n +n−2l−2
=
4(2l − 1) (n − l − 1)!(n0 − l)! (n0 + n)n0 +n
2
4n0 n
0
4n0 n
n −n
× F (l + 1 − n, l − n0 , 2l, − 0 ) − F (l + 1 − n, l − n 0
, − )
(n − n)2 n0 + n (n0 − n)2
(667)
Simple analytic expressions for the squares of the matrix elements for small val-
ues of (n0 , l) are shown in Table(3).
115
Table 2: Radial wave functions Rn,l (ρ) forR a Hydrogenic-like atom, where ρ =
Zr ∞
a . The functions are normalized so that 0 dρ ρ2 Rnl Rnl = 1.
n = 1 l = 0 2 exp − ρ
√1 ρ ρ
n = 2 l = 0 2
1 − 2 exp − 2
1 ρ
l = 1 2
√
6
ρ exp − 2
2 2 2 2 ρ
n = 3 l = 0 3 1 − 3 ρ + 27 ρ exp − 3
32
5
22 ρ ρ
l = 1 7 1 − 6 ρ exp − 3
32
3
22 2 ρ
l = 2 9 √ ρ exp − 3
32 5
R∞
Table 3: Values of | 0
dr r2 Rnl r Rn0 l−1 |2 in atomic units.
n, l n0 , l − 1
np 1s 28 n7 (n − 1)2n−5 (n + 1)−2n−5
116
of the hydrogenic states increases with increasing n, varying roughly as n3 for
a fixed value of l. The decrease in the dipole matrix elements with increasing n
is simply due to the increasing numbers of nodes in the radial wave functions.
so the decay time is approximately eight orders of magnitude larger than the
time taken for the photon to cross the atom. When averaged over l, the electric
dipole decay rate is given by
1 X (2l + 1) 9
∝ ∼ n− 2 (672)
τn n5
l
so, as seen in Table(4), the decay is slower for the higher energy levels.
117
Table 4: Electric Dipole Transition Rates for Hydrogen, in units of 108 sec−1 .
2p ns 6.25 - -
3s np - 0.063 -
3p ns 1.64 0.22 -
3d np - 0.64 -
4s np - 0.025 0.018
4p ns 0.68 0.095 0.030
4p nd - - 0.003
4d np - 0.204 0.070
4f nd - - 0.137
h̄2
a = (675)
m e2
The decay rate in the Fermi-Golden rule, evaluated in the dipole approximation,
is given by
1 3
4 ω1,2 d∗1s,2p . d1s,2p
= (676)
τ 3 h̄ c3
The frequency is evaluated from
118
Therefore, the scattering rate is determined by the ratio ac but also is modified
by the fourth power of the dimensionless electromagnetic coupling strength
2
e 1
≈ (679)
h̄ c 137.0359979
The smallness of this factor allows us to only consider the Fermi-Golden rule
expression for the decay rate. The dimensionless dipole matrix elements are
expected to be non-zero, since they obey the selection rules. They are non-zero,
as can be directly verified by performing an integration. The only non-zero
dipole matrix element originates from the z-component of the dipole
Z
∗
d1s,2p = e d3 r ψ1s (r) r ψ2p (r) (680)
since only the z-component satisfies the ∆m = 0 selection rule. The angular
integration is evaluated as
√ Z 2π Z π √
3 3 2
dϕ dθ sin θ cos2 θ = 2π
4π 0 0 4 π 3
1
= √ (681)
3
and the radial integration yields
Z ∞ ∞
r4
Z
∗ 2 3 r
dr r2 R1s (r) r R2p (r) = √ dr 3 exp −
0 2 6 0 a 2 a
5 Z ∞
a 2
= √ dx x4 exp − x
6 3 0
5 5
4! 2 √ 2
= a √ = 4a 6 (682)
6 3 3
Hence, the time scale τ is of the order of 10−10 seconds. The exact value of the
decay time is calculated to be 1.6 × 10−9 seconds.
119
10.1.7 Electric Quadrupole and Magnetic Dipole Transitions.
Consider decays such as the 3d state (with m = 0) to the 1s state in the hydrogen
atom. Since, in this transition, the change in the electron’s angular momentum
is two units, the transition is forbidden in the dipole approximation. Therefore,
the transition rate is evaluated by keeping the next order term in the expansion
exp − i k . r ≈ 1 − i k . r + ... (685)
The second term in the expansion describes electric quadrupole and magnetic
dipole transitions.
This shall be written as the sum of two terms, with different symmetries with re-
spect to interchange of r and p. These two terms will describe electric quadrupole
and magnetic dipole transitions. The matrix elements are written as the sum
of a term symmetric under the interchange of r and p̂ and a term that is anti-
symmetric
1
( k . r ) ( ˆα (k) . p̂ ) = ( k . r ) ( ˆα (k) . p̂ ) + ( k . p̂ ) ( ˆα (k) . r )
2
1
+ ( k . r ) ( ˆα (k) . p̂ ) − ( k . p̂ ) ( ˆα (k) . r )
2
(687)
The first term represents the matrix electric for the quadrupole transitions36 ,
and the second term represents the matrix elements for the magnetic dipole
transitions. The first term can be written as the scalar products of a symmetric
dyadic
k. r p̂ + p̂ r . ˆα (k) (688)
The scalar products are organized such that the left most vector outside the
parenthesis forms a scalar product with the left most vectors within the paren-
thesis, and likewise with the right most vectors. The electronic matrix elements
only involve the dyadic operator, as the wave vector and polarization vectors
are properties of the photon. The matrix elements
0 0 0
< nlm | r p̂ + p̂ r | nlm > (689)
[ r , p̂2 ] = 2 i h̄ p̂ (690)
36 J. A. Gaunt and W. H. McCrea, Proc. Camb. Phil. Soc. 23, 930 (1927).
120
which allows the momentum operator to be written as
im
p̂ = [ Ĥ , r ] (691)
h̄
Therefore, the matrix elements of the dyadic can be expressed in the form of
the matrix elements of the commutator with the dyadic
0 0 0 im
< nlm | r p̂ + p̂ r | nlm > = < n0 l0 m0 | [ Ĥ , r r ] | nlm >
h̄
( En0 l0 m0 − Enlm )
= im < n0 l0 m0 | r r | nlm >
h̄
= i m ωn0 ,n < n0 l0 m0 | r r | nlm > (692)
The decay rate in the Fermi-Golden rule, evaluated in the electric quadrupole
approximation, is given by
2 Z
m2 c2 ωk2 X
1 e
= d3 k | k . < n0 l0 m0 | r r | nlm > . ˆα (k) |2
τ mc 8 π ωk α
× δ( Enlm − En0 l0 m0 − h̄ ωk )
5 Z
e2
ωnl,n0 l0 X
= dΩk | k̂ . < n0 l0 m0 | r r | nlm > . ˆα (k) |2
8 π h̄ c α
(693)
121
Therefore, the scattering rate is determined by the ratio ac but also is modified
by the sixth power of the dimensionless electromagnetic coupling strength
2
e 1
≈ (697)
h̄ c 137.0359979
The smallness of this factor allows us to only consider the Fermi-Golden rule
expression for the decay rate. The dimensionless quadrupole matrix elements
are expected to be non-zero, since they obey the selection rules which involve
the exchange of two units of angular momentum. They are non-zero, as can
be directly verified by performing an integration. Therefore, the quadrupole
allowed decay rate is given by
6
e2
1 c
≈ (698)
τ h̄ c a
one can add a diagonal term to the dyadic without affecting the result. A
diagonal term with a magnitude that makes the resulting dyadic traceless is
added to the dyadic, leading to the expression
1
Qi,j = e xi xj − δi,j | r |2 (700)
3
The symmetric dyadic Qi,j has six inequivalent components, which because of
the restriction that dyadic is traceless, can be reduced to five independent com-
ponents. Due to the transformational properties of the dyadic under rotation,
it can be expressed as a linear combination of the spherical harmonics Ym2 (θ, ϕ)
and nothing else38 . This can be seen from rewriting the quadrupole tensor Q̃
2
xx − r3 xy xz
Q̃ 2
= yx yy − r3 yz (702)
e
2
r
zx zy zz − 3
37 This estimate will be modified upwards by several orders of magnitude, due to the presence
122
in terms of spherical polar coordinates
2 2
1 1 2 1
Q̃ 2 sin θ cos 2ϕ − 6 (3 cos θ − 1) 2 sin θ sin 2ϕ sin θ cos θ cos ϕ
1 2 2
= 2 sin θ sin 2ϕ − 2 sin θ cos 2ϕ − 16 (3 cos2 θ − 1)
1
sin θ cos θ sin ϕ
e r2 1 2
sin θ cos θ cos ϕ sin θ cos θ sin ϕ 3 (3 cos θ − 1)
(703)
The presence of states with orbital angular momentum of only two makes the
dyadic an irreducible second rank tensor. Application of the Wigner-Eckart
theorem to an irreducible second rank tensor results in the electric quadrupole
selection rules
l + l0 ≥ 2 ≥ | l − l0 | (704)
The angular momentum carried away by the photon consists of the spin-one
carried away by the photon in addition to the component of the photon’s wave
function described by the spherical Bessel function j1 (kr) ∼ k r which carries
off one unit of orbital angular momentum. In addition to the angular momen-
tum selection rules, there are parity selection rules for the electric quadrupole
transitions. Since the parity operator satisfies
P̂ r P̂ −1 = − r (705)
123
The magnetic dipole transition should be extended from orbital angular mo-
mentum to include the spin magnetic moment which is of the same order
e h̄
σ . ( k ∧ ˆα (k) )
2mc
e
( r ∧ p ) . ( k ∧ ˆα (k) ) (712)
2mc
since orbital angular momentum is quantized in units of h̄. The angular mo-
mentum selection rule for the magnetic dipole transition is given by
∆l = 0 (713)
and
1 ≥ | ∆m | (714)
also parity does not change.
Terms with higher-order orbital angular momentum that occur in the ex-
pansion of the photon’s wave function exp[ i k . r ] can be found by using the
Rayleigh expansion. The terms with orbital angular momentum l are propor-
tional to the spherical harmonics jl (kr) which vary as (kr)l when kr → 0, as is
found from the expansion of the exponential term. The presence of the extra
factors k l in the matrix element has the result that the electric 2s -th multi-pole
transition rates are found to vary as
2s+1
1 ωn,n0
∝ a2s (715)
τ c
where s is the magnitude of the change in the electronic orbital angular mo-
mentum, which satisfies the inequality
( l + l0 ) ≥ s ≥ | l0 − l | (716)
The extra factors from the photon’s angular momentum results in an overall de-
2
crease in the electric multi-pole transition rate by a factor of ( h̄e c )2s . It should
also be noted that the relative strength of the higher-order electric multi-pole
transitions increase more rapidly with Z than the electric dipole transitions.
Therefore, it is frequently found that the quadrupole transitions cannot be ne-
glected for the heavy elements. Alternatively, higher-order multipole transitions
do become important in the x-ray region, since in this region the wavelength of
the radiation is comparable to the spatial extent of the charged particle’s wave
function.
124
transition. The electric quadrupole transition rate can be expressed as
5 Z
1 1 ω X
= dΩk | k̂ . < 1s | Q̃ | 3d > . ˆα (k) |2 (717)
τ 8 π h̄ c α
where Q̃ represents the quadrupole tensor. The frequency factor can be evalu-
ated as 2
ω 4 e
= (718)
c 9 a h̄ c
hence, the rate can be expressed in the form
5 2 5 Z
1 c 4 e X 1
e | 3d > . ˆα (k) |2
= dΩk 2 a4
| k̂ . < 1s | Q
τ 8πa 9 h̄ c α
e
(719)
We shall consider the transition from the m = 0 state of the 3d level to the 1s
state. As can be easily shown, the matrix elements of quadrupole tensor for this
transition are diagonal and are given by
− Q2zz
0 0
< 1s | Q̃ | 3d > = < 1s | 0 − Q2zz 0 | 3d > (720)
0 0 Qzz
Therefore, the transition matrix elements are of the form
Z X 1
dΩk e | 3d > . ˆα (k) |2
| k̂ . < 1s | Q
e2 a4
α
Z X | < 1s | Qzz | 3d > |2 2
1 1
= dΩk k̂z ˆα (k)z − k̂x ˆα (k)x − k̂y ˆα (k)y
α
e2 a4 2 2
(721)
The direction of the emitted photon k̂ is expressed as
k̂ = (sin θk cos ϕk , sin θk sin ϕk , cos θk ) (722)
and the polarization vectors are given by
ˆ1 (k) = (cos θk cos ϕk , cos θk sin ϕk , − sin θk )
ˆ2 (k) = (− sin ϕk , cos ϕk , 0) (723)
Therefore for the m = 0 level, one finds that the integral over the angular
distribution is given by
Z X 1
dΩk e | 3d > . ˆα (k) |2
| k̂ . < 1s | Q
e2 a4
α
2
| < 1s | Qzz | 3d > |2
Z
3
= dΩk − sin θ k cos θ k
e2 a4 2
3 | < 1s | Qzz | 3d > |2
= 4π (724)
10 e2 a4
125
The scattering rate becomes
5 2 6
1 3 4 c e | < 1s | Qzz | 3d > |2
= (725)
τ 20 9 a h̄ c e2 a4
Finally, one finds the resulting expression for the quadrupole decay rate of the
3d state with m = 0 2 6
1 1 c e
= (727)
τ 3600 a h̄ c
which is evaluated as 228 sec−1 . From the above analysis, it is seen that angular
distribution for the emitted photon is governed by the factor
and the intensity is largest for the cone with θk ≈ 0.28 π or 0.72 π. This angular
dependence of the emitted radiation is the same as found by considering the
radiation form an oscillating classical quadrupole, for which the radiated power
is given by
6
dP c ω
= Q2 cos2 θk sin2 θk (729)
dΩk quad 288 π c
sin2 θk (730)
π
which is maximum for θk = 2.
126
m=0
(732)
127
Hence, the transition matrix element is given by
Z ∞
h̄ ˆα (k) . r ∗
−i d3 r ψ2s (r) exp − i k . r ψ1s (r) (735)
a 0 r
the factor ˆα (k) . r only depends on x and y and is antisymmetric with respect
to the transformations x → − x and y → − y. All other factors are even
functions of x and y. On integrating over the directions in the x − y plane, one
finds that the integral is identically zero.
The above result could have been (partially) anticipated by considering the
selection rules. The electric dipole transition is forbidden by parity. The mag-
netic dipole transition is zero in this non-relativistic treatment. All magnetic
and electric quadrupole and higher multipole transitions are forbidden by an-
gular momentum conservation.
The 2s state decays via two-photon emission which is described by the dia-
magnetic interaction and by the effect of the paramagnetic interaction taken to
second-order in time-dependent perturbation theory. Since only the part of the
paramagnetic interaction that creates a photon is involved, for our purposes the
paramagnetic interaction can be replaced by
1
q X 2 π h̄ c2 2
Ĥpara → − p̂ . ˆα (k) a†k,α exp − i k . r (737)
mc V ωk
k,α
hk
n
n'
128
q2 2 π h̄ c2 ˆα (k) . ˆα0 (k 0 ) †
†
X
0
Ĥdia → √ a a
k,α k0 ,α0 exp − i ( k + k ) . r
2 m c2 V ωk ωk 0
k,α;k0 ,α0
(738)
for two-photon emission. The system is assumed to be initially in an eigenstate
(k,α)
(k',α')
2s
1s
of the unperturbed Hamiltonian | n > but, due to the interaction Ĥint makes
transitions to states | n0 >. state. Following the usual procedure of time-
dependent perturbation theory, the above state | ψn > can be decomposed in
terms of a complete set of non-interacting energy eigenstates | n > via
X
| ψn > = Cn0 (t) | n0 > (739)
n0
where Cn0 (t) are time-dependent coefficients. The probability of finding the
system in the final state | n0 > at time t is then given by |Cn0 (t)|2 . The rate
at which the transition n → n0 occurs is then given by the time-derivative of
|Cn0 (t)|2 .
where ω = c k and ω 0 = c k 0 are the energies of the two photons in the final
state. The small quantity η has been absorbed as a small imaginary part to the
initial state energy
En → En + i η h̄ (741)
129
The paramagnetic interaction is of order of q and the diamagnetic interaction
is of order q 2 . Thus, to second-order in q, one must include the diamagnetic
interaction and the paramagnetic interaction to second-order. There are two
second-order terms which represent:
(b) emission of a photon (k 0 , α0 ) followed by the emission of the photon (k, α).
n'' n''
n n' n n'
(742)
( En − En00 − h̄ ω )
< n0 l0 m0 | ˆα (k) . p̂ | n00 l00 m00 > < n00 l00 m00 | ˆα0 (k 0 ) . p̂ | nlm >
+ (743)
( En − En00 − h̄ ω 0 )
130
as long as the denominators are non-vanishing.
(1) (2)
The coefficients Cn0 (t) and Cn0 (t) have the same type of time-dependence.
The remaining integration over time yields
Z t
−i i
dt0 exp (h̄ω 0 + h̄ω + En0 − En − ih̄η) t0
h̄ −∞ h̄
exp[ i
h̄ (h̄ω 0 + h̄ω + En0 − En − ih̄η) t ]
= − (744)
(h̄ω 0 + h̄ω + En0 − En − ih̄η)
where the matrix elements M are due to the combined effect of the diamagnetic
interaction and the paramagnetic interaction taken to second-order. That is,
These three terms add coherently, and it should be noted that the intermediate
state is only a virtual state and it can have a higher-energy than the 2s state39 .
39 Due to the Lamb shift, there is a 2p state with slightly lower energy than the 2s state.
However, due to the small magnitude of the energy difference, the part of the decay process
involving any real 2p transition is negligibly small.
131
In the limit η → 0 the first term in the expression for the transition rate
of eqn(747) reduces to a delta function which expresses conservation of energy
between the initial and final states.
η exp 2 η t
1
lim = δ( E2s − E1s − h̄ωk0 − h̄ωk )
η→0 π (h̄ω 0 + h̄ω + En0 − En )2 + h̄2 η 2
(749)
In the limit η → 0 the transition rate reduces to the Fermi-Golden rule
expression
1 2π X
= | M |2 δ( E2s − E1s − h̄ωk0 − h̄ωk ) (750)
τ h̄
k,α:k0 ,α0
The emitted photons have continuous spectra. In the expression for the matrix
elements M , the last two terms differ in the time-order that the two photons
are emitted. On inserting the expressions for the interactions into M , one can
pull out the common factors leaving a dimensionless matrix element M 0 . This
leads to the expression
q2 2 π h̄ c2
1
M = √ M0 (751)
2 m c2 V ωk ωk 0
2 X < 1s | ˆα (k) . p̂ exp[ − i k . r ]| n00 l00 m00 > < n00 l00 m00 | ˆα0 (k 0 ) . p̂ exp[ − i k 0 . r ] | 2s >
+
m E2s − En00 l00 m00 − h̄ωk0
n00 l00 m00
2 X < 1s | ˆα0 (k 0 ) . p̂ exp[ − i k 0 . r ]| n00 l00 m00 > < n00 l00 m00 | ˆα (k) . p̂ exp[ − i k . r ] | 2s >
+
m E2s − En00 l00 m00 − h̄ωk
n00 l00 m00
(752
One can assume that the dipole matrix elements of the intermediate states
should be randomly oriented in space, since the initial and final electronic states
132
are isotropic. After summing over the polarizations, the transition rate becomes
isotropic. On setting X
| M 0 |2 ≈ 1 (754)
α,α0
one finds
2 2
h̄ c2 d3 k d3 k 0
Z Z
1 e
= δ( E2s − E1s − h̄ωk − h̄ωk0 )(755)
τ 2 m c2 ( 2 π )3 k k0
where ω12 is related to the energy difference of the 1s and 2s states. An ele-
mentary integration yields
2 3
e2
1 c ω12
= (758)
τ m c2 12 π c
The first factor has dimensions of length squared and can be recognized as the
square of the classical radius of the electron. However, since
ω12 3 e2
= (759)
c 8 h̄ c a
and
h̄2
a = (760)
m e2
or
e2 a e4
= (761)
m c2 h̄2 c2
one finds the decay rate is approximated by
3 2 7
1 1 3 c e
= (762)
τ 12 π 8 a h̄ c
Thus, the estimated decay rate is 8.75 sec−1 . The exact value calculated by
Shapiro and Breit40 is 8.266 sec−1 .
133
10.1.10 The Absorption of Radiation
If a process occurs in which only a photon with quantum numbers (k, α) is
absorbed, then the numbers of quanta in the initial and final state of the elec-
tromagnetic field are given by
n0k,α = nk,α − 1
n0k0 ,β = nk0 ,β (763)
The matrix elements of the paramagnetic interaction are given by
< n0 l0 m0 {n0k0 ,β } | Ĥpara | nlm {nk0 ,β } >
s
√ 2 π h̄ c2
X
q 0 0 0
= − nk,α < n l m | p̂ . ˆα (k) exp + i k . r | nlm >
mc V ωk
k,α
(764)
The photon absorption rate is found from the Fermi-Golden rule expression
2
2 π h̄ c2
1 2π q X
= nk,α |2 δ( Enlm + h̄ωk − En0 l0 m0 )
τ h̄ mc V ωk
n0 l0 m0
0 0 0
× | < n l m | p̂ . ˆα (k) exp + i k . r | nlm > |2 (765)
This is related to the lifetime due to stimulated emission, if the initial and final
states are interchanged.
which simplifies to
4 π 2 e2
X
σabsorb (ωk ) = δ( Enlm + h̄ωk − En0 l0 m0 )
m2 ωk c
n0 l0 m0
× | < n0 l0 m0 | p̂ . ˆα (k) exp + ik.r | nlm > |2
(768)
134
The absorption cross-section is independent of the volume of the electromag-
netic cavity and the number of photons in the incident beam. As a function of
frequency, the Born approximation for the cross-section for photon absorption
contains delta function lines corresponding to the atomic excitation energies.
Measured absorption lines do have natural widths ∆ωnl,n0 l0 and the absorbtion
spectra can be approximated by the sums of Lorentzian functions. The widths
70
60
σ(ω) [ h /m ]
50
40
2
30
20
10
0
0.7 0.8 0.9 1
hω [ Ryd ]
Figure 27: A sketch of the photon absorption cross-section σ(ω) (in units of
h̄2
m ) as a function of photon energy h̄ω (in units of Rydbergs). The plot over-
emphasizes the role of the photon lifetimes, since the ratio of the line-width to
e2 3
the photon frequency is of the order of ( h̄c ) .
of the lines are governed by half the sum of the decay rates of the initial and
final electronic levels.
1 1 1
∆ωnl,n0 l0 = + (769)
2 τnl τ n0 l 0
This formula implies that rapidly decaying levels will yield broad lines, but does
not imply the converse41 . The spectral widths can be described by the inclu-
sion of the effects of interaction to higher orders42 . The higher-order processes
produce small shifts of the atomic energy levels and also give the energies small
imaginary parts, resulting in a Lorentzian line shape. Since a typical atomic
transition rate is of the order of 108 sec−1 and a typical photon frequency is of
the order of 1015 sec−1 , the widths of the lines can usually be neglected.
135
× | < n0 l0 m0 | p̂ . ˆα (k) | nlm > |2 (770)
For an isotropic medium, the electronic states are degenerate with respect to
the z-components of the orbital angular momentum, so the initial state (n, l, m)
should be averaged over the different values of m
l
1 X
(772)
(2l + 1)
m=−l
and the values of m0 for the final states are summed over all possible values.
This averaging process results in an isotropic absorption rate, and is equivalent
to averaging the polarization vector over all directions in space. Therefore, in
the dipole approximation, the absorption cross-section for an isotropic medium
is given by the expression
4 π 2 e2
X
σabsorb (ω) = ωn0 l0 ,nl | < n0 l0 m0 | r | nlm > |2 δ(ωn0 l0 ,nl − ω )
3 h̄ c 0 0 0
nlm
(773)
The strength of each absorption line can be found by integrating the cross-
section over a narrow frequency range centered on the frequency of the absorp-
tion line. (More specifically, the width of the interval of integration must be
greater than the natural line-width.) The integrated intensity of the transition
(nlm) → (n0 l0 m0 ) is given by
Z ωnl,n0 l0 +
4 π2
2
e
dω σabsorb (ω) = ωn0 l0 ,nl | < n0 l0 m0 | r | nlm > |2
ωnl,n0 l0 − 3 h̄ c
(774)
The intensity of each line is proportional to the “oscillator strength” fnl→n0 l0
defined as
2 m ωn0 l0 ,nl
fnl→n0 l0 = | < n0 l0 m0 | r | nlm > |2 (775)
h̄
The intensities and the frequencies of all the transitions are related via sum
rules43 . These sum rules involve quantities of the form
X
ωnp 0 l0 m0 ,nlm | < n0 l0 m0 | r | nlm > |2 (776)
n0 l0 m0
136
Table 5: Sum Rules for Dipole Transitions
3 h̄
1 2 m
2
2 m ( Enlm − < nlm | V | nlm > )
h̄
3 2 m < nlm | ∇2 V | nlm >
and have values given in the Table(5). The sum rules can be used to provide
checks of experimental data.
There exists a systematic way of deriving sum rules for the weighted inten-
sities of the dipole allowed transitions. The sum rules are of the form
X p 0 0 0
ωnl,n0 l0 | < nlm | Â | n l m > |2 (777)
n0 l0 m0
where
h̄ ωnl,n0 l0 = Enl − En0 l0 (778)
and p is a positive integer.
137
Then, on taking successive derivatives of F (t) with respect to t, one finds
∂F i
= < nlm | [ Ĥ0 , Â(t) ] † (0) | nlm > (781)
∂t h̄
and
2
∂2F
i
= < nlm | [ Ĥ0 , [ Ĥ0 , Â(t) ] ] † (0) | nlm > (782)
∂t2 h̄
etc. This process shows that the p-th derivative is expressed as p nested com-
mutators
p p
∂ F i
p
= < nlm | [ Ĥ0 , [ . . . [ Ĥ0 , [ Ĥ0 , Â(t) ] ] . . . ] ] † (0) | nlm >
∂t h̄
(783)
Alternatively, one can insert a complete set of states in the definition for F (t)
yielding
X
F (t) = < nlm | Â(t) | n0 l0 m0 > < n0 l0 m0 | † (0) | nlm > (784)
n0 l0 m0
but since the states | nlm > are eigenstates of Ĥ0 , one has
X 2
< nlm | Â | n0 l0 m0 >
F (t) = exp i ωnl,n0 l0 t (785)
n0 l0 m0
The sum rules are found by equating the two forms of the p-th time-derivative
and then setting t = 0
X p 2
0 0 0
Enl − En0 l0 < nlm | Â | n l m >
n0 l0 m0
Hence, the p-th moment of the matrix elements of  is related to the expecta-
tion value of the product of the p-th nested commutator of Ĥ0 and  multiplied
by † .
138
value is homogeneous in time. The q-th nested commutator of the operator Â
can be defined by
B̂q = [ Ĥ0 , B̂q−1 ] (788)
where
B̂0 = Â (789)
†
Likewise, Ĉq can be defined as the q-th nested commutator of  . However, for
any pair of operators B̂p−q−1 and Ĉq , one has
n0 l0 m0
< nlm | [ Ĥ0 , B̂p−q−1 ] Ĉq | nlm > = ( − 1 ) < nlm | B̂p−q−1 [ Ĥ0 , Ĉq ] | nlm >
(791)
On using the definition of the operators B̂p and Cq , the above equation reduces
to
< nlm | B̂p−q Ĉq | nlm > = ( − 1 ) < nlm | B̂p−q−1 Ĉq+1 | nlm > (792)
By induction, this shows that the nested commutators can be distributed be-
tween the two sides of the expression.
< nlm | B̂p Ĉ0 | nlm > = ( − 1 )q < nlm | B̂p−q Ĉq | nlm > (793)
139
The cross-section is given by
X 4 π 2 e2 h̄2 k 02
0 2
σ = | < k | p .
ˆα (k) exp i k . r | 1s > | δ( E 1s + h̄ ω k − )
0
m2 ωk c 2m
k
(795)
where the initial wave function is given by
1 r
ψ1s (r) = √ exp − (796)
π a3 a
As long as the emitted electron is not close to threshold, the final state wave
function can be approximated by a plane-wave
1
ψk0 (r) = √ exp i k 0 . r (797)
V
The sum over final states of the electron can be replaced by an integral over the
magnitude of its momentum and its direction
Z ∞ Z
X V 0 02
→ dk k dΩk0 (798)
0
( 2 π )3 0
k
It is seen that the factor of the volume in the density of final states cancels with
the factors from the normalization of the electron’s final state. The differential
cross-section corresponds to the part of the cross-section where the outgoing
electron is emitted into the solid angle dΩk0 . Hence,
Z ∞
e2 h̄2 k 02
dσ V 0 02 0 2
= dk k | < k | p .
ˆα (k) exp i k . r | 1s > | δ( E 1s + h̄ ω k − )
dΩ0 2 π m2 ωk c 0 2m
(799)
The integration over the magnitude of electron’s final momentum k 0 can be
performed by using the properties of the energy conserving delta function. The
magnitude of electron’s final momentum is denoted by kf
2m
kf2 = ( h̄ωk + E1s ) (800)
h̄2
The result of the integration over k 0 is
e2
dσ V 0
= kf | < kf dΩ | p . ˆα (k) exp i k . r | 1s > |2
dΩ0 2 π h̄2 m ωk c
(801)
It is assumed that the initial photon is propagating along the x-axis and is
polarized along the z-direction. The matrix elements involving the momentum
operator only yield a finite result when p̂ acts on ψ1s (r), since k . ˆα (k) = 0.
However,
h̄ cos θ
ˆα (k) . p̂ ψ1s (r) = i ψ1s (r) (802)
a
140
eα e- (Ek',k')
'
θ
(hωk,k)
Figure 28: The geometry for the photo-emission of an electron from an atom.
An electromagnetic wave, with polarization along the z-axis, is incident along
the x-axis. The photo-emitted electron propagates along the direction k 0 .
e2
dσ V 0
= kf | < kf dΩ | cos θ exp i k . r | 1s > |2 (803)
dΩ0 2 π m ωk c a2
where (θ, ϕ) are the polar coordinates of the vector r. The matrix elements are
evaluated using the dipole approximation for the photon wave function and set
exp i k . r ≈ 1 + i k . r + ... (804)
and only keep the first term of the expansion. The factor cos θ can be expressed
as a spherical harmonic through
r
4π 1
cos θ = Y0 (θ, ϕ) (805)
3
and the final state electronic wave function can be expressed in terms of the
Rayleigh expansion
X
0
exp i kf k̂ . r = 4π il jl ( kf r ) Yml (θ, ϕ) Yml ∗ (θ0 , ϕ0 ) (806)
l,m
where (θ0 , ϕ0 ) are the polar coordinates of the electron’s final momentum. The
angular integration over the polar coordinates (θ, ϕ) can be performed by using
the orthogonality relations for the spherical harmonics. The end result is
Z ∞
1 r
< kf dΩ0 | cos θ | 1s > = − 4 π i cos θ0 √ dr r2 j1 (kf r) exp −
π a3 V 0 a
(807)
where the cos θ0 dependence refers to the direction of the emitted electron’s
angular momentum. The radial integral is evaluated to yield
r
0 0 a3 2 kf a
< kf dΩ | cos θ | 1s > = − 4 π i cos θ (808)
π V ( 1 + kf2 a2 )2
141
Therefore, the differential cross-section is given by
2 2
dσ kf e a 2 0 2 kf a
= 8 cos θ (809)
dΩ0 k m c2 ( 1 + kf2 a2 )2
Using
h̄2
a = (810)
m e2
the photo-emission cross-section can be re-written as
2 2 2
dσ kf 2 e 2 0 2 kf a
= 8 a cos θ (811)
dΩ0 k h̄ c ( 1 + kf2 a2 )2
Thus, although the photon is propagating along the x-direction, the electron is
preferentially emitted along the direction of the polarization (θ0 ≈ 0). This can
be understood as being due to the effect of c being large, so that the photon’s
momentum is negligible compared with the energy, therefore, (in the dipole
approximation) only the direction of the polarization determines the angular
distribution of the emitted electron. It should be noted that in the relativistic
case, where the momentum of the photon is important, the electrons are pre-
dominantly ejected in the direction of the photon44 . This formula also breaks
down for emitted electrons with low energies. In this case, the correct electronic
wave function for the continuous spectrum of Ĥ0 should be used45 . The in-
clusion of the Coulomb attraction of the ion in the final state has the effect of
reducing the cross-section near the threshold.
142
which is evaluated as
r
√
q 2 π h̄ c
− p̂ . ˆα (k) δk+k0 −k00 nk,α (816)
mc V ωk
This shows that momentum is conserved. Furthermore, for the transition rate
p'
p''
Hence,
( pµ + pµ0 ) ( pµ + p0µ ) = pµ00 p00µ (819)
but the electron’s momenta form a Lorentz scalar which is related to the rest
mass
pµ0 p0µ = pµ00 p00µ = m2 c2 (820)
and the photon has zero mass
pµ pµ = 0 (821)
pµ p0µ = 0 (822)
143
10.2 Scattering of Light
Kramers and Heisenberg evaluated the scattering cross-section for light incident
on atomic electrons46 . The incident photon is denoted by (k, α) and the scat-
tered photon by (k 0 , α0 ). The scattering cross-section involves the paramagnetic
interaction to second-order and the diamagnetic interaction to first-order. The
matrix elements of the diamagnetic interaction are given by
e2
0 0 0 0 0
< n l m k α | Ĥdia | nlmkα > = < n0 l0 m0 k 0 α0 | Â . Â | nlmkα >
2 m c2
e2
0 0 0 0 0 † † 0
= < n l m k α | ( ak,α a 0
k ,α 0 + a 0
k ,α0 ak,α ) exp i ( k − k ) . r | nlmkα >
2 m c2
2 π h̄ c2
× √ ˆα (k) . ˆα0 (k 0 ) (823)
ωk ωk 0 V
where it has been assumed that only the initial and final photon are present. On
making use of the long-wavelength approximation λ a, the matrix elements
simplify to
e2
0 0 0 0 0
< n l m k α | Ĥdia | nlmkα > ≈ < n0 l0 m0 | nlm >
2 m c2
2 π h̄ c2
× √ ˆα (k) . ˆα0 (k 0 )
ωk ωk 0 V
(824)
The scattering cross-section will be expressed in terms of a transition rate and
the transition rate will be calculated using a similar procedure to that which
was used in describing two-photon decay. An arbitrary state | ψn > can be
expressed in terms of a complete set of non-interacting states | n >
X
| ψn > = Cn0 (t) | n0 > (825)
n0
144
where ω = c k and ω 0 = c k 0 and the long-wavelength approximation has
been used. The small quantity η has been absorbed as a small imaginary part
to the initial state energy
En → En + i η h̄ (827)
k'α'
k''α''
n n'
(b) emission of a photon (k 0 , α0 ) followed by the absorption of the photon (k, α).
(k,α) (k',α')
(k',α')
(k,α)
n'' n''
n n' n n'
145
× < n0 l0 m0 | ˆα0 (k 0 ) . p̂ | n00 l00 m00 > < n00 l00 m00 | ˆα (k) . p̂ | nlm >
X i i
+ exp[ (En0 − En00 − h̄ω) t0 ] exp[ − (En − En00 − h̄ω 0 ) t00 ]
00
h̄ h̄
n
× < n0 l0 m0 | ˆα (k) . p̂ | n00 l00 m00 > < n00 l00 m00 | ˆα0 (k 0 ) . p̂ | nlm >
(828)
( En − En00 + h̄ ω )
< n0 l0 m0 | ˆα (k) . p̂ | n00 l00 m00 > < n00 l00 m00 | ˆα0 (k 0 ) . p̂ | nlm >
+ (829)
( En − En00 − h̄ ω 0 )
146
where the matrix elements are given by
M = ˆα (k) . ˆα0 (k 0 ) < n0 l0 m0 | nlm >
0
1 X < n0 l0 m0 | ˆα0 (k ) . p̂ | n00 l00 m00 > < n00 l00 m00 | ˆα (k) . p̂ | nlm >
+
m 00 ( En − En00 + h̄ ω )
n
0
1 X < n0 l0 m0 | ˆα (k) . p̂ | n00 l00 m00 > < n00 l00 m00 | ˆα0 (k ) . p̂ | nlm >
+
m 00 ( En − En00 − h̄ ω 0 )
n
(834)
On taking the limit η → 0, the first factor in the decay rate reduces to an
energy conserving delta function. Therefore, one obtains the Fermi-Golden rule
expression
2 2 2
1 2π e 2 π h̄ c2
= √ M 2 δ(h̄ωk0 + En0 − En − h̄ωk ) (835)
τ h̄ m c2 ωk ωk 0 V
The magnitudes of the final state photon quantum numbers (k 0 ) must be inte-
grated over, since these are not measured. This integration imparts a physical
meaning to the expression for the rate which contains the Dirac delta function.
We shall assume that the direction of the scattered photon is to be measured
and that the photon is absorbed by a detector which subtends a solid angle dΩ0
to the material. Therefore, the scattering rate is given by
2 2 Z ∞ 2
2 π h̄ c2
1 2π e V 0 0 02
= dΩ dk k √ | M |2 δ(h̄ωk0 +En0 −En −h̄ωk )
τdΩ0 h̄ m c2 ( 2 π )3 0 ωk ωk 0 V
(836)
Since h̄ ωk0 = h̄ c k 0 , the integration over the delta function can be performed,
yielding
2 2 2
V dΩ0 ω 02 2 π h̄ c2
1 2π e
= √ | M |2 (837)
τdΩ0 h̄ m c2 ( 2 π )3 h̄ c3 ω ω0 V
The scattering cross-section is defined as the transition rate divided by the
photon flux. The photon flux is found by noting that it has been assumed that
there is one photon per volume V so the photon density is V1 and the speed of
light is c. Hence, the photon flux is given by Vc . Therefore, the cross-section is
determined by the Kramers-Heisenberg formula
2 2 0
dσ e ω
= | M |2 (838)
dΩ0 m c2 ω
The magnitude of the scattering rate is determined by the quantity re which
has the dimensions of length
2
e
re = (839)
m c2
147
This quantity is often called the classical radius of the electron. The quantity
re can be expressed as
2 2
e e h̄
re = = ≈ 2.82 × 10−15 m (840)
m c2 h̄ c mc
but one can re-write the Kronecker delta function in terms of the commutation
relation
[ xi , p̂j ] = i h̄ δi,j (843)
Thus, one can express the scalar product as a commutator
1 X
ˆα (k) . ˆα0 (k 0 ) = ˆα (k)i [ xi , p̂j ] ˆα0 (k 0 )j
i h̄ i,j
1
= [ ˆα (k) . r , p̂ . ˆα0 (k 0 ) ] (844)
i h̄
Since, in the dipole approximation, the diamagnetic contribution to the matrix
elements M is proportional to the overlap integral
the initial and final states must be identical if this is non-zero. Hence, the
result is equivalent to the expectation value in the state | nlm > . On replacing
the matrix elements by the expectation value and then insert a complete set of
electronic states, one finds
148
The matrix elements of r can be expressed in terms of the matrix elements of p̂
via
1
< nlm | p̂ | n00 l00 m00 > = < nlm | [ r , p̂2 ] | n00 l00 m00 >
2 i h̄
m
= < nlm | [ r , Ĥ0 ] | n00 l00 m00 >
i h̄
m
= ( En00 l00 m00 − Enlm ) < nlm | r | n00 l00 m00 >
i h̄
(847)
where
En00 l00 m00 − Enlm = h̄ ωn00 n (849)
Thus, the elastic scattering term in the Kramers-Heisenberg formula is given by
1 X < n0 l0 m0 | p̂ . ˆα0 (k 0 ) | n00 l00 m00 > < n00 l00 m00 | p̂ . ˆα (k) | nlm >
−
m h̄ ωnn00
n00 l00 m00
(850)
1 X < n0 l0 m0 | p̂ . ˆα0 (k 0 ) | n00 l00 m00 > < n00 l00 m00 | p̂ . ˆα (k) | nlm >
+
m En00 − En
n00 l00 m00
(851)
On substituting this back into the expression for the matrix elements M , one
obtains
1 X < n0 l0 m0 | p̂ . ˆα (k) | n00 l00 m00 > < n00 l00 m00 | p̂ . ˆα0 (k 0 ) | nlm >
M =
m 00 00 00 En00 − En
n l m
1 X < n0 l0 m0 | p̂ . ˆα0 (k 0 ) | n00 l00 m00 > < n00 l00 m00 | p̂ . ˆα (k) | nlm >
+
m En00 − En
n00 l00 m00
149
1 X < n0 l0 m0 | ˆα0 (k 0 ) . p̂ | n00 l00 m00 > < n00 l00 m00 | p̂ . ˆα (k) | nlm >
+
m ( En − En00 + h̄ ω )
n0 l00 m00
< n0 l0 m0 | ˆα (k) . p̂ | n00 l00 m00 > < n00 l00 m00 | p̂ . ˆα0 (k 0 ) | nlm >
1 X
+
m ( En − En00 − h̄ ω )
n00 l00 m00
(852)
which simplifies to
< n0 l0 m0 | ˆα0 (k 0 ) . p̂ | n00 l00 m00 > < n00 l00 m00 | p̂ . ˆα (k) | nlm >
ω X
M =
m h̄ ωn00 n ( ωnn00 + ω )
n00 l00 m00
< n0 l0 m0 | ˆα (k) . p̂ | n00 l00 m00 > < n00 l00 m00 | p̂ . ˆα0 (k 0 ) | nlm >
−
ωn00 n ( ωnn00 − ω )
(853)
In the limit of small photon frequencies compared with the electronic energies,
one can expand the denominators of the matrix element as
1 1 ω
= 2 ∓ 3 + ... (854)
ωnn00 ( ωnn00 ± ω ) ωnn 00 ωnn 00
When this low-frequency expansion is substituted into the matrix elements, the
leading term vanishes. This can be seen since the leading term becomes
X 1
2 < n0 l0 m0 | ˆα0 (k 0 ) . p̂ | n00 l00 m00 > < n00 l00 m00 | p̂ . ˆα (k) | nlm >
ω nn 00
n00
− < n0 l0 m0 | ˆα (k) . p̂ | n00 l00 m00 > < n00 l00 m00 | p̂ . ˆα0 (k 0 ) | nlm >
(855)
which can be expressed as
X
m2 < n0 l0 m0 | ˆα0 (k 0 ) . r | n00 l00 m00 > < n00 l00 m00 | r . ˆα (k) | nlm >
n00
0 0 0 00 00 00 00 00 00 0
− < n l m | ˆα (k) . r | n l m > < n l m | r . ˆα0 (k ) | nlm >
(856)
or, on using the completeness relation, one finds the expectation value of the
commutator
m2 < n0 l0 m0 | [ ˆα0 (k 0 ) . r , r . ˆα (k) ] | nlm > = 0 (857)
which vanishes. Thus, the leading term of the low-frequency expansion vanishes.
Therefore, the scattering rate is expressed as
2 3
dσ re X
4 1
= ω
dΩ0 m h̄ 00 00 00
ωnn00
n l m
150
× < n0 l0 m0 | ˆα0 (k 0 ) . p̂ | n00 l00 m00 > < n00 l00 m00 | p̂ . ˆα (k) | nlm >
2
0 0 0 00 00 00 00 00 00 0
+ < n l m | ˆα (k) . p̂ | n l m > < n l m | p̂ . ˆα0 (k ) | nlm >
(858)
Finally, the scattering rate can be expressed in terms of the dipole matrix ele-
ments as
2 X 3
dσ re m 4
1
= ω
dΩ0 h̄ ωnn00
n00 l00 m00
× < n0 l0 m0 | ˆα0 (k 0 ) . r | n00 l00 m00 > < n00 l00 m00 | r . ˆα (k) | nlm >
2
0 0 0 00 00 00 00 00 00 0
+ < n l m | ˆα (k) . r | n l m > < n l m | r . ˆα0 (k ) | nlm >
(859)
Hence, at long-wavelengths, the scattering cross-section varies as ω 4 as expected
from Rayleigh’s law. Since the typical electronic frequency ωnn00 is in the ultra-
violet spectrum, then
ωnn0 ω (860)
for all frequencies in the visible optical spectrum. This leads to the phenomena
of blue skies in the day and red sunsets at dusk.
151
In the investigation of the angular dependence of Thomson scattering, it is
convenient to introduce a coordinate system which is defined by the polarization
vectors and direction of propagation of the incident photon and its polarization
ˆ1 (k). The coordinate system is composed of the three orthogonal unit vec-
tors (ˆ 1 (k), ˆ2 (k), k̂). Thus the direction of the polarization vector ˆ1 (k) defines
the x-direction. In this coordinate system, the scattered photon (k 0 , α0 ) is in
the direction k 0 with polar coordinates (θk0 , ϕk0 ). The polarization of the final
photons ˆα (k 0 ) must be transverse to k 0 . Two polarization vectors are defined
according to
ˆ1 (k 0 ) = ( cos θk0 cos ϕk0 , cos θk0 sin ϕk0 , − sin θk0 ) (864)
k
e2(k')
k'
e1(k')
θk'
e2(k)
φk'
e1(k)
Figure 32: The coordinate system and polarization vectors used to describe
Thomson scattering.
vectors, the scattering cross-section for incident radiation that is polarized along
the x-direction takes on the form
cos2 θk0 cos2 ϕk0 for α0 = 1
dσ 2
= r (866)
dΩ0 x−pol e
sin2 ϕk0 for α0 = 2
If the incident beam has its polarization along the x-direction, and the de-
tector is not sensitive to the polarization, then the final polarization must be
152
summed over. In this case of a polarized beam and a polarization insensitive
detector, the cross-section is given by
dσ 2 2 2 2
= r e cos θ k 0 cos ϕ k 0 + sin ϕ k 0 (867)
dΩ0 x−pol
where the polarizations of the final state photon have been summed over.
if the polarizations of the final state photons are measured. This result is iden-
tical to that obtained by assuming that the initial beam is composed of one half
of the number photons polarized along the x-direction and the other half of the
number of photons polarized along the y-direction. That is
2
for α0 = 1
1 2 2
dσ 2 2 cos θk ( cos ϕk + sin ϕk )
0 0 0
= r 2 (869)
dΩ0 unpol e 1 2
2 ( sin ϕk + cos ϕk )
0 0 for α0 = 2
153
Classical Interpretation
In the first process, an electron bound harmonically to the atom which re-
sponds to an electromagnetic field E 0 exp[ i ω t ] can be described by the
equation of motion
2 q
r̈ + ω0 r = E <e exp i ω t (873)
m 0
where ω0 is the frequency of the electron’s natural motion. In the steady state,
one finds q
m E0
r = 2 <e exp i ω t (874)
ω0 − ω 2
The acceleration of the charged particle can be described by
q 2
m ω E0
r̈ = − 2 <e exp i ω t (875)
ω0 − ω 2
2 q2 r2 ω4
P (ω) =
3 c3
2 q E 20
4
ω4
= (876)
3 m2 c3 ( ω02 − ω 2 )2
8π 2 ω4
σ = re (878)
3 ( ω02 − ω 2 )2
This formula has the correct frequency dependence in the limit ω ω0 in which
case the classical cross-section varies as ω 4 , as expected for Rayleigh scattering.
On the other hand, in the limit ω ω0 the cross-section becomes frequency
independent, as is expected for Thomson scattering.
154
10.2.3 Raman Scattering
For inelastic scattering, one has h̄ ω 6= h̄ ω 0 , therefore, the condition of con-
servation of energy requires that
Since it is most probable that the initial electron is in the ground state, one has
h̄ ω > h̄ ω 0 (881)
Hence, the final photon has less energy than the initial photon. That is, the
Stokes
n → n'
I(ω')
anti-Stokes
n' → n
Figure 33: The schematic frequency dependence of the observed intensity ex-
pected in a Raman scattering experiment. The ratio of intensities of the Stokes
and anti-Stokes lines provides a relative measure of the initial occupation of the
low-energy state n and the higher-energy excited state n0 .
electromagnetic field has lost energy and left the electron in an excited state.
This inelastic process describes the Stoke’s line. On the other hand, if the
electron is initially in an excited state, then it is possible that the electron
looses energy and makes a transition to the ground state. In this case,
h̄ ω < h̄ ω 0 (883)
155
10.2.4 Radiation Damping and Resonance Fluorescence
In the analysis of photon scattering, it has been assumed that the energy de-
nominators ( En − En00 + h̄ ω ) do not vanish. If the energy denominator
vanishes, the Kramers-Heisenberg formula becomes singular, however, the phys-
ically observed scattering cross-section may become large but does not diverge.
This is the phenomenon of resonance-fluorescence. Using the classical model,
one can describe the scattering cross-section, if damping is introduced to rep-
resent the lifetime of the electronic states. That is, the dynamics of the bound
electron is modelled by a damped harmonic oscillator
2 q
r̈ + γ ṙ + ω0 r = E <e exp i ω t (884)
m 0
since γ is related to the decay rate is of the order of 108 sec−1 , it is usually
negligible compared with the frequency of light which is estimated as ω ∼ 1015
sec−1 . Following our previous arguments, one finds that the scattering cross-
section is given by
2
q2 ω4
8π
σ(ω) = (886)
3 m c2 ( ω2 − ω02)2 + γ 2 ω 2
so
En = En(0) + ∆En (889)
Hence, due to the form of the expressions for the shift and the lifetime as the
real and imaginary parts of a complex function, it is possible to consider an
156
unstable state as having a complex energy47 given by
h̄
En − i Γn ≈ En(0) + ∆En − i (890)
2 τn
That is, the lifetime can be considered as giving the state an energy-width Γn .
This is the natural width of the electronic state. The factor of two in the width
can be understood by considering the time-dependence of the state | ψn (t) >
which is given by
i
| ψn (t) > = exp − ( En − i Γn ) t | ψn (0) > (891)
h̄
Hence, the probability Pn (t) that the state has not decayed at time t is given
by
Therefore, for the case of resonant scattering, one should replace the energies
by complex numbers such that the real part represents the state’s energy and
the imaginary part describes half the state’s decay rate. In the case of resonant
scattering, the Kramers-Heisenberg formula is modified48 to
2
e2 ω0
dσ
= | M |2 (895)
dΩ0 m c2 ω
47 That is, the perturbation produces a complex shift of the energy-shift which related to
157
where the matrix elements are given by
M = ˆα (k) . ˆα0 (k 0 ) < n0 l0 m0 | nlm >
1 X < n0 l0 m0 | ˆα0 (k 0 ) . p̂ | n00 l00 m00 > < n00 l00 m00 | ˆα (k) . p̂ | nlm >
+
m ( En − En00 − i Γn00 + h̄ ω )
n00 l00 m00
< n0 l0 m0 | ˆα (k) . p̂ | n00 l00 m00 > < n00 l00 m00 | ˆα0 (k 0 ) . p̂ | nlm >
1 X
+
m ( En − En00 − i Γn00 − h̄ ω 0 )
n00 l00 m00
(896)
2
Since close to resonance, the resonant denominator is given by Γ ∼ h̄ac ( h̄e c )4
2
whereas the numerator is of the order of ea . Hence, on-resonance the matrix
2
elements can be of the order ( h̄e c )−3 larger than the non-resonant matrix ele-
ments. Therefore, on resonance, the non-resonant terms may be neglected. In
the following, it shall be assumed that the resonant state is non-degenerate
| < n0 l0 m0 | ˆα0 (k 0 ) . p̂ | n00 l00 m00 > < n00 l00 m00 | ˆα (k) . p̂ | nlm > |2
2 2 0
dσ e ω
=
dΩ0 m2 c2 ω ( En − En00 + h̄ ω )2 + Γ2n00
(897)
This expression can be re-expressed in terms of the product of two factors
which is the probability for absorption from the ground state to the resonant
state | n00 l00 m00 > (divided by the incident flux) times the probability for its
decay via emission. On resonance, it appears that the process corresponds to
two sequential processes, first absorption and secondly emission.
The difference between a resonant process and two step process, is deter-
mined by the lifetime of the intermediate state | n0 l0 m0 > compared with the
frequency width of the photon beam. The frequency width of the photon beam
may be limited by the monochromator, or by the time-scale of the experiment
if it involves a pulsed light source. If the lifetime of the intermediate state is
sufficiently long compared with the the time scale of experiment, it may be pos-
sible to observe the decay long after the incident light has been switched off. In
158
this case, the resonance can be considered to be composed of two independent
processes49 . Furthermore, it may be possible to perform further experiments
on the surviving intermediate state. In the opposite case, where the lifetime
of the intermediate state is shorter than the time-scale of the experiment, the
intermediate state will have decayed before the experiment has terminated.
where the decay includes transitions to all possible final states, one finds
1 1 1
ρn (E) = −
2πi E − En − i 2 h̄τn E − En + i 2 h̄τn
h̄
1 2 τn
= (901)
π ( E − En )2 + ( 2 h̄τ )2
n
This can only be an approximate form of the energy-distribution since the en-
ergy must be bounded from below. The existence of a lower-bound to en-
ergy distribution implies that the width of the electronic energy level has to
be energy-dependent 2 h̄τn = Γn as this must become zero below a threshold
energy. However, it should be noted that the width of the energy-distribution
will determine the approximate exponential decay. Since the perturbations in-
troduce an energy-dependent width to the wave packet, causality requires that
the energy-shift ∆En should also be energy-dependent. Hence, the effects of
the perturbation (such as the energy-shift and lifetime) should be described in
49 V. Weisskopf, Ann. der Physik, 9, 23 (1931).
159
terms of a self-energy Σn (E)
This complex self-energy has a real and imaginary part. The imaginary part
can be thought of as occurring via amplification of the infinitesimal imaginary
term i η in the denominator, and can be seen to be non-zero when the energy
E of the component in the wave packet falls in the region when the spectral
density of the approximate En0 is finite. Hence, since the En0 are bounded from
below, then so is the energy-distribution ρn (E) since
1 =m Σn (E + iη)
ρn (E) = − (904)
π ( E − En − <e Σn (E) )2 + ( =m Σn (E + iη) )2
The real part of the self-energy must also be energy-dependent, since it is related
to the imaginary part via the Kramer’s-Kronig relations
Z ∞
1 =m Σn (z + iη)
<e Σn (E) = − dz
π −∞ E − z
Z ∞
Pr <e Σn (z)
=m Σn (E + iη) = + dz (905)
π −∞ E − z
Hence, the real part of the self-energy is also energy-dependent. The Kramers-
Kronig relation is an expression of causality.
Since the electronic states in the expression for the Fermi-Golden rule decay
1 2π
= | < n0 l0 m0 | ĤI | nlm > |2 δ( En0 l0 − Enl − h̄ ω ) (907)
τ nl→n0 l0 h̄
are to be interpreted as wave packets with a distribution of energies, the factor
expressing conservation of energy should be expressed in terms of the energy
160
conservation for the components of the wave packets. Hence, the decay rate
should be written as the convolution
Z ∞ Z ∞
1 2π
= | < n0 l0 m0 | ĤI | nlm > |2 dE 0 ρn0 l0 (E 0 ) dE ρnl (E) δ( E 0 − E − h̄ ω )
τ nl→n0 l0 h̄ −∞ −∞
Z ∞
2π 0 0 0 2
= | < n l m | ĤI | nlm > | dE ρn0 l0 ( E + h̄ ω ) ρnl (E) (908)
h̄ −∞
We shall use the approximation for the energy distributions suggested by eqn(901).
In this case, the convolution is evaluated by contour integration as
h̄ h̄
1 2 2 τn + 2 τn0 l
= | < n0 l0 m0 | ĤI | nlm > |2
τ nl→n0 l0 h̄ ( h̄ ω + Enl − En0 l0 )2 + ( 2 h̄τnl + h̄
2 τn0 l0 )
2
(909)
since only the terms with poles on the opposite sides of the real-axis yield
non-zero contributions. From this, one can show that the optical absorption
cross-section is given by
1 1
4π
e2
X ωn0 l0 ,nl ( 2 τn0 l0 + 2 τnl )
σabsorb (ω) = | < n0 l0 m0 | r | nlm > |2 1 1
3 h̄ c
n0 l0 m0
( ωn0 l0 ,nl − ω )2 + ( 2 τn0 l0 + 2 τnl )2
(910)
which was first derived by Weisskopf and Wigner50 . Hence, the natural width
is given by the average of the decay rates for the initial and final electronic
states. This leads to the conclusion that even weak lines can be broad, if the
final electronic state has a short lifetime.
10.3 Renormalization
Quantum Electrodynamics treats the interactions between charged particles and
the electromagnetic field, and often contains infinities. The zero-point energy of
the electromagnetic field is one such infinity. In most cases, these infinities can
be ignored since they are not measurable, since the infinities occur as modifica-
tions caused by the introduction of interactions between the charged particles
of a hypothetical system with an electromagnetic field. That is, the infinities
occur in the form of a renormalization of the quantities of the non-interacting
theory. These infinite renormalizations do not lead to the rejection of the theory
of Quantum Electrodynamics since the quantities of the non-interacting system
are not measurable. To be sure, the infinities occur in relations between hypo-
thetical quantities and physically measurable quantities, and so these infinities
can be ignored since the hypothetical quantities are undefined. However, it is
possible to use the theory to eliminate the unmeasurable quantities, thereby
yielding relations between physically measurable quantities to other physically
50 V. F. Weisskopf and E. Wigner, Z. Physik, 63, 54 (1930).
161
measurable quantities. In Quantum Electrodynamics, the infinities cancel in
equations which only contain physical measurable quantities. This fortunate
circumstance makes the theory of Quantum Electrodynamics renormalizable.
First, it shall be shown how the infinite zero-point energy of the electro-
magnetic field can lead to a (finite) physically measurable force between its
containing walls.
L-d d
Figure 34: The geometry of the partitioned electromagnetic cavity used to con-
sider the Casimir effect.
evaluate the total energy of this configuration and then deduce the form of the
interaction between the partition and the walls of the cavity.
We shall consider the total energy due to the zero-point fluctuations in the
container. Since the zero-point energy is divergent due to the presence of arbi-
trarily large frequencies, we shall introduce a convergence factor. The conver-
gence factor can be motivated by the observation that, in mater, electromagnetic
radiation becomes exponentially damped at large frequencies. Hence, one can
write
1 X ωk,α
E = h̄ ωk,α exp − λ (911)
2 c
k,α
162
and then take the limit λ → 0.
163
The factor of t−1 can be eliminated by evaluated by performing one of the
differentials with respect to λ.
Z ∞
h̄ c L2 ∂2 exp[ πdλ t ]
Ed = dt (920)
2d 1 ∂λ2 ( exp[ πdλ t ] − 1 )2
We shall set
πλt
s = exp[ ] − 1 (921)
d
therefore
∞
h̄ c L2 ∂ 2
Z
d ds
Ed = (922)
2 d ∂λ2 πλ s0 s2
h̄ c L2 ∂ 2
d
Ed =
2 d ∂λ2 π λ s0
d
h̄ c L2 ∂ 2
π λ
=
2 d ∂λ2 exp[ πdλ ] − 1
2 π λ
h̄ c L2 ∂ 2
d d
= (924)
2 d ∂λ2 πλ exp[ πdλ ] − 1
164
with n = 4 remains finite in the limit λ → 0 and all the higher-order terms
vanish in this limit. Explicitly, one has
h̄ c L2 6 B0 d2 2 B4 π 2
2 B1 d
Ed = + + + O(λ) (927)
2d π 2 λ4 π λ3 4! d2
The first term in the energy is proportional to L2 d, which is the volume of the
cavity and the second term is proportional to L2 the surface area of the walls.
The third term is independent of the cut-off and the higher order terms vanish
in the limit λ → 0.
The Casimir force is the force between two planes, which originates from
the energy of the field52 . This energy can be separated out into a volume
part and parts due to the creation of the surfaces and an interaction energy
between the surfaces. In order to eliminate both the volume dependence of the
energy and the surface energies, we are considering two configurations of the
partitions in the cavity. In one configuration the plane divides the volume into
two unequal volumes d L2 and (L − d) L2 , and the other configuration is a
reference configuration where the cavity is partitioned into two equal volumes
L3
2 . The difference of energies for these configurations is given by
∆E = Ed + EL−d − 2 E L (928)
2
π 2 h̄ c L2
lim ∆E → − (929)
Ld,λ→0 720 d3
The d-dependence of the energy difference leads to an attractive force between
the two plates separated by a distance d, which is the Casimir force
π2 L2
F = − h̄ c 4 (930)
240 d
The force is proportional to L2 which is the area of the wall of the cavity. The
predicted force was measured by Sparnaay53 . A more recent experiment involv-
ing a similar force between a planar surface and a sphere has achieved greater
accuracy54 .
52 Our considerations only includes the part of Fock space that corresponds to having zero
numbers of excited quanta. Hence, the Casimir force is due to the properties of the field, and
is not due to the transmission of real particles (photons) between the planes.
53 M. J. Sparnaay, Physica 24, 751 (1958).
54 S. K. Lamoreaux, Phys. Rev. Lett. 78, 5 (1997).
165
ter such a large perturbation, the feedback system required 6 3 1014 Hz, Eq. (5) gives a correction of order 20%
several minutes to reestablish equilibrium. at the closest spacings; our data does not support such
Assuming that the functional form for the Casimir force a deviation. However, the simple frequency dependence
is correct, its magnitude was determined by using linear of the electrical susceptibility used in the derivation of
least squares to determine a parameter d for each sweep Eq. (5) is not correct for Au, the index of refraction of
such that which has a large imaginary component above the plasma
Fcm sai d s1 1 ddFcT sai d 1 b 0 . (9) frequency; a rough estimate using the tabulated complex
index [14] limits the conductivity correction as no larger
In this context, b 0 should be zero, and for the complete than 3%, which is consistent with our results [15].
data set, b 0 , 5 3 1027 dyn (95% confidence level). I thank Dev Sen (who was supported by the UW NASA
The average over the 216 sweeps gives d 0.01 6 0.05, Space Grant Program) for contributions to the early stages
and this is taken as the degree of precision of the of this experiment, and Michael Eppard for assistance
measurement. There was no evidence for any variation with calculations.
of d depending on the region of the plates used for the
measurement.
The most striking demonstration of the Casimir force
is given in Fig. 4. The agreement with theory, with no
*Present address: Los Alamos National Laboratory,
Neutron Science and Technology Division P-23, M.S.
H803, Los Alamos, NM 87545.
[1] H. B. G. Casimir, Koninkl. Ned. Adak. Wetenschap. Proc.
51, 793 (1948).
[2] E. Elizalde and A. Romeo, Am. J. Phys. 59, 711 (1991).
[3] V. M. Mostepanenko and N. N. Trunov, Sov. Phys. Usp.
31, 965 (1988).
[4] M. J. Sparnaay, Physica (Utrecht) 24, 751 (1958).
[5] C. I. Sukenik, M. G. Boshier, D. Cho, V. Sangdohar, and
E. A. Hinds, Phys. Rev. Lett. 70, 560 (1993).
[6] E. M. Lifshitz, Sov. Phys. JETP 2, 73 (1956).
[7] T. H. Boyer, Phys. Rev. 174, 1764 (1968).
[8] J. Blocki, J. Randrup, W. J. Swiatecki, and C. F. Tsang,
Ann. Phys. (N.Y.) 105, 427 (1977).
[9] J. Schwinger, L. L. DeRaad, Jr., and K. A. Milton, Ann.
Phys. (N.Y.) 115, 1 (1978).
[10] J. Mehra, Physica (Utrecht) 37, 145 (1967).
[11] L. S. Brown and G. J. Maclay, Phys. Rev. 184, 1272
(1969).
[12] G. Ising, Philos. Mag. 1, 827 (1926).
[13] W. R. Smythe, Static and Dynamic Electricity (McGraw-
FIG. 4. Top: All data with electric force subtracted, averaged Hill, New York, 1950), pp. 121 –122.
into bins (of varying width), compared to the expected Casimir [14] CRC Handbook of Chemistry and Physics, 76th Ed. (CRC
Figure 35: The separation-dependent force between two closely spaced metallic
force for a 11.3 cm spherical plate. Bottom: Theoretical
Casimir force, without the thermal correction, subtracted from
Press, Boca Raton, 1995), pp. 12 –-130.
[15] S. Hacyan, R. Jauregui, F. Soto, and C. Villarreal, J. Phys.
surfaces due to the modification of the zero-point energy. The lower panel shows
top plot; the solid line shows the expected residuals. A 23, 2401 (1990).
Cut-Off Independence
It is the boundary condition and not the cut-off that plays an important role
in the Casimir effect. For simplicity, one can choose zero boundary conditions.
The zero-point energy of a cylindrical electromagnetic cavity of radius R and
length d can be expressed as the sum
s 2 s
∞ Z 2
h̄ c π R2 X
2
π n z 2
π nz
Ed = 2 dkρ kρ kρ + F kρ +
2 2 n =1 d d
z
(931)
where F (z) is an arbitrary cut-off function (which may depend on an arbitrary
parameter λ which is ultimately going to be set to zero). The cut-off must not
effect the low energy-modes so one can choose F (0) = 1 and all the derivatives
of F (z) to be zero for finite values of z. These assumptions are all in accord with
the ideal case of no cut-off function or F (z) = 1. The energy can be written as
∞
h̄ c R2 X
Ed = f (nz ) (932)
2 n =1 z
55 The independence of any cut-off procedure can be shown by evaluating the divergent sums
166
1.2
0.8
F(z)
0.4
0
0 0.2 0.4 0.6 0.8 1
z/N
where
s 2 s 2
Z ∞
π nz π nz
f (nz ) = dkρ kρ kρ2 + F kρ2 + (933)
0 d d
We shall assume that in the f (n) and all its derivatives vanishes in the limit
of large n, limN →∞ f (N ) → 0, due to the behavior of the cut-off function.
167
The corrections in the Euler-Maclaurin summation formulae can be evaluated
by noting that the first derivative of f (n) with respect to n is given by
Z ∞ s 2
π2 n
(1) kρ 2 +
πn
f (n) = dkρ F kρ (936)
d2
q
0 k 2 + ( π n )2 d
ρ d
since the derivatives of F (z) all vanish for finite z. On integrating by parts, one
obtains
Z ∞
(1) π2 n
f (n) = dz F (z)
d2 πn
d
Z ∞
π2 n
∂z
= dz F (z)
d2 πn
d
∂z
∞
π2 n
= 2
z F (z)
d πn
d
π 3 n2
= − (937)
d3
In deriving the above expression, the condition that the first-order derivative of
F (z) vanishes for finite z has been used. It immediately follows that
2 π3 n
f (2) (n) = − (938)
d3
and
2 π3
f (3) (n) = − (939)
d3
and all higher order derivatives vanish. Hence, one finds that at z = 0 all the
m-th order derivatives f (m) (0) vanish, except for m = 3 which is given by
2 π3
f (3) (0) = − (940)
d3
Hence, on evaluating the energy of the cylindrical cavity (and using the zero
boundary conditions), one finds that the energy has a number of infinite terms.
The integral part of the expression only depends on the volume of the cav-
ity, and hence drops out when the energy differences are taken. The only terms
that yield non-zero contributions to the energy difference depend on d and these
terms give rise to the Casimir force. This approach also showed that any par-
ticular choice made for the cut-off is irrelevant.
Mathematical Interlude:
The Euler-Maclaurin Summation Formula.
168
The Euler-Maclaurin formula allows one to accurately evaluate the difference
of finite summations and their approximate evaluations in the form of integrals.
The Euler-Maclaurin formula provides expressions for the difference between the
sum and the integral in terms of the higher-derivatives f (n) at the end points
of the interval 0 and N . For any integer p, one has
p
1 X B2n 2n−1 2n−1
S + f (0) + f (N ) − I = f (N ) − f (0) + R
2 n=1
(2n)!
(944)
where B1 = -1/2, B2 = 1/6, B3 = 0, B4 = -1/30, B5 = 0, B6 = 1/42, B7 =
0, B8 = -1/30, ... are the Bernoulli numbers, and R is an error term which is
normally small if the series on the right is truncated at a suitable value of p.
where Pn (x) = Bn (x − [x]) are the periodic Bernoulli polynomials. The remain-
der term can be estimated as
Z N
2
|R| ≤ dx | f 2p−1 (x) | (946)
(2π)p 0
169
Derivation by Induction
First we shall examine the properties of the Bernoulli polynomials and the
Bernoulli numbers. Then we shall indicate how the Euler-Maclaurin formula
can be obtained by induction.
where Bn are the Bernoulli constants. Hence, the Bernoulli constants are the
Bernoulli polynomials evaluated at x = 0, i.e. Bn (0) = Bn . Furthermore, on
differentiating the generating function w.r.t. x, one finds
∂G(z, x)
= z G(z, x) (949)
∂x
which implies that
∞ ∞
X ∂Bn (x) z n X zn
= z Bn (x) (950)
n=0
∂x n! n=0
n!
On equating the coefficients of z n in the above equation, one obtains the im-
portant relation
∂Bn (x)
= n Bn−1 (x) (951)
∂x
Therefore, by integration it easy to show that Bn (x) are polynomials of degree
n. The first few Bernoulli polynomials can be explicitly constructed from the
generating function expansion. The few polynomials are given by
B0 (x) = 1
1
B1 (x) = x−
2
1
B2 (x) = x2 − x +
6
3 1
B3 (x) = x3 − x2 + x
2 2
1
B4 (x) = x4 − 2x3 + x2 −
30
5 5 4 5 3 1
B5 (x) = x − x + x − x
2 3 6
... (952)
170
From the generating function expansion, one can show that the Bernoulli poly-
nomials are either even or odd functions of x − 12 . The generating function can
be expressed
∞
zn
z(x− 12 ) z X
G(z, x) = e z z = B n (x) (953)
e 2 − e− 2 n=0
n!
where the second factor is an even function of z, thus, the generating function is
invariant under the combined transformation z → −z and (x − 12 ) → −(x − 12 ).
Therefore, one has
∞ ∞
1 zn zn
X 1 X 1 1
Bn +x− = Bn + − x ( − 1 )n (954)
n=0
2 2 n! n=0
2 2 n!
The even part has only even terms in its Taylor expansion, and there is only
one term in the odd part. Hence, the odd Bernoulli numbers vanish for n > 1,
i.e. B2n+1 (0) = 0 for n > 0. Therefore, for n ≥ 2, one has Bn (0) = Bn (1). This
equality can be used to evaluate the integrals of the Bernoulli polynomial over
the range from 0 to 1. On expressing the integral of Bn (x) in terms of Bn+1 (x),
one has
Z 1 Z 1
1 ∂Bn+1 (x)
dx Bn (x) = dx
0 (n + 1) 0 ∂x
Bn+1 (1) − Bn+1 (0)
=
( n+1 )
= 0 for n ≥ 1 (958)
Hence, the Bernoulli polynomials may be defined recursively via the relation
∂Bn (x)
= n Bn−1 (x) (959)
∂x
if the constant of integration is fixed by
Z 1
dx Bn (x) = 0 for n ≥ 1 (960)
0
171
The periodic Bernoulli functions Pn (x) can be defined by
where [x] is the integral part of x. This definition of Pn (x) reproduces to the
Bernoulli polynomials on the interval (0, 1) since [x] = 0 in this interval. The
functions Pn (x) are periodic over an extended range of x with period 1.
v = P1 (x) (965)
N Z N Z N
X f (1) + f (N ) ∂f (x)
f (n) = dx f (x) + + dx P1 (x) (970)
n=1 1 2 1 ∂x
172
The last two terms, therefore, give the error when the sum is approximated by
an integral. The first correction is simply the end point corrections from the
“trapezoidal rule”, and the second correction has to be evaluated to yield the
Euler-Maclaurin formula. The last correction is of the form of an integral which
can be expressed in terms of the sum of the integrals
Z n+1
dx f 0 (x) P1 (x) (971)
n
where the prime refers to the derivative of f (x) w.r.t. x. The above expression
can be evaluated by integrating by parts. The integrand is re-written as
Z n+1 Z n+1
∂v
dx f 0 (x) P1 (x) = dx u (972)
n n ∂x
where one identifies the two factors as
u = f 0 (x)
∂v
= P1 (x) (973)
∂x
Since the indefinite integral is evaluated as
Z x
1
dx0 P1 (x0 ) = P2 (x) (974)
2
the integration by parts yields
Z n+1 n+1 Z n+1
P2 (x) f 0 (x)
1
dx P1 (x) f 0 (x) = − dx f 00 (x) P2 (x) (975)
n 2 n 2 n
However, one has P2 (0) = P2 (1) = B2 , therefore the above expression simplifies
to
Z n+1 0 Z n+1
f (n + 1) − f 0 (n)
1
dx P1 (x) f 0 (x) = B2 − dx f 00 (x) P2 (x)
n 2 2 n
(976)
Then, on summing the above expression from n = 1 to n = N − 1, one finds
Z N 0 Z N
f (N ) − f 0 (1)
1
dx P1 (x) f 0 (x) = B2 − dx f 00 (x) P2 (x) (977)
1 2 2 1
This yields the first term in the series of end point corrections in the Euler-
Maclaurin formula, where the correction is the sum of the first derivatives at
the end points multiplied by B2 /2!. The above process can be iterated yielding
a complete proof of the Euler-Maclaurin summation formula.
In order to get bounds on the size of the error when the sum is approximated
by the integral, we note that the Bernoulli polynomials on the interval [0, 1] at-
tain their maximum absolute values at the endpoints and the value Bn (1) is the
173
n-th Bernoulli number.
References
e2
V (r) = − (980)
r
the Laplacian is related to a point charge density at the nucleus
174
Hence, the shift due to the fluctuations in the electron’s potential energy occurs
primarily at the origin. The effect of the electromagnetic fluctuations on the
kinetic energy are not state specific, and can be considered as a uniform shift
of all the energy levels, like the electron’s rest mass energy m c2 . Thus, the
relative energy shift of the levels is solely determined by the potential at the
origin. Therefore, the states with non-zero angular momenta do not experience
the relative energy-shift since the electronic wave functions vanish at the origin.
Thus, only the 2s state experiences a shift but the 2p state is unshifted.
∆r(t)
where Eω is the Fourier component of the fluctuating electric field. Hence, the
ω-component of the mean squared fluctuation59 in the particle’s position is given
by
2
q Eω2
< | ∆r2ω | > = < | | > (984)
m ( ω02 − ω 2 )2
58 T.A. Welton, Phys. Rev. 74, 1557 (1948).
59 The average squared fluctuation of the electromagnetic field should, in principle, be cal-
culated as an average over a volume in time and space which encompasses the electron’s
trajectory.
175
On approximating the electromagnetic energy associated with the fluctuating
electromagnetic field < | Eω2 | > by the half the sum of the zero-point energies
of the photon modes, one has
V 1
< | Eω2 | > = 2 h̄ ω (985)
8π 4
where the factor 2 represents the two types of polarization of the normal modes.
Therefore, on summing over the normal modes, one finds that the mean squared
deviation of the electron’s trajectory from the classical orbit is proportional to
Z ∞
Eω2
Z
V
3
dΩ dω ω 2 < | 2 | >
(2πc) 0 ( ω0 − ω 2 )2
Z ∞
ω3
Z
4 π h̄ V
= 3
dΩ dω 2 (986)
V (2πc) 0 ( ω0 − ω 2 )2
The integration over ω can be approximated as
Z mch̄ 2
dω m c2
= ln (987)
ω0 ω h̄ ω0
where an upper and lower cut-off have been introduced to prevent the integral
from diverging60 . The expectation value of the second derivative of the potential
for the 2s state is given by
1
< | ∇2 V | > = 4 π e2 (988)
π a3
where the second factor represents the 2s electron density at the origin. The
corresponding factor for an ns level is expected to vary proportionally to n−3 .
Combining the above expressions, one finds that the 2s level is shifted by an
energy given by
2 3
m e4 m c2
4 e
∆E2s = 2 ln (989)
2 π h̄ c h̄ h̄ ω0
where the frequency of the electron’s orbit ω0 has been chosen as a lower cut-off
on the frequency of the electromagnetic fluctuations.
176
a state with momentum q will be evaluated via perturbation theory.
The lowest-order correction to the electron’s energy comes from the diamag-
netic interaction. From first-order perturbation theory, one finds the correction
(k,α)
q q
Figure 38: The first-order correction to the rest mass of the electron due to the
diamagnetic interaction.
since the electronic matrix elements give rise to the condition of conservation of
momentum. Hence, the correction to the energy is found as
e2
Z
V 2 π h̄ c 1
∆Eq(1) = 2 d3 k
2 m c2 ( 2 π )3 V k
Z ∞
e2
V 2 π h̄ c
= 8 π dk k
2 m c2 ( 2 π )3 V 0
2 Z ∞
e h̄
= dk k (993)
πmc 0
177
which diverges. This contribution is independent of the electron’s momentum q,
and since k = k 0 it can be seen that the contribution of the diamagnetic inter-
action to first-order is independent of the quantum state of the electron. This
contribution to the electron’s energy can be lumped together with the electron’s
rest-energy m c2 . However, since the corrections are being evaluated for non-
relativistic electrons, it is customary to ignore the rest-energy and, therefore,
this correction shall no longer be considered.
(k,α)
q-k
q q
to a virtual process in which the electron emits a photon and then re-absorbs
it. The second-order correction to the energy is evaluated from
X < q {0} | Ĥpara | q 0 1k,α > < q 0 1k,α | Ĥpara | q {0} >
∆Eq(2) =
0
Eq0 + h̄ ωk − Eq
q ,k,α
(994)
where | q 0 1k,α > is a one-photon intermediate state of the electron-photon
system. We assume that the process does not conserve energy, so that the
denominator is finite. The matrix elements are evaluated as
s
0 2 π h̄ c2
< q 1k,α | Ĥpara | q {0} > = h̄ ˆα (k) . q
V ωk
Z
1 3 0
× d r exp − i q . r exp − i k . r exp i q . r
V
s
2 π h̄ c2
= h̄ ˆα (k) . q δq0 +k−q (995)
V ωk
| h̄ q . ˆα (k) |2
2 X
2 π h̄ c2
e
∆Eq(2) = h̄2 (q−k)2
(996)
m2 c2 V ωk h̄2 q 2
− − h̄ ω
k,α 2 m 2 m k
178
On summing over the polarizations by using the diadic completeness relation62
X
ˆα (k) ˆα (k) = Iˆ − k̂ k̂ (997)
α
(1001)
It should be evident that the integral diverges logarithmically at large k. The
divergent part of the integral can be written as
Z ∞
h̄2 q 2
2 Z π
e 2 dk
∆Eq(2) ∼ − dθ sin θ ( 1 − cos2 θ )
2m h̄ c π 0 2mc
h̄
k
2 2 2 Z ∞
h̄ q e 8 dk
= − (1002)
2m h̄ c 3 π 2mc h̄
k
62 The completeness relation merely expresses the fact that any vector in a three-dimensional
space can be expressed in terms of the components along three orthogonal directions êi
3
X
A = Ai êi
i=1
where the components are given by the scalar product
Ai = A . êi
Hence, the completeness relation follows as
X
I = êi êi
i
.
179
If an upper cut-off λ−1
+ is introduced, then the correction to the electron’s kinetic
energy can be estimated as
h̄2 q 2 8
2
e h̄
∆Eq(2) = − ln (1003)
2 m 3 π h̄ c 2 m c λ+
(2)
2
e h̄ c
Z ∞ Z π X | < n0 l0 m0 | p̂ | nlm > |2 ( 1 − cos2 θk )
∆Enlm = dk k dθ k sin θ k
m2 c2 ( 2 π ) 0 0 Enlm − En0 l0 m0 − h̄ ωk
n0 l0 m0
(1006)
where θk is the angle subtended between k and the matrix elements of p. The
angular integration can be performed, yielding
(2)
2
e 2 h̄ c
Z ∞ X | < n0 l0 m0 | p̂ | nlm > |2
∆Enlm = dk k
m2 c2 3π 0 Enlm − En0 l0 m0 − h̄ ωk
n0 l0 m0
(1007)
In the completely non-relativistic limit, the integration over k can be shown to
be linearly divergent at the upper limit of integration.
Hans Bethe argued63 that, within the same approximation, the correction to
the kinetic energy of the electron in the state | nlm > is given by an expression
analogous to that of an electron in a continuum state n
2 2 Z ∞ X | < n0 | p̂ | n > |2
(2) 2 e h̄
∆Tn = dω ω (1008)
3 π h̄ c mc 0 0
En − En0 − h̄ ω
n
63 H. A. Bethe, Phys. Rev. 72, 339 (1947).
180
Since momentum is conserved for continuum states (on average), only the state
where n = n0 contribute so the denominator simplifies. The expression for the
mass renormalization is divergent and is given by
2 Z ∞ X | < n0 | p̂ | n > |2
e2
2 h̄
∆Tn(2) = − dω ω
3π h̄ c mc 0 0
h̄ ω
n
2 ∞
dω ω < n | p̂2 | n >
Z
4 e h̄
= − (1009)
3π h̄ c m c2 0 ω 2m
where the completeness relation has been used. This expression is valid if n
labels either a continuum or a discrete state, since only the mass of the electron is
being altered and the expectation value of p̂ is unaltered. The bare Hamiltonian
is given by
p̂2
Ĥ0 = + V (r) (1010)
2m
and the unperturbed energy of the hypothetical state | nlm > is calculated in
the non-relativistic Schrödinger theory as
(0) p̂2
Enlm = < nlm | | nlm > + < nlm | V (r) | nlm > (1011)
2m
However, when this is evaluated, the approximate energy has to be expressed
in terms of the observed physical mass via
(0) p̂2
Enlm = < nlm | ∗
| nlm > + < nlm | V (r) | nlm >
22m Z ∞
dω ω < nlm | p̂2 | nlm >
4 e h̄
+ 2
3 π h̄ c mc 0 ω 2m
2
p̂
= < nlm | ∗
| nlm > + < nlm | V (r) | nlm >
2m
2 Z ∞
dω ω X | < n0 l0 m0 | p̂ | nlm > |2
4 e h̄
+
3 π h̄ c m c2 0 ω 0 0 0
2m
nlm
(1012)
where the completeness relation was used in obtaining the last line. The second
term in the unperturbed energy is a correction due to the mass renormalization64
which should be combined with the second-order radiative correction. The total
energy (to second-order) is given by
p̂2
Enlm = < nlm | | nlm > + < nlm | V (r) | nlm >
2 m∗
64 Renormalization is an idea which Bethe attributed to H. A. Kramers. Kramers had
proposed that physical quantities should be expressed in terms of observable quantities, with
all mention of bare quantities removed. Kramers was advocating a classical treatment from
which Bethe created a non-relativistic quantum treatment.
181
2 Z ∞
e2 | < n0 l0 m0 | p̂ | nlm > |2
2 h̄ X
+ dω ω
3π h̄ c mc 0 h̄ ω
n0 l0 m0
2 Z
e2 ∞ | < n0 l0 m0 | p̂ | nlm > |2
2 h̄ X
+ dω ω
3π h̄ c mc 0 Enlm − En0 l0 m0 − h̄ ω
n0 l0 m0
(1013)
The overall (second-order) shift Schrödinger’s estimate of the energy of the state
| nlm > (as calculated with the physical mass) is given by the sum of the last
two terms, which is expressed as
2 Z
e2 ∞ | < n0 l0 m0 | p̂ | nlm > |2 ( Enlm − En0 l0 m0 )
shift 2 h̄ X
∆Enlm = dω ω
3π h̄ c mc 0 ( Enlm − En0 l0 m0 − h̄ ω ) h̄ ω
n0 l0 m0
(1014)
If the rest energy of the electron is used as the upper cut-off energy m c2 ∼ 0.5 ×
106 eV, and assuming that the averaged logarithm of the electron excitation
energy corresponds to an energy of the order of 17.8 Ryd, then the logarithm has
a value of about 7.63 and is not sensitive to the precise value of Enlm − En0 l0 m0
and, therefore, can be taken outside the summation
2 h̄2 c2
2
shift 2 e
∆Enlm = − ln 2 4
3 π h̄ c Z e
X | < n l m0 | p̂ | nlm > |2
0 0
× ( Enlm − En0 l0 m0 )
0 0 0
m2 c2
nlm
(1016)
2
As later shown by Dyson65 , that divergences found in any order in h̄e c can
be removed by consistently using the ideas of mass and charge renormaliza-
65 F. J. Dyson, Phys. Rev. 75, 1736 (1949).
182
tion66 . Hence, a completely consistent relativistic theory does yield a finite
shift, without the need to invoke any cut-off67 . The weighted sum over the
matrix elements can be evaluated by expressing it in terms of an expectation
value involving commutators of Ĥ0 with p̂. That is
X
| < n0 l0 m0 | p̂ | nlm > |2 ( Enlm − En0 l0 m0 )
n0 l0 m0
X
= < nlm | p̂ | n0 l0 m0 > < n0 l0 m0 | [ p̂ , Ĥ0 ] | nlm >(1017)
n0 l0 m0
On substituting
p̂ = − i h̄ ∇ (1019)
and
p̂2
Ĥ0 = + V (r) (1020)
2m
into the expression for the matrix elements, one obtains
Z
− h̄2 d3 r ψnlm (r) ∇ . ( ∇ V (r) ) ψnlm (r) (1021)
Thus, the energy-shift only occurs for bound electrons as the expectation value of
the Laplacian of the potential will vanish for extended states. For a hydrogenic-
like atom
∇2 V (r) = 4 π Z e2 δ 3 (r) (1024)
66 This statement does not imply that a properly renormalized perturbation theory is con-
vergent. In fact, one may argue that if the coupling constant changed sign then systems
containing electrons would be unstable to BCS pairing. Since the radius of convergence of
any expansion is limited by the closest singularity, perturbation theory may only have a zero
radius of convergence. In this case, the theory may be expected to contain non-analytic terms
of the form exp[ − h̄ c/ e2 ].
67 F. J. Dyson, Phys. Rev. 173, 617 (1948).
183
so
4 Z e2 e2 h̄2 2 h̄2 c2
∆E shift
nlm = 2
| ψnlm (0) | ln 2 4 (1025)
3 h̄ c m2 c2 Z e
Therefore, the Lamb shift only occurs for electrons with l = 0, since electronic
wave functions with l 6= 0 vanish at the origin. The atomic wave function at
the position of the nucleus is given by
3
1 Z
| ψn00 (0) |2 = (1026)
π na
This yields Bethe’s estimate for the Lamb shift as
2 3 4 4
2 h̄2 c2
shift 4 e Z e m
∆En00 = ln 2 4
(1027)
3 π n3 h̄ c h̄2 Z e
The above formulae leads to the estimate of 1040 MHz which is in good agree-
ment with the experimentally determined value68 . The exact relativistic calcu-
lation69 yields the result
2 5
m c2
shift 4 4 e 2 31
∆En00 = Z mc ln
+ (1028)
3 π n3 h̄ c 2 h̄ ωn,n0 120
where the mc2 in the logarithm comes from the Dirac theory without invoking
any cut-off. The most recent experimentally measured value70 is 1057.851 MHz
which is in good agreement with the theoretical value of 1057.857 MHz.
10.3.5 Brehmstrahlung
Accelerating (or decelerating) charged particles radiate. The radiation emitted
by a charged particle that scatters from a massive charged particle via the
Coulomb interaction, shall be considered. It is assumed that the mass of the
charged particle (in most cases, this is a nucleus) M is significantly greater
than the electron mass, so that the recoil of the nucleus can be neglected. The
(instantaneous) Coulomb interaction between the electron and the nucleus is
given by
Z e2
V (r) = − (1029)
r
The Hamiltonian of the unperturbed electron is simply the kinetic energy. The
incident electron is assumed to have a momentum q and the scattered electron
has momentum q 0 and the cross-section for the scattering process will be calcu-
lated via low-order perturbation theory.
184
Rutherford Scattering
q q'
q-q'
185
q'
2q sinθ'/2
θ'
q
Figure 41: The geometry for Rutherford scattering. For elastic scattering, the
magnitude of the initial momentum q is equal to the magnitude of the final
momentum q 0 and the scattering angle is θ0 .
Figure 42: The scattering angle dependence of the differential scattering cross-
section.
186
and E. Marsden in 1913 through scattering of charged α-particles71 that was
instrumental in verifying Rutherford’s 1911 conjecture72 that the atom has nu-
cleus of very small spatial extent. The divergence in the scattering cross-section
at θ0 = 0 is due to the long-ranged nature of the Coulomb interaction, which
causes electrons to undergo scattering (no matter how slight the scattering is)
at arbitrarily large distances from the nucleus.
Brehmstrahlung
(k,α)
(k,α)
q'+k q-k
q q' q q'
(a) Scattering of an electron from the nucleus followed by the emission of a pho-
ton. The initial state of the electron is assumed to have momentum q and the
final state of the electron is given by q 0 while the emitted photon has momentum
k. Therefore, from conservation of momentum, the momentum of the electron
in the intermediate state is given by q 0 + k.
187
The matrix elements for these second-order processes are given by
s
4 π Z e2 2 π h̄ c2
e h̄
Ma = 0 2
ˆα (k) . ( q 0 + k )
V |q − q − k| mc V ωk
1
× (1037)
Eq − Eq0 +k + i η
and
s
4 π Z e2 2 π h̄ c2
e h̄
Mb = 0 2
ˆα (k) . ( q − k )
V |q − q − k| mc V ωk
1
× (1038)
Eq − Eq−k − h̄ ωk + i η
It should be noted that the numerators of the matrix elements simplify because
the photons have transverse polarizations
α (k) . k = 0 (1039)
From the energy conserving delta function in the expression for the decay rate,
one finds
Eq = Eq0 + h̄ ωk (1040)
hence the first energy-denominator can be expressed in a similar form to the
second
Eq − Eq0 +k = Eq0 − Eq0 +k + h̄ ωk (1041)
For small k, the energy-denominators can be expanded, yielding
h̄ 0 h̄2 k 2
Eq0 − Eq0 +k + h̄ ωk = h̄ ωk − q .k − (1042)
m 2m
and
h̄ h̄2 k 2
Eq − Eq−k − h̄ ωk = − h̄ ωk + q.k − (1043)
m 2m
Since the energy of the photon cannot exceed the energy of the initial electron,
one must have q > k, so the third term is smaller than the second term. Due to
the large magnitude of c compared with the electron velocities h̄mq , the second
and third terms can be neglected. Therefore, the photon-energy dominates both
the energy-denominators. On substituting the above expressions in the sum of
the matrix elements, one finds
s
4 π Z e2 2 π h̄ c2
e
M a + Mb = 0 2
V |q − q − k| mc V ωk
0
ˆα (k) . q h̄ ˆα (k) . q h̄
× +
Eq0 − Eq0 +k + h̄ ωk + i η Eq − Eq−k − h̄ ωk + i η
188
s
4 π Z e2 2 π h̄ c2
e
≈ 0 2
V |q − q − k| mc V ωk
0
ˆα (k) . ( q − q )
× (1044)
ωk
Using this approximation for the matrix elements, the transition rate is given
by
2 2
4 π Z e2 2 π h̄ c2
1 2π X X e
=
τ h̄ 0
V | q − q 0 − k |2 mc V ωk
q k,α
ˆα (k) . ( q 0 − q ) 2
×
δ( Eq − Eq0 − h̄ ωk ) (1045)
ωk
If the angular distributions of the emitted photon and the scattered electron are
both measured, the scattering cross-section can be represented as
d3 σ q0
dσ
=
dΩ0 dΩk dωk Brehmse q dΩ0 Rutherford
h̄ ˆα (k) . ( q 0 − q ) 2
2 X
1 e
×
4 π 2 ωk h̄ c
α
mc
(1047)
where the second factor is the probability of emitting a photon with energy h̄ ωk
into solid angle dΩk . On summing over the polarization α and integrating over
the directions of the emitted photon, one obtains
2
d2 σ q0 2 m Z e2
=
dΩ0 dωk Brehmse q h̄2 | q − q 0 |2
h̄ ( q 0 − q ) 2
2
2 e
× (1048)
3 π ωk h̄ c mc
Hence, the scattering rate which includes the emission of a photon of energy
h̄ ωk is given by the product of the Rutherford scattering rate with a factor
0
q0 2 q h̄ sin θ2 2
2
2 e
(1049)
q 3 π ωk h̄ c mc
189
This particular factorization of the cross-section involving the simultaneous
emission of a soft photon is common to many processes involving the emis-
sion of low-energy bosons. The soft-photon theorem74 shows that properties
of the emitted low-energy photon is insensitive to anything except the global
properties (such as the total charge or total magnetic moment) of the scattered
particle. The cross-section involving the emission of a low-energy photon di-
verges as ωk → 0. This type of divergence is an infrared divergence. What
this implies is that, in Brehmstrahlung, arbitrary large numbers of low-energy
photons are emitted. Furthermore, similar singularities are also found in the
ω = 0 limit when elastic scattering corrections to the Rutherford scattering
process are considered75 . In any experiment with finite energy resolution, elas-
tic scattering and very low-energy quasi-elastic scattering processes cannot be
distinguished, so it is might be expected that the elastic scattering and quasi-
elastic scattering divergences should be combined.
it is expected that the classical limit of quantum theory applies so that classical electromag-
netic theory should produce exact results.
190
so the cut-off λ− cancels and the scattering cross-section does not diverge log-
arithmically. With this reasoning, Bloch and Nordsieck found that the appro-
2 2
h̄ ω0
priate expansion parameter is not h̄e c but instead is given by h̄e c ln m c2 . The
higher-order perturbations may also describe processes involving larger numbers
of emitted soft photons and results in a multiplicative exponential factor to the
quasi-elastic scattering rate
2
dσ dσ 1 e 2 h̄ ω0
≈ exp B ln + . . .
dΩ Quasi-Elastic dΩ Rutherford 2 π h̄ c m c2
(1053)
Therefore, the scattering rate from soft photons vanishes in the limit ω0 → 0.
This occurs because perturbation theory causes the normalization of the starting
approximate wave function to change, and hence the probabilities of the vari-
ous processes are changed by including higher-order processes. In other words,
since the probability of emitting an arbitrarily large number of soft-photons is
finite, the probability of emitting either zero or any fixed number of soft photons
must be zero. Bloch and Nordsieck’s calculation was restricted to the case of
emission of sufficiently low-energy photons. Pauli and Fierz78 also considered
Brehmstrahlung in a non-relativistic approximation. Pauli and Fierz showed
that the infra-red divergences, discussed above, cancel. Pauli and Fierz went on
to examine the remaining ultra-violet divergences, and showed that portions of
the ultra-violet infinities that were found in the calculations of the scattering
processes could be associated with mass renormalization. Using a relativistic
theory Ito, Koba and Tomonaga79 showed that the remaining infinities could
be absorbed into a renormalization of the electron charge. Similar conclusions
were arrived at by Lewis80 and by Epstein81 . Dyson82 showed that all infinities
that appear in Quantum Electrodynamics could be cured by renormalization to
arbitrarily high-orders in perturbation theory.
191
equation which is first-order in time. Dirac83 searched for a set of coupled first-
order (in time) equations for a multi-component wave function ψ
ψ (0)
ψ (1)
ψ = (1055)
..
.
ψ (N )
The wave function was assumed to satisfy an equation of the form
h̄ ∂
i − α . p̂ ψ = β m c ψ
c ∂t
h̄ ∂
i + i h̄ α . ∇ ψ = β m c ψ (1056)
c ∂t
The equations have to be of this form since, if the equation is a first-order partial
differential equation in time then it must also only involve the first-order partial
derivatives with respect to the spatial components if the resulting equation is
to be relativistically covariant. The wave function ψ is a N -component (column
wave) function and the three as yet unknown components of α and β are three
N × N matrices. Since the Hamiltonian is the generator of time translations,
∂
then Ĥ should be equivalent to ih̄ ∂t . Hence, as the Hamiltonian operator Ĥ
must be Hermitean, then the operators α and β must be Hermitean matrices.
This set of equations is required to yield the dispersion relation for a relativistic
particle
2
E
− p2 = m2 c2 (1057)
c
which, following the ordinary rules of quantization, leads to the Klein-Gordon
equation
h̄2 ∂ 2
2 2
− 2 + h̄ ∇ ψ = m2 c2 ψ (1058)
c ∂t2
(which is a second-order partial differential equation in time). The requirement
that the Dirac equation is compatible with the Klein-Gordon equation imposes
conditions on the form of the matrices. On writing the Dirac equation as
h̄ ∂ψ
i = β m c − i h̄ α . ∇ ψ (1059)
c ∂t
and iterating, one has
2 2 2
h̄ ∂ ψ
− = β m c − i h̄ α . ∇ ψ
c ∂t2
83 P. A. M. Dirac, Proc. Roy. Soc. A 117, 610 (1928).
192
= β 2 m2 c2 − i h̄ m c ( β α + α β ) . ∇
− h̄2 ( α . ∇ )2 ψ (1060)
When expressed in terms of individual matrices α(j) , the above equation be-
comes
2 2
h̄ ∂ ψ X
− 2
= β 2 m2 c2 − i h̄ m c ( β α(j) + α(j) β ) ∇j
c ∂t j
2 X
h̄ (i) (j) (j) (i)
− (α α + α α ) ∇i ∇j ψ (1061)
2 i,j
From eqn(1062), one concludes that if the Hermitean matrices are brought to
diagonal form then the diagonal elements are given by ± 1. The possible dimen-
sions N of the matrix can be determined by considering the anti-commutation
relations. On taking the determinant of eqn(1063), one finds
det α(i) det β = ( −1 )N det β det α(i)
det α(i) det α(j) = ( −1 )N det α(j) det α(i) (1064)
Hence, on cancelling the common factors of determinants, one finds
( − 1 )N = 1 (1065)
so N must be even. Furthermore, the matrices must be traceless. This can be
seen by considering
α(i) α(j) = − α(j) α(i) (1066)
which on multiplying by α(i) , yields the relation
α(j) = − α(i) α(j) α(i)
α(j) = − ( α(i) )−1 α(j) α(i) (1067)
193
since α(i) is its own inverse. Apart from the negative sign, the form of the
left-hand side is of the form of an equivalence transformation. By using cyclic
invariance, it can be shown that the trace of a matrix is invariant under equiv-
alence transformations. Therefore, one has
or
Trace α(i) = 0 (1069)
which proves that the matrices are traceless.
β2 = Iˆ
(α ) (i) 2
= Iˆ (1070)
then their eigenvalues must all be ±1, as can be seen by operating on the
eigenvalue equation
β φβ = λβ φβ (1071)
with β. This process yields
β 2 φβ = λβ β φβ
= λ2β φβ (1072)
λ2β = 1 (1073)
This and the condition that the matrices are traceless implies that the set of
eigenvalues of each matrix are composed of equal numbers of +1 and −1, and
it also confirms the conclusion that dimension N of the matrices must be even.
The smallest value dimension for which there is a representation of the matrices
is N = 4. The smallest even value of N , N = 2 can not be used since one
can only construct three linearly independent anti-commuting 2 × 2 matrices84 .
These three matrices are the Pauli spin matrices σ (j) . Hence, Dirac constructed
the relativistic theory with N = 4.
d+1 linearly independent (anti-commuting) Dirac-matrices. We shall assume that the product
matrices are linearly independent. Since the number of linearly independent N × N matrices
is N 2 , the minimum dimension N which will yield a representation of the Dirac-matrices is
d+1
N =2 2 .
194
the 4 × 4 matrices in the form of 2 × 2 block matrices. In this case, one can
represent the matrix in the block-diagonal form
I 0
β = (1074)
0 −I
If the three matrices α(i) are to anti-commute with β and be Hermitean, they
must have the off-diagonal form
A(i)
(i) 0
α = (1075)
A(i)† 0
where A(i) is an arbitrary 2 × 2 matrix. We shall choose all three A(i) matrices
to be Hermitean. Since the three α(i) matrices must anti-commute with each
other, the A(i) must also anti-commute with each other. Since the three Pauli
matrices are mutually anti-commuting, one can set
σ (i)
(i) 0
α = (1076)
σ (i) 0
where the σ (i) and I are, respectively, the 2 × 2 Pauli matrices and the 2 × 2
unit matrix. The Pauli matrices are given by
(1) 0 1
σ = (1077)
1 0
(2) 0 −i
σ = (1078)
i 0
and
1 0
σ (3) = (1079)
0 −1
The matrix α(0) is defined as the 4 × 4 identity matrix
I 0
α(0) = (1080)
0 I
This set of matrices form a representation of the Dirac matrices. This can
be seen by directly showing that they satisfy the appropriate relations. Many
different representations of the Dirac matrices can be found, but they are all
related by equivalence transformations and the physical results are independent
of which choice is made.
Exercise:
By direct matrix multiplication, show that the above matrices satisfy the
relations
( α(j) )2 = β 2 = Iˆ (1081)
195
and the anti-commutation relations
α(i) β + β α(i) = 0
α (i)
α(j) + α(j) α(i) = 2 δ i,j Iˆ (1082)
one obtains
∂ψ
† † † 2
i h̄ ψ = − i h̄ c ψ α . ∇ ψ + ψ β ψ m c (1085)
∂t
The Hermitean conjugate of the Dirac equation is given by
∂ψ †
† † † † 2
− i h̄ = + i h̄ c ∇ . ψ α + ψ β m c (1086)
∂t
Therefore, since α and β are Hermitean matrices, the Hermitean conjugate
equation simplifies to
∂ψ †
− i h̄ = + i h̄ c ∇ . ψ † α + ψ † β m c2 (1087)
∂t
Post-multiplying the equation by the column-vector ψ, yields
∂ψ †
− i h̄ ψ = + i h̄ c ∇ . ψ † α ψ + ψ † β ψ m c2 (1088)
∂t
On subtracting eqn(1088) from the eqn(1085) and combining terms, one obtains
∂
i h̄ ( ψ † ψ ) = − i h̄ c ∇ . ( ψ † α ψ ) (1089)
∂t
The above equation is in the form of a continuity equation
∂ρ
+ ∇.j = 0 (1090)
∂t
196
in which the probability density is given by
ρ = ψ† ψ (1091)
Using the rules of matrix multiplication the probability density is a real scalar
quantity, which is given by the sum of squares
ρ = | ψ (0) |2 + | ψ (1) |2 + | ψ (2) |2 + | ψ (3) |2 (1092)
and so it is positive definite. Hence, unlike the Klein-Gordon equation, the
Dirac equation does not lead to negative probability densities. The probability
current density j is given by
j = c ψ† α ψ (1093)
In this case, the total probability
Z Z
Q = d3 x ψ † ψ = d3 x ρ (1094)
is conserved, since
Z
dQ ∂ρ
= d3 x
dt ∂t
Z
= − d3 x ∇ . j
Z
= − d2 S . j (1095)
where Gauss’s theorem has been used to represent the volume integral as surface
integral. For a sufficiently large volume, the current at the boundary vanishes,
hence the total probability is conserved
dQ
= 0 (1096)
dt
197
Or equivalently, after multiplying the Dirac equation by β and then introducing
the four γ matrices via
γ µ = β αµ (1100)
one finds that the Dirac equation appears in the alternate forms
γ µ p̂µ ψ = mcψ
i h̄ γ µ ∂µ ψ = mcψ (1101)
γ µ γ ν + γ ν γ µ = 2 g µ,ν Iˆ (1102)
where Iˆ is the 4 × 4 identity matrix, and g µ,ν is the Minkowski metric. The
gamma matrices labelled by the spatial indices are Unitary and anti-Hermitean,
as shall be proved below.
It is easy to show that the matrix with the temporal index (0) is unitary
and Hermitean
( γ (i) )† = ( β α(i) )†
= ( ( α(i) )† β † )
= ( α(i) β )
= ( − β α(i) )
= − γ (i) (1104)
since α(i) and β are Hermitean and, in the fourth line the operators have been
anti-commuted. Now, the gamma matrices with spatial indices can be shown
to be unitary since
where, in obtaining the second line, the anti-commutation properties of α(i) and
β have been used, and the property
( α(i) )2 = β 2 = Iˆ (1106)
198
was used to obtain the last line. Since it has already been demonstrated that
the spatial matrices are anti-Hermitean
Hence, since
( γ (0) )2 = Iˆ (1110)
the Hermitean conjugate wave function ψ † can be expressed in terms of the
†
adjoint spinor ψ via
†
ψ † = ψ γ (0) (1111)
The continuity equation is given by the Lorentz covariant form
∂j µ
= 0 (1112)
∂xµ
where the four-vector conserved probability current j µ is given by
j µ = c ψ † αµ ψ (1113)
By using the definition of the Dirac adjoint, the current density can be re-
expressed as the four quantities
†
j (0) = c ψ γ (0) ψ
†
j (i) = c ψ γ (i) ψ (1114)
that, respectively, represent c times the probability density and the j (i) are the
contravariant components of the probability current density.
199
explicit dependence on position. The solution can be expressed as a momentum
eigenstate in the form
(0)
u
u(1)
µ
ψ = (2) exp − i kµ x
(1115)
u
u(3)
Hence, the Dirac equation can be expressed as the block-diagonal matrix equa-
tion
(0) m c
− k + h̄ I k.σ A
φ
B = 0 (1119)
φ
− k (0) − mh̄ c I
k.σ
200
Using Pauli’s identity
σ.A σ.B = A.B I + iσ. A ∧ B (1122)
one finds the energy eigenvalues are given by the doubly-degenerate dispersion
relations s 2
mc
k (0)
= ± + k2 (1123)
h̄
Thus, the field free relativistic electron can have positive and negative-energy
eigenvalues given by q
E = ± m2 c4 + p2 c2 (1124)
Since the solutions are degenerate, solutions can be found that are simultaneous
eigenvalues of the Hamiltonian Ĥ given by
m c2 I
− i h̄ c σ . ∇
Ĥ = (1125)
− i h̄ c σ . ∇ − m c2 I
and another operator that commutes with Ĥ. It is convenient to choose the
second operator to be the helicity operator.
Σk = +1 Σk = −1
k k
Figure 44: A cartoon depicting the two helicity states of a spin one-half particle.
given by
σ.∇ 0
Σ̂ = − i h̄ (1126)
0 σ.∇
This is the appropriate relativistic generalization of spin valid only for free
particles85 , as the helicity is a conserved quantity since
[ Ĥ , Σ̂ ] = 0 (1127)
85 Helicity is not conserved for spherically symmetric potentials. However, if only a time-
independent vector potential is present, the generalized quantity
q
Σ̂ = σ . ( p̂ − A)
c
is conserved. This conservation law implies that the spin will always retain its alignment with
the velocity.
201
In the absence of electromagnetic fields, the Hamiltonian is evaluated as
m c2 I
h̄ c σ . k
Ĥ(k) = (1128)
h̄ c σ . k − m c2 I
Likewise, for the source free case, the properly normalized Helicity operator is
found as
σ . k̂ 0
Λ(k) = (1129)
0 σ . k̂
which has eigenvalues of ±1.
φA
+ = u(0) χ+
(0) 1
= u (1131)
0
and φB
+ as
φB
+ = u(2) χ+
1
= u(2) (1132)
0
Therefore, one has
u(0) χ+
ψ+ (x) = exp − i kµ xµ (1133)
u(2) χ+
φA
− = u(1) χ−
(1) 0
= u (1134)
1
and φB
+ as
φB
− = u(3) χ−
(3) 0
= u (1135)
1
202
Thus, the eigenstates with helicity −1 are the spin-down eigenstates
(1)
u χ− µ
ψ− (x) = exp − i kµ x (1136)
u(3) χ−
On substituting the helicity eigenstates ψΛ into the Dirac equation for the
free spin one-half particle
∂
i h̄ ψΛ = Ĥ ψΛ (1138)
∂t
one finds
φA m c2 σ (3) c h̄ k (3) φA
E Λ = Λ (1139)
φB
Λ σ c h̄ k (3)
(3)
− m c2 φB
Λ
σ (3) c h̄ k (3) A
φB
Λ = φ (1140)
E + m c2 Λ
This equation shows that the components φBΛ are small for the positive-energy
solutions, whereas the complementary expression
σ (3) c h̄ k (3) B
φA
Λ = − φ (1141)
m c2 − E Λ
shows that φA Λ is small for the negative-energy solutions. Hence, the two
positive-energy and two negative-energy (un-normalized) solutions of the Dirac
equation can be written as
χ+
µ
ψ+ (x) = Ne exp − i kµ x (1142)
c h̄ k(3)
E + m c 2 χ+
χ−
ψ− (x) = Ne exp − i kµ xµ (1143)
c h̄ k(3)
− E + m c2 χ−
203
The normalization condition is
Z
d3 r ψ † ψ = 1 (1144)
c2 h̄2 k 2
1 = V Ne 2 1 +
( E + m c2 )2
E + 2 E m c2 + m2 c4 + c2 h̄2 k 2
2
2
= V Ne
( E + m c2 )2
2 E + 2 E m c2
2
2
= V Ne
( E + m c2 )2
2 2E
= V Ne
E + m c2
Hence, the normalization constant can be set as
r
E + m c2
Ne = (1145)
2EV
for positive E.
the lower components are the large components. In this case, it is more conve-
nient to express the negative-energy solutions as
− m cc2h̄ −k E χ+
µ
ψ+ (x) = Np exp − i kµ x (1147)
χ+
for helicity -1. Furthermore, in this expression the normalization constant has
the form r
m c2 − E
Np = (1149)
−2EV
Hence, the positive and negative-energy solutions are symmetric under the in-
terchange E → − E, if Λ → − Λ and the upper and lower two-component
spinors (φA , φB ) are interchanged.
204
General Helicity Eigenstates
205
11.4 Coupling to Fields
The Dirac equation describes relativistic spin one-half fermions, and their anti-
particles. It describes all massive leptons such as the electron, muon and tao
particle, and can be generalized to describe their interaction with the electro-
magnetic field, or its generalization the electro-weak field. In the limit m → 0,
the Dirac equation reduces to the Weyl equation87 which describes neutrinos.
The Dirac equation also describes massive quarks and the interaction can be
generalized to quantum chromodynamics.
206
11.4.1 Mott Scattering
We shall consider the scattering of positive-energy electrons from a nucleus of
charge Z. The initial electron beam has momentum h̄ k which is scattered by
the target nucleus. The detector is placed so as to detect scattered electrons
with momentum h̄ k 0 . The initial and final states of the positive-energy electron
can be represented by the Dirac spinors of the form ψσ
χσ
ψk,σ (x) = Nk exp − i kµ xµ (1162)
c h̄ k . σ
Ek + m c2 χ σ
The interaction Hamiltonian with the electrostatic field of the nucleus is given
by the diagonal matrix
Z e2
I 0
ĤInt = − (1164)
r 0 I
h̄ k c2
F = (1167)
V Ek
The elastic scattering cross-section in which the final state polarization is un-
measured is given by
Ek V 2 X ∞
Z
dσ 1
= dk 0 k 02 | < k 0 σ 0 | ĤInt | k, σ > |2 δ( Ek − Ek0 )
dΩ0 ( 2 π )2 h̄2 k c2 0
σ0
(1168)
where the delta function ensures conservation of energy. Since the polarization
of the final state electron is unmeasured, the spin σ 0 is summed over. The
integration over k 0 can be performed, yielding
2 X
dσ EV
= | < k 0 σ 0 | ĤInt | k, σ > |2 (1169)
dΩ0 2 π h̄2 c2 σ 0
207
where k and k 0 are restricted to be on the energy shell (E = Ek = Ek0 ). The
matrix elements can then be evaluated as
4 π Z e2 E + m c2
0 0
< k , σ | ĤInt | k, σ > = −
V | k − k 0 |2 2EV
c h̄ ( σ . k 0 ) ( σ . k )
2 2
× χTσ0 I + χσ
( E + m c2 )2
(1170)
where the normalization constants have been combined, since energy is con-
served. Likewise, the complex conjugate matrix elements are given by
4 π Z e2 E + m c2
0 0
< k, σ | ĤInt | k , σ > = −
V | k − k 0 |2 2EV
c h̄ ( σ . k ) ( σ . k 0 )
2 2
× χTσ I + χσ0
( E + m c2 )2
(1171)
These expressions for the matrix elements are inserted into the scattering cross-
section. Since the final state polarization is not detected, then σ 0 must be
summed over. The trace over σ 0 is evaluated by using the completeness relation
X
χσ0 χTσ0 = I (1172)
σ0
208
Hence, the cross-section is given by
2
Z e2 c4 h̄4 k 2 k 02
dσ 2 2 2 2 0
= ( E + m c ) + 2 c h̄ k . k +
dΩ0 h̄2 c2 | k − k 0 |2 ( E + m c2 )2
(1176)
It should be noted that the last two terms originated from the combined action
of the Pauli spin operators and involved the lower two-component spinors. The
last term can be simplified by using the elastic scattering condition k = k 0 and
then using the identity
c4 h̄4 k 4 = ( E 2 − m2 c4 )2 (1177)
k . k 0 = k 2 cos θ0 (1179)
m2 c4 = E 2 − c2 h̄2 k 2 (1181)
has been introduced. The above result is the Mott scattering cross-section88 ,
which describes the scattering of charged electrons. It differs from the Ruther-
ford scattering cross-section due to the multiplicative factor of relativistic origin,
88 N. F. Mott, Proc. Roy. Soc. A 124, 425 (1929).
209
which deviates from unity due to the electron’s internal degrees of freedom. The
extra contribution to the scattering is interpreted in terms of scattering from the
magnetic moment associated with the electron’s spin interacting with the mag-
netic field of the nuclear charge that the electron experiences in its rest frame.
It should be noted that even if the initial beam of electrons is un-polarized, the
scattered beam will be partially spin-polarized (due to higher-order corrections).
We shall require that the matrices αµ are Hermitean and that they satisfy the
equation
( αµ )2 = Iˆ (1187)
On comparing with the form of Maxwell’s equations89 , one finds that the Ma-
trices are given by
1 0 0 0
0 1 0 0
α(0) = 0 0 1 0
(1188)
0 0 0 1
0 −1 0 0
−1 0 0 0
α(1) = (1189)
0 0 0 −i
0 0 i 0
89 Since the first element of ψ is zero, the first columns of the matrices are not determined
directly from the comparison. The first rows are determined by demanding that the matrices
are Hermitean.
210
0 0 −1 0
0 0 0 i
α(2) =
−1 0
(1190)
0 0
0 −i 0 0
0 0 0 −1
0 0 −i 0
α(3) =
0 i
(1191)
0 0
−1 0 0 0
The matrices corresponding to the spatial indices are traceless and satisfy the
anti-commutation relations
and X
α(i) α(j) = i ξ i,j,k α(k) (1193)
k
211
11.4.3 The Gordon Decomposition
The interaction of the Dirac particle with the electromagnetic field is described
by the interaction Hamiltonian which is described by the 4 × 4 matrix
q
ĤI = c γ (0) γ µ Aµ (1200)
c
The matrix interaction Hamiltonian operator yields an interaction Hamiltonian
density ĤI given by
q †
ĤI = c ψ γ µ ψ Aµ
c
q
= j µ Aµ (1201)
c
where j µ is the four-vector probability current density which satisfies the con-
dition for conservation of probability. Due to the prominence of the current
density operator in applications of the Dirac equation, since it naturally de-
scribes interactions with an electromagnetic field and the conservation laws, the
physical content of the current densities shall be examined next.
212
where the partial derivatives only operate on the wave function immediately to
the right of it. The identity
γ (0) γ (0) = Iˆ (1207)
†
has been used to express ψ † in terms of ψ . However, since the γ matrices
satisfy
γ (0) γ µ† γ (0) = γ µ (1208)
the current can be further simplified to yield
ν i h̄ q † µ ν † ν µ q
j = − ( ∂µ − i Aµ )ψ γ γ ψ + ψ γ γ ( ∂µ + i Aµ )ψ
2m h̄ c h̄ c
(1209)
where, once again, the partial derivative only operates on the wave function
immediately to the right of it. Furthermore, if one sets
1
γµ γν + γν γµ = g µ,ν Iˆ
2
1
γµ γν − γν γµ = − i σ µ,ν (1210)
2
then the current density can be expressed as the sum of two contributions
jν = jcν + jsν
i h̄ µ,ν † † q µ,ν †
= − g ( ∂µ ψ ψ − ψ ∂µ ψ ) + 2 i g ψ Aµ ψ
2m h̄ c
h̄ ∂ †
− ψ σ µ,ν ψ (1211)
2 m ∂xµ
where
i h̄ ν † † ν q †
jcν = − (∂ ψ ψ − ψ ∂ ψ) + 2i ν
ψ A ψ
2m h̄ c
h̄ ∂ †
jsν = − µ
ψ σ µ,ν ψ (1212)
2 m ∂x
213
Let us examine the first term in the probability current density. If ψ repre-
(0)
sents an energy eigenstate, then jc is given by
E † q †
jc(0) = ψ ψ − ψ A(0) ψ (1213)
mc mc
This contribution obviously yields the main contribution to (c times) the prob-
ability density
†
jc(0) ≈ c ψ ψ (1214)
in the non-relativistic limit since the rest mass energy dominates the energy
(i)
E ∼ m c2 . The spatial components of jc are given by
i h̄ † † q †
jc = (∇ψ )ψ − ψ (∇ψ) − ψ Aψ (1215)
2m mc
where the derivatives have been expressed as derivatives w.r.t. the contravariant
components x(i) of the position vector. This expression coincides with the full
non-relativistic expression for the current density j (i) .
We now examine the second term jsµ in the Gordon decomposition. For
future reference, the anti-symmetrized products of the Dirac matrices σ µ,ν will
be expressed in 2 × 2 block diagonal form. Therefore, since
I 0
γ (0) =
0 −I
σ (i)
0
γ (i) = (1216)
−σ (i) 0
and
i
σ µ,ν = γµ γν − γν γµ (1217)
2
the matrices are found as
σ (j)
0
σ 0,j = i (1218)
σ (j) 0
and
σ (k)
X 0
σ i,j = ξ i,j,k (1219)
0 σ (k)
k
The two by two block diagonal matrix of Pauli spin matrices will be denoted by
(0)
σ̂. For an energy eigenstate, the time-like component of js is identically zero.
(i)
Hence, the space-like components of js are given by
h̄ †
js = − ∇ ∧ ( ψ σ̂ ψ ) (1220)
2m
214
where σ̂ is the 2 × 2 block-diagonal Pauli spin matrix
σ 0
σ̂ = (1221)
0 σ
The additional term in the current density clearly involves the Pauli spin-
matrices. To elucidate its meaning, its contribution to the energy shall be
examined. On substituting this term in the interaction Hamiltonian density,
one finds a contribution
spin q
ĤI = − j .A
c s
q h̄ †
= + A . ∇ ∧ ( ψ σ̂ ψ ) (1222)
2mc
On integrating over space, the interaction Hamiltonian density gives rise to the
interactions contribution to the total energy. By integrating by parts, it can
be shown that this energy contribution is equivalent to the energy contribution
caused by an equivalent form of the interaction Hamiltonian density
spin q h̄ †
ĤI ≡ − ( ψ σ̂ ψ ) . ( ∇ ∧ A )
2mc
q h̄ †
≡ − ( ψ σ̂ ψ ) . B (1223)
2mc
where B is the magnetic field. Hence, the interaction energy contains a term
which represents an interaction between the electron’s internal degree of free-
dom and the magnetic field.
The first step of the proof of the Lorentz covariance of the Dirac equation
requires that one should be able to show that under a Lorentz transformation
defined by
Aµ → Aµ0 = Λµ ν Aν (1224)
215
then the Dirac equation is transformed from
q
γ µ ( p̂µ − Aµ ) ψ = m c ψ (1225)
c
to an equation with an equivalent form
q 0
γ µ0 ( p̂0µ − A ) ψ0 = m c ψ0 (1226)
c µ
Furthermore, the four components of the spinor wave function ψ 0 are assumed
to be linearly related to the components of ψ by a four by four matrix R̂(Λ)
which is independent of xµ
Hence, the transformed Dirac equation can be re-written in terms of the un-
transformed spinor
q 0
γ µ0 ( p̂0µ − A ) ψ0 = m c ψ0
c µ
q 0
γ µ0 ( p̂0µ − A ) R̂(Λ) ψ = m c R̂(Λ) ψ (1228)
c µ
if such an R̂(Λ) exists. The γ µ0 matrices must satisfy the same anti-commutation
relations as the γ µ and, therefore, only differ from them by a similarity trans-
formation91 . The transformations of γ µ0 just results in the set of the four linear
equations that compose the Dirac equation being combined in different ways,
so this rearrangement can be absorbed in the definition of R̂(Λ). That is, one
can choose to impose the convention that γ µ0 = γ µ . The transformed Dirac
equation can be expressed as
q 0
γ µ0 ( p̂0µ − A ) R̂(Λ) ψ = m c R̂(Λ) ψ
c µ
q
γ µ0 Λµ ν ( p̂ν − Aν ) R̂(Λ) ψ = m c R̂(Λ) ψ (1229)
c
where the transformation properties of the momentum four-vector have been
used92 . On multiplying by the inverse of R̂(Λ), one has
q
R̂−1 (Λ) γ µ0 Λµ ν ( p̂ν − Aν ) R̂(Λ) ψ = mcψ (1230)
c
q
R̂−1 (Λ) γ µ0 R̂(Λ) Λµ ν ( p̂ν − Aν ) ψ = mcψ (1231)
c
91 This is a statement of Pauli’s fundamental theorem [W. Pauli, Ann. Inst. Henri Poincaré
6, 109 (1936).]. For a general discussion, see R. H. Good Jr. Rev. Mod. Phys. 27, 187
(1955).
92 It should be noted that the matrices Λ ν and R̂ act on totally different spaces. The
µ
matrices Λµ ν act on the components of the four-vectors xν , whereas the R̂ matrices act on
the components of the four-component Dirac spinor ψ.
216
where the four by four matrices R̂(Λ) have been commuted with the differential
operators and also with the components of the Lorentz transform. The condition
for covariance as
R̂−1 (Λ) γ µ0 R̂(Λ) Λµ ν = γ ν (1232)
The transformed Dirac equation has the same form as the original equation if
the transformed γ µ0 matrices satisfy the same anti-commutations and conditions
as the unprimed matrices. This can be achieved by choosing γ µ0 = γ µ . This
choice yields the condition for covariance as
Λ µ ν Λ ρ ν = δµ ρ (1234)
The above equation determines the 4 × 4 matrix R̂(Λ). If R̂(Λ) exits, the Dirac
equation has the same form in the two frames of reference and the solutions
are linearly related. Pauli’s “fundamental theorem” guarantees that a matrix
R̂(Λ) exists which does satisfy the condition. Instead of following the general
theorem, the solution will be inferred from consideration of infinitesimal Lorentz
transformations.
Λµ ν = δ µ ν + µ ν + . . . (1236)
where δ µ ν is the Kronecker delta function. The matrix R̂(Λ) for the infinitesimal
transformation can also be expanded as
i µ
R̂ = Iˆ − ν ωµ ν + . . . (1237)
4
where ωµ ν is a four by four matrix that has yet to be determined. The inverse
matrix can be written as
i µ
R̂−1 = Iˆ + ν ωµ ν + . . . (1238)
4
to first-order in the infinitesimal quantity µ ν . On substituting the matrices for
the infinitesimal transform into the equation that determines R̂, one obtains
i ρ
σ ωρ σ γ µ − γ µ ωρ σ = µ ν γ ν (1239)
4
217
or on raising and lowering indices
i
ρσ ω ρσ γ µ − γ µ ω ρσ = g µρ ρσ γ σ (1240)
4
——————————————————————————————————
Proof of Solution
It can be shown that the expression for σ α,β given in eqn(1244) satisfies the
requirement of eqn(1243), by evaluating the nested commutator through repeat-
edly using the anti-commutation properties of the γ matrices. The commutator
can be expressed as a nested commutator or as the sum of two commutators
i
[ σ αβ , γ µ ] = [ [ γα , γβ ] , γµ ]
2
i i
= [ γα γβ , γµ ] − [ γβ γα , γµ ] (1245)
2 2
218
On using the anti-commutation relation for the γ matrices
1 α β
γ γ + γ γ β α
= g α,β Iˆ (1246)
2
[ σ αβ , γ µ ] = i [ γ α γ β , γ µ ] + i g α,β [ Iˆ , γ µ ]
= i [ γα γβ , γµ ] (1247)
where the second line follows since the identity matrix commutes with γ µ . One
notices that if the γ µ ’s are anti-commuted to the center of each product, some
terms will cancel and there may be some simplification. On using the anti-
commutation relation in the second term of the expression
[ σ αβ , γ µ ] = i γ α γ β γ µ − γ µ γ α γ β (1248)
one finds
[ σ αβ , γ µ ] = i γ α γ β γ µ + γ α γ µ γ β − 2 g µ,α γ β (1249)
Likewise, the γ matrices in the first term can also be anti-commuted, leading to
[ σ αβ , γ µ ] = i 2 g µ,β γ α − γ α γ µ γ β + γ α γ µ γ β − 2 g µ,α γ β
µ,β α µ,α β
= 2i g γ − g γ (1250)
since the middle pair of terms cancel. Hence, one has proved that
i αβ µ µ,α β µ,β α
[σ , γ ] = g γ − g γ (1251)
2
which completes the identification of the solution of the equation for ω α,β .
Therefore, since R̂(Λ) exists, it has been shown that the form of the Dirac
equation is maintained in the primed reference frame and that there is a one
to one correspondence between the solutions of the primed and unprimed frames.
——————————————————————————————————
It remains to be shown that the ψ and ψ 0 describe the properties of the same
physical system, albeit in two different frames of reference. That is, the proper-
ties associated with ψ must be related to the properties of ψ 0 and the relation
can be obtained by considering the Lorentz transformation. The most complete
219
physical descriptions of a unique quantum mechanical state are related to the
probability density, which can only be inferred from an infinite set of position
measurements. The probability density, should behave similarly to the time-like
component of a four-vector as was seen from the consideration of the continuity
equation. Therefore, it follows that if the four-vector probability currents of ψ
and ψ 0 are related via a Lorentz transformation, then the two spinors describe
the same physical state of the system.
j µ0 = c ψ †0 γ (0) γ µ ψ 0
= c ψ † R̂† γ (0) γ µ R̂ ψ (1253)
The identity
R̂−1 = γ (0) R̂† γ (0) (1254)
will be proved below, so on using this identity together with
j µ0 = c ψ † γ (0) Λµ ν γ ν ψ
†
= Λµ ν c ψ γ ν ψ
= Λµ ν j ν (1258)
Hence, the probability current densities j µ0 and j µ found in the two reference
frames are simply related via the Lorentz transformation. Therefore, the Dirac
equation gives consistent results, no matter what inertial frame of reference is
used.
——————————————————————————————————
220
Proof of Identity
The identity
R̂−1 (Λ) = γ (0) R̂† (Λ) γ (0) (1259)
can be proved by starting from the expression for the expression for R̂ appro-
priate for infinitesimal transformation given by
1
R̂ = Iˆ + µν [ γ µ , γ ν ] + . . . (1260)
8
Hence, the Hermitean conjugate is given by
1
R̂† = Iˆ + µν [ γ ν † , γ µ† ] + . . .
8
1
R̂† = Iˆ − µν [ γ µ† , γ ν † ] + . . . (1261)
8
since the Hermitean conjugate of a product is the product of the Hermitean
conjugate of the factors taken in opposite order. On forming the product
γ (0) R̂† γ (0) and inserting a factor of
γ (0) γ (0) = Iˆ (1262)
between the pairs of four by four γ matrices in the commutator and noting that
γ (0) γ µ† γ (0) = γ µ (1263)
one finds that
1
γ (0) R̂† γ (0) = Iˆ − µν [ γ µ , γ ν ] + . . .
8
= R̂−1 (1264)
The last line follows from the observation that on combining the expression for
R̂ with the expression for γ (0) R̂† γ (0) , the terms of order cancel. Hence to
the order of 2 , the product γ (0) R̂† γ (0) coincides with R̂−1 . This concludes
the discussion of the desired identity.
Finite Rotations
221
x(2)
x(2)'
x(1)'
ϕ
x(1)
Figure 45: A passive rotation of the coordinate system through an angle ϕ about
the ê3 -axis.
Λµ ν = δ µ ν + µ ν + . . . (1268)
222
since σ µ,ν is anti-symmetric. On compounding N infinitesimal transformations
about the same axis R̂(δϕ) using their exponential form, and defining N δϕ =
ϕ, one obtains the finite rotation R̂(ϕ)
N
R̂(ϕ) = R̂(δϕ)
δϕ 1,2
= exp i N σ
2
ϕ 1,2
= exp i σ (1272)
2
Therefore, for a finite rotation, the transformation matrix is given by
ϕ 1,2
R̂(ϕ) = exp i σ (1273)
2
which can be expressed in terms of even and odd-powers of σ 1,2 via
ϕ 1,2 ϕ 1,2
R̂(ϕ) = cos σ + i sin σ (1274)
2 2
but since
i
σ 1,2 = [ γ (1) , γ (2) ]
2
= σ̂ (3) (1275)
the transformation can be expressed as
ϕ (3) ϕ (3)
R̂(ϕ) = cos σ̂ + i sin σ̂ (1276)
2 2
The above expression can be simplified by expanding the trigonometric functions
in series of ϕ and then using the property of the σ̂ (j) matrices
( σ̂ (3) )2 = Iˆ (1277)
Since the repeated use of the above identity leads to
( σ̂ (3) )2n = Iˆ
( σ̂ (3) )2n+1 = σ̂ (3) (1278)
the series simplify and can be re-summed leading to
ϕ ˆ ϕ
R̂(ϕ) = cos I + i sin σ̂ (3) (1279)
2 2
Therefore, under a finite rotation through angle ϕ around the unit vector ê, a
spinor is rotated by the operator
ϕ ˆ ϕ
R̂(ϕ) = cos I + i sin ê . σ̂ (1280)
2 2
223
From the above equation, due to the presence of the half-angle, one notes that
a rotation ϕ and through ϕ + 2π are not equivalent, since
which changes the sign of the spinor. For spin one-half electrons, it is necessary
to rotate through 4π to return to the same state
A finite Lorentz boost by velocity v along the ê1 direction can be expressed
in terms of the transformation
cosh χ − sinh χ 0 0
− sinh χ cosh χ 0 0
Λ = (1283)
0 0 1 0
0 0 0 1
Λµ ν = δ µ ν + µ ν + . . . (1287)
224
The infinitesimal transformation of a Dirac spinor was determined to be given
by
i
R̂(δχ) = Iˆ − µν σ µν + . . . (1289)
4
Hence, for a infinitesimal Lorentz boost one has
i
R̂(δχ) = Iˆ + 2 δχ σ 0,1 + . . .
4
δχ 0,1
= exp i σ (1290)
2
but since
i
σ 0,1 = [ γ (0) , γ (1) ]
2
= i α(1) (1294)
( α(1) )2 = Iˆ (1296)
225
Since the repeated use of the above identity leads to
( α(1) )2n = Iˆ
( α(1) )2n+1 = α(1) (1297)
the series simplify and can be re-summed leading to
χ ˆ χ
R̂(χ) = cosh − I + sinh − α(1)
2 2
χ ˆ χ
= cosh I − sinh α(1) (1298)
2 2
Therefore, under a finite boost through velocity v, a spinor is rotated by the
operator
χ ˆ χ
R̂(χ) = cosh I − tanh α(1)
2 2
χ ˆ χ
= cosh I − tanh v̂ . α (1299)
2 2
where the rapidity χ is determined by
v
tanh χ = (1300)
c
Exercise:
Exercise:
Exercise:
Show that the helicity eigenvalue of a free Dirac particle can be reversed by
going to a new reference frame which is “overtaking” the particle.
226
ψ
v A
L'
all other products can be reduced to the above products. The order of the
matrices is irrelevant, since the different matrices anti-commute. Also, since
( γ µ )2 = ± I,ˆ one only needs to consider the products in which each matrix
enters at most one time. Hence, since each of the four matrices either appear as
a factor or do not, there are only 24 such matrices. These sixteen Γi matrices
can be constructed from I, ˆ γ µ , σ µ,ν = i γ µ γ ν , γ (4) and γ (4) γ µ , by choosing
appropriate phase factors.
The set of matrices Γi formed from the set of γ µ are closed under multipli-
cation, so
Γi Γj = ai,j Γk (1302)
where a4i,j = 1. The sixteen Γi matrices can be chosen as the product of the
members of the above set multiplied by a phase factor taken from the set ± 1
and ± i, such that the condition
( Γi )2 = Iˆ (1303)
Γi Γj = Iˆ only if i = j (1304)
Also, by anti-commuting the factors of γ µ in the products, one can show that
Γi Γj = ± Γj Γi (1305)
227
Table 6: The Set of the Sixteen Matrices Γn with their Phase Factors (j > i)
Iˆ
γ (0) i γ (i)
Specifically, for a fixed Γi not equal to the identity, one can always find a
specific Γk such that
Γi Γk = − Γk Γi (1306)
which on multiplying by Γk results in
Γk Γi Γk = − Γi (1307)
Traceless Matrices
The above facts can be used to show that the Γi matrices, other than the
identity, are traceless. This can be proved by considering
− Trace Γi = Trace( − Γi ) = Trace( Γk Γi Γk )
= Trace( Γi Γk Γk ) = Trace Γi (1308)
in which the existence of a specific Γk which anti-commutes with Γi has been
used, and where the cyclic invariance of the trace has been used as has been
ˆ Hence, all the Γi matrices, other than the identity, are traceless
( Γk )2 = I.
Trace Γi = 0 (1309)
Linear Independence
228
other than Ci ≡ 0 for all i. If the Γi are linearly independent, the only solution
of this equation is
Ci ≡ 0 for all i (1311)
This can be proved by multiplying eqn(1310) by any one Γj in the set which
leads to
X
Cj Iˆ + Ci Γi Γj = 0
i6=j
X
Cj Iˆ + Ci ai,j Γk = 0 (1312)
i6=j
since the matrices Γk are traceless. Hence, all the Cj are zero, so the matrices
are linearly independent.
Uniqueness of Expansions
The existence of sixteen linearly independent matrices require that the ma-
trices can be represented in a space of N × N matrices, where N ≥ 4. Any
matrix A in the space of 4 × 4 matrices can be uniquely expressed in terms of
the basis set of the Γi . For example, if
X
A = Ci Γi (1314)
i
229
Schur’s Lemma
The uniqueness of the expansion can be used to show that the product of Γi
for fixed i with the set of Γj for leads to a different Γk for each j. This can be
shown by assuming that there exist two different (linearly independent) values
Γj and Γj 0 which lead to the same Γk
Γi Γj = ai,j Γk
Γi Γj 0 = ai,j 0 Γk
(1317)
Γj = ai,j Γi Γk
Γj 0 = ai,j 0 Γi Γk
(1318)
One can also prove Schur’s lemma. Schur’s Lemma states that if a matrix A
commutes with all the γ µ ’s, then A is a multiple of the identity. If A commutes
with the γ µ ’s, it also commutes with all the Γi ’s. Schur’s lemma follows from
the expansion of A as X
A = Ci Γi + Cj Γj (1320)
j6=i
Γk Γi Γk = − Γi (1321)
Since it has been assumed that A commutes with all the Γi , for the specific Γk
one has
A = Γk A Γk
X
= Ci Γk Γi Γk + Cj Γk Γj Γk
j6=i
X
= − Ci Γi + Cj Γk Γj Γk (1322)
j6=i
Γk Γj Γk = ( ± 1 )j,k Γj (1323)
230
the above equation reduces to
X
A = − Ci Γi + Cj ( ± 1 )j,k Γj (1324)
j6=i
Since the expansion is unique, the coefficients of the Γj are unique and in par-
ticular
Ci = − Ci (1326)
ˆ Hence, if A commutes with all the Γi
so Ci = 0 for any i such that Γi 6= I.
then A must be proportional to the identity.
γ µ0 = Ŝ γ µ Ŝ −1 (1327)
The theorem requires that one constructs a set of sixteen matrices Γi 0 from
the γ µ0 following the same rules with which the Γi were constructed from γ µ .
Then one can describe the non-singular matrix by
X
Ŝ = Γi 0 F Γi (1328)
i
since
Γ2k = Iˆ (1331)
On pre-multiplying eqn(1330) by Γj Γi , one obtains
Γj Γi Γi Γj Γi Γj = a2i,j Γj Γi (1332)
231
but since
Γj Γi Γi Γj = Iˆ (1333)
eqn(1332) reduces to
Γi Γj = a2i,j Γj Γi (1334)
However, as
Γi Γj = ai,j Γk (1335)
the equation becomes
Γj Γi = a3i,j Γk (1337)
0
The Γi matrices are constructed so that they satisfy similar relations to the Γi .
0
In particular, the Γi matrices satisfy
Γi 0 Γj 0 = ai,j Γk 0 (1338)
However, since i is fixed and j is being summed over, every Γk appears once
and only once in the product. Therefore, the sum can be performed over k
X
Γi 0 Ŝ Γi = Γk 0 F Γk = Ŝ (1344)
k
232
If one can show that the matrix Ŝ has an inverse, then on post-multiplying by
Ŝ −1 , one finds
Γi 0 Ŝ Γi Ŝ −1 = Iˆ (1345)
Furthermore, since Γi 0 is its own inverse, then on pre-multiplying by Γi 0 the
equation reduces to
Ŝ Γi Ŝ −1 = Γi 0 (1346)
This is a generalization of the statement of the theorem. As a particular case,
one may choose Γi = γ µ in which case the theorem becomes
γ µ0 = Ŝ γ µ Ŝ −1 (1347)
which was the initial statement of Pauli’s fundamental theorem made above.
Ŝ = Γi 0 Ŝ Γi (1349)
Ŝ 0 Ŝ = Γi Ŝ 0 Γi 0 Γi 0 Ŝ Γi
= Γi Ŝ 0 Ŝ Γi (1351)
Hence, by Schur’s Lemma one sees that Ŝ 0 Ŝ commutes with all the matrices in
the space, therefore it must be a multiple of the identity
Ŝ 0 Ŝ = κ Iˆ (1352)
Ŝ 0 Ŝ = Iˆ (1353)
233
11.5.2 Polarization in Mott Scattering
When evaluated in the Born Approximation, Mott scattering does not result
in the polarization of an unpolarized beam. However, when higher-order cor-
rections are included, Mott scattering produces a partially polarization of the
scattered electrons93 . If the incident beam is polarized by having a definite
helicity, it is expected that the helicity may change as a result of the scattering.
p'
p θ'
p'
Figure 47: Helcity non-flip and helicity flip Mott scattering of an electron with
helicity +1. The scattering angle is θp0 .
The probability of non-helicity flip scattering and helicity flip scattering can
be evaluated using the Born approximation. The initial beam will be considered
as having a momentum p parallel to the ê3 -axis and as having a helicity of +1.
The initial spinor is proportional to
s
Ep + m c2 p.r
χ+
ψp,+ (r) = c p exp i (1354)
2 Ep V Ep + m c2 χ+ h̄
The electrons are assumed to be elastically scattered to a state with final mo-
mentum p0 . The scattering is defined to occur through an angle θp0 in the z − x
plane. The final state is composed of a linear-superposition of states with dif-
ferent helicities. Since the final state helicities are specified relative to the final
momentum, the final state helicity eigenstates can be obtained by rotating the
initial state helicity eigenstates through an angle θp0 around the ê2 -axis
234
where the rotation operator is given by
θp0 ˆ θp0 (2)
R̂(θp ) =
0 cos I − i sin σ̂ (1356)
2 2
which does not mix the upper and lower two-component spinors. Therefore, one
finds that the final state two-component spinors representing helicity eigenstates
are given by
θp0 θp0 (2)
χ0+ = cos I − i sin σ χ+
2 2
!
θ 0
cos 2p
= θ 0 (1357)
sin 2p
and
θp0 θp0 (2)
χ0− = cos I − i sin σ χ−
2 2
!
θ 0
− sin 2p
= θ 0 (1358)
cos 2p
which is evaluated as
2 2 2 2
4 π Z e2 c2 p2 E p + m c2
0
0†
1 + Λ χ Λ0 χ+
V | p − p0 |2 ( Ep + m c2 )2 2 Ep
2 2 2
2
( Ep + m c ) + Λ0 c2 p2
2 2
4πZ e 0†
= χ Λ0 χ+ (1361)
V | p − p0 |2 2 E p ( E p + m c2 )
235
It is seen that the probability for helicity flip scattering vanishes in the ultra-
relativistic limit. Also, in the non-relativistic limit, a static charge cannot flip
the spin. Therefore, in the non-relativistic limit, if one expresses the spin eigen-
state as a linear superposition of the final helicity eigenstates
θp0 0 θp0 0
χ+ = cos χ − sin χ (1364)
2 + 2 −
one is lead to expect that the relative probability of helicity flip to non-helicity
flip will be governed by a factor of
θp0
tan2 (1365)
2
which agrees with the above matrix elements evaluated in the non-relativistic
limit. The cross-section for non-flip scattering is determined as
2
2 Z e2 E p
dσ θp0
= cos2 (1366)
dΩ0 +,+ 4 c2 p2
θ 0
sin2 2p 2
236
11.6 The Non-Relativistic Limit
The non-relativistic limit of the Dirac equation should reduce to the Schrödinger
equation. As shall be seen, the appropriate Schrödinger equation for a particle
with positive-energy is modified due to the existence of spin. The non-relativistic
limit is described by the Pauli equation94 .
The equation can be written in 2 × 2 block diagonal form, if the wave function
is expressed in the form of two two-component spinors. We shall mainly focus
on the positive-energy solutions and recognize that, in the non-relativistic limit,
the largest component of the wave function is φA and the largest term in the
energy is the rest mass energy m c2 . Therefore, the spinor wave function will
be expressed as
m c2
A
φ
ψ = exp − i t (1371)
φB h̄
The above form explicitly displays the rest-mass energy of the positive-energy
solution of the Dirac equation. Hence, the Dirac equation takes the form
c σ . ( p̂ − qc A ) φB
A
∂ φ
i h̄ − q A0 =
∂t φB c σ . ( p̂ − qc A ) φA
0
− 2 m c2 (1372)
φB
where the rest mass has been eliminated from the equation for the large com-
ponent φA of the positive-energy solution. Since the kinetic energy and the
potential energy are assumed to be smaller than the rest mass energy, the equa-
tion for the small component
∂ B q
i h̄ − q A0 φ = c σ . p̂ − A φA − 2 m c2 φB (1373)
∂t c
can be expressed as
1 q
φB = σ. p̂ − A φA (1374)
2mc c
Substituting the expression for the small component into the equation for the
large component, hence eliminating φB , one finds the equation
2
∂ 1 q
i h̄ − q A0 φA = σ. p̂ − A φA (1375)
∂t 2m c
94 W. Pauli, Z, Phys. 44, 601 (1927).
237
which is the Pauli equation. The equation can be simplified by expanding the
terms involving the Pauli spin matrices. The Pauli identity can be used to
obtain
2 2
q q q q
σ . p̂ − A = I p̂ − A + iσ. p̂ − A ∧ p̂ − A
c c c c
2
q q h̄
= I p̂ − A − σ. ∇ ∧ A (1376)
c c
where the last term originates from the non-commutativity of the components
of p̂ and A. Since the magnetic field B is given by
B = ∇ ∧ A (1377)
The Pauli equation95 is the non-relativistic limit of the Dirac equation. It rep-
resents the Schrödinger equation for a charged particle with spin one-half. The
two components of the spinor φA in the Pauli equation represent the internal
spin of the electron. The last term represents the anomalous Zeeman interaction
between the magnetic field and the electron’s spin.
The other contribution to the Zeeman interaction originates with the elec-
tron’s orbital angular momentum L. The ordinary Zeeman interaction occurs
between the constant magnetic field B and the orbital angular momentum and
originates from the gauge-invariant term in the Hamiltonian
2 2
1 q 1 q
p̂ − A = p̂ − B ∧ r (1379)
2m c 2m 2c
where the vector potential has been expressed in terms of the uniform magnetic
field via
1
A = B ∧ r (1380)
2
The expression for the energy term can be further simplified to
2
p̂2
1 q q
p̂ − A = − p̂ . (B ∧ r) + (B ∧ r) . p̂
2m c 2m 4mc
q2
+ A2
2 m c2
p̂2 q2
q
= − (B ∧ r) . p̂ + A2
2m 2mc 2 m c2
95 W. Pauli, Z, Phys. 44, 601 (1927).
238
p̂2 q2
q
= − B . (r ∧ p̂) + A2
2m 2mc 2 m c2
p̂2 q2
q
= − B . L̂ + A2 (1381)
2m 2mc 2 m c2
In obtaining the second line, the i-th component of p̂ has been commuted with
the i-th component of ( B ∧ r ). In obtaining the third line, the (cyclic) vector
identity
(A ∧ B).C = (B ∧ C ).A (1382)
has been used. The first term in eqn(1381) represents the usual non-relativistic
expression for the kinetic energy of the electrons, the second term represents
the ordinary Zeeman interaction which originates from the paramagnetic inter-
action. The last term represents the diamagnetic interaction.
The total Zeeman interaction is the energy of the total magnetic moment M
in the field B
ĤZeeman = − M . B (1383)
The Dirac equation results in the Zeeman interaction of the form
q
ĤZeeman = − B . L̂ + h̄ σ
2mc
q
= − B . L̂ + 2 S (1384)
2mc
where the spin angular momentum S has been identified as
h̄
S = σ (1385)
2
It is seen that both the spin angular momentum and the orbital angular momen-
tum of the charged particle interacts with the magnetic field, therefore, both
contribute to the magnetic moment. However, it is noted that the magnetic
moment can be written in the form
q
M = L̂ + g S (1386)
2mc
where the magnitude of the magnetic moment is determined by the factor 2 qmh̄ c
which is the Bohr magneton. The Dirac equation shows that the spin angular
momentum couples with a different strength to orbital angular momentum, and
the relative coupling strength g (the gyromagnetic ratio) is given by g = 2.
The existence of spin and the value of 2 for the gyromagnetic ratio were the
first successes of Dirac’s theory. Quantum Electrodynamics96 yields a small
correction to the gyromagnetic ratio of
2
1 e
g = 2 1 + + ... (1387)
2 π h̄ c
96 J. S. Schwinger, Phys. Rev. 73, 416 (1948).
239
which has been experimentally verified to incredible precision97 . Using the fea-
tures associated with spin, Dirac’s theory correctly described the fine structure
of the Hydrogen atom. The second success of the Dirac equation followed Dirac’s
physical interpretation of the negative-energy states in terms of anti-particles98 .
The second round of success came with the discovery of the positron by Ander-
son99 .
Exercise:
L̂ = r ∧ p̂ (1389)
R. S. Van Dyck Jr., P. B. Schwinberg and H. G. Dehmelt, Phys. Rev. Lett. 59, 26 (1987).
98 P. A. M. Dirac, Proc. Roy. Soc. A 126, 360 (1930).
99 C. D. Anderson, Phys. Rev. 43, 491 (1933).
240
Finally, the α matrices are of off-diagonal form
0 σ
α =
σ 0
0 I
= σ̂ (1393)
I 0
where σ̂ is the 2 × 2 block-diagonal Pauli spin matrix. The rate of change of
orbital angular momentum is given by the Heisenberg equation of motion
∂
i h̄
L̂ = [ L̂ , Ĥ ] (1394)
∂t
The orbital angular momentum operator commutes with the mass term and
with the spherically symmetric potential V (r). The orbital angular momentum
does not commute with the momentum. Thus,
∂
L̂ = c [ L̂ , α . p̂ ]
i h̄ (1395)
∂t
Hence, the Heisenberg equation of motion can be expressed in the form
∂ 0 I
i h̄ L̂ = c [ L̂ , σ̂ . p̂ ]
∂t I 0
0 I
= −c σ̂ . [ p̂ , L̂ ] (1396)
I 0
However, the components of the orbital angular momentum L̂(i) and momenta
p(j) satisfy the commutation relations
X
[ L̂(i) , p(j) ] = i h̄ ξ i,j,k p(k) (1397)
k
The spin angular momentum is also not conserved. This can be seen by
examining the Heisenberg equation of motion for the Pauli spin operator
∂
i h̄
σ̂ = [ σ̂ , Ĥ ] (1399)
∂t
The spin operator commutes with Iˆ and β but does not commute with the α
matrices. Hence,
∂
i h̄ σ̂ = c [ σ̂ , α . p̂ ]
∂t
0 I
= c [ σ̂ , σ̂ ] . p̂ (1400)
I 0
241
The components of the Pauli spin operators satisfy the commutation relations
X
[ σ (i) , σ (j) ] = 2 i ξ i,j,k σ (k) (1401)
k
which, clearly, have a similar form to the commutation relations for the orbital
angular momentum. Hence, spin angular momentum is not conserved since
∂ 0 I
i h̄ σ̂ = − 2 i c ( σ̂ ∧ p̂ ) (1402)
∂t I 0
Jˆ = L̂ + Ŝ
h̄
= L̂ + σ̂ (1403)
2
The total angular momentum is conserved since
∂ ˆ ∂ ∂
i h̄ J = i h̄ L̂ + i h̄ Ŝ
∂t ∂t ∂t
0 I 0 I
= i h̄ c ( σ̂ ∧ p̂ ) − ( σ̂ ∧ p̂ )
I 0 I 0
= 0 (1404)
which follows from combining eqn(1398) and eqn(1402). This confirms the in-
terpretation of the quantity Ŝ defined by
h̄
Ŝ = σ̂ (1405)
2
as the spin angular momentum of the electron.
The parity transform P acting on the coordinates (t, r) has the effect
Lee and C. N. Yang [T. D. Lee and C. N. Yang, Phys. Rev. 104, 254 (1956).]
242
which is an inversion of the spatial coordinates. Thus, the parity reverse the
space-like components of vectors, so the effects of the parity operation on the
position and momentum vectors are given by
P̂ r P̂ −1 = −r
P̂ p P̂ −1 = −p (1407)
P̂ L P̂ −1 = L (1408)
which is unchanged. This implies that spin angular momentum should also be
invariant under the parity transform
P̂ σ P̂ −1 = σ (1409)
Ĥ = P̂ Ĥ P̂ −1 (1410)
Ĥ = c α . p̂ + β m c2 + Iˆ V (r) (1411)
P̂ α P̂ −1 = −α
P̂ β P̂ −1 = β (1413)
The condition on the potential is the familiar condition for parity invariance
in classical mechanics. In the standard representation, in 2 × 2 block diagonal
form, the requirement of parity invariance on the Dirac matrices become the
matrix equations
0 σ 0 σ
P̂ P̂ −1 = −
σ 0 σ 0
I 0 I 0
P̂ P̂ −1 = (1414)
0 −I 0 −I
The above equation shows that, in the standard representation, the parity op-
erator can be uniquely factorized as
I 0
P̂ = P̂ (1415)
0 −I
243
where the operator P̂ only acts on the coordinates r. The presence of the matrix
in the parity operation on the Dirac spinor should be compared with the effect
of the parity operator on the four-vector potential of Electrodynamics Aµ (r)
which is given by the product of spatial inversion and a matrix operation
P̂ Aµ (r, t) = γ µ ν P̂ Aν (r, t)
= γ µ ν Aν (−r, t) (1416)
The effect of the parity operator on the Dirac four-component spinor wave
function can be computed from
A
φ (t, r)
P̂ ψ(t, r) = P̂
φB (t, r)
A
I 0 φ (t, −r)
=
0 −I φB (t, −r)
φA (t, −r)
= (1418)
− φB (t, −r)
Hence, in the standard representation, the parity operator changes the relative
sign of the two two-component spinors. Due to the presence of the term − I in
the lower diagonal block of the parity matrix, the lower two-component spinor
φB in the Dirac spinor is said to have a negative intrinsic parity.
P̂ ψ = ηp ψ (1419)
P̂ φA ηp φA
= (1420)
−P̂ φB ηp φB
Hence, the two-component spinors φA (r) and φB (r) have opposite parities under
spatial inversion
P̂ φA (r) = ηp φA (r)
P̂ φB (r) = − ηp φB (r) (1421)
244
In polar coordinates, the spatial part of the parity operation P̂ is equivalent to
a reflection
θ → π − θ (1422)
followed by a rotation
ϕ → ϕ + π (1423)
which has the effect that
sin θ → sin θ
cos θ → − cos θ
exp i m ϕ → ( − 1)m exp i m ϕ (1424)
are eigenstates of the parity operator and have parity eigenvalues of (−1)l . The
lowering operator L̂− , defined via
∂ ∂
L̂− = − h̄ exp − i ϕ − i cot θ (1426)
∂θ ∂ϕ
Therefore, on repeatedly operating on Yll (θ, ϕ) with the lowering operator L̂−
(l − m) times, one finds that under the parity transformation
which shows that all states with a definite magnitude of the orbital angular mo-
mentum l are eigenstates of the parity operator and have the same eigenvalue.
Exercise:
Show that under a parity transformation the positive-energy solution for the
+
free Dirac particle ψk,σ (x) transforms as
+ +
P̂ ψk,σ (x) = ψ−k,σ (x) (1429)
−
while the negative-energy solutions ψk,σ (x) transform as
− −
P̂ ψk,σ (x) = − ψ−k,σ (x) (1430)
245
Hence, the parity operation reverses the momentum and keeps the spin invari-
ant for the positive-energy and negative-energy solutions solution. The extra
negative sign implies that the negative-energy solution has opposite intrinsic
parity to the positive-energy solution.
Exercise:
where R̂(Λ) “rotates” the spinor. The covariant condition for the Dirac equation
is
R̂−1 (Λ) γ µ R̂(Λ) = Λµ ν γ ν (1433)
For a parity transformation, one has
xµ0 = xµ (1434)
since the spatial components of xµ change sign. Hence, for a parity transforma-
tion, the transformation matrix is determined as
Λµ ν = gµ,ν (1435)
det | g | = − 1 (1436)
Solve for the matrix R̂(Λ) which shuffles the components of the Dirac spinor.
xµ0 = Λµ ν xν (1438)
246
and the condition that the Dirac equation is covariant under the orthochronous
Lorentz transformation is
R̂−1 (Λ) γ µ R̂(Λ) = Λµ ν γ ν (1440)
From the transformational properties of the Dirac spinors, together with the
identity
γ (0) R̂† (Λ) γ (0) = R̂−1 (Λ) (1441)
one can find the transformational properties of quantities that are bi-linear in
the Dirac spinors.
†
Thus, for example, the bi-linear quantity ψ ψ transforms according to
†0
ψ ψ0 = ψ †0 γ (0) ψ 0
= ψ † R̂† (Λ) γ (0) R̂(Λ) ψ
= ψ † ( γ (0) )2 R̂† (Λ) γ (0) R̂(Λ) ψ
= ψ † γ (0) R̂−1 (Λ) R̂(Λ) ψ
†
= ψ ψ (1442)
where a factor of ( γ (0) )2 = Iˆ has been used in the third line and the identity
†
has been used in the fourth. Thus, one finds that ψ ψ transforms like a scalar.
†
Likewise, one can show that the bi-linear quantities ψ γ µ ψ transform like
the components of a four-vector. That is
†0
ψ γ µ ψ0 = ψ †0 γ (0) γ µ ψ 0
= ψ † R̂† (Λ) γ (0) γ µ R̂(Λ) ψ
= ψ † ( γ (0) )2 R̂† (Λ) γ (0) γ µ R̂(Λ) ψ
= ψ † γ (0) R̂−1 (Λ) γ µ R̂(Λ) ψ
†
= ψ R̂−1 (Λ) γ µ R̂(Λ) ψ
†
= Λµ ν ψ γ ν ψ (1443)
where the covariant condition has been used in obtaining the last line. Since
†
this relation holds for Lorentz boosts, rotations and spatial inversions, ψ γ µ ψ
is a four-vector.
247
†
Table 7: The sixteen bi-linear covariants ψ Q̂ ψ for the Dirac equation.
†
ψ Q̂ ψ R̂−1 (Λ) Q̂ R̂(Λ)
†
Scalar ψ Iˆ ψ Iˆ 1
†
Vector ψ γµ ψ Λµ ν γ ν 4
†
Anti-symmetric Tensor ψ σ µ,ν ψ Λµ ρ Λν τ σ ρ,τ 6
†
Pseudo-scalar ψ γ (4) ψ det | Λ | γ (4) 1
†
Axial-Vector ψ γ (4) γ µ ψ det | Λ | Λµ ν γ (4) γ ν 4
where we have inserted a factor of Iˆ = R̂(Λ) R̂−1 (Λ) in the second line, and
used the covariant condition (twice) in the third line. Hence, the bi-linear quan-
†
tity ψ σ µ,ν ψ transforms like an anti-symmetric second-rank tensor.
One can define a quantity γ (4) in terms of a product of all the γ-matrices
γ (4) = i γ (0) γ (1) γ (2) γ (3) (1448)
248
It is easily verified that γ (4) anti-commutes with all the γ µ ,
{ γ µ , γ (4) }+ = 0 (1449)
so one has
† †
ψ 0 (x0 ) γ (4) ψ 0 (x0 ) = det | Λ | ψ (x) γ (4) ψ(x) (1456)
†
Therefore, the quantity ψ γ (4) ψ transforms as a pseudo-scalar.
249
†
One can also define the bi-linear axial-vector ψ γ (4) γ µ ψ. From consid-
erations similar to those used previously, one can show that these quantities
transform according to
† †
ψ 0 (x0 ) γ (4) γ µ ψ 0 (x0 ) = det | Λ | Λµ ν ψ (x) γ (4) γ ν ψ(x) (1457)
†
Hence, ψ γ (4) γ µ ψ transforms like a four-vector under proper orthochronous
Lorentz transformations. However, the space-like components do not change
sign under an inversion, but the time-like components do change sign. There-
†
fore, ψ γ (4) γ µ ψ transforms like an axial-vector.
Exercise:
Show, by considering the non-relativistic limit, that the above equation de-
scribes an electron with an electric dipole moment. Determine an expression for
the electric dipole moment.
The angular momentum operator Jˆ and the parity operator P̂ commute with
the Hamiltonian Ĥ. Therefore, one can find simultaneous eigenstates of the
three operators Ĥ, Jˆ2 , Jˆz and P̂. The energy eigenstates satisfy the equation
c α . p̂ + β m c2 + Iˆ V (r) ψ = E ψ (1460)
250
In spherical polar coordinates, the operator ( σ . p̂ ) can be expressed as
cos θ sin θ exp[−iϕ] ∂
( σ . p̂ ) = − i h̄
sin θ exp[+iϕ] − cos θ ∂r
i h̄ − sin θ cos θ exp[−iϕ] ∂
−
r cos θ exp[+iϕ] sin θ ∂θ
i h̄ 0 − i exp[−iϕ] ∂
−
r sin θ i exp[+iϕ] 0 ∂ϕ
(1462)
which has a quite complicated structure. For future reference, it shall be noted
that the matrix part of the coefficient of the partial derivative w.r.t. r is simply
equal to
r.σ
(1463)
r
which is independent of the radial coordinate r. The operator ( σ . p̂ ) can be
cast in a more convenient form through the repeated use of the Pauli identity.
First, the 2 × 2 unit matrix can be written as
2
r.σ
I = (1464)
r
and are their own inverses. Therefore, one can express the operator ( σ . p̂ ) as
2
r.σ
( σ . p̂ ) = ( σ . p̂ )
r
r.σ
= ( r . σ ) ( σ . p̂ )
r2
r.σ
= r . p̂ + i σ . ( r ∧ p̂ )
r2
r.σ ∂
= − i h̄ r + i σ . L̂
r2 ∂r
r.σ ∂ 2i
= − i h̄ r + S . L̂ (1466)
r2 ∂r h̄
where the Pauli identity has been used in going between the second and third
lines. Therefore, the two-component spinors satisfy the set of coupled equations
2 A r.σ ∂ 2i
( E − V (r) − m c ) φ (r) = c − i h̄ r + S . L̂ φB (r)
r2 ∂r h̄
251
Table 8: The Clebsch-Gordon Coefficients for adding orbital angular momentum
(l, m) with spin quantum numbers ( 21 , sz ) to yield a state with total angular
momentum quantum numbers (j, jz ). The allowed values of m are given by
jz = m + sz .
sz = + 12 sz = − 12
q 1
q 1
1 l + jz + l − jz +
j =l+ 2 2 l + 1
2
2 l + 1
2
q 1
q
1 l − jz + l +jz + 12
j =l− 2 - 2 l + 1
2
2 l + 1
r.σ ∂ 2i
( E − V (r) + m c2 ) φB (r) = c − i h̄ r + S . L̂ φA (r)
r2 ∂r h̄
(1467)
It is seen that, due to the effect of special relativity, the Dirac equation results
in the coupling of the spin and the orbital angular momentum.
J = L + S (1468)
Thus, the two-component spinor eigenstates of total angular momentum Ωlj,jz (θ, ϕ)
which describes the angular dependence, are formed by combining states of or-
bital angular momentum l, represented by Yml (θ, ϕ), and the spin eigenfunction
χ± . On combining states with orbital angular momentum l and spin s = 12 , one
finds states with total angular momentum which satisfy
1 1
l + ≥ j ≥ l − (1469)
2 2
Thus, it is found that the possible eigenstates correspond to j = l + 21 and
j = l − 12 . Furthermore, the corresponding eigenfunctions are expressed as
s s
l + 12 + jz l l + 12 − jz l
Ωll+ 1 ,jz (θ, ϕ) = Yjz − 1 (θ, ϕ) χ+ + Yjz + 1 (θ, ϕ) χ−
2 2l + 1 2 2l + 1 2
252
s s
l + 12 − jz l l + 12 + jz l
Ωll− 1 ,jz (θ, ϕ) = − Yjz − 1 (θ, ϕ) χ+ + Yjz + 1 (θ, ϕ) χ−
2 2l + 1 2 2l + 1 2
(1470)
where the coefficients are identified with the Clebsch-Gordon coefficients given
in Table(8). The functions Ωlj,jz (θ, ϕ) are the analogue of the spherical harmon-
ics Yml (θ, ϕ) in relativistic problems where spin and orbital angular momentum
are coupled.
(1472)
0
As shall be seen later, the two-component spinors Ωll0 − 1 ,jz (θ, ϕ) and Ωll+ 1 ,jz (θ, ϕ)
2 2
have opposite parities. In fact, the two-component spinors generated by angular
momentum l and l0 = (l + 1) are related by the action of the pseudo-scalar
r.σ cos θ sin θ exp[−iϕ]
= (1473)
r sin θ exp[+iϕ] − cos θ
253
one finds that the inverse relationship between the two-component spinors is
also given by
r.σ
Ωl+1 l
j,jz (θ, ϕ) = − Ωj,jz (θ, ϕ) (1476)
r
Therefore, one concludes that the two angular momentum eigenstates have dif-
ferent properties under the spatial inversion transformation r → −r.
——————————————————————————————————
Mathematical Interlude:
The Action of the Operator ( r̂ . σ ) on the Spinor Spherical Harmon-
j± 1
ics Ωj,jz2 (θ, ϕ).
Here, it will be argued that the spinor spherical harmonics satisfy the equa-
tions
r.σ j+ 1 j− 1
Ωj,jz2 (θ, ϕ) = − Ωj,jz2 (θ, ϕ)
r
r.σ j− 1 j+ 1
Ωj,jz2 (θ, ϕ) = − Ωj,jz2 (θ, ϕ) (1477)
r
[ Jˆ(i) , ( r . Ŝ ) ] = 0 (1479)
The complete proof of this statement immediately follows from the proof of the
relation for any one component Jˆ(i) , since ( r . Ŝ ) is spherically symmetric.
Thus, for i = 1, one has
and
[ L̂(i) , x(j) ] = i h̄ εi,j,k x(k) (1482)
254
one finds that
[ Jˆ(1) , ( r . Ŝ ) ] = i h̄ Ŝ (2) (3)
x − Ŝ (3) (2)
x + x (2)
Ŝ (3) (3)
− x Ŝ (2)
= 0 (1483)
which was to be shown. From repeated use of the above commutation relations
which involve the components Jˆ(i) , it immediately follows that
2
[ Ĵ , ( r . Ŝ ) ] = 0 (1484)
1 2
j±
Thus, since Ωj,jz2 is a simultaneous eigenstate of Ĵ and Jˆ(3) and because these
j± 1
operators commute with ( r . Ŝ ), then ( r . Ŝ ) Ωj,jz2 is also a simultaneous
eigenstate with eigenvalues (j, jz ).
1 2
j±
Since the states ( r . Ŝ ) Ωj,jz2 are simultaneous eigenstates of Ĵ and Jˆ(3)
with eigenvalues (j, jz ), and because this subspace is spanned by the basis com-
j± 1
posed of the two states Ωj,jz2 (θ, ϕ), the transformed states can be decomposed
as
r.σ j+ 1 j+ 1 j− 1
Ωj,jz2 (θ, ϕ) = C++ (j, jz ) Ωj,jz2 (θ, ϕ) + C+− (j, jz ) Ωj,jz2 (θ, ϕ)
r
r.σ j− 1 j+ 1 j− 1
Ωj,jz2 (θ, ϕ) = C−+ (j, jz ) Ωj,jz2 (θ, ϕ) + C−− (j, jz ) Ωj,jz2 (θ, ϕ)
r
(1485)
where the coefficients C±,± (j, jz ) will be determined below.
First, we shall show that the coefficients C±,± (j, jz ) are independent of jz .
This follows as Jˆ± commutes with ( r . Ŝ ) since all the components Jˆ(i) com-
mute with ( r . Ŝ ). Thus, one has
r.σ j+ 1 r . σ ˆ± j+ 12
Jˆ± Ωj,jz2 (θ, ϕ) = J Ωj,jz (θ, ϕ) (1486)
r r
and
r.σ j+ 1 j+ 1 j− 1
Jˆ± Ωj,jz2 (θ, ϕ) = C++ (j, jz ) Jˆ± Ωj,jz2 (θ, ϕ) + C+− (j, jz ) Jˆ± Ωj,jz2 (θ, ϕ)
r
r . σ ˆ± j+ 12 j+ 1 j− 1
J Ωj,jz (θ, ϕ) = C++ (j, jz ± 1) Jˆ± Ωj,jz2 (θ, ϕ) + C+− (j, jz ± 1) Jˆ± Ωj,jz2 (θ, ϕ)
r
(1487)
Hence, on comparing the linearly-independent terms on the left-hand sides, one
concludes that
C++ (j, jz ± 1) = C++ (j, jz )
C+− (j, jz ± 1) = C+− (j, jz ) (1488)
255
etc. Therefore, the coefficients C±,± (j, jz ) are independent of the value of jz .
Henceforth, we shall omit the index jz in C±,± (j, jz ).
From considerations of parity, it can be determined that C++ (j) = C−− (j) =
0. Under the parity transformation r → − r, one has
j± 1 1 j± 1
Ωj,jz2 (θ, ϕ) → ( − 1 )j± 2 Ωj,jz2 (θ, ϕ) (1489)
which follows from the properties of the spherical harmonics Yml (θ, ϕ) under the
parity transformation. Also one has
r.σ r.σ
→ − (1490)
r r
under the parity transform. Thus, after the parity transform, one finds that the
transformed states have the decompositions
r.σ j+ 1 j+ 1 j− 1
Ωj,jz2 (θ, ϕ) = − C++ (j) Ωj,jz2 (θ, ϕ) + C+− (j) Ωj,jz2 (θ, ϕ)
r
r.σ j− 1 j+ 1 j− 1
Ωj,jz2 (θ, ϕ) = C−+ (j) Ωj,jz2 (θ, ϕ) − C−− (j) Ωj,jz2 (θ, ϕ)
r
(1491)
Therefore, recalling that the coefficients are independent of jz , one can express
the effect of the operator on the spinor spherical harmonics as
r.σ j+ 1 j− 1
Ωj,jz2 (θ, ϕ) = C+− (j) Ωj,jz2 (θ, ϕ)
r
r.σ j− 1 j+ 1
Ωj,jz2 (θ, ϕ) = C−+ (j) Ωj,jz2 (θ, ϕ) (1493)
r
Furthermore, since
2
r.σ
= I (1494)
r
one obtains the condition
256
The above two equations suggest that C−+ (j) and C+− (j) are pure phase fac-
tors, such as
C+− (j) = exp + i φ(j)
C−+ (j) = exp − i φ(j) (1497)
which becomes real for ϕ = 0 since the spherical harmonics become real. Hence,
on inspecting eqn(1493) with ϕ = 0, one concludes that the phase factors are
equal and are purely real. That is
257
are non-zero. The spinor spherical harmonics with θ = 0 are connected via
1 0 j± 1 j∓ 1
Ωj,±21 (0, ϕ) = − Ωj,±21 (0, ϕ) (1504)
0 −1 2 2
which holds independent of the values of θ and j, so the effect of the operator
on the spinor spherical harmonics is completely specified by
r.σ j+ 1 j− 1
Ωj,jz2 (θ, ϕ) = − Ωj,jz2 (θ, ϕ)
r
r.σ j− 1 j+ 1
Ωj,jz2 (θ, ϕ) = − Ωj,jz2 (θ, ϕ) (1506)
r
as was to be shown.
——————————————————————————————————
The Ansatz
If one only considers the spatial part of the parity operator, P̂ , the two-
0 0
component spinor states Ωll0 ± 1 ,jz (θ, ϕ) have parities (−1)l
2
0 0 0
P̂ Ωll0 ± 1 ,jz (θ, ϕ) = (−1)l Ωll0 ± 1 ,jz (θ, ϕ) (1507)
2 2
Furthermore, as has been seen, the upper and lower two-component spinors of
the four-component Dirac spinor must have opposite intrinsic parity. Therefore,
the desired simultaneous eigenstates for the relativistic electron can be either
−
represented by the four-component Dirac spinor ψj,j z
(r) with parity (−1)l =
(l+ 21 − 12 )
(−1) of the form
f − (r)
l
− r Ω l+ 1
,j
(θ, ϕ)
ψl+ 1 (r) = g− (r) l+12 z (1508)
2 ,jz i r Ωl+ 1 ,j (θ, ϕ)
2 z
+
or by ψj,j z
(r)
f + (r)
+ r Ωl+1
l+ 1 ,j
(θ, ϕ)
z
ψl+ 1
,j
(r) = g + (r)
2 (1509)
2 z
i r Ωll+ 1 ,jz (θ, ϕ)
2
l+ 21 + 12
which has parity (−1) . In these expressions f ± (r) and g ± (r) are scalar
radial functions that have to be determined as solutions of the radial equation.
These states do not correspond to definite values of the orbital angular mo-
mentum since the upper and lower two-component spinors correspond to the
258
different values of either l or l0 = l + 1 for the orbital angular momentum.
To condense the notation, the energy eigenstates will be written in the com-
pact form
f ± (r)
!
lA
± r Ω j,j (θ, ϕ)
ψj,jz (r) = ±
z
(1510)
i g r(r) Ωlj,j
B
z
(θ, ϕ)
1 1
where lA = j ± 2 and lB = j ∓ 2.
Jˆ = L̂ + S (1511)
When this operator acts on the relativistic two-component spinor spherical har-
monic Ωlj,j
A
z
, one finds
h̄2
3
S . L̂ Ωlj,j
A
= j ( j + 1 ) − lA ( lA + 1) − Ωlj,j
A
(1513)
z
2 4 z
1
which for j = lA + 2 yields
h̄2 1
S . L̂ Ωlj,j
A
= (j − ) Ωlj,j
A
(1514)
z
2 2 z
h̄2 3
S . L̂ Ωlj,j
A
= − (j + ) Ωlj,j
A
(1515)
z
2 2 z
259
Following Dirac, it is customary to define an integer κ in terms of the eigenvalues
of S . L̂ via
h̄2
( S . L̂ ) Ωlj,j
A
= − ( 1 + κ ) Ωlj,j
A
z
2 z
2
h̄
( S . L̂ ) Ωlj,j
B
= − ( 1 − κ ) Ωlj,j
B
(1517)
z
2 z
j+ 1
Therefore, if Ωlj,j
A
z
= Ωj,jz2 , i.e. j = lA − 12 , then κ = (j + 12 ).
j− 1
Otherwise, if Ωlj,j
A
z
= Ωj,jz2 , i.e. j = lA + 12 , then κ = − (j + 12 ).
Therefore, the Dirac radial equation consists of the two coupled first-order dif-
ferential equations for f (r) and g(r). On multiplying by a factor of r and
simplifying the derivatives of f (r)/r, one finds the pair of more symmetrical
equations
∂ κ
( E − V (r) − m c2 ) f (r) + c h̄ − g(r) = 0
∂r r
∂ κ
( E − V (r) + m c2 ) g(r) − c h̄ + f (r) = 0
∂r r
(1520)
The above pair of equations are the central result of this lecture.
260
Table 9: The Relationship between j, lA , lB , κ and Parity. The parity eigenvalue
A
is given by ηp = (−1)l and κ = ±(j + 21 ).
κ lA lB Parity
κ = (j + 12 ) j+ 1
2 j− 1
2 (−1)κ
κ = −(j + 12 ) j− 1
2 j+ 1
2 (−1)1−κ
(1521)
However, due to the identity
Ωlj,j
A
z
(θ, ϕ)† Ωlj,j
A
z
(θ, ϕ) = Ωlj,j
B
z
(θ, ϕ)† Ωlj,j
B
z
(θ, ϕ) = Aj,|jz | (θ) (1522)
the probability is independent of the azimuthal angle ϕ and the sign of jz (just
like in the non-relativistic case) and has a common angular factor of Aj,|jz | (θ).
Thus, the probability distribution factorizes into a radial and the angular factor
|f (r)|2 |g(r)|2
P (r) = + Aj,|jz | (θ) (1523)
r2 r2
The angular distribution function for a closed shell is given by the sum over the
angular distribution functions. Due to the identity,
j
X 2j + 1
Aj,|jz | (θ) = (1524)
jz =−j
4π
one finds that closed shells are spherically symmetric, as is expected. The first
few angular dependent factors Aj,|jz | (θ) are given in Table(10) and the corre-
sponding non-relativistic angular factors are given in Table(11). On compar-
ing the relativistic angular dependent factors with the non-relativistic factors
|Yml (θ, ϕ)|2 , one finds that they are identical for |jz | = j. Since the relativistic
distribution is the sum of two generally different positive definite forms origi-
nally associated with the two spinors χ+ and χ− , it generally does not go to
zero for non-zero values of θ.
261
Figure 48: The relativistic (left) and non-relativistic (right) angular distribu-
tions Aj,|jz | (θ) for j = 21 and j = 32 .
262
Figure 49: The relativistic (left) and non-relativistic (right) angular distribu-
tions Aj,|jz | (θ) for j = 25 .
263
Table 10: Relativistic Angular Distribution Functions
1 1 1
2 2 4 π
3 1 1
2 2 8 π ( 1 + 3 cos2 θ )
3
2
3
2
3
8 π sin2 θ
5 1 3
2 2 16 π ( 1 − 2 cos2 θ + 5 cos4 θ )
5
2
3
2
3
32 π sin2 θ ( 1 + 15 cos2 θ )
5
2
5
2
15
32 π sin4 θ
1
0 0 4 π
3
1 0 4 π cos2 θ
1 1 3
8 π sin2 θ
5
2 0 16 π ( 1 − 3 cos2 θ )2
2 1 15
8 π sin2 θ cos2 θ
2 2 15
32 π sin4 θ
264
11.10.1 The Hydrogen Atom
The radial energy eigenvalue equation for a hydrogenic-like atom is given by
Z e2
2 ∂ κ
(E + − m c ) f (r) + c h̄ − g(r) = 0
r ∂r r
Z e2
∂ κ
(E + + m c2 ) g(r) − c h̄ + f (r) = 0
r ∂r r
(1525)
The above equations will be written in dimensionless units, where the energy is
expressed in terms of the rest mass m c2 and lengths are expressed in terms of
the Compton wave length mh̄ c . A dimensionless energy is defined as the ratio
of E to the rest mass energy
E
= (1526)
m c2
For a bound state, m c2 > E > − m c2 so the value of the magnitude of
is expected to be a little less than unity. A dimensionless radial variable ρ is
introduced which governs the asymptotic large r decay of the bound state wave
function. The variable is defined by
p
2
rmc
ρ = 1 − (1527)
h̄
In terms of these dimensionless variables, the Dirac radial equations for the
hydrogen-like atom become
r
1 − γ ∂ κ
− + f + − g = 0
1 + ρ ∂ρ ρ
r
1 + γ ∂ κ
+ g − + f = 0 (1528)
1 − ρ ∂ρ ρ
where
Z e2
γ = (1529)
h̄ c
is a small number.
Boundary Conditions
The asymptotic ρ → ∞ form of the solution can be found from the asymp-
totic form of the equations
r
1 − ∂
− f + g ∼ 0
1 + ∂ρ
r
1 + ∂
g − f ∼ 0 (1530)
1 − ∂ρ
265
Hence, on combing these equations, one sees that the asymptotic form of the
equation is given by
∂2f
= f (1531)
∂ρ2
Therefore, one has
f ∼ A exp − ρ + B exp + ρ (1532)
one must choose the positive solution for s. Normalizability near the origin
requires that 2 s > − 1. Hence, one may set
p
s = κ2 − γ 2 (1538)
266
This will be a good solution for κ = − 1 if Z does not exceed a critical value.
For values of Z greater than ≈ 172, the point charge can spark the vacuum
and spontaneously generate electron-positron pairs101 . The solution with the
negative value of s given by
p
s = − κ2 − γ 2 (1539)
could also possibly exist and be normalizable if γ is greater than a critical value
γc determined as
1 p
= 1 − γc2 (1540)
2
This critical value of γ is found from
√
3
γc = (1541)
2
which corresponds to Zc ∼ 118. The solutions corresponding to negative s
are, infact, un-physical and do not survive if the nucleus is considered to have
a finite spatial extent.
We shall use the Fröbenius method to find a solution. The solutions of the
radial equation shall be written in the form
f (r) = exp − ρ ρs F (ρ)
g(r) = exp − ρ ρs G(ρ) (1542)
H. Bokemeyer, P. Vincent, Y. Nakayama, and J. S. Greenberg, Phys. Rev. Lett. 40, 1443
(1978).
267
where the coefficients an and bn are constants which have still to be determined.
The coefficients are determined by substituting the series in the differential
equation and then equating the coefficients of the the same power in ρ. Equating
the coefficient of ρn yields the set of relations
r
1 −
− an−1 + γ an + ( n + s − κ ) bn − bn−1 = 0
1 +
r
1 +
bn−1 + γ bn − ( n + s + κ ) an + an−1 = 0
1 −
(1545)
valid for any n. The above equation can be used to eliminate the coefficients
bn and yield a recursion relation between an and an−1 . The ensuing recursion
relation will enable us to explicitly calculate the wave functions G(ρ) and hence
F (ρ).
The behavior of the recursion relation for large values of n can be found by
noting that eqn(1547) yields
r
1 +
n an ∼ n bn (1548)
1 −
which when substituted back into the large n limit of the first relation of
eqn(1545) yields
n an ∼ 2 an−1 (1549)
Since the large ρ limit of the function is dominated by the highest powers of
ρ, it is seen that if the series does not terminate, the functions F (ρ) and G(ρ)
268
would be exponentially growing functions of ρ
F (ρ) ∼ exp + 2 ρ
G(ρ) ∼ exp + 2 ρ (1550)
Therefore, the set of recursion relations must terminate, since if the series does
not terminate, the large ρ behavior of the functions F (ρ) and G(ρ) would gov-
erned by the growing exponentials. Even when combined with the decaying
exponential term that appear in the relations
s
f (r) = ρ F (ρ) exp − ρ
s
g(r) = ρ G(ρ) exp − ρ (1551)
the resulting functions f (r) and g(r) would not satisfy the required boundary
conditions at ρ → ∞. We shall assume that the series truncate after the nr -th
terms. That is, it is possible to set
anr +1 = 0
bnr +1 = 0 (1552)
Thus, the components of the radial wave function may have nr nodes. Assuming
that the coefficients with indices nr + 1 vanish and using the first relation in
eqn(1545) with n = nr + 1, one obtains the condition
r
1 +
bn = − anr (1553)
1 − r
A second condition is given by the relation between an and bn
r r
1 + 1 +
γ − ( n + s + κ ) an + γ + ( n + s − κ ) bn = 0
1 − 1 −
(1554)
valid for any n. We shall set n = nr and then eliminate anr using the termination
condition expressed by eqn(1553). After some simplification, this leads to the
equation p
γ = ( nr + s ) 1 − 2 (1555)
This equation determines the square of the dimensionless energy eigenvalue 2 .
On squaring this equation, simplifying and taking the square root, one finds
( nr + s )
= ± p (1556)
( nr + s )2 + γ 2
269
or, equivalently, the energy of the hydrogen atom102 is given by
m c2
E = ± q (1557)
γ2
1 + ( nr + s ) 2
where r
1 2
s = (j + ) − γ2 (1558)
2
This expression for the energy eigenvalue is independent of the sign of κ and,
therefore, it holds for both cases
1
j = (l + 1) −
2
1
j = l + (1559)
2
Hence, the energy eigenstates are predicted to be doubly degenerate (in addition
to the (2j + 1) degeneracy associated with j3 ), since states with the same j but
have different values of l0 have the same energy. If the positive-energy eigenvalue
is expanded in powers of γ, one obtains
1 γ2
E ≈ m c2 − m c2 1 + ... (1560)
2 ( nr + j + 2 )2
which agrees with the energy eigenvalues found from the non-relativistic Schrödinger
equation. However, as has been seen, the exact energy eigenvalue depends on
nr and (j + 21 ) separately, as opposed to being a function of the principle quan-
tum number n which is defined as the sum n = nr + j + 21 . Hence, the Dirac
equation lifts the degeneracy between states with different values of the angular
momentum. The energy levels together with their quantum numbers are shown
in Table(12). The energy splitting between states with the same n and different
j values has a magnitude which is governed by the square of the fine structure
2
constant Z ( h̄e c ). That is
γ2
2 1
E ≈ mc 1 −
2 ( nr + j + 12 )2
γ4
1 1 3
− − + ...
2 ( nr + j + 12 )3 (j + 1
2 ) 4 ( nr + j + 1
2 )
(1561)
The fine structure splittings for H-like atoms was first observed by Michelson103
and the theoretical prediction is in agreeement with the accurate measurements
102 C. G. Darwin, Proc. Roy. Soc. A 118, 654 (1928).
270
Table 12: The Equivalence between Relativistic and Spectroscopic Quantum
Numbers.
n = nr + |κ| nr κ = ±(j + 12 ) nLj Degenerate E
m c2
Partner
p
1 0 -1 1S 12 1 − γ2
r
γ 2
2 1 -1 2S 12 2P 12 1− √
2+2 1−γ 2
2 1 +1 2P 12 2S 12 --
q
1
2 0 -2 2P 32 1− 4 γ2
r
γ 2
3 2 -1 3S 12 3P 12 1− √
5+4 1−γ 2
3 2 +1 3P 12 3S 12 --
r
γ 2
3 1 -2 3P 32 3D 32 1− √
5+2 4−γ 2
3 1 +2 3D 32 3P 32 --
q
1
3 0 -3 3D 52 1− 9 γ2
of Paschen104 . The fine structure splitting is important for atoms with larger Z.
This observation has a classical interpretation which reflects the fact that for
large Z the electrons move in orbits with smaller radii and, therefore, the elec-
trons must move faster. Relativistic effects become more important for electrons
which move faster, and this occurs for atoms with larger values of Z. Although
the fine structure splitting does remove some degeneracy, the two states with
the same principle quantum number n and the same angular momentum j but
which have different values of l are still predicted to be degenerate. Thus, for
example, the 2Sj= 12 and the 2Pj= 12 states of Hydrogen are predicted to be de-
generate by the Dirac equation. It has been shown that this degeneracy is
removed by the Lamb shift, which is due to the interaction of an electron with
its own radiation field. The Lamb shift is smaller than the fine structure shifts
2
discussed above because it involves an extra factor of h̄e c .
271
The ground state wave function of the hydrogen atom is slightly singular
at the origin. This can be seen by noting that it corresponds to nr = 0 and
κ = −1. Since the dimensionless energy is given by the expression
p
= 1 − γ2 (1562)
γ a0 + ( s + 1 ) b0 = 0
γ b0 − ( s − 1 ) a0 = 0 (1565)
and p
b0 s + κ 1 − γ2 − 1
= = (1567)
a0 γ γ
This shows that the lower component is smaller than the upper constant by
approximately γ, which has the magnitude of vc where v is the velocity in Bohr’s
theory. The ratio of b0 to a0 determines the radial functions as
f (r) s−1
= a0 ρ exp − ρ
r
p
1 − γ 2 − 1 s−1
g(r)
= a0 ρ exp − ρ (1568)
r γ
272
0.8
1S1/2
0.6
0.4
f(r)
0.2
g(r) x 100
0
0 0.5 1 1.5 2 2.5 3
-0.2
-0.4
mcγr/h
Figure 50: The large f (r) and small component g(r) radial wave functions for
the 1S 12 ground state of Hydrogen.
Since Y00 (θ, ϕ) = √1 , the angular spherical harmonics for the upper com-
4 π
ponents are just
1
ΩA (θ, ϕ) = √ χσ (1569)
4π
and the lower components are given by
r.σ 1
ΩB (θ, ϕ) = − √ χσ
r 4π
1 cos θ sin θ exp[−iϕ]
= − √ χσ (1570)
4π sin θ exp[+iϕ] − cos θ
Thus, apart from an over all normalization factor, the four-component spinor
Dirac wave function ψ is given by
√ 2 χσ
N
ρ 1−γ −1 exp − ρ
ψ = √ r . σ (1571)
4π −i r χσ
Hence, it is seen that as ρ approaches the origin, at first the wave function is
slowly varying since
√ 2
γ2
γ2
1−γ −1
ρ ∼ exp − ln ρ ∼ 1 − ln ρ (1572)
2 2
but for distances smaller than the characteristic length scale
h̄ 2
rc = exp − 2 (1573)
mcγ γ
273
the wave function exhibits a slight singularity. This length scale is much smaller
that the nuclear radius so, due to the spatial distribution of the nuclear charge,
the singularity is largely irrelevant. This singularity is not present in the non-
relativistic limit, since in this limit one assumes that the inequality | V (r) |
m c2 always holds, although this assumption is invalid for r ∼ 0. Therefore, one
concludes that the relativistic theory differs from the non-relativistic theory at
small distances, which could have been discerned from the use of the Heisenberg
uncertainty principle.
0.4 0.2
2P1/2
2S1/2
0.2 0
0 2 4 6 8 10
0 f(r)
0 2 4 6 8 10 -0.2
f(r) g(r) x 100
-0.2
g(r) x 100
-0.4
-0.4
-0.6
-0.6 mcγr/h
mcγr/h
Figure 51: The radial wave functions for the 2S 12 and 2P 12 states of Hydrogen.
274
where the length scale a is given in terms of the energy E and the Compton
wavelength by
2 − 12
E h̄
a = 1 − (1577)
m c2 mc
The values of the indices s, energy E, length scale a and normalization N
are given in Table(13). Since the two-component spinor spherical harmonics
Ωlj,jz (θ, ϕ) are normalized to unity, the normalization condition is determined
from the integral
Z ∞
N2 dr | f |2 + | g |2 = 1 (1578)
0
involving the radial wave functions. The integral is evaluated with the aid of
the identity
Z ∞
dρ ρa+b
exp − 2 ρ = 2−(a+b+1) Γ(a + b + 1) (1579)
0
The coefficients cn and dn in the above expansion of the radial functions differ
from the coefficients an and bn that occur in the Frobenius expansion, since the
values of the ratio cnr /dnr has been chosen to simplify in the limit of large n. In
particular at the value of nr (at which the series terminates), the ratio is chosen
to satisfy
cnr
= 1 (1580)
d nr
instead of the condition
s
anr 1 + ( mEc2 )
= − (1581)
bn r 1 − ( mEc2 )
The relative negative sign and the square root factors in the coefficients have
been absorbed into the expressions for the upper and lower components f (r) and
g(r). The square root factors are responsible for converting the upper and lower
components, respectively, into the large and small components for positive E,
and vice versa for negative E. The expansion coefficients are given in Table(14).
Since the ratio of the magnitudes of the polynomial factors is generally of the
order of unity, the ratio of the magnitudes of the small to large components is
found to be of the order of γ.
275
Table 13: Parameters specifying the Radial Functions for the Hydrogen atom.
E γ m c
State s m c2 a h̄ N
p p s+ 1
κ = −1 1 − γ2 1 − γ2 1 √1 √2 2
a 2Γ(2s+1)
1S 12
q √ r
2 E
p 1 + 1 − γ2 ( −1) s+ 1
E 1 m c2 √1 √2 2
κ = −1 1 − γ2 2 2 m c2 2 ( 2 E2 ) a
m c Γ(2s+1)
2S 32
q √ r
2 E
p 1 + 1 − γ2 ( +1) s+ 1
E 1 m c2 √1 √2 2
κ=1 1 − γ2 2 2 m c2 2 ( 2 E2 ) a
m c Γ(2s+1)
2P 12
s+ 1
q
γ2
p
κ = −2 4 − γ2 1 − 2 √1 √2 2
4 a 2Γ(2s+1)
2P 32
treated by first-order perturbation theory, yield the fine structure. The physical
interpretation of the interactions will be examined. Historically, the following
type of analysis and the ensuing discussion of the Thomas precession played a
decisive role in compelling Pauli to reluctantly accept Dirac’s theory.
276
Table 14: Coefficients for the Polynomial in the Hydrogen atom Radial Wave-
functions.
State c0 c1 d0 d1
κ = −1 1 0 1 0
1S 12
2 E
2 E
E ( )+1 E ( )+1
m c2 m c2
κ = −1 2 m c2 2 s + 1 2 m c2 + 1 2 s + 1
2S 32
2 E
2 E
E ( )−1 E ( )−1
m c2 m c2
κ=1 2 m c2 − 1 2 s + 1 2 m c2 2 s + 1
2P 12
κ = −2 1 0 1 0
2P 32
Also the non-relativistic energy will be defined as the energy referenced with
respect to the rest-mass energy
E = m c2 + (1584)
The coupled equations reduce to
− V φA = − i h̄ c ( σ . ∇ ) φB
2
− V + 2 mc φB = − i h̄ c ( σ . ∇ ) φA (1585)
The pair of equations will be expanded in powers of ( mp c )2 and only the first-
order relativistic corrections will be retained. One can express φB as
− i h̄ c ( σ . ∇ ) φA
φB =
− V + 2 m c2
−1
1 − V
= 1 + ( σ . p̂ ) φA
2mc 2 m c2
1 − V
≈ 1 − + ... ( σ . p̂ ) φA (1586)
2mc 2 m c2
277
to the required order of approximation. The above equation can be used to
obtain a Schrödinger-like equation for the two-component spinor φA . Since a
Schrödinger equation is sought for ψS , a correspondence must be established
between the pair of spinors (φA ,φB ) and ψS . The probability density is the
physical quantity which is directly associated with both types of wave functions.
The probability density associated with the Schrödinger equation should be
equivalent to the probability density associated with the Dirac equation. The
probability density associated with the four-component Dirac spinor depends
on both φA and φB ,
P (r) = φA† φA + φB † φB (1587)
278
( σ . p̂ ) p̂2
− V
= ( σ . p̂ ) I − − ( σ . p̂ ) ψS
2m 8 m2 c2 2 m c2
(1593)
or
p̂2 p̂2
− V − + V ψS
8 m2 c2 8 m2 c2
2
p̂2
p̂ − V
= I − − ( σ . p̂ ) ( σ . p̂ ) ψS
2m 8 m2 c2 4 m2 c2
(1594)
where the relativistic corrections are symmetric in p2 and V . This represents the
energy eigenvalue equation for a two-component wave function ψS , similar to
279
the Schrödinger wave function, but the above equation does include relativistic
corrections to the Hamiltonian. The first correction term is
p̂4
ĤKin = − (1599)
8 m3 c2
which is recognized as the relativistic kinematic energy correction, that origi-
nates from the expansion of the kinetic energy
p
= m2 c2 + p2 c2 − m c2
2
p p4
≈ − + ... (1600)
2m 8 m3 c2
The remaining two correction terms
2
p̂ V + V p̂2
V
( σ . p̂ ) ( σ . p̂ ) − (1601)
4 m2 c2 8 m2 c2
will be interpreted as the sum of the spin-orbit interaction and the Darwin term.
It should be noted that the sum of these two terms would identically cancel in
a purely classical theory. This cancellation can be shown to occur since, in the
classical limit, V and p commute, and then the Pauli-identity can be used to
show that the resulting pairs of terms cancel.
The factor
2 ( σ . p̂ ) V ( σ . p̂ ) − p̂2 V + V p̂2 (1602)
can be evaluated as
2 p̂ . V p̂ − p̂2 V + V p̂2 + 2iσ. p̂ ∧ V p̂ (1603)
The first two terms can be combined to form a double commutator, yielding
− [ p̂ , [ p̂ , V ] ] + 2 i σ . p̂ ∧ V p̂ (1604)
or
2 2
+ h̄ ∇ V + 2 i σ . p̂ ∧ V p̂ (1605)
since
p̂ ∧ p̂ ≡ 0 (1607)
280
Using these substitutions, the remaining interactions can be expressed as the
sum of the spin-orbit interaction and the Darwin interaction
h̄2
h̄
ĤSO + ĤDarwin = + 2 2
σ . ∇ V ∧ p̂ + ∇2 V (1608)
4m c 8 m2 c2
The first term is the spin-orbit interaction term, and the second term is the
Darwin term. For central potentials, the Darwin term is only important for
electrons with l = 0. The evaluation and the physical interpretation of the
energy shifts due to the three fine-structure interactions will be discussed sepa-
rately.
The first-order energy shift due to the kinematic correction ĤKin can be eval-
uated by using the solution to the non-relativistic Schrödinger equation
2
p̂2 m c2 Z e2 Z e2
ψS = − + ψS (1611)
2m 2 n2 h̄ c r
which leads to
p̂4
Z
∆EKin = − d3 r ψS† (r) ψS (r)
8 m3 c2
2 2
m c 2 Z e2 Z e2
Z
1 3 †
= − d r ψS (r) − + ψS (r)
2 m c2 2 n2 h̄ c r
4
Z e2 Z 2 e4
Z
3 1
= m c2 4
− 2
d3 r ψS† (r) 2 ψS (r) (1612)
h̄ c 8n 2mc r
Hence, the first-order energy shift due to the kinematic correction is evaluated
as
4
Z e2
2 3 1
∆EKin = m c − (1613)
h̄ c 8 n4 n3 ( 2 l + 1 )
281
This term is found to lift the degeneracy between states with fixed principle
quantum numbers n and values of the angular momenta l. The relativistic kine-
matic correction to the energy is found to be smaller than the non-relativistic
energy by a factor of
2
Z e2
∼ Z 2 × 10−4 (1614)
h̄ c
which can be identified with a factor of ( vc )2 as can be inferred from an analysis
based on the Bohr model of the atom. One sees that the relativistic corrections
become more important for atoms with larger Z, since the correction varies as
Z 4 . This occurs because for larger Z the electrons are drawn closer to the nu-
cleus and, hence have higher kinetic energies, so the electron’s velocities draw
closer to the velocity of light.
which is caused by the apparent rotation of the charged nucleus. In the electron’s
rest frame, the electron’s spin S should interact with the magnetic field through
the Zeeman interaction
rest = − q
ĤInt gS B 0 . S (1618)
2mc
282
where gS is the gyromagnetic ratio for the electron’s spin. Dirac’s theory pre-
dicts that the spin is a relativistic phenomenon and also that gS = 2 for an
electron in its rest frame. This interaction with the magnetic field will cause
the spin of the electron to precess. The spin precession rate found in the elec-
trons rest frame is calculated as
e
ωrest = gS B 0 (1619)
2mc
However, the electron is bound to the nucleus and is orbiting with angular
momentum L. Therefore, one has to consider the corrections to the precession
rate (and the interaction) caused by the acceleration of the electron’s rest frame.
Thomas Precession
In the electron’s rest frame, the gyromagnetic ratio due to the orbital mag-
netic field B 0 (caused by the charged nucleus) is given by gs = 2. This gyro-
magnetic ratio yields a spin precession rate in the electron’s rest frame of
e
ωrest = gS B 0 (1620)
2mc
The spin precession rate observed in the lab frame will be calculated later. The
rate of precession as observed in the electron’s rest frame has to be corrected
by taking into account the motion of the electron. The correction is due to
the non-additivity of velocities in successive Lorentz transformations. First,
the transformation properties of Dirac spinors under infinitesimal rotations and
boosts will be re-examined. Secondly, infinitesimal transformations will be suc-
cessively applied to describe the particle’s instantaneous rest frame and the
Thomas precession.
283
infinitesimal Lorentz transform has the non-zero elements
Hence, for a passive rotation through an infinitesimal angle δϕ, the four-component
Dirac spinor is rotated by
(k)
δϕ X i,j,k σ 0
R̂(δϕ) = Iˆ + i ξ + ... (1625)
2 0 σ (k)
k
which can be expressed in terms of the projection of the (block diagonal) spin
operator Ŝ = h̄2 σ̂ on the axis of rotation ê as
i δϕ
R̂(δϕ) = Iˆ + ( ê . σ̂ ) + . . .
2
i δϕ
= Iˆ + ( ê . Ŝ ) + . . . (1626)
h̄
which is in accord with the definition of spin Ŝ as the generator of rotations.
If the primed frame of reference has a velocity v along the k-axis relative
to the un-primed frame, the infinitesimal Lorentz transform has the non-zero
elements
v
0,k = − k,0 = − (1627)
c
A Lorentz boost along the k-axis corresponds to a rotation in the 0 - k plane
through an “angle” χ
χ 0,k
R̂(χ) = exp + i σ (1628)
2
where the “angle” χ is governed by the boost velocity v through
v
tanh χ = (1629)
c
However,
i
σ 0,k = [ γ (0) , γ (k) ] = i α(k) (1630)
2
284
so
χ (k)
R̂(χ) = exp − α (1631)
2
Therefore, for a Lorentz boost with an infinitesimal velocity v along the k-th
direction, one finds
v
R̂(χ) = Iˆ − α(k) + . . . (1632)
2c
The infinitesimal transformation is guaranteed to be consistent with the source
free solution of the Dirac equation. For example, if the above transformation is
applied to the solution of the Dirac equation describing a positive-energy parti-
cle at rest, the transformed solution describes a particle moving with momentum
p = − m v when viewed from the moving frame of reference.
ωRest
a q ωΤ E
Figure 52: A cartoon depicting a rotating charged spin one-half particle, along
with the precession of the spin due to the external field in the particle’s rest
frame and the Thomas precession.
285
The Pauli identity can be used to evaluate the last term
(α.a)(α.v) = ( σ̂ . a ) ( σ̂ . v )
= a . v Iˆ + i σ̂ . ( a ∧ v ) (1637)
where, since the product of the two α’s yields a two by two block diagonal form
which involves the four by four matrices Iˆ and σ̂. Hence, the right-hand side acts
equally on both the upper and lower two-component spinors. Furthermore, since
the orbit is circular, the acceleration is perpendicular to the velocity, therefore
a.v = 0 (1638)
Thus, the combined boost corresponds to the transformation
1 i
R̂ = Iˆ − α . ( v + a δt ) + σ̂ . ( a ∧ v ) δt + . . . (1639)
2c 4 c2
The combined boost is identified as producing an infinitesimal Lorentz boost
through v + a δt and a rotation around an axis ê through the infinitesimal
angle δϕ given by
1
δϕ ê ≈ ( a ∧ v ) δt (1640)
2 c2
The rotation part acts on both the upper and lower two-component spinors in
the Dirac spinor. The rotation angle δϕ is linearly proportional to the time
interval δt. This class of rotations due to the combination of Lorentz boosts are
known as a Wigner rotations. Hence, it was shown that the spinor rotates with
the angular velocity given by
1
ωT = (a ∧ v)
2 c2
q
= (E ∧ v) (1641)
2 m c2
The magnitude of ωT is calculated as
e
ωT = B0 (1642)
2mc
and its direction is opposite to the precession of the spin in the electron’s rest
frame.
On combing the two precession frequencies, one finds that in the lab frame
the spin’s precession rate is given by
ωLab = ωrest − ωT
e
= ( gS − 1 ) B 0 (1643)
2mc
It is clear that the moving spin experiences an effective interaction which is
reduced by the factor
gS − 1
(1644)
gS
286
when compared to the interaction in the electron’s rest frame. Hence, the gyro-
magnetic ratio that enters the spin-orbit coupling should not be gS but should
be given by ( gS − 1 ).
In the lab frame, the interaction between the moving electron’s spin S mag-
netic moment and its field is inferred to be
Lab = − q
ĤInt ( gS − 1 ) B 0 . S (1645)
2mc
where gS is the gyromagnetic ratio. Since the magnetic induction field is given
by
0 1 ∂φ
B = − L (1646)
m c r ∂r
q0
φ(r) = (1647)
r
the spin-orbit interaction can be expressed as
q q0
ĤSO = − ( gS − 1 ) L.S (1648)
2mc m c r3
Hence, the spin-orbit interaction is found to be given by
Z e2
ĤSO = ( gS − 1 ) L . S (1649)
2 m2 c2 r3
The spin-orbit coupling is a relativistic coupling which, apart from the Thomas
precession factor, indicates that the electron’s spin interacts with a magnetic
field in its rest frame via the gyromagnetic ratio of 2. The magnitude of the
interaction agrees precisely with the interaction found from the perturbative
treatment of the Dirac equation.
287
The expectation value of r−3 is evaluated as
Z 3
1 1 Z
d3 r ψS† (r) ψS (r) = 1 (1652)
r3 l(l + 2 )(l + 1) na
Therefore, the spin-orbit interaction lifts the degeneracy between states with
different j = l ± 12 values. For l = 0, the numerator vanishes since the total
angular momentum can only take the value
1
j = + (1654)
2
The energy shift produced by the spin-orbit coupling is about a factor of the
square of the fine structure constant
2 2 2
e 1
∼ ∼ 10−4 (1655)
h̄ c 137
smaller than the energy levels of the hydrogen-like atom
2
m c2 Z e2
En ≈ − (1656)
2 n2 h̄ c
calculated using the non-relativistic Schrödinger equation. The spin-orbit split
levels are labeled by the angular momentum values and the j values, and are
denoted by nLj . Hence, for n = 2 and l = 1, one has the two levels 2P 12 and
2P 32 , while for n = 3 and l = 2 one has the levels 3D 32 and 3D 52 , and so on. It
is seen that the spin-orbit interaction is increasingly important for atoms with
large Z values, as it varies like Z 4 .
π Z e2 h̄2 3
ĤDarwin = δ (r) (1657)
2 m2 c2
288
which produces the first-order shift
π Z e2 h̄2 †
∆EDarwin = ψS (0) ψS (0) (1658)
2 m2 c2
Hence, the shift only occurs for electrons with l = 0. Furthermore, since the
probability density for finding the electron at the origin is given by
3
1 Z
ψS† (0) ψS (0) = δl,0 (1659)
π na
which shifts the energies of s states upwards. The Darwin term reflects the fact
that the relativistic corrections are important for small r since the inequality
Z e2
m c2 (1661)
r
required for the non-relativistic treatment to be reasonable is violated in this
region.
E S P
Kinematic 2P3/2
Spin Orbit
2S1/2 2P1/2
Kinematic
Darwin
n=2
Figure 53: The Grotarian energy level diagram for the n = 2 shell of hydrogen
(blue). The diagram shows the magnitude and sign of the various relativistic
corrections. It should be noted that states with the same j are degenerate.
289
When the various relativistic corrections are combined, for l = 0, the Darwin
term exactly compensates for the absence of the spin-orbit interaction. There-
fore, the energy shifts combine to yield one formula in which l drops out. This
implies that the energy levels only depend on the principle quantum number n
and the total angular momentum j. States with different orbital angular mo-
menta are degenerate, even though the individual interactions appear to raise
the degeneracy. The relativistic corrections inherent in Dirac’s theory of hydro-
gen yields energy shifts and line-splittings which are described as fine structure.
The energy levels are described by
1 Z 2 α2 1 Z 4 α4
1 3
E ≈ m c2 1 − − − + . . .
2 n2 2 n3 ( j + 12 ) 4n
(1662)
where
e2
α = (1663)
h̄ c
is the fine structure constant. Generally, states with larger j values have higher
energies. The fine structure splittings decrease with increasing n like n−3 , but
increase with increasing Z like Z 4 . The splitting of the lower energy levels are
largest, for example
m c2 α 4
1 1
E2P 3 − E2P 1 = − − ≈ 4.533 × 10−5 eV (1664)
2 2 16 2 1
This splitting corresponds to a frequency of 10.96 GHz. The energy levels are
predicted to be doubly degenerate (in addition to the degeneracy associated
with j3 ), the degeneracy is just the number of states with different l values that
yield the same value of j. Since j is found by combining l with the electronic
spin s = 21 , there are two possible l values for each energy level which are given
by the solutions of either
1
j = l + (1665)
2
or
1
j = l − (1666)
2
The higher-order relativistic corrections does not alter the conclusion that the
states labeled by (n, j) are degenerate, as the energy levels found from the exact
solution of the Dirac equation only depend on n and j. For j = 12 the energy
levels, although predicted to be degenerate by Dirac’s theory, are experimentally
observed as being non-degenerate. The first experiments that revealed this split-
ting were performed by Lamb and Retherford106 . These scientists found that
the 2S 12 was shifted by about 1057 MHz to higher energies relative to the 2P 12 .
The relative shift of the nS 12 level of hydrogen with respect to the nP 12 level is
106 W. E. Lamb Jr. and R. E. Retherford, Phys. Rev. 72, 241 (1947).
290
E S P D
3D5/2
Kinematic
Kinematic Spin Orbit
3P3/2 3D3/2
Spin Orbit
3S1/2
3P1/2
Kinematic
Darwin
n=3
Figure 54: The Grotarian energy level diagram for the n = 3 shell of hydrogen
(blue). The diagram shows the magnitude and sign of the various relativistic
corrections. It should be noted that states with the same j are degenerate.
291
I
e-
e-
Oven EM
H1 Cavity H1
state. Therefore, the current due to the emitted electrons was proportional to
the number of meta-stable hydrogen atoms that survived the passage through
the resonator. Hence, analysis of the experiment yielded the number of transi-
tions undergone in the electromagnetic resonator.
Figure 56: The dependence of the current emitted from the tungsten plate on
the applied magnetic field. The resonance frequency was set to 9487 Megacycles.
[W. E. Lamb Jr. and R. C. Retherford, Phys. Rev. 72, 241 (1947).]
In the resonator, an applied magnetic field Zeeman split the excited levels of
hydrogen and, when the oscillating field was on-resonance with the splitting of
the energy levels, the hydrogen atom made transitions out from the meta-stable
292
2S 12 state. At resonance, the frequency of the oscillating electromagnetic field
is equal to the energy splitting. Therefore, for fixed frequency, knowledge of the
resonance magnetic field allowed the splitting of the energy levels to be accu-
rately determined. The field dependence of the resonance frequency indicated
Figure 57: The observed dependence of the resonance frequencies on the applied
magnetic field. The solid lines are the predictions of the Dirac theory and the
dashed lines are the result of Dirac’s theory if the energy of the 2S state is
simply shifted. [W. E. Lamb Jr. and R. C. Retherford, Phys. Rev. 72, 241
(1947).]
that at zero field the the degeneracy between the 2S 21 and 2P 12 states were lifted,
with the 2S 12 state having the higher energy.
293
We shall examine the case of an attractive central square well potential V (r)
which is defined by
− V0 for r < a
V (r) = (1668)
0 for r > a
In the region r < a where the potential is finite, the Dirac radial equation
0.5
0
V(r)/V0
-0.5
-1
-1.5
0 0.5 1 1.5 2
r/a
becomes
2 ∂ κ
( E + V0 − m c ) f (r) + c h̄ − g(r) = 0
∂r r
∂ κ
( E + V0 + m c2 ) g(r) − c h̄ + f (r) = 0
∂r r
(1669)
294
By using a similar procedure, starting from the second equation, one can find
the analogous equation for g(r)
2
2 2 ∂ κ(κ − 1) 2 2 4
c h̄ − g(r) = − ( E + V0 ) − m c g(r)
∂r2 r2
(1672)
Real Momenta
295
of half-integer order. The spherical Bessel functions and spherical Neumann
functions of order n are defined in terms of the Bessel functions via
r
π
jn (ρ) = J 1 (ρ)
2 ρ n+ 2
r
π
ηn (ρ) = N 1 (ρ) (1679)
2 ρ n+ 2
Therefore, the general solutions of each of the radial equations can be expressed
as
f (r)
= A0 j|κ+ 12 |− 12 (k0 r) + A1 η|κ+ 12 |− 12 (k0 r) (1680)
r
and
g(r)
= B0 j|κ− 12 |− 12 (k0 r) + B1 η|κ− 12 |− 12 (k0 r) (1681)
r
However, since the functions f (r) and g(r) in the upper and lower components
are related by the differential equations
( E + V0 + m c2 ) g(r)
∂ 1 + κ f (r)
+ = (1682)
∂ρ ρ r c h̄ k0 r
and
( E + V0 − m c2 )
∂ 1 − κ g(r) f (r)
+ = − (1683)
∂ρ ρ r c h̄ k0 r
the two sets of coefficients (A0 , A1 ) and (B0 , B1 ) must also be related. The
explicit relations can be found by using the recurrence relations for the spherical
Bessel functions jn (ρ)
∂ n+1
ρ jn (ρ) = ρn+1 jn−1 (ρ) (1684)
∂ρ
and
∂
ρ−n jn (ρ) = − ρ−n jn+1 (ρ) (1685)
∂ρ
The spherical Neumann functions ηn (ρ) satisfy identical recurrence relations.
This yields the relations
E + V 0 + m c2
A0 = sign κ B0
c h̄ k0
E + V 0 + m c2
A1 = sign κ B1 (1686)
c h̄ k0
Hence, for positive-energy solutions. the upper components are the large com-
ponents and the lower components are the small components. In the inner
region, one must set A1 = B1 = 0, since the wave function are required to be
296
normalizable near the origin and the spherical Neumann functions ηn (ρ) diverge
as ρ−(n+1) as ρ → 0.
Imaginary Momenta
Bound States
297
For asymptotically large ρ, these functions are complex conjugates and represent
out-going or incoming spherical waves
1 1 π
lim h±n (ρ) → exp ± i ρ − ( n + ) (1695)
ρ→∞ ρ 2 2
The factor of ρ−1 reflects the fact that the intensity of an outgoing wave-packet
decreases in proportion to ρ−2 in order to conserve energy and probability. From
the asymptotic variation, it is seen that the spherical Hankel functions h± n (iρ)
with imaginary arguments, respectively, represent exponentially attenuating or
growing spherical waves. In the exterior region, the solutions are represented
by
f (r)
= C00 h+ |κ+ 12 |− 12
(iκ1 r) + C10 h− |κ+ 12 |− 12
(iκ1 r) (1696)
r
and
g(r)
= D00 h+ |κ− 12 |− 12
(iκ1 r) + D10 h− |κ− 12 |− 12
(iκ1 r) (1697)
r
The coefficients of the upper and lower components are related via
E + m c2
C00 = − D00
c h̄ κ1
E + m c2
0
C1 = D10 (1698)
c h̄ κ1
as can be seen by substituting the asymptotic form of the Hankel functions given
by eqn(1695) in the asymptotic form of the differential equations relating f (r)
and g(r) with V0 = 0. If this wave function is to be normalizable at ρ → ∞,
one must set C10 = D10 = 0.
The solutions for the wave functions have been found in the inner and outer
regions of the potential. The solution must also hold at r = a. This is achieved
by demanding that the upper and lower components of the wave function are
continuous at r = a. These conditions are demanded due to charge conservation
∂µ j µ = 0, since the current j µ only depends on the components of ψ and does
not (explicitly) depend on their derivatives.
Since the wave function at the origin must be normalizable, and since the
wave function must be exponentially decaying, when r → ∞, the matching
condition for the upper component becomes
298
By eliminating the amplitudes from the two matching conditions by using
eqn(1698), one can arrive at the equation
h+ (iκ1 a)
E + V 0 + m c2 j|κ+ 21 |− 12 (k0 a) E + m c2
|κ+ 12 |− 12
sign(κ) = −
c h̄ k0 j|κ− 12 |− 12 (k0 a) c h̄ κ1 h+
|κ− 1 |− 1
(iκ1 a)
2 2
(1701)
In the above expression, the quantities k0 and κ1 are defined by
h̄2 c2 k02 = ( E + V0 )2 − m2 c4 (1702)
and
h̄2 c2 κ21 = m2 c4 − E 2 (1703)
These equations determine the allowed values for the energy. The above set of
equations have to be solved numerically to find the energy eigenvalues. We note
that for the Dirac particle, the spin effectively results in the formation of a cen-
trifugal barrier (either for the upper or the lower component) even for electrons
in s states. As a result, the potential V0 must exceed a critical strength if it is
to yield a bound state.
The MIT bag model107 is a simple purely phenomenological model for the
structure of strongly interacting particles (hadrons). The model is based on
the spherically symmetric potential of radius a, but it will be assumed that the
quark mass can have one or the other of two values. The quark is assumed to
have a small mass (approximately zero) if it is located within a sphere of radius
a, and the mass is assumed to be very large (or infinite) if r > a. To be sure,
the quark mass is assumed to be a function of r such that
m = 0 if r < a
m → ∞ if r > a (1704)
107 A. Chodos, R. L. Jaffe, K. Johnson, C. B. Thorn, and V. F. Weisskopf, Phys. Rev. D 9,
3471, (1974).
299
It is the infinite mass of the quark for r > a that results in the confinement of
the quark to within the hadron. That is, in the exterior region, the infinite rest
mass energy exceeds the bound state energy so the exterior region is classically
forbidden, therefore, the particle is confined to the interior.
Inside the hadron, where both the potential energy and the mass m are zero,
the kinetic energy parameter k0 can be expressed entirely in terms of the energy
via E = h̄ c k0 since the potential is assumed to be zero. Therefore, the radial
components of the Dirac wave function can be expressed as
f (r)
= A0 j|κ+ 12 |− 12 (k0 r)
r
g(r)
= sign(κ) A0 j|κ− 12 |− 12 (k0 r) (1705)
r
where the amplitudes of the upper and lower components are the same, since
the potential and mass are zero for r < a.
Outside the hadron, where r > a, the energy E is assumed to be much less
than the rest mass energy, m c2 E, therefore, the momentum parameter is
imaginary and one can set h̄ c κ1 ≈ m c2 . In the exterior region, the radial
functions can be expressed as
f (r)
= C00 h+
|κ+ 1 |− 1
(iκ1 r)
r 2 2
g(r)
= − C00 h+
|κ− 1 |− 1
(iκ1 r) (1706)
r 2 2
Due to the asymptotic properties of the spherical Hankel functions, their ratio
is unity for large κ1 . This leads to the energies of the quarks being governed by
the simplified matching condition
where
E = c h̄ k0 (1709)
The above equation governs the ground state and excited state energies of the
individual quarks inside the hadron. Since the spherical Bessel functions oscil-
late in sign, the above equations will result in a set of solutions for k0 with fixed
300
κ. From the structure of the equations, it is seen that the solutions k0 will only
depend on the integer number κ and the value of a. Since another boundary
condition should also be imposed at the bag’s surface, only states with angular
momentum j = 21 should be retained. This extra condition restricts the interest
to states with κ = − 1.
Since
sin ρ
j0 (ρ) = (1711)
ρ
and
sin ρ − ρ cos ρ
j1 (ρ) = (1712)
ρ2
the energy eigenvalues are determined by the solutions of
1
ρ = (1713)
1 + cot ρ
which has an infinite number of solutions which, asymptotically, are spaced by
π. The smallest solution corresponds to k0 a = 2.04. Hence, the energy of the
1
P(ρ)/P(0)
0.5
0
0 1 2 3 4
ρ
Figure 59: The radial dependence of the quark-distribution in the ground state
of the MIT bag.
301
Table 15: The lowest single-particle energies (in units of Eκ,nr a/c h̄) of the MIT
Bag Model.
nr κ = −1 κ = +1 κ = −2 κ = +2
single particle levels. This could allow one to calculate the excitation energies
required to change the hadron’s internal structure.
302
Table 16: The Observed Energy Levels for the charmonium system (cc) in units
of MeV/c2 .
1 3 3 3 3
S0 S1 P0 P1 P2
which yields a ratio of 1.36. This ratio is far too small for the triplet of π mesons
since Mπ ∼ 139 MeV/c2 , and Mn ∼ 938 MeV/c2 . Although it is in adequate
for the pseudo-scalar mesons, the MIT bag model is more appropriate for the ω
vector meson which is composed of √12 (uu + dd) and has a mass of Mω ∼ 783
MeV/c2 , or the the ρ vector meson √1 2 (uu + dd) with a mass Mρ ∼ 776
MeV/c2 . Hence, at best, the MIT Bag model produces mixed results. The MIT
Bag model is also quite unappealing, since the basic assumptions of the bag
model do not follow from Quantum Chromodynamics, and the model is neither
re-normalizable nor is it Lorentz invariant.
303
Table 17: The Observed Energy Levels for the Upsilon system (bb) in units of
MeV/c2 .
1 3 3 3
S0 P0 P1 P2
that at small separations the quarks only interact weakly. This property is
called asymptotic freedom. It was the realization by ’t Hooft114 , Gross and
Wilczek115 and Politzer116 that non-Abelian gauge theories possessed the prop-
erties of asymptotic freedom that led to the acceptance of the theory of Quantum
Chromodynamics. The screening of the color force between the quarks at large
distances (due to virtual quark/anti-quark pairs) is more than compensated by
an anti-screening due to virtual gluon pairs. However, at small distances the
color force vanishes.
Exercise:
Show that the positive energy eigenvalues of the Dirac equation with the
mass m(r) given by
m(r) c = m0 c − i ω α . r (1718)
are determined as √
En,j,l = m0 c2 tA + 1 (1719)
114 G. t’ Hooft, unpublished (1972).
115 D. J. Gross and F. A. Wilczek, Phys. Rev. Lett. 30, 1343 (1973).
116 H. D. Politzer, Phys. Rev. Lett. 30, 1346 (1973).
117 D. Ito, K. Mori and E. Carriere, Nuovo Cimento, 51 A, 1119, (1967).
304
where the dimensionless parameter t corresponding to the string tension is given
by
h̄ ω
t = (1720)
m0 c2
and A is given in terms of the quantum numbers as
1
A = 2 (n − j) + 1 if j = l +
2
1
A = 2 (n + j) + 3 if j = l − (1721)
2
Hence, find the best fit to the excitation spectra of quarkonium.
The Dirac wave function ψ(r) can be expressed in terms of two two-component
spinors A
φ (r)
ψ(r) = (1722)
φB (r)
One only need specify the upper component φA (r), since once φA (r) has been
specified φB (r) is completely determined. For example, for the in and out
asymptotes, the Dirac equation reduces to
E p − m c2
A
− c p̂ . σ φ (r)
= 0 (1723)
− c p̂ . σ E p + m c2 φB (r)
305
In the scattering experiment, a plane-wave with momentum p parallel to the
ê3 -axis falls incident on the target. The in-asymptote can be described by a
state which is in a superposition of eigenstates of Ŝ (3) given by
in χ± pr
ψ± (r) = NEp c p
± Ep + m c2 χ± exp i cos θ (1725)
h̄
From the Rayleigh expansion, one observes that the in-asymptotes are not eigen-
states of (Ĵ)2 = (L̂ + Ŝ)2 since they are formed of linear superpositions of many
2 2
states with differing eigenvalues of L̂ but have a fixed eigenvalue of Ŝ . How-
ever, the in-asymptote are eigenstates of Jˆ(3) = L̂(3) + Ŝ (3) with eigenvalues
± h̄2 .
Ψin
(θ,ϕ)
p
Ψout
Figure 60: The geometry of the asymptotic final state of Mott scattering. At
large r, the beam separated into an unscattered beam ψin and a spherical out-
going wave ψout .
active in the vicinity of the target. In spherical polar coordinates, the orbital
306
angular momentum raising and lowering operators are given by
∂ ∂
L̂± = ± h̄ exp[±iϕ] ± i cot θ (1727)
∂θ ∂ϕ
In light of the comment about the upper two-component spinor, one sees that
the scattered wave is determined by
f (θ)
(1732)
g(θ) exp[+iϕ]
if the initial beam has a negative helicity. The quantities f (θ) and g(θ) are gen-
eralized scattering amplitudes that have the dimensions of length, and depend
on θ but do not depend on ϕ as both the in and out asymptotes are eigenstates
of Jˆ(3) with eigenvalues ± h̄2 . A partial wave analysis can be performed on the
Dirac equation to yield expressions for the scattering amplitudes f (θ) and g(θ)
in terms of phase shifts. A detailed knowledge of the scattering amplitudes is
not required for the following analysis.
307
If the in-asymptote has the spin quantized along the direction given by
(sin θs cos ϕs , sin θs sin ϕs , cos θs ), the upper component of the Dirac wave spinor
is determined by the two-component spinor
f (θ) cos θ2s exp[−i ϕ2s ] − g(θ) sin θ2s exp[+i ϕ2s ] exp[−iϕ]
A 1 pr
φ (r) = exp i
g(θ) cos θ2s exp[−i ϕ2s ] exp[+iϕ] + f (θ) sin θ2s exp[+i ϕ2s ] r h̄
(1735)
The probability for scattering is proportional to
2
θs ϕs θs ϕs
I(θ, ϕ) ∝ f (θ) cos exp[−i ] − g(θ) sin exp[+i ] exp[−iϕ]
2 2 2 2
2
θs ϕs θs ϕs
+ g(θ) cos
exp[−i ] exp[+iϕ] + f (θ) sin exp[+i ]
2 2 2 2
2 2 ∗ ∗
= | f (θ) | + | g(θ) | + sin θs sin(ϕ − ϕs ) i f (θ) g(θ) − f (θ) g (θ)
(1736)
If the initial beam is unpolarized, the direction of the initial spin (θs , ϕs )
should be averaged over by integrating over the solid angle dΩs = dϕs dθs sin θs .
This process yields the scattering probability for the unpolarized beam
Z
dΩs 2 2
I(θ, ϕ) = | f (θ) | + | g(θ) | (1737)
4π
which is independent of the azimuthal angle ϕ. It should be noted that the
unpolarized cross-section differs from the polarized cross-section.
Even if the initial beam is unpolarized, the final beam will be partially
polarized. The direction of the net polarization is determined by evaluating the
matrix elements of Ŝ and averaging over the direction of the initial spin, θs and
ϕs . The result is proportional to
∗
f (θ) g(θ) − f (θ) g ∗ (θ)
h̄
Ŝ = i (sin ϕ, − cos ϕ, 0) (1738)
2 | f (θ) |2 + | g(θ) |2
Hence, the polarization is perpendicular to the scattering plane. It should also
be noted that the net polarization of the scattered wave is determined by the
relative deviation of the scattering cross-section for polarized electrons from the
unpolarized scattering cross-section.
308
11.11.2 Partial Wave Analysis
The Dirac equation with a spherically symmetric potential V (r) has solutions
of the form
j± 12
!
f (r)
r Ω j,jz (θ, ϕ)
ψ(r) = j∓ 12 (1739)
i g(r)
r Ω j,jz (θ, ϕ)
j± 1
where the two-component spinor spherical harmonics Ωj,jz2 (θ, ϕ) are given by
q 1 1
j± 12
j+ 2 ± 2 ∓jz
j± 1 ∓ 2j+1±1 Y jz − 12
(θ, ϕ)
Ωj,jz2 (θ, ϕ) = q 1 1 j± 12
(1740)
j+ 2 ± 2 ±jz
2j+1±1 Y j + 1 (θ, ϕ)
z 2
c2 h̄2 k 2 = E 2 − m2 c4 (1742)
309
where δκ (k) are the phase shifts that characterize the potential. The phase
shifts depend directly on κ (and the energy) and only depend indirectly on j
and l through κ. The phase shifts are defined so that the asymptotic variation
of the radial functions is given by
fκ (r) cos(kr − (κ + 1) π2 + δκ (k))
∼ eiδκ (k) (1747)
r r
and only differs from the asymptotic variation of the free particle solutions
through the phase shifts. Furthermore, if this is decomposed in terms of incom-
ing and outgoing spherical waves,
exp i k r − (κ + 1) π2 + 2 δκ (k)
fκ (r)
∼
r 2r
π
exp − i k r − (κ + 1) 2
+ (1748)
2r
their fluxes are equal due to conservation of particles and, as written, the in-
coming spherical waves are not modified by the phase-shifts.
The general asymptotic r → ∞ form of the wave function for the scattering
is composed of the un-scattered wave and a spherical outgoing wave. The polar-
axis is chosen to be parallel to direction of the incident beam which is also chosen
to be the quantization axis for the spin. If the incident beam is polarized with
spin-up, the upper two-component spinor has the form
exp i k r
1 f (θ)
φA
↑ (r) = exp i k r cos θ +
0 g(θ) exp[ i ϕ ] r
(1749)
whereas for a down-spin polarized incident beam
exp i k r
0 − g(θ) exp[ − i ϕ ]
φA↓ (r) = exp i k r cos θ +
1 f (θ) r
(1750)
On recalling the Rayleigh expansion
X
exp i k r cos θ = il ( 2 l + 1 ) jl (kr) Pl (cos θ) (1751)
l
one can find the scattered spherical outgoing wave by subtracting the un-
scattered beam from the total wave function. On using the asymptotic large r
variation, one obtains the asymptotic form
cos(kr − (l + 1) π2 )
X
exp i k r cos θ → il ( 2 l + 1 ) Pl (cos θ) (1752)
kr
l
310
which has a similar form to the asymptotic form of the total wave function.
In particular, the spin and orbital angular momentum eigenstates can be de-
composed in terms of the spinor spherical harmonics. Thus, for the up-spin
polarized incident beam one has the upper two-component spinor
r
4π
Pl (cos θ) χ+ = Y l (θ, ϕ) χ+
2l + 1 0
√
√ √
4π
= l + 1 Ωll+ 1 , 1 − l Ωll− 1 , 1 (1753)
2l + 1 2 2 2 2
311
and the down-spin component is given by
√ r
4π X l(l + 1)
exp[ 2 i δ−l−1 (k) ] − 1
2ik 2l + 1
l
exp i k r
− exp[ 2 i δl (k) ] − 1 Y1l (θ, ϕ) (1759)
r
In the above expressions, the index on the phase-shifts δκ (k) refer to the value
of κ. Hence, for a spin-up polarized incident beam, the scattering amplitudes
are given in terms of the phase-shifts via
√
4π X (l + 1)
f (θ) = √ exp[ 2 i δ−l−1 (k) ] − 1
2ik 2l + 1
l
l
+ √ exp[ 2 i δl (k) ] − 1 Y0l (θ, ϕ) (1760)
2l + 1
and
√ r
4π X l(l + 1)
g(θ) exp[ i ϕ ] = exp[ 2 i δ−l−1 (k) ] − 1
2ik 2l + 1
l
− exp[ 2 i δl (k) ] − 1 Y1l (θ, ϕ) (1761)
A = B x êy (1763)
312
In the standard representation, the energy eigenvalue equation is represented
by the set of coupled equations
q
( E − m c2 ) φA (r) = c σ . ( p̂ − A ) φB (r)
c
q
( E + m c2 ) φB (r) = c σ . ( p̂ − A ) φA (r)
c
(1765)
Substituting the expression for φB from the second equation into the first, one
obtains the second-order differential equation for φA
2
2 2 4 A 2 q
(E − m c )φ = c σ . ( p̂ − A) φA (r)
c
2 q 2 q h̄
= c ( p̂ − A) − σ . B φA (r)
c c
= p̂ c + q B x − 2 q p̂y c B x − q c h̄ σ B φA (r)
2 2 2 2 2 (z)
(1766)
Since p̂y and p̂z commute with x, one can find simultaneous eigenstates of Ĥ,
p̂y and p̂z . Hence, the two-component spinor φA can be expressed as
A
φ (r) = exp i ky y + i kz z ΦA (x) (1767)
where
σ (z) χσ = σ χσ (1770)
in which the eigenvalues of σ (z) are denoted by σ. Therefore, the eigenvalue
equation can be reduced to
2
∂2
c h̄ ky
− h̄2 c2 + ( q B )2 x− f (x) = ( E 2 − m2 c4 − c2 h̄2 kz2 + q c h̄ B σ ) f (x)
∂x2 qB
(1771)
313
which (apart from an overall scale factor) is formally equivalent118 to the (non-
relativistic) energy eigenvalue equation for a shifted harmonic oscillator, with
frequency 2 c | q | B. The modulus sign was inserted to ensure that the frequency
ωHO is positive. The energy eigenvalues are determined from
1
( E 2 − m2 c4 − c2 h̄2 kz2 + q c h̄ B σ ) = 2 | q | c h̄ B ( n + ) (1772)
2
Hence, for an electron with negative charge q = − e one finds that the positive-
energy eigenvalue is given by the solution
r
| e | h̄
E = c m2 c2 + h̄2 kz2 + ( 2 n + 1 + σ ) B (1773)
c
This expression has an infinite degeneracy as it is independent of the continuous
variable ky . It also has a discrete (two-fold) degeneracy between the levels with
quantum numbers (n, σ = 1) and (n + 1, σ = −1). The two-fold degeneracy
can be understood as a consequence of the generalized helicity σ . ( p̂ − qc A )
commuting with the Hamiltonian Ĥ. This results in the spin’s alignment with
the electron’s velocity being preserved, as the spin’s precession is precisely bal-
anced by the electron’s orbital precession. It should be noted that if the g factor
deviates from 2, and such an anomaly in the g factor is expected from Quantum
Electrodynamics and has been found in experiment, then this degeneracy will
be lifted. The calculated ( g − 2 ) anomaly for an electron is given by
2 3 4
g − 2 1 α α α α
= − 0.3284986 + 1.17611 − 1.434 + ...
2 Theor 2 π π π π
(1774)
where 2
e
α = (1775)
h̄ c
is the fine structure constant. The experimentally determined value of the g
anomaly is found as
g − 2
= 0.0011659208 (1776)
2 Expt
118 The explicit (but dimensionally incorrect) analogy is obtained by setting the Harmonic
314
and differs from the theoretical value in the last two decimal places119 . In
the non-relativistic limit, the expression for the relativistic energy eigenvalue
reproduces the expression for energies of the well-known Landau levels
h̄2 kz2
1 + σ | e | h̄ B
E ≈ m c2 + + (n + ) (1777)
2m 2 mc
which are doubly-degenerate.
Aµ = Aµ (φ) (1779)
∂µ Aµ = kµ Aµ (φ)0 = 0 (1780)
where the prime indicates differentiation with respect to φ. The classical vector
potential must satisfy the source-free wave equation
∂ν ∂ ν Aµ = kν k ν Aµ (φ)00 = 0 (1781)
kν k ν = 0 (1782)
The Dirac equation for a spin one-half particle with charge q can be used to
obtain the second-order differential equation
q2
2 µ q µ µ 2 2 q µ ν 0
− h̄ ∂µ ∂ − 2 i h̄ A ∂µ + 2 Aµ A − m c − i h̄ γ kµ γ Aν (φ) ψ = 0
c c c
(1783)
119 This discrepancy could indicate the importance of virtual processes in which heavy par-
ticle/antiparticle pairs are created. The (g − 2) anomalies for the muon and its anti-particle
have also been measured [G. W. Bennett et al., Phys. Rev. Lett. 92, 1618102 (2004).].
These experiments show that particles and anti-particles precess at the same rate. However,
the value of the (g − 2) anomaly is inconsistent with the theoretical prediction based on the
standard model of particle physics.
315
where ψ is the four-component Dirac spinor. In deriving this, the Lorenz gauge
condition has been used to re-write
µ ν µ ν µ,ν
γ γ ∂µ Aν ψ = γ γ ∂µ Aν ψ − g ∂µ Aν ψ
(1784)
in the diagonal terms.
does commutes with the Hamiltonian and, therefore, is conserved. The con-
servation of this quantity can be interpreted in terms of the energy absorbed
or emitted by the electron due to interaction with the classical electromagnetic
field being accompanied by the absorption or emission of similar amount of mo-
mentum121 . Despite the different interpretation of pµ in the presence of the
classical field, the four-vector pµ shall be chosen to satisfy the condition
pµ pµ = m2 c2 (1788)
which is the dispersion relation for a free electron122 .
316
The form of the wave function of eqn(1785) is to be substituted into the
second-order differential eqn(1783). It shall be noted that
Aµ ∂µ F (φ) = kµ Aµ F (φ)0 = 0
∂ µ ∂µ F (φ) = k µ kµ F (φ)00 = 0 (1789)
µ µ
since A satisfies the Lorenz gauge condition and k satisfies the dispersion
relation for electromagnetic waves in vacuum. On substituting the ansatz into
the second-order equation, using the above two equations and the choice of pµ
satisfying the free-electron dispersion relation, one finds that the second-order
equation reduces to a first-order differential equation for the spinor F (φ)
q2
q q
2 i h̄ pµ k µ F (φ)0 = 2 Aµ pµ − 2 Aµ Aµ + i h̄ γ µ kµ γ ν Aν (φ)0 F (φ)
c c c
(1790)
which only depends on φ since the exponential phase-factor which depends on
pµ xµ has been factored out. The first-order equation can be integrated w.r.t.
φ to yield
Z φ
q γ µ kµ γ ν Aν
iq µ 0 1 q µ 0 0 0
F (φ) = exp − p Aµ (φ ) − A (φ ) A µ (φ ) dφ + F (0)
h̄ c pλ k λ 0 2 c c 2 pλ k λ
(1791)
where F (0) is an arbitrary constant four-component spinor. The exponential of
the matrix is defined in terms of its series expansion.
Z φ
iq µ 0 1 q µ 0 0 0
F (φ) = exp − p A µ (φ ) − A (φ ) A µ (φ ) dφ
h̄ c pλ k λ 0 2 c
q γ µ kµ γ ν Aν
× exp F (0) (1792)
c 2 pλ k λ
The above form can be simplified by expanding the last exponential factor due
to the identity n
µ ν
γ kµ γ Aν = 0 (1793)
for all integers n such that n > 1. The identity can be proved by
γ µ kµ γ ν Aν γ τ kτ γ ρ Aρ = − γ µ kµ γ τ kτ γ ν Aν γ ρ Aρ + 2 g ν,τ Aν kτ γ µ kµ γ ρ Aρ
= − γ µ kµ γ τ kτ γ ν Aν γ ρ Aρ (1794)
where the first line follows by using the anti-commutation relations for the γ
matrices and the second line follows from applying the Lorenz gauge condition.
The expression can be further simplified by noting that on anticommuting the
first pair of γ matrices, one has
= − γ µ kµ γ τ kτ γ ν Aν γ ρ Aρ
= γ τ kτ γ µ kµ γ ν Aν γ ρ Aρ + 2 g µ,τ kµ kτ
= γ τ kτ γ µ kµ γ ν Aν γ ρ Aρ
= γ µ kµ γ τ kτ γ ν Aν γ ρ Aρ (1795)
317
the third line follows from the condition k µ kµ = 0 and the last line follows
from interchanging the first two pairs of summation indices. On comparing the
first and last lines, one notes that the right-hand side is zero. Therefore, one
has proved the identity
γ µ kµ γ ν Aν γ τ kτ γ ρ Aρ = 0 (1796)
Therefore, one demands that F (0) satisfies the above supplementary condition
which is the same as for a free particle. Hence, one can set
χσ
F (0) = NF (1802)
p . σ
p(0) + m c
χσ
318
The spectrum of eigenvalues of the electron’s energy can be found by Fourier
transforming the above solution with respect to time, which shows that the elec-
tron absorbs and emits radiation in multiples of h̄ ω. The Volkov solutions have
been used to describe the Compton scattering of electrons by intense coherent
laser beams, and is also the basis of the strong-field approximation sometimes
found useful in atomic physics123 .
q γ ν Aν γ µ kµ
† † S
ψ = F (0) Iˆ + exp − i (1805)
c 2 pλ k λ h̄
q pν Aν q 2 Aν Aν
c q µ
j µ = (0) pµ − A + kµ − (1806)
p V c c k λ pλ c2 2 k λ pλ
γ µ p̂µ ψ = m c ψ (1808)
where the γ matrices are any set of matrices which satisfy the anti-commutation
relations
γ µ γ ν + γ ν γ µ = 2 g µ,ν Iˆ (1809)
123 L. V. Keldysh, Zh. Eksp. Teor. Fiz. 47, 1945 (1964). [Sov. Phys. J.E.T.P. 20, 1307
(1965).]
F. H. M. Faisal, J. Phys. B 6, L89 (1973).
H. R. Reiss, Phys. Rev. A 22, 1786 (1980).
319
The Dirac equation is independent of the specific representation of the γ matri-
ces. We have chosen the representation
I 0
γ (0) = (1810)
0 −I
and
σ (i)
(i) 0
γ = (1811)
−σ (i) 0
where σ (i) are the Pauli-matrices. This is the standard representation.
γ µ0 = Û γ µ Û † (1813)
invariant.
320
The components φL and φR are related to the components of ψ in the standard
representation via
L0 A
φ − φB
φ 1
= √ (1820)
φR0 2 φA + φB
The chiral representation is particularly useful for the description of massless
spin one-half particles, such as might be the case for the neutrino. The neutrino
masses are extremely small. The masses have evaded direct experimental mea-
surement. However, direct measurements have set upper limits on the masses
which decrease with time124 . In this case, with the limit m → 0, the Dirac
equation takes the form
∂
0 ∂t + c σ . ∇ L0
φ
= 0 (1821)
φR0
∂
∂t − c σ . ∇ 0
Hence, the Dirac equation for a massless free particle reduces to two uncoupled
equations, each of which are equations proposed by Weyl125
∂
+ c σ . ∇ φR0 = 0 (1822)
∂t
and
∂
− c σ . ∇ φL0 = 0 (1823)
∂t
The Weyl equation describes a spin one-half massless particle by a two com-
ponent spinor wave function. The Weyl equation violates parity invariance.
The Weyl equation was considered to be un-physical until the discovery of the
(anti-)neutrino126 and the associated violation of parity invariance127 . After the
parity violation of the weak interaction was established, the Weyl equation was
adopted to describe the neutrino128 .
Inexplicably nature seems to have selected the Weyl equation for φL , but not
φR to describing neutrinos. The solutions of the Weyl equation for free particles
∂
− c σ . ∇ φL = 0 (1824)
∂t
124 L. Langer and R. Moffat, Phys. Rev. 88, 689 (1952).
1413 (1957).
128 T. D. Lee and C. N. Yang, Phys. Rev. 105, 1671 (1957).
321
can be written as
(0)
u 1 i
φL = √ exp − ( E t − p . r ) (1825)
u(1) V h̄
Since helicity is conserved, one can choose the direction of p as the axis of
quantization. The positive-energy solution is given by
L 0 1 i
φ− = √ exp − (Et − pz) (1826)
1 V h̄
which has negative helicity and has energy given by
E− = c p (1827)
E+ = − c p (1829)
This negative-energy solution will describe anti-particles. The Weyl equation for
φR has a positive-energy solution with positive helicity, and a negative-energy
solution with negative helicity. Since only neutrinos with negative helicity are
observed in nature, only φL is needed. The anti-neutrinos have positive helicity
and are represented by φR .
E Λ=+1 Λ=−1
R L Λ=−1
φ φ ν∗ ν
Elementary Excitations
Λ=+1
Figure 61: The dispersion relations for φL and φR . The elementary excitations
are the negative-helicity neutrino ν and a positive-helicity anti-neutrino ν.
The Neutrino
322
decay products included a proton and an electron. However, it was observed
that the emitted electron had a continuous range of kinetic energies. Therefore,
another neutral particle must have been emitted in the decay. This particle was
termed the anti-neutrino, and the reaction can be written as
n → p + e− + ν e (1830)
Conservation of angular momentum requires that the neutrino has a spin of h̄2 .
Furthermore, since an energy of 1.2934 MeV is released in the transformation of
a neutron to a proton, and since sometime the decay processes produce electrons
which seem to take up all the released energy, the neutrino was suggested as
having zero mass. An upper limit on the neutrino’s mass of a few eV follows
from the Fermi-Kurie plot129 . The Fermi-Kurie plot of the electron energy
200
2 1/2
150
[N(p)/Fp ]
100
50
0
0 5 10 15 20
Energy [keV]
Figure 62: The Fermi-Kurie plot of the energy distribution of the electrons
emitted in the beta decay of tritium, 3 H → 3 He + e− + ν e . The decay releases
18.1 keV. It is seen that the electrons produced in the decay process have a
non-zero probability for carrying off most of the released energy. Hence, one
concludes that the anti-neutrinos are almost massless. The dashed blue curve
is the curve expected if the neutrino had a mass of 3 keV.
distribution is based on the phase space available for the emission of the electron
and anti-neutrino130 . The joint phase-space available for the electron of four-
momentum (Ee /c, p) and the anti-neutrino of four-momentum is (Eν /c, q) is
proportional to the factor
Z
dΓ = dp p2 dq q 2 δ(E − Ee (p) − Eν (q))
Z
p Ee (p) q Eν (q)
= dEe (p) 2
dEν (q) δ(E − Ee (p) − Eν (q))
c c2
129 L. Langer and R. Moffat, Phys. Rev. 88, 689 (1952).
323
1 p
2 − m2 c4 E (q)
= 5
dE e (p) p E e (p) E ν (q) ν ν
c
Eν (q)=E−Ee (p)
1 p
= dEe (p) p Ee (p) ( E − Ee (p) )2 − m2ν c4 ( E − Ee (p) )
c5
(1831)
The process of beta decay does not conserve parity. The non-conservation
of parity was discovered in the experiments of C. S. Wu et al.131 . In these
experiments, the spin of a 60 Co nucleus was aligned with a magnetic field. The
spin S = 5h̄ 60 Co nucleus decayed into a spin S = 4h̄ 60 N i nucleus by emitting
an electron and an anti-neutrino.
60
Co → 60
N i + e− + ν e (1832)
Since angular momentum is conserved, the spin of the electron and the anti-
neutrino initially must both be aligned with the field. In the experiment, the
angular distribution of the emitted electrons was observed. Because the helicity
of the electrons is conserved, the angular distribution of the electrons can be
used to prove that the electrons all have negative helicity, and hence it is inferred
that the anti-neutrinos should have positive helicity. Since helicity should be
reversed under the parity operation, and since only negative helicity electrons
are observed, the process is not invariant under parity. Hence, parity is not
conserved.
The electrons that are emitted in beta decay have negative helicities. If
the momentum of an emitted electron is given by (p, θp , ϕp ), then its helicity
operator is
cos θp sin θp exp[−iϕp ]
Λp = (1833)
sin θp exp[+iϕp ] − cos θp
The helicity operator has eigenstates χ given by
Λp χ± ±
θp = ± χθp (1834)
131 C. S. Wu, E. Ambler, R. W. Hayward, D. D. Hoppes and R. F. Hudson, Phys. Rev. 108,
1413 (1957).
324
which are determined as
!
θ ϕ
cos 2p exp[−i 2p ]
χ+
θp = θ ϕ
sin 2p exp[+i 2p ]
!
θ ϕ
− sin 2p exp[−i 2p ]
χ−
θp = θ ϕ (1835)
cos 2p exp[+i 2p ]
Since angular momentum is conserved and the emitted electrons only have neg-
ative helicity, the angular distribution of the emitted electrons is proportional to
the square of the overlap of the initial electron spin-up spinor with the negative
helicity spinors
† − 2 θp
| χ+
θ=0 χθp | = sin2
2
1
= ( 1 − cos θp ) (1836)
2
which is in exact agreement with the experimentally observed distribution. From
the distribution of emitted electrons one is led to expect that the anti-neutrino
has positive helicity.
νe
S=h/2
e-
S=h/2
Co
S=5h Ni
S=4h
Figure 63: The spin S = 5h̄ of the Co nucleus is aligned with the magnetic field.
The Co undergoes beta decay to N i which has S = 4h̄ by emitting an electron
e− and an anti-neutrino ν e . The spin of the electron and the anti-neutrino
produced by the decay must initially be aligned with the magnetic field, due to
conservation of angular momentum.
325
1.2
0.8
I(θ)
0.6
0.4
0.2
0
0 0.2 0.4 0.6 0.8 1
θ/π
Figure 64: The angular distribution of the emitted electron in the beta decay
experiment of Wu et al.
captures an electron from the K-shell and decays to the excited state of a 152 Sm
nucleus with angular momentum J = h̄ and emits a neutrino.
152
Eu + e− → 152
Sm∗ + νe (1837)
Goldhaber et al. measured the photons with the full Doppler shift, from which
they were able to infer the direction of the recoil of the nucleus. The photons
e-
Eu
Sm*
νe
J=1
γ Sm νe
Λγ=−1 J=0
326
oriented along the direction of motion of the emitted neutrino. Since the sum of
the angular momentum of the excited state (J = h̄) and the emitted neutrino
must equal the spin of the captured electron h̄2 , the neutrino must have its spin
oriented anti-parallel to the angular momentum of the Sm∗ nucleus. Hence, the
neutrino has negative helicity.
The Lagrangian equation of motion is found from the variational principle which
states that the action is extremal with respect to ψ and ψ † . The condition that
the action is extremal with respect to variations in ψ † leads to the Dirac equation
q
i h̄ γ µ ( ∂µ + i Aµ ) ψ = m c ψ (1842)
c h̄
after the resulting equation has been multiplied by a factor of γ (0) . On making
a variation of the action with respect to ψ, one finds the Hermitean conjugate
equation
q † †
− i h̄ c ( ∂µ − i Aµ ) ψ γ µ − m c2 ψ = 0 (1843)
c h̄
That this is the Hermitean conjugate of the Dirac equation can be shown by
taking its Hermitean conjugate, which results in
q
i h̄ c γ µ† γ (0) ( ∂µ + i Aµ ) ψ − m c2 γ (0) ψ = 0 (1844)
c h̄
The above equation can be reduced to the conventional form by multiplying by
γ (0) and by using the identities
γ (0) γ (0) = Iˆ
γ (0) γ µ† γ (0) = γµ (1845)
327
Hence, the equation found by varying ψ is just the Hermitean conjugate of the
Dirac equation
q
i h̄ γ µ ( ∂µ + i Aµ ) ψ = m c ψ (1846)
c h̄
Furthermore, it is surmised that the starting Lagrangian is appropriate to de-
scribe the Dirac field theory.
328
Likewise, (c times) the momentum density T 0 j is found from
†
T 0j = i h̄ c ψ γ (0) ∂j ψ
= i h̄ c ψ † ∂j ψ
= c ψ † p̂j ψ (1852)
where the partial derivative has been identified with the covariant momentum
operator. Hence, the contravariant component of the momentum is given by
∂
T 0,j = − i h̄ c ψ † ψ
∂xj
= c ψ † p̂(j) ψ (1853)
One can also determine the conserved Noether charges by noting that the
Lagrangian is invariant under a global gauge transformation
0
ψ → ψ = exp + i ϕ ψ
∗
∗ 0∗
ψ →ψ = exp − i ϕ ψ (1856)
δψ = + i δϕ ψ
δψ ∗ = − i δϕ ψ ∗ (1857)
δL = 0 (1858)
so we have
0 = δ L
∂L ∂L ∗ ∂L ∂L
= δψ + δψ + δ(∂µ ψ) + δ(∂µ ψ ∗ )
∂ψ ∂ψ ∗ ∂(∂µ ψ) ∂(∂µ ψ ∗ )
(1859)
329
After substituting the Euler-Lagrange equations for the derivatives w.r.t. the
fields ψ and ψ ∗ , the variation is expressed as
∂L ∂L ∗
0 = ∂µ δψ + δψ (1860)
∂(∂µ ψ) ∂(∂µ ψ ∗ )
For an arbitrary gauge transformation through the fixed infinitesimal angle δϕ,
this condition becomes
∂L ∂L ∗
0 = i δϕ ∂µ ψ − ψ (1861)
∂(∂µ ψ) ∂(∂µ ψ ∗ )
Hence, one finds that there is a current j µ which satisfies the continuity equation
∂µ j µ = 0 (1862)
where (apart from the infinitesimal constant of proportionality) the current is
given by
∂L ∂L ∗
j µ ∝ i δϕ ψ − ψ (1863)
∂(∂µ ψ) ∂(∂µ ψ ∗ )
For the Dirac Lagrangian, the second term is identically zero and the first term
is non-zero. Hence, on adopting a conventional normalization, the conserved
current is identified as
†
jµ = c ψ γµ ψ (1864)
This is the the same expression for the conserved current that was previously
derived for the one-electron Dirac equation. Hence, the one-particle Dirac equa-
tion yields the same expectation values and obeys the same conservation laws
as the (classical) Dirac field theory.
where φL and φR are two-component Dirac spinors and the two sets of quantities
σ µ and σ̃ µ are expressed in terms of the Pauli matrices as
µ
σL = ( σ0 , − σ )
µ
σR = ( σ0 , σ ) (1868)
330
µ µ
The difference between σL and σR reflect the different chirality of φL and φR . In
the absence of the mass term, the Dirac Lagrangian possesses two independent
scalar gauge transformations. These transformations corresponds to the global
gauge transformations
L L0 L
φ → φ = φ exp i θL
R R0 R
φ → φ = φ exp i θR (1869)
where θL and θR are independent angles. The Lagrangian has a U (1) × U (1)
gauge symmetry. The presence of a mass term would couple the two fields and
reduce the gauge transformation to one in which θR = θL .
φL
ψ = (1873)
φR
The first factor represents the usual global gauge transformation for the Dirac
Lagrangian with finite mass. This transformation yields the usual conserved
four-vector current jVµ defined by
†
jVµ = c ψ γ µ ψ (1874)
The second factor is specific to the Dirac Lagrangian with zero mass. It is
called the chiral transformation or axial U (1) transformation. Using the anti-
commutation relation
{ γ (4) , γ µ }+ = 0 (1875)
one can show that the exponential factor in the chiral gauge transformation has
the property that
µ θR − θL (4) θR − θL (4)
γ exp i γ = exp − i γ γ µ (1876)
2 2
331
This property can be used to show that the Lagrangian is invariant under the
chiral transformation because
† θR − θL θR − θL
ψ 0 γ µ ∂µ ψ 0 = ψ † exp − i γ (4) γ (0) γ µ ∂µ exp i γ (4) ψ
2 2
= ψ † γ (0) γ µ ∂µ ψ
†
= ψ γ µ ∂µ ψ (1877)
which involves two commutations. Since the massless Dirac Lagrangian is invari-
ant under the chiral transformation, Noether’s theorem shows that the current
µ †
jA = c ψ γ µ γ (4) ψ (1878)
is conserved. This conserved current transforms like a vector under proper
orthochronous Lorentz transformations but does not transform as a vector under
improper orthochronous transformations. Therefore, the current is an axial
(0)
current. The conserved axial density jA is given by
(0) †
jA = ψ γ (0) γ (4) ψ
= ψ † γ (4) ψ
= − φ†L φL + φ†R φR (1879)
which is the difference between the number of particles with positive helicity
and the number of particles with negative helicity.
Exercise:
332
11.16 Hole Theory
The negative-energy solutions of the Dirac equation lead to the conclusion that
one-particle quantum mechanics is an inadequate description of nature. In clas-
sical mechanics, the dispersion relation for a free particle is found to be given
by p
E = ± m2 c4 + p2 c2 (1882)
The negative-energy states found in classical mechanics can be safely ignored.
The rational for ignoring the negative-energy states in classical mechanics is
that, the dynamics is governed by a set of differential equations which result
in the classical variables changing in a continuous fashion. Since the particle’s
energy can only change in a continuous fashion, there is no mechanism which
allows it to connect with the the negative branch of the dispersion relation.
However, in quantum mechanics, particles can make discontinuous transitions
between different energy levels, by emitting photons. Hence, if one has a single
electron in a positive-energy state where E > m c2 , this state would be unsta-
ble to the electron making a transition to a negative-energy state which occurs
with the simultaneous emission of photons which carry away an energy greater
than 2 m c2 . The transition rate for such process is quite large, therefore, one
might conclude that positive-energy particles should not exist in nature. Fur-
thermore, if one does have particles in the negative-energy branch, they might
be able to further lower their energies by multiple photon emission processes.
Hence, the states of negative-energy particles with finite momenta could be un-
stable to states in which the momentum has an infinite value.
Dirac noted that if the negative-energy states were all filled, then the Pauli
exclusion principle would prevent the decay of positive-energy particles into
the negative-energy states. Furthermore, in the absence of any positive-energy
particles, the Pauli exclusion principle would cause the set of particles in the
negative-energy state to be completely inert. In this picture, the filled sea
of negative-energy states would represent the physical vacuum, and would be
unobservable in experiments. For example, if charge is measured, it is the
non-uniform part of the charge distribution that is measured, but the infinite
number of particles in the negative-energy states do produce a uniform charge
density. Likewise, when energies are measured, the energy is usually measured
with respect to some reference level. For the case of a vacuum in which all
the negative-energy states are filled with electrons, the measured energies cor-
respond to energy differences and so the infinite negative energy of the vacuum
should cancel. Therefore, Dirac postulated that the vacuum consists of the
state in which all the negative-energy states are all filled with electrons133 . Fur-
thermore, physical states correspond to the states were a relatively few of the
positive-energy states are filled with electrons and a few negative states are un-
occupied. In this case, the electrons in the positive-energy states are identified
with observable electrons, and the unfilled states or holes in the distribution of
133 P. A. M. Dirac, Proc. Roy. Soc. A 126, 360 (1930).
333
2
Unoccupied Positive Energy States
2 1
E/mc
-1
Figure 66: A cartoon depicting the vacuum for Dirac’s Hole Theory, in which
the negative-energy states are filled and the positive-energy states are empty.
negative-energy states are also observable. These holes are known as positrons
and are the anti-particles of the electrons. The properties of a positron are
found by computing the difference between the property for a state with an
absent negative-energy electron and the property of the vacuum state.
q p = ( N − 1 ) q e − N qe (1883)
Therefore, one finds that the positron has the opposite charge to that of an
electron
qp = − qe (1884)
Hence, the positron has a positive charge. Likewise, the energy of the vacuum
in which all the electrons occupy all the negative-energy states is denoted by
E0 . The positron energy will be denoted as Ep (pe ). The positron corresponds
all states with negative energy being filled except for the state with the energy
p
Ee (pe ) = − m2 c4 + p2e c4 (1885)
334
which is unfilled. The positron energy is defined as the energy difference
Ep (pp ) = E0 − Ee (pe ) − E0
= − Ee (pe )
p
= m2 c4 + p2e c4 (1886)
= − pe (1887)
Hence, the momentum of the positron is the negative of the momentum of the
missing electron
p p = − pe (1888)
Likewise, the spin of the positron is opposite to the spin of the missing electron,
etc. The velocity of an electron is defined as the group velocity of a wave packet
of momentum pe . Hence, one finds the velocity of the negative energy-electron
from
∂
ve = Ee (pe )
∂pe
pe c2
= − q (1889)
m2 c4 + p2e c2
Therefore, the positron and the negative-energy electron states have the same
velocities.
335
Table 18: The relation between properties of Negative Energy Electron and
Positron States.
h̄
Electron −|e| −|E| +p + 2 σ σ.p v
h̄
Positron +|e| +|E| −p − 2 σ σ.p v
e + e → 2γ (1891)
In this process, it is necessary that the excess energy be carried off by two
photons if the energy-momentum conservation laws are to be satisfied. Likewise,
by supplying an energy greater than a threshold energy of 2 m c2 , it should be
possible to promote an electron from a negative-energy state, thereby creating
an electron-positron pair. Since it is unlikely that more than one photon can
be absorbed simultaneously, electron-positron pair creation only occurs in the
vicinity of a charged nucleus which can carry off any excess momentum.
γ → e + e (1892)
authors were the first who correctly identified the positively charged particle as the anti-
particle of the electron, in full accord with the predictions of Dirac’s hole theory.
137 J. Thibaud, Phys. Rev, 35, 78 (1934).
336
2
Unoccupied Positive Energy States
1
2
E/mc
(k,α)
-1
which only connects the upper and lower two-component spinors of the initial
and final states ψn and ψn0 . Hence, as light scattering processes are at least of
second-order in A, the intermediate state ψn00 must involve a negative-energy
electron state. Since the Pauli exclusion principle forbids the occupation of
the filled negative-energy states, hole theory ascribes the intermediate states
as involving virtual electron-positron creation and annihilation processes. This
shows that, even for processes which appear to involve a single electron in the
138 P. A. M. Dirac, Proc. Roy. Soc. A126, 360 (1930).
337
initial and final states, one must abandon single-particle quantum mechanics
and adopt a multi-particle description. Therefore, a purely single-particle de-
scription is inadequate and one must consider a many-particle description such
as quantum field theory.
V ωk 0 2
dσ
= | M |2 (1894)
dΩk0 2 π h̄ c2
and where q indicates all the quantum numbers of a positive-energy free electron
state. The sum over q 00 represents a sum over all possible intermediate states
of the electron, no matter whether they are positive or negative-energy states.
The matrix element M is composed of a coherent superposition of matrix el-
(k,α) (k',α')
(k,α) (k',α')
q''
q''
q' q q'
q
ements for virtual processes which represent the absorption of a photon (k, α)
followed by the subsequent emission of a photon (k 0 , α0 ) and the process where
the emission of light precedes the absorption process.
338
Since the basis set is composed of momentum eigenstates, the evaluation
of the spatial integration in the matrix elements of the interaction results in
the condition of conservation of momentum. Hence, for the process where the
photon (k, α) is absorbed before the emission of the photon (k 0 , α0 ), the momenta
are restricted by
k + q = q 00
q 00 = k 0 + q 0 (1896)
which leads to the identification of the momentum of the intermediate and final
states as
q 00 = q + k
q0 = q + k − k0 (1897)
In the second process, where the emission process precedes the absorption, con-
servation of momentum yields
k + q = k + k 0 + q 00
k + k 0 + q 00 = k 0 + q 0 (1898)
which yields
q 00 = q − k 0
q0 = q + k − k0 (1899)
The limit in which the initial electron is at rest q = 0 shall be considered. The
momenta of the incident and scattered photon will be assumed sufficiently low
so that the momentum of the electron in the intermediate state can be neglected
since q 00 ≈ 0. That is, the Compton scattering process will be consider in the
limit k → 0 and k 0 → 0.
If the initial (positive-energy) electron is stationary and has spin σ, its wave
function can be represented by the Dirac spinor
1 χσ
ψσ,q (r) = √ (1900)
V 0
339
zero momentum. Hence, the electron in the virtual state must have the form of
a negative-energy eigenstate
1 0
ψσ00 ,q00 (r) ≈ √ (1902)
V χσ00
Likewise, the matrix elements which involve the final (positive energy) electron
are evaluated as
< ψσ0 ,q0 | ĤInt | ψσ00 ,q00 > = | e | χ†σ0 σ χσ00 . A (1904)
From these one finds that, to second-order, the matrix elements that appear in
the transition rate are given by
χσ00 χ†σ00 = I
X
(1907)
σ 00
( σ . ˆα (k) ) ( σ . ˆα0 (k 0 ) ) = ( ˆα (k) . ˆα0 (k 0 ) ) + i σ . ( ˆα (k) ∧ ˆα0 (k 0 ) ) (1909)
340
Therefore, after combining both terms and noting that the pair of vector product
terms cancel, one finds that the matrix elements are diagonal in the spin indices
and are given by
e2 2 π h̄ c2
M ≈ √ δσ,σ0 2 ˆα (k) . ˆα0 (k 0 ) (1910)
2 m c2 V ωk ωk 0
These matrix elements are identical to the matrix elements that occur in the
non-relativistic quantum theory of Thomson scattering. On substituting this
into eqn(1894), one recovers the non-relativistic expression for the differential
scattering cross-section
2
ωk 0 e2
dσ
≈ δσ,σ0 | ˆα (k) . ˆα0 (k 0 ) |2
dΩk0 ωk m c2
2
ωk 0 e2
≈ δσ,σ0 cos2 Θ (1911)
ωk m c2
where
cos Θ = ˆα (k) . ˆα0 (k 0 ) (1912)
is the angle subtended by the initial and final polarization vectors. Hence, one
concludes that the negative-energy states do play an important role in light
scattering processes which involve low-energy electrons. The result, although
correct, does need re-interpretation, since the states of negative energy are as-
sumed to be filled with electrons in the vacuum and, therefore, the electron is
forbidden to occupy these levels in the intermediate states.
q q
e- e-
(k',α') (k,α)
q''
e+ e+ q''
q' (k',α')
(k,α)
e-
e-
q'
Electron-Positron Interpretation
The first contribution to the matrix elements, which was described above,
has to be re-interpreted as representing a process in which an electron that
341
initially occupies the negative-energy state q 00 makes a transition to the positive-
energy state q 0 while emitting the photon (k 0 , α0 ). This transition is subsequently
followed by the positive-energy electron q absorbing the photon (k, α) and falling
into the empty negative-energy state. In this process, the negative-energy states
are completely occupied in the initial and final state, and the energy of the initial
and final states are conserved. By re-ordering the factors in the matrix elements
and noting that since
the contribution to the matrix element of these two descriptions are identical
(apart from an over all negative sign).
one finds an identical expression (and the multiplicative factor of minus one).
Hence, Dirac hole-theory does lead to the correct classical result.
The above description is quite cumbersome, but can be made more concise by
adopting an anti-particle description of the unoccupied negative-energy states.
The first contribution to M first involves the creation of a virtual electron-
positron pair with the emission of the photon (k 0 , α0 ). The electron which has
just been created in the momentum eigenstate (q 0 , σ 0 ) remains unchanged in
the final state. Subsequently, the positron annihilates with the initial electron
(q, σ) while absorbing the photon (k, α). Since the intermediate state is a vir-
tual state, energy does not have to be conserved. The second contribution to M
involves the creation of a virtual electron-positron pair with the absorption of
the photon (k, α). The created electron (q 0 , σ 0 ) remains in the final state while
the positron subsequently annihilates with the initial electron (q, σ) and emits
the photon (k 0 , α0 ). This process is also a virtual process if the energy of the
incident light h̄ ωk is less than 2 m c2 .
342
derived by Klein and Nishina139 in 1928.
We shall multiply the complex conjugate of the Dirac equation by γ (2) and anti-
commute γ (2) with the real γ µ∗ and commute γ (2) with the γ (2)∗ matrix. This
procedure changes the sign in front of the term originating from the differential
momentum operator w.r.t. the sign of the mass term. This procedure yields
q
γ (2) µ∗
γ ( − i h̄ ∂µ − Aµ ) − m c ψ ∗ = 0
c
q
µ
γ ( i h̄ ∂µ + Aµ ) − m c γ (2) ψ ∗ = 0 (1919)
c
Hence, one sees that γ (2) ψ ∗ describes a Dirac particle with mass m and a charge
of − q moving in the presence of a vector potential Aµ . The fact that the opera-
tion of charge conjugation (in any representation) involves complex conjugation
is related to gauge invariance. Charge conjugation is a new type of symmetry for
particles that have complex wave functions which relates particles to particles
with opposite charges. The charge conjugate field ψ c is defined as
ψ c = Ĉ ψ ∗ (1920)
which is the result of the complex conjugation followed by the action of a linear
operator Ĉ. The joint operation can be represented as an anti-unitary operator.
139 O. Klein and Y. Nishina, Zeit. für Physik, 52, 843 (1928).
343
The charge conjugation operator Ĉ is defined as the unitary and Hermitean
operator
Ĉ = − i γ (2) (1921)
The charge conjugation operator is Hermitean as
where the anti-commutation relations of the γ matrices have been used. It was
through this type of logic that Kramers140 discovered the form of the charge
conjugation transformation which turns a particle into an anti-particle.
where we have used the identity z = (z ∗ )∗ in the second line. However, since Ĉ
is real, one finds
Z Z ∗
d3 r ψ c† (r) Â ψ c (r) = d3 r ψ † (r) Ĉ Â∗ Ĉ ψ(r)
Z ∗
= − d3 r ψ † (r) γ (2) Â∗ γ (2) ψ(r)
(1926)
We shall examine the effect of charge conjugation on the plane wave solutions
of the Dirac equation. The plane-wave solutions can be written as
r
( E + m c2 )
χσ
ψσ,k (x) = c h̄ σ . k exp − i k µ xµ (1927)
2EV E + m c2 χσ
140 H. A. Kramers, Proc. Amst. Akad. Sci. 40, 814 (1937).
344
The charge conjugate wave function is given by
c ∗
ψσ,k (x) = Ĉ ψσ,k (x)
r !
( E + m c2 ) χ∗σ
= Ĉ ∗
c h̄ σ . k exp + i k µ xµ
2EV E + m c2 χ∗σ
(1928)
where
Ĉ = − i γ (2)
−iσ (2)
0
=
iσ (2) 0
0 0 0 −1
0 0 1 0
=
0 1 0 0
(1929)
−1 0 0 0
cos θ2 exp[+i ϕ2 ]
∗
χ+σ (θ, ϕ) = (1931)
sin θ2 exp[−i ϕ2 ]
− sin θ2 exp[−i ϕ2 ]
χ−σ (θ, ϕ) = (1932)
cos θ2 exp[+i ϕ2 ]
345
Likewise, it can be shown that the upper two-component spinor is proportional
to
i σ (2) ( σ ∗ . k ) χ+σ (θ, ϕ)∗ = − ( σ . k ) ( i σ (2) ) χ+σ (θ, ϕ)∗
= − ( σ . k ) χ−σ (θ, ϕ)
= ( σ . ( − k ) ) χ−σ (θ, ϕ) (1934)
The end result is that the charge conjugated single-particle wave function has
the form
r
c h̄ ( σ . (−k) )
( E + m c2 )
c − χ−σ
ψσ,k (x) = E + m c2 exp + i k µ xµ
2EV χ−σ
(1935)
The properties described above are the properties of a state of a relativistic free
particle with a negative energy eigenvalue − E, momentum − h̄ k and spin − σ.
The absence of an electron in the charge conjugated state describes a positron,
with positive energy E, momentum h̄ k and spin σ.
Exercise:
Exercise:
Prove the completeness relation for the set of solutions for the Dirac equation
for a free particle
X
† 0 c† 0
c
φα (r)λ φα (r )ρ + φα (r)λ φα (r )ρ = δ 3 (r − r0 ) δλ,ρ (1936)
α
142 Frequently, the relativistic free electron states are given a manifestly covariant normaliza-
tion, in order to facilitate covariant perturbation theory. The use of different normalization
conventions results in changes the form of the completeness relation.
346
12 The Many-Particle Dirac Field
12.1 Second Quantization of Fermions
No. Accounting for fermions143 .
347
{ b̂†α , b̂β }+ = δα,β (1939)
for the positron operators, and the mixed electron/positron anti-commutation
relations are given by
{ ĉ†α , b̂†β }+ = { ĉα , b̂β }+ = { ĉ†α , b̂β }+ = 0 (1940)
The mixed electron/positron anti-commutation relations are all zero, since the
operators describe electrons in different single-particle energy eigenstates. In
this notation, the field operators are expressed as146
X
c †
ψ̂(r) = φα (r) ĉα + φα (r) b̂α (1941)
α
and
X
ψ̂ † (r) = φ∗α (r) ĉ†α + φcα ∗ (r) b̂α (1942)
α
The field operators ψ̂(r) and ψ̂ † (r) are expected to be canonically conjugate, as
we shall show below.
Π̂(r) =
1 δL ˆ † (r) γ (0) = i h̄ ψ̂ † (r)
= i h̄ ψ (1944)
c δ(∂0 ψ̂)
Hence, one expects that the field operators ψ̂ † (r) and ψ̂(r) are canonically con-
jugate and, therefore, satisfy the equal-time anti-commutation relations
{ ψ̂ † (r)λ , ψ̂(r0 )ρ }+ = δ 3 (r − r0 ) δλ,ρ (1945)
where λ and ρ label the components of the Dirac spinor. The anti-commutation
relations for the field operators can be verified by noting that
X
† 0
{ ψ̂ (r) , ψ̂(r ) }+ = { ĉ†α , ĉβ }+ φ∗α (r) φβ (r0 ) + { ĉ†α , b̂†β }+ φ∗α (r) φcβ (r0 )
α,β
+ { b̂α , ĉβ }+ φcα ∗ (r)
φβ (r ) + { b̂α , 0
b̂†β }+ φcα ∗ (r) φcβ (r0 )
X
∗ 0 c∗ c 0
= δα,β φα (r) φβ (r ) + δα,β φα (r) φβ (r )
α,β
X
= φ∗α (r) 0
φα (r ) + φcα ∗ (r) φcα (r0 )
α
= δ (r − r0 )
3
(1946)
146 W. Heisenberg and W. Pauli, Zeit. für Physik, 56, 1 (1929).
348
where the fermion anti-commutation relations have been used in arriving at the
second line. The positive-energy states and their charge conjugated states form
a complete set of basis states for the single-particle Dirac equation, so their
completeness condition has been used in going from the third to the fourth line.
The equal-time field anti-commutation relations can be generalized to field anti-
commutators at space-time points with a general type of separation. In the case
where the two field points x and x0 have a space-like separation
µ
( xµ − x0 ) ( xµ − x0 µ ) < 0
{ ψ̂ † (x) , ψ̂(x0 ) }+ = 0
∆x2 > 0
ct
r
2
∆x < 0
Figure 70: Due to causality, the anti-commutator of the field operator should
vanish for space-like separations. The anti-commutators can be non-zero inside
or on the light cone.
L. Rosenfeld148 have put forward general arguments that the commutation rela-
tions also place limitations on the measurement of fields at time-like separations.
The Hamiltonian density for the (non-interacting) quantized Dirac field the-
ory can be expressed as the operator
Ĥ = ψ̂ † γ (0) c − i h̄ γ . ∇ + m c ψ̂ (1947)
147 Outside the light-cone there is no way to distinguish between future and past.
148 N. Bohr and L. Rosenfeld, Kon. Dansk. Vid. Selskab., Mat.-Fys. Medd. XII, 8 (1933).
349
When the expansion of the quantized field in terms of single-particle wave func-
tions is substituted into the Hamiltonian, one finds
X
† c †
Ĥ = Eα ĉα ĉα + Eα b̂α b̂α
α
X
= Eα ĉ†α ĉα − Eα b̂α b̂†α (1949)
α
where the expression for the energy of the charge conjugated state
Eαc = − Eα (1950)
has been used. On anti-commuting the positron and annihilation operators, one
finds
X
Ĥ = Eα ĉ†α ĉα + b̂†α b̂α − 1 (1951)
α
The last term, when summed over α, yields the infinitely negative energy of
Dirac’s vacuum in which all the negative-energy states are filled. The vacuum
energy shall be used as the reference energy, so the Hamiltonian becomes
X
† †
Ĥ = Eα ĉα ĉα + b̂α b̂α (1952)
α
which describes the energy of the excited state as the sum of the energies of the
excited electrons and the excited positrons. The energies of the positrons and
electrons are given by positive numbers.
which is just the sum of the momenta of the (positive-energy) electrons and the
positrons. The spin operator is defined as
Z
h̄
Ŝ = d3 r ψ̂ † σ̂ ψ̂ (1954)
2
This is evaluated by substituting the expression for the field operators in terms
of the single-particle wave functions and the particle creation and annihilation
operators. The expectation value of the spin operator in the charge conjugated
state φcα is given by
Z Z ∗
d3 r φcα † (r) σ̂ φcα (r) = − d3 r φ†α (r) γ (2) σ ∗ γ (2) φα (r)
Z ∗
3 † (2) ∗ (2)
= d r φα (r) σ σ̂ σ φα (r)
350
Z ∗
= − 3
d r φ†α (r) σ̂ φα (r)
Z
= − d3 r φ†α (r) σ̂ φα (r) (1955)
The last line follows since σ is Hermitean. Hence, the spin operator is evaluated
as
h̄ X
Ŝ = χ†σ00 σ χσ0 ĉ†k,σ00 ĉk,σ0 + b̂†k,σ00 b̂k,σ0 (1957)
2 0 00
k;σ ,σ
which is just the sums of the spins of the electrons and positrons.
The last term in the parenthesis, when summed over all states α, yields the total
charge of the vacuum which is to be discarded. Hence, the observable charge is
defined as
X
Q̂ = ĉ†α ĉα − b̂†α b̂α (1959)
α
which shows that the total electrical charge defined as the difference between
the number of electrons and the number of positrons is conserved.
351
12.3.1 Parity
The parity eigenvalue equation for a multi-particle state with parity ηψ can be
expressed as
P̂ | ψ > = ηψ | ψ > (1960)
Since the action of the parity operator on states is described by a unitary opera-
tor, operators transform under parity according to the general form of a unitary
transformation. In particular, the effect of the parity transformation on the
field operator is determined as
ψ̂(r) → ψ̂ 0 (r0 ) = P̂ ψ̂(r) P̂ (1961)
The parity transformation is going to be determined in analogy with the parity
transformation of a classical field, in which the creation and annihilation oper-
ators are replaced by complex numbers. The parity operation on the quantum
field can be interpreted as only acting on the wave functions and not the particle
creation and annihilation operators. Quantum mechanically, this corresponds
to viewing the parity operator as changing the properties of the states to the
properties associated with the parity reversed states. Since the field operator is
expressed as
X
† c
ψ̂(r) = ĉα φα (r) + b̂α φα (r) (1962)
α
one has
X
P̂ ψ̂(r) P̂ = ĉα P̂ φα (r) P̂ + b̂†α P̂ φcα (r) P̂ (1963)
α
352
so the quantum field operators transforms in a similar fashion to the classical
field.
The relations between parity reversed states and parity reversed charge con-
jugated states can be verified by examining the free particle solutions of the
Dirac equation and noting that the parity operator consists of the product of
γ (0) and spatial inversion r → − r. This spatial inversion acting on a wave
function with momentum k and spin σ becomes a wave function with momentum
−k and spin σ, up to a constant of proportionality. A free particle momentum
eigenstate is given by
χσ
φσ,k (x) = N c h̄ k . σ exp − i ( k0 x(0) − k . r ) (1968)
E + m c2 χσ
The application of the parity operator to the above wave function yields
χσ
P̂ φσ,k (x) = N γ (0) c h̄ k . σ exp − i ( k0 x(0) + k . r )
E + m c2 χσ
χσ (0)
= N c h̄ k . σ exp − i ( k0 x + k.r)
− E + m c2 χσ
= φσ,−k (x) (1969)
as anticipated. The charge conjugated state is given by
φcσ,k (x) = Ĉ φ∗σ,k (x)
!
χ∗σ
= N Ĉ c h̄ k . σ ∗ exp + i k µ xµ (1970)
E + m c2 χ∗σ
where
Ĉ = − i γ (2)
0 0 0 −1
0 0 1 0
=
0 1 0 0
(1971)
−1 0 0 0
Therefore, the charge conjugate wave function is given by
!
c h̄ k . σ ∗
∗
c (2) χσ µ
φσ,k (x) = − i σ̂ N E + m c 2
exp + i k xµ (1972)
− χ∗σ
The effect of the parity operator on this state leads to
!
c h̄ k . σ ∗
∗
c (2) χ σ (0)
P̂ φσ,k (x) = − i σ̂ N E + m c 2
exp + i ( k x0 + k . r )
+ χ∗σ
!
c h̄ ( − k . σ ∗ )
(2) − χ∗σ (0)
= − i σ̂ N E + m c2
exp + i ( k x0 + k . r )
+ χ∗σ
= − φcσ,−k (x) (1973)
353
where in the first line the parity operator has sent r → − r and the factor of
γ (0) has flipped the sign of the lower components. In the second line we have
re-written k as −(−k) in the two two-component spinor, in anticipation of the
comparison with eqn(1970) which allows us to identify the factor of φcσ,−k (x).
This example shows that a state and its charge conjugate have opposite intrinsic
parities.
From the general form of the parity transformation on Dirac spinors, one
infers that the parity transform of the field operator is given by
X
P̂ ψ̂(r) P̂ = ĉα ηαP φPα (r) − b̂†α ηαP φcPα (r) (1974)
α
Thus, the parity operation can also be interpreted as only affecting the particle
creation and annihilation operators, and not the wave functions. Quantum
mechanically, this interpretation corresponds to viewing that the particles as
being transferred into their parity reversed states
X
† c
P̂ ψ̂(r) P̂ = P̂ ĉα P̂ φα (r) + P̂ b̂α P̂ φα (r) (1977)
α
In this new interpretation, the effects of parity on the fermion operators are
found by identifying the operators multiplying the single-particle wave functions
in the previous two equations. The resulting operator equations are
P
P̂ ĉα P̂ = ηPα ĉPα (1978)
and
P̂ b̂†α P̂ = − ηPα
P
b̂†Pα (1979)
which shows that fermion particles and anti-particles have opposite intrinsic
parities. Therefore, we conclude that, irrespective of which interpretation is
used, the field operator transforms as
X
P P † c
P̂ ψ̂(r) P̂ = ηα ĉα φPα (r) − ηα b̂α φPα (r) (1980)
α
which shows that the quantum field operators transforms in a similar fashion
to the classical field.
354
12.3.2 Charge Conjugation
Under charge conjugation, the classical Dirac field transforms as
ψ → ψ c = − i γ (2) ψ ∗ (1981)
(up to an arbitrary phase) since this is how the single-particle wave functions
transform. Classically, the (anti-linear) charge conjugation operator Cˆ is the
product of complex conjugation and the unitary matrix operator Ĉ = − i γ (2) .
If the classical field is expressed as a linear superposition of energy eigenfunc-
tions, the amplitudes of the eigenfunctions are represented by complex numbers.
In the charge conjugated state, these amplitudes are replaced by the complex
conjugates. In the quantum field, the amplitudes must be replaced by parti-
cle creation and annihilation operators. If an amplitude is associated with an
annihilation operator, then the complex conjugate of the amplitude is usually
associated with a creation operator. Hence, we should expect that charge con-
jugation will result in the creation and annihilation operators being switched.
where, in accord with the earlier comment about the relation between the quan-
tum and classical fields, the single-particle operators have been replaced by their
Hermitean conjugates. However, under charge conjugation general Dirac spinors
satisfy
therefore,
X
ψ̂ (r) = Cˆ ψ̂(r) Cˆ =
c
η c
ĉ†α φcα (r) + b̂α φα (r) (1985)
α
355
For consistency, the two expressions for ψ̂ c (r) must be equivalent. Hence, the
operator coefficients of φα (r) and φcα (r) in the two expressions should be iden-
tical. Therefore, one requires that
Cˆ ĉα Cˆ = η c b̂α
Cˆ b̂†α Cˆ = η c ĉ†α (1987)
ψ̂ c = Cˆ ψ̂ Cˆ = − i η c γ (2) ψ̂ † (1988)
where ψ̂ † is the Hermitean conjugate (column) field operator. Apart from the
replacement of the complex amplitudes with the Hermitean conjugates of the
creation and annihilation operators, the above expression is identical to the ex-
pression for charge conjugation on the classical field.
The charge conjugation operator has the effect of reversing the current den-
sity operator
ˆ † γ µ ψ̂ Cˆ = − ψ
Cˆ ψ ˆ † γ µ ψ̂ (1989)
which is understood as the result in the change of the charge’s sign.
Thus, under time reversal, the time-like and space-like components of the po-
sition four-vector have different transformational properties. Furthermore, the
energy-momentum four-vector transforms as
Hence, the position four-vector and momentum four-vector have different trans-
formational properties. Due to the above properties, angular momentum (in-
cluding spin) transforms as
T̂ J = − J (1992)
Therefore, we find that time reversal reverses momenta and flips spins.
356
T̂ interchanges the initial and final states, then
satisfies the Dirac equation with t → − t. For example, the plane wave
solutions of the Dirac equation can be shown to transform as
which flips the momentum and the spin angular momentum. It should be noted
that the matrix operator γ (1) γ (3) does not couple the upper and lower two-
component spinors, but nevertheless is closely related to the operator − i γ (2)
which occurs in the charge conjugation operator.
T̂ cα T̂ = cT α
T̂ bα T̂ = bT α (2000)
357
Table 19: Discrete Symmetries of Particles.
The charge conjugated of a state is a negative energy state with momentum −p
and spin −σ, that is interpreted as the state of antiparticle with momentum p
and spin σ.
Q p σ Λ
Charge Conjugation − + + +
Parity + − + −
Time Reversal + − − +
CPT − + − −
ψ 0 (x0 ) = Cˆ P̂ T̂ ψ(x)
∗
(2)
= −iγ P̂ T̂ ψ(x)
∗
= + i γ (2) γ (0) γ (1) γ (3) ψ ∗ (−x)
358
12.3.4 The CPT Theorem
The CPT theorem states that any local152 quantum field theory with a Her-
mitean Lorentz invariant Lagrangian which satisfies the spin-statistics theorem,
is invariant under the compound operation Cˆ P̂ T̂ , where the operators can be
placed in any order.
The proof of the theorem relies on the fact that any Lorentz invariant quan-
tity must be created out of contracting the indices of bi-linear covariants (quan-
tities such as the current density jµ which involve products of the γµ ) with the
indices of contravariant derivatives ∂ µ . Since the joint operation P̂ T̂ results in
each of the contravariant derivatives ∂ µ in the product changing sign, the theo-
rem ensures that the corresponding bi-linear covariants with which the deriva-
tives are contracted with must undergo an equivalent number of sign changes
under the compound operation Cˆ P̂ T̂ . The theorem only assumes invariance
under proper orthochronous Lorentz transformations and makes no assumptions
about reflection. The improper transformations are treated as analytic continu-
ation of the Lorentz transformation into complex space-time. The theorem was
first discussed by Lüders153 and Pauli154 , and then by Lee, Oehme and Yang155 .
The theorem has several consequences, such as the equality of the masses of
particles and their anti-particles. This follows since the mass mc is an eigenvalue
of p̂(0) in the particle’s rest frame and since one can find simultaneous eigenstates
of the commuting operators p̂µ and the product Cˆ P̂ T̂ . If one denotes the
compound operator as
Θ̂ = Cˆ P̂ T̂ (2002)
then
since the CPT theorem ensures that Θ̂ commutes with the Hamiltonian
Θ̂ Ĥ Θ̂−1 = Ĥ (2004)
interactions can be expressed in terms of products of fields at the same point in space-time.
It would be truly remarkable if this concept were to continue to work at arbitrarily small
distances!
153 G. Lüders, Dan. Mat. Fys. Medd. 28, 5 (1954).
359
then the state Θ̂ | Ψ > describes an anti-particle with flipped angular momen-
tum. This follows since the vacuum satisfies
one finds that the energy of a particle is equal to the energy of an anti-particle
with a reversed spin. However, as the rest mass cannot depend on the angular
momentum, the mass of a particle is equal to the mass of its anti-particle. For
unstable particles, the equality of the mass of the particle and anti-particle is
ensured by the invariance of the S-matrix under Cˆ P̂ T̂ .
Likewise, one can use the CPT theorem to show that the total decay rate
of a particle into products is equal to the total decay rate of the anti-particle
into its products156 . It should be noted that the partial decay rates into specific
final states are not equivalent, only the sums over all final states are equal.
which only has positive excitation energies. Hence, if the wave function changes
sign under the interchange of a pair of spin one-half particles the energy is
bounded from below. If the field operators had been chosen to obey commu-
tation relation, then the wave function would have been symmetric under the
156 T. D. Lee, R. Oehme and C. N. Yang, Phys. Rev. 106, 340 (1957).
157 W. Pauli, Phys. Rev. 58, 716 (1940).
360
interchange of particles. If this were the case, there would be a negative sign in
front of the positron energies so that the energy would have been unbounded
from below. This would have implied that the vacuum would not be stable, and
the theory is erroneous. This can be taken as implying that spin one-half par-
ticles must obey Fermi-Dirac Statistics. The other part of the theorem compels
integer spin particles to be bosons. Therefore, since photons have spin one, the
expression for the energy of the electromagnetic field is considered to be given
by
X h̄ ωk †
ĤPhoton = âk,α âk,α + âk,α â†k,α (2010)
2
k,α
which is the sum of the vacuum energy (the zero-point energies) and the ener-
gies of each excited photon. The excitation energies are positive. If it had been
assumed that the photon wave functions were anti-symmetric under the inter-
change of particles, then one would have found that the photon energies would
have been identically equal to zero. Furthermore, the excited photons would
have carried zero momentum and, therefore, be completely void of any physi-
cal consequence. Hence, one concludes that spin-one photons must obey Bose-
Einstein Statistics. The generalized theorem158 is an assertion that a non-trivial
integer spin field cannot have a anti-commutator that vanishes for space-like sep-
arations and a non-trivial odd half-integer spin field cannot have a commutator
that vanishes for space-like separations.
361
This is equivalent to assuming four independent real fields. The inner product
is defined as
Φ† Φ = Φ∗1 Φ1 + Φ∗2 Φ2 (2014)
where α(0) is an arbitrary scalar. The invariance of the Lagrangian under mul-
tiplication of the wave function by the phase factor, is equivalent to the usual
U (1) gauge invariance which has been discussed in the context of the electro-
magnetic field. The operator Û must be a unitary operator, if the norm of Φ is
conserved by the generalized gauge transformation
Φ†0 Φ0 = Φ† Û † Û Φ
= Φ† Φ (2016)
Therefore, one requires
Û † Û = Iˆ (2017)
and so Û must be a unitary operator. The operator Û is assumed to be an
arbitrary unitary matrix that acts on isospin states, that is, it acts on the two
components of Φ. Furthermore, it shall be assumed that the unitary matrix
has determinant + 1. Hence, the Lagrangian is assumed to be invariant under
a set of SU (2) gauge transformations. A general transformation of SU (2) is
generated by the three operators
(1) 0 1
τ =
1 0
(2) 0 −i
τ =
i 0
(3) 1 0
τ = (2018)
0 −1
where these matrices generate a Lie algebra. That is, the algebra of the com-
mutation relations is closed, since
[ τ (i) , τ (j) ] = 2 i ξ i,j,k τ (k) (2019)
where ξ i,j,k is the antisymmetric Levi-Civita symbol. An arbitrary unitary
transformation can be expressed as
X
k (k)
Û = exp − i α τ (2020)
k
362
where the αk are three real quantities. This represents an arbitrary rotation in
isospin space160 . The U (1) gauge transformation can also be represented in the
same way. Namely, the U (1) transformation can be expressed as
Û0 = exp − i α(0) τ (0) (2021)
We shall alter the Lagrangian, such that it is invariant under a gauge trans-
formation which varies from point to point in space. These are local gauge
transformations, in which the αk (x) depend on x. If the Lagrangian is to be
invariant under local gauge transformations, then one must introduce a coupling
to gauge fields Aµ . This coupling compensates for the change of the derivatives
under the gauge transformation, so that
†
µ µ
∂µ − i g Aµ Φ ∂ − igA Φ
†
= ∂µ − i g A0µ Φ0 ∂ µ − i g Aµ0 Φ0 (2025)
160 We shall not stop and contemplate the question of what restricts our measurements have to
be quantized along the isospin z-direction, and shall not ponder why there is a super-selection
rule at work.
363
Since
Φ = Û † Φ0 (2026)
we require that
∂ µ − i g Aµ0 Φ0 = Û ∂ µ − i g Aµ Φ (2027)
where the derivative only acts on the unitary transformation. Since the Û are
generated by τ (k) , there must be four components of Aµ , i.e. the fields have
four components Aµ,k . The matrix form of Aµ is given by
3
X
Aµ = Aµ,k τ (k)
k=1
Aµ,(3) Aµ,(1) − i Aµ,(2)
= µ,(1) (2029)
A + i Aµ,(2) − Aµ,(3)
We shall identify the contravariant derivative for the massive scalar particles
as161 as
Dµ = ∂ µ − i g Aµ − i g 0 Aµ0 (2030)
and one recognizes that this has the same form as the coupling of charged
particles to the EM field. In that case, the coupling occurs solely via τ (0) , the
coupling constant is given by g 0 = h̄q c and the field Aµ(0) = Aµ is the
four-vector potential. Since τ (0) commutes with all isospin operators, it is not
necessary to consider g 0 to be identical with the g value for the SU (2) gauge
fields.
relativity, if one follows the logic adopted by Weyl and considers GR as a gauge field theory.
364
where D is the covariant derivative only involving the SU (2) triplet of gauge
fields. It should be noted that since the gauge fields do not commute, this
involves terms which are second-order in the field amplitudes. That is
F µ,ν = ∂ µ Aν − ∂ ν Aµ − i g Aµ Aν − Aν Aµ (2032)
(2033)
where the indices i and j are summed over and ξ i,j,k is the Levi-Civita symbol.
In arriving at the above expression, we have used the identity
3
X
τ (i) τ (j) = δ i,j τ (0) + i ξ i,j,k τ (k) (2034)
k=1
as expected for an electromagnetic field. Since the SU (2) gauge fields don’t
commute, the field theory is a non-Abelian gauge field theory. Under an SU (2)
transformation, the field tensors transform according to
F µ,ν → F µ,ν 0 = Û F µ,ν Û † (2036)
which is just a local unitary transform in isospin space. The Lagrangian density
for all the free gauge fields can be expressed as
1
Lgauge = − Trace F µ,ν Fµ,ν (2037)
32 π
where the Trace is evaluated in isospin space and takes into account that there
are a total of four fields. The Lagrangian density can be expressed directly in
terms of the contributions from four components of the field. The result can be
expressed as
3
1 X k
Lgauge = − Fµ,ν F µ,ν,k (2038)
16 π
k=0
365
where we have decomposed the fields as
X
k
Fµ,ν = Fµ,ν τ (k) (2039)
k
evaluated the product of the Pauli spin matrices and used the fact that the
Pauli spin matrices τ (k) for k 6= 0 are traceless.
One can consider the k-components of the vector potential Aµ (i.e. the three
real components Aµk for fixed µ) as forming three-vectors Aµ in isospin space.
These quantities transform as three-vectors under transformations in isospin
space, and also the Aµ transform as four-vectors under Lorentz transformations
in Minkowsky space-time. The three-vector fields are spin-one bosons with
isospin one. Hence, we might expect that the isospin triplet should contain two
oppositely charged particles and one uncharged particle. These particles are
supplemented by the particle corresponding to the single uncharged field Aµ(0) .
In terms of this set of isospin vectors, the free gauge field Lagrangian density
can be written in the form of a sum of a scalar product in isospin space and an
isospin scalar
1 µ ν ν µ µ ν
Lgauge = − (∂ A − ∂ A ) + 2g A ∧ A . (∂µ Aν − ∂ν Aµ ) + 2g Aµ ∧ Aν
16 π
1 µ ν ν µ
− ∂ A(0) − ∂ A(0) ∂µ Aν,(0) − ∂ν Aµ,(0) (2040)
16 π
It should be noted that the Lagrangian reduces to the sum of four non-interacting
electromagnetic Lagrangians in the limit g → 0. However, at finite values of
g, the Lagrangian density contains cubic and quartic interactions with coupling
strengths that are fixed by gauge invariance in terms of the single gauge param-
eter g.
Figure 71: The interaction vertices representing the interaction of three and
four isospin triplet gauge field bosons.
Exercise:
366
Determine the equations of motion for the vector gauge fields, in the presence
of a source term
1
Lint = − Trace ( Aµ . j µ ) (2041)
c
where the current source j µ has also been decomposed in terms of Pauli spin
matrices.
In terms of these new combinations, the free Lagrangian for the gauge fields
become
1 1 1
Lgauge = − (0)
Fµ,ν F µ,ν,(0) − (3)
Fµ,ν F µ,ν,(3) − F − F µ,ν,+ (2046)
16 π 16 π 8 π µ,ν
where the first two terms are recognized as being similar to the Lagrangian den-
sity for the electromagnetic field. It was first hypothesized by Sheldon Glashow
that the electro-weak interaction is produced by the massless vector bosons de-
scribed by the above Lagrangian162 . Masses for the gauge bosons should not
be added by hand, since the resulting theory would not be renormalizable. To
retain renormalizability of the theory, and to have massive vector bosons, we
need to break the symmetry.
367
be modified, as will be the excitations of the gauge fields. Due to the symmetry-
breaking of the scalar field, the U (1) vector gauge field will become coupled to
the triplet of SU (2) gauge fields. When the symmetry is broken, the elementary
excitations of the coupled system of fields change and these new excitations will
represent the observable particles.
The symmetry is broken by assuming that the physical ground state corre-
sponds to one specific choice of the uniform field Φ. Given the specific ground
state which the system chooses spontaneously, one can make use of the global
gauge invariance to describe the ground state Φ0 as a field which has one non-
zero component which is real. That is, αk can be chosen so that
<e Φ1
Φ0 =
0
φ0
= (2049)
0
The excited states can be expressed as
φ0 + χ1
Φ = (2050)
0
where the local gauge degrees of freedom have been used to make χ1 real. This
excited field is invariant under the transformation
Φ → Φ0 = ÛEM Φ (2051)
where ÛEM is restricted to have the form
1 0
ÛEM = (2052)
0 exp − i Λ
368
and will turn out to represent the residual U (1) gauge invariance of the electro-
magnetic field.
The Lagrangian density for the isospin doublet of scalar fields and their
couplings can be evaluated for the excited state as
2
mc
Lscalar = (Dµ Φ)† Dµ Φ − χ21 (2054)
h̄
Aµ,(3) ( φ0 + χ1 )
µ µ,(0)
µ ∂ χ1 A ( φ0 + χ1 ) √
D Φ = − i g0 −ig
0 0 2 Aµ,− ( φ0 + χ1 )
(2055)
A new interaction strength λ can be defined as
q
λ = g02 + g 2 (2056)
g0 = λ cos θ
g = λ sin θ (2058)
Thus, the covariant derivative has the connection with the field
The field AµZ will turn out to be the field that describes the neutral Z particle.
The field orthogonal to the Z field is defined as
When expressed in terms of the transformed fields and constants, the covariant
derivative terms become
µ µ
Dµ Φ =
∂ χ1
− i ( φ0 + χ1 ) √λ AZµ,− (2061)
0 g 2A
The lowest-order terms in the Lagrangian density of the non-uniform scalar field
and all its couplings to the gauge fields are expressed as
2
mc
Lscalar = ∂µ χ1 ∂ µ χ1 − χ21 + λ2 φ20 AµZ Aµ,Z + 2 g 2 φ20 Aµ+ A− µ
h̄
(2062)
369
The higher-order terms, which have been neglected, describe the self-interactions
between the scalar field and the residual interactions between the scalar field
and the gauge fields.
µ,ν
F(3) = sin θ FZµ,ν + cos θ FEM
µ,ν
+ 2 i g ( Aµ− Aν+ − Aν− Aµ+ )
(2066)
The Lagrangian density describing the small amplitude excitations of the scalar
field and the gauge fields can be written as
2
µ mc
LF ree = ∂µ χ1 ∂ χ1 − χ21
h̄
1
− Fµ,ν,Z FZµ,ν + λ2 φ20 AµZ Aµ,Z
16 π
1 µ,ν
− Fµ,ν,EM FEM
16 π
1
− F − F µ,ν,+ + 2 g 2 φ20 Aµ+ A− (2067)
8 π µ,ν µ
In electro-weak theory, the first term represents the free uncharged scalar
boson. The second term describes an uncharged vector particle with mass MZ
proportional λ φ0 . The third term describes the uncharged massless vector
particle known as the photon. From the equations of motion for Aµ,± , the
remaining term can be shown to describe a pair of charged particles with masses
MW proportional to g φ0 = sin θ λ φ0 . These particles are known as the W +
370
and W − particles. The W + and W − particles are charged and the observed
charges are ± e. The interaction mediated by the massive vector bosons is found
to have a finite range (≈ 10−18 m), and is responsible for the weak interaction.
The experimentally determined masses164 are MW c2 ≈ 80.33 GeV and MZ c2 ≈
91.187 GeV. Nearly all the parameters of this theory have been determined
through experiment, the only exception is the mass m of the scalar particle
which remains to be discovered. The ratio of the masses determines the angle
θ via165
MW
= sin θ (2068)
MZ
which yields sin θ ≈ 0.8810. The W ± particles carry electrical charges ±e since
they couple to the electromagnetic field. This can be seen by examining the
W ± field tensor
where
Therefore, the covariant derivatives of the fields Dµ Aν± couple them to the elec-
tromagnetic field AµEM with either a positive or negative coupling constant of
magnitude 2 g cos θ. Since only electrically charged particles couple to the
electromagnetic field, one can make the identification
e
= 2 g cos θ = λ sin 2θ (2071)
h̄ c
sin2 2θ ( MZ c2 )2
φ20 = e2
(2072)
8 π h̄ c ( h̄c )
√
which leads to φ0 ≈ 178 GeV / h̄c, where h̄c ≈ 197 MeV fm. Hence, the only
undetermined parameter is the mass of the Higgs particle m. This theory was
shown to be renormalizable by G. t’ Hooft166 .
164 G. Arnison, A. Astbury, B. Aubert, et al., Phys. Lett. B, 122 103-116 (1983).
371