100% found this document useful (1 vote)
1K views680 pages

Horațiu Năstase - Quantum Mechanics - A Graduate Course-Cambridge University Press (2023)

Uploaded by

azuradark2308
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
1K views680 pages

Horațiu Năstase - Quantum Mechanics - A Graduate Course-Cambridge University Press (2023)

Uploaded by

azuradark2308
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 680

ISTUDY

Quantum Mechanics
A Graduate Course

Written for a two-semester graduate course in quantum mechanics, this comprehensive text helps
the reader to develop the tools and formalism of quantum mechanics and its applications to physical
systems. It will suit students who have taken some introductory quantum mechanics and modern
physics courses at undergraduate level, but it is self-contained and does not assume any specific
background knowledge beyond appropriate fluency in mathematics. The text takes a modern logical
approach rather than a historical one and it covers standard material, such as the hydrogen atom and
the harmonic oscillator, the WKB approximations, and Bohr–Sommerfeld quantization. Important
modern topics and examples are also described, including Berry phase, quantum information,
complexity and chaos, decoherence and thermalization, and nonstandard statistics, as well as more
advanced material such as path integrals, scattering theory, multiparticles, and Fock space. Readers
will gain a broad overview of quantum mechanics as a solid preparation for further study or research.

Horaţiu Năstase is a Researcher at the Institute for Theoretical Physics, State University of São Paulo.
He completed his PhD at Stony Brook with Peter van Nieuwenhuizen, co-discoverer of supergravity.
While in Princeton as a postdoctoral fellow, in a 2002 paper with David Berenstein and Juan
Maldacena he started the pp-wave correspondence, a sub-area of the AdS/CFT correspondence.
He has written more than 100 scientific articles and five other books, including Introduction to the
AdS/CFT Correspondence (2015), String Theory Methods for Condensed Matter Physics (2017),
Classical Field Theory (2019), Introduction to Quantum Field Theory (2019), and Cosmology and
String Theory (2019).
“There are a few dozen books on quantum mechanics. Who needs another one? Graduate students.
Most of the existing books aim at the undergraduate level. Often, students of graduate courses of QM
are all familiar with the motivation and its historical development but have a very varied technical
background. This book skips the history of the field and enables all the students to reach a common
needed level after going over the first chapters.
This book spans a very wide manifold of QM topics. It provides the tools for further researching
any quantum system. An important further value of the book is that it covers the forefront aspects
of QM including the Berry phase, anyons, entanglement, quantum information, quantum complexity
and chaos, and quantum thermalization. Readers understand a topic only if they are able to solve
problems associated with it; the book includes at least 7 exercises for each of its 59 chapters.”
— Professor Jacob Sonnenschein, Tel Aviv University
“Năstase’s Quantum Mechanics is another marvellous addition to his encyclopaedic collection of
graduate-level courses that take both the dedicated student and the hardened researcher on a grand
tour of contemporary theoretical physics. It will serve as an ideal bridge between a comprehensive
undergraduate course in quantum mechanics and the frontiers of research in quantum systems.”
— Professor Jeff Murugan, University of Cape Town
Quantum Mechanics
A Graduate Course

HORAŢIU N ĂSTASE
Universidade Estadual Paulista, São Paulo
University Printing House, Cambridge CB2 8BS, United Kingdom
One Liberty Plaza, 20th Floor, New York, NY 10006, USA
477 Williamstown Road, Port Melbourne, VIC 3207, Australia
314–321, 3rd Floor, Plot 3, Splendor Forum, Jasola District Centre,
New Delhi – 110025, India
103 Penang Road, #05–06/07, Visioncrest Commercial, Singapore 238467

Cambridge University Press is part of the University of Cambridge.


It furthers the University’s mission by disseminating knowledge in the pursuit of
education, learning, and research at the highest international levels of excellence.

www.cambridge.org
Information on this title: www.cambridge.org/highereducation/isbn/9781108838733
DOI: 10.1017/9781108976299
© Horaţiu Năstase 2023
This publication is in copyright. Subject to statutory exception
and to the provisions of relevant collective licensing agreements,
no reproduction of any part may take place without the written
permission of Cambridge University Press.
First published 2023
A catalogue record for this publication is available from the British Library.
Library of Congress Cataloging-in-Publication Data
Names: Năstase, Horaţiu, 1972– author.
Title: Quantum mechanics : a graduate course / by Horatiu Nastase,
Instituto de Fisica Teorica, UNESP, Saõ Paulo, Brazil.
Description: Cambridge, United Kingdom ; New York, NY : Cambridge University Press, 2022. |
Includes bibliographical references.
Identifiers: LCCN 2022010290 | ISBN 9781108838733 (hardback)
Subjects: LCSH: Quantum theory. | BISAC: SCIENCE / Physics / Quantum Theory
Classification: LCC QC174.12 .N38 2022 | DDC 530.12–dc23/eng20220517
LC record available at https://lccn.loc.gov/2022010290
ISBN 978-1-108-83873-3 Hardback
Additional resources for this publication at www.cambridge.org/Nastase
Cambridge University Press has no responsibility for the persistence or accuracy of
URLs for external or third-party internet websites referred to in this publication
and does not guarantee that any content on such websites is, or will remain,
accurate or appropriate.
To the memory of my mother,
who inspired me to become a physicist
Contents

Preface page xix


Acknowledgements xx
Introduction xxi

Part I Formalism and Basic Problems 1

Introduction: Historical Background 3


0.1 Experiments Point towards Quantum Mechanics 3
0.2 Quantized States: Matrix Mechanics, and Waves for Particles: Correspondence
Principle 6
0.3 Wave Functions Governing Probability, and the Schrödinger Equation 8
0.4 Bohr–Sommerfeld Quantization and the Hydrogen Atom 8
0.5 Review: States of Quantum Mechanical Systems 10

1 The Mathematics of Quantum Mechanics 1: Finite-Dimensional Hilbert Spaces 12


1.1 (Linear) Vector Spaces V 12
1.2 Operators on a Vector Space 15
1.3 Dual Space, Adjoint Operation, and Dirac Notation 17
1.4 Hermitian (Self-Adjoint) Operators and the Eigenvalue Problem 19
1.5 Traces and Tensor Products 21
1.6 Hilbert Spaces 21

2 The Mathematics of Quantum Mechanics 2: Infinite-Dimensional Hilbert Spaces 24


2.1 Hilbert Spaces and Related Notions 24
2.2 Functions as Limits of Discrete Sets of Vectors 25
2.3 Integrals as Limits of Sums 26
2.4 Distributions and the Delta Function 27
2.5 Spaces of Functions 29
2.6 Operators in Infinite Dimensions 30
2.7 Hermitian Operators and Eigenvalue Problems 30
2.8 The Operator Dxx 31

3 The Postulates of Quantum Mechanics and the Schrödinger Equation 35


3.1 The Postulates 35
3.2 The First Postulate 37
3.3 The Second Postulate 37
3.4 The Third Postulate 38

vii
viii Contents

3.5 The Fourth Postulate 38


3.6 The Fifth Postulate 41
3.7 The Sixth Postulate 41
3.8 Generalization of States to Ensembles: the Density Matrix 42

4 Two-Level Systems and Spin-1/2, Entanglement, and Computation 45


4.1 Two-Level Systems and Time Dependence 45
4.2 General Stationary Two-State System 47
4.3 Oscillations of States 50
4.4 Unitary Evolution Operator 52
4.5 Entanglement 52
4.6 Quantum Computation 54

5 Position and Momentum and Their Bases; Canonical Quantization, and Free Particles 57
5.1 Translation Operator 57
5.2 Momentum in Classical Mechanics as a Generator of Translations 58
5.3 Canonical Quantization 60
5.4 Operators in Coordinate and Momentum Spaces 61
5.5 The Free Nonrelativistic Particle 63

6 The Heisenberg Uncertainty Principle and Relations, and Gaussian Wave Packets 66
6.1 Gaussian Wave Packets 66
6.2 Time Evolution of Gaussian Wave Packet 68
6.3 Heisenberg Uncertainty Relations 69
6.4 Minimum Uncertainty Wave Packet 71
6.5 Energy–Time Uncertainty Relation 72

7 One-Dimensional Problems in a Potential V(x) 74


7.1 Set-Up of the Problem 74
7.2 General Properties of the Solutions 75
7.3 Infinitely Deep Square Well (Particle in a Box) 78
7.4 Potential Step and Reflection and Transmission of Modes 81
7.5 Continuity Equation for Probabilities 83
7.6 Finite Square Well Potential 84
7.7 Penetration of a Potential Barrier and the Tunneling Effect 87

8 The Harmonic Oscillator 91


8.1 Classical Set-Up and Generalizations 91
8.2 Quantization in the Creation and Annihilation Operator Formalism 92
8.3 Generalization 95
8.4 Coherent States 96
8.5 Solution in the Coordinate, |x, Representation (Basis) 96
8.6 Alternative to |x Representation: Basis Change from |n Representation 99
8.7 Properties of Hermite Polynomials 100
8.8 Mathematical Digression (Appendix): Classical Orthogonal Polynomials 101
ix Contents

9 The Heisenberg Picture and General Picture; Evolution Operator 106


9.1 The Evolution Operator 106
9.2 The Heisenberg Picture 108
9.3 Application to the Harmonic Oscillator 109
9.4 General Quantum Mechanical Pictures 110
9.5 The Dirac (Interaction) Picture 112

10 The Feynman Path Integral and Propagators 116


10.1 Path Integral in Phase Space 116
10.2 Gaussian Integration 119
10.3 Path Integral in Configuration Space 120
10.4 Path Integral over Coherent States (in “Harmonic Phase Space”) 121
10.5 Correlation Functions and Their Generating Functional 123

11 The Classical Limit and Hamilton–Jacobi (WKB Method), the Ehrenfest Theorem 127
11.1 Ehrenfest Theorem 127
11.2 Continuity Equation for Probability 129
11.3 Review of the Hamilton–Jacobi Formalism 130
11.4 The Classical Limit and the Geometrical Optics Approximation 132
11.5 The WKB Method 133

12 Symmetries in Quantum Mechanics I: Continuous Symmetries 137


12.1 Symmetries in Classical Mechanics 137
12.2 Symmetries in Quantum Mechanics: General Formalism 139
12.3 Example 1. Translations 141
12.4 Example 2. Time Translation Invariance 142
12.5 Mathematical Background: Review of Basics of Group Theory 143

13 Symmetries in Quantum Mechanics II: Discrete Symmetries and Internal Symmetries 149
13.1 Discrete Symmetries: Symmetries under Discrete Groups 149
13.2 Parity Symmetry 150
13.3 Time Reversal Invariance, T 152
13.4 Internal Symmetries 154
13.5 Continuous Symmetry 155
13.6 Lie Groups and Algebras and Their Representations 155

14 Theory of Angular Momentum I: Operators, Algebras, Representations 159


14.1 Rotational Invariance and SO(n) 159
14.2 The Lie Groups SO(2) and SO(3) 160
14.3 The Group SU(2) and Its Isomorphism with SO(3) Mod Z2 162
14.4 Generators and Lie Algebras 164
14.5 Quantum Mechanical Version 165
14.6 Representations 166
x Contents

15 Theory of Angular Momentum II: Addition of Angular Momenta and Representations;


Oscillator Model 172
15.1 The Spinor Representation, j = 1/2 172
15.2 Composition of Angular Momenta 174
15.3 Finding the Clebsch–Gordan Coefficients 177
15.4 Sums of Three Angular Momenta, J1 + J2 + J3 : Racah Coefficients 178
15.5 Schwinger’s Oscillator Model 179

16 Applications of Angular Momentum Theory: Tensor Operators, Wave Functions


and the Schrödinger Equation, Free Particles 183
16.1 Tensor Operators 183
16.2 Wigner–Eckhart Theorem 185
16.3 Rotations and Wave Functions 186
16.4 Wave Function Transformations under Rotations 188
16.5 Free Particle in Spherical Coordinates 190

17 Spin andL + S 194


17.1 Motivation for Spin and Interaction with Magnetic Field 194
17.2 Spin Properties 196
17.3 Particle with Spin 1/2 197
17.4 Rotation of Spinors with s = 1/2 199
17.5 Sum of Orbital Angular Momentum and Spin, L + S 200
17.6 Time-Reversal Operator on States with Spin 201

18 The Hydrogen Atom 204


18.1 Two-Body Problem: Reducing to Central Potential 204
18.2 Hydrogenoid Atom: Set-Up of Problem 205
18.3 Solution: Sommerfeld Polynomial Method 207
18.4 Confluent Hypergeometric Function and Quantization of Energy 210
18.5 Orthogonal Polynomials and Standard Averages over Wave Functions 211

19 General Central Potential and Three-Dimensional (Isotropic) Harmonic Oscillator 214


19.1 General Set-Up 214
19.2 Types of Potentials 215
19.3 Diatomic Molecule 220
19.4 Free Particle 221
19.5 Spherical Square Well 221
19.6 Three-Dimensional Isotropic Harmonic Oscillator: Set-Up 222
19.7 Isotropic Three-Dimensional Harmonic Oscillator in Spherical Coordinates 223
19.8 Isotropic Three-Dimensional Harmonic Oscillator in Cylindrical Coordinates 225

20 Systems of Identical Particles 229


20.1 Identical Particles: Bosons and Fermions 229
20.2 Observables under Permutation 231
20.3 Generalization to N Particles 232
20.4 Canonical Commutation Relations 233
xi Contents

20.5 Spin–Statistics Theorem 234


20.6 Particles with Spin 236

21 Application of Identical Particles: He Atom (Two-Electron System) and H2 Molecule 240


21.1 Helium-Like Atoms 240
21.2 Ground State of the Helium (or Helium-Like) Atom 243
21.3 Approximation 3: Variational Method, “Light Version” 245
21.4 H2 Molecule and Its Ground State 246

22 Quantum Mechanics Interacting with Classical Electromagnetism 250


22.1 Classical Electromagnetism plus Particle 250
22.2 Quantum Particle plus Classical Electromagnetism 251
22.3 Application to Superconductors 254
22.4 Interaction with a Plane Wave 256
22.5 Spin–Magnetic-Field and Spin–Orbit Interaction 256

23 Aharonov–Bohm Effect and Berry Phase in Quantum Mechanics 260


23.1 Gauge Transformation in Electromagnetism 260
23.2 The Aharonov–Bohm Phase δ 262
23.3 Berry Phase 264
23.4 Example: Atoms, Nuclei plus Electrons 265
23.5 Spin–Magnetic Field Interaction, Berry Curvature, and Berry Phase
as Geometric Phase 266
23.6 Nonabelian Generalization 269
23.7 Aharonov–Bohm Phase in Berry Form 269

24 Motion in a Magnetic Field, Hall Effect and Landau Levels 272


24.1 Spin in a Magnetic Field 272
24.2 Particle with Spin 1/2 in a Time-Dependent Magnetic Field 272
24.3 Particle with or without Spin in a Magnetic Field: Landau Levels 273
24.4 The Integer Quantum Hall Effect (IQHE) 275
24.5 Alternative Derivation of the IQHE 278
24.6 An Atom in a Magnetic Field and the Landé g-Factor 278

25 The WKB; a Semiclassical Approximation 282


25.1 Review and Generalization 282
25.2 Approximation and Connection Formulas at Turning Points 283
25.3 Application: Potential Barrier 285
25.4 The WKB Approximation in the Path Integral 287

26 Bohr–Sommerfeld Quantization 290


26.1 Bohr–Sommerfeld Quantization Condition 290
26.2 Example 1: Parity-Even Linear Potential 291
26.3 Example 2: Harmonic Oscillator 292
26.4 Example 3: Motion in a Central Potential 294
26.5 Example: Coulomb Potential (Hydrogenoid Atom) 298
xii Contents

27 Dirac Quantization Condition and Magnetic Monopoles 301


27.1 Dirac Monopoles from Maxwell Duality 301
27.2 Dirac Quantization Condition from Semiclassical Nonrelativistic Considerations 303
27.3 Contradiction with the Gauge Field 304
27.4 Patches and Magnetic Charge from Transition Functions 305
27.5 Dirac Quantization from Topology and Wave Functions 307
27.6 Dirac String Singularity and Obtaining the Dirac Quantization Condition from It 308

28 Path Integrals II: Imaginary Time and Fermionic Path Integral 311
28.1 The Forced Harmonic Oscillator 311
28.2 Wick Rotation to Euclidean Time and Connection with Statistical Mechanics
Partition Function 315
28.3 Fermionic Path Integral 318
28.4 Gaussian Integration over the Grassmann Algebra 320
28.5 Path Integral for the Fermionic Harmonic Oscillator 321

29 General Theory of Quantization of Classical Mechanics and (Dirac) Quantization


of Constrained Systems 325
29.1 Hamiltonian Formalism 325
29.2 Constraints in the Hamiltonian Formalism: Primary and Secondary Constraints,
and First and Second Class Constraints 326
29.3 Quantization and Dirac Brackets 329
29.4 Example: Electromagnetic Field 332

Part IIa Advanced Foundations 337

30 Quantum Entanglement and the EPR Paradox 339


30.1 Entanglement: Spin 1/2 System 339
30.2 Entanglement: The General Case 341
30.3 Entanglement: Careful Definition 342
30.4 Entanglement Entropy 343
30.5 The EPR Paradox and Hidden Variables 345

31 The Interpretation of Quantum Mechanics and Bell’s Inequalities 350


31.1 Bell’s Original Inequality 350
31.2 Bell–Wigner Inequalities 353
31.3 CHSH Inequality (or Bell–CHSH Inequality) 356
31.4 Interpretations of Quantum Mechanics 357

32 Quantum Statistical Mechanics and “Tracing” over a Subspace 361


32.1 Density Matrix and Statistical Operator 361
32.2 Review of Classical Statistics 363
32.3 Defining Quantum Statistics 365
32.4 Bose–Einstein and Fermi–Dirac Distributions 370
32.5 Entanglement Entropy 371
xiii Contents

33 Elements of Quantum Information and Quantum Computing 375


33.1 Classical Computation and Shannon Theory 375
33.2 Quantum Information and Computation, and von Neumann Entropy 377
33.3 Quantum Computation 378
33.4 Quantum Cryptography, No-Cloning, and Teleportation 380

34 Quantum Complexity and Quantum Chaos 384


34.1 Classical Computation Complexity 384
34.2 Quantum Computation and Complexity 385
34.3 The Nielsen Approach to Quantum Complexity 386
34.4 Quantum Chaos 388

35 Quantum Decoherence and Quantum Thermalization 393


35.1 Decoherence 393
35.2 Schrödinger’s Cat 393
35.3 Qualitative Decoherence 394
35.4 Quantitative Decoherence 395
35.5 Qualitative Thermalization 397
35.6 Quantitative Thermalization 398
35.7 Bogoliubov Transformation and Appearance of Temperature 400

Part IIb Approximation Methods 405

36 Time-Independent (Stationary) Perturbation Theory: Nondegenerate, Degenerate,


and Formal Cases 407
36.1 Set-Up of the Problem: Time-Independent Perturbation Theory 407
36.2 The Nondegenerate Case 408
36.3 The Degenerate Case 410
36.4 General Form of Solution (to All Orders) 412
36.5 Example: Stark Effect in Hydrogenoid Atom 414

37 Time-Dependent Perturbation Theory: First Order 418


37.1 Evolution Operator 418
37.2 Method of Variation of Constants 419
37.3 A Time-Independent Perturbation Being Turned On 421
37.4 Continuous Spectrum and Fermi’s Golden Rule 421
37.5 Application to Scattering in a Collision 423
37.6 Sudden versus Adiabatic Approximation 425

38 Time-Dependent Perturbation Theory: Second and All Orders 429


38.1 Second-Order Perturbation Theory and Breit–Wigner Distribution
(Energy Shift and Decay Width) 429
38.2 General Perturbation 432
38.3 Finding the Probability Coefficients bn (t) 433
xiv Contents

39 Application: Interaction with (Classical) Electromagnetic Field, Absorption, Photoelectric


and Zeeman Effects 436
39.1 Particles and Atoms in an Electromagnetic Field 436
39.2 Zeeman and Paschen–Back Effects 437
39.3 Electromagnetic Radiation and Selection Rules 438
39.4 Absorption of Photon Energy by Atom 442
39.5 Photoelectric Effect 444

40 WKB Methods and Extensions: State Transitions and Euclidean Path Integrals (Instantons) 447
40.1 Review of the WKB Method 447
40.2 WKB Matrix Elements 448
40.3 Application to Transition Probabilities 451
40.4 Instantons for Transitions between Two Minima 452

41 Variational Methods 457


41.1 First Form of the Variational Method 457
41.2 Ritz Variational Method 458
41.3 Practical Variational Method 459
41.4 General Method 460
41.5 Applications 462

Part IIc Atomic and Nuclear Quantum Mechanics 467

42 Atoms and Molecules, Orbitals and Chemical Bonds: Quantum Chemistry 469
42.1 Hydrogenoid Atoms (Ions) 469
42.2 Multi-Electron Atoms and Shells 470
42.3 Couplings of Angular Momenta 471
42.4 Methods of Quantitative Approximation for Energy Levels 472
42.5 Molecules and Chemical Bonds 473
42.6 Adiabatic Approximation and Hierarchy of Scales 474
42.7 Details of the Adiabatic Approximation 475
42.8 Method of Molecular Orbitals 476
42.9 The LCAO Method 477
42.10 Application: The LCAO Method for the Diatomic Molecule 478
42.11 Chemical Bonds 478

43 Nuclear Liquid Droplet and Shell Models 483


43.1 Nuclear Data and Droplet Model 483
43.2 Shell Models 1: Single-Particle Shell Models 484
43.3 Spin–Orbit Interaction Correction 487
43.4 Many-Particle Shell Models 489

44 Interaction of Atoms with Electromagnetic Radiation: Transitions and Lasers 492


44.1 Two-Level System for Time-Dependent Transitions 492
44.2 First-Order Perturbation Theory for Harmonic Potential 494
xv Contents

44.3 The Case of Quantized Radiation 495


44.4 Planck Formula 497

Part IId Scattering Theory 501

45 One-Dimensional Scattering, Transfer and S-Matrices 503


45.1 Asymptotics and Integral Equations 504
45.2 Green’s Functions 506
45.3 Relations between Abstract States and Lippmann–Schwinger Equation 507
45.4 Physical Interpretation of Scattering Solution 509
45.5 S-Matrix and T-Matrix 511
45.6 Application: Spin Chains 512

46 Three-Dimensional Lippmann–Schwinger Equation, Scattering Amplitudes and Cross Sections 516


46.1 Potential for Relative Motion 516
46.2 Behavior at Infinity 517
46.3 Scattering Solution, Cross Sections, and S-Matrix 518
46.4 S-Matrix and Optical Theorem 521
46.5 Green’s Functions and Lippmann–Schwinger Equation 522

47 Born Approximation and Series, S-Matrix and T-Matrix 528


47.1 Born Approximation and Series 528
47.2 Time-Dependent Scattering Point of View 530
47.3 Higher-Order Terms and Abstract States, S- and T-Matrices 530
47.4 Validity of the Born Approximation 532

48 Partial Wave Expansion, Phase Shift Method, and Scattering Length 535
48.1 The Partial Wave Expansion 535
48.2 Phase Shifts 537
48.3 T-Matrix Element 540
48.4 Scattering Length 541
48.5 Jost Functions, Wronskians, and the Levinson Theorem 542

49 Unitarity, Optics, and the Optical Theorem 547


49.1 Unitarity: Review and Reanalysis 547
49.2 Application to Cross Sections 548
49.3 Radial Green’s Functions 550
49.4 Optical Theorem 551
49.5 Scattering on a Potential with a Finite Range 553
49.6 Hard Sphere Scattering 554
49.7 Low-Energy Limit 554

50 Low-Energy and Bound States, Analytical Properties of Scattering Amplitudes 557


50.1 Low-Energy Scattering 557
50.2 Relation to Bound States 558
xvi Contents

50.3 Bound State from Complex Poles of Sl (k) 560


50.4 Analytical Properties of the Scattering Amplitudes 561
50.5 Jost Functions Revisited 563

51 Resonances and Scattering, Complex k and l 566


51.1 Poles and Zeroes in the Partial Wave S-Matrix Sl (k) 566
51.2 Breit–Wigner Resonance 568
51.3 Physical Interpretation and Its Proof 570
51.4 Review of Levinson Theorem 573
51.5 Complex Angular Momentum 574

52 The Semiclassical WKB and Eikonal Approximations for Scattering 577


52.1 WKB Review for One-Dimensional Systems 577
52.2 Three-Dimensional Scattering in the WKB Approximation 578
52.3 The Eikonal Approximation 581
52.4 Coulomb Scattering and the Semiclassical Approximation 584

53 Inelastic Scattering 588


53.1 Generalizing Elastic Scattering from Unitarity Loss 588
53.2 Inelastic Scattering Due to Target Structure 590
53.3 General Theory of Collisions: Inelastic Case and Multi-Channel Scattering 592
53.4 General Theory of Collisions 593
53.5 Multi-Channel Analysis 595
53.6 Scattering of Identical Particles 597

Part IIe Many Particles 601

54 The Dirac Equation 603


54.1 Naive Treatment 603
54.2 Relativistic Dirac Equation 606
54.3 Interaction with Electromagnetic Field 607
54.4 Weakly Relativistic Limit 607
54.5 Correction to the Energy of Hydrogenoid Atoms 611

55 Multiparticle States in Atoms and Condensed Matter: Schrödinger versus Occupation Number 614
55.1 Schrödinger Picture Multiparticle Review 614
55.2 Approximation Methods 616
55.3 Transition to Occupation Number 616
55.4 Schrödinger Equation in Occupation Number Space 617
55.5 Analysis for Fermions 621

56 Fock Space Calculation with Field Operators 623


56.1 Creation and Annihilation Operators 623
56.2 Occupation Number Representation for Fermions 624
56.3 Schrödinger Equation on Occupation Number States 625
xvii Contents

56.4 Fock Space 626


56.5 Fock Space for Fermions 627
56.6 Schrödinger Equations for Fermions, and Generalizations 628
56.7 Field Operators 628
56.8 Example of Interaction: Coulomb Potential, as a Limit of the Yukawa Potential,
for Spin 1/2 Fermions 630

57 The Hartree–Fock Approximation and Other Occupation Number Approximations 632


57.1 Set-Up of the Problem 632
57.2 Derivation of the Hartree–Fock Equation 633
57.3 Hartree and Fock Terms 635
57.4 Other Approximations in the Occupation Number Picture 637

58 Nonstandard Statistics: Anyons and Nonabelions 641


58.1 Review of Statistics and Berry Phase 641
58.2 Anyons in 2+1 Dimensions: Explicit Construction 642
58.3 Chern–Simons Action 644
58.4 Example: Fractional Quantum Hall Effect (FQHE) 646
58.5 Nonabelian Statistics 647

References 650
Index 652
Preface

There are so many books in quantum mechanics, that one has to ask: why write another one?
When teaching graduate quantum mechanics in either the US or Brazil (countries I am familiar
with), as well as in other places, one is faced with a conundrum: how to address all possible
backgrounds, while keeping the course both interesting and also comprehensive enough to offer
the graduate student a chance to follow competitively in any area? Indeed, while graduate students
have usually very different backgrounds, there is usually a compulsory graduate quantum mechanics
course, usually a two-semester one. The students will certainly have some introductory undergraduate
quantum mechanics and some modern physics, but some have studied these topics in detail, while
others less so.
To that end, I believe there is little need for a long historical introduction, or a detailed explanation
of why we need to define quantum mechanics in the way we do. In order to challenge students,
we cannot simply repeat what they heard in the undergraduate course. So one needs a tighter
presentation, with more emphasis on the building up of the formalism of quantum mechanics rather
than its motivation, as well as a presentation that contains more interesting new developments besides
the standard advanced concepts. On the other hand, I personally found that many (even very bright)
students are caught up between the two systems and miss on important information: they have
had an introductory quantum mechanics course that did not treat all the standard classical material
(examples: the hydrogen atom in all detail, the harmonic oscillator in various treatments, the WKB
and Bohr–Sommerfeld formalisms), yet the graduate course assumes that students have covered all
these topics and so they struggle in their subsequent research as a result.
Since I have found no graduate book that adheres to these conditions of comprehensiveness to my
satisfaction, I decided to write one, and this is the result. The book is intended as a two-semester
course, corresponding to the two parts, each chapter corresponding to one two-hour lecture, though
sometimes an extended version of a lecture.

xix
Acknowledgements

I should first thank all the people who have helped and guided me on my journey as a physicist,
starting with my mother Ligia, who was the first example I had of a physicist, and from whom I first
learned to love physics. My high school physics teacher, Iosif Sever Georgescu, helped to start me
on the career of a physicist, and made me realize that it was something that I could do very well. My
student exchange advisor at the Niels Bohr Institute, Poul Olesen, first showed me what string theory
is, and started me in a career in this area, and my PhD advisor, Peter van Nieuwenhuizen, taught
me the beauty of theoretical physics, of rigor and of long calculations, and an approach to teaching
graduate physics that I still try to follow.
With respect to teaching quantum mechanics, most of the credit goes to my teachers in the Physics
Department of Bucharest University, since during my four undergraduate years there (1990–1994),
many of the courses there taught me about various aspects of quantum mechanics. Some recent
developments described here I learned from my research, so I thank all my collaborators, students,
and postdocs for their help with understanding them. My wife Antonia I thank for her patience and
understanding while I wrote this book, as well as for help with an accident during the writing. I would
like to thank my present editor at Cambridge University Press, Vince Higgs, who helped me get this
book published, as well as my outgoing (now retired) editor Simon Capelin, for his encouragement
in starting me on the path to publishing books. To all the staff at CUP, thanks for making sure this
book, as well as my previous ones, is as good as it can be.

xx
Introduction

As described in the Preface, this book is intended for graduate students, and so I assume that readers
will have had both an introductory undergraduate course in quantum mechanics, which familiarized
them with the historical motivation as well as the basic formalism. For completeness, however, I have
included a brief historical background as an optional introductory chapter. I do assume a certain level
of mathematical proficiency, as might be expected from a graduate student. Nevertheless, the first
two chapters are dedicated to the mathematics of quantum mechanics; they are divided into finite-
and infinite-dimensional Hilbert spaces, since these concepts are necessary to a smooth introduction.
I start the discussion of quantum mechanics with its postulates and the Schrödinger equation, after
which I start developing the formalism. I mostly use the bra-ket notation for abstract states, as it
is often cleaner, though for most of the standard systems I use wave functions. I introduce early
on the notion of the path integral, since it is an important alternative way to describe quantization,
through the sum over all paths, classical or otherwise, though I mostly use the operatorial formalism.
I also introduce early on the notion of pictures, including the Heisenberg and interaction pictures, as
alternatives for the description of time evolution. The important notion of angular momentum theory
is presented within the larger context of symmetries in quantum mechanics, since this is the modern
viewpoint, especially within the relativistic quantum field theory that extends quantum mechanics.
With respect to topics, in Part I, on basic concepts, I consider both the older, but still very useful,
topics such as WKB and Bohr–Sommerfeld quantization as well as more recent ones such as the
Berry phase and Dirac quantization condition. In Part II I consider scattering theory, variational
methods, occupation number space, quantum entanglement and information, quantum complexity,
quantum chaos, and thermalization; these are the more advanced topics. The advanced foundations of
Quantum Mechanics (Part IIa ) are discussed in Part II since they are newer and harder to understand
even though they could be said to belong to Part I for being foundational issues.
After each chapter, I summarize a set of “Important Concepts to Remember”, and present seven
exercises whose solution is meant to clarify the concepts in the chapter.

xxi
PART I

FORMALISM AND BASIC PROBLEMS


Introduction: Historical Background

In this book, we will describe the formalism and applications of quantum mechanics, starting from
the first principles and postulates, and expanding them logically into the fully developed system that
we currently have. That means that we will start with those principles and postulates, assuming that
the reader has had an undergraduate course in modern physics, as well as an undergraduate course
in quantum mechanics. This implies a first interaction with the ideas of quantum mechanics, and
the historical development and experiments that led to it. However, for the sake of consistency and
completeness, in this chapter we will quickly review these historical developments and how we were
led to the current formalism for quantum mechanics.
In classical mechanics and optics, there were deterministic laws, involving two types of objects:
particles of matter, following well-defined classical paths, with their evolution defined by their
Hamiltonian and the initial conditions, and waves for light, defined by wave functions (for the electric
and magnetic fields E  and B),
 and the observable intensities I defined by them, showing interference
patterns in the sum of waves. But there was no overlap between the two descriptions, in terms of
particles and waves. Notably, Newton had a rudimentary particle description for light (in his Optics),
but with the development of the classical wave picture of Huyghens for optics, that description was
forgotten for a long time.

0.1 Experiments Point towards Quantum Mechanics

Then, around the turn of the nineteenth century into the twentieth, a string of developments
changed the classical picture, allowing for a complementary particle description for waves, and for a
complementary wave description for particles. One of the first such developments was the discovery
of radioactivity by Henri Becquerel in 1896, who, after Wilhelm Roentgen’s discovery of X-rays in
1895, showed that radioactive elements produce rays similar to X-rays in that they can go through
matter and then leave a pointlike mark on a photographic plate, thus involving emission of highly
energetic particles. Now we know that these emitted particles can be α (nucleons), β (electrons and
positrons), but also γ (high-energy photons, so lightlike).
Next came what is believed to be the start of the modern quantum era, with the first theoretical idea
of a quantum, specifically of light, used to describe blackbody radiation. In 1900, Planck managed
to explain the blackbody emission spectrum with a simple but revolutionary idea. Assuming that the
energy exchange between matter and the emitted radiation occurs not continuously (as assumed in
classical physics), but only in given quanta of energy, proportional to the frequency ν of the radiation,

ΔE = hν, (0.1)

3
4 0 Experiments Point towards Quantum Mechanics

where the constant h is called the Planck constant, Planck managed to derive a formula for the
spectral radiance:
2hν 3 1
Bν (ν, T ) =
2 hν/k T −1
, (0.2)
c e B

which matches the emission spectrum of the radiation of a perfect black body, which was then known
experimentally. Note that by integrating, we obtain the power per unit of emission area,
 ∞ 
P= dν dΩBν cos θ = σT 4 , (0.3)
0

where we have used the differential of solid angle dΩ = sin θ dθdφ and have defined the Stefan–
Boltzmann constant
2k B4 π 5
σ=, (0.4)
15c2 h3
obtaining the (experimentally discovered) Stefan–Boltzmann law. Planck himself didn’t quite believe
in the physical reality of the quantum of energy (as existing independently of the phenomenon of
emission of radiation), but thought of it as a useful trick that manages to provide the correct result.
It was necessary to wait until Einstein’s explanation of the photoelectric effect in 1905 for the true
start of the quantum era. Indeed, Einstein took Planck’s idea to its logical conclusion, and postulated
that there is a quantum of energy, that he called a photon and that can interact with matter, such that
its energy is E = hν. Such an energy quantum is absorbed in the photoelectric effect by the bound
electrons with binding energy W , such that they are released with kinetic energy
mv 2
Ekin = = hν − W , (0.5)
2
providing an electrical current; see Fig. 0.1a.
Continuing the idea of the interaction of electrons with quanta of light, i.e., photons, we also note
the Compton effect, observed experimentally by Arthur Compton in 1923, in which a photon scatters
off a free electron and changes its wavelength according to the law
2h θ
sin2 . Δλ = (0.6)
mc 2
The law is easily explained by assuming that the photon has a relativistic relation between the
energy and momentum, E = pc, leading to a momentum
hν h
p= = . (0.7)
c λ

free m

(a) Photoelectric (b) Compton (c) Balmer

Figure 0.1 Interactions of photons with electrons: (a) photoelectric effect; (b) Compton effect; (c) Balmer relation (transition to
excited state).
5 0 Experiments Point towards Quantum Mechanics

Then relativistic energy and momentum conservation for the process e− + γ → e− + γ (see
Fig. 0.1b), where the initial e− is (almost) at rest,

p γ = p γ + p e
 (0.8)
mc2 + pγ c = pe2 c2 + m2 c4 + pγ c,

leads to the Compton law (0.6).


Next, we can also consider the case where the absorbed or emitted energy quantum leaves the
electron bound to an atom, just changing its state inside the atom; see Fig. 0.1c. Since not all possible
energies for the photon can be absorbed in this way, this means that the allowed energy states for
the electron inside the atom are also quantized, i.e., they can only take well-defined values. In fact,
experimentally we have the Balmer relation, discovered by Johann Balmer in 1885, for the possible
frequencies of the absorbed or emitted radiation for the hydrogen (H) atom, given by
 
1 1
ν=R 2− 2 , (0.9)
n m

where R is called the Rydberg constant. Together with Planck’s and Einstein’s relation ΔE = hν, we
obtain the law that the possible energies of the electron inside the H atom are given by the energy
levels
hR
En = − , (0.10)
n2
where n is a natural nonzero number, i.e. n = 1, 2, 3, . . .
The fact that the states of the electron in the H atom (and, in fact, of any atoms or molecules, as
we now know) are quantized suggests that it is possible to have more general quantization relations
for energy states. That this is true, and that we can turn it into a concrete description for states in
quantum mechanics, was proved in another important experiment, the Stern–Gerlach experiment of
1922 (by Otto Stern and Walter Gerlach). In this experiment, horizontal paramagnetic atomic beams
are passed through a magnetic field varying as a function of the vertical position, B = B(z), created
by two magnets oriented on a vertical line, as in Fig. 0.2. The electrons are understood as “spinning”,
like magnets with a magnetic moment μ, so the energy of the interaction of the “spinning” electrons
with the magnetic field is given by

ΔE = −
μ · B.
 (0.11)

But the magnetic moment is proportional to the “angular momentum” l, μ  = ml. Classically, we
expect any possible value for l and, so any possible value for both the magnetic moment μ
  , and its
projection μz on the z direction. Since the force in the z direction is

Fz = −∂z (
μ · B)
  μz ∂z Bz , (0.12)

we expect the arbitrary value of μz to translate into an arbitrary deviation of the beam, experimentally
detected on a screen transverse to the original beam. But in fact one observes only two possible
deviations, implying only two possible values of μz , symmetric about zero, which in fact can only
be +μ and −μ, with a fixed μ. That in turn means that l z has only the possible values +l and −l,
with l fixed.
6 0 Experiments Point towards Quantum Mechanics

Stern–Gerlach experiment

Figure 0.2 The Stern–Gerlach experiment.

0.2 Quantized States: Matrix Mechanics, and Waves for Particles:


Correspondence Principle

The conclusion is that the “spinning” states of the electron have a projection onto any axis (since z is
a randomly chosen direction) that can have only two given values, which is another example of
the quantization of states, involving the simplest possibility: two possible choices only, states “1”
and “2”. In general, we expect more quantized values for the energy, therefore more, but usually
countable, states. Observables in these states can be “diagonal”, meaning they map the state to
itself (for instance, the energy of a state), or “off-diagonal”, meaning they map one state to another
(like those related to the transition between matter states due to interaction with light). In total, we
obtain a “matrix mechanics”, for matrix observables Mi j acting on states i. This was defined by
Werner Heisenberg, Max Born, and Pascual Jordan in 1925, in three papers (first by Heisenberg,
then generalized and formalized by Born and Jordan, then by all three).
But one still had to understand what is the physical meaning of the states, and obtain an alternative
to the classical idea of the path of a matter-particle. This came with the idea of a wave associated with
any particle, and conversely, a particle associated with any wave, or particle–wave complementarity;
see Fig. 0.3a. This is illustrated best in the classic double-slit experiment, which today is a table top
experiment covered in undergraduate physics. Consider a plane wave, or a beam of particles, moving
perpendicularly towards a screen with two slits in it, and observe the result on a detecting parallel
screen behind it, as in Fig. 0.3b.
If we have classical waves, as in classical electromagnetism (the description of the macroscopic
quantities of light), a traveling plane wave is described by a function ψ depending only on the
propagation direction x and the time t, as
  
2π 2π
ψ(x, t) = A exp i x− t , (0.13)
λ T
where ψ is made up from E
 and B.
 The general (spherical) form of the traveling wave is
A
ψ(r, t) = exp [i(kr − ωt)] , (0.14)
r
7 0 Experiments Point towards Quantum Mechanics

(a) (b)

Figure 0.3 (a) Particle–wave complementarity (correspondence principle). (b) Two-slit interference of waves ψ1 , ψ2 giving a screen
image made up of individual points (particles), forming an interference pattern at large times.

and is the form that is relevant for the wave that comes out of a slit. One can however measure only
the intensity, given by
I = |ψ| 2 . (0.15)
The role of the double slits is to split the plane wave into two waves, 1 and 2, with origin in each
slit, but converging on the same point on the detecting screen. Then one detects the total I = I1+2 ,
but what adds up is not I, but rather ψ, so ψ1+2 = ψ1 + ψ2 , and
I1+2 = |ψ1 + ψ2 | 2 . (0.16)
This leads to the interference pattern observed on the screen, with many local maxima of decaying
magnitude.
On the other hand, for particles, the particle beam divides at the slits in two beams. The “intensity”
of the beam is proportional to the number of particles per unit area (with the total number of particles
classically a constant, since they cannot be created or destroyed), so for them one finds instead
I1+2 = I1 + I2 . (0.17)
This would lead to a pattern with a single maximum, situated at the midpoint of the screen, but
this is not what it is observed.
In fact, as one knows from the experiment, if one decreases the intensity of the beam, such that,
according to the previous description, we have only individual photons (quanta of light) coming
through the slits, one can see the individual photons hitting the screen. However, their locations are
such that, when sufficiently many of them have hit, they still create the same interference pattern of
the waves. That means that, in a sense, a photon can “interfere with itself”.
It also implies a correspondence principle, that classical physics corresponds to macroscopic
effects, whereas microscopic effects are described by quantum mechanics. As long as something
becomes macroscopic (through many iterations of the microscopic, like for instance the large number
of photons in the double slit experiment), it becomes classical.
So the question arises, is there a wave associated with the photon itself? And what does it
correspond to? And if this is true for a photon, could it be true for any particle, even a particle of
matter? In fact, historically, this was first proposed by (the Marquis) Louis de Broglie in 1924, who
said that there should be a wave number k associated with any particle of momentum p, given by
p
k= , (0.18)

8 0 Experiments Point towards Quantum Mechanics

where  = h/(2π). Then it follows that the de Broglie wavelength of the particle is
2π h
λ= = . (0.19)
k p
This gives the correct formula for a photon, since for a photon, p = E/c and λ = c/ν. But
the formula is supposed to also apply for any matter particle, such as for instance an electron.
For electrons, the formula was confirmed experimentally by Davisson and Germer in experiments
performed between 1923 and 1927. Today, the principle implied in these experiments is used in the
electron microscope, which uses electrons instead of photons in order to “see”. This allows it to have
a better resolution, since the resolution is of the order of λ and the de Broglie λ for an electron is
much smaller than that for a photon of light (since the momentum p is much larger).

0.3 Wave Functions Governing Probability, and the Schrödinger Equation

Because the intensity of a beam of particles is I = |ψ| 2 , if we allow for the possibility that particles
have only probabilities of behaving in some way, we arrive at the conclusion that the intensity I must
be proportional to this probability, and that as such the probability is given by |ψ| 2 , where ψ is a wave
function associated with any particle. This was the interpretation given by Erwin Schrödinger, who
then went on to write an equation, now known as Schrödinger’s equation, for the wave function. The
equation for ψ is in general

i∂t ψ = Ĥ ψ, (0.20)

where Ĥ is an operator associated with the classical Hamiltonian H, now acting on the wave
function ψ.
This interpretation of quantum mechanics, of the Schrödinger equation for a wave function
associated with probability, was complementary to the matrix mechanics of Heisenberg, Born, and
Jordan. The two however are united into a single one by the interpretation of Dirac in terms of bra
ψ| and ket |ψ states, which we will develop. In it, thinking of various states |ψi  as the states i of
matrix mechanics, operators like Ĥ appear as matrices Hi j .

0.4 Bohr–Sommerfeld Quantization and the Hydrogen Atom

Quantization of the electrons in the H atom on the other hand is obtained from the Bohr–Sommerfeld
quantization rules, which state that the total action over the domain of a state, such as a cycle of the
motion of an electron around the H atom, is quantized in units of h,

p · dq = nh, (0.21)
H=E

where E is the energy of the electron. More generally, this should be true for any variable q and its
canonically conjugate momentum p, i.e.,

pi dqi = nh. (0.22)
9 0 Experiments Point towards Quantum Mechanics

We will study this in more detail later, but in this review we will just recall the principal details.
For the H atom, we have three simple quantizations. For pφ (the momentum associated with the
angular variable φ),

pφ dφ = lh (0.23)

leads to quantization of the total angular momentum,


L = l. (0.24)
For pr (the momentum associated with the radial variable r),

pr dr = k h (0.25)

leads to the relation (e02 ≡ e2 /(4π0 ))



2π 2 me04
− 2πL = k h. (0.26)
−E
The sum of the quantum numbers k and l is called the principal quantum number,
n = l + k. (0.27)
This then leads to the derivation of the energy levels obtained from the Balmer law,
me04
En = −
. (0.28)
22 n2
Physically, one describes the original Bohr–Sommerfeld quantization as the particle trajectory being
an integral number of de Broglie wavelengths, as in Fig. 0.4.
On the other hand, the projection of the angular momentum L on any direction z must also be
quantized (as we deduced from the Stern–Gerlach experiment), so

L z dφ = mh, (0.29)

leading to
L z = m. (0.30)
Thus the H atom is described by the three quantum numbers (n, l, m).

Figure 0.4 Bohr–Sommerfeld quantization, physical interpretation.


10 0 Experiments Point towards Quantum Mechanics

0.5 Review: States of Quantum Mechanical Systems

To summarize, in the case of quantum mechanical systems we have the following possibilities for
the states.

• We can have a finite number of discrete states, as in the case of spin systems. Quantization in
this case means exactly that: instead of a continuum of possible states (for a classical angular
momentum, in this case), now we have a discrete set of states. In particular, for spin s = 1/2 (as in
the case of electrons, proven by the Stern–Gerlach experiment), we have a two-state system, the
simplest possible, which will be analyzed first.
• We can have an infinite number of discrete states, as in the case of the hydrogen atom, considered
above. In this case, quantization means the same: instead of a continuum of states, we have a
discrete set, with an allowed set of energies and states, as well as other observables.
• We can also have a continuum of states, as in the case of free particles. In this case, quantization
refers only to the types of states allowed; there is no restriction on the allowed number of energies
and states associated with them.

Important Concepts to Remember

• Some states for some systems are quantized (quantized energy states, for instance), leading to
matrix mechanics.
• There is always a wave associated with a particle and a particle with a wave, leading to a
complementarity.
• There is always a (complex-valued) wave function ψ describing probabilities for a measurement
via P = |ψ| 2 . It satisfies the Schrödinger equation.
• We can always have a discrete infinite number of states, or even a continuous infinite number, but
both are still described by matrix mechanics. In any case, there is a wave function.

Further Reading
See, for instance, Messiah’s book [2]. It has a more thorough treatment of the historical background.

Exercises

(1) Derive the Stefan–Boltzmann law from the Planck law for the spectral radiance.
(2) In the photoelectric effect, if the incoming flux of light has frequency ν and the released electrons
get thermalized through interaction, what is the resulting temperature? If ν is within the visible
spectrum and W is comparable with the energy of the photon, what kind of temperature range
do you expect?
(3) Derive the Compton effect Δλ from the relativistic collision of the photon from a free electron.
11 0 Experiments Point towards Quantum Mechanics

(4) Considering a relativistic traveling wave of frequency ν, calculate the distance between the local
maxima (or local minima) in the interference pattern on a screen.
(5) Calculate the de Broglie wavelength for an electron at “room temperature”. How does it compare
with a corresponding wavelength in the radiation spectrum? What does this imply for the
feasibility of microscopes using electrons versus those using radiation?
(6) Can we apply Bohr–Sommerfeld quantization to a circular pendulum? Why, and would it be
useful to do so?
(7) Do you think that the Schrödinger equation is applicable for light as well? Explain why.
The Mathematics of Quantum Mechanics
1 1: Finite-Dimensional Hilbert Spaces

Before we define quantum mechanics through its postulates, we will review relevant issues about the
mathematics of vector spaces, formulating it in a way more easily applicable to the physical case in
which we are interested, and in particular the vector spaces we will call Hilbert spaces, relevant for
quantum mechanics. In this chapter we will consider the case of finite-dimensional spaces, which are
easier to define and analyze, and in the next chapter we will consider the more complicated, but more
physically relevant, case of infinite-dimensional Hilbert spaces, which are more involved.

1.1 (Linear) Vector Spaces V

We start with the general definition of a vector space. A vector space is a generalization of the space
of vectors familiar from classical mechanics. As such, it is a collection {|v} of elements denoted
by |v (in the so-called “ket” notation defined by Dirac) together with an addition rule “+” and a
multiplication rule “·”, and a set of axioms about them, as follows.

• There is an addition operation “+” that respects the space (a group property): ∀|v, |w ∈ V, |v +
|w ∈ V.
• There is a multiplication by a scalar operation “·” that also respects the space: ∀α ∈ C and |v ∈ V,
α|v ∈ V.
• The addition of scalars is distributive: ∀α, β ∈ C and |v ∈ V, (α + β)|v = α|v + β|v.
• The addition of vectors is also distributive: ∀α ∈ C and |v, |w ∈ V, α(|v + |w) = α|v + α|w.
• The addition of vectors “+” is commutative and associative, and so is the multiplication by
scalars “·”.
• There is a unique zero vector |0, such that |v + |0 = |v.
• There is a unique inverse under vector addition: ∀|v ∃| − v such that |v + | − v = |0.

Given a set of vectors |v1 , |v2 , . . . , |vn , we can form linear combinations of them. We say that
the vectors are linearly dependent if there is a linear combination of them that vanishes, i.e., if ∃
αi  0 (at least two of them nonzero), complex numbers such that

n
α i |vi  = |0. (1.1)
i=1

In this case, we can write one of the vectors in terms of the others. For instance, if α n  0, we
write

n−1
αi 
n−1
|vn  = − |vi  = α i |vi . (1.2)
i=1
α n i=1
12
13 1 Finite-Dimensional Hilbert Spaces

We will then call a vector space V n-dimensional if there are n linearly independent vectors
|v1 , |v2 , . . . , |vn , but ∀ sets of n + 1 vectors |w1 , |w2 , . . . , |wn+1 , they are linearly dependent.
We call {|v1 , |v2 , . . . , |vn } a basis for the space.
We can then decompose any vector |v ∈ V in this basis, i.e., there is a unique set of coefficients
α i , such that

n
|v = α i |vi . (1.3)
i=1

The fact that the set of coefficients is unique follows from the fact that if there were a second set
α̃ i satisfying the same condition,

n
|v = α̃ i |vi , (1.4)
i=1

then by taking the difference of the two decompositions, we obtain



n
(α i − α̃ i )|vi  = 0, (1.5)
i=1

which would mean that the |vi  would be linearly dependent, which is a contradiction.

Subspaces
If there is a subset V ⊂ V such that, ∀|v ∈ V and ∀α ∈ C, then α|v ∈ V as well, and also
∀|v, |w ∈ V, |v + |w ∈ V, we call V a subspace of the vector space.
In this case, the subspace will have dimension k ≤ n (n being the dimension of V), and there will
be a basis (|v1 , |v2 , . . . , |vk ) for V.

Scalar Product and Orthonormality


On a vector space we can define a notion of a scalar product, which associates with any two vectors
|v and |w a complex number w|v such that:
(1) w|v = v|w∗ .
(2) The product is distributive in the second term, i.e., w|(α|v + β|u) = w|αv + βu = αw|v +
βw|u.
(3) The product of a vector with itself is nonnegative, v|v ≥ 0, and is zero only for the null vector,

v|v = 0 ⇔ |v = |0. (1.6)

Note that the first and second axioms imply also that the product is distributive in the first term,

αv + βu|w = (α∗ v| + β∗ u|)|w = α ∗ v|w + β∗ u|w. (1.7)

Given this scalar product, we can define orthogonality, namely when

v|w = 0. (1.8)
14 1 Finite-Dimensional Hilbert Spaces

We also deduce, using the second axiom, that the product of any vector with the null vector gives
zero, since

v|0 = v|w − v|w = 0. (1.9)

We define the norm as the scalar product of a vector with itself, as

v ≡ v|v. (1.10)

Next, we can define orthonormality as orthogonality with a unit norm for each vector, v = 1, ∀|v.
If a basis vector does not have unit norm, we can always find a complex number c such that cv = 1,
since
1
cv|cv = cc∗ v|v ⇒ |c| 2 = . (1.11)
v 2
We can consider a basis of orthonormal vectors |i, and decompose an arbitrary vector |v
using it, as

|v = vi |i. (1.12)
i

Considering also another vector decomposed in this way,



|w = wi |i, (1.13)
i

we obtain for the scalar product



v|w = vi∗ w j i| j. (1.14)
i,j

If the basis is orthonormal, we have i| j = δi j , so



v|w = vi∗ wi . (1.15)
i

Then the norm is given by



v 2 = v|v = |vi | 2 . (1.16)
i

By multiplying the decomposition of |v by k | (considering the scalar product with |k), we obtain
(using k |i = δik )

k |v = vi k |i = vk . (1.17)
i

Since vi = i|v, the decomposition becomes



|v = |ii|v. (1.18)
i

The Gram–Schmidt theorem (orthonormalization)


Given a linearly independent basis, we can always find an orthonormal basis by making linear
combinations of the basis vectors |v.
15 1 Finite-Dimensional Hilbert Spaces

1.2 Operators on a Vector Space

An operator  is an action on the vector space that takes us back into the vector space, i.e., if |v ∈ V
then also |w = Â|v ∈ V. In other words, it associates with any vector another vector,

 : |v→
 | w. (1.19)

In other words, it is the generalization of the notion of a function for an action on a vector space.
We will mostly be interested in linear operators, which means that

Â(α|v + β|w) = α Â|v + β Â|w. (1.20)

Note however that there are also useful antilinear operators, relevant for the time-reversal invariance
(T) operator,

Â(α|v + β|w) = α ∗ Â|v + β∗ Â|w. (1.21)

Nevertheless, at present we will consider only linear operators (except later on in the book, when we
will consider the T symmetry).
In the space of operators (or “functions” on the vector space) there exists the notion of 0̂ (a neutral
element under addition) and 1̂ (a neutral element under multiplication), namely defined by

0̂|v = |0, 1̂|v = |v. (1.22)

We also have the same definition of the product with a scalar (in general a complex number), and of
addition, as in the vector space itself,
(α Â)|v = α( Â|v),
(1.23)
(α Â + β B̂)|v = α( Â|v) + β( Â|v).
Since the operator is the generalization of the notion of a function for a vector space, we can also
define the product of operators as the generalization of the composition of functions, then  · B̂ →
A ◦ B, namely

 · B̂|v ≡ Â( B̂|v). (1.24)

On the space of operators, obtained from the vector space, we can define a completeness relation,
saying that an orthonormal basis of vectors |i is also complete in the space, i.e., that we can expand
any vector in the vector space in terms of them. As we saw, this expansion formula was

|v = |ii|v, (1.25)
i

and by identifying this with 1̂|v (by the definition of 1̂), we obtain the completeness relation

|ii| = 1̂. (1.26)
i

This “ket-bra” notation for operators is used more generally than just for the completeness relation,
and it can be shown that we can write an operator  in the form

 = |ab|, (1.27)
a ∈ A,b ∈B
16 1 Finite-Dimensional Hilbert Spaces

where A, B are some sets of vectors, since by acting on an arbitrary vector |v we obtain
 
Â|v = |ab|v = vb |a ≡ |w, (1.28)
a ∈ A,b ∈B a ∈ A,b ∈B

where vb ≡ b|v.

Matrix Representation of Operators


We have talked until now in an abstract way of vector spaces and operators acting on them, but we
now show that we can represent operators as matrices acting on column vectors. Then in the case of
quantum mechanics one obtains the representation of the “matrix mechanics” of Heisenberg.
Consider an (orthonormal) basis |1, |2, . . . , |n for the vector space. Then an arbitrary vector
space |v, decomposed as
 
|v = |ivi = |ii|v, (1.29)
i i

can be represented by the i|v numbers represented as a column vector,

v 1|v
 1   
 v2  2|v
|v =  .  =  .  , (1.30)
 ..   .. 
vn n|v
where the first form is general (in terms of coefficients vi of the basis), and the second is for an
orthonormal basis.
In this representation, the ith basis vector corresponds to the column vector with only a 1 in the
ith position,

0
 
1  ... 
 
0
   
|1 =  ..  , . . . , |i = 1i  . (1.31)
 .   . 
 .. 
0
0
Under this representation of the vectors, the operators  associating a vector with another vector are
associated with a matrix. The matrix element of the operator in the basis {|i} is

Ai j = i| Â| j. (1.32)

If moreover the set {|i} is complete then the product of operators  · B̂ is associated with the matrix
product of matrices. Indeed, inserting the identity written as the completeness relation, we obtain
 
( Â · B̂)i j = i| Â · B̂| j = i| Â|kk | B̂| j = Aik Bk j . (1.33)
k k
17 1 Finite-Dimensional Hilbert Spaces

Then it follows also that the operator inverse Â−1 , defined by

 · Â−1 = Â−1 ·  = 1̂, (1.34)

is associated with the matrix inverse ( Â)i−1


j .

1.3 Dual Space, Adjoint Operation, and Dirac Notation

We represented the vectors |v as column vectors (1.30), where the basis vectors |i correspond
to each component of the column vector. But it follows that we can define the adjoint matrix of this
column vector, i.e., the transposed and complex conjugate matrix, a row vector, defined as the adjoint
of the column vector:

(|v) † = (v1∗ , v2∗ , . . . , vn∗ ). (1.35)

Then we can also write the scalar product of two vectors in an orthonormal basis, which we saw
was written as i vi∗ ui , as the matrix product of row and column vectors,

u
 1 
  u2 
v|u = vi∗ ui = (v1∗ v2∗ · · · vn )  .  . (1.36)
i  .. 
un
But that means that we can represent the adjoint of the vector as the formal “bra” vector,

v| = (|v) † ≡ (v1∗ , v2∗ , . . . , vn∗ ). (1.37)

The “bra” and “ket” notation is due to Dirac, and it is a splitting of the word “bracket”, which refers
to the scalar product v|u. Then |u, the vectors in the space V, are called ket vectors and v|, the
vectors in the “dual” (adjoint) space, denoted by V∗ , are called bra vectors.
We can define the adjoint operation on vectors in a vector space a bit more generally, in order to
better define V∗ , by defining the adjoint operation under multiplication by scalars a ∈ C and under a
sum of the basis vectors,

av| = (|av) † = (a|v) † = a∗ v|


  (1.38)
|v = vi |i ⇒ v| = (|v) † = vi∗ i|,
i i


where we have used that |i = i| for the basis vectors.
Having defined the adjoint operation for the vectors in the space, we can also define the same
for the operators relating two vectors, also via the adjoint matrix, i.e., the transpose and complex
conjugate, for the basis vectors |i, so

( † )i j ≡ ( Âi j ) † = i| Â| j =  j | † |i = A∗ji . (1.39)
18 1 Finite-Dimensional Hilbert Spaces

We can write this adjoint operation formally without reference to a basis of vectors. Since an operator
associates a vector with another vector,
Â|v = | Âv, (1.40)
it follows that

 Âv| = Â|v = v| † . (1.41)
The matrix definition, or this abstract definition, also implies that the dagger operation acts in
reverse on a product, as follows:
(  · B̂) † = B̂† · † . (1.42)

Change of Basis

Consider a change of basis, from the orthonormal basis {|i} to {|i }. The fact that both are bases
means that we can expand a vector in one basis in the other basis, as the linear combination

|i  = Uji | j. (1.43)
j

Multiplying by the bra vector k | and using orthonormality, k |i = δik , we obtain
Uki = k |i  . (1.44)
We can consider the reverse mapping, and represent this matrix as an abstract operator Û, with
matrix elements Uki , i.e.,
Uki = k |Û |i = k |Ûi, (1.45)
which implies that formally we have
|i  = Û |i = |Ûi. (1.46)
For the inverse relation, we can expand the vector |i in the basis |i , giving the linear combination,

|i = Vji | j . (1.47)
j

Multiplying by the vector k | and using the orthonormality relation k  | j  = δ jk , we obtain





Vki = k  |i = i|k   ∗ = Uik∗ = (Û † )ki , (1.48)
implying that formally we have V̂ = Û † . Then, applying the transformation twice, from |i to |i  and
back to |i, we obtain
|i = Û †Û |i ⇒ Û †Û = 1̂, (1.49)
and similarly Û Û † = 1̂, implying that
Û −1 = Û † . (1.50)
Such an operator is called a unitary operator. Thus, operators that change basis in a vector space
are unitary.
Moreover, for such an operator, taking the determinant of the relation (mapped to matrices)
Û Û † = 1̂, and using the facts that the determinant of the matrix product is the product of the matrix
determinants and that det U T = det U, so that det U † = (det U) ∗ , we obtain
19 1 Finite-Dimensional Hilbert Spaces

1 = det(Û · Û † ) = det U · det U † = | det U | 2 . (1.51)


These unitary matrices U form a group G, which means that their multiplication is such that:
(a) ∀ Â, B̂ ∈ G, Â · B̂ ∈ G.
(b) ∃ a unit operator (matrix) 1̂, such that  · 1̂ = 1̂ ·  = Â, ∀  ∈ G.
(c) ∀ Â, ∃ Â−1 such that Â−1 · Â = Â · Â−1 = 1̂.
The group of such unitary n × n matrices is called the unitary group U (n). Such a unitary
transformation of basis Û preserves the scalar product, meaning that if
|v  = U |v, |w  = U |w, (1.52)
then the scalar product is invariant,
w  |v  = Uw|Uv = w|U †U |v = w|v. (1.53)
The transformation of the basis is called an “active” transformation, |v → U |v. Then the matrix
elements of an operator  change as follows:
i| Â| j → i  | Â| j  = Ui| Â|U j = i|U † ÂU | j. (1.54)
Equivalently, describing this as a “passive” transformation, we can think of the transformation acting
on the operators instead as;
 → U † ÂU = U −1 ÂU. (1.55)
If, however, we consider the matrix element as invariant, the change of basis |i → U |i is
accompanied by a compensating inverse transformation on operators  given by  → U ÂU −1 .

1.4 Hermitian (Self-Adjoint) Operators and the Eigenvalue Problem

We call an operator Hermitian or self-adjoint if


† = Â, (1.56)
and anti-Hermitian if
† = − Â. (1.57)
The notion of Hermitian and anti-Hermitian operators corresponds to the notion of real or
imaginary complex numbers. As for complex numbers (where we can decompose any complex
number into a real and an imaginary part), we can decompose any operator into a Hermitian and
an anti-Hermitian part,
 + †  − †
 = + ≡ ÂH +  A-H . (1.58)
2 2
Indeed, ÂH = (  + † )/2 is Hermitian and  A-H = (  − † )/2 is anti-Hermitian.
In matrix notation, a Hermitian (self-adjoint) operator satisfies
∗ ∗
Ai j = i| Âj = i| Â| j =  j | † |i =  j | Â|i = A∗ji . (1.59)
We will be interested in eigenvalue problems for operators, especially Hermitian operators.
20 1 Finite-Dimensional Hilbert Spaces

An eigenvalue problem amounts to finding a vector |v, called an eigenvector, for which the
operator acts on it simply as multiplication by a complex number λ, called the eigenvalue. Thus
we have

Â|v = λ|v = λ1̂|v, (1.60)

or equivalently

( Â − λ1̂)|v = 0. (1.61)

Multiplying this equation with a bra vector, we obtain an equation describable as a matrix equation
(since there is a basis in which |v is a basis vector), for which we can find the determinant, and
obtain the eigenvalue condition

det( Â − λ1̂) = 0. (1.62)

As mentioned previously, we are mostly interested in Hermitian operators. For them, we have the
following:

Theorem The eigenvalues of a Hermitian operator are all real.

Proof We have
Â|v = λ|v ⇒ v| Â|v = λv 2 . (1.63)

On the other hand, for a Hermitian operator we also have



Â|v = v| † = v| Â. (1.64)

It then follows that we can rewrite the diagonal element of † in two ways:

v| † |v = v| Â|v = λv 2


 (1.65)
= v| † |v = v|λ∗ |v = λ∗ v 2 .

Therefore λ = λ∗ , so λ is real. q.e.d.

We can use the eigenvectors of a Hermitian operator to build an orthonormal basis, according to
the following:

Theorem For a Hermitian operator, † = Â, there is an orthonormal basis of eigenvectors for V. In
it, if the λi are all different,
λ 0 ··· ··· 0
 1 
0 λ2 0 ··· 0 
 =   . (1.66)
 0 ··· 0 
0 ··· ··· 0 λn
Then, the eigenvalue condition (1.62) corresponds to an nth-order polynomial in λ, Pn (λ) = 0, with
n solutions λ1 , λ2 , . . . , λ n .
That means that we can diagonalize a Hermitian operator by a unitary transformation, and on the
diagonal we have the eigenvalues of the operator.
21 1 Finite-Dimensional Hilbert Spaces

1.5 Traces and Tensor Products

We can define the trace of an operator formally, as corresponding to the trace of the matrix in its
matrix representation,
 
Tr  = i| Â|i = Aii . (1.67)
i i

We can also define tensor products of vector spaces, and operator actions on them, in a way also
obvious from the behavior of matrices.
Consider the vector spaces V and V. Then define the tensor product V ⊗ V as follows. If |i ∈ V
is a basis for V and |a ∈ V is a basis for V, we denote the basis for V ⊗ V as the tensor product of
the basis elements, i.e.,
|ia ≡ |i ⊗ |a, (1.68)
and any vector |v ∈ V ⊗ V is expandable in this basis,

|v = via |ia. (1.69)
i,a

Then operators on the tensor product space are also tensor products  = ÂV ⊗ ÂV  , acting as tensor
products of the corresponding matrices in the basis, i.e., acting independently in each space,
 
Â|v = ÂV ⊗ ÂV  via |i ⊗ |a = via ÂV |i ⊗ ÂV  |a . (1.70)
i,a i,a

1.6 Hilbert Spaces

Finally, we come to the definition of Hilbert spaces, which are the spaces needed for states in quantum
mechanics. We will forgo a more precise and rigorous definition until the next chapter, where we will
also generalize Hilbert spaces to infinite-dimensional spaces.
Briefly, Hilbert spaces are vector spaces, with a notion of a scalar product and the norm derived
from it, which contain only proper vectors, which can be normalized to unity, v = 1, and for which
we have a notion of continuity and completeness of the space with respect to a metric.
Physically, as mentioned above the Hilbert space is the state space for quantum mechanics. In
the finite-dimensional case, a Hilbert space can be thought of as a regular topological vector space,
associated with the metric distance defined from the scalar product (and its norm). In the infinite-
dimensional case more care is needed, since the space of functions will be included in it, so in the
next chapter we will define the Hilbert space more carefully.
Note that for vectors |v in a Hilbert space, if they are normalizable, which means we can write
v|v = 1 and expand the vectors in an orthonormal basis |i,  j |i = δi j , we have
 
|v = vi |i = i|v|i, (1.71)
i i

so that
  
1 = v|v =  j |v ∗j vi |i = |vi | 2 = |i|v| 2 . (1.72)
i,j i i
22 1 Finite-Dimensional Hilbert Spaces

Important Concepts to Remember

• Linear vector spaces are a collection of elements together with the addition of vectors and
multiplication by scalars, and some axioms.
• Vector spaces have bases. A basis contains a maximal number of linearly independent vectors, such
that any element in the space can be decomposed into it.
• A subspace is a subset of elements of the space, such that the addition and multiplication by scalars
defined for the space apply also to its subspaces.
• The scalar product associates a complex scalar with two vectors. In Dirac’s bra-ket notation, the
scalar product is v|w.
• The norm is the square root of the scalar product of a vector with itself.
• An orthonormal basis is a basis of vectors of unit norm that are orthogonal to each other (the scalar
product of two different basis vectors vanishes), i.e., vi |v j  = δi j .
• Operators  on a vector space associates any vector with another. A linear operator respects
linearity.
• Representing vectors as column vectors whose elements are the coefficients of the expansion into
a basis, an operator is represented as a matrix acting on these column vectors.
• When changing the basis between two orthonormal bases, the corresponding operator Û is unitary,
Û ∈ U (N ), and the operators are changed as  → Û ÂÛ −1 .
• Hermitian operators, † = Â, admit an eigenvalue–eigenstate problem, Â|v = λ|v, with real
eigenvalues λ ∈ R.
• Traces of operators are associated with the traces of their associated matrices, while tensor products
of vector spaces mean that for any element of one space, we associate with it a full copy of the
other vector space.
• Hilbert spaces are vector spaces with scalar products and norms and vectors that can be normalized,
and for which we have a notion of continuity and completeness of the space with respect to a metric.
Physically, they are the state space for quantum mechanics.

Further Reading
See, for instance, Shankar’s book [3] for more details on vector spaces, from the point of view of a
physicist.

Exercises

(1) Write formally for a general vector space the triangle inequality for three vectors, generalized
from vectors in a three-dimensional space, with c = a + b.
(2) Considering three linearly independent vectors in a three-dimensional Euclidean space, a , b, c,
construct an orthonormal basis out of them.
(3) Using the bra-ket Dirac notation for operators, show that we can put the product  · B̂ into the
same form and that the product is associated with the matrix product of the associated matrices.
23 1 Finite-Dimensional Hilbert Spaces

(4) Show that the trace of a product of Hermitian operators is real.


(5) What is the trace of a tensor product of two operators?
(6) Show that, modulo some discrete symmetries, U (n) can be split up into SU (n) times a group
U (1) of complex phases eiα (and show that these phases do form a group).
(7) Find the two eigenvalues of an operator associated with a 2 × 2 matrix with arbitrary elements.
The Mathematics of Quantum Mechanics 2:
2 Infinite-Dimensional Hilbert Spaces

In this chapter we move to the more complicated case of an infinite-dimensional Hilbert space, in
particular the vector space of functions.
Before that, however, we will take a second, more rigorous, look at the definition of a Hilbert space
and various related definitions.

2.1 Hilbert Spaces and Related Notions

We define a Hilbert space as a complex vector space with a scalar product, with which we associate
a distance function (“metric”) for any two vectors x and y in the space,

d(x, y) =  x − y = x − y|x − y, (2.1)

and which is a complete metric space with respect to this distance function.
A metric space, sometimes also called a pre-Hilbert space, is a vector space with a metric or
distance d(x, y) defined on it. The distance d(x, y) must obey the triangle inequality:

d(x, z) ≤ d(x, y) + d(y, z). (2.2)

In the case (as for a Hilbert space) where this comes from a scalar product, the triangle inequality
follows from the Cauchy–Schwarz inequality

|x|y| ≤  x  y. (2.3)

A metric space M is called complete (or a “Cauchy space”) if a so-called Cauchy sequence of
points in M, that is, a sequence vn of points that become arbitrarily close to each other (in the sense
of the metric or distance) as n → ∞, has a limit, as n → ∞, that is also in M. A counterexample to
this, i.e.,√an incomplete metric space, is the space of rational numbers Q, since there are irrationals
such as 2 that are the limit of rational numbers (in a Cauchy sequence), yet they are not part of the
space itself.
A metric space (pre-Hilbert) that is complete is called Hilbert.
Related to the Hilbert space is the notion of a Banach space, which means a vector space with an
associated notion of a norm (but that does not come from a scalar product), that is also complete.
That means that any Hilbert space is Banach, but the converse is not true: there are Banach spaces
that are not Hilbert.
Any Hilbert space is also a topological vector space, which means a vector space that is also
a topological space. A topological space is a space admitting a notion of continuity and having a
uniform topological structure, i.e., a notion of convergence (for the definition of Cauchy sequences).
Thus any Banach space is also a topological vector space.
24
25 2 Infinite-Dimensional Hilbert Spaces

2.2 Functions as Limits of Discrete Sets of Vectors

We are interested in the relevant and nontrivial case of infinite-dimensional Hilbert spaces. Indeed,
quantum mechanical systems are usually of this type, so we need to generalize to this case. Moreover,
in the quantum mechanical case we have functions ψ(x) as vectors, and operators acting on them, so
we need to generalize further to the case of infinite-dimensional vectors. Thus we need to find a way
to think of ψ(x) as a vector space the same as we do for the x’s themselves, and to define operators
acting on them, Âψ(x). That is, we need to replace a set of vectors | f k  with a function f (x).
A way to do that is to understand f (x) as a limit of a function defined on discrete points, f (x k ).
Though the details are a bit different. The first step is to define a basis |x i  that is orthonormal,

x i |x j  = δi j , (2.4)

and complete in the space of n vectors,



n
|x i x i | = 1, (2.5)
i=1

where 1 is the unit operator, which means that any vector | f  can be expanded in this basis as

n
|f = f (x i )|x i . (2.6)
i=1

If we understand the components f (x i ) as, by definition,

f (x i ) ≡ x i | f , (2.7)

then, in view of the completeness relation, we obtain an identity,



n
|f = |x i x i | f . (2.8)
i=1

Note that here we define |x i  as the column vector with a 1 only in the ith position and zeroes in the
other positions:
0
 . 
 .. 
 . 
 .. 
|x i  = 1i  . (2.9)
 . 
 .. 
 .. 
.
0
The decomposition of | f  in |x i  and the fact that f (x i ) = x i | f  suggest that the scalar product
can be understood in a similar way, as

n 
n
 f |g =  f |x i x i |g = f ∗ (x i )g(x i ), (2.10)
i=1 i=1
26 2 Infinite-Dimensional Hilbert Spaces

since
 f |x i  = (x i | f ) ∗ = f ∗ (x i ). (2.11)
Then we also obtain that the norm of a vector | f  is

n
 f 2 =  f | f  = | f (x i )| 2 . (2.12)
i=1

2.3 Integrals as Limits of Sums

In both the scalar product and the norm defined above we saw sums appearing, which we need to
generalize to the continuum case. It seems obvious that sums should generalize to integrals. Indeed,
the Riemann integral is defined as the limit of a Riemann sum.
A very small (infinitesimal) constant step of integration Δx i =  → 0 becomes the differential dx,
where  = (b − a)/n, n is the number of points and a, b the limits of integration. Then the sum tends
to an integral in the limit n → ∞,
n  b
f (x i ) → dx f (x). (2.13)
i=1 a

However, it turns out that a better limit of the sum is the more general Lebesgue integral, which
allows for integration in the presence of distributions such as Dirac’s delta function, treated next. The
limiting Lebesgue integral is
  ∞
f dμ = f ∗ (t)dt, (2.14)
0

where
f ∗ (t) = μ({x}| f (x) > t) (2.15)
is the measure (the values of x for which f (x) is greater than a given t). With respect to the Lebesgue
integral, we can define L 2 (X, μ), the space of square integrable functions over the domain, which is
a space X, with measure μ. We say that f belongs to L 2 (X, μ) if

| f | 2 dμ < ∞. (2.16)
X

This means that the scalar product can be generalized from a sum to a Riemann integral, as

n  b
 f |g = f ∗ (x i )g(x i ) → f ∗ (x)g(x)dx, (2.17)
i=1 a

or more generally to a Lebesgue integral, as



 f |g = f ∗ (t)g(t)w(t)dt, (2.18)

where the Lebesgue measure is



μ( A) = w(t)dt. (2.19)
A
27 2 Infinite-Dimensional Hilbert Spaces

Thus the norm of a function can be written as a Riemann integral or as a Lebesgue integral,
 b 
 f 2 = | f (x)| 2 dx → | f | 2 dμ. (2.20)
a A

2.4 Distributions and the Delta Function

Since the so-called Dirac delta function, which famously is not a function, but rather a distribution,
will also make an appearance in infinite-dimensional Hilbert spaces, we need to define the theory of
distributions, and specialize to the delta function.
In short, a distribution T is a linear and continuous application (not a function), defined on the
space of test functions of compact support, and with complex (or real) values; thus T : D → C.
The support means the effective domain of a function, which is smaller than the full domain, and it
is supposed that the function vanishes outside the support. The space of test functions for distributions
is then defined as the set of infinitely differentiable functions with compact support,

D ≡ {φ : Rn → C|supp(φ) compact, φ ∈ C∞ (Rn )}. (2.21)

This space is a vector space, with convergence structure (so that it is a topological space) φ n → φ
in D.
Distributions are in some sense generalizations of functions that allow us to make sense of
quantities like the Dirac delta function. Indeed, for any locally integrable function, f (x) ∈ L 1loc ( A),
we can associate a distribution f˜ with it, f˜ : D → C, such that, for any test function φ,

 f˜|φ = f ∗ (x)φ(x)dx. (2.22)
A

The space of distributions D∗ is called the dual of the space of test functions D,

D∗ = {T : D → C|T a distribution}. (2.23)

It is also a vector space, and has a convergence structure (so it is a topological space): if the sequence
{Tn }n ∈ D∗ , then Tn → T (weak convergence) ⇔ for any φ ∈ D we have Tn |φ → T |φ.
On this space, we can define multiplication of the distribution T ∈ D∗ by any infinitely differen-
tiable function a(x) ∈ C ∞ (Rn ), by

aT |φ ≡ T |a∗ (x)φ(x). (2.24)

If there exists a function f such that a distribution has a function associated with it, T = f˜, we call
the distribution T a regular distribution. If there is no such function, we call the distribution singular;
for example, the Dirac delta function is a singular distribution:

δ x0 : D → R, δ x0 |φ = φ(x 0 ). (2.25)

On the other hand the Heaviside (step) distribution is regular. It is defined by


 ∞
Hx0 |φ = φ(x)dx. (2.26)
x0
28 2 Infinite-Dimensional Hilbert Spaces

Then the Heaviside function associated with it is defined as


⎪ 1, x > x0
Hx0 (x) = ⎨
⎪ 0, (2.27)
⎩ x ≤ x0.

The Heaviside function is continuous, but not differentiable at x 0 . It is only differentiable as a


distribution.
We can define the derivative of a distribution T ∈ D∗ by a bracket with a test function φ ∈ D, as
   
∂   ∂φ
T  φ ≡ − T  . (2.28)
∂xj   ∂xj

It is a good definition, since it is self-consistent for regular distributions (those associated with
functions). For T = f˜, the relation is written as
   
∂ ˜   ∂φ
f φ ≡− f
˜
∂ x j 
. (2.29)
 ∂xj

But the right-hand side can be written as follows:


    ˜  
∗ ∂φ ∂ ∂f∗ ∂ f 
− dx = − ( f ∗ · φ(x)) + φ(x)dx = φ ,
∂ x j 
f (x) dx (2.30)
Rn ∂xj Rn ∂xj Rn ∂xj

where in the first equality we have used partial integration, and in the second we used the fact that
the test functions have compact support, so the boundary integral on Rn vanishes and the remaining
term is the definition of the scalar product.
Furthermore, as a distribution, we have that the delta function is the derivative of the Heaviside
function,

Hx 0 = δ x0 . (2.31)

To prove this, note that


 ∞
Hx 0 |φ 
= −Hx0 |φ  = − φ  (x)dx = φ(x 0 ) − φ(∞) = φ(x 0 ) = δ x0 |φ, (2.32)
x0

where we have used the fact that φ(∞) = 0 because of the compact support of φ.
We can also define a notion of the support of a distribution T ∈ D∗ ,

supp(T ) ≡ {x ∈ Rn |T  0 in X }. (2.33)

Then the support of the delta function is a single point,

supp(δ x0 ) = {x 0 }. (2.34)

Moreover, there is a theorem that, for any function f ∈ L 1loc ( A), we have supp( f˜) ⊂ supp( f ) and
for any continuous function f , supp( f˜) = supp( f ).
29 2 Infinite-Dimensional Hilbert Spaces

2.5 Spaces of Functions

We can now go back to defining the notion of a space of functions. As we saw, the scalar product of
two functions f , g on an interval (a, b) is defined as
 b
 f |g = f ∗ (x)g(x)dx. (2.35)
a

We can consider the case when one of the function vectors is replaced by the |x vector itself.
It follows that we need to define the product of two such vectors, giving something like δ x,y . Of
course, now we know that the correct thing to do is to use the delta function, or rather, distribution,
δ(x − y), so
x|y = δ(x − y), (2.36)
which is zero outside the support x 0 and gives 1 on integration,
 b
δ(x − y)dy = 1, a < x < b. (2.37)
a

In fact, we can restrict the domain of integration to an infinitesimal region around the zero of the
argument of δ, and write
 b
δ(x − y) f (y)dy = f (x)
a
 x+ (2.38)
= δ(x − y) f (y)dy.
x−

On the other hand, the completeness relation for |x, defined as a (Riemann) sum, is now generalized
to an integral:
n  b
|x i x i | = 1 → dx|xx| = 1. (2.39)
i=1 a

Then we can also generalize x i | f  ≡ f (x i ) to a continuous function x| f  = f (x), since


 b  b
 
f (x) = x| f  = x|1| f  = x|x x | f  = dx  f (x  )x|x   (2.40)
a a

only makes sense if x|x   = δ(x − x  ).


Note that the delta function is not a function (it is a distribution), but can be obtained as the limit
when Δ → 0 of a Gaussian,
1
e−(x−y)
2 /Δ2
f Δ (x − y) = √ . (2.41)
πΔ2
The delta function δ(x − y) is even, owing to its definition as a distribution, and since it is the limit
of f Δ , and also because
δ(x − y) = x|y = (y|x) ∗ = δ∗ (y − x) = δ(y − x). (2.42)
From the fact that δ(x − y) is a distribution, its derivative with respect to its argument,
d d
δ  (x − y) = δ(x − y) = − δ(x − y), (2.43)
dx dy
30 2 Infinite-Dimensional Hilbert Spaces

acts in reality as
d
δ(x − y) . (2.44)
dy
(the proof is by partial integration, as we saw from the general derivative of a distribution). Note that
here we have assumed that the space of integration for the test functions is in y, f (y)dy, which is
the opposite convention to the usual one, but it makes sense if we want to think of the integration as
a generalization of summation coming from the matrix product.
This fact in particular means that δ  (x − y) is a kind of matrix in (x, y) space (since it has indices).

2.6 Operators in Infinite Dimensions

We saw that the space of functions | f  such that x| f  = f (x) is a Hilbert space, which means from
the general theory that we can define operators  on the functions, acting as Â| f .
Among such operators we have the trivial ones, corresponding to the usual multiplication by a
number. One nontrivial case is the differentiation operator, which can be thought as a matrix too. It
is defined as
d d
: f (x) → f  (x) = f (x), (2.45)
dx dx
but we should extend this definition to the abstract space | f  (without the basis |x) as D, where
df 
D| f  ≡  . (2.46)
 dx
Then we can also multiply by x|, in order to obtain a matrix element,
 
 df df
x|D| f  = x  = (x). (2.47)
 dx dx
On the other hand, introducing the completeness relation inside this expression, we obtain
 
df   
(x) = dx x|D|x x | f  ≡ dx  Dxx f (x  ), (2.48)
dx
which means that the matrix element of D is (note the integration over x  in the delta function)
d
Dxx = δ  (x − x  ) = δ(x − x  ) . (2.49)
dx

2.7 Hermitian Operators and Eigenvalue Problems

Consider a linear operator  : H → H on a Hilbert space H. Then its adjoint † is defined as before
(in the finite-dimensional case), by
 † f |g ≡  f | Âg. (2.50)
Then a Hermitian (or self-adjoint) operator is one for which
 = † . (2.51)
31 2 Infinite-Dimensional Hilbert Spaces

One thing we have to be careful about is that in principle f and g live in different Hilbert spaces
(the vector space of functions and its dual), but  = † is only meaningful if the domain (more
precisely, the support) of the two operators is the same. This is a subtle issue, since there are operators
that are formally self-adjoint, but act a priori on different spaces, and requiring that the spaces are
identical imposes constraints.
We have several important theorems about Hermitian operators:

Theorem For  : H → H Hermitian, i.e.,   f |g =  f | Âg for all f , g ∈ H, it follows that  is
bounded.

Theorem For  : H → H (  ∈ L(H)), there is a unique adjoint † ∈ L(H), and it has the same
norm,  †  =  Â.

Theorem For A ∈ L(H) Hermitian, † = Â, the diagonal elements are real,   f | f  ∈ R.

Theorem For a Hermitian operator  ∈ L(H), † = Â, orthogonal eigenvectors correspond
to different eigenvalues. This leads to the same Gram–Schmidt diagonalization algorithm as that
considered in the finite-dimensional case.
We next consider the eigenvalue problem (or spectral problem) Â f = λ f . It follows that

( Â − λ1) f = 0. (2.52)

Then we have another theorem:

Theorem (Hilbert–Schmidt) For a Hermitian operator  = † , with a set of eigenvalues λi and
eigenvectors f i , {λi , f i }i ∈N , the set of eigenvectors forms a complete basis, so

Â| f  = λi  f i | f | f i . (2.53)
i ∈N

Theorem (Fredholm) For an operator  : H → H, consider the homogenous and inhomogenous


eigenvalue equations,
(1) : ( Â − λ1)x = z
(2.54)
(2) : ( Â − λ1)x = 0.
Then we have two statements:
(a) (1) admits a unique solution ⇔ λ is not an eigenvalue.
(b) If λ is an eigenvalue, then (1) admits a solution ⇔ (2) admits a nontrivial solution, and z ∈ H ⊥
(the transverse subspace).

2.8 The Operator Dxx

We can ask, is the operator Dxx from (2.49) Hermitian? No, but from it we can make a potentially
Hermitian operator, −iDxx , since

−iDxx = (−iDxx ) ∗ = +iDx x , (2.55)


32 2 Infinite-Dimensional Hilbert Spaces

which can be rewritten as

−iδ  (x − x  ) = +iδ  (x  − x). (2.56)

But if the domain of definition of the functions is actually finite (as opposed to just having a
compact support), there can be an issue with boundary terms. Indeed, for a Hermitian operator Â, we
have the equality

g| Â| f  = g| Â f  =  Â f |g =  f | Â|g∗ , (2.57)

and by introducing two completeness relations we can write the relations in terms of the x x  matrix
components of the operators:
 b  b  b  b ∗
     
dx dx g|xx| Â|x x | f  = dx dx  f |xx| Â|x x |g . (2.58)
a a a a

We want to see whether this can be satisfied by Axx = −iDxx . The left-hand side becomes
 b  b    b  
 ∗  d  ∗ d
dx dx g (x) −iδ(x − x )  f (x ) = dxg (x) −i f (x), (2.59)
a a dx a dx
whereas the right-hand side gives, by partial integration,
 b  b   ∗  b
 ∗  d  d
dx dx f (x) −iδ(x − x )  g(x ) = −i g ∗ (x) f (x) + ig ∗ (x) f (x)| ba . (2.60)
a a dx a dx
That means that we get equality only if the boundary term

+ ig ∗ (x) f (x)| ba (2.61)

vanishes.
The eigenvalue problem for  = −iDxx , defined as

Â|k = k |k (2.62)

gives

kx|k = x| Â|k = dx x| Â|x x  |k, (2.63)

which in turn becomes (since Axx = −iDxx = −iδ(x − x  )d/dx  and can be integrated, and defining
x|k ≡ ψk (x))
d
−i ψk (x) = kψk (x), (2.64)
dx
which is solved by
1
ψk (x) = Aeik x = √ eik x . (2.65)

The normalization constant A has been chosen so that k |k   = δ(k − k  ). Note then that in this
new |k basis we have that

−iDkk  = k | − iD|k   = k k |k   = k  δ(k − k  ) (2.66)

is diagonal.
33 2 Infinite-Dimensional Hilbert Spaces

Important Concepts to Remember

• The distance on a metric or pre-Hilbert space is d(x, y) =  x − y, and a Hilbert space is a metric
space that is also complete, meaning that the limit of a series in the space also belongs to the space,
and the norm comes from a scalar product.
• A Banach space is like a Hilbert space, except that the norm does not come from a scalar product.
• For quantum mechanics, we need infinite-dimensional Hilbert spaces in which the vectors are
functions. To discretize, f (x i ) = x i | f  but, in the continuous case, |x i  is replaced by the
continuous variable |x. b
• The scalar product on the Hilbert space of functions is  f |g = a f ∗ (x)g(x)dx, but the complete-
b
ness relation for the |x basis vectors is a dx|xx| = 1, reducing the previous definition of a
scalar product to a trivial identity.
• Distributions are linear and continuous applications that act on the space of some test functions
and give a complex number result. For instance, the distribution T = f˜ is defined such that
 f˜|φ = A f ∗ (x)φ(x)dx.
• The “delta  function” distribution δ x0 is defined by δ x0 |φ = φ(x 0 ), or, more commonly, by
φ(x 0 ) = δ(x − x 0 )φ(x)dx.
• The derivative of a distribution is defined by ∂T/∂ x j |φ = −T |∂φ/∂ x j .
• The Heaviside function is a function, but it can also be associated with a distribution. As a
distribution, Hx 0 = δ x0 .
• The orthonormality relation of the |x vectors is x|y = δ(x − y).
• Nontrivial operators on the Hilbert space of functions are those that involve derivatives, in
particular the operator D, with D| f  = |d f /dx and with matrix element Dxx = δ  (x − x  ) =
δ(x − x  )d/dx.
• For Hermitian (or self-adjoint) operators † =  there are many solutions to the eigenvalue-
eigenstate problem (or spectral problem) ( Â − λ1) f = 0. In fact, the set of (linearly independent)
eigenvectors forms a complete basis for the Hilbert space.
• The operator −iDxx , thought of as a distribution or as an operator on the space of functions is
Hermitian, but only if the Hilbert space (the “space of test functions for the distribution”) involves
functions with compact support or otherwise if the boundary terms match for the matrix element
with two functions.

Further Reading
See, for instance, Shankar’s book [3] for more details about spaces of functions, from the point of
view of a physicist.

Exercises

(1) Discretize the product of two functions, as compared to discretizing each function indepen-
dently, and describe what that means in the language of kets.
(2) For a tensor product of kets, describe what the norm is in the abstract sense, and then in the
function form (with integrals).
34 2 Infinite-Dimensional Hilbert Spaces

(3) Is the square of the “delta function” a distribution? If so, prove it using Dirac’s bra-ket notation.
(4) Show how δ  (x − y) (the second derivative with respect to x) acts as a distribution on functions.
(5) Is a real function of a Hermitian operator Â, f ( Â), also Hermitian? Give examples.
(6) Consider the unitary operator ei Â, with  Hermitian and acting on function space. Can it be
diagonalized? If so, write an expression for its diagonal
 elements.

0 1
(7) Diagonalize e σ1 , where σ1 is the first Pauli matrix .
1 0
The Postulates of Quantum Mechanics
3 and the Schrödinger Equation

After developing a somewhat lengthy mathematical background, we are now ready to define
quantum mechanics from some first principles, or postulates. The word “postulates” implies some
mathematical-type axiomatic system, but the situation is not quite so clear cut. There isn’t a
mathematically rigorous collection of assumptions, only a set of rules that can be presented in many
ways. The number, order, and content of the postulates varies, though the total physical content of
the system is the same. Here I will present my own viewpoint on the postulates. We will start with
the postulates themselves, and then we will explain them and add comments.
The crucial difference from classical mechanics is that we don’t any longer have classical paths
defined by a Hamiltonian H (pi , qi ) and initial conditions for the phase-space variables (pi , qi ) (giving
a “state” at time t). Instead of that, we have probabilities for everything we can observe, including
analogs of classical variables like (pi , qi ); but there are also new variables that have no classical
counterpart. We thus have to define states, observables, probabilities for them, time evolution, and
the postulates dealing with these concepts.

3.1 The Postulates

First postulate, P1
At every time t, the state of a physical system is defined by a ket |ψ in a Hilbert space.

Second postulate, P2
For every observable described classically by a quantity A, there is a Hermitian operator  acting on
the physical states |ψ of the system. The fact that the operator is Hermitian means that its eigenvalues
are real, which as we will shortly see means that the observables are real, as they should be.

Third postulate, P3
The only possible result of a measurement of an observable is an eigenvalue λ of the operator
corresponding to it:

Â|ψ λ  = λ|ψ λ . (3.1)

Note that if the spectrum of the operator  is discrete, there is a discrete number of values
λ n , explaining the “quantum” in “quantum mechanics”. Since the operators  corresponding to

35
36 3 The Postulates and the Schrödinger Equation

observables are Hermitian, it follows that the Hilbert space has a basis |n (corresponding to λ n )
formed by eigenvectors of the operator Â, so every state can be expanded into it,

|ψ = cn |n. (3.2)
n

Fourth postulate, P4
The probability of obtaining an eigenvalue λ n corresponding to the eigenstate |vn  in a measurement
of the observable  in a normalized state |ψ is

Pn = |vn |ψ| 2 , (3.3)

where, besides |ψ, the |vn  are also normalized states. Here we have to assume that the sum of all
probabilities is one, n Pn = 1.

Fifth postulate, P5
After a measurement of a variable corresponding to an operator Â, giving as a result the eigenvalue
λ n corresponding to the eigenvector |vn , the state of the system has changed from the original |ψ
to |vn .
This is the strangest of the postulates, which clashes most with our intuition since it implies a
discontinuous, nonlinear, change. But it is experimentally found to be true. If we re-measure the
same quantity right after the first measurement, we always find the same result λ n .

Sixth postulate, P6
The time evolution of a state is given by

|ψ(t) = Û (t, t 0 )|ψ(t 0 ), (3.4)

where the operator Û is unitary, Û †Û = 1̂, in order to preserve norms, and thus the total probability
(which equals 1), i.e.,

ψ(t)|ψ(t) = ψ(t 0 )|ψ(t 0 ). (3.5)

This evolution operator (also known as a “propagator”) is found by imposing the requirement that
the time dependence of the state satisfies the Schrödinger equation,

d
i |ψ(t) = Ĥ |ψ(t). (3.6)
dt
Note that here we have assumed that states change with time but operators corresponding to
observables do not, which is known as the “Schrödinger picture”, but there are other pictures, as
we will see later on in the book.
We next start to comment on and expand on the various postulates.
37 3 The Postulates and the Schrödinger Equation

3.2 The First Postulate

As we have just seen, the state of a system is defined at a given time (in the Schrödinger picture),
but it depends on time, with the time variation given by the Schrödinger equation. We described
abstract states |ψ, but for states described by quantities that depend on spatial coordinates, such
as, for instance states corresponding to single particles with a classical position, we can define the
function
x|ψ = ψ(x), (3.7)
known as the wave function. Since |x is an eigenvector of the position operator X̂, from postulate 4
it follows that if we measure the position through X̂ then |vn  → |x, so the probability of obtaining
the value x is
|vn |ψ| 2 → |x|ψ| 2 = |ψ(x)| 2 . (3.8)
Thus it is this function that defines position probabilities through its modulus squared, justifying the
name wave function.

3.3 The Second Postulate

One observation to make is that there is some ambiguity about the operator  corresponding to a
classical mechanics quantity A: there is an issue concerning the order of the operators (such as X̂, P̂)
involved in the expression for a composite operator. For instance, consider a Hamiltonian H (p, q)
that classically contains a term pq2 . Quantum mechanically, should we replace that with P̂Q̂2 , Q̂ P̂Q̂,
Q̂2 P̂, or some linear combination of the three possibilities?
The other observation is that for a Hermitian operator Â, the eigenvalues λ n in Â|vn  = λ n |vn 
are real, and the |vn  eigenstates form a complete basis for the Hilbert space, so any state |ψ can be
expanded in them:

|ψ = cn |vn . (3.9)
n

It could be, however, (this is more generic) that there are other observables that can also be
measured, which could mean that there are several states of given eigenvalue λ n for Â, which
are distinguished by some other index b (corresponding perhaps to eigenvalues of some other
observables), i.e.,
Â|vn , b = λ n |vn , b. (3.10)
But we can also have a linear combination of the states with index b, so
 
  cb |vn , b = λ n  cb |vn , b . (3.11)
 b  b

Then an arbitrary state can still be expanded in this set, but now we write

|ψ = cn,b |vn , b. (3.12)
n,b
38 3 The Postulates and the Schrödinger Equation

3.4 The Third Postulate

As we saw, we measure only eigenvalues of operators, which can be discrete, i.e., indexed by a
natural number, λ n . This is the reason we talk about quantum mechanics.
(1) The eigenvalues, and the corresponding eigenstates associated with them, can be not only
discrete but also finite in number, in which case we have a finite-dimensional Hilbert space. The
standard example is the system with only two states that arises from a spin 1/2 particle: spin up ↑ or
spin down ↓, or more precisely

|s z = +1/2 and |s z = −1/2. (3.13)

This system has no classical counterpart.


(2) Another possibility is a discrete spectrum, for an operator that corresponds to a classical
continuous variable, but still to have a countable (indexable by N) set of eigenvalues. As we said, in
this case we talk about quantization of the observable. The standard example in this case is the energy
of an electron in a hydrogen atom, En . It could also be, as in this case, that the energy spectrum is
degenerate, meaning that there are other observables besides Ĥ (giving the energy) that add an index
b to the state,

Ĥ |ψ n , b = En |ψ n , b. (3.14)

3.5 The Fourth Postulate

Since the probability of finding the eigenvalue λi corresponding to eigenstate |vi  of  is

Pi = |vi |ψ| 2 , (3.15)

and we can expand the state in the complete and orthonormal (vi |v j  = δi j ) set of the eigenstates of
Â, we write

|ψ = ci |vi , (3.16)
i

so that the probability of measuring λi is

Pi = |ci | 2 . (3.17)

Since the sum of all the probabilities is one, we obtain


 
Pi = |ci | 2 = 1, (3.18)
i i

which amounts (since the |vi  are normalized states) to normalization of the state,
 
ψ|ψ = ci∗ c j vi |v j  = |ci | 2 = 1. (3.19)
i,j i

This also means that only normalizable states |ψ can be physical (hence our definition for the
Hilbert space), since normalizable states imply that we can construct a probability set that sums
to one.
39 3 The Postulates and the Schrödinger Equation

In the case of degenerate states, Â|vn , b = λ n |vn , b, instead of projecting onto a single state vi |,
we project onto the set of states with the same eigenvalue λ n , using the projector PVn , so onto the
state
(PVn |ψ) † . (3.20)
A projector must satisfy the relations
PV2 n = PVn , PV† n = PVn , (3.21)
the first stating that projecting an already projected state changes nothing, so P2 = P, and the second
that the projections onto the bra and ket are identical.
It is easy to realize that the projector must be the part of the completeness relation involving only
the states in question, so in our case

PVn = |vn , avn , a|. (3.22)
a

Acting on the general state



|ψ = cm,b |vm , b (3.23)
m,b

gives

PVn |ψ = |vn , avn , a|vm , bcm,b . (3.24)
a,m,b

Then we must replace Pn = |vn |ψ| 2 = |cn | 2 by the more general




Pn = ψ|PVn |ψ = PVn ψ|PVn ψ = cn,a cn,a . (3.25)
a

Finally, we come to the possibility of having a continuous spectrum for an operator. As we said,
this is the case for the position operator, with matrix elements
x| X̂ |x  = xδ(x − x  ), (3.26)
and the momentum operator, with matrix elements
x| P̂|x   = −iδ  (x − x  ). (3.27)
The position operator has eigenstates |x,
X̂ |x = x|x, (3.28)
which means we can define the wave function x|ψ ≡ ψ(x). In terms of it, the abstract relation for
eigenvectors of operators,
Â|ψ = λ|ψ (3.29)
becomes, by inserting the identity as a completeness relation using the states |x,

dx x| Â|x x  |ψ = λx|ψ. (3.30)

Writing this in terms of wave functions, we obtain



dx  Axx ψ(x  ) = λψ(x). (3.31)
40 3 The Postulates and the Schrödinger Equation

In the case of the momentum operator, Â = P̂, we obtain




dx  −iδ  (x − x  ) ψ(x  ) = pψ(x), (3.32)

and, since δ  (x − x  ) acts as δ(x − x  )d/dx , we finally obtain


dψ(x)
−i = pψ(x). (3.33)
dx
On the other hand, for the position operator  = X̂, since X̂ xx = xδ(x − x  ) we obtain just an
identity, xψ(x) = xψ(x).
We have presented three cases, of finite dimensional Hilbert spaces (such as for electron spin), of
discrete countable states (such as the energy states of the hydrogen atom), and of continuous states
(such as position and momentum). But we can have a system that has eigenvalues corresponding to
more than one observable, for instance both spin and position like the electron. In such a case, the
total Hilbert space is a tensor product of the individual Hilbert spaces,

H = H1 ⊗ H 2 . (3.34)

Example For the electron, with spin 1/2, consider that H1 corresponds to spin 1/2, with basis |s z =
+1/2, |s z = −1/2 (or |+, |−), and H2 corresponds to position, with x|ψ = ψ(x). Then we
consider states of the type |s ⊗ |ψ and basis states of the type +| ⊗ x|.

If we have two independent observables that can be diagonalized simultaneously (so we can
measure their eigenvalues simultaneously), corresponding to operators Â1 , Â2 , so that

Â1 |λ1 , λ2  = λ1 |λ1 , λ2 , Â2 |λ1 , λ2  = λ2 |λ1 , λ2 , (3.35)

we can take the commutator of the two operators, and obtain

( Â1 Â2 − Â2 Â1 )|λ1 , λ2  = 0. (3.36)

Then, more generally (since the vectors |λ1 , λ2  form a basis for the Hilbert space), we can write
a commutation condition for operators:

[ Â1 , Â2 ] = 0. (3.37)

Thus operators corresponding to independent observables commute.


From the fourth postulate we also obtain a result for the experimentally measured average value
of an observable, which is sometimes included as part of the postulates:
  
A ≡ ψ| Â|ψ = cn∗ vn | Âcm |vm  = |cn | 2 an = Pn an , (3.38)
n,m n n

where we have used Â|vn  = an |vn  and the orthonormality of the eigenstates. Since the right-hand
side ( n Pn an ) is the experimentally measured average value, it follows that the average value of Â
in the state |ψ is consistently defined as

A ≡ ψ| Â|ψ. (3.39)

We also saw that the sum of probabilities must be 1, which means that states must be normalized,
n Pn = n |cn | = 1. But this relation, the conservation of probability, must be preserved by any
2
41 3 The Postulates and the Schrödinger Equation

transformation, such as a change of basis by an operator Û, or time evolution through the operator
Û (t, t 0 ). That means in both cases that Û must be unitary, since

|ψ → Û |ψ ⇒ ψ|ψ → ψ|Û †Û |ψ = ψ|ψ (3.40)

implies Û †Û = 1̂.

3.6 The Fifth Postulate

As we said, this is the weirdest postulate, the one that most contradicts our common intuition but is
experimentally verified. On the one hand it looks very singular, since the state changes abruptly by
measurement, which is an interaction with a classical apparatus, and on the other it seems to happen
instantaneously. Moreover, it is a nonlinear process. Other quantum processes, as we just saw, are
obtained by unitary evolution (in either time, or change of basis), but this one process is nonunitary
as well as nonlinear. Since classical processes are supposed to appear in the limit of a large number
of components (as a macroscopic object is formed of a very large number of atoms), they should
also be related somehow to quantum mechanical processes. Yet measurements, interactions with a
classical system, somehow give a nonlinear and nonunitary process. This is very puzzling, and its
interpretation is still the subject of some debate.

3.7 The Sixth Postulate

We now consider the Schrödinger equation, and how it leads to the time evolution operator. The
eigenstates |E, corresponding to eigenvalues, or energies E, of the Hamiltonian Ĥ form a complete
set, so we introduce their completeness relation in the definition of a state |ψ(t), to find
 
|ψ(t) = |E E|ψ(t) ≡ ψ E (t)|E, (3.41)
E E

and substitute in the time-dependent Schrödinger equation



i |ψ(t) = Ĥ |ψ(t), (3.42)
∂t
while imposing the time-independent Schrödinger equation, or eigenvalue problem for the Hamilto-
nian,

Ĥ |E = E|E. (3.43)

We obtain
   d 

i − Ĥ |ψ(t) = i ψ E (t) − Eψ E (t) |E = 0, (3.44)
∂t E
dt

with solution

ψ E (t) = e−iE (t−t0 )/ ψ E (t 0 ), (3.45)


42 3 The Postulates and the Schrödinger Equation

leading to the full time-dependent state



|ψ(t) = e−iE (t−t0 )/ ψ E (t 0 )|E. (3.46)
E

Then the time evolution operator is



Û (t, t 0 ) = |EE|e−iE (t−t0 )/ . (3.47)
E

For a degenerate energy spectrum, we will have a sum over states |E, a:

|ψ(t) = e−iE (t−t0 )/ ψ E,a (t 0 )|E, a,
E,a
 (3.48)
Û (t, t 0 ) = |E, aE, a|e−iE (t−t0 )/ .
E,a

3.8 Generalization of States to Ensembles: the Density Matrix

Until now we have considered the “pure quantum mechanical” case, where the state of the system is
defined by a unique state |ψ in a Hilbert space, also known as a “pure state”. However, in practice,
many important cases correspond to a combination of quantum mechanical and classical systems.
Specifically, instead of having a pure quantum state, we have a “collection” or ensemble of states in
which the system may be found each one with its own classical probability pi . We characterize this
situation by a “density matrix” (though it is really an operator, and it should be thought of this way),

ρ̂ = pi |ii|. (3.49)
i

Note that |i can be any kind of state, not necessarily an eigenstate of some observable.
Then we say that the average value of an observable associated with an operator  is the result
of a double averaging: first a classical averaging (denoted with a line over the operator) giving
probabilities pi , then the standard quantum averaging over states |i,

 Ā = pi i| Â|i. (3.50)
i

This average is obtained from the quantity


   
Tr( ρ̂ Â) = pi n|ii| Â|n = pi |i|n| 2 λ n = pi Ai =  Ā, (3.51)
n i i n i

where in the first equality we have used the fact that there is an orthonormal basis |n of eigenstates
of Â, and we have taken the trace of the corresponding matrix, and in the second we have used the
eigenvalue equation, Â|n = λ n |n, and finally we have used that the quantum probability of finding
the state |i in the state |n is Pi,n = |i|n| 2 .
Note that the density matrix is normalized, i.e., we assume that the sum of all the probabilities is
one, which amounts to (setting formally  = 1, or λ n = 1)
   
Tr ρ̂ = pi |i|n| 2 = pi Pi,n = 1. (3.52)
i n i n
43 3 The Postulates and the Schrödinger Equation

To come back to the original case, of a “pure state” (as opposed to a “mixed state”, i.e., an ensemble
with a density matrix), or, more properly stated, a “pure ensemble”, we now take into account that
pi = δi,ψ , thus the density matrix is

ρ̂ = |ψψ|. (3.53)

In this case, the average reduces to the quantum average, since


 
 Ā = Tr( ρ̂ Â) = n|ψψ| Â|n = λ n |n|ψ| 2 = Aψ , (3.54)
n n

the same as for a pure state.


The formalism of density matrices is relevant for the transition from quantum to classical regimes
which, as we alluded to already, is a tricky issue. As we said, we have a combination of classical and
quantum aspects, and in the presence of temperature we obtain something of this type. Temperature
means a degree of “thermalization”, or interaction with a classical temperature bath, that means
classicalization to some degree. These are however issues that will be discussed more at length later
on, in Part IIa of this book.

Important Concepts to Remember

• The postulates of quantum mechanics are just a convenient way to package the physical content of
quantum mechanics, not mathematical axioms.
• The first postulate: at every time, the state of a physical system is a ket |ψ in a Hilbert space.
• The second postulate: observables correspond to a Hermitian operator  acting on |ψ. The
eigenvalues of  are real, as they must be in order to correspond to observables.
• The third postulate: the result of a measurement of an observable A is an eigenvalue λ n on Â. If
the λ n are discrete, we have quantization. We can expand a state |ψ in terms of a basis consisting
of the eigenstates |vn  of Â.
• The fourth postulate: the probability of obtaining the eigenvalue λ n associated with |vn  is
Pn = |vn |ψ| 2 . Since |ψ and |vn  are normalized, n Pn = 1.
• The fifth postulate: after measuring the variable A and obtaining λ n , the state of the system has
changed from |ψ to |vn .
• The sixth postulate: the evolution in time of a state is governed by the Schrödinger equation
id/dt|ψ(t) = Ĥ |ψ(t).
• If we consider the position of a particle in a state |ψ, with vector |x, then the probability of finding
the particle at position x is |ψ(x)| 2 .
• Independent observables can be diagonalized simultaneously and have commuting operators:
[ Â, B̂] = 0.
• For an eigenvalue E of the Hamiltonian of a system, we have the time-independent Schrödinger
equation Ĥ |ψ E  = E|ψ E ; then |ψ E (t) = e−iE (t−t0 )/ |ψ E (t 0 ).
• The evolution operator Û (t, t 0 ) is defined by |ψ(t) = Û (t, t 0 )|ψ(t 0 ).
• A classical ensemble of quantum states corresponds to a density matrix ρ̂ = i |ii|, normalized
as Tr ρ̂ = 1 and with average observable value  Ā = Tr(ρ Â).
44 3 The Postulates and the Schrödinger Equation

Further Reading
See, for instance, Messiah’s book [2] or Shankar’s book [3] for an alternative approach.

Exercises

(1) Suppose that the probability of finding some particle 1 at x 1 is a Gaussian around x 1 , with
standard deviation σ1 , and the probability of finding another particle 2 at x 2 is a Gaussian around
x 2 with standard deviation σ2 . What is the condition on the wave functions of the two particles
such that the probability to find either particle at the position mentioned is the sum of the two
Gaussians?
(2) Consider the classical Hamiltonian
H = αp2 + βq2 + p f (q). (3.55)
Write a quantum Hamiltonian that is ordering-symmetric with respect to Q̂ and P̂, taking into
account that [Q̂, P̂] is a constant.
(3) Consider infinite and discrete energy spectra En . If the spectrum extends by a finite amount,
between Emin and Emax , what conditions can you impose on the En very close to either of these
values? What if Emax = +∞ or Emin = −∞?
(4) Consider a (spinless) particle in a three-dimensional potential. How many commuting operators
associated with observables are there?
(5) Consider the Hamiltonian
H = x2. (3.56)
Solve the Schrödinger equation and find the time evolution operator.
(6) Consider the density matrix
1
ρ̂ = (|1 1| + |2, 2| + |3 3|), (3.57)
3
where |1 and |2 ± |3 are eigenstates of some operator Â. Calculate the average value of A.
(7) In exercise 6, if the states |1, |2, |3 are eigenstates of the Hamiltonian Ĥ, write the evolution
in time of the density matrix ρ̂.
Two-Level Systems and Spin-1/2, Entanglement,
4 and Computation

In this chapter we consider the simplest possible situation, that of a Hilbert space with dimension 2,
i.e., a system with only two orthonormal states leading to operators represented as 2 × 2 matrices,
also known as a two-level system. We will only analyze the time-independent case, leaving the
time-dependent case for later. Then, we will consider the case of several such two-level systems
coupled together, with a tensor product Hilbert space (as we described previously). The relevant new
phenomenon that we will observe is called entanglement, related to the quantum coupling of the
Hilbert spaces. Moreover, this will allow us to define a quantum version of classical computation,
which here we will only touch upon; both entanglement and quantum computation will be addressed
in more detail later on in the book.

4.1 Two-Level Systems and Time Dependence

We will motivate the study of general two-level systems by first considering the simplest such system,
of a fermion (such as an electron) with spin | S|  = 1/2 (more precisely, the spin is 1/2 times the
quantum unit of angular momentum, which is , so | S|  = /2, but we will suppress the  for the
moment). In this case, as we know experimentally, for instance from the Stern–Gerlach experiment
(described in Chapter 0), the projection of the spin onto any direction, here denoted by z, Sz , can
take only the values +1/2 and −1/2. In the Stern–Gerlach experiment, we put a magnetic field in the
z direction, splitting an electron beam passing through it transversally.
Therefore, the independent states in this simplest system are |S = 1/2, Sz = +1/2, also called |+
or |↑, and |S = 1/2, Sz = −1/2, also called |− or |↓. In a matrix notation for the Hilbert space and
the operators on it, the states are (see Fig. 4.1)
   
1 0
|↑ = |+ = , |↓ = |− = . (4.1)
0 1

Operators acting on this space are then represented by 2 × 2 matrices acting on the above states.
The most important operators are the spin operators themselves, or more precisely the spin
projections onto the three axes Sx , Sy , Sz . We will describe the general theory of spin and angular
momentum later, but for the moment we note that, since the spin projections are ±1/2, we want the
matrices representing Sx , Sy , Sz to satisfy
1
Sx2 = Sy2 = Sz2 = . (4.2)
4
Moreover, we want to have (we will explain this condition later)

(Sx + iSy ) 2 = (Sx − iSy ) 2 = 0, (4.3)


45
46 4 Two-Level Systems and Spin 1/2, Entanglement

+, ↑, E2

−, ↓, E1
Figure 4.1 A two-level system, with a common notation.

which leads to

Sx Sy + Sy Sx = 0, (4.4)

i.e., these matrices anticommute. The above conditions, together with the anticommutativity of Sx
and Sy with Sz , are actually sufficient to define the matrices as Si = 12 σi , with σi the Pauli matrices,
     
0 1 0 −i 1 0
σ1 = σx = , σ2 = σy = , σ3 = σz = . (4.5)
1 0 i 0 0 −1
More precisely, the Pauli matrices satisfy

σi σ j = δi j 12×2 + ii jk σk , (4.6)

which we can check explicitly; we have used implicit Einstein summation over the index k, and
the Levi–Civita tensor is i jk = +1 for (i j k) = 123 and for cyclic permutations thereof, and −1
otherwise.
The energy due to the interaction of the spin of the electron, manifested through its magnetic
moment, and its interaction with the magnetic field B is

ΔHspin = −
μ · B,
 (4.7)

where μ
 is the magnetic moment of the electron. Classically,
e 
μ
 = L. (4.8)
2m
Quantum mechanically, as we will see, the quantum unit of | L| is . Then the quantum of |
μ | is
called the Bohr magneton, and is
e
μB = . (4.9)
2m
On the other hand the electron, as a fermion, has half-integer spin (i.e., intrinsic angular momentum),
and more precisely

| S|
 = . (4.10)
2
Moreover, the quantum magnetic moment of the electron due to its spin has an extra factor, the Landé
g-factor, so
e 
μ
S = g S, (4.11)
2m
where g  2. The magnetic moment as an operator, or more precisely a matrix, acting on the two
spin states, is then
ge
μ
S = σ, (4.12)
4m
47 4 Two-Level Systems and Spin 1/2, Entanglement

so the spin energy, leading to spin splitting of the electron’s energy levels, is
ΔHspin = −
μS · B
 = −μσ · B.
 (4.13)
If μ
 = μz ez and B is time independent, then we finally obtain
 
1 0
ΔHspin = −μBz . (4.14)
0 −1
Its eigenvectors are the basis vectors,
   
1 0
|↑ = |+ = , |↓ = |− = . (4.15)
0 1
If we are measuring this energy, for instance as in the Stern–Gerlach experiment, a prototype of
a quantum state that does not have a well-defined energy but rather probabilities for energies, is the
normalized linear combination
 
1 1 1
|ψ±  ≡ √ (|↑ ± |↓) = √ . (4.16)
2 2 ±1
In this state, the probabilities for the occurrence of each eigenstate |i = |± are equal, and given by
1
pi = |ci | 2 = , (4.17)
2
for both values of the measured energy,
E+ = E + μBz and E− = E − μBz . (4.18)
We could in principle also add a time-dependent magnetic field with circular polarization in the
(x, y) directions,
Bx = −b cos ωt, By = b sin ωt. (4.19)
Then its interaction with the magnetic moment of the electron gives
   
0 cos ωt + i sin ωt 0 eiωt
−(Bx σx + By σy ) = b = b −iωt , (4.20)
cos ωt − i sin ωt 0 e 0
leading to a total spin Hamiltonian
   
1 0 0 eiωt
ΔHspin = −μBz + μb −iωt . (4.21)
0 −1 e 0
However, this case is more complicated and will not be analyzed here, but later in the book.

4.2 General Stationary Two-State System

Consider a complete basis |φ1 , |φ2  corresponding to some part of the Hamiltonian Ĥ0 . In coordinate
space, this would correspond to wave functions x|φ1  = φ1 (x) and x|φ2  = φ2 (x). As before,
   
1 0
representing |φ1  by |+ = and |φ2  by |− = , a general wave function can be decomposed as
0 1

ψ(x, t) = c1 (t)φ1 (x) + c2 (t)φ2 (x), (4.22)


48 4 Two-Level Systems and Spin 1/2, Entanglement

which is written formally as


 
c1 (t)
|ψ(t) = c1 (t)|φ1  + c2 (t)|φ2  = . (4.23)
c2 (t)
Then the Schrödinger equation for the wave function in coordinate space is
d
i ψ(x, t) = Ĥ ψ(x, t), (4.24)
dt
and in matrix form for the formal case it is
      
d c1 (t) c1 (t) H11 H12 c1 (t)
i = Ĥ = . (4.25)
dt 2 c (t) c2 (t) H21 H22 c2 (t)
Here the matrix elements of the Hamiltonian are, by definition,
   
Hi j = φi | Ĥ |φ j  = dx dyφi |xx| Ĥ |yy|φ j  = dx dyφi∗ (x) Ĥxy φ j (y), (4.26)

where we have introduced two identities written using the completeness relation in order to express
the Hamiltonian matrix elements in terms of wave functions.
As we saw in the general theory, in order to solve the Schrödinger equation we write an eigenvalue
problem for the Hamiltonian, and in terms of it we write an ansatz for the time dependence as
   
c1 (t) c
= e−iEt/ 1 . (4.27)
c2 (t) c2
Here c1 , c2 are constants and E is an eigenvalue for the eigenvalue–eigenvector problem for the
Hamiltonian, satisfying
    
H11 H12 c1 c
=E 1 . (4.28)
H21 H22 c2 c2

As we know, the existence of the eigenvalue E is equivalent to the vanishing of det( Ĥ − E1),
  H11 − E 
H12 
  = 0. (4.29)
 H21 H22 − E 
Moreover, remembering that the Hamiltonian (like all observables) must be a Hermitian operator,
Ĥ = Ĥ † , in matrix form it means that we must have

H21 = H12 . (4.30)

The above equation for the eigenvalues then becomes

(E − H11 )(E − H22 ) − |H12 | 2 = E 2 − E(H11 + H22 ) + H11 H22 − |H12 | 2 = 0, (4.31)

with solutions

H11 + H22 (H11 − H22 ) 2
E± = ± + |H12 | 2 . (4.32)
2 4
Given these eigenvalues, the eigenstate equations are
H H12 c± c±
 11   1  = E±  1  . (4.33)
 H21 H22 c2± ±
c2
49 4 Two-Level Systems and Spin 1/2, Entanglement

Because of the eigenvalue condition, the two equations are degenerate (the determinant of the linear
equation for the variables c1± , c2± vanishes), so only one of the two equations is independent, say, the
second,

H21 c1± + (H22 − E± )c2± = 0, (4.34)

while the second condition for the pair of variables is the normalization condition

|c1± | 2 + |c2± | 2 = 1, (4.35)

obtained from

ψ+ |ψ+  = ψ− |ψ−  = 1. (4.36)

The solution of the two equations is


1
c± η±  
 1 =   H21  , (4.37)
±
c2  H21 2
1 +    E± − H22
 E± − H22 
where |η± | 2 = 1, so η± is a phase, which we will choose to be 1. We can simplify the equations using
the definitions
H11 + H22 H22 − H11 ∗
≡ Ē, ≡ Δ, H12 = H21 ≡ V , 2 Δ2 + |V | 2 ≡ Ω. (4.38)
2 2
Then the eigenenergies are

E± = Ē ± Δ2 + |V | 2 = Ē ± , (4.39)
2
the Hamiltonian is
 
Ē − Δ V
Ĥ = , (4.40)
V∗ Ē + Δ
and the eigenstates are
  1
c1± η±  ∗ 
=  V 
c2± |V | 2
1+  Δ ± Δ2 + |V | 2
(Δ ± Δ2 + |V | 2 ) 2

1
η± (±Δ + Δ2 + |V | 2 )  ∗  (4.41)
=  V 
|V | 2 + (±Δ + Δ2 + |V | 2 ) 2  Δ ± Δ2 + |V | 2
 
η± ±Δ + Δ2 + |V | 2
= .
±V ∗
|V | 2 + (±Δ + Δ2 + |V | 2 ) 2
We choose η± = 1, a real V (V = V ∗ ∈ R), and then we can define
 +    −  
c1 − sin θ c1 cos θ
= , = . (4.42)
c2+ cos θ c2− sin θ
50 4 Two-Level Systems and Spin 1/2, Entanglement

Note that this definition is self-consistent since c2− = −c1+ , which gives

V Δ + Δ2 + V 2
 = ≡ sin θ, (4.43)
√ √
V 2 + (−Δ + Δ2 + V 2 ) 2 V 2 + (Δ + Δ2 + V 2 ) 2

as we can check by squaring the equation and multiplying with the denominators.
Moreover, we obtain that

2V (Δ + Δ2 + V 2 ) V
sin 2θ = 2 sin θ cos θ = − √ = −√ , (4.44)
V 2 + (Δ + Δ2 + V 2 ) 2 Δ2 + V 2
which also implies that
Δ
cos 2θ = √ . (4.45)
Δ + V2
2

In conclusion, the eigenstates corresponding to E± = Ē ± Ω/2 are

|ψ−  = cos θ|φ1  + sin θ|φ2 


(4.46)
|ψ+  = − sin θ|φ1  + cos θ|φ2 ,

so the general time-dependent state is given by

|ψ(t) = c− e−iE− t/ |ψ−  + c+ e−iE+ t/ |ψ+ , (4.47)

where, since |ψ±  are orthonormal states, by multiplying with ψ± | we obtain that the coefficients c±
are given by

c± = ψ± |ψ(t = 0). (4.48)

4.3 Oscillations of States

Choosing  as initial state an eigenstate of the diagonal part of the Hamiltonian, for instance
1
|φ1  = , but not of the full Hamiltonian, we will see that the system actually oscillates between
0
the state |φ1  and the state |φ2 . This phenomenon is actually the one responsible for neutrino
oscillations, and what we have described here applies almost directly.
Consider then the state |ψ(t = 0) = |φ1  as the initial state. Inverting (4.46), we have

|φ1  = cos θ|ψ−  − sin θ|ψ+ 


(4.49)
|φ2  = sin θ|ψ−  + cos θ|ψ+ ,

which means that from c± = ψ± |ψ(t = 0) we get

c− = cos θ, c+ = − sin θ. (4.50)


51 4 Two-Level Systems and Spin 1/2, Entanglement

Substituting into the general state, we obtain

|ψ(t) = cos2 θ e−iE− t/ + sin2 θ e−iE+ t/ |φ1 


(4.51)
+ sin θ cos θ e−iE− t/ − e−iE+ t/ |φ2  ≡ c1 |φ1  + c2 |φ2 ,

which means that, from the point of view of the |φ1 , |φ2  basis, the state oscillates between the basis
states. This is indeed the phenomenon of neutrino oscillations, if we consider the neutrino states as
|ν1  = |φ1  and |ν2  = |φ2  and that they are actually not eigenstates of the Hamiltonian.
The probability of oscillation, that is, of finding the system in the basis state |φ2 , is

1 − cos(E+ − E− )t/
p2 (t) = |c2 | 2 = sin2 θ cos2 θ e−iE− t/ − e−iE+ t/  = (sin 2θ) 2
2
2

V2 Δ2 + V 2 t
= 2 sin2 (4.52)
Δ +V 2 
V2
= (1 − cos Ωt) ,
2(Δ2 + V 2 )

where we have used E+ − E− = 2 Δ2 + V 2 = Ω. Then the probability of remaining in the original
state is
2Δ2 + V 2 V2
p1 (t) = |c1 (t)| 2 = 1 − p2 (t) = + cos Ωt, (4.53)
2(Δ2 + V 2 ) 2(Δ2 + V 2 )
as we can check explicitly. These probabilities oscillate with frequency Ω around an average, as can
be seen.
Consider next two special cases:

• First, Δ = 0, in which case Ω = 2|V | and


 
1 2|V |t
p1 (t) = 1 + cos
2 
  (4.54)
1 2|V |t
p2 (t) = 1 − cos .
2 

In this case, the oscillations happen around the average value of 1/2, but there are times where the
system finds itself fully in the state |φ2 , since p1 = 0, p2 = 1, and others where it is fully back at
|φ1 , since p1 = 1, p2 = 0.
• Second, consider V = 0, in which case we are back at the spin 1/2 case with B = Bz , and E− =
H11 = E1 , E+ = H22 = E2 . Then there is no oscillation, since we get sin 2θ = 0, and p2 (t) = 0.
Moreover, in this case |ψ±  = |φ1,2  are the basis states of the matrix. But for a general initial state
|ψ(t = 0) = c− |φ1  + c+ |φ2 , with c± arbitrary, we find

|ψ(t) = e−iE1 t/ c− |φ1  + e−iE2 t/ c+ |φ2 . (4.55)

That means, however, that the probability of the system being in state 1 is p1 (t) = |c− | 2 and of its
being in state 2 is p2 (t) = |c+ | 2 ; these probabilities are time independent, so we have no oscillations.
52 4 Two-Level Systems and Spin 1/2, Entanglement

4.4 Unitary Evolution Operator

Consider the general initial state


 
b
|ψ(t = 0) = b|φ1  + a|φ2  = = a(sin θ|ψ−  + cos θ|ψ+ ) + b(cos θ|ψ−  − sin θ|ψ+ ),
a
(4.56)

that is, with

c− = a sin θ + b cos θ, c+ = a cos θ − b sin θ. (4.57)

Then substituting into the general form of the time-dependent state (4.56), we obtain

|ψ(t) = c− |ψ− e−iE− t/ + c+ |ψ+ e−iE+ t/


 
= (a sin θ cos θ + b cos2 θ)e−iE− t/ + (−a sin θ cos θ + b sin2 θ)e−iE+ t/ |φ1  (4.58)
 
+ (a sin2 θ + b sin θ cos θ)e−iE− t/ + (a cos2 θ − b sin θ cos θ)e−iE+ t/ |φ2 .

This can be written in the form of a matrix U (t) acting on the initial state,
 
b
|ψ(t) = U (t) = U (t)|ψ(t = 0), (4.59)
a

where the U (t) is given by

cos2 θ eiΩt/2 + sin2 θ e−iΩt/2 2i sin θ cos θ sin Ωt/2 


U (t) = e−i Ēt/  . (4.60)
 2i sin θ cos θ sin Ωt/2 sin θ eiΩt/2 + cos2 θ e−iΩt/2
2

We can check explicitly that the matrix is unitary,

U † (t)U (t) = 1̂. (4.61)

This agrees with the general theory explained before, stating that the time evolution of a state, as
well as any change of basis, must be a unitary operation in order for the sum of all probabilities to be
equal to 1 (the conservation of probability). This also means that the time evolution is a reversible
process: we can act with U −1 = U † on the final state, and obtain the initial state.
The only possible nonunitary process, in quantum theory, is the process of measurement, which is
not reversible.

4.5 Entanglement

We have described a single two-level system, for instance a spin 1/2 system, but we can also consider
several such systems interacting with each other. In the simplest case, consider just two such systems,
A and B. In this case, as we explained in the general theory, the total Hilbert space is a tensor product
of the individual Hilbert spaces, so H AB = H A ⊗ HB .
53 4 Two-Level Systems and Spin 1/2, Entanglement

But there is one possibility of interest that we will consider here, namely that the interaction
between the systems is such that the state of one system influences the state of the other, in such
a way that the state of the total system is not describable as the product of states in each system, i.e.,

|ψ AB  |ψ A ⊗ |ψ B . (4.62)

We call such a situation an entangled state, and the phenomenon is called entanglement.
The quintessential example is one that is called “maximally entangled”, where the state of one
system completely determines the state of the other. For instance, consider a system, such as an atom
or nucleus, that has a total spin equal to zero but decays into two systems (decay products) that each
have spin 1/2 (such as an electron and a “hole”, or ionized atom in the case of an atom). Then spin
conservation means that the total spin of the decaying products must still add up to zero, meaning
that if system A has Sz = +1/2 then system B has Sz = −1/2, and vice versa (if A has Sz = −1/2
then B has Sz = +1/2); thus the state of system A completely determines the state of system B. In
this case, the probabilities of the two spin situations are each equal to 1/2, so the total state of the
system is
1 1
|ψ±  AB = √ (|↑ A ⊗ |↓B ± |↓ A ⊗ |↑B ) ≡ √ (|10 ± |01), (4.63)
2 2
where in the last equality we have introduced a new notation for the two states of the two-level
system: |0 and |1. We note that, as desired, the probabilities of the two cases are pi = |ci | 2 = 1/2
and that the state of the system is normalized,

AB ψ± |ψ±  AB = 1, (4.64)

as well as having orthogonal basis states, AB ψ+ |ψ−  AB = 0.


Another way to analyze entangled states is to consider the density matrix associated with the
state. Indeed, in a more general situation we could also consider the entanglement associated with a
nontrivial density matrix, though we will not do it here. The density matrix of the above entangled
state is
1
ρ AB = |ψ± ψ± | = (|↑↓ ± |↓↑) (↑↓| ± ↓↑|) . (4.65)
2
If we consider the trace of the density matrix operator over the Hilbert space of system B,
corresponding to a sum over all the possibilities for the system B (for instance, assuming that we
don’t know which possibility is correct), we obtain

TrB ρ AB = TrB |ψ±  AB AB ψ± | = B ↑|ψ±  AB AB ψ± |↑B + B ↓|ψ±  AB AB ψ± |↓B
1 1 (4.66)
= (|↓ A A↓| + |↑ A A↑|) = 1̂ A,
2 2
where we have used the orthonormality of the eigenstates for the B system.
The result is that after taking the trace over the B system’s density matrix, i.e., summing over its
possibilities, we obtain a completely random system for state A: on measuring the spin we get the
probability 1/2 of obtaining spin up, and probability 1/2 of obtaining spin down. This is another way
to determine that we have maximum entanglement in the system.
There is much more that can be said about entanglement, but we will postpone this until the second
part of the book.
54 4 Two-Level Systems and Spin 1/2, Entanglement

4.6 Quantum Computation

We have introduced another notation for two-level states, of |0 and |1, which recalls the bits of
classical computation, also denoted by states 0 and 1. This is not incidental: we will call a two-level
system a “qubit”, or quantum version of the computer bit.
In the entangled system, the two interacting spins (the two qubits) are now denoted by |a ⊗ |b,
where a, b = 0 or 1.
On individual two-level systems, i.e., qubits, as well as on the two-qubit states, i.e., tensor product
states, in the absence of measurements the only possible operations allowed by quantum mechanics,
time evolution and changes of basis, are unitary operations, as we said earlier.
This allows one to define a quantum version of classical computation in a computer. In a classical
computer, we encode the data in classical states of 0s and 1s, and act on them with “gates”, which are
operators with a well-defined action on the products of two-state systems (taking a given combination
of 0s and 1s to be well defined). Any classical computation can be reduced to a product of gates.
In a quantum computer, similarly, we would encode the data in sets, or tensor products, of qubits.
All qubits are then evolved in time with unitary evolution operations (coming from Hamiltonians),
corresponding to any computation. Thus a quantum computation is a unitary evolution, which can
also be reduced to a product of gates. Each gate will be a quantum unitary operator acting on a tensor
product of qubits, usually taken to be an operator acting on the two-qubit state |a ⊗ |b.
Unlike a classical computation however, now we can have a wave function containing components
in many different individual qubit states, and the unitary evolution will act on all of them at once. This
allows for a version of parallel computing, which however is usually more efficient than anything
classical, thus sometimes allowing an exponentially faster solution than in the classical case. The
quintessential case is an initial state of the type
1 1
|ψ±  = √ (|↑ ± |↓) = √ (|0 ± |1), (4.67)
2 2
or in the case of the two-qubit state, an entangled state like |ψ±  AB .
The downside of such a calculation is that, when we “collapse the wave function” by making
a measurement, which is the only nonunitary operation (and one which breaks the otherwise
reversibility of the quantum calculation), we obtain only probabilities for individual states; so we
need to make sure that (1) the result of the calculation can be extracted from a particular state and
that (2) the probability of obtaining such a state is high enough that we can do this with a sufficiently
small number of trials.
There is much more information to be given about quantum computation, but we will expand upon
it in the second part of the book.

Important Concepts to Remember

• The simplest quantum mechanical Hilbert space is one with two basis elements, i.e., that is “two-
level”, such as a spin s = ±1/2 fermion for instance.
• For the spin s = ±1/2 case, the spin operators are Si = 12 σi , where σi are the Pauli matrices,
satisfying σi σ j = δi j + ii jk σk .
55 4 Two-Level Systems and Spin 1/2, Entanglement

• Since ΔHspin = − μ · B,
 and classically the magnetic moment μ  L = (e/2m) L, quantum mechanically
μ
 S = g(e/2m) S, with g  2.

• Since | S|
 = /2, ΔHspin = −μσ · B,  with μ = g(e/2m).
• Then the eigenvalues (for a magnetic field on the z direction) are ±μBz , and the eigenstates are
|ψ±  = √12 (|↑ ± |↓).

• For a general two-state system Hamiltonian, with elements H11 , H22 , H12 and H21 = H12 , the
eigenenergies are E± = (H11 + H22 )/2 ± Ω/2, with Ω/2 = Δ + |V | where Δ = (H22 − H11 )/2
2 2

and V = H12 .
• If the initial state is not one of the eigenstates |ψ± , then the time-dependent state oscillates between
|ψ+  and |ψ−  with frequency Ω.
V2
• If the original state is |↑, the probability of switching spin is p2 (t) = (1 − cos Ωt). This
2(Δ2 + V 2 )
corresponds to the phenomenon of neutrino oscillation.
• For several two-level systems, for instance two such systems, we can have states of the coupled
system that are not separable, |ψ AB  |ψ A ⊗ |ψ B , which are called entangled. The system is
defined by a density matrix ρ AB .
• The two-level system is the basis for the bits 0, 1 of the quantum version of computation.
• In quantum computation the wave function is evolved in time during a calculation, so we have to
extract from it the correct result.

Further Reading
See Preskill’s Caltech Notes on Quantum Entanglement and Computation [12]. For more on quantum
computation, see [13]–[15].

Exercises

(1) Using the Pauli matrices σi : (σ1 , σ2 , σ3 ), show that we can construct four matrices γa ,
a = 1, 2, 3, 4, as tensor products of Pauli matrices, γa = σi ⊗ σ j , such that γa γb + γb γa = 2δ ab .
Find explicitly an example of such tensor products.
(2) Consider a system with two spin 1/2 electrons in a constant magnetic field parallel to the z
direction, Bz . Assume there is no other degree of freedom (not even momentum). Solve the
Schrödinger equation and find the eigenstates of the system.
(3) Consider the Hamiltonian for a two-level system

3
H = a0 1 + ai σi . (4.68)
i=1

Calculate its eigenstates and the associated energies.


(4) (Neutrino oscillations). Consider a two-level system with eigenstates of the Hamiltonian
|ψ1  and |ψ2 , of energies E1 and E2 , respectively (E2 > E1 ), corresponding to
the mass (and, of course, momentum) eigenstates of two massive neutrinos. Con-
sider also flavor eigenstates |φ1  = |νμ  and |φ2  = |ντ , rotated by a mixing
angle θ with respect to the mass eigenstates, and an initial muon neutrino eigenstate,
56 4 Two-Level Systems and Spin 1/2, Entanglement

|ψ(t = 0) = |νμ , of energy E. If the neutrinos are ultra-relativistic (E1  m1 ,


E2  m2 ), find the formula for the oscillation probability (the probability of finding the
system in the tau neutrino flavor eigenstate) as a function of θ, time, E and Δm2 ≡ m22 − m12 .
(5) Find the unitary evolution operator corresponding to the previous case, that of neutrino
oscillations.
(6) Consider the two-qubit state
|φ = C [|1 ⊗ |1 + |0 ⊗ |1 + a|1 ⊗ 0 + |0 ⊗ |0] , (4.69)
where C is a normalization constant. Find C as a function of a. When is the state entangled (at
what values of a), and when is it not entangled?
(7) Calculate the density matrix of system A for the two-qubit state above (in exercise 6), when the
trace is taken over system B, as a function of a. Find the maximum and minimum probabilities
that system A is in state |1, independently of system B.
Position and Momentum and Their Bases; Canonical
5 Quantization, and Free Particles

After analyzing the simplest discrete system, with just two states, in this chapter we move on to the
simplest continuum systems, with an infinite dimensional and continuum (not countable) number of
dimensions.
A classical particle is usually described in phase space, in terms of variables x, p which are
continuous. In quantum theory, these classical observables are promoted to quantum operators X̂, P̂,
which will have continuous eigenvalues. Each operator will have, according to general theory, a
complete set of eigenstates |x, |p such that
X̂ |x = x|x, P̂|p = p|p. (5.1)
Both the basis {|x} and the basis {|p} are orthonormal. But the question is, how does the
momentum operator P̂ act on the |x basis?

5.1 Translation Operator

To answer the above question, we have to first define the operator corresponding to translation by
an infinitesimal amount dx, written as T̂ (dx). It is thus defined as an operator that translates the
eigenvalue x by dx, so
T̂ (dx)|x = |x + dx. (5.2)
Since this translation operator preserves the scalar product, we have
x|x   = x + dx|x  + dx = δ(x − x  ), (5.3)
but on the other hand we obtain
x|T̂ † (dx)T̂ (dx)|x   = x|x  , (5.4)
which implies that the translation operator is unitary,
T̂ † (dx)T̂ (dx) = 1̂. (5.5)
Moreover, the translation operation obeys composition, so that
T̂ (dx)T̂ (dx  ) = T̂ (dx + dx  ), (5.6)
and the inverse operator must be
T̂ −1 (dx) = T̂ (−dx). (5.7)
Finally, translation by zero must give the identity,
T̂ (dx → 0) → 1. (5.8)
57
58 5 Position and Momentum, Quantization, Free Particles

In fact, we can construct the action of the translation operator even for a finite translation by a in
the coordinate (x) representation, that is, on wave functions we have
† −1  ad/dx
T̂xx  (a) = T̂xx  (a) = δ(x − x )e , (5.9)

since
x|T̂ † (a)|ψ = x + a|ψ = ψ(x + a)
(5.10)
= T̂ (a)xx x|ψ = e ad/dx ψ(x).
But on the other hand, by the Taylor expansion,
d a2 d 2  an d n
ψ(x + a) = ψ(x) + a ψ(x) + 2
ψ(x) + · · · = n
ψ(x) ≡ e ad/dx ψ(x), (5.11)
dx 2 dx n≥0
n! dx

which proves the above identity. Then, for an infinitesimal translation, we have
d
T̂xx (dx)  1̂ − dx . (5.12)
dx
And, in abstract terms, the unitarity property means that

T̂ † (dx) = T̂ −1 (dx) ⇒ 1̂ − idx K̂, (5.13)

where K̂ is Hermitian ( K̂ † = K̂). Indeed, in the x representation, we find that


 
d
K̂ xx = −i , (5.14)
dx xx
which is indeed Hermitian, as we have proven earlier.
Again in abstract terms, we find that
X̂ T̂ (dx  )|x = X̂ |x + dx  = (x + dx  )|x + dx ,
(5.15)
T̂ (dx  ) X̂ |x = T̂ (dx  )x|x = x|x + dx ,
meaning that in general, as a condition on operators (since it is valid on an orthonormal basis of the
Hilbert space) we have

[ X̂, T (dx  )] = dx  1̂ ⇒ [ X̂, K̂] = i. (5.16)

This is of course satisfied by (5.14), its form in the coordinate representation.


To continue, and to see that in fact K̂ is P̂ up to a real constant, we review a bit of classical
mechanics.

5.2 Momentum in Classical Mechanics as a Generator of Translations

The formal definition of dynamics in classical mechanics is given in terms of the Hamilton equations
on the phase space (qi , pi ). The Hamilton equations become more formal in terms of Poisson brackets
{, } P.B. , defined for functions f and g on the phase space as
  ∂ f ∂g ∂ f ∂g

{ f (q, p), g(q, p)} P.B. = − . (5.17)
i
∂qi ∂pi ∂pi ∂qi
59 5 Position and Momentum, Quantization, Free Particles

This is an antisymmetric bracket,

{ f , g} P.B. = −{g, f } P.B. , (5.18)

that anticommutes with (complex) numbers, which are not functions of phase space,

{ f (q, p), c} P.B. = 0, (5.19)

and satisfies the Jacobi identity

{{ A, B}, C} + {{B, C}, A} + {{C, A}, B} = 0. (5.20)

In terms of (5.18), the Hamilton equations


∂H ∂H
= q̇i , = − ṗi (5.21)
∂pi ∂qi
become

q̇i = {qi , H } P.B. , ṗi = {pi , H } P.B. . (5.22)

For an arbitrary function of phase space and time f (q, p; t), we have the time evolution
d f (q, p; t) ∂f ∂f ∂f ∂f
= q̇i + ṗi + = { f , H } P.B. + . (5.23)
dt ∂qi ∂pi ∂t ∂t
This means that the Hamiltonian, through its Poisson brackets, is the generator of the time translations
(the motion of the system in time).
Next, consider an infinitesimal canonical transformation on phase space, from (q, p) to (Q, P), and
take the active point of view, which is that this is a transformation that changes points in phase space
from (q, p) to (Q, P) = (q + δq, p + δp). Then we have the generating function of the canonical
transformation:

F (q, P, t) = qP + G(q, P, t), (5.24)

where  is an infinitesimal parameter. It generates the canonical transformation through the equations
∂F ∂G
p≡ =P+
∂q ∂q
(5.25)
∂F ∂G ∂G
Q≡ =q+ q+ ,
∂P ∂P ∂p
where in the last equation we used P  p, and so the variations are
∂G
δp = − = {p, G} P.B.
∂q
(5.26)
∂G
δq =  = {q, G} P.B. .
∂p
Doing the same thing for the Hamiltonian H (p, q), we obtain
∂G dG
δH (q, p) = {H (q, p), G} P.B. −  = − . (5.27)
∂t dt
That means that if dG/dt = 0, we have δH = 0. This implies that a constant of motion (a function
G(q, p) that is independent of time) is a generating function of canonical transformations that leaves
the Hamiltonian H invariant.
60 5 Position and Momentum, Quantization, Free Particles

In particular, considering a cyclic coordinate qi i.e., ∂H/∂qi = 0, it follows from the Hamilton
equations that the corresponding momentum is a constant of motion, dpi /dt = 0. In particular, take
this constant of motion as the generating function G, i.e.,

G(q, p) = pi . (5.28)

Then the canonical transformation gives

δq j = δi j
(5.29)
δp j = 0,
which means that the momentum pi is the generator of translations in the coordinate qi canonically
conjugate to it (δcan,pi qi = ).

5.3 Canonical Quantization

It follows that in the quantum case, where classical observables correspond to operators, we can
equate the momentum operator P̂ with the translation generator K̂, at least up to a multiplicative
real constant. In fact, the dimensions of K̂ (inverse length) and P̂ (momentum) are different, so that
linking them there must be at least a constant with dimensions of momentum times length, or in other
words the dimensions of action. One such constant is  = h/(2π), so at least up to a number, we have

K̂ = . (5.30)

In fact, there is no further multiplicative number. A way to see this is to consider that, using de
Broglie’s assumption on the relation between the wavelength λ and the momentum p of a particle,
h p 2π
p= ⇒ =k= , (5.31)
λ  λ
so p/ is the wave number, and it is natural to associate it with the translation operator (for instance
on a lattice).
Finally, then, when acting on a wave function ψ(x) = x|ψ, the momentum operator becomes
 
d
P̂xx = −i . (5.32)
dx xx
In this x representation, we have the commutator
 
d
[ X̂, P̂] = x, −i = i, (5.33)
dx
which replaces the classical canonical Poisson bracket formula

{q, p} P.B. = 1. (5.34)

Moreover, the previous analysis of the Hamiltonian applies to any coordinates qi and the momenta
canonically conjugate to them pi , so we obtain the general canonical quantization conditions

[ X̂i , P̂j ] = iδi j , (5.35)


61 5 Position and Momentum, Quantization, Free Particles

replacing the classical canonical commutation relations

{qi , p j } P.B. = δi j . (5.36)

In other words, in the general canonical quantization prescription, the Poisson brackets are
replaced with the commutator divided by i,
1
{, } P.B. → [, ]. (5.37)
i
The other canonical Poisson brackets are

{qi , q j } P.B. = {pi , p j } P.B. = 0, (5.38)

and under canonical quantization, they would become

[ X̂i , X̂ j ] = [ P̂i , P̂j ] = 0. (5.39)

These relations are indeed satisfied: for i = j they are trivial and for i  j the variables are
independent (do not influence each other), so the corresponding operators commute.

5.4 Operators in Coordinate and Momentum Spaces

Operators in Coordinate (x) Space


The eigenstates |x of the coordinate operator X̂ are orthonormal, x|x   = δ(x − x  ), and in the
coordinate representation we use the following identity in terms of them, 1̂ = dx|xx|.
Then the action of the matrix element of the operator  between the states |ψ and |χ is given by
   
χ| Â|ψ = dx dx χ|xx| Â|x x  |ψ = dx dx  χ ∗ (x) Axx ψ(x  ). (5.40)

For an observable corresponding to a function of position, Â = fˆ( X̂ ), we have

Axx = x| fˆ( X̂ )|x  = f (x  )x|x   = f (x  )δ(x − x  ). (5.41)

On the other hand, for  = P̂, we have


 
d d
Axx = −i = −iδ(x − x  )  , (5.42)
dx xx dx
as we saw in Chapter 2, so
    
d d
χ| P̂|ψ = dx dx  χ ∗ (x)(−iδ(x − x  ))  ψ(x  ) = dxχ ∗ (x) −i ψ(x) . (5.43)
dx dx

Operators in Momentum (p) Space


In momentum space we use momentum eigenstates |p, which are also orthonormal,  p|p =
δ(p − p ), and the identity written as a completeness relation in terms of them, 1̂ = dp|pp|.
62 5 Position and Momentum, Quantization, Free Particles

Then the matrix element (5.40) of an operator  is


   
  
χ| Â|ψ = dp dp χ|pp| Â|p p |ψ = dp dp χ ∗ (p) App ψ(p), (5.44)

where p|ψ = ψ(p) is the wave function in momentum space.


However, to do this integral we have to calculate the transformation of a wave function from
coordinate space to momentum space. Consider the matrix element of P̂ between the x| and the |p
states, and introduce the completeness relation in coordinate space, giving
    
d
x| P̂|p = dx x| P̂|x  x  |p = dx  Pxx x  |p = dx  −iδ(x − x  )  x  |p
dx
(5.45)
d
= −i x|p.
dx
On the other hand, by using P̂|p = p|p, we also obtain px|p, so
d
−i x|p = px|p, (5.46)
dx
with solution

x|p = N eipx/ . (5.47)

The normalization constant N is found from the normalization condition by inserting a completeness
relation for |x,

δ(x − x  ) = x|x  = dpx|pp|x  . (5.48)

Substituting the solution above into this equation, we find


  
 x − x
δ(x − x  ) = N 2 dpeipx/ e−ipx / = N 2 2πδ = 2πN 2 δ(x − x  ), (5.49)


meaning that N = 1/ 2π, and so
1
x|p = √ eipx/ . (5.50)
2π
That means that the transformation between the x and the p/ basis (which has the same
dimensions) is just the Fourier transform,

x|ψ = dpx|pp|ψ ⇒
 (5.51)
1
ψ(x) = √ dpe ipx/
ψ(p),
2π
and the inverse relation (the inverse Fourier transform) is

p|ψ = dxp|xx|ψ ⇒
 (5.52)
1
ψ(p) = √ dxe−ipx/ ψ(x).
2π
63 5 Position and Momentum, Quantization, Free Particles

5.5 The Free Nonrelativistic Particle

The Schrödinger equation for an arbitrary time-dependent state is


d
i |ψ = Ĥ |ψ. (5.53)
dt
But for a free classical particle, H (q, p) = p2 /2m + 0, leading to the quantum Hamiltonian
P̂2
Ĥ = . (5.54)
2m
We will solve the Schrödinger equation by the general procedure we described: first we consider
the eigenvalue problem for the Hamiltonian,

Ĥ |ψ E  = E|ψ E , (5.55)

and for the eigenstate we can solve the time evolution exactly,

|ψ E (t) = e−iEt/ |ψ E (t = 0) = e−iEt/ |ψ E . (5.56)

In our case, the Hamiltonian eigenvalue problem is


 
P̂2 P̂2
Ĥ |ψ E  = |ψ E  = E|ψ E  ⇒ − E |ψ E  = 0. (5.57)
2m 2m

This means that |ψ E  = |p is a momentum eigenstate, with p = ± 2mE. That in turn means that
we have a degeneracy, with states
√ √
|E, + ≡ |p = + 2mE, |E, − ≡ |p = − 2mE. (5.58)

Since the system is degenerate with respect to energy, we have that, in general,
√ √
|ψ E (t) = e−iEt/ α|p = + 2mE + β|p = − 2mE . (5.59)

Then the coordinate (x) space wave function for a given energy is
e−iEt/  +ix √2mE/ √ 
x|ψ E (t) = ψ E (x, t) = √ αe + βe−ix 2mE/ . (5.60)
2π
Choosing a sign for the momentum (and so a state of given momentum), say p > 0, so that
α = 1, β = 0, means that the probability density of finding the free particle in this state at position x is
dP(x) 1
= |x|ψ E (t)| 2 = , (5.61)
dx 2π
which is constant. This means that the position x is completely arbitrary once we fix the momentum
p. We will see the implications of this fact in more detail in the next chapter.
Note that the above relation is consistent with the normalization condition
  
dP(x)
1= dx = dx|x|ψ| 2 = dxψ|xx|ψ = ψ|ψ, (5.62)
dx
which is therefore satisfied.
We should comment on the fact that we want the variables x, p to be approximately classical,
that is, defined up to some error. That means that the fact that the position is arbitrary in the above
64 5 Position and Momentum, Quantization, Free Particles

discussion arises because having a perfectly defined momentum p is an idealization. In reality, we


need to consider so-called wave packets, a linear combination (superposition) of various momentum
states with varying p. In this case, we can approximately localize the position x around an average
value, perhaps with a moving (time-dependent) average value. We will see in the next chapter that if
we center the state of given momentum p on some x by adding a multiplicative Gaussian profile at
time t = 0 then, as t progresses, this wave packet will spread out in space, tending towards the same
constant wave function with respect to x.

Important Concepts to Remember

• A constant of motion is a generating function of canonical transformations that leaves the


Hamiltonian invariant, and, in particular, the momentum pi is the generator of translations in the
coordinate qi canonically conjugate to it.
• The generator of translations is K̂ = P̂/, so P̂xx = (−id/dx) xx .
• The canonical quantization conditions are [ X̂i , P̂j ] = iδi j (and [ X̂i , X̂ j ] = [ P̂i , P̂j ] = 0), replacing
{qi , p j } P.B. = δi j (and {qi , q j } P.B. = {pi , p j } P.B. = 0) in classical mechanics; so canonical quanti-
zation means that we have the replacement {, } P.B. → (1/i)[, ].

• In x space, x| f ( X̂ )|x  = f (x  )δ(x − x  ) and χ| P̂|ψ = dx χ ∗ (x) (−id/dx) ψ(x).

• The product of coordinate and momentum space basis elements is x|p = eipx/ / 2π, so
the wave function in momentum space is the Fourier transform (in units with  = 1) ψ(p) =
(1/2π) dxe−ipx/ ψ(x).
• For a free particle, the time evolution is |ψ E (t) = e−iEt/ |ψ√E , E|ψ E  = ( P̂2 /2m)|ψ√E , and there
is a degeneracy with respect to momentum, |E, + = |p = + 2mE, |E, − = |p = − 2mE.
• The state of perfectly defined momentum p is an idealization. In reality we have wave packets,
states given as linear combinations of momentum states, peaked around a given classical
momentum.

Further Reading
See any other fairly advanced quantum mechanics book, e.g., [2, 1, 4, 3].

Exercises

(1) Consider possible terms in the (quantum) Hamiltonian for a one-dimensional system,
 
1
dx ψ∗ (x)e a d x ψ(x), H2 = dx ψ∗ (x) 2 2 ψ(x),
d
H1 = (5.63)
d /dx
where ψ(x) is a continuous complex function of the spatial variable x, and 1/(d 2 /dx 2 ) is
understood as the inverse of an operator, acting on the function. Can these terms be understood
to be coming from local interactions in x (interactions defined at a single point x), and if so,
why? Use a calculation to explain your reasoning.
65 5 Position and Momentum, Quantization, Free Particles

(2) Consider the angular momentum of a classical particle, L = r × p , and a system for which it is
invariant. Use it as a generating function to define the relevant canonical transformations.
(3) Consider a system with Lagrangian
q̇12 q̇2
L= + q̇1 q̇2 + 2 − k (q1 + q2 ). (5.64)
2 2
Write down the Hamiltonian and Poisson brackets, and canonically quantize the system.
(4) Consider a system of N free particles in one spatial dimension x. Write down the general
eigenstate and eigenenergy, its degeneracy and time evolution.
(5) Consider a system with Hamiltonian
p2
H= + αp + βx 2 + γx 3 . (5.65)
2m
Write down its Schrödinger equation for wave functions in coordinate space. If β = γ = 0,
write down its general eigenstate wave function and time evolution, for a given energy E.
(6) Consider a free particle in three spatial dimensions, and a system of coordinates relative to
a point O not on the path of the particle. Write down the integrals of motion conjugate to
Cartesian, polar, and spherical coordinates respectively, and the canonical quantization for each
set of coordinates.
(7) Considering the free particle in three spatial dimensions from exercise 6, what is the probability
density of finding a particle at a point r , given a momentum p ?
The Heisenberg Uncertainty Principle and Relations,
6 and Gaussian Wave Packets

In this chapter, following the explicit analysis of Gaussian wave packets, we will see that products
in the variances of canonically conjugate variables have a nonzero value. Then we will prove a
theorem saying that such products in general obey a lower bound, called the Heisenberg uncertainty
relation(s). Moreover, we will then show that the minimum is attained in the case of the same
Gaussian wave packets.

6.1 Gaussian Wave Packets

As we said in the previous chapter, free particles of a given momentum are an abstraction, and
moreover imply a coordinate that is completely arbitrary. In a realistic situation, a superposition of
waves, or wave packet, represents a real particle, as in Fig. 6.1. The quintessential example is the
Gaussian wave packet at t = 0, where we multiply a free wave of given momentum by a Gaussian in
position space, centered on x = 0 and with variance σ = d, so that

x|ψ p (t = 0) = N eipx/ e−x


2 /2σ 2
. (6.1)

The probability density is a pure Gaussian (since the wave multiplying the Gaussian has a constant
probability density):

dP
= |x|ψ| 2 = N 2 e−x /d .
2 2
(6.2)
dx

Imposing normalization of the probability, dxdP/dx = 1, we obtain
 +∞ √ 1
dxe−x
2 /d 2
N 2
= N2d π = 1 ⇒ N = , (6.3)
−∞ (πd 2 ) 1/4

so the Gaussian wave packet at t = 0 is



2π
x|ψ p (t = 0) = x|pe−x
2 /2d 2
. (6.4)
(πd 2 ) 1/4

Now we calculate the averages of the x, x 2 , p, p2 operators in this Gaussian wave packet state.
First,
  +∞
x ≡ ψ| X̂ |ψ = dxψ|xx| X̂ |ψ = dx x|ψ(x)| 2 = 0, (6.5)
−∞
66
67 6 Heisenberg Uncertainty Principle, Wave Packets

x x

(a) (b)

Figure 6.1 A wave packet, representing a particle. (a) A static wave packet, centered on a position in x space. (b) A time-dependent wave
packet, where the central position in x moves with time.

by the symmetry in x of the integral. The average of x 2 is nonzero, however:


 +∞  +∞
x2
dx √ e−x /d
2 2
x 2  ≡ ψ| X̂ 2 |ψ = dx x 2 |ψ(x)| 2 =
−∞ −∞ d π
(6.6)
d2 d2
= √ Γ(3/2) = .
π 2

Next, the average of p is also nonzero,


 
p ≡ ψ| P̂|ψ = dx dx ψ|xx| P̂|x  x  |ψ
   
 ∗  d 
= dx dx ψ (x) −iδ(x − x )  ψ(x ) (6.7)
dx
 +∞  +∞  
∗ d ix
= −i dxψ (x) ψ(x) = dx|ψ(x)| p + 2 = p.
2
−∞ dx −∞ d

And, finally, the average of p2 is



d2
p2  ≡ ψ| P̂2 |ψ = (−i) 2 dxψ∗ (x)
ψ(x)
dx 2
 ⎡   2⎤
2 ⎢
⎢ 1 1 ix ⎥⎥
= (−i) 2
dx|ψ(x)| ⎢− 2 + p+ 2 ⎥ (6.8)
⎢⎣ d (−i) 2 d ⎥⎦
 
2 2 2
= p2 + 2 − 4 x 2  = p2 + 2 .
d d 2d

Now we can calculate the deviations in x and p,

d2
(Δx) 2  = (x − x) 2  = x 2  − x2 =
2
(6.9)
2
(Δp)  = (p − p)  = p  − p = 2 .
2 2 2 2
2d
Multiplying them, we obtain

2
(Δx) 2 (Δp) 2  = . (6.10)
4
We will see that this actually yields the lower bound coming from the general Heisenberg uncertainty
relations.
68 6 Heisenberg Uncertainty Principle, Wave Packets

For completeness, we will write the Gaussian wave packet with momentum p in momentum space
(we set a = i(p − p)d 2 /):

p|ψ p  = ψ p (p) = dxp|xx|ψ p 
 +∞  
1 1 p x x2
=√ dx e−ipx/ exp +i − 2
2π (πd 2 ) 1/4 −∞  2d
+∞
+a2 /2d 2  (6.11)
1 e
dx e−(x−a) /2d
2 2
=√
2π (πd 2 ) 1/4 −∞

d  2 2
e−(p−p ) d /2 .
2
=
(π2 ) 1/4

6.2 Time Evolution of Gaussian Wave Packet

We have defined the Gaussian wave packet with momentum p0 at t = 0, but in order to find its time
evolution, we need to remember the time evolution of states of definite momentum,

|ψ E (t) = e−iEt/ |p = 2mE, (6.12)

which we could also write in the coordinate representation.


But we need to calculate the time evolution operator for the more general wave function that we
have. In order to do this, we use the evolution operator formalism, where

|ψ(t) = Û (t)|ψ(t = 0) (6.13)

√ to imply (given the relation (6.12), which defines the time evolution for |p states, with
is seen
p = 2mE) that
 +∞  +∞
−iE (p)t/
dp |p p|e−ip t/2m .
2
Û (t) = dp |p p|e = (6.14)
−∞ −∞

The matrix elements of this operator in the coordinate basis |x are
 +∞
 
dpx|p p|x  e−ip t/2m
2
U (x, x ; t) ≡ x|Û (t)|x  =
−∞
 ip(x−x  )/
 (6.15)
e −ip 2 t/2m m im(x−x )2 /2t
= dp e = e .
2π 2πit
Taking the relation (6.13) in the coordinate
 representation, i.e., multiplying with x| and then
introducing a completeness relation dx  |x  x  | on the right of the evolution operator, we get

ψ(x, t) = dx U (x, x ; t)ψ(x , t = 0), (6.16)

with U (x, x ; t) as above and ψ(x , t = 0) the Gaussian wave packet from the previous section:
 2
eip0 x / e−x /2d
2

ψ(x  , t = 0) =  √ . (6.17)
d π
69 6 Heisenberg Uncertainty Principle, Wave Packets

On performing the Gaussian x  integral, the result is


   
1 (x − p0 t/m) 2 ip0  p0 t 
x|ψ(t) =  exp − 2 exp − x− , (6.18)
√ 2d (1 + it/md 2 )  2m
π (d + it/md)

which implies the probability density


dP(x, t)
ρ(x, t) ≡ = |ψ(x, t)| 2
dx
  (6.19)
1 (x − p0 t/m) 2
=√ √ exp − 2 .
π d 2 + 2 t 2 /m2 d 2 d + (2 t 2 /m2 d 2 )
We observe two properties of this result:

• the mean value moves with momentum p0 , as it is a function of x − p0 t/m;


• the width Δx(t) of the Gaussian grows in time as
 
d 2 t 2
Δx = √ 1+ 2 4 . (6.20)
2 m d

That is, the initial wave function spreads out until it becomes more like the wave associated with
a free particle of given momentum.

6.3 Heisenberg Uncertainty Relations

We have seen that Δx is correlated with Δp, and also that [ X̂, P̂] = i. But how general is this
correlation, in the case where two operators do not commute, [ Â, B̂]  0?
In the case where operators do commute, which means that the related observables are compatible,
the states have simultaneous eigenvalues,

 B̂|a, b = Âb|a, b = ab|a, b = B̂ Â|a, b = B̂a|a, b. (6.21)

In the case where they do not commute, this would be impossible (we would obtain a contradiction
from the above relation). For instance, X̂ and P̂ are incompatible and, as we saw, if we localize the
momentum (p is given), then space is completely delocalized (x is arbitrary), and moreover if Δx  0,
we also have Δp  0.
In fact this is part of a general theory of noncommuting operators. We have the following:

Theorem If [ Â, B̂]  0 and Â, B̂ are associated with observables, so that they are Hermitian
operators, then we have
1
(ΔA) 2 (ΔB) 2  ≥ |[ Â, B̂]| 2 . (6.22)
4
These relation are known as Heisenberg’s uncertainty relation.

Proof First, note that if [ Â, B̂] = iĈ then Ĉ is Hermitian, Ĉ = Ĉ † . Indeed,
(iĈ) † = −iĈ † = ([ Â, B̂]) † = [ B̂† , † ] = [ B̂, Â] = −[ Â, B̂] = −iĈ. (6.23)
70 6 Heisenberg Uncertainty Principle, Wave Packets

Also, if { Â, B̂} = D̂ then D̂ is Hermitian, D̂† = D̂:


D† = ({ Â, B̂}) † = { B̂† , † } = { B̂, Â} = { Â, B̂} = D̂. (6.24)
The quantities in the Heisenberg uncertainty relations are defined as
[ Â, B̂] ≡ ψ|( Â B̂ − B̂ Â)|ψ
(ΔA) 2  ≡ ψ|( Â − A) 2 |ψ (6.25)
(ΔB) 2  ≡ ψ|( B̂ − B) 2 |ψ.

Then note that if Â, B̂ are Hermitian, Â − A and B̂ − B are also Hermitian.
Since Â, B̂ are operators acting on a Hilbert space, consider the states |φ, |χ in this Hilbert space,
which, as we saw in Chapters 1 and 2, obey the Cauchy–Schwarz inequality,
|φ|χ| 2 ≤ φ 2 χ 2 = φ|φχ|χ, (6.26)
with equality only if |χ ∝ |φ.
Now consider the states
|φ = ( Â − A)|ψ, |χ = ( B̂ − B)|ψ. (6.27)
The norms of these states are (using the fact that  − A and B̂ − B are Hermitian)
φ 2 = ψ|( Â − A) 2 |ψ = (ΔA) 2 
(6.28)
χ 2 = ψ|( B̂ − A) 2 |ψ = (ΔB) 2 .
Moreover, also using the Hermitian nature of ΔA and ΔB, we find
|φ|χ| 2 = |ψ|( Â − A)( B̂ − B)|ψ| 2 . (6.29)
But then
1 1
( Â − A)( B̂ − B) = [ Â − A, B̂ − B] + { Â − A, B̂ − B}
2 2
(6.30)
1 1
= [ Â, B̂] + { Â − A, B̂ − B},
2 2
and, as we saw, [ Â, B̂] = iĈ with Ĉ † = Ĉ, and so the anticommutator above is a Hermitian operator.
These two operators then act as real and imaginary numbers (remembering that the eigenvalues of
a Hermitian operator are real, this is actually what we get in an eigenstate basis), so the modulus
squared equals the real (Hermitian) part squared plus the imaginary (anti-Hermitian) part squared.
In conclusion, the Cauchy–Schwarz inequality becomes
1 2  1 2
(ΔA) 2 (ΔB) 2  ≥ |ψ|Δ ÂΔ B̂|ψ| 2 ≥  ψ|[ Â, B̂]|ψ +  ψ|{Δ Â, Δ B̂}|ψ
2  2  (6.31)
1  2
≥ ψ|[ Â, B̂]|ψ .
4
q.e.d.

We have thus proven the Heisenberg uncertainty relations in the general case.
Specializing to the case of canonically conjugate operators, Â = q̂i and B̂ = p̂i , we obtain first
[ Â, B̂] = [q̂i , p̂i ] = i, (6.32)
71 6 Heisenberg Uncertainty Principle, Wave Packets

and, substituting in the Heisenberg uncertainty relations,


2
(Δqi ) 2  (Δpi ) 2  ≥ , (6.33)
4
or, more simply,

Δqi Δpi ≥ , (6.34)
2
in terms of the standard deviations of qi , pi . This is the original form of Heisenberg’s uncertainty
relations, in terms of the x, p observables themselves (rather than for general canonically conjugate
variables).
We also note the situation when we have equality in the uncertainty relations (i.e., when we
“saturate” the inequality). There were two inequalities used, and each comes with its own condition
for equality, so both must hold:
(1) For the first inequality, equality arises when |χ ∝ |φ, so that

Â|ψ = c B̂|ψ. (6.35)

(2) For the second inequality, equality arises when the (average of the) anticommutator vanishes, so
that

ψ|{ Â, B̂}|ψ = 0. (6.36)

6.4 Minimum Uncertainty Wave Packet

We now find the wave packet for the free particle that saturates the minimum of the standard
inequality.
Apply the Heisenberg uncertainty relations for  = X̂, B̂ = P̂. Then equality arises when
(1)

( P̂ − p)|ψ = c( X̂ − x)|ψ, (6.37)

(2)
 
ψ| ( P̂ − p)( X̂ − x) + ( X̂ − x)( P̂ − p) |ψ = 0. (6.38)

Applying these conditions in coordinate space, i.e., by acting with x| from the left on the
conditions, we find from (1) that
 
d
−i − p ψ(x) = c(x − x)ψ(x) ⇒
dx
(6.39)
d i
ln ψ(x) = [p + c(x − x)],
dx 
with the solution
 
px c(x − x) 2
ψ(x) = ψ(x = 0)ei  exp i . (6.40)
2
72 6 Heisenberg Uncertainty Principle, Wave Packets

Then, multiplying condition (1) with ψ|( X̂ − x), considering the Hermitian conjugate of the
resulting relation, summing the equation and its conjugate, and then using condition (2), we finally
obtain

(c + c∗ )ψ|( X̂ − x) 2 |ψ = 0. (6.41)

Since the expectation value cannot be zero independently of |ψ, we must have a purely imaginary
coefficient,

c = i|c|, (6.42)

which means that finally the wave function is


 
px |c|
ψ(x) = ψ(x = 0)e i  exp − (x − x) .
2
(6.43)
2
This is the Gaussian wave packet we used before (in which case, indeed, we found saturation of
the Heisenberg uncertainty relations), with Gaussian variance

d2 = . (6.44)
|c|

6.5 Energy–Time Uncertainty Relation

We have found an (x, p) uncertainty relation, but we also can have an (E, t) (energy versus time)
uncertainty relation,

ΔEΔt ≥ . (6.45)
2
Since time t does not correspond to an operator in quantum theory, this relation cannot strictly
speaking be derived in the same way as the (x, p) relation. However, we can argue for it because:

• relativistically (i.e., in a relativistic theory), E and t are the zero components of the 4-vectors pμ
and x μ , so ΔEΔt is Δp0 Δx 0 , meaning that by relativistic invariance we must have (6.45).
• We also know that E is the eigenvalue (observable value) for the quantum Hamiltonian Ĥ, which
as we saw generates evolution in time t. But we also saw that the canonically conjugate momentum
P̂ generates translations in x, so the same Heisenberg uncertainty relation should be valid for the
pair (E, t).

The meaning of the uncertainty relation is however the same: if for instance we know the energy
of a system with infinite precision, such as in the case when a system is confined to a discrete energy
state (without transitions: say the state is a ground state), then the time it spends there is infinite, so
Δt = ∞. Conversely, at a given time the quantum energy is arbitrary (cannot be determined). Another
way to apply this relation is to say that, for an unstable particle, the lifetime Δt of the particle and its
uncertainty in energy ΔE are related by the uncertainty relation. This is what happens for instance
for virtual particles (particles created from the vacuum for a short time) in a quantum theory: if the
particles exist for a time Δt, then their energy is not well defined but has uncertainty ΔE.
73 6 Heisenberg Uncertainty Principle, Wave Packets

Important Concepts to Remember

x2
px −
• In a Gaussian wave packet in x space, at t = 0, ψ(x, t = 0) = N ei  e 2σ2 and we have Δx 2 Δp2 =
2 /4.
• A Gaussian wave packet travels with momentum p0 , so the x dependence of the probability density
is encoded in a function of x − t√ p0 /m (but there is an extra t dependence), and the wave packet
spreads out in time, with Δx = d/ 2(1 + 2 t 2 /m2 d 4 ).
• Heisenberg’s uncertainty relation for incompatible operators ([ Â, B̂]  0) comes from
(ΔA) 2 (ΔB) 2  ≥ 14 |[ Â, B̂]| 2 .
• The standard form of Heisenberg’s uncertainty relation is Δqi Δpi ≥ /2.
• A Gaussian wave packet minimizes Heisenberg’s uncertainty relation.
• The energy–time uncertainty relation, ΔEΔt ≥ /2, is not derived as above, and it applies either to
errors or to an energy difference and time of decay.

Further Reading
See any other book on quantum mechanics, for instance [2] or [1].

Exercises

(1) Consider a Gaussian wave packet in momentum space, equation (6.11). Calculate (Δx) 2 and
(Δp) 2 in this p space, and check again the saturation of Heisenberg’s uncertainty relations.
(2) Do the integral for the Gaussian wave packet with evolution operator, to prove (6.18), and
calculate Δx(t) for it, to prove (6.20).
(3) Can we measure simultaneously the momentum and the angular momentum in three spatial
dimensions and, if not, what are the Heisenberg uncertainty relations corresponding to them?
(4) Consider a system with Hamiltonian
p2
H= + αpx. (6.46)
2m
Can we measure simultaneously the energy and the momentum of the system?
(5) Consider two energy levels of an atomic system, E1 = E∗ and E2 = E∗ + 0.5 eV. What is the
minimum possible decay time from E2 to E1 ?
(6) Consider a superposition of two Gaussian wave packets with the same momentum p0 , initially
(at time t 0 ) at the same position x 0 , but with different variances, σ1 = d 1 and σ2 = d 2 . Calculate
(Δx) 2  and (Δp) 2 , and check the Heisenberg uncertainty relation, at time t 0 .
(7) For the situation in exercise 6, calculate the time dependence of the uncertainty in position,
Δx(t).
7 One-Dimensional Problems in a Potential V(x)

After analyzing two-state systems and free particles and wave packets, which are the simplest
systems in the discrete-Hilbert-space and continuous-Hilbert-space cases, we consider the next
simplest systems: particles in one dimension, with a standard nonrelativistic kinetic term and with a
potential V (x).

7.1 Set-Up of the Problem

We want to solve the Schrödinger equation in position space, so we multiply by x| the equation

i∂t |ψ(t) = Ĥ |ψ(t), (7.1)

where the quantum Hamiltonian is


P̂2
Ĥ = + V̂ ( x̂). (7.2)
2m
We note that P̂ and x̂ are Hermitian operators, P̂† = P̂, x̂ † = x̂, and so is V̂ ( x̂), and therefore
the Hamiltonian Ĥ. However, note that this is actually not enough: one also needs to have the same
domain for Ĥ and Ĥ † , not just for them to act in the same way (which they do, since P̂ and x̂ do).
This is a subtlety which is irrelevant in most cases, but implies some interesting counterexamples.
The kinetic and potential terms act as follows (note that P̂2 is the matrix product of two P̂s, and
involves a sum or integral over the middle indices, turning two delta functions into a single one):
  2   2
 P̂    ∂
x   x = δ(x − x ) −i
 2m  ∂x (7.3)
x|V̂ ( x̂)|x   = δ(x − x  )V (x).
Then the Schrödinger equation in coordinate space becomes (introducing a completeness relation via
1̂ = dx  |x  x  |)

i∂t x|ψ(t) = dx x| Ĥ |x  x  |ψ(t) ⇒
 (7.4)
i∂t ψ(x, t) = dx  Hxx ψ(x , t).

Substituting the matrix elements of the kinetic and potential operators, and doing the x  integral, we
obtain
 2 2 
 ∂
i∂t ψ(x, t) = − + V (x) ψ(x, t). (7.5)
2m ∂ x 2
74
75 7 One-Dimensional Problems in Potential V(x)

This is solved, as earlier, by the separation of variables. We consider a wave function ψ E (x, t) that
is an eigenfunction of the Hamiltonian, so that
i∂t ψ E (x, t) = Eψ E (x, t)
(7.6)
Ĥ ψ E (x, t) = Eψ E (x, t).
Then we have the stationary solution

ψ E (x, t) = e−iEt/ ψ E (x, t = 0) = e−iEt/ ψ E (x). (7.7)

Note that for a stationary solution the probability density is independent of time,
dP(x, t)
ρ(x, t) ≡ = |ψ(x, t)| 2 = |ψ(x)| 2 . (7.8)
dx
The eigenfunction problem for the Hamiltonian, also called the stationary (time-independent)
Schrödinger equation, in our case is
 2 2 
 ∂
Ĥ ψ E (x) = − + V (x) ψ E (x) = Eψ E (x). (7.9)
2m ∂ x 2
Defining the rescaled variables
2mE 2mV (x)
≡ , = U (x), (7.10)
2 2
the Schrödinger equation becomes

ψ E (x) = −( − U (x))ψ E (x). (7.11)

This is the equation we will study in this chapter. It is a real equation (with real coefficients), so
it is enough to consider its real eigenfunctions, since complex eigenfunctions can be obtained from
them. However, for simplicity sometimes we will consider complex eigenfunctions directly.

7.2 General Properties of the Solutions

Next we make a general description of the solutions, without considering specific potentials U (x).

• The one-dimensional Schrödinger equation discussed above is a second-order linear (in the variable
ψ(x)) differential equation, similar to the classical equation for the position of a particle x(t) in a
potential. As in that case, we could give the initial conditions at some point, ψ(x 0 ) and ψ  (x 0 ), like
the values for x and p = m ẋ in the initial condition for the classical particle, and find the evolution
in x (corresponding to evolution in time for the particle). This would amount to “integrating” the
differential equation, and it would clearly lead (via integration) to a continuous solution ψ(x), so
ψ(x) ∈ C 0 (R) (the space of continuous functions).
• Besides being continuous, ψ(x) is also bounded as x → ±∞, since otherwise the probability
density |ψ(x, t)| 2 would not be normalizable to 1, as needed.
• If U (x) doesn’t have infinite discontinuities (jumps), or in another words if it doesn’t have delta
functions, then on integrating ψ  = ( −U (x))ψ we obtain that ψ  (x) is continuous, so ψ ∈ C 1 (R)
(the space of once-differentiable functions with continuous derivative). Indeed, if U (x) has a finite
jump then ψ  also has a finite jump, which means that ψ  is still continuous.
76 7 One-Dimensional Problems in Potential V(x)

• If we are in a classically allowed region, with energy greater than the potential,  > U (x), so
that  − U (x) > 0, then the equation ψ  = −( − U (x))ψ in the U (x)  constant case has
sinusoidal/cosinusoidal solutions or, taking a complex basis, eik x and e−ik x solutions, with

k =  − U. (7.12)

• If on the other hand we are in a classically forbidden region, with energy smaller than the potential,
 − U (x) < 0, the equation ψ  = +(U (x) − )ψ in the U (x)  constant case has exponentially
increasing and decreasing solutions, ψ ∼ e κx and ∼ e−κx , with

κ = U − . (7.13)

• The energy spectrum can be continuous (as for a free particle) or discrete (as for a H atom or the
two-level system studied earlier). It can also be degenerate (with two or more states with the same
energy) or nondegenerate.
• We will define lim x→+∞ U (x) ≡ U+ and limx→−∞ ≡ U− and will assume that U+ > U− ; if not,
we can set x → −x, and retrieve this situation. Then as a function of the energy , we have three
possible cases:
(I)  > U+ > U− . In this case, there are two independent solutions (sin and cos, or eik x and e−ik x )
at each end of the real domain. That means that we can define two solutions that are bounded
everywhere, including at both ends of the domain, for any such energy , which means that
the spectrum is continuous, i.e., there is no restriction on , and degenerate with degeneracy
2, since for each energy there are two solutions. These states are unbound states, like free
particles, with kinetic energy  − U.
(II) U− <  < U+ . In this case  − U (x) is negative at +∞ and positive at −∞. The solution at +∞
is exponentially increasing or decaying. Since the exponentially increasing solution is non-
normalizable, we must choose the exponentially decreasing solution. But then this solution,
when continued to −∞, will correspond to only a given linear combination of the two (sin
and cos) solutions there. That means that there is a unique solution for every energy  in this
region, so the spectrum is still continuous but now nondegenerate, and we also have unbound
states.
(III)  < U− < U+ . In this case  − U (x) is negative at both +∞ and −∞, which means there is a
unique solution (exponentially decaying) at both ends. Consider the unique solution at −∞,
f 1 , and continue it to +∞, where it will become a linear combination of the exponentially
decaying (g1 ) and increasing (g2 ) solutions, f 1 = αg1 + βg2 . But the coefficients α, β are
functions of the energy , so the condition that β() = 0 (so that we have a bounded condition
at +∞ also), gives a constraint on the possible energies, having as solutions a discrete (and also
nondegenerate) spectrum. Moreover, because of the exponentially decaying function at both
±∞, the eigenfunction is bounded in space to a finite region, and we say we have bound states.
The existence or not of eigenvalues for the energy depends on the details of the potential
U (x) in the finite region in the middle, as does the total number of eigenvalues, which can be
anything from 0 to infinity.

• The Wronskian theorem. This is a theorem for general linear second-order equations. Define the
Wronskian of two functions y1 , y2 as

W (y1 , y2 ) = y1 y2 − y2 y1 , (7.14)


77 7 One-Dimensional Problems in Potential V(x)

where y1 and y2 are solutions to the equations


y1 + f 1 (x)y1 = 0 (1)
(7.15)
y2 + f 2 (x)y2 = 0 (2).
Here f 1 (x), f 2 (x) are real functions that are piecewise continuous in the interval x ∈ (a, b),
meaning they can have jumps at a finite number of points but are otherwise continuous. Then
the Wronskian theorem states that
 b
W (y1 , y2 )|ab = dx( f 1 (x) − f 2 (x))y1 y2 . (7.16)
a

Proof Multiplying equation (1) above by y2 and equation (2) by y1 , and subtracting the two, we
get
d
y2 y1 − y1 y2 + ( f 1 (x) − f 2 (x))y1 y2 = 0 ⇒ (y2 y1 − y1 y2 ) + ( f 1 (x) − f 2 (x))y1 y2 = 0,
dx
(7.17)
and by integration over x we get the result of the theorem. q.e.d.

• Applying the Wronskian theorem in our case, with f (x) =  − U (x), for two energies 1 , 2 , we
get f 1 (x) − f 2 (x) =  1 − 2 . Then the theorem says that
 b
  b
(y1 y2 − y2 y1 )|a = W (y1 , y2 )|a = (1 − 2 )
b
dxy1 y2 , ∀a, b. (7.18)
a

In particular, for 1 = 2 , we obtain that W (y1 , y2 ) is independent of x.


• The Wronskian theorem in our case can be used to prove rigorously that indeed if  > U (x) for
x > x 0 we have two oscillatory solutions, and if  < U (x) and U (x)− ≥ M 2 > 0 for x > x 0 , there
is a unique solution going faster than or as fast as e−M x to zero at +∞, and other solutions increase
exponentially. It also follows that in a classically forbidden region ( < U (x)), the eigenfunction
ψ(x) can have at most one zero, since it is either increasing or decreasing exponentially.
• If  < U (x) everywhere, there is no solution since it means that we have exponential solutions
everywhere, but this would mean that the solution would have to increase at least on one side
(+∞ or −∞), since the continuity of the derivative at possible jump points means that the increasing
or decreasing property is conserved at jumps of U (x) also. Therefore this solution will not be
normalizable, so there is no solution to the eigenvalue problem.
• This also means that if U (x) has a minimum Umin somewhere then the eigenvalues of E are
necessarily larger than Umin .
• The number of nodes (zeroes of the wave function) can be analyzed as follows. Consider two
different energies 2 > 1 . Then the Wronskian theorem for the case where a, b are two consecutive
zeroes (or nodes) of y1 , so that y1 (a) = y1 (b) = 0, means that
 b
 b
y2 y1 |a = (2 − 1 ) dxy1 y2 . (7.19)
a

But if a, b are consecutive zeroes, it means that between them, in the interval (a, b), y1 has the
same sign, say y1 > 0, which implies y1 (a) > 0, y1 (b) < 0. Thus y2 must change sign in the
interval (a, b), since otherwise the right-hand side of the Wronskian theorem has the same sign as
y2 (as 2 − 1 > 0), whereas the left-hand side has the opposite sign (since both y1 (b) and −y1 (a)
are negative).
78 7 One-Dimensional Problems in Potential V(x)

That in turn means that y2 has at least one zero in between the two consecutive zeroes of y1 , a,
and b. But since both y1 and y2 also vanish asymptotically at +∞ and −∞, besides the finite zeroes
(or nodes) it follows that if y1 has n nodes (so that there are n + 1 intervals (a, b), including −∞
and +∞ as boundaries) then y2 has n + 1 nodes. In turn, that means that there are at least n − 1
nodes for the nth eigenfunction of the system.
• Finally, if we have a discrete spectrum, which implies that the eigenstates satisfy y(±∞) = 0,
then two eigenstates y1 , y2 for two different eigenenergies (1  2 ) are orthonormal, since the
Wronskian theorem for a = −∞, b = +∞ implies that (dividing by the nonzero 1 − 2 )
 +∞
dx y1 y2 = 0. (7.20)
−∞

7.3 Infinitely Deep Square Well (Particle in a Box)

For an infinitely high potential barrier, i.e., if U (x) = ∞ at some x = x 0 , it is clear that we have
ψ(x 0 ) = 0. Consider then the case where the potential is an infinitely deep square well (or box; see
Fig. 7.1a),
U (x) = 0, |x| < L/2
(7.21)
= ∞, |x| > L/2.
In this case, it is clear that ψ(x) = 0 for |x| ≥ L/2. Note that if U (x) = U0 for |x| < L/2, we can
make the rescaling  − U0 → , and so get back to this case.
The differential equation in the only relevant region, |x| ≤ L/2, is
ψ  + ψ = 0, (7.22)

x x
-L/2 +L/2
(a) (b)

a b x x
L

(c) (d)

Figure 7.1 One-dimensional potentials U(x). (a) Particle in a box (infinite square well): its potential U(x) and ground state wave function
ψ1 (x). (b) Potential step and wave components. (c) Finite square well. (d) Potential barrier and tunneling effect.
79 7 One-Dimensional Problems in Potential V(x)

with the boundary conditions that ψ(x) = 0 at x = ±L/2. This is basically a stationary violin string
set-up, with discrete eigenmodes (harmonics).
In the relevant region II, |x| ≤ L/2, the solution is
ψ(x) = Aeik x + Be−ik x , (7.23)
where

2mE√
k= =
. (7.24)
2
Imposing continuity of the wave function at both ends, i.e.,
ψ(−L/2) = ψ(+L/2) = 0, (7.25)
implies the system of equations
Ae+ik L/2 + Be−ik L/2 = 0
(7.26)
Ae−ik L/2 + Be+ik L/2 = 0,
which has nontrivial solutions only if the determinant of coefficients vanishes,
  eik L/2 e−ik L/2  
 −ik L/2 ik L/2  = 0 ⇒ e
ik L
= e−ik L , (7.27)
 e e 
which means that the wave numbers must be quantized,

k = kn = ⇒ eikn L = (−1) n . (7.28)
L
Consequently the eigenenergies are
2 n2 π 2
En = . (7.29)
2m L 2
Here, naively, we would have n ∈ Z, so that n = 0, ±1, ±2, ±3, . . . , but in fact n = 0 is excluded,
since that would imply that ψ(x) is constant, which would mean we don’t have ψ(±L/2) = 0, or that
the wave function vanishes.
On the eigenenergies, we have B = −Aeikn L , which means that the eigenfunctions are
⎧ nπx



2i A sin , n even
L
ψ n (x) = A(e ik n x n −ik n x
+ (−1) e )=⎨ ⎪ (7.30)

⎪ 2A cos nπx , n odd.
⎩ L
The remaining constant, A, is determined from the normalization condition for probability,
 +L/2  L/2
dP
dx = |ψ(x)| 2 = 1. (7.31)
−L/2 dx −L/2
This gives
 +L/2
nπx 1
1 = 4| A| 2 dx sin2 = 2| A| 2 L ⇒ | A| = √ , (7.32)
−L/2 L 2L
so the eigenfunctions are



⎪ 2 nπx


⎪ sin , n even
L L
ψ n (x) = ⎨
⎪  (7.33)


⎪ 2 nπx
⎪ cos , n odd.
⎩ L L
80 7 One-Dimensional Problems in Potential V(x)

We observe that the ground state of the quantum system is not the classical value E = 0, but
is now
π 2 2
E1 = , (7.34)
2mL 2
and the remaining states satisfy

En = n2 E1 . (7.35)

The fact that the energy of the ground state is nonzero, and its order of magnitude, can be
understood from Heisenberg’s uncertainty relations. Since the particle is confined to be in a box
of size L, for |x| ≤ L/2, it means that the variance of x (the uncertainty in the position) is Δx = L/2.
From the Heisenberg uncertainty relation we obtain
/2
Δp ≥ . (7.36)
L/2
Assuming equality,

pmin = Δp = , (7.37)
L
we find
2
pmin 2
E≥ = . (7.38)
2m 2mL 2
This gives the right result except for a missing factor of π 2 (so the inequality is valid, just the
possibility of equality is not).
The next observation is that in a bound state p = 0. Having a stationary state means that p is
time independent, and if the result were nonzero, it would mean that the particle translates on average
and would eventually escape from the box, but that cannot happen, since the box has infinite sides.
The average position can be calculated explicitly,
 +L/2  +L/2
2 nπx
x = dx x|ψ(x)| =
2
dx x sin2 =0 (7.39)
−L/2 L −L/2 L
by the x → −x symmetry.
To calculate Δx, we find that, given x = 0, we have
 +L/2  +L/2
2 1 − cos 2πnx/L
(Δx) 2 n = x 2 n = dx x 2 |ψ(x)| 2 = dx x 2
−L/2 L −L/2 2
   (7.40)
+πn/2
L2 L2 1 1
= − 3 3 dθ θ 2 cos 2θ = L 2 − ,
12 π n −πn/2 12 2n2 π 2
which gives

1 1
Δx = L − . (7.41)
12 2n2 π 2

This tends, as n → ∞, to L/ 12, which is the classical result: indeed, the classical result for the
+L/2
average of x 2 is (1/L) −L/2 dx x 2 = L 2 /12.
81 7 One-Dimensional Problems in Potential V(x)

We can also calculate Δp in a similar way, considering that p = 0:


 +L/2  +L/2  +L/2
2n2 π 2 2 1 + cos 2πnx/L
p2 n = dx| P̂ψ(x)| 2 = 2 dx|ψ  (x)| 2 = 3
dx
−L/2 −L/2 L −L/2 2
(7.42)
2 n2 π 2
= = 2mEn .
L2
It follows that we have the expected classical-like relation

(Δp) 2 n = p2 n = 2mEn . (7.43)

7.4 Potential Step and Reflection and Transmission of Modes

After considering the simplest stationary case, the particle in a box, we now move on to a different
type of potential, one that allows for states at infinity, states propagating in time and reflecting and
being transmitted through a “wall”. The simplest example in this class is the potential step, with two
different values of the potential, changing at x = 0, as in Fig. 7.1b,


⎪U1 , x<0
U (x) = ⎨
⎪U , (7.44)
⎩ 2 x > 0.

Consider for concreteness that U2 > U1 , so that the step is an increase, looking a bit like a wall to an
incoming wave.

(a) The case U2 >  > U1


We treat first the case where the energy of the particle – or particles? is in between the two values of
the potential. According to the general theory described at the beginning of the chapter, we expect
a continuous and nondegenerate spectrum. For x < 0 we have  > U1 , so we have a sinusoidal
solution, whereas for x > 0 we have  < U2 , so we have an exponentially decaying solution. Thus
the solution can be parametrized as


⎪ A sin(k1 x + φ), x<0
φ(x) = ⎨
⎪ Be−κ2 x , (7.45)
⎩ x > 0,
√ √
where k1 =  − U1 and κ2 = U2 − .
We first impose the condition of continuity of φ(x) at x = 0,

A sin φ = B, (7.46)

and thus A is a normalization constant. Next, we impose the continuity of the derivative ψ  (x) at
x = 0, or better, of the logarithmic derivative (under the log, the derivative gets rid of otherwise
arbitrary multiplicative constants), i.e. ψ  (x)/ψ(x) at x = 0,
ψ  (x − )
˜ ψ  (x + )
˜
= , (7.47)
ψ(x − ) ˜ ψ(x + ) ˜
82 7 One-Dimensional Problems in Potential V(x)

implying a solution for φ:

−k1
−κ2 = k1 cot φ ⇒ φ = arctan . (7.48)
κ2

Substituting back, we can find value for B/A from



k1  − U1
B = A sin φ = −A  = −A . (7.49)
U2 − U1
k12 + κ22

Then A is found from the normalization condition



dxφ∗E (x)φ E  (x) = δ(E − E  ). (7.50)

The full time-dependent solution is


 
A sin(k1 x + φ)e−iEt = Ã ei(k1 x−Et)+φ − e−i(k1 x+Et)−φ , (7.51)

which is the sum of an incident (original) traveling wave and a reflected wave, traveling in the
opposite direction. The interference of the two waves gives the resulting solution.

(b) The case  > U2


The second case is for an energy above both values of the potential. Again, according to the
general theory described earlier, in this case we have a continuous and degenerate spectrum, with
degeneracy 2. Both for x > 0 and for x < 0 we have sinusoidal solutions and, besides the incident
wave for x < 0, we must have both a reflected wave (for x < 0) and a transmitted wave (for x > 0).
Thus the wave function (where we have taken out an overall normalization constant A and replaced
it with 1) is

ψ(x) = eik1 x + Re−ik1 x , x<0


(7.52)
= Se ik2 x
, x > 0,

where the wave with factor R is the reflected wave and the wave with factor S is the transmitted
√ √
wave, and k1 =  − U1 > k2 =  − U2 . Continuity of the wave function at x = 0 gives

1 + R = S, (7.53)

whereas continuity of the logarithmic derivative (of ψ  (x)/ψ(x)), also at x = 0, gives

1−R k1 − k2 2k1
ik1 = ik2 ⇒ R = ⇒ S =1+R= > 1. (7.54)
1+R k1 + k2 k1 + k2

Then, introducing the time dependence, we have an incident wave ∼ eik1 x−iωt , where ω = E/, a
reflected wave ∼ e−ik1 x−iωt , and a transmitted wave ∼ eik2 x−iωt .
83 7 One-Dimensional Problems in Potential V(x)

7.5 Continuity Equation for Probabilities

We can define the probability current density for continuous systems from

ψ(t)|ψ(t) = ρdV , (7.55)

where the probability ρ = dP/dV (written for the more general case of more than one dimension,
with dV replacing dx), so we obtain in the coordinate representation
dP 
= ψ(r , t)  .
2
ρ= (7.56)
dV 
Then writing the Schrödinger equation and its complex conjugate in the coordinate representation,
∂ψ 2  2
i =− ∇ ψ + Vψ
∂t 2m
(7.57)
∂ψ∗ 2  2 ∗ ∗
−i =− ∇ ψ + Vψ ,
∂t 2m
and then multiplying the first equation by ψ and the second by ψ∗ and subtracting the two, we obtain
2 ∗  2
i(ψ∗ ∂t ψ + ψ∂t ψ∗ ) = −  2 ψ∗ ) ⇒
(ψ ∇ ψ − ψ∇
2m
(7.58)
2  ∗   ∗ ),
i∂t (|ψ| 2 ) = i∂t ρ = − ∇(ψ ∇ψ − ψ∇ψ
2m
which can be written in terms of a continuity equation for probability flow,
∂t ρ = −∇
 · j, (7.59)
where the probability current density is

j ≡  (ψ∗ ∇ψ  ∗ ).
 − ψ∇ψ (7.60)
2mi
Applying this formalism for a one-dimensional problem with an incident wave ψ = Aeik x , we find
the current density

j = k | A| 2 ex . (7.61)
m
Even in one dimension, the vector ex makes sense, taking the values ±1 depending on the direction
of propagation.
Consider then the conservation of probability at a point of reflection and refraction (i.e.,
transmission),
jinc + jrefl = jtransmitted . (7.62)
In our case, these currents become

jinc = k1 ex


m
jrefl = k1 |R| 2 (−e x ) (7.63)
m
k
jtrans = 2 |S| 2 ex ,
m
84 7 One-Dimensional Problems in Potential V(x)

which means that the conservation of probability becomes


k2 2
k1 (1 − |R| 2 ) = k2 |S| 2 ⇒ 1 − |R| 2 = |S| ≡ T, (7.64)
k1
where the right-hand side, T, is the transmission coefficient (since |R| 2 is the probability of reflection
and T is the probability of transmission, so that the two sum up to 1). We find specifically
4k1 k2 (k1 − k2 ) 2
T= , |R| 2 = . (7.65)
(k1 + k2 ) 2 (k1 + k2 ) 2

7.6 Finite Square Well Potential

We next consider a square well potential, where the potential “dips” for a short while; for generality
we consider the two sides to have different potentials, so that



⎪ U1 , x < a, (I)


U (x) = ⎨
⎪ U2 , a < x < b, (II) (7.66)


⎪U ,
⎩ 3 x > b, (III),

where U2 < U1 < U3 , as in Fig. 7.1c.


If  < U2 , there is no solution, since the solution must be exponentially decaying or increasing,
and this implies that at least one side of the solution will increase exponentially, making it impossible
to normalize the probability to one.

(a) The case U1 >  > U2


In this case, according to the general analysis, the spectrum is discrete, and represents bound states.
The solution is exponentially decreasing to zero on both sides of the well, and sinusoidal inside the
well, so we write



⎪ Ce−κ3 x , x>b


ψ(x) = ⎨
⎪ B sin(k2 x + φ) = B1 sin k2 x + B2 cos k2 x, a>x>b (7.67)


⎪ Ae+κ1 x ,
⎩ x < a,

where

κ1 = U1 − , k2 =  − U2 , κ3 = U3 − . (7.68)

Next, we impose continuity at x = a and x = b for ψ(x), obtaining the equations

ψ(a+) = ψ(a−) ⇒ Ae+κ1 a = B sin(k2 a + φ)


(7.69)
ψ(b+) = ψ(b−) ⇒ B sin(k2 b + φ) = Ce−κ3 b .

Finally, we impose the continuity of ψ  (x) at the above points or, better still, considering that
we have already imposed the continuity of ψ(x), the continuity of the logarithmic derivative
ψ  (x)/ψ(x), giving
85 7 One-Dimensional Problems in Potential V(x)

ψ ψ
(a+) = (a−) ⇒ +κ1 = k2 cot(k2 a + φ)
ψ ψ
(7.70)
ψ ψ
(b+) = (b−) ⇒ k2 cot(k2 b + φ) = −κ3 .
ψ ψ
The equations giving the continuity of ψ equations solve for B, C as a function of A (then A is
found from the normalization condition), whereas one of the equations for the continuity of ψ /ψ can
be solved for φ, and then the other gives an equation for the eigenenergies  n . We write these last
two equations as two equations for the same φ, whose consistency will give  n :
 
k2
φ = arctan − k2 a
κ1
  (7.71)
k2
= − arctan − k2 b + nπ.
κ3
Equating the two values, we find
k2 k2
arctan + arctan + k2 (b − a) − nπ = 0, (7.72)
κ3 κ1
which is a transcendental equation for , indexed by n (one solution per value of n) giving a discrete
and finite sequence for  n . Defining the quantities
 
U1 − U2  − U2
cos γ ≡ , ξ≡ , K = U1 − U2 , L = b − a, (7.73)
U3 − U2 U1 − U2
we find the equation

arcsin ξ + arcsin(ξ cos γ) = nπ − ξK L; (7.74)

to find a solution for it, we need the condition

K L ≥ (n − 1)π + γ (7.75)

which implies that there is a maximum value for n, given by


KL − γ
n ≤ nmax = 1 + . (7.76)
π
This means that, indeed, we have a finite spectrum, as promised. Moreover, n is the number of
nodes or zeroes of ψ (see our previous general discussion, equating this number with the index of the
eigenenergy), since we have a factor sin(k2 x + φ) in the wave function, vanishing for the nπ term.
If U1 = U3 = U and U2 = U0 (for a symmetric well), we find κ1 = κ3 , which means cos γ = 1

and thus γ = 0, and if moreover L U − U0 = K L  1 then there is a unique solution: the condition
K L ≥ (n − 1)π + 0 can only have n = 1 as a solution, given by the transcendental equation
k2
2 arctan = π − k2 L, (7.77)
κ
from which we find k2 /κ → ∞, and the unique eigenenergy is
1
E = V − K L(V − V0 ), (7.78)
2
and the angle φ  π/2.
86 7 One-Dimensional Problems in Potential V(x)

(b) The case U1 <  < U3


In this case, when the energy is in between the two potentials at −∞ and +∞, according to the general
analysis we must find a continuous, nondegenerate spectrum. As for the finite step potential, in region
I (x < a), where  > U (x), we write the wave function as an incident wave and a reflected wave,
while in region III (x > b), where  < U (x), we find an exponentially decaying solution. All in all,


⎪ eik1 x + e−ik1 x e2iφ1 , x<a


ψ(x) = ⎨
⎪ 2Aeiφ1 sin(k2 x + φ2 ), a<x<b (7.79)


⎪ 2Beiφ1 e−κ3 x ,
⎩ x > b.

Note that the equation is basically the previous set-up (with cos(k1 x − φ1 ) in region I and
sin(k2 x+φ2 ) in region II), multiplied by eiφ1 , and by redefining e2iφ1 ≡ R and à = Aeiφ1 , B̃ = Beiφ1 ,
we get the set-up with incident and reflected waves.
As before, A and B are obtained from the continuity equations for ψ(x) at x = a and x = b. On
the other hand, the angles φ1 and φ2 are found from the continuity of the logarithmic derivative,
ψ  (x)/ψ(x) at x = a and x = b, giving
1 − e2i(k1 a+φ1 )
ik1 = k1 tan(k1 a + φ1 ) = k2 cot(k2 a + φ2 )
1 + e2i(k1 a+φ1 ) (7.80)
k2
k2 cot(k2 b + φ2 ) = −κ3 ⇒ φ2 = −k2 b − tan .
κ3
Substituting φ2 in the first equation, we find φ1 as well. That means that all the parameters
are fixed, but we have no new equations for the energy, meaning the energy spectrum is indeed
continuous.

(c) The case  > U3


Finally, consider the case when  > U (x) everywhere, meaning that the solution is sinusoidal
everywhere, and we have two good solutions in every region, I, II, and III. This means that the
spectrum is continuous and degenerate with degeneracy 2.
As in the similar case with a single step, we write an incident and a reflected wave in region I, and
a transmitted wave in region III. In region II, we also write a combination of a forward moving and
a backward moving, wave, so


⎪ eik1 x + Re−ik1 x , x<a



ψ(x) = ⎨
⎪ Peik2 x + Qe−ik2 x , a<x<b (7.81)



⎪ Seik3 x , x > b.

Continuity of the function ψ(x) and of its logarithmic derivative ψ  (x)/ψ(x) at x = a and x = b
allows us to fix the four constants P, Q, R, S. The four equations are
eik1 a + Re−ik1 a = Peik2 a + Qe−ik2 a
Peik2 b + Qe−ik2 b = Seiκ3 b
1 − Re−2ik1 a 1 − (Q/P)e−2ik2 a (7.82)
ik1 −2ik
= ik2
1 + Re 1 a 1 + (Q/P)e−2ik2 a
1 − (Q/P)e−2ik2 b
ik2 = iκ3 .
1 + (Q/P)e−2ik2 b
87 7 One-Dimensional Problems in Potential V(x)

The last equation can be solved for Q/P, then the third for R, then the first for P, and finally the
second for S. The one interesting coefficient is the transmission coefficient,

κ3 2 4ηζξ 2
T= |S| = , (7.83)
κ1 ξ 2 (η + ζ) 2 cos2 ξK L + (ξ 2 + ηζ) 2 sin2 ξK L

where we have defined


k2 κ1 κ3
K = U1 − U2 , ξ= , η= , ζ= . (7.84)
K K K
The reflection coefficient is

|R| 2 = 1 − T, (7.85)

an equation that can be explicitly checked.

7.7 Penetration of a Potential Barrier and the Tunneling Effect

We end the analysis of one-dimensional problems with the problem of a potential barrier consisting
of a potential step of length L, which would be classically impenetrable, if  < U inside this middle
region.
Consider then the potential in Fig. 7.1d,



⎪ 0, x<0



U (x) = ⎨
⎪ U0 > 0, 0<x<L (7.86)



⎪ 0, x > L,

This means that, in the absence of any potential barrier (for L = 0), we would have just a free
particle propagating. But, for L  0, classically the particle starting in region I (x < 0) would be
unable to reach region II (x > L) for  < U0 , though quantum mechanically there is a nonzero
probability for the particle to “tunnel” through the barrier.
In regions I and III we write respectively the incident plus reflected waves and the transmitted
wave,
√ √


⎪ ei x + Re−i x
, x<0
ψ(x) = ⎨
⎪ √ (7.87)
⎪ Sei x , x > L.

In region II (0 < x < L), we write a sum of exponentials, which are either real or imaginary,
depending on the energy:



⎪ Ae κx + Be−κx ,  < U0
ψ(x) = ⎨
⎪Ceik x + De−ik x , (7.88)
⎪  > U0 ,

√ √
where κ = U0 −  and k =  − U0 .
88 7 One-Dimensional Problems in Potential V(x)

(a) The case  < U0


In this case, the continuity conditions at x = 0 and x = L for ψ(x) and ψ  (x)/ψ(x) give

1+ R = A+ B

Ae κL + Be−κL = Sei L

√ 1−R A− B (7.89)
i  =κ
1+R A+ B
κL
Ae − Be −κL √
κ κL = i .
Ae + Be−κL
Solving the equations as before (the first for R in terms of A and B, the second for S in terms of the

same, and then the last two for A and B), we find the transmission coefficient (since k2 = k1 = )
1
T = |S| 2 = < 1, (7.90)
sinh2 κL
U02
1+
4(U0 − )

and the reflection coefficient is |R| 2 = 1 − T as before. This is the tunneling probability, namely the
probability of passing through a classically impenetrable barrier. This is relevant for radioactivity, for
instance, where an α particle can “tunnel out” of a potential barrier binding it to a radioactive nucleus.
We see that if κL  1, we have an exponentially small probability of tunneling, as expected:
16(U0 − ) −2κL
T e . (7.91)
U02

(b) The case  > U0


In this case, the particle can go “over” the barrier classically, as if it wasn’t there, but quantum
mechanically there will be reflection as well as transmission.
The continuity conditions at x = 0 and x = L for ψ(x) and ψ  (x)/ψ(x) give

1+R=C+D

Ce + De−ik L = Sei L
ik L

√ 1−R C−D (7.92)


i  = ik
1+R C+D
Ce − De
ik L −ik L √
ik ik L = i ,
Ce + De−ik L
which similarly lead to a transmission coefficient
1
T = |S| 2 = < 1, (7.93)
U02
1+ 2
sin k L
4( − U0 )

(and a reflection coefficient |R| 2 = 1 − T); note that now the transmission coefficient oscillates
between 1 and 1/(1 + U02 /4( − U0 )) < 1 as a function of k (classically, we expect only the value
T = 1).
89 7 One-Dimensional Problems in Potential V(x)

Important Concepts to Remember

• For a one-dimensional particle in a potential, the wave function is ψ(x) ∈ C 0 (R) and, if the
potential has no delta functions, ψ(x) ∈ C 1 (R) and is bounded at x = ±∞.
• The wave function is of sin/cos type in a classically allowed region, and of exponentially
decaying/growing type in a classically forbidden region (being exactly sin/cos or exp only if the
potential is constant in the region).
• For energies above the values of the potential at +∞ and −∞, we have a continuous spectrum, of
unbound states of degeneracy 2, as for free particles.
• For energies between the values of the potential at +∞ and −∞, we have a continuous spectrum,
of unbound states but nondegenerate.
• For energies below the values of the potential at +∞ and −∞, we have a discrete and nondegenerate
spectrum of bound states, i.e., they are constrained to a finite region (exponentially decaying
outside it).
• There are at least n − 1 nodes (zeroes) for the nth eigenfunction of the system.
2 π 2 2
• The infinitely deep square well has eigenenergies En = 2mL 2 n , with n = ±1, ±2, ±3, etc., and

eigenfunctions sin nπx/L for n even and cos nπx/L for n odd.
• In the case of a step-type wall, we can define a transmitted wave and a reflected wave, ψ(x) =
eik1 x + Re−ik2 x for x < 0 and ψ(x) = Se−ik2 x for x > 0.
• The continuity equation for probabilities is ∂t ρ + ∇  j = 0, where ρ = |ψ| 2 is the probability density
 ∗ ∗
and j = 2m (ψ ∇ψ − ψ∇ψ ) is the probability current.
 
• For both the step-function case and the case of a finite barrier, we can define the transmission
coefficient T and the reflection coefficient R for the probability, with T + R = 1.

• For a potential

barrier of length L and height U0 , for L U0 − EL  1, the tunneling probability is
T ∝ e−2 U0 −E L .

Further Reading
See [2] for more details.

Exercises

(1) Consider a potential that depends only on the radial direction r in a three-dimensional space,
V (r), with V (r such that → ∞) = 0 while V (0) = −V0 < 0 and Vmax = maxr V (r) = V1 > 0,
reached at r = r 1 , and there is a unique solution r 0 to the equation V (r 0 ) = 0. Describe the
spectrum of the system in the various energy regimes.
(2) Repeat the previous exercise in the case V (r → 0) → −∞.
(3) Consider the delta function potential in one dimension,
V = −V0 δ(x). (7.94)
Calculate its bound state spectrum.
(4) Calculate x m n and pm n for a particle in a box (an infinitely deep square well) for arbitrary
integers n, m > 0. In which limits do we have a classical result?
90 7 One-Dimensional Problems in Potential V(x)

(5) Consider a periodic cosinusoidal potential,


V (x) = V0 cos(ax). (7.95)
Calculate the bound state spectrum, and the number of nodes for the corresponding eigen-
states.
(6) For the finite square well potential, prove the formula for the transmission coefficient (7.83),
calculate R, and prove that |R| 2 + T = 1.
(7) Consider an asymmetric potential barrier, with


⎪ 0, x<0



U (x) = ⎪U0 > U1 , 0<x<L (7.96)


⎪U > 0,
⎩ 1 x > L.
Calculate the tunneling probability T for energy U1 < E < U0 , and for energy E > U0 .
8 The Harmonic Oscillator

We next move to the simplest one-dimensional system with a truly space-dependent potential V (x)
(as opposed to one that is piecewise constant), the harmonic oscillator. It is an excellent tool, both
for learning quantum mechanics, and for applying it. Indeed, we can say that:

• This is the most basic quantum mechanical system, the prototype that we can use for a quantum
mechanical problem. It has all the important features that allow one to understand the basics.
• If understood in detail, it can be generalized to any problem. Famously, Sidney Coleman stated
it thus: “The career of a young theoretical physicist consists of treating the harmonic oscillator in
ever-increasing levels of abstraction”. As such, while the basic treatment is described here, we will
return to the harmonic oscillator from time to time, using it as an example and application.
• Most systems can be approximated by one, or several, harmonic oscillators. Thus the harmonic
oscillator can be understood as a first approximation in a perturbation theory for a general system.

8.1 Classical Set-Up and Generalizations

The basic system is a spring of constant k, so that we have Hooke’s law F = −k x and the potential
 = −∇V
is (since F  )
1 2
V= kx , (8.1)
2
writing k = mω 2 , where ω is the classical angular frequency of oscillation. Then the classical
Hamiltonian is
p2 mω 2 2
H =T +V = + x , (8.2)
2m 2
so that the quantum Hamiltonian operator is
P̂2 mω 2 2
Ĥ = + X̂ . (8.3)
2m 2
However, rather than this exact and basic system we can consider a more general set-up: a particle
in a general potential V (x) that has a stable minimum x = x 0 (so that V  (x 0 ) > 0). Around this
minimum, we can do a Taylor expansion:
V  (x 0 )
V (x)  V (x 0 ) + V  (x 0 )(x − x 0 ) + (x − x 0 ) 2 + · · ·
2
(8.4)
V  (x 0 )
= V (x 0 ) + (x − x 0 ) 2 .
2
Thus, with x − x 0 ≡ y and V  (x 0 ) ≡ mω 2 , we obtain approximately the above harmonic oscillator.
91
92 8 The Harmonic Oscillator

For a generic multiparticle coupled system with variables x 1 , . . . , x n and potential V (x 1 , . . . , x n ),


and a stable minimum (so the determinant of the Hessian matrix is positive) written as x 0 =
(x 1,0 , . . . , x n,0 ), we obtain similarly
1
n
V (x 1 , . . . , x n )  V (x 0 ) + ∂i ∂j V (x 0 )(x i − x i,0 )(x j − x j,0 ). (8.5)
2 i,j=1

By writing δx i ≡ x i − x i,0 and


∂i ∂j V (x 0 ) ≡ Vi j = Vji , (8.6)
we obtain the Hamiltonian
 n
1 δi j 1
n
H= pi pj + δx i Vi j δx j . (8.7)
i,j=1
2 m 2 i,j=1

This corresponds to a matrix in (i, j) space, which can be diagonalized to find the eigenmodes, i.e.,
the eigenvectors and eigenvalues for Hi j . In the diagonalized form, the system reduces to a system
of decoupled harmonic oscillators.
Finally, similarly considering a field φ(x, t) in Fourier space, it can be decomposed in an infinite
set of approximately harmonic oscillators, as in the above finite dimensional case. We will not do
this here, however, since field theory is beyond the scope of this book.
The classical equations of motion of the harmonic oscillator Hamiltonian (Hamilton’s
equations) are:
∂H p ∂H
ẋ = = , ṗ = − = −mω 2 x, (8.8)
∂p m ∂x
leading to a single equation for x(t),
ẍ + ω 2 x = 0, (8.9)
−iωt
with independent solutions e iωt
and e , or sin ωt and cos ωt, with general real solution
x(t) = A sin(ωt + φ). (8.10)

8.2 Quantization in the Creation and Annihilation


Operator Formalism

We want to solve the Schrödinger equation for the quantum harmonic oscillator,
i∂t |ψ = Ĥ |ψ, (8.11)
where the quantum Hamiltonian is, as we saw above,
P̂2 mω 2 2
Ĥ = + X̂ . (8.12)
2m 2
We do the usual separation of variables, writing the eigenvalue problem for Ĥ (the time-independent
Schrödinger equation),
Ĥ |ψ = E|ψ. (8.13)
93 8 The Harmonic Oscillator

In order to do that, and to diagonalize the Hamiltonian, i.e., find its eigenstates, we consider a
change of quantum operators from ( X̂, P̂) to complex operators ( â, ↠), defined by

1  mω 1 Q̂ + i P̂
â = √ X̂ + i √ P̂ ≡ √
2  mω 2
 (8.14)
1  mω 1 Q̂ − i P̂

â = √ X̂ − i √ P̂ ≡ √ ,
2  mω 2
with the inverse
â + ↠â − â†
Q̂ = √ , P̂ = √ . (8.15)
2 i 2
Under this change, the canonical commutation relation [ X̂, P̂] = i1̂ becomes
1
[â, ↠] = [Q̂ + i P̂, Q̂ − i P̂] = +i[P̂, Q̂] = 1̂, (8.16)
2
and of course we also have [â, â] = [↠, ↠] = 0.
The Hamiltonian then becomes
mω ( â − ↠) 2 mω 2  ( â + ↠) 2 ω
Ĥ = + = ( â ↠+ ↠â)
2m (−2) 2 mω 2 2
    (8.17)
1 1
= ω ↠â + ≡ ω N̂ + ,
2 2

where in the second line we have used the commutation relation [ â, ↠] = 1 and have defined the
“number operator” (we will see shortly why it is called this)

N̂ = ↠â, (8.18)

which we can easily see is Hermitian, N̂ † = ↠â = N̂.


This means that the eigenstates of Ĥ are also eigenstates of N̂, so we will consider the eigenvalue
problem for the latter, with the eigenvalue called n:

N̂ |n = n|n. (8.19)

Since we have a Hilbert space, the norms of the states |n must be positive and nonzero, i.e.,
nonnegative, n|n > 0. But now consider the expectation value

n| N̂ |n = n| ↠â|n = || â|n|| 2 ≥ 0


(8.20)
= nn|n ≥ 0,
leading to the fact that the eigenvalue n is positive, n ≥ 0, and moreover

n = 0 ⇔ a|n = 0. (8.21)

That means that the ground state of the harmonic oscillator, the state of minimum energy and
therefore also the minimum eigenvalue for N̂, has n = 0 and so is denoted |0 and is called the
“vacuum” state. It is defined by

â|0 = 0. (8.22)
94 8 The Harmonic Oscillator

Next we note that n (the eigenvalue for N̂) is increased by the action of a† and decreased by the
action of a, since

[ N̂, â] = ↠â â − â ↠â = [↠, â]â = −â


(8.23)
[ N̂, ↠] = ↠â ↠− ↠↠â = ↠[â, ↠] = ↠.

Indeed, then N̂ |n = n|n implies that

N̂ ( â|n) = ( â N̂ − â)|n = (n − 1)( â|n) ⇒ |n − 1 ∝ â|n


(8.24)
N̂ ( ↠|n) = ( ↠N̂ + ↠)|n = (n + 1)( ↠|n) ⇒ ↠|n ∝ |n + 1.

This means that ↠creates a quantum with energy ω, since when acting on a state it increases the
energy as E → E + ω, and â annihilates a quantum of energy ω, decreasing the energy of a state
as E → E − ω. Thus â is called an annihilation or lowering operator, and ↠is called a creation or
raising operator.
Since n can jump by only one, up or down, and we have n = 0 for the state defined by a|0 = 0,
it follows that n ∈ N is a natural number, and the vacuum state |0 is defined by a|0 = 0. The states
are then indexed by this natural number n,

|0, |1 ∝ a† |0, |2 ∝ a† |1 ∝ (a† ) 2 |0, ... (8.25)

and the eigenvalue for the state |n is


 
1
En = n + ω. (8.26)
2

We require orthonormal states, n|n = 1, and, more generally,

n|m = δ nm , (8.27)

which means that the proportionality constants for the action of â and ↠, given by

â|n = α n |n − 1
(8.28)
↠|n = β n |n + 1,

can be calculated. Indeed, from the relations found previously,



n|| |n|| 2 = ||a|n|| 2 = |α n | 2 || |n − 1|| 2 ⇒ |α n | = n. (8.29)

Choosing the phase of α n to be unity (the trivial choice), we obtain α n = n. Similarly, from the
previous relations,

n| â ↠|n = || ↠|n|| 2 = n|( ↠â + 1)|n = (n + 1)|| |n|| 2
√ (8.30)
= |β n | 2 || |n|| 2 ⇒ |β n | = n + 1.

Then, choosing the phase of β n to also be unity, we obtain β n = n + 1. All in all,
√ √
â|n = n|n − 1, ↠|n = n + 1|n + 1. (8.31)
95 8 The Harmonic Oscillator

Since n|m = δ mn , it means that the matrix elements of â and ↠are

m| â|n = nδ m,n−1
√ (8.32)
m| ↠|n = n + 1δ m,n+1 ,

which means the opearators a and a† are represented in the basis with n = 0, 1, 2, . . . by the matrices
√ 0 0 0
0 1 0 √ 
 √   1 0 0 
0 0 2 0 √
 √  0 
a = 
2 0
0 0 3  , a† =  √  . (8.33)
 √   3 0 0
0 0 4  √ 
4 0 
 ···
 ···

Moreover, we obtain the general state as a repeated action with ↠, using the recursion relation
n times,
↠↠↠( ↠) n
|n = √ |n − 1 = √ |n − 2 = · · · = √ |0. (8.34)
n n(n − 1) n!

8.3 Generalization

We can easily generalize the single harmonic oscillator to the case of several oscillators, coming
either from several particles with harmonic oscillator potentials, or from diagonalizing a total
interacting Hamiltonian expanded around a stable minimum. Then we have

H= Hq , (8.35)
q

with each Hq being an independent harmonic oscillator of ω = ωq ,


 
† 1
Hq = ωq â â + . (8.36)
2
The fact that the harmonic oscillators are independent means that we have

[âq , âq†  ] = δ q,q 1̂ (8.37)

(and the rest of the commutators vanish). Then the states of the total system are, as in the general
theory described previously, tensor products of the states for each Hamiltonian Hq , meaning that the
general state, with occupation numbers n1 , n2 , . . . , nq , . . . in each mode, is

( ↠) n1 ( âq† ) nq
|n1 , n2 , . . . , nq , . . . = √1 ... . . . |0, (8.38)
n1 ! nq !

where the total vacuum state is the tensor product of the vacuum state of each oscillator,

|0 ≡ |01 ⊗ |02 ⊗ · · · (8.39)


96 8 The Harmonic Oscillator

8.4 Coherent States

We can define other bases than {|n} (eigenstates of the Hamiltonian) in the Hilbert space. One such
basis is the basis of coherent states |α, defined as eigenstates of the annihilation operator â,

a|α = α|α. (8.40)

We find that the coherent states can be written in terms of the basis of |n states, as

 αn
|α = e α â |0 ≡ ( ↠) n |0. (8.41)
n≥0
n!

Indeed, then
† †
â|α = [â, e α â ]|0 = αe α â |0 = α|α, (8.42)

where in the first equality we have used â|0 = 0 and in the second we have used

 αn
[â, e α â ] = [â, ↠]( ↠) n−1 + ↠[â, ↠]( ↠) n−2 + · · · + ( ↠) n−1 [â, ↠]
n≥0
n!
 α n−1 (8.43)

=α ( ↠) n−1 = αe αa .
n≥0
(n − 1)!

Similarly, we define the complex conjugate state,



α ∗ | ≡ 0|e α a , (8.44)

obeying the complex conjugate relation

α ∗ | ↠= α ∗ |α ∗ . (8.45)

The inner product of the bra and ket states is found to be


∗ ∗ ∗
α ∗ |α = 0|e α a |α = e αα 0|α = e α α , (8.46)

since we have (using 0|a† = 0)



0|α = 0|e αa |0 = 0|0 = 1. (8.47)

We also have a completeness relation for these bra and ket states,

dα dα∗ −αα∗
1̂ = e |αα ∗ |, (8.48)
2πi
but we will leave its proof as an exercise.

8.5 Solution in the Coordinate, |x, Representation (Basis)

Until now, we have worked abstractly with ket states, without choosing a representation. But we now
want to obtain the probabilities of finding the harmonic oscillator in coordinate (x) space, so we need
to calculate the wave functions ψ n (x) ≡ x|n.
97 8 The Harmonic Oscillator

According to the general theory defined in the previous chapter, the Schrödinger equation
Ĥ |n = En |n becomes
 2 2 
 d 1
− + mω x ψ n (x) = En ψ n (x),
2 2
(8.49)
2m dx 2 2

by inserting the completeness relation dx|xx| = 1̂ and multiplying with x|. Rewriting it as
  mωx  2 
ψ n (x) +  n − ψ n (x) = 0, (8.50)

where we have defined as before
2m
= E, (8.51)
2

or, better still, defining the coordinate y = x mω/ and
E
˜ = , (8.52)

we have

ψ n (y) + (2˜ n − y 2 )ψ n (y) = 0. (8.53)

In order to solve this differential equation and find its eigenvalues  n , we use a method, called the
Sommerfeld method, that can be used in more general situations. We first find the behavior of the
equation, and the corresponding solutions, at the extremes of the region, y → ∞ and y → 0, and then
we factor them out of ψ(y), and write an equation for the reduced function.

(a) The limit y → ∞


In this limit, equation (8.53) becomes

ψ  − y 2 ψ = 0, (8.54)

on neglecting the constant in the second term. The solution is of the general form

ψ(y) = ( Ay m )e±y /2 ,
2
(8.55)

where for completeness we have considered also the subleading polynomial factor ( Ay m ), but in fact
this is not needed, since all we need to do is to factor out the leading behavior. Indeed, then
 
 m+2 ±y 2 /2 2m + 1 m(m − 1)
ψ (y) = Ay e 1± + → y 2 ψ(y), (8.56)
y2 y4
independently of the polynomial in the y → ∞ limit. From the condition that ψ(y) must be a
normalizable function (since its modulus squared gives probabilities), we can only have the negative
sign in the exponent, e−y /2 .
2

(b) Limit y → 0
In this case we can ignore the y 2 term with respect to the constant in the second term, obtaining

ψ  (y) + 2ψ(y)
˜ = 0, (8.57)
98 8 The Harmonic Oscillator

with the general solution


√ √
ψ(y) = Ã sin 2y
˜ + B̃ cos 2y.
˜ (8.58)
However, since we are considering the y → 0 limit, we can ignore order-y 2 terms in the solution
(as in the equation), and write only

ψ(y) → Ã 2y + B̃. (8.59)

Putting together the two limits, we factor out the leading behaviors at infinity (e−y /2 ) and at zero
2

(just 1, a constant), redefining


ψ(y) = e−y /2 H (y),
2
(8.60)
where from the general (including subleading) behavior at infinity, and the behavior at zero, we know
that H (y) should be a polynomial (∼ Ay m at y → ∞ and ∼ C̃ y + B̃ at y → 0). We calculate first
ψ  (y) = e−y /2 [H  (y) − 2yH  (y) + (y 2 − 1)H (y)],
2
(8.61)
which means that the equation for H (y) is
H  (y) − 2yH  (y) + (2˜ − 1)H (y) = 0. (8.62)

But from the fact that the good (i.e., normalizable) solution at y → ∞ is ∼ e √ and not√∼ e+y /2 ,
−y 2 /2 2

√ in general is a linear combination of the two possible solutions at zero (sin 2y ∼ 2y and
which
cos 2y ∼ 1), while we want a nonzero solution at y = 0, we get a constraint on  =  n , that is, a
quantization condition for the energy. Indeed, this good behavior both at y = 0 and at y = ∞ of the
solution will lead to the condition
En
2˜ = 2 = 2n + 1, n ∈ N, (8.63)

which will mean that the Hn (x) are Hermite polynomials. To see this, we first write H (y) as an
infinite Taylor series,

H (y) = Cn y n . (8.64)
n≥0

Then, substituting in the equation for H (y), equation (8.62), we find



Cn [n(n − 1)y n−2 − 2ny n + (2˜ − 1)y n ] = 0. (8.65)
n≥0

But we can redefine the sums above to have the same y n factor (and verify that in doing so we
don’t change the n = 0 and n = 1 terms):

y n [(n + 2)(n + 1)Cn+2 − Cn (2 − 1 − 2n)] = 0. (8.66)
n≥0
n
Since the y are linearly independent functions, we can set all their coefficients to zero, obtaining
a recurrence relation for Cn ,
2˜ − 1 − 2n
Cn+2 = Cn . (8.67)
(n + 2)(n + 1)
Here we use the fact that equation (8.62) is a rewriting (by redefining y(x) as −y 2 /2) of
(8.53), which we know has two possible solutions at infinity, e±y /2 ; so (8.62) also has two possible
2

solutions at infinity, one polynomial, and one ∼ e+y . Generically, then, the series H (y) = n≥0 Cn y n
2
99 8 The Harmonic Oscillator

would lead, if all Cn s were nonzero all the way to infinity, to ∼ e+y behavior. To avoid that possibility,
2

we require the Taylor series to terminate at a finite n, so that H (y) is a Hermite polynomial Hn (x).
Given the above recurrence relation, this is only possible if Cn+2 /Cn = 0 for some n, implying
indeed that
1
˜ n = n + , n ∈ N. (8.68)
2
For such an n, we obtain the eigenfunction

mω 
ψ n (x) = An e−mωx /2 Hn 
2
x , (8.69)
 
where An is a normalization constant, which can be found from the condition
 +∞
dx|ψ n (x)| 2 = 1, (8.70)
−∞

leading to
  1/4

An = . (8.71)
π22n (n! ) 2

8.6 Alternative to |x Representation: Basis Change


from |n Representation

As an alternative to the above derivation, we could go from the |n to the |x basis directly.
First, we consider the ground state, defined by â|0 = 0. Introducing the completeness relation
1̂ = dx  |x  x  | on the right-hand side of â, multiplying with x|, and defining the ground state
wave function
x|0 ≡ ψ0 (x), (8.72)
we obtain

0= dx x| â|x  ψ0 (x  ). (8.73)

Using the matrix element


 
1  mω 1  1   mω  1 d

x| â|x  = √ x| X̂ + √ 
i P̂ |x  = √ δ(x − x ) x +√  
2   mω 2   mω dx
  (8.74)

δ(x − x )  d
= √ y +  ,
2 dy

where we have defined y ≡ mω/x as before, we find the equation
 
   
 δ(x − x )  d  1 d
0= dx √ y +  ψ0 (x ) = √ y + ψ0 (y). (8.75)
2 dy 2 dy
Its solution is
ψ0 (x) = A0 e−mωx
2 /2
. (8.76)
100 8 The Harmonic Oscillator

To proceed to the other states, we act with x| on


( ↠) n
|n = √ |0, (8.77)
n!
and again by inserting a completeness relation on the left of |0, we find
  n
1 A0 y − d/dy
dx x  |( ↠) n |x  ψ0 (x  ) = √ e−y /2
2
ψ n (x) = √ √
n! n! 2
(8.78)
A0
e−y /2 Hn (x),
2
=√
n!2 n/2

since the Hermite polynomials can be defined by


 n
d
y 2 /2
e−y /2 .
2
Hn (y) ≡ e y− (8.79)
dy
This result matches what we found before, including the normalization constant.

8.7 Properties of Hermite Polynomials

2
The Hermite polynomials form an orthonormal set with respect to the weight ey , and so have scalar
product
 +∞

Hn (y)Hm (y)e−y dy = δ nm (2n n! π).
2
(8.80)
−∞

They can be defined by


 n
d
n x2
e−x ,
2
Hn (x) = (−1) e (8.81)
dx
have parity change given by the order n,

Hn (−x) = (−1) n Hn (x), (8.82)

and satisfy the recurrence relations

Hn+1 (x) − 2xHn (x) + 2nHn−1 (x) = 0


(8.83)
Hn (x) = 2nHn−1 (x).
They also form a complete set on the space of (doubly differentiable) functions on the interval
2
(−∞, +∞) with weight e x , L 2 x 2 (R), since they satisfy
e

ψ n (x)ψ n (x) = δ(x  − x). (8.84)
n≥0

They can be obtained by Taylor expansion from the generating function


 sn
e−s +2sz =
2
Hn (z). (8.85)
n≥0
n!
101 8 The Harmonic Oscillator

8.8 Mathematical Digression (Appendix): Classical


Orthogonal Polynomials

A set of polynomials {Pn (x)} constitutes a set of classical orthogonal polynomials with weight ρ on
the interval (a, b) if
 b
ρ(x)Pn (x)Pm (x) = δ nm (8.86)
a

and ρ satisfies

[σ(x)ρ(x)] = τ(x)ρ(x), (8.87)

where


⎪ (x − a)(x − b), a, b finite


σ(x) = ⎨
⎪ x − a, b = ∞, a finite (8.88)


⎪ b − x,
⎩ a = −∞, b finite,

τ(x) is a linear function and



⎪ τ(b) τ(a)


⎪ (x − a) β (b − x) α , α = − − 1, β = − 1, a, b finite


⎪ b − a b −a


⎪ 
⎪ (x − a) β e xτ (x) , β = τ(a) − 1, b = ∞, a finite
ρ(x) = ⎨






⎪ (b − x) α e−xτ (x) , α = −τ(b) − 1, a = −∞, b finite


⎪  

⎪ exp
⎩ τ(x)dx , (a, b) = (−∞, +∞).
(8.89)

Then, one can prove the limits

lim x m σ(x)ρ(x) = 0
x→a
(8.90)
lim x m σ(x)ρ(x) = 0.
x→b

One can take the general classical polynomials to their canonical forms, by making changes of
variables for x and Pn .

(1) The case where (a, b) are finite


In this case, we can change variables to set (a, b) → (−1, 1) and obtain

ρ(t) = (1 − t) α (1 + t) β
σ(t) = 1 − t 2 (8.91)
τ(t) = −(α + β + 2)t + β − α.
The resulting classical orthogonal polynomials are the Jacobi polynomials.
In the particular case of α = β = 0 (so that ρ(t) = 1 is trivial), we obtain for Pn (x) the Legendre
polynomials.
102 8 The Harmonic Oscillator

In the particular case α = β = +1/2 and α = β = −1/2, we obtain the Chebyshev polynomials of
the first kind, Tn (x), and second kind, Un (x). In the cases α = β = λ −1/2, we obtain the Gegenbauer
polynomials G n (x).

(2) The cases (−∞, b) and (a, +∞)


In these cases, we change variables (a, b) to (0, +∞). We obtain the Laguerre polynomials L nα (x),
with
ρ(t) = t α e−t
σ(t) = t (8.92)
τ(t) = −t + α + 1.

(3) The case (a, b) = (−∞, +∞)


In this case, we can put

ρ(t) = e−t
2

σ(t) = 1 (8.93)
τ(t) = −2t,
and the resulting polynomials are the Hermite polynomials.

Properties of the Classical Orthogonal Polynomials


The only orthogonal polynomials whose derivatives are also orthogonal polynomials are the classical
orthogonal polynomials, which we are describing in this appendix here.
They satisfy the eigenvalue equation
 
d d
σ(x)ρ(x) Pn (x) + λ n ρ(x)Pn (x) = 0, (8.94)
dx dx
where the eigenvalue is
 
n − 1 
λ n = −n τ (x) + σ (x) . (8.95)
2
They are also given by Rodrigues’ formula,
1 dn
Pn (x) = An σ n (x)ρ(x) , (8.96)
ρ(x) dx n

and they form a basis for the Hilbert space in L 2ρ (a, b), so that

Pn (x)Pn (x  ) = δ(x − x  ). (8.97)
n≥0

They satisfy the recurrence formulas

xPn (x) = α n Pn+1 (x) + β n Pn (x) + γn Pn−1 (x), (8.98)


103 8 The Harmonic Oscillator

valid for a general orthogonal polynomial, and


σ(x)Pn (x) = α 1n Pn+1 (x) + (β 1n + γn1 x)Pn (x)
(8.99)

Pn (x) = α 2n Pn+1 (x) + (β 2n + γn2 x)Pn (x),
valid only for classical orthogonal polynomials.
The generating function K (x, t) can be Taylor expanded to give the classical orthogonal
polynomials,
 P̃n (x)
K (x, t) = tn, (8.100)
n≥0
n!

and it can be calculated from the formula


1 ρ(z1 )
K (x, t) = , (8.101)
ρ(x) 1 − tσ  (z1 )
where z1 is the closest zero to x of the function

f (z) = z − x − σ(z)t. (8.102)

For the Legendre polynomials Pn (x), we have


1 dn
Pn (x) = [(x 2 − 1) n ]
2n n! dx n
1  (8.103)
√ = Pn (x)t n .
1 − 2t x + t 2
n≥0

For the Hermite polynomials Hn (x), we have


n
2 d
(e−x )
2
Hn (x) = (−1) n e x
dx n
2
 tn (8.104)
e2t x−x = Hn (x).
n≥0
n!

For the Laguerre polynomials L nα (x), we have


et x/t−1
K (x, t) = . (8.105)
(1 − t) α+1

Important Concepts to Remember

• The harmonic oscillator is an important prototype for a quantum mechanical system, which can be
used to test formalisms and to approximate systems to the quadratic order.
• One can define its quantum mechanics in terms of creation and annihilation operators ↠, â,
satisfying [â, ↠] = 1, which create and annihilate quanta of frequency ω. This is also called the
occupation number basis.

• The states of n quanta√ are created by acting with â n times on the vacuum |0, satisfying â|0 = 0,
i.e., |n = [( â ) / n!]|0; their number is measured by N = ↠â, N |n = n|n and their energy is
† n

En = (n + 1/2)ω.
104 8 The Harmonic Oscillator

• Coherent states |α are eigenstates of the annihilation operator â, â|α = α|α, and are written as

|α = e α â |0.
• The solution of the harmonic oscillator in the |x (coordinate) representation is obtained by the
Sommerfeld method, by finding the behavior of the harmonic oscillator equation, and of its
solution, at the limits of the domain (here x → 0 and x → ∞), and factoring this out of the
general solution, which usually reduces the solution to a polynomial.
• In the Sommerfeld method, one usually obtains a quantization condition by imposing that the
unique normalizable solution at one end of the domain matches the unique normalizable solution
at the other end, so that an infinite series reduces to a polynomial, usually a classical orthogonal
one.
• For the harmonic oscillator, the Sommerfeld
√ method leads to the Hermite polynomials and so the
solution ψ(y) = e−y /2 Hn (y), with y = x mω/.
2

• Classical orthogonal polynomials are polynomials that are orthogonal on an interval (a, b), with a
weight ρ(x), whose derivatives are also orthogonal polynomials.
• In the case of a, b finite, redefining the domain to (−1, 1) we have the Jacobi polynomials; in the
case of (a, +∞) or (−∞, b) we redefine the domain to (0, +∞), obtaining the Laguerre polynomials,
and in the case (−∞, +∞) we have the Hermite polynomials.

Further Reading
See [2] for more details.

Exercises

(1) Consider the Lagrangian


q̇12 q̇22 α 1 α2 α 12
L= + − sin2 (β1 q1 ) − sin2 (β2 q2 ) − sin2 (β12 (q1 + q2 )). (8.106)
2 2 2 2 2
Approximate the system by two harmonic oscillators, and quantize it in the creation and
annihilation operator (occupation number) representation.
(2) Consider the Hamiltonian for a (large) number N of oscillators âi , âi† (with [âi , â†j ] = δi j ):

N  
H= âi† âi + α âi+1

âi + h.c. , (8.107)
i=1

where â N +1 ≡ â1 , â†N +1 ≡ â1 . Diagonalize it and find its eigenstates.


(3) Show that the completeness relation for coherent states is (8.48).
(4) Calculate (in terms of a single ket state and no operators)
 
exp i[a† a + β( â + ↠) 3 ] |α. (8.108)

(5) Calculate x 2 n and x 3 n in the state |n of the harmonic oscillator.


105 8 The Harmonic Oscillator

(6) Use the Sommerfeld method for a particle in one dimension with a quartic potential, instead of
a quadratic one, V (x) = λx 4 . What is the resulting reduced equation, and can you describe the
quantization condition for bound states?
(7) Consider two harmonic oscillators, one with frequency ω and one with frequency 2ω. Calculate
the symmetric wave function (in x space) corresponding to the energy E = (15/2)ω.
The Heisenberg Picture and General Picture;
9 Evolution Operator

Until now, we have considered the Schrödinger equation as acting on states |ψ S (t) which are time
dependent, with operators that are time independent. This is known as the Schrödinger picture.
However, we can consider other “pictures” with which to describe quantum mechanics, for instance
one in which the operators are time dependent but states are time independent; this is known as the
Heisenberg picture.

9.1 The Evolution Operator

In order to define the Heisenberg picture, we remember that in the Schrödinger picture the time
evolution was packaged in a single operator, the “time evolution” operator Û (t, t 0 ), defined in the
Schrödinger picture by
|ψ S (t) = Û (t, t 0 )|ψ S (t 0 ). (9.1)
Indeed, in this usual, Schrödinger, picture we can define the evolution operator as follows. We
can note that the evolution from |ψ(t 0 ) to |ψ(t) is a linear operation, preserving in time the linear
superposition of states, so we can define this time evolution as the result of an operator Û (t, t 0 ) as in
the relation (9.1) above. The relation must be a unitary transformation, Û (t, t 0 ) † = Û −1 (t, t 0 ), since
it must preserve the norm of the states in time,
ψ(t)|ψ(t) = ψ(t 0 )|ψ(t 0 ), (9.2)
because the norm is associated with probabilities, and conservation of the total probability must be
imposed.
We also saw that, for a conservative system, for which Ĥ  Ĥ (t), on eigenstates |ψ E (t) of the
Hamiltonian,
Ĥ |ψ E (t) = E|ψ E (t), (9.3)
the time evolution of states is obtained as
|ψ E (t) = e−iE (t−t0 )/ |ψ E (t 0 ) = e−i Ĥ (t−t0 )/ |ψ E (t 0 ), (9.4)
so the time evolution operator is
Û (t, t 0 ) = e−i Ĥ (t−t0 )/ . (9.5)
In the energy eigenstates basis, the completeness relation is written as 1̂ = n,a |En,a En,a |, so,
inserting it in front of the above relation, the time evolution operator can be written also as

Û (t, t 0 ) = |En,a En,a |e−iE (t−t0 )/ . (9.6)
n,a
106
107 9 Heisenberg and General Pictures; Evolution Operator

Moreover, we can write a differential equation for this time evolution operator by acting on (9.5)
with d/dt, obtaining
d
i Û (t, t 0 ) = Ĥ Û (t, t 0 ). (9.7)
dt
The difference is that now we can define the time evolution operator to be that found from the
above differential equation, together with the boundary condition Û (t 0 , t 0 ) = 1̂, even in the case of
a nonconservative system, with Ĥ = Ĥ (t). In this latter case, we can “integrate” the equation over a
very small interval t − t 0 , and write
 t
i
Û (t, t 0 ) = 1̂ − Ĥ (t  )Û (t , t 0 )dt . (9.8)
 t0
This is in fact equivalent to the action of the Schrödinger equation acting on states,
d
|ψ(t) = Ĥ |ψ(t),
i (9.9)
dt
since by integrating it we obtain the same time evolution operator:
 
i
|ψ(t + dt) = 1 − Ĥ dt |ψ(t) = Û (t + dt, t)|ψ(t)  e−i Ĥ (t)dt/ |ψ(t). (9.10)

We note that in this expression, since we have a Hermitian Hamiltonian, Ĥ † = Ĥ, the evolution
operator is indeed unitary, Û † = Û −1 , so that finite time evolution amounts to an infinite sequence
of unitary infinitesimal operations, leading to a total unitary operation. The final state, after this
sequence of operations, each one centered on a t n , is
n  
i
|ψ(t) = exp − Ĥ (t n )dt n |ψ(t 0 ). (9.11)
i=1

We need to use this complicated formula since, in general, the Hamiltonians at different times need
not commute,
[ Ĥ (t 1 ), Ĥ (t 2 )]  0, (9.12)

so if we were to write an integral H (t  )dt  in the exponent instead of the product of infinitesimal
exponentials, this would be in general different from the above product of exponentials. Therefore
we have to define the integral in the exponential only with a time ordering operator, which puts
products of operators in time order: T ( A(t 1 ) · · · A(t n )) = A(t 1 ) · · · A(t n ) if t 1 > t 2 > · · · t n . Then
   t 
i
|ψ(t) = T exp − Ĥ (t  )dt  |ψ(t 0 ), (9.13)
 t0
where the time-ordered exponentials are defined as
   t  
N −1  
i  
T exp − Ĥ (t )dt = lim exp −i Ĥ (t n )dt n . (9.14)
 t0 N →∞
n=0

Noncommuting Exponentials

For such noncommuting exponentials, we have the following theorem:


 
[ Â, B̂]
e Â+B̂ = e  e B̂ exp − , (9.15)
2
108 9 Heisenberg and General Pictures; Evolution Operator

which can be proven as follows. Consider the function

f (x) = e Âx e B̂x , (9.16)

and consider its derivative,


df
= Âe Âx e B̂x + e Âx B̂e B̂x
dx (9.17)
= Â + e Âx B̂e− Âx f (x).

On the other hand, we also find that


 (−x) n  (−x) n−1 n
[B, e−Ax ] = [B, An ] = −x An−1 [B, A] = −xe−Ax [B, A], (9.18)
n≥0
n! n≥0
n!

allowing us to write
df
= ( Â + B̂ + [ Â, B̂]x) f (x). (9.19)
dx
This differential equation can be integrated, with boundary condition f (0) = 1, to obtain
2 /2[ Â, B̂]
f (x) = e ( Â+B̂)x e x . (9.20)

Putting x = 1, we arrive at the stated equation.

9.2 The Heisenberg Picture

Having defined the evolution operator, it follows that we can define time-independent states, which
by definition will be the states in the Heisenberg picture, by reversing the evolution operator, so these
states, denoted by |ψ H , will be

|ψ H  = Û −1 (t, t 0 )|ψ S (t) = |ψ S (t 0 ). (9.21)

This change in the space of states can be thought of as a “unitary transformation”, or change
of basis in the Hilbert space, with the unitary operator U −1 (t, t 0 ) = U † (t, t 0 ). Through it, the time
dependence encoded in Û (t, t 0 ) can be put onto the operators, since the change of basis implies that
the operators are also changed from the Schrödinger picture operators ÂS to the Heisenberg picture
operators ÂH , as

ÂS → ÂH (t) = Û † (t, t 0 ) ÂS Û (t, t 0 ), (9.22)

in such a way that the matrix elements (or expectation values) are unchanged,

ψ S | ÂS |χS  = ψ H | ÂH |χ H . (9.23)

Note therefore that when we write Û (t, t 0 ), we mean the evolution operator in the Schrödinger
picture, ÛS (t, t 0 ).
The time evolution of the new operators is found by using the differential equation for Û (t, t 0 ) in
equation (9.7), and its complex conjugate, in the time derivative of the definition of ÂH (acting on
three terms), obtaining
109 9 Heisenberg and General Pictures; Evolution Operator

d ∂ ÂS
i ÂH = −ÛS† (t, t 0 ) ĤS ÂS Û (t, t 0 ) + iÛ † Û + Û † ÂS ĤS Û
dt ∂t
(9.24)
† † ∂ ÂS
= Û [ ÂS , Ĥ]Û + iÛ Û.
∂t
Defining the Hamiltonian in the Heisenberg representation,

ĤH = Û † ĤS Û, (9.25)

in such a way that

Û † [ ÂS , ĤS ]Û = [ ÂH , ĤH ], (9.26)

and if the operators have explicit time dependence ∂/∂t ÂS  0, such that

∂ ÂH ∂ ÂS
= Û † Û, (9.27)
∂t ∂t
we obtain the equation for the time evolution of the Heisenberg operators,

d ÂH ∂
i = [ ÂH , ĤH ] + i ÂH . (9.28)
dt ∂t
This Heisenberg equation of motion replaces the Schrödinger equation, since now operators have
time dependence, not the states (now ∂t |ψ H  = 0 always).
This equation of motion is the quantum mechanical counterpart of the classical equation in the
Hamiltonian formalism, in terms of Poisson brackets,
dAcl ∂ Acl
= { Acl , Hcl } P.B. + . (9.29)
dt ∂t
This is the generalization of the Hamilton equations written with Poisson brackets,
dqi ∂H
= {qi , H } P.B. =
dt ∂pi
(9.30)
dpi ∂H
= {pi , H } P.B. = − .
dt ∂qi
For the quantum mechanical case, the above Hamilton equations become the Heisenberg equations
of motion,
d q̂i (t) i
= − [q̂i , Ĥ]
dt  (9.31)
d p̂i (t) i
= − [ p̂i , Ĥ].
dt 

9.3 Application to the Harmonic Oscillator

We will use as the simplest example our standard toy model, the harmonic oscillator.
The quantum mechanical Heisenberg equations for â, ↠, taking the role of the Hamilton
equations, are
110 9 Heisenberg and General Pictures; Evolution Operator

d â
i = [â, Ĥ] = ω â
dt
(9.32)
d â†
i = [↠, Ĥ] = −ω ↠,
dt
where we have used Ĥ = ω( ↠â + 1/2) and [â, ↠] = 1.
The solution of these equations is
â(t) = â0 e−iωt
(9.33)
↠(t) = â0† e+iωt ,
which implies for the phase space variables q̂(t), p̂(t) that


q̂(t) = ↠eiωt + â0 e−iωt
2mω 0
 (9.34)
mω † iωt
p̂(t) = i â0 e − â0 e−iωt .
2

9.4 General Quantum Mechanical Pictures

The Schrödinger and Heisenberg pictures are the best known ones, but there are others. Here we
write a general analysis for such pictures.
To gain an understanding, we look to classical mechanics in the Hamiltonian and Poisson bracket
formalism, as we did above for the Heisenberg picture case.
In classical mechanics, there are canonical transformations on phase space, changing (p, q) to
(p, q  ), under which functions of phase space F (q, p) and G(q, p), with Poisson brackets
{F (q, p), G(q, p)} P.B. = K (q, p) (9.35)
transform to functions F  (q , p ) and G  (q , p ) that obey the same relation,
{F  (q , p ), G  (q , p )} P.B. = K  (q , p ). (9.36)
In quantum mechanics, the role of such canonical transformations on phase space is taken by
unitary transformations on the Hilbert space,
|ψ   = Û |ψ, (9.37)
with inverse |ψ = Û −1 |ψ  , and under which the operators also transform,
 = Û ÂÛ −1 . (9.38)
Since quantization involves the replacement of {, } P.B. with 1/i[, ], we must obtain for the
commutator a transformation law similar to that for Poisson brackets in classical mechanics. Indeed,
we find
[F̂, Ĝ] → [F̂ , Ĝ ] = Û[F̂, Ĝ]Û −1 = Û K̂ Û −1 = K̂ . (9.39)
Consider a unitary transformation by an operator Ŵ (t) that can depend explicitly on time,
|ψW  = Ŵ (t)|ψ, ÂW = Ŵ (t) ÂŴ −1 (t). (9.40)
111 9 Heisenberg and General Pictures; Evolution Operator

The Hamiltonian generating the new time dependence is found from the Schrödinger equation:

i∂t |ψ = Ĥ |ψ ⇒


i∂t |ψW  = i∂t (Ŵ |ψ) = iŴ ∂t |ψ + i(∂t Ŵ )|ψ (9.41)
= (Ŵ Ĥ Ŵ −1 )|ψW  + i(∂t Ŵ )Ŵ † |ψW ,

where in the last line we have used the Schrödinger equation for |ψ. Equating the final result to
H  |ψW  allows us to write Ĥ , the Hamiltonian generating the new time evolution, as

Ĥ  = ĤW + i(∂t Ŵ )Ŵ † . (9.42)

The time evolution of the transformed operators ÂW = Ŵ ÂŴ † is found as before, by acting on
each term:
 
∂ ∂ Â
i ÂW = iŴ Ŵ † + i(∂t Ŵ ) ÂŴ −1 + iŴ Â∂t Ŵ −1
∂t ∂t
  (9.43)

= i  + i(∂t Ŵ )Ŵ † ÂW − i ÂW (∂t Ŵ )Ŵ −1 ,
∂t W

which finally becomes


i ÂW = i(∂t Â)W + [i(∂t Ŵ )Ŵ † , ÂW ]. (9.44)
∂t
From the above, we can also find the new time evolution operator. Since in the Schrödinger picture
we had |ψ(t) = Û (t, t 0 )|ψ(t 0 ), in the new W picture we find

|ψW (t) = Ŵ (t)|ψ(t) = Ŵ (t)Û (t, t 0 )Ŵ † (t 0 )|ψW (t 0 ), (9.45)

from which we deduce that the time evolution operator in the W picture is

Û  (t, t 0 ) = Ŵ (t)Û (t, t 0 )Ŵ † (t 0 )  ÛW . (9.46)

Indeed, note that the new evolution operator is not simply the Schrödinger operator transformed
into the W picture, because we have Ŵ (t) to the left but Ŵ † (t 0 ) to the right (at a different time t 0 ).

Application to the Heisenberg Picture

Note that we can define the Heisenberg picture by the condition that states are time independent,
∂t |ψ H  = 0, so the new Hamiltonian must be trivial, H  = 0. But that means that the Schrödinger
Hamiltonian in the new picture is

ĤH = ĤS = −i(∂t Ŵ )Ŵ † . (9.47)

By choosing the two pictures, the W picture and the Schrödinger picture, to be equal at time t 0 , we
finally obtain

Ŵ (t) = ÛS−1 (t, t 0 ) = ÛS (t 0 , t). (9.48)


112 9 Heisenberg and General Pictures; Evolution Operator

9.5 The Dirac (Interaction) Picture

The most interesting new example of a picture that remains is the Dirac, or interaction, picture.
This is defined in the case where the Hamiltonian H can be split into a free particle part Ĥ0 and an
interaction part Ĥ1 ,

Ĥ = Ĥ0 + Ĥ1 . (9.49)

We can then use just the free part, Ĥ0 , in order to go to a sort of Heisenberg picture for Ĥ0 , i.e.,
using a unitary transformation Ŵ (t) defined by
    −1  
−1 Ĥ0 (t − t 0 ) Ĥ0 (t − t 0 )
Ŵ (t) = ÛS,0 (t, t 0 ) = exp −i = exp +i . (9.50)
 

In this case, we obtain that the interaction picture states |ψ I (t) and operators Â(t) both depend
on time, and their time dependences are given by


i |ψ I (t) = H1,I |ψ I (t)
∂t
(9.51)

i ÂI (t) = [ ÂI (t), Ĥ0 ] +i(∂t ÂS )I .
∂t
The evolution operator in the interaction picture is, according to the general theory,

ÛI (t, t 0 ) = Ŵ (t)Û (t, t 0 )Ŵ −1 (t 0 ), (9.52)

for the canonical transformation

Ŵ (t) = ei Ĥ0 (t−t0 )/ , (9.53)

so that Ŵ (t 0 ) = 1̂. In the conservative case, of a time-independent Hamiltonian, Û (t, t 0 ) =


e−i Ĥ (t−t0 )/ , so that

ÛI (t, t 0 ) = ei Ĥ0 (t−t0 )/ e−i Ĥ (t−t0 )/ . (9.54)

We can write a differential equation for it, using the Schrödinger picture equation derived before,
d
i dt Û (t, t 0 ) = Ĥ Û (t, t 0 ). We find
 
d d d d
i ÛI (t, t 0 ) = i [Ŵ (t)Û (t, t 0 )] = i Ŵ (t) Ŵ −1 (t)ÛI (t, t 0 ) + Ŵ (t)i Û (t, t 0 )
dt dt dt dt
(9.55)
= [i(∂t Ŵ (t))Ŵ † (t)]ÛI (t, t 0 ) + Ŵ (t) Ĥ Ŵ −1 (t)UI (t, t 0 )
= ĤI ÛI (t, t 0 ),

where ĤI is the Hamiltonian giving the time evolution in the interaction picture.
113 9 Heisenberg and General Pictures; Evolution Operator

Moreover, since

ÛI (t, t 0 ) = ei Ĥ0 (t−t0 )/ e−i Ĥ (t−t0 )/ , (9.56)

it follows that by differentiation we get

i∂t ÛI (t, t 0 ) = ei Ĥ0 (t−t0 )/ ( Ĥ − Ĥ0 )e−i Ĥ (t−t0 )/ , (9.57)

where, when differentiating both factors, we have chosen to write the resulting Hamiltonian in the
middle of the two exponentials. Then, using the fact that Ĥ − Ĥ0 = Ĥ1 in the Schrödinger picture,
and that in the interaction picture we have

H1,I = ei Ĥ0 (t−t0 ) H1,S e−i Ĥ0 (t−t0 ) , (9.58)

where H1,I is H1 in the interaction picture, we derive that

i∂t ÛI (t, t 0 ) = H1,I ei Ĥ0 (t−t0 ) e−i Ĥ (t−t0 ) = Ĥ1,I ÛI (t, t 0 ). (9.59)

Comparing the two differential equations, we see that the Hamiltonian in the interaction picture
equals Ĥ1 in the interaction picture,

ĤI = Ĥ1,I . (9.60)

We can solve the differential equation (9.59) for something like U (t, 0) ∼ e−iHI t/ , by analogy.
More precisely, we see that we can write the expansion of the exponential, taking care to integrate
only over the time-ordered product of Hamiltonians:
 t  t  t1
ÛI (t, t 0 ) = 1 + (−i/) dt 1 Ĥ1,I (t 1 ) + (−i/) 2 dt 1 dt 2 H1,I (t 1 )H1,I (t 2 ) + · · · (9.61)
t0 t0 t0

Calling the three terms zeroth-order (1), first-order, and second-order, we see that in the second-
order term we are integrating only over a triangle, satisfying the condition of time ordering, t 1 > t 2
in H1,I (t 1 )H1,I (t 2 ), instead of dividing by 2, as would happen from the expansion of an exponential.
When doing this, we find that, indeed,

i∂t (first-order) = Ĥ1,I (t)(zeroth-order)


(9.62)
i∂t (second-order = Ĥ1,I (t)(first-order).

However, as we
 tsaid,the integration over the triangle actually equals half the integration over the
t
whole rectangle t dt 1 t dt 2 . But, since in general the interaction Hamiltonians at different times
0 0
don’t commute,

[ Ĥ1,I (t), Ĥ1,I (t  )]  0, (9.63)

in order to get the correct result we must include a time ordering as well, so
 t  t  t
(−i/) 2
ÛI (t, t 0 ) = 1 + (−i/) dt 1 Ĥ1,I (t 1 ) + dt 1 dt 2T {H1,I (t 1 )H1,I (t 2 )} + · · · (9.64)
t0 2! t0 t0
114 9 Heisenberg and General Pictures; Evolution Operator

This pattern continues at higher orders, and we find that we must in fact put the time-ordering
operator in front of all the terms coming from the exponential, so
   t 
ÛI (t, t 0 ) = T exp −i dt  H1,I (t  ) . (9.65)
t0

This is the same formula as we obtained in the Schrödinger picture in (9.13) but now defined
rigorously (and in the interaction picture).

Important Concepts to Remember

• The evolution operator Û (t, t 0 ) can be written as the solution of a Schrödinger-like equation,
d
i Û (t, t 0 ) = Ĥ Û (t, t 0 ), with boundary condition Û (t 0 , t 0 ) = 1̂, leading to a solution as a time-
dt   t 
ordered exponential, T exp − i t dt  H (t  ) .
0
• Quantum mechanics can be described in different pictures, the usual Schrödinger picture being the
one where operators are time independent and states are time dependent (and obey the Schrödinger
equation).
• In the Heisenberg picture, states are time independent, |ψ H  = Û −1 (t, t 0 )|ψ(t) = |ψ S (t 0 ), and
operators are time dependent, ÂH = Û −1 (t, t 0 ) ÂS Û (t, t 0 ), evolving with the Hamiltonian via the
Heisenberg equation of motion, id ÂH /dt = [ ÂH , ĤH ] + i∂t ÂH .
• A general picture is related to the Schrödinger picture by a canonical transformation, |ψW  =
Ŵ (t)|ψ, ÂW = Ŵ (t) AŴ −1 (t), resulting in a new Hamiltonian for the time evolution of
states i∂t |ψW  = i Ĥ  |ψW , and a new time evolution of operators, i∂t ÂW = i(∂t Â)W +
[i(∂t Ŵ )Ŵ −1 , ÂW ].
−1
• In the Dirac, or interaction, picture, Ĥ = Ĥ0 + Ĥ1 and Ŵ (t) = ÛS,0 (t, t 0 ), so it is a sort of Heisenberg
picture for Ĥ0 only, in which states evolve with Ĥ1,I , i∂t |ψ I (t) = Ĥ1,I |ψ I (t) (and ĤI = Ĥ1,I ),
but operators evolve with Ĥ0 , i∂t ÂI (t) = [ ÂI (t), Ĥ0 ].
• The
 evolution
  t operator in the  interaction picture is also a time-ordered exponential, ÛI (t, t 0 ) =
T exp − i t dt  Ĥ1,I (t  ) .
0

Further Reading
See [2] and [1] for more details.

Exercises

(1) Write down the time evolution operator and the differential equation that it satisfies for a one-
dimensional particle in a potential V (x).
(2) Write down the time evolution equation for Heisenberg operators, calculating explicitly the
evolution Hamiltonian, in the case of a free particle, and then calculate explicit expressions for
x̂(t) and p̂(t).
115 9 Heisenberg and General Pictures; Evolution Operator


(3) For a harmonic oscillator of frequency ω, calculate the  explicit time-dependent a(t), a (t)

operators in the picture with Ŵ (t) = exp 2iω â â(t − t 0 ) , and the explicit differential equations
for a general operator and state.
(4) Consider a harmonic oscillator perturbed by H1 = λ(a + a† ) 3 . Write down the explicit evolution
equations for states (the first two terms in the expansion in ω) and the a, a† operators in the
interaction picture.
(5) Calculate the evolution operator for the interaction picture in the case in exercise 4.
(6) Consider a Hamiltonian Ĥ (p) only (no x dependence). What are the operators in the theory that
evolve nontrivially in the Heisenberg picture? What about in a possible interaction picture, if
Ĥ (p) can be separated into two parts?
(7) Can one have a picture in which operators don’t evolve in time other than the Schrödinger
picture?
10 The Feynman Path Integral and Propagators

Having defined the Heisenberg picture and evolution operators, we can now attack quantum
mechanics in a different way, proposed by Feynman: instead of states and operators acting on them,
we talk about something closer to the classical concept of a particle moving on a path. Instead of
having a single path, Feynman thought about the fact that in quantum mechanics, we should be
able to sum over all possible paths between two points, even discontinuous ones, with the weight
given by the phase eiS , S being the classical action for the particle. This resulting Feynman path
integral formulation of quantum mechanics is actually equivalent to the Schrödinger formulation
(or the Heisenberg, or Dirac, formulations for that matter). The path integral for a generalized
coordinate q(t) is

Dq(t)eiS[q]/ , (10.1)

and, in the case where it is possible to have a nearly classical motion, we will find that the result is
approximated by
∼ eiScl [qcl ]/ × (corrections), (10.2)
where qcl (t) is the classical path, and Scl is the classical on-shell action (in the classical solution). We
will derive this next.

10.1 Path Integral in Phase Space

The path integral formalism is applied to the propagator, that is, the evolution operator in the
coordinate representation, defined as the product of two Heisenberg (bra and ket) states,

H x , t  |x, t H ≡ U (x , t ; x, t) = x  |Û (t , t)|x, (10.3)
−i Ĥ (t  −t)/
where as we showed, in the case of a conservative system, Û (t , t) = e and represents the
transition amplitude between the initial and final points.
The term “propagator” is related to the fact that we can write the evolution of a state in the
coordinate representation (by multiplying with x  |), and insert a completeness relation for |x, to get
|ψ(t) = Û (t, t 0 )|ψ(t 0 ) ⇒

 
(10.4)
ψ(x , t) = x |Û (t, t 0 )|ψ(t 0 ) = dx U (x , t ; x, t)ψ(x, t 0 ).

We saw earlier that the evolution operator can be written in terms of the energy eigen-basis as

Û (t, t 0 ) = |En , aEn , a|e−iEn (t−t0 )/ . (10.5)
n,a
116
117 10 The Feynman Path Integral and Propagators

Then the propagator at equal times becomes



 
lim

U (x , t ; x, t) = x  |En , aEn , a|x = x  | 1̂|x = δ(x − x  ). (10.6)
t →t
n,a

Moreover, the relation (10.4) implies that, since ψ(x, t) satisfies the Schrödinger equation,
 2 2 
 d ∂
− + V (x) − i ψ(x, t) = 0, (10.7)
2m dx 2 ∂t
the propagator U (x , t ; x, t) also satisfies the Schrödinger equation for t  > t. Then, adding the
definition that

U (x , t ; x, t) = 0, for t  < t, (10.8)

we find that U (x , t ; x, t) becomes a Green’s function for the Schrödinger operator. Indeed, because
of the discontinuity at t  → t, we find that
 2 2 
 d ∂
− + V (x) − i U (x, t; x 0 , t 0 ) = δ(x − x 0 )δ(t − t 0 ). (10.9)
2m dx 2 ∂t
The Heisenberg state |x, t H is an eigenstate of the Heisenberg operator x̂ H (T ) at time T = t, i.e.,

x̂ H (T = t)|x, t H = x(t)|x, t H , (10.10)

and is not an eigenstate at T  t. The relation to the Schrödinger states is given by

|x, t H = ei Ĥt/ |x ⇒ |x = e−i Ĥt/ |x, t H . (10.11)

The Heisenberg operator is related to the Schrödinger operator by

X̂ (t) = ei Ĥt/ X̂ S e−i Ĥt/ , (10.12)

where X̂ S |x = x|x.


To write a path integral formulation for the propagator U (x , t ; x, t), we first divide the interval
(t, t  ) into n + 1 intervals of length ,
t − t
= ⇒ t 0 = t, t 1 = t + , . . . , t n+1 = t . (10.13)
n+1
At any fixed time t i , the set of all possible Heisenberg states {|x i , t i |x i ∈ R} is a complete set. The
completeness relation is then

dx i |x i , t i x i , t i | = 1̂. (10.14)

Because of this, we can insert in U (x , t ; x, t) at each t i , i = 1, . . . , n, an operator 1̂ written as the


completeness relation for time t i , obtaining

 
U (x , t ; x, t) = dx 1 · · · dx n x , t  |x n , t n x n , t n |x n−1 , t n−1  · · · |x 1 , t 1 x 1 , t 1 |x, t. (10.15)

In the above equation, x i = x(t i ) is a discretized path but it is not a classical path, in that we are
integrating arbitrarily and independently over x i and x i+1 , which means that generically the distance
between x i and x i+1 is large, and does not go to zero (i.e., we do not have x i+1 → x i ) if  → 0.
Instead of a classical path, we get a discontinuous quantum path, as in Fig. 10.1.
118 10 The Feynman Path Integral and Propagators

x
x
xi

t
t ti t
Figure 10.1 For the definition of the quantum mechanical “path integral”, we integrate over all possible discretized paths, not just the
smooth ones (as would be suggested by classical paths). Indeed, we divide the path into a large number of discrete points, after
which we integrate over the positions of all these points, independently of the positions of their nearby points.

The product of the integrations at each intermediate point between x and x  is called the path
integral, or integral over all possible quantum paths, and is denoted by
 n 

Dx(t) = lim dx(t i ). (10.16)
n→∞
i=1

Since we can expand |x in the |p basis, and vice versa,
 
|x = dp|pp|x, |p = dx|xx|p, (10.17)

and since we have (as we saw earlier)


  
eipx/  eix(p−p )/
x|p = √ ⇒ dxp |xx|p = dx = δ(p − p ), (10.18)
2π 2π

we can write the generic scalar product of two Heisenberg states appearing in the path integral as

H x(t i ), t i |x(t i−1 ), t i−1  H = x(t i )|e−i Ĥ/ |x(t i−1 )


 (10.19)
= dp(t i )x(t i )|p(t i )p(t i )|e−i Ĥ/ |x(t i−1 ).

As a technical point, we now impose that Ĥ is to be ordered so that momenta p̂ are to the left of
coordinates x̂. Then, to order , we can write
   
i i
p(t i )| exp −  Ĥ |x(t i−1 )  exp − H (p(t i ), x(t i−1 )) p(t i )|x(t i−1 )
 
  (10.20)
i
= exp − H (p(t i ), x(t i−1 )) ,


where in the first relation, we acted with P̂ on the left and with X̂ on the right, and neglected
commutator corrections, which are of order 2 : for instance, in Ĥ ( P̂, X̂ ) Ĥ ( P̂, X̂ ) we have P̂ X̂ P̂ X̂-
type terms, which would be more complicated.
119 10 The Feynman Path Integral and Propagators

All in all, using the formulas derived, we obtain


  n 
n
 
U (x , t ; x, t) = dp(t i ) dx(t j )x(t n+1 )|p(t n+1 )p(t n+1 )|e−i Ĥ/ |x(t n )
i=1 j=1

· · · x(t 1 )|p(t 1 )p(t 1 )|e−i Ĥ/ |x(t 0 )


  
i
= Dp(t) Dx(t) exp [p(t n+1 )(x(t n+1 ) − x(t n )) + · · ·


+ p(t 1 )(x(t 1 ) − x(t 0 )) −  (H (p(t n+1 ), x(t n )) + · · · + H (p(t 1 ), x(t 0 )))]
  ⎧  tn+1 =t  ⎫
⎪i ⎪
= Dp(t) Dx(t) exp ⎨
⎪ dt[p(t) ẋ(t) − H (p(t), x(t))] ⎬
⎪,
⎩ t0 =t ⎭
(10.21)
where we have made the replacement x(t i+1 ) − x(t i ) → dt ẋ(t).
This is the path integral in phase space (in the Hamiltonian formulation). Note that this is a
physicist’s “rigorous” derivation (mathematically, of course, even the notion of path integration is not
well defined, but once that is accepted, everything is rigorous). However, as we said at the beginning,
we would be more interested in the path integral in configuration space, in terms of only x(t).
In order to obtain this, we must do a Gaussian integration over p(t). For that, we need one more
technical requirement on the Hamiltonian: it must be quadratic in the momenta p: H (p, x) = p2 /2 +
V (x). However, before we do the integration over p(t) we must calculate some Gaussian integration
formulas.

10.2 Gaussian Integration

We will have a mathematical interlude at this point, in order to find out how to do the integration we
must perform. The basic Gaussian integral is
 +∞ 
−αx 2 π
I= dx e = . (10.22)
−∞ α
Squaring it, we obtain
   2π  ∞
dy e−(x
2 +y 2 )
r dr e−r = π,
2
I2 = dx = dφ (10.23)
0 0
where in the second equality we transformed to polar coordinates in the (x, y) plane.
Generalizing this Gaussian integration formula to an n-dimensional space, we can say that

1
d n x e−xi Ai j x j /2 = (2π) n/2 √ . (10.24)
det A

This formula can be proven by diagonalizing the (constant) matrix Ai j , in which case det A = i α i ,
where αi are the eigenvalues of the matrix Ai j .
Then, defining a scalar that is identified with a quadratic “action” in a discretized form,
1 T
S= x · A · x + bT · x, (10.25)
2
120 10 The Feynman Path Integral and Propagators

and defining its “classical solution”,


∂S
= 0 ⇒ x cl = −A−1 · b, (10.26)
∂ xi
we write the on-shell (classical) action as
1
S(x cl ) = − bT · A−1 · b, (10.27)
2
and the quantum action (including fluctuations) is rewritten as
1 1
S=(x − x cl )T · A · (x − x cl ) − bT · A−1 · b. (10.28)
2 2
Then the Gaussian integration over the vector x, shifted to x − x cl , is
  
−S(x) −1/2 −S(xcl ) 1 1 T −1
n
d xe = (2π) (det A) e
n/2
= (2π) √n/2
exp + b · A · b . (10.29)
det A 2

10.3 Path Integral in Configuration Space


We are now ready to do the path integral over Dp(t), which explicitly (in the discrete version) is
 ⎧  t  ⎫

⎨ i 1 2 ⎪
Dp(t) exp ⎪ dτ p(τ) ẋ(τ) − p (τ) ⎬ ⎪
 t 2
⎩ ⎭
n     (10.30)
 dp(τi ) i 1 2
= exp Δτ p(τi ) ẋ(τi ) − p (τi ) .
i=1
2π  2

This means that, with respect to the above formula for Gaussian integration, we have the identifica-
tions
x i → p(t i ), Ai j → iΔτδi j , b = −iΔτ ẋ(τ), (10.31)
so that the result of the path integration is
 ⎧  t  ⎫ ⎡⎢  t  ⎤
⎪i 1 2 ⎪
⎬ = N exp ⎢⎢ i ẋ 2 (τ) ⎥⎥
Dp(t) exp ⎨ −
2 ⎥⎥
⎪ t dτ p(τ) ẋ(τ) p (τ) ⎪ dτ , (10.32)

2
⎭ ⎢⎣  t ⎦
where the normalization constant N contains factors of i, 2, π, Δτ, all constant and so irrelevant once
we normalize the probability to one.
Finally, this means that the path integral for the propagator becomes a path integral only in
configuration space,
 ⎧  t  2 ⎫
⎪i ẋ (τ) ⎪
U (x , t ; x, t) = N Dx(t) exp ⎨
⎪ t dτ − V (x) ⎬

2
⎩ ⎭
 ⎡⎢  t  ⎤⎥
i
=N Dx(t) exp ⎢⎢ dτL(x(τ), ẋ(τ)) ⎥⎥ (10.33)
⎢⎣  t ⎥⎦
  
i
=N Dx(t) exp S(x) .

121 10 The Feynman Path Integral and Propagators

This is indeed the formula we argued for at the beginning of the chapter, now derived rigorously
from the phase space path integral, under the following two technical requirements: the Hamiltonian
is to be ordered with ps on the left and xs on the right, and moreover the Hamiltonian is only quadratic
in momenta.
This formula, besides having a simple physical interpretation that generalizes the classical path
(the amplitude for the transition probability (see equation (10.3)) is the sum over all paths weighted
by the phase eiS[x] ), also implies a simple classical limit.
Indeed, the classical path is the extremum of the action S[x], so we expand any action around
its value on the classical solution (the “on-shell value”), up to at least quadratic order (and maybe
higher),
1
S[x]  Scl [x cl ] + (x − x cl ) · A · (x − x cl ), (10.34)
2
as before. Then the first approximation to the propagator is
Scl [xcl ]
U (x , t ; x, t)  ei  , (10.35)

while the second comes with a factor of 1/ det A, as we saw from the Gaussian integration formula.
We can calculate exactly the propagator for free particles, though we will not do it here but will
leave it for later.
Also, we can show that the propagator U (x , t ; x, t), calculated as a Feynman path integral, does
also satisfy the Schrödinger equation, so it is indeed a Green’s function for the Schrödinger operator,
meaning that any wave function evolved with it will also satisfy the Schrödinger equation. Thus the
Feynman path integral representation of quantum mechanics is actually equivalent to the original
Schrödinger representation.

10.4 Path Integral over Coherent States (in “Harmonic Phase Space”)

Our prototype (toy model) for a simple quantum system is, as we have seen, the harmonic oscillator.
A more complicated system can be described as a perturbed harmonic oscillator or a system of
harmonic oscillators. Therefore it is of interest to see if there is any other way to describe the path
integral for it. In fact, there is, using the so-called “harmonic phase space” instead of configuration
space or phase space.
This space is defined using the notion of coherent states of the harmonic oscillator. For a harmonic
oscillator, the Hamiltonian can be written in terms of creation and annihilation operators as

Ĥ = ω ↠â + 1
2 . (10.36)

But, while the original intention in introducing â and ↠was to deal with eigenenergy states |n,
whose index n is changed by â and ↠(hence the name creation and annihilation), we can also
introduce eigenstates for these creation and annihilation operators, called coherent states.
We can define |α, as we showed in Chapter 8, as an eigenstate of â,

|α ≡ e α â |0 ⇒ â|α = α|α (10.37)
122 10 The Feynman Path Integral and Propagators

and similarly α ∗ | as an eigenstate of ↠,


∗ â
α ∗ | ≡ 0|e α ⇒ α| ↠= α ∗ |α ∗ . (10.38)
Moreover, they satisfy a completeness relation

dαdα ∗ −αα∗
e |αα ∗ | = 1̂. (10.39)
2πi
We can compute the propagator for these states, i.e., the scalar product of Heisenberg states
constructed out of them,
U (α ∗ , t ; α, t) ≡ H α ∗ , t  |α, t H = α ∗ |Û (t , t)|α, (10.40)
−i Ĥ (t  −t)
where as always Û (t , t) = e .
On the other hand, we find that the expectation value of Ĥ in the coherent state basis is

α ∗ | Ĥ ( ↠, â)|β = H (α ∗ , β)α ∗ |β = H (α ∗ , β)e α β . (10.41)
We define the same discretization of a classical path from α to α∗ in n + 1 intervals, with
t − t
= , t 0 = t, t 1 , . . . , t n , t n+1 = t . (10.42)
n+1
We follow the same steps as in phase space, inserting a completeness relation at each intermediate
point, obtaining
   dα(t i )dα ∗ (t i ) ∗

U (α ∗ , t ; α, t) = e−α (ti )α(ti ) α ∗ (t  )|e−i Ĥ/ |α(t n )
i
2πi (10.43)
× α ∗ (t n )|e−i Ĥ/ |α(t n−1 )α ∗ (t n−1 )| · · · α ∗ (t 1 )|e−i Ĥ/ |α(t).
But the generic scalar expectation value is
   
i i
α∗ (t i+1 )| exp −  Ĥ |α(t i ) = exp − H (α ∗ (t i+1 ), α(t i )) exp [α ∗ (t i+1 )α(t i )], (10.44)
 
if Ĥ is normally ordered (with ↠to the left and â to the right).
Then we obtain for the propagator
U (α ∗ , t ; α, t)
   dα(t i )dα ∗ (t i )  
= exp α ∗ (t  )α(t n ) − α ∗ (t n )α(t n )
i
2πi
 t ⎤⎥
i
∗ ∗ ∗
+ α (t n )α(t n−1 ) − α (t n−1 )α(t n−1 ) + · · · + α (t 1 )α(t) − dτ H (α ∗ (τ), α(τ)) ⎥⎥ ,
 t ⎥⎦
   dα(t i )dα ∗ (t i )  ⎧  
⎪ t

i
⎫

= exp ⎨
⎪ t dτ α̇ ∗
(τ)α(τ) − H (α ∗
(τ), α(τ)) + α ∗
(t)α(t) ⎬,

2πi 
i ⎩ ⎭
(10.45)
so that finally we can write it as
  ⎧  t   ⎫
⎪i  ∗ ⎪
U (α ∗ , t ; α, t) = D(α(τ)) D(α ∗ (τ)) exp ⎨
⎪ t dτ α̇ (τ)α(τ) − H + α ∗ (t)α(t) ⎬
⎪.
i
⎩ ⎭
(10.46)
123 10 The Feynman Path Integral and Propagators

10.5 Correlation Functions and Their Generating Functional

We can consider a more general observable than a transition probability, namely one where we insert
an operator X̂ (t a ) in between Heisenberg states,

H x , t  | X̂ (t a )|x, t H , (10.47)

obtaining what is known as a correlation function, specifically a one-point function. This observable
is harder to understand, but is of great theoretical interest, especially in that it can be generalized.
To calculate it, we follow the same steps as before and discretize the path. The only constraint in
doing so is to choose t a as one of the intermediate t i s in the path integral (we can choose the division
into n + 1 steps in such a way that this is true).
Then, if t a = t i , so that x(t a ) = x(t i ), we find

x i+1 , t i+1 | X̂ (t a )|x i , t i  = x(t a )x i+1 , t i+1 |x i , t i ; (10.48)

now following the same steps as before, we find the path integral

x , t  | X̂ (t a )|x, t =
i
Dx(t)e  S[x] x(t a ). (10.49)

Next, we can define the two-point function (which is also a correlation function)

x , t  | X̂ (t b ) X̂ (t a )|x, t. (10.50)

If t a < t b , we can work as before: choose t a = t i and t b = t j , where j > i, and find

x j+1 , t j+1 | X̂ (t b )|x j , t j  · · · x i+1 , t i+1 | X̂ (t a )|x i , t i 


(10.51)
= x(t b )x j+1 , t j+1 |x j , t j  · · · x(t a )x i+1 , t i+1 |x i , t i ,
so we obtain

x , t  | X̂ (t b ) X̂ (t a )|x, t = Dx(t)eiS[x]/ x(t b )x(t a ). (10.52)

Conversely, the path integral gives the time-ordered product



Dx(t)eiS[x]/ x(t a )x(t b ) = x , t  |T { X̂ (t a ) X̂ (t b )}|x, t, (10.53)

where the time ordering is defined as usual,




⎪ X̂ (t a ) X̂ (t b ), t a > t b
T { X̂ (t a ) X̂ (t b )} = ⎨
⎪ (10.54)
⎪ X̂ (t b ) X̂ (t a ), t a < t b ,

and generalized to

T { X̂ (t a1 ) · · · X̂ (t an )} = X̂ (t a1 ) · · · X̂ (t an ), if t a1 > t a2 > · · · > t an . (10.55)

Then we can also calculate n-point functions (general correlation functions) as a path integral,

G n (t a1 , . . . , t an ) = x , t  |T { X̂ (t a1 ) · · · X̂ (t an )}x, t = Dx(t)eiS[x]/ x(t a1 ) · · · x(t an ). (10.56)
124 10 The Feynman Path Integral and Propagators

Finally, we can write a generating function for all the n-point correlation functions. Indeed, for
numbers an we can find a generating function
 an x n
f (x) = , (10.57)
n≥0
n!

and the coefficient an is found from the nth order derivatives at zero,
dn 
an = f (x)  . (10.58)
dx n x=0
In the present case, we can define a generating functional Z[J] for all the Green’s functions G n ,
defined as
 
(i/) n
Z[J] = dt 1 · · · dt n G n (t 1 , . . . , t n ) J (t 1 ) · · · J (t n ). (10.59)
n≥0
n!

Substituting the Green’s functions as path integrals, we obtain


  1  i
n
Z[J] = Dx(t)e iS[x]/
dt x(t) J (t) , (10.60)
n≥0
n! 

which is seen to easily sum to


      
i i i
Z[J] = Dx(t) exp S(x, J) ≡ Dx(t) exp S(x) + dt J (t)x(t) ; (10.61)
  
then the Green’s functions are obtained from it via
 
δn
Z[J] = Dx(t)eiS[x]/ x(t 1 ) · · · x(t n ) = G n (t 1 , . . . , t n ). (10.62)
(i/)δ J (t 1 ) . . . (i/)δ J (t n ) J=0

Important Concepts to Remember

• The Feynman path integral is another way to define quantum mechanics, as a sum over all possible
paths, not necessarily continuous,
 between
 initial
 points and endpoints.
• The path integral is Dq(t) exp  S[q(t)] , and is approximated (at a saddle point) by
i
  √
i
exp  Scl [qcl (t)] , with corrections coming from the Gaussian around the saddle point, as 1/ det A,
where S = Scl [qcl ] + 12 (q − qcl ) · A · (q − qcl ).
• The path integral is derived in phase space   as 
  t
 
U (x , t ; x, t) = Dx(t) Dp(t) exp i t dt(pq̇ − H) , under the technical requirement to have
p to the left of x in H.
• From the path integral  in phase
 space, we derive the path integral in coordinate space,
U (x , t ; x, t) = N Dx(t) exp i S[x(t)] , under the extra technical requirement that H is
quadratic in p, H = p2 /2 + V (x).
• The best definition of the path integral is in harmonic phase space, related to coherent states of the
harmonic oscillator, α, α ∗ , where
125 10 The Feynman Path Integral and Propagators

     
t
U (α ∗ , t ; α, t) = Dα ∗ (τ) Dα(τ) exp i/ t dτ [/i α̇ ∗ (τ)α(τ) − H] + α ∗ (t)α(t) , and is
valid for normal-ordered Hamiltonians.
• The correlation functions or n-point functions are expectation values of time-ordered products of
position operators, rewritten as path integrals,   
G n (t a1 , . . . , t an ) = x , t  |T ( X̂ (t a1 ) · · · X̂ (t an ))|x, t = Dx(t) exp i S[x(t)] x(t a1 ) · · · x(t an ).
• The generating
 functional
 for all the correlations  functions is
Z[J] = Dx(t) exp  S[x] + i dt J (t)x(t) , from which we obtain the correlation functions by
i

taking derivatives.

Further Reading
See [3] and [5] for more details (the latter for details on correlation functions).

Exercises

(1) We have used the idea of Gaussian integration around the classical  solution for the action
to argue that the first-order result for the path integral is exp i Scl [x cl (t)] . However, is
Gaussian integration, and more generally the path integral itself, well defined for this particular
exponential? How would you expect to modify the exponential in order to make sense of the
Gaussian integration?
(2) For the path integral in phase space, we have assumed that the Ps are always on the left of the
Xs in the Hamiltonian. Consider a case where this is not true, for instance having an extra term
in the Hamiltonian of the type α( P̂ X̂ 2 + X̂ P̂ X̂ + X̂ 2 P). Redo the calculation, and see what you
obtain for the path integral in phase space.
(3) Consider the case when H = 34 p4/3 + V (x) and the path integral is in phase space. Is the
resulting
 path integral in coordinate space approximated in any way by the usual expression
i
Dx(t)e  S[x(t)] ?
(4) Consider the Hamiltonian, in terms of a, a† (with [a, a† ] = 1),

H = ω a† a + 1
2 + λ(a + a† ) 3 . (10.63)

Derive the harmonic path integral in phase space for it (without calculating the path integral,
which is not Gaussian).
(5) Consider a generating functional
   
1
Z[J (t)] = N exp − dt dt  J (t)Δ(t, t  ) J (t  ) . (10.64)
2

Calculate the two-point, three-point, and four-point functions.


(6) Consider a harmonic oscillator, with Lagrangian L = ( q̇2 −ω2 q2 )/2. Using a naive generalization
of the Gaussian integration, show that the generating functional is of the type given in exercise 5.
Write a formal expression for Δ(t, t  ).
126 10 The Feynman Path Integral and Propagators

(7) Consider the generating functional


  
1
Z[J (t)] = N exp − dt dt  J (t)Δ(t, t  ) J (t  )
2
     (10.65)
+λ dt 1 dt 2 dt 3 dt 4 J (t 1 ) J (t 2 ) J (t 3 ) J (t 4 ) .

Calculate the four-point function.


The Classical Limit and Hamilton–Jacobi
11 (WKB Method), the Ehrenfest Theorem

Until now, we have developed the formalism of quantum mechanics from postulates, and we
have shown how to match with experiments. We have not used classical mechanics at all, except
sometimes as a guide. But we know that the macroscopic world is approximately classical, and most
systems have a classical limit. We must therefore consider a way to obtain a classical limit, plus
quantum corrections, from the quantum formalism we have developed so far.
First we will show that the classical equations of motion play a role in the quantum theory itself,
in the form of the Ehrenfest theorem, which however does not in itself suggest a classical limit.
The classical limit is suggested by the Feynman path integral formalism, developed in the previous
chapter, for transition amplitudes, perhaps with the insertion of an operator. The action S[x] appearing
in the path integral has as a minimum the classical action, the action on the classical path, for which
δS = 0 ⇒ x = x cl (t); this allows us to expand the action around the classical on-shell action,
S  Scl [x cl ] + (δx) 2 (. . .) = Scl [x cl ] + fluctuations, (11.1)
and integrate over the fluctuations:
  
  
x , t T Â({t i })  x, t = Dx(t)e (iS[x]/) A({t i })  eiScl [xcl ]/ Acl ({t i }) × fluctuations
(11.2)
≡ eiScl [xcl ]/ Acl ({t i })eiSqu / .
Thus the classical action has some relevance to the classical limit of quantum mechanics and, as we
will see, it is related to the classical mechanics Hamilton–Jacobi equation and formalism. But before
analyzing that, we will prove the Ehrenfest theorem.

11.1 Ehrenfest Theorem

A statement of the Ehrenfest theorem is: The classical mechanics equations of motion are valid for
the quantum average over quantum states.

Proof Consider a quantum operator  and its expectation value in the state |ψ,
Aψ ≡ ψ| Â|ψ. (11.3)
Its time evolution is calculated as
       
d ∂ Â 
Â|ψ + ψ  Â  ψ + ψ 
d dψ|
Aψ =  ψ . (11.4)
dt dt  dt   ∂t 
But we can use the Schrödinger equation acting on |ψ, namely
d 1 d 1
|ψ = Ĥ |ψ ⇒ ψ| = − ψ| Ĥ, (11.5)
dt i dt i
127
128 11 Classical Limit, WKB Method, Ehrenfest Theorem

on the time evolution above, and obtain


  
∂ Â 
Aψ = ψ|( ÂĤ − Ĥ Â)|ψ + ψ 
d 1
 ψ
dt i  ∂t 
  (11.6)
1 ∂ Â
= [ Â, Ĥ]ψ + .
i ∂t ψ

This is the first version of the Ehrenfest theorem, which is the quantum mechanical equivalent of the
classical mechanics evolution equation
d ∂A
A = { A, H } P.B. + . (11.7)
dt ∂t
Consider next a Hamiltonian on a general phase space, H = H (qi , pi ), where pi is canonically
conjugate to the variable qi . In this case, ∂( q̂i )/∂t = ∂( p̂i )/∂t = 0, so we obtain
d 1
qi ψ = [q̂i , Ĥ]ψ
dt i
(11.8)
d 1
pi ψ = [ p̂i , Ĥ]ψ .
dt i
On the other hand, the commutations come from Poisson brackets, in particular the canonical
commutators from the canonical Poisson brackets,
1
[q̂i , p̂i ] = 1 ← {qi , pi } P.B. = 1, (11.9)
i
which means that we can represent the coordinates by derivatives with respect to momenta, and
momenta as derivatives with respect to coordinates, as
1 ∂ 1 ∂
q̂i = , p̂i = − , (11.10)
i ∂pi i ∂qi
which allows us to replace them in the commutators on the right-hand side of the Ehrenfest theorem
(11.6).
Better still, we have that the specific commutators in the Ehrenfest theorem come from Poisson
brackets that amount to derivatives of the Hamiltonian, which can be extended to operators as

1 ∂H ∂ Ĥ
[ p̂i , Ĥ] ← {pi , H } P.B. = − →−
i ∂qi ∂ q̂i
(11.11)
1 ∂H ∂ Ĥ
[q̂i , Ĥ] ← {qi , H } P.B. = + →+ .
i ∂pi ∂ p̂i
Finally, then, we obtain the Hamiltonian equations of motion as operator equations quantum
averaged over a state |ψ,
 
d ∂H
qi ψ =
dt ∂pi ψ
  (11.12)
d ∂H
pi ψ = − ,
dt ∂qi ψ

which is the more common way to describe the Ehrenfest theorem. q.e.d.
129 11 Classical Limit, WKB Method, Ehrenfest Theorem

As an example, consider the case of a coordinate q , and a Hamiltonian with a kinetic term and a
potential term,
p̂2
Ĥ = + V ( q̂). (11.13)
2m
In this case, the two Hamiltonian equations of motion can be turned into a single (Newtonian)
equation of motion, both for the quantum average:
d 1 d
q ψ = p ψ , p ψ = −∇V
 ψ ⇒
dt m dt
(11.14)
d2
m 2 q ψ = −∇V
 ψ =  F  ψ .
dt

11.2 Continuity Equation for Probability

In Chapter 7, we saw that the norm of a time-dependent state amounts to an integral of a probability
density,
 
dP
ψ(t)|ψ(t) = ρdV = dV , (11.15)
dV
and in the coordinate representation this implies that
dP
ρ= = |ψ(r , t)| 2 . (11.16)
dV
Moreover, the Schrödinger equation implies the continuity condition for probability,

∂t ρ = −∇
 · j, (11.17)

where j is the probability current density (when we think of probability as a fluid),

j =  (ψ∗ ∇ψ  ∗ ).
 − ψ∇ψ (11.18)
2mi
But we can expand on this formalism, and write the wave function in coordinate space as a real
normalization constant times a phase,

ψ(r , t) = AeiS(r ,t)/ , (11.19)

where A ∈ R and, from the above, we have



A= ρ(r , t). (11.20)

Moreover, the probability current density is

 = ρ ∇S.
j =  A2 2i ∇S  (11.21)
2mi  m
Thinking of probability as a fluid, we can define its “velocity” v by writing the fluid equation j = ρv ,
obtaining
1
v = ∇S. (11.22)
m
130 11 Classical Limit, WKB Method, Ehrenfest Theorem

Since a wave function is written in terms of the propagator (the time evolution operator in the
coordinate basis), which itself is a path integral that can be expanded in terms of the classical action
plus a quantum correction, as in (11.2), we obtain
 
i
ψ(r , t) = U (r , t; r , 0)ψ(r , 0)  exp (Scl (r cl , t) + Squ (r , t)) ψ(r , 0),
 
(11.23)

which is consistent with the previous ansatz for the wave function (11.19).

Part of the time evolution of the ansatz (11.19), namely the factor A = ρ, was defined from the
continuity equation, which can now be written more explicitly,
 v ) = − 1 ∇(
∂t A2 = ∂t ρ = −∇(ρ  A2 ∇S)
 ⇒
m
  (11.24)
1   + 1 A∇
∂t A = − ∇A · ∇S  2S .
m 2
This continuity equation was derived from the Schrödinger equation, but we have used only part
of it; the other part must be the equation for S. Indeed, from the Schrödinger equation in coordinate
space,
 2 
 2
− ∇ + V (r ) ψ(r , t) = i∂t ψ(r , t), (11.25)
2m
which for the ansatz becomes
 
2 i S  2 2i  i 2 1  2
− e  ∇ A + ∇A · ∇S + A∇ S − 2 A(∇S) + V (r ) AeiS/

2m   
  (11.26)
i i
= ie  S ∂t A + A∂t S ,

we obtain, using the continuity equation (11.24) to cancel terms involving A between the left- and
right-hand sides,
1  2 2 ∇
2A
(∇S) + V (r ) + ∂t S − = 0. (11.27)
2m 2m A
We see that the quantum value  appears only in the last term, the other three terms being of the
same, classical, order.
That means that if we define the classical limit as the limit  → 0, in which for instance the
canonical quantization condition [q̂i , p̂ j ] = iδi j becomes just the commutation of classical variables,
[qi , p j ] = 0, we obtain an equation involving just the first three terms, which turns out to be the
classical Hamilton–Jacobi equation.

11.3 Review of the Hamilton–Jacobi Formalism

In order to continue, we quickly review the Hamilton–Jacobi formalism of classical mechanics. There
is a more complete and rigorous derivation for it, but here we will show a somewhat shorter version,
focusing on the physical interpretation.
If we consider a canonical transformation between two representations of classical mechanics for
the same system, the classical action S must change by a total derivative, given that the endpoints of
the action are fixed:
131 11 Classical Limit, WKB Method, Ehrenfest Theorem


d
S → S = S − dt S (x , t) ⇒
dt
 t2  t2  t2 (11.28)
L dt → (L dt − dS) = L dt − S (x (t 2 ), t 2 ) + S (x (t 1 ), t 1 ),
t1 t1 t1

where S (x (t 1,2 ), t 1,2 ) are constants and so do not change the physics.
That means that a Lagrangian depending on general variables qi changes as follows:
dS  q̇i2 ∂S  ∂S
L → L = L − = mi − V − − q̇i
dt i
2 ∂t i
∂qi
(11.29)
 1  ∂S
2
∂S  1 ∂S
 2
= pi − −V − − .
i
2mi ∂qi ∂t i
2mi ∂qi

The extremum of the new Lagrangian is obtained when we have both



pi = S (11.30)
∂qi
and
∂  1  ∂S  2
S+ + V (q, t) = 0. (11.31)
∂t i
2mi ∂qi

This latter equation (11.31) is the Hamilton–Jacobi equation, and S (q , t) is Hamilton’s principal
function.
The usefulness of the Hamilton–Jacobi equation comes when, after the canonical transformation,
H  = 0, which means that L  = Pi Q̇i , and this is actually zero, since Pi and Qi are constant, as we
will see shortly. Thus S is the action of the theory.
Since H  = 0, one finds that the new momenta Pi are constants of motion, which will be called α i ,
and the principal function will depend on these constants as well: S = S (t, qi , α i ). Then, moreover,
the derivatives with respect to these constants αi give the new coordinates Qi and are new constants
of motion, called βi ,
∂S
= βi = Qi . (11.32)
∂α i
This, together with Pi = α i , defines the constants of motion, also known as integrals of motion.
There is no algorithmic solution to the Hamilton–Jacobi equation, but often one can solve it by the
separation of variables. Indeed, if the potential V (r ) is (explicitly) time independent, the equation
admits a solution that separates the variables,

S (t, qi ) = f (t) + s(qi ). (11.33)

Then, substituting into (11.31), we find


 1  ∂s  2
f˙(t) + + V (qi ) = 0, (11.34)
i
2m ∂qi

which means that time dependence equals qi dependence, allowing us to set to zero both indepen-
dently,

f˙(t) = constant ≡ −E ⇒ f (t) = −Et + constant, (11.35)


132 11 Classical Limit, WKB Method, Ehrenfest Theorem

and substituting into (11.34) we find the time-independent Hamilton–Jacobi equation


 1  ∂s  2
+ V (qi ) = E. (11.36)
i
2mi ∂qi

For Hamilton’s principal function we obtain

S (t, qi , E, α i ) = s(qi , E, α i ) − Et, (11.37)

and the integrals of motion satisfy


∂S ∂s ∂S ∂s(qi , E, α i )
= = βi , = − t = t0. (11.38)
∂α i ∂α i ∂E ∂E
We can also separate more variables, in the case where the Hamiltonian is independent of other
variables, but we will not describe that here.

11.4 The Classical Limit and the Geometrical Optics Approximation

We see then that, in the  → 0 limit, the quantum version of the Hamilton–Jacobi equation, (11.27),
reduces to the classical Hamilton–Jacobi equation, (11.31). That means that S(x , t) appearing in
ψ = AeiS/ is identified with Hamilton’s principal function S (x , t), which in turn equals the action if
H  = 0, and thus L  = PQ̇ = 0. Indeed, then, from the path integral formalism, we have

ψ(r , t)  eiScl (r ,t)/ × (· · · ). (11.39)

Considering then the time-independent (stationary) case for S(x , t) as in the Hamilton–Jacobi
equation, we write

S(x , t) = W (r ) − Et, (11.40)

which leads to the following equation for W :

1  2 2 ∇
2A
(∇W ) + V (r ) − E − = 0. (11.41)
2m 2m A
But we still haven’t defined the classical limit physically or defined rigorously the smallness of
the extra term in (11.27). Physically, we see that the limit is a generalization of the geometrical
optics approximation of classical wave mechanics. In it, a wave is replaced by “paths of light rays”,
corresponding now to replacing the integral over all paths with just the motion of classical particles.
Using the particle–wave duality defined by de Broglie, the wave associated with a particle of
momentum p has wavelength λ = h/p. In the presence of a potential V (r ), p2 /(2m) = E − V (r ), so
the position-dependent de Broglie wavelength is
h h
λ= = . (11.42)
p 2m(E − V (r ))
Replacing this in (11.41), we can rewrite it as
⎡⎢ 2 2 ⎤
(∇W
 )2 = h2 ⎢⎢1 + λ ∇ A ⎥⎥⎥ . (11.43)
λ2 ⎢⎣ (2π) 2 A ⎥⎦
133 11 Classical Limit, WKB Method, Ehrenfest Theorem

Then it is indeed clear that neglecting the term with (λ/(2π)) 2 as being small, we obtain the
geometrical optics approximation. Indeed, the resulting equation is
h2
(∇W
 )2 = , (11.44)
λ2
which is the equation of a wave front in geometrical optics. The planes of constant W are planes of
constant phase, i.e., wave fronts. If we have V (r ) = 0, λ is constant, so the solution of the equation is

W = p · r + constant, (11.45)

where |p | = h/λ, and so wave fronts are perpendicular to the direction of p , whereas light rays are
in a direction parallel to p .

11.5 The WKB Method

A related way to expand the wave function semiclassically is the WKB method, found by Wentzel,
Kramers, and Brillouin; it is a more general method, but roughly speaking can be understood here as
an expansion in  and so as a “semiclassical” expansion of the wave function.
Specifically, we saw that
 
i
ψ(r , t) = eiS(r ,t)/ = exp (W (r ) − Et) ⇒ ψ(r ) = eiW (r )/ , (11.46)

and moreover we write

W (r ) = s(r ) + T (r ), (11.47)
i
so that

ψ(r ) = eT (r ) eis(r )/ ≡ Aeis(r )/ , (11.48)

where we can uniquely define s(r ) and T (r ) by saying that they are both even in . We can moreover
expand them in , and keep only the lowest order in the expansion.
The Schrödinger equation for ψ(r , t) becomes an equation for S(r , t),
⎡ 2 ⎤
∂S  1 ⎢⎢ ∂S  ∂ 2 S ⎥⎥
+ ⎢ + + V (qi ) = 0.
i ∂qi2 ⎥⎥
(11.49)
∂t 2mi ⎢ ∂qi
i ⎣ ⎦
Neglecting the term in , specifically because
 2   2
 ∂ S ∂S
 2  , (11.50)
 ∂qi  ∂qi

we obtain again the classical Hamilton–Jacobi equation.


As we saw, this limit is a geometrical optics approximation, and, remembering that ∂S/∂qi = pi ,
the above condition becomes
 ∂pi   ∂(/pi ) 
    pi ⇒    1. (11.51)
 ∂qi   ∂qi 
134 11 Classical Limit, WKB Method, Ehrenfest Theorem

Remembering that /pi = λi , we write the condition as

| ∇λ|
  1, (11.52)

or, considering the variation of λ over a distance of order λ, so δλ  (∇λ)δx


 for δx = λ, we obtain
the condition
 δ λ λ 
   1, (11.53)
 λ 
which is the precise form of the geometrical optics approximation.

WKB in One Dimension

The simplest application of the WKB method is for a one-dimensional system. Putting  → 0 in
(11.41), and restricting to one dimension, we obtain
W 2
 E − V (x), (11.54)
2m
with approximate solution
 x  x
W (x)  ± dx  2m(E − V (x  )) ≡ ± dx  p(x  ), (11.55)

where p(x) = 2m(E − V (x)) is the (space-dependent) momentum. Then the wave function is
  x 
i   iEt
ψ(x, t) = A exp ± dx p(x ) − . (11.56)
 x0 
The equation for A is, as we have seen already, the continuity equation (coming from ρ = A2 ),
now written as
 
∂ρ ∂ ρ ∂S
+ = 0. (11.57)
∂t ∂x m ∂x
Since ∂ρ/∂t = 0 (the stationary case) and ∂S/∂ x = dW /dx, we obtain that ρW  (x) is constant. But

on the other hand, from (11.55), ρW  = ±ρ 2m(E − V (x)), meaning we obtain
√ constant
A= ρ= . (11.58)
[E − V (x)]1/4
Finally, then the WKB solution to the one-dimensional problem is
  x 
constant i   iEt
ψ(x, t) = exp ± dx p(x ) − . (11.59)
[E − V (x)]1/4  x0 

Important Concepts to Remember

• The statement of the Ehrenfest theorem is that the classical equations of motion in the Hamiltonian
formulation are valid for the quantum average, for quantum states.
135 11 Classical Limit, WKB Method, Ehrenfest Theorem


• The wave function can be written as ψ(r , t) = ρ(r , t)eiS(r ,t)/ , so that the “velocity of the
probability fluid” is v = 1 ∇S.
m

• Then, from the Schrödinger equation (and the continuity equation for probability), the function
S(r , t) satisfies a quantum-corrected Hamilton–Jacobi equation, with a term of order 2 .
• In the Hamilton–Jacobi formalism, S(q , t) is Hamilton’s principal function, and its derivatives with
respect to the constants of motion α i , the values of Pi , are other constants of motion βi , the values
of Qi .
• There is no algorithmic way to solve the Hamilton–Jacobi equation; one generally uses some
separation of variables, at least for time, leading to the time-independent Hamilton–Jacobi
equation.
• The classical limit, ignoring the 2 term in the Hamilton–Jacobi equation, is a generalization of
the geometrical optics approximation, for the case where the de Broglie wavelength satisfies,
|δ λ λ/λ|  1.  
• An equivalent expression for the wave function is ψ(r , t) = exp i (W (r ) − Et) , where W (r ) =
s(r ) − i T (r ).
• The WKB method  xin one dimension is a first-order correction to the classical solution Aeiscl / , in

which W (x) = ± dx 2m(E − V (x  )) − i ln A and A is no longer constant anymore but equals


const /[E − V (x)]1/4 .

Further Reading
See [2] and [3] for more details.

Exercises

(1) Consider the Hamiltonian


p2
H= + λx 4 (11.60)
2
and the observable

A = x 3 + p3 . (11.61)

Find the equation of motion for A in the Hamiltonian formalism and the corresponding quantum
version of this equation of motion.
(2) Consider a radial (central) potential for motion in three dimensions, V (r). Write the quantum
version of the Hamilton–Jacobi equation, and reduce it to a single equation for the radial motion
(in r).
(3) (Review from classical mechanics) Consider the classical Hamilton–Jacobi formalism, for the
case of the motion of a particle in three spatial dimensions, in a central potential V (r) = −B/r.
Solve the Hamilton–Jacobi equation for the motion of a particle coming in from infinity and
being deflected, and find the deflection angle.
136 11 Classical Limit, WKB Method, Ehrenfest Theorem

 
(4) In the parametrization ψ = A exp i W (r ) − i Et , what is the equation of motion for A? What
happens to it in the geometrical optics approximation? If also V (r ) = 0, solve the equation for A.
(5) Consider a one-dimensional harmonic oscillator. Can one apply the WKB approximation to it?
If so, why, and when?
(6) Write down the WKB approximation for the one-dimensional potential V = −B/x, B > 0, and
find the domain of validity for it.
(7) For the case in exercise 6, write down the explicit equations for the wave function outside the
geometrical optics approximation, and find a way to introduce the next-order corrections to the
WKB approximation that are based on the geometrical optics approximation.
Symmetries in Quantum Mechanics
12 I: Continuous Symmetries

We now begin the analysis of symmetries in quantum mechanics. First, in this chapter, we will
analyze simple continuous symmetries, such as translation invariance. But before that, we will review
symmetries in classical mechanics. Then we will generalize the results to quantum mechanics, and
show that the symmetries of quantum mechanics form groups. In the next chapter, we will analyze
discrete symmetries, such as parity invariance, and “internal” symmetries. After that, in the following
chapters, we will consider rotational invariance and the theory of angular momentum.

12.1 Symmetries in Classical Mechanics

Consider an infinitesimal transformation of the variables qi of a system with Lagrangian L(qi , q̇i , t),
qi → qi + δqi , where δqi is a specific change. Then we have a symmetry of the system if the
Lagrangian is invariant, δL/δqi = 0, or more precisely, if the action is invariant,
δS
= 0. (12.1)
δqi

Note that therefore the action S = dt L is invariant if L varies by at most the total derivative of a
function f , i.e., d f /dt.
We can consider two simple cases:

• We can have the invariance of L under any translation by δqi , so that ∂L/∂qi = 0. Then the
Lagrange equations of motion,
∂L d ∂L
− = 0, (12.2)
∂qi dt ∂ q̇i
where the canonically conjugate momentum is pi ≡ ∂L/∂qi , imply that this conjugate momentum
is conserved, i.e., constant in time,
dpi
= 0. (12.3)
dt
This is the simplest form of the Noether theorem, which states that: for any symmetry of the
theory there is a conserved charge (i.e., a charge that is constant in time). Here the conserved
charge is the canonically conjugate momentum pi . Note that usually the Noether theorem is
defined within classical field theory, which is outside our scope, but here we present the simple
(nonrelativistic) classical-mechanics form.
• We can also have a symmetry corresponding to invariance under a transformation that is continuous
(so that there exists an infinitesimal form) and linear (so that it is proportional to the qi itself) and
is of the type
137
138 12 Continuous Symmetries in Quantum Mechanics


δqi =  a (iTa )i j q j , (12.4)
a

where the  a are arbitrary (real or complex) infinitesimal parameters and (iTa )i j , for a = 1, . . . , N,
are constant matrices called the “generators of the symmetry”.
In this more general case, consider a Lagrangian that (for simplicity) has no explicit time
dependence, so that L = L(qi , q̇i ). Then the variation of the action under the symmetry is
  
∂L ∂L
δS = dt δqi + δ q̇i
∂qi ∂ q̇i
       (12.5)
∂L d ∂L d ∂L
= dt δqi − + δqi ,
∂qi dt ∂ q̇i dt ∂ q̇i
where in the second line we have used δ q̇i = dtd
(δqi ) (since [∂t , δ] = 0) and partial integration.
If the transformation above is a symmetry then the action is invariant, δS = 0. Moreover, using
the classical Lagrange equations of motion (so, “on-shell”), we find that
 t2     t2
d ∂L ∂L
0= dt δqi =  a (iTa )i j q j  , (12.6)
t1 dt ∂ q̇i ∂ q̇i t1
which vanishes for any  a , allowing us to peel off (factorize out) the  a , and, for any t 1 , t 2 , which
means that the quantity, a priori time dependent,
 
∂L
Qa ≡ j
(iTa )i q j (t), (12.7)
∂ q̇i
is actually time independent, and is known as a “conserved charge” associated with the symmetry
(12.4). This is the more common version of the Noether theorem in classical mechanics.
The Noether theorem allows for one more generalization, for the case where the Lagrangian is
not invariant, but only the action is invariant. The Lagrangian can change by a total derivative, so

dt ( a Q̃ a ) =  a Q̃ a  ,
d t2
δS = (12.8)
dt t1

which vanishes only because of the boundary conditions on Q̃ a at t 1 , t 2 . Then, actually instead
of Q a , it is
(Q a − Q̃ a )(t) (12.9)
that is independent of time.
We can make a number of observations about this analysis:
• Observation 1. The total new variable, after the transformation, is
qi = qi + δqi = δij +  a (iTa )i j q j ≡ Mi j q j , (12.10)

where the matrix Mi j defines the linear transformation of qi .


• Observation 2. Consider the time translation, t → t + δt; under it qi varies by δqi = q̇i δt. If time
translation is an invariance (so that the action is invariant under it), the infinitesimal conserved
charge and the variation of the Lagrangian are
∂L
aQa = q̇i δt
∂ q̇i (12.11)
δL = ∂t Lδt,
139 12 Continuous Symmetries in Quantum Mechanics

and from the last relation we deduce that ∂t L = ∂t Q̃ ⇒ L = Q̃. Substituting pi = ∂L/∂ q̇i , we find
the conservation law
 
d ∂L d d
0= q̇i − L = (pi q̇i − L) = H, (12.12)
dt ∂ q̇i dt dt
meaning that the Hamiltonian, or energy, is the Noether charge associated with time translations.
• Observation 3. We can write the linear and continuous variation of the qi as a Poisson bracket for
the charge. Indeed, we find that
∂L
Qa = (iTa )i j q j = pi (iTa )i j q j , (12.13)
∂ q̇i
so the variation of qi is the Poisson bracket
δqk = −{ a Q a , qk } P.B. = δik  a (iTa )i j q j =  a (iTa )k j q j , (12.14)
as written before. Here we have used the fundamental Poisson bracket {qk , pi } = δik .
• Observation 4. We can generalize the formalism to discrete symmetries (meaning that there is no
infinitesimal version), by writing the finite transformation
qi → qi = Mi j q j , (12.15)
instead of M = 1 + O(). Then, if S changes to S  = S or, more restrictively, if L changes to
L  = L, the matrix M defines the action of the symmetry on the variables qi .

12.2 Symmetries in Quantum Mechanics: General Formalism

In quantum mechanics, the classical matrix Mi j becomes the operator M̂ij and the charge Q a becomes
the charge operator Q̂ a , which as we saw, can be defined to be a function of the phase space variables,
now operators, q̂i , p̂i .
The linear transformation, written as a Poisson bracket, with the quantization rule becomes (1/(i)
times) the commutator,
1 a
δqk = −{ a Q a , qk } P.B. → − [ Q̂ a , q̂k ]. (12.16)
i
This is a natural relation in canonical quantization.
In classical mechanics, we focused on symmetries in the Lagrangian formalism, when the
action (or sometimes the Lagrangian) is invariant, δS = 0. However, in the Hamiltonian formalism,
useful for quantum mechanics, the equivalent statement is the invariance of the Hamiltonian,
δH (qi , pi ) = 0. In the classical mechanics version, this means that
δH (qi , pi ) = −{ a Q a , H } P.B. = 0, (12.17)
which translates in quantum mechanics into
1 a
[ Q̂ a , Ĥ] = 0 ⇒ [Q̂ a , Ĥ] = 0, (12.18)
i
i.e., that the transformed Hamiltonian operator is the same as the original one,
Ĥ  ≡ Q̂ a Ĥ Q̂−1
a = Ĥ. (12.19)
140 12 Continuous Symmetries in Quantum Mechanics

More importantly, the Ehrenfest theorem, stating that the quantum averages on states should have
the same property as the classical values, must hold.
In the case of symmetry transformations, we can have two points of view.
(1) The active point of view, where a transformation changes the state of the quantum system,

|ψ → |ψ   = Û |ψ, (12.20)

but not the operators.


Under this change, the energy, or quantum average of the Hamiltonian, the equivalent of the
classical quantity, should be invariant. Thus

ψ  | Ĥ |ψ   = ψ|Û † Ĥ Û |ψ
(12.21)
= ψ| Ĥ |ψ,

while the norms of the states should also be invariant,

ψ  |ψ   = ψ  |Û †Û |ψ = ψ|ψ, (12.22)

implying that the operator Û should be unitary, Û † = Û −1 . Then the invariance of the quantum
average of the Hamiltonian above, for any state |ψ, means that the transformed operator equals
the original one,

Û −1 Ĥ Û = Ĥ ⇒ [Û, Ĥ] = 0, (12.23)

so that the Hamiltonian commutes with the operator Û, representing the symmetry transformation
on states.
(2) The passive point of view, where the transformation acts on operators but not on states,

 →  = Û −1 ÂÛ, |ψ → |ψ. (12.24)

This point of view should lead to the same results as the active one: we can see that the
quantum average transforms in the same way.
But, the change from an active to a passive point of view is more general than its application
above to invariances of the Hamiltonian. It is in fact valid simply for transformations of the states
or operators, corresponding to classical transformations. By the Ehrenfest theorem, the quantum
version of the classical transformations of an observable A are transformed quantum averages for the
quantum operator Â,

A = ψ| Â |ψ or ψ  | Â|ψ , (12.25)

where  = Û −1 ÂÛ.


Note that the invariance of the quantum Hamiltonian Ĥ under a symmetry transformation with
operator Q̂ a implies a degeneracy of the eigenstates of the Hamiltonian. Indeed, if |ψ is an eigenstate
of Ĥ then so is Q̂ a |ψ, since

Ĥ (Q̂ a |ψ) = Q̂ a ( Ĥ |ψ) = Eψ (Q a |ψ). (12.26)


141 12 Continuous Symmetries in Quantum Mechanics

12.3 Example 1. Translations

A classical translation of the coordinate qi acts on classical phase space, and the corresponding
quantum averages, as

qi → qi + ai ⇒ qi  → qi  = qi  + ai


(12.27)
pi → pi ⇒ pi  → pi .

But in Chapter 5 we saw that a translation by δq is represented by an operator

T̂ (δq) = 1̂ − iδq K̂, (12.28)

where K̂ is Hermitian, K̂ † = K̂, and acts on wave functions as

d
K̂ → −i . (12.29)
dq

More precisely, since, as we saw, /(i) appears naturally in the transition from classical mechanics,
we write

i
T̂ () = 1̂ − K̂. (12.30)


Then, indeed, for a finite transformation by a instead of ,

  ia  n K̂ n
−i(a/) K̂
T̂ (a) = e ≡ − . (12.31)
n≥0
 n!

This is so since for wave functions, when K̂ = −id/dq, this is just Taylor expansion on ψ(x).
By the general definition,

T̂ (a)|q = |q + a, (12.32)

and this means that for the wave function in the coordinate (q) representation,

q|T̂ (a)|ψ = ψ(q + a)


(12.33)
= T̂ (a)ψ(q),

as required.
In the passive point of view, the action of translation on the operators q̂ is

q̂  = T̂ −1 () q̂T̂ () = q̂ + 1̂, (12.34)

since then, indeed, the action on the quantum averages is

qψ → qψ = qψ + . (12.35)


142 12 Continuous Symmetries in Quantum Mechanics

12.4 Example 2. Time Translation Invariance

As we saw, at the classical level, time translation invariance leads to a Hamiltonian that is invariant
in time (conserved), dH/dt = 0, which by the Ehrenfest theorem becomes in quantum mechanics
 
d Ĥ
= 0. (12.36)
dt ψ

This means that if we start with a given state at t = 0 or t = τ and evolve in time by t, we should
obtain the same state, up to a phase eiα(t,τ) at the most, for any state:

|ψ(t) = Û (t, 0)|ψ(0) = eiα(t,τ) |ψ  (t + τ) = eiα(t,τ) Û (t + τ, τ)|ψ  (τ), (12.37)

where |ψ  (τ) = |ψ(0) = |ψ0  is the same state.


That in turn implies that in fact we have a relation between evolution operators,

Û (t, 0) = eiα(t,τ) Û (t + τ, τ), (12.38)

for any t and τ.


But, since in the infinitesimal form Û  1̂ − i Ĥ dt, we obtain
 
i i
1 − Ĥ (0)dt = (1 + i f (τ)dt) 1 − Ĥ (τ)dt , (12.39)
 

so that finally

Ĥ (τ) = Ĥ (0) +  f (τ) 1̂. (12.40)

However, in any case the function f (τ) coming from the expansion of the phase eiα(t,τ) is trivial
and can be absorbed in redefinitions, so we can just drop it, resulting in the fact that the Hamiltonian
operator is time independent, as expected from the classical analysis.
Equivalently, and dropping the phase eiα from the beginning, we have
 
idt
|ψ(t 1 + dt)  1 − Ĥ (t 1 ) |ψ0 

  (12.41)
idt
|ψ(t 2 + dt)  1 − Ĥ (t 2 ) |ψ0 ,


so we obtain

d Ĥ
= 0, (12.42)
dt
which in particular implies also that
 
d Ĥ
= 0, ∀|ψ, (12.43)
dt ψ

as needed.
143 12 Continuous Symmetries in Quantum Mechanics

12.5 Mathematical Background: Review of Basics of Group Theory

General Linear and Continuous Symmetry


In general, we can write for the transformation operator,

i a
M̂ = 1̂ − Q̂ a . (12.44)


Then the transformed operator is


   
i a i a 1
q̂i −1
= M̂ q̂i M̂ = 1 + Q̂ a q̂i 1 − Q̂ a  q̂i − [ a Q̂ a , q̂i ]. (12.45)
  i

Thus we can represent the operator Q̂ a as


 1 a
Q̂ a = p̂i (iTa )i j q̂ j ⇒ − [ Q̂ a , q̂k ] =  a (iTa )i j q̂ j = δ q̂i . (12.46)
i,j
i

Groups and Invariance under Groups

Symmetry transformations form a group.


We say we have a group G if we have a set of elements, together with a multiplication between
them, represented as G × G → G, such that we have the following properties:
(a) The multiplication respects the group, i.e., ∀ f , g ∈ G, h = f · g ∈ G.
(b) The multiplication is associative, i.e., ∀ f , g, h ∈ G, ( f · g) · h = f · (g · h).
(c) There is an element e called the identity, such that e · f = f · e = f , ∀ f ∈ G.
(d) There is an element called the inverse, f −1 , associated with any f ∈ G, such that f · f −1 =
f −1 · f = e.
Indeed, in the case of symmetries qi → Mi j q j , with M ∈ G, ∀ M1 , M2 ∈ G, M1 · M2 = M3 ∈ G.
Also, matrix multiplication is associative and admits an inverse.
As examples, we can consider both time translation and translation of some qi by a, so that the
operator is T (a). Then indeed

T (a1 ) · T (a2 ) = T (a1 + a2 ). (12.47)

Example of a Group: Z2

The simplest group is the group Z2 , which has just two elements, a = e and b.
It can be represented on R as the numbers a = e = +1 and b = −1. This implies that the
multiplication table for the group is

a · a = a, b · b = a, a · b = b · a = b. (12.48)
144 12 Continuous Symmetries in Quantum Mechanics

It also means that the group is Abelian, so that g1 · g2 = g2 · g1 , ∀ g1 , g2 ∈ G. Abelian groups are
named after the Norwegian mathematician Niels Henrik Abel, whose name is associated with the
most important prize in mathematics, the Abel prize.
To have invariance of a system under a group G in classical mechanics means that we need to
have invariance under the transformations qi → gqi , ∀g ∈ G. Thus L(gqi ) = L(qi ), or at least
S[gqi ] = S[qi ] (invariant action).

Example of an Invariant System under Z2 Acting on q

As a simple example of invariance under Z2 , consider a system with one real coordinate q ∈ R, but
with a potential depending only on its modulus, V = V (|q|), so that

q̇2
L=m − V (|q|). (12.49)
2
Then q  = gq for g = a or b. For the case g = a = +1, we obtain q  = q, so this is trivially a
symmetry. For the case g = b = −1, we need L(−q) = L(q). But, indeed, q̇2 = (−q̇) 2 and | − q| = |q|,
implying L(−q) = L(q). Then also S[q] = S[−q] (the action is invariant), though the reverse is not
true in general (invariance of the action invariant doesn’t imply invariance of the Lagrangian).

Example of a System with Invariance of the Action but not of the Lagrangian
We can modify the above example in such a way that the action is still invariant, S[−q] = S[q], but
the Lagrangian is not. Consider then the new Lagrangian

q̇2 d
L=m − V (|q|) + α q, (12.50)
2 dt
together with the boundary condition q(t 2 ) = q(t 1 ). Then we have explicitly that L(−q)  L(q),
since the new term is odd under q → −q, but the action is invariant, since the new term gives
 t2
d
α dt q = α(q(t 2 ) − q(t 1 )) = 0, (12.51)
t1 dt

because of the boundary condition.

Generalization: Cyclic Groups


The next (“cyclic”) group in terms of dimension is Z3 , made up of three elements, {e, a, b}, that can
be represented in C as complex numbers, equal to the third roots of unity, i.e., x ∈ C such that x 3 = 1.
We have then

e = 1 = e0 , a = e2πi/3 , b = e4πi/3 , (12.52)

which implies the multiplication table

a2 = b, b2 = a, ab = ba = e. (12.53)

This is also an Abelian group, as we can easily see.


145 12 Continuous Symmetries in Quantum Mechanics

As an example of a classical Lagrangian that is invariant under the above group, consider a
generalization of it. Consider a complex variable q = q1 + iq2 , q1 , q2 ∈ R, and

| q̇| 2
L(q) = m − V (q3 ). (12.54)
2
Then, under q → q  = gq, we have | q̇| = |g q̇| = | q̇| and

q3 → q 3 = (gq) 3 = q3 , (12.55)

since g 3 = 1 for all g ∈ Z3 .


Our final generalization is to the N-element cyclic group Z N , with elements {e, a1 , . . . , a N −1 } that
can be represented in the complex numbers as the Nth roots of unity, g, such that g N = 1, specifically

e = 1, a1 = e2πi/N , . . . , a N −1 = e2πi(N −1)/N . (12.56)

An example of a Lagrangian invariant under this group is a further generalization of the same
Lagrangian as before, for a complex variable q,

| q̇| 2
L(q, q̇) = m − V (q N ), (12.57)
2
since then indeed

q N → q N = (gq) N = q N . (12.58)

ZN -Invariant System in Quantum Mechanics

We can translate the invariance of the above classical system into a quantum mechanical invariance.
For that, we define the quantum Hamiltonian

| q̇|
ˆ2
Ĥ = m + V ( q̂ N ). (12.59)
2

Then we can check that the Hamiltonian is invariant under ĝ ∈ Z N , acting as q̂ → ĝ −1 q̂ĝ, since

| ĝ −1 q̇g|
ˆ 2
Ĥ  = ĝ −1 Ĥ ĝ = m + V (ĝ −1 q̂ ĝ) N = Ĥ. (12.60)
2

Representations of Groups
So far, we have defined the groups Z N by the complex numbers that define their multiplication table.
But it is important to understand that we have different kinds of representations of the group, in
terms of different kinds of objects. For a representation R, we say that the element g is represented
by DR (g).
In the case of the fundamental (defining) representation of the Z N defined above, since all the
elements g are complex numbers, we have a one-dimensional complex vector space. But for Z N with
N ≥ 3, there is (at least) one other representation, called the regular representation.
146 12 Continuous Symmetries in Quantum Mechanics

In the case of Z3 , we define it in terms of 3 × 3 matrices,

1 0 0 0 0 1 0 1 0
     
D(e) = 0 1 0 , D(a) = 1 0 0 , D(b) = 0 0 1 , (12.61)
0 0 1 0 1 0 1 0 0

so it permutes the three elements of the vector space between each other.
This is a three-dimensional representation, meaning that these are matrix operators acting on a
x
 
three-dimensional vector space  y . Consider a Cartesian basis for this vector space, and denote its
z
components according to the three elements of the group, as

1 0 0
     
|e = 0 ≡ |e1 , |a = 1 ≡ |e2 , |b = 0 ≡ |e3 . (12.62)
0 0 1
The notation has been chosen in such a way that we have

D(g1 )|g2  = |g1 g2  ≡ |h, (12.63)

as we can check explicitly. Moreover, since the states are orthonormal, ei |e j  = δi j , we find that the
matrix elements of D(g) are given by

(D(g))i j = ei |D(g)|e j  = ei |D(g)e j . (12.64)

This regular representation is for the same group as the group defined by the roots of unity, but it
is inequivalent to it, since the vector space has a different dimension (one versus three).

Equivalent Representations
Equivalent representations means that, first, the spaces must have the same dimension, and second,
there must be a similarity transformation S, defining a change of basis, under which we get from one
representation to the other, i.e.,

D(g) → D  (g) = S −1 D(g)S. (12.65)

Indeed, in this case, the matrix elements are the same,

(D  (g))i j ≡ ei |D  (g)|e j  = ei |S −1 D(g)S|e j  = ei |D(g)|e j  = (D(g))i j , (12.66)

where in the second equality we have used the change of basis |ei  = S|e j  and the unitarity of S,
S −1 = S † , which is true for Z N and for all the classical Lie groups.

Reducible Representations

If there is an invariant subspace H ⊂ G in the representation vector space, i.e.,

∀|h ∈ H, D(g)|h ∈ H, ∀g ∈ G, (12.67)

we say that we have a reducible representation. If not, we say we have an irreducible one.
147 12 Continuous Symmetries in Quantum Mechanics

A reducible representation is always block-diagonal,

D(h) 0 0 ···
 
 0 D( h̃) 0 
D(g) =  ˜  . (12.68)
 0 0 D( h̃) 
 ···

Indeed, in this case, the vector space splits as

|h
 
 | h̃ , (12.69)
 
˜
 | h̃
such that then, for each subspace, D(g)|h ∈ H, etc.
An irreducible representation is a representation that is not reducible.

Important Concepts to Remember

• A symmetry is a transformation on the variables of the Lagrangian that leaves the action invariant.
• The Noether theorem says that for any symmetry of the theory there is a conserved charge.
• For a continuous linear symmetry, δqi = a  a (Ta )i j q j , the conserved charge (invariant in time)
is Q a = ∂∂L j
q̇i (Ta )i q j .
• The momenta pi are the charges associated with invariance under translations in qi , and the
Hamiltonian is the charge associated with time translation invariance.
• Linear transformations can be written as δqk = −{ a Q a , qk } P.B. .
• Invariance in quantum mechanics means that [Q̂ a , Ĥ] = 0, corresponding to {Q a , H } P.B. = 0.
• Symmetry in quantum mechanics can be understood either as an active transformation, on states,
|ψ → Û |ψ, but not on operators, or as a passive transformation on operators, Â → Â = Û −1 ÂÛ,
but not on states. Either way, the expectation values transform in the same manner.
• Symmetry transformations form a group.
• The simplest groups are the discrete cyclic groups Z N ; these can be represented by the Nth roots
of unity, which are Abelian groups, i.e., g1 g2 = g2 , g1 , ∀g1 , g2 ∈ G.
• Groups have different representations; for Z N , we have the fundamental representation as Nth
roots of unity with a multiplication table, and the regular representation as real matrices acting on
the group elements as elements in an N-dimensional vector space, etc.
• Equivalent representations can be mapped to each other; reducible representations have block-
diagonal matrices and so can be reduced (split) into lower-dimensional representations; irreducible
representations cannot.

Further Reading
See [8] for more on group theory for quantum mechanics.
148 12 Continuous Symmetries in Quantum Mechanics

Exercises

(1) Consider the Lagrangian (q, q̃ ∈ C)


m
L = (| q̇| 2 + | q̃|
˙ 2 ) − V ((q2 + q̃2 ) 2 ). (12.70)
2
What are its symmetries? What are the representations of the group in which q, q̃ belong?
(2) Calculate the Noether charge for the continuous symmetry(ies) in exercise 1, and check that
the infinitesimal variations of q, q̃ are indeed generated by the Noether charge via the Poisson
bracket with q, q̃.
(3) Consider two harmonic oscillators of the same mass and frequency, with Hamiltonian
1 2 k
H= (p + p22 ) + (x 21 + x 22 ). (12.71)
2m 1 2
Find the continuous symmetries of the system and write down the resulting conserved charges
as a function of the phase space variables. Then quantize the system, and show that the charges
do indeed commute with the quantum Hamiltonian.
(4) For the system in exercise 3, write down the equation of motion for x 21 + x 22 ≡ r 2 and then
the corresponding Ehrenfest theorem equation and its transformed version under the continuous
symmetries, in both the active and the passive sense.
(5) Consider the Lagrangian for q ∈ C
m
L = | q̇| 2 − V (q), (12.72)
2
in the cases
1 1 1 1
V = 3 + 3 + 5 + 5 (I)
q q̄ q q̄
1 1
V = 3 + 5 (II) (12.73)
|q| |q|
1
V = 3 (III).
q
What are the symmetries in each case?
(6) Consider the matrices
1 0 0 0 0 1
   
A = 0 1 0 , B = 0 1 0 . (12.74)
0 0 1  1 0 0
(a) Do they form a representation of Z2 ? Why?
(b) If so, is the representation reducible?
(c) If so, is this a regular representation?
(d) If so, is this equivalent with the roots of unity representation?
(7) Write down the generalization of the regular representation for Z3 to the Z N case, for a cyclic
permutation by one step of the basis elements.
Symmetries in Quantum Mechanics II: Discrete
13 Symmetries and Internal Symmetries

After analyzing mostly continuous spacetime symmetries (spatial and temporal translations), as well
as considering the general theory of symmetries and of discrete groups, we move on to applications
of discrete symmetries in quantum mechanics.

13.1 Discrete Symmetries: Symmetries under Discrete Groups

We start with Z2 -type symmetries (symmetries that can be described in terms of the group Z2 ). We
consider symmetries that are present at the classical level also:

• Parity symmetry P, or spatial inversion (like mirror reflection, but for all spatial coordinates instead
of just one). By definition, that means that the space gets a minus sign. But inverting the direction
of space also inverts the direction of momentum, so we have

x → x  = −x , p → p  = −p . (13.1)

In the presence of spin (understood as rotation around the direction of the momentum), spin does
not get inverted; see Fig. 13.1a.
• Time reversal invariance T, inverting the direction of time,

t → −t. (13.2)

Of course, unlike parity, which could be thought of as a simple change of reference frame, inverting
the direction of time is not physically possible. Moreover, we know that there are phenomena such
as the increase in entropy that define the arrow of time. But in this case, we simply mean “filming”
the evolution in time of a system and rolling it backwards. In other words, consider the dynamics
of the system and change t into −t in its equations.
• “Internal” symmetries, i.e., symmetries that are not associated with spacetime but rather with some
internal degrees of freedom. Examples are “charge conjugation”, which changes particles into
antiparticles, R-parity in supersymmetric theories, isospin parity in the Standard Model of particle
physics, etc.

Another example of a discrete symmetry is lattice translation (translation by a fixed amount a) in


condensed matter, but we will not analyze it here.

149
150 13 Discrete Symmetries in Quantum Mechanics and Internal Symmetries

P
(a) (b)

Figure 13.1 (a) Parity is understood as inversion in a mirror: momentum gets inverted, spin (rotation around the direction of the
momentum) does not. (b) Time reversal invariance: both momentum and spin get inverted.

13.2 Parity Symmetry

Corresponding to the classical parity symmetry acting on coordinate and momenta by inverting them,
we defined a parity operator π̂ that acts on states and/or operators. Specifically, for eigenstates of X̂
and P̂, in the active point of view,

π̂|x = | − x, π̂|p = | − p. (13.3)

Equivalently, in the passive point of view, with the parity operator acting on operators,

X̂  = π̂ −1 X̂ π̂ = − X̂
(13.4)
P̂  = π̂ −1 P̂ π̂ = − P̂.

This means that [ π̂, X̂]  0, [π̂, P̂]  0.


Either way, we find the correct transformation law for the quantum averages of operators, which
by the Ehrenfest theorem are the quantum equivalents of classical transformations,

xψ = ψ| π̂ −1 X̂ π̂|ψ = −ψ| X̂ |ψ = −xψ


(13.5)
pψ = ψ| π̂ −1 P̂ π̂|ψ = −ψ| P̂|ψ = −pψ .

Moreover, we can find the action of parity on wave functions ψ(x) = x|ψ by introducing the
completeness relation for |x states in x| π̂|ψ,
 
  
x| π̂|ψ = dx x| π̂|x x |ψ = dx x| − x x  |ψ = −x|ψ = ψ(−x), (13.6)

where we have used the orthonormal relation x  |x = δ(x − x  ).


Some observations about the parity operator: carrying out the classical parity operation twice, we
get back to the same situation. Correspondingly, we find that the parity operator applied twice also
gives the identity,

π̂ 2 |x = |x ⇒ π̂ 2 = 1̂ ⇒ π̂ −1 = π̂. (13.7)

Since π is a symmetry transformation, π̂ is also unitary, according to the general theory, so for all
π̂ we have π̂ † = π̂ −1 .
151 13 Discrete Symmetries in Quantum Mechanics and Internal Symmetries

Since π̂ 2 = 1̂, the eigenvalues of π̂ are ±1, so for parity eigenstates we have

π̂|ψ = ±|ψ. (13.8)

Then, for the wave functions of these parity eigenstates we have

ψ(−x) = ±ψ(x), (13.9)

which are called even parity and odd parity, respectively.


But not all states are parity eigenstates. In particular, eigenstates of X̂, P̂ are not, since, as we
saw, π̂|x = | − x and π̂|p = | − p. Also, as we saw, [π̂, X̂]  0, [π̂, P̂]  0, so we cannot have
simultaneously an eigenstate of X̂ or P̂ and of π̂.
However, we can have a Hamiltonian that is invariant under parity,

π̂ Ĥ π̂ −1 = Ĥ ⇒ [ π̂, Ĥ] = 0, (13.10)

which means that such a Hamiltonian can have an eigenstate simultaneously with π. Then, if
([ π̂, Ĥ] = 0 and) |n is an eigenstate of Ĥ, Ĥ |n = En |n, and |n is nondegenerate, it follows
that |n is also a π̂ eigenstate.

Example: Harmonic Oscillator


For the harmonic oscillator, the ground state |0 is parity even, since the wave function is symmetric
around 0. Since

a† = α X̂ + β P̂, α, β ∈ C, (13.11)

which is parity odd (since both X̂ and P̂ are parity odd), it follows that the first excited state is also
parity odd, since

|1 ∝ a† |0, (13.12)

and, more generally,

|n ∝ (a† ) n |0 ⇒ π|n = (−1) n |n, (13.13)

so the parity of the state |n is (−1) n .

Parity Selection Rules


We can also consider selection rules, meaning rules for a matrix element to be nonzero.
If |α, |β are parity eigenstates, |α has parity  α , and |β has parity  β then (since X̂, P̂ are odd
under parity)

β| X̂ |α = 0 = β| P̂|α, (13.14)

unless  α = − β (if not, the states |β and X̂ |α, P̂|α are orthogonal, corresponding to different
eigenvalues for the Hermitian operator π̂).
152 13 Discrete Symmetries in Quantum Mechanics and Internal Symmetries

13.3 Time Reversal Invariance, T

As we mentioned, the time reversal operator corresponds to the reversal of motion, meaning running
backwards the dynamics of the theory, which includes inverting the momentum. This leads to the
classical action of T,

x → x , p → −p . (13.15)

In classical mechanics, if x (t) is a solution to the Newtonian equation of motion,

m(x¨ ) = −∇V
 , (13.16)

then x (−t) is also a solution, since making the replacement t → −t amounts to replacing d 2 /dt 2 by
d 2 /d(−t) 2 in the equation, which leaves it invariant. Defining this as a new solution x  (t) ≡ x (−t),
we indeed find that
d2  d2
2
x (t) = 2 x (t). (13.17)
dt dt
On the other hand, in quantum mechanics, the Schrödinger equation,
 2 
∂  2
i ψ = − ∇ + V (r ) ψ, (13.18)
∂t 2m
is not invariant under time inversion, because of the presence of a single time derivative: if ψ(x , t)
is a solution, then ψ(x , −t) is not a solution, since the time reversed equation is
 2 
∂  2
−i ψ = − ∇ + V (r ) ψ. (13.19)
∂t 2m
However, ψ∗ (x , −t) is a solution to (13.9). Indeed, the complex conjugate of the Schrödinger
equation is
 2 
∂  2
−i ψ∗ = − ∇ + V (r ) ψ∗ , (13.20)
∂t 2m
which is the same as the time reversed equation, except that ψ is replaced by ψ∗ .
Then, for an energy eigenfunction, the wave function is stationary, and we have

ψ(x , t) = ψ n (x )e−iEn t/ ⇒ ψ∗ (x , −t) = ψ∗n (x )e−iEn t/ . (13.21)

Considering a fixed time, such as t = 0, we obtain


T
ψ n (x ) = x |n → ψ∗n (x ) = (x |n) ∗ = n|x . (13.22)

This suggests that the T̂ operator in quantum mechanics is an antilinear and unitary operator, or
anti-unitary operator, so (according to the mathematics reviewed in the first few chapters)

T̂ (c|ψ) = c∗T̂ (|ψ), ∀c ∈ C. (13.23)

By definition, in the passive point of view the action of T̂ on the operators X̂, P̂ is the equivalent
of the classical action on x, p, i.e.,

T̂ X ˆ T̂ PˆT̂ −1 = − P.
ˆ T̂ −1 = X, ˆ (13.24)
153 13 Discrete Symmetries in Quantum Mechanics and Internal Symmetries

Therefore
T̂ 2 = 1̂ ⇒ T̂ −1 = T̂, (13.25)
though only in the case without spin. As we will see later, on states with spin, the action of T̂ is
different.
Having T as an anti-unitary operator solves another issue. Indeed, for a time-reversal invariant
system, evolving a state by an infinitesimal time dt after the action of T should be equivalent to the
action of T after a time evolution by −dt. Since in the infinitesimal case Û (dt, 0)  1̂ − i Ĥ dt/, we
obtain
   
i i
1̂ − Ĥ dt T |ψ = T 1̂ − Ĥ (−dt) |ψ ⇒
  (13.26)
−i Ĥ T |ψ = T i Ĥ |ψ .
In the above, we have divided by dt/; both quantities in the ratio are real numbers, so the ratio can
be taken out of both linear and antilinear operators.
However, we cannot divide by i, as for a linear operator, and conclude that Ĥ T̂ = −T̂ Ĥ, which
would be nonsensical, since it would imply that
Ĥ T̂ |En  = −T̂ Ĥ |En  = −En T̂ |En  , (13.27)
which would give negative eigenvalues of the Hamiltonian, and this is physically impossible.
Instead, if we accept that T̂ is antilinear, which is consistent with the previously derived action on
T
wave functions ψ n (x ) → ψ∗n (x ), when dividing out i from T̂, it becomes −i, so we obtain
Ĥ (T̂ |ψ) = T̂ ( Ĥ |ψ), (13.28)
which can be stated as an action on the Hamiltonian,
Ĥ T̂ = T̂ Ĥ ⇒ T̂ Ĥ T̂ −1 = Ĥ, (13.29)
saying that the Hamiltonian is time-reversal invariant, which is consistent with classical physics and
our assumption.
We note that in the above analysis, we didn’t need to define the action of T̂ on a “bra” state, χ|T,
which would be hard to do since the notation of bra and ket states was defined for linear operators.
For a general observable, associated with a Hermitian operator Â, if the observable has a well-
defined action under T, we have
T̂ ÂT̂ −1 = ± Â, (13.30)
which means  is even or odd under time reversal. Then, for quantum averages, we have
ψ| Â|χ = ±T̂ ψ| Â|T̂ χ, (13.31)
which is consistent with the Ehrenfest theorem.
Coming back to the action of T̂ on wave functions, we note that the antilinear property of T̂ implies
that action as well: since
T̂ (c|x) = c∗T̂ |x = c∗ |x, (13.32)
inserting the |x completeness relation in front of a state |ψ,
 
|ψ = dx|xx|ψ = dx|xψ(x), (13.33)
154 13 Discrete Symmetries in Quantum Mechanics and Internal Symmetries

implies the action of T on it as



T |ψ = dxψ∗ (x)|x, (13.34)

T
so we re-obtain the previously found action ψ(x) → ψ∗ (x).
We note two more properties of T̂:

• If Ĥ is invariant under T̂, and the energy eigenstates |n are nondegenerate, then x|n is real, since

T : x|n → x|n∗ = x|n. (13.35)


T
 the spin is odd under time reversal, S →
• If a system has spin S, − S.
 The reason is that we can think
of the spin as a sort of rotation around the axis of the momentum, and running time backwards
means the rotation changes direction; see Fig. 13.1b. In quantum theory, this means that
ˆ ˆ
T̂ ST̂ −1 = − S. (13.36)

A more precise notion has to wait for a better description of spin, here we merely note that for a
state of spin j, we have

T̂ 2 |spin j = (−1) 2j |spin j. (13.37)

For spinless states ( j = 0), we recover T̂ 2 = 1̂.

13.4 Internal Symmetries

Having dealt with discrete spacetime symmetries, we turn to discrete internal symmetries, having to
do with internal degrees of freedom.
Consider our prototype, the Z N group, specifically Z2 , for the case where there is no spacetime
interpretation (unlike for π̂, T̂).
We considered the example of a Z N -invariant system with complex variable q̂ and Hamiltonian
| q̇|
ˆ2
Ĥ = m + V ( q̂ N ). (13.38)
2
Then, indeed,

ĝ Ĥ ĝ −1 = Ĥ, ∀g ∈ Z N . (13.39)

Since therefore [ĝ, Ĥ] = 0, we obtain a degeneracy of states, with ĝ|En  having the same energy En
as |En  for all g ∈ Z N , meaning that there are N related states of the same energy. However, since in
this q space the gk are represented as complex phases, gk = e2πik/N , the states all represent the same
physical state.
Another example is better suited to show the relevant degeneracy: consider another representation
of Z N . In the case of Z2 , we can choose the representation in a two-dimensional vector space
permuting the group elements,
   
1 0 0 1
D(e) = , D(b) = , (13.40)
0 1 1 0
155 13 Discrete Symmetries in Quantum Mechanics and Internal Symmetries

where e and b are elements of Z2 . This representation can be used to construct a relevant case of a
Z2 -invariant Hamiltonian, with two real variables q1 and q2 ,
m
Ĥ = ( q̂˙12 + q̂˙22 ) + V ( q̂1 · q̂2 ). (13.41)
2
Indeed, the Hamiltonian is invariant under the above representation of Z2 , with the only nontrivial
element D(b) acting on the vector space (q1 , q2 ) by interchanging them. Interchanging q1 and q2
leaves Ĥ invariant. Then, in this case also, |En  and D(b)|En  are states degenerate in energy, but
now this means something nontrivial, as D(b) interchanges q1 and q2 , so one obtains a different
energy eigenstate.
We have given an example of Z2 invariance, where there are two group elements and correspond-
ingly two states (degeneracy 2), but for a finite discrete group with N elements we have N states, so
the degeneracy is N.

13.5 Continuous Symmetry

Above we have considered only discrete internal symmetries, but we can also have continuous
symmetries that can be both internal (considered later on in the book) and spacetime, e.g., rotations,
which will be considered in the next chapter.
As an example of continuous symmetry, consider the SO(2) rotation, with matrix element
 
cos α sin α
g = g(α) = , ∀α ∈ [0, 2π]. (13.42)
− sin α cos α
 
x
This matrix element acts on a two-dimensionalvector  (for a rotational symmetry in the plane
y
q1
defined by Cartesian coordinates x and y) or (for an internal symmetry acting on two real
q2
variables as before).
Consider the case of an internal symmetry for a system with two variables q1 and q2 ,
m
L = ( q̇12 + q̇22 ) − V (q12 + q22 ) = T − V ⇒
2
m ˙2 ˙2 (13.43)
Ĥ = ( q̂1 + q̂2 ) + V ( q̂12 + q̂22 ) = T̂ + V̂ .
2
But then, under the SO(2) internal symmetry, the terms in the Hamiltonian transform as
q12 + q22 → (q1 cos α + q2 sin α) 2 + (−q1 sin α + q2 cos α) 2 = q12 + q22
(13.44)
q̇12 + q̇22 → ( q̇1 cos α + q̇2 sin α) 2 + (−q̇1 sin α + q̇2 cos α) 2 = q̇12 + q̇22 ,
which means that both T and V are independently invariant.

13.6 Lie Groups and Algebras and Their Representations

The group SO(2) considered above is an example of a Lie group, which is a continuous group,
depending on continuous parameter(s) α a , so that g = g(α). It is also Abelian, though in fact it is the
only Abelian Lie group, up to equivalences.
156 13 Discrete Symmetries in Quantum Mechanics and Internal Symmetries

For a Lie group, by convention we have g(α a = 0) = e. Then, Taylor expanding in α a around
α = 0, we have

D(g(α a ))  1̂ + idα a X a + · · · , (13.45)

which means that the constant elements X a , called the generators of the Lie group, are found as
∂ 
X a ≡ −i D(g(α a ))  . (13.46)
∂α a
α a =0
The factor i is conventional (there is another convention without it), but it is chosen because if D(g)
is unitary, so that [D(g)]† = [D(g)]−1 , then X a is Hermitian (X a† = X a ).
Lie groups are named after Sophus Lie, who showed that the generators above can be defined
independently of the particular representation and also defined the Lie algebra, which will be
described shortly.
For a finite group transformation by a parameter α a , we can consider infinitesimal pieces
dα a = α a /k, as k → ∞, and then
 k
iα a X a
D(g(α)) = lim 1 + = exp (iα a X a ). (13.47)
k→∞ k
The simplest example of a Lie group is the group of unitary numbers, U (1), which is equivalent
to the SO(2) group. Indeed, consider

D(g(α)) = eiα , (13.48)

when acting on a complex number q = q1 + iq2 . Then

q → eiα q = (cos α q1 − sin α q2 ) + i(q1 sin α + q2 cos α), (13.49)

meaning that we have an action


    
q1 cos α − sin α q1
→ , (13.50)
q2 sin α cos α q2
equivalent to the SO(2) action above by a permutation of elements.

Lie Algebra
A Lie algebra is the algebra (defined by a commutator) satisfied by the generators X a , X a ∈ L(G).
The commutator acts as a product on the Lie algebra space L(G), [, ] : L(G) × L(G) → L(G).
Because of the group property, ∀g, h ∈ G,
aX aX aX
g = eiα a
, h = eiβ a
⇒ g · h = eiγ a
. (13.51)

Then, after a somewhat simple analysis, one can show that this implies that we have the “algebra”

[X a , X b ] = f ab c X c , (13.52)

which defines the constants f ab c , known as “structure constants”. From its definition, we find the
antisymmetry property

f ab c = − f ba c . (13.53)
157 13 Discrete Symmetries in Quantum Mechanics and Internal Symmetries

In fact, the structure constants are completely antisymmetric in a, b, c, which can be shown by
considering the Jacobi identity, an identity (of the type 0 = 0) described as
[X a , [X b , X c ]] + cyclic permutations of (a, b, c) = 0, (13.54)
which implies, by calculating both the commutators in (13.54) in terms of structure constants,
f bc d f ad e + f ab d f cd e + f ca d f bd
e
= 0. (13.55)

Representations for Lie Groups


Lie groups can be represented as matrices, e.g., the SO(n) can be represented as special (of
determinant 1) and orthogonal matrices, the SU (N ) as special and unitary matrices, etc. This is
called the fundamental representation. But we have other representations denoted by R, in which the
generators Ta are (Ta(R) )i j . Another important representation is the adjoint one, defined by
(Ta )b c = −i f ab c . (13.56)
It is indeed a representation, since we have
([Ta , Tb ])c e = i f ab d (Td )c e , (13.57)
which is true owing to the Jacobi identity, as we can easily check.

Important Concepts to Remember

• Discrete symmetries are: parity, time reversal invariance, lattice translation, and internal.
• Parity eigenstates, π̂|ψ = ±|ψ have even parity (the + eigenvalue) or odd parity (the − eigen-
value), but not all states are parity eigenstates in a parity-invariant theory.
• For the harmonic oscillator (which is parity invariant), the parity of the state |n is (−1) n , but |x
and |p are not parity eigenstates.
• A selection rule for some particular symmetry is a rule for a matrix to be nonzero on the basis of
the symmetry.
• The Schrödinger equation is not time-reversal invariant; one needs to take the complex conjugate
also of ψ. A related feature is that T̂ (the time reversal operator in quantum mechanics) is antilinear
and unitary, or anti-unitary.
• In a time-reversal-invariant theory, T̂ Ĥ T̂ −1 = Ĥ, leading to x|n ∈ R.
• The action of time reversal on spin is T̂ ŜT̂ −1 = − Ŝ, and T̂ 2 |spin j = (−1) 2j |spin j.
• Lie groups are continuous groups depending on continuous parameters, with generators Ta that are
(−i) times the derivative of the group element with respect to the parameter, at zero parameter, so
D(g(α)) = exp (iα a X a ).
• The Lie algebra is the algebra of the generators of the Lie group, [X a , X b ] = f ab c X c , with f ab c the
structure constants, satisfying the Jacobi identity, f bc d f ad e + cyclic permutations of (a, b, c) = 0.
• The group SO(2), for the rotation of two real objects, is equivalent to the group U (1) with group
element eiα rotating a complex element.
• For the classical Lie groups we have the fundamental representation, on which the generators act
as matrices on a space, and the adjoint representation, for which (Ta )b c = −i f ab c , and other
representations.
158 13 Discrete Symmetries in Quantum Mechanics and Internal Symmetries

Further Reading
See [8] for more about groups and quantum mechanics.

Exercises

(1) Consider a three-dimensional harmonic oscillator (a harmonic oscillator with the same mass and
frequency in each direction). Is it parity invariant? If so, what is the parity of a generic state?
(2) Consider the probability density of the nth state of the one-dimensional harmonic oscillator. Is
it invariant under time-reversal invariance? How about under parity?
(3) Consider a system with two degrees of freedom in one dimension, q1 and q2 , and Hamiltonian
p12 p22
H= + + V (|q1 |) + V (|q2 |) + V12 (|q1 − q2 |). (13.58)
2 2
Does it have any continuous internal symmetries?
(4) Consider the algebra for three generators A, B, C,
[A, B] = C, [B, C] = A, [C, A] = B. (13.59)
Is it a Lie algebra? If so, write its adjoint representation in terms of matrices.
(5) Consider a Hamiltonian for two spins,
H = α( S12 + S22 ) + β S1 · S2 . (13.60)
Is the system time-reversal invariant? What about parity invariant?
(6) What is the dimension of the adjoint representation of SU (N ), N > 2? Is this adjoint represen-
tation equivalent to the fundamental representation?
(7) Consider the following Lagrangian for N degrees of freedom qi ,
N
| q̇i | 2  N
L= −V  |qi | 2  . (13.61)
i=1
2  i=1
If qi ∈ R, what is the internal continuous symmetry group? What about if qi ∈ C?
Theory of Angular Momentum I: Operators,
14 Algebras, Representations

In this chapter we start the analysis of angular momentum. We first analyze the rotation group SO(3)
and its equivalence to the group SU (2). We then define representations of both groups in terms of
values for angular momentum (or “spin”).

14.1 Rotational Invariance and SO(n)

A rotation is a linear transformation on a system: a transformation either of coordinates or of the


system, depending on whether one takes a passive or active point of view, that leaves the lengths
|Δr i j | = |r i − r j | of objects (the distances between points) invariant.
Considering a rotation around a point O with vector r O , with linear transformation

(r − r O )a = Λa b (r − r O )b , (14.1)

where a, b = 1, 2, 3 (these are spatial coordinates), by subtracting the relation for two points i and j
we find that indeed

(r i − r j )a = Λa b (r i − r j )b . (14.2)

Imposing invariance, so that |r i − r j | = |r i − r j |, we find that

|r i − r j | 2 = Λa b (r i − r j )b Λac (r i − r j )c = (r i − r j )b (ΛT Λ) bc (r i − r j )c


(14.3)
= (r i − r j )b δ bc (r i − r j )c ,
which means that the matrix Λ defining the linear transformation obeys

Λ† Λ = 1 ⇒ ΛT = Λ−1 , (14.4)

and thus is an orthogonal matrix. These orthogonal matrices form a group, since if ΛT1 = Λ−1
1 and
ΛT2 = Λ−1
2 then

(Λ1 Λ2 )T = ΛT2 ΛT1 = Λ−1 −1 −1


2 Λ1 = (Λ1 Λ2 ) , (14.5)

so their product is also orthogonal. The group of orthogonal matrices is called the orthogonal group
and is denoted by O(3) in the case of 3 × 3 matrices acting on three-dimensional vectors (spatial
vectors).
Moreover, using the fact that det( A · B) = det A · det B and det AT = det A, we find that

det(ΛT Λ) = det Λ det ΛT = (det Λ) 2


(14.6)
= det 1 = 1,
159
160 14 Angular Momentum I: Operators, Algebras, Representations

implying that det Λ = ±1. But that is easily understood: O(3) contains also the parity operation P,
which as we saw acts as Pr = r  = −r , which is not an operation continuously connected with the
identity as is needed for a Lie group.
This means that, in order to obtain a Lie group we need to eliminate P = −1 from O(3) by
imposing the special condition, det Λ = +1. This condition respects the group property since if
det Λ1 = det Λ2 = +1 then

det Λ1 Λ2 = det Λ1 det Λ2 = 1. (14.7)

We thus define the special orthogonal group, here SO(3), for 3 × 3 matrices. We have

SO(3) = O(3)/Z2 , (14.8)

where Z2 = {+1, −1}, so O(3) corresponds to two copies of the Lie group SO(3).
The analysis above trivially generalizes to n dimensions, for n×n matrices acting on n-dimensional
vectors, forming the group O(n) with Lie subgroup SO(n) = O(n)/Z2 . The Lie algebra of SO(n)
will be denoted by lower case letters, as so(n).

14.2 The Lie Groups SO(2) and SO(3)

We have that SO(2), the Abelian Lie group studied before, is a subgroup: SO(2) ⊂ SO(3). As we
saw, the matrix (which we can easily see is orthogonal and special)
 
cos θ − sin θ
M= (14.9)
sin θ cos θ
 
x
acts on two-dimensional spatial vectors as
y
     
x x x cos θ − y sin θ
=M = . (14.10)
y y x sin θ + y cos θ

Then we have rotational invariance, x 2 + y 2 = x 2 + y 2 , and the action of this SO(2) matrix is
equivalent to the action of a unitary number (a complex phase) on a complex number, thus to the
action of

M = eiθ = cos θ + i sin θ (14.11)

on z = x + iy, such that

M z = z  = x  + iy , (14.12)

as we can easily check, so SO(2)  U (1) (where U (1) is the group of unitary numbers, i.e., phases).
This SO(2)  U (1) Abelian group (the unique Abelian Lie group) is a subgroup of any compact
Lie group, in particular of any SO(n) for n ≥ 2. The reason is that we can always pick two
coordinates out of the n on which SO(n) acts to rotate, hence defining SO(2) ⊂ SO(n).
161 14 Angular Momentum I: Operators, Algebras, Representations

3 

 z
Á

2
Ã

Figure 14.1 Euler angles parametrizing a rotation in three-dimensional space: two angles, θ and ψ, relate to the Cartesian coordinate
system, and one, φ, corresponds to a rotation around the axis itself.

The Group SO(3)


Considering now rotations in three-dimensional space, we can always pick a planar (SO(2)) rotation
of angle φ around a fixed axis ẑ. Choosing the coordinate system in such a way as to have this axis
0
 
ẑ in the third direction, ez = 0, we then have
1
cos φ − sin φ 0
 
R3 (φ) =  sin φ cos φ 0 . (14.13)
 0 0 1

However, in general the vector ẑ, in a fixed coordinate system with Cartesian directions 1̂, 2̂, 3̂, is
defined by two angles, θ and ψ. Consider θ to be the angle made by ẑ with 3̂, and project ẑ onto the
plane defined by (1̂, 2̂). Then ẑ makes an angle π/2 − θ with its projection onto the (1̂, 2̂) plane, and
the projection makes an angle ψ with 2̂, as in Fig. 14.1. The angles φ, θ, ψ are known as the Euler
angles.
To obtain the general rotation, we first rotate by ψ around 3̂, thus aligning the projection of ẑ
with 2̂, and then rotate around 1̂ by θ, aligning ẑ with 3̂, and finally rotate by φ around it. Then, the
general three-dimensional rotation parametrized by the Euler angles is

g(φ, θ, ψ) = g(φ)g(θ)g(ψ) = R3 (φ)R1 (θ)R3 (ψ)


cos φ − sin φ 0 1 0 0 cos ψ − sin ψ 0
   
=  sin φ cos φ 0 0 cos θ − sin θ  sin ψ cos ψ 0 .
 0 0 1 0 sin θ cos θ  0 0 1
(14.14)

To obtain the most general rotation axis ẑ in three-dimensional space, the rotation range of ψ is
[0, 2π), and then of θ is [0, π). Finally, the rotation by φ has the standard [0, 2π) range.
162 14 Angular Momentum I: Operators, Algebras, Representations

There are three U (1)  SO(2) Abelian subgroups in SO(3), corresponding to the three Euler
angles, and three ways to pick two coordinates for the rotation plane from the three coordinates.
To find the Lie algebra of SO(3), we define its three generators Ji as being associated with the three
Euler angles. Each corresponds to rotation in a plane. For instance, R3 (φ) corresponds to rotation in
the (1, 2) plane, and, for an infinitesimal φ, we obtain

cos φ − sin φ 0 1 −φ 0 0 −1 0
     
R3 (φ) =  sin φ cos φ 0   φ 1 0 + O(φ) 2 = 1 + φ 1 0 0 + O(φ2 ). (14.15)
 0 0 1 0 0 1 0 0 0
Then the generator for this rotation around the 3-direction is

0 −1 0
 
iT3 = i J3 = 1 0 0 , (14.16)
0 0 0
and we can do a similar calculation for the other directions. The group elements for finite rotations
by angles θi , i = 1, 2, 3, are

exp iθ i Ji ≡ exp ii jk ω jk Ji . (14.17)

Moreover, a unitary infinitesimal matrix means that

(1 + θ i i Ji ) † = 1 + θ(i Ji ) † = (1 + iθ i Ji ) −1 = 1 − θ i i Ji , (14.18)

and for real i Ji we find an antisymmetric matrix, as above for i J3 .

Rotationally Invariant Systems


For a rotationally (SO(3)) invariant Lagrangian L (and thus Hamiltonian Ĥ), for instance
 mi  mi  d 2
L= ˙
ri − V (|r i − r j |) =
2
(r i − r O ) − V (r i − r j ) 2 , (14.19)
i
2 i
2 dt

the Lagrangian of the system must be in a representation of the rotation group SO(3). Here, as we can
see, we have the fundamental representation, where the rotation matrices act on the three-dimensional
position vectors r .
However, the state of the system, |ψ, need not be in the same representation, but can be in an
arbitrary representation (since it is not related to x as a vector space).
Before discussing general representations, we consider the equivalence of SO(3) and SU (2) in the
Lie algebra.

14.3 The Group SU(2) and Its Isomorphism with SO(3) Mod Z2

Consider the group of unitary matrices, i.e., the complex matrices U satisfying U † = U −1 . Then,
indeed, if U1† = U1−1 and U2† = U2−1 ,

(U1U2 ) † = U2†U1† = U2−1U1−1 = (U1U2 ) −1 , (14.20)


163 14 Angular Momentum I: Operators, Algebras, Representations

which means that unitarity respects the group property. The group of unitary n × n matrices is called
U (n) and acts on an n-dimensional complex vector.
For a unitary matrix U,

det(U †U) = (det U) ∗ det U


(14.21)
= det 1 = 1 ⇒ | det U | = 1 ⇒ det U = eiα ,

which means that the determinant of the unitary matrix forms a U (1) group.
Imposing the special condition det U = +1, which respects the group property as we saw, we obtain
the special unitary group, SU (n). However, now (since the determinant forms a U (1) group), U (n) 
SU (n) × U (1), modulo topological issues. In fact,

U (n)  (SU (n) × U (1))/Zn . (14.22)

In Lie algebra, u(n) = su(n) × u(1). In the relevant case of n = 2, we have also

SO(3)  SU (2)/Z2 , (14.23)

which means that SU (2) winds around SO(3) twice. Indeed, −12×2 ∈ SU (2), but −13×3 is not in
SO(3).
As the SO(2) ⊂ SO(3) rotation equals U (1) ⊂ SU (2), we can identify the rotation in the plane
(1, 2) (around 3̂) as an SU (2) matrix with diagonal elements that have opposite phases:
 iφ/2 
e 0
R3 (φ) → ≡ R̃3 (φ), (14.24)
0 e−iφ/2
 
z
acting on the two-dimensional complex vector 1 . The matrix is unitary, since
z2
 
e−iφ/2 0
R̃3† = = R̃3−1 , (14.25)
0 eiφ/2

and det R̃3 = 1.


The rotation around 1̂ can be represented by a different unitary matrix,
 
cos θ/2 i sin θ/2
R1 (θ) → ≡ R̃1 (θ). (14.26)
i sin θ/2 cos θ/2

We can then check that


 
cos θ/2 −i sin θ/2
R̃1† = = R̃1−1 , (14.27)
−i sin θ/2 cos θ/2

and det R̃1 = +1.


Then the general element of SU (2) is represented via the Euler angles φ, θ, ψ as

g(φ, θ, ψ) = R̃3 (φ) R̃1 (θ) R̃3 (ψ)


cos θ/2 ei(φ+ψ)/2 i sin θ/2 ei(φ−ψ)/2 (14.28)
= −i(φ−ψ)/2 −i(φ+ψ)/2
.
i sin θ/2 e cos θ/2 e
164 14 Angular Momentum I: Operators, Algebras, Representations

But, as we said, −12×2 ∈ SU (2), so we need to obtain ± times the above generic matrix, which
was defined over the original range of the (SO(3)) Euler angles. Since cos(π + α) = − cos α and
sin(π + α) = −α, we can achieve that by noting that

R̃3 (φ) R̃1 (θ + 2π) R̃3 (ψ) = − R̃3 (φ) R̃1 (θ) R̃3 (ψ), (14.29)

that is, by doubling the range of θ, thus proving the statement from before, that SU (2) winds twice
around SO(3) (it is a double cover of SO(3)).
On the other hand, that also means that the Lie algebras of SU (2) and SO(3) are the same,
su(2) = so(3), which means that their representations are equivalent.
In particular, it means that we can represent SO(3) via complex 2 × 2 matrices also! We will see
that this is the minimum representation space (of “spin j = 1/2”) that we can have for SO(3).

Example for SU(2)

An example of such a representation, for a system with “internal” SU (2) symmetry, is a sys-
tem defined by four real coordinates, x 1 , x 2 , y1 , y2 ∈ R, combined into two complex coordinates
q1 = x 1 + iy1 and q2 = x 2 + iy2 , and further into the vector space with two complex dimensions,
q
q = 1 , with Lagrangian
q2

| q̇| 2 m
L=m − V (|q| 2 ) ≡ (| q̇1 | 2 + | q̇2 | 2 ) − V (|q1 | 2 + |q2 | 2 ). (14.30)
2 2
Then, for q → q  = gq, we obtain

|q| 2 → |q  | 2 = |gq| 2 = q† g † gq = q† q = |q| 2


(14.31)
| q̇| 2 → | q̇  | 2 = |g q̇| 2 = q̇† g † g q̇ = q̇† q̇ = | q̇| 2 .

Again, just because the (classical or quantum) system is in the two-dimensional representation of
SU (2), it doesn’t mean that the state is in that representation, since the state is not related to q1 , q2 as
a vector space. States |ψ can be in a different representation of SU (2).

14.4 Generators and Lie Algebras

From the Noether theorem, we saw in Chapter 12 that, for a symmetry with generators Ta , we have
a conserved charge associated with it,
∂L
Qa = (iTa )i j q j = pi (iTa )i j q j . (14.32)
∂ q̇i
Considering the symmetry to be a rotation, specifically around z = x 3 , and qi = x i = (x 1 , x 2 , x 3 ),
we found that
0 −1 0
 
iT3 = +1 0 0 , (14.33)
0 0 1
165 14 Angular Momentum I: Operators, Algebras, Representations

and similarly (just permuting the vector space 1, 2, 3) for iT1 , iT2 we find
Q3 = p · iT3 · q = x 1 p2 − x 2 p1 . (14.34)
Moreover, for the other two charges, we find
Q2 = x 3 p1 − x 1 p3
(14.35)
Q1 = x 2 p3 − x 3 p2 ,
or, written together,

Qi = i jk x j pk , (14.36)
j,k

using the Levi-Civita antisymmetric symbol i jk , for which 123 = +1 and which is totally antisym-
metric (so that, for instance, 132 = −123 = −1).
However, i jk defines the three-dimensional vector product, so we have in fact
 = r × p = L,
Q (14.37)
which is the angular momentum of a particle! Therefore the conserved charge associated with
rotational invariance is the angular momentum.
Moreover, denoting iTa = i Ja , we see that Q a = L a is the charge associated with the generator
Ja , so the generators of SO(3) rotations are (associated with) angular momenta. This is similar to the
fact that the generators of translations K a are (associated with) the momenta Pa .
Classically, we can calculate the Poisson brackets of angular momenta, using the canonical
expressions {x i , p j } P.B. = δi j or, equivalently, the definitions of the Poisson brackets:
{L i , L j } P.B. = ikl  jmn {x k pl , x m pn } P.B. = ilk  jmk x m pl + ikl  jmn x k pn
(14.38)
= −i jk  kmn x m pn = −i jk L k .
This is the same algebra as the algebra of generators i Ji for SO(3) (a Lie algebra), as we can directly
check:
⎡⎢ 0 0 0 0 0 −1 ⎤⎥ 0 −1 0
⎢⎢   ⎥  
[i J1 , i J2 ] = ⎢0 0 −1 , 0 0 0 ⎥⎥ = −i J3 = − 1 0 0 , (14.39)
⎢⎢ ⎥⎥
⎣ 0 1 0 1 0 0 ⎦ 0 0 0
so we can identify the generators of rotations with the angular momenta.
Finally, we will see in the next chapter that for wave functions ψ(r ) the rotation acts according to
ˆ
the operator L:
  ∂
L̂ i ψ(x ) = i jk X̂ j P̂k ψ(x ) = −i i jk x j ψ(x ). (14.40)
jk jk
∂ xk

14.5 Quantum Mechanical Version

In quantum mechanics, the Poisson bracket {, } P.B. is replaced by (1/i)[, ], so we obtain the algebra

[ L̂ i , L̂ j ] = i i jk L̂ k . (14.41)
k
166 14 Angular Momentum I: Operators, Algebras, Representations

Equivalently, and more directly, x i and pi become the operators X̂i and P̂i , so we have

L̂ 1 = X̂2 P̂3 − X̂3 P̂2 , etc. ⇒ L̂ i = i jk X̂ j P̂k , (14.42)
k

and, using the canonical commutation relations [ X̂i , P̂j ] = iδi j , we find again

[ L̂ i , L̂ j ] = i i jk L̂ k . (14.43)
k

Moreover, constructing the square of the total angular momentum,


Lˆ 2 = L̂ 2 + L̂ 2 + L̂ 2 , (14.44)
1 2 3

we then obtain

ˆ
[ L̂ i , L 2 ] = L̂ j [ L̂ i , L̂ j ] + [ L̂ i , L̂ j ] L̂ j = i i jk ( L̂ j L̂ k + L̂ k L̂ j ) = 0. (14.45)
k

Thus a general “angular momentum”, for either rotation SO(3) generators or generators of an
internal SO(3), has the commutation relations

ˆ
[ Jˆi , Jˆk ] = i i jk Jˆk , [J 2 , Jˆi ] = 0, (14.46)
k

where the Jˆi are Hermitian.

14.6 Representations

Finally, we can construct the representations of the rotation group SU (2)  SO(3) corresponding to
the possible Hilbert spaces for the rotationally invariant systems.
We define the operators
J+ = J1 + i J2 , J− = J1 − i J2 , Jz = J3 , (14.47)
with (J+ ) † = J− and (J− ) † = J+ .
Their algebra is
[J+ , J− ] = 2i[J2 , J1 ] = 2Jz
[Jz , J+ ] = J+ (14.48)
[Jz , J− ] = −J− .
2
From the fact that [J , Jz ] = 0, it follows that there is a complete set of simultaneous eigenstates for
2
both J and Jz , denoted |λ2 , m, . . . by their eigenvalues,
J 2 |λ2 , m, . . . = λ2 2 |λ2 , m, . . ., Jz |λ2 , m, . . . = m|λ2 , m, . . .. (14.49)
But, since Ji is Hermitian, we find that
 
λ2 , m, . . . | Ji2 |λ2 , m, . . . =  Ji |λ2 , m, . . . 2 ≥ 0
i i (14.50)
=λ  2 2
⇒ λ ≥ 0, 2
167 14 Angular Momentum I: Operators, Algebras, Representations

where we have used the fact that the norm of the eigenstates is one (or in any case, positive and
nonzero). Thus λ2 ≥ 0, so λ is real (which shows that it was a good definition).
In a similar manner, using the Hermiticity of J1 , J2 we find
2
λ2 , m, . . . |(J12 + J22 )|λ2 , m, . . . ,  = λ2 , m, . . . |( J − Jz2 )|λ2 , m, . . . = (λ2 − m2 )2
=  J1 |λ2 , m, . . . 2 +  J2 |λ2 , m, . . . 2 ≥ 0 (14.51)
⇒ λ2 − m2 ≥ 0.
On the other hand, J+ acts as a raising operator for m and J− as a lowering operator, since (using
[Jz , J+ ] = J+ and [Jz , J− ] = J− ) we find
Jz (J+ |λ2 , m, . . .) = J+ (Jz |λ2 , m, . . .) + J+ |λ2 , m, . . . = (m + 1)(J+ |λ2 , m, . . .)
Jz (J− |λ2 , m, . . .) = J− (Jz |λ2 , m, . . .) − J+ |λ2 , m, . . . = (m − 1)(J− |λ2 , m, . . .).
(14.52)
Since m2 ≤ λ2 , it follows that there is a maximum value and a minimum value of m, such that
J+ |λ2 , mmax , . . . = 0 ⇒ |λ2 , mmax + 1, . . . = 0
(14.53)
J− |λ2 , mmin , . . . = 0 ⇒ |λ2 , mmin − 1, . . . = 0.
Also, because the step in m is 1, it follows that a representation of given λ2 has a finite number of
states. Moreover, as we have
2
J− J+ = (J1 − i J2 )(J1 + i J2 ) = J12 + J22 + i[J1 , J2 ] = J − Jz2 − Jz (14.54)
and J− = (J+ ) † , we find that
2
λ2 , m, . . . | J− J+ |λ2 , m, . . . = λ2 , m, . . . |( J − Jz2 − Jz )|λ2 , m, . . . = (λ2 − m2 − m)2
=  J+ |λ2 , m, . . . 2 ≥ 0 ⇒ λ2 − m2 − m ≥ 0.
(14.55)
Similarly, as we have
2
J+ J− = (J1 + i J2 )(J1 − i J2 ) = J12 + J22 − i[J1 , J2 ] = J − Jz2 + Jz (14.56)
and J+ = (J− ) † , we obtain
2
λ2 , m, . . . | J+ J− |λ2 , m, . . . = λ2 , m, . . . |( J − Jz2 + Jz )|λ2 , m, . . . = (λ2 − m2 + m)2
=  J− |λ2 , m, . . . 2 ≥ 0 ⇒ λ2 − m2 + m ≥ 0.
(14.57)
Further, the equality to zero happens, for mmax , when the norm of the state is zero, since the state
itself vanishes, meaning when
J+ |λ2 , m, . . . = 0 ⇒ λ2 − mmax
2
− mmax = 0, (14.58)
and for mmin when similarly
J− |λ2 , m, . . . = 0 ⇒ λ2 − mmin
2
+ mmin = 0. (14.59)
However, subtracting (14.58) from (14.59) we find
mmax (mmax + 1) = mmin (mmin − 1), (14.60)
168 14 Angular Momentum I: Operators, Algebras, Representations

whose only solution is


mmax = −mmin . (14.61)
Indeed, there is no solution if both mmax and mmin are positive, or both are negative.
Finally, we have
−mmax ≤ m ≤ mmin , (14.62)
and since m changes by ±1 only (through J± ), it follows that
2mmax = mmax − mmin = n ∈ N, (14.63)
so
n
mmax = ≡ j. (14.64)
2
This “quantum number” j is thus half-integer.
Finally, (14.58) implies that
λ2 = j ( j + 1). (14.65)
Assuming that there are no other quantum numbers (so we have only angular momentum as a
variable), the state is | jm and
Jz | jm = m| jm
(14.66)
J 2 | jm = j ( j + 1)2 | jm.

The interpretation is that the index j describes possible representations of SO(3), or of angular
momentum (since j defines the total angular momentum), and m describes the states in the
representation.

Matrix Elements

Finally, we compute the matrix elements of the generators in the | jm representation. From (14.55),
 J+ | jm 2 = 2 [ j ( j + 1) − m2 − m], (14.67)
whereas from (14.52),
J+ | jm ∝ | j, m + 1, (14.68)
which finally gives
J+ | jm = ( j − m)( j + m + 1)| jm ≡ α j,m+1 | j, m + 1, (14.69)
where
α jm ≡ j ( j + 1) − m(m − 1) = ( j + m)( j − m + 1). (14.70)
As this is a matrix element in | jm space, we have
(J+ ) jm ,jm = α j,m+1 δ m ,m+1 . (14.71)
Similarly, from (14.57),
 J− | jm 2 = 2 [ j ( j + 1) − m2 + m] (14.72)
169 14 Angular Momentum I: Operators, Algebras, Representations

whereas from (14.52)


J− | jm ∝ | j, m − 1, (14.73)
which finally gives
J− | jm = ( j + m)( j − m + 1)| jm ≡ α jm | j, m − 1. (14.74)
As a matrix element in | jm space, this means that
(J− ) jm ,jm = α jm δ m ,m−1 . (14.75)
Finally, since J1 = (J+ + J− )/2 and J2 = (J+ − J− )/(2i), we find also

(J1 ) jm ,jm =
(α j,m+1 δ m ,m+1 + α jm δ m ,m−1 )
2
(14.76)

(J2 ) jm ,jm = (α j,m+1 δ m ,m+1 − α jm δ m ,m−1 ).
2i
To these, we add the diagonal elements
(J3 ) jm ,jm = mδ m m
2
(14.77)
( J ) jm ,jm = j ( j + 1)2 δ m m .
We now specialize to the j = 1/2 case (which will be described in much more detail later on, as
it corresponds in particular to the electron spin 1/2 case), and note that the above matrices are now
2 × 2 matrices, acting on the two-dimensional vector space | jm = |+1/2, ±1/2. Then, as a matrix,
 
1/2 0
J3 =  . (14.78)
0 −1/2
This indeed matches with the generators of SU (2), since for infinitesimal Euler angle parameters, we
find
   
1 + iφ/2 0 1/2 0
R̃3 (φ)  + O(φ2 ) = 1 + iφ + O(φ2 ) ⇒
0 1 − iφ/2 0 −1/2
 
1/2 0
J3 =
0 −1/2
    (14.79)
1 +iθ/2 0 1/2
R̃1 (θ)  + O(θ ) = 1 + iθ
2
+ O(θ ) ⇒
2
+iθ/2 0 1/2 0
 
0 1/2
J1 = .
1/2 0
The last formula is consistent with the matrices we found for the spin 1/2 representation, since
we find
 
0
J+ |1/2, ±1/2 = × |1/2, ∓1/2
1
 
1
J− |1/2, ±1/2 = × |1/2, ∓1/2 ⇒ (14.80)
0
 
J+ + J− 1 0 1
J1 = = .
2 2 1 0
170 14 Angular Momentum I: Operators, Algebras, Representations

Important Concepts to Remember

• The special orthogonal group SO(n) is the group of matrices that are orthogonal, ΛT = Λ−1
and special, det Λ = 1, which correspond to rotations in n-dimensional space. Globally,
SO(n) = O(n)/Z2 .
• The special unitary group SU (n) is the group of matrices that are unitary, U † = U −1 , and special,
det U = 1, with U (n)  (SU (n) × U (1))/Zn .
• For n = 2 we also have SO(3)  SU (2)/Z2 , so SU (2) winds twice around SO(3): SO(3) is
generated by three Euler angles for rotations around three axes (in a plane; with the z axis towards
the plane; and around the final z axis), while SU (2) has twice the range of the angles, so that
−1 ∈ SU (2).
• The charges associated with SO(3) rotations of coordinates are the angular momenta
L i = i jk x k pk , which classically satisfy the Poisson brackets {L i , L j } P.B. = −i jk L k .
ˆ
• Quantum mechanically, the L̂ i give the Lie algebra [ L̂ i , L̂ j ] = ii jk L̂ k , with [L 2 , L̂ k ] = 0, which
is the Lie algebra of SU (2), with generators generally denoted by Jˆi .
• The representations of SU (2) are indexed by a half-integer j ∈ N/2 and within a representation,
2
the states are indexed by m, − j ≤ m ≤ j, and thus by | jm, with J | jm = j ( j + 1)2 | jm and
Jz | jm = m| jm.
• The matrix elements within the representation are J+ | jm = α j,m+1 | j, m + 1, J− | jm = α j,m | j,
m − 1, and J3 | jm = m| jm, with α j,m = ( j + m)( j − m + 1).
• The spin 1/2 ( j = 1/2) generators Ji correspond to the generators of SU (2) via the Euler angles.

Further Reading
See [2], [1], [3] for more information.

Exercises

(1) Write explicitly the matrices of the spin 1 representation and the adjoint representation of SU (2),
and relate them.
(2) Write explicitly the 2 × 2 matrices g = eiα i σi /2 , where σi are the Pauli matrices, and compare
with g(θ, φ, ψ) for SO(3), to find an explicit map between α i and the Euler angles (θ, φ, ψ).
(3) For a general matrix
 
a b
A= (14.81)
c d

belonging to SU (2), i.e., such that A† = A−1 and det A = 1, find the Euler angles in terms of
a, b, c, d.  
q1
(4) Consider the complex degrees of freedom q1 , q2 ∈ C forming a doublet q = ,
q2
A = i=1 ai σi ∈ C, where σi are the Pauli matrices, and the Lagrangian
3

L = q̇† q̇ + det e Ȧ − q† Aq. (14.82)


Show that L is invariant under SU (2).
171 14 Angular Momentum I: Operators, Algebras, Representations

(5) Write explicitly the matrices for the generators of the group SU (2) in the j = 3/2 representation,
and check that they satisfy the Lie algebra of SU (2).
(6) In the classical limit for the angular momentum, i.e., for large j, we expect to see quantum
averages J1 , J2 , J3  close to the classical values, as well as a small quantum fluctuation for
these angular momenta. Show that this is the case.
(7) Consider a three-dimensional rotationally invariant harmonic oscillator. What quantum numbers
describe a state of the oscillator, viewed as a system with a potential V = V (r)?
Theory of Angular Momentum II: Addition
15 of Angular Momenta and Representations;
Oscillator Model

In the previous chapter, we saw that the angular momentum of a system acts as a generator of the
SO(3) rotation group, and that the values of the angular momentum j define the representation
of SO(3) in which the states of the system belong. Classically, energy and mass are scalars
under rotations, meaning they do not change, while momentum p and angular momentum J are
vectors, and so do transform under the fundamental (or, defining) representation of SO(3). Quantum
mechanically, p and J no longer characterize states (which are in a given representation) but rather
are operators; in particular they are generators of the translation and rotation (SO(3)) groups,
respectively, and as such are in fixed representations of the groups themselves. On the other hand,
states are in different representations, unrelated to those of the operators: they are in representations
defined by an eigenvalue j:

• for j = 0, we have the “scalar” representation, with a single state, |0, 0;
• for j = 1/2, we have the “spinor” representation, with two states, either |1/2, −1/2 and |1/2, +1/2,
or |1/2↑ and |1/2↓.
• for j = 1, we have the “vector” representation, with three states, |1, −1, |1, 0, and |1, −1.
• etc.

15.1 The Spinor Representation, j = 1/2

The simplest nontrivial representation, with angular momentum j = 1/2, has generators Ji in this
representation given by the following matrices (from the general formulas of the last chapter):
     
 0 1  0 −i  1 0
J11/2 = , J21/2 = , J31/2 = . (15.1)
2 1 0 2 i 0 2 0 −1
We take out the prefactor

Ji1/2 ≡ σi , (15.2)
2
and thus defining the Pauli matrices σi ,
     
0 1 0 −i 1 0
σ1 = , σ2 = , σ3 = . (15.3)
1 0 i 0 0 −1

172
173 15 Angular Momentum II: Addition, Representations; Oscillators

We can check that these matrices satisfy a number of properties:


(1) σi2 = 1 (matrix normalization):
(2) {σi , σ j } = 0 for i  j (anticommutation):
(3) σ1 σ2 = iσ3 and cyclic permutations thereof: σ2 σ3 = iσ1 , σ3 σ1 = iσ2 (commutation); since
σ1 σ2 = −σ2 σ1 , etc. (anticommutation), we obtain the algebra
[σi , σ j ] = 2ii jk σk . (15.4)
Putting together properties (1) and (3), we have
σi σ j = δi j 1 + ii jk σk . (15.5)
Further properties of the Pauli matrices are as follows:
(4) Tr[σi ] = 0 (tracelessness):
(5) Tr[σi2 ] = 2 (matrix normalization);
(6) using properties (3)–(5), we find also Tr[σi σ j ] = 0 for i  j, so we have a (trace orthonormaliza-
tion) condition
Tr[σi σ j ] = 2δi j ; (15.6)
(7) (completeness of (1, σi )) in the space of complex 2 × 2 matrices, we can decompose any matrix
M in terms of the four matrices σi and 1, as

M = α0 1 + α i σi ; (15.7)
i=1,2,3

indeed, multiplying with 1 or σ j and taking the trace, and then using the trace orthonormalization
condition at property (6), we find
1 1
α0 = Tr[M], αi = Tr[M σi ]; (15.8)
2 2
(8) σi† = σi (Hermiticity);
(9) putting together (1) and (8), we find also that σi† = σi−1 (unitarity).
Finally, we obtain that the spin 1/2 representation is the fundamental representation of SU (2)
(which is equal in Lie algebra to the rotation group SO(3), as we saw), since the σi = (2/) Ji1/2 are
2 × 2 complex Hermitian matrices and generators, so that g = ei α ·J is unitary: g † = g −1 . Moreover,


property (4) (the tracelessness of σI ) means that the matrices g are also special. Indeed, we have
det M = eTr ln M , (15.9)
which can be proven as follows. Diagonalize the matrix M, by writing it as S −1 M̃ S, with M̃ diagonal
with diagonal elements λi , and using det(S −1 M S) = det S −1 det M det S = det M and Tr[S −1 M S] =
Tr M, so that
 
Tr[M] = λi ⇒ det M = λi = e i ln λ i = eTr ln M . (15.10)
i i

Further, writing M = e A, we obtain


det e A = eTr A, (15.11)
which means that if Tr A = 0 then det e A = 1. In our case, A = α
 · σ and e A = g, so det g = 1.
174 15 Angular Momentum II: Addition, Representations; Oscillators

On the other hand, the adjoint representation of SO(3) is defined by the structure constants of
SO(3), f abc =  abc ,
(Ta )bc = −i f abc = −i abc . (15.12)
This means specifically that
0 0 0 0 0 −1 0 1 0
     
T1 = −i 0 0 1 , T2 = −i 0 0 0  , T3 = −i −1 0 0 , (15.13)
0 −1 0 1 0 0 0 0 0
which is the same set of matrices as Ji1 / (the generators, or angular momenta in the j = 1
representation), as can be checked using the explicit formulas from the last chapter.
With regard to the representations of j ≥ 3/2, we can obtain them from the previous representations
by composition, which will be discussed next.

15.2 Composition of Angular Momenta

Consider a composite system, such as for instance an atom, that contains several objects (such as
2
electrons, etc.), each with angular momentum. Thus we have eigenstates of J and J z for each
object, but we must have also eigenstates of the system for the total momentum J = i J i .

Two Angular Momenta,J = J1 + J2

It is in fact enough to consider only two momenta, since we can add up two momenta, then add
another to the sum of the first two, etc., until we have done the whole sum.
2
Since for J , Jz we have eigenstates | jm, with j defining the representation and m an index for
the states in the representation, for two such momenta J 1 , J 2 we have a tensor product total state,
written as
| j1 m1  ⊗ | j2 m2  ≡ | j1 j2 ; m1 m2 . (15.14)
The algebra of the angular momenta is
[J1i , J1j ] = ii jk J1k
[J2i , J2j ] = ii jk J2k
[J1i , J2j ] = 0 (15.15)
[J12 , J1z ] =0
[J22 , J2z ] = 0.
For fixed j1 , j2 , we have that m1 , m2 must define the system, as a tensor product representation. But
this representation is in fact too large, considered as a representation of SO(3). In fact, we already
know that all the representations of SO(3) are defined by a fixed j and indexed by m values. This
means that the basis of states above, while complete in the Hilbert space, is too large, and we must
impose an extra condition that J = J 1 + J 2 is also an angular momentum,
[Ji , J j ] = ii jk Jk , (15.16)
with quantum numbers j, m defining the representation of the total system.
175 15 Angular Momentum II: Addition, Representations; Oscillators

We note that on the space of states | j1 m1  ⊗ | j2 m2 , we can act with the matrices in the
corresponding representations for each of the two Hilbert subspaces, thus with
   
D1R (g) ⊗ D2R (g) = DR e−ia ·J 1 / ⊗ DR e−i α ·J 2 /
 

α
 · J 1   α
 · J 2 
 1 − i ⊗ 1−i
   
(15.17)
i
1⊗1− α  · J 1 ⊗ 12 + 11 ⊗ J 2

i
≡1− α · J ,

meaning that more precisely we have
J = J 1 ⊗ 12 + 11 ⊗ J 2 . (15.18)
In the basis | j1 j2 ; m1 m2 , we have the eigenvalues
J 21 | j1 j2 ; m1 m2  = j1 ( j1 + 1)2 | j1 j2 ; m1 m2 
J 22 | j1 j2 ; m1 m2  = j2 ( j2 + 1)2 | j1 j2 ; m1 m2 
(15.19)
J1z | j1 j2 ; m1 m2  = m1 | j1 j2 ; m1 m2 
J2z | j1 j2 ; m1 m2  = m2 | j1 j2 ; m1 m2 .
But since j1 , j2 define the representation, they must be kept in an alternative basis, while we can
2
replace m1 , m2 (the eigenvalues of J1z , J2z ) with the eigenvalues of J and Jz , represented by j and
2 2 2
m. Indeed, the set J 1 , J 2 , J , Jz is mutually commuting, so we can have simultaneous eigenvalues,
whereas adding J1z , J2z spoils the mutual commutativity. Indeed, we can calculate
2 2 2 2 2
[J , J 1 ] = [J 1 + J 2 + 2J 1 · J 2 , J 1 ]
2 2 2
= [J 1 + J 2 + 2J1z J2z + J1+ J2− + J1− J2+ , J 1 ] = 0
2 2 2 2 2
[J , J 2 ] = [J 1 + J 2 + 2J 1 · J 2 , J 2 ]
(15.20)
2 2 2
= [J 1 + J 2 + 2J1z J2z + J1+ J2− + J1− J2+ , J 2 ] = 0
2 2
[Jz , J 1 ] = [J1z + J2z , J 1 ] = 0
2 2
[Jz , J 2 ] = [J1z + J2z , J 2 ] = 0,
but on the other hand
2
[J , J1z ]  0 (since [J1z , J1± ]  0)
(15.21)
2
[J , J2z ]  0 (since [J2z , J2± ]  0).
Therefore the new basis of states is | j1 j2 ; jm, with eigenvalues
J 21 | j1 j2 ; jm = j1 ( j1 + 1)2 | j1 j2 ; jm

J 22 | j1 j2 ; jm = j2 ( j2 + 1)2 | j1 j2 ; jm


(15.22)
J 2 | j1 j2 ; jm = j ( j + 1)2 | j1 j2 ; jm

Jz | j1 j2 ; jm = m| j1 j2 ; jm,


176 15 Angular Momentum II: Addition, Representations; Oscillators

and this basis is also complete in the Hilbert space of states of the system. Thus we can expand the
elements of one basis in terms of the other basis. In particular, by using the completeness relation

1= | j1 j2 ; m1 m2  j1 j2 ; m1 m2 |, (15.23)
m1 ,m2

we can write

| j1 j2 ; jm = | j1 j2 ; m1 m2  j1 j2 ; m1 m2 | j1 j2 ; jm. (15.24)
m1 ,m2

The matrix elements  j1 j2 ; m1 m2 | j1 j2 ; jm are known as the Clebsch–Gordan coefficients.


Properties of the Clebsch–Gordan coefficients:

(1) The first property derives from acting with Jz = J1z + J2z on states, specifically taking matrix
elements between the two basis states,

0 =  j1 j2 ; m1 m2 |(Jz − J1z − J2z )| j1 j2 ; jm = (m − m1 − m2 ) j1 j2 ; m1 m2 | j1 j2 ; jm, (15.25)

where we have acted with Jz on the right, and with J1z , J2z on the left. The equation above means
that the Clebsch–Gordan coefficients can only be nonzero for m = m1 + m2 (otherwise they vanish,
according to the equation).
(2) The second property gives a range for the total angular momentum j:

| j1 − j2 | ≤ j ≤ j1 + j2 . (15.26)

As a first check, we note that at large angular momenta j1 , j2 (in the classical regime), when ji2 
J 2i /2 , (15.26) is obviously true:

2
J 2 = ( J 1 + J 2 ) 2 = J 21 + J 22 + 2J 1 · J 2 ⇒ ( j1 − j2 ) 2 ≤ J ≤ ( j1 + j2 ) 2 . (15.27)
2
But (15.26) is actually true in quantum mechanics as well. We first note that the maximum projection
of Jz is mmax = m1,max + m2,max = j1 + j2 , which means that

jmax = j1 + j2 . (15.28)

The representation with this jmax has 2 jmax + 1 states, and the whole representation can be obtained
from the state with mmax = j1 + j2 by acting with J− (which lowers m by one unit). For the next highest
value of m, mmax − 1, we have more than one state: the state J− | j1 j2 jmax mmax  and another state, with
j = jmax − 1. From the latter state we generate another representation with j = jmax − 1 by acting
with J− , for a total of 2( jmax − 1) + 1 states. Then, at mmax − 2, we have two states coming from the
previous two representations, plus one more, with j = jmax − 2, etc. In total, assuming that indeed
jmin = | j1 − j2 |, and assuming for concreteness that j1 ≥ j2 , we find that the total number of states is
j
1 +j2
2
(2 j + 1) = [( j1 + j2 + 1)( j1 + j2 ) − ( j1 − j2 − 1)( j1 − j2 )] + 2 j2 + 1
j=j1 −j2
2 (15.29)
= (2 j1 + 1)(2 j2 + 1),
177 15 Angular Momentum II: Addition, Representations; Oscillators

which is indeed the total number of states in the basis with j1 , j2 , m1 , m2 at fixed j1 , j2 , so we have
found all the states in that basis, confirming the correctness of the hypothesis of jmin = | j1 − j2 |.

Decomposition of Product of Representations

We have thus arrived at the following logic: when we take a product of two representations of SO(3),
one of dimension j1 and the other of dimension j2 , we can decompose the resulting states into
representations of SO(3) of varying j, between | j1 − j2 | and j1 + j2 , formally:

ji ⊗ j2 = | j1 − j2 | ⊕ | j1 − j2 | + 1 ⊕ · · · ⊕ j1 + j2 . (15.30)

15.3 Finding the Clebsch–Gordan Coefficients

The Clebsch–Gordan coefficients are the matrix elements for a change of basis in the Hilbert
space of the system, from | j1 j2 ; m1 m2  to | j1 j2 ; jm, which by the general theory of Hilbert
spaces are unitary matrices. In general, a unitary matrix has complex elements, but since we can
multiply all the elements of both bases by arbitrary phases without changing the physics of the
system, we can choose a convention where the Clebsch–Gordan coefficients are all real (the phases
turn all complex elements into real ones). Moreover, we can also choose a convention such that
 j1 j2 , m1 = j1 , m2 | j1 j2 ; jm = j ∈ R+ is not only real, but positive.
Then, since we have unitary transformations on the Hilbert space, these matrices obey orthonor-
mality conditions,

 j1 j2 ; m1 m2 | j1 j2 ; jm j1 j2 ; m1 m2 | j1 j2 ; jm = δ m1 m1 δ m2 m2
j,m
 (15.31)
 j1 j2 ; m1 m2 | j1 j2 ; jm j1 j2 ; m1 m2 | j1 j2 ; j  m  = δ j j  δ mm .
m1 ,m2

For j = j , m = m  in the second relation, we obtain a normalization condition for the Clebsch–


Gordan coefficients.
Instead of the Clebsch–Gordan coefficients, one sometimes uses a set of symbols related to them
by constants, Wigner 3 j symbols, defined by
 
j j2 j
 j1 j2 ; m1 m2 | j1 j2 ; jm = (−1) j1 −j2 +m 2 j + 1 1 . (15.32)
m1 m2 −m

Recursion Relations

We can find recursion relation between the Clebsch–Gordan coefficients by acting with J± = J1± + J2±
on the basis decomposition relation, since successive actions of J± construct the j representation, of
J1± construct the j1 representation, and of J2± construct the j2 representation:

J± | j1 j2 ; jm = (J1± + J2± ) | j1 j2 ; m1 m2  j1 j2 ; m1 m2 | j1 j2 ; jm. (15.33)
m1 ,m2
178 15 Angular Momentum II: Addition, Representations; Oscillators

Using the formulas from the previous chapter for the actions of J± on | jm, we obtain the following
relation between states:
( j ∓ m)( j ± m + 1)| j1 j2 ; jm ± 1

= ( j1 ∓ m1 )( j1 ± m1 + 1)| j1 j2 , m1 ± 1, m2  (15.34)
m1 ,m2

+ ( j2 ∓ m2 )( j2 ± m2 + 1)| j1 j2 ; m1 , m2 ± 1  j1 j2 ; m1 m2 | j1 j2 ; jm.
However defining
(m1 ≡ m1 ± 1, m2 = m2 ) and (m1 = m1 , m2 = m2 ± 1), (15.35)
for the two states, respectively, and multiplying from the left with  j1 j2 ; m1 m2 |, we find that

( j ∓ m)( j ± m + 1) j1 j2 ; m1 m2 | j1 j2 ; j, m ± 1



= ( j1 ∓ m1 + 1)( j1 ± m1 ) j1 j2 ; m1 ∓ 1, m2 | j1 j2 ; jm (15.36)

+ ( j2 ∓ m2 + 1)( j2 ± m2 ) j1 j2 ; m1 , m2 ∓ 1| j1 j2 ; jm.

These recursion relations, together with the normalization condition and the phase and sign
convention, are enough to specify the Clebsch–Gordan coefficients completely, though we will not
give examples here.

15.4 Sums of Three Angular Momenta,J1 + J2 + J3 : Racah Coefficients

In the case where three angular momenta are to be added, we have the following starting basis for
the Hilbert space:
| j1 m2  ⊗ | j2 m2  ⊗ | j3 m3  ≡ | j1 j2 j2 ; m1 m2 m3 . (15.37)
To sum the angular momenta, and decompose these states into representations (of given j) of the
rotation group SO(3), we can first add two momenta, and then add the remaining one to the sum of
the first two. This can be done in three different ways.
(1) We can first set J 1 + J 2 = J 12 , and then set J 12 + J 3 = J . In the first step, we replace m1
and m2 with j12 and m12 , and in the second, we replace m3 and m12 with j and m (i.e., total angular
momentum and its projection onto z), for a total basis state of | j1 j2 j12 j3 ; j, m. Decomposing this
basis state in terms of the original basis, using the two steps above, leads to

| j1 j2 j12 j3 ; jm = | j1 j2 j3 ; m1 m2 m3  j1 j2 m1 m2 | j12 m12  j12 j3 ; m12 m3 | jm, (15.38)
m1 ,m2 ,m12 ,m3

where
 j1 j2 m1 m2 | j12 m12  ≡  j1 j2 j3 ; m1 m2 m3 | j1 j2 j12 j3 ; m12 m3 
(15.39)
 j12 j3 m12 m3 | jm ≡  j1 j2 j12 j3 ; m12 m3 | j1 j2 j12 j3 j; m.

(2) Alternatively, we can first set J 2 + J 3 = J 23 , and then J 23 + J 1 = J . In the first step, we
replace m2 and m3 with j23 and m23 , and in the second, we replace m1 and m23 with j and m. The
179 15 Angular Momentum II: Addition, Representations; Oscillators

new decomposition is

| j1 j2 j3 j23 ; jm = | j1 j2 j3 ; m1 m2 m3  j2 j3 m2 m3 | j23 m23  j1 j23 ; m1 m23 | jm. (15.40)
m1 ,m2 ,m3 ,m23

(3) The third possibility is to set J 1 + J 3 = J 13 , and then J 13 + J 2 = J . The corresponding last
decomposition is

| j1 j3 j13 j2 ; jm = | j1 j2 j3 ; m1 m2 m3  j1 j3 m1 m3 | j13 m13  j2 j13 ; m2 m13 | jm. (15.41)
m1 ,m2 ,m3 ,m23

So, all three methods are equivalent, and we should arrive at similar bases. This means that, in
particular, the bases obtained in points (1) and (2) are related by a unitary transformation. The matrix
elements between them can be written as (by taking out real factors) the Racah W coefficients, or
equivalently the Wigner 6 j symbols,
 j1 j2 j12 j3 ; jm| j1 j2 j3 j23 ; jm ≡ (2 j12 + 1)(2 j23 + 1)W ( j1 , j2 , j13 ; j12 , j23 )
  (15.42)
j j j12
≡ (−1) j1 +j2 +j3 +j (2 j12 + 1)(2 j23 + 1) 1 2 .
j3 j j23
These matrices are rather complicated and tedious to obtain, though Racah showed that the W
coefficients have workable expressions.

15.5 Schwinger’s Oscillator Model

Finally, we note a very useful way to obtain the representations of the Lie algebra of SU (2) 
SO(3), by building its generators from the algebra of harmonic oscillators. It turns out that we need
two harmonic oscillators, one with operators a+ , a+† , one with a− , a−† . The two oscillators will be
associated with the positive and negative values of m = Jz , respectively.
We start with a single harmonic oscillator, with a, a† and number operator N = a† a. Thus its
algebra is
[a, a† ] = 1
[N, a† ] = a† aa† − a† a† a = a† [a, a† ] = +a† (15.43)
† † †
[N, a] = a aa − aa a = −[a, a ]a = −a.
The following commutation relations of N look indeed like the commutation relations
     
Jz J± Jz
2 , =± 2 , (15.44)
  
but the remaining one is not quite right, since it is
     
J+ J− Jz
, = 2 , (15.45)
  
which doesn’t match [a† , a] = −1.
This is why we actually need two oscillators in order to construct the Lie algebra, commuting one
with the other, so with simultaneous eigenvalues,
[a+ , a−† ] = [a− , a+† ] = 0. (15.46)
180 15 Angular Momentum II: Addition, Representations; Oscillators

For this system, we can construct simultaneous eigenstates, in the usual manner, as a tensor product
of the states of the two oscillators,
(a+† ) n+ (a−† ) n−
|n+ , n−  = |n+  ⊗ |n−  = √ |0, 0. (15.47)
n+ ! n− !
Then we can construct the generators of the Lie algebra in terms of these two oscillators,
J+
≡ a+† a− ,

J−
≡ a−† a+ , (15.48)

2Jz
≡ a+† a+ − a−† a− = N+ − N− .

We can check that, indeed this has the right commutation relations. For instance, the commutation
relation that did not work with a single oscillator now does work:
   
J+ J− 2Jz
, = a+† a− , a−† a+ = a+† [a− , a−† ]a+ − a−† [a+ , a+† ]a− = N+ − N− = . (15.49)
  
Moreover, the action of the angular momentum operators on the basis consisting of occupation
number states |n+ , n−  is
J+
|n+ , n−  = a+† a− |n+ , n−  = n− (n+ + 1)|n+ + 1, n− − 1

J−
|n+ , n−  = a−† a+ |n+ , n−  = n+ (n− + 1)|n+ − 1, n+ + 1 (15.50)

2Jz
|n+ , n−  = (N+ − N− )|n+ , n−  = (n+ − n− )|n+ , n− .

We see that we can identify the occupation states for the oscillators, |n+ , n− , with the states in the
representation of angular momentum j, by making the identifications

n+ = j + m, n− = j − m. (15.51)

Defining the total occupation number, N = N+ + N− , we obtain that

n+ + n− ≡ N = 2 j. (15.52)

Substituting (15.51) in the general state |n+ , n− , we obtain the states in the spin j representation of
the SO(3) rotation group as oscillator states, with

(a+† ) j+m (a−† ) j−m


| jm = |0. (15.53)
( j + m)! ( j − m)!
In this way, we can construct all the states in the representation of angular momentum j: we note
that mmax = j (so that j − m = 0) and mmin = − j (so that j + m = 0), otherwise the above state (15.53)
does not make sense. We also note that the state with m = j is

(a+† ) 2j
| j j = |0, (15.54)
2 j!
which can be interpreted as being composed of 2 j elements with angular momentum 1/2 each, all
with “spin up”, jz = +1/2. Thus the total angular momentum can be thought of as a sum of angular
181 15 Angular Momentum II: Addition, Representations; Oscillators

momenta each having the minimum value, 1/2:


2j

j = (1/2) i . (15.55)
i=1

Thus, as we mentioned at the beginning, we can indeed construct all the representations of SO(3)
from just sums of the simplest, “spinor” representation.
In this Schwinger oscillator model, a+† creates a spin 1/2 state with spin up, and a−† creates one
with spin down, each state having N = 2 j oscillators, j + m with spin up and j − m with spin down.
The Schwinger construction presented here for SO(3)  SU (2) is actually much more general. In
fact, any Lie algebra can be constructed in terms of harmonic oscillators ai , ai† , which in turn allows
us, as above, to explicitly construct its representations. Indeed, in Lie group theory, one can show
that we can reduce the Lie algebra to a kind of product of SU (2) factors, after which we can just
follow the above analysis. We will not do this here, but the result is that this quantum mechanical
construction is useful for solving the issue of representations of general (Lie) groups.
Thus we can say that quantum mechanics helps not just with physics, but also with mathematics,
in terms of representations of Lie groups!

Important Concepts to Remember

• The angular momentum J is the generator of the SO(3) group of spatial rotations, and the values
j of the angular momentum correspond to representations of SO(3) in which physical |ψ states
belong.
• The spinor representation of SO(3)  SU (2)/Z, j = 1/2, has generators given by the Pauli matrices,
Ji = 2 σi , and is the fundamental representation of SU (2).
• The Pauli matrices σi satisfy σi σ j = δi j 1 + ii jk σk ; they are Hermitian and, since σi2 = 1, are also
unitary and traceless.
• The matrices (1, σi ) are complete in the space of 2 × 2 matrices.
• The j = 1 representation is the adjoint representation of SO(3).
• We can compose two angular momenta to give J = J 1 + J 2 , in which case the states can be
described either as a tensor product of the original states, | j1 m1  ⊗ j2 m2  ≡ | j1 j2 ; m1 m2 , or as
states of J , with j1 , j2 defined also, thus as | j1 j2 ; jm, since we have two mutually commuting
2 2 2 2 2
sets, ( J 1 , J 2 , J1z , J2z ) and ( J 1 , J 2 , J , Jz ).
• The relation between the two possible bases is given as

| j1 j2 ; jm = | j1 j2 ; m1 m2  j1 j2 ; m1 m2 | j1 j2 ; jm;
m1 ,m2

the coefficients  j1 j2 ; m1 m2 | j1 j2 ; jm are the Clebsch–Gordan coefficients.


• In the Clebsch–Gordan coefficients we have (providing that the coefficient is nonzero) m = m1 +m2
and | j1 − j2 | ≤ j ≤ j1 + j2 , which means that we have the decomposition of the product of
representations into a sum of representations, as j1 ⊗ j2 = | j1 − j2 | ⊕ | j1 − j2 | + 1 ⊕ · · · ⊕ j1 + j2 .
• Up
 to a rescaling, the Clebsch–Gordan coefficients are the same as the Wigner 3 j symbols,
j1 j2 j
.
m1 m2 −m
182 15 Angular Momentum II: Addition, Representations; Oscillators

• When composing three angular momenta, J 1 + J 2 + J 3 = J , we can do this by first adding 1


and 2, or first adding 2 and 3 (or first adding 1 and 3), which are equivalent procedures, leading
to coefficients for the transition,  j1 j2 j12 j3 ; jm| j1 j2 j3 j23 ; jm, which
 up to some rescaling are the
j j j
Racah coefficients W ( j1 , j2 , j13 ; j12 , j23 ) or the Wigner 6 j symbols 1 2 12 .
j3 j j23
• In Schwinger’s oscillator model, we can construct the generators of SU (2) or SO(3), the angular
momenta, in terms of two sets of oscillators a+ , a+† , a− , a−† , as J+ / = a+† a− , J− / = a−† a+ , 2Jz / =
N+ − N− , and the states in a representation in terms of the creation operators,
(a+† ) j+m (a−† ) j−m
| jm = |0,
( j + m)! ( j − m)!
so a+† creates a spin 1/2 up and a−† a spin 1/2 down.
• The Schwinger oscillator model is much more general; any Lie algebra can be represented in terms
of harmonic oscillators, and the states in a representation in terms of the creation operators acting
on a vacuum.

Further Reading
See [2], [1], [3] for more on angular momenta, and [8] for more on mathematical constructions for
SU (2) representations and for other Lie algebras.

Exercises

(1) Calculate
Tr e α i σi /2 , (15.56)
where σi are the Pauli matrices.
(2) Decompose the product of two spin 1 representations into (irreducible) representations of
SU (2), and list explicitly the relation between the basis elements for the two bases in terms
of Clebsch–Gordan coefficients.
(3) For the case in exercise 2, find all the Clebsch–Gordan coefficients for the lowest value of the
total spin j, using the recursion relations and the normalization conditions.
(4) For the sum of three spin 1 representation, in the case where the final spin is 2, write explicitly
the three decompositions of the final basis in terms of the tensor product basis (with a product
of two Clebsch–Gordan coefficients) and also write explicitly the resulting Racah coefficients.
(5) Rewrite the relation between the two bases in exercise 2 in terms of Schwinger oscillators.
(6) Consider the Lie algebra for Ji , Ki , i = 1, 2, 3,
[Ji , J j ] = ii jk Jk , [Ki , K j ] = ii jk Jk , [Ji , Ki ] = ii jk Kk . (15.57)
Write it in terms of Schwinger oscillators, and give the general representation in terms of
Schwinger oscillators acting on a vacuum.
(7) Decompose the product of four spin 1 representations of SU (2), 1 ⊗ 1 ⊗ 1 ⊗ 1, into irreducible
SU (2) representations.
Applications of Angular Momentum
16 Theory: Tensor Operators, Wave Functions and
the Schrödinger Equation, Free Particles

After having described how to construct representations, and how to compose them (and to
decompose the products of representations into sums of representations, or J 1 + J 2 into J ), we now
apply these notion to physics. First, we learn how to write operators, associated with observables,
transforming in a given representation under rotation, i.e., “tensor operators”. Then, we find how
wave functions transform under rotations, and how the Schrödinger equation decomposes under
them. Then, we apply this formalism to the simplest case, that of free particles.

16.1 Tensor Operators

Consider classical measurable vectors, e.g., r , p , L, collectively called Vi , that transform according
to the rule

Vi → Vi = Ri j Vj , (16.1)
j

where Ri j is an SO(3) matrix, i.e., a rotation matrix in the defining j = 1, or vector, representation
(with orthogonal 3 × 3 matrices).
Then, also classically, one considers tensors as objects transforming as products of the basic vector
representation,
Ti1 i2 ...in → Ti1 i2 ...in = Ri1 i1 Ri2 i2 · · · Rin in Ti1 i2 ...in . (16.2)
However, as we saw in the previous chapter, this representation, the product of vector representa-
tions, is not irreducible. It can in fact be reduced, i.e., decomposed into irreducible representations of
SO(3), which are all of angular momentum j. From the previous chapter, we can formally say that
we have

1i = j, (16.3)
i

where j has several values that can be calculated, or, in a different notation, where we denote the
representation by its value of j, and denote the effect of the angular momentum sum on states (tensor
product, decomposed as a sum),
1 ⊗ 1 ⊗ · · · ⊗ 1 = · · · ⊕ (n − 1) ⊕ n. (16.4)
Note that we have a tensor product of n quantities on the left and that on the right the possible j’s
take only integer values, since we are summing integer values (of j = 1) many times. In yet another
notation, we can denote the representation by its multiplicity (the number of states in it), for instance
j = 1 as 3, j = 2 as 5, etc., so that
183
184 16 Applications of Angular Momentum Theory

3 ⊗ 3 ⊗ · · · ⊗ 3 = · · · ⊕ (2(n − 1) + 1) ⊕ 2n + 1. (16.5)

This relation would be true as numbers also, i.e., 3 × 3 × · · · × 3 = · · · + (2(n − 1) + 1) + (2n + 1).
In particular, for a sum of only two vectors (giving a tensor product of representations) and thus
for the tensor Ti j , we have

1 ⊗ 1 = 0 ⊕ 1 ⊕ 2, (16.6)

meaning that, for j1 = j2 = 1, we have 0 = |1 − 1| ≤ j ≤ 1 + 1 = 2, so j = 0, 1, 2. Equivalently,


denoting the representation by its multiplicity, we have

3 ⊗ 3 = 1 ⊕ 3 ⊕ 5, (16.7)

which is also true as regular products and sums, 3 × 3 = 1 + 3 + 5 = 9, giving the total number of
states in the system.
Quantum mechanically, we have operators V̂i corresponding to the observables Vi , for instance r̂ i ,
p̂i , L̂ i . Then the transformed operators V̂i have the same transformation rules as the classical
observables. This is so, since if the states |ψ in the Hilbert space transform in some representation
with angular momentum j under a rotation with rotation matrix R,

|ψ → |ψ  ≡ D( j) (R)|ψ, (16.8)

where D( j) (R) is the rotation matrix in the ( j) representation, then the classically observable
quantum averages of the operators V̂i , classically observable, must transform in the same way as
the classical objects, i.e.,

ψ|V̂i |ψ → ψ|D†( j) (R)V̂i D( j) (R)|ψ = Rik ψ|V̂k |ψ, (16.9)
j

implying the transformation for operators



V̂i → V̂i = D†( j) (R)V̂i D( j) (R) = Rik V̂k . (16.10)
k

The infinitesimal form of the transformation law can be obtained by remembering the general form
of a rotation matrix in a representation ( j), in terms of the generators Ji in this representation. In the
infinitesimal case, we have
 
i i i
D( j) (R) = exp − α (Ji )( j)  1 − α i (Ji )( j) , (16.11)
 
and in the defining, or vector representation, which for SO(3) happens to be also the adjoint one, we
have
  
i k i 1
Ri j = exp − α Jk  1i j − α k (Jk )i j = δi j − α k  ki j , (16.12)
 ij  
where we have used (Jk )i j = −i f i jk = −ii jk .
Then the transformation law D† V̂i D = j Ril V̂l implies

iα k iα k
− [V̂i , (Jk )]( j) = − (−i kil V̂l )( j) ⇒
  (16.13)
[V̂i , Jk ]( j) = iikl (V̂l )( j) ,
which can be taken as an alternative definition of a vector operator.
185 16 Applications of Angular Momentum Theory

Thus a quantum mechanical tensor operator is an object T̂m( j) , in a representation of integer (not
half-integer!) angular momentum j = k ∈ N where m is the standard index in the representation,
m = − j, . . . , j, that therefore transforms according to the rotation matrices in the representation j
˜
under a transformation of the states by matrices in a different representation j,

j
D†( j)˜ (R)T̂m( j) D( j)˜ (R) = ( j)∗
Dmm ( j)
 (R)T̂m , (16.14)
m =−j

where we have used DT (R−1 ) = D∗ (R) to write the complex conjugate of the rotation matrix.
The infinitesimal form of the above transformation law for tensors is similar to the one for vectors,
except that now we must use the (complex conjugate of the) matrix of the generator Jk in the j
representation,
∗  ∗ 
(Jk )mm ≡  jm| Jk | jm  =  jm | Jk | jm, (16.15)

to obtain

[T̂m( j) , Jˆk ] = −(Jk )mm ( j)
 T̂m , (16.16)

or

[ Jˆk , T̂m( j) ] = T̂m( j)  jm  | Jk | jm. (16.17)

16.2 Wigner–Eckhart Theorem

This is a very useful theorem, which states that the matrix elements of the tensor operator T̂m( j)
with respect to |α j˜m̃ eigenstates (α stands for all other indices) are given by Clebsch–Gordan
coefficients, times a function that is independent of m or, denoting the tensor operator by T̂q(k) in
order to match standard notation and to avoid confusing j and j,˜

α , j  T (k) α, j
α , j  m  |T̂q(k) |α, jm =  j k; mq| j k; j  m  , (16.18)
2j + 1
where the double verticals indicate that the matrix element is independent of m.

Proof. To prove the relation, we consider the action of the tensor operator on a state,

T̂q(k) |α, jm, (16.19)

and construct a linear combination with coefficients given by the Clebsch–Gordan coefficients,

|τ, j˜m̃ ≡ T̂q(k) |α, jm j k; mq| j k j˜m̃. (16.20)
j,m

Using the orthogonality property of the Clebsch–Gordan coefficients, we reverse the relation
and write

T̂q(k) |α, jm = |τ, j˜m̃ j k; mq| j˜m̃. (16.21)
˜ m̃
j,
186 16 Applications of Angular Momentum Theory

Using (16.17) and (16.21), we find that Jˆk acts on |τ j˜m̃ as on | j˜m̃. The proof of that is left as an
exercise. Thus

α jm|τ j˜m̃ ∝ δ j j˜ δ mm̃ , (16.22)

which implies the statement of the theorem. q.e.d.

As an application of the theorem, we see that the matrix elements of T̂q(k) are zero, unless

q = m − m
(16.23)
| j − j  | ≤ k ≤ j + j .

This selection rule is very useful for quantum processes involving matrix elements for transition
between states, as we will see later on in the book.

16.3 Rotations and Wave Functions

For quantum observables, one can measure the eigenvalues of the corresponding operators. But we
saw that the classical angular momentum,

L i = i jk x j pk ( L = r × p ), (16.24)

corresponds to a quantum operator

L̂ i = i jk X̂ j P̂k , (16.25)

acting as a generator of the rotation group SO(3); hence its eigenvalues 2 j ( j + 1), defined by a
given j, are something that one can measure. Therefore measurements will correspond to a given j,
and so a given representation of SO(3), a quantization condition for angular momentum that will be
reflected in the wave functions.
In the coordinate basis, angular momenta have matrix elements

x | L̂ i |x  = i jk x | X̂ j P̂k |x  = i jk x j x | P̂k |x  = −ii jk x j δ(x − x  ) . (16.26)
∂ x k

When acting on a wave function in coordinate, |x , space, the usual manipulations lead to

x | L̂ i |ψ = dx x | L̂ i |x x  |ψ
(16.27)

= −ii jk x j ψ(x ) = −i r × ∇ψ
 .
∂ xk i

It is useful to go to spherical coordinates r, θ, φ, defined from the Cartesian coordinates


x 1 , x 2 , x 3 by

x 1 = r sin θ cos φ
x 2 = r sin θ sin φ (16.28)
x 3 = r cos θ,
187 16 Applications of Angular Momentum Theory

x3

x1
Á
x2

Figure 16.1 Coordinate system.

as in Fig. 16.1. The new system of coordinates has basis vectors er , eθ , and eφ , which obey the
relations

er = eθ × eφ , eθ = eφ × er , eφ = er × eθ , (16.29)

and then the ∇


 operator becomes in spherical coordinates

 = er ∂ + eθ 1 ∂ + eφ 1




. (16.30)
∂r r ∂θ r sin θ ∂φ
This allows us to write the form of the angular momentum operator acting on wave functions in
spherical coordinates:
 
 = −i eφ ∂ − eθ 1 ∂ .
L = −ir × ∇ (16.31)
∂θ sin θ ∂φ
By using the decomposition of the spherical basis vectors in the Cartesian basis,

er = sin θ cos φ e1 + sin θ sin φ e2 + cos θ e3


eθ = sin φ e1 + cos φ e2 (16.32)
eφ = cos θ cos φ e1 + cos θ sin φ e2 − sin θ e3 ,

we can also decompose L, written above in spherical coordinates, into L 1 , L 2 , L 3 , as follows:
 
∂ ∂
L 1 = −i − sin φ − cos φ cot θ
∂θ ∂φ
 
∂ ∂
L 2 = −i cos φ − sin φ cot θ (16.33)
∂θ ∂φ

L 3 = −i .
∂φ
Then we can also construct operators that change m,
 
∂ ∂
L̂ + = L̂ 1 + i L̂ 2 = eiφ + i cot θ
∂θ ∂φ
  (16.34)
∂ ∂
L̂ − = L̂ 1 − i L̂ 2 = e−iφ − + i cot θ .
∂θ ∂φ
188 16 Applications of Angular Momentum Theory

We want to write wave functions in radial coordinates to match the symmetry of the relevant
angular momentum operators, which means that we must consider basis coordinate states of the type

|x  ≡ |r, n = |r ⊗ |n, (16.35)

where n = n (θ, φ) is a unit vector defined by the angles θ and φ, and corresponding factorized states

|ψ = |ψr  ⊗ |ψn . (16.36)

Then the wave functions factorize as well, or in other words, we can consider the separation of
variables in order to be able to act with L̂ i on them and obtain eigenvalues,

r, n | L̂ i |ψ = r |ψr n | L̂ i |ψn 


  (16.37)
∂ 1
≡ −iψr (r) eφ − eθ n |ψn .
∂θ sin θ
Here we have defined the radial wave function

r |ψr  ≡ ψr (r). (16.38)

We also define

n (θ, φ)|ψn  ≡ ψ(θ, φ). (16.39)

Since, as we said, we are interested in eigenfunctions of L, it means that |ψn  is a state |lm.
Moreover, we will see shortly that in fact we are interested only in l ∈ N (so m ∈ Z), in which
case the angular wave function is a spherical harmonic,

Ylm (θ, φ) ≡ n (θ, φ)|lm. (16.40)

Then, the Ylm (θ, φ) are eigenfunctions of L 2 and L z , namely

L 2Ylm (θ, φ) = 2 l (l + 1)Ylm (θ, φ)


(16.41)
L z Ylm (θ, φ) = mYlm (θ, φ).

But, since we saw that in spherical coordinates L z = −i∂/∂φ, we obtain that

Ylm (θ, φ) = eimφ ylm (θ). (16.42)

We will give a better definition of these spherical harmonics later; for the moment this is all that
we need to say about them.

16.4 Wave Function Transformations under Rotations

Under rotations, since the Ylm (θ, φ) describe states |ψ ∝ |lm, they must transform under the l
representation of SO(3). Under n  = Rn, these states transform under the rotation matrix D(R),

|n  = D(R)|n, (16.43)


189 16 Applications of Angular Momentum Theory

and correspondingly wave functions transform with the rotation matrix in the l representation,
since then
n  |lm = n |D(R) † |lm ⇒
Ylm (n  ) ≡ n  |lm = n |D(R) † |lm = Dmm
(l)∗
n |lm  
 (R) (16.44)
(l)∗
= Dmm  (R)Ylm (
n ).
We now make an important observation. Since we have found that
Ylm (θ, φ) ∝ eimφ , (16.45)
it means that indeed, for periodicity of Ylm (θ, φ) we must have only m ∈ Z, so that l ∈ N, as we have
assumed before.
However, it also means that if we considered instead m ∈ Z/2, for half-integer spin, in particular
for j = 1/2, we would obtain
Ylm (θ, 2π) = eiπYlm (θ, 0) ⇒ ψ(r, θ, 2π) = −ψ(r, θ, 0), (16.46)
and only a rotation by 4π would bring us back to the original state.
Equivalently, consider the formula from Chapter 14 concerning a rotation with Euler angle φ
around the third axis, corresponding to the j = 1 representation, i.e., the vector or fundamental
representation,
cos θ − sin θ 0
 
R3 (φ) =  sin θ cos θ 0 , (16.47)
 0 0 1
for which R3 (2π) = 1, as expected for j = 1 (integer). On the other hand, the same rotation acting
on the space of j = 1/2 states, or “spinor” representation, i.e., acting as an SU (2) matrix, gives
 iφ/2 
e 0
R̃3 (φ) = , (16.48)
0 e−iφ/2
so that
R̃3 (2π) = −1 ⇒ R̃3 (4π) = 1. (16.49)
This condition, in general gives a good definition of a spinor: it is a representation in which only
a rotation by 4π, but not by 2π, gives back the same state. Indeed, in general we find that a rotation
by φ in a representation of angular momentum j has a group element
 
i ( j)
g ∝ exp φ J3 . (16.50)

Thus, if J3 has eigenvalue j, and φ = 2π, we obtain
g(2π) = e2πi j g(0) = (−1) 2j g(0). (16.51)
We might ask: is it not contradictory to have a rotation by 2π that changes the state? The answer
is that it is not, as long as the change is unobservable, which it is: we will see that a single spinor is
unobservable; only two spinors are observable, and, for a state of two spinors, nothing changes.
Finally, we note that rotation matrices in the j representation can be found using the Schwinger
method of representing generators using oscillators. We could calculate them explicitly and obtain
the so-called Wigner formula (so, in this case, the method is also known as the Wigner method),
though we will not do this here.
190 16 Applications of Angular Momentum Theory

16.5 Free Particle in Spherical Coordinates

Our first application of the formalism that we have developed will be to a free particle in spherical
coordinates, as a warm-up for later, more involved, cases. We choose, instead of the |r  = |x, y, z
coordinate basis, the spherical coordinate basis |r, θ, φ.
Just as we had before ket states |p  = |px py pz  corresponding to the eigenvalues of three
commuting observables P̂x , P̂y , P̂z , so now we choose instead three other commuting observables,
related to the rotational invariance. That means that L 2 and L z are certainly involved, but we must
choose a third. The simplest choice is the Hamiltonian Ĥ, giving the energy E.
Indeed, for a free particle, the Hamiltonian is

ˆ 2
P
Ĥ = , (16.52)
2m
with eigenvalue
p 2
E= . (16.53)
2m
Since

L̂ i = i jk X̂ j P̂k , (16.54)

we can check that [ Ĥ, L̂ z ] = 0 and also [ Ĥ, L̂ 2 ] = 0.


The Schrödinger equation in a coordinate basis,
 2 
 2
x | Ĥ |ψ = − ∇ ψ(x ) = Eψ(x ), (16.55)
2m

can be rewritten in terms of L 2 , since (using i i jk ilm = δ jl δ km − δ jm δ kl , which can be proven


by analyzing all possibilities)
L̂ i L̂ i = i jk ilm X̂ j P̂k X̂l P̂m = X̂ j P̂k X̂ j P̂k − X̂ j P̂k X̂k P̂j
(16.56)
= X̂ j X̂ j P̂k P̂k − X̂ j P̂j X̂k P̂k − i X̂ j P̂j ,
where in the second line we have used the canonical commutation relations. Then, we rewrite (16.56)
in spherical coordinates,

L̂ i L̂ i = −ir̂ p̂r − r̂ p̂r r̂ p̂r + r̂ 2 pˆ 2 = r̂ 2 (pˆ 2 − p̂r2 ), (16.57)

where we have used [r̂, p̂r ] = i. We see that therefore


1 ˆ
pˆ 2 = p̂r2 + 2 L 2 . (16.58)
r
We can also prove this directly in the coordinate basis, since then
1 ∂
pr = −i r (16.59)
r ∂r
and
 
L 2 1 ∂ ∂ 1 ∂2
= Δ θ,φ = sin θ + , (16.60)
2 sin θ ∂θ ∂θ sin θ ∂φ2
2
191 16 Applications of Angular Momentum Theory

and finally
∂2 2 ∂ 1
Δ= + + Δθ,φ , (16.61)
∂r 2 r ∂r r 2
leading to
L 2
p 2 = −2 Δ = −2 ∇
 2 = pr2 + , (16.62)
r2
for r  0.
The Schrödinger equation becomes
p 2 1  2 L 2 
ψ(r, θ, φ) = p + ψ(r, θ, φ) = Eψ(r, θ, φ), (16.63)
2m 2m  r r 2
and can be solved by separation of variables, writing the wave function as a product of a radial
function and a spherical harmonic,
ψlm (r, θ, φ) = f l (r)Ylm (θ, φ), (16.64)

where the spherical harmonics are eigenfunctions of L 2 and L z ,

L2Ylm (θ, φ) = l (l + 1)2Ylm (θ, φ)


(16.65)
L z Ylm (θ, φ) = mYlm (θ, φ).
The Schrödinger equation then reduces to an eigenvalue equation for the radial wave function
f l (r),
 2 
pr l (l + 1)2
+ f l (r) = E f l (r) ⇒
2m 2mr 2
⎡  2 ⎤ (16.66)
2 ⎢⎢ 1 ∂ l (l + 1) 2mE ⎥⎥
− r + − 2 ⎥ f l (r) = 0.
2m ⎢⎢ r ∂r r2  ⎥⎦

Defining
2mE
k2 ≡ (16.67)
2
and ρ = kr, we find the equation
 2  
∂ 2 ∂ l (l + 1)
+ + 1− f l (ρ) = 0. (16.68)
∂ρ 2 ρ ∂ρ ρ2
This is in fact the equation for the spherical Bessel functions jl (ρ), meaning that the solution to
the Schrödinger equation in spherical coordinates is
r, θ, φ|lm = ψlm (r, θ, φ) = Ylm (θ, φ) jl (kr). (16.69)
In fact, the solution to the Schrödinger equation in Cartesian coordinates,

x |k = ei k ·r ,



(16.70)
can be expanded in the basis of the solutions in spherical coordinates as

∞ 
l
ei k ·r =

alm (k)Ylm (θ, φ) jl (kr). (16.71)
l=0 m=−l
192 16 Applications of Angular Momentum Theory

We will continue this analysis in Chapter 19; for the moment we just note that the final result can
be rewritten as


eikr cos θ = (2l + 1)i l jl (kr)Pl (cos θ)
l=0
(16.72)

∞ 
l

= 4π (2l + 1)i l
jl (kr)Ylm (n1 )Ylm (n2 ),
l=0 m=−l

where in the last form we used two directions n1 , n2 , for the spherical harmonics, that have an angle
θ between them.

Important Concepts to Remember

• The decomposition of products of spin j representations, in particular spin 1 representations, into


irreducible representations is denoted equivalently as i 1i = j, as 1 ⊗ 1 ⊗ · · · ⊗ 1 = · · · ⊕ (n − 1) ⊕
n, or (using the dimension of the representation) as 3 ⊗ 3 ⊗ · · · ⊗ 3 = · · · ⊕ (2(n − 1) + 1) ⊕ 2n + 1,
which is also true for products of numbers.
• The classical tensor transformation law under a rotation with matrix Ri j , Ti1 i2 ...in →
Ti1 i2 ...in = Ri1 i1 Ri2 i2 . . . Rin in Ti1 i2 ...in , becomes the transformation law for states |ψ → |ψ   ≡
D( j) (R)|ψ and, for operators T̂m( j) in a representation j ∈ N, T̂m( j) → D†( j)˜ (R)T̂m( j) D( j)˜ (R) =
j

m =−j D ( j)∗ (R)T̂ ( j) or [ Jˆk , T̂m( j) ] = T̂ ( j)  jm  | Jk | jm (for vectors, [V̂i , Jk ]( j) = iik j (V̂j )( j) ).
mm m m
• The Wigner–Eckhart theorem states that the matrix elements (in |α jm) of tensor operators T̂q(k)
are given by Clebsch–Gordan coefficients for summing j + k = j , times a function independent
 
of m: α , j  m  |T̂q(k) |α, jm =  j k; mq| j k; j  m  α ,j √T α,j .
(k )

2j+1

• For a given representation of the SO(3) rotation group, i.e., a given integer angular momentum L 2 ,
we have |ψ = |ψr  ⊗ |ψn , or, for wave functions, ψ(r, θ, φ) = ψr (r)Ylm (θ, φ), where Ylm (θ, φ)
is the spherical harmonic eimφ ylm (θ).
• We see that single-valued spherical harmonics require l ∈ N, whereas for half-integer j we have
spinors, which under a 2π rotation acquire a minus sign, ψ(r, θ, 2π) = −ψ(r, θ, 0), and return to
themselves under a 4π rotation.
• For a free particle, since Δθ,φ = L 2 , we have a centrifugal potential 2 l (l + 1)/2mr 2 , so the solution
to the Schrödinger equation in spherical coordinates is Ylm (θ, φ) jl (kr) ( jl is the spherical Bessel
function), and indeed ei k ·r decomposes in these solutions as basis functions with coefficients


alm (k).

Further Reading
See [2], [3], [1] for more details.
193 16 Applications of Angular Momentum Theory

Exercises

(1) Consider the symmetric traceless operator


1 ˆ 2
X̂i X̂ j − X δi j , (16.73)
3
where X̂i corresponds to the position of a particle. Is it a tensor operator? Justify your answer
with a calculation.
(2) Consider a free particle. Rewrite the matrix elements
 
1 ˆ
E , l  m  | P̂i P̂j − P2 δi j |E, lm (16.74)
3
in terms of Clebsch–Gordan coefficients and other quantities.
(3) Write down the differential equation satisfied by the reduced spherical harmonic ylm (θ).
(4) Express the Schrödinger equation of the free particle in polar coordinates (r, θ, z), making use
of L z .
(5) Write down the relevant separation of variables for exercise 4.
(6) Consider two free particles, and write down the wave function (in coordinate space) of this
system (of distinguishable particles) for the case where the two particles coincide, in terms of
the total angular momentum L 1 + L 2 .
(7) Find the normalization condition for orthonormal free particle solutions Ajl (kr)Ylm (θ, φ).
17 Spin andL + S

Until now, we have assumed that J = L is an orbital angular momentum, which would mean that
states belong to representations of an arbitrary j, depending on the wave function. This reasoning
was based on the classical theory. However, in relativistic theory, SO(3) rotational invariance
is enhanced to Lorentz SO(3, 1) invariance and, further, to Poincaré invariance under the group
I SO(3, 1). The latter, however, leads to the existence of an intrinsic representation for an SO(3)
group, called “spin”, that is associated with the type of particle, i.e., a given particle has a given spin
s, that cannot be changed. Moreover, the statistics of identical particles, Bose–Einstein or Fermi–
Dirac (to be studied further later on in the book), are associated with integer spin and half-integer
spin, respectively, a statement known as the “spin–statistics theorem”, which is however beyond the
scope of this book (it is a statement in quantum field theory).
Therefore, a theory of spin arises from a relativistic theory, quantum field theory. For the electron,
the simplest theory arises from the relativistic Dirac equation, to be studied towards the end of the
book. At this point, however, we use nonrelativistic theory, which is based on classical intuition.
Therefore we can think of the spin as an intrinsic “angular momentum” related to a rotation around
an axis, i.e., to some precessing “currents”. However, unlike L, S is intrinsic, i.e., it is always the
same for a given particle, whereas L varies.
For the electron, the spin has s = 1/2, so S 2 = 2 s(s + 1) = (3/4)2 , which corresponds to a two-
dimensional Hilbert space for the simple two-state system described towards the beginning of the
book. The states, as there, are written as
|1/2, +1/2 = |+, |1/2, −1/2 = |−. (17.1)

17.1 Motivation for Spin and Interaction with Magnetic Field

To motivate the study of spin, we look at the interaction of angular momenta with a magnetic field
and in particular the Stern–Gerlach experiment described in Chapter 0.
Consider the coupling of a charged particle, say an electron, to an electromagnetic field. This
is done through “minimal coupling”, which replaces derivatives acting on states of the particle as
follows;
q  − i q A,
∂i → ∂i − i Ai ⇒ ∇  →∇  (17.2)
 
where A is the vector potential of electromagnetism and q the electric charge of the particle. Then,
in quantum mechanics, the canonical momentum operator acting on a wave function is replaced as
follows:

pˆ = ∇ → pˆ − q A.
 (17.3)
i
194
195 17 Spin andL + S

The Hamiltonian operator of a free particle therefore is also changed by the electromagnetic
interaction:

pˆ 2 1 ˆ 2 ˆ2 2
ˆ = p + q A
ˆ ˆ ˆ ˆ
ˆ 2 − q p · A + A · p .
Ĥ = → p − q A (17.4)
2m 2m 2m 2m m 2
 corresponds to a constant magnetic field B
If the electromagnetic potential A  then in the Coulomb
gauge we have


×A
 = B,
 ∇·A
 = 0, (17.5)

and we can choose a vector potential

 = 1B
A  × r . (17.6)
2
Indeed, using

i jk  klm = δil δ jm − δim δ jl , (17.7)
k

we obtain
1   × r ))i = 1 i jk  klm ∂j (Bl x m ) = 1 (Bi δ j j − B j δi j ) = Bi .
(∇ × ( B (17.8)
2 2 2
Then the (quantum) interaction Hamiltonian is
q ˆ  ˆ  × rˆ ) · pˆ ] = − q  i jk [ P̂i B j X̂k + Bi X̂ j P̂k ]
Ĥint = − [p · ( B × r ) + ( B
4m 4m
(17.9)
q  ˆ
=− B · L + constant.
2m
On the other hand, in general the interacting Hamiltonian for a closed current interactng with a
magnetic field is

Hint = −
μ · B,
 (17.10)

where the magnetic moment μ


 of a loop of current I surrounding a disk with area A = πr 2 and with
unit vector e is

μ
 = I Ae. (17.11)

The current through the loop is


qv
I= , (17.12)
2πr
which means that the magnetic moment is
qv q q q 
μ= πr 2 = mvr = L ⇒ μ
 = L. (17.13)
2πr 2m 2m 2m
We see that then, substituting this μ
 in the interacting Hamiltonian −μ · L, we obtain the form of
the quantum Hamiltonian given in (17.9).
Moreover, the angular momentum L suffers a precession around B,  situated at an angle θ to L, of

d L  = q LB sin θ eφ ,

×B (17.14)
dt 2m
196 17 Spin andL + S

but on the other hand we have (since d L moves on a circle of radius L sin θ)
d L dφ
= ≡ ω, (17.15)
L sin θ dt dt
the angular precession speed is therefore
q 
ω =− B. (17.16)
2m
The proportionality between the magnetic moment and the angular momentum in quantum
mechanics leads to a quantized projection of the magnetic moment onto any direction Oz,
q  q
μ
 = L ⇒ μz = Lz , (17.17)
2m 2m
and, since L z = ml , we obtain
e
μz = ml ≡ μ B ml , (17.18)
2m
where μ B is called the Bohr magneton, and ml = −l, . . . , +l.
Thus, when constructing an electron beam traversing a constant B field (in the Oz direction) area,
the electrons in the beam will have no orbital angular momentum, and yet in the Stern–Gerlach
experiment we see a split of the beam into two, corresponding to a two-state system, therefore
indicating that the electrons are in an angular momentum j = 1/2 state. This means that there must
be an electron spin of s = 1/2, with ms = ±1/2 and Sz = ms . Moreover, then we can measure Hint
and therefore μz , the projection of μ along the direction of B. We find
e
μz = g (2ms ), (17.19)
4m
where 2ms = ±1, so that
μB
μz = ±g , (17.20)
2
and we find that g  2. In fact, from the Dirac equation we find that g = 2 exactly, whereas in QED
there are small corrections that take us slightly away from g = 2. This underscores the fact that spin
is really a different type of angular momentum, unlike the orbital angular momentum. This was to be
expected, since it is an intrinsic property of the particle.
For particles having a different spin, such as s = 1 (these are called “vector particles”, photons are
an example), we have different values for g but again the Hilbert space (the representation) can be
described in the same way as for the angular momentum states.

17.2 Spin Properties

For any particle that has both orbital angular momentum L and spin S,  we can consider states
characterized by both, as well as by the total angular momentum J = L + S.  If we have several
particles composing the system, for instance an atom, with the nucleus, composed of nucleons, and
the electrons around it, we can add up the angular momenta, as follows:
 
Stotal = Si , L total = L i , J total = L total + Stotal . (17.21)
i i
197 17 Spin andL + S

In the case of an atom, both the nucleons and the electrons are particles of spin s = 1/2, but we
have several possibilities for Stotal , by the general vector addition theory for angular momenta.
For one particle, we can consider eigenstates of L 2 , L z , together with S 2 , Sz , as well as other
observables, with eigenvalues generically called α:
|α, lm ⊗ |sms . (17.22)
A generic state |ψ can be described in a coordinate and spin basis r , ms | by a wave function
r ms |ψ = ψ ms (r ), or ψ(r , ms ). (17.23)
In the case of s = 1/2, we describe the states by wave functions ψ± (r ), corresponding to
ms = ±1/2. We can represent these wave functions only in the coordinate basis r |, so states in the
spin Hilbert space are given by
r |ψ = ψ+ (r )|+ + ψ− (r )|− (17.24)
or
 
ψ+ (r )
ψ= , (17.25)
ψ− (r )
which is a (two-component) spinor field.
We can also represent just the spin state (in the spin Hilbert space) as
 
C
|α = |++|α + |−−|α ≡ C+ |+ + C− |− = + , (17.26)
C−
where we have used a completeness relation in the spin Hilbert space, |++| + |−−| = 1, and
defined C± ≡ ±|α. Further, we can factorize the state of the particle with spin, r |ψ (see 17.24),
and find
ψ+ (r ) = C+ ψ(r ), ψ− (r ) = C− ψ(r ). (17.27)

17.3 Particle with Spin 1/2

In view of the fact that the spin and orbital angular momentum components commute,
[ L̂ i , Ŝ j ] = 0, (17.28)
just as for [J1i , J2j ] in the general theory of addition of angular momenta, it means that we can have
simultaneous eigenstates for spin and orbital angular momentum, as we have seen already.
Thus, considering a particle in a rotationally invariant system, we can use as observables L 2 , L z ,
S 2 , Sz . In terms of eigenstates for these operators, we have found that we can separate variables in
the wave function for the system without spin,
ψ(r ) → ψl (r)Ylm (θ, φ), (17.29)
so in the presence of spin we can write states also characterized by s, ms :
 
C
ψlm,ms (r, θ, φ) = ψl (r)Ylm (θ, φ) + . (17.30)
C−
198 17 Spin andL + S

To continue, we give more details on the spherical harmonics introduced in the previous chapter.
As we saw, the spherical harmonics also factorize in terms of the dependences on θ and φ:
Ylm (θ, φ) = Pl,m (cos θ)Φm (φ), (17.31)
where the normalized eigenfunctions for the φ dependence are
1
Φm (φ) = √ eimφ , (17.32)

the Pl,m are the associated Legendre functions of the second degree, defined for m ≥ 0 by the formula
   l+m
(l − m)! 2l + 1 1 d
Pl,m (w) = (−1) m
(1 − w 2 m/2
) (w 2 − 1) l , (17.33)
(l + m)! 2 l! 2l dw
which take real values; for negative values of m they are defined by imposing

Yl,−m (θ, φ) = (−1) mYlm (θ, φ). (17.34)
For m = 0, the spherical harmonics reduce to the Legendre polynomials,

2l + 1
Yl,0 (θ, φ) = Pl (cos θ). (17.35)

The Legendre polynomials are defined by the relation
 l
1 d
Pl (w) = (w 2 − 1) l , (17.36)
l! 2l dw
and satisfy the differential equation
(1 − w 2 )Pl (w) − 2wPl (w) + l (l + 1)Pl (w) = 0; (17.37)
their orthogonality condition is
 1
2
dwPl (w)Pl (w) = δll . (17.38)
−1 2l + 1
For Ylm , when w = cos θ the differential equation (17.37) becomes
 2 
d d
+ cot θ + l (l + 1) Pl (cos θ) = 0. (17.39)
dθ 2 dθ
This equation is obtained from the eigenvalue equation
L 2
Ylm (θ, φ) = l (l + 1)Ylm (θ, φ), (17.40)
2
when we take into account that (using the formulas for L z , L + , L − from the previous chapter)
 
Lˆ 2 1 L̂ + L̂ − + L̂ − L̂ +
= 2 L̂ 3 +
2
2  2
    
∂2 1 iθ ∂ ∂ ∂ ∂
=− 2 + e + i cot θ , e−iθ − + i cot θ (17.41)
∂φ 2 ∂θ ∂φ ∂θ ∂φ
 2 
∂ ∂ 1 ∂2
=− + cot θ + .
∂θ 2 ∂θ sin2 θ ∂φ2
199 17 Spin andL + S

Other properties of the spherical harmonics are as follows. From the completeness of the |lm
states in the Hilbert space for integer l, we find

∞ 
l
|lmlm| = 1. (17.42)
l=0 m=−l

Multiplying from the left with n | and from the right with |n , where n is a unit vector
characterized by angles θ, φ so that |n is a coordinate basis for angles, we obtain the completeness
relation for spherical harmonics,

∞ 
l

Ylm (n  )Ylm (n ) = δ(n − n  ) = δ(cos θ − cos θ  )δ(φ − φ  ). (17.43)
l=0 m=−l

Also, the orthonormality of |lm,


lm|l  m  = δll δ mm , (17.44)
leads, by the insertion of the completeness relation of |n,

dΩn |nn | = 1, (17.45)

into the orthonormality relation for spherical harmonics,



dΩn Yl∗ m (n )Ylm (n ) = δll δ mm . (17.46)

Besides these relations, we also have the addition formula for spherical harmonics,

l 
l
2l + 1
n |lmlm|n   = ∗
Ylm (n )Ylm (n  ) = Pl (n · n  ). (17.47)
m=−l m=−l

Finally, the recursion relation for finding Yl,m−1 from Yl,m is obtained from the recursion relation
for |lm, where one uses L̂ − , in the n | basis, so that
n | L̂ − /|lm
n |l, m − 1 = √ , (17.48)
(l + m)(l + 1)
from which we obtain the formula
 
1 ∂ ∂
Yl,m−1 (n ) = e−iφ − + i cot θ Ylm (n ). (17.49)
(l + m)(l − m + 1) ∂θ ∂φ
Using it, we find the formulas already quoted, (17.31) and (17.33).

17.4 Rotation of Spinors with s = 1/2

We can also find the effect of rotating spinors in three-dimensional space. A rotation by an angle φ
around an arbitrary direction defined by n is generated by σ · n/2, since the quantities σ/2 (the σi are
the Pauli matrices) are the generators of SO(3) in the spinor (s = 1/2) representation. Then the group
element (acting on Hilbert spaces) is
 
σ · n φ
g = exp −i . (17.50)
2
200 17 Spin andL + S

However, since σi2 = 1 and n 2 = 1, it follows that also (σ · n ) 2 = 1, which means that


⎪ 1, m = 2k
(σ · n ) m = ⎨
⎪ σ · n, (17.51)
⎩ m = 2k + 1.

Then the group element for rotations in the spin 1/2 representation becomes
∞
(−iφ/2) m ∞
(−iφ/2) 2k ∞
(−iφ/2) 2k+1
g = e−iσ ·nφ/2 = (σ · n ) m = + σ · n
m=0
m! k=0
(2k)! k=0
(2k + 1)!
(17.52)
φ φ
= cos 1 − i sin (σ · n ).
2 2
Since σ · n is a vector operator, we have

σi = g −1 σi g = Ri j σ j , (17.53)
j

where Ri j is the rotation matrix (in the vector representation).


We can check that, for instance for a rotation of σ1 around the third direction, we have

σ1 = eiσ3 φ/2 σ1 e−iσ3 φ/2 = (cos φ/2 + i sin φ/2 σ3 )σ1 (cos φ/2 − i sin φ/2 σ3 )
= σ1 cos2 φ/2 + sin2 φ/2 σ3 σ1 σ3 + i sin φ/2 cos φ/2 [σ3 , σ1 ] (17.54)
= σ1 cos φ − σ2 sin φ,

where we have used

σ1 σ3 = −σ3 σ1 = −iσ2 . (17.55)

The resulting transformation is indeed a rotation of σi , according to the vector operator formula.
We can use the same rotation formula to construct the eigenstate of a general vector operator σ · n
with spin projection +1/2 onto the direction of n.  
1
We start with the eigenstate of σ3 with spin projection +1/2 along the third direction, |+ = .
0
Then we rotate the system by an angle θ around direction 2 and after that by φ around direction 3.
The resulting state is
 
−iσ3 φ/2 −iσ2 θ/2 1
|ψ = e e |+ = (cos φ/2 − i sin φ/2 σ3 )(cos θ/2 − i sin θ/2 σ2 )
0
(17.56)
 cos θ/2 e−iφ/2 
= .
 sin θ/2 e
iφ/2

17.5 Sum of Orbital Angular Momentum and Spin,L + S

In order to sum orbital angular momentum and spin, we follow the general theory of addition of
angular momenta, with
J = L ⊗ 1 + 1 ⊗ S.
 (17.57)
201 17 Spin andL + S

We can use a basis of eigenstates of commuting operators (observables) L2 , L z , S 2 , Sz ,

|lm ⊗ |sms , (17.58)


2
or a new basis of eigenstates of L 2 , S 2 , J , Jz , namely

|ls jm j . (17.59)

In the case of s = 1/2, by multiplying the state with n |α| we obtain
 
C
Ylm (n ) + , (17.60)
C−
leading to states ψl,1/2,j,m j .
Whether |lmsms  or |ls jm j  are the more appropriate basis states depends on whether the
interaction Hamiltonian Ĥint has spin–orbit coupling terms L · S or not.

17.6 Time-Reversal Operator on States with Spin

Finally, we consider the effect of the time-reversal operator T on particles with spin (before, we only
considered its effect on spinless particles).
We first note that the angular momentum, thought of as the generator of SO(3), is odd under T,

T Ji T −1 = −Ji . (17.61)

We can see that this is true by for instance observing that group elements g ∈ SO(3) should commute
with T (rotations and time reversals are independent), even for infinitesimal g  1 − i α
 · J , so that

[ei α ·J , T] = 0 ⇒ T (i Ji )T −1 = i Ji .

(17.62)

But, since T is an anti-unitary operator (antilinear and unitary), T (i Ji )T −1 = −iT (Ji )T −1 , giving the
stated relation.
Moreover, as we saw, T acts on wave functions by complex conjugation. Applying to the spherical
harmonics, we obtain

T : Ylm (θ, φ) → Ylm (θ, φ) = (−1) mYl,−m (θ, φ). (17.63)

This means that for orbital angular momentum states,

T |lm = (−1) m |l, −m, (17.64)

at least for integer m.


But one can generalize to say (this involves a phase convention) that the same formula applies to
eigenstates of J = L + S,


T | jm = (−1) m | j, −m, m ∈ Z. (17.65)

Then we can generalize this further to any j, integer or half-integer, by writing it as

T | jm = i 2m | j, −m. (17.66)


202 17 Spin andL + S

Applying T twice gives

T 2 | jm = +| jm, j ∈ Z
(17.67)
= −| jm, j ∈ Z/2.

Note that this is quite unexpected, since reversing time twice gives the same system, and it is not
clear why we should have a phase ±1.
The above action of T on a state with general j can be proven more rigorously and generally, but
we will not do it here.
Next, we consider the action of T on spin, the intrinsic angular momentum of a particle. From the
above analysis, S should also be odd under time reversal. This is consistent with the semiclassical
idea of an intrinsic “spinning” around the direction of the momentum. Indeed, reversing the direction
of time, this spinning would reverse as well, thus changing the sign of the spin, S → − S;  thus
Sz → −Sz for any direction z.
Taking into account also the action of T on r , which is to leave it invariant (even under T,
consistent with the idea that r is not related to time evolution), and on p , p → −p (odd under
T, consistent with the idea that changing the direction of time changes the direction of motion), we
can conclude that the time operator corresponds to a rotation by π around the y axis (given that the
z axis is the direction of momentum), plus the complex conjugation operator K0 (the basic antilinear
operator),

T = e−iπSy / K0 . (17.68)

Indeed we could check that the correct transformation rules are obtained, though we leave it as
an exercise. Here we just point out that a rotation by π around y, perpendicular to the momentum
(which is in the z direction), reverses any vector in the z direction.

Important Concepts to Remember

• Besides the orbital angular momentum L, there is an intrinsic spin angular momentum S,  which
comes from quantum field theory (Poincaré invariant theory), and is associated with the type of
particle (thus constant for each type of particle). We have the spin–statistics theorem relating
integer spin with Bose–Einstein statistics and half-integer spin with Fermi–Dirac statistics.
• A charged particle in a magnetic field has a magnetic moment μ  , such that Hint = − μ · B,
 and
q  q
classically μ  = 2m L, so quantum mechanically we expect μz = 2m ml . For the electron, 2m = μ B e

is the Bohr magneton.


• For the electron, μz = gμ B ms with g  2 (2 from the Dirac equation, and small corrections from
QED), and for other particles we have different values of g that still differ from 1.
• For a particle with spin, the state is |αlm ⊗ |sms , with wave function in the |r ms  basis ψ ms (r ) =
 
ψ+ (r )
ψ(r , ms ), or r |ψ = ψ+ (r )|+ + ψ− |− = with ψ+ (r ) = C+ ψ(r ), ψ− (r ) = C− ψ(r ), or
ψ− (r )
 
C
otherwise ψlmms (r, θ, φ) = ψl (r)Ylm (θ, φ) + .
C−
203 17 Spin andL + S


• The spherical harmonics are Ylm (θ, φ) = Plm (cos θ)eimφ / 2π, with Plm the associated Legendre
functions of second degree, orthonormal and complete; the spherical harmonics satisfy the addition

formula lm=−l Ylm (n )Ylm (n  ) = 2l+1 n · n  ).
4π Pl (
• A spinor (which describes a spin −iφ/2 1/2 particle),
 initially of spin +1/2 in the third direction, rotated
cos θ/2 e
with n (θ, φ), becomes .
sin θ/2 e+iφ/2
• Under time reversal angular momentum is odd, T Ji T −1 = −Ji , and states transform as T | jm =
i 2m | j, −m.

Further Reading
See [2], [1], [3] for more details.

Exercises

(1) For an electron inside an atom, with angular momentum l = 1 (and spin 1/2), in a magnetic
field, how many energy levels (from spectral lines) do we see? Assume there is a single energy
level for l, s fixed.
(2) Write down the wave function for a free oscillating neutrino (using general parameters) in
Cartesian coordinates.
(3) Write down the wave function for a free oscillating neutrino (using general parameters)
in spherical coordinates.
(4) Prove the addition formula for spherical harmonics, (17.47).
(5)  Classically, the spin has a precession
Consider an electron (with spin 1/2) in a magnetic field B.

motion around B (Larmor precession). Prove this. Then calculate the quantum mechanical wave
function |ψ(t) corresponding to this classical motion.
(6) Consider an electron with orbital angular momentum l = 2. Write down its possible states in the
J basis.
(7) Prove that the time-reversal operator can be written as T = e−iπSy / K0 .
18 The Hydrogen Atom

In this chapter, we consider the simplest nontrivial case of a central potential, the negative 1/r
potential, valid for the hydrogen atom and hydrogen-like atoms (those with a nucleus of charge
+Ze and an electron of charge −e around it), one of the first and most important cases to be analyzed
by quantum mechanics. It is also a very simple system, forming a central-potential substitute for the
one-dimensional harmonic oscillator. This system teaches most of the important methods for solving
a quantum system.

18.1 Two-Body Problem: Reducing to Central Potential

The first thing we will show is that for a two-body problem with a likewise two-body potential
we can factorize out the center of mass behavior, and we are then left with reduced system in a
central potential, as in classical mechanics. We will do this for a general two-body (central) potential
V (|r i j |).
The quantum two-body Hamiltonian acting on a wave function is
2  2 2  2
Ĥ = − ∇1 − ∇ + V (|r 1 − r 2 |). (18.1)
2m 2m2 2
Defining

r ≡ r 1 − r 2 ,  ≡ m1r 1 + m2r 2 ,
R
m1 + m2
(18.2)
1 1 1
M ≡ m1 + m2 , ≡ + ,
μ m1 m2
where μ is known as the reduced mass, we obtain
m1,2
∂i1,2 = (∂i1,2 r j )∂r j + (∂i1,2 R j )∂R j = ±∂ri +
∂R j , (18.3)
M
leading to an alternative split of the two-body kinetic terms, into center of mass and relative motion
terms:
1 2 1 2 1  m1   2 1  m2   2
∇1 + ∇2 = ∇r + ∇R + ∇r − ∇
m1 m2 m1 M m2 M R
(18.4)
1 2 1 2
= ∇ + ∇ .
μ r M R
Then the Schrödinger equation reduces to
 
2  2 2  2
− ∇R − ∇r + V (r) ψ(r , R)
 = Eψ(r , R).
 (18.5)
2M 2μ
204
205 18 The Hydrogen Atom

We can separate the variables so that the wave function becomes a function of the center of mass
 times a function of the relative position r ,
position R
ψ(r , R)
 = ψ(r )φ( R).
 (18.6)
Then the Schrödinger equation splits into a part depending only on r , plus a part depending

only on R:
   2 
2  2   ψ(r ) + −  ∇
− ∇R φ( R) − ECM φ( R)  2 ψ(r ) + V (r )ψ(r ) φ( R)

2M 2μ r (18.7)
= (E − ECM )ψ(r )φ( R).


This allows us to set the two parts to zero independently:


2  2 
− ∇ φ( R) = ECM φ( R)

2M R
 2  (18.8)
 2
− ∇ + V (r) ψ(r ) = (E − ECM )ψ(r ).
2μ r
The first equation corresponds to the center of mass motion, and is of free particle type.
The second equation is the equation for the relative motion, with relative energy Erel = E − ECM ,
with a central potential V (r).

18.2 Hydrogenoid Atom: Set-Up of Problem

Consider a “hydrogenoid” atom, with a negative Coulomb potential corresponding to the interaction
of a nucleus of charge Q = Ze and an electron of charge q = −e around it. The reduced mass μ is
approximately equal to the electron mass me , since me  mn (the nucleus mass mn ). Then
|Q||q| 1 Q̃2
V (r) = − ≡− . (18.9)
4π0 r r
In the Schrödinger equation for this relative motion with central potential V (r),
 2 
 2
− ∇ + V (r) ψ(r ) = Erel ψ(r ), (18.10)
2μ r
where from now on we will replace Erel with E, we can separate variables further.
First, we consider the equation in spherical coordinates, ψ(r ) → ψ(r, θ, φ), and then, remember-
ing that
∂2 2 ∂ 1
Δr = ∇
2 = + + 2 Δθ,φ , (18.11)
r
∂r 2 r ∂r r
where the angular Laplacian equals the square of the angular momentum,
L 2
Δθ,φ = − , (18.12)
2
we write
ψ(r, θ, φ) = R(r)Ylm (θ, φ). (18.13)
206 18 The Hydrogen Atom

However, the spherical harmonics Ylm (θ, φ) are eigenfunctions of the angular Laplacian Δφ,θ , or
the angular momentum squared,
L 2
Δθ,φYlm (θ, φ) = − Ylm (θ, φ) = −l (l + 1)Ylm (θ, φ), (18.14)
2
implying that the radial function R(r) satisfies the following radial equation
 2  2  
 d 2 d l (l + 1)
− + − + V (r) R(r) = E R(r). (18.15)
2μ dr 2 r dr r2
The solution of this equation will depend on the parameters E and l only, so

R(r) = RE,l (r) ⇒ ψ = ψ Elm (r, θ, φ) = REl (r)Ylm (θ, φ). (18.16)

We see that, with respect to the Schrödinger equation in one dimension, in the radial equation for
R(r) we have a term with a first derivative, r −1 d/dr. But we can reduce (18.16) to the same type of
problem as in one dimension by redefining the radial wave function as
χ(r)
R(r) = , (18.17)
r
which implies that
 2 
d 2 χ(r) 2μ  l (l + 1)
+ − + E − V (r) χ(r) = 0; (18.18)
dr 2 2 2μ r 2
this is indeed of the form of the one-dimensional Schrödinger equation, with an “effective potential”

2 l (l + 1)
Veff (r) = V (r) + . (18.19)
2μ r 2
The only difference is that here r ≥ 0, unlike in one dimension. In our particular case the effective
potential is

Q̃2 2 l (l + 1)
Veff (r) = − + , (18.20)
r 2μ r 2
and at r → 0 we have V (r) → +∞, so it is as though we have an infinite “wall” at r = 0, allowing us
to have a complete analogy with the one-dimensional system. The potential is of the type depicted in
Figs. 19.1b and 19.2a, with a negative minimum at a positive value for r,

l (l + 1)2
r min = ; (18.21)
Q̃2 μ
the potential goes to zero at infinity, from below.

Boundary Conditions

Since at r = 0 we have an infinite “wall” for the motion, in χ(r), as in the one-dimensional case, we
have the boundary condition

χ(r = 0) = 0. (18.22)
207 18 The Hydrogen Atom

At r → ∞, we must have at least χ(r → ∞) → 0. Actually, though, due to the condition of (three-
dimensional) normalizability, we have
 ∞  ∞
|ψ| 2 r 2 dr dΩ < ∞ ⇒ χ(r) 2 dr < ∞, (18.23)
0 0

which means that the behavior at infinity of the “one-dimensional wave function” χ(r) must be
reducing faster than
A
χ(r) < . (18.24)
r 1/2

Case 1: E > 0
In this case, since for r ≥ r 0 we have V (r) < 0, then, according to the general analysis of one-
dimensional systems, the state behaves at infinity sinusoidally,

χ(r) ∼ e±ikr , r → ∞, (18.25)

since V (r)  0 there, and thus


2 k 2
= E − V (r)  E. (18.26)

This sinusoidal behavior is not square integrable at infinity, as in the one-dimensional case, and
corresponds to scattering states, to be studied in the second part of the book.

Case 2: E < 0
This is the more interesting case for us at present. By the general one-dimensional analysis we
expect to find bound states, valid for a set of discrete values for the energy, and so for a quantization
condition, obtained from the vanishing of a coefficient A of a possible solution:

A = 0 ⇒ E = En . (18.27)

18.3 Solution: Sommerfeld Polynomial Method

To solve the radial equation, we use a variant of the same method as that used in the case of the
one-dimensional harmonic oscillator to solve the equation.
First, we rescale the equation, both variables and parameters, in order to have a dimensionless
equation. Define
 
2μ|E| μ
ρ = 2κr, κ ≡ , λ ≡ Q̃ 2
, (18.28)
2 2|E|2
so that the equation becomes (note that now we set E = −|E|)
 
d 2 χ(ρ) l (l + 1) 1 λ
+ − − + χ = 0. (18.29)
dρ 2 ρ2 4 ρ
208 18 The Hydrogen Atom

In order to solve this equation, we first find the behaviors at r → 0 and r → ∞ of the solution, and
then factor them out, leaving us with an easier (and hopefully recognizable) equation.
At r → 0, writing
χ ∼ Aρ α , (18.30)
we find
1
ρ α−2 [α(α − 1) − l (l + 1)] − ρ α + λρ α−1 = 0. (18.31)
4
Setting to zero the coefficient of the leading power in order to solve the equation for the leading
behavior, we find
α(α − 1) = l (l + 1), (18.32)
which has two solutions,
α = l + 1 or − l ⇒ χ ∼ ρl+1 or ρ −l . (18.33)
However, the behavior χ ∼ ρ−l is not normalizable at zero, so this solution is excluded.
At r → ∞, from the general theory, since V (r → ∞) → 0 the solution is exponential with
parameter κ,
χ(ρ) ∼ Be±κr = Be±ρ/2 . (18.34)
However, as in the one-dimensional case, the positive exponential is not normalizable but the
negative one, ∼ e−ρ/2 , is normalizable.
According to the general procedure outlined earlier, first we factorize this behavior at infinity by
redefining the radial wave function with it:
χ(ρ) = e−ρ/2 H (ρ). (18.35)
Then we have
1
χ  = − e−ρ/2 H + e−ρ/2 H  ⇒
2   (18.36)
 −ρ/2 1
χ =e H − H  + H  ,
4
so the radial equation becomes
 
l (l + 1) λ
H  − H  − − H = 0. (18.37)
ρ2 ρ
We next factorize the behavior at r → 0, again redefining the radial wave function:
H (ρ) = ρl+1 F (ρ). (18.38)
Since
H  = (l + 1)ρl F + ρl+1 F 
(18.39)
H  = l (l + 1)ρl−1 F + 2(l + 1)ρl F  + ρl+1 F ,
the radial equation becomes finally
 
2l + 2 F
F  + − 1 F  − (l + 1 − λ) = 0. (18.40)
ρ ρ
209 18 The Hydrogen Atom

But this is the equation for the confluent hypergeometric function 1 F1 (a, b; z), namely
 
 b a
F + − 1 F  − F = 0, (18.41)
z z
with general solution

F = A1 F1 (a, b; z) + Bρ 1−b 1 F1 (a − b + 1, 2 − b; z). (18.42)

In our case, since a = l + 1 − λ and b = 2l + 2, the general solution for the function F in the radial
equation is

F = A1 F1 (l + 1 − λ, 2l + 2; ρ) + Bρ −2l−1 1 F1 (−λ − l, −2l; ρ). (18.43)

However, since 1 F1 equals 1 at z = ρ = 0, the solution with coefficient B has the wrong behavior
at ρ = 0, F ∼ ρ −2l−1 , so that χ ∼ ρl+1 ρ −2l−1 = ρ −l , which has already been excluded on physical
(boundary condition) grounds.
That means that only the solution with coefficient A is good at ρ = 0. On the other hand, at ρ → ∞,

1 F1 (a, b; ρ) ∼ αρ n + βe+ρ , (18.44)

so that the leading behavior of χ is χ ∼ αe−ρ/2 + βe ρ/2 . Again, we know that the behavior with e+ρ/2
is not normalizable, so must be excluded. That imposes a condition β = 0 which is a (quantization)
condition on the parameters a, b (and so on l and on λ and therefore on the energy E) of the function
1 F1 (a, b; ρ).
The confluent hypergeometric function can be defined as an infinite series,
∞
1 a(a + 1) · · · (a + k − 1) k
1 F1 (a, b; ρ) = ρ . (18.45)
k=0
k! b(b + 1) · · · (b + k − 1)

We can check this by writing the solution to this equation as a series in z (or ρ) and then finding a
recursion relation for the coefficients from the equation. Thus, write


F= Ck ρ k , (18.46)
k=0

and substitute in the defining equation, to find



∞  
k (k − 1)Ck ρ k−2 + bkCk ρ k−2 − kCk ρ k−1 − aCk ρ k−1 = 0. (18.47)
k=0

We can rewrite this by redefining the sums to have the same ρ k−1 factor in all terms, in other words,
redefine k as k + 1 in the first two terms in the sum. This only affects the k = 0 term, which becomes
k = 1, but then the new k = 0 term still vanishes owing to the k prefactor. Thus we get


ρ k−1 [Ck+1 (k + 1)(k + b) − Ck (a + k)] = 0. (18.48)
k=0

This implies the recursion relation


Ck+1 a+k
= , (18.49)
Ck (k + 1)(k + b)
which is indeed satisfied by (18.45).
210 18 The Hydrogen Atom

18.4 Confluent Hypergeometric Function and Quantization of Energy

Since (18.45) is written as an infinite sum of powers of ρ, the only way to avoid an exponential
blow-up, and thus to put β = 0 in (18.44), is for the infinite series to terminate at a given k = nr ∈ N,
since then Ck+1 /Ck = 0. Then 1 F1 becomes a polynomial of order nr for

a + nr = 0. (18.50)

This is a quantization condition, which implies that


a = l + 1 − λ = −nr ⇒
λ = nr + l + 1 ≡ n
  (18.51)
μ Ze2 μ
= Q̃ 2
= ,
2|E|2 4π0 2|E|2
where we have defined the total energy quantum number n = nr + l + 1, with nr the radial quantum
number, which takes values 0,1,2,. . .
That means that the energy levels are quantized, as
(Ze2 /4π0 ) 2 μ 2 (αZ )
2
En = − = −μc , (18.52)
22 n2 2n2
where
e2
α≡ (18.53)
4π0 c
is the fine structure constant, approximately equal to 1/137. Note that, since l = 0, 1, 2, . . . and
nr = 0, 1, 2, . . ., it follows that n = 1, 2, 3, . . . Conversely, at fixed n, since l = n − nr − 1, we have
l = 0, 1, 2, . . . , n − 1.
We can also rewrite (18.52) as
Z 2 e02 1
En = − , (18.54)
2a0 n2
where
e2
e02 ≡ (18.55)
4π0
and we have defined the Bohr radius
2 2 4π0
a0 ≡ = . (18.56)
μe02 e2 μ
The energy can be put into Rydberg’s form, i.e., in terms of the Rydberg constant R,

Z2R μe04
E=− ⇒ R = . (18.57)
n2 22
Then we have also (see (18.28))

2μ|En | Z
κn = = , (18.58)
2 na0
211 18 The Hydrogen Atom

and the minimum of the effective potential is


l (l + 1)2 l (l + 1)
r min = = a0 ∝ a0 . (18.59)
Ze02 μ Z
Therefore we can write the radial wave function as

Rn,l (r) = Nn,l e−κ n r (2κ n r) l 1 F1 (−n + l + 1, 2l + 2; 2κr), (18.60)

where the normalization constant Nn,l is found to be (by integrating over the square of the wave
function)

1 (n + l)!
Nn,l = (2κ n ) 3/2
. (18.61)
(2l + 1)! (n − l − 1)! 2n

18.5 Orthogonal Polynomials and Standard Averages


over Wave Functions

When the confluent hypergeometric function terminates at a finite order n it becomes a polynomial,
and since the constituent functions are orthonormal, it is an orthogonal polynomial. In fact, it is
a classical orthogonal polynomial, the Laguerre polynomial L bn (z) (defined in Chapter 8), up to a
constant,
n! Γ(b + 1) b
1 F1 (−n, b + 1; z) = L (z). (18.62)
Γ(b + n + 1) n
The Laguerre polynomials for integer parameters can be defined by
d m −z m
L 0m (z) = ez (e z )
dz m
(18.63)
dk
L km = (−) k k L 0m+k (z),
dz
and obey the orthonormality condition
 ∞
(m + k)!
e−z z k L km (z)L kn (z)dz = δ mn . (18.64)
0 m!
Thus the radial wave function can be written in terms of the Laguerre polynomials:

Rn,l = Ñn,l e−κ n r (2κ n r) l L 2l+1


n−l−1 (2κ n r), (18.65)

where the new normalization factor is



(n − l − 1)!
Ñn,l = (2κ n ) 3/2 . (18.66)
2n(n + l)!
We can use the square of the radial wave function, giving the probability as a function of radius,
to calculate average powers of the radius:
 ∞
 
r k  = r 2 drr k Rn,l (r) 2 , (18.67)
0
212 18 The Hydrogen Atom

to obtain
a0
rnl = [3n2 − l (l + 1)]
2Z
a2
r 2 nl = 02 n2 [5n2 + 1 − 3l (l + 1)] (18.68)
2Z
 
1 1 Z
= .
r nl n2 a0

We see that the average position of the electron goes as a0 n2 /Z, increasing as n becomes larger,
so in the higher energy levels (higher n), the electron is, on average, further away from the nucleus.
Also, the average potential energy is
 
2 1 Q̃2 Z En
V nl = −Q̃ =− 2 = , (18.69)
r nl n a0 2
as is somewhat expected from the equipartition of energy.
For the maximum value of l for fixed n, l max = n−1, the classical orbit would be the least eccentric.
In this case, we find
a02 2
r 2 n,n−1 = n (n + 1/2)(n + 1),
Z2 (18.70)
a0
rn,n−1 = (n2 + n/2),
Z
implying that the relative error in the position vanishes at large n,
na0 √ rn,n−1
Δr ≡ r 2  − r2 = 2n + 1 = √ ⇒
2Z 2n + 1 (18.71)
Δr
→ 0 as n → ∞,
r
so the electron becomes more classical in this limit.

Important Concepts to Remember

• The hydrogen, or hydrogenoid, atom is a very important and simple case that teaches most of the
methods involved in solving a quantum system in three spatial dimensions.
• We can reduce the motion of such an atom, from the motion of the nucleus of charge Ze and the
electron of charge −e to a free center of mass motion and a relative motion in the central potential,
with reduced mass μ.
• Separating variables so that ψ nlm (r, θ, φ) = Rnl (r)Ylm (θ, φ), and with R(r) = χ(r)r, we get a
one-dimensional Schrödinger equation for χ(r) in an effective potential
2 l (l + 1)
Veff (r) = V (r) + .
2μ r 2
• The effective potential has a wall at r = 0, goes to zero at infinity, and has a minimum at r min > 0,
leading to bound states for E < 0 and scattering states for E > 0.
213 18 The Hydrogen Atom

• The solution of the Schrödinger equation is found by the Sommerfeld polynomial method: we
factor out the behavior at zero and at infinity, and thus obtain an equation for a confluent
hypergeometric function 1 F1 (a, b; z), with good behavior at zero; however the behavior at infinity
has two components, a polynomial and a rising exponential (giving normalizable and non-
normalizable behaviors for the wave function).
• Then we find a quantization condition on the parameters a, b by imposing that the wave function
at infinity is normalizable, or equivalently, that 1 F1 reduces to a polynomial (so in the recursion
condition for coefficients we have Ck+1 /Ck = 0 at some finite k); this turns out to be a Laguerre
classical orthogonal polynomial.
• The resulting quantization condition for the energy is
(αZ ) 2 Ze2 RZ 2
En = −μc2 = − = − ,
2n2 2a0 n2 n2
with a0 the Bohr radius, R the Rydberg constant, and n = l + nr + 1, nr = 0, 1, 2, . . . so n = 1, 2, . . .
• The average position is
3 a0 2
r = [n − l (l + 1)/3],
2 Z
and so increases for higher n, while Δr/r → 0 for n → ∞.

Further Reading
See [2] for more details.

Exercises

(1) Consider the three-body problem, with the same two-body potential Vi j (r i j ) for all three pairs.
Can we separate variables to factor out anything in this general case?
(2) Set up the problem for the same hydrogenoid atom, but living in two spatial dimensions instead
of three (yet with the 1/r potential of three dimensions).
(3) Use the same Sommerfeld polynomial method to solve the problem in exercise 2.
(4) Prove that the normalization constant Nnl is given by equation (18.61).
(5) Prove the relations (18.68).
(6) Calculate the average radial momentum of the reduced orbiting particle in the state n, l of the
hydrogenoid atom.
(7) If we introduce a small constant magnetic field B  for the hydrogenoid atom, what happens to
the energy levels En ? Give an approximate quantitative description of how this happens, based
on the quantization of energy at B = 0.
General Central Potential and Three-Dimensional
19 (Isotropic) Harmonic Oscillator

After having examined the most important example of a central potential, the Coulomb potential,
and the corresponding hydrogen atom, we now consider the general case of a central potential, and
apply it to a few other examples: a free particle, a radial square well, and finally three-dimensional
isotropic harmonic oscillator.

19.1 General Set-Up

We have basically derived the general formulas for a central potential in the previous chapter, in
the process of applying them to the Coulomb potential, but we review them here. The Schrödinger
equation for the wave function in a general central potential,
 
2  2
− ∇ + V (r) ψ(r ) = Eψ(r ), (19.1)
2M
reduces in spherical coordinates to
⎡⎢ ⎤
⎢⎢−   ∂ + 2 ∂ − L /  + (V (r) − E) ⎥⎥⎥ ψ(r ) = 0.
2 2 2 2
(19.2)
⎢⎣ 2M  ∂r 2 r ∂r r2 ⎥⎦

This means that we can separate the variables in the equation,

ψ(r ) = REl (r)Ylm (θ, φ), (19.3)

where Ylm are eigenfunctions of the angular momentum,


L 2Ylm (θ, φ) = −l (l + 1)2Ylm (θ, φ), (19.4)

to obtain
 
d2 2 d l (l + 1) 2m
+ − + (E − V (r)) REl (r) = 0. (19.5)
dr 2 r dr r2 2
Further, to reduce this equation to one-dimensional Schrödinger-type equation, we need to get rid
of the (2/r)d/dr term, by defining
χEl (r)
REl (r) = , (19.6)
r
obtaining
  
d2 2m 2 l (l + 1)
+ E − V (r) − χEl (r) = 0. (19.7)
dr 2 2 2mr 2
214
215 19 General Central Potential and Harmonic Oscillator

We see that indeed the equation is the one-dimensional Schrödinger equation for an effective
potential
2 l (l + 1)
Veff (r) = V (r) + . (19.8)
2mr 2

Normalization Conditions
In order to have normalizable solutions, we need to have a finite integral over the square of the wave
function:
  
|ψ| d r =
2 3
|ψ| r dr dΩ =
2 2
|χ| 2 dr dΩ. (19.9)

Then the condition at infinity,


 ∞
|χ(r)| 2 dr < ∞, (19.10)

implies that, as r → ∞, the function χ(r) vanishes faster than 1/r 1/2 ,
1
|χ(r)| < . (19.11)
r 1/2
The condition at zero,

|χ(r)| 2 < ∞, (19.12)
0

which means that at r → 0, the function χ(r) either blows up or vanishes more slowly than 1/r 1/2 ,
1
|χ(r)| >
. (19.13)
r 1/2
In the case where we have a Laurent expansion for χ(r), this would mean that |χ(r)| ∼ r k , k ≥ 0,
at r → 0.

19.2 Types of Potentials

Case I
We will consider first the case with
lim V (r) = constant ≡ 0, (19.14)
r→∞

where we have set the constant to zero, since we can rescale the constant in the energy. This condition
is true in many cases of interest, but not in all.

Case II
In particular, the case of the three-dimensional harmonic oscillator is not of the above type, and then
it will be automatically described as having case II behavior.
Case I can be split into subcases, depending on the behavior at r → 0. These subcases will be
listed and then discussed separately.
216 19 General Central Potential and Harmonic Oscillator

Veff(r) Veff(r)

(a) (b)
Veff(r)
Veff(r)

(c) (d)

Figure 19.1 (a) Case Ia, with α > 0. (b) Case Ia, with α < 0, if the behavior at 0 and at ∞ is the same. (c) Cases Ib with α > 0 and Ic
with β > 0. (d) Cases Ib with α < 0 and Ic with β < 0.

Case Ia. When


α
V (r → 0) ∼ , s < 2, (19.15)
rs

the one-dimensional effective potential is dominated by the l (l + 1)/r 2 term and thus blows up
at zero,

Veff (r → 0) → +∞. (19.16)

In this case, we have an infinite “wall” at zero, as in Figs. 19.1a and b, which means that we have the
boundary condition χ(0) = 0.

Case Ib. When


α
V (r → 0) ∼ , s > 2, (19.17)
rs
we obtain that the one-dimensional effective potential is dominated by the potential itself and thus
blows up at zero, with a sign depending on the sign of α, as in Figs. 19.1c and d,

Veff (r → 0) → sgn(α)∞. (19.18)

The more interesting case is, however, when α < 0, so that Veff → −∞.

Case Ic. Finally, the potential could have the same behavior as the angular momentum
(centrifugal) term,
α
V (r → 0) ∼ . (19.19)
r2
217 19 General Central Potential and Harmonic Oscillator

Case Ia
We consider first case Ia.
(1) E > 0. For positive energy, since we have an infinite wall at r = 0 and at r = ∞, then for
E > V (r), according to the general theory in Chapter 7, we have unbound states, with imaginary
exponentials (or sinusoidal and cosinusoidal),

χ(r) ∼ Aeikr + Be−ikr , (19.20)

where

2mE
k= . (19.21)
2
The function (19.20) is not square integrable at infinity. On the other hand, at zero, as we said
previously, we need χ(0) = 0, which imposes a constraint, thus selecting a single solution and
leading to a continuous nondegenerate spectrum.
Indeed, at r → 0, the equation for χ is approximately
d 2 χ l (l + 1)
= χ, (19.22)
dr 2 r2
which has solutions r α with α = l + 1 or α = −l; thus

χ = Cr l+1 + Dr −l . (19.23)

Because of the boundary condition at zero, we need to impose D/C = 0, which is the constraint we
mentioned earlier.
(2) E < 0. For negative energy, we have bound states, if there is a region where V (r) < E < 0, by
the same general analysis from Chapter 7. At r → 0, we have the solution in (19.23) again. On the
other hand, at r → ∞ we have exponential solutions,

χ ∼ Ae−κr + Be+κr , (19.24)

where

2m|E|
κ≡ . (19.25)
2
Imposing D/C = 0 in (19.23) selects a single solution, but then imposing B/A = 0 gives a
quantization condition, selecting only discrete energies En .
In this case, we can extend Sommerfeld’s polynomial method. Defining ρ ≡ 2κr, we find the
equation for χ,
 
d2 χ l (l + 1) 1 2m
+ − − − V (ρ) χ(ρ) = 0. (19.26)
dρ 2 ρ2 4 2
Then, at ρ → 0, the equation becomes d 2 χ/dρ 2  +l (l + 1)χ/ρ 2 , which as we saw has the physical
solution

χ ∼ ρl+1 , (19.27)

whereas at ρ → ∞ the equation is d 2 χ/dρ 2  1/4χ(ρ), with physical solution

χ ∼ e−ρ/2 . (19.28)
218 19 General Central Potential and Harmonic Oscillator

We then factor out the two behaviors, at zero and infinity, write

χ = e−ρ/2 H (ρ), H (ρ) = ρl+1 F (ρ), (19.29)

and find an equation for F (ρ) that is a polynomial at zero; we impose that it is a polynomial also at
infinity, which turns into a quantization condition.
Further, in this case, with s < 2, consider a wave function that is only nonzero for r < r 0 , so that
Δr = r 0 . Then, by Heisenberg’s uncertainty relation,
 
p ∼ Δp ∼ = . (19.30)
Δr r 0
The energy of the state with this wave function is
2 α
E∼ 2
− s, (19.31)
2mr 0 r 0

and it is always positive for small enough r 0 . This means that we cannot have E < 0 in this case;
the bound states must start at some minimum distance from the center r = 0, and we cannot have
arbitrarily small energies: the ground state has a finite E and is at a finite r. That means that the
particle cannot fall into r = 0 quantum mechanically in this case.
On the other hand, if the potential behaves in the same way, ∼ −α/r s , but with s < 2, then at
r → ∞, for large enough r 0 , with a wave function centered on such an r 0 and with Δr  r 0 , the
energy
2 α
E∼ 2
− s, (19.32)
2m(Δr) r0
becomes negative at large enough r 0 . Thus there are bound states, stationary states of negative energy,
situated at large distances from the origin, which implies they have arbitrarily small negative energy.
In particular, this implies that there is an infinite number of (bound) energy levels, becoming infinitely
dense at E = 0.

Case Ib
In this case, if α < 0, Veff (r → 0) → −∞, which suggests that we would have a large probability
at r = 0: χ(0) need not be zero and in fact χ can even be non-normalizable. The energy of a state
bounded by r 0 → 0,
2 α
E∼ − , (19.33)
2mr 02 r 0s
is negative for small enough r 0 , meaning that there are states of arbitrarily large |E| for E < 0. The
interpretation is that the quantum particle can “fall” to r = 0, where it has E = −∞ (which is an
eigenvalue). We can see this directly too, since at r → 0 the Schrödinger equation is approximately
 
d2 χ α l (l + 1)
= − s + χ(r), (19.34)
dr 2 r r2
with solution
Aei f (r ) + Be−i f (r )
χ∼ . (19.35)
rb
219 19 General Central Potential and Harmonic Oscillator

The second derivative is


b(b + 1)
χ  ∼ −( f  (r)) 2 χ + 0 + χ, (19.36)
r2
and the leading term in this equation of motion implies that ( f  (r)) 2 = α/r s , the zero in the middle
term amounts to the vanishing of the imaginary part of the solution, and fixes B/A, and the matching
of the subleading term gives b(b + 1) = l (l + 1), selecting b = l in this case i.e., a non-normalizable
solution at zero. Therefore the probability of finding the particle outside r = 0 is negligible, and the
particle falls into r = 0 in quantum mechanics, as expected.
If the potential behaves in the same way, V ∼ −α/r s but with s > 2, at r → ∞ then the energy is
2 α
E∼ − , (19.37)
2m(Δr) 2 r 0s
where s > 2 and r 0 → ∞, implying that E > 0 at large enough r 0 . Thus, there is no state of arbitrarily
small negative energy and large r 0 . The spectrum terminates at a finite and nonzero negative energy.

Case Ic
When the potential V  −α/r 2 , with α > 0, the effective potential is
2 l (l + 1) − 2mα 2 β
Veff (r) = 2
≡ . (19.38)
2mr 2mr 2
The Schrödinger equation at r → 0 is
d2 χ β
2
= 2 χ, (19.39)
dr r
with solution χ ∼ r a , which implies a(a − 1) = β, i.e.,
1 ± 1 + 4β
a1,2 = . (19.40)
2
(1) If β > −1/4, both solutions a1,2 are real but, for reasons of normalization, we must choose the
slowest behavior at r → 0, namely
 √ 
1+ 1+4β 2
χ(r) ∼ r a1 = r , (19.41)

for which 0 |χ(r)| 2 dr < ∞.
(2) If β < −1/4, both solutions have imaginary parts and we find
1  √ √ 
R(r) ∼ 1/2 Aei −1−4β ln r + Be−i −1−4β ln r . (19.42)
r
Imposing the condition that the solution must be real, we find one that has an infinite number of
zeroes,
1  
R(r) ∼ 1/2 cos −1 − 4β ln r + δ . (19.43)
r
But the ground state cannot have zeroes, which means that the wave function (19.43) is wrong, and
in the ground state the particle is actually at r = 0 (one can produce an argument that is a bit more
rigorous for this statement, but the result is the same).
On the other hand, if the potential is the same, i.e., V = −α/r 2 , at r → ∞, then:
220 19 General Central Potential and Harmonic Oscillator

for β > −1/4, we have (at most) a finite number of negative energy levels: the solutions χ ∼ r a1,2 are
valid also for r → ∞, but these solutions have no zeroes. Thus there can be zeroes for, at most, r less
than some r 0 , implying a finite number of energy levels, each associated with an interval between
zeroes, according to the general one-dimensional analysis.
for β < −1/4, the solutions R ∼ cos −1 − 4β ln r + δ /r 1/2 are valid also at r → ∞, and have an
infinite number of zeroes, thus an infinite number of energy levels.

19.3 Diatomic Molecule

Consider a molecule made up of two atoms. Reducing the Schrödinger equation to the relative motion
of the atoms with reduced mass μ, we find a central potential that has a minimum at some r 0 . If the
molecule is stable then V  (r 0 ) > 0, so the minimum is stable, as in Fig. 19.2a. This means that we
can expand the potential around r 0 ,
(r − r 0 ) 2
V (r)  V (r 0 ) + V  (r 0 ) , (19.44)
2
and we define
ω2
V  (r 0 ) ≡ μ . (19.45)
2
Also expanding the angular momentum term,
2 l (l + 1) 2 l (l + 1)
2
 +··· , (19.46)
2mμ r 2μ r 02

and defining the inertial momentum I ≡ μr 02 , we find the following Schrödinger equation:
 2  
d 2μ 2 l (l + 1) μω 2 2
+ 2 E − V (r 0 ) − − ρ χ(ρ) = 0. (19.47)
dr 2  2I 2
However, this is nothing other than the Schrödinger equation for a linear harmonic oscillator, of
shifted energy E − V (r 0 ) − 2 l (l + 1)/2I. Thus the eigenenergies are
 
2 l (l + 1) 1
En = V (r 0 ) + + ω n + . (19.48)
2I 2

V V

r r

(a) (b)
Figure 19.2 (a) Potential for diatomic molecule; (b) Spherical square well potential.
221 19 General Central Potential and Harmonic Oscillator

19.4 Free Particle

We have already looked at the free particle in spherical coordinates, but we review it here for
completeness. For E > 0, the Schrödinger equation in spherical coordinates reduces to
 
d2 χ 1 l (l + 1)
+ − χ(ρ) = 0. (19.49)
dρ 2 4 ρ2
The equation above has a unique solution that is everywhere bounded, the Bessel function jl (ρ/2) =
jl (kr) for R = χ/r, meaning that the full wave function is

ψ(r ) = jl (kr)Ylm (θ, φ). (19.50)

Then the known solution in Cartesian coordinates, ei k ·r , can be expanded in terms of the spherical


solutions above,

∞ 
l
ei k ·r =

alm (k) jl (kr)Ylm (θ, φ). (19.51)
l=0 m=−l

Choosing the momentum k to be in the direction Oz, and defining cos θ ≡ u (and since ρ/2 = kr),


eiρu/2 = eikr cos θ = cl jl (ρ/2)Pl (u). (19.52)
l=0

From the recurrence relations obtained for the spherical Bessel functions jl and the Legendre
polynomials Pl , we find that

cl = (2l + 1)i l c0 , (19.53)

and with the normalization choice c0 = 1 we find that




eikr cos θ = (2l + 1)i l jl (kr)Pl (cos θ). (19.54)
l=0

19.5 Spherical Square Well

Consider the important case where the potential is piecewise constant, specifically smaller for r ≤ r 0
than outside it as in Fig. 19.2b,

⎪ −V0 , r ≤ r 0
V (r) = ⎨
⎪ 0, (19.55)
⎩ r > r0 .

For negative energy, −V0 < E < 0, by the general theory we have bound states (since E > V (r) over
a finite region). For r < r 0 we have a “free wave” solution like that above, Rl (r) = jl (kr), but with

2m(E + V0 )
k= . (19.56)
2
222 19 General Central Potential and Harmonic Oscillator

For r > r 0 , we have an exponentially decaying solution (the only one that vanishes at infinity),

χl (κr) ∼ e−κr , (19.57)

where

2m|E|
κ= . (19.58)
2
Then, the condition of continuity of the logarithmic derivative (and so of the function and of its
derivative) at r = r 0 ,
χl (κr)  
ln jl (kr) 
d d
ln  = , (19.59)
dr r r=r0 dr r=r0
gives a quantization condition that gives a finite number of energy levels En .

19.6 Three-Dimensional Isotropic Harmonic Oscillator: Set-Up

This is an example of Case II, where the potential grows without bound as r → ∞.
Consider a harmonic oscillator in d dimensions, with a priori different frequencies in each
dimension,

d
Ĥ = Ĥi
i=1 (19.60)
p2 ω 2 q2
Ĥi = i + m i i .
2m 2
Then the Hilbert space is a tensor product of the harmonic oscillator Hilbert spaces for each
dimension,

H = H1 ⊗ · · · ⊗ H d , (19.61)

so the states (in the “Fock space”) are tensor product of states of occupation numbers ni ,

|{ni } = |n1  ⊗ · · · ⊗ |nd ; (19.62)

in particular, the vacuum is a tensor product of the vacua for each dimension,

|0 ≡ |01 ⊗ · · · ⊗ |0d . (19.63)

For these states the Hamiltonian for each dimension is diagonal,


 
1
Ĥi |ni  = ni + ωi |ni , (19.64)
2
so that
d 
  
1
Ĥ |{ni } = ωi ni + |{ni }. (19.65)
i=1
2
223 19 General Central Potential and Harmonic Oscillator

In the case of an isotropic harmonic oscillator, with the same frequency in all directions, ωi = ω,
we have
pˆ 2 mω 2 ˆ 2
Ĥ = + r . (19.66)
2m 2
In terms of the number operators N̂i = âi† âi , we have

d
N̂ = N̂i , (19.67)
i=1

so the total Hamiltonian is


 
d
Ĥ = N̂ + ω, (19.68)
2
and the eigenenergies are
 d  
 d d
Ĥ |{ni } = ni + ω|{ni } = n + ω|{ni }. (19.69)
 i=1 2 2

The degeneracy of these energies equals the number of different ni that give the same n = i ni .
We are mostly interested in d = 3, in which case the degeneracy is
(n + 1)(n + 2)
gn = . (19.70)
2

19.7 Isotropic Three-Dimensional Harmonic Oscillator


in Spherical Coordinates

Instead of solving each dimension independently, as above (each as a harmonic oscillator), we can
go to spherical coordinates, noting that the central potential is
mω 2 2
V (r) =
r . (19.71)
2
Then the Sommerfeld polynomial method works as before. The Schrödinger equation becomes
 
d2 χ 2mE m2 ω 2 2 l (l + 1)
+ − r − χ(r) = 0. (19.72)
dr 2 2 2 r2
Define

mω 2E
α= , = λ, ξ = αr, (19.73)
 ω
leading to the equation
 
d2 χ l (l + 1)
+ λ − − ξ 2
χ = 0. (19.74)
dξ 2 ξ2
Then, the equation at r → 0 or ξ → 0 is
d 2 χ l (l + 1)
= χ, (19.75)
dξ 2 ξ2
224 19 General Central Potential and Harmonic Oscillator

which has the solution

χ = Aξ l+1 + Bξ −l ; (19.76)

this implies that the normalizable solution has B = 0.


The equation at r → ∞ or ξ → ∞ is
d2 χ
= ξ 2 χ, (19.77)
dξ 2
with the leading, normalizable, solution at infinity

χ = e−ξ
2 /2
× (subleading terms). (19.78)

We now factor out the behaviors at infinity and zero. First, we write

χ = e−ξ /2 H (ξ),
2
(19.79)

which implies

χ  = e−ξ H  − 2ξH  + (ξ 2 − 1)H ,


2 /2
(19.80)

so that the equation for H is (dividing by e−ξ /2 )


2

 
l (l + 1)
H  − 2ξH  + λ − 1 − H = 0. (19.81)
ξ2
Next, we write

H = ξ l+1 F (ξ), (19.82)

which implies

H  = ξ l+1 F  + (l + 1)ξ l F
(19.83)
H  = ξ l+1 F  + 2(l + 1)ξ l F  + l (l + 1)ξ l−1 F,

so that the equation for F is (dividing by ξ l+1 )


 
l+1
F  + 2 − ξ F  + (λ − 2l − 3)F = 0. (19.84)
ξ

Finally, change variables by setting z = ξ 2 , so that


dF dF d2 F dF d2 F
= 2ξ , 2
=2 + 4z 2 , (19.85)
dξ dz dξ dz dz
and we obtain the equation
 
d2 F 3 dF λ − 2l − 3
z 2
+ l + − z + F = 0. (19.86)
dz 2 dz 4
But this is the same equation as that for the confluent hypergeometric function 1 F1 (a, b; z) from
the previous chapter, with
λ − 2l − 3 3
−a = , b=l+ . (19.87)
4 2
225 19 General Central Potential and Harmonic Oscillator

Therefore, as in the previous chapter, we find the solution


 
2l + 3 − 2E/(ω) 3
F (z) = 1 F1 ,l + ;z , (19.88)
4 2
with quantization condition
2l + 3 − 2E/(ω)
= −nr ∈ N. (19.89)
4
Finally then, the eigenenergies are
En 3 3
= l + + 2nr ≡ n + , (19.90)
ω 2 2
where we have defined n = 2nr + l. This is the same formula as that obtained from the Cartesian
form. Moreover, the degeneracy is again
(n + 1)(n + 2)
gn = , (19.91)
2
meaning that the possible states are indeed the same. In spherical coordinates, the radial functions
are (putting together the formulas derived before)
Rn,l (r) = Nnr ,l e−α
2 r 2 /2
(αr) l 1 F1 (−nr , l + 3/2; α 2 r 2 ), (19.92)
where
√   1/2
α 2α nr !
Nn,l = . (19.93)
Γ(l + 3/2) Γ(nr + l + 3/2)

19.8 Isotropic Three-Dimensional Harmonic Oscillator


in Cylindrical Coordinates

For completeness, we now show how to solve the same oscillator in cylindrical coordinates. Then
the Hamiltonian is
   
2 1 ∂ ∂ 1 ∂2 ∂2 m0 ω 2 2
Ĥ = − ρ + 2 + + (ρ + z 2 )
2m0 ρ ∂ρ ∂ρ ρ ∂φ 2 ∂z 2 2
(19.94)
L̂ 2z
= Ĥz + Ĥρ + ,
2m0 ρ 2
where we have denoted the mass by m0 in order not to confuse it with m, the eigenvalue of L z /. We
can easily check that H, Hz , L z form a complete set of mutually compatible observables,
0 = [ Ĥ, L̂ z ] = [ Ĥ, Ĥz ] = [ L̂ z , Ĥz ]. (19.95)
Then we separate variables by writing the wave function as
ψ nz ,nρ ,|m| (z, φ, ρ) = Znz (z)Φm (φ)Rnρ ,|m| (ρ). (19.96)
The equation of motion for Znz (z) is that for a harmonic oscillator in one dimension,
 
2 ∂ 2 m0 ω 2 z 2
Hz Z n z = − + Z n z = Ez Z n z , (19.97)
2m0 ∂z 2 2
226 19 General Central Potential and Harmonic Oscillator

with the standard solution

Znz = N (α)e−α
2 z 2 /2
Hnz (αz)
(19.98)
Ez = ω(nz + 1/2).

The equation of motion for Φm (φ) is even simpler,

L z Φm (φ) = mΦm (φ), (19.99)

so that the eigenfunction is


1
Φm (φ) = √ eimφ . (19.100)

Then, finally, the equation for the two-dimensional radial function (in the plane) is
  
d 2 R 1 dR ⎧⎪
⎨ m2 2m0 1 m02 ω 2 2 ⎫

+ + − + E − ω n + − ρ ⎬⎪ R = 0. (19.101)
dρ 2 ρ dρ ⎪
z
⎩ ρ 2 2 2  2

Redefining the variable R, in order to get rid of the dR/dρ term, by setting
χ(ρ)
R= √ , (19.102)
ρ

so that
 
d 2 R 1 dR 1 d2 χ 1
+ =√ + χ , (19.103)
dρ 2 ρ dρ ρ dρ 2 4ρ 2

we find the equation


    
d2 χ m2 − 1/4 2m0 1 m0 ω 2 2
+ − + E − ω n z + − ρ χ = 0. (19.104)
dρ 2 ρ2 2 2 2

This is the same equation as (19.72), with the following changes,


 
1 1
E → E − ω nz + , l (l + 1) → m2 − , (19.105)
2 4
implying
1 1
l 2 + l − m2 + = 0 ⇒ l = − ± |m|, (19.106)
4 2
and eigenenergies
 
E 1 3
− nz + = l + + 2nρ = 1 ± |m| + 2nρ ⇒
ω 2 2
    (19.107)
3 3
E = ω nz + 2nρ ± |m| + ≡ ω n + .
2 2
We see that again we obtain the same spectrum, and in fact also the same degeneracy, as expected.
227 19 General Central Potential and Harmonic Oscillator

Important Concepts to Remember

• For motion in a central potential in three dimensions, we can reduce the Schrödinger equation to
an equation in one dimension, for χ(r), in an effective potential
2 l (l + 1)
Veff (r) = V (r) + ,
2m r 2
thus including the centrifugal term.
• For V ∼ α/r s with s < 2, we can have bound states if there is a region where V (r) < E < 0. If
V ∼ −|α|/r s with s < 2 at r → ∞, we have an infinite number of bound states, of vanishingly
small energy.
• For V ∼ −|α|/r s with s > 2, we can have bound states of arbitrarily large |E|, meaning there is
an arbitrarily large probability for the particle to be at r = 0, implying that the particle “falls into
r = 0 quantum mechanically”.
• For V ∼ −|α|/r 2 , β = l (l + 1) − 2m|α|/2 < −1/4, the ground state is at r = 0.
• The diatomic molecule has a potential with a minimum at r 0 , approximated by a shifted harmonic
oscillator, so E = V (r 0 ) + 2 l (l + 1)/2I + ω(n + 1/2).
• The spherical square well has a finite number of bound states for E < 0.
• The three-dimensional isotropic harmonic oscillator is equivalent to three one-dimensional har-
monic oscillators, with tensor product states, giving an (n + 1)(n + 2)/2 degeneracy.
• Equivalently, in spherical coordinates, we have a central potential mω2 r 2 /2, which can be solved
by the Sommerfeld polynomial method in terms of the same 1 F1 , giving a quantized energy
En = ω(l + 2nr + 3/2) ≡ ω(n + 3/2), with the same (n + 1)(n + 2)/2 degeneracy.
• In cylindrical coordinates, we obtain a one-dimensional harmonic oscillator in z times a one-
dimensional radial harmonic oscillator for planar motion, with the same energy levels and
degeneracy.

Further Reading
See [4] for more details on the various central potential cases and also [2].

Exercises

(1) Consider the central potential V (r) = α/r cos βr. How many bound states are there?
(2) Consider the central potential V (r) = |α|/r 2 ln r/r ∗ . In what region of space are the bound states
confined?
(3) Consider the attractive Yukawa central potential, V (r) = −|α|/re−mr . Is the bound state
spectrum finite or infinite? Can we use the same approximation as that used for the diatomic
molecule, and if so, under what conditions?
(4) For the free particle in spherical coordinates, prove relation (19.53).
228 19 General Central Potential and Harmonic Oscillator

(5) Complete the proof of the fact that the number of bound states of the spherical square well is
finite.
(6) Show that the normalization constant for the isotropic harmonic oscillator in spherical coordi-
nates is given by (19.93).
(7) Solve the isotropic harmonic oscillator in two spatial dimensions, using polar coordinates.
20 Systems of Identical Particles

In this chapter we start considering a more physical case, with several particles. According to
the general theory discussed earlier in the book, if the particles are independent, so that the
observable operators for different particles commute (in particular the Hamiltonians, Ĥ = i=1
N
Ĥi ,
with [ Ĥi , Ĥ j ] = 0), we can write the Hilbert space as a tensor product,

H = H1 ⊗ · · · ⊗ H N . (20.1)

Correspondingly, for a state with particles at positions x 1 , x 2 , . . . , x N , we have

|ψ = |x 1  ⊗ · · · ⊗ |x N . (20.2)

We have seen something similar before, in the case where we had several systems. For instance, in
the previous chapter we considered a d-dimensional harmonic oscillator, with the oscillation in each
dimension acting as an independent harmonic oscillator, so that Ĥ = i=1
d
Ĥi and the general state in
the Cartesian occupation number picture was |n1  ⊗ · · · ⊗ |n N .

20.1 Identical Particles: Bosons and Fermions

However, we want to ask what happens when the particles are not just similar, but identical. In
classical mechanics that is not a problem because we can always track the position and momentum of
a particle, so we can always distinguish it from another particle. In quantum mechanics, however, the
particles have probabilities of being anywhere, including at the same point, so we cannot distinguish
them by position: the particles are indistinguishable.
Consider first a state with only two particles, and specifically the state with one particle at x 1 and
one at x 2 , denoted |x 1 x 2 . Because of what we have just said, this state should be indistinguishable
from the state |x 2 x 1 , which means the two states should be proportional,

|x 2 x 1  = C12 |x 1 x 2 . (20.3)

Define the permutation operator, which exchanges particles 1 and 2, by

P̂12 |x 1 x 2  = |x 2 x 1 . (20.4)

Then the two previous equations imply that P̂12 must be diagonal, and its eigenvalue is C12 .
Moreover, it seems obvious that we must have P̂12 = P̂21 and P̂12 P̂34 = P̂34 P̂12 , i.e., the permu-
2
tations are Abelian (commutative) and P̂12 = 1, meaning that two permutations will take us back to
the original (unpermuted) case. This latest consideration implies that
2
C12 = 1 ⇒ C12 = ±1. (20.5)
229
230 20 Systems of Identical Particles

Now, the conditions that permutations are Abelian (commutative), and that two permutations
take us back to the original case sound harmless and obvious, but in fact they are not. These are
assumptions that are broken in some possible cases, leading to what are known as “non-Abelions”
and “anyons”, which will be analyzed at the end of this book.
For the moment however, we note that we have only two possible symmetries, associated with the
statistics describing particles: C12 = +1 leads to Bose–Einstein statistics, for bosons, while C12 = −1
leads to Fermi–Dirac statistics, for fermions.
Since

P̂12 |x 1  ⊗ |x 2  = |x 2  ⊗ |x 1   |x 1  ⊗ |x 2 , (20.6)

it follows that states such as |x 1  ⊗ |x 2  are actually not satisfactory states for identical particles. We
must have either symmetric (for bosons) or antisymmetric (for fermions) combinations of the above
states and their permutations,
1
|x 1 x 2 ; S/A = √ (|x 1  ⊗ |x 2  ± |x 2  ⊗ |x 1 ), (20.7)
2

corresponding to C12 = ±1 (bosons/fermions). The factor 1/ 2 appears because of normalization, as
the two states√ |x 1  ⊗ |x 2  and |x 2  ⊗ |x 1  are orthogonal so their sums or differences must have a
coefficient 1/ 2.
More generally, for states a, b characterizing particles 1 and 2, instead of x 1 , x 2 , for
bosons/fermions we have
1
|ab = √ (|a ⊗ |b ± |b ⊗ |a). (20.8)
2
In the case of fermions, when a = b (the same state for particles 1 and 2), the two-particle state is
1
|aa = √ (|a ⊗ |a − |a ⊗ |a) = 0. (20.9)
2
This is a mathematical expression of the Pauli exclusion principle, stating that we cannot have two
fermions occupying the same state, unlike for bosons or classical particles.
Next, consider symmetric or antisymmetric states |ψS/A in the coordinate representation, which
means that we have to take the scalar product with the states |x 1 x 2 ; S/A. The wave function is thus

ψ S/A (x 1 , x 2 ) ≡ x 1 x 2 ; S/A|ψ S/A. (20.10)

If the state |ψ S/A corresponds to eigenstates for observables Ω̂ (with eigenvalues ω), measured for
particles 1 and 2, then we must have
1
|ψ S,A = |ω1 ω2 ; S/A = √ (|ω1  ⊗ |ω2  ± |ω2  ⊗ |ω1 ). (20.11)
2
Then, in the antisymmetric (A) or fermion case, we have the wave function
 
1  ψ ω1 (x 1 ) ψ ω2 (x 1 )  1
ψ A (x 1 , x 2 ) = √   = √ ψ ω1 (x 1 )ψ ω2 (x 2 ) − ψ ω2 (x 1 )ψ ω1 (x 2 ) . (20.12)
ψ
2  ω1 2 (x ) ψ (x
ω2 2  ) 2
This is known as a Slater determinant.
231 20 Systems of Identical Particles

20.2 Observables under Permutation

Consider (generalized) coordinates q̂1 , q̂2 for particles 1 and 2, and let an observable depending
on these coordinates be represented by the operator Â( q̂1 , q̂2 ). Then the fact that particles are
indistinguishable means that the operators Â( q̂1 , q̂2 ) and Â( q̂2 , q̂1 ) must be equal (while the states
can differ by a phase).
On the other hand, consider separate operator observables associated with the two particles, Â1
and Â2 , which generalize q̂1 , q̂2 , and consider eigenstates for the two operators, |a1  ⊗ |a2 . Then
Â1 |a1  ⊗ |a2  = a1 |a1  ⊗ |a2 
(20.13)
Â2 |a1  ⊗ |a2  = a2 |a1  ⊗ |a2 .
−1
Acting with P̂12 Â1 on the states, and introducing 1 = P̂12 P̂12 before the states, we obtain (by
calculating in two different ways)
−1
P̂12 Â1 P̂12 P̂12 |a1  ⊗ |a2  = a1 P̂12 |a1  ⊗ |a2  = a1 |a2  ⊗ |a1 
−1
(20.14)
= ( P̂12 Â1 P̂12 )|a2  ⊗ |a1 .
Since this relation is valid for any states |a1  ⊗ |a2  (and the eigenstates of observables make a
complete set in Hilbert space), we have also the operatorial equation
−1
P̂12 Â1 P̂12 = Â2 . (20.15)
In particular, the same is true when Âi = q̂i .
Applying relation (20.15) to the operator observable Â( q̂1 , q̂2 ), and calculating in two ways (the
second using the fact that the operator for indistinguishable particles is the same when we switch
them), we obtain
−1 −1 −1
Â( q̂1 , q̂2 ) = Â( P̂12 q̂1 P̂12 , P̂12 q̂2 P̂12 ) = P̂12 Â( q̂1 , q̂2 ) P̂12
(20.16)
= Â( q̂1 , q̂2 ),
thus leading to
−1
P̂12 Â( q̂1 , q̂2 ) P̂12 = Â( q̂1 , q̂2 ), (20.17)
or [ P̂12 , Â] = 0 (multiparticle observables commute with permutations).
Another proof of the same relation is obtained by acting with ÂP̂12 on states,
Â( q̂2 , q̂1 ) P̂12 |q1  ⊗ |q2  = Â( q̂2 , q̂1 )|q2  ⊗ |q1  = A(q1 , q2 )|q2  ⊗ |q1 
(20.18)
= P̂12 ( A(q2 , q1 )|q1  ⊗ |q2 ) = P̂12 Â( q̂2 , q̂1 )|q1  ⊗ |q2 ,
and since it is true for states, it is also true as an operator relation.
In particular, the Hamiltonian operator must be of this type (it must commute with P̂12 , i.e., be
symmetric). However, in general we have
P̂12 P̂2
Ĥ = + 2 + V ( X̂1 , X̂2 ). (20.19)
2m1 2m2
(a) One possibility is that the two particles don’t interact at all, corresponding to a separable
system, i.e.,
V ( X̂1 , X̂2 ) = V ( X̂1 ) + V ( X̂2 ) ⇒ Ĥ = Ĥ1 + Ĥ2 . (20.20)
232 20 Systems of Identical Particles

In this case, the solutions to the Schrödinger equation (the states) are separable,

|ψ = |ψ1  ⊗ |ψ2 , (20.21)

leading to separable wave functions (as we saw),

ψ(x 1 , x 2 ) = ψ1 (x 1 )ψ2 (x 2 ). (20.22)

(b) Another possibility is that the potential depends only on the relative distance of the particles,

V ( X̂1 , X̂2 ) = V (| X̂1 − X̂2 |), (20.23)

which, as we saw, means that the Schrödinger equation reduces to separate equations for the center of
mass motion and for the relative motion, with the relative motion corresponding to a central potential.
−1
In this case, we do indeed have P̂12 Ĥ P̂12 = Ĥ, as the relative motion is the same when we
interchange the particles.

20.3 Generalization to N Particles

In the case of N particles, we must consider the behavior under a generic permutation,
 
1 2 ··· N
P= . (20.24)
i1 i2 · · · i N

But consider in particular just the transposition of particles i and j,


 
1 ··· i ··· j ··· N
Pi j ≡ . (20.25)
1 ··· j ··· i ··· N

In particular, the action of Pi j on a state is

P̂i j |12 . . . i . . . j . . . N = |12 . . . j . . . i . . . N. (20.26)

However, in this case, we are back to the previous case (since only two particles are interchanged,
the others are left untouched), so by the same argument we must also have

P̂i j |12 . . . i . . . j . . . N = Ci j |12 . . . i . . . j . . . N, (20.27)

and again we must have Ci2j = 1 (see (20.3)), so

Ci j = ±1, (20.28)

corresponding as before to either Bose–Einstein or Fermi–Dirac statistics. Moreover, the same


argument leads to the Pauli exclusion principle, saying that for fermions we can have at most one
particle per state.
The above analysis must be true for any i, j, which means that in fact, the state |12 . . . N must be
totally symmetric or antisymmetric, corresponding respectively to bosons and fermions. Thus, for any
generic permutation P, obtained as a product of transpositions Pi j , the same analysis as above must
hold. For bosons, we have sums over permutations, whereas for fermions, we sum with a sign equal
233 20 Systems of Identical Particles

to the sign of the permutation P (which is given by the product of the minus signs corresponding to
each transposition that composes P). We say that
1 
|12 . . . N = (sgn(P)) P̂|1 ⊗ |2 ⊗ · · · ⊗ |N. (20.29)
Nperms perms. P

Wave functions in the coordinate representation are defined by

ψ S/A (x 1 , . . . , x N ) = x 1 . . . x N ; S/A|ψ S/A, (20.30)

and thus have the same symmetry properties. They are totally symmetric for bosons and totally
antisymmetric for fermions.
For a state that corresponds to individual states ai for each particle i, we have the multiparticle
state
1 
|ψ S/A = (sgn(P)) P̂|a1  ⊗ · · · ⊗ |an , (20.31)
Nperms perms. P

which means that the wave functions can be written as a determinant generalizing the two-particle
case,
 ψ (x ) ··· ψ a N (x 1 ) 
1  a1 . 1 .. 
ψ S/A (x 1 , . . . , x N ) =  .. .  . (20.32)
Nperms  
ψ a N (x 1 ) ··· ψ a N (x N ) 

20.4 Canonical Commutation Relations

For bosons, as we saw, we had commuting fundamental (canonical) operators, so in particular

[q̂i , q̂ j ] = [ p̂i , p̂ j ] = 0, [q̂i , p̂ j ] = iδi j . (20.33)

But if we have (extended) phase space variables ψi that are truly fermionic, i.e., anticommuting,
their canonical anticommutation relations will be

{ψi , ψ j } = {pψi , pψ j } = 0, {ψi , pψ j } = iδi j . (20.34)

The simplest system will still be the harmonic oscillator, but now a fermionic version of it whose
Hamiltonian can be written in terms of the above phase space variables,
p̂2ψ mω 2 2
Ĥ = + ψ̂ , (20.35)
2m 2
which again can be written in terms of creation and annihilation operators b, b† but now obeying
anticommutation relations,

{b, b} = {b† , b† } = 0, {b, b† } = 1. (20.36)

Then the quantum Hamiltonian can be written in terms of them as


   
−bb† + b† b 1
Ĥ = ω = ω b† b − . (20.37)
2 2
234 20 Systems of Identical Particles

Observations:

• We note that there exists a unique totally symmetric or antisymmetric wave function, a fact that
should be obvious since we can start from one state and symmetrize it. Also, the totally symmetric
or antisymmetric state is in a given representation of SO(N ), so is uniquely defined by group
theory.
• The other important observation is that we have considered only N particles in order to (anti-)
symmetrize, but we can ask: why not consider all the particles in the Universe? The point is that
the particles considered must have some kind of interaction, even if a small one, in which case we
are forced to use the symmetrization.
But if we can ignore the coupling to an “exterior” made up of the rest of the Universe, we can
ignore symmetrization with respect to it, and consider separation of variables instead:

ψ = ψsystem ψrest . (20.38)

Then the result is the same, as if we had not considered the exterior at all, since the probability
is given by

P = |ψ| 2 = |ψsystem | 2 |ψrest | 2 , (20.39)



and by summing (integrating) over the rest of the Universe, with |ψrest | 2 = 1, we obtain that

Psystem = |ψsystem | 2 , (20.40)

the same result as if we had just ignored the rest of the Universe.

20.5 Spin–Statistics Theorem

Now, a natural question can be asked: how do we know whether a particle is a boson or a fermion,
other than by measuring its statistics (i.e., its behavior under measurement)?
The answer is that the statistics obeyed by a particle is related to its spin: for integer spin the particle
is a boson, whereas for half-integer spin j ∈ (N/2)\N, the particle is a fermion. This statement goes
under the name of the spin–statistics theorem; however, it is not so much a single mathematical proof
for this statement as it is a set of different proofs, of various degrees of rigor.
Some of the more rigorous proofs are based on quantum field theory, and will not be reproduced
here, but only stated:
(1) Lorentz invariance of the S-matrix for the scattering of particles implies the need to correlate
spin with statistics, in order not to break the symmetry at the quantum level.
(2) Vacuum stability: if we consider the wrong statistics for the fields associated with particles with
spin, then we find that there are contributions of negative energy, implying an arbitrarily negative
potential. Then the vacuum is unstable, which cannot hold for a true vacuum, implying that the
assumption about the statistics was wrong.
A third proof based on quantum field theory will be sketched, being reasonably simple to
understand (though some facts will have to be taken for granted by the reader):
(3) Causality criterion: For a spacelike separation between two points, (x − y) 2 > 0, such that there
is no causal contact between the points and we can make a Lorentz transformation to a reference
system where the two points are at the same time t, the observables measured at points x and y
235 20 Systems of Identical Particles

should be independent. But in quantum mechanics, that amounts to the operators associated with
them commuting,
[O(x), O(y)] = [O(x , t), O(y , t)] = 0. (20.41)
In quantum field theory, however, a quantum field is (up to a normalization factor Nk ) a sum over
modes described by eik ·x and associated creation operators,
  
d 3 k Nk a(k)e−iωt+i k ·x + N̄k a† (k)e+iωt−i k ·x ,
 
φ(x , t) = (20.42)

where ω = k 2 + m2 . The commutation relation for the creation and annihilation operators is
[a(k), a† (k  )] = δ3 (k − k  ), (20.43)
leading to commutation relation between the fields themselves at different points (that are spacelike
separated),
  
d 3 k |Nk | 2 ei k ·(x −y ) − e−i k ·(x −y ) = 0,
 
[φ(x , t = 0), φ(y , t = 0)] = (20.44)

where in the last equality we have used the fact that we can make the change k → −k in the
integration. This is indeed what we expect, and since this relation is true for the basic operators
(the fields), it will also be true for composite operators (arbitrary observables) made up from them.
Note, however, that if we didn’t have commutation relations for a, a† , but anticommutation relations
instead (as for b, b† ), we would get a nonzero result, thus proving the relation between spin (zero in
this case) and statistics (bosonic, in this case). The relation can be proven case by case, though we
do not have a general proof valid for all spins.
(4) However, the simplest (yet least rigorous) proof of the spin–statistics theorem relies on just the
quantum mechanics we have defined so far (though it becomes slightly more rigorous when using a
relativistic formulation, which will not be given here).
The generator of rotations around Oz is J3 /, as we saw in Chapter 16, so the matrix element for
these rotations (with an angle θ) is
g = eiθJ3 / . (20.45)
But the interchange of two particles can be obtained as follows: we can make a rotation by π in
the plane perpendicular to Oz, and then a translation (depending on where the origin is, with respect
to the midline between the particles). This amounts to a rotation by π of each of the two particles,
which is the same as a rotation by 2π of a single particle.
However, a particle of spin j, with J3 = j (its maximum value), rotated by θ = 2π, gives
g(2π) = e2πi j = (−1) 2j , (20.46)
which is indeed the formula we were looking for: symmetric (g = +1) for j ∈ N, and antisymmetric
(g = −1) for j ∈ (N/2)\N under this exchange of particles.
For j = 1/2, and considering a single particle, we must rotate by θ = π, so that
 
i 0
g=e iπσ3 /2
= (cos π/2)1 + i(sin π/2) σ3 = iσ3 = . (20.47)
0 −i
Thus, for a single particle in a state |ψ1 , we have
ĝ(π)|ψ1  = ±i|ψ1 , (20.48)
236 20 Systems of Identical Particles

and for a two-particle state,


ĝ(π)|ψ1  ⊗ |ψ2  = (±i) 2 |ψ1 |ψ2  = −|ψ1  ⊗ |ψ2 
(20.49)
= |ψ2  ⊗ |ψ1 ,
as promised.
Moreover, for j = (2k + 1)/2 and θ = π, we obtain

g 2 (π) = g(2π) = (−1) 2j , (20.50)

which is what we obtained from our first way of deriving the interchange factor.

20.6 Particles with Spin

Next, we consider what happens when the identical particles have a spin degree of freedom. Consider
two electrons, with spin s = 1/2. Then the two-particle states are characterized by the individual
positions and spin projections on Oz, with Hilbert space basis

|x 1 , ms1  ⊗ |x 2 , ms2  → |x 1 , ms1 ; x 2 , ms2 . (20.51)

If the total Hamiltonian commutes with the total spin squared, [ Stot
2
, Ĥ] = 0, i.e., if the Hamiltonian
is independent of the total spin, then we can write the states as the product of separate states for
position and spin,

|ψ = |φposition ⊗ |χspin . (20.52)

The wave functions in the coordinate–spin basis are

ψ(x 1 , x 2 ; ms1 , ms2 ) = x 1 , ms1 ; x 2 , ms2 |ψ, (20.53)

and they are separable into pure coordinate and pure spin wave functions,

ψ(x 1 , x 2 ; ms1 , ms2 ) = φ(x 1 , x 2 )χ(ms1 , ms2 ). (20.54)

Since the addition of the two angular momenta (of the electrons with spin s = 1/2) gives
→ →
1/2 ⊗ 1/2= 1 ⊕ 0, (20.55)

meaning the four states of the product space decompose into three states of total spin 1 (the
triplet representation), | jm = |1, −1, |1, 0, |1, 1, plus a state of total spin zero (the singlet
representation), | jm = |0, 0, it follows that we can construct spin wave functions that correspond
to these representations,
χ(ms1 , ms2 ) = χ++
1
= √ (χ+− + χ−+ )
2
(20.56)
= χ−−
1
= √ (χ+− − χ−+ ),
2
237 20 Systems of Identical Particles

where the first three states are in the triplet representation, and the last in the singlet representation.
We note that indeed the triplet states are even under permutations (symmetric), as expected, since
it implies that the electron angular momenta are parallel, whereas the singlet state is odd under
permutations (antisymmetric), since it implies that the electron angular momenta are antiparallel.
Since in the case where both angular momenta are positive, we expect a permutation to give back
the same (symmetric) state (since the state is symmetric),

P̂12 χ++ = +χ++ , (20.57)

the other two states in the triplet representation are obtained by acting with the total spin annihilation
operator S− = S1,− + S2,− . Since we also have [ P̂12 , S− ] = 0, the property of being symmetric is
maintained by acting with S− .
Because the wave function factorizes (into separate variables), we have that the permutation
operator also factorizes,
space spin
P̂12 = P̂12 · P̂12 . (20.58)

Since we only need the total permutation to be antisymmetric (as is required for a fermion),

P̂12 ψ(x 1 , x 2 ) = −ψ(x 1 , x 2 ) = ψ(x 2 , x 1 ), (20.59)

it means we have two possibilities:

• We can have a symmetric spatial wave function, φ(x 1 , x 2 ) = φ(x 2 , x 1 ), and an antisymmetric spin
wave function, χ(ms1 , ms2 ) = −χ(ms2 , ms1 ). This is the case when the spins are antiparallel, and
so corresponds to the singlet representation.
• We can have an antisymmetric spatial wave function, φ(x 1 , x 2 ) = −φ(x 2 , x 1 ), and a symmetric
spin wave function, χ(ms1 , ms2 ) = +χ(ms2 , ms1 ). This is the case when the spins are parallel, and
so corresponds to the triplet representation.

In the case of a separable Hamiltonian, Ĥ = Ĥ1 + Ĥ2 , we can further separate the spatial wave
function φ(x 1 , x 2 ), which thus has solutions of the type

φ(x 1 , x 2 ) = φ A (x 1 )φ B (x 2 ), (20.60)

where φ A (x 1 ) corresponds to Ĥ1 and φ B (x 2 ) corresponds to Ĥ2 . But then we can write down a
spatial symmetric or antisymmetric wave function,
1
φ(x 1 , x 2 ) = √ [φ A (x 1 )φ B (x 2 ) ± φ A (x 2 )φ B (x 1 )], (20.61)
2
corresponding to the triplet and singlet representations, respectively.
In that case, the probability density of finding one electron at x 1 and the other at x 2 is
1
|φ(x 1 , x 2 )| 2 = |ψ A (x 1 )| 2 |φ B (x 2 )| 2 + |ψ B (x 1 )| 2 |ψ A (x 2 )| 2
2 (20.62)

± 2 Re[ψ A (x 1 )ψ B (x 2 )ψ∗A (x 2 )ψ∗B (x 1 )] .

The last term in the above is called the exchange probability density. We note that for the same
state, A = B, when x 1 = x 2 we find that |φ(x 1 , x 2 )|singlet
2
= 0, which is a statement of the Pauli
exclusion principle. On the other hand, for the triplet, the probability when A = B for x 1 = x 2 is
maximal.
238 20 Systems of Identical Particles

Important Concepts to Remember

• In quantum mechanics identical particles are indistinguishable, which means the result of acting
with the permutation operator on a state with two particles must be related to the unpermuted state
by a phase, |x 2 x 1  = C12 |x 1 x 2 .
• Excluding the nontrivial cases of anyons and nonabelions, that means we can either have Bose–
Einstein statistics (bosons) for C12 = +1, or Fermi–Dirac statistics (fermions), C12 = −1.
• For fermions, this leads to the Pauli exclusion principle, which states that we cannot have two
fermions in the same state: |x x = 0.
• Multiparticle operators commute with the permutation operator; for the Hamiltonian, it is either
separable and symmetric, or has a potential depending only on the relative distances of the particles.
• A multiparticle state is either totally symmetric or totally antisymmetric. In the antisymmetric case,
this leads to a Slater determinant for the wave function.
• Fermionic phase space variables have anticommutation relations rather than commutation rela-
tions, indicated by {, }. In particular, the fermionic harmonic oscillator has operators b, b† with
{b, b† } = 1.
• The spin–statistics theorem says that integer spin particles are bosons, and half-integer spin
particles are fermions; this can be understood from the rotation matrix g = eiθJ3 / , implying that
g(2π) = (−1) 2j .
• For particles with spin, the total wave function is a product of the coordinate wave function times
the spinor wave function, both of which need to be totally symmetric or totally antisymmetric; only
the product wave function is governed by the spin–statistics theorem.
• Thus a fermion has either a symmetric spatial wave function and an antisymmetric spin wave
function, or an antisymmetric spatial wave function and a symmetric spin wave function.
• For a separable Hamiltonian, the probability density has a direct term and an exchange term.

Further Reading
See [2], [1], [3] for more details.

Exercises

(1) Write down an antisymmetric (fermionic) state for three particles, the energy eigenstates
E1 , E2 , E3 , and the corresponding Slater determinant.
(2) Consider a three-body problem with two-body interactions, i.e., with the potential
V̂ = V̂int (| X̂1 − X̂2 |) + V̂int (| X̂1 − X̂3 |) + V̂int (| X̂2 − X̂3 |). (20.63)

Is this quantum potential suitable for indistinguishable particles?


(3) Consider a state with N indistinguishable particles. What is the phase that relates |12 . . . N with
|N . . . 21?
(4) Write down a general state for fermionic harmonic oscillator. Also write down general state for
N identical fermionic harmonic oscillators.
239 20 Systems of Identical Particles

(5) We saw in the text that [φ(x , t), φ(y , t)] = 0 for a bosonic field. Show also that
{ψ(x , t), ψ(y , t)} = 0 for a fermion (Hint: think about the fact that ψ and ψ† are conjugate for
fermions, unlike for bosons). What relation will we have for the product φ(x , t)ψ(y , t)?
(6) Consider three particles of spin 1/2. Write the possible spin wave functions for the three-particle
states, and construct the possible total wave functions, for the spatial wave function times the
spin wave function.
(7) For the case in exercise 6, write down the probability and identify the exchange terms.
Application of Identical Particles: He Atom
21 (Two-Electron System) and H2 Molecule

In this chapter, we will apply the formalism of the previous chapter to two cases with two electrons:
the He atom (and helium-like atoms, with an arbitrary Z but two electrons only) and the H2 molecule.
In order to do these calculations, we will introduce approximation methods that will be formalized
and generalized later in the book.

21.1 Helium-Like Atoms

Helium-like atoms have a nucleus of charge +Ze, around which two electrons move. Of course,
in the case of helium Z = 2, but we can consider general Z. The nucleus will be considered as
approximately fixed. As in the case of the hydrogen atom, the center of mass motion factorizes, and
the relative motions are the only ones of importance: since the mass of the nucleus is much larger
than that of the electrons, for the relative motions of the electrons we can consider a fixed nucleus
situated at the origin, around which the electrons move (with reduced mass approximately equal to
the electronic mass).
The Hamiltonian for the system contains the potentials for the interactions of the two electrons
with the nucleus and with each other, thus being given by
pˆ 12 pˆ 2 Ze02 Ze02 e2
Ĥ = + 2 − − + 0
2m 2m r1 r2 r 12 (21.1)
= Ĥ1 + Ĥ2 + Ĥ12 ,

where e02 = e2 /(4π0 ), as before.


As in the previous chapter, the state of the system factorizes into a spin state for spin and
a coordinate state, and correspondingly the wave function is the product of a wave function for
positions and one for spins,
ψ(x 1 , x 2 ; ms1 , ms2 ) = φ(x 1 , x 2 )χ(ms1 , ms2 ), (21.2)
where the spin wave function χ(ms1 , ms2 ) is either in the triplet representation,
 
1 1
χ++ = χ1+ χ2+ ; √ (χ+− + χ−+ ) = √ (χ1+ χ2− + χ1− χ2+ ); χ−− = χ1− χ2− , (21.3)
2 2
or in the singlet representation
1 1
√ (χ+− − χ−+ ) = √ (χ1+ χ2− − χ1− χ2+ ). (21.4)
2 2
The position wave function φ(x 1 , x 2 ) is either antisymmetric, corresponding to the triplet spin
wave function, or symmetric, corresponding to the singlet spin wave function. The case with triplet
240
241 21 He Atom and H2 Molecule

spin (when the spins of the individual electrons are parallel) and antisymmetric φ(x 1 , x 2 ) is a state
that is known as “ortho” (so in this case, we have ortho-helium), while the case with singlet spin
(when the electron spins are antiparallel) and symmetric φ(x 1 , x 2 ) is a state known as “para” (so in
this case, we have para-helium). To proceed, we will make some approximations.

Approximation 1
We neglect the interaction between the electrons, Ĥ12 = e02 /r 12 , for the wave functions, and so
consider eigenfunctions of the noninteracting Hamiltonian,
Ĥ  Ĥ1 + Ĥ2 , (21.5)
and then, as in the noninteracting case, we can separate variables, obtaining eigenfunctions
ψ(r 1 , r 2 ) = ψ(r 1 )ψ(r 2 ). (21.6)
For these eigenfunctions, the eigenenergies are
E  E (1) + E (2) , (21.7)
and (ψ(r 1 ), E (1) ), (ψ(r 2 ), E (2) ) are solutions of a hydrogen-like (noninteracting) Schrödinger
equation,
Ze02
−  Δr −
2
 ψ (s) (r s ) = E (s) ψ (s) (r s ). (21.8)
s
 2m r s

Then the individual electron wave functions are


ψ (s) (r s ) = ψ ns ls ms (r s ) ≡ ψqs (r s ), (21.9)
and their energies are
mZ 2 e04 1 Z 2 e02 1
E (s) = Ens = − = − . (21.10)
22 n2s 2a0 n2s
The corresponding symmetric (spin singlet) and antisymmetric (spin triplet) coordinate wave
functions are
1
φ s (r 1 , r 2 ) = √ [ψq1 (r 1 )ψq2 (r 2 ) + ψq2 (r 1 )ψq1 (r 2 )]
2
(21.11)
1
φ a (r 1 , r 2 ) = √ [ψq1 (r 1 )ψq2 (r 2 ) − ψq2 (r 1 )ψq1 (r 2 )].
2

Approximation 2
We now suppose that the energy of the interaction-less case above is corrected by the interaction term
Ĥ12 evaluated in the above (noninteracting, S or A) state,
E = E (1) + E (2) + ΔE
  e2  
ΔE = φ S/A  0  φ S/A
 r 12  (21.12)
 
e2
= d 3r1 d 3 r 2 0 |φ S/A (r 1 , r 2 )| 2 .
r 12
242 21 He Atom and H2 Molecule

The probability density in the integral can be written as


 ψq (r 1 )ψq (r 2 ) ± ψq (r 1 )ψq (r 2 ) 2
|φ s,a (r 1 , r 2 )| =  1 2

2 1

 2 
1 (21.13)
= |ψq1 (r 1 )| 2 |ψq2 (r 2 )| 2 + |ψq2 (r 1 )| 2 |ψq1 (r 2 )| 2
2

± 2 Re[ψq1 (r 1 )ψq∗ 1 (r 2 )ψq2 (r 2 )ψq∗ 2 (r 1 )] .
The last term is called the exchange probability density, as we said in the previous chapter.
The integrals involving the terms above can be rewritten in a useful way. Consider the integrals of
the first two terms:
 
1 e2  
C≡ 3
d r1 d 3 r 2 0 |ψq1 (r 1 )| 2 |ψq2 (r 2 )| 2 + |ψq2 (r 1 )| 2 |ψq1 (r 2 )| 2
2 r 12
  2
e
= d 3r1 d 3 r 2 0 |ψq1 (r 1 )| 2 |ψq2 (r 2 )| 2 (21.14)
r 12
 
ρ q (r 1 )ρ q2 (r 2 )
= 3
d r1 d 3r2 1 ,
4π0 r 12
where in the second line we have made the redefinition r 1 ↔ r 2 in the integral of the second term,
in order to show that it equals the first term, and in the third line we have defined the charge density

as the charge times the probability density, ρ q (r ) = e|ψq (r )| 2 = e0 4π0 |ψq (r )| 2 . As we can see,
this integral is a Coulomb-type integral; it is just the integral of the Coulomb potential for charge
densities, hence the name C for the contribution.
Next consider the last term in (21.13) (with ± 2 Re in front):
 
e2 1
A≡ d 3r1 d 3 r 2 0 [ψq1 (r 1 )ψq∗ 1 (r 2 )ψq2 (r 2 )ψq∗ 2 (r 1 ) + ψq1 (r 2 )ψq∗ 1 (r 1 )ψq2 (r 1 )ψq∗ 2 (r 2 )]
r 12 2
 
e2
= d 3r1 d 3 r 2 0 ψq1 (r 1 )ψq∗ 1 (r 2 )ψq2 (r 2 )ψq∗ 2 (r 1 ),
r 12
(21.15)
where we have used again that the two terms (which are otherwise complex conjugates of each other)
are equal, as can be seen by making the redefinition r 1 ↔ r 2 . This means that the term is actually
real, as needed since it is a term in the energy. It also means that we can define f (r ) ≡ ψq1 (r )ψq∗ 2 (r ),
and write
 
e2
A= 3
d r1 d 3 r 2 0 f (r 1 ) f (r 2 ). (21.16)
r 12
Here A is the exchange integral or exchange energy, and the initial comes from the German
“Austausch”, meaning exchange.
Finally then, the energy is
E = Eq1 + Eq2 + C ± A. (21.17)
Moreover, we can show that A is not only real, but positive. Indeed, we can see that the largest
contribution to the integral comes from r 12 → 0, when the two particles are close to each other. But A
is an integral of f (r 1 ) f r 2 )/r 12 , and if r 1 , r 2 are close together then, by the continuity of the function
f , f (r 1 )  f (r 2 ), or at the very least they have the same sign, meaning that their product is positive,
and so this leading contribution to A is positive.
243 21 He Atom and H2 Molecule

But that means that the energy of the ortho-helium state (antisymmetric position wave function)
is smaller than the energy of the para-helium state (symmetric position wave function), meaning that
it is more stable. Thus we have a tendency to have parallel spins, i.e., triplet states, even though we
don’t have any interaction that depends on spin (this tendency follows simply from the statistics of
the fermions).
Also note that when q1 = q2 (both electrons have the same quantum numbers n, l, m), the analysis
is special (the previous analysis does not apply). Then there is no antisymmetric (ortho-helium, or
spin triplet) state, only a symmetric one (para-helium, or spin singlet), corresponding to antiparallel
spins for the electrons.

21.2 Ground State of the Helium (or Helium-Like) Atom

We will calculate only the ground state of the helium (or helium-like) atom, so we need only consider
the ground states of hydrogen-like atoms as individual wave functions. Then we have

ψq = ψ100 (r ) = R10 (r)Y00 (θ, φ). (21.18)

But the ground state spherical harmonic is just a normalization constant, since the state has no
angular momentum, and this means that it is spherically symmetric,
1
Y00 (θ, φ) = √ , (21.19)


so that dΩ|Y00 | = 1. In orbital notation, the ground state of a hydrogen-like atom is (1s) 2 , meaning
that n = 1, l = 0 (denoted as an “s” orbital) and the index 2 refers to there being two electrons in this
state (necessarily of opposite spins by the Pauli exclusion principle, since they are in the same state
for coordinates). The radial wave function is R10 (r) ∼ e−kr , where k = Z/a0 , and the normalization
constant for it comes from
 ∞
1
e−2kr r 2 dr = 3 , (21.20)
0 4k
leading to
  3/2
Z 2
ψq = √ e−Zr/a0 . (21.21)
a0 4π
Then the para (spin singlet) state of a helium-like atom is
Z 3 −Z (r1 +r2 )/a0
ψ s = ψq (r 1 )ψq (r 2 )χsinglet = e , (21.22)
πa03
and the electron interaction energy is
   
1 Z6 e2
ΔE = e02 = d 3r1 d 3 r 2 2 6 e−2k (r1 +r2 ) 0 . (21.23)
r 12 ψ s π a0 r 12

To do this integral, we first define



e−2kr2
Φ(r 1 ) ≡ d 3r2 , (21.24)
r 12
244 21 He Atom and H2 Molecule

and note that it has the form of a Coulomb potential at the point r 1 for a distribution of charge with
density e−2kr2 . It then satisfies the Poisson equation

Δr 1 Φ(r 1 ) = −4πe−2kr1 . (21.25)

Because of the spherical symmetry of the equation, L 2 Φ = 0, we obtain just the radial equation,
2
 d + 2 d  Φ(r 1 ) = −4πe−2kr1 (21.26)
2
 dr 1 r 1 dr 1
or
d2
(r 1 Φ(r 1 )) = −4πr 1 e−2kr1 . (21.27)
dr 12

We see that it has a particular solution of the form r 1 Φ(r 1 ) = (ar 1 + b)e−2kr1 and, by substituting
it into (21.27) we obtain

[(2k) 2 (ar 1 + b) − 4ka]e−2kr1 = −4πr 1 e−2kr1 , (21.28)

which fixes the coefficients as follows:


π a π
a=− , b = = − 3. (21.29)
k2 k k
However, the general solution of a second-order differential equation with a source (the Poisson
equation) is equal to the particular solution plus the general solution of the homogenous (without
source) equation, which is r 1 Φ(r 1 ) = C1 r 1 +C2 . To fix the coefficients C1 , C2 , we impose the physical
conditions that the potential Φ(r 1 ) should vanish at infinity, Φ(r 1 → ∞) → 0, which gives C1 = 0,
and that it is nonsingular at r 1 → 0 since in the original definition (21.24) there is nothing special
about r 1 = 0. This last condition implies that the coefficient of 1/r 1 must vanish at r 1 = 0, leading to
C2 = −b = π/k 3 .
Finally, then, we have
 
1 1 e−2kr1 1 1
Φ(r 1 ) = −π 2 e−2kr1 + 3 − 3 . (21.30)
k k r1 k r1
Integrating over r 1 as well, we get
  ∞
−2kr1 3
Φ(r 1 )e d r1 = Φ(r 1 )e−2kr1 4πr 12 dr 1
0
  ∞  ∞  ∞ 
1 1 1
= −4π 2
e−4kr1 r 12 dr 1 + 3 e −4kr1
r 1 dr 1 − 3 e −2kr1
r 1 dr 1
k2 0 k 0 k 0
 
1 1 1 1 1 1
= −4π 2 + −
k 2 32k 3 k 3 16k 2 k 3 4k 2
5 π2
=+ .
8 k5
(21.31)
Thus the correction ΔE to the energy is
e02 Z 6 5π 2 5 Ze02
ΔE = = , (21.32)
π 2 a06 8k 5 8 a0
245 21 He Atom and H2 Molecule

and the total energy is the sum of two ground state energies for a hydrogen-like atom, plus the above
correction,
 
Z 2 e02 5 Ze02 Ze2 5
E = 2E0 + ΔE = − + =− 0 Z− . (21.33)
a0 8 a0 a0 8
Using Z = 2 for helium, we obtain E  −74.8 eV, which may be compared with the experimental
value of Eexp = −78.8 eV. That is quite good, but we can in fact do better.

21.3 Approximation 3: Variational Method, “Light Version”

We can use a variational method to account for the fact that, when considering the interaction of the
individual electrons with the nucleus, the electrons do not see the full charge +Ze of the nucleus but,
rather, a smaller effective charge, owing to the partial screening of the nuclear charge by the charge
density cloud created by the second electron. To find this effective charge Zeff , we use a variational
method (studied in detail later, in Chapter 41).
Namely, we consider a modified wave function, with Z → Zeff , so that
3
Zeff
ψ̃ s = e−Zeff (r1 +r2 )/a0 , (21.34)
πa03
and evaluate the Hamiltonian in this effective state,
  ˆ2       2 
 p 1 pˆ 22   Ze02 Ze02 
ψ̃ s + ψ̃ s  ψ̃ s .
    e
E = ψ̃ s  + ψ̃ s + ψ̃ s  − − (21.35)
 2m 2m   r 1 r 2  r
 12 
But we saw in Chapter 18 that
     
ψ̃ s   ψ̃ s =
1 Z 1 1 Zeff
= ⇒
r n a0 n 2 r
 1 a0
 ˆ2  2 2   ˆ2   2 2
(21.36)
p
=
Z e0 1
⇒ ψ̃  p ψ̃ = Zeff e0 ,
s
 2m 
s
2m n 2a0 n2 2a0
so that the energy functional to be minimized is
2 2
Zeff e0 Z Zeff e02 5 Zeff e02
E=2 −2 + . (21.37)
2a0 a0 8 a02
Minimizing this over Zeff at fixed Z by δE/δZeff = 0, we obtain
5
Zeff = Z −
. (21.38)
16
Replacing it back in (21.37), we find the correct minimized energy as
 2
e2 5 Z 2 e2
E=− 0 Z− = − eff 0 . (21.39)
a0 16 a0
An equivalent way to obtain this result is to note that (as seen in Chapter 18)
   
1 Z 1 1 5 1 5 1
= ⇒ − − = 0, (21.40)
r n a0 n 2 r 12 16 r 1 16 r 2
246 21 He Atom and H2 Molecule

and use this to rewrite the Hamiltonian in the form


   
pˆ 21 pˆ 22 5 e02 5 e02
Ĥ = − − − Z− − Z−
2m 2m 16 r 1 16 r 2
  (21.41)
1 5 1 5 1
+ e02 − − ,
r 12 16 r 1 16 r 2
where the term in the last line has zero quantum average, so can be dropped. But then the remaining
terms are the same as for two decoupled electrons in the potential of a nucleus with Zeff = Z − 5/16,
so the total energy is
 2
e02 5
E = −2 Z− . (21.42)
2a0 16

21.4 H2 Molecule and Its Ground State

Another two-electron system is the hydrogen molecule H2 , with two protons (nuclei), called A and B,
and two electrons, called 1 and 2. The Hamiltonian corresponds to the Coulomb interactions between
the electrons and each of the nuclei, between themselves, and between the two nuclei, so is given by
2 2 e2 e2 e2 e2 e2 e2
Ĥ = − Δ1 − Δ2 − 0 − 0 − 0 − 0 + 0 + 0 , (21.43)
2m 2m r 1A r 1B r 2A r 2B r 12 R
where R = |r A − r B | is the distance between the nuclei.
It is clear that in the ground state, at least for R  a0 , one electron is near nucleus A, and the other
is near nucleus B.
Indeed, consider the opposite situation, with both electrons near one nucleus; thus we have two
ionized hydrogen atoms, H+ and H− . Here H− is just a bare nucleus, so doesn’t contribute in the first
approximation, whereas H+ can be thought of as a helium-like atom (with two electrons), but with
Z = 1. Then we can apply the helium-like energy formulas, obtaining that the electron energy of the
system is
 2
e02 5 e2
Ee = − 1− > − 0 = 2EH,0 , (21.44)
a0 16 a0
which is larger than the energy of the electrons in the configuration where we have two neutral
hydrogen atoms.
This means that for the ground state in the R  a0 case we have one electron near nucleus A and
the other near nucleus B, with two neutral hydrogen atoms, H + H. For this configuration, we have
two eigenstates, depending on which electron is near which nucleus,
ψ1 = ψ(r 1A )ψ(r 2B )
(21.45)
ψ2 = ψ(r 1B )ψ(r 2A ).
But while these two states are normalized, they are not orthogonal because they admit a nonzero
product,
 
ψ1 |ψ2  = d 3 r 1 ψ∗ (r 1A )ψ(r 1B ) d 3 r 2 ψ(r 2A )ψ∗ (r 2B ) = |S| 2 , (21.46)
247 21 He Atom and H2 Molecule

where we have defined the “overlap integral”



S≡ d 3 r ψ∗ (r A )ψ(r B ). (21.47)

Since we have the products

ψ1 ± ψ2 |ψ1 ± ψ2  = ψ1 |ψ1  + ψ2 |ψ2  ± 2ψ1 |ψ2  = 2(1 ± |S| 2 ), (21.48)

the normalized symmetric and antisymmetric coordinate wave functions for the two-electron
system are
ψ1 ± ψ2 ψ(r 1A) ψ(r 2B ) ± ψ(r 1B )ψ(r 2A )
ψ s/a = √ = √ . (21.49)
2 1± |S| 2 2 1 ± |S| 2
Then the matrix elements of the Hamiltonian are
     
2 e2  2 e2 
ψ1 | Ĥ |ψ1  = ψ1  − Δ1 − 0  ψ1 + ψ1  − Δ2 − 0  ψ1
  2m r 1A    2m r 2B 
 
e2 e2 e2 e2
+ d 3r1 d 3 r 2 − 0 − 0 + 0 + 0  |ψ(r 1A )| 2 |ψ(r 2B )| 2
 r 2A r 1B r 12 R
  2 2 2
e e e e2
= 2E0 + d 3r1 d 3 r 2 − 0 − 0 + 0 + 0  |ψ(r 1A )| 2 |ψ(r 2B )| 2
 r 2A r 1B r 12 R
    
2 e2   2 e2 
ψ2 | Ĥ |ψ2  = ψ2  − Δ1 − 0  ψ2 + ψ2  − Δ2 − 0  ψ2
  2m r 1A    2m r 2B 
 
e2 e2 e2 e2
+ d 3r1 d 3 r 2 − 0 − 0 + 0 + 0  |ψ(r 2A )| 2 |ψ(r 1B )| 2
 r 2A r 1B r 12 R
  2 2 2
e e e e2
= 2E0 + d 3r1 d 3 r 2 − 0 − 0 + 0 + 0  |ψ(r 2A )| 2 |ψ(r 1B )| 2
 r 2A r 1B r 12 R
    
2 e2   2 e2 
ψ1 | Ĥ |ψ2  = ψ1  − Δ1 − 0  ψ2 + ψ1  − Δ2 − 0  ψ2
  2m r 1A    2m r 2B 
 
e2 e2 e2 e2
+ d 3r1 d 3 r 2 − 0 − 0 + 0 + 0  ψ∗ (r 1A )ψ(r 1B )ψ(r 2A )ψ∗ (r 2B )
 r 2A r 1B r 12 R
  2 2
e e e2 e2
= 2E0 S 2 + d 3r1 d 3 r 2 − 0 − 0 + 0 + 0  ψ∗ (r 1A )ψ(r 1B )ψ(r 2A )ψ∗ (r 2B )
 r 2A r 1B r 12 R
    
 2 e02   2 e02 

ψ2 | Ĥ |ψ1  = ψ2  −  Δ1 −   ψ1 + ψ2  −   Δ2 −   ψ1
  2m r 1A    2m r 2B 
 
e2 e2 e2 e2
+ 3
d r1 d 3 r 2 − 0 − 0 + 0 + 0  ψ(r 1A )ψ∗ (r 1B )ψ∗ (r 2A )ψ(r 2B )
 r 2A r 1B r 12 R
  2 2
e e e2 e2
= 2E0 S 2 + d 3r1 d 3 r 2 − 0 − 0 + 0 + 0  ψ(r 1A )ψ∗ (r 1B )ψ∗ (r 2A )ψ(r 2B ).
 r 2A r 1B r 12 R
(21.50)
248 21 He Atom and H2 Molecule

Then, renaming the integration variables by exchanging r 1 and r 2 , we find that

ψ1 | Ĥ |ψ1  = ψ2 | Ĥ |ψ2 


(21.51)
ψ1 | Ĥ |ψ2  = ψ2 | Ĥ |ψ1 .

Finally then, the energies of the symmetric (spin singlet) and antisymmetric (spin triplet) H2
states are
 
ψ1 ± ψ2   ψ1 ± ψ2
E± (R) =  Ĥ 
2(1 ± |S| 2 )   2(1 ± |S| 2 )
1  
= ψ1 | Ĥ |ψ1  + ψ2 | Ĥ |ψ2  ± ψ1 | Ĥ |ψ2  ± ψ2 | Ĥ |ψ1 
2(1 + |S| )
2
(21.52)
1  
= ψ1 | Ĥ |ψ1  ± ψ2 | Ĥ |ψ1 
1 + |S| 2
C±A
= 2E0 + ,
1 ± |S| 2
where we have defined, as in the helium case, the Coulomb and exchange integrals,
 
e2 e2 e2 e2
C= d 3r1d 3 r 2 − 0 − 0 + 0 + 0  |ψ(r 1A )| 2 |ψ(r 2B )| 2
 r 2A r 1B r 12 R
  (21.53)
e2 e2 e2 e2
A= d 3r1 d 3 r 2 − 0 − 0 + 0 + 0  ψ∗ (r 1A )ψ(r 1B )ψ(r 2A )ψ∗ (r 2B ).
 r 2A r 1B r 12 R

We can calculate numerically and plot the two energy functions E− (R) and E+ (R), and we find
that E− (R) − 2E0 is positive and uniformly decreases to zero at infinity, E+ (R) − 2E0 decreases to a
negative minimum, reached at R = R0 , stays negative and at infinity tends to zero from below.
We also note that the van der Waals interaction between two hydrogen atoms at a large distance,
∼ e04 /r 6 , is obtained when calculating E+ (R) at large R, but this is in second-order perturbation theory
(which will be introduced later on in the book).

Important Concepts to Remember

• For helium (two electrons and a nucleus), we can have an ortho- state, for triplet spin and
antisymmetric spatial wave function, and a para- state, for singlet spin and symmetric spatial wave
function.
• As an approximation, we calculate the states and wave functions of the electrons as if there were
no interaction between the electrons, and then evaluate the interaction Hamiltonian for these states.
• We find that the extra energy depends on the probability of finding the electrons near one or the
other atom, so has a direct part, and an exchange part.
• The direct energy becomes
 
ρ q (r 1 )ρ q2 (r 2 )
A= d 3r1 d 3r2 1 ,
4π0 r 12
249 21 He Atom and H2 Molecule

and the exchange part of the energy is


 
e2
C= d 3r1 d 3 r 2 0 f (r 1 ) f (r 2 ),
r 12
where f (r ) = ψq1 (r )ψq∗ 2 (r ).
• Using the above approximation, the ground state of helium is found from ΔE = 5/8 Ze02 /a0 , which
gives a good approximation.
• A better approximation is found using a variational method, which in this (simplified) case means
writing a modified wave function, in which we make the change Z → Zeff , while keeping Z in
the Hamiltonian, and then minimizing over Zeff at fixed Z. We find Zeff = Z − 5/16, and E0 =
−Zeff e02 /a0 .
• For the H2 molecule (two protons and two electrons), if R  a0 , where R is the distance between
the protons, we can prove that in the minimum energy configuration each electron is around a
proton.
• One finds
C±A
E± (R) = 2E0 + ,
1±S
where C is the direct contribution and A is the exchange contribution, for r 1A and r 2B , while
S = d 3 r ψ∗ (r A )ψ(r B ).

Further Reading
See [1] for more details.

Exercises

(1) Consider a helium atom with one electron in the n = 1, l = 0 state, and another in the n = 2, l = 0
state. Calculate the energy of the ortho-helium state (the lowest energy state for these quantum
numbers), using the two approximations in the text (Approximation 1 and Approximation 2).
(2) Consider a helium atom with two electrons in the n = 2, l = 0 state. Calculate their ground state,
using the two approximations in the text.
(3) Apply the variational method (Approximation 3) to the case in exercise 1.
(4) Apply the variational method (Approximation 3) to the case in exercise 2.
(5) Argue that, for the H2 molecule, E+ (R) − 2E0 tends to infinity from below, and E− (R) tends to
infinity from above.
(6) Consider a potential H3 molecule, in an equilateral triangle of side R. Give an argument why
this molecule would be unstable.
(7) If we have four H atoms, write down the Hamiltonian, and describe two potentially stable
configurations.
Quantum Mechanics Interacting with
22 Classical Electromagnetism

In this chapter we will learn how to deal with quantum mechanics in the presence of a classical elec-
tromagnetic field. We started this analysis in Chapter 17, but here we treat everything methodically.

22.1 Classical Electromagnetism plus Particle

To start with, we consider classical mechanics for a particle interacting with a classical electro-
magnetic field. We already mentioned in Chapter 17 that in quantum mechanics the electromagnetic
interaction is obtained by a “minimal coupling” that amounts to a replacement of derivatives with
derivatives plus fields,
q
∂i → ∂i − i Ai

q (22.1)
∂0 → ∂0 − i A0 .

In the Lagrangian of classical mechanics, correspondingly, the replacements are

p = mv → mv + q A

(22.2)
p0 = mc2 → mc2 − q A0 ,

to be substituted into

v 2
L = −mc2 + m , (22.3)
2
thus leading to (the electric potential is −A0 )
2
mv + q A

L = −mc + q A0 +
2
. (22.4)
2m
Then the Lagrange equations,

d ∂L ∂L
= , (22.5)
dt ∂vr ∂ xr

 2 /2m)
become (neglecting the term quadratic in the fields, q2 A

d
(mvr + q Ar ) = q∂r A0 + qvi ∂r Ai , (22.6)
dt
250
251 22 Quantum Mechanics and Classical Electromagnetism

 B
and, since we have (using the definition of E,  in terms of the vector potential A
 and the scalar
potential A0 )
d ∂ Ar
Ar = + vs ∂s Ar
dt ∂t
∂r As − ∂s Ar = r sk Bk (22.7)

∂r A0 − ∂0 Ar = Er ,

we obtain
mv̇r = q[∂r A0 − ∂0 Ar + (∂r As − ∂s Ar )vs ]
= q(Er + r sk vs Bk ) ⇒ (22.8)
mv˙ = q( E
 + v × B).


This is the Lorentz force law, so the Lagrangian we wrote down was indeed correct.
From this Lagrangian, we can find the canonically conjugate momentum
∂L
pr = = mvr + q Ar . (22.9)
∂vr
 2 /2m)
Then the Hamiltonian is (again neglecting the term q2 A
 mv 2 1 2
H= pr vr − L = + mc2 − q A0 = p − q A
 − q A0 + mc2 . (22.10)
r
2 2m

22.2 Quantum Particle plus Classical Electromagnetism

Next, we consider how to use quantum mechanics for the particle, while leaving the electromag-
netism classical. This means that we replace p with the quantum operator pˆ , while the vector and
 and A0 1. The Hamiltonian
scalar potentials remain scalar functions times the identity operator, A1
operator is then
1 ˆ 2
Ĥ = p − q A1
 − q A0 1. (22.11)
2m

Further, it is still the canonically conjugate momentum p that is replaced by i ∇,  since we still
have [x i , p j ] = iδi j , where p j are the canonically conjugate momenta. That means that we still have
the replacement
  = ∇
pˆ = ∇ → pˆ − q A  − q A,
 (22.12)
i i
which implies the “minimal coupling” expressed in (22.1).
Then the Schrödinger equation i∂t |ψ = Ĥ |ψ becomes, acting on the wave function,
 
1 ˆ  2 − q A0 1 ψ(t, r ).
i∂t ψ(t, r ) = p − q A1 (22.13)
2m
252 22 Quantum Mechanics and Classical Electromagnetism

This Hamiltonian contains an interaction part, for the interaction between the particle (via its
operator pˆ ) and the classical electromagnetic field,
q
Ĥint = − pˆ · A.
 (22.14)
m
As we saw in Chapter 17, in the gauge ∇
 ·A
 = 0, and for a constant magnetic field B,
 given that
=∇
B  × A,
 we can choose (considering also ∂t A
 = 0, a stationary field)

 = 1B
A  × r , (22.15)
2
leading to an interaction Hamiltonian
q  ˆ  · μˆ ,
Ĥint = − B · L = −B (22.16)
2m
where the magnetic moment can be written as an integral involving the current density j, or the
current differential d I = jd 3 r, as
 
1 r × d I
μ
 = d 3 r r × j = . (22.17)
2 2
Note that in this relation, both μ and j are understood as expectation values, quantities that can
be measured in experiments, and not as quantum operators. Indeed, j is constructed out of a wave
function, and so is μ.
We next show that a change in the wave function by a phase amounts to a change of gauge for the
gauge field ( A0 , Ai ). The redefinition of the wave function by a phase,
ψ = ψ  e−iqΛ/ , (22.18)
implying
 
 
∂t ψ = ∂t ψ − q(∂t Λ)ψ e−iqΛ/
 
i i
  (22.19)
 
p̂i = ∂i ψ = ∂i ψ  − q(∂i Λ)ψ  e−iqΛ/ ,
i i
means that the Schrödinger equation for ψ, written as
⎧  2 ⎫

⎨  ∂t ψ + 1  ∂i − q Ai ψ − q A0 ⎪
⎬ ψ = 0,
⎪i ⎪ (22.20)
2m i
⎩ ⎭

can be rewritten in terms of ψ as
⎧  2 ⎫

⎨  ∂t ψ  + 1  ∂i − q( Ai + ∂i Λ) ψ − q( A0 + ∂t Λ) ⎪
⎬ ψ = 0.
⎪i ⎪ (22.21)
2m i
⎩ ⎭
This is the Schrödinger equation for the gauge transformed fields
Ai = Ai + ∂i Λ, A0 = A0 + ∂t Λ, (22.22)
proving the statement above (22.18).
However, given the fact that

[pr , As ] = [∂r , As ]  0, (22.23)
i
253 22 Quantum Mechanics and Classical Electromagnetism

we have to be careful about the order of operators. Expanding the Schrödinger equation, we obtain
 2 q   q    q2  2
∂t ψ − Δψ − A · ∇ψ − (∇ · A)ψ + A ψ − q A0 ψ = 0, (22.24)
i 2m mi 2m i 2m
with complex conjugate
 2 q   ∗ q    ∗ q2  2 ∗
− ∂t ψ∗ − Δψ∗ + A · ∇ψ + (∇ · A)ψ + A ψ − q A0 ψ∗ = 0. (22.25)
i 2m mi 2m i 2m
Multiplying the Schrödinger equation (22.24) by ψ∗ and its complex conjugate by ψ, and subtracting
the two, we obtain
  q   ∗ ) − q  (∇
∂t (|ψ| 2 ) − (ψ∗ Δψ − ψΔψ∗ ) − A · (ψ∗ ∇ψ
 + ψ∇ψ  · A)|ψ|
 2
= 0, (22.26)
i 2m mi mi
which can be rewritten as
 
 ·  (ψ∗ ∇ψ
∂t (|ψ| 2 ) + ∇  − ψ∇ψ  ∗ ) − q A|ψ|
 2 = 0. (22.27)
2mi m
This has the form of a continuity equation for the probability density, adding the interaction with
the electromagnetic field to the previously derived equation from Chapter 7. Indeed, defining the
probability density
ρ(r , t) = |ψ(r , t)| 2 , (22.28)
 as
we see that we can define the probability current density in the presence of the vector potential A

j =  (ψ∗ ∇ψ − ψ∇ψ  ∗ ) − q A|ψ|


 2 ≡ j0 + j1 , (22.29)
2mi m
so that we have the standard continuity equation
∂t ρ + ∇
 · j = 0. (22.30)

We also define the current j0 as the current in the absence of A,

j0 =  (ψ∗ ∇ψ  ∗ ),
 − ψ∇ψ (22.31)
2mi
as well as the new contribution j1 , depending on A,


j1 = − q A|ψ|
 2 = − A. (22.32)
m m
Note that the probability density ρ and probability current density j are related to the charge
density and current density by multiplying with the individual charge of the particle,
ρ charge = qρ, jcharge = q j. (22.33)
We can then split the (expectation value for the) magnetic moment associated with each current
density individually, first the current density in the absence of A,  denoted μ  0,

1
μ
0 = d 3 r r × j0,charge
2
 ⎡⎢  ∗ ⎤
q ∇
  ⎥
= d 3 r ⎢⎢ψ∗ r × ψ + r × ∇ψ ψ⎥⎥ (22.34)
4m ⎢⎣  i i ⎥⎦
q  ˆ ˆ
 q ˆ
= ψ|( L|ψ) + (ψ| L)|ψ = ψ| L|ψ,
4m 2m
254 22 Quantum Mechanics and Classical Electromagnetism

ˆ
where in the second equality we have used the fact that L = rˆ × pˆ = rˆ × i ∇,
 and in the last equality
ˆ
we have used the fact that L is a self-adjoint operator, like all observables. Finally, then, since we
have a relation valid for any state |ψ, we obtain the same relation between an abstract operator (an
operator that is not in a particular representation) describing the magnetic moment μˆ and the angular
ˆ
momentum operator L as at the classical level,
q ˆ
μˆ = L. (22.35)
2m
However, the same is not true for the new term in μ  , called μ 1 , that comes from the new term in

the current density, j1 . For it, we obtain
 
3 1 q2
μ
1 = d r r × j1,charge = −
 d 3 r ψ∗ (r × A)ψ

2 2m
(22.36)
q2 ˆ
=− ψ|(r × A)|ψ.

2m
Again, though the relation is valid for any state |ψ, so we can write a relation valid for abstract
operators,
2
q ˆ 
μˆ 1 = − (r × A). (22.37)
2m

22.3 Application to Superconductors

As an application of the previous formalism, we will consider the quantum theory of supercon-
ductors. In a superconductor there are superconducting electrons (which means electrons that don’t
experience collisions with the nuclei that can reduce their speed) described by a wave function ψ(r )
and with a local density equal to the probability density ns = |ψ(r )| 2 .
The only difference with respect to the previous analysis is that instead of the mass m of the
electrons, we have an “effective mass” m∗ , an effective description of the mutual interaction of the
electrons. Indeed, in a “Fermi liquid”, describing electrons in most of these materials, the effect of
the interactions between the electrons is just to replace the mass of the free electrons (described
as a “Fermi gas”) m by the renormalized mass m∗ , which can be 10%–50% larger. Actually, in
heavy fermions it can even be the case that m∗  m. Therefore, the electric current density of the
superconducting electrons, js = q j, is then
2
js (r ) = −i q (ψ∗ ∇ψ  ∗ ) − q ψ∗ ψ A.
 − ψ∇ψ  (22.38)
2m∗ m∗
Deep inside the superconductor, the wave function of the superconducting electrons approaches a
constant absolute value (so that the probability density is constant) that minimizes the thermodynamic
free energy. Calling it ψ0 (∈ R), we have that
ψ  ψ0 eiα . (22.39)
Then the electric current density of the superconducting electrons is
2
js  − q ψ2 A.
 (22.40)
m∗ 0
255 22 Quantum Mechanics and Classical Electromagnetism

Further, the superconductor is described in the London–London model (named after H. London and
F. London) as a sum of two components, a normal electron component jn , that experiences collisions
with nuclei, so has a finite conductivity σ, and current density jn = σ E,  and a superconducting
component js as above. The normal electrons have density nn and velocity v and the superconducting
ones have density ns = |ψ| 2  ψ02 and velocity vs , so we have a total density n = nn + ns , and

jn = −ennvn
js = −ens vs (22.41)
j = jn + js .

The superconducting electrons have mass m∗ , and on them only the electric force −e E
 acts, without
any collisions to slow them down, so it increases their velocity by
dvs
−e E
 = m∗ . (22.42)
dt
But that in turn means that the superconducting electric current density changes as

d js ns e2  
ns e2 d A
= E=− , (22.43)
dt m∗ m∗ dt
which is the first London equation. It really should be postulated not derived, as are the usual
Maxwell’s equations, since it is based on observations but doesn’t quite follow from what we already
know about (normal) electrons.
The second London equation is also postulated, though it is based on the observation we made that,
in the superconductor, js  −(q2 /m∗ )ψ02 A,
 where ns = |ψ| 2  ψ2 , so by taking the cross product of
0

 with it we get
2 2
 × js = − ns e B
∇  = − ns e ∇
 × A.
 (22.44)
m∗ m∗

However, once it is postulated, we have to further postulate that js ∝ A  in a simply connected
superconductor (we will see in the next chapter why this is needed), in order to obtain
2
js = − ns e A ≡ − 1 A,  (22.45)
m∗ μ0 λ2L

where the last equality defines λ L . On the other hand, from the Maxwell equation ∇
 ×B
 = μ0 j and
using ∇ C = −∇ × (∇ × C) − ∇(∇ · C) (true for any vector C) and that ∇ · B = 0, we obtain
 2        

 × js = μ0 ns e B
2
∇  = −∇
 2B  × (∇
 × B)
 = −μ0 ∇  = 1 B. (22.46)
m∗ λ2L
The solution of this equation is an exponentially decaying magnetic field inside the superconduc-
tor, when going inside from the boundary,
−r/λ L

B(r) = B(0)e
 . (22.47)

This is the mathematical description of the observed phenomenon that superconductors expel
magnetic field, known as the Meissner effect: here we see that the magnetic field only penetrates
a distance of λ L inside the material.
256 22 Quantum Mechanics and Classical Electromagnetism

22.4 Interaction with a Plane Wave

As a first step towards quantizing the electromagnetic field, we consider a simple space dependence
 and promote x to be a quantum operator. This is a simple generalization of the previously
for A,
considered case of B  constant, with A = (B
 × r )/2, where we also promoted r to be an operator. It is
also an obvious thing to do since, in the coordinate representation for wave functions, x acts trivially
(since it is replaced by its eigenvalue).
Consider a vector potential that is a plane wave, with sinusoidal form,
 
 = 2A(0)  cos ω n · x − ωt ,
A (22.48)
c
where  is a polarization vector, transverse to the propagation direction n, so that  · n = 0.
We can rewrite (22.48) as the sum of two exponentials,
    
 = A(0)  exp i ω n · x − iωt + exp −i ω n · x + iωt ,
A (22.49)
c c
where, as we will see in the second part of the book, the first term is associated with absorption of
light and the second with stimulated emission. This is replaced inside the interaction Hamiltonian,
q ˆ 
Ĥint = − P · A, (22.50)
m
in order to calculate its matrix element n| Ĥint |m in between eigenstates |n of the free Hamiltonian,
for which Ĥ0 |n = En |n. Defining the frequency of transition between two states,
En − Em
ωnm = , (22.51)

and using the fact that for the x component of the kinetic momentum we have
m
P̂x = [ X̂, Ĥ0 ], (22.52)
i
where X is the kinetic position operature, we find that
m
n| P̂x |m = n|( X̂ Ĥ0 − Ĥ0 X̂ )|m = imωnm n| X̂ |m. (22.53)
i
 (the plane
Moreover, as a first approximation, we can replace the exponentials in the expansion of A
 however it
wave) with their leading term, 1. That amounts, of course, to having just a constant A;
must be thought of as the leading approximation to a plane wave, and gives

n| Ĥint |m = −q A(0) iωnm n| X̂ |m. (22.54)

22.5 Spin–Magnetic-Field and Spin–Orbit Interaction

Finally, we consider an interaction term in the Hamiltonian that really comes from the relativistic
theory (quantum field theory), but, like the electron spin itself, has a simple nonrelativistic
description, as an interaction between spin and orbital angular momentum.
257 22 Quantum Mechanics and Classical Electromagnetism

First, we consider the interaction of the electron spin with the magnetic field. We remember that,
for an electron, the spin generates a magnetic moment with g = 2. Considering also that q = −e,
we have
e 
S = −
μ S. (22.55)
me

But the spin (intrinsic angular momentum) is given by S = σ /2, where σi are the Pauli matrices,
acting on the two-dimensional spin vector space. As we showed earlier, the Pauli matrices satisfy

σi σ j = δi j + ii jk σk . (22.56)

Multiplying this with P̂i P̂j , where P̂i is the kinetic momentum, we obtain

ˆ σ · P)
(σ · P)( ˆ = Pˆ · Pˆ + iσ · ( Pˆ × P).
ˆ (22.57)

Now using the fact that P̂i is related to the conjugate momentum p̂i by

ˆ
P = pˆ − q A
 = pˆ + e A,
 (22.58)

we obtain (since pˆ = i ∇)




ˆ 2 = pˆ + e A
(σ · P)  2
+ eσ · B.
 (22.59)

 can be rewritten using (22.56), multiplied


On the other hand, the Hamiltonian in the absence of A
by p̂i p̂ j , as

pˆ · pˆ 1 ˆ ˆ 1
Ĥ = − eA0 = p · p + iσ · (p × p ) − eA0 = (σ · pˆ ) 2 − eA0 , (22.60)
2m 2m 2m

ˆ obtaining that the Hamiltonian is that previously


and in the last form we can again replace pˆ by P,
derived for interaction with a classical electromagnetic field, plus a term giving the interaction of the
spin with the magnetic field,

1 ˆ 2 − eA0 = pˆ + e A 2
Ĥ = (σ · P)  + eσ · B
 − eA0 . (22.61)
2m
To obtain the interaction between spin and orbital angular momentum (the spin–orbit interaction),
we have to consider the relativistic Dirac equation and expand it in 1/c2 , obtaining the interaction
term
e
Hint,LS = − σ · (∇A
 0 × p ). (22.62)
4m02 c2

We will not derive this here, though we will do so towards the end of the book, when considering
the Dirac equation. For spherically symmetric systems, we find

e 1 dA0 e 1 dA0  
Hint,LS = − 2 2 r dr
σ · (r × pˆ ) = S · L. (22.63)
4m0 c 2m02 c2 r dr
258 22 Quantum Mechanics and Classical Electromagnetism

Important Concepts to Remember

• Minimal coupling of a particle to an electromagnetic field amounts to the replacement of the


momentum p (usually both the kinetic momentum, and that conjugate to x) by p − q A  ≡ P,

where p is now the canonically conjugate momentum, and P is the kinetic momentum.
• At the quantum level for the particle (while A may be kept classical), we replace the canonically
conjugate momentum p with i ∇,  with  ∇
 so P = p − q A
i
 − q A1.

• The resulting interacting Hamiltonian, in the gauge ∇
 ·A = 0, for constant B
 and stationary A
 is
−
μ · B,
 where the magnetic moment is μ  = r × d I/2.
• A gauge transformation of Aμ = ( A0 , Ai ) by Λ corresponds to multiplying the wave function by
the phase eiqΛ/ .
• The gauge transformation of ψ does not change the probability density ρ, but does change the

current of probability j by the addition of j1 = − A,  and the magnetic moment operator by the
m
q2 ˆ  q ˆ
addition of μˆ 1 = − (r × A), while μ 0 = ψ| L|ψ.
2m 2m
• In a superconductor, we have electrons of effective mass m∗ and of two types (the London–London
model): normal, with nn and jn = −ennvn = σ E,  and superconducting, with ns and js = −ens vs ,
satisfying
   
d js ns e2 d A 2
 × js = − ns e ∇
=− and ∇  × A.

dt m∗ dt m∗
• A superconductor
 expels magnetic field, meaning it penetrates the superconductor only a distance
λL = μ0 ns e2 /m∗ , B(x) = B0 e−x/λ L . This is the Meissner effect.
• For a sinusoidal electromagnetic wave, with the position variable replaced by the X̂ operator, we
find that the matrix elements of the interaction Hamiltonian are related to the transition frequency
ωmn = (Em − En )/ and the matrix elements of X̂ as Hint,mn = −q A(0) iωmn X mn .
• Minimal coupling in the presence of spin leads to a spin–magnetic field interaction, and from the
e 1 dA0  
Dirac equation we also find a spin–orbit interaction S · L.
2m02 c2 r dr

Further Reading
See [2] and [3] for more details.

Exercises


(1) The relativistic Lagrangian for particle is L = −mc2 1 − v 2 /c2 . Does minimal coupling at the
classical level based on it work? Justify.
(2) Write down formally the quantum level version for exercise 1, without bothering about what ψ
means now.
(3) Consider
 a conductor forming a closed loop C in a magnetic field, with nonzero flux Φ =
S
B · d S through it. Show that there is a nonzero probability current loop C dl · I.
 
259 22 Quantum Mechanics and Classical Electromagnetism

(4) In the London–London theory, show that by writing down the energy of the magnetic field and
the kinetic term for the superconducting electrons, we obtain the free energy
 ∞
1  2 + λ2L μ2j 2 )2πr dr,
f = (B 0 (22.64)
2μ0 0

and from it we obtain the London equation λ2L ∇  = B.


 2B 
(5) Explain why a small permanent magnet brought down along the (vertical) axis of a supercon-
ducting ring levitates (i.e., it doesn’t fall under gravity).
(6) Consider a plane electromagnetic wave incident in a perpendicular direction onto a planar
material containing electrons bound in it. Does the wave induce (leading-order) transitions in
the electronic states?
(7) Calculate the spin–orbit interaction for a hydrogen atom in the ground state.
Aharonov–Bohm Effect and Berry Phase
23 in Quantum Mechanics

In this chapter, we consider relevant geometric phases that appear in the wave function: first, in the
coupling to electromagnetic fields, the Aharonov–Bohm phase, and then, in general, the Berry phase.

23.1 Gauge Transformation in Electromagnetism


In the previous chapter, we saw that the canonically conjugate momentum pˆ is represented by ∇ as
i
usual, but when the particle couples to electromagnetic fields, the kinetic momentum is represented
by

pˆ kin = pˆ − q A1,


 (23.1)

and is the operator appearing in the Hamiltonian,

pˆ 2kin pˆ − q A1



Ĥ = − q A0 1 + · · · = − q A0 1 + · · · (23.2)
2m 2m
But this means that the canonically conjugate momentum acts on wave functions as (a constant
times) a translation operator,

 ∂
pˆ ψ = ψ, (23.3)
i ∂x
which in turn means that in the wave functions there is a phase factor
  
i  · dx ,
exp qA (23.4)
 P

where P is a path in coordinate space. Indeed, in terms of the kinetic momentum, the translational
phase (going along P from x to x  in order to find the wave function at x  from that defined at x) is
 x ⎡⎢  x ⎤⎥  x
i i  · dx ⎥⎥ ≡ exp  i
exp  p · dx  = exp ⎢⎢ p kin + q A p kin · dx eiδ , (23.5)
 P:x ⎢⎣  P:x ⎥⎦   P:x

where the kinetic momentum is taken to be gauge invariant (since, classically, it corresponds to mv ),
which means that we have a gauge-dependent phase
 x
q
δ=  · dx .
A
 P:x
260
261 23 Aharonov–Bohm Effect and Berry Phase

More precisely, we saw in the previous chapter that the gauge transformation acts on the solution
of the Schrödinger equation as follows:

Ai → Ai = Ai + ∂i Λ
A0 → A0 = A0 + ∂0 Λ ⇒ (23.6)
ψ = ψ  e−iqΛ/ ,

so that

ψ  = e+iqΛ/ ψ, (23.7)

which is consistent with the phase factor eiδ in the wave function ψ.
Another way to obtain the same result is to require on physical grounds that, under a gauge
transformation on the wave function given by |ψ → |ψ   = Û |ψ, we have:

• The norm of the state is invariant,

ψ  |ψ   = ψ|ψ ⇒ Û † = Û −1 , (23.8)

so Û is unitary.
• The expectation value of X̂ is invariant,

ψ  | X̂ |ψ  = ψ| X̂ |ψ ⇒ Û † X̂ Û → X. (23.9)

• The kinetic momentum is gauge invariant,


    
 1  ψ  = ψ  pˆ − q   ψ ⇒
ψ   pˆ − q A
    (23.10)
Û † pˆ − q A  + ∇Λ
 Û = pˆ − q A.


We then see that the only solution for the unitary operator Û is

Û = eiqΛ/ , (23.11)

which is the same as that found before.


Yet another way to obtain the same result is to consider the path integral. In the propagator, we
have the path integral expression
     t
S i
U (t , t) = Dq(t) exp i = Dq(t) exp  L dt . (23.12)
  t
But in the previous chapter, we saw that the classical Lagrangian, appearing in the path integral
exponent, is
1 2
L= mv + q A
 · v + q A0 + · · · (23.13)
2
Thus, there is a phase factor
 t   
iq dr  iq
e = exp 


 dt = exp A · dr ,
 (23.14)
 t dt  P

the same as before.


262 23 Aharonov–Bohm Effect and Berry Phase

23.2 The Aharonov–Bohm Phase δ

Now, in general a phase factor in the state |ψ or wave function is irrelevant, since we can always
redefine it by a phase factor, while keeping the probability,  |ψ 2 = ψ|ψ invariant. However, we
cannot redefine the phase difference between the phases of two different particles. This is the same
as saying that the phase on a closed path (formed by two different paths in between the same initial
and final points, as in Fig. 23.1a) cannot be redefined away.
Consider a double slit experiment, but with an electromagnetic field in between the two paths of
the particles, as in Fig. 23.1b. That is, there is a unique source for particles (they have to be charged
particles, not photons, in order to couple with electromagnetism), and two slits in a screen, and
then we measure the interference in a unique point P behind the screen. The two generic (quantum)
particle paths, going through the two slits, are called P1 and P2 . We can measure the phase difference
through the interference between the waves traveling along paths P1 and P2 . The corresponding time
dependent states are
  
iq
|ψ1 (t) = exp A · dr |ψ1 

 P1
        (23.15)
iq iq
|ψ2 (t) = exp A · dr |ψ2  ≡ e exp
 iδ
A · dr |ψ2  ,

 P2  P1
where |ψ2  ∼ |ψ1  times perhaps an extra phase factor (independent of the electromagnetic field).

(a) (b)

(c) (d)

Figure 23.1 (a) For two different paths in between the same initial and final points, there is a closed path C = P1 − P2 , with a magnetic
field B inside it. (b) The same for a double slit experiment; the region with B (into the page) is in between the slits. (c) The
surface S bounded by the closed path C, with the region with B inside it. There is a nonzero field A around C, even though B is 0.
(d) A physical construction: a conducting cylinder, with a field B only in the void inside the cylinder; the path C is inside the
conductor.
263 23 Aharonov–Bohm Effect and Berry Phase

Then we can measure the phase difference


     
iq  · dr = exp iq
eiδ = exp A  · dr ,
A (23.16)
 P1 −P2  C
where C = P1 − P2 is a closed loop. Then, if C = ∂S is the boundary of a surface S, by Gauss’s law
(or Stokes’ law), we find
    
iq  · d S = exp iq Φm ,
eiδ = exp B (23.17)
 S 
where Φm is the magnetic flux going through the surface S.
The implication seems to be that we are observing A,  and not B  itself. In favor of this interpretation,
consider the case where there is no field B on the paths P1 and P2 , therefore none on C = P1 − P2 ,

see Fig. 23.1c, yet we observe nontrivial interference between the two paths, due to the phase δ.
However, there is certainly a field A  on it, so  · dr  0. As a more precise set-up for this
A
C
case, consider a cylindrical sheet of metal (a larger cylinder with a smaller cylinder in the middle),
and electrons moving in the material, around the center. Consider also that there is a magnetic field
parallel to the axis of the cylinder, but only in the void in the middle; see Fig. 23.1d. So there is
a magnetic flux going through the surface bounded by any closed contour C in the section of the
material (around the axis), and then there is also a field A  on the path C itself. But there is no B  on C
itself, so it seems that we can measure A, not B.  
However, this doesn’t mean that we are breaking gauge invariance, since, as we saw above, δ
depends only on the gauge invariant B,  through Φm , so it is a gauge invariant phase. This leads to a
sort of philosophical debate: is this nonlocality, whereby δ depends on the full closed contour C, or
the fact that δ depends on the value of A  on C, more important for the interpretation?
In the path integral formalism, it appears that the nonlocality is more important. Indeed, consider
the path integral for the wave function in the double slit experiment (or, equivalently, in the cylinder
set-up, with electron paths going either clockwise or anticlockwise around the circle, between two
points on it), and write it as a sum over paths P1 going through one slit, and paths P2 going through
the other,
 ⎡⎢  t  ⎤⎥    
i iq  · dr + i S( A
ψ(r, t  ) = Dq(t) exp ⎢⎢ L dt ⎥⎥ ψ(t = 0) = exp A  = 0)
⎢⎣  0 ⎥⎦ paths P
 P 
    
iq
= Dq(t)eiS0 (P1 )/ exp  · dr ψ1 (r)
A
above={P1 }  P1
     (23.18)
iq
+ Dq(t)e iS0 (P2 )/
exp  · dr ψ2 (r)
A
below={P2 }  P2
  
iq
= exp  · dr ψ1 (r) + eiδ ψ2 (r) .
A
 P1
Again, we obtain interference due to the phase factor eiδ but now we see explicitly that the
nonlocality associated with the contour C is the key issue.
Finally, note that if the phase δ is a multiple of 2π then the phase factor eiδ = 1 is unobservable.
This happens for a magnetic flux through the surface S bounded by C that is quantized,
2π h
Φ[C] = nΦ0 , Φ0 = = , (23.19)
q q
where Φ0 is called a flux quantum.
264 23 Aharonov–Bohm Effect and Berry Phase

23.3 Berry Phase

The Aharonov–Bohm phase factor eiδ can be generalized to a much more important case, the
observable geometric phase, valid for any quantum mechanical system undergoing adiabatic
changes. This was derived in a seminal paper by Michael Berry in 1983.
Indeed, suppose that the Hamiltonian depends on a (set of) parameter(s) K that change slowly
in time, K = K (t). However, suppose that the change is slow enough (adiabatic) that the system,
initially in an eigenstate of the initial Hamiltonian, continues to be in an eigenstate |n(K (t)) of the
Hamiltonian for any later time t, Ĥ (t), so that
Ĥ (K (t))|n(K (t)) = En (K (t))|n(K (t)). (23.20)
But the Schrödinger equation is (considering the state |n(K0 ), t 0  at time t 0 )
d
Ĥ (K (t))|n(K0 ), t 0 ; t = i
|n(K0 ), t 0 ; t, (23.21)
dt
which means that its time-dependent solution is of the form
  t 
i   iγn (t)
|ψ(t) = exp − En (t )dt e |n(K (t)). (23.22)
 0
This is obvious except for the phase factor eiγn (t) , but this phase factor must exist since it
compensates the fact that d/dt can also act on K (t). Indeed, taking this derivative into account in
the Schrödinger equation, we obtain
 
d  iγn (t)  dγn d
0 = i e |n(K (t)) = eiγn (t) − |n(K (t)) + i |n(K (t)) , (23.23)
dt dt dt
which implies that (multiplying with n(K (t))| from the left and using the normalization of states)
dγn (t) dK
= in(K (t))|∇k |n(K (t)) , (23.24)
dt dt
which can be integrated to give the phase γn as

γn = i n(K (t))|∇k |n(K (t))dK. (23.25)

This phase γn is the Berry phase, or geometric phase. We can express it in a more familiar way by
defining the Berry “connection” (or, generalized gauge field)
An (K ) ≡ in(K (t))|∇k |n(K (t)). (23.26)

Then, if K is a vector, such as a position R  (more on that later), we have, more precisely
(considering the factors of  as well in order to have a more precise analog of a gauge field),
 ≡ in( R(t))|
 n ( R)
A  ∇
  |n( R(t))
 ⇒
R
 (23.27)
1
γn =  · d R.
 n ( R)
A 

Moreover, in general, as for a gauge field, we have a gauge transformation that leaves the system
invariant. Indeed, as in the case of the Aharonov–Bohm phase, we realize that we can change the
state |ψ(t) by an arbitrary phase e χ(K ) without affecting the observables and probabilities, so
| ñ(K (t)) = eiχ(K ) |n(K (t)) (23.28)
265 23 Aharonov–Bohm Effect and Berry Phase

is as good as |n(K (t)) in describing the system. Therefore


  d     d  
dχ(t) dK dK
i ñ(K (t))   ñ(K (t)) = i n(K (t))   n(K (t)) − = An (K ) − ∇K χ (23.29)
dt
  dt
  dt dt dt
also serves as a Berry connection term; thus the Berry connection transforms as

An (K ) → An (K ) − ∇K χ, (23.30)

exactly like a gauge field. In the case K = R,


 we have the standard vector potential transforma-
tion law,

A  →A
 n ( R)  −∇
 n ( R)   χ( R(t)).
 (23.31)
R

We note that γn must be a phase, since it is real. Indeed, differentiating the normalization relation
of the time-dependent state |n(K (t)), we have
d
[n(K (t))|n(K (t)) = 1] ⇒
dt    
d
0 = n(K (t))  n(K (t)) +
d
n(K (t))|n(K (t)) (23.32)
dt dt
   d  
= 2Re n(K (t))   n(K (t)) ,
 dt 
where we have used that
   d  ∗
n(K (t))|n(K (t)) = n(K (t))  n(K (t))
d
,
dt  dt
so we obtain that the phase is real (note the extra i in the phase),
  d  
γn ≡ i n(K (t))   n(K (t)) ∈ R. (23.33)
 dt 
But, if the Hamiltonian is periodic, i.e., it returns to the initial point after a time T,

Ĥ (K (T )) = Ĥ (K (0)), (23.34)

then this means that the gauge transformation must also be single valued, eiχ(T ) = eiχ(0) , in order to
have a well-defined physical system, so

χ(K (T )) = χ(K (0)) + 2πm, (23.35)

where m ∈ Z. Therefore the Berry phase on a closed path C, modulo 2πm, is invariant, and cannot
be removed by a gauge transformation:
 
γn (C) + 2πm = An (K )dK + χ(T ) − χ(0) = An (K )dK = γn (C). (23.36)
C C

23.4 Example: Atoms, Nuclei plus Electrons

A standard example, and one that was implicit when we said that the parameters K can be a vector
 for the position of some quantity, is the case of several atoms interacting (close by), composed of
R
266 23 Aharonov–Bohm Effect and Berry Phase

nuclei and electrons. The nuclei are slow variables, since the time variations of their positions are
small, whereas the electrons are fast variables since the time variations of their positions are large.
Consider for simplicity a diatomic molecule, with distance R  between the nuclei (determining relative
motion of the nuclei). Factoring out the center of mass motion, the total Hamiltonian for the system
then splits into a Hamiltonian for the nuclei depending on R,  a Hamiltonian for the electrons
 H N ( R),
Hel (r 1 , . . . , r N ), and a potential depending on the positions of all of the electrons, V ( R,
 r 1 , . . . , r N ),

Ĥtot = Ĥ N ( R)
 + Ĥel (r 1 , . . . , r N ) + V ( R,
 r 1 , . . . , r N ). (23.37)

This is a slight generalization of the hydrogen molecule H2 , considered in Chapter 21. There
we implicitly used the adiabatic approximation R   constant: the nuclei have only a small
motion (relative oscillation) with respect to the distance R between them. This is called the Born–
Oppenheimer approximation, after the people who considered it first.
We can then assume, adiabatically, that the positions of the nuclei are approximately constant as
the distance R changes in time, as far as the electrons are considered. Then, for the electrons, we
can write down the eigenenergy equation with electron eigenstate |n( R),
 where R  appears as just a
parameter,

Hel (r 1 , . . . , r N ) + V ( R,
 r 1 , . . . , r N ) |n( R)
 = Un ( R)|n(
 
R). (23.38)

We see then that this is of the type considered in the Berry phase above.
For the total system, the state is approximately (assuming the separation of variables) the tensor
product of a state for the nuclei and the electron state,

|ψ = |ψ N  ⊗ |n( R).


 (23.39)

Neglecting the back reaction of the electrons on the nuclei equation, we obtain the simple
separated-variable equation
 + Un ( R)
H N ( R)  |ψ N  = En |ψ N , (23.40)

and corresponding time-dependent Schrödinger equation, whose solution classically corresponds to


the motion of the relative position of the nuclei, R  = R(t);
 this could be plugged back into the
adiabatic eigenenergy equation for the electrons and a Berry phase obtained.
We can ask: why is it then that Born and Oppenheimer missed this Berry phase in their analysis?
One answer is that the total Hamiltonian in this case is real, Ĥ ∈ R, so by the general theory we can
choose wave functions that are real as well (sines and cosines), so we can neglect phases; this is what
Born and Oppenheimer did and therefore missed the Berry phase.

23.5 Spin–Magnetic Field Interaction, Berry Curvature,


and Berry Phase as Geometric Phase

We next aim to understand why the Berry phase is also called a geometric phase. We will
examine the property of this phase of being defined by geometry for another classic case, that
of a spin interacting with a magnetic field that depends slowly on time. Then, we can again

consider an adiabatic approximation, and write the interaction of the spin with B(t) as giving a
267 23 Aharonov–Bohm Effect and Berry Phase

time-dependent Hamiltonian, in which the magnetic field appears as a time-dependent parameter.


The interaction term is

 = −gμ B S · B(t).
Hint ( B)  (23.41)


Then the Berry connection, defined with respect to the parameters B(t), is
 B = in( B(t))|
A  ∇
  |n( B(t)).
 (23.42)
B

However, associated with it, we can define a “Berry curvature”, meaning the field strength of the
Berry connection (the gauge field), in the usual way. Writing it for the general case (not necessarily
for K = B,
 but for general parameters) we have

 n (K ) = ∇
F K × A
 n (K ), (23.43)

which means that in components we have


∂ ∂
Finj = ∂i Anj − ∂j Ain ≡ Aj − Ai , (23.44)
∂Ki ∂K j
 n (K ) in the usual way (as
as for a regular gauge field, and then we can define the “magnetic field” F
for electromagnetism), by

Finj = i jk Fkn . (23.45)

Then, moreover, using Gauss’s (Stokes’) law for this gauge field, we have that (as for the
Aharonov–Bohm phase)
 
An (k) · d K =
   n (K ) · d 2 K.
B  (23.46)
C=∂S S

The field strength of the gauge field can be rewritten as


 
Finj = i ∂Ki n(K (t))|∂K j |n(K (t)) − ∂K j n(K (t))|∂Ki |n(K (t))
  (23.47)
= i ∂Ki n(K (t))|∂K j n(K (t)) − ∂K j n(K (t))|∂Ki n(K (t)) .

Now we can insert the identity, rewritten using the completeness relation for the adiabatic states,

1= |m(K (t))m(K (t))|, (23.48)
m

inside the scalar products, obtaining



Fi j = i ∂Ki n(K (t))|m(K (t))m(K (t))|∂K j |n
mn
 (23.49)
− ∂K j n(K (t))|m(K (t))m(K (t))|∂Ki |n .

For m  0, due to the fact that Ĥ |m = Em |m and n|m = 0, which is still valid when K = K (t),
we take ∇K and obtain

∇K n(K )| Ĥ |m(K ) = 0. (23.50)


268 23 Aharonov–Bohm Effect and Berry Phase

Moreover, also using the fact that ∇K n(K (t))|m(K (t)) = 0 (in the adiabatic case), which implies
that

∇K n(K (t))|m(K (t)) = −n(K (t))|∇K m(K (t)), (23.51)

and acting with ∇K on both states n| and |m, and on Ĥ in (23.50), we obtain

0 = n(K (t))|(∇K Ĥ)|m(K (t))| − (Em − En )n(K (t))|∇K m(K (t)). (23.52)

This implies that

n(K (t))|(∇K Ĥ)|m(K (t))


n(K (t))|∇K |m(K (t)) = . (23.53)
Em − En
Substituting this in (23.49), we obtain that the Berry curvature is expressed as
 ⎡⎢ n(K (t))|(∂i Ĥ)|m(K (t))m(K (t))|(∂j Ĥ)|n(K (t))
Finj =i ⎢⎢
mn ⎣
⎢ (Em − En ) 2
(23.54)
n(K (t))|(∂j Ĥ)|m(K (t))m(K (t))|(∂i Ĥ)|n(K (t)) ⎤⎥⎥
− ⎥⎥ .
(Em − En ) 2 ⎦
Finally, we can use this formula in the case of the interaction between a spin and a time-dependent

magnetic field B(t), acting as parameters K (t). Then, we have

∂ Ŝi
∂i Ĥ = Ĥ = −gμ B (= −μ B σi ), (23.55)
∂ Bi 
where in parenthesis we have written an expression for the right-hand side in terms of the Pauli
matrices acting on a two-dimensional Hilbert space for spin 1/2 ith projection m/2 (m = ±1) for the
spin in the z direction, using Si = σi /2 and g = 2. Then, after a calculation (left as an exercise), we
find
B̂i
Bin = m , (23.56)
B2
where B̂i stands for the i component of the magnetic field. Then the Berry phase is (using the fact
that Ki = Bi and Gauss’s, or Stokes’, law)
   
1 1 B̂i 2
γn = Ain dBi = Bin d 2 Bi = −m d Bi = −m dΩ. (23.57)
 C=∂S  S(C) S(C) B
2
C

Here in the second equality we have used Stokes’ law to convert the integral over the closed path
C = ∂S in Bi space to an integral over the surface S(C) bounded by C in Bi space, and in the last
equality we have noted that ( B̂i /B2 )d 2 Bi is a solid angle dΩ on the surface S bounded by the contour
C, with respect to the origin O in B  space; see Fig. 23.2. Then we see that

γn = −mΔΩ (23.58)

 space, thus being entirely geometric in nature, as we set out to show.


is a solid angle in parameter ( B)
269 23 Aharonov–Bohm Effect and Berry Phase

C
S

Figure 23.2 A surface S in B space, bounded by a contour C and making a solid angle Ω from the point of view of the origin O.

23.6 Nonabelian Generalization

We can also define a nonabelian generalization of the Berry connection. Indeed, now suppose that
we have N degenerate eigenstates |n, with n = 1, . . . , N, of the Hamiltonian, which depends on
time-dependent parameters K (t), so we denote the eigenstates by |n, K.
Then we can still define the Abelian Berry connection as the diagonal matrix element

A(K ) = i n, K (t)|∇K |n, K (t), (23.59)
n

but now we can define also the nonabelian Berry connection as the off-diagonal matrix element in
the space of the same energy,

A(n,m) (K ) = in, K (t)|∇K |m, K (t), (23.60)

which is an element in the unitary group U (N ) (it is a unitary N × N matrix), so that A(K ) is its trace,

N
A(K ) = A(n,n) (K ), (23.61)
n=1

which is known to be the Abelian component in the decomposition U (N )  (U (1) × SU (N ))/Z N .

23.7 Aharonov–Bohm Phase in Berry Form

We can rewrite the Aharonov–Bohm phase δ in the Berry phase form as follows. Consider (electron)
 translated along the curve C (for example, a curve inside a conductor) until r .
states at a point R,
Then the position-space wave function at point r is
270 23 Aharonov–Bohm Effect and Berry Phase

⎡⎢  r ⎤
 = exp ⎢⎢ iq  r  ) · dr ⎥

r |n( R) A( ⎥⎥ ψ n ( R),
 (23.62)
⎢⎣  R ⎦
which leads to

n( R)|
 ∇
R
 = −iq A(
  n( R)  R),
 (23.63)
 R)
meaning we have rewritten the vector potential A(  as a Berry connection, and thus the Aharonov–
Bohm phase as a Berry phase.

Important Concepts to Remember


 

• Since the canonically conjugate momentum p = p kin − q A translates in x so is equal to ∂x , we

 x i
find that when we go along a path P from x to x  we acquire a phase eiδ , with δ = P:x q A  · dx .
• In turn, this means that on a closed path C, we have a gauge-invariant phase proportional to the
magnetic flux, since δ = q C=∂S A  · dx = q B
 S
 · d S = q Φm . This is the Aharonov–Bohm phase.

• The Aharonov–Bohm phase would appear to mean that A  is observable (we can have B  = 0 on C,
but A  0); the phase itself is unobservable (e = 1) for integer flux, Φn = nΦ0 , Φ0 = h/q.
 iδ

• If the Hamiltonian depends on a set of parameters that change slowly in time, K = K (t), with
adiabatic energy eigenstates |n(K (t)), we can define a Berry connection A  n (K ) = in(K (t))|
∇K |n(K (t)), and the states have an additional Berry phase eiγn , with γn = A  n (K ) · d K.

• The Berry phase is real and gauge invariant (under generalized gauge transformations for A  n (K ))
modulo 2πm, so is physically observable.
• For atoms (nuclei plus electrons) in the Born–Oppenheimer approximation (nuclei with small
motion), we have a Berry phase in terms of R (the nuclear separation).
• The Berry curvature, the field strength of the Berry connection, can be written in terms of the
matrix elements of the derivatives of the Hamiltonian.

• In the case of an interaction between a spin 1/2 and a time-dependent magnetic field B(t), with

projection m/2 of the spin on B, the Berry phase is a geometric phase in parameter space
 space), specifically −m times the solid angle bounded by a moving direction B(t)/|
(B  
B(t)| along
a closed path.
• In the case of N degenerate eigenstates of the Hamiltonian, |n, we can define a nonabelian Berry
connection, A(nm) (K ) = in, K (t)|∇K |m, K (t), an element in U (N ), while the Abelian Berry con-
nection is its trace, n A(n,n) .
• The Aharonov–Bohm phase is a particular example of the Berry phase, since it can be put in the
form of the latter.

Further Reading
See [1] and the original paper of Michael Berry [9] for more details.
271 23 Aharonov–Bohm Effect and Berry Phase

Exercises

(1) Consider a figure-eight loop made of conductor, mostly in a single plane (with just enough
nonplanarity for the loop to not cross itself, but to slightly avoid it), and a constant magnetic
field perpendicular to this plane. Do we have an Aharonov–Bohm phase for the electrons in the
conducting loop?
(2) What happens to the wave function of the electrons moving around the loop in exercise 1 (or to
the interference pattern if we shoot electrons at a point on the loop in opposite directions on the
loop and measure their interference when they cross again) as we undo the figure-eight into a
circle in the same plane?
(3) Consider a system depending on N time-dependent vectors R i , i = 1, 2, . . . , N, for instance a
molecule with N nuclei with such positions. How do we construct its generic Berry phase?
(4) Consider an H2 molecule moving slowly, in the Born–Oppenheimer approximation, in a constant
 What Berry connection does one define, and how can we have a nontrivial Berry
electric field E.
phase?
(5) Consider a molecule with fixed dipole d in a time-dependent electric field E(t).  Calculate the
Berry connection and Berry magnetic field in this case. Do we have a geometrical phase in this
case?
(6) Prove that for a spin 1/2 particle in a magnetic field the Berry curvature is given by (23.56).
(7) How would you define the Berry curvature in the nonabelian case, in such a way that it
transforms covariantly under U (N ), i.e., with U acting from the left and U −1 from the right?
Motion in a Magnetic Field, Hall Effect
24 and Landau Levels

In this chapter we consider particles with and without spin, and later atoms, in a magnetic field and
calculate the resulting quantum effects. For a particle, we find that there are “Landau levels” for the
states, and for particles in a conductor these lead to a quantum version of the classical Hall effect.
Finally, for the atom, we find a first-order form for the Zeeman effect.

24.1 Spin in a Magnetic Field

As we saw in previous chapters, for the interaction of a spin with magnetic field, we have the
Hamiltonian

 · S.
Ĥ (B) = Ĥ0 − gμ B B (24.1)

In the case of a spin 1/2 particle, we can rewrite this as
Ĥ (B) = Ĥ0 − μ B B
 · σ. (24.2)
For a magnetic field in the z direction,
 
 · σ = Bz σ3 = +Bz
B
0
, (24.3)
0 −Bz
which means that the time-independent Schrödinger equation Ĥ (B)|ψ = E(B)|ψ gives
E(B) = E0 ∓ μ B Bz , (24.4)
which is the result of the Stern–Gerlach experiment for Sz = ±/2 (it is the spin-1/2-particle splitting
energy as a function of Sz in a transverse magnetic field).

24.2 Particle with Spin 1/2 in a Time-Dependent Magnetic Field

We consider a magnetic field B  = B(t),


 i.e., constant in space, but time dependent. In this case, we
can separate variables and write a wave function that is a tensor product of a position state and a spin
state (as in Chapters 17 and 20),
|ψ = ψ(r , t)|s(t). (24.5)
The Schrödinger equation splits into independent Schrödinger equations. The solution for |ψ is
unique, though the separation into parts is not. We can add a time dependence with a function F (t),
split between the two equations,
272
273 24 Motion in Magnetic Field, Hall Effect, Landau Levels

 ∂ 
Ĥ0 − F (t) ψ  (r , t) + ψ (r , t) = 0
i ∂t
(24.6)
 d   · σ + F (t))|s  = 0.
|s  + (−μ B B(t)
i dt
The solution of the two equations factorizes the dependence on F (t),
  t 
i
ψ  (r , t) = ψ(r , t) exp F (t  )dt 
 0
  t  (24.7)
 i  
|s (t) = |s(t) exp − F (t )dt .
 0
But, as we said, the total state is independent of F (t),
|ψ   = ψ  (r , t)|s  (t) = ψ(r , t)|s(t) = |ψ. (24.8)
So we can use the solution without F (t) (i.e., put F (t) = 0), which is unique.
However, as explained in the previous chapter, in this situation there is also a Berry phase that is
actually geometrical; we will not repeat the argument here.

24.3 Particle with or without Spin in a Magnetic Field:


Landau Levels

We saw in the previous chapter that the coupling of a particle with a vector potential describing an
electromagnetic field is given by
1 ˆ  2 − q A0 .
Ĥ =p − q A (24.9)
2m
Consider a magnetic field in the z direction, B = Bez , with no electric field, so we can put the
gauge A0 = 0. Then we have the vector potential
 = Bxey ,
A (24.10)
which indeed gives Fxy =  xyz Bz = B. Then the Hamiltonian becomes
1  2 2

Ĥ = p̂x + p̂y − qBx + p̂z2 . (24.11)
2m
To solve the time-independent Schrödinger equation, we write a solution with separated variables,
where in the y, z directions we have just the free particle ansatz
ψ(x, y, z) = un,py ,pz (x, y, z) = X n (x)ei(y py +z pz )/ . (24.12)
The time-independent Schrödinger equation becomes
Ĥ ψ = Eψ
⎡⎢  2⎤⎥
2 ⎢⎢∂ 2 + iq
=− + ∂z2 ⎥⎥ ψ
∂y − Bx
2m ⎢⎣ x  ⎥⎦ (24.13)
⎡  2
2 ⎢⎢ 2 ipy iq pz2 ⎤⎥
=− ∂ + − Bx − 2 ⎥⎥ X n (x)ei(y py +z pz )/ .
2m ⎢⎢ x    ⎥⎦

274 24 Motion in Magnetic Field, Hall Effect, Landau Levels

If the particle also has a spin, say spin 1/2, then we saw that there is an extra term −μ B B
 · σ in Ĥ,
corresponding to a term ∓μ B B in the energy. With this extra term, the equation for a particle with
spin 1/2 in a magnetic field is given by
⎡  
 2m ⎢⎢ pz2 q2 B2 py 2 ⎤⎥⎥
X n (x) + 2 ⎢ E ± μ B B − − x− X n (x) = 0.
qB ⎥⎥
(24.14)
 ⎢⎣ 2m 2m

Then, defining
py
x0 =
qB
(24.15)
q2 B2 |q|B
≡ ω (B) ⇒ ω(B) = ωc =
2
,
m2 m
where ωc is called the cyclotron frequency and is the angular frequency for the motion of a particle
of mass m in a field B, we find
 
2m pz2 mωc2
X n (x) + 2 E ± μ B B − − (x − x 0 ) 2 X n (x) = 0. (24.16)
 2m 2
This is the equation for a harmonic oscillator, with coordinate x − x 0 and frequency ωc , so the x
dependent wave function is
X n (x) = Nn e−α
2 (x−x 2 /2
0)
Hn (α(x − x 0 )), (24.17)
and the eigenenergies are
 
pz2 1
En;|pz |,ms = ∓ μ B B + ωc n + . (24.18)
2m 2
We note that
|e|B ωc
μB B = = = |ms |ωc , (24.19)
2m 2
where ms = ±1/2 gives the values of the spin projection onto the z direction. Thus the eigenenergies
can be rewritten as
 
pz2 1
En;|pz |,ms = + ωc n + ms + . (24.20)
2m 2
The parameter α in the harmonic oscillator is rewritten as
 
mωc |q|B 1
α= = ≡ , (24.21)
  l

250 A
where the distance l becomes √ , and the position x 0 is
B(tesla)
l2 py
x 0 (py ) = ± py = ± . (24.22)
 mωc
The harmonic oscillator index n (listing the energy levels) is called now a Landau level. We note
that the harmonic oscillation:
• is shifted by x 0 (py ) and rescaled by l;
• is a motion in a plane perpendicular to B,  in this case the plane defined by (x, y).
275 24 Motion in Magnetic Field, Hall Effect, Landau Levels

To compare the behavior obtained in quantum mechanics with that in classical mechanics, note
that in classical mechanics we have a circular motion around the direction of B.  The electron is
under the influence of the Lorentz force F = q( E + v × B), with E = 0 and a velocity perpendicular
   
 Since the Lorentz force acts as a centripetal force, or balances the centrifugal force, we obtain
to B.
F = |q|vB = Fcentrifugal = mvω, (24.23)
leading to a motion with the cyclotron angular frequency
|q|B
ω= = ωc . (24.24)
m
The particle thus describes circles of radius
v py
r= = = |x 0 (py )|. (24.25)
ωc mω(B)
Thus in the quantum case we have a shifted oscillator instead of circular motion, though the
parameters are the same as in the classical case.
We also note that the energy is independent of py , so there is a degeneracy of the levels,
corresponding to all possible py . In a physical situation, the area of the system in the (x, y) plane
is finite. Consider a rectangle in this plane, with length L x in x and L y in y, so with area S = L x L y .
Then, the momenta py are quantized as py,m , since we need to have periodicity of the wave function
(which should vanish at the boundaries of the sample) exp i py,m L y = e2πim = 1, implying
2πm
py,m =  . (24.26)
Ly
But, since for each py the level is shifted by x 0 (py ), there is a maximum value for py , py,max , such
that we shift all the way to the end of the system, x 0 (py,max ) = L x ,
py,max 2πmmax
= = Lx, (24.27)
qB qBL y
which implies that there is a maximum number of states (degeneracy) in a Landau level,
qBL y L x L x Ly
Nmax = mmax = = . (24.28)
h 2πl 2
Since S = L x L y , the magnetic flux is Φ = BL x L y , and thus we have
Φ
Nmax = , (24.29)
Φ0
where, as before, Φ0 = h/e is the flux quantum. Therefore the maximum number of electron states in
a Landau level is independent of the level n. And the maximum number of electron states per unit
area L x L y is
Nmax 1 eB
nB = = = . (24.30)
L x Ly 2πl 2 h

24.4 The Integer Quantum Hall Effect (IQHE)

The Landau levels have a very important application to the physics of conducting materials, the Hall
effect.
276 24 Motion in Magnetic Field, Hall Effect, Landau Levels

Bz
− − −
I Ey VH
e− W
+ + + I
Ex
L
VL

Figure 24.1 The classical Hall effect: a current in the x direction and magnetic field in the z direction result in a Hall potential in
the y direction.

The classical Hall effect is as follows. Consider a quasi-two-dimensional sample of conducting


material of size L x = L and L y = W and magnetic field B in the z direction. Add a voltage VL in the
x direction, implying an electric field Ex , so that VL = LEx ; see Fig. 24.1.
Conduction electrons move sideways in the material under the Lorentz force until there is a
compensating electric field Ey , generating a “Hall voltage” VH = L y Ey . The equilibrium of the
Hall voltage generating Ey with the Lorentz force gives, from F  = −e( E + v × B)
 = 0,

Ey = vx Bz . (24.31)

This is the classical Hall effect, which we will quantify next.


Conduction electrons move in the material by accelerating between collisions, which stop them
and bring the velocity down to zero. The time between collisions is the “relaxation time” τ. This
means that the average velocity in the conduction direction x is
eEx τ
vx = aτ = − . (24.32)
m
Then, considering the current density per unit length j x (note that we have a quasi-two-dimensional
sample, with a transverse area that is quasi-one-dimensional, so the appropriate density is given per
unit length in the y direction instead of per unit area) generated by a number of electrons per unit
area ne , we have

e2 τne Ex
j x = ne evx = , (24.33)
m
which means that the compensating electric field in the y direction is
eB
Ey = vx Bz = − τEx = −ωc τEx . (24.34)
m
We define the transverse, or Hall resistance RH as the ratio of the Hall voltage VH (in the y direction)
and the applied current I (in the x direction),

VH L y Ey mωc τ mωc B
RH = = =− 2 =− =− . (24.35)
I Ly jx e τne ne e 2 ne e

Here the minus sign is conventional, to emphasize that the Hall voltage is a compensating effect that
opposes the applied current.
277 24 Motion in Magnetic Field, Hall Effect, Landau Levels

B
Figure 24.2 RH , or the Hall resistivity ρ xy , versus magnetic field B has plateaux, while ρ xx  0 on the plateaux.

The Hall conductance is defined as the inverse of the resistance,


1 ne e
σH = =− . (24.36)
RH B
However, in 1980, Klaus von Klitzing measured experimentally the Hall effect in a two-
dimensional electron gas confined at the interface between an oxide, SiO2 , and the Si semiconductor
MOSFET (metal-oxide-semiconductor field-effect transistor, a standard electronic device). He found
that in this case, the B dependence of the Hall voltage is not linear, as suggested by the above
relation, but rather has plateaux, whose position is independent of the sample (i.e., is universal), as
in Fig. 24.2. He also found that actually the Hall conductance σ H is quantized as
e2
σ H = −n
= nσ0 . (24.37)
h
Moreover, von Klitzing found that at the plateaux the longitudinal (normal) resistance vanishes,
RL = VL /I = 0. This behavior appears in various (2 + 1)-dimensional strongly coupled
systems (systems where the effective interactions between the degrees of freedom are strong).
The above phenomenon is called the integer quantum Hall effect, or IQHE, and von Klitzing got
a Nobel Prize for it in 1985 (which is a record time between research and Nobel Prize). Actually, in
1982 Daniel Tsui and Horst Störmer had found that, at much higher B and much lower temperature T,
for similar systems we have a fractional quantum Hall effect (FQHE), with
e2 k
σH = ν , ν= . (24.38)
h 2p + 1
The FQHE was (partially) explained by Robert Laughlin in terms of a “Laughlin wave function”,
but we will not address it here. For this explanation, he (together with Tsui and Störmer) also got a
Nobel Prize, in 1998.
Coming back to the IQHE, and applying the Landau level theory, if we have a maximum number
n B of states per unit area in a Landau level, we can have n fully occupied levels, so the number of
(conducting) electron states per unit area is given by
ne
ν= = n ∈ N. (24.39)
nB
Then the Hall conductance is, as given earlier,
ne e nnB e ne2
σH = − =− =− . (24.40)
B B h
278 24 Motion in Magnetic Field, Hall Effect, Landau Levels

24.5 Alternative Derivation of the IQHE

We will now consider an alternative derivation of the IQHE, which also explains an important
property of the quantum Hall effect, the fact that at the plateaux we have RL = 0 (zero normal,
or longitudinal, resistance).
As we said, at equilibrium the Lorentz force vanishes, v × B
 = − E,
 so

vi Bk = E j i jk . (24.41)

If we take B to be in the z direction, and k = z, we find


i j E j
vi = . (24.42)
Bz
If we have n fully occupied Landau levels, therefore each with Nmax states in them, so that
Q = nNmax , for electrons (q = −e) we find the current density
eQ eQ
ji = vi = i j E j ≡ σi j E j , (24.43)
L x Ly BL x L y
where we have defined a conductance tensor by the standard relation ji = σi j E j . Then we have
eQ enNmax e2
σi j = i j = = n i j = σH i j , (24.44)
Φ Nmax h/e h
so the conductance matrix is
 
0 ne2 /h
σ= . (24.45)
−ne2 /h 0
We see that σxx , the diagonal or normal (longitudinal) conductance, vanishes. If there were no off-
diagonal (Hall) conductance, but only a longitudinal conductance, σxx = 0 would imply infinite
resistance, Rxx = ∞. However, now σ is a matrix, so has to be inverted as a matrix, giving the
resistance matrix
 
0 −1/σxy
R = σ−1 = , (24.46)
1/σxy 0
and we see that now in fact Rxx = 0! This means that we have zero longitudinal resistance on the
plateaux, the other important experimental observation about the integer quantum Hall effect.

24.6 An Atom in a Magnetic Field and the Landé g-Factor

Consider now the next case in terms of complexity of the system, a multi-electron atom in a magnetic
field. The electrons have spins that add up to a total electronic spin S = i=1N
si (for N electrons),
 through both the vector potential A
and these electrons interact with B  and the spin–magnetic field
 · S.
interaction B  Specifically, the Hamiltonian will be

1  ˆ
N  · S
2 B
Ĥ = p i + |e| A(
 r i) + U (r 1 , . . . , r N ) − μ B , (24.47)
2m i=1 
279 24 Motion in Magnetic Field, Hall Effect, Landau Levels

where U (r 1 , . . . , r N ) is the interaction potential between the electrons and nucleus and between the
electrons themselves.
Expanding, we get
|e|   e2   2
N N
ˆ |e|  
Ĥ = Ĥ ( A = 0) + A(r i ) · p + A (r i ) + B · S. (24.48)
m i=1 2m i=1 m

If the magnetic field is constant, we can choose a gauge where the vector potential is A  = (r × B)/2,

so the Hamiltonian is then rewritten as
e   e2  
N N
|e|  
Ĥ = Ĥ ( A = 0) + B· (r i × pˆ i ) + ( B × r i ) 2 + B·S
2m i=1
8m i=1 m
(24.49)
2  N
e
= Ĥ ( A = 0) + μ B ( L + 2 S)
 ·B + (r i × B)
 ,2
8m i=1

where we have used the individual orbital angular moment L i = r i × p i and the total orbital angular
momentum L = i=1 N 
L i . The last term is independent of L, S, and the middle term is (since L is the
total angular momentum),
μ
 tot = μ B ( L + 2 S).
 (24.50)
For the electrons, we can define the total angular momentum
J = L + S,
 (24.51)
and similarly for the projections onto the z direction, Jz = L z + Sz . The projection Jz is defined by an
eigenvalue MJ , so the average for a state is Jz  = MJ . Moreover, we also have L z  = ML  and
Sz  = MS .
Then the energy split due to the angular momenta is
μB B μB B
ΔE = (L z  + 2Sz ) = (Jz  + Sz ), (24.52)
 
and we can write it in terms of a total Landé, or gyromagnetic, factor g for the whole atom, as
ΔE = gμ B BMJ . (24.53)
This is the Zeeman effect, of the splitting of the atomic levels under a magnetic field. Note that
there are further terms, coming from the nonrelativistic expansion for the Dirac equation, including
the spin–orbit coupling, but we will not consider them here.
In order to calculate g, we first square the relation J = L + S,
 to obtain
2
2L · S = J − L 2 − S 2 . (24.54)
2
Then, we take the quantum average over a state of given J, L, S, for which J = J (J + 1)2 ,
L 2 = L(L + 1)2 and S 2 = S(S + 1)2 , to obtain
2
 S · J  =
[J (J + 1) − L(L + 1) − S(S + 1)] . (24.55)
2
Next, we note that, for a spin–orbit coupling that aligns the spin and total angular momentum, we
obtain
J · S
 MJ 
Sz  = Jz  = [J (J + 1) − L(L + 1) − S(S + 1)] . (24.56)
2
J  2J (J + 1)
280 24 Motion in Magnetic Field, Hall Effect, Landau Levels

We finally write
 
J (J + 1) − L(L + 1) − S(S + 1)
gMJ  = Jz  + Sz  = MJ  1 + , (24.57)
2J (J + 1)
allowing us to write the Landé factor as
 
J (J + 1) − L(L + 1) − S(S + 1)
g = 1+ . (24.58)
2J (J + 1)

Important Concepts to Remember

• For a quantum particle (perhaps with spin 1/2) in a constant magnetic field, we find that it behaves
as a shifted and rescaled harmonic oscillator with cyclotron frequency ωc = |q|B/m in a direction
perpendicular to B, leading to Landau levels n, En = pz2 /(2m) + ωc (n + ms + 1/2).
• Classically, we have a motion at ωc on a circle of radius r = py /(mωc ) (with momentum
perpendicular to the radial direction), while quantum mechanically we have motion in a direction
x in the same plane as the circle, shifted by x 0 (py ) = r (the x direction is perpendicular to the
py direction), as that of a one-dimensional harmonic oscillator with the same ωc , and rescaled by
l = /(|q|B).
• In a physical situation, for each Landau level we have a degeneracy determined by the quantized
momentum py , which is bounded because it leads to a shift in the perpendicular direction. This
leads to a maximum number of states (degeneracy) of each Landau level equal to Φ/Φ0 , where
Φ0 = h/e is the flux quantum, and to n B = eB/h states per unit area in a Landau level.
• The classical Hall effect is as follows: if we have a B field in the z direction, then an E field in the
y direction is correlated with the current in the x direction (either the current induces Ey , i.e., the
Hall voltage VH , or Ey induces Ix ), leading to a Hall conductivity σ H = −ne e/B.
• The integer quantum Hall effect (IQHE) is the phenomenon whereby, for certain (2 + 1)-
dimensional systems, the relation between VH and B (which classically is VH = I/σ H ∝ B) is
not linear, but has plateaux, whose position is universal, and we also have an integer conductivity,
σ H = −ne2 /h, and RH = 0 (zero longitudinal resistance).
• The fractional quantum Hall effect (FQHE) is the phenomen whereby, at much higher B fields and
lower temperature T, the conductivity is fractional, σ H = νe2 /h, with ν = k/(2p + 1), (partially)
explained by Laughlin in terms of a Laughlin quantum wave function.
• The Hall conductance σ H of IQHE comes from n fully occupied Landau levels, with ne /n B = n,
which also means σi j = σ H i j , such that σxx = 0, yet because R = σ−1 is a matrix inverse we also
have Rxx = 0 (zero longitudinal resistance).
μ B
• For a multi-electron atom, the Landé g factor is found from writing ΔE = B (Jz  + Sz ) as
μB B
 gJz .

Further Reading
See for instance [6] for more details on the Hall effects (IQHE and FQHE), and [2] for more on the
Landé g factor.
281 24 Motion in Magnetic Field, Hall Effect, Landau Levels

Exercises

(1) Consider a stationary gauge transformation Λ(x, y) in the plane transverse to a magnetic field
Bz . Calculate the effect it has on the Schrödinger equation and its solutions.
(2) For the case in exercise 1, specialize to an infinitesimal gauge transformation that rotates A  in

its plane (transverse to B), and interpret this in terms of the classical motion.
(3) For a (2 + 1)-dimensional system of sides L x and L y under the IQHE, calculate the length of the
plateau in RH as B is increased.
(4) In the FQHE, how many occupied states can there be in a Landau level? What do you deduce
about the description of the states of the electron?
(5) In some more complicated materials, we can have both a Hall conductivity and a longitudinal,
usually isotropic, conductivity. Calculate the resistivity matrix Ri j . Consider a transformation
that acts by exchanging the values of the electric field components Ei and those of the current
densities ji , and write it as a transformation on the complex value σ ≡ σxy + iσxx .
(6) For a multi-electron atom, with total angular momentum L > 1 and total spin 1, what are the
possible values of the Landé g factor? Specialize to the classical limit of large L.
(7) Consider a multi-electron atom, with spin S, orbital angular momentum L, and total angular
momentum J in a slowly varying magnetic field B(t) moving on a closed curve C. Calculate the
Berry phase of the quantum state of the atom.
25 The WKB; a Semiclassical Approximation

We now go back to the WKB method, which first appeared in Chapter 11, where it was related to the
classical limit as an expansion around the Hamilton–Jacobi formulation of classical theory. The WKB
method is a method of approximation, and as such we will also go back to it in Part II of the book,
but here we make a second pass at it since it will lead, in the next chapter, to the Bohr–Sommerfeld
quantization, which was one of the first formulations of quantum mechanics.

25.1 Review and Generalization

The wave function solution of the Schrödinger equation was written as


ψ(r , t) = eiS(r ,t)/
(25.1)
S(r , t) = W (r ) − Et,
which means that the time-independent wave function is
ψ(r ) = eiW (r )/ . (25.2)
Moreover, we wrote W (r ) as

W (r ) = s(r ) + T (r ), (25.3)
i
where s(r ) and T (r ) are uniquely defined as even functions in . Then
ψ(r ) = eT (r ) eis(r )/ ≡ Aeis(r )/ . (25.4)
We can expand the even functions s, T in ,
 2  4
 
s = s0 + s2 + s4 + · · ·
i i
 2  4 (25.5)
 
T = t0 + t2 + t4 + · · · .
i i
If we define t n = s n+1 , it means that W (r ) has an expansion in all powers of /i, with coefficients s n .
Then, in one spatial dimension, the Schrödinger equation becomes
 2
dW  d 2W
− 2m(E − V (x)) + = 0, (25.6)
dx i dx 2
and expanding it in powers of /i we obtain
⎡⎢   n⎤ 2  d2   n+1
⎢⎢ ds n  ⎥⎥ 
− − + = 0.
⎢⎣n≥0 dx i ⎥⎥⎦
2m(E V (x)) s
2 n i
(25.7)
n≥0
dx
282
283 25 The WKB and Semiclassical Approximation

We can set to zero the coefficients of each power of /i, obtaining an infinite set of coupled
equations for s n .
Then, as we have seen, keeping only s0 and s1 = t 0 the WKB solution in one dimension is
  x 
const. i   iEt
ψ(x) = exp ± dx p(x ) − , (25.8)
p(x)  x0 
where, if E > V (x),
p(x) = 2m(E − V (x)). (25.9)
However, for the solution in the classically forbidden region, E < V (x), if we write
p(x) = iχ(x) = i 2m(V (x) − E) (25.10)
then the solution is
  x 
const 1   iEt
ψ(x, t) = √ exp ± dx χ(x ) − . (25.11)
χ(x)  x0 
The condition for the validity of this WKB approximation is |δ λ λ/λ|  1, or
  2(E − V (x))
|δ λ λ|  √ ⇔ √  , (25.12)
2m(V (x) − E) 2m|V (x) − E| |dV /dx|
where the last expression is the characteristic distance for the variation of the potential.
However, we note that, while being valid mostly everywhere, the above condition is certainly not
valid near the turning points for the potential, where E = V (x). Indeed, there we have ψ → ∞, and
 2|E − V (x)|
√ →∞ → 0. (25.13)
2m(V (x) − E) |dV /dx|
Instead, we must use a different approximation.

25.2 Approximation and Connection Formulas at Turning Points

We consider first the case when V (x) decreases through E at x = x 1 , so the barrier (the classically
forbidden region) is on the left-hand side of the potential diagram.
Then we can approximate the function
2m
f (x) = (E − V (x)) (25.14)
2
by a Taylor expansion to the first order, i.e., a linear approximation,
2m 
f (x) = f  (x 1 )(x − x 1 ) = − V (x 1 )(x − x 1 ). (25.15)
2
This can then be analytically continued through the complex-x plane, in order to avoid the point
x 1 , where f (x 1 ) = 0. One can make a rigorous analysis in this way, but it is rather complicated.
Instead, there is a much simpler shortcut, where we replace the function f (x) with a step function:

⎪ −α 2 , x < x1
f (x) = ⎨
⎪ +α 2 , (25.16)
⎩ x > x1.
284 25 The WKB and Semiclassical Approximation

This function is discontinuous, but that is not a problem as we have already dealt with step functions
in the general chapter on one-dimensional problems.
With this f (x), the time-independent Schrödinger equation becomes
ψ− − α 2 ψ− = 0, x < x1
(25.17)
ψ+ + α ψ+ = 0,
2
x > x1.
The solutions ψ± on the different segments can be glued using the usual joining conditions, of
continuity at x 1 for ψ(x) and ψ  (x).
The solution in the region x < x 1 that goes to zero at x → −∞ is e−α(x−x1 ) ; introducing a factor

1/ α for consistency with the WKB approximation, we obtain
x
1 1
ψ− (x) = √ e−α(x−x1 ) = √ e− x dx α ,
1
(25.18)
α α
so for this solution, we have
1 √
ψ− (x 1 ) = √ , ψ− (x 1 ) = α. (25.19)
α
On the other hand, the general solution in the region x > x 1 is
1
ψ+ (x) = √ A sin[α(x − x 1 ) + δ], (25.20)
α
so for this solution, we have
A sin δ √
ψ(x 1 ) = √ , ψ+ (x 1 ) = A α cos δ. (25.21)
α
The joining conditions are
ψ+ (x 1 ) = ψ− (x 1 ) ⇒ A sin δ = 1
(25.22)
ψ+ (x 1 ) = ψ− (x 1 ) ⇒ A cos δ = 1.

The ratio of the conditions gives tan δ = 1, so δ = π/4, and thus we obtain A = 2.
Then, since

2m
α= |E − V (x)|, (25.23)
2
we obtain the condition for replacement (when going from the left to the right of the point) at the
turning point:
  x1 
1 1   ) − E]
exp − dx 2m[V (x
[V (x) − E]1/4  x
√   x 
2 1  
π
→ sin dx 2m[E − V (x )] + (25.24)
[E − V (x)]1/4  x1 4
√   x 
2 1  
π
= cos dx 2m[E − V (x )] − .
[E − V (x)]1/4  x1 4
Similarly, for the opposite turning point, for an increasing V (x) (thus for the barrier, or classically
forbidden region, on the right-hand side), we obtain the condition for replacement (from the right to
the left of x 2 ),
285 25 The WKB and Semiclassical Approximation

  x 
1 1  
exp − dx 2m[V (x ) − E]
[V (x) − E]1/4  x2
√   x2 
2 1   )] −
π
→ sin dx 2m[E − V (x (25.25)
[E − V (x)]1/4  x 4
√   x2 
2 1  
π
= cos − dx 2m[E − V (x )] + .
[E − V (x)]1/4  x 4

25.3 Application: Potential Barrier

Consider a potential well with depth −V0 until x = x 1 , after which V (x) jumps up to +Vmax , and then
decreases monotonically to zero at infinity, as in Fig. 25.1. The relevant case is that of the potential in
nuclear α-decay, meaning nuclear fragmentation. Then the energy E of the decaying nucleus is higher
than zero, representing the energy of the fragmented pieces. The potential well at x < x 1 corresponds
to the nuclear force, while the monotonically decreasing barrier for x > x 1 is a Coulomb potential.
We define region I for x < x 1 , setting E = V (x) at x = x 2 > x 1 ; we have region II for x 1 ≤ x ≤ x 2 ,
and region III for x > x 2 . We further write, as usual,

2m(E − V (x))
≡ k (x), x ∈ III
2

2m(V (x) − E)
≡ κ(x), x ∈ II (25.26)
2

2m(E + V0 )
≡ k, x ∈ I.
2
Then the wave function in region III is
√  x 
2 π
ψIII (x) = √ cos k (x) dx −
k (x) x2 4
   x    x  (25.27)
1 π π
=√ exp i k (x) dx − i + exp −i k (x) dx + i .
k (x) x2 4 x2 4

I II III
x

Figure 25.1 Potential barrier for α decay.


286 25 The WKB and Semiclassical Approximation

According to the previous WKB rules for continuity at the turning point, we have for the wave
function in region II (except that the wave function should be now increasing away from the
transition point, not decreasing)
  x2 
1  
ψII (x) = √ exp + dx κ(x )
κ(x) x
  x  (25.28)
1
=√ e τ exp − dx  κ(x  ) ,
κ(x) x1

where we have defined


 x2
τ= dx  κ(x  ). (25.29)
x1

Finally, in region I, where the potential is constant, V (x) = −V0 , we can solve the Schrödinger
equation exactly (rigorously, without a WKB approximation):

ψI (x) = A sin[k (x − x 1 ) + β]
ei[k (x−x1 )+β] + e−i[k (x−x1 )+β] (25.30)
=A .
2i
The conditions for the continuity of ψ(x) and ψ  (x) at x = x 1 are

ψI (x 1 ) = ψII (x 1 ) ⇒ √ 1
κ(x1 )
eτ = A sin β
(25.31)
ψI (x 1 ) = ψII (x 1 ) ⇒ k cot β = −κ(x 1 ).

But, given that we have written the wave functions ψI and ψIII in terms of exponentials eiθ(x) in
(25.30) and (25.27), we can use the general formulas from Chapter 7 for the incident and transmitted
currents,
 2
jinc = k | A| ex
m 2
(25.32)
jtrans = k (x) √ 1 
ex = ex .
m ( k (x)) 2 m

Then the transmission coefficient is


| jtrans | 4
T= = . (25.33)
| jinc | k | A| 2
But from (25.31), we obtain
1 1
sin2 β = = ⇒
1 + cot β 1 + κ (x 1 )/k 2
2 2
  (25.34)
e2τ 1 e2τ κ2 (x 1 )
| A| =
2
= 1+ ,
κ(x 1 ) sin2 β κ(x 1 ) k2

so the transmission coefficient becomes



kκ(x 1 ) −2τ (Vmax − E)(E + V0 )
T = 4e−2τ = 4e . (25.35)
k 2 + κ2 (x 1 ) Vmax + V0
287 25 The WKB and Semiclassical Approximation

25.4 The WKB Approximation in the Path Integral

The wave function is written in terms of the propagator (evolution operator),

ψ(x, t) = U (t, t 0 )ψ(x, t 0 ), (25.36)

and the propagator can be written as a path integral,



U (t, t 0 ) = Dx(t)eiS[x(t)]/ . (25.37)

In terms of the path integral, the expansion in  of the exponent in U (t, t 0 ) = eiW / is related to the
expansion of the action around its classical value. To a first approximation, the action is equal to the
classical value plus a quadratic fluctuation around it,
 t
(x(t  ) − x cl (t  )) 2 δ2 S
S  Scl [x cl (t)] + dt  [x cl ], (25.38)
0 2 δx 2
which corresponds to the expansion of W (x)  s0 (x) + i s1 (x), which leads to ψ(x) = es1 eis0 / . The
classical part does not need to be integrated (since it is constant), while the path integration acts on
the quadratic fluctuation part.
Since (putting t 0 = 0, and using the fact that L = p ẋ − H)
 t    xf
 dx
Scl = dt p(x)  − E = dx p(x) − Et, (25.39)
0 dt x0

and since
 t  
m ẋ 2 p2 (x)
S= dt − V (x) ⇒ + V (x) = E, (25.40)
0 2 2m
we find that the zeroth-order term in the propagator becomes
    
i i i
exp Scl = exp dx 2m(E − V (x)) − Et , (25.41)
  
which is the zeroth-order term (eis0 / ) in the WKB approximation.
On the other hand, the quadratic fluctuation part of S can be path integrated, as it is the form of a
Gaussian integral type,
 +∞ 
−αx 2 π
dxe = . (25.42)
−∞ α
Since we can write
 t  
1  d2 
S  Scl [x cl ] + dt (x(t) − x cl (t)) −m 2 + (E − V (x)) (x(t) − x cl (t)) , (25.43)
2 0 dt
the Gaussian integration over the quadratic fluctuation part gives the prefactor
    −1/2
d2 
e = det −m 2 − (V (x) − E)
s1
≡ [det A]−1/2 . (25.44)
dt
Now define V  (x) ≡ mω 2 , and factor out the m prefactor from the operator A, since it gives an
overall constant (independent of ω).
288 25 The WKB and Semiclassical Approximation

To calculate the determinant of the operator A, we must choose a basis of eigenfunctions for it, as
follows.
In general, it is hard to obtain the WKB approximation term s1 from the path integral calculation,
and so from [det A]−1/2 . Instead, we can obtain the eigenenergies, e.g., for periodic paths, as in the
harmonic
 oscillator case. In this case we choose x(t f ) = x(t i ) = x 0 and then we integrate over it,
dx 0 . On these periodic paths, with period T = t f − t i , a good (complete) basis of eigenfunctions is
sin nπt
T . For this basis we have
   2 2 
d2 nπt n π nπt
− 2 − ω sin
2
= − ω sin
2
. (25.45)
dt T T2 T
Then, the determinant of the operator equals the product of the eigenvalues,
∞   ∞  
A  n2 π 2  ω 2T 2 sin ωT
det O ≡ det = − ω ≡ K (T )
2
1 − 2 2 = K (T ) , (25.46)
m n=1 T 2 n=1
n π ωT

where we have factored out another kinematic, i.e., ω-independent, factor, K (T ).


Now one can do the integral over x 0 for the combined classical and first quantum terms. After a
calculation that will not be repeated here (it can be found for instance in [10], Chapter 6.1), we obtain
 ∞ ∞
1
dx 0 eiScl / √ = K̃ (T ) e−i(n+1/2)ωT = K̃ (T ) e−iEn T / , (25.47)
det O n=0 n=0

where K̃ (T ) is another constant that is independent of the dynamics (i.e., of ω). The above result is
the correct propagator for the harmonic oscillator.

Important Concepts to Remember

 
• In the expansion ψ(r , t) = exp i (W (r ) − Et) , we write W (r ) = s(r ) + i T (r ) and expand s(r )
and T (r ) in (/i) 2 , obtaining a series expansion of the Schrödinger equation that corresponds to
quantum corrections to the Hamilton–Jacobi equation.
• The WKB solution (the solution in the WKB approximation) in one dimension is still formally
a solution in the classically forbidden region E − V (x) < 0, by writing p(x) = iχ(x) (where

p(x) = 2m(E − V (x))), but the solution (the WKB approximation) is not valid near the turning
points in the potential E = V (x).
• At the turning points in the potential, we can write connection formulas, which connect a solution
in terms of p(x) into a solution in terms of χ(x).
• In the path integral, the WKB approximation amounts to the quadratic approximation for the action
S in the exponent eiS around its classical (minimum) value Scl . We are left with an integral over x 0 ,
in the case of periodic paths that begin and start at the same x, x f = x i = x 0 , as for the harmonic
oscillator, obtaining ∞ n=0 e
−iEn T /
, where En are the eigenenergies of the harmonic oscillator.

Further Reading
See [3], [2] for more details on WKB in one dimension, and [10] for more details on WKB in the
path integral.
289 25 The WKB and Semiclassical Approximation

Exercises

(1) Write down the equation for s2 in terms of s0 and s1 = t 0 , and substitute into it the values of s0
and s1 from the WKB solution in one dimension.
(2) Write down the connection formulas at the turning points for a harmonic oscillator in one
dimension.
(3) Write down the connection formulas at the turning points for a potential V = V0 cos(ax) (V0 > 0),
for E < V0 .
(4) Consider the potential barrier given by Veff (r) for the radial motion in a potential V = −|α|/r 3
with a positive energy smaller than the barrier. Write down the connection formulas or the
turning points in this case.
(5) In the case in exercise 4, calculate the transmission coefficient.
(6) Consider the potential V = −V0 cos(ax) (V0 > 0) with a small initial condition (x 0 < π/(2a)).
Calculate the WKB approximation in the path integral.
(7) Can we truncate the sum over n in the propagator (25.47) to a finite order, as an approximation?
26 Bohr–Sommerfeld Quantization

In this chapter we use the WKB method on bound states to derive a modified version of the
quantization condition put forward initially by Bohr and Sommerfeld, which was one of the first
calculational methods of quantum mechanics. We then apply this modified Bohr–Sommerfeld
quantization condition to a set of examples, including the harmonic oscillator and the hydrogen
atom, in order to derive the eigenenergies, and wave functions for the states.

26.1 Bohr–Sommerfeld Quantization Condition

We consider a potential well, where for x < x 1 we have V (x) > E, called region I (classically
forbidden), for x 1 ≤ x ≤ x 2 we have V (x) ≤ E, called region II (the classically allowed region, the
potential well), and for x > x 2 we also have V (x) > E, called region III, as in Fig. 26.1.
We can now use the connection formulas at the turning points x 1 , x 2 derived in the previous
chapter. In region I, we have the WKB solution
  x1 
1 1   ) − E) ,
ψI (x) = exp − dx 2m(V (x (26.1)
[2m(V (x) − E)]1/4  x
that transitions into the solution in region II,
√   x 
2 1   )) −
π
ψII (x) = cos dx 2m(E − V (x . (26.2)
[2m(E − V (x))]1/4  x1 4
On the other hand, the correct WKB solution in region III is
  x 
1 1   ) − E) ,
ψIII (x) = exp − dx 2m(V (x (26.3)
[2m(V (x) − E)]1/4  x2
which transitions into the solution in region II.
√   x2 
2 1  
π
ψ̃II (x) = cos − dx 2m(E − V (x )) + . (26.4)
[2m(E − V (x))]1/4  x 4

In order for the two solutions in region II to be the same, ψII (x) = ψ̃II (x), we must have the
quantization condition
 x2
1 π
dx  2m(E − V (x  )) − = nπ. (26.5)
 x1 2

290
291 26 Bohr–Sommerfeld Quantization

Figure 26.1 Potential regions.

Note that for odd n, we have that the argument of cos in ψII is minus the argument in ψ̃II (which
still gives the same function), whereas for even n, it is the same argument. Then the quantization
condition becomes
 x2  
1
dx  2m(E − V (x  )) = n + π. (26.6)
x1 2
x x
Considering the integral over a closed path, x 2 + x 1 = C , corresponding to a full oscillation
1 2
between x 1 and x 2 , and generalizing x to a general variable q, and p(x) to its canonically conjugate
momentum p, we obtain
     
1 1
dx p(x) = p dq = n + 2π = n + h, (26.7)
C C 2 2
where n = 0, 1, 2, . . .
We note that, except for the constant (1/2 instead of 1), this is the same as the original Bohr–
Sommerfeld quantization condition, which was postulated (not derived),

p dq = (n + 1)h, (26.8)
C

where again n = 0, 1, 2, . . . We have thus obtained an improved version of Bohr–Sommerfeld


quantization that includes the classical and first semiclassical terms, and thus is expected to give the
correct result either in very simple cases (such as the harmonic oscillator and the hydrogen atom), or
for large quantum numbers n, when we are closer to classicality.
We will thus apply it to find the eigenenergies En and eigenfunctions ψ n , and we expect in general
to find very good agreement at large n.
Finally, another observation is that the WKB wave function has n nodes for the quantum number
n, just as in the general (exact) analysis of one-dimensional problems. To see this, note the form of
the function ψII (x) in the classically allowed region, written in terms of the cos therefore having n
nodes for the quantum number n.

26.2 Example 1: Parity-Even Linear Potential

We consider the potential

V (x) = k |x|, (26.9)


292 26 Bohr–Sommerfeld Quantization

which applies for instance, to the q q̄ potential in QCD: when pulling the quark and antiquark apart,
there is a constant force pulling them back together. This example can also be solved exactly, so we
can compare how well the result works.
The turning points are given by
E
E − V (x 1,2 ) = 0 ⇒ x 1,2 = ∓ . (26.10)
k
Then, integrating between them and noting that x 1 = −x 2 for an even integrand, we obtain
 x2  x2
 
dx 2m(E − V (x )) = 2 dx  2m(E − k x  )
x1 0
√  1
2mEE
=2 dy 1 − y (26.11)
k 0

4 E 2mE
= ,
3 k
1
where we have used 0 dy 1 − y = 2/3. Since this integral equals (n + 1/2)π by the quantization
condition, we obtain the eigenenergies
  2/3
3πk(n + 1/2)
En = √ . (26.12)
4 2m
But in this case, the exact eigenenergies are known, and are given by
  2/3
k
En = λ n √ , (26.13)
2m
where λ n are the zeroes of the Airy function,

Ai(−λ n ) = 0. (26.14)

Then we find excellent agreement, starting at n = 1, where the error is at about 1%, after which
it gets better. The only poor agreement is for the ground state, E0 (for n = 0). So in this case,
the eigenenergies are getting better at large n, as expected. Also as expected, the wave functions
of these states are also improving at large n except near the turning points, where we must use the
approximation from the previous chapter.

26.3 Example 2: Harmonic Oscillator

Consider next the harmonic oscillator potential,

mω 2 2
V (x) = x . (26.15)
2
In this case, the integral appearing in the quantization condition is (note that now we also have
x 1 = −x 2 , and at the turning points we have 2mE − m2 ω 2 x 21,2 = 0)
293 26 Bohr–Sommerfeld Quantization

 x2  x2 √
I= dx  2m(E − V (x  )) = dx  2mE − m2 ω 2 x 2
x1 x1
√  x2
 x2 m2 ω 2 x 
= x 2mE − m ω x  +
2 2 2 x  dx  √ (26.16)
x1
x1 2mE − m2 ω 2 x 2
 x2 √  x2
dx 
=− dx  2mE − mω 2 x 2 + 2mE √ ,
x1 x1 2mE − m2 ω 2 x 2
which leads to
 x2
dx 2mE 2πE
2I = 2mE √ = π= . (26.17)
x1 2mE − m2 ω 2 x 2 mω ω

Therefore the quantization condition is


   
E 1 1
= n+ ⇒ E = ω n + . (26.18)
ω 2 2

This is in fact the exact quantization condition for the harmonic oscillator, so it works at all n.
This result is consistent with the path integral result from the end of the previous chapter, where
we saw that the WKB propagator contained the exact eigenenergies for the harmonic oscillator,


U (T ) = e−iEn T / . (26.19)
n=0

Wave Functions
The wave functions in this case are not satisfactory at intermediate values of x, but become better
at large x (x → ±∞), which means in the classically forbidden region. In this classically forbidden
region, the WKB wave function is (for instance, for x > x 2 , x → ∞)
  x √ 
1 1 
ψ(x) = 2 2 2 exp − dx m 2 ω 2 x 2 − 2mE . (26.20)
[m ω x − 2mE]1/4  x2

However, at large x,

√ 2E E
m2 ω 2 x 2 − 2mE = mωx 1 −  + mωx − , (26.21)
2
mω x 2 ωx
so the exponent in the wave function is
 x √  
1 mω 2 x 2 E 1
− dx  m2 ω 2 x 2 − 2mE  const − + ln x + O 2 . (26.22)
 x2  2 ω x

Since E/ω = n + 1/2, we obtain the approximate wave function


   
const. mω 2 x 2 1
ψ(x)  2 2 2 exp − + n+ ln x
[m ω x − 2mE]1/4  2 2
    (26.23)
const n+1/2 mω 2 x 2  mω 2 x 2
√ x exp − = const × x exp −
n
.
mωx  2  2
294 26 Bohr–Sommerfeld Quantization

But these are indeed the leading (exponential) and first subleading (power law) terms in the exact
wave functions, since x n is the leading term in the Hermite polynomial Pn (x).

26.4 Example 3: Motion in a Central Potential

Consider a central potential in three dimensions, V = V (r). Then the time-independent Schrödinger
equation is

−2 Δψ − 2m(E − V (r))ψ = 0. (26.24)

Writing the time-independent wave function as the usual exponential (since S(r , t) = W (r ) − Et),

ψ(r ) = eiW (r )/ , (26.25)

we obtain an equation for W that is a quantum-corrected version of the Hamilton–Jacobi equation,

 ) 2 − 2m(E − V (r)) +  ΔW = 0.
(∇W (26.26)
i
But, in spherical coordinates, we have
 2  2  2
 ) 2 = ∂W + 1 ∂W +
(∇W
1 ∂W
∂r r 2 ∂θ r 2 sin2 θ ∂φ
  (26.27)
∂ 2 W 2 ∂W 1 ∂ 2W ∂W 1 ∂ 2W
ΔW = + + + cot θ + .
∂r 2 r ∂r r 2 ∂θ 2 ∂θ sin2 θ ∂φ2

Then, as we saw in Chapter 11, expanding W in  in the usual way,


 2
 
W = s0 + s1 + s2 + · · · , (26.28)
i i
and substituting in the Schrödinger equation, we obtain an equation that is also an infinite series in
, where setting to zero each coefficient leads to an independent equation. Keeping only the leading
and first subleading terms, we obtain
 2  2  2
∂s0 1 ∂s0 1 ∂s0
+ 2 + − 2m(E − V (r))
∂r r ∂θ r 2 sin2 θ ∂φ
  (26.29)
 ∂s0 ∂s1 2 ∂s0 ∂s1 2 ∂s0 ∂s1
+ 2 + 2 + + Δs0 = 0.
i ∂r ∂r r ∂θ ∂θ r 2 sin2 θ ∂φ ∂φ
The leading (order-one) term in the equation is just the Hamilton–Jacobi equation,
 2  2  2
∂s0 1 ∂s0 1 ∂s0
+ 2 + − 2m(E − V (r)) = 0. (26.30)
∂r r ∂r r 2 sin2 θ ∂φ
We separate variables, as usual in the Hamilton–Jacobi equation, by writing in spherical coordinates

s0 (r, θ, φ) = R0 (r) + Θ0 (θ) + Φ0 (φ), (26.31)


295 26 Bohr–Sommerfeld Quantization

which leads to three independent equations, one for each separated function,
 2
dΦ0
= L 23

 2
dΘ0 L 23
+ = L2 (26.32)
dθ sin2 θ
 2
dR0 L2
+ 2 − 2m[E − V (r)] = 0.
dr r
The first equation is solved directly as

Φ0 (φ) = L 3 φ, (26.33)

the second is integrated as




L 23
Θ0 (θ) = dθ L 2 − , (26.34)
sin2 θ
and the third is also integrated as
 
L2
R0 (r) = dr 2m(E − V (r)) − . (26.35)
r2
The equation for s1 is obtained by setting to zero the coefficient for /i in (26.29), which, after
substituting the separated s0 (r, θ, φ), gives
 
dR0 ∂s1 1 dΘ0 ∂s1 1 dΦ0 ∂s1
2 + 2 + + Δs0 = 0. (26.36)
dr ∂r r dθ ∂θ r 2 sin2 θ dφ ∂φ
We can separate variables also in s1 ,

s1 = R1 (r) + Θ1 (θ) + Φ1 (φ). (26.37)

Since Δs0 contains also d 2 Φ0 /dφ2 = 0, we do not have any φ dependence in the equation for s1 ,
which means that we can choose to put Φ1 (φ) = 0, in which case the equation becomes
   
dR0 dR1 1 dΘ0 dΘ1 d 2 R0 2 dR0 1 d 2 Θ0 dΘ0
2 + 2 + + + + cot θ = 0. (26.38)
dr dr r dθ dθ dr 2 r dr r 2 dθ 2 dθ
But the separation of variables gives the equations
dR0 dR1 d 2 R0 2 dR0
2 + + =0
dr dr dr 2 r dr
(26.39)
dΘ0 dΘ1 d 2 Θ0 dΘ0
2 + 2
+ cot θ = 0,
dθ dθ dθ dθ
or equivalently (denoting by a prime the derivative with respect to the coordinate, no matter what
that is)
 
 1 Θ0 (θ)
Θ1 (θ) + + cot θ = 0
2 Θ0 (θ)
  (26.40)
 1 R0 (r) 2
R1 (r) + + = 0.
2 R0 (r) r
296 26 Bohr–Sommerfeld Quantization

The solutions of these equations are given by

1   
Θ1 (θ) = const − ln Θ0 (θ) sin θ
2
(26.41)
1  
R1 (r) = const − ln r 2 R0 (r) ,
2
which implies for the WKB wave function the solution

eiR0 (r )/ eiΘ0 (θ)/


ψ(r, θ, φ) =   eiL3 φ/ . (26.42)
r R0 (r) Θ0 (θ) sin θ

The requirement for 2π periodicity in the angle φ means that we need to have

L 3 = nφ . (26.43)

Substituting the integral formula for Θ0 (θ) into the WKB wave function, the θ dependence in it is

⎡⎢  ⎤

1 ⎢ i L 23 ⎥⎥

exp ⎢± dθ L −
2 ⎥. (26.44)
[L 2 sin2 θ − L 23 ]1/4 ⎢⎣  sin2 θ ⎥⎥

Considering it as a WKB wave function in one dimension, for which we have the WKB
quantization condition, or, equivalently, taking a Bohr–Sommerfeld quantization condition modified
by the addition of 1/2, we obtain the condition

 θ2  

L 23 1
dθ L2 − = n θ + π, (26.45)
θ1 sin2 θ  2

where θ1,2 are the turning points,

|L 3 |
| sin θ1,2 | = . (26.46)
L
To obtain the quantization condition, we will use an integral that we can calculate:
 x2
dx
I (a, b, c) = √ , (26.47)
x1 a + 2bx − c2 x 2

where x 1,2 are the zeroes of the square root in the denominator (the turning points). It can be rewritten
by shifting the integral over x in such a way that the endpoints x 1,2 become symmetric with respect
to the origin, namely ±|d|/|c|,
 +|d |/|c |  +π/2  +π/2
dx |d/c| cos θ 1 π
I (a, b, c) = √ = dθ  = dθ = , (26.48)
− |d |/|c | d2 − c2 x 2 −π/2 |c| −π/2 |c|
d 2 (1 − sin2 θ)

where we have used the change of variables x = |d/c| sin θ, meaning that the integral from −|d|/|c|
to +|d|/|c| becomes an integral from −π/2 to +π/2.
297 26 Bohr–Sommerfeld Quantization

Then we can calculate the integral in the quantization condition, as



 θ2  θ2  θ2
L 23 L2 dθ L 23
dθ L −2 = dθ  − 
θ1 sin2 θ θ1 L 2 − L 23 /sin2 θ θ1 sin θ
2
L 2 − L 23 /sin2 θ
 θ2  θ2
L2 L 23 d(cot θ)
=− d(cos θ)  + 
θ1 L 2 − L 23 − L 2 cos2 θ θ1 L 2 − L 23 − L 23 cot2 θ

= πL − π|L 3 |,
(26.49)
where we have used d(cot θ) = −dθ/ sin2 θ, and then the formula (26.48) in both terms (i.e., for the
terms in cos θ, and in cot θ).
Finally then, the quantization condition is
 
1
L − |L 3 | =  nθ + , (26.50)
2
and, since L 3 = nφ , we find
   
1 1
L =  nθ + nφ + ≡ l+ , (26.51)
2 2
where l ≥ |nφ |. But then we have, in the WKB approximation,
 2
1
L 2 = 2 l + , (26.52)
2
which can be compared with the exact solution,
⎡⎢  2 ⎤
1 1⎥
L 2 = 2 l (l + 1) = 2 ⎢⎢ l + − ⎥⎥ . (26.53)
⎢⎣ 2 4⎥

Note that, strictly speaking, since θ ∈ [0, π] we cannot use the previous WKB analysis on R. But
the condition we used was that the function does not blow up at θ = 0 or π, and that is indeed true.
Using the connection formulas at a turning point, for 0 ≤ θ ≤ θ1 , we have the angular wave function
⎡⎢  ⎤⎥

⎢ 1 θ1 L 23
− L ⎥⎥⎥ .
1
exp ⎢⎢− dθ 2 (26.54)
[L 3 − L sin θ]
2 2 2 1/4
⎢⎣  0 sin θ
2
⎥⎦
At θ → 0, this becomes
       |n |
1 |L 3 | dθ 1 θ 1 θ φ
√ exp − =√ exp |nφ | ln tan =√ tan → 0, (26.55)
|L 3 |  sin θ |L 3 | 2 |L 3 | 2
so it means we actually have the correct boundary conditions, despite this not being obvious a priori.

Radial Wave Function

Finally we move on to the radial wave function in ψ(r, θ, φ), which is


⎡⎢  r  ⎤
1 ⎢ i  L 2 ⎥⎥
⎢ − −
  r 2 ⎥⎥⎦
exp dr 2m(E V (r)) . (26.56)
r 2m(E − V (r)) − L 2 /r 2 1/4 ⎢⎣  r1
298 26 Bohr–Sommerfeld Quantization

However, the radial quantization condition,


 r2   
L2 1
dr 2m(E − V (r)) − 2 = π nr + , (26.57)
r1 r 2
depends on the potential V (r), so we must choose a potential in order to obtain a WKB result and
compare with the exact result.

26.5 Example: Coulomb Potential (Hydrogenoid Atom)

The potential for the hydrogenoid atom is the usual


Ze02
V (r) = − , (26.58)
r
where e02 = e2 /(4π0 ). Also, the energy of bound states is negative, so E = −|E|. Then the
quantization condition is written as

 r2  
2mZe02 L 2 1
J= dr −2m|E| + − 2 = π nr + , (26.59)
r1 r r 2
and the integral J is calculated by integrating by parts,
 r2
2mZe02 L 2 
J = r −2m|E| + − 2 
r r 
r1
 r2
−2mZe0 /r + 2L 2 /r 3
2 2
− r dr 
r1 2 −2m|E| + 2mZe02 /r − L 2 /r 2
 r2
−mZe02 /r + L 2 /r 2
=− dr 
r1 −2m|E| + 2mZe02 /r − L 2 /r 2
 r2
mZe02 (26.60)
= dr 
r1 −L 2 + 2mZe02 r − 2m|E|r 2
 r2
dr L2
− 2

r1 r −2m|E| + 2mZe02 /r − L 2 /r 2
 z1
mZe02 π dz
=√ +L 2

2m|E| z2 −2m|E| + 2mZe02 z − L 2 z 2

mZe02
=√ π − πL,
2m|E|
where in the fourth equality we have used the integral (26.48) in the first term and defined 1/r = z in
the second, and in the last line, we have used the integral (26.48) in the second term. Then, the WKB
quantization means that
299 26 Bohr–Sommerfeld Quantization

 
1
J = π nr + , (26.61)
2
and, using the fact that L = (l + 1/2), as we have seen before, we obtain the eigenenergies

mZ 2 e04
E = −|E| = − , (26.62)
22 (nr + l + 1) 2
which are in fact the exact eigenenergies (at all n = nr + l + 1, independently of whether n is large
or small).
For the wave functions, we have the same observation that we made for the θ dependence: the
radial direction is only on half the real line, 0 ≤ r < ∞, so technically speaking the WKB analysis
would not apply. Nevertheless, in the classically forbidden region 0 ≤ r ≤ r 1 , we have the wave
function
  r1  
1 L dr 1 r1 
ψ(r)   exp − = √ exp −(l + 1/2) ln
r L 2 /r 2 1/4  r r Lr r
(26.63)
(r/r 1 ) l+1/2
= √ ∼ r → 0,
l
Lr
where we have used the fact that L/ = l + 1/2. So, as before, this is the correct boundary condition
at r = 0 (finiteness of the wave function), even though the wave function doesn’t extend to r = −∞,
so the WKB analysis actually applies.

Important Concepts to Remember

• The Bohr–Sommerfeld quantization condition (the version modified by 1/2 from its original form)
is obtained by applying the  x WKB  x approximation, and the resulting turning-point connection
formulas, to a closed path, x 2 + x 1 = C . This results in C dx p(x) = C dq p = (n+1/2)2π =
1 2
(n + 1/2)h, with n = 0, 1, 2, . . . (compared with the original, for which (n + 1)h).
• Applying Bohr–Sommerfeld quantization to some simple cases, we can either get an exact result
or obtain the exact result at large n (in the classical limit).
• For V (x) = k |x|, we get a wrong result for the energies only for n = 0; from n = 1 onward we get
less than 1% error.
• For the harmonic oscillator, we get the correct eigenenergies, but only the leading result at (small
x and) large x for the wave functions is correct.
• For the motion in a central potential, writing as usual ψ = eiW with W = s(r ) − Et with expanding
s(r ) into /i we get the quantum-corrected Hamilton–Jacobi equation, which can be solved by the
separation of variables, s0 (r, θ, φ) = R0 (r) + Θ0 (θ) + Φ0 (φ), s1 = R1 (r) + Θ1 (θ) + Φ1 (φ), etc.
• One thus obtains a WKB solution (by directly solving for R0 , Θ0 , Φ0 and then R1 , Θ1 , Φ1 in terms
of them); the Bohr–Sommerfeld quantization conditions for θ ∈ [0, π] and r ∈ [0, ∞) (given that
L 3 = nφ ) lead to L = (l + 1/2) with l = nθ + nφ and another quantization condition, depending
on V (r).
• For the hydrogenoid atom, nr quantization (for R(r)) gives the correct eigenenergies and the correct
boundary conditions for the wave function at θ = 0 and r = 0, though not the exact wave function.
300 26 Bohr–Sommerfeld Quantization

Further Reading
See [2] for more details.

Exercises

(1) Consider the Lagrangian


q̇4 α q̇2
L= + − λq4 . (26.64)
4 2
Write down the explicit (improved) Bohr–Sommerfeld quantization condition for periodic
motion in this Lagrangian.
(2) For the potential V = k |x| in Bohr–Sommerfeld quantization, calculate the leading and first
subleading terms for the eigenfunctions at large x and small x.
(3) Apply the Bohr–Sommerfeld quantization method to particle in a box (an infinitely deep square
well) to find the eigenenergies and eigenfunctions, and compare with the exact results.
(4) For the wave function of the harmonic oscillator in the Bohr–Sommerfeld quantization method,
what happens in the classical, large-n, limit? Is there any sense in which we can consider that
the result matches the exact result?
(5) For the case of motion in a central potential, write down the equations for R2 and Θ2 .
(6) Calculate the Bohr–Sommerfeld quantization condition for R(r) for motion in the central
potential V = −α/r 2 .
(7) For the hydrogenoid atom in Bohr–Sommerfeld quantization, compare the wave function
as r → ∞ (the leading and first subleading terms) with the exact wave function. Do they
match? Is there any sense in which the wave function matches the exact one in the classical
limit n → ∞?
Dirac Quantization Condition
27 and Magnetic Monopoles

In this chapter we consider a quantization condition found by Dirac that gives an argument for the
existence of magnetic monopoles. The quantization condition is found in several different ways,
including a semiclassical way. We also give an argument for the existence of magnetic monopoles
based on a duality symmetry of the Maxwell equations.

27.1 Dirac Monopoles from Maxwell Duality

The Maxwell equations in vacuum are


 ×E = − ∂ B,
 ∇
 ·E
=0
∂t
(27.1)

 ×B = + 1 ∂ E,
 ∇
 ·B
 = 0,
c2 ∂t
or, in relativistically covariant form, writing
Ei = E i = F 0i = −F0i
1 i jk (27.2)
Bi = Bi =  Fjk ,
2
the second and third Maxwell equations become the equation of motion

∂μ F μν = 0, (27.3)

while the first and the fourth Maxwell equations become the Bianchi identities,

∂[μ Fνρ] = 0, (27.4)

where, because of the total antisymmetrization, this is equivalent to

Fμν = ∂μ Aν − ∂ν Aμ , (27.5)

with Aμ = (−φ, A) or Aμ = (φ, A).



It can be seen that the Maxwell equations have a symmetry, called Maxwell duality, on making the
exchanges
 → B,
E   → − E,
B  (27.6)

or, in relativistic form,


1
Fμν → ∗Fμν ≡  μνρσ F ρσ , (27.7)
2
301
302 27 Dirac Quantization Condition, Magnetic Monopoles

which takes ∂μ F μν = 0 into ∂[μ Fνρ] = 0. Note that ∗2 = −1, so (as we can also see from the vector
components), applying the duality symmetry twice gives −1, so we obtain an overall sign for the
fields.
Adding sources means adding terms on the right-hand side of the second and third Maxwell
equations,
ρ

 ·E=
0
(27.8)
∇ − 1 ∂E
 ×B  = μ0 j,
c2 ∂t
or, in relativistic notation, the equation of motion acquires a source term,

∂μ F μν + j ν = 0. (27.9)

Here the 4-current is


 
μ ρ
j = , μ0 j .
 (27.10)
0
But now we have a problem, since there is no magnetic source, so Maxwell duality is broken.
We note, however, that it would be fixed if we also introduce a magnetic 4-current, made up of a
magnetic charge density and current,

jm
k μ =  μ0 ρ m ,  . (27.11)
 0

Then the Bianchi identity has a source term,


1 μνρσ
 ∂μ Fρσ + k ν = ∂μ ∗Fμν + k ν = 0. (27.12)
2
In terms of 3-vectors, we have the modified Maxwell equations


 ·B
 = μ0 ρ m
(27.13)
+ ∂B


 ×E  = jm .
∂t 0
Thus we again have Maxwell duality, if we extend it to act on sources as well, as
 
j μ ↔ k μ , ρe → ρm , q = d 3 xρ e ↔ g = d 3 xρ m . (27.14)

Considering a pointlike electric source, an electron with ρ e = qδ3 (x), we have the Maxwell
equation

∇  = q δ3 (x).
 ·E (27.15)
0

But applying Maxwell duality to it, we find a pointlike magnetic source with ρ m = gδ3 (x), and a
Maxwell equation


 ·B
 = μ0 gδ3 (x). (27.16)
303 27 Dirac Quantization Condition, Magnetic Monopoles

Then, using a magnetic Gauss’s law, namely integrating over a sphere, or rather a ball, centered on
the charge, and converting the integral over the volume to an integral over the 2-sphere of radius R,
SR2 , we obtain
 
d3 x∇
 ·B =  · d S = μ0 g,
B (27.17)
BR 2
SR

 field must be radial, we finally obtain the magnetic field of the pointlike magnetic
and, since the B
source,
μ0 g r̂
=
B . (27.18)
4π r 2
This magnetic pointlike source is called a “Dirac magnetic monopole”. Usually, a magnetic field
has only a dipole mode, or higher (quadrupole, etc.). The reason is that we have not yet found a
magnetic monopole in nature; if there were one, we would say that the magnetic field also starts at
the monopole, as for an electric field.
In the presence of magnetic sources, we can also act with Maxwell duality on the Lorentz force
law, in order to find the full law, at q  0, g  0. Since B
 → − E,
 E → B, q → g, we find the
general law
 = q( E
F  + v × B)
 + g( B
 − v × E).
 (27.19)

In relativistic notation, since F μν → ∗F μν , we have


dp μ 
= qF μν + g ∗ F μν uν . (27.20)

We also note that, for a Dirac magnetic monopole, the energy density diverges at r → 0 since the
magnetic field has density
2
B g 2 μ0 1
E= = ∝ → ∞. (27.21)
2μ0 32π 2 r 4 r 4
Moreover, also the total energy of the magnetic field of the Dirac magnetic monopole diverges,
 
g 2 μ0 1
E= d 3 xE = 4πr 2 dr E ∼ → ∞. (27.22)
r→0 8π r
But this is just the same as in the case of the electron. For the electron, the particle is fundamental,
but there are quantum corrections to the field at r → 0 which cut off the divergence in E. For the
monopole, the particle is usually a large-r approximation of a nonabelian monopole (i.e., a ’t Hooft
monopole), which also cuts off the divergence in energy, though in a different way.

27.2 Dirac Quantization Condition from Semiclassical


Nonrelativistic Considerations

The existence of the Dirac monopole implies, at the quantum level, an important quantization
condition known as the Dirac quantization condition. It can be derived in different ways, but (given
the analysis in the previous chapters) we will start with a semiclassical nonrelativistic method.
304 27 Dirac Quantization Condition, Magnetic Monopoles

Consider a nonrelativistic system of an electric charge q in the magnetic field of a magnetic charge
g. Indeed, the magnetic charge has the fields
μ0 g r̂
mon = 0,
E mon =
B . (27.23)
4π r 2
Then the Lorentz force on the electric charge is
μ qg ˙
 = 0 r × r .
p˙ = mr¨ = qr˙ × B (27.24)
4π r 3
The time derivative of the orbital angular momentum is
 
d L d μ0 qg ˙ × r ) = d μ0 qg r ,
= (mr × r˙ ) = mr × r¨ = r
 × (r (27.25)
dt dt 4πr 3 dt 4π r
where we have used
1 1 1
[r × (r˙ × r )]i = 3 i jk x j ( klm vl x m )) = 3 (δli δ m j − δ i δ j )x vl x m
m l j
r3 r r
⎡⎢ ˙ ⎤   (27.26)
vi x i ˙ ⎢ r r (r · r˙ ) ⎥⎥ d r
= − 3 (r · r ) = ⎢ − =
r 3 ⎥⎥⎦
.
r r ⎢⎣ r dt r i
i
We finally obtain the conservation law
 
d  μ0 qg r d
L− ≡ J = 0. (27.27)
dt 4π r dt
Since the first term is the orbital angular momentum, the full expression (the one that is conserved)
must be a total angular momentum, which is why we called it J . But the total angular momentum
is quantized in units of /2 so, now considering the opposite regime, when L = 0, we obtain the
quantization condition
μ0 qg N
= ⇒ μ0 qg = 2πN = N h. (27.28)
4π 2
This is the Dirac quantization condition.
Note then that, if there is a single magnetic charge g anywhere in the Universe, the Dirac
quantization condition means that electric charge is quantized, which we know experimentally to
be true. The above argument is the only available theoretical explanation for the quantization of the
electric charge, which is a strong indirect argument for the existence of magnetic charges somewhere
in our Universe even if we have not observed them experimentally yet.

27.3 Contradiction with the Gauge Field

When we wrote down the duality-invariant Maxwell equations, with both electric and magnetic
sources, by introducing the magnetic sources we obtained a contradiction with the existence of the
gauge field Aμ = (−φ, A). Indeed, since we define B =∇ × A,
 we can use Gauss’s law, or rather its
generalization, Stokes’ law, twice, and find
   
μ0 Q m = d x∇ · B =
3  
B · dS =
  (∇ × A) · d S =
    · dl = 0.
A
M3 ∂M 3 =S 2 S 2 (closed) ∂S 2 (closed)
(27.29)
305 27 Dirac Quantization Condition, Magnetic Monopoles

Indeed, note that the 2-sphere S 2 is a closed surface, so it has no boundary, which means that the
“line integral over the boundary” is zero. But this relation is a contradiction, since it implies that the
magnetic charge must be zero.
In relativistically covariant notation the contradiction is easier to understand, since Fμν = ∂μ Aν −
∂ν Aμ means that ∂[μ Fνρ] = 0, but this is in contradiction with the Bianchi identity with sources,
1 μνρσ
 ∂μ Fρσ + k ν = 0. (27.30)
2
The solution of this contradiction is that Fμν = ∂μ Aν − ∂ν Aμ or, in the φ = 0 gauge, B =∇ × A,

is valid only on patches. That is, it is valid only on open surfaces that do not intersect the magnetic
charges, so it is not valid on the closed surfaces S that surround the magnetic charge.

27.4 Patches and Magnetic Charge from Transition Functions

In order to use patches for the closed surface S 2 , we divide it and similar closed surfaces into two
overlapping patches Oα and Oβ , such that Oαβ = Oα ∩ Oβ and Oα ∪ Oβ = S 2 , as in Fig. 27.1a. Then,
(β)
we define Aμ on each patch, i.e., Aμ(α) and Aμ , such that
(β) (β) (β)
Fμν = ∂μ Aν(α) − ∂ν Aμ = ∂μ Aν − ∂ν Aμ . (27.31)

But this means that on the common patch Oαβ , the two gauge fields must differ by a gauge
transformation,
(β)
Aμ(α) = Aμ + ∂μ λ (αβ) , (27.32)

where the gauge parameter λ (αβ) is called a transition function.


In the A0 = 0 gauge, we have
=∇
B ×A
 (α) = ∇
×A
 (β) , (27.33)

so that the vector potentials differ by a gauge transformation,


 (α) = A
A  (β) + ∇λ
 (αβ) . (27.34)

We can consider explicit patches as in Fig. 27.1b, i.e.,


(α) : S 2 − north pole, θ = π
(27.35)
(β) : S 2 − south pole, θ = 0.
In the gauge A0 = 0, on the patch (α) the vector potential is
μ0 g (−1 + cos θ)
 (α) =
A eφ , (27.36)
4πr sin θ
where eφ is the unit vector in the direction of increasing φ, in Cartesian coordinates:

eφ = (− sin φ, cos φ, 0). (27.37)

The formula for A (α) is singular at sin θ = 0 but cos θ  1, which means at θ = π. Thus, indeed,
 (α) is defined only on (α).
A
306 27 Dirac Quantization Condition, Magnetic Monopoles

Dirac
string
singularity

Ob N

Ob
Oab
Ob
C
Oa Oa Oa
2 S
S t, Dirac
(a) (b) (c) string
singularity

Figure 27.1 (a) Two overlapping patches Oa and Ob for the 2-sphere S2 , with common patch Oab . (b) The case when the overlap Oab is an
equator, C. (c) The case when Oa = S2 less the north pole, and Ob = S2 less the south pole. The Dirac string singularity is a line
going to infinity either at the north or the south pole.

We can check that the formula for A  (α) is correct, since we have, in spherical coordinates (note
that we have replaced θ with π − θ with respect to the usual definition),
   
= 1 ∂ ∂ Aθ 1 1 ∂ Ar ∂

×A − ( Aφ sin θ) + r̂ − − (r Aφ ) eθ
r sin θ ∂θ ∂φ r sin θ ∂φ ∂r
  (27.38)
1 ∂ ∂ Ar
+ − (r Aθ ) + eφ .
r ∂r ∂θ
On the patch (β), the vector potential (again in the gauge A0 = 0) is
μ0 g (+1 + cos θ)
 (β) =
A eφ . (27.39)
4πr sin θ
This potential is singular at sin θ = 0 but cos θ  −1, so at θ = 0. Therefore it is indeed defined
only on (β).
Then, moreover, the difference between the two potentials is a gauge transformation,
μ0 −g
 (α) − A
A  (β) = eφ = ∇λ
 (αβ) , (27.40)
2π r sin θ
where the transition function is
μ0 g
λ (αβ) = − φ. (27.41)

This is correct since in spherical coordinates

 = ∂ r̂ + 1 ∂ eθ + 1


eφ . (27.42)
∂r r ∂θ r sin θ ∂φ
307 27 Dirac Quantization Condition, Magnetic Monopoles

Then the gauge transformation ∇λ


 (αβ) is single valued around the circle parametrized by φ ∈ [0, 2π],
but the gauge parameter λ (αβ)
is not. Indeed, around it we have

λ (αβ) (φ = 2π) − λ (αβ) (φ = 0) = −μ0 g. (27.43)

In general, consider a common one-dimensional patch that is comprised by an equator: Oαβ = C,


β β
OCα ∪ OC = S 2 , and OCα ∩ OC = C, as in Fig. 27.1b. Then we can again use Stokes’ law (or the
generalized Gauss’s law) twice, but now on the sum of patches, so
  
μ0 Q m =  · d S =
B ∇
×A (α) +
β

×A  (β)
S2 α
OC OC
   (27.44)
=  (α) · dl +
A β
 (β) · dl =
A  (α) − A
(A  (β) ) · dl.
α =+C
∂OC ∂OC =−C C

Here we have used the fact that OCα has an outward normal, associated with the right-hand screw
β
rule, whose direction is on C, and so OC has an outward normal, associated with the right-hand
screw rule, with the opposite direction on C; hence when using the same outward normal, i.e., the
same integral on C, we have difference of integrands.
Finally, we obtain that the magnetic charge is the integral of the gauge transformation, or the
non-single-valuedness of the gauge parameter,

μ0 Q m = (∇λ
 (αβ) ) · dl = λ (αβ) (φ = 2π) − λ (αβ) (φ = 0). (27.45)
C

27.5 Dirac Quantization from Topology and Wave Functions

There is an alternative derivation of the Dirac quantization condition that builds upon the formalism
of patches described above.
If there are electrically charged particles in the system then we have complex fields, in particular
complex wave functions, that are minimally coupled to the electromagnetic potential A  through the
electric charge q, as we saw in Chapter 23,
 
iq 
Dψ = ∇ − A ψ.
  (27.46)

But we saw that gauge transformations with parameters λ act on the wave functions as
 
 qλ
ψ = ψ exp −i . (27.47)

We must then impose that the gauge transformation on the wave function is single-valued around the
circle C in Fig. 27.1, i.e., that
   
iq iq
exp − λ (αβ) (φ = 2π) = exp − λ (αβ) (φ = 0) , (27.48)
 
which means that we must have
q (αβ)
λ (φ = 2π) − λ (αβ) (φ = 0) = 2πn. (27.49)

308 27 Dirac Quantization Condition, Magnetic Monopoles

Substituting the non-single-valuedness of λ (for a magnetic charge Q m = g), we obtain


2πn
= μ0 g ⇒ μ0 gq = 2πn = hn, (27.50)
q
which is again the Dirac quantization condition.

27.6 Dirac String Singularity and Obtaining the Dirac


Quantization Condition from It

There is yet another way to derive the Dirac quantization condition, which will expose another
important feature in the presence of magnetic monopoles: the Dirac string.
For the Dirac monopole, we found that (on both patches), the vector potential A  is defined
everywhere except on a line. In the case of A  , there was a singularity on the north pole θ = π on
(α)

S 2 , which is a “string” at θ = π extending from the monopole (r = 0) to infinity (r = ∞). Similarly,


A (β) was defined everywhere except on the “string” on the south pole θ = 0, extending from the
monopole to infinity; see Fig. 27.1c.
This singularity is known as the Dirac string singularity. We see that by gauge transformations
(like that taking us between A  (β) ) we can move the Dirac string singularity around, which
 (α) and A
means that it is not physical.
Dirac’s interpretation of this singularity was that we can find a regular, i.e., nonsingular, magnetic
field B reg , that can be written in terms of an everywhere-defined gauge field (vector potential) A.  This
magnetic field is formed by the monopole field B mon plus the field of the Dirac string singularity,
string = μ0 gθ(−z)δ(x)δ(y) ẑ,
B (27.51)
so that
reg = B
B mon + B
string . (27.52)

The above field B string is for the (β) patch, when the singularity is on the south pole, from r = 0 to
r = ∞, i.e, at z < 0, hence we have the Heaviside function θ(−z). The regular field B reg obeys the
usual rules, so that

 ·B
reg = 0 ⇒ B
reg = ∇
 × A.
 (27.53)
The Dirac quantization condition now arises by imposing that the Aharonov–Bohm effect from a
contour encircling the Dirac string is trivial, as it should be for an unphysical singularity: this means
that the Dirac string must be unobservable.
Since the Dirac string is associated with a vector potential A  generating the delta function magnetic
field Bstring , we have
 
 · dl =
A B · d S  0, (27.54)
C=∂S S

yet it must be unobservable in an Aharonov–Bohm effect. Therefore, when going around a circle
encircling the Dirac string, the change in the wave function, its “monodromy”,
  
iq
ψ → exp  · dl ψ,
A (27.55)
 C=∂S
309 27 Dirac Quantization Condition, Magnetic Monopoles

must be trivial, i.e.,


     
iq iq
exp Astring · d l = exp
  Bstring · d S = 1.
  (27.56)
 C=∂S  S

This means that eiμ0 qg/ = 1, so we obtain again the Dirac quantization condition,

μ0 qg = 2πN = N h. (27.57)

Important Concepts to Remember

• Maxwell duality refers to the invariance of the Maxwell equations without source under E  → B,
 → − E.
B  If we add a magnetic current source k μ = (μ0 ρ m , jm /0 ) similar to the electric current
source j μ = (ρ e /0 , μ0 je ), we can have Maxwell duality even with sources.
• A magnetic current source of delta function type (like the electron in relation to the electric current)
gives a Dirac magnetic monopole,

μ0 g r̂
=
B .
4π r 2

• We can obtain the Dirac quantization condition μ0 qg = 2πN = hN from the conservation of the
total angular momentum, such that N is an angular momentum quantum number.
• If there is a single magnetic monopole with charge g in the Universe, we obtain the quantization
of electric charge, which is an experimental fact but otherwise theoretically unexplained. This is
good indirect evidence for the existence of magnetic monopoles.
• In the presence of magnetic charges, Fμν = ∂μ Aν − ∂ν Aμ cannot be valid everywhere (in particular,
not on a closed surface enclosing the charge), but must be true only on patches.
• The magnetic charge is equal to the monodromy on C (the difference in the field when going around
a closed loop) of the transition function (the gauge parameter for a transformation) between two
patches with C as a common loop.
• Dirac quantization can also be obtained from the single-valuedness of the gauge transformation
defined by the transition function on the wave function minimally coupled to the electromagnetic
potential.
 of a monopole,
• The Dirac string singularity is a spurious singularity in the vector potential A
starting at the monopole and going to infinity. It can be moved around by gauge transformations.
The Dirac quantization condition also arises from the condition that the Aharonov–Bohm effect
from a contour encircling the Dirac string is unobservable (trivial).

Further Reading
See [7] for more details.
310 27 Dirac Quantization Condition, Magnetic Monopoles

Exercises

(1) Can Maxwell duality be derived from the classical action for electromagnetism (plus electric
and magnetic sources)? Is there a simple way to generalize this duality to the quantum theory, if
there are particles minimally coupled to the electromagnetic fields? Explain.
(2) Consider an electron and a monopole at the same point at a distance R from a perfectly
conducting infinite plane. What are E  and B
 on the plane at the minimum-distance point from
the charges?
(3) In the presence of so-called dyons, particles that carry both electric and magnetic charges, the
Dirac quantization condition for a particle with charges (q1 , g1 ) and another particle with charges
(q2 , g2 ) is generalized to the Dirac–Schwinger–Zwanziger quantization condition
q1 g2 − q2 g1 = 2πN, N ∈ N. (27.58)
Prove this relation using a generalization of the argument for the quantization of the total
angular momentum of system of two particles.
(4) Consider two dyons satisfying the Dirac–Schwinger–Zwanziger quantization condition given in
exercise 3, the minimum value for the integer, N = 1, occurring when one of the dyons has
q1 = e, g1 = h/e. Calculate the total relative force between the dyons.
(5) Suppose that the magnetic field for a particle at the point 0 is a delta function,


B(x) = δ(x ). (27.59)
e
Take two  such particles
 and rotate one around the other. Calculate the Aharonov–Bohm
phase exp i C A  · dl of the moving particle. Since the particles are identical, how would you
interpret this result?
(6) Consider two monopoles of opposite charge (monopole and antimonopole) situated at a distance
2R from each other, and a sphere of radius 2R centered on the midpoint between the monopole
and antimonopole. Is there a unique vector potential A  that is valid over the whole sphere?
(7) Suppose in our Universe there is only a monopole and antimonopole pair, situated at a distance
d of each other that is much smaller than the distance to any other atom in the Universe. Can we
still infer that the electron charge is quantized? Explain.
Path Integrals II: Imaginary Time
28 and Fermionic Path Integral

In this chapter we will extend the quantum mechanical formalism of path integrals, in order to
describe finite temperature, and fermions. We will first study path integrals in Euclidean space,
which, as we will see, are better defined than those in Minkowski space, and also give their relation
to statistical mechanics at finite temperature. Moreover, the usual case is defined as a limit of the
Euclidean space path integral. Finally, we will also define path integrals for fermionic variables.

28.1 The Forced Harmonic Oscillator

In Chapter 10, we obtained the path integral for the transition amplitude (for the propagator)
 
     i 
M (q , t ; q, t) ≡ H q , t ; q, t H = q | exp − Ĥ (t − t ) |q

   tf  (28.1)
i
= Dp(t)Dq(t) exp dt [p(t) q̇(t) − H (q(t), p(t))]
 ti
and, for a Hamiltonian quadratic in momenta, H = p2 /2 + V (q), we obtained the path integral in
terms of the Lagrangian,
  
i
M (q , t ; q, t) = N Dq exp S[q] . (28.2)

We also saw that we can define N-point correlation functions, and calculate them in terms of path
integrals, as
G N (t¯1 , . . . , t¯N ) = H q , t  |T { q̂(t¯1 ) · · · q̂(t¯N )}|q, t
  
i (28.3)
= Dq(t) exp S[q] q(t¯1 ) · · · q(t¯N ),
q(t)=q,q(t  )=q 
where we have the boundary conditions q(t) = q, q(t  ) = q . These correlation functions are obtained
from their generating functional, which is
      
i i i
Z[J] = Dq exp S[q; J] = exp S[q] + dt J (t)q(t) , (28.4)
q(t)=q,q(t  )=q  q(t)=q,q(t  )=q  
by means of multiple derivatives at zero,

δ δ  .
G N (t¯1 , . . . , t¯N ) = · · · Z[J] (28.5)
 δ J (t 1 )
i ¯  δ J (t N )
i ¯ 
J=0

But, while Z[J], called the partition function, is just a generating functional (a mathematical
construct), we note that J (t) can be interpreted as a source term for the classical variable q(t). In
311
312 28 Path integrals II

the case of a harmonic oscillator, this would be an external driving force, giving rise to a forced
(driven) harmonic oscillator, as seen in the equation of motion,
δS[q; J] δS[q]
0= = + J (t), (28.6)
δq(t) δq(t)
so Z[J] and its derivatives make sense also at nonzero J (t).
The action of the driven harmonic oscillator is then
  
1 2
S[q; J] = dt q̇ − ω q + J (t)q(t) .
2 2
(28.7)
2
In this case, the path integral is still Gaussian, since we are just adding a linear term to the quadratic
one. But, in order to calculate it, we still have one issue: we need to understand the boundary
conditions for q(t). In the absence of boundary terms, we can partially integrate and obtain
   2  
1 d
S[q; J] = dt − q(t) + ω q(t) + J (t)q(t) .
2
(28.8)
2 dt 2
This is schematically of the type
i 1 i
S = − q · Δ−1 · q + J · q, (28.9)
 2 
where
 
−1 d2
Δ q(t) ≡ i + ω q(t).
2
(28.10)
dt 2
In this case, the Gaussian path integral can be calculated as
  
1 i
Z[J] = N Dq exp − q · Δ−1 · q + J · q
2 
   
1 T
≡ d x exp − x · A · x + b · x
n T
(28.11)
2
 
1
= (2π) n/2 (det A) −1/2 exp bT · A−1 · b ,
2

where b = − i J (t) and A = Δ−1 /, so we finally obtain


 
1
Z[J] = N  exp − J · Δ · J , (28.12)
2

where N  contains constants (expressions independent of J), including (det Δ) −1/2 .


In the above, Δ is called the propagator, which seems unrelated to the previous notion of a
propagator, as U (t, t  ) relating |ψ(t  ) to |ψ(t). In the case of the driven harmonic oscillator, the
propagator is actually related to the two-point correlation function since then, from (28.12), we have

δ2 
G2 (t 1 , t 2 ) = −2 Z[J]
δ J (t 1 )δ J (t 2 ) J=0
  (28.13)
δ
(J · Δ)(t 2 ) exp − J · Δ · J 
1
= = Δ(t 1 , t 2 ).
δ J (t 1 ) 2 J=0
313 28 Path integrals II

Also from (28.12), we find


 
dp e−ip(t−t )
Δ(t, t  ) = i , (28.14)
2π p2 − ω 2
since then
   
d2 dp e−ip(t−t )
Δ−1 Δ(t, t  ) = i + ω 2
i
dt 2 2π p2 − ω 2

dp −p2 + ω 2 −ip(t−t  )
=− e (28.15)
2π p2 − ω 2

dp −ip(t−t  )
= e = δ(t − t  ).

However, the expression we have found for Δ is ill defined, since we have a singularity at p2 = ω 2 ,
where the denominator vanishes, and we must somehow avoid this singularity. Its avoidance is related
to the question of how to invert Δ−1 : this depends, as for any operator, on the space of functions
on which the operator acts. In quantum mechanics, this is the Hilbert space of states and, in this
continuous case, must be better defined. The relevant issue is that the Hilbert space for Δ−1 has zero
modes, where
 2 
−1 d
Δ q0 (t) = + ω q0 (t) = 0.
2
(28.16)
dt 2
Having zero modes (eigenstates of zero eigenvalue), the operator on the full Hilbert space is not
invertible (think of a matrix with zero eigenvalues, therefore with zero determinant). Moreover, these
are not some pathological states but rather are the solutions of the classical equations of motion of the
harmonic oscillator. To obtain an invertible operator, we must therefore find a way to exclude these
classical states from the Hilbert space, by imposing some boundary conditions that contradict them.
We will find that the correct result is described in terms of an integral over slightly complex
momenta p, by avoiding the pole just slightly in the complex plane, according to the formula
 
dp e−ip(t−t )
ΔF (t, t  ) = . (28.17)
2π p2 − ω 2 + i
Here F stands for Feynman; this is the Feynman propagator.
If an operator A has eigenstates |q with eigenvalues aq ,

A|q = aq |q, (28.18)

and the states are orthonormal,

q|q  = δ qq , (28.19)

then the operator can be written as



A= aq |qq|. (28.20)
q

Thus the inverse operator is


 1
A−1 = |qq|. (28.21)
q
aq
314 28 Path integrals II

In our case, if Δ−1


F has orthonormal eigenfunctions {qi (t)},

dtqi∗ (t)q j (t) = δi j , (28.22)

with eigenvalues λi , then



Δ−1 
F (t, t ) = λi qi (t)qi∗ (t  ), (28.23)
i

so the inverse is
 1
ΔF (t, t  ) = qi (t)qi∗ (t  ). (28.24)
i
λ i


In order to get (28.17), we can make the identifications qi (t) ∼ e−ipt , i ∼ dp/2π, and λi ∼
(p2 − ω 2 + i).
We can be more precise than the above, however. Since in ΔF (t, t  ) there are poles in the complex
plane at p = ±(ω − i), we can calculate the integral with the residue theorem on the complex plane.
In order to do that, we must form a closed contour from the integral over the real line, by adding
a piece of contour whose integral vanishes. Such a contour is a semicircle at infinity, provided the
 
factor e−ip(t t ) is exponentially small.
Thus if t > t , by considering Im(p) < 0, we obtain a factor that vanishes exponentially,
−|Im(p) |(t−t  )
e → 0, on the semicircle at infinity in the lower half of the complex plane. We can
thus add, without affecting the result, the integral over this contour, obtaining a closed total contour
C such that the pole p1 = +(ω − i) is inside C, as in Fig. 28.1. By the residue theorem, we obtain

1 −iω(t−t  )
ΔF (t, t  ) = e . (28.25)

Similarly, if t < t , by considering Im(p) > 0, we obtain an exponentially vanishing factor
Im(p)(t  −t)
e → 0 on the semicircle at infinity in the upper half of the complex plane. Adding

−ω

Figure 28.1 The contour for the Feynman propagator avoids −ω from below and +ω from above. It is closed in the lower half-plane (see
the dotted contour “at infinity”), for t > t.
315 28 Path integrals II

again without affecting the result the integral over this contour, we obtain a closed total contour
C that encircles the pole p2 = −(ω − i), so by the residue theorem we get
1 −iω(t  −t)
ΔF (t, t  ) =
e . (28.26)

Putting together the two cases, we have at general t, t ,
1 −iω |t−t  |
ΔF (t, t  ) =
e . (28.27)

Now we can calculate the boundary conditions, since from (28.24) we see that ΔF has the same
boundary conditions for t as q(t). Since at t → +∞, ΔF ∼ e−iωt and at t → −∞, ΔF ∼ e+iωt , it
follows that, in order to obtain (28.17) we need to impose the boundary conditions
q(t) ∼ e−iωt , t → +∞
(28.28)
q(t) ∼ e+iωt , t → −∞.
We can use these boundary conditions to calculate more precisely the Gaussian path integral, and
find the boundary term correction to the partition function,
 
1 i
Z[J] = N  exp − J · Δ · J + qcl (t) · J , (28.29)
2 
though we will not show the details. The same result can also be obtained, a bit more rigorously,
from the harmonic phase-space path integral of Chapter 10. Moreover, the harmonic phase-space
path integral gives the same boundary conditions for the functions qi (t).

28.2 Wick Rotation to Euclidean Time and Connection with Statistical


Mechanics Partition Function

We have shown above how to calculate the partition function in the case of a quadratic action, by
doing a kind of Gaussian integral. However, the calculation is not very rigorous, since the Gaussian
+∞
speaking only correct in the real case, −∞ e−αx .
2
is of an imaginary kind while the result is strictly 
Indeed, in the case of an imaginary integral, dxe−iαx , the integrand oscillates rapidly, which makes
it impossible for it to converge at infinity. If we take a cut-off for the integral at a large Λ, the
 +Λ  +Λ+C C
difference between −Λ dx e−iαx and −Λ−C dx e−iαx is 2 0 dx e−iαx , which is finite and oscillatory
for finite C, so we cannot even define the limit properly when Λ → ∞.
A simple solution would be to replace eiS/ with e−S/ , in which case the exponential  +∞ at infinity
would be decaying instead of oscillatory, and we could define the integral −∞ dx without any
problem. This is what happens when we go to Euclidean time (i.e., work in Euclidean spacetime).
For a Hamiltonian Ĥ that is independent of time, and has a complete set {|n} of eigenstates (so that
1 = n |nn|), with positive eigenenergies En > 0, we can write

   −i Ĥ (t−t  )/ 
H q , t |q, t H = q |e |q = q  |nn|e−i Ĥ (t−t )/ |mm|q
n m
 
= q  |nn|qe−iEn (t−t )/ (28.30)
n
 
= ψ n (q  )ψ∗n (q)e−iEn (t−t )/ .
n
316 28 Path integrals II

In the second line, we have used the fact that the matrix element on energy eigenstates is
 
n|e−i Ĥ (t−t )/ |m = δ nm e−iEn (t−t )/ , (28.31)

as well as the definition ψ n (q) ≡ q|n.


At this point, we can make the required analytical continuation to Euclidean time, also known as
a “Wick rotation”, where we just make the replacement Δt → −iβ. We then obtain

q , β|q, 0 = ψ n (q  )ψ∗n (q)e−βEn . (28.32)
n

If moreover q = q  and we integrate over it, we obtain an expression for the statistical mechanics
partition function of a quantum system at temperature T, with k B T = 1/β. Indeed, then
  
dqq, β|q, 0 = dq |ψ n (q)| 2 e−βEn = Tr[e−β Ĥ ] ≡ Z[β], (28.33)
n

where we sum over the Boltzmann factors e−βEn times the probability |ψ n (q)| 2 of the state, and then
sum over q and n. The procedure above implies that we are considering closed (periodic) paths in
Euclidean time, of period β = 1/k B T, with

q  = q(t E = β) = q(t E = 0) = q. (28.34)

For a Lagrangian in the usual (Minkowski) spacetime, with the canonical kinetic term,
 2
1 dq
L(q, q̇) = − V (q), (28.35)
2 dt
the action can be rewritten in terms of the Euclidean time t E (t = −it E ) as
 tE =β ⎡⎢  2 ⎤⎥
1 dq
iS[q] = i (−idt E ) ⎢⎢ − V (q) ⎥⎥ ≡ −SE [q], (28.36)
0 ⎢⎣ 2 d(−it E ) ⎥⎦

where we have defined the Euclidean action SE [q], obtaining


 β ⎡⎢  2 ⎤⎥  β
⎢ 1 dq
SE [q] = dt E ⎢ + V (q) ⎥⎥ = dt E LE (q, q̇). (28.37)
0 ⎢⎣ 2 dt E ⎥⎦ 0

In this way, we obtain the Feynman–Kac formula, relating the statistical mechanics partition
function to the partition function from the path integral in Euclidean space,
  
−β Ĥ 1
Z (β) = Tr[e ]= Dq exp − SE [q] . (28.38)
q(tE +β)=q(tE ) 

We can also introduce nonzero sources J (t) into the Euclidean-time partition function,
   
1 1 β
Z[β, J] = Dq exp − SE [β] + dτ JE (τ)qE (τ) , (28.39)
  0
where the source term is the analytical continuation of the Minkowski-time source term,
  
i dt J (t)q(t) = i d(−it E ) J (−it E )q(−it E ) = dt E JE (t E )qE (t E ). (28.40)
317 28 Path integrals II

From the partition function (28.39) we can calculate correlation functions as before, for instance
the two-point function,

2 δ2 Z[β; J] 1
= Dq(τ)q(τ1 )q(τ2 )e−1SE (β)/ , (28.41)
Z (β) δ J (τ1 )δ J (τ2 ) Z (β)
which now is also a statistical mechanics trace,
1
Tr[e−β Ĥ T ( q̂(−iτ1 ) q̂(−iτ2 ))], (28.42)
Z (β)
where the Heisenberg operators in Minkowski time become the analytical continuation in Euclidean
time,

q̂(t) = ei Ĥt/ q̂e−i Ĥt/ ⇒ q̂(−iτ) = e1 Ĥ τ/ q̂e−1 Ĥ τ/ . (28.43)

However, the path integrals are still taken over periodic Euclidean paths.
To go back to the regular case in Minkowski space, from the Euclidean formulation, we must both
undo the Wick rotation, and take the limit β → ∞ (infinite periodicity), corresponding to T → 0
in statistical mechanics (zero temperature). In this limit, inside the trace remains only the vacuum
contribution, for E0 , as

ψ n (q  )ψ∗n (q)e−βEn → ψ0 (q  )ψ0 (q)e−βE0 . (28.44)
n

Harmonic Oscillator Example


We will apply the Euclidean time formalism to the simplest example: the forced harmonic oscillator.
Its Euclidean partition function is
 ⎧  ⎡⎢   2 ⎤  ⎫

⎨ 1 ⎢ dq 2 2⎥ ⎥ 1 ⎪
Z E [J] = Dq exp ⎪ − dt ⎢ +ω q ⎥+ dt J (t)q(t) ⎬

2 ⎢ dt ⎥ 
⎩ ⎣ ⎦ ⎭
      
1 d2 1 (28.45)
= Dq exp − dtq(t) − 2 + ω q(t) + 2
dt J (t)q(t)
2 dt 
   
1
= N exp dt dt  J (t)ΔE (t, t  ) J (t  ) ,
2
where in the second line we have used partial integration, now without boundary terms, since the
Euclidean-time path integral is over periodic paths (without boundary), and in the third line the
Gaussian integration is now well defined, as it is a real integral, of the type dte−S(x) , and not an
oscillatory imaginary one, and the Euclidean-time propagator is defined as before through a Fourier
transform in (Euclidean) energy EE ,
  −1  
 d2  dEE e−iEE (t−t )
ΔE (t, t ) = − 2 + ω 2
(t, t ) = . (28.46)
dt 2π EE2 + ω 2

We finally note that there are no poles in this expression, since, for real EE , EE2 + ω 2 > 0. We
have thus solved all three problems that we found with the result in Minkowski space: there are no
boundary terms, the Gaussian integration is well defined, and there are no poles in the expression for
the propagator.
318 28 Path integrals II

When Wick-rotating back to Minkowski time, via Et = EE t E , so that (since t = −it E ) EE = −iE,
we can consider the procedure as a rotation in the complex E plane by π/2. However, for a Wick
rotation of a full π/2 (EE = −iE), we would be back to having poles in the integrand for Δ(t, t  ),
since EE2 + ω 2 → −E 2 + ω 2 . To avoid obtaining a pole, we must instead do a rotation by π/2 − , so

EE → e−i(π/2−) E = −i(E + i  ). (28.47)

Then the propagator becomes


 +∞
dE e−iEt
ΔE (t E = it) = −i = ΔF (t), (28.48)
−∞ 2π −E 2 + ω 2 − i

so it turns exactly into the Feynman propagator from before.

28.3 Fermionic Path Integral

Until now, we have considered path integrals appropriate for bosons, as in the case of the particle
with position x(t), but that is not the only possibility. Indeed, we have seen that the path integral for a
harmonic oscillator is best defined as a harmonic phase-space path integral, in terms of a, a† instead
of q, p.
However, for fermions, we saw in Chapter 20 that we have anticommuting variables, with
canonical quantization conditions

{ψi , ψ j } = {pψi , pψ j } = 0, {ψi , pψ j } = iδi j , (28.49)

and we can define fermionic raising and lowering operators b, b† , satisfying anticommutation
relations

{b, b} = {b† , b† } = 0, {b, b† } = 1. (28.50)

In this case, we can define a sort of “classical limit” for the fermions, by taking  → 0. This is not
quite a classical limit, since strictly speaking there is no classical fermion since for fermions there
is a single particle per state, and we need a (macroscopically) large number of particles in the same
state to obtain a classical limit. But one defines the limit in the abstract, and leaves the interpretation
for later. In this case, remembering
√ that (like a, a† ), we have
√ defined b,
√ b† in terms of q, p with
coefficients that have 1/  in front, so in fact we have b = b̃/ , b† = b̃† / , and then

{ b̃, b̃† } =  → 0. (28.51)

Together with the fact that {b, b} = {b† , b† } = 0, we see that the classical limit does not give
the usual commuting functions but rather a Grassmann algebra, for anticommuting objects, with
complex coefficients. This is defined in terms of regular, commuting objects (complex numbers),
called “even” (or bosonic) elements, and anticommuting objects such as b, b† , the “odd” (or
fermionic) part of the algebra. We then have the usual product relations, bose × bose = bose,
319 28 Path integrals II

fermi × fermi = bose and bose × fermi = fermi × bose = fermi. For example, the product of two
odd objects is even, for example [bb† , bb† ] = 0.
Having defined the Grassmann algebra for fermionic objects, we must next define the path integral
over it.

Definitions
We consider a Grassmann algebra with N objects x i , i = 1, . . . , N, {x i , x j } = 0, together with the
identity 1, which commutes with the rest, [x i , 1] = 0. As we said, it is an algebra over the complex
numbers, which means that the coefficients c ∈ C. Since (x i ) 2 = 0, the Taylor expansion in these x i
will end after N terms,
  
f ({x i }) = f (x 0 ) + f i(1) x i + f i(2)
j xi x j + f i(3) (N )
x x x + · · · + f 12···N
jk i j k
x 1 · · · x N . (28.52)
i i< j i< j<k

Since we cannot add an even element to an odd element, it means that the functions are either even
or odd. But, in order for the functions to be nontrivial, we must consider as x i only a subset of the
anticommuting elements, and we can use the rest for the coefficients f i(k)
1 ... ik
. For instance, for N = 1,
f (x) = a + bx, which means that (if the function f (x) is even) a is even and b is odd, but then we
can have b = cy, where c ∈ C, and y is another odd element, and f (x) = a + cyx.
In more generality, consider an even number of odd elements, and use half of them for the Taylor
expansion (as x i ) and the other half as coefficients.
We define differentiation using the same basic relation as in the commuting case,
d
x j = δi j . (28.53)
dx i
The only difference is in the differentiation of a product of elements, since the derivative is also
anticommuting (is fermionic), so we get extra minus signs compared with the commuting (bosonic)
case,
∂ ∂
(x j . . . ) = δi j (. . . ) − x j (. . . ). (28.54)
∂ xi ∂ xi
Then for even functions f (x), g(x),
   
∂ ∂ ∂
( f (x)g(x)) = f (x) g(x) + f (x) g(x) , (28.55)
∂ xi ∂ xi ∂xj
whereas for an odd function f (x) and arbitrary g(x), we have
   
∂ ∂ ∂
( f (x)g(x)) = f (x) g(x) − f (x) g(x) . (28.56)
∂ xi ∂ xi ∂xj
Next we need to define integration. However, on a Grassmann algebra, we cannot define an
integration in the usual Riemann or Lebesgue sense, since there is no “Riemann sum” or nontrivial
measure. This means that we cannot define a definite integral (with variable limits of integration),
but only an indefinite integral.
For a single odd element x, as the basis elements of the Grassmann algebra are 1 and x, we
must define the indefinite integral over them. The result of the integral for both elements must be a
320 28 Path integrals II


c-number, whereas dx must also be odd (fermionic), which defines them rather uniquely as
 
dx1 = 0, dx x = 1. (28.57)

For several odd elements x i , owing to the anticommuting nature of the elements we have
(for i  j)

{dx i , dx j } = 0, {x i , dx j } = 0. (28.58)

But then we note that integration is the same as differentiation, as the rules for both are the same.
For instance, we also obtain translational invariance for the integral,
   
dx f (x + a) = dx[ f + f (x + a)] =
0 1
dx f x =
1
dx f (x). (28.59)

The delta function in the Grassmann algebra must be defined as well. We will prove that δ(x) = x.
Indeed, for a single x, it is enough to prove the relation by integrating with a general function,
f (x) = f 0 + f 1 x. We obtain
  
dxδ(x − y) f (x) = dx(x − y)( f 0 + f 1 x) = dx(x f 0 − y f 1 x)
 (28.60)
= f0 + y dx f 1 x = f 0 − y f 1 = f 0 + f 1 y = f (y),

which is the correct result, proving that indeed δ(x) = x.


Changing the integration variable by rescaling with a complex number a, setting y = ax, we obtain
  
1
1= dx x = dy y = a dy x ⇒ dy = dx. (28.61)
a

28.4 Gaussian Integration over the Grassmann Algebra

We will consider a real n×n antisymmetric matrix A, such that A2 has negative, nonzero eigenvalues.
We must have n = 2m, and then we obtain for the real Gaussian integration,
 √
d n x exp xT Ax = 2m det A, (28.62)

though we will not prove it here. Instead, it is simpler to consider the case of a complex Gaussian
integration. Consider independent x i , yi (such that yi  x ∗i ), in which case we have
 
dn x d n y exp yT Ax = det A. (28.63)

To prove this relation, we need to realize that in the exponential (written as an infinite sum of
power laws) only the term of order n contributes, since (x i ) 2 = 0, so terms with more than n factors
yT Ax have at least one x i repeated, whereas T
 terms with fewer than n factors y Ax will give zero
since at least one x i will be missing, giving dx i 1 = 0.
321 28 Path integrals II

Then the only nonzero terms are of order n, and the only difference between them is in the
permutation Q of the x i and P of the yi , and the integral gives
1   
y P(1) AP(1)Q(1) x Q(1) · · · y P(n) AP(n)Q(n) x Q(n)
n! P Q
1 
= y1 A1QP−1 (1) x QP−1 (1) · · · yn AnQP−1 (n) x QP−1 (n)
n! P Q
  
= y1 A1Q (1) x Q (1) · · · yn AnQ (n) x Q (n) (28.64)
Q

= (y1 · · · yn )(x 1 · · · x n )  Q A1Q(1) · · · AnQ(n)
Q

= (y1 · · · yn )(x 1 · · · x n ) det A,

where in the second line we have relabeled the x’s, as x P(i) → x i , in the third we define Q  = QP−1 ,
in the fourth we have ordered the y’s in front, in the correct order (y1 · · · yn ), followed by the x’s
in the correct order (x 1 · · · x n ), generating a constant sign  that depends on n, times a sign  Q that
depends on the permutation Q.

28.5 Path Integral for the Fermionic Harmonic Oscillator

Consider the fermionic harmonic oscillator Hamiltonian,


 
1
ĤF = ω b̂† b̂ − . (28.65)
2
Analogously to the bosonic case, we define coherent fermionic ket states,

|ψ ≡ eb̂ ψ |0 = (1 + b̂† ψ)|0 = (1 − ψ b̂† )|0, (28.66)

that have eigenvalue ψ under b,

b|ψ = ψ|0 = ψ(1 − ψ b̂† )|0 = ψ|ψ, (28.67)

as well as corresponding bra states, ψ∗ |. Together, they satisfy the completeness relation

1= d ψ̄dψ|ψψ∗ |e−ψ̄ψ . (28.68)

Following the same steps as in the bosonic case, we find the fermionic transition amplitude as a
path integral over the coherent (harmonic) phase space:
 ⎧  t   ⎫

⎨ i ⎪

ψ̄, t |ψ, t = D ψ̄Dψ exp ⎪ dτ −i∂τ ψ̄(τ)ψ(τ) − H + ψ̄(t)ψ(t) ⎬
⎪. (28.69)
 t
⎩ ⎭
Doing a partial integration in τ (without boundary terms), and introducing sources η̄, η for ψ, ψ̄,
in order to obtain the partition function Z[η, η̄], the generating functional of correlation functions,
322 28 Path integrals II

we obtain (this time writing the source term without an , since the kinetic term does not have it
explicitly either)
     
Z[η, η̄] = D ψ̄ Dψ exp i dt ψ̄(i∂t − ω)ψ + ψ̄η + η̄ψ . (28.70)

Doing the Gaussian integration on the transition amplitude, we find


  +∞  +∞ 

ψ̄, t  |ψ, t = N exp − dτ dτ  eiω(τ−τ ) η̄(τ  )η(τ)
−∞ τ
  +∞  +∞  (28.71)
  
= N exp − dτ dτ η̄(τ )DF (τ, τ )η(τ) ,
−∞ −∞

where we have used the formula for the Feynman propagator,


 
dE e−iE (τ −τ) 
DF (τ, τ  ) ≡ (−i(i∂t − ω)) −1 (τ, τ  ) = i = θ(τ − τ  )e−iω(τ −τ) . (28.72)
2π E − ω + i
The fermionic path integral can be defined better in Euclidean time, similarly to the bosonic case,
but we will not do it here.

Important Concepts to Remember

• The partition function Z[J] is a generating functional, but J (t) can be understood as more than
a mathematical artifact, in fact, rather as a source for the classical variable q(t). In the case of a
harmonic oscillator, we obtain the driven harmonic oscillator.
• The kinetic operator i d 2 /dt 2 + ω 2 is inverted to the propagator Δ(t, t  ), via a Fourier trans-
form, as
 
dp e−ip(t−t )
i ,
2π p2 − ω 2
but this is not invertible since it has zero modes, the classical solutions. The correct formula is then
the Feynman propagator, avoiding the singularity in the complex plane, by making the replacement
ω 2 → ω 2 − i.
1 −iω |t−t  |
• The Feynman propagator ends up being ΔF (t, t  ) = 2ω e , which is consistent with a space of
eigenfunctions (on which we invert the kinetic propagator) with boundary  conditions q(t) ∼ e−iωt
+iωt 
for t → −∞ and q(t) ∼ e for t → +∞. Then Z[J] = N exp − 2 J · Δ · J + i qcl (t) · J .
1

• In order to do the Gaussian integration correctly (since the integrand is complex, eiS/ , and is
thus highly oscillatory), one does a Wick rotation to imaginary time (analytical continuation to
Euclidean time), such that eiS → e−SE / , where
 ⎡⎢  2 ⎤⎥
⎢ 1 dq
SE = dt ⎢ + V (q) ⎥⎥
⎢⎣ 2 dt E ⎥⎦
is the Euclidean action.
• From the Wick rotation of the quantum amplitude, written as a path integral, for a periodic path
q = q  and integration over q, we obtain the Feynman–Kac
 formula, relating it to the statistical
−β Ĥ
mechanics partition function, Z[β] = Tr[e ] = q(t +β)=q(t ) e−SE [q]/ .
E E
323 28 Path integrals II

• We can introduce sources also in the Euclidean formulation, and calculate n-point functions of
q̂(−iτ) = e Ĥ τ/ q̂e− Ĥ τ/ via derivatives.
• For a harmonic oscillator, besides the correct Gaussian integration with e−SE / , we have no
boundary terms (owing to the periodic paths in the path integral), and moreover the propagator
is well defined, as its Fourier transform is 1/(EE2 + ω 2 ).
• Since the formal classical limit,  → 0, of the fermionic harmonic oscillator leads to {b, b† } = 0,
that is, anticommuting variables, we have a Grassmann algebra, over which we need to define a
path integral.
• The functions over the Grassmann algebra are at most linear in each independent element x i , but
the coefficients are also Grassmann objects (so that half the Grassmann objects are coordinates,
half are coefficients). 
• We can define an indefinite integral dx that is the same as the derivative. Gaussian integration
 
gives d n x d n x exp yT Ax = det A.
  
• The path integral for the transition amplitude of a harmonic oscillator gives exp − η̄ · DF · η ,
where in Fourier space we have DF = i/(E − ω + i).

Further Reading
See [5] for more details.

Exercises

(1) Consider the propagator obtained by the replacement ω2 → ω 2 + i (“anti-Feynman”) in the


Fourier transform. Calculate the integral, and find the corresponding boundary conditions for
the (Hilbert space of) functions over which we invert the kinetic operator.
(2) Repeat the above exercise for the case where both the p = ±ω poles in the integral are avoided
in the complex space from above (via a complex integration contour that is slightly above the
real line at the level of both poles).
(3) Prove that
 
d2
− 2 + ω 2 K (τ, τ  ) = δ(τ − τ  ), (28.73)

where K (τ, τ  ) = Δfree (τ − τ  ), and Δ(τ − β) = Δ(τ) has a unique solution: if τ ∈ [0, β], the
solution is
1
Δfree (τ) = [(1 + n(ω))e−ωτ + n(ω)e ωτ ], (28.74)

where
1
n(ω) = . (28.75)
e β |ω | −1
(4) Wick-rotate the formula
 
dE1,E dE2,E 1
I (EE ) = . (28.76)
2π 2
2π (E1,E + ω12 )(E2,E
2
+ ω22 )[(E1,E + E2,E − EE ) 2 + ω32 ]
324 28 Path integrals II

(5) For the driven harmonic oscillator, Wick-rotated to Euclidean space, calculate the four-point
function G4,E (t 1 , t 2 , t 3 , t 4 ).
(6) Consider Grassmann variables θ α , α = 1, 2, and the even function

Φ(x, θ) = φ(x) + 2θ α ψ α (x) + θ 2 F (x), (28.77)
where θ2 =  αβ θ α θ β . Calculate

d 2 θ(a1 Φ + a2 Φ2 + a3 Φ3 ), (28.78)

where
 
1
d θ≡−
2
dθ α dθ β  αβ . (28.79)
4
(7) Fill in the details omitted in the text for the calculation of
   
Z[ η̄, η] = Z[0, 0] exp −i dτ dτ  η̄(τ)DF (τ, τ  )η(τ  ) . (28.80)
General Theory of Quantization of Classical Mechanics
29 and (Dirac) Quantization of Constrained Systems

In this chapter, we will analyze the answer to the question of how to quantize general classical
mechanics theories, in the presence of constraints. The answer was given by Dirac in his famous
book Lectures on Quantum Mechanics [11], published in 1964, and based on four lectures given at
Yeshiva University. It is probably the most influential book in theoretical physics, for the number of
pages. His presentation was so well conceived that there is little left to change even more than 50
years afterwards, so my presentation is largely based on his original one.
There were several potential issues that needed solving by Dirac’s picture.

• One is the presence of physical constraints, such as for instance in the case of a motion in a circle.
One solution is to solve the classical motion in terms of independent variables, but that may not
always be available or practical. Instead, we can still quantize in terms of the original variables,
without solving the constraints.
• Sometimes there are no independent variables: indeed, we will see that constraints can involve
spatial and/or momentum variables, qi and pi , so constraints that involve both at the same time are
harder to understand.
• Finally, there is the existence of gauge invariance symmetries, i.e., local redundancies in the
description. For instance, in the case of electromagnetism, there is the invariance δ Aμ = ∂μ λ.
In this case we can fix it by a gauge choice that acts as a constraint on the Aμ (phase space)
variables. The constraints are defined on phase space, i.e., φ(p, q) = 0, which means that we
must use the Hamiltonian formalism and find a specific form for the Hamiltonian. In the case of
the Lagrangian formalism, we can introduce constraints by Lagrange multipliers, L → L + λφ.
Then in the Hamiltonian formalism, we should also add constraints to H, but it will happen in a
different way.

29.1 Hamiltonian Formalism

For completeness, we start with a review of the Hamiltonian formalism. From a Lagrangian
L(qn , q̇n ), we define the canonically conjugate momenta
∂L
pn = , (29.1)
∂ q̇n
after which we define the (naive, see later) Hamiltonian


N
H= (pn q̇n ) − L = H ({qn }, {pn }), (29.2)
n=1
325
326 29 Quantization of Classical Mechanics and Constrained Systems

where we replace the q̇n with their expressions in terms of pn in order to find H ({qn }, {pn }). The
Hamiltonian equations of motion are
∂H ∂H
q̇n = , ṗn = − . (29.3)
∂pn ∂qn
We can rewrite the Hamiltonian formalism in a way appropriate for quantization by introducing
Poisson brackets for two functions of phase space f (q, p) and g(q, p) as
  ∂ f ∂g ∂ f ∂g

{ f , g} P.B. = − . (29.4)
n
∂qn ∂pn ∂pn ∂qn

In terms of them, the Hamiltonian equations of motion become


q̇n = {qn , H } P.B. , ṗn = {pn , H } P.B. , (29.5)

or, more generally, for an arbitrary function of phase space g(q, p),
ġ = {g, H } P.B. . (29.6)

29.2 Constraints in the Hamiltonian Formalism: Primary and Secondary


Constraints, and First and Second Class Constraints

We start by assuming a set of M constraints on phase space,

φ m (q, p) = 0, m = 1, . . . , M. (29.7)

We will call them primary constraints; we will see shortly why.


In order to understand the constraints, we give two examples.

Example 1 Consider the Lagrangian


L = q q̇ − αq2 . (29.8)

Then the momentum p canonically conjugate to q is


∂L
p= = q, (29.9)
∂ q̇
which means there is a constraint relating momenta and coordinates,

p − q = 0. (29.10)

Example 2 Consider the Lagrangian


1 2
q̇ − αq22 − βq1 q22 .
L= (29.11)
2 2
In this case, the momentum canonically conjugate to p2 is q̇2 , which is standard, but the momentum
canonically conjugate to q1 is

p1 = 0, (29.12)

which is a constraint on phase space.


327 29 Quantization of Classical Mechanics and Constrained Systems

We define ≈ as an equality that holds only after using the constraints φ m = 0; it is called a weak
equality. It means that we use the constraints only at the end of the calculation, after for instance
evaluating all the Poisson brackets. Indeed, the Poisson brackets are defined by assuming that {qn , pn }
are independent variables that have nonzero brackets. By definition, we have
φ m ≈ 0, (29.13)
so these brackets are not thought of as being identically zero, only weakly zero. Since φ m ≈ 0, from
the point of view of time evolution there is no difference between the Hamiltonian H and H + um φ m .
The equations of motion associated with the latter are
∂H ∂φ m
q̇n = + um ≈ {qn , H + um φ m } P.B.
∂pn ∂pn
(29.14)
∂H ∂φ m
ṗn = − − um ≈ {pn , H + um φ m } P.B. ,
∂qn ∂qn
where in the last ≈ in both equations we have taken into account that terms with ∂um /∂pn and
∂um /∂qn are both multiplied by φ m , so are ≈ 0.
For a general function g(q, p), then, we obtain the new equation of motion
ġ = {g, HT } P.B. , (29.15)
where
HT = H + um φ m (29.16)
is called the total Hamiltonian.
We can apply the above time evolution to the primary constraints φ m themselves. But if the φ m
are good constraints, it follows that they must be respected by time evolution as well, and must stay
on the constraint surface φ m  0, so that φ̇ m ≈ 0. This gives the equation
{φ n , HT } P.B.  {φ n , H } P.B. + um {φ n , φ m } P.B. ≈ 0. (29.17)
In general, this equation produces other, potentially independent, constraints, called secondary
constraints. Then we repeat the process, and find the equation for time evolution of the above
equations, etc., until we find no more new constraints. This is then the full set of secondary
constraints,
φk , k = M + 1, . . . , M + K. (29.18)
Together, the primary and secondary constraints are defined as
φ j ≈ 0, j = 1, . . . , M + K. (29.19)
We will consider them together, since the difference between the two is not really important.
There is however a different separation of constraints that is useful. We call a general function of
phase space R(q, p) first class, if
{R, φ j } P.B. ≈ 0, ∀j = 1, . . . , M + K, (29.20)
and second class if there exists at least one φ j such that {R, φ j } P.B. is not ≈ 0. Considering the
case where R is a constraint φ j  itself, we can define first-class and second-class constraints. The
separation of constraints into first class and second class is of relevance, unlike that between primary
and secondary.
328 29 Quantization of Classical Mechanics and Constrained Systems

For the full set of constraints, we can write the equations for time evolution in the form

φ̇ j = {φ j , HT } P.B. ≈ {φ j , H } P.B. + um {φ j , φ m } P.B. ≈ 0, j = 1, . . . , M + K. (29.21)

In view of the previous definition, the equation also says that HT is first class. The equations
(29.21) comprise M + K equations for the M coefficients u j , which is an overconstrained system.
However, on physical grounds, since there must exist a true time evolution we must have at least one
solution, call it Um . However, if Um is a particular solution of (29.21), then

um = Um + va Vam (29.22)

is the general solution, where Vam solves the homogenous equation,

Vam {φ j , φ m } P.B. ≈ 0. (29.23)

Then we can split the first-class Hamiltonian HT as

HT = H + Um φ m + va Vam φ m = H  + va φ a , (29.24)

where H  = H + Um φ m is also first class, since Um is a particular solution of (29.21) for um , and

φ a = Vam φ m (29.25)

are first-class primary constraints, since the φ m are primary, and Vam {φ j , φ m } P.B. ≈ 0 implies that
Vam φ m are first class.

Theorem If R and S are first class, i.e., {R.φ j } P.B. = r j j  φ j  and {S, φ j } P.B. = s j j  φ j  , then {R, S} P.B.
is first class.

Proof Consider the Jacobi identity, which is an identity (so equivalent to 0 = 0 as can be seen by
writing all the terms explicitly), derived from the antisymmetry of the Poisson brackets, namely

{{R, S} P.B. , P} P.B. + {{P, R} P.B. , S} P.B. + {{S, P} P.B. , R} P.B. = 0. (29.26)

The power of the Jacobi identity comes from its use in calculating explicitly all the Poisson
brackets. The case of interest is P = φ j , in which case we obtain

{{R, S} P.B. , φ j } P.B. = {{R, φ j } P.B. , S} P.B. − {{S, φ j } P.B. , R} P.B.


= {r j j  φ j  , S} P.B. − {s j j  φ j  , R} P.B.
= r j j  s j  j  φ j  − {r j j  , S} P.B. φ j  − s j j  (−r j  j  φ j  ) − {s j j  , R} P.B. φ j  ≈ 0.
(29.27)
But this is then the definition of {R, S} P.B. being first class. q.e.d.

Adding the first-class secondary constraints, denoted φ a , to the Hamiltonian (since, as we said,
there is no difference between primary and secondary, only between first-class and second-class), we
obtain the extended Hamiltonian

HE = HT + va φ a . (29.28)

To summarize, the first-class constraints generate motion tangent to the constraint hypersurface
(so can be added to the Hamiltonian), whereas the second-class constraints generate motion away
from it, so cannot be added.
329 29 Quantization of Classical Mechanics and Constrained Systems

29.3 Quantization and Dirac Brackets

The standard quantization procedure is to replace the Poisson brackets {, } P.B. with i1 [, ]. But the
presence of constraints introduces a subtlety.
The simplest way to deal with the constraints is to introduce them as an operator constraint (since
f (q, p) becomes fˆ( q̂, p̂)) acting on physical states |ψ,

φ̂ j |ψ = 0. (29.29)

However, that leads to a potential problem, since acting twice with φ̂ j and taking the commutator
leads to

[φ̂ j , φ̂ j  ]|ψ = 0. (29.30)

In quantum mechanics, for this to be zero, considering that φ j is a full set of constraints, the
commutator must be proportional to the constraints themselves, i.e., the constraint operators must
satisfy an algebra,

[ φ̂ j , φ̂ j  ] = c j j  j  φ̂ j  . (29.31)

Note that c j j  j  could be an operator itself, and since operators do not in general commute, we
must write it to the left of φ j  , so that this vanishes when acting on |ψ.
Since time evolution should be a constraint too (it should keep us within the constraint
hypersurface), we must also have

[ φ̂ j , Ĥ] = b j j  φ̂ j  . (29.32)

If there are only first-class constraints, there is no problem since then the first-class condition is

{φ j , φ j  } P.B. ≈ 0 ⇒ {φ j , φ j  } P.B. = c j j  j  φ j  , (29.33)

which in quantum theory turns into (29.31).


However, if there is a second-class constraint, we do have a problem, since then {φ j , φ j  } P.B. is
not ≈ 0.

Example
Consider an example, to help us understand the issue and a possible solution. Consider a system with
N degrees of freedom, and constraints q1 ≈ 0 and p1 ≈ 0. But then {q1 , p1 } P.B. = 1  0, so the
constraints are not first class. We quantize by imposing on physical states |ψ that

q̂1 |ψ = 0, p̂1 |ψ = 0. (29.34)

However taking the commutator results in [q̂1 , p̂1 ]|ψ = 0, which is in contradiction with the
quantization of the Poisson bracket {q1 , p1 } P.B. , which leads to [q̂1 , p̂1 ] = i.
In conclusion, either φ̂ j |ψ = 0 is not a good way to impose the constraint or the quantization
prescription of replacing the Poisson bracket {, } P.B. with the commutator i1 [, ] does not work. In
fact, it turns out to be the latter. We need to modify the Poisson bracket to a new bracket.
330 29 Quantization of Classical Mechanics and Constrained Systems

In the case above, the solution is simple: we just remove (q1 , p1 ) from the phase space, so the
modified Poisson bracket is now
N  
∂ f ∂g ∂ f ∂g
{ f , g} P.B. = − . (29.35)
n=2
∂qn ∂pn ∂pn ∂qn
But we must find a way to extend this procedure to the general case, by introducing what is known
as a Dirac bracket. Consider independent second-class constraints by taking out linear combinations
that are first class and leaving only constraints all of whose linear combinations are second class, and
call them χs .
Define the inverse matrix css of Poisson brackets of the second-class constraints χs ,
css {χs , χs } P.B. = δ ss . (29.36)
Then we can define the Dirac brackets of two functions of phase space coordinates as
[ f , g]D.B. = { f , g} P.B. − { f , χs } P.B. css {χs , g} P.B. . (29.37)
If we use Dirac brackets instead of Poisson brackets, the time evolution is not modified, since
[g, HT ]D.B. = {g, HT } P.B. − {g, χs } P.B. css {χs , HT } P.B.
(29.38)
≈ {g, HT } P.B. ,
where we have used that HT is first class, hence {χs , HT } P.B. ≈ 0.
However, using Dirac brackets instead of Poisson brackets leads to the bracket of any function of
phase space coordinates with a second-class constraint being equal to zero strongly,
[ f , χs ]D.B. = { f , χs } P.B. − { f , χs } P.B. css {χs , χs } P.B. = 0, (29.39)
where we have used the definition of css , css {χs , χs } P.B. = δ ss . Since the equality of the bracket
to zero is exact (strongly), we can set χs to zero strongly if we use the Dirac bracket.
In quantum mechanics, we can still impose the constraints on states χ̂s |ψ = 0, and not obtain any
more contradictions, by replacing the Dirac bracket with i1 [, ]. Indeed, then, for the commutator of
second-class constraints with any function f of phase space coordinates, we obtain zero when acting
on states, so without any more contradictions,
[ fˆ, χ̂s ]|ψ = 0. (29.40)
To understand this better, consider the previous example, of constraints q1 ≈ 0, p1 ≈ 0, and use
this general formalism to see that indeed the Dirac bracket amounts to the same solution as the one
we found. Since {q1 , p1 } P.B. = 1, it means that both constraints are second class, so we need to use
Dirac brackets. Define χ1 = q1 and χ2 = p1 . Then {χ1 , χ2 } P.B. = 1, which implies
cs1 {χ1 , χ2 } P.B. = δ s2 ⇒ χ21 = 1
(29.41)
cs2 {χ2 , χ1 } P.B. = δ s1 ⇒ χ12 = −1.
Then the Dirac bracket is
[ f , g]D.B. = { f , g} P.B. − { f , χ2 } P.B. {χ1 , g} P.B. + { f , χ1 } P.B. {χ2 , g} P.B.
∂ f ∂g ∂ f ∂g
= { f , g} P.B. − +
∂q1 ∂p1 ∂p1 ∂q1 (29.42)

 ∂ f ∂g
N 
∂ f ∂g
= − ,
n=2
∂qn ∂pn ∂pn ∂qn
331 29 Quantization of Classical Mechanics and Constrained Systems

where in the second equality we have substituted χ1 = q1 and χ2 = p1 , and then calculated the
Poisson brackets.
Since it is this Dirac bracket that quantizes to i1 [, ], in quantum mechanics [q̂n , p̂m ] = iδ nm only
for n = 2, . . . , N, but not for n = 1, so [q̂1 , p̂1 ] = 0. Then imposing

q̂1 |ψ = 0 = p̂1 |ψ (29.43)

leads to no contradiction.
Next, we consider in detail two examples that have complementary features and that were
described in the general formalism.

Example 1
The first example is the first example from before, with L = q q̇ − αq2 . The example is a bit
pathological, but it has some of the features of the case of (Majorana, i.e., real) massive fermion(s)
ψi , with Lagrangian
1 m
L= ψi ψ̇i − ψi σi j ψ j ,
2 2
and where the ψi are anticommuting and there is an antisymmetric matrix σi j , so that the mass term
is nonzero. However, we will not treat Majorana fermions here, since they have some subtleties.
The equations of motion of the Lagrangian are
d ∂L ∂L d
− =0 ⇒ q + 2αq − q̇ = 0 ⇒ q = 0. (29.44)
dt ∂ q̇ ∂q dt
However, in the Hamiltonian formalism, the situation is more interesting. First, the Hamiltonian is

H = pq̇ − L = αq2 , (29.45)

since as we saw, p = q. This is actually a (primary) constraint, so we have

φ1 = p − q ≈ 0. (29.46)

But we have to consider the time evolution of this primary constraint, and set it to zero weakly,
generating a secondary constraint φ2 ,

{φ1 , H } P.B. = {p − q, H } P.B. = −2αq ≈ 0 ⇒ φ2 = q ≈ 0. (29.47)

The bracket of φ2 = q with H is zero, so there are no more secondary constraints.


The equations of motion of the Hamiltonian are

q̇ = 0, ṗ = −2αq. (29.48)

Note that, since p = q, we have that the second equation is q̇ = −2αq, which because of the first
equation reduces to q = 0, as in the Lagrangian formalism.
Calculating the Poisson brackets of the constraints, we get

{p − q, p − q} P.B. = {q, q} P.B. = 0


(29.49)
{φ1 , φ2 } P.B. = {p − q, q} P.B. = 1  0,
332 29 Quantization of Classical Mechanics and Constrained Systems

so both φ1 and φ2 are second class, and there are no first-class constraints. On the other hand, φ1 is
a primary constraint and φ2 is a secondary constraint. The total Hamiltonian is obtained by adding
the primary constraints, so

HT = H + um φ m = αq2 + u(p − q). (29.50)

For this, we have two conditions: {φ j , HT } P.B. ≈ 0, for j = 1, 2. The first is

{p − q, αq2 } + u{p − q, p − q} ≈ 0, (29.51)

but since the first bracket is −2αq ≈ 0 (it is proportional to φ2 ), and the second is identically zero,
the equation is satisfied. The second condition is

{q, αq2 } P.B. + u{q, p − q} P.B. ≈ 0, (29.52)

which becomes (since the first bracket is zero, and the second is one)

u ≈ 0. (29.53)

This is not surprising, since HT is first class, and there are no first-class constraints. Further,
u = U + va φ a , where φ a are the first-class constraints, of which there are none, so u = U and
U is a particular solution; but since there is a particular solution u = 0, as we saw, we can put U = 0
as well. Then finally

HT = H  = H = HE = αq2 . (29.54)

To obtain the Dirac brackets, note first that all the constraints are second class, so χ1 = φ1 = p − q,
and χ2 = φ2 = q. Since

{χ1 , χ2 } P.B. = −1 = −{χ2 , χ1 } P.B. , (29.55)

we have c12 = 1, c21 = −1. Then the Dirac brackets are


[ f , g]D.B. = { f , g} P.B. + { f , χ2 } P.B. {χ1 , g} P.B. − { f , χ1 } P.B. {χ2 , g} P.B.
    (29.56)
∂ f ∂g ∂ f ∂g ∂ f ∂g ∂g ∂f ∂ f ∂g
= − + + − + = 0,
∂q ∂p ∂p ∂q ∂p ∂q ∂p ∂q ∂p ∂p
where in the second equality we substituted χ1 = p − q and χ2 = q, and then evaluated the Poisson
brackets.
Since the Dirac bracket is zero, there is nothing to quantize.

29.4 Example: Electromagnetic Field

As a second example, consider the electromagnetic field Aμ , but without a gauge choice, so
constraints will appear only from the form of the action itself. It is a field, so Aμ = Aμ (x , t), and
in terms of quantum mechanics, we can think of x as an extra index i; otherwise we have the same
classical story. However, we will not go into the quantization details, since they are in the domain of
quantum field theory, beyond the scope of this book. Instead, we will just use it as an example of the
constraint formalism in classical physics.
333 29 Quantization of Classical Mechanics and Constrained Systems

The action for electromagnetism is


   
Fμν F μν 1 1
S=− 3
d x dt = d x dt − Fi j F − F0i F ,
3 ij 0i
(29.57)
4 4 2
where Fμν = ∂μ Aν − ∂ν Aμ , and Aμ is the field variable. For an arbitrary function f , we denote
derivatives with a comma, ∂μ f = f , μ . Then the momentum canonically conjugate to Aμ is
δL
P μ (x ) ≡ = F μ0 (x ), (29.58)
δ Aμ, 0 (x )
where the functional derivative δ amounts to a regular partial derivative after removing the integral
over x (since x is like an index i, and this index sits on the variable Aμ (x ), so the integral stands
for i ). But since F μν is antisymmetric we have F 00 = 0, so P0 = 0, which is a (primary) constraint,
P0 ≈ 0. (29.59)
Since x is an index i like μ, the fundamental Poisson brackets for Aμ and P ν are
{ Aμ (x , t), P ν (x , t)} P.B. = δ νμ δ3 (x − x  ). (29.60)
The Hamiltonian is
   
μ 1 1
H= d xP Aμ,0 − L =
3
d x F Ai,0 + Fi j F i j + Fi0 F i0
3 i0
4 2
  
1 1
= d 3 x F i0 A0,i + Fi j F i j − Fi0 F i0 (29.61)
4 2
  
1 1 i i
= d x Fi j F + P P − A0 P ,i ,
3 ij i
4 2
where in the second equality we have used Fi0 = A0,i − Ai,0 , and in the third one, we have used
Pi = F 0i and partial integration.
The secondary constraints are obtained from the time evolution,
{P0 , H } P.B. = Pi ,i ≈ 0, (29.62)
where we have partially integrated under the integral sign in the Hamiltonian. We see that we obtain
a secondary constraint. Taking one more time evolution with H,
{Pi ,i , H } P.B. = 0, (29.63)
we obtain no new secondary constraints. We find that both constraints (both the primary one, and the
secondary one) are first class, since the Poisson brackets of the constraints vanish,
{P0 (x , t), P0 (x , t)} P.B. = 0 = {P0 (x , t), Pi ,i (x , t)} P.B. = {Pi ,i (x , t), P j ,j (x , t)} P.B. . (29.64)
Thus there are no Dirac brackets, and we can use Poisson brackets.
The constraints are all first class, but they divide into primary and secondary,
φ m : φ1 = P0 , φk : φ2 = Pi ,i . (29.65)
Then the first-class Hamiltonian H  is


H = H + Um φ m = H + d 3 x U P0 , (29.66)

but we can choose U = 0 as a solution, and thus H  = H.


334 29 Quantization of Classical Mechanics and Constrained Systems

To obtain the total Hamiltonian, we add the primary first-class constraints with coefficients, va φ a ,
namely P0 , so
      
 1 1
HT = H + d x v(x )P (x ) =
3 0
d x Fi j F + Pi P +
3 ij i
d 3 x −A0 Pi ,i + vP0 . (29.67)
2 2

Note that since Pi = F 0i = E i , φ2 = Pi ,i = ∇


 ·E
 is a Gaussian constraint, and that, in HT , −A0 is its
Lagrange multiplier.
To obtain the extended Hamiltonian HE , we add the secondary first-class constraint Pi ,i with an
arbitrary coefficient,

HE = HT + d 3 x u(x )Pi ,i (x ). (29.68)

However, we can put P0 = 0 in HT , since its only purpose is to set the time evolution of A0 to zero,
as Ȧ0 = 0. But that means we can get rid of A0 by redefining u, as u  = u − A0 . Then the extended
Hamiltonian is
   
1 1
HE = d 3 x Fi j F i j + Pi Pi + d 3 x u  (x )Pi ,i (x ). (29.69)
4 2
In this case, in quantum mechanics there is nothing new since the Dirac brackets equal the Poisson
brackets, but there are still subtleties due to gauge invariance, which will not be addressed here as
they relate to quantum field theory.

Important Concepts to Remember

• On a system in Hamiltonian formalism we can impose some primary constraints φ m (q, p) and
consider weak equality ≈ as an equality that holds only after using the constraints.
• Time evolution is weakly unchanged by adding the constraints to the Hamiltonian, H → HT = H +
um φ m , and from the time evolution of the primary constraints (and further secondary constraints)
we get secondary constraints.
• The set of all (primary and secondary) constraints is divided into first-class φ j  , those that have
Poisson brackets weakly zero with all constraints, {φ j  , φ j } P.B. ≈ 0, so = c j  ck φk , and second-
class, those for which there is at least a φ j such that {φ j  , φ j } P.B.  c j  jk φk (more generally, any
function of phase space coordinates can be first or second class).
• The time evolution equations of the constraints are {φ j , H } P.B. + um {φ j , φ m } P.B. ≈ 0 and have a
particular solution Um and solutions Vam of the homogenous equation, so um = Um + Vam va is the
general solution.
• The first-class total Hamiltonian is HT = H  + va φ a , where H  = H +Um φ m is first class and φ a =
Vam φ m are first-class constraints, and the extended Hamiltonian adds also first-class secondary
constraints, HE = HT + va φ a .
• In order to quantize, we can only introduce operator constraints acting on states, φ̂ j |ψ = 0, if we
use Dirac brackets instead of Poisson brackets when quantizing the commutator, [, ] D.B. → i1 [, ].
• Dirac brackets are obtained by removing the independent second-class constraints, which would
introduce contradictions (such as 0 = ({φ m , φ j } P.B. )quantized |ψ ∼ const.|ψ), via [ f , g]D.B. =
−1
{ f , g} P.B. − { f , χs } P.B. ({χs , χs } P.B. )ss  {g, χ s } P.B. .
335 29 Quantization of Classical Mechanics and Constrained Systems

• The Dirac brackets of functions of phase space with second-class constraints equal to zero strongly,
meaning there are no contradictions coming from imposing second-class constraints as operators
acting on states during quantization, since now their commutator with anything vanishes.
• For electromagnetism we have only first-class constraints, P0 ≈ 0 being the primary one  and
Pi ,i ≈ 0 the unique secondary one. In the extended Hamiltonian HE = H + (u − A0 )Pi .i + vP0 ,
we can put v = 0 and redefine u  = u − A0 . Since Pi = E i , Pi ,i = ∇
 ·E
 is the Gaussian constraint
and, in HT , −A0 is its Lagrange multiplier.

Further Reading
See Dirac’s book [11] for more details.

Exercises

(1) Consider the Lagrangian


1 2
L= ( q̇ + q̇22 ) − α(q12 + q22 ) 2 , (29.70)
2 1
and the (primary) constraint q1 + βq2 = 0. Calculate the secondary constraints, and find out
which of the constraints is first class and which is second class.
(2) Consider a particle of mass m in an (approximately) constant gravitational field (such as that
of the Earth), but constrained to move on a circle within a vertical plane. Using Cartesian
coordinates and the constraint formalism, write down the Lagrangian and the primary constraint
and calculate the secondary constraints, and see which is first class, and which is second class.
(3) Consider the Lagrangian
1 2
L=q̇ + α q̇2 q1 − βq1 q22 . (29.71)
2 1
Find the constraints of the system, write down the total Hamiltonian, and solve for Um , va .
(4) Use Dirac quantization to quantize the system in exercise 3.
(5) Consider the Born–Infeld action for nonlinear eletromagnetism,
 ⎡⎢  2 ⎤⎥
⎢ Fμν F μν
Fμν Fρσ − 1⎥⎥⎥ ,
−4 1 μνρσ
S = −L d x dt ⎢⎢ 1 + L
3 4 −L 8  (29.72)
⎢⎣ 2 8 ⎥⎦
where L is a constant length.
Calculate the Hamiltonians H, H , HT , HE in a similar way to the Maxwell electromagnetism
case in the text, and find the Dirac brackets.
(6) Consider the action for a (Dirac) spinor,

S= dt(iψ∗ ψ̇), (29.73)

written in terms of independent variables ψ and ψ∗ . Calculate the primary constraints and the
total Hamiltonian HT . Check that there are no secondary constraints, and then from
{φ m , HT } P.B. ≈ 0 (29.74)
336 29 Quantization of Classical Mechanics and Constrained Systems

solve for Um , v A. Note that classical fermions are anticommuting, so we defined p by taking the

derivatives from the left, for example ∂ψ ψ χ = χ, so that {p A, qB } = −δ BA . In general, we
must define
   
∂ ∂
{ f , g} P.B. = −(∂ f /∂pα ) g + (−) f g (∂g/∂pα ) f , (29.75)
∂qα ∂qα

where ∂ f /∂pα is the right derivative, for example ∂ψ (χψ) = χ, and (−) f g = −1 if f and g are
both fermions and +1 otherwise (if f and/or g is bosonic). This bracket is antisymmetric if f
and g are bose–bose or bose–fermi, and symmetric if f and g are both fermionic.
(7) (Continuation of exercise 6) Show that all constraints are second class, thus writing also H  and
HE . Write down the Dirac brackets, and find a (potentially!) new expression for HT and the
resulting Dirac quantization relations.
PART IIa

ADVANCED FOUNDATIONS
30 Quantum Entanglement and the EPR Paradox

In Part IIa , we will study advanced topics related to the foundations of quantum mechanics. In this
chapter, we will start by understanding entanglement, and associated notions and explaining the EPR
paradox for entangled states.
Entanglement is a notion related to a total system composed of two subsystems, A and B, not
necessarily with any physical division though as we shall see, the case of a macroscopic distance
between the states is quite important. We consider not only two subsystems, but also independent
observers for each, one called “Alice” for A and the other “Bob” for B. The total Hilbert space is the
product of Hilbert spaces, H = H A ⊗ HB .
But if system A is correlated, or “entangled”, with system B, in the quantum state of the total
system, we say that we have an “entangled state”. In particular, this means that we cannot write
the state of the total system as a product state of the states in each subsystem. To understand these
concepts better, we consider the simplest system, a spin 1/2 or two-state system.

30.1 Entanglement: Spin 1/2 System

The spin 1/2 system is the standard example used, since it contains all necessary ingredients, besides
being the simplest system.
Each spin 1/2 system has two states, denoted by |↑ and |↓. Consider two such systems, A and B.
Then the general entangled states in total Hilbert space H = H A ⊗ HB are
|ψ = a|↑ A↑B  + b|↓ A↓B 
(30.1)
|φ = a  |↑ A↓B  + b |↓ A↑B ,
with a, b  0, where the spins in |ψ are correlated, and in |φ are anticorrelated. The standard
examples of these entangled states are
1
|ψ1,2  = √ (|↑ A↑B  ± |↓ A↓B )
2
(30.2)
1
|φ1,2  = √ (|↑ A↓B  ± |↓ A↑B ) .
2
In both the |ψ and |φ states, if a = ±b and a  = ±b we have the same probabilities for the two
cases, since
P↑↑ (ψ) = |a| 2 , P↓↓ (ψ) = |b| 2
(30.3)
P↑↓ (φ) = |a  | 2 , P↓↑ (φ) = |b | 2 .

339
340 30 Quantum Entanglement and the EPR Paradox

On the other hand, the nonentangled states in H are the states with a = 0 or b = 0, or with a  = 0
or b = 0, i.e.,
|↑ A ⊗ |↑B , |↑ A ⊗ |↓B , |↓ A ⊗ |↑B , |↓ A ⊗ |↓B , (30.4)
where we have written the tensor product explicitly in order to show that the states are generic (tensor
product) states,
|ψ = |ψ A ⊗ |ψ B . (30.5)
In fact, for any |ψ A state for A and |ψ B  state for B (including states that are linear combinations
of the basis A or B states), the states are nonentangled.
The relevant point about an entangled state is best explained in |ψ1,2  and |φ1,2 : if say, Alice
measures the spin in A (which is random, that is, is obtained with probability 1/2 to be either up or
down), then she also knows what B will measure afterwards, if he measures the same spin in the
same direction. That is so, since measuring the spin of A, say to be in |↑ A, collapses the state |ψ1,2 
to |↑ A↑B , which has the spin up for B.
In order to understand further the effects of entanglement on a subsystem, consider a Hermitian
operator acting only on A, MA ⊗1B . Then the expectation value of the operator (the average measured
value) in the state |ψ is
MA = ψ|MA ⊗ 1B |ψ = |a| 2 ↑|MA |↑ A + |b| 2 ↓|MA |↓ A. (30.6)
In terms of the total Hilbert space H, this entangled state is pure, with density matrix

ρ tot = |ψψ| = (a|↑ A↑B  + b|↓ A↓B ) a∗ ↑ A↑B | + b∗ ↓ A↓B | . (30.7)
Yet in terms of the Hilbert space of A only, the entangled state |ψ is mixed, with density matrix
ρ A = |a| 2 |↑↑| + |b| 2 |↓↓|, (30.8)
since the expectation value can be rewritten in the density matrix formalism as
MA = Tr(MA ρ A ) = ↑|MA ρ A |↑ + ↓|MA ρ A |↓, (30.9)
which is equal to the result in (30.6).
But then the density matrix of A, ρ A, is obtained as the trace over only the degrees of freedom of
B of the total density matrix ρ tot ,
ρ A = TrB ρ tot ≡ ↑B |ρ tot |↑B  + ↓B |ρ tot |↓B 

= ↑B | (a|↑ A↑B + b|↓ A↓B ) a∗ ↑ A↑B | + b∗ ↓ A↓B | |↑B 
 (30.10)
+ ↓B | (a|↑ A↑B + b|↓ A↓B ) a∗ ↑ A↑B | + b∗ ↓ A↓B | |↓B 
= a|↑ Aa∗ ↑ A | + b|↓ Ab∗ ↓ A |,
which is exactly the mixed state ρ A from before. Therefore finding the trace over a subsystem of an
entangled state leads to a mixed state, at least in this example.
On the other hand, if the state is nonentangled, i.e., separable,
|ψ = |ψ A|ψ B , (30.11)
where |ψ A = a|↑ A + b|↓ A and |ψ B  = c|↑B  + d|↓B , so that
|ψ = (a|↑ A + b|↓ A) ⊗ (c|↑B  + d|↓B ) , (30.12)
341 30 Quantum Entanglement and the EPR Paradox

then finding the trace over the subsystem B leads to a pure state instead, since

TrB ρ tot = TrB |ψψ|


 
= ↑B | (a|↑ A + b|↓ A) ⊗ (c|↑B  + d|↓B ) a∗ ↑ A | + b∗ ↓ A | ⊗ c∗ ↑B | + d ∗ ↓B | |↑B 
 
+ ↓B | (a|↑ A + b|↓ A) ⊗ (c|↑B  + d|↓B ) a∗ ↑ A | + b∗ ↓ A | ⊗ c∗ ↑B | + d ∗ ↓B | |↓B 
 
= c (a|↑ A + b|↓ A) c∗ a∗ ↑ A | + b∗ ↓B | + d (a|↑ A + b|↓ A) d ∗ a∗ ↑ A | + b∗ ↓B |
= (|c| 2 + |d| 2 )|ψ Aψ A | ≡ ρ A.
(30.13)

Here we have used the (probability) normalization condition |c| 2 + |d| 2 = 1.

30.2 Entanglement: The General Case

Now we consider a general system for A and B, with orthonormal basis {|i} for A and orthonormal
basis {|m} for B. Then a general entangled state is written as
   
|ψ AB = Cim |i A ⊗ |mB = |i A ⊗  Cim |mB  ≡ |i A ⊗ |i
˜ B, (30.14)
i,m i m i

where we define a new basis for B, by linear combinations



|i
˜B= Cim |mB , (30.15)
m

but which is not necessarily orthonormal at this point. Choose however the basis |i such that it also
diagonalizes the density matrix of A obtained by finding the trace over the subsytem B,

ρA = pi |ii|, (30.16)
i

where pi is the probability for state |i in the ensemble.


Then we can calculate ρ A from (30.14) as
⎡⎢ ⎤⎥
ρ A = TrB |ψ AB AB ψ| = TrB ⎢⎢⎢ (|i A A j |) ⊗ |i
˜ B B  j˜| ⎥⎥ .
⎥⎥ (30.17)
⎢⎣ i,j ⎦
To compute it further, note that

TrB |i
˜ B B  j˜| = B k |i B B  j |k B
˜ ˜ =  j˜|i,
˜ (30.18)
k

where we have used k |kk | = 1. Using this relation, we finally obtain



ρA = B  j |i B |i A A  j |.
˜˜ (30.19)
i,j

We can now compare this formula with (30.16), and equating the terms implies

B  j |i B = pi δi j ,
˜˜ (30.20)
342 30 Quantum Entanglement and the EPR Paradox

which means that the states


1 ˜
|i B = √ |i B (30.21)
pi
form an orthonormal set as well, just as the |mB do. Then, finally, we can rewrite the pure state
|ψ AB in the form
√
|ψ AB = pi |i A ⊗ |i B , (30.22)
i

known as the Schmidt decomposition of a bipartite pure state |ψ AB .


This decomposition can be performed for any bipartite pure state |ψ AB ; however, the bases |i A
and |i B do depend on the explicit form of |ψ.
The density matrix for the B subsystem is obtained in the same way as that for A,
ρ B ≡ Tr A ρ tot = Tr A |ψ AB AB ψ|
√ √
= pi k |i A ⊗ |i B pi B  j  | ⊗ A j |k
i,j,k (30.23)

= pi |i B B i  |.
i

We see then that ρ A and ρ B have the same set of nonzero eigenvalues pi .

30.3 Entanglement: Careful Definition

We must now define more precisely the entanglement of a state |ψ AB . Define the number of nonzero
eigenvalues pi in ρ A (or, equivalently, in ρ B ) as the Schmidt number n. If n > 1 then the state is called
entangled, or nonseparable, whereas if n = 1 then the state is called unentangled, or separable.
For the separable state

|ψ = |ψ A ⊗ |ψ B , (30.24)

the density matrices of the subsystems are still of the pure state type,

ρ A = |ψ Aψ A |, ρ B = |ψ B ψ B |. (30.25)

Returning to the spin 1/2 example, the entangled state has quantum correlations, since for instance
in |ψ states, |↑ A implies |↑B and |↓ A implies |↓B . But these quantum correlations cannot be
created locally, in A or B independently, but rather have to involve interactions between A and B.
On the other hand, |↑ A ⊗|↑B can be prepared by communicating classically (and separately) between
A and B.
In terms of a quantum transformation, however, we can entangle states by a unitary transformation
UAB in H that is not of the product type UA ⊗UB , with UA and UB independent unitary transformations
in H A and HB , respectively. Indeed, we can act on the separable states |1 = |↑ A↑B  and |2 = |↓ A↓B 
by rotating them,
     
|1 1 |1 + |2 |ψ1 
UAB =√ = , (30.26)
|2 2 |1 − |2 |ψ2 
343 30 Quantum Entanglement and the EPR Paradox

in the same way that we act on individual up and down states in each subsystem to obtain
   
|↑ A 1 |↑ A + |↓ A
UA =√
|↓ A 2 |↑ A − |↓ A
    (30.27)
|↑B 1 |↑B + |↓B
UB =√ .
|↓B 2 |↑B − |↓B
The maximally entangled states are |ψ1,2  or |φ1,2 , in the following sense. We can calculate the
density matrix ρ A, and find
1
ρ A = TrB |ψ1,2 ψ1,2 | = |a| 2 |↑ A↑ A | + |b| 2 |↓ A↓ A | = (|↑ ↑ | + |↓ A↓ A |) , (30.28)
2 A A
and similarly for the A|φ states,
1
ρ A = TrB |φ1,2 φ1,2 | = |a| 2 |↑ A↑ A | + |b| 2 |↓ A↓ A | =
(|↑ ↑ | + |↓ A↓ A |) , (30.29)
2 A A
and moreover ρ A = ρ B for all these pure states, which means that
1
ρ A = ρB =1, (30.30)
d
where d is the number of basis states in the Hilbert space of A or B (the dimension of the Hilbert
space); here d = 2.
Obtaining the above density matrices for the subsystems A or B, with equal probabilities in all
states, is the definition of a maximally entangled state: namely, we find a maximally mixed state by
taking the trace over a subsystem B.
The Schmidt basis decomposition is not unique, and, in order to define it uniquely, we would
need more information. In the case of a maximally entangled state, with ρ A = ρ B = (1/d)1, with
probabilities pi = 1/d in each state, rotating the basis states |i and | j  with a unitary matrix Ui j in
the Schmidt basis decomposition,
1 
|ψ AB  = √ |i AUi j | j B , (30.31)
d i,j
gives the same ρ A = ρ B density matrices. Moreover, rotating |i and | j  with complex conjugate (so
opposite) unitary matrices,

|i A = |a AVai
a
 (30.32)
|i B = |bB Vbi∗ ,
b

leads to the same state in the total Hilbert space,


1  1  1 
|ψ AB = √ |i A ⊗ |i B = √ |a AVai ⊗ |bB Vib† = √ |a A ⊗ |a  B . (30.33)
d i d i,a,b d a

30.4 Entanglement Entropy

Entanglement between the two subsystems of a system is associated with a kind of entropic measure,
corresponding to the classical notion of entropy as relating to many possible outcomes, with various
344 30 Quantum Entanglement and the EPR Paradox

probabilities. The entropy in the classical case of various states i, arising with classical probabilities
pi , was defined by Gibbs as

S = −k B pi ln pi . (30.34)
i

In this formula for the Gibbs entropy, the probabilities are distributed according to the Boltzmann
distribution,
 
1 i
pi = exp − , (30.35)
Z k BT
for energies i for the states, so
     
i i
S = −k B pi − − ln Z = pi + k B ln Z , (30.36)
i
k BT i
T

which varies as
  δpi i  
δpi i
δS = + k B δpi ln Z = , (30.37)
i
T i
T

since i pi = 1 is preserved by the variation, and this implies that i δpi = 0.


In the case of a quantum system, we can define an entropy that extends the classical Gibbs entropy,
the von Neumann entropy (for k B = 1)

S = − Tr ρ ln ρ, (30.38)

where ρ is the density matrix, and ln ρ is understood as the infinite Taylor series applied for a matrix
(the same as for the exponential function). For a density matrix that is diagonal,

ρ= pi |ii|, (30.39)
i

the von Neumann entropy reduces to the Gibbs formula,



S=− pi ln pi . (30.40)
i

In particular, for a pure state |ψ (a single quantum state), thus for

ρ = |ψψ|, (30.41)

the probabilities are pi = δi1 , so the entropy vanishes, S = 0, as expected. This is the minimum value
of S. On the other hand, for the density matrix corresponding to the most entropic situation, with all
states having equal probabilities,
1
ρ= 1, (30.42)
d
we obtain the maximum value of the entropy,
1 1
Smax = − Tr pi ln pi = − d ln = ln d. (30.43)
d d
In particular, for two-state system, with d = 2, we have Smax = ln 2  0.69.
345 30 Quantum Entanglement and the EPR Paradox

We can now specialize to the case when we have a total system that can be divided into two
systems A and B, H = H A ⊗ HB , whether the division is physical (some sort of separation, either a
wall, or a distance) or not. Then we define as before
ρ A = TrB ρ tot , ρ B = Tr A ρ tot , (30.44)
and we define the entanglement entropy of A in the system as the von Neumann entropy of ρ A,
S A = − Tr ρ A ln ρ A. (30.45)
This, however, is not the usual extensive entropy, since for instance for a pure state |ψ AB , with
total density matrix ρtot = |ψ AB AB ψ|, the entanglement entropy is the same for A and B,
ρ A = ρ B ⇒ S A = SB , (30.46)
contradicting the extensive property.
Going back to the generic von Neumann entropy, it has the following properties:
• It is invariant under unitary transformations, |i → U |i, implying ρ → U ρU † = U ρU −1 . Indeed,
then
S(ρ) → − Tr[U ρU −1 ln(U ρU −1 )] = S(ρ), (30.47)
where we have used the fact that the natural log is written as an infinite Taylor series, leading to
ln(U ρU −1 ) = U ln ρU −1 .
• It is additive for independent systems (it is not additive for interacting subsystems of a larger
system, as we have seen) ρ A, ρ B :
S(ρ A ⊗ ρ B ) = S(ρ A ) + S(ρ B ). (30.48)
• It has strong subadditivity. Considering three systems A, B, C, we have
S ABC + SB ≤ S AB + SBC . (30.49)
This is less obvious, and the proof of this property is quite involved. We also obtain that
S A + SC ≤ S AB + SBC . (30.50)
Putting B = 0 in the first relation, we obtain the standard subadditivity relation,
S AC ≤ S A + SC . (30.51)
Since the entanglement entropy is the von Neumann entropy of a subsystem, it also satisfies the
strong subadditivity inequality, as the inequality does not depend on S A being calculated from a
subsystem of a system.

30.5 The EPR Paradox and Hidden Variables

The EPR paradox was put forward in the Einstein–Podolsky–Rosen 1935 paper, which described a
“paradoxical situation”, with correlations over potentialy infinite distances. The original paper dealt
with measurements of p and x, but the “paradox” is easier to understand in the situation of measuring
spins 1/2, in line with what we have already done in this chapter. This description was put forward
by David Bohm.
346 30 Quantum Entanglement and the EPR Paradox

Figure 30.1 The EPR paradox from entanglement. What is different in quantum mechanics is that the measurement of the spin is in any
direction. “OR”means that the upper and lower parts of the figure are alternatives. The encircled cross indicates that the spins
are connected via a tensor product.

We consider a physical situation in which a particle of total spin 0 decays (or disintegrates) into
two particles of spin 1/2, in which case the total final spin, the vector sum of the two spins 1/2, must
vanish. If the particles continue to move apart, even after they are at a large (macroscopic) distance,
we still need to have the total spin zero, since the total state is entangled. An example of the above
decay is the decay of the spin zero η particle into two oppositely charged muons,

η → μ+ + μ− . (30.52)

The result of the decay is then a state of the type |φ,

1
|φ1,2  = √ (|↑ A↓B  ± |↓ A↑B ) , (30.53)
2
which means that if Alice measures the spin projection, she knows for sure what Bob would measure
if later he measured the spin projection in the same direction; see Fig. 30.1.
However, if this were all that there was to say the situation would not be quantum mechanical
in nature, since in a classical measurement we could get the same thing. Consider a measurement
that corresponds to Alice picking out at random a ball from a hat containing a black ball (spin up)
and a white ball (spin down). Then seeing (measuring) what the spin (ball) is by Alice implies that
you would know what Bob would measure when he subsequently pulls the remaining ball (spin)
from the hat.
What is different about quantum mechanics, though is that we have total spin zero for the spin
projection in any direction, for instance in the x and z directions. But, on the other hand, the states
of given spin in x can be decomposed as linear combinations of the states of given spin in z,
1 
|Sx ; ± = √ |Sz + ± |Sz − , (30.54)
2
with inverse
1
|Sz ; ± = √ (|Sx + ± |Sx −) . (30.55)
2
Then the entangled spin-singlet total state for spin measured along the z direction,
1 
|φ AB = √ |Sz + A ⊗ |Sz −B − |Sz − A ⊗ |Sz +B , (30.56)
2
347 30 Quantum Entanglement and the EPR Paradox

can be rewritten in terms of the spin-singlet total state for spin measured along the x direction,
1
|φ AB = − √ (|Sx + A ⊗ |Sx −B − |Sx − A ⊗ |Sx +B ) . (30.57)
2
The minus sign in front is irrelevant, since it represents the same state. Then the expression above
shows the ambiguity of the Schmidt basis for a maximally entangled state such as |φ AB . And the
physical consequence is that, unlike in the classical case, Alice and Bob can measure the spin in
different directions, and still obtain probabilistic results. For instance, Alice can measure the spin on
z, and afterwards Bob can measure the spin on x, in which case Bob will obtain spin up or down with
the same probability 1/2. This is then definitely not a classical result!
The measurement by Alice of a spin in A, which dictates what Bob would find if he then measures
the same spin in the B system, seems to suggest an instantaneous (with v > c) travel of information,
so is the information transmitted at superluminal speeds? The answer is NO. The point is that it is
not a message (information) being transmitted, since we have no way of determining whether Bob
actually made that measurement (which would amount to information about Bob’s actions). It would
seem as though we would need something like a simultaneous measurement on x and z, which is not
quite possible.
To generalize the previous case, consider an arbitrary direction n; changing |φ states to |ψ states
for simplicity, we find, by changing the states on A with the unitary matrix V , and the states on B
with V † ,
1 
|ψ AB = √ |Sz + A ⊗ |Sz +B + |Sz − A ⊗ |Sz −B
2
(30.58)
1
= √ (|Sn + A ⊗ Sn +B + |Sn − A ⊗ |Sn −B ) .
2
Given this transformation of the entangled state between the directions of spin measured, we might
hope that we could use the relation to send signals, with the message encoded in whichever direction
has been used for measurement by Bob (for instance, x or z). However, in fact there is no way to do
this, as we can check. The only way to do it is to use extra information, which needs to be passed
between Alice and Bob classically, so it must travel at subluminal speeds.
In conclusion, entanglement does not imply a violation of causality, as information cannot be
exchanged at superluminal speeds. In this sense, the EPR experiment is not paradoxical. Einstein
knew this, as shown in the EPR paper. However, he felt that there was a paradox nevertheless, since
he defined a theory as something with a complete description of physical reality, which would need
to satisfy the stronger condition later called Einstein locality, or local realism, namely that:

An action on A must not modify the description of B.


Clearly the EPR experiment involves a paradox, as it breaks the Einstein locality criterion: while
information does not travel superluminally, measurement of A changes the description of B by
collapsing the state of the system to a state of given spin for B.
Einstein then devised possible descriptions of the EPR paradox experiment that mimic quantum
mechanics but satisfy Einstein locality, by having some classical description, with some hidden
variables (that cannot be known by experiment). The description is classical, but the appearance
of quantum probabilities is due to our not knowing some degrees of freedom, the hidden variables,
which can for instance be some λ ∈ (0, 1). It seemed as though this was a solution, and quantum
mechanics would not be needed.
348 30 Quantum Entanglement and the EPR Paradox

But Bell showed that we can distinguish experimentally between a hidden variable theory
and quantum mechanics, by calculating the satisfaction or not of some inequalities, called Bell’s
inequalities, which will be studied in the next chapter.

Important Concepts to Remember

• In the case of two (sub)systems A and B, a separable state is a state of tensor product form, |ψ A ⊗
|χ B , while an entangled state is a state in H A ⊗ HB that cannot be written as such product (is not
separable).
• The formal definition of entanglement comes from the Schmidt decomposition of the bipartite state

|ψ AB , as i pi |i A ⊗ |i B . If the Schmidt number n (the number of nonzero eigenvalues pi ) is
> 1, we have an entangled state, and if n = 1, a separable state.
• We can entangle states by a unitary transformation UAB that is not a product of unitaries UA ⊗ UB .
• A maximally entangled state is a state that, when traced over either A or B, leads to a constant (and
equal) density matrix, ρ A = ρ B = d1 1.
• The quantum mechanical von Neumann entropy, Sv N = − Tr ρ ln ρ, extends the classical Gibbs
entropy S = −k B i pi ln pi .
• For a system formed by two subsystems, we define the entanglement entropy as the von Neumann
entropy of A, SE = − Tr ρ A ln ρ A, where ρ A = TrB ρ tot and ρ B = Tr A ρ tot .
• The EPR “paradox” arises when we make a measurement in A of an entangled state, thus modifying
what B can see. In the quantum spin 1/2 case, unlike the classical case, knowing the spin in
one direction means a given state for the spin in another direction, thus still implies something
nontrivial for a different measurement.
• In the EPR experiment information does not travel at superluminal speeds, but a measurement of
A changes the description of B, thus we violate Einstein locality, or local realism.

Further Reading
See Preskill’s lecture notes on quantum information [12].

Exercises

(1) Consider the state |χ1,2


a
 = |ψ1,2  + a|φ1,2 , for an arbitrary a ∈ R. When is it separable?
Calculate ρ A and ρ B to check that you find the correct results.
(2) Apply Schmidt decomposition to the general case in exercise 1.
(3) Consider the same general state as that in exercise 1, in the generic entangled case. What is the
unitary matrix that disentangles them?
(4) Consider the density matrix
ρ = C(|12| + |21| + a|22| + |23| + |32|), (30.59)
where |1, |2, |3 are orthonormal states. Calculate C, and thus the von Neumann entropy of this
mixed state.
349 30 Quantum Entanglement and the EPR Paradox

(5) For the generic state in exercise 1, calculate the entanglement entropy.
(6) Consider three spin 1/2 systems, A, B, and C, and a state
C(|↑↑↑ + |↓↑↓ + |↑↓↑). (30.60)
Check that the strong subadditivity condition is satisfied.
(7) Consider an EPR state |φ AB ; Alice measures the spin on z, then Bob measures it on x, and
then Alice measures it again on z. Classify the possible answers for the second measurement
of Alice, with their probabilities, depending on the initial measurement of Alice (and without
knowledge of the measurement of Bob). What do you deduce?
The Interpretation of Quantum Mechanics
31 and Bell’s Inequalities

In this chapter we describe the Bell’s inequalities that apply for hidden variable theories in EPR
paradox-type situations; these theories are violated by the quantum mechanics results, thus providing
an experimental test, which has been verified to give quantum mechanics as the correct description
in all cases. Finally, we describe some of the leading interpretations of quantum mechanics.

31.1 Bell’s Original Inequality

As we described at the end of the previous chapter, Einstein considered a deterministic theory, a
“hidden variable” theory, in which there are some variables that cannot be observed that generate
the observed statistical results in an a priori deterministic model. The assumption is that the hidden
variable theory will reproduce the quantum mechanical results perfectly. But, in his seminal paper,
John Bell showed that such a hidden variable model would lead to an inequality that could be violated
by quantum mechanics, and thus could be experimentally distinguished from it.
Consider measuring the reduced spin, σ ≡ 2 S in two different directions, defined by unit vectors
a and b, so we have (σ · a ) for system A, i.e., measured by Alice, and (σ · b) for system B, i.e.,
measured by Bob.
According to the hidden variable model, with hidden variable λ, Alice measures A(a; λ) and Bob
measures B(b; λ), obtaining possible values A = ±1 and B = ±1.
In the case of Bell’s original inequality, applied to David Bohm’s (now standard) version of the
EPR paradox, with a system of total spin 0, meaning that if a = b (Alice and Bob measure spin in
the same direction) then A = −B (and if a = −b, then A = B), we thus get

A(a; λ) = −B(a; λ). (31.1)

The hidden variable λ appears (since we cannot measure it) as a statistical ensemble, with
probability distribution ρ(λ), normalized to one,

dλ ρ(λ) = 1. (31.2)
Λ

Then define, in this hidden variable theory, the correlation function of the measurements of Alice
and Bob,

Chid (a, b) =
 dλ ρ(λ) A(a; λ)B(b; λ). (31.3)
Λ

350
351 31 Interpretation of Quantum Mechanics, Bell’s Inequalities

Given (31.1), we rewrite this as



Chid (a, b) = − dλ ρ(λ) A(a; λ) A(b; λ). (31.4)
Λ

In order to obtain a relevant inequality, we consider a situation involving spin measurements in


three directions a , b, c. We then calculate
  
Chid (a, b) − Chid (a, c ) = −
 dλ ρ(λ) A(a; λ) A(b; λ) − A(a; λ) A(c; λ)
Λ
   (31.5)
= dλ ρ(λ) A(a; λ) A( b; λ) A( b; λ) A(c; λ) − 1 ,
 
Λ

where in the second equality we have used the fact that A = ±1, so [A(b; λ)]2 = 1, which we
introduced in the second term.
Since A = ±1, this means that also A(a; λ) A(b; λ) = ±1. Then we obtain an inequality,
  
|Chid (a, b) − Chid (a, c )| =  dλ ρ(λ) A(a; λ) A(b; λ) A(b; λ) A(c; λ) − 1 
 Λ 
    
≤  dλ ρ(λ) A(b; λ) A(c; λ) − 1   A(a; λ) A(b; λ)  (31.6)
 Λ 
  
=  dλ ρ(λ) A(b; λ) A(c; λ) − 1  .
 Λ 
But ρ(λ) ≥ 0 and (since A = ±1)

1 − A(b; λ) A(c; λ) ≥ 0, (31.7)

which means we obtain


  
|Chid (a, b) − Chid (a, c )| ≤ dλ ρ(λ) 1 − A(b; λ) A(c; λ)
Λ (31.8)
= 1 + Chid ( b, c ),


where in the second equality we used the normalization Λ dλ ρ(λ) = 1 and the definition of Chid .
The resulting inequality is called Bell’s (original) inequality.
On the other hand, in quantum mechanics, we define the corresponding correlation function of the
measurements of Alice and Bob. Since in this case A = (σ · a )( A) and B = (σ · b)(B) , we obtain the
expectation value of the product AB, in a state |ψ for the total Hilbert space H = H A ⊗ HB ,

CQM (a, b) = ψ|(σ · b)(B) (σ · a )( A) |ψ. (31.9)

This correlation function will violate the Bell inequality, showing that quantum mechanics is
different from the hidden variable theory.
We choose an entangled state of total spin 0, meaning a |φ state, in the nomenclature of the
previous chapter, and one that is antisymmetric in the opposite spins for A and B, uniquely identifying
the state as |φ2 ,
1
|ψ ≡ |φ−  = |φ2  = √ (|↑ A↓B  − |↓ A↑B ). (31.10)
2
352 31 Interpretation of Quantum Mechanics, Bell’s Inequalities

But, as we have seen, for such a state the density matrix obtained for the subsystem A by finding
the trace over B is ρ A = 12 1. Then we obtain for the quantum mechanical correlation function

CQM (a, b) = ψ|(σ · b)(B) (σ · a )( A) |ψ


= −ψ|(σ · b)( A) (σ · a )( A) |ψ
= −ai b j ψ|σ j ( A) σi( A) |ψ (31.11)
= −ai b j Tr A ρ A σ (j A) σi( A)

= −a · b = − cos θ(a, b).

In the fourth equality, we have used the fact that, since there is no operator acting on the B
subsystem, the correlation function equals the result obtained from expectation value in A for the
density matrix of the subsystem A. In the fifth equality we have used the fact that
1
Tr A ρ A σ (j A) σi( A) = Tr A σ (j A) σi( A) = δi j , (31.12)
2

and in the last equality we have defined θ(a, b) as the angle between the two unit vectors.
In order to easily see that this correlation function violates the Bell inequalities, we consider the
simpler case of small angles. First, if

|θ(b, c )|  1, (31.13)

then the right-hand side of the Bell inequality in the quantum mechanics case is
1
1 + CQM (b, c ) = 1 − cos θ(b, c )  θ 2 (b, c ). (31.14)
2
Then, if

|θ(a, b)|  1, |θ(a, c |  1 (31.15)

as well, we obtain that the quantum mechanical version of the Bell inequality becomes
1 2 
θ (a, b) − θ 2 (a, c )  ≤ θ 2 (b, c ).
1
(31.16)
2 2

 (a; c)

a b
(a; b)
c
(b; c)

Figure 31.1 Set-up for violation of the Bell inequalities.


353 31 Interpretation of Quantum Mechanics, Bell’s Inequalities

However, this relation can be easily violated. Consider coplanar unit vectors a , b, c; see Fig. 31.1.
Then θ(a, c ) = θ(a, b) + θ(b, c ), so the inequality becomes

|(θ(a, b) + θ(b, c )) 2 − θ 2 (a, b)| 2 = θ 2 (b, c ) + 2θ(b, c )θ(a, b) ≤ θ 2 (b, c ), (31.17)

which is clearly violated.


In principle, this Bell inequality can be tested experimentally. Alice measures A and gets ±, and
Bob measures B and gets ±. Since the correlation function is a sum of products AB weighted by
probabilities (so sum over events, and divide by the total number of events), or statistical expectation
value, then A = +, B = + and A = −, B = − both contribute with a plus in the sum, whereas
A = +, B = − and A = −, B = + both contribute with a minus in the sum, so the experimental
correlation function is (here the Ns are numbers of events)

N++ + N−− − N+− − N−+


Cexp (a, b) = . (31.18)
N++ + N−− + N+− + N−+

We could then see if the Bell inequality is experimentally violated or not. Unfortunately, the case
of perfectly anti-aligned spins for A and B is hard to measure, so we also need other Bell inequalities
that are easier to measure.
Before that, however, we will derive other, even simpler to understand, Bell inequalities, though
they are less relevant for the possible hidden variable theories (they eliminate fewer hidden variable
theories).

31.2 Bell–Wigner Inequalities

We now consider the Bell inequalities in a simple model by Wigner from 1970 [17]. This is also
described in Modern Quantum Mechanics by Sakurai and Tuan [1], and a variant of the same was
presented by Preskill in his lecture notes [12].
Consider the case of a hidden variable class of models, considered by Wigner, in which we have
classically random emitters of spins, which emit spins of a given value in several directions at
the same time (though it is impossible to measure the spin in several directions simultaneously,
in consistency with experimental results that agree with quantum mechanics). Suppose that in the
direction of unit vector a we get the possible values σa = ±, and in the direction of unit vector b we
get the possible values σb = ±. All are pre-ordained, but not knowable.
For the EPR experiment, Alice and Bob must always measure opposite spins in a given direction.
Then consider states that have given spins in two directions, (a σa , bσb ), or in three directions
(a σa , bσb , c σc ), etc. The EPR assumption means that Alice and Bob measure totally opposite states.
For spins in three directions, we obtain Table 31.1 for the various possibilities, each with an
associated number Ni of events, thus with probability

Ni
Pi = . (31.19)
k Nk

We denote by P(a σa , bσb ) the probability for Alice to measure σa on a and for Bob to measure σb
on b.
354 31 Interpretation of Quantum Mechanics, Bell’s Inequalities

Table 31.1 Possible measurements by Alice and Bob, and the


corresponding numbers of events.
Alice (system A) Bob (system B)
Numbers of events σa σb σc σ a σb σc

N1 +++ −−−
N2 ++− −−+
N3 +−+ −+−
N4 +−− −++
N5 −++ +−−
N6 −+− +−+
N7 −−+ ++−
N8 −−− +++

Then, by inspection of Table 31.1, we see that P(a+, b+) corresponds to N3 and N4 , P(a+, c+)
corresponds to N2 and N4 , and P(c+, b+) corresponds to N3 and N7 , so we obtain
N3 + N4
P(a+, b+) = P3 + P4 =
k Nk
N2 + N4
P(a+, c+) = P2 + P4 = (31.20)
k Nk
N3 + N7
P(c+, b+) = P3 + P7 = .
k Nk

But then, since

N3 + N4 ≤ (N2 + N4 ) + (N3 + N7 ), (31.21)

it means that we obtain the inequality

P(a+, b+) ≤ P(a+, c+) + P(c+, b+), (31.22)

which is one form of the Bell–Wigner inequality.


Other inequalities are possible. One example is obtained by defining the probability to obtain the
same result for the same observer (say, Alice) in a measurement of two directions i, j, chosen among
a , b, c, the probability being denoted by Psame (i, j). Note that this probability corresponds to having
opposite spins (as in the EPR paradox) measured by Alice and Bob on the same pair of directions.
Then, again by a simple inspection of Table 31.1, we obtain that
N1 + N2 + N7 + N8
Psame (a, b) =
k Nk
N1 + N5 + N6 + N8
Psame (a, c ) = (31.23)
k Nk
N1 + N4 + N5 + N8
Psame (b, c ) = .
k Nk
355 31 Interpretation of Quantum Mechanics, Bell’s Inequalities

Summing the three probabilities, we obtain

k Nk + 2N1 + 2N8
Psame (a, b) + Psame (a, c ) + Psame (b, c ) = ≥ 1. (31.24)
k Nk

This is the new type of Bell–Wigner inequality.


We next consider the results corresponding to the Bell–Wigner identities in quantum mechanics. In
quantum mechanics, in order to calculate the probability P(a+, b+), we need to insert the projection
operator along the spin up or spin down direction on n = a , b, i.e.,
1 ± n · σ
E(n, ±) = . (31.25)
2
Specifically, we obtain

P(a+, b+) = φ− |E ( A) (a, +)E (B) (b, +)|φ− . (31.26)

But, substituting the expression for E(n, ±), and using the fact that

φ− |1|φ−  = 1, φ− |n · σ |φ−  = 0, φ− |(σ · a )(σ · b)|φ−  = − cos θ(a, b), (31.27)

we get

1 1 θ(a, b)
P(a+, b+) = (1 − cos θ(a, b)) = sin2 . (31.28)
4 2 2
Similarly, we obtain for the other three possibilities for measurement,
1
P(a−, b−) = φ− |E ( A) (a, −)E (B) (b, −)|φ−  = (1 − cos θ(a, b))
4
1
P(a+, b−) = φ− |E ( A) (a, +)E (B) (b, −)|φ−  = (1 + cos θ(a, b)) (31.29)
4
1
P(a−, b+) = φ− |E ( A) (a, −)E (B) (b, +)|φ−  = (1 + cos θ(a, b)).
4
Then, we see that indeed we have probabilities normalized to one, since

P(a+, b+) + P(a−, b−) + P(a+, b−) + P(a−, b+) = 1. (31.30)

The (first) Bell–Wigner inequality in the quantum mechanical version becomes


1 1
(1 − cos θ(a, c ) + 1 − cos θ(c, b)) ≥ (1 − cos θ(a, b)) ⇒
4 4
(31.31)
1  2 θ(a, c ) θ(c, b)  1 2 θ(a, b)

sin + sin2 ≥ sin .
2 2 2 2 2

But this inequality is easily violated if a , c, b are coplanar and ordered in this way, so

θ(a, b) = θ(a, c ) + θ(c, b) (31.32)

is small, as before, so that

θ2 (a, b) = (θ(a, c ) + θ(c, b)) 2 = θ 2 (a, c ) + θ 2 (c, b) + 2θ(a, c )θ(c, b)
(31.33)
> θ2 (a, c ) + θ 2 (c, b).
356 31 Interpretation of Quantum Mechanics, Bell’s Inequalities

As in the case of the other Bell–Wigner inequality, we first note that in quantum mechanics we
have
1 θ(a, b)
Psame (a, b) = P(a+, b+) + P(a−, b−) = (1 − cos θ(a, b)) = sin2 . (31.34)
2 2
Then the left-hand side of the Bell–Wigner inequality is

θ(a, b θ(b, c) θ(a, c )


Psame (a, b) + Psame (b, c ) + Psame (a, c ) = sin2 + sin2 + sin2 , (31.35)
2 2 2
which can be easily arranged to be less than 1.
However, again, as in the case of the original Bell’s inequality, it is experimentally difficult
to arrange since we again have opposite spins for Alice and Bob. This means that we need a
generalization of the Bell inequality that can be tested experimentally.

31.3 CHSH Inequality (or Bell–CHSH Inequality)

A generalization of the Bell inequality that can be tested experimentally is provided by the inequality
derived by John Clauser, Michael Horne, Abner Shimony and R. A. Holt: the CHSH inequality.
We consider the situation where Alice can measure two variables, a and a , that take values in
{±1}, and Bob measures other two variables, b and b, that also take values in {±1}. In a hidden
variable theory, depending on the hidden variables λ, the measurements of Alice are A(a, λ) ≡ A
and A (a , λ) ≡ A, and the measurements of Bob are B(b, λ) ≡ B and B  (b, λ) ≡ B . Note that
a, a  , b, b can be any variables, not necessarily spins in some direction.
We define the correlation function in the hidden variable theory as before, by

Chid (a, b) = dλρ(λ) A(a, λ)B(b, λ) ≡ AB, (31.36)
Λ

where the final notation is understood as the statistical (classical) average. Note that this definition
has no assumption that B(a, λ) = −A(a, λ), as for spins in the EPR experiment.
Then, since A, A = ±1, it follows that

A + A = 0, A − A = ±2, or A − A = 0, A + A = ±2. (31.37)

This implies that we have

( A + A )B + ( A − A )B  = ±2
(31.38)
= AB + A B + AB  − A B  ≡ M.

But then we obtain an inequality by using the statistical average and modulus,
  
 dλ ρ(λ)M (λ)  = |M| ≤ |M | = dλ ρ(λ)|M (λ)|
 Λ  Λ
 (31.39)
=2 ρ(λ) = 2,
Λ

where in the last equality we have used the normalization of the probability distribution ρ(λ).
357 31 Interpretation of Quantum Mechanics, Bell’s Inequalities

Now replacing M by its definition in (31.38), we obtain

|AB + A B + AB  − A B | ≤ 2 ⇒


(31.40)
|Chid (a, b) + Chid (a , b) + Chid (a, b ) − Chid (a , b )| ≤ 2.

This is the CHSH, or Bell–CHSH inequality.


To calculate its equivalent in quantum mechanics, we choose the variables to be spins in various
directions,

a = σ ( A) · a , a  = σ ( A) · a 
(31.41)
b = σ (B) · b, b = σ (B) · b.

But since, as we saw, we have

φ− |(σ ( A) · a )(σ (B) · b)|φ−  = −a · b = − cos θ(a, b), (31.42)

we can choose coplanar unit vectors, arranged as before in rotational order (when the origin is the
same), a , b, a , b, at successive π/4 intervals of angle. Then
π 1 3π 1
AB = A B = AB  = − cos = − √ , A B  = − cos =√ . (31.43)
4 2 4 2
Thus the equivalent of the CHSH inequality in quantum mechanics is
1 √
4 × √ = 2 2 ≤ (?)2, (31.44)
2
which is clearly violated.
In fact, one can show that this is the maximal violation of the classical CHSH inequality.

31.4 Interpretations of Quantum Mechanics

We end this chapter with a short analysis of the main possible interpretations of quantum mechanics.
There are very many such interpretations, proof that, more than a hundred years after its start,
quantum mechanics is still a mystery to us, but most of these possibilities are shaky and less standard,
so we will stay with the leading ones.

The Standard, “Copenhagen”, Interpretation


This is the most common interpretation of quantum mechanics, the one that has been implicit in
most of the book so far. It was developed by Niels Bohr and Werner Heisenberg around 1927
in Copenhagen, extending the previous work of Max Born. In this view of quantum mechanics,
we have a probabilistic interpretation for the outcome of measurements based on wave functions
evolving in time, i.e., we simply have probabilities of obtaining various results. On the other hand,
questions about non-measured things, such as the paths previous to one measurement experiment, are
meaningless. We have only a wave function and measurements that have meaning. In particular, the
interaction of a system with the observer in obtaining a measurement collapses the wave function.
358 31 Interpretation of Quantum Mechanics, Bell’s Inequalities

The “Many Worlds” Interpretation


In this interpretation, there is no wave function collapse associated with measurements. Mea-
surements occur via “decoherence”, where the system “interacts with itself” without the need
of an observer, and moreover at each measurement, the Universe splits into multiple, mutually
unobservable Universes, or “alternative histories”. These splittings occur at each moment in time
corresponding to some measurement, leading to a “Multiverse”, a distribution of Universes with
slightly different histories at each point, splitting further as time goes by.
Note that this interpretation of quantum mechanics is the basis for a story device in science
fiction movies and books that is quite common yet gets things mostly wrong. In the Sci-Fi
version one can travel between Universes in the Multiverse (that can never happen, an essential
part of this interpretation being that the Universes are mutually unobservable); the people are
the same, but the events are slightly different (there is no difference between what happens to
sentient people and objects or events with respect to branching: everything is slightly different, so
the people should be too).

The “Consistent Histories” Interpretation


This is an interpretation that is somewhat related to the formulation of quantum mechanics in terms
of path integrals, which are sums over quantum (generically nonsmooth) histories for the propagator.
In this interpretation, we view the wave function as a sum over consistent histories for the particles,
with some probabilities.
This interpretation is useful for dealing with the problem of quantum cosmology. Quantum
cosmology refers to the quantum mechanical evolution of the whole Universe near the Big Bang
(the time origin of the Universe), when all things are quantum mechanical, and the whole Universe is
one big system. In that case, we have a problem with the standard Copenhagen interpretation, since
there is a single Universe, there is no “ensemble of Universes” to measure, and there is only one
experiment, the evolution of our Universe. Thus we have to find a way to deal with the quantum
mechanics of a single experiment.
Of course, the many worlds interpretation could also be useful for quantum cosmology, since
in it we imagine a Multiverse (a collection of alternative Universes) instead of ensembles, even in
regular experiments on Earth so even more so for quantum cosmology. The unobservability of the
Multiverse is part of the point: the ensemble is always an ensemble of Universes, not of successive
experiments.

Ensemble Interpretation
The ensemble interpretation is the most minimalist, even more so than the Copenhagen interpretation.
The quantum mechanics probabilistic interpretation is valid only for ensembles, not for single
particles, or single Universes (so it definitely tells us we cannot apply quantum mechanics to
cosmology, i.e., to the Universe itself). This is very agnostic, and restrictive, so it is perhaps going
too far.
There are many other interpretations of quantum mechanics, but these are less standard, and more
controversial, so we will not mention them here.
359 31 Interpretation of Quantum Mechanics, Bell’s Inequalities

Important Concepts to Remember

• All or most of the variations of Bell’s inequalities refer to three consecutive measurements by Alice
and Bob, in the EPR paradox, relative to three directions a , b, c, and in a classical, hidden variable,
theory we obtain an inequality violated by the quantum mechanical analog of the same.
• The original Bell inequality is hard to measure, since it requires the measurement of exactly anti-
aligned spins for Alice and Bob (assuming that the total spin is zero). The Wigner model, in which
we have preordained numbers for each classical possibility, leading to Bell-Wigner inequalities, is
also hard to measure, for the same reason.
• The Bell–CHSH inequalities are better experimentally, since we do not assume total spin zero, and
we consider four arbitrary directions of measurement.
• The Copenhagen interpretation of quantum mechanics is the most common: it is a probabilistic
interpretation based on wave functions and measurements (involving interaction with an observer
and so leading to collapse of the wave function).
• In the many worlds interpretation there is no collapse of the wave function, and at each (self-)
interaction, the Universe splits into multiple, mutually unobservable Universes, corresponding to
each possibility. In the consistent histories interpretation, the wave function arises as the sum over
all consistent histories (as in path integral), and allows us to deal with the quantum cosmology of
a single Universe (i.e., a single experiment). In the ensemble interpretation, the most restrictive,
we can deal only with ensembles, not with a single particle or Universe (it is the most agnostic
description, i.e., it is the description that is maximal concerning what we cannot know).

Further Reading
For Bell’s inequalities, see Preskill’s lecture notes on quantum information [12], as well as the
analysis of Sakurai and Tuan in [1]. The original articles cited here are [16] and [17].

Exercises

(1) Explore whether the original Bell inequality can be violated at large angles as well.
(2) Analyzing Table 1, find another example of a Bell–Wigner inequality that is violated in quantum
mechanics.
(3) Find an example of (angles corresponding to) another violation of the Bell–CHSH inequality by
quantum mechanics, which is not maximal but is more generic.
(4) Instead of the Bell–CHSH inequality in the text, consider an inequality obtained in the same
way from M̃ = ( A + A )B  + ( A − A )B instead of M. Is it still violated by quantum mechanics?
(5) If we consider five measurements instead of the four in the Bell–CHSH inequality, namely
a, a , a  for Alice and b, b for Bob, can we extend the Bell–CHSH inequality to include this
case, such that the inequality is still violated by quantum mechanics?
360 31 Interpretation of Quantum Mechanics, Bell’s Inequalities

(6) In quantum cosmology, one can define a “wave function of the Universe” Ψ[a(t)], whose
variable is the expanding scale factor of the Universe, a(t), and which satisfies a general
relativity (Einstein) version of the Schrödinger equation, called the Wheeler–DeWitt equation.
Yet in the Copenhagen interpretation of quantum mechanics, there is only one “experiment”
(our expanding Universe, since our Big Bang) that we can see, and no “outside observer”. How
would you slightly extend this interpretation to make sense of the results of the Wheeler–DeWitt
equation (there is much debate, so there is no unique good answer to this question at present)?
(7) In some TV shows, instigated by the “many worlds interpretation” of quantum mechanics, one
sees an “evil parallel Universe”, where things have gone very bad in some sense, and the same
characters behave in an evil way. Leaving aside philosophical speculations on good and evil,
and the possibility of accessing (even just to see) this Universe, explain why this is inconsistent
with the “many worlds interpretation” of quantum mechanics.
Quantum Statistical Mechanics and “Tracing”
32 over a Subspace

In this chapter, we will describe the basics of quantum statistical mechanics as a natural extension of
the most general formalism, for a density matrix (i.e., a mixed state) instead of a pure state. A density
matrix is a classical distribution of quantum states, and, as such, it allows for a satisfactory statistical
mechanics interpretation, since classical statistical mechanics deals with a classical distribution of
classical states. On the other hand, we have also seen that a nontrivial ρ̂ appears when we take the
trace over a subspace in a total space.
Thus we can interpret mixed states in two possible ways:

• Perhaps we can ignore (meaning, we cannot measure) some part of a Hilbert space for a total
isolated system: we have a “macroscopic” description only, not a microscopic description. This is
in fact at the core of the statistical mechanics interpretation, even in the classical case: statistics
comes from ignorance of (or “averaging over”) the unknown parts of the state of the total system.
• Alternatively, perhaps we have a system in contact with a reservoir, schematically denoted as S ∪R.
Then, in a similar manner, there is only a total state for the system plus reservoir, S ∪ R, but not
for S or R independently, since there is no such thing as a state, or wave function for only a part
of a system, except for very special states (product, or separable, states).

32.1 Density Matrix and Statistical Operator

Either way (in both of the above interpretations), the system S is in a state chosen randomly from
the set {|ψ j }, with classical probability w j , which is the definition of a mixed state.
Consider a complete and orthonormal set {|ψ α } of eigenstates of a complete set of compatible
observables. That means that for any state of the system, we can decompose it in the |ψ α  states,

|ψ = cα |ψ α . (32.1)
α

Multiplying by ψ β | from the left, we obtain as usual cα = ψ α |ψ. Applying the decomposition to
the states in the mixed state set, we have

|ψ j  = cα( j) |ψ α . (32.2)
α

361
362 32 Quantum Statistical Mechanics, Tracing over a Subspace

The expectation value of an observable A in the state |ψ is found to be

Aψ = ψ| Â|ψ



= ψ|ψ α ψ α | Â|ψ β ψ β |ψ
α,β (32.3)

= cα∗ cβ Aαβ ,
α,β

where we have inserted two completeness relations 1 = α |ψ α ψ α |.


But then, for the classical statistics of the quantum states |ψ α  with probabilities w j , we can
calculate the average value of an observable A by taking first a quantum average, and then the
classical statistical average over states:

    
A ≡ Aqu = w j Aψ j =  w j cα( j)∗ cβ( j)  Aαβ
stat
j α,β  j
 (32.4)
= ρ βα Aαβ = Tr( ρ̂ Â).
α,β

Here we have defined the density matrix


 
ρ βα = w j cβ( j) cα( j)∗ = w j ψ β |ψ j ψ j |ψ α 
j j
 
= ψ β | w j |ψ j ψ j | |ψ α  (32.5)
j

≡ ψ β | ρ̂|ψ α ,
where

ρ̂ = w j |ψ j ψ j | (32.6)
j

is an operator that, at least in the context of quantum statistical mechanics, will be called the statistical
operator.
The normalization of the statistical operator and its associated density matrix is found as follows:

Tr ρ̂ = w j ψ α | |ψ j ψ j | |ψ α 
α j
   (32.7)
= w j ψ j | |ψ α ψ α | |ψ j  = w j = 1,
j α j

where in the third equality we have used the completeness relation of |ψ α  and the orthonormality
of |ψ j .
Considering the statistical operator for a pure state |ψ as a particular example,

ρ̂ = |ψψ|, (32.8)

the density matrix is

ρ βα = ψ β | ρ̂|ψ α  = cα∗ cβ . (32.9)


363 32 Quantum Statistical Mechanics, Tracing over a Subspace

The formalism of mixed states applies to states that are not completely known, so that there are
classical probabilities w j , in which case the mixed states, defined by John von Neumann, appear in a
macroscopic description.
Indeed, in Chapter 30 we saw that, for a pure total state, either a system plus reservoir, S ∪ R, or
an observed state plus an unobservable state, we sum over diagonal components in the unobservable
Hilbert space B, i.e., we take the trace, with the result

ρ B = Tr A ρ tot = Tr A |ψ AB AB ψ| = pi |i B B i  |
i
 (32.10)
ρ A = TrB ρ tot = pi |i A Ai|.
i

Note that in this case, the resulting state for a reduced system (i.e., either system A or system B
but not both) is not of the type j w j |ψ j , but simply is not a (pure) state |ψ   at all!

32.2 Review of Classical Statistics

Before we turn to the description of quantum statistical mechanics, we review the classical version,
in order to see how it can be generalized.
We start with the definition of a statistical ensemble, due to Boltzmann and Gibbs. It is a collection
of identical systems to the real system S, in the same conditions as the real one, independent of each
other and each system being in any of the possible states (that are compatible with the external
conditions on the system).
We also define the distribution function of the statistical ensemble,

dN
ρ= , (32.11)
N dΓ
where N refers to the number of systems in the ensemble and dΓ is the differential of the total
number of states with energy ≤ E,

Γ(E) = dΓ. (32.12)
H≤E

We also define the number of states in a region between E and E + ΔE,



Ω(E, ΔE) = dΓ, (32.13)
E ≤H≤E+ΔE

and the energy distribution,

∂Γ(E)
ω(E) = , (32.14)
∂E
so that

Ω(E, ΔE)  ω(E)ΔE. (32.15)


364 32 Quantum Statistical Mechanics, Tracing over a Subspace

All quantities, including Γ and ρ, are functions of the phase space (p, q) of the total number of
particles (so there is a very large number of variables, giving a very large dimension of the phase
space), so we have ρ(p, q) and dΓ(p, q). Then the ensemble average of an observable A is

A = A ρ dΓ. (32.16)
Γ

The Ergodic Hypothesis


The ergodic hypothesis is due to Boltzmann. It states that for an isolated system, a point in the phase
space of the system goes through all the points in the phase space on an isoenergetic (E = constant)
hypersurface. But it is not valid.
A better variant is the quasi-ergodic hypothesis: that a point in the phase space of the system
goes arbitrarily close to every point on the isoenergetic hypersurface. However, it is not valid and/or
provable in general.
That means that we need a set of postulates about observables in the system, to replace any need
for explanations or proofs.

Postulate 1
The first such postulate states that the observed expectation value of a quantity A equals the temporal
average,
 τ
1
Aobs = A = lim A(t) dt. (32.17)
τ→∞ τ 0

Postulate 2
The second postulate, called the ergodic postulate, or Gibbs–Tollman postulate, states that the
statistical average (the average over the ensemble) equals the time average,

A = A. (32.18)

Postulate 3
The third postulate, of a priori equal probabilities, known as the Tollman postulate, says that for an
isolated system, the probability density in phase space is constant for all the accessible states (i.e.,
those consistent with the external boundary conditions). Mathematically, we write


⎪ constant in D
ρ=⎨
⎪ 0, (32.19)
⎩ outside it.
365 32 Quantum Statistical Mechanics, Tracing over a Subspace

32.3 Defining Quantum Statistics

In the quantum mechanical case, we use classical statistics over quantum states, plus quantum
statistics over a state. Moreover, states of the system are now (generically) discrete, labeled by an
index n for the energy plus a degeneracy gn for different states of the same energy.
Now we define the total number of states with energy less than E, similarly to the classical case,
 
Γ(E) = 1= gn , (32.20)
n,gn (En <E) n(En <E)

and then define Ω(E, ΔE) and ω(E) as in the classical case.
We now define postulates for quantum statistical mechanics that are equivalent to the classical
ones above in the appropriate limit.

Postulate 1
This is the same as before, equating the observed value and the temporal average,

Aobs = A(t). (32.21)

Postulate 2
This is also the same as before, equating the statistical average (over the ensemble) with the time
average (or the observed value), but now the statistical average is different,

Aobs = Astat = Tr( ρ̂ Â). (32.22)

Postulate 3
This is a postulate that the amplitudes are a priori equal and the phases are a priori random. Then the
amplitudes of cα( j) (for an expansion in the basis |ψ α ) are a priori equal, and the phases of the same
are a priori random,
(j)
cα( j) = r ( j) eiφα , (32.23)

where r ( j) is independent of α and the φ α are random as a function of α. Defining the classical
ensemble average (denoted by an overbar) as an average over the states j, the conditions are

r α2 = ρ 0 , cos(φ α − φ β ) = sin(φ α − φ β ) = 0. (32.24)

Then the density matrix in the basis |ψ α  is constant and diagonal,


⎪ ρ 0 δ αβ in D
ρ αβ = ⎨
⎪ 0, (32.25)
⎩ for (α, β) outside D.
366 32 Quantum Statistical Mechanics, Tracing over a Subspace

But on the other hand, the diagonal elements are

ρ αα = ρ 0

(32.26)
= |cα( j) | 2 w j = wα ,
j

α
where wα is the probability in the basis state α, and we have used that of the above quantity
should be equal to 1, hence we can identify it with wα .
The properties of the statistical operator ρ̂ are as follows.
(1) It is a Hermitian operator (which we have seen implicitly).
(2) It is normalized by Tr ρ̂ = 1.
(3) The eigenvalues of ρ̂ are semi-positive-definite, i.e., ≥ 0. This must be so, since they represent
probabilities.
(4) ρ̂ is bounded, meaning that |ρ αβ | ≤ 1.
Further, to determine ρ̂ for an isolated system (for which case we can prove the following
statement) and in general (for which it needs to be postulated), we have the Liouville–von Neumann
equation,
∂ ρ̂
i = [ Ĥ, ρ̂]. (32.27)
∂t
For an isolated system, this is proven as follows. Substituting the diagonal form for ρ̂, we find

   
i∂t  w j |ψ j ψ j |  = w j (i∂t |ψ j )ψ j | + |ψ j (i∂t ψ j |)
 j j

(32.28)
= i w j Ĥ |ψ j ψ j | − |ψ j ψ j | Ĥ
j
 
= i Ĥ ρ̂ − ρ̂ Ĥ = i[ Ĥ, ρ̂],

where in the second equality we have used the Schrödinger equation i∂t |ψ = Ĥ |ψ j  and its
complex conjugate.

Statistical Ensemble at Equilibrium


In the quantum case, a statistical ensemble at equilibrium gives observables that are time
independent, so

Aobs = A = Tr[ρ̂ Â] (32.29)

is time independent, implying that ρ is time independent. But, by the Liouville–von Neumann
equation, we have
∂ ρ̂
= 0 ⇒ [ ρ̂, Ĥ] = 0. (32.30)
∂t
Since these two Hermitian operators commute, it means that we can always define them (and measure
them) at the same time. But then, classically, ergodicity would mean that the distribution depends on
the energy only,
367 32 Quantum Statistical Mechanics, Tracing over a Subspace

ρ(p, q) = ρ(H(p, q)). (32.31)

The same statement in quantum mechanics is that now the statistical operator is a function of the
Hamiltonian,

ρ̂ = ρ( Ĥ). (32.32)

We can now define the (quantum version of ) ensembles and their associated distribution.

Quasi-Microcanonical Ensemble (Distribution)


In this case, we define the ensemble by saying that it is composed of isolated systems each with
energy in a very small region E  ∈ (E, E + ΔE).
Since the statistical operator is a function of the Hamiltonian, ρ̂ = ρ( Ĥ), for the eigenvalues
(corresponding to eigenstates of energy) we have

ρ nm = ρ(En )δ nm . (32.33)

Defining a domain

D = {n|E < En < E + ΔE}, (32.34)

and a function

⎪ 1, x∈D
Δa (x) = ⎨
⎪ 0, (32.35)
⎩ x  D,

the quasi-microcanonical distribution is


⎪ constant in D
ρ(En ) = ⎨
⎪0 (32.36)
⎩ outside it.

This can be rewritten as


ΔΔE (En − E)
ρ nm = δ nm , (32.37)
Ω(E, ΔE)
now with the correct normalization for the matrix (so that n ρ nn = 1). This can be extended in the
quantum case to the formula

ΔΔE ( Ĥ − E1)
ρ̂ = , (32.38)
Ω(E, ΔE)

where ΔΔE ( Ĥ − E1) is a projector onto the domain D.


Then the expectation value for an observable A is given by
1 
A = n| Â|n. (32.39)
Ω(E, ΔA) n∈D

We can now add another postulate to the axiomatic system for statistical mechanics:
368 32 Quantum Statistical Mechanics, Tracing over a Subspace

Postulate 4: the Boltzmann formula


We define the entropy of an isolated system in terms of the number of states in the domain D (of
energy in (E, E + ΔE)), Ω, as

S = k B ln Ω. (32.40)

This defines the statistical entropy, and can be “heuristically proven”, meaning it is not quite a proof,
but a strong argument. The formula for the entropy S should satisfy the condition that, if the system
is made of two subsystems, S = S a ∪ S b , then

ω(E)  ω (a) (E (a) )ω (b) (E (b) )ΔE. (32.41)

In the thermodynamical limit, of an infinite number of particles, volume, and energy, with fixed
ratios,
E V
N → ∞, V → ∞, E → ∞, = fixed, = fixed, (32.42)
N N
we have other formulas for the entropy that are equivalent to the above,

S = k B ln ω, S = k B ln Γ, S = −k B ln ρ. (32.43)

These formulas are based on the fact that, for a very large dimension of the phase space, N →
∞, ωΔE  Γ  Ω. Moreover, the last relation, based on the distribution ρ, becomes in quantum
mechanics (using the standard averaging with ρ̂)

S = −k B Tr[ρ̂ ln ρ̂], (32.44)

which is the von Neumann entropy, generalizing the Gibbs entropy from Chapter 30. We also note
that the entropy is only additive in the thermodynamic limit.

Canonical Ensemble (Distribution)


The (quasi-)microcanonical distribution, discussed above, is rather hard to use; and the simplest and
most standard one is the canonical distribution, in which case the system is connected to a reservoir
of temperature, S ∪ RT .
Then, since the variation of the entropy is related to heat (energy variation) divided by temperature,
we obtain
δQ
δS = = −k B log ρ. (32.45)
T
This can be satisfied by a Maxwell-type distribution in the thermodynamic limit,

ρ ∝ e−H/k B T . (32.46)

Including normalization, the formula is


1 −H/k B T
ρ= e , (32.47)
Z
where we define β = 1/(k B T ) and the partition function Z is the sum of the exponential factors,

Z= e−βE . (32.48)
E
369 32 Quantum Statistical Mechanics, Tracing over a Subspace

In quantum mechanics, the formula becomes more useful, since the statistical operator,
1
ρ̂ = ρ( Ĥ) = e−β Ĥ , (32.49)
Z (β, V , N )
becomes discrete for states,
e−βEn
ρ n (En ) = . (32.50)
Z (β, V , N )
The partition function also becomes discrete,

Z (β, V , N ) = gn e−βEn , (32.51)
n

such that ρ n is normalized to one,



gn ρ n = 1. (32.52)
n

In this ensemble, we define the internal energy as the average of the energies

U ≡ E = ρ n En
n
 En e−βEn 1 ∂  ∂
(32.53)
= =− gn e−βEn = − ln Z.
n,g
Z Z ∂β n
∂β
n

But since in thermodynamics


∂ ∂
U= [βF (T, V , N )] = F + β F, (32.54)
∂β ∂β
we can identify the free energy F (the thermodynamic potential in this case) as
F (T, V , N ) = −k B T ln Z. (32.55)

Grand Canonical Ensemble (Distribution)


Another relevant ensemble is the grand canonical one, in which case there is a reservoir of heat and
particles in contact with our system, keeping fixed the system’s temperature T and chemical potential
μ, S ∪ RT ,μ . Then the classical distribution is
1
ρ= e−β(H−μN ) , (32.56)
Z (β, βμ)
which at the quantum level leads to the operator
1
ρ̂ = ρ( Ĥ, N̂ ) = e−β( Ĥ−μ N̂ ) , (32.57)
Z (β, βμ)
with eigenvalues
1
ρn = e−β(En −μN ) . (32.58)
Z (β, βμ)
Then the partition function is

Z (β, βμ) = gn e−β(En −μN ) , (32.59)
n
370 32 Quantum Statistical Mechanics, Tracing over a Subspace

and the thermodynamic potential is now


Ω(T, V , μ) = −k B T ln Z (β, βμ). (32.60)
The total average energy and number of particles can be calculated from the thermodynamic
potential, as
   
∂ ln Z ∂ ln Z
U = H = − , N = . (32.61)
∂β βμ ∂βμ β
We could define other ensembles, but the general procedure should be clear by now.

32.4 Bose–Einstein and Fermi–Dirac Distributions

Consider next systems consisting of identical quantum particles, bosons or fermions, which are
identical and indistinguishable and must therefore satisfy either Bose–Einstein or Fermi–Dirac
statistics.
In the case of Bose–Einstein statistics, in each state we can have an arbitrary number of particles
nα = 0, 1, 2, . . . , ∞.
In the case of Fermi–Dirac statistics, because of the Pauli exclusion principle we can only have
nα = 0 or 1 (the state is either occupied or not).
Then in both cases (considered together), the energy splits over the energy states, as does the total
number of particles,
 
E=  α nα , N = nα . (32.62)
α α

Moreover, then the partition function also factorizes,



Z= Zα , (32.63)
α

where

Zα = e−βnα ( α −μ) . (32.64)

(1) In the case of Bose–Einstein statistics, summing over nα = 0, 1, . . . , ∞, we get


1
Zα = , (32.65)
1− e−β( α −μ)
so the grand canonical potential is

Ω = −k B T ln Z = − k B T ln Z α . (32.66)
α

Then we obtain the average number of particles


∂Ω 
N = − = nα , (32.67)
∂μ α

where the Bose–Einstein distribution function is


1
nα  = β( −μ) ≡ f BE ( α ). (32.68)
e α −1
371 32 Quantum Statistical Mechanics, Tracing over a Subspace

(2) In the case of Fermi–Dirac statistics, summing over nα = 0, 1, we have


Z α = 1 + e−β( α −μ) , (32.69)
from which we similarly obtain the Fermi–Dirac distribution function,
1
nα  = ≡ f F D ( α ). (32.70)
e β( α −μ) +1

32.5 Entanglement Entropy

In Chapter 30, we also described a different kind of entropy, associated with two subsystems of a
system, S = A ∪ B (so that Htot = H A ⊗ HB ). Then, we defined the reduced density matrix
ρ̂ A = TrB ρ̂ tot , (32.71)
both for a pure state ρ̂ tot (= |ψψ|), and for a mixed state. Then the von Neumann entropy of ρ̂ A is
defined as the entanglement entropy,
S A = −k B Tr A ( ρ̂ A ln ρ A ). (32.72)
But we note that, if we have a finite temperature T, we can have a thermal total density matrix,
ρ̂ tot = ρ̂ tot,thermal = e−β Ĥ . (32.73)
If moreover the second subsystem vanishes, so that B = {0} and S = A, then the thermal
entanglement entropy equals the usual (von Neumann) thermal entropy.
But we can also choose a more general situation, with S = A ∪ B at finite temperature, in which
case we say we have a thermal entanglement entropy.
We can also generalize the entanglement entropy by means of an integer n, obtaining the Renyi
entropy,

1 N
1  
Sn (ρ) = ln  pin  = ln Tr( ρ̂ n ) . (32.74)
1 − n  i=1 1−n

This quantity is simpler to define, since it does not contain the log of a matrix (which is tricky in
general), yet we can obtain the entanglement entropy as a limit: one can prove that
S(ρ) = lim Sn (ρ). (32.75)
n→1

In both cases, we can consider the thermal density matrix,


e−β Ĥ
ρ̂ = , (32.76)
Z
which is the highest in entropy.
We can describe entanglement as the result of an entanglement Hamiltonian, which is different
from the actual Hamiltonian of the system (as far as the dynamics is concerned). Consider a variation
in the entanglement entropy due to a variation in the density matrix,
δS(ρ) = −δ Tr[ρ̂ log ρ̂] = − Tr[δ ρ̂ log ρ̂] − Tr[ρ̂δ log ρ̂]
(32.77)
= − Tr[δ ρ̂ log ρ̂] − Tr δ ρ̂.
372 32 Quantum Statistical Mechanics, Tracing over a Subspace

Defining the entanglement Hamiltonian as

ĤE = − log ρ̂ ⇒ ρ̂ = e− ĤE , (32.78)

it follows that we have

δS( ρ̂) = δ Tr[ρ̂ ĤE ] = δ ĤE . (32.79)

However, if ρ̂ describes a thermal (mixed) state,

e−β Ĥ
ρ̂ = ⇒ δS = βδ Ĥ ⇒ dE = T dS. (32.80)
Z
Thus, indeed, for the entanglement Hamiltonian, we have the expected thermodynamic relation
for β = 1 (unit temperature).
For this same thermal state, we can “purify it”, meaning we can define a pure state, in a product
Hilbert space H = H A ⊗ HB , such that the reduced density matrix ρ is thermal, and its entanglement
entropy defines the von Neumann entropy of the state. The pure state in H is called the thermofield
double state, and is defined as
1  −βEi /2
|ψ = e |i A ⊗ |iB , (32.81)
Z i

where |i are the eigenstates of the Hamiltonian. We can easily check that taking the trace over system
B gives the thermal density matrix.
We note that the entanglement entropy is hard to measure, or calculate, yet is interesting and has
been the subject of much research and many developments.

Important Concepts to Remember

• The density matrix, the matrix element of the statistical operator ρ̂, describes a mixed state. It arises
from taking the trace over another system, in contact with this one, or from contact with a reservoir
(which amounts to the same, except we don’t know the description of the reservoir states).
• Classical statistics is based on four postulates, replacing the ergodic hypothesis, which is incorrect:
(1) Aobs = A(t); (2) A = A; (3) for an isolated system, ρ is contant inside the domain D and 0
outside it; (4) the Boltzmann formula, S = k B ln Ω, which is heuristically proven only.
• Quantum statistics is based on four postulates, replacing the above, with ρ replaced by ρ̂: (1)
Aobs = A(t); (2) Aobs = Astat = Tr[ρ̂ Â]; (3) the phases and amplitudes of cα( j) are a priori random;
(4) the von Neumann formula, S = −k B Tr[ρ̂ ln ρ̂].
• The statistical operator obeys the Liouville–von Neumann equation, i ∂ ρ̂/∂t = [ Ĥ, ρ̂], proven for
an isolated system and postulated in general.
• At equilibrium, we have time independence, so [ Ĥ, ρ̂] = 0; so classically ρ = ρ(H(p, q)) and
quantum mechanically ρ̂ = ρ̂( Ĥ).
• The Boltzmann formula is equivalent, but only in the thermodynamic limit, to S = k B ln ω, S =
k B ln Γ, S = −k B ln ρ, with the latter motivating the von Neumann formula, S = −k B Tr[ρ̂ ln ρ̂],
in the quantum case.
373 32 Quantum Statistical Mechanics, Tracing over a Subspace

• In the canonical distribution (ensemble), with S ∪ RT , ρ = Z −1 e−βH , with the partition function
Z = E e−βE classically, and ρ̂ = Z −1 (β, V , N )e−β Ĥ , and Z (β, V , N ) = n gn e−βEn quantum
mechanically, and free energy F (T, V , N ) = −k B T ln Z.
• In the grand canonical distribution (ensemble), with S ∪ RT ,μ , classically ρ = Z −1 e−β(H−μN )
and quantum mechanically ρ̂ = Z −1 (β, βμ)e−β( Ĥ−μ N̂ ) , with Z (β, βμ) = n gn e−β(En −μN ) and
thermodynamic potential Ω(T, V , μ) = −k B T ln Z (β, βμ).
• From the Bose–Einstein and Fermi–Dirac statistics, we obtain the corresponding distributions,
nα  = f BE ( α ) = 1/(e β( α −μ) − 1) and na  = f F D ( α ) = 1/(e β( α −μ) + 1).
• One can define the entanglement entropy as before, S A = Tr A ρ tot , but we can define it also
in the case where the total system is at finite temperature, S AB ∪ RT , giving the thermal
entanglement entropy. We can also define the Renyi entropy, S = (1 − n) −1 ln[Tr( ρ̂ n )], such that
the entanglement entropy is obtained in the n → 1 limit.
• For entanglement, we can define the entanglement Hamiltonian via ρ̂ = e− ĤE , so that δS(ρ) =
δHE , similar to the thermodynamic relation δS = βδ Ĥ, or dE = T dS.
• We can “purify” any thermal state by writing it as the trace, over a different system, of a pure state,
called the thermofield double state, |ψT F D  = Z1 i e−βEi /2 |i A ⊗ |iB .

Further Reading
See [2, 1, 3] and any advanced (quantum) statistical mechanics book.

Exercises

(1) Consider the bipartite state in H A ⊗ HB


1
|ψ = [|1 ⊗ |1 + |2 ⊗ |2 + |3 ⊗ |4 + |2 ⊗ |3] , (32.82)
2
where |1, |2, |3, |4 are orthonormal states.
Calculate the mixed state obtained by taking the trace over system A, or system B.
(2) If the states |1, |2, |3, |4, |5 are (orthonormal) eigenstates of the Hamiltonian of energies
E1 , E2 , E3 , E4 , and E5 , respectively, and at time t = 0 the statistical operator is

ρ̂ = |11| + |23| + |34| + |45|, (32.83)

then find the time evolution of ρ̂ at small times.


(3) Consider a thermodynamic system of N (of the order of the Avogadro number N A) spins 1/2.
Calculate their classical entropy, for a (quasi-)microcanonical ensemble. Describe a quantum
statistical operator ρ̂ for the same system now using quantum mechanics, that gives the same
von Neumann entropy.
(4) Consider N harmonic oscillators of frequencies ωi , i = 1, . . . , N, connected to a reservoir of
temperature. Calculate (in the canonical ensemble) the free energy and the heat capacity CV .
374 32 Quantum Statistical Mechanics, Tracing over a Subspace

(5) Consider the harmonic oscillators from exercise 4, connected to a reservoir of temperature T and
chemical potential μ. Calculate (in the grand canonical ensemble) the thermodynamic potential
Ω and the heat capacity CV .
(6) Consider a distribution of free, nondegenerate, relativistic fermionic particles of mass m, of
arbitrarily large three-dimensional momentum. Calculate the number density and energy density
as a function of temperature. Write down an explicit analytical form at large temperatures.
(7) Consider a system A ∪ B, with total Hamiltonian diagonalized by states |1 A ⊗ |1B , |2 A ⊗
|2B , |3 A ⊗ |4B , |4 A ⊗ |3B , of energies E1 , E2 , E3 , E4 , respectively. Calculate the thermal
entanglement entropy at temperature T.
Elements of Quantum Information
33 and Quantum Computing

In this chapter, we introduce the ideas of quantum information theory and ways to do computing in
quantum mechanics.

33.1 Classical Computation and Shannon Theory

Before we turn to quantum theory, we start with a review of classical computation theory, in order to
generalize it to the quantum case.
To define classical information, we need to quantify the information and redundancy in the
transmission of some message. Generically, a message amounts to a string of n letters, chosen from
an alphabet of k letters,

{a1 , . . . , ak }. (33.1)

Each letter a x occurs with an a priori probability p(a x ), such that x p(a x ) = 1. For instance,
in the English language there are 26 letters, the most frequent being e, with approximately 12.7%
frequency so p(e) = 0.127, and then t, with 9.3% frequency, so p(t) = 0.093, etc.
However, as we are mostly interested in computers and their working, it is worth considering
binary messages, where the alphabet is composed of 0 and 1 only. In that case, we denote

p(1) = p, p(0) = 1 − p. (33.2)

We can assume that at large n, a message will have approximately pn 1’s and (1 − p)n 0’s. Then
the number of distinct binary strings of length n that can be sent (i.e., messages of n bits) corresponds
to the number of ways in which we can pick np letters out of the n in order to put 1’s in them. Thus,
the number of distinct possible messages is of the order of
 
n n!
∼ =  2nH (p) , (33.3)
np (np)! (n(1 − p))!

where we have used the Stirling approximation at large N, N!  2πN N −1/2 e−N , which results in a
quantity

H (p)  −p log2 p − (1 − p) log2 (1 − p). (33.4)

We can extend this analysis to the case of an alphabet of an arbitrary length, where the letter a x is
represented by the label x, with probability p(x). Then the number of distinct strings of length n is
of the order of the ways in which we can pick groups of p(x)n out of n, namely
n!
∼ ∼ 2nH (X) , (33.5)
x (np(x))!
375
376 33 Quantum Information and Quantum Computing

where again we have used the Stirling approximation formula to find an H (X ) (depending on the set
of x’s and their probabilities, defining an ensemble called X), as

k
H (X )  − p(x) log2 p(x). (33.6)
x=1

This is called the Shannon entropy. Note that we have used logarithms to the base 2, in order to
write the number as an exponent of 2. But we could have used any other number (including e, leading
to ln) instead. Note that log2 q = log2 n logn q.
This Shannon entropy is then a measure of the total redundancy in the message, and is the way to
encode the message in bits: we need nH (X ) bits to be able to put any message into them (p bits lead
to 2 p positions).
Another way to compute the Shannon entropy is as follows. A message is a string of letters of size
n, x 1 . . . x n . Then the a priori probability for this string is

P(x 1 , . . . , x n ) = p(x 1 ) · · · p(x n ), (33.7)

with

n
log2 P = log2 p(x i ). (33.8)
i=1

But this means that we can obtain, in the large-n limit, if there are p(x)n instances for each letter a x ,

1  k
− log2 P(x 1 , . . . , x n ) ∼ − log2 p(x) = − p(x i ) log2 p(x i ) = H (X ). (33.9)
n i=1

So the Shannon entropy is defined by the probability of the string being 2−nH (X) . Thus the optimal
code compresses a letter onto H (X ) bits, so that a message of length n is compressed into nH (X )
bits, with 2nH (X) states. This H (X ) depends on the ensemble X, defined by the probabilities p(x).

Mutual Information
We can ask, how correlated are two different messages, (x 1 , . . . , x n ) ≡ x and (y1 , . . . , yn ) ≡ y,
drawn from different ensembles, X and Y ? The measure of this correlation is the mutual information
I (X, Y ).
If p(x, y) denotes the probability that both messages will occur, then the probability of message x,
given that message y has occurred, is
p(x, y)
p(x|y) = . (33.10)
p(y)
Then we can define the conditional Shannon entropy as the entropy defined from p(x|y), namely
H (X |Y ) = − log2 p(x|y) = − log2 p(x, y) + log2 p(y)
(33.11)
= H (X, Y ) − H (Y ),
where p(x, y) is the probability that both messages will occur.
Then we define

I (X; Y ) ≡ H (X ) − H (X |Y ) = H (X ) + H (Y ) − H (X, Y ) (33.12)


377 33 Quantum Information and Quantum Computing

called the mutual information. We can see that it is positive definite, I (X; Y ) ≥ 0, since H (X |Y )
contains less information than H (X ).

33.2 Quantum Information and Computation,


and von Neumann Entropy

We are now ready to generalize to the quantum case.


In quantum mechanics, generically we have a mixed state, defined by a density matrix

ρ= px ρ x , (33.13)
x

where px is a classical probability. As we saw, we can always diagonalize the matrix in an


orthonormal basis |i, where

ρ= pi |ii|. (33.14)
i

Then log ρ is represented by its diagonal eigenvalues, log pi .


Thus we have an analog of the Shannon entropy at the quantum level,
− log ρ = − Tr(ρ log ρ) ≡ S( A) = H (i, pi ), (33.15)
where A is an ensemble defining system A. This is the von Neumann entropy, and we see that it
equals the Shannon entropy of the classical distribution of the states (i, pi ).
We can also define the mutual information in the same way as in the classical case, by
I ( A; B) = S( A) + S(B) − S( A + B). (33.16)
The von Neumann entropy satisfies several properties. The most relevant ones are:
(1) The entropy vanishes for a pure state,
S(ρ = |ψψ|) = 0. (33.17)
This is consistent, since the entropy of a single state should be zero.
(2) The entropy is invariant under unitary transformations, since
S(U ρU −1 ) = − Tr[U ρU −1 log(U ρU −1 )] = − Tr[ρ log ρ] = S(ρ). (33.18)
(3) Concavity of the entropy: if p1 , p2 , . . . , pn ≥ 0 and i pi = 1, then
 
S pi ρ i  ≥ pi S(ρ i ), (33.19)
 i i

which follows from the same property of the negative log function.
(4) Subadditivity of the entropy: for a bipartite system AB,
S( A + B) ≤ S( A) + S(B). (33.20)
As before,
ρ A = TrB ρ AB , ρ B = Tr A ρ AB , (33.21)
378 33 Quantum Information and Quantum Computing

so we have
S(ρ AB ) ≤ S(ρ A ) + S(ρ B ). (33.22)
This describes the fact that there is nontrivial information in AB encoded in the correlations
of A and B, which is a reasonable assumption.
(5) Strong subadditivity of the entropy: for a tripartite system ABC, we have the inequalities
S( A + B + C) + S(B) ≤ S( A + B) + S(B + C)
(33.23)
S( A + C) ≤ S( A + B) + S(B + C).
This is rather difficult to prove in the quantum case of the von Neumann entropy. The name
refers to the fact that subadditivity is obtained from it in the special case of B = 0.

33.3 Quantum Computation

Having seen that we can define quantum information, it follows that we can store information in
quantum systems, and do computations with it.
Instead of information encoded in classical bits of 0 and 1, we can now encode information in
the quantum bits, or qubits, of two possible states, for instance the states |+ and |− of a two-state
system (such as a spin 1/2 system).
Then for n qubits, we have states in the product Hilbert space H = H A1 ⊗ · · · ⊗ H An , with basis
|ψ±,±,...,±  ≡ |± ⊗ |± ⊗ · · · ⊗ |±. (33.24)
A more useful set of states is the set of entangled states in H, since we can use entanglement to
our advantage in order to do things that are not possible with a classical computer.
In order to do computations on a state, we must pass it through a circuit that is constructed out of
basic gates, which are (linear) transformations that act on two or more bits simultaneously, among
the n bits of the original message (state). A minimal gate is then an action on two bits x and y, as
     
x x x
→  =M . (33.25)
y y y
Classically, the action of a gate is on bits x, y = 0 or 1, and a circuit is a product of gates M1 · · · Mp .
In quantum mechanics, quantum gates are objects that act linearly on the quantum states. That
means that quantum gates are unitary transformations Ui , acting on a fixed number of qubits at a
time (a minimum of 2) out of the general n qubits of the message. A quantum circuit is, as before, a
product of the individual gates (see Fig. 33.1),
U = U1 · · · Un . (33.26)
A quantum computation is the action of a circuit on a quantum state, according to the relation
|ψ → |ψ   = U |ψ. (33.27)
At the end of the computation, we make a measurement, in order to have a classical result. This
means that we select a state in each of the basis sets (perhaps in a subset of qubits).
Note that it is of great use to consider entangled qubits, of the type |φ±, |ψ±, in a product system
A ∪ B, instead of the nonentangled qubit states |±, |±.
379 33 Quantum Information and Quantum Computing

circuit U
Figure 33.1 A quantum circuit doing a computation.

We can ask, why is it that we bother with quantum computation instead of the standard classical
computation? One answer is that, with the increase in density of processors, and the decrease of the
size onto which we can store a bit, eventually we will reach a size of the basic circuitry of the order
of the quantum scale, meaning that quantum computational must become relevant then. This is a
practical constructional reason.
But a better, theoretical, reason is that we can do more with a quantum computer than with a
classical one. Can we do calculations that are not possible on a classical computer? No, that would
be a contradiction, since we can simulate a quantum computer on a classical one.
However, we can do the same calculation faster on a quantum computer, meaning that the way in
which they scale with the size of the problem is different. Generally, it is thought that if a specific
computation can be done in a time that is a polynomial (approximately a power law) of the size of
the data, i.e., that “it can be solved in polynomial time”, then the problem is solvable by classical
computers (since we will solve it in a reasonable time). These problems are in a set called P.
Nevertheless there are very important problems that are not in P, meaning that they involve a
time that grows faster than a power law with the size of the problem. Then the problem is thought
effectively unsolvable by a classical computer, since we can easily choose a problem size that leads
to an unreasonable time to solve it.
There is a standard example, namely the problem of the factorization of large numbers. For a large
number N, with k digits, we can write it as a product of n factors N = n1 · · · nn . Then finding the
factors ni is a problem that is not in P, the time to find {ni } being related to the size k by ∼ exp( f (k)).
However, if we have a solution, we can check that it is correct (by multiplying the factors ni ) in a
polynomial time, so the check of the solution is in P.
This problem is related to cryptography: cryptography (the encryption of messages everywhere,
including banking) is based on the existence of a key that can be related to the factors in a very large
number. Then solving the factorization problem in polynomial time will result in a way to break
encryptions in polynomial time.
Coming back to quantum computation, we note that a quantum gate is a unitary transformation,
and as such is by definition reversible. But a classical computation is in general not reversible. That
means that a classical reversible computer is mapped to a special case of a quantum computer.
To understand further the differences between a classical and a quantum computer, we note that
quantum computers can simulate probabilistic classical computers. Conversely, a classical computer
can simulate a quantum computer. Then, why is it that a quantum computer is better? The reason
is that the simulation takes increasingly large times, becoming quickly prohibitive. Indeed, we saw
380 33 Quantum Information and Quantum Computing

that there is no difference between the problems that can be solved by a classical and a quantum
computer, only in the times it takes to solve the problems on the two computers.
The gain in time efficiency when going to a quantum computer is offset somewhat by the fact
that, since the computation is a unitary action on a state, followed by projection onto other states, the
quantum computer gives only probabilities for the measurement. That means that the result is given
as a probability distribution, so we have errors and error bars.
But there is no problem with that, since as we said, we are mostly interested in solving a given
problem that is not in P in polynomial time. And once we have a solution, we can check it in
polynomial time, so the probability distribution of results is enough.
What is the root of the effectiveness of quantum computation as against classical computation? The
important point is “quantum parallelism”: for instance, in the case of a single qubit, we can choose the
input to be a superposition of the basis states |↑ and |↓ (or of other |ψ states). Then the computer
does the computation on the |↑ and |↓ states in parallel, in effect doing two computations in the time
it takes a computer to do one. Of course, then we must go back and repeat the computation several
times, in order to get probabilities, and then we also must check the result (in polynomial time). But
the end result is still an improvement over polynomial time, owing to the large number of parallel
computations.
One potential problem is the existence of errors that appear due to interaction with the environ-
ment, which can lead to a change, or even a collapse, of the wave function via a classical measure-
ment. Indeed, it is to be understood that, at least in some cases, measurements and classicalization (the
transition from a quantum state to a classical one) are the result of interaction with the environment,
through decoherence (a generalization of the decoherence of an electromagnetic wave, light, to the
decoherence of the probability wave, the wave function). This effect will be studied in more detail
later on, but here we just state it without details.
However, errors that appear can be partially corrected, via quantum error correction (quantum)
algorithms, which will not be described here. So, one can delay for a long time the decoherence and
loss of quantum information, giving enough time to do the quantum calculation.
The end result is quantum supremacy, the ability of a quantum computer to solve quickly problems
that a classical computer can only do slowly. Experimentally, at the time of writing it has not been
obtained yet but it is believed theoretically to exist.

33.4 Quantum Cryptography, No-Cloning, and Teleportation

We have seen that we can use quantum computation to decrypt standard classical encryption, which
is based on factorization of a large number. But is the reverse also possible, namely can we use
quantum phenomena to generate a new kind of encryption? The answer is yes, and this entails a way
to send a quantum key securely from one person to another, as we shall see.
But then another question arises: can we copy a quantum state that we possess in order to send
a copy of it to some other person? The answer to that is actually NO, expressed in the form of a
principle, the no-cloning principle:
It is not possible to copy the state |ψ of a (sub-)system by any quantum mechanical (or
classical) process.

Such quantum processes would be unitary transformations acting on the state of the total system
(comprising Alice, from whom we want to copy the state, and Bob, to whom we wish to send
381 33 Quantum Information and Quantum Computing

A B

|Ã |Ã


Figure 33.2 Principle of quantum teleportation.

the copy), as well as measurements collapsing the state of the (sub-)system. This is a principle,
not a theorem, so it cannot be proven in any generality; one can only test that it cannot be done in
any specific case that we consider.
Given this fact, it seems strange that we said we can send a state (a “key”) from Alice to Bob.
The process by which we send it is called “teleportation”, after the name invented in science fiction
(made popular mostly by the original “Star Trek” TV show) for a machine that sends something from
point A to point B by erasing it from point A and making it appear at point B.
This “quantum teleportation” is similar to this general idea, since we need to erase the state at
the original place (Alice), before (or rather, at the same time) recreating it at point B (Bob). We can
“teleport” the state from Alice to Bob by using entanglement (in the EPR style) and measurement.
The procedure is as follows (see Fig. 33.2). We define a general state in the two-dimensional qubit
Hilbert space,
|ψ = a|↑ + b|↓, (33.28)
and we want to send it from another location, C, to B, by use of entanglement with A. Defining as
before
1
|φ± = √ (|↑↑ ± |↓↓)
2
(33.29)
1
|ψ± = √ (|↑↓ ± |↓↑),
2
we can show that we have the relation
1
|ψC |φ+ AB = |φ+ AC |ψB + |ψ+ AC σ1 |ψB
2 (33.30)

+ |ψ− AC (−iσ2 )|ψB + |φ− AC σ3 |ψB .
We leave the proof of this relation as an exercise (by simple substitution of the definitions of states
and matrices in terms of the basis).
We can perform then a “Bell measurement” on the system AC, meaning that we measure onto
an EPR style state; i.e., we project onto one of the states |φ±, |ψ±. Then Alice can tell Bob
classically (over a classical channel, like a telephone) what kind of measurement she did (what
state she projected onto). Depending on the state, then all Bob has to do is to apply one of the
operators 1, σ1 , (−iσ2 ), or σ3 (for |φ+, |ψ+, |ψ−, |φ−, respectively), in order to obtain the state
|ψ. Thus indeed, the state |ψ has been teleported from C to B, having been erased at C (by the
measurement), but created at B (by the measurement plus operation).
382 33 Quantum Information and Quantum Computing

Based on these ideas of teleportation via entanglement, one can create a quantum cryptographic
system that is invulnerable to attack. Indeed, for cryptography, we need to send a private key that
is used to decode a message. In quantum mechanics this means that we need to send a quantum
state (the key), after which the message is encoded in the correlations of a transmitted string and
the private key. Then we can construct a quantum key distribution (a way to send the key) that is
invulnerable to attack, since on the one hand the key is a state and, on the other, even knowing the
key state is useless since the information is in the correlation of the string with the key, so having the
string is irrelevant if it is not correlated to the key that you have.

Important Concepts to Remember

• In classical information theory, the Shannon entropy H (p) = − i pi log2 pi is defined by the
number of n-bit messages being 2nH (p) , or, put in another way, the probability for the occurrence
of a single string being 2−nH (p) .
• In quantum computation, the analog of the Shannon entropy is the von Neumann entropy.
• In classical computation, a circuit is a product of gates acting on at least two bits. A quantum
circuit is a product of gates (unitary transformations), acting on at least two qubits, and a
quantum computation is the action of the circuit on a quantum state. At the end, we have made
a measurement.
• A quantum computer can calculate faster: problems in NP (such problems cannot be solved in
polynomial time by a classical computer) can be solved in polynomial time by a quantum computer.
This is quantum supremacy, and would lead to the breaking of all classical encryption.
• This gain in efficiency is due partly to using entangled states and, as a result, to quantum
parallelism: calculations are done in parallel, and then measurements are made and the result
checked (in polynomial time).
• In quantum mechanics we have the no-cloning principle: we cannot copy a quantum state.
• We can have quantum teleportation, though: we erase the state from system A, and give it to system
B. This leads to quantum cryptography.

Further Reading
See Preskill’s lecture notes on quantum information [12].

Exercises

(1) Show the details of proving the formula (33.5) for the Shannon entropy in (33.6).
(2) Show that the mutual information is zero if and only if, in the probability of occurrence of the
message x given the message y, p(x, y), the presence of the message y is irrelevant (so we have
independence of y).
(3) Show the details of the proof of the concavity of the von Neumann entropy.
(4) Give simple examples of when the von Neumann entropy satisfies S( A + B) < S( A) + S(B),
and when it satisfies S( A + B) = S( A) + S(B).
383 33 Quantum Information and Quantum Computing

(5) Give an example of a finite quantum circuit made up of a very large number of infinitesimal
unitary gates acting on the same two qubits (and calculate the circuit).
(6) Prove the quantum teleportation relation (33.30).
(7) If a quantum computer can factorize a large number in polynomial time, thus breaking banking
and internet security encryption based on the same, does a quantum computer mean the end of
banking safety? Explain. If not, what problems do you see that need solving?
34 Quantum Complexity and Quantum Chaos

In this chapter we will define the notion of quantum complexity, related to quantum computations,
and how to calculate it and its properties. Then, we will describe a quantum version of classical
chaos, and the quantities that define it, and we will also show how to calculate them. The properties
described here are general for quantum theories, and they help us to understand how to connect to
general classical concepts.

34.1 Classical Computation Complexity

Before we consider quantum complexity, we define the easier classical complexity. In computer
science, data is encoded in a set of bits, and a computation is an action on (a function of) initial data
as bits. This function (the computation) is built out of building blocks called gates. A gate is a basic
function that acts on several bits, and spits out a result that is one or more other bits. An example is
the NOT gate, which acts on a single bit, and produces another, namely the opposite bit. Thus we can
define the gate as:
NOT : 0 → 1, NOT : 1 → 0. (34.1)
An example of a gate that acts on two bits, and gives one bit is the gate AND, which acts as follows:
AN D : (0, 0) → 0
(1, 0) → 0
(34.2)
(0, 1) → 0
(1, 1) → 1.
Another one is the gate OR, which acts as follows:
OR : (0, 0) → 0
(1, 0) → 1
(34.3)
(0, 1) → 1
(1, 1) → 1.
We say that a set of gates is a universal set if any general computation, i.e., any function of all
possible input data, can be built out of a combination of, including repetitions, the set of gates (of
course, any particular computation, from one input to one output, can be formed out of a smaller set
of gates or even a single gate).
For example, a universal set of gates is {NOT, AN D, OR, and I N PUT }, where I N PUT (x i ) inputs
the ith bit (this allows us to recover the bits lost when acting with other gates which turn two bits into
one).
384
385 34 Quantum Complexity and Quantum Chaos

Another example is the set {NOT, AN D, OR, and XOR}, where now we keep the first bit in the
result, and the second bit is the result of the computation, thus having only two bits to two-bit gates,
where XOR is defined as
XOR : (0, 0) → 0
(1, 0) → 1
(34.4)
(0, 1) → 1
(1, 1) → 0.

Then we can define the notion of classical complexity, as the minimal number of gates that are
needed in order to define a computation.

34.2 Quantum Computation and Complexity

In the quantum case, data is encoded in qubits, meaning as an action on a general qubit state

c0 |0 + c1 |1. (34.5)

Similarly to the classical case, we can define a quantum computation as a function of initial data on
qubits. But quantum computations are unitary transformations on the state, |ψ → U |ψ. A quantum
computation can also be built from basic building blocks, i.e., quantum gates, which however now
are defined as unitary matrices acting on the qubit.
For example, the quantum analog of the NOT gate is the matrix
 
0 1
X= . (34.6)
1 0
 
|0
Here the matrix acts on a column vector .
|1
Then we can define the Hadamard gate,
 
1 1 1
H=√ , (34.7)
2 1 −1
which therefore acts on a general state as follows:
1 1
H (c0 |0 + c1 |1) = √ (c0 + c1 )|0 + √ (c0 − c1 )|1, (34.8)
2 2
and is thought of as the “square root of the quantum NOT gate”.
Another gate is the phase gate,
 
1 0
P= ; (34.9)
0 i

more generally, eiφ replaces i.


We can define as in the classical case a universal set of gates as a set of gates into which we
can decompose any quantum computation on any initial data. A universal set is composed of the
386 34 Quantum Complexity and Quantum Chaos

Hadamard gate, the phase gate, and the “Toffoli gate”, defined as the matrix (acting on three qubits,
with basis |000, |001, |010, |011, |100, |101, |110, |111)

1 0 0 0 0 0 0 0
 
0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0
0 0
T = 
0 0 1 0 0 0
. (34.10)
0 0 0 0 1 0 0 0
0 
0 0 0 0 1 0 0
0 0 0 0 0 0 0 1
0 0 0 0 0 0 1 0

Since quantum computations are merely unitary transformations on states, if we define a given
reference state |ψ R , we can define a notion of the quantum complexity of any state |ψT  as the
quantum complexity of the computation from |ψ R  to |ψT , namely the unitary matrix U defining

|ψT  = U |ψ R . (34.11)

The minimum number of gates in the universal set that is required in order to build the
computation, i.e., the matrix U, is called the quantum complexity.
But there is another catch now, with respect to the classical case. Since unitary transformations
are continuous, whereas the product of gates is a discrete operation, we will rarely be able to obtain
exactly the state |ψT  by a product of a finite number of gates. Instead, we must define the equality
of the state made from gates with the actual state as being only up to a tolerance , i.e.,

 |ψT  − U |ψ R  2 ≤ . (34.12)

As before, there is no unique circuit that can give the wanted result. Instead, the complexity C(U)
is the minimum number of gates required to build U up to a desired tolerance, over the set of quantum
circuits giving U. Since U is an action on n qubits, it is a 2n ×2n matrix. One can show that C(U) ≥ 4n
for most cases, putting a lower bound on the complexity.
One can also have an upper bound, given the Solovay–Kitaev theorem, which states that, given a
universal set of gates G and target unitary matrix U, the number N of gates needed to obtain equality
up to the tolerance  > 0 is

N = O(logc (1/)), where 1 ≤ c ≤ 2. (34.13)

34.3 The Nielsen Approach to Quantum Complexity

There is a geometrical approach to quantum complexity pioneered by M.A. Nielsen in [20], which
was generalized to the quantum field theory case in [19].
In it, we can define the quantum unitary matrix U as a path ordered exponential of a “Hamiltonian”
H (s), as
  1 

U = P exp −i H (s)ds , (34.14)
0
387 34 Quantum Complexity and Quantum Chaos

where the “Hamiltonian” is decomposed in the basic building blocks acting on qubits, the generalized
Pauli matrices σI , understood as tensor products of the Pauli matrices σi acting on each qubit:

H (s) = Y I (s)σI . (34.15)
I

I
Here the function Y (s) is called the control function.
We can define things in a bit more generality, by not requiring an action on qubits, but rather on
general states (since, in any case, we define complexity in terms of the relation of a general state to
a reference state, neither of which needs to be in a multiple qubit state). Then we can replace the
Pauli matrices σI with a general basis OI of Hermitian generators. Furthermore we can also define
an s-dependent matrix by integrating only up to s,
  s 
I  
U (s) = P exp −i Y (s )OI ds . (34.16)
0

It satisfies a Schrödinger equation for evolution in s,


dU (s)
i = Y I (s)OI U (s) , (34.17)
ds
and can be solved, with boundary conditions U (s = 0) = 1 and U (s = 1) = U.
In order to define a quantum complexity, we need to define some path between these two extremes
(i.e., boundary conditions) that can be minimized according to some rules. For this, we define a
cost function, or Finsler function, F (Y I (s), ∂s Y I (s)), that can be used for the minimization. It must
satisfy the following conditions:
(1) Positivity: F ≥ 0, and F = 0 ⇔ Y I = 0.
(2) Continuity and smoothness: F ∈ C ∞ .
(3) Homogeneity: F (λY I ) = λF (Y I ).
 
(4) Triangle inequality: F (Y I + Y I ) ≤ F (Y I ) + F (Y I ).
The simplest example is the function F2 ,

F2 (Y I ) ≡ δ I J Y I (Y J ) ∗ . (34.18)
IJ

It gives rise to a Riemannian geometry on the space of control functions.


Given a cost function, we can define the distance functional, or “circuit depth”,
 1
D[U] = ds F (Y I (s), ∂s Y I (s)). (34.19)
0

Then its minimum, Dmin (U) gives the quantum complexity.


Moreover, using a theorem by Nielsen, we find that the complexity is also found from a geodesic
in the Riemannian space, with metric
dx μ dx ν  I
gμν = |Y (s)| 2 . (34.20)
ds ds I

In [21], it was found that the quantum complexity of k qubits is related to the entropy of a classical
system with 2k degrees of freedom, and one can find from it a (thermodynamic) second law of
quantum complexity.
388 34 Quantum Complexity and Quantum Chaos

C(t)

Cmax

t
K
∼ eK ∼ ee
k
Figure 34.1 General behavior of the complexity C(t) with time. At eK there is flattening and at ee , quantum recurrences.

For the unitary operator U (t) = e−iHt , we can define the time evolution of the complexity C(t)
and find that it increases linearly,
C(t) = Kt, (34.21)
until a time of the order of ∼ e K (with K the slope of the linear relation). After that, it remains flat at
K
Cmax , with only fluctuations around it. Then, at much larger times, around ee , we obtain “quantum
recurrences”, where the state returns to previously passed values and C(t) has large downwards
fluctuations, until eventually it dips to zero, and then comes back up, as in Fig. 34.1. That this picture
is quite general is a conjecture, but there is a lot of evidence for it.
This behavior, for k qubits, is similar to that for the entropy of 2k degrees of freedom.
The complexity in quantum field theories also has another, more concrete (both more precise
and easily calculable) geometrical interpretation, in terms of the relation of quantum field theory
to gravity, the AdS/CFT correspondence (or gauge/gravity duality). In it, the complexity can be
calculated as an on-shell action for a classical gravity theory, or the volume of a geometrical
hypersurface. We will, however, not explain these advanced concepts here.

34.4 Quantum Chaos

Classically, chaotical behavior is defined as the divergence of two nearby paths in phase space
according to an exponential law in time, defined by the Lyapunov exponent λ I ,
δq(t) ∼ e λ I t . (34.22)
An exact analog for this in quantum mechanics is not well defined.
However, recently a very concrete measure of quantum chaos was proposed, in [18]. It relates to
the “out of time ordered correlator (OTOC)”.
To define this quantity, consider a state |ψ and a Hamiltonian Ĥ, for some general quantum
system with Hilbert space H. Consider also two operators, V and W , that can be defined as some
local perturbations at different positions. For this, suppose that H defines something like a “spin
chain”, a discrete set of sites with two-state Hilbert spaces (“spins”) at each site, with V and W
389 34 Quantum Complexity and Quantum Chaos

acting at different sites, separated by a distance x. Note that this description is equivalent to the
description of quantum data in terms of (spatially ordered) qubits.
Now apply V to the state |ψ, evolve in time with Ĥ for a time t, then apply W and evolve
backwards in time for −t, generating the state

|ψ1  = eiHt/ W e−iHt/V |ψ. (34.23)

Alternatively, evolve in time with Ĥ for a time t, then apply W , then evolve for a time −t, then apply
V , obtaining

|ψ2  = V eiHt/ W e−iHt/ |ψ. (34.24)

Then the overlap of the two states is the OTOC,

F (t) = ψ2 |ψ1  = ψ|eiHt/ W † e−iHt/V † eiHt/ W e−iHt/V |ψ. (34.25)

The Heisenberg operator

W (t) = eiHt/ W e−iHt/


∞
(it/) n (34.26)
= [H, . . . [H, W ] . . . ]
n=0
n!

is defined as an analog of the classical butterfly effect, when we go forwards in time, then act, and
then go backwards in time, or, equivalently, as a sum of arbitrary nested commutators. Then if the
operator W is local (defined at a single site) initially, for W (t) it amounts to a “spreading out”: the
operator acts over a distance that increases in time around the initial site.
In terms of W (t), the OTOC is

F (t) = W † (t)V † W (t)V . (34.27)

Note that here the spin chain gives another way to think of a qubit model for the data in a quantum
computation. In that case, V and W are expressed as sums of products of basis operators, which are
Pauli matrices σI .
Another relevant quantity for quantum chaos is the expectation value of the commutator of W (t)
and V , a more usual kind of correlator,

C(t) = [W (t), V ]† [W (t), V ]


(34.28)
= ψ| V † W (t) † W (t)V + W (t) †V † VW (t) − V † W (t) † VW (t) − W (t) †V † W (t)V |ψ.

Then note that, for unitary V , W (V † = V −1 , W † = W −1 ), we obtain

C(t) = 2(1 − Re F (t)). (34.29)

We can also, more relevantly, generalize the above to finite temperature T, with β = 1/(k B T ).
Then the expectation value is defined using the thermal density matrix,
1
. . .  = Tr(e−βH . . .). (34.30)
Z
For a chaotic system and two local operators V and W , the spread in time of the action in W (t) is
ballistic (unimpeded), with a “butterfly velocity” vB which can be thought of as a kind of “emergent
390 34 Quantum Complexity and Quantum Chaos

speed of information propagation”. Then, if x denotes the distance between the points at which V
and W act, we have, at fixed t,
F (t)  1, for x  vB t
(34.31)
F (t)  0, for x  vB t.
On the other hand, as a function of time, C(t)  0 at small times and C(t)  2 at large times. This
means that the correlation of the operators at large times is perfect, and at small times is negligible.
Moreover, for the butterfly effect, we have
C(t) ∼ 2VV WW  (34.32)
at large t.
For a local chaotic system, C(t) becomes of order 1 at the so-called “scrambling (delocalization)
time” t ∗ ,
C(t) ∼ 1 for t ∼ t ∗ . (34.33)
In terms of F (t), this corresponds to the transition between 1 and 0, so we obtain
x
t∗ ∼ . (34.34)
vB
There is another relevant time scale, the much smaller “dissipation time” t d , (or “collision time”
if there is a quasiparticle picture for dissipation through collisions), where t d  t ∗ . At small times,
but larger than t d , the two-point function of operators decays exponentially, so
V (0)V (t) ∼ e−t/td . (34.35)
We also have
V (0)V (0)W (t)W (t) ∼ VV WW  + O(e−t/td ). (34.36)
For strongly coupled systems, the dissipation time is t d ∼ β = /k B T.
In systems at nonzero temperature T and with large number of degrees of freedom, we obtain
F (t) = 1 − e λ L t , (34.37)
where λ L is called the (quantum) Lyapunov exponent, defining the quantum counterpart to the
classical chaotic behavior. Both it and  depend on the system. From the relation between C(t)
and the OTOC F (t), we see that
1
t∗ ∼ . (34.38)
λL
The Lyapunov exponent is conjectured [18] to satisfy (for most quantum systems, at least) a bound,
2π 2πk B T
λL ≤ = . (34.39)
β 
The bound is saturated for black holes.
Moreover, there is a conjectured form of the spatial spread in C(t) [22],
 
(x − vB t) 1+p
Cx (t) ≡ C(x, t) ∼ exp −λ L , (34.40)
c1+p t p
where, in Cx (t), W acts at x = 0 and V = Vx acts at x, and p is a universality class parameter that
varies between 0 and 1, at least in known cases.
391 34 Quantum Complexity and Quantum Chaos

Important Concepts to Remember

• The classical computation complexity is the minimal number of gates necessary to do a


computation.
• The quantum complexity is the minimal number of quantum gates in a universal set necessary
to move from a given reference state |ψ R  to the state we want, |ψT , as |ψT  = U |ψ R , more
precisely up to a tolerance  in norm.
• In the case of n qubits, U is 2n × 2n , so C(U) ≥ 4n ; for a tolerance , the number of gates is
O(logc (1/)), with 1 ≤ c ≤ 2.
• In Nielsen’s approach
  s to quantum  computation, we build path-dependent unitary matrices as
U (s) = P exp −i 0 Y I (s  )OI ds  , with OI a basis of Hermitian generators and Y I (s) the control
function making up the “Hamiltonian” H (s) = I Y I (s)OI for a Schrödinger equation, with
boundary conditions U (0) = 1 and U (1) = U.
• One minimizes the integral over s of a cost (Finsler) function F (Y I , ∂s Y I ), called a distance
functional or circuit depth, to give the quantum complexity at the minimum.
• The quantum complexity increases linearly with time, C = Kt, until a time t ∼ eK , then it becomes
K
flat, and then at ee we have quantum recurrences.
• The measure of classical chaos is the Lyapunov exponent, from δq(t) ∼ e λ I t , whereas
the measure of quantum chaos comes from the out of time ordered correlator (OTOC),
F (t) ≡ ψ|eiHt/ W † e−iHt/V † eiHt/ W e−iHt/V |ψ and from C(t) ≡ ψ|[W (t), V ]† [W (t), V ]|ψ,
with W (t) = eiHt/ W e−iHt/ . For unitary V , W , C(t) = 2(1 − Re F (t)).
• For a chaotic system, the spread in time of the action of W (t) is very fast, with “butterfly velocity”
vB (the emergent speed of information propagation), such that the OTOC is F (t)  1 for x  vB t,
and F (t)  0 for x  vB t.
• Also for a chaotic system, C(t) factorizes at large times, C(t)  2VV WW , and becomes (from 0
at t = 0) of order 1 at the scrambling time t ∗ ∼ x/vB .
• For a chaotic system at nonzero temperature T and with a large number of degrees of freedom, the
OTOC F (t) = 1 − e λ L t , with λ L ∼ 1/t ∗ the quantum Lyapunov exponent. It is conjectured to
satisfy the bound λ L ≤ 2πk B T/, which is saturated by black holes.

Further Reading
For classical and quantum complexity, see Preskill’s lecture notes on quantum information [12], as
well as [19, 20, 21]. For quantum chaos, see [18, 22].

Exercises

(1) Consider the following classical computation on six-qubit space, (1, 0, 1, 0, 1, 0) →


(0, 0, 1, 1, 1, 1). Choose your favorite universal set of gates, and find two examples of circuits
(products of gates) giving this computation. Therefore deduce a bound on the complexity of the
computation.
392 34 Quantum Complexity and Quantum Chaos

(2) Consider the following quantum computation on five qubit space, from reference state |ψ R  to
final state |ψT ,
 
|0 + |1 |0 + |1 |0 + |1 |0 + |1 |0 + |1
|ψ R  = √ , √ , √ √ , √ →
2 2 2 2 2
  (34.41)
|0 − |1 |0 + |1 |0 − |1 |0 + |1 |0 − |1
|ψT  = √ , √ , √ , √ , √ ,
2 2 2 2 2
with tolerance  = 1/10. Using the universal set of gates in the text, find one example of a circuit
(products of gates) giving this computation, and therefore deduce a bound on the complexity of
the computation.
(3) Consider a quantum computation on four-qubit space in the Nielsen approach. Find a basis OI
for the computation, and write down explicitly the Schrödinger equation for U (s) in this basis.
(4) Consider the function F (Y I ) = I |Y I |. Is it a good cost function (Finsler function)? Why?
(5) Consider a Heisenberg spin chain Hamiltonian, H = −J i σi · σi+1 , where σi are the Pauli
matrices at site i on a spin chain or, equivalently, on the data for a quantum computation, and the
operator W = σk1 , where k is a fixed site. Show that, indeed, W (t) spreads along the spin chain
away from k as time evolves.
(6) Is there a classical limit of the measure of quantum chaos described here? How would you
compare the classical and quantum chaos descriptions?
(7) The conjectured bound (34.39) implies that at zero temperature λ L → 0. Does that imply that
the scrambling time t ∗ → ∞? Explain.
35 Quantum Decoherence and Quantum Thermalization

In this chapter, we will study the transition from quantum to classical, i.e., classicalization. Usually,
when talking about a system interacting with an environment, we talk about decoherence, which
will be studied first. In the case of an isolated system, we can also talk about decoherence since the
mechanism is the same, but more commonly it is called thermalization, and as such one considers a
separate analysis, which we will study next. Finally, we will describe a Bogoliubov transformation
on a (system of) harmonic oscillator(s) that describes particle production. We can obtain a thermal
particle production, and thus a finite temperature, in certain physically relevant cases.

35.1 Decoherence

In quantum mechanics, an obvious question that should have been asked from the beginning is: how
do we transition to a classical state? Even if it is in terms of just increasing the system size it is a
relevant question, since we mostly observe classical systems but, at the microscopic level, everything
is quantum. However, the transition from a quantum state to a classical state is, for an isolated system,
not a unitary transformation, so we need some other ingredient to describe it. A related issue is the
question of measurement: in it, there is an interaction of a quantum system with a classical observer or
apparatus, which changes the state of the quantum system in a nonunitary way. For the measurement
issue, however, as we saw, this is connected with the interpretation of quantum mechanics. Since this
is a controversial issue, we will not delve into it here even though decoherence theory has something
to say about it.
The minimalist point of view, then, is to restrict ourselves to a description of the interaction of
a system S with an environment E which will lead to decoherence of the state of the system S,
effectively leading to classical states. Even so, the field was regarded as somewhat disreputable, in as
much as it was perceived as dealing with mostly philosophical issues (a viewpoint perhaps influenced
by the connection with the interpretation of quantum mechanics alluded to above). This was the case
until decoherence was experimentally verified, in particular by the work of French physicist Serge
Haroche, and others, in 1996 and afterwards. As a result, Haroche gained a Nobel prize in 2012, for
the “experimental manipulation of individual quantum systems”.

35.2 Schrödinger’s Cat

Erwin Schrödinger famously proposed a Gedanken experiment (thought experiment), the


Schrödinger’s cat paradox, to sharpen the problem with the interaction between classical and
393
394 35 Quantum Decoherence and Quantum Thermalization

quantum, and (as he thought) with the standard Copenhagen interpretation of quantum mechanics.
Consider an isolated system, a “black box”, inside which we have a quantum trigger, for instance
a Geiger counter, triggered by an individual radioactive decay, giving a classical effect, that of
producing a killing procedure (spilling some poison, or shooting a pistol, etc.) at some time t on a
cat inside the same box. If the quantum state of the radioactive atom is described as (since the atom
decay has some probability per unit time, which can be integrated over some time to give 1/2)
1
|atom = √ (|decay + |no decay), (35.1)
2
then it seems to imply that the quantum state of the Schrödinger cat is
1
|S.cat = √ (|dead + |alive). (35.2)
2
This is almost certainly paradoxical, since a cat is either dead, or alive, but not in between. A question
arises: why did Schrödinger consider a cat in the thought experiment? One answer is that the cat is
macroscopic (so naturally classical) and alive. But is it an observer? If an observer is something
related to consciousness, the experiment sharpens the issue, as the cat is not human but most people
would assume it has consciousness. Schrödinger’s point was that this question, in his view, makes
the standard Copenhagen interpretation ludicrous.
But there is an obvious problem with the set-up of the Gedanken experiment: there are nonlocal
correlations over macroscopic distances, with classical interactions in between the two points.
The solution to this issue is a solution to the paradox: we must eliminate these nonlocal “cat states”
from the theory by decoherence, the loss of quantum coherence, which arises due to the interaction
of the quantum state with the “environment” inside the box, meaning the classical cat and killing
apparatus.
The issue of elimination of cat states through decoherence is also relevant for quantum computa-
tions. In a quantum computation, there are delicate superpositions of states of a large quantum system,
the system of many qubits of information. As such, the quantum state will decay very rapidly (like
a cat state of the macroscopic cat) through decoherence. Then the quantum information would be
lost quickly. However, as we mentioned earlier, there are quantum error-correcting codes that delay
decoherence for a long time.

35.3 Qualitative Decoherence

We first present the qualitative picture of decoherence, involving several points, as below.
1. What happens, very fast, is that a correlation between the |S.cat state and the environment
forms, in such a way that the |S.cat state becomes inaccessible. We can say that, in effect, “the
environment measures the cat”, collapsing its state to a |dead or |alive one.
2. Then decoherence represents a (unitary, from the point of view of a larger system, of which our
system is a subsystem) transition to entangled system + environment states.
3. A general state |ψ of the quantum system can be expanded in different bases, such that the
various basis elements interact with the environment in element-specific ways. Then the (unitary)
evolution leads to a situation with no interaction between basis elements. That means that we lose
phase coherence between these elements, i.e., we “decohere”.
395 35 Quantum Decoherence and Quantum Thermalization

4. The environment selects only a particular basis of elements of the quantum system, one
that interacts with the environment in a canonical way. We have then an “environment-induced
superselection”, or Einselection.
The theory of decoherence was initiated by H.D. Zeh in 1970 and developed by many people,
though notably among them W.H. Zurek.

35.4 Quantitative Decoherence

Toy Model
To start making decoherence more quantitative, we will start with a three-qubit model, one qubit for
each of the three elements, the system S, the environment E and the apparatus A. The two states of
the system are denoted by |↑ and |↓, of the apparatus by | A0  and | A1 , and of the environment by
|0  and |1 .
Consider a measurement, which is thought of as an interaction between the system and the
apparatus, that leads to a (unitary) change of the total quantum system, with Hilbert space HS ⊗ HA .
Specifically, we start with the apparatus in the | A0  state, so we have

|↑ ⊗ | A0  → |↑ ⊗ | A1 
(35.3)
|↓ ⊗ | A0  → |↓ ⊗ | A0 .
Here the apparatus states are orthogonal, A0 | A1  = 0.
On a general quantum state, the measurement means the evolution

(α|↑ + β|↓) ⊗ | A0  → α|↑ ⊗ | A1  + β|↓ ⊗ | A0  ≡ |Φ. (35.4)

Then the final state (after the measurement) |Φ can be expanded in different bases, but the
environment has not entered the discussion yet, so neither basis is preferred.
Next we must do a premeasurement, which is an evolution of the system, apparatus, and
environment that aligns all of them,

α|↑ ⊗ | A1  + β|↓ ⊗ | A0  ⊗ |0  → α|↑ ⊗ | A1  ⊗ |1  + β|↓ ⊗ | A0  ⊗ |0  = |ψ. (35.5)

An important assumption of the decoherence model is

0 |1  = 0, (35.6)

which means that the orthogonal states of the environment are correlated with the apparatus in the
basis in which the premeasurement was carried out.
Then the interaction with the environment, when we take the trace over the environment (since we
don’t know it), leads to a transition from a pure state to a mixed state that looks classical, if the above
condition holds, eliminating interference terms.
At the beginning, the state in S ⊗ A is pure, as is the state in E, so taking the trace over E is
trivial, and does not change the pure state of the system. But after the premeasurement, we obtain a
(reduced) density matrix

ρ AS = TrE |ψψ| = |α| 2 |↑↑| ⊗ | A1 A1 | + |β| 2 |↓↓| ⊗ | A0 A0 |, (35.7)


396 35 Quantum Decoherence and Quantum Thermalization

which is a classical state, since it means that with probability |α| 2 the apparatus shows 1 and the
system is in the state |↑ and with probability |β| 2 the apparatus shows 0 and the system is in the state
|↓, and the probabilities add up without quantum interference terms.

General Model
We can consider next a more general model. We define an “einselected” basis, namely an
environment-induced selected basis |i. Expand a general state of the system S in it,

|ψ = |ii|ψ. (35.8)
i

Start with the environment E in the state |, so with a state for the total S−E system of

|ψ ⊗ | = |i ⊗ |i|ψ. (35.9)
i

Then the decoherence is a continuous phenomenon, on a scale between: total absorption of the
quantum state by the system (as when a photon is absorbed by a cavity in which it resonates); and the
case when the system is not disturbed at all by the environment (though the environment is changed
by the system).
(1) Considering first the case of total absorption by the environment, this corresponds to an
evolution of the special (einselected) basis elements as
|i ⊗ | → |i , (35.10)
leading to an evolution of a general state as

|ψ ⊗ | → |i i|ψ. (35.11)
i

In order to have unitary evolution for the total S − E system, thus to preserve probability and thus
state products, we must have
 i | j  = δi j , (35.12)
since the einselected basis is orthonormal, i| j = δi j and | = 1.
(2) Considering next the case where the system is not disturbed by the environment, the evolution
changes the environment by correlating its states with the einselected basis. Thus the action on the
einselected basis is
|i ⊗ | → |i ⊗ |i , (35.13)
leading to an action on a general state as

|ψ ⊗ | → |i ⊗ |i i|ψ. (35.14)
i

In this case, unitarity requires again the conservation of probabilities, thus of state products, so
i| j ⊗ i | j  = δi j i | j 
(35.15)
= δi j .
But if we also require decoherence, meaning einselection in this case, we need
i | j  = δi j . (35.16)
397 35 Quantum Decoherence and Quantum Thermalization

To consider the effect of decoherence, we use the density matrix formalism. Before the premea-
surement, we start in the pure state |ψ ⊗ |, which means the density matrix
ρ 0 = |ψψ| ⊗ ||. (35.17)
Then the reduced matrix for the system S only is
ρ 0,S = TrE ρ 0 = |ψψ|, (35.18)
which indeed represents a pure state.
In this case, the expectation value in the |φ state, meaning the probability for being in the |φ
state, is
φ|ρ 0,S |φ = |ψ|φ| 2

= |ψ|ii|φ| 2
i (35.19)

+ ψ|i j |φi|φψ| j.
i,j;ij

The last line contains the interference terms characteristic of a quantum state.
After the decoherence (evolution), we end up with a density matrix

ρ S = TrE ρ = |i j | ⊗  k |i  j | k i|ψψ| j
i,j,k
 (35.20)
= |ψ|i| 2 |ii|,
i

which instead represents a mixed state, of a classical nature. Indeed, the expectation value in a |φ
state (and so the probability for the system to go to the |φ state) is

φ|ρ S |φ = |ψ|ii|φ| 2 , (35.21)
i

without the interference terms, so this is a classical state.

35.5 Qualitative Thermalization

We next move to the issue of thermalization, the evolution of a large isolated quantum system to a
thermal state, i.e., a classical statistical ensemble; we describe it qualitatively first. An essential factor
is the large size of the system, in the thermodynamic limit, and the interaction between subsystems.
In thermalization, the end result is the approach to the Gibbs ensemble, characterized by the classical
density matrix
e−β Ĥ
ρ̂ A =
. (35.22)
Z
The effective temperature at the end of thermalization is calculated such that the average energy
before and after thermalization is the same. Before we have a regular quantum average in the initial
state |ψ(0) and afterwards we have a quantum average with the Gibbs ensemble, so

ψ(0)| Ĥ |ψ(0) = Tr( Ĥe−β Ĥ ). (35.23)


398 35 Quantum Decoherence and Quantum Thermalization

In classical statistical mechanics, reaching thermodynamic equilibrium is guaranteed by the


ergodic hypothesis, which says that we get arbitrarily close to all the points in the phase space
(although as we noted before, it is strictly speaking – mathematically – not always correct). The
latter is a consequence of the classical chaos of the system, since considering small variations in the
initial conditions means that we map most of the phase space. Classical thermalization then means
that an atypical initial state of the system goes over to the thermal ensemble at large times, t → ∞.
In quantum mechanics, instead of the ergodic hypothesis we have the eigenstate thermalization
hypothesis (ETH), proposed independently by J.M. Deutch in 1991 and M. Srednicki in 1994, the
latter being the standard version.
The hypothesis says that the expectation value of a few-body observable Â, m| Â|m, in an
eigenstate |m of the Hamiltonian with energy Em , in the case of a large, interacting, many-body
system equals the thermal (quasi-)microcanonical average, Amicrocanonical or (Em ), and so depends
on the mean energy Em .
This ETH is a hypothesis, meaning that it is not always satisfied. In particular, a system with “many
body localization” (MBL) does not satisfy it [24].
Also, integrable systems do not satisfy it, since these are systems with an infinite number of
conservation laws. These are analogs of non-chaotic systems in the classical version, so we expect
non-ergodic behavior. However, these systems are described by a generalized Gibbs  ensemble
(GGE), in which there is a combination of conserved quantities in the exponent, exp − i βi Pi ,
not just the energy E (as in e−βE ≡ e−E/k B T ).
The bottom line of the ETH is that the knowledge of a single many-body eigenstate is sufficient to
compute thermal averages: it can be any eigenstate in the (quasi-)microcanonical ensemble.

35.6 Quantitative Thermalization

We can make a more quantitative analysis as follows. Consider the initial (many-body) state
expanded in the |n eigenbasis for the Hamiltonian Ĥ,

|ψ(0) = Cm |m. (35.24)
m

Its time evolution is



|ψ(t) = e−i Ĥt/ |ψ(0) = Cm e−iEm t/ |m. (35.25)
m

The expectation value of an observable  in the time-dependent state is




 Â(t) = ψ(t)| Â|ψ(t) = Cm Cn ei(Em −En )t/ Amn
m,n
  (35.26)

= |Cm | Amm +
2
Cm Cn ei(En −Em )t/ Amn ,
m mn

where Amn ≡ m| Â|n.


Then the eigenstate thermalization hypothesis (ETH) states that the time average equals the
average in the diagonal ensemble,
399 35 Quantum Decoherence and Quantum Thermalization

 T
dt A(t) 
0
Ā = lim = |Cm | 2 Amm = Tr(ρ d A), (35.27)
T →∞ T m

where ρ d is the diagonal ensemble. For an energy shell in the microcanonical ensemble we have

E 2 ψ(0) − E2ψ(0)  0. (35.28)

More precisely, in order for (35.27) to hold, we have the proposed relation

Amn = A( Ē)δ mn + e−S(E)/2 f A ( Ē, ω)Rmn , (35.29)

where Ē = (Em + En )/2, ω = (En − Em )/2, S(E) is the entropy, f A is an arbitrary function, and Rmn
is a random variable with a zero mean and unit variance. This relation was formalized by Srednicki.
Then the diagonal ensemble ρ d is a generalized Gibbs ensemble (GGE) if we take all projection
operators P̂m = |mm| as quantum operators corresponding to classical integrals of motion. Indeed,
we find
⎡⎢  D ⎤⎥
ρ̂ d = exp ⎢⎢− λ m P̂m ⎥⎥ , (35.30)
⎢⎣ m=1 ⎥⎦

where D is the dimension of the Hilbert space and

λ m = − ln(|Cm | 2 ). (35.31)

Indeed, to prove that this gives ρ d , we expand the exponential as a sum of powers, and use
Pm Pm = |mm|mm| = Pm . We find


D   D   D
ρ̂ d = 1 + (e−λ m − 1) P̂m = 1 + |Cm | 2 − 1 Pm = |Cm | 2 |mm| , (35.32)
m=1 m=1 m=1

where in the last equality we used Pm Pn = 0 for m  n, and m=1 D


Pm = 1.
As an example of the ETH, Srednicki used Berry’s conjecture to prove (something like) ETH in
this case. Berry’s conjecture says that the energy eigenfunctions are superpositions of plane waves in
momentum space, with random phase and Gaussian random amplitudes.
Expanding the energy eigenfunction in position space in terms of the momentum space,

ψ n (x ) ≡ x |n =  i P · X/
d 3N pAn ( P)e

δ(p 2 − 2mEn ), (35.33)

 ≡  P|n
where An ( P)  are Gaussian random variables, meaning that their two-point correlation in the
eigenstate ensemble (EE) is

 An ( P  )E E = δ mn δ3N ( P + P  )
Am ( P) . (35.34)
δ( P2 − P 2 )

Then, defining the integral over all momenta except one,



φ mn (p 1 ) = d 2 p2 · · · d 3 pn ψ∗m ( P)ψ
 n ( P),
 (35.35)
400 35 Quantum Decoherence and Quantum Thermalization

we find that the diagonal elements, averaged in the eigenstate ensemble, give the Boltzmann
thermalized distribution,
p 21 
φ nn (p 1 )E E = (2πmk B Tn ) −3/2 exp − , (35.36)
 2mk B Tn
and the variance for it, Δφ mm  goes to zero as N → ∞. Thus we have found that the ETH is obtained
from the Berry conjecture.

35.7 Bogoliubov Transformation and Appearance of Temperature

There is another way to obtain a thermal quantum distribution from a T = 0 distribution. It is found
in black holes, accelerated mirrors giving “Rindler spaces”, and the expanding universe cosmology,
all of which generate a temperature from the existence of “horizons” that cannot be penetrated by
information.
The above-mentioned way is a Bogoliubov transformation, which is a transformation
b = αa + βa† ⇒
(35.37)
b† = β∗ a + α ∗ a† ,
on the operators a, a† of a harmonic oscillator, thus redefining the oscillator and, most importantly,
its states. One must impose the normalization condition
|α| 2 − |β| 2 = 1, (35.38)
which means that we can find the inverse transformation
a = α ∗ b − βb† . (35.39)
Moreover, then the commutation relations for the a, a† operators,
[a, a] = [a† , a† ] = 0, [a, a† ] = 1, (35.40)
are preserved,
[b, b] = [αa + βa† , αa + βa† ] = 0
[b† , b† ] = [β∗ a + α ∗ a† , β∗ a + α ∗ a† ] = 0 (35.41)
† † ∗ ∗ †
[b, b ] = [αa + βa , β a + α a ] = |α| − |β| = 1.
2 2

The Hamiltonian
aa† + a† a
H= (35.42)
2
is not preserved, though, since
bb† + b† b aa† + a† a
= αβ∗ (a) 2 + α ∗ β(a† ) 2 + (|α| 2 + |β| 2 ) , (35.43)
2 2
which only equals H for β = 0.
Defining the vacuum state |0a of the original oscillator as being annihilated by a,
a|0a = 0, (35.44)
401 35 Quantum Decoherence and Quantum Thermalization

in terms of the redefined oscillator it is nontrivial,

b|0a  0. (35.45)

Moreover, the new vacuum state, defined as before,

b|0b = 0, (35.46)

is a nontrivial function of the original variables, a coherent-type state,


 
β
|0b = exp − (a† ) 2 |0a . (35.47)
α
Indeed,
 
β
b|0b = αa + βa† exp − (a† ) 2 |0a , (35.48)
α
but we have the commutator
    ∞    
β † 2 n β n † †2 n β β † 2
a, exp − (a ) = − a a = − exp − (a ) , (35.49)
α n=0
n! α α α

where we have expanded the exponential as an infinite sum of powers, and then commuted. Then we
obtain
    
β † 2 † β † 2 β †
b|0b = exp − (a ) (αa + βa )0a + α exp − (a ) − a |0a = 0, (35.50)
α α α
where we have used a|0a = 0.
Finally, then, considering the vacuum state of the original oscillator, |0a , propagate it in time, and
use it to evaluate the number operator of the redefined oscillator in this state. We find

a 0|b b|0 a = a 0|(β ∗ a + α ∗ a† )(αa + βa† )|0a = |β| 2 a 0|a a† |0a = |β| 2 . (35.51)

The interpretation of this is that there is a number |β| 2 of b particles in the state, i.e., the vacuum
state for a, a phenomenon known as particle creation.
The particles thus created can be distributed thermally in a specific case, relevant to black holes,
accelerated mirrors, and cosmologies (all having horizons, “walls” that cannot be penetrated by
information, and all implying a nonzero temperature).
We consider a “field” (the calculation is more sensible in full quantum field theory) that is a sum
of harmonic oscillators i with some nontrivial functions for the coefficients f i and f i∗ ,

φ= ( f i ai + f i∗ ai† ), (35.52)
i

where the expansion is valid in some region of space A. Expanding the same “field” in terms of the
basis bi , b†i at a different region B,

φ= (pi bi + pi∗ b†i ), (35.53)
i

with a relation between coefficients of the form



pi = (α i j f j + βi j f j∗ ), (35.54)
i,j
402 35 Quantum Decoherence and Quantum Thermalization

we also obtain a relation between oscillator operators, since then



φ= f j (α i j bi + β∗i j b†i ) + f j∗ (βi j bi + α ∗i j b†i ), (35.55)
i,j

implying

a j = α i j bi + β∗i j b†i . (35.56)

If the functions f i in region A turn, in region B, into


 
j
p j ∼ exp − , (35.57)
2k B T
where  j are the single-particle energies in region B, then we obtain a Boltzmann distributed number
density,
   
† i
a 0|bi bi |0 a = |βi j | ∼ exp −
2
, (35.58)
j j
k BT

which is thermally distributed.

Important Concepts to Remember

• Decoherence is understood as the interaction of a quantum system with an environment, leading to


classical states as time evolves.
• The Schrödinger’s cat Gedanken experiment consists of a cat in a black box, which may be killed
by a quantum process such as radioactive decay trigger, leading to quantum states |S.cat =
√1 (|dead+|alive). But these states (and analogous states in quantum computation) have nonlocal
2
correlations over large distances, and must be eliminated from the theory, via decoherence.
• Decoherence represents a transition to an entangled system + environment state, with no interaction
between the basis elements, where the environment selects the basis: this is environment-induced
superselection, or einselection (“the environment measures the cat”).
• Thermalization represents an (large) isolated quantum system evolving to a thermal state, with a
classical ensemble.
• Thermalization is based on the eigenstate thermalization hypothesis (ETH), replacing the classical
ergodic hypothesis and stating that for a few-body observable  in a large, interacting many-body
system, m| Â|m = Amicrocanonical (Em ).
• The ETH implies a generalized Gibbs ensemble (GGE), ρ d = exp [−λ m |mm|], with λ m =
− ln |Cm | 2 , and Cm the coefficients of the expansion of the initial many-body state in the eigenstates
of the Hamiltonian, |m, only if all the Pm = |mm| are integrals of motion.
• A Bogoliubov transformation is a transformation on a, a† for the harmonic oscillator, mixing them
and giving b = αa + βa† , which preserves the commutation relations [a, a† ] = [b, b† ] = 1 but takes
a vacuum, a|0a , into a nonvacuum, b|0a  0.
• In the case of horizons, coming from black holes, accelerated mirrors (“Rindler spaces”) and cos-
mology, this transformation leads to particle creation, a 0|b†i b | 0a = j |βi j | 2 ∼ j exp − k BiT , a
quantum thermal distribution.
403 35 Quantum Decoherence and Quantum Thermalization

Further Reading
For decoherence, see Zurek’s review [25] and Preskill’s lecture notes on quantum information [12].
For thermalization, see the original papers [23] and [26], and the review [24].

Exercises

(1) In the Schrödinger’s cat Gedanken experiment, ignoring the issue of observers (which,
admittedly, was what concerned Schrödinger), where would you simplify it, by eliminating
pieces in its construction? Think then about how that relates to the quantum computation issue,
as alluded to in the text.
(2) At the end of decoherence, from the point of view of the system S, we no longer have a pure
quantum state. But how is this result consistent with a classical picture (for instance, if we
consider the Schrödinger cat Gedanken experiment)?
(3) In the quantitative decoherence general model, calculate what will happen to φ|ρS |φ if we
did not have i | j  = δi j , and explain why that would be unsatisfactory.
(4) Give an example of an integrable quantum mechanical model, and explain (without any
calculation) why this is not expected to thermalize. One way to define an integrable quantum
mechanical model is by saying that the three-point and higher-point scatterings can be reduced to
just two-point scatterings and by satisfying the Yang–Baxter equations, which relate the possible
orders in which we have two-point scatterings.
(5) If the dimension of the Hilbert space D in (35.30) is infinite, is the formula still true, and is this
still a GGE? Can we deduce something about the system’s behavior then?
(6) Diagonalize the Hamiltonian
L ⎧ †
⎪ ai ai + ai ai†  2⎫

H= ⎨ + λ (a + a †
) − (a + a †
) ⎬, (35.59)
⎪ 2
i i i+1 i+1 ⎪
i=1 ⎩ ⎭
where ai , ai† are harmonic-oscillator creation/annihilation operators.
(7) The thermal particle creation in (35.58) arises from a state that was in a vacuum in region A.
Considering energy conservation, what can we deduce about the space? Can this process happen
in Minkowski space? Why?
PART IIb

APPROXIMATION METHODS
Time-Independent (Stationary) Perturbation Theory:
36 Nondegenerate, Degenerate, and Formal Cases

In this Part IIb , we will describe approximation methods for the calculation of eigenenergies and
eigenstates. In this chapter, we will start with time-independent perturbation theory. We will first
treat the case of nondegenerate energy eigenstates, which is simpler. Then, we will consider the
degenerate case, and finally a formal system for describing perturbation theory. We will end with an
example that contains all the relevant features described here.

36.1 Set-Up of the Problem: Time-Independent Perturbation Theory

The case of interest is when the Hamiltonian contains a simple “free” part Ĥ0 , which has large
eigenvalues, and a (complicated) part Ĥ1 that has small eigenvalues and is proportional to a small
parameter λ, so in effect we can write
Ĥ = Ĥ (λ) = Ĥ0 + λ Ĥ1 , (36.1)
where λ  1. We want to consider a perturbation theory in λ, where both eigenenergies and
eigenstates are expanded in it.
Consider in general an eigenvalue problem for Ĥ0 that has degeneracy parameter α for the energies
En , so
Ĥ0 |n (0) , α = En(0) |n (0) , α. (36.2)
Then the eigenvalue problem for Ĥ (λ) is
Ĥ (λ)|n, α; λ = En (λ)|n, α; λ. (36.3)
We want to calculate the eigenenergies En (λ) and eigenstates |n, α; λ in perturbation theory in λ,
En (λ) = En(0) + λEn(1) + λ2 En(2) + · · ·
(36.4)
|n, α; λ = | ñ (0) , α + λ| ñ (1) , α + λ 2 | ñ (2) , α + · · · ,
where we have denoted the eigenstate at λ = 0 by | ñ (0) , α instead of |n (0) , α, since in principle it
can be a different state. The basis can be defined for Ĥ in terms of eigenvalues for some operator(s)
that commute with Ĥ and can be different from the operator(s) that commute with Ĥ0 (say, perhaps
L 2 , L z for Ĥ0 , but S 2 , Sz for Ĥ so a basis with En(0) , l, m for H0 , but with En , j, m j for Ĥ).
We use two reasonable assumptions for the perturbation theory:
• that the zeroth-order energy is En(0) , namely En (λ = 0) = En(0) , which is an assumption on the
smoothness of the λ → 0 limit.
• that |n (0) , α is a complete set for the full Hamiltonian Ĥ (λ), not just for Ĥ0 , so that
n,α |n , αn , α| = 1. This is less obvious, but it is also related to the smoothness of the
(0) (0)

λ → 0 limit.
407
408 36 Time-Independent Perturbation Theory

Then, to define perturbation theory, we substitute (36.4) in (36.3) and solve the equation order by
order in λ.
At zeroth order in λ (the constant part of the equation), we obtain
( Ĥ (0) − En(0) )| ñ (0) , α = 0, (36.5)
which is just the unperturbed Schrödinger equation, so is satisfied.
At first order in λ (the linear part of the equation), we obtain
( Ĥ0 − En(0) )| ñ (1) , α + ( Ĥ1 − En(1) )| ñ (0) , α = 0. (36.6)
At second order in λ (the quadratic part of the equation), we obtain
( Ĥ0 − En(0) )| ñ (2) , α + ( Ĥ1 − En(1) )| ñ (1) , α − En(2) | ñ (2) , α = 0. (36.7)
Finally, at kth order in λ (for k > 2), (the term with λ k in the equation), we obtain
( Ĥ0 − En(0) )| ñ (k) , α + ( Ĥ1 − En(1) )| ñ (k−1) , α − En(2) | ñ (k−2) , α − · · · − En(k) | ñ (0) , α = 0. (36.8)

36.2 The Nondegenerate Case

We start with the simpler nondegenerate case, when there is no index α, so | ñ, α is replaced by just
|n. The eigenstates of Ĥ0 are assumed to be orthonormal,
n (0) |m (0)  = δ mn . (36.9)
The zeroth-order equation is satisfied, as we have seen.
The first-order equation becomes now simply
( Ĥ0 − En(0) )|n (1)  + ( Ĥ1 − En(1) )|n (0)  = 0. (36.10)
We multiply it with the ket of the free Hamiltonian, m (0) |, from the left.
In the case m = n, we obtain
n (0) | Ĥ0 |n (1)  − En(0) n (0) |n (1)  + n (0) | Ĥ1 |n (0)  − En(1) = 0. (36.11)
Acting with Ĥ0 on the left (on the ket state) in the first term, we obtain that the first and second terms
cancel, and we are left with a formula for the first correction to the energy,
En(1) = n (0) | Ĥ1 |n (0)  ≡ H1,nn . (36.12)
In the case m  n, we obtain
m (0) | Ĥ0 |n (1)  − En(0) m (0) |n (1)  + m (0) | Ĥ1 |n (0)  = 0. (36.13)
Again acting with Ĥ0 on the left in the first term, we now obtain
(En(0) − Em
(0)
)m (0) |n (1)  = m (0) | Ĥ1 |n (0)  ≡ H1,mn . (36.14)
But we can now use the second assumption, of the completeness of the states |n (0)  in the Hilbert
space of the full Hamiltonian Ĥ (λ), in order to expand |n (1)  in |m (0) ,

|n (1)  = n(1)
Cm |m (0) . (36.15)
m
409 36 Time-Independent Perturbation Theory

Multiplying from the left with p(0) | gives Cpn(1) = p(0) |n (1) , so substituting this into the first order
equation for m  n, we get
H1,mn
n(1)
Cm = . (36.16)
En(0) (0)
− Em
Note that this equation is valid only for m  n. At nonzero λ, we will assume that there is no
contribution to |n (0)  itself for |n, n (0) |n (1)  = 0, which is a consistent choice, amounting to just a
n(1)
normalization condition. Then, substituting Cm into the expansion of |n (1) , we obtain
 H1,mn
|n (1)  = (0) (0)
|m (0) . (36.17)
mn En − Em

The second-order equation is now simply

( Ĥ0 − En(0) )|n (2)  + ( Ĥ1 − En(1) )|n (1)  − En(2) |n (0)  = 0. (36.18)

Again we multiply it with the free ket m (0) | from the left.
In the m = n case, we obtain

n (0) | Ĥ0 |n (2)  − En(0) n (0) |n (2)  + n (0) | Ĥ1 |n (1)  − En(2) = 0, (36.19)

and, by acting with Ĥ0 on the left in the first term, we see that the first two terms cancel. But we can
use the first-order result (the expansion of |n (1) ), to calculate
 n (0) | Ĥ1 |m (0) H1,mn
n (0) | Ĥ1 |n (1)  = . (36.20)
mn En(0) − Em
(0)

Substituting in the second-order equation, we obtain


 H1,nm H1,mn  |H1,nm | 2
En(2) = = . (36.21)
mn En(0) − Em
(0)
mn En(0) − Em
(0)

In the m  n case, we obtain

m (0) |( Ĥ0 − En(0) )|n (2)  + m (0) |( Ĥ1 − En(1) )|n (1) 
(36.22)
= (Em
(0)
− En(0) )m (0) |n (2)  + m (0) | Ĥ1 |n (1)  − En(1) m (0) |n (1)  = 0.

Now we can use the first-order result, with En(1) = H1,nn and |n (1)  = Cm
n(1)
|m (0) , and obtain
 H1,pn H1,mn
(0)
(Em − En(0) )m (0) |n (2)  + m (0) | Ĥ1 |p(0)  − H1,nn = 0. (36.23)
pn En(0) − E p(0) En(0) (0)
− Em

Again we expand in the complete set of zeroth-order states, as in the first order:

|n (2)  = n(2)
Cm |m (0) . (36.24)
m

n(2)
Then, we find Cm = m (0) |n (2) , so

 H1,mp H1,pn H1,nn H1,mn


n(2)
Cm = m (0) |n (2)  =
1  − (0)  . (36.25)
En(0) − (0)
Em E (0)
− E (0)
E − E (0)
 pn n p n m
410 36 Time-Independent Perturbation Theory

36.3 The Degenerate Case

We next consider the degenerate case, where, as we said, in general we expect the basis at λ → 0
for the degenerate Hilbert space Hn(0) of given unperturbed energy to be different from the basis used
for Ĥ0 ,
λ→0

(0)
| ñ, α; λ → | ñ (0) , α ≡ Cn,αβ |n (0) , β. (36.26)
β

First Order

For the first perturbation order m = n case, multiply (36.6) from the left with the bra ñ (0) , β|,
obtaining

ñ (0) , β| Ĥ0 | ñ (1) , α − En(0) ñ (0) , β| ñ (1) , α + ñ (0) , β| Ĥ1 | ñ (0) , α − En(1) δ αβ = 0. (36.27)

Acting with Ĥ0 from the left on the first term leads to the cancellation of the first two terms giving

H1,nβα ≡ ñ (0) , β| Ĥ1 | ñ (0) , α = En(1) δ αβ . (36.28)

However, we see that this equation does not have a solution if H1,nβα  0 for α  β. In order for it
to have a solution, we consider states that are linear combinations of the basis states,

gm
|ψ n  = Cnα | ñ, α, (36.29)
α=1

where both |ψ n  and | ñ, α admit expansions in λ, or, more precisely, the above relation is true at
each order in λ,

gm
(s) (s)
|ψ n(s)  = Cnα | ñ , α. (36.30)
α=1

Then we replace | ñ (s) , α with |ψ n(s)  in the original perturbation theory equations, starting with
(36.6). Then, again multiplying it with ñ (0) , β|, we obtain

gm
(0) (0)
H1,nβα Cnα = En(1) Cnβ . (36.31)
α=1

This is a matrix equation for diagonalization, H1 · C = EC, so the possible eigenvalues, for En(1) , are
solutions to the equation

det(H1,nβα − En(1) δ βα ) = 0. (36.32)

Next we consider the m  n case. Again replacing | ñ (s) , α with |ψ n(s)  in (36.6), but multiplying
from the left with m̃ (0) , β|, we obtain
  
(1) (1) (0) (0) (0)
Cnα m̃ (0) , β| Ĥ0 | ñ (1) , α − En(0) Cnα ñ , β| ñ (1) , α + Cnα ñ , β| Ĥ1 | ñ (0) , α = 0.
α α α
(36.33)
411 36 Time-Independent Perturbation Theory

Acting with Ĥ0 from the left on the first term, we obtain
 
(1) (0)
(Em(0)
− En(0) ) Cnα m̃ (0) , β| ñ (1) , α + Cnα H1,mβ,nα = 0, (36.34)
α α

where we have defined

H1,mβ,nα ≡ m̃ (0) , β| Ĥ1 | ñ (0) , α (36.35)

and, expanding |ψ n(1)  in the complete set of H0 eigenstates | ñ (0) , α,


 
(1) (1) (1)
|ψ n(1)  = Cnα | ñ , α = C̃mβ | m̃ (0) , β, (36.36)
α m,β

we first find that (by multiplication with m̃ (0) , β|)



(1) (1)
Cnα m̃ (0) , β| ñ (1) , α = C̃mβ , (36.37)
α

in which case the perturbation equation is solved by


(0)
(1) α H1,mβ,nα Cnα
C̃mβ = . (36.38)
En(0) − Em
(0)

Substituting back in the definition of |ψ n(1) , we find


   H1,mβ,nα Cnα
(0)
(1) (1)
|ψ n(1)  = Cnα | ñ , α = | m̃ (0) , β. (36.39)
α m,β α En(0) − Em
(0)

Second Order
For the second perturbation order, replacing | ñ (s) , α with |ψ n(s)  in (36.7), we find
  
(2) (2) (1) (1) (0) (0)
( Ĥ0 − En(0) ) Cnα | ñ , α + ( Ĥ1 − En(1) ) Cnα | ñ , α − En(2) Cnα | ñ , α = 0. (36.40)
α α α

The m = n case. Multiplying from the left with the bra ñ (0) , β|, we find
 
(2) (0) (2) (0)
Cnα ñ , β| Ĥ0 | ñ (2) , α − En(0) Cnα ñ , β| ñ (2) , α
α α
  (36.41)
(1) (0) (1) (0) (0)
+ Cnα ñ , β| Ĥ1 | ñ (1) , α − En(1) Cnα ñ , β| ñ (1) , α − En(2) Cnβ = 0.
α α

Acting with Ĥ0 from the left on the first term, the first two terms cancel, and we are left with
 
(1) (0) (1) (0) (2)
Cnα ñ , β| Ĥ1 | ñ (1) , α − En(1) Cnα ñ , β| ñ (1) , α = En(2) Cnβ . (36.42)
α α

However, from the first-order analysis, replacing the form of |ψ n(1)  we have
  H1,mβ,pγ H1,pγ,nα
(1) (0)
Cnα m̃ (0) , β| Ĥ1 | ñ (1) , α = m̃ (0) , β| Ĥ1 |ψ n(1)  = Cnα
α α p,γ En(0) − E p(0)
 (36.43)
(1)
Cnα m̃ (0) , β| ñ (1) , α = m̃ (0)
, β|ψ n(1)  = 0.
α
412 36 Time-Independent Perturbation Theory

Using these identities, we obtain that the second-order equation for m = n becomes
 H1,mβ,pγ H1,pγ,nα
(0) (0)
En(2) Cnβ = m̃ (0) , β| Ĥ1 |ψ n(1)  = Cnα . (36.44)
α p,γ En(0) − E p(0)

Then defining the matrix element of an abstract operator K̂ (2) by


 m̃ (0) , β| Ĥ1 | p̃(0) , γ p̃(0) , γ| Ĥ1 | ñ (0) , α
ñ (0) , β| K̂ (2) | ñ (0) , α ≡ , (36.45)
p,γ En(0) − E p(0)

the equation for En(2) again becomes an eigenvalue problem for K̂ (2) ,

(0) (0)
ñ (0) , β| K̂ (2) | ñ (0) , αCnα = En(2) Cnβ . (36.46)
α

For the case m  n we could again find the second-order wave function, but the analysis is more
complicated and will be skipped.

36.4 General Form of Solution (to All Orders)

Now we will show the main points of a different approach, which works to all orders in perturbation
theory. Consider the eigenstate of the total Hamiltonian H, | ñ, α, λ, and denote it by |n, α.
Define the projector onto the Hilbert subspace of given energy En (the space of |n, α states for
any α),

P̂n = |n, αn, α|. (36.47)
α

Since it is a projector, it satisfies

P̂n2 = P̂n , and P̂n P̂m = 0 ⇒ P̂n P̂m = δ mn P̂n


 (36.48)
P̂n = 1,
n

as we can check.
Then the Schrödinger equation Ĥ |n, α = En |n, α becomes

Ĥ P̂n = En P̂n . (36.49)

Define the resolvent of this equation,


not. 1
Ĝ(z) ≡ (z − Ĥ) −1 = , (36.50)
z − Ĥ
where the label above the equals sign indicates that the notation with fractions will be used for
convenience, and where z is a complex variable; the resolvent can be thought of as a sum of
power laws,
∞  n
1  Ĥ
Ĝ(z) = . (36.51)
z n=0 z
413 36 Time-Independent Perturbation Theory

Then, using the Schrödinger equation, we obtain

P̂n
Ĝ(z) P̂n = . (36.52)
z − En

Summing over n, and using the fact that n P̂n = 1, we obtain


 P̂n  |nαnα|
α
Ĝ(z) = = . (36.53)
n
z − En n
z − En

This means that we can use the residue theorem in the complex plane to identify a projector P̂n :

1
P̂n = G(z)dz, (36.54)
2πi Γn
where Γn is a contour in the complex z plane around the real pole at z = En .
Now we remember that Ĥ = Ĥ0 + λ Ĥ1 , which means that
1
Ĝ(z) = , (36.55)
z − Ĥ0 − λ Ĥ1

and define the same resolvent for the free theory (just with Ĥ0 ),
1
Ĝ0 (z) = . (36.56)
z − Ĥ0
Then we can write the identities
1 1 1
Ĝ(z) = = [(z − Ĥ0 − λ Ĥ1 ) + λ Ĥ1 ]
z − Ĥ0 − λ Ĥ1 z − Ĥ0 z − Ĥ0 − λ Ĥ1
1 1 1 (36.57)
= + λ Ĥ1
z − Ĥ0 z − Ĥ0 z − Ĥ0 − λ Ĥ1
= Ĝ0 (z) 1 + λ Ĥ1 Ĝ(z) .

We have thus obtained a self-consistent equation tailor-made for perturbation theory,

Ĝ(z) = Ĝ0 (z) 1 + λ Ĥ1 Ĝ(z) , (36.58)

since the term with Ĝ(z) on the right-hand side is proportional to λ.


This means that we can iterate the equation, by putting Ĝ = Ĝ0 on the right-hand side (so Ĝ (0) =
Ĝ0 ), and then we have Ĝ up to the first order on the left-hand side. Next we put this on the right-
hand side of the equation, obtaining Ĝ up to the second order on the left-hand side. We continue this
procedure ad infinitum, thus obtaining the perturbative solution to all orders in perturbation theory,


Ĝ = λ n Ĝ0 ( Ĥ1 Ĝ0 ) n . (36.59)
n=0

Define a contour in the complex plane Γ̃n that encircles both En(0) (eigenvalue of Ĥ0 ) and the real
values En . Then we have

1
P̂n = Ĝ(z)dz (36.60)
2πi Γ̃n
414 36 Time-Independent Perturbation Theory

as before, but now we can also use the solution (36.59) inside the equation, and then use the same
equation for P̂n,0 in terms of Ĝ0 to find



P̂n = P̂n,0 + λ n A(n) , (36.61)
n=1

where

1
A (n)
= Ĝ0 ( Ĥ1 Ĝ0 ) n dz. (36.62)
2πi Γ̃n

One can also find expressions for the expansions of P̂n and Ĥ P̂n in λ n from the above, but we will
not do it here.
Then, acting with Ĥ P̂n on the eigenstates of Ĥ0 , | ñ (0) , α, the operator P̂n projects them onto the
states of energy En for Ĥ, so we can act with Ĥ, and obtain

Ĥ P̂n | ñ (0) , α = En P̂n | ñ (0) , α. (36.63)

But then, without altering anything, we can put P̂n,0 in front of | ñ (0) , α and multiply the equation
from the left by P̂n,0 .
Defining the following operators, which are Hermitian, as we can easily check,

Ĥn = P̂n,0 Ĥ P̂n P̂n,0 , K̂n = P̂n,0 P̂n P̂n,0 , (36.64)

the resulting equation becomes

Ĥn | ñ (0) , α = En K̂n | ñ (0) , α. (36.65)

In order for this equation to have a solution, the energy En must satisfy a secular equation,

det( Ĥn − En K̂n ) = 0. (36.66)

The expansion in λ of Ĥn and K̂n can be derived from the previous expansion of Ĥ P̂n and P̂n ,
resulting in the eigenvalue equation for En in a λ expansion. We will not continue it here, however.

36.5 Example: Stark Effect in Hydrogenoid Atom

Consider the Stark effect, which is the interaction of a system having electric dipole moment d with
a constant electric field E in the z direction, with interaction Hamiltonian

Ĥ1 = dEz = dEr cos θ. (36.67)

Moreover, suppose that the unperturbed Hamiltonian Ĥ0 is that for the hydrogenoid atom. For
perturbation theory to work, we need that the interaction energy  Enucleus .
415 36 Time-Independent Perturbation Theory

We first apply perturbation theory to the ground state, with energy quantum number n = 1, starting
with the first order. For n = 1, g1 = 1 the only state (nlm) is (100). But since z = r cos θ is
directional, whereas the (100) state is spherically symmetric, we have that
(H1 )100,100 = 0. (36.68)
This means that for the first-order correction to the ground state energy E1 we have
E1(1) = (H1 )100,100 = 0. (36.69)
The first nonzero contribution is from the second order, which in this case is found from the general
expression
 |H1,100,nlm | 2
E1(2) = . (36.70)
(nlm)(100)
E1 − En

We cannot find a closed form for the solution; instead we must calculate term by term in (nlm) and
then sum, so we will not go further in evaluation of this formula.
Next we apply perturbation theory to the first excited state, with n = 2. In this case g2 = 22 = 4.
The four states in this Hilbert subspace are |2α = |2lm, specifically denoted by
|2; 1 ≡ |2, 0, 0, |2; 2 ≡ |2, 1, 0, |2; 3 ≡ |2, 1, 1, |2; 4 ≡ |2, 1, −1. (36.71)
Then the matrix elements of the perturbation are
nl 1 m1 | Ĥ1 |nl 2 m2  = dEl 1 m1 | cos θ|l 2 m2 rnl1 ,nl2 . (36.72)
In order for the result to be nonzero, we must have m1 = m2 and l 1 = l 2 ± 1. Moreover, we find that

l 2 − m2
l, m| cos θ|l − 1, m = l − 1, m| cos θ|l, m = , (36.73)
4l 2 − 1
which we leave as an exercise for the reader to prove.
Therefore in our case (with states 1, 2, 3, 4) the only nonzero matrix element is H1,12  0, with

1
2; 1| cos θ|2; 2 = . (36.74)
3
Then the matrix element is
r20,21 a0
H1,12 = −dE √ = −3dE . (36.75)
3 Z
Next we need to diagonalize the matrix
0 H1,12 0 0 0 1 0 0
   
H
 1,12 0 0 0 1 0 0 0
= H1,12 
0 0
. (36.76)
 0 0 0 0 0 0
 0 0 0 0 0 0 0 0

The secular equation for the eigenvalues is (for E2(1) = λH1,12 )


 
H1,12 
2  −λ
λ    = 0 ⇒ λ2 (λ2 − H1,12 ) = 0, (36.77)
H
 1,12 −λ 
416 36 Time-Independent Perturbation Theory

with solutions

E2(1) = H1,12 (0, 0, +1, −1). (36.78)

The eigenstates are also found from the diagonalization, namely

|ψ3  = |2; 3 = |2, 1, 1


|ψ4  = |2; 4 = |2, 1, −1
1 1
|ψ1  = √ (|2; 1 + |2; 2) = √ (|2, 0.0 + |2, 1, 0) (36.79)
2 2
1 1
|ψ2  = √ (|2; 1 − |2; 2) = √ (|2, 0, 0 − |2, 1, 0).
2 2

Important Concepts to Remember

• Time-independent perturbation theory arises for a time-independent Hamiltonian Ĥ (λ) = Ĥ0 +


λ Ĥ1 , where Ĥ0 is the “free” part, with large eigenvalues, and λ Ĥ1 is a perturbation, with small
eigenvalues.
• In the nondegenerate case, to first order we find En(1) = H1,nn and Cm
n(1)
= H1,mn /(En(0) − Em
(0)
),
(2) (0) (0)
En = mn |H1,nm | /(En − Em ).
2

• In the degenerate case, the basis | ñ (0) , α at λ → 0 is generically different from the free basis,
|n (0) , α. Moreover, the states are combinations of the basis states, |ψ n  = n Cnα | ñ, α.
• The equation for En(1) is a secular equation in the basis states, det(H1,nβα − En(1) δ βα ) = 0, which
(0)
solves α [H1,nβα Cnα − En(1) Cnβ
(0) (1)
] = 0. Then C̃mβ (0)
= α H1,mβ,nα Cnα /[En(0) − Em
(0)
].
• In the general case of perturbation theory, defining the resolvent

1  ⎡⎢ ⎤⎥
Ĝ(z) = = ⎢⎢ |nαnα|/(z − En ) ⎥⎥
z − Ĥ ⎢
n ⎣ α
⎥⎦

and the free resolvent,

1
Ĝ0 (z) = ,
z − Ĥ0

we have the self-consistent equation Ĝ(z) = Ĝ0 (z)(1 + λ Ĥ1 Ĝ(z)), solved by perturbation theory
via iteration.
• The Stark effect for a hydrogenoid atom, H1 = dEz, is trivial in first-order perturbation theory, and
is nontrivial at second order.

Further Reading
See [2], [3], [1].
417 36 Time-Independent Perturbation Theory

Exercises

(1) Consider a one-dimensional harmonic oscillator perturbed by a linear potential, λ Ĥ1 = λx.
Calculate the first-order perturbation theory corrections to the energy and wave functions, and
compare with the exact solution for Ĥ0 + λ Ĥ1 (which is trivial to find in this case).
(2) Consider a hydrogenoid atom perturbed by a decaying exponential potential λ Ĥ1 = λe−μr .
Calculate the first-order perturbation theory corrections to the energy and ground state wave
function.
(3) Write down an explicit formula for the second-order perturbation contribution to the ground
state energy in exercise 2, without calculating the terms and summing them.
(4) Calculate the first-order perturbation contribution to the energy of the first excited state of the
hydrogenoid atom in exercise 2.
(5) Formally, to obtain the perturbative solution (36.59), we did not need the self-consistent equation
(36.58), we just needed to expand the definition of G(z) in λ: show this.
(6) Prove the relation (36.73), used in the analysis of the Stark effect.
(7) Use first-order perturbation theory to find the Stark effect splittings for the second excited state
of the hydrogenoid atom, n = 3.
37 Time-Dependent Perturbation Theory: First Order

In this chapter, we will consider the time dependence of the Schrödinger equation, and solve it in
perturbation theory. We will concentrate on the first order, leaving the second-order case and the
general expansion for the next chapter.
Consider then the time-dependent Hamiltonian Ĥ split into a nonperturbed, time-independent part
Ĥ0 and a time-dependent perturbation,

Ĥ (t) = Ĥ0 + λ Ĥ1 (t). (37.1)

As before, for this to be a satisfactory perturbation theory, we need λ  1, or rather that the matrix
elements of λ Ĥ1 (t) are much smaller than the matrix elements of Ĥ0 .
Thus, we will attempt to solve the time-dependent Schrödinger equation,

i∂t |ψ(t) = Ĥ (t)|ψ(t), (37.2)

as an expansion in λ.

37.1 Evolution Operator

We defined the evolution operator Û (t 2 , t 1 ) by

|ψ(t 2 ) = Û (t 2 , t 1 )|ψ(t 1 ), (37.3)

which implies that the Schrödinger equation is now

i∂t Û (t, t 0 ) = Ĥ (t)Û (t, t 0 ). (37.4)

If the Hamiltonian is time independent, Ĥ (t) = Ĥ, we obtain Û (t, t 0 ) = e−i Ĥ (t−t0 )/ . Defining the
eigenstates of Ĥ as |ψ n  with eigenvalues En , and if the state at t = 0 is

|ψ(t = 0) = an |ψ n (0), (37.5)
n

then the state at time t is



|ψ(t) = an e−iEn t/ |ψ n (0). (37.6)
n

However, in general we have Ĥ (t) (a time-dependent Hamiltonian), so choosing at an initial time


t i a state among the eigenstates |ψ n , namely

|ψ(t = 0) = |ψi , (37.7)


418
419 37 First-Order Time-Dependent Perturbation Theory

we find that, at a final time, the state has become a superposition of all the |ψ n  states,

|ψ(t = t f ) = Û (t f , t i )|ψi  = an (t f )|ψ n . (37.8)
n

Multiplying from the left with the bra ψ m |, we find that


am (t f ) = ψ m |Û (t f , t i )|ψi  ≡ Umi (t f , t i ). (37.9)
Then the transition probability (during the period from t i to t f ) between |ψi  and |ψ f  is
P f i = |U f i (t f , t i )| 2 . (37.10)

37.2 Method of Variation of Constants

Consider the eigenvalue problem for the unperturbed Hamiltonian Ĥ0 ,


Ĥ0 |ψ n  = En |ψ n , (37.11)
and an initial condition

|ψ = cn |ψ n . (37.12)
n

As in the previous analysis of the evolution operator, the time-dependent interaction Ĥ1 (t) can
be used to calculate probabilities for transitions between the states |ψ n . For the solution of the
time-dependent Schrödinger equation for Ĥ (t), we write an ansatz by lifting to time dependence the
constants cn , so that

|ψ(t) = cn (t)|ψ n . (37.13)
n

Comparing with (37.8), we see that


an (t) = cn (t) = Uni (t, t i ). (37.14)
Substituting this ansatz in the Schrödinger equation, we obtain
 
i ċn (t)|ψ n  = Ĥ (t) cn (t)|ψ n . (37.15)
n n

Multiplying from the left with the bra ψ m |, we obtain the system of equations
 
iċm (t) = cn (t)ψ m | Ĥ (t)|ψ n  ≡ cn (t)Hmn (t). (37.16)
n n

This system of equations is equivalent to the original Schrödinger equation.


Again, we must use the assumption that {|ψ n } is a complete basis for the full Ĥ (t) as well. Then,
we expand the cn (t) coefficients in λ,


cn = cn (t, λ) = cn(s) λ s . (37.17)
s=0

Taking matrix elements of Ĥ (t) = Ĥ0 + λ Ĥ1 (t) in the basis |ψ n  of Ĥ0 , we obtain
Hmn (t) = En δ mn + λH1,mn (t). (37.18)
420 37 First-Order Time-Dependent Perturbation Theory

Moreover, we can factorize the time dependence of the unperturbed Hamiltonian from the coefficients
cn (t),

cn (t) = bn (t)e−iEn t/ , (37.19)

and therefore find an expansion in λ of bn (t) as well,




bn (t) = bn(s) λ s . (37.20)
s=0

Then the Schrödinger equation for cn (t) becomes one for bn (t) that is simpler,
 
iḃm (t) = λH1,mn (t)ei(Em −En )t/ = λH1,mn (t)eiω mn t , (37.21)
n n

where we define the transition frequency


Em − En
ωmn ≡ . (37.22)

The boundary conditions for the equations for bn (t) are that bn(0) is constant, and defines the initial
(unperturbed) wave function. To solve the equations, we identify the coefficient of λ s+1 on both sides
of the equation, resulting in

(s+1)
iḃm (t) = H1,mn (t)bn(s) (t)eiω mn t . (37.23)
m

This results in an iterative solution of the equation, given the boundary conditions. Put bn(0) on the
right-hand side, then find bn(1) from the left-hand side. Then input bn(1) on the right-hand side, and find
bn(2) from the left-hand side, etc.
Consider the boundary conditions given by an initial state |ψi , so that

bn(0) (t) = δ ni
(37.24)
bn(s) (t = 0) = 0, s > 0.

The second condition means that there is no component for n  i at all orders in λ, at the initial time.
Then, the equation for the first order in λ is
(1)
iḃm (t) = H1,mi (t)eiω mi t , (37.25)

with solution
 τ
(1)
ibm (τ) = dteiω mi t H1,mi (t). (37.26)
0

Finally, the time-dependent state to first order is



|ψ(t) = (δ ni + bn(1) (t))e−iEn t/ |ψ n . (37.27)
n

Then the transition probability for the system to go from a state with energy Ei to a state with energy
En during the time t is

Pni (t) = |bn(1) (t)| 2 . (37.28)


421 37 First-Order Time-Dependent Perturbation Theory

37.3 A Time-Independent Perturbation Being Turned On

Consider a time-independent perturbation turned on at time t = 0, so that


⎪ 0, t<0
H1 (t) = ⎨
⎪ H , t ≥ 0. (37.29)
⎩ 1
Later we will see that this is an example of the “sudden approximation”, and that it is a sensible
thing to consider.
Then, during the period of integration, we have a time-independent H1 or, more precisely, we have
a time-independent matrix element H1,mi . Thus we obtain
 τ
H1,mi iω mi τ
ibm(1)
(τ) = H1,mi dteiω mi t = −i e −1 . (37.30)
0 ωmi
The probability of transition during the time τ is

|H1,mi | 2 eiω mi t − 12 = 4|H1,mi (τ)| sin2 ωmi τ ,


2
Pmi (τ) = |bm
(1)
(τ)| 2 =   (37.31)
2 ωmi
2 (Em − Ei ) 2 2

where we have used that ωmi = Em − Ei and eiω mi t − 1 = 2(1 − cos ωmi τ) = 4 sin2 ωmi τ/2.
2

Since however we have the limit formula


1 sin2 ax
lim = δ(x), (37.32)
a→∞ π ax 2
which we leave as an exercise for the reader to prove, then it means that at large times

τ 1 sin2 (E f − Ei )(τ/2)
|bm
(1)
(τ)| 2 = 4π|H1,mi | 2
2 π (τ/2)(E f − Ei ) 2 (37.33)
τ→∞ 2π
→ τ |H1,mi | 2 δ(E f − Ei ).

Thus the probability per unit time at large times becomes
dP f i 2π
= |H1,mi | 2 δ(E f − Ei ). (37.34)
dt 
This means that at large times, we can only have (i.e., there is a nonzero probability for) the same
energy for the initial and final states of the transition.

37.4 Continuous Spectrum and Fermi’s Golden Rule

One important particular case is when the final state m = f belongs to the continuous spectrum of the
Hamiltonian. For one such possibility, the initial state is in the discrete spectrum, which happens to
overlap (in terms of energy, but not in position in space) with the continuous spectrum. For instance,
we can have bound electrons in atoms, that can escape and transition to the continuous spectrum as
free electrons, a process of “self-ionization”.
422 37 First-Order Time-Dependent Perturbation Theory

We consider orthonormal final states |ψ f , where f = { f 1 , . . . , f r } is a set of quantum numbers


(parameters), out of which at least one, say f 1 , is continuous, and perhaps there are also discrete
parameters, so
ψ f |ψ f   = “δ( f − f  )” ≡ δ( f 1 − f 1 )δ( f 2 − f 2 ) · · · δ fr −1 fr−1 δ fr fr . (37.35)
Then in the general expansion

|ψ = cn (t)|ψ n , (37.36)
n

the index n runs over discrete values n , and continuous + discrete ones f , so that we have
  
ψ|ψ = |cn (t)| +
2
d f1d f2 · · · |c f (t)| 2
n fr −1 fr
   (37.37)
= |b (t)| +
n
2
d f1d f2 · · · |b f (t)| .
2

n fr −1 fr

But we will have a delta function in energy, δ(E f − Ei ), that must be integrated over E, which
means that we must replace one of the continuous final variables, say f 1 , by the energy E. Note that
all the f k might depend on E, and E might depend on all the f k , but still effectively only f 1 matters.
Indeed, when doing this we will obtain the Jacobian of the transformation
 ∂E/∂ f 1 ∂E/∂ f 2 ... . . . 
 
 0 1 0 . . . 
J =  0   ∂E  ,
0 1 . . . =  (37.38)
 . . . ∂f 
... ... . . .  1 
 
 0 0 ... 1 
which in turn means that we have a “density of states” of energy in a window of dE around E f ,
1 1
ρ(E, f 2 , . . .) = = . (37.39)
J |∂E/∂ f 1 |
Then we can rewrite the sum of the probability over the final states as
   
d f1 d f2 · · · |b f (t)| =
2
dE d f 2 · · · ρ(E, f 2 , . . .)|b f (t)| 2 . (37.40)
fr −1 fr fr −1 fr

Substituting the transition probability over a time t in (37.34), and summing over the final states,
with the transformation to E, we get
 

ΔP f i (t) = t |H1, f i | 2
dE f d f 2 · · · δ(E f − Ei )ρ(E f , f 2 , . . .). (37.41)
 f r

Thus the transition probability density (in time) is given by


 
dP(i → f ) 2π
(t) = |H1, f i | 2
d f2 · · · ρ(Ei , f 2 , . . . , f r ), (37.42)
dt  f r

which is Fermi’s golden rule for transition processes, and has applications in many domains.
There are two observations that we need to make:
(1) The perturbation expansion is valid only for small times t, meaning when P(t) is small ( 1).
Indeed, otherwise, eventually P(t) would increase beyond 1 as time passes, which would be a
contradiction.
423 37 First-Order Time-Dependent Perturbation Theory

(2) In order to have a well-defined probability, we need not only E f to be near Ei but also that all the
relevant f states are “near” each other. For instance, if there is a momentum direction characterizing
the final state, say f 2 = p f /p f , then we need the direction to be in a solid angle around a central
value, so that d f 2 corresponds to the solid angle.

37.5 Application to Scattering in a Collision

One important application of the Fermi golden rule, which also takes advantage of the second
observation above, is the case of quantum scattering in collisions. In this case, we need to also
generalize to the case where the initial state is in a continuous spectrum.
The standard quantum mechanical scattering process is the case of a monoenergetic (a given Ei )
parallel beam of incoming particles of momentum p 0 , with direction n0 = p 0 /p0 and with flux
J , incident on a transverse “foil” of target material and scattered at an arbitrary angle θ from the
incoming axis, with arbitrary outgoing direction n = p /p. Then we have f 2 = n, and d f 2 = dΩ
(where dΩ is an infinitesimal solid angle around n); see Fig. 37.1.
We define the differential scattering cross section in terms of the number of particles incident on a
surface element dS⊥ perpendicular to the beam as
No. of parts. scatt. in dΩ and dt
dσ =
No. of incident parts. in dS⊥ and dt
prob. of scatt. in dΩ dt
= (37.43)
Prob. of incidence through dS⊥ and dt
dPi→ f /dt
= .
J
The current of the probability density is

J = n0 · j0 = |N0 | 2 v0 , (37.44)

where N0 is the normalization factor for the incident (free) wave function
 
i
ψ0 = N0 exp (p 0 · r − Et) (37.45)


and p0 = mv0 .

dΩ p
p0

Figure 37.1 Scattering from a target foil.


424 37 First-Order Time-Dependent Perturbation Theory

The outgoing (free) wave function for particles of a given energy E and direction n is
√  
m0 p i
ψ E,n (r , t) = exp (p · r − Et) , (37.46)
(2π) 3/2 
which is normalized in such a way that it obeys a normalization condition over E and Ω,
 ∞ 
dE dΩ ψ∗E,n (r , t)ψ E,n (r , t) = δ(r − r  ). (37.47)
0

Indeed, in Chapter 5 we saw that the wave function for a free particle in one dimension, normalized
over momentum p, is
 
1 i
ψ E,p (x, t) = √ exp (px − Et) , (37.48)
2π 
and therefore the wave function for a three-dimensional free particle, normalized over momen-
tum p , is
 
1 i
ψ E,p (r , t) = exp (p · r − Et) . (37.49)
(2π) 3/2 
    
But, since d 3 p = dp p2 dΩ = dE m0 p dΩ, we have

|ψ E,n | 2 = m0 p|ψ E,p | 2 . (37.50)

To apply perturbation theory for this scattering, we need the kinetic energy to be much larger than
the interaction energy of the incoming particles with the target, V (r ),
p2
H0 =  H1 = Hint = V (r ). (37.51)
2m
If this condition is satisfied, we can use Fermi’s golden rule, for the case when f 1 = E f , so that
ρ = 1/J = 1, and f 2 = n, so that d f 2 = dΩ, obtaining
dP(i → f )/dt 1 2π
dσ = = 2 |V (r ) f i | 2 dΩ, (37.52)
2
N0 v0 N0 v0 
where v0 = p0 /m0 . The matrix element of the interaction Hamiltonian V (r ) is

V (r ) f i = d 3 r ψ∗E,n (r )V (r )ψ E,n0 (r )
√   
m0 p i (37.53)
= N0 d r exp − (p − p 0 ) · r V (r )
3
(2π) 3/2 

= N0 m0 p V (p − p 0 ),
where we have defined the Fourier transform of the potential,
  
1 i
V (p ) = d r exp − p · r V (r ).
3
(37.54)
(2π) 3/2 
Then we obtain for the differential cross section per unit solid angle,
dσ 2π 2πm02
= m0 p|V (p − p 0 )| 2 = |V (p − p 0 )| 2 , (37.55)
dΩ v0 
which is the Born approximation, giving the Born formula for elastic scattering.
425 37 First-Order Time-Dependent Perturbation Theory

Coulomb Potential
The most important application of the Born formula is for the case of a Coulomb potential, obtained
as a limit of the Yukawa potential (with an exponential, coming from the exchange of a massive
particle, the pion, instead of a massless one, the photon),
 
z Z |e0 | 2 z Z |e0 | 2 −λr
V (r) = = lim e . (37.56)
r λ→0 r

The Fourier transform of the Yukawa potential, in the limit, is

2z Ze2 1 1
V (p ) = lim √ 0 . (37.57)
λ→0  2 2
2π p / + λ2

The proof of this formula is left as an exercise.


Then, applying the Born formula to the case of the scattering of α particles in a Coulomb potential
of a hydrogenoid atom with charge Z, with z = 2, gives

⎡ ⎤2 2
dσ ⎢⎢ 4m0 Ze02 ⎥⎥ 
2Ze02
 1
= =
dΩ ⎢⎢ (p − p 0 ) 2 ⎥⎥
, (37.58)
⎣ ⎦  4E0 sin θ/2
4

where we have used the fact that (p − p 0 ) 2 = p2 2(1 − cos θ) = 4p2 sin2 θ/2, when p = p0 (by
energy conservation). This is the famous Rutherford formula, which can also be obtained classically
(as Rutherford did) and was used to show that the target is made of atoms in which there are electrons
around a nuclear core of charge +Z |e| (i.e., the target is not made of atoms as solid neutral balls).

37.6 Sudden versus Adiabatic Approximation

There are two relevant approximations that are almost always considered when calculating the time-
dependent perturbation theory.
The sudden approximation corresponds to having a sudden change in the Hamiltonian, by
introducing the perturbation Ĥ1 over a time Δt = 2 → 0. It is implicit in the time-independent
perturbation analyzed before, where until t 0 −  we had Ĥ1 (t) = 0 and after t 0 +  we had Ĥ1 (t) = Ĥ1
(time independent).
Then, in this approximation we can integrate the Schrödinger equation over the time period of the
change,
 +
i
|ψ(+) − |ψ(−) = − dt H (t)|ψ(t) → 0. (37.59)
 −

Indeed, since the integrand is finite and  → 0, the integral vanishes. Then we obtain

|ψ(0+) = |ψ(0−), (37.60)

or that the state does not change over the sudden transition that introduces the perturbation
Hamiltonian.
426 37 First-Order Time-Dependent Perturbation Theory

Adiabatic Perturbation
The opposite situation occurs when we introduce the perturbation very slowly, or adiabatically,
meaning that we transition from the initial state |n (0) , an eigenstate of Ĥ0 , to the state |n, an
eigenstate of the full Hamiltonian Ĥ, which corresponds to |n (0)  as λ → 0.
The way in which we introduce the interaction Ĥ1 is to factorize the time dependence as follows:
Ĥ1 (t) = e γt Ĥ1 . (37.61)
As we see, at t = 0 we introduce the perturbation ( Ĥ1 (t = 0) = Ĥ1 ), and we see that in fact we can
start at t 0 → −∞, when Ĥ1 (t) → 0.
This formalism connects the time-dependent and time-independent perturbation theories, as we
will now show.
From (37.26), substituting the above Ĥ1 (t), we obtain
 τ
(1)
ibm (τ) = dt eiω mi t e γt H1,mi
−∞
(37.62)
eiω mi τ+γτ
= H1,mi .
iωmi + γ
Then the transition probability is
|H1,mi | 2 e2γτ
|bm
(1)
(τ)| 2 = . (37.63)
2 ωmi 2
+ γ2
Moreover, the probability density in time is
d (1) 2 2|H1,mi | 2 γe2γt
|b (t)| = . (37.64)
dt m 2 ωmi
2
+ γ2
But since, by smoothly removing the perturbation, by taking γ → 0, so that
1 γ
lim 2 = δ(ωmi ) = δ(Em − Ei ), (37.65)
π γ→0 γ + ωmi
2

we obtain
d (1) 2 2π
|b (t)| = |H1,mi | 2 δ(Em − Ei ), (37.66)
dt m 
which is Fermi’s golden rule, as expected for a time-dependent perturbation (introduced suddenly, at
a very early time, t 0 → −∞).
On the other hand, we can also obtain the time-independent perturbation formula. Indeed, first
using the time-dependent formula, we have
cm (t) = m (0) |ψ(t) = bm (t)e−iEm t/ . (37.67)
Considering a negative initial time, t 0 < 0, and vanishing final time, t = 0, and restricting ourselves
to first-order perturbation theory, we obtain
 0  0
i  i  
Cm (t = 0) = − dt  e+iω mi t H1,mi (t  ) = − dt  e+iω mi t H1,mi e γt
 t0  t0
(37.68)
i 1 − e γt0 +iω mi t0
=− H1,mi .
 iωmi + γ
427 37 First-Order Time-Dependent Perturbation Theory

Then, as we take t 0 → −∞ to introduce the perturbation adiabatically, we obtain


H1,mi H1,mi
Cm (t = 0) → = , (37.69)
ωmi Ei − Em
which is the time-independent perturbation formula.

Important Concepts to Remember

• Time-dependent perturbation theory corresponds to Ĥ (t) = Ĥ0 + λ Ĥ1 (t), and can be solved with
the method of variation of constants, |ψ(t) = n cn (t)|ψ n , where |ψ n  are the eigenstates of Ĥ0 .
• We expand the coefficients in λ, after factoring out the Ĥ0 time dependence:
cn (t) = e−iEn t/ s bn(s) λ s .
t 
• At first order, we find bn(1) (t) = (1/i) 0 dt  eiω mi t H1,mi (t) and

|ψ(t) = (δ ni + bn(1) (t))e−iEn t/ |ψ n .
n

• In the sudden approximation ( Ĥ1 turned on at t = 0), we find


(1)
dPi f d|bm (τ)| 2π
= = |H1,mi | 2 δ(E f − Ei )
dτ dτ 
at large times.
• For a transition to a continuous spectrum, we find Fermi’s golden rule,
 
dP(i → f )(t) 2π
= |H1, f i | 2
d f2 · · · ρ(Ei , f 2 , · · · , f r ),
dt  f r

where all relevant f states need to be “near each other”.


• We can apply Fermi’s golden rule to elastic scattering in a collision, and obtain the Born formula
(Born approximation)

dσ 2πm02
= |V (p − p 0 )| 2 ,
dΩ 
which for the hydrogenoid atom gives the Rutherford formula,
2
dσ  2Ze02  1
= .
dΩ  4E0 sin θ/2
4

• The adiabatic approximation, with Ĥ1 introduced as e γt Ĥ1 , for t < 0, allows us to connect the
time-dependent and time-independent formalism.

Further Reading
See [2], [1].
428 37 First-Order Time-Dependent Perturbation Theory

Exercises

(1) Consider Ĥ0 corresponding to the hydrogen atom, with the initial state i being the ground state,
and λH1,mi = λe−mA+λ̃t , for t < 0, where A is a constant. Calculate the probability of transition
to the mth energy eigenstate as a function of time.
(2) Consider Ĥ0 corresponding to the hydrogen atom, the initial state i being the ground state, and
Ĥ1 = Kr, with K constant, a perturbation introduced suddenly at t = 0. Calculate the transition
probability to the state n = 2, l = 0 as a function of time. What happens at large times?
(3) Prove the formula (37.32) for the limit giving the delta function.
(4) Consider an electron in an atom, with a potential that can be approximated by a (radial) square
well of radius r 0 above zero, in its ground state i = 0. Introduce a small perturbation Hmi = H =
constant. Calculate the rate for the electron to transition out of the square well and into the free
space beyond.
(5) Calculate the differential cross section for scattering from a delta function potential,
V = V0 δ(r ).
(6) Prove the formula (37.57).
(7) Consider the adiabatic introduction of the perturbation H1 with
Ĥ1
Ĥ1 (t) = ,
1 + γ2t 2
for t < 0. Calculate the transition rate.
Time-Dependent Perturbation Theory:
38 Second and All Orders

In this chapter, we continue with time-dependent perturbation theory, first with the second order, as
applied to finding the Breit–Wigner distribution for a transition with an energy shift and decay width,
and then with an all-orders formalism.

38.1 Second-Order Perturbation Theory and Breit–Wigner Distribution


(Energy Shift and Decay Width)

As we saw in the previous chapter, the general recursion relation between the (s + 1)th-order and the
sth-order in perturbation theory is

(s+1)
iḃm (t) = H1,mn (t)bn(s) (t)eiω mn t . (38.1)
n

We are interested in the s = 1 time-independent (sudden approximation) case, which gives



(2)
iḃm (t) = H1,mn bn(1) (t)eiω mn t . (38.2)
n

Specializing for m = i, we get



iḃi(2) (t) = H1,in bn(1) (t)eiωi n t , (38.3)
n

but we saw in the previous chapter that


 t
1 
bn(1) (t) = dt  eiω ni t H1,ni (t  )bi(0) (t  ), (38.4)
i 0

for n  i, where, however, bi(0) (t  ) = 1 (more precisely, bm


(0) 
(t ) = δ mi ). On the other hand, bi(1) (t) =
0, so the time dependence of the initial state only appears from the second order (i.e., in bi(2) (t))
onwards.
Considering the case of H1,in  0 and sudden approximation (constant H1,in in time), substituting
bn in the formula for bi(2) , we obtain (note that ωin = −ωni )
(1)

 |H1,in | 2  t  t 
 iω i n t  
bi(2) (t) = − dt e dt  eiω ni t bi(0)
n
 2
0 0
 (38.5)
 i|H1,in | 2 t

= dt  1 − e−iω ni t bi(0) .
n
2ω
ni 0

429
430 38 Higher-Order Time-Dependent Perturbation Theory

Then we find a decay formula for the initial state, where bi (t)  bi(0) (= 1) and ḃi (t)  ḃi(2) ,

ḃi (t)  i|H1,in | 2 1 − e−i(En −Ei )t/


= . (38.6)
bi (t) n
 En − Ei

To continue, we use the formula


1 − e−iax 1
lim = iπδ(x) + P , (38.7)
a→∞ x x
understood as a relation in distributions, which implies that (if y < 0, z > 0)
 z  −  z 
1 − e−iax f (x) f (x)
lim dx f (x) = iπ f (x) + lim dx + dx
a→∞ y x →0 y x + x
 z (38.8)
f (x)
= iπ f (x) + P dx.
y x

Here we have defined the principal part P of an integral as the integral minus its pole,
 z  −  z
f (x) f (x) f (x)
P dx = dx + dx. (38.9)
y x y x + x

To prove the above relation, valid for distributions, we first note that

1 − e−iax i sin ax 1 − cos ax


= + , (38.10)
x x x
and, from the previous chapter, we know that if a → ∞, then the first term goes to iπδ(x). Moreover,
 z  −  +  z 
1 − cos ax 1 − cos ax
dx f (x) = + + , (38.11)
y x y −  x

and, since around x = 0 we have (1 − cos ax)/x  a2 x/2, we have that


 +
1 − cos ax
→ 0, (38.12)
− x
which leaves only the principal part of the integral. Moreover, for the principal part (excluding the
region near x = 0), we have

cos ax
lim − f (x)dx = 0, (38.13)
a→∞ 2
since the integrand oscillates very fast, averaging to zero. Then finally we find
 z  z
1 − e−iax f (x)
lim f (x) = iπ f (x) + P dx. (38.14)
a→∞ y x y x

q.e.d.

Substituting into (38.6), with a = t/ and x = En − Ei , we get


 
ḃi (t)  i 1
 |H1,in | iπδ(En − Ei ) + P
2
. (38.15)
bi (t) n
 En − Ei
431 38 Higher-Order Time-Dependent Perturbation Theory

More precisely, we have n as a sum over final states, so


   
ḃi (t) 1 1
 d f2 · · · dEn ρ(En , f 2 , . . . , f n ) |H1,in | 2 −πδ(En − Ei ) + iP
bi (t) fn
 En − Ei
  |H1,in | 2
=π d f2 · · · ρ(Ei , f 2 , . . . , f n ) (38.16)
fn

  
i |H1,ni | 2
+ P dEn d f2 · · · ρ(En , f 2 , . . . , f n ) .
 f
En − Ei
n

The term on the second line is defined to be ≡ −Γ/2 and the term on the third line is defined to
be ≡ −(i/)ΔEi , and in total the result is defined to be equal to
 
Γ i
−γ ≡ − + ΔEi . (38.17)
2 
Thus

2π  
Γ= d f2 · · · |H1,ni | 2 ρ(E, f 2 , . . . , f n )
 n fn
    |H1,ni | 2 (38.18)
ΔEi = − P dEn d f2 · · · ρ(En , f 2 , . . . , f n ),
n f
En − Ei
n

and the decay of the initial state is given by


ḃi (t)
 −γ ⇒ bi (t) = e−γt = e−Γt/2 e−iΔEt/ . (38.19)
bi (t)
Note that the coefficient bi (t) calculated here,  1 − Γ/2t − i/ΔEt + · · · amounts to a sum of the
zeroth-order bi(0) = 1 and the second-order one in −Γ/2−i/ΔE, plus an infinite series of contributions
from higher orders.
Thus the coefficients of the initial eigenstate of Ĥ0 are
 
i
ci (t) = bi (t) exp − Ei t  e−Γt/2 e−i(Ei +ΔEi )t/ , (38.20)

which means that Γ is a “decay width”, whereas ΔEi is an energy shift.
Indeed, the probability of finding the particle in the same state as the initial state is

Pi (t) = |ci (t)| 2  e−Γt , (38.21)

so Γ is indeed a decay width. Moreover, the lifetime of the particle is given by


 ∞    ∞    ∞
dP d|ci | 2 1
t = dt t − = dt t − = dt tΓe−Γt = . (38.22)
0 dt 0 dt 0 Γ
Since we found that in (38.19) there are zeroth-order plus second-order contributions, substituting
it into the right-hand side of (38.1), we obtain a sum of first-order and third-order terms for bm .
Moreover, the terms in the sum with n  i are negligible compared with those with n = i, so only the
term with n = i remains, and we get
(1)+(3)
iḃm (t) = H1,mi bi(0)+(2) (t)eiω mi t = H1,mi e−γt eiω mi t . (38.23)
432 38 Higher-Order Time-Dependent Perturbation Theory

Pm

Em
Ei + ΔEi
Figure 38.1 The Breit–Wigner distribution is a Lorentzian curve.

Integrating this equation, we obtain


 τ
1
(1)+(3)
bm (τ) = H1,mi dt e (iω mi −γ)t
i 0
(38.24)
H1,mi ei(Em −Ei )τ/ e−γτ − 1
=− .
Em − Ei + iγ

At large times, τ → ∞, e−γτ → 0, so we obtain


H1,mi H1,mi
(1)+(3)
bm (τ → ∞) → = . (38.25)
Em − Ei + iγ Em − Ei − ΔEi + iΓ/2

Therefore the probability of finding the state in the mth eigenstate at large times is

|H1,mi | 2
|cm | = |bm
(1)+(3) 2
| =
(1)+(3) 2
, (38.26)
(Em − Ei − ΔEi ) 2 + (Γ/2) 2
which is the Breit–Wigner distribution in energies, a Lorentzian curve, as in Fig. 38.1. Indeed, here
Γ is the full width at half maximum value of the distribution.
Notice also that this distribution implies the energy–time uncertainty relation, since the uncertainty
in time is Δt ∼ t, so

ΔtΔE ∼ . (38.27)

38.2 General Perturbation

The general theory of perturbations (to all orders) is best described in the interaction picture, defined
in Chapter 9. We review it here.
As before, we consider a nonperturbed Hamiltonian Ĥ0 and an interaction Ĥ1 , so Ĥ = Ĥ0 + Ĥ1 .
Then we go to the Heisenberg picture with the free Hamiltonian Ĥ0 only, so we change states and
operators with respect to the Schrödinger picture,

|ψW  = Ŵ |ψ
(38.28)
ÂW = Ŵ ÂŴ −1 ,
433 38 Higher-Order Time-Dependent Perturbation Theory

where
    −1  
i i
Ŵ (t) = (US(0) (t, t 0 )) −1 = exp − Ĥ0 (t − t 0 ) = exp + Ĥ0 (t − t 0 ) , (38.29)
 
and US(0) (t, t 0 ) is the Schrödinger-picture evolution operator for propagation with Ĥ0 . Then
 
i
|ψ I (t) = exp Ĥ0 (t − t 0 ) |ψ S (t)

    (38.30)
i i
ÂI = exp Ĥ0 (t − t 0 ) ÂS exp − Ĥ0 (t − t 0 ) .
 
Thus, now both the state |ψ I (t) and the operator ÂI (t) are time dependent, with

|ψ I (t) = Ĥ1,I |ψ I (t)
i
∂t
(38.31)

i ÂI (t) = [ ÂI (t), Ĥ0,I ],
∂t
where the index I means that the operator is in the interaction picture.
The evolution in the interaction picture is given by
|ψ I (t) = ÛI (t, t 0 )|ψ I (t 0 ) = ÛI (t, t 0 )|ψ S (t 0 ), (38.32)
since all the pictures must give the same result at time t 0 . Then we find
 
i
ÛI (t, t 0 ) = Ŵ (t)ÛS (t, t 0 )Ŵ −1 (t 0 ) = exp Ĥ0 (t − t 0 ) ÛS (t, t 0 ), (38.33)

where we have used the fact that Ŵ (t 0 ) = 1. This interaction-picture operator then satisfies the same
equation as |ψ I (t), namely

i ÛI (t, t 0 ) = Ĥ1,I ÛI (t, t 0 ). (38.34)
∂t
This equation is solved perturbatively, as
  t  2  t  t1
−i −i
ÛI (t, t 0 ) = 1 + dt 1 H1,I (t 1 ) + dt 1 dt 2 H1,I (t 1 )H1,I (t 2 ) + · · ·
 t0  t0 t0
  t  2  t  t
−i 1 −i  
=1+ dt 1 H1,I (t 1 ) + dt 1 dt 2T H1,I (t 1 )H1,I (t 2 ) + · · ·
 t0 2!  t0 t0
   t 
i  
= T exp − dt H1,I (t ) ,
 t0
(38.35)
where T is a time-ordering operator.

38.3 Finding the Probability Coefficients bn (t)

Consider the initial interaction-picture state (which equals the Schrödinger-picture state) to be an
eigenket of Ĥ0 of index i, as before, so that
|ψ I (t) = ĤI (t, t 0 )|ψ I (t 0 ) = ÛI (t, t 0 )|ψi . (38.36)
434 38 Higher-Order Time-Dependent Perturbation Theory

Then the Schrödinger-picture state is


   
i i
|ψ S (t) = exp − Ĥ0 (t − t 0 ) |ψ I (t) = exp − Ĥ0 (t − t 0 ) ÛI (t, t 0 )|ψi . (38.37)
 
In order to calculate transition probabilities, we insert on the left the identity expanded in a
complete set of the eigenkets of Ĥ0 , 1 = n |ψ n ψ n |. We obtain
  
i
|ψ S (t) = |ψ n ψ n | exp − | Ĥ0 (t − t 0 ) ÛI (t, t 0 )|ψi 
n

   (38.38)
i
= exp − En (t − t 0 ) |ψ n ψ n |ÛI (t, t 0 )|ψi ,
n

 
where we have acted with exp − i Ĥ0 (t − t 0 ) on the left, on ψ n |. Identifying the result with the
general expansion in terms of bn (t) coefficients, we find that
bn (t) = ψ n |ÛI (t, t 0 )|ψi 
  ∞   t  t  
(−i/) s   
= ψ n  dt 1 · · · dt s T H1,I (t 1 ) . . . H1,I (t s ) ψi
 s=0 s! t0 t0  (38.39)
 (−i/)
∞  
s t t
! "
= δ ni + dt 1 · · · dt s ψ n |T H1,I (t 1 ) . . . H1,I (t s ) |ψi .
s=1
s! t0 t0

Now insert the identity written as a complete set over mk , 1 = mk |ψ mk ψ mk |, after each
Hamiltonian at t k . Then, in
   
i i
H1,I (t k ) = exp Ĥ0 (t k − t 0 ) H1,S (t k ) exp − Ĥ0 (t k − t 0 ) , (38.40)
 
we can act with the operator on the left on the left-hand ψ mk−1 | and with the operator on the right on
the right-hand |ψ mk , obtaining
 ∞  t  t
(−i/) s
bn (t) = δ ni + dt 1 · · · dt s
s=1
s! t0 t0

  
i
 
i

× T exp En (t 1 − t 0 ) H1,S,nm1 (t 1 ) exp − Em1 (t 1 − t 0 )
{mk }
 
(38.41)
    
i i
× exp + Em1 (t 2 − t 0 ) H1,S,m1 m2 (t 2 ) exp − Em2 (t 2 − t 0 ) · · ·
 
    
i i
× exp + Ems−1 (t s − t 0 ) H1,S,ms−1 i (t s ) exp − Ei (t s − t 0 ) .
 
This expansion can be identified with the expansion in λ of the coefficients,


bn (t) = bn(s) (t), (38.42)
s=0

with bn(0) (t) = δ ni .


We have therefore re-obtained the zeroth-order, first-order, and second-order terms, and general-
ized to all orders, as promised.
435 38 Higher-Order Time-Dependent Perturbation Theory

Important Concepts to Remember

• In the second order of time-dependent perturbation theory, more precisely summing the zeroth-
order, second-order, and higher-order
 contributions,
 we find, for the initial-state coefficient, ci (t) =
bi (t) exp − i Ei t = e− 2 t exp − i (Ei + ΔEi )t , where Γ is the decay width and ΔEi is the energy
Γ

shift.
• This leads to the decay of the original state, Pi (t)  e−Γt , and to the Breit–Wigner distribution for
(1)+(3) 2 (1)+(3) 2
the other states, |cm | = |bm | = |H1,mi | 2 /[(Em − Ei − ΔEi ) 2 + (Γ/2) 2 ].
• For a general perturbation, wecan use  the
 t formalism of  the evolution operator in the interaction
picture, which is ÛI (t, t 0 ) = T exp − i t dt  H1,I (t  ) , and note that bn (t) = ψ n |ÛI (t, t 0 )|ψi .
0

Further Reading
See [1] and [2].

Exercises

(1) In general, can we have a zero-energy shift if we have a nonzero decay width (finite lifetime)?
How about a zero decay width (infinite lifetime) for a nonzero energy shift?
(2) In a transition, if the density of final states ρ(En ) is an increasing exponential, ρ(En ) = Ae αEn ,
α > 0, what does the (physical) condition of a finite energy shift imply for the transition
Hamiltonian matrix H1,in , if Ei > 0?
(3) If Hni is independent of En = Ei , what physical condition can we impose on ρ(En )?
(4) Argue for the energy–time uncertainty relation, ΔEΔt ∼ , on the basis of the Breit–Wigner
distribution.
(5) Calculate the interaction picture operator ÛI (t, t 0 ) and bn (t) for the interaction in the sudden
approximation.
(6) Show that the first-order formula for time-dependent perturbation theory can be recovered from
(38.41).
(7) Show that the second-order formula for time-dependent perturbation theory can be recovered
from (38.41).
Application: Interaction with (Classical)
39 Electromagnetic Field, Absorption, Photoelectric
and Zeeman Effects

In this chapter we will apply the formalisms of time-independent and time-dependent perturbation
theory to the case of the interaction of electrons and atoms with an electromagnetic field. In particular,
we will consider the Zeeman effect for bound electrons, and the absorption and photoelectric effects
for atoms.

39.1 Particles and Atoms in an Electromagnetic Field

In Chapter 22, we described the interaction of a quantum mechanical particle with a classical
electromagnetic field. We found that the interaction Hamiltonian is
1 2
Ĥ = p − q A
 + V  (r ) − q A0
2m
(39.1)
p 2 q 2
q2 A
= − A · p − q A0 (r ) + V  (r ) + .
2m m 2m
Here V  (r ) is a non-electromagnetic part of the potential (more precisely, non-external
electromagnetic-field), so

−q A0 (r ) + V  (r ) = V (r ), (39.2)

and p = i ∇,
 as usual, so, in the gauge ∇
·A
 = 0, we have

2  2 iq   q2  2
Ĥ = − ∇ + V  (r ) + A · ∇ − q A0 + A. (39.3)
2m m 2m
iq  
Otherwise, there would be a + (∇ · A) term.
2m
Moreover, we can ignore the last term in Ĥ. In a hydrogenoid atom, where V  (r ) is the interaction
of the electron with the nucleus, we have
2  2
Ĥ0 = − ∇ + V  (r ), (39.4)
2m
and the other terms make up Ĥ1 .
 A0 constitute a radiation field, we can also put A0 = 0 as a gauge choice, obtaining the
If A,
2
radiation gauge. Neglecting the A term (as being small, of second order), we obtain
iq  
Ĥ1 = A · ∇. (39.5)
m
436
437 39 Applications

39.2 Zeeman and Paschen–Back Effects

Consider an atom with strong spin–orbit coupling in a magnetic field. In Chapter 24, we found that
in the presence of a magnetic field in the z direction, B = Bz , the interaction Hamiltonian between
the electrons in the atom and the external magnetic field is

ĤB-int = −
μtot · B
 = −μ B ( L + 2 S)
 ·B
 = −μ B B(L z + 2Sz ) = −μ B B(Jz + Sz ). (39.6)

When continuing the analysis, the implicit assumption (not expressed) was that this ĤB-int is a
perturbation. Then, we can use time-independent perturbation theory in first order, which does not
change the states, so that

|n (0)  = |En J M. (39.7)

Therefore, the energy shift in first-order perturbation theory becomes


ΔEn(1) = H1,nn = n (0) | Ĥ1 |n (0) 
(39.8)
= −μ B BEn J M | Jz + Sz |En J M.

But, by the Wigner–Eckhart theorem (from Chapter 16), S is a vector operator like J so they are
proportional, and so we have

J · S

Sz  = 2
Jz . (39.9)
J 
Using this relation, we found that

En J M | Jz + Sz |En J M  = gM δ M M  , (39.10)

where g is the Landé g factor, calculated from the above relation. Then, finally, we found

ΔEn(1) = −gM μ B B. (39.11)

We must make one observation, however. This linear Zeeman effect in a hydrogenoid atom is
valid only in the case where j  0, so either l  0 or s  0. But, if l = s = 0 (so j = 0), then we need
to consider the second-order term in Ĥ,
 2 mω 2L r 2 sin2 θ
e2 A
Ĥ2 = = , (39.12)
2m 2
where
eB
ωL = (39.13)
2m
is the Larmor frequency. Then the energy shift in first-order perturbation theory, using this Ĥ2 , is
 2
m eB  a0  2 1 2 2
ΔEn(1) = n (5n + 1), (39.14)
2 2m Z 3
where we have used the fact that, for a hydrogenoid atom,
 a 2 1
0
r 2 sin2 θn = n2 (5n2 + 1). (39.15)
Z 3
438 39 Applications

Paschen–Back Effect
This effect occurs in the opposite limit to the (linear) Zeeman effect, in which limit the LS (spin–
orbit) coupling (described in Chapter 22) is  μ B B. This means that the spin–orbit interaction can
be treated perturbatively, and does not couple L with S into J in this limit. Then the correct quantum
states to consider are |En LSML MS , instead of |En LS J MJ . Therefore the energy shift in first-order
time-independent perturbation theory is

ΔEn(1) = En LSML MS | Ĥ1 |En LSML MS 


(39.16)
= −μ B B(ML + 2MS ).

However, we note that actually this Ĥ1 is diagonal in this basis, so it is actually a part of Ĥ0 . Thus
here the perturbation Ĥ1 is the spin–orbit coupling, which is a small effect.

39.3 Electromagnetic Radiation and Selection Rules

Before considering the absorption and emission of photons, we will consider the emission of classical
electromagnetic radiation from an electric current, calculate the interaction Hamiltonian, and find
the resulting selection rules for the transition between initial and final states interacting with the
electromagnetic radiation.
We consider the same radiation gauge, A0 = 0 and ∇ ·A  = 0. Then the interaction Hamiltonian of
the classical electromagnetic radiation with an electric current j is

Hi = d 3 r j · A.
 (39.17)

In the radiation gauge, the equation of motion is the wave equation  A


 = 0, with traveling wave
solution (a retarded wave, with delayed time)
 
r · n
A= A t−
  , (39.18)
c

where n is the direction of propagation,

k ck
n = = . (39.19)
k ωk


The Fourier-transformed wave in the delayed time is A(ω), and satisfies the condition

n · A(ω)
 = 0, (39.20)

which comes from the gauge condition ∇


·A  = 0.
The Fourier-transformed vector potential is written in terms of a polarization vector  (k), as


A(ω) =  (k) A(ω) = A1 (ω)1 + A2 (ω)2 , (39.21)
439 39 Applications

where 1 and 2 are unit vectors perpendicular to n and each other, and

A(ω) = 1 + a2 A1 (ω)
1 + aeiθ 2
 (ω) = √ (39.22)
1 + a2
A1 (ω)
= aeiθ .
A2 (ω)
However, usually we consider a monochromatic wave, i.e., a wave with a single ω, so that instead
of a Fourier-transformed wave, we have simply
 
 r , t) = A0  (k)2 cos ω n · x − ωt
A(
c
  (39.23)
= A0  (k) ei(k ·x −ωt) + e−i(k ·x −ωt) ,
 

where k = ωn/c. Then, the interaction Hamiltonian between the radiation and the electric current
presented by an atom, valid for both emission and absorption of electromagnetic radiation, is
  
d 3 r (k) · j ei(k ·x −ωt) + e−i(k ·x −ωt) .
 
Hi = A0 (39.24)

We note that the current conservation law is


∂ρ 
+ ∇ · j = 0. (39.25)
∂t
Next, we can proceed either classically or quantum mechanically, and obtain basically the
same result.
Classically, we consider that the electric current generating or absorbing the radiation oscillates in
syncronization with it, meaning the charge density ρ = ρ(ω) oscillates with the same monochromatic
frequency ω as the wave, so that
∂ρ
ρ ∝ eiωt ⇒ = iωρ. (39.26)
∂t
Quantum mechanically, the charge density is an operator, ρ = ρ̂, and its oscillation in syncron-
ization with the radiation induces a transition between atomic states, |ψ n  → |ψ m . Then, using the
Heisenberg equation of motion for the operator ρ, we obtain
  ∂ ρ̂     i  
ψ m   ψ n = ψ m  [ Ĥatom , ρ̂] ψ n
 ∂t    (39.27)
i
= (Em − En )ψ m | ρ̂|ψ n  = iωmn ψ m | ρ̂|ψ n .

This is essentially the same result as in the classical case, except that it is an expectation value,
between two states, and the frequency ω is generically quantized, ω = ωmn .
Thus, either way we obtain
iωρ + ∇
 · j = 0, (39.28)

meaning we can replace ∇


 · j with −iωρ. But we can also use the relation
 

 ( · r ) j = ∂i [ j x j ji ] = δi j  j ji +  j x j ∂i ji
(39.29)
=  · j + ( · r ) ∇
 · j =  · j − ( · r )iωρ.
440 39 Applications

Integrating over space, the left-hand side gives a boundary term, which is assumed to vanish, so we
obtain
 
d r  · j = iω
3  d 3 r ρ( · r ). (39.30)

We can now consider the term with ei k ·x in Ĥi , namely the first term, and expand it as follows:


ei k ·x  1 + ik · x + · · ·

(39.31)

Considering only the zeroth-order term, i.e., 1, we have the interaction Hamiltonian at leading order,
 
(1)
Hi = A0 d r  ( k) · j = iω A0  ( k) ·
3
     d 3 r ρr , (39.32)

where we have used the relation deduced above. But the integral

P ≡ d 3 r ρr (39.33)

is the electric dipole moment, so

Hi(1) = iω A0  (k) · P (39.34)

gives rise to electric dipole radiation.


Next, consider the first-order term in ei k ·x , resulting in the next-to-leading order (NLO) Hamilto-


nian,

Hi(2) = A0 d 3 r ( (k) · j)(k · r ). (39.35)

There is now a similar relation to (39.29) that we can use to rewrite the NLO interacting Hamiltonian,


 · [( · r )(k · r ) j] = ∂i [ j x j k k x k ji ]

= δi j  j k k x k ji +  j x j δik k k ji +  j x j k k x k ∂i ji
(39.36)
= (k · r )( · j) + ( · r )(k · j) + ( · r )(k · r ) ∇
 · j

= (k · r )( · j) + ( · r )(k · j) − iωρ( · r )(k · r ),

where in the last line we used current conservation. Then, integrating over r as before, the left-hand
side gives a boundary term, assumed to vanish, so that
  
d 3 r (k · r )( · j) + ( · r )(k · j) − iωρ( · r )(k · r ) = 0. (39.37)

But we still need to rewrite the middle term in (39.36). For that, we need to use yet another relation,

( × k) · (r × j) = i jk i k j lmk x l jm


= (δil δ jm − δim δ jl )i k j x l jm (39.38)
= ( · r )(k · j) − ( · j)(k · r ).
441 39 Applications

This allows us to rewrite the middle term resulting in a relation between integral quantities
  
iω (r × j)
d 3 r (k · r )( · j) = d 3 r ρ( · r )(k · r ) − ( × k) d 3r , (39.39)
2 2
so that the interaction Hamiltonian is
 

Hi(2) = A0 i k j Qi j − ( × k) · μ
 . (39.40)
2
Here we have defined the electric quadrupole moment

Qi j ≡ d 3 r ρx i x j , (39.41)

and the magnetic dipole moment



(r × j)
μ
 ≡ d 3r . (39.42)
2
In general, at lth subleading order, we have an electric 2l+1 th-polar perturbation and a magnetic
2 th-polar perturbation. These are all tensors of SO(3)  SU (2), which means that they are quantities
l

with a given angular momentum.


The electric dipole Pi is a vector, the magnetic dipole μi is also a vector (both with j = 1), while
the electric quadrupole Qi j is a two-index symmetric tensor (with j = 2). In general, for higher-order
terms, we also obtain tensors, Tq(k) . In quantum theory, these will be quantum operators, the same as
those that were analyzed for the Wigner–Eckhart theorem in Chapter 16. The theorem states that
α , j  T (k) α, j
α  j  m|T̂q(k) |α, jm =  j, k; m, q| j, k; j  , m  , (39.43)
2j + 1
and the matrix elements are nonzero only for
q = m − m
(39.44)
| j − j  | ≤ k ≤ j + j .
These are then selection rules for the matrix element to be, resulting in a nonzero transition
probability via first-order perturbation theory.
In the case of the leading, electric dipole, approximation, we have a vector operator Pi , with q = 1,
so m = 0, ±1, but m = 0 is only possible when P is parallel to  , which means a longitudinal mode
(which is unphysical, in the radiation region). Otherwise, we have
m − m  = ±1
(39.45)
| j − j  | ≤ 1 ≤ j + j ,
which means that
j − j ≤ 1 ⇒ j ≥ j − 1
(39.46)
j  − j ≤ 1 ⇒ j  ≤ j + 1,
finally resulting in
j  = j − 1, j, j + 1, and m  = m ± 1. (39.47)
442 39 Applications

39.4 Absorption of Photon Energy by Atom

The direct application of the formalism is to the absorption of electromagnetic energy.


The first way to derive the transition amplitude is to use the interaction in the particle picture,
q q
A · p = − A0  (k)ei k ·x · p ,

Ĥ1 = − (39.48)
m m

where the momentum operator is understood as i ∇  acting on a wave function. In the electric dipole
approximation, the exponential is replaced by 1, so
q 
Ĥ1 = −  ( k) · p . (39.49)
m
The matrix element between an initial and a final state is
q
H1, f i = ψ f | Ĥ1 |ψi  = − A0  (k) · ψ f |p |ψi 
m
 (39.50)
q 
= − A0  (k) · d 3 r ψ∗f (r ) ∇ψ i (
r ).
m i

But, since for atoms we have Ĥ0 = p 2 /2m + V (r ),


m
p = [r , Ĥ0 ]. (39.51)
i
Then we can replace p in the matrix element, and find
q q
H1, f i = − A0  (k) · ψ f |p |ψi  = − A0  (k) · ψ f |[r , Ĥ0 ]|ψi 
m i
Ei − E f
= iq A0 ψ f | (k) · r |ψi  (39.52)


= iq A0 ωi f d 3 r ψ∗f (r ) (k) · r ψi (r ).

But, since ρ is the electric charge density, it equals q times the probability density ψ∗ ψ, which means
that the matrix element of ρ, ρ f i , is given by

ρ f i (r ) = qψ∗f (r )ψi (r ), (39.53)

so that the interaction Hamiltonian can be rewritten as



H1, f i = i A0 ωi f  · d 3 r ρ f i r , (39.54)

which is the same formula as that obtained from (39.32) (which is true in the electric dipole
approximation) in between the initial and final states.
We must also give a better definition of the electric dipole approximation for hydrogenoid atoms.
In this case, the transition frequency ωi j is of the order of the ground state energy (corresponding to
the initial state being the ground state), so

Ze02 Ze02
ω1 f ∼ E1 ∼ ∼ , (39.55)
r1 a0 /Z
443 39 Applications

where the ground state radius is r 1 ∼ a0 /Z. Moreover, e02 /(c) ∼ 1/137, so the wavelength λ
associated with ω is given by
λ c ca0 137a0 /Z
= ∼ 2 2 ∼ . (39.56)
2π ω Z e0 Z

But we require that the wavelength is much greater than the size of the atom, λ  2πr 1 , in order
to use the dipole approximation, so
r1 a0 /Z Z
∼ ∼  1, (39.57)
λ/(2π) λ/(2π) 137
which is valid only if

Z  137, (39.58)

i.e., for light hydrogenoid atoms.


Next, we use Fermi’s golden rule for absorption, when the initial state has the energy Ei for the
atom and a photon of energy ω is to be absorbed by the atom, so the delta function for energy is
δ(E f − Ei − ω), giving the probability density (in time)
 
dP 2π
= |H1, f i | 2 d f2 · · · ρ(E f , f 2 , . . . , f r )dE f δ(E f − Ei − ω). (39.59)
dt  f r

But we can use (39.50) to replace the matrix element in the above, obtaining
 
dP 2π q2
= | A0 | 2
|ψ f |
 ( 
k) · p
 |ψ i | 2
d f2 · · · ρ(E f , f 2 , . . . , f r )dE f δ(E f − Ei − ω).
dt  m 2
f r

(39.60)

The absorption cross section is defined as the ratio of absorbed energy per unit time divided by the
incident flux (the incident energy per unit time and unit transverse area),
dE/dt(abs) ωdP/dt(abs)
σabs = = , (39.61)
dEinc /dtdS F
where the absorbed energy per unit time equals the photon energy ω times the probability per unit
time, and the incident flux F is written in terms of the incident energy density E as F = E c, so

 2 + B   ,
dEinc c 2
F= = 0 E (39.62)
dtdS 2  μ0

 as
and the electric and magnetic fields are written in terms of the traveling wave potential A

=∇
B  r , t) = k ×  (k) A0 ei(k ·x −ωt)
 × A(
(39.63)
 r , t) = ω (k) A0 ei(k ·x −ωt) ,
 = −∂t A(
E
so
ck 2 | A0 | 2 ω 2 | A0 | 2
F= = . (39.64)
2 2c
444 39 Applications

Thus the absorption cross section is


 
4πα
σabs = 2 |ψ f | (k) · p |ψi | 2 d f2 · · · ρ(E f , f 2 , . . . , f r )dE f δ(E f − Ei − ω), (39.65)
m ωf i fr

where we have used α = q2 /(c).


But we can also relate the p matrix elements to those for −imω f i r , as in (39.52), obtaining
 
σabs = 4παω f i |ψ f | ( k) · r |ψi |
 2
d f2 · · · ρ(E f , f 2 , . . . , f r )dE f δ(E f − Ei − ω). (39.66)
fr

39.5 Photoelectric Effect

The photoelectric effect corresponds to an ionized final state of the atom, so the process is the
absorption of a photon by the atom and the emission of a free electron, at a direction n, with
momentum p . Then the final state is |ψ f  = |k f , a free electron with energy

2 k 2f
E= . (39.67)
2me
Thus the matrix element is

 
k f | · p ei k ·x |ψi  =  · d 3 r ψ∗ (r )ei k ·r ∇ψ

i (
r ), (39.68)
kf i
where ψi (r ) corresponds to an electron inside the hydrogenoid atom, whereas ψk f (r ) is a free wave
function, normalized over energy.
In the electric dipole approximation we replace ei k ·x with 1, so we then obtain a matrix element


H1, f i = imω f i ψ f | · p |ψi . (39.69)

As in Chapter 37, where we analyzed the scattering wave collision, we have f 1 = E, f 2 = n, so
d f 2 = dΩ. The differential cross section is then

dσ ωdP/dt(abs)
= dE f ρ(E f )δ(E f − Ei − ω), (39.70)
dΩ F
but if the state ψ f is normalized in energy, as in Chapter 37, then ρ(E f ) = 1. Therefore we finally
obtain
dσ 4παω f i 
= |ψ f | · p |ψi | 2 . (39.71)
dΩ (mω f i ) 2

In this formula, we would need to calculate the matrix element for a given state |ψi  of the
hydrogenoid atom. Therefore, since the differential cross section depends on |ψi , we will not
compute it further here.
445 39 Applications

Important Concepts to Remember

• The linear Zeeman effect corresponds to the linear part of the interaction Hamiltonian for
interaction with electromagnetism, Ĥ1 = iq m A · ∇, plus the interaction with the spin, leading to
 
Ĥ1 = −μ B ( J + S)  · B,  and energy shift ΔE = −gM μ B B. For l = s = 0, we must consider the
quadratic part of Ĥ1 instead.
• The Paschen–Back effect corresponds to the opposite limit, in which the LS coupling is  μ B B,
so the correct quantum states are |En LSML MS , and ΔEn = −μ B B(ML + 2MS ).
• For electromagnetic radiation, A  = A(t − n · r /c), or in particular A  = A0  (k)[ei(k ·x −ωt) +
−i(
k ·
x −ωt)
e ], in which case the charge density ρ also oscillates with ω, and iωρ + ∇  · j = 0.
k ·
i
• When expanding e x
into 1 + i k · x + · · · , the first-order interaction Hamiltonian depends on the

electric dipole moment P as Hi(1) = iω A0  (k) · P,  and the second-order interaction Hamiltonian
depends on the electric quadrupole moment Qi j and the magnetic dipole moment μ  as Hi(2) =
A0 [(iω/2)i k j Qi j − i jk i k j μk ].
• The Wigner–Eckhart theorem leads to selection rules for the matrix elements for electromagnetic
transition probabilities. In particular, the leading electric dipole approximation leads to m  = m ± 1
and j  = j − 1, j, j + 1. 
• For absorption of a photon by an atom, we obtain the matrix element H1, f i = i Ao ω f i  · d 3 r ρ f i r ,
and we can use it in Fermi’s golden rule. For a hydrogenoid atom, this electric dipole approximation
is valid for Z  137 = 1/α.
• For the photoelectric effect, we would apply the formalism for absorption of photon by an atom,
with the final state an emitted free electron.

Further Reading
See [2], [1], [3], [4].

Exercises

(1) Consider the first relativistic correction to the energy of a particle, and introduce the coupling to
the electromagnetic field. Write down the resulting extra terms that are linear and quadratic in

A.
(2) For the terms considered in exercise 1, find expressions for the first relativistic corrections to the
linear and quadratic Zeeman effects (ignore any extra couplings to spin).
(3) Show that the splittings of energy levels according to the Zeeman or Paschen–Back effects lead
to the same number of lines.
(4) Calculate the transition element H1, f i for the interaction of a hydrogenoid atom with a
monochromatic electromagnetic wave, in the electric dipole approximation, from the n = 1, l =
0 state to the n = 2, l = 1 state.
446 39 Applications

(5) Consider a hydrogenoid atom in the state n = 3, l = 2, m = 0. What are the possible transitions
induced by a monochromatic electromagnetic wave, to both first and second order?
(6) Calculate the absorption cross section for the transition in exercise 4.
(7) Calculate the differential cross section for the photoelectric effect for a monochromatic
electromagnetic wave of energy of 14 eV incident on a hydrogen atom in the ground state.
WKB Methods and Extensions: State Transitions
40 and Euclidean Path Integrals (Instantons)

In this chapter we pick up the previously described WKB (semiclassical) methods, and extend them
for transitions between states in (time-dependent) perturbation theory, and we define “instanton”
calculations in a Euclidean version of the path integral formalism.

40.1 Review of the WKB Method

The Schrödinger equation in one dimension, in the presence of a potential V (x), with the wave
function ansatz
 
i
ψ(x, t) = exp (W (x) − Et) , (40.1)

is
 2
dW  d 2W
− 2m(E − V (x)) + = 0, (40.2)
dx i dx 2
and has the quasiclassical solution
  x 
const i i
ψ(x) = exp ± dx  p(x  ) − Et (40.3)
p(x)  x0 
for E > V (x), where

p(x) = 2m(E − V (x)), (40.4)

and the solution


  x 
const 1   i
ψ(x) = √ exp ± dx χ(x ) − Et (40.5)
χ(x)  x0 
for E < V (x), where

χ(x) = 2m(V (x) − E). (40.6)

These solutions are valid only asymptotically, meaning not very close to the “turning points” x i ,
where E − V (x i ) = 0.
When going through these turning points, we match one type of solution to the other. If we have
a decreasing potential V (x), and we transition from left to right (from higher V to lower V ), we are
translating
  x1 
1 1   ) − E)
exp − dx 2m(V (x (40.7)
[V (x) − E]1/4  x
447
448 40 WKB Methods and Extensions, Transitions, Instantons

into
√   x 
2 1  
π
sin dx 2m(E − V (x )) +
[E − V (x)]1/4  x1 4
√   x  (40.8)
2 1  
π
= cos dx 2m(E − V (x )) − .
[E − V (x)]1/4  x1 4
On the other hand, for an increasing potential, and a transition from right to left (from higher V to
lower V ), we need to translate
  x2 
1 1   ) − E)
exp − dx 2m(V (x (40.9)
[V (x) − E]1/4  x
into
√   x 
2 1  
π
sin dx 2m(E − V (x )) −
[E − V (x)]1/4  x2 4
√   x  (40.10)
2 1   )) +
π
= cos dx 2m(E − V (x .
[E − V (x)]1/4  x1 4
In the path integral formalism, the propagator
  
i
U (t, t 0 ) = Dx(t) exp S[x(t)] , (40.11)

with boundary conditions x(0) = x 0 and x(t) = x, is approximated in the “saddle point approxima-
tion” by the action in the classical solution times a Gaussian integration for the quadratic fluctuation
giving a quasi-classical determinant,
 
i
U (t, t 0 )  exp S[x cl (t)] [det(kinetic operator)]−1/2

  x 
i  
i
= exp dx 2m(E − V (x )) − Et (40.12)
 x0 
    −1/2
d2
× det −m 2 − (V (x) − E)  .
dt
After integrating over the initial position x 0 , made to be identical with the final position x, we obtain
the correct propagator U (t, t 0 ) for the harmonic oscillator case.

40.2 WKB Matrix Elements

We want to calculate matrix elements

A12 ≡ ψ1 | Â|ψ2  (40.13)

where |ψ1 , |ψ2  are eigenstates of the Hamiltonian Ĥ with energies E1 , E2 . Assume for concreteness
that E2 > E1 . The analysis below follows the work of Landau (described in the book by Landau and
Lifshitz).
449 40 WKB Methods and Extensions, Transitions, Instantons

We consider only operators  that are functions in the x representation, as for instance a potential
V (x). Then
 
A12 = dx dyψ1 |xx| Â|yy|ψ2 
  (40.14)
= dx dyψ1∗ (x)ψ2 (x) Axy

and Axy = δ(x − y) f (x). Assuming also that ψ1 (x) and ψ2 (x) are real, we have

A12 = f 12 = dxψ1 (x) f (x)ψ2 (x). (40.15)

Consider a potential V (x) that is decreasing, and consider a turning point a1 for E1 such that
V (a1 ) = E1 , and a turning point a2 for E2 such that V (a2 ) = E2 , implying a1 > a2 . Define further
p1 (x) = pE1 (x), χ1 (x) = χE1 (x)
(40.16)
p2 (x) = pE2 (x), χ2 (x) = χE2 (x).
Then, according to the general theory reviewed before, away from the turning point a1 the wave
function ψ1 (x) corresponding to E1 is
  x 
C1 1  
ψ1 = √ √ exp − χ1 (x )dx , x < a1
2 χ1 (x)  a1
  x  (40.17)
C1 1  
ψ1 = cos p1 (x )dx , x > a1 ,
p1 (x)  a1
and similarly for ψ2 (x).
However, we cannot use both these solutions to calculate f 12 . First, the region near the turning
points has a large contribution to the integral, since the wave function blows up there, and second,
the turning point of one is not a turning point for the other, so it encroaches on the region that was
good from the point of view of the other.
Instead, we write ψ2 = ψ2+ + ψ2− , where ψ2+ is complex, and ψ2− = (ψ2+ ) ∗ . Then we write, for the
solution for x > a2 , 2 cos[. . .] as ei[...] + e−i[...] , which is a sum of complex conjugate terms, therefore
corresponding to ψ2+ and ψ2− .
But now we can take advantage of an alternative correspondence rule through the turning points,
for complex exponential solutions, namely (from x > a2 to x < a2 )
  x    x 
π 1
p(x  )dx  ,
C i C
exp p(x  )dx  + i → exp  (40.18)
p(x)  a 2
4 |p(x)|  a 2 
thus defining ψ2+ also for x < a2 . We obtain (note that −i = e −iπ/2
)
  x  
−iC2 1
ψ2+ = √ √ exp  p2 (x  )dx  , x < a2
2 χ2 (x)   a2 
  x  (40.19)
C2 i π
ψ2+ = exp p2 (x  )dx  − i , x > a2 .
2 p2 (x)  a2 4
Having defined ψ2+ , and ψ2− = (ψ2+ ) ∗ , we can also split f 12 similarly as f 12 = f 12
+ −
+ f 12 , with
 +∞
+
f 12 = ψ1 (x) f 12 (x)ψ2+ (x)dx. (40.20)
−∞
450 40 WKB Methods and Extensions, Transitions, Instantons

Next, we consider moving x from the real line onto the complex plane, specifically the upper half-
plane (the positive imaginary part). In that case, if the exponent is a function g(x) that maps the
upper half-plane to itself, eig(x) → e−Img(x) goes to zero at infinity, so is better defined near |x| → ∞.
Then, we consider an integration contour C for x in the upper half-plane that is slightly above the
real line, thus avoiding the turning points at real a1 , a2 and so being well defined. For that, however,
we must define, in the whole half-plane but away from a1 , a2 ,
  x 
C1 1
ψ1 (x) = √ √ exp χ1 (x  )dx 
2 χ1 (x)  a1
  x  (40.21)
+ −iC2 1  
ψ2 (x) = √ √ exp − χ2 (x )dx .
2 χ2 (x)  a2
+
Then we can calculate the amplitude f 12 as
   x  x 
+ −iC1 C2 f (x) 1
f 12 = dx √ exp dx  χ1 (x  ) − dx  χ2 (x  ) . (40.22)
2 C χ1 (x)χ2 (x)  a1 a2

The exponent in the above formula has an extremum x 0 , i.e., a zero derivative with respect to x,
only for V (x 0 ) = ∞ if E1  E2 . Indeed,
 x  x 
d    
dx χ1 (x ) − dx χ2 (x ) = 0
dx a1 a2
(40.23)
E1 − E2
⇒ 2m(V (x) − E1 ) − 2m(V (x) − E2 )  √ =0
2m(V (x) − E1 )
has only the solution V (x) → ∞.
Assume that there is a single pole for V (x) in the upper half-plane, i.e., V (x 0 ) → ∞. Then, the
integral over the contour C (slightly above the real line) is equal to the integral over a contour C 
that goes vertically down from infinity to x 0 , then encircles it from below (very close to it), then goes
vertically back up again. Indeed, we can add for free two quarter-circles at infinity in the upper half-
plane (since, as we just said, this contribution at infinity vanishes in the upper half-plane) that will
connect C with −C  at infinity, and there is no pole inside the resulting total contour, as in Fig. 40.1.
We can thus use the residue theorem of complex analysis to say that the integral over the whole
contour is zero (the integral equals the sum of residues at the poles inside the contour), or, since the
added integral at infinity vanishes anyway, that the integral over C is equal to the integral over C .

x C

x0
C

Figure 40.1 Integration over the contour C, slightly above the real axis in x, can be reduced to integration over the contour C, encircling x0 ,
down and up from infinity, indicated by the dashed curve.
451 40 WKB Methods and Extensions, Transitions, Instantons

+
Then f 12 is found to be the same as (40.22), but integrated over C , and can be thus approximated
(since the further we go up in the imaginary plane, the more we add to the exponent in e−Im[...] ,
making the contribution exponentially small) by the value of the exponential near the minimum x 0 .
Moreover, then
+
f 12 = 2 Re f 12 ∝ 2 Im exp[. . .], (40.24)
so that finally we have the approximation (note that f (x 0 ) is a subleading part, which can be
neglected compared with the exponential)
  x0  x0 
1
f 12 ∼ exp − Im dx  p2 (x  ) − dx  p1 (x  ) . (40.25)

If the two energies E1 and E2 are close to each other, i.e.,
1
E1,2 ≡ E ± ω21 , (40.26)
2
with ω21  E, we can also make the approximations
∂pE (x) ω21
p2 (x)  pE (x) + (E)
∂E 2
(40.27)
∂pE (x) ω21
p1 (x)  pE (x) − (E) .
∂E 2
Then, since

∂pE (x) m
(E) = , (40.28)
∂E 2(E − V (x))
we find that the transition amplitude matrix element is approximated by
  x0  
 m
f 12 ∼ exp −ω21 Im dx = e−ω21 Im τ (40.29)
2(E − V (x))
where we have defined the complex time
 x0
dx
τ≡ , (40.30)
v(x)
and where also we have defined the complex velocity

2(E − V (x))
v(x) ≡ . (40.31)
m
Thus we have a transition amplitude (a matrix element) that is approximated by a decay factor in
complex time, found as an extremum over the complex plane of the action, i.e., the exponent of the
WKB approximation.

40.3 Application to Transition Probabilities

The transition probability relations contain, as we saw in previous chapters, the modulus squared of
transition amplitudes for the interaction Hamiltonian,
|ψ m | Ĥ |ψi | 2 , (40.32)
and usually the interaction Hamiltonian Ĥ1 is a certain function of x, f (x).
452 40 WKB Methods and Extensions, Transitions, Instantons

Then, time-dependent perturbation theory, in the sudden approximation (the interaction Hamilto-
nian jumps from 0 to a constant at t 0 ), gives

|bm (t)| 2 = t |ψ m | Ĥ |ψi | 2 δ(E f − Ei ), (40.33)

or Fermi’s golden rule,
 

P(i → m) = t d f2 · · · ρ(Ei , f 2 , . . . , f r )|ψ m | Ĥ |ψi | 2 ∼ |ψ m | Ĥ |ψi | 2 . (40.34)
 fr

But then, we have for the amplitude


  x0  x0 
1
ψ m | Ĥ1 |ψi  ∼ exp − Im dx  pm (x  ) − dx  pi (x  ) . (40.35)
 x1 x2

Since however, the integral in the exponent, according to the Bohr-Sommerfeld quantization,
discussed in Chapter 26, is the action,
 x0
dx p(x) = action = S(x 1 , x 0 ), (40.36)
x1

we obtain that the leading factor in the transition probability is


 
2
P(i → m) ∼ exp − Im [S(x 1 , x 0 ) + S(x 0 , x 2 )] . (40.37)

This means that we have an action in the exponent for a path in the complex plane in terms
of variables (complex) time τ and (complex) position x, together giving the (complex) path x(τ),
which starts at x 1 and goes to x 0 , the extremum of the action S in complex time, then goes to x 2 .
Here x 1 and x 2 are positions corresponding to the two states ψi and ψ m for which we are calculating
the transition probability.

40.4 Instantons for Transitions between Two Minima

The previous explanation motivates a more precise analysis for the case of a transition between two
local minima. For this, however, we need to define path integrals in Euclidean space first.

Path Integrals in Euclidean Space

In Chapter 28 we introduced path integrals in Euclidean (imaginary) time, though the motivation
there was to connect with statistical mechanics. Here, the main motivation is to have a formalism
that allows for a natural “classical path” in imaginary time.
Thus we consider the propagator written as a path integral,
 x f =x(t f )
U (t, t 0 ) = Dx(t)eiS[x(t)]/ , (40.38)
xi =x(ti )

and it can be Wick rotated to Euclidean (imaginary) time by setting t E = it (so we are considering
t = −it E , where now t E ∈ R), which changes the Minkowski metric of spacetime to the Euclidean
one,
453 40 WKB Methods and Extensions, Transitions, Instantons

−c2 dt 2 + dx 2 = +c2 dt 2E + dx 2 . (40.39)

We define the exponent in the path integral, iS[x], as −SE [x],


 ⎡⎢  2 ⎤⎥
⎢ 1 dx
iS[x] = i (−idt E ) ⎢ − V (x) ⎥⎥ ≡ −SE [x], (40.40)
⎢⎣ 2 d(−it E ) ⎥⎦

which defines the Euclidean action as


 ⎡⎢  2 ⎤ 
SE [x] = dt E ⎢⎢ 1 dx + V (x) ⎥⎥⎥ = dt E L E (x, ẋ). (40.41)
⎢⎣ 2 dt E ⎥⎦

This allows the propagator to be better defined, as


  
1
U (t, t 0 ) = Dx(t E ) exp − SE [x(t E )] , (40.42)

but more importantly, it still takes real values for real Euclidean time t E instead of real normal
time t.
Assume that V (x) has two minima, x 1 and x 2 , with the same value V0 , thus V  (x 1,2 ) = 0, with
V (x 1,2 ) > 0, which necessarily means that there is a local maximum x 0 in between them, V  (x 0 ) = 0


with V  (x 0 ) < 0.
These extrema for the potential are then also extrema of the Euclidean action SE [x], namely,
x(t) = x 1 and x(t) = x 2 satisfy the classical equations of motion for the Euclidean action,
δSE d 2 x dV
0= =− 2 + = 0. (40.43)
δx(t E ) dt E dx
With respect to the classical equations of motion for the normal action,
d 2 x dV
+ = 0, (40.44)
dt 2 dx
we see that we have effectively an inverted potential,

VE (x) = −V (x). (40.45)

Consider then the classical solution in this inverted potential VE (x), i.e., the classical solution of the
Euclidean action, with the given boundary conditions; it is called an instanton. Note that VE (x) has
two maxima at x 1 and x 2 , and a local minimum x 0 in between them, as in Fig. 40.2.

VE (x)

x
x1 x0 x2

Figure 40.2 Instanton solution in between two maxima, x1 , x2 of VE , passing through the local minimum x0 (equivalently, two minima of V,
with the solution passing through the local maximum x0 ).
454 40 WKB Methods and Extensions, Transitions, Instantons

Then we can consider an instanton solution that asymptotically goes to x 1 at t i → −∞ and to x 2


at t f → +∞, by passing through x 0 at a finite time. Physically, this motion in the inverted potential
VE (x) corresponds to a particle initially (at t i → −∞) sitting at the maximum (an unstable point) at
x 1 , and receiving an infinitesimal kick, after which it goes down the hill, through the local minimum
at x 0 , and on to the maximum at x 2 , reached asymptotically when t f → +∞.
If t i , t f ∈ R and are asymptotically at infinity, they would correspond to t E,i , t E, f ∈ iR, which is
not what we want. However, in the complex t plane this would correspond to integration over the
contour C and, as we saw, this is equivalent to integration over the contour C , where x 0 is, not a
pole of the potential, but, rather more generally now, the extremum of the action appearing in the
exponent situated in between the initial and final points, meaning the same x 0 as was defined just
above. Then x(t) is the same complex path as that described in the WKB matrix element analysis.
In this case, now we can use the general formulas in Chapter 38 relating transition amplitudes with
the propagator,

bn (t) = ψ n |UI (t, t 0 )|ψi . (40.46)

Here we take |ψi  to be the quantum state corresponding to the position x 1 at t i → −∞ and |ψ m  the
quantum state corresponding to position x 2 at t f → +∞. Then we obtain for the transition amplitude
coefficient
     
i 1
bm (t) ∼ Dx(t) exp S[x(t)] = Dx(t E ) exp − SE [x(t E )] . (40.47)
 
A saddle point approximation for the path integral gives just the exponent corresponding to the
minimum of the Euclidean action SE , meaning the Euclidean action evaluated on the classical
solution (which extremizes SE ), i.e., the instanton, thus

bm (t) ∼ exp {−SE [x cl (t E )]} = exp (−SE [x 1 , x 0 ] − SE [x 0 , x 2 ]) = exp (iS[x 1 , x 0 ] + iS[x 0 , x 2 ]),
(40.48)
where we have split the classical instanton solution into two halves, one going from x 1 to x 0 and the
other from x 0 to x 2 , and we have gone back to the normal action by setting −SE = iS. Finally then,
the probability of transition is
 
2
P(i → m) = |bm (t)| 2 ∼ exp − Im(S[x 1 , x 0 ] + S[x 0 , x 2 ]) , (40.49)

which is the same formula as that obtained above from the WKB method on matrix elements.
We can also consider the next correction to the above approximate formula, a further “quasi-
classical approximation” around the classical Euclidean value. This corresponds to considering
fluctuations around the classical path in complex space, i.e., around the instanton. We write the
quadratic fluctuations around the instanton as follows:
1 δ2 SE
SE  SE [x cl (t E )] + [x cl (t E )](x − x cl (t E )) 2 , (40.50)
2 δx 2 (t E )
and then, doing Gaussian integration over x, we obtain the subleading fluctuations determinant
contribution as well,

⎧ ⎡⎢ ⎤⎥ ⎫ −1/2
−S E ,0 ⎪
⎨ det ⎢⎢− d + V  (x cl (t E )) ⎥⎥ ⎪
2

e ⎪ ⎪ . (40.51)
⎩ ⎢
⎣ dt 2
E

⎦⎭
455 40 WKB Methods and Extensions, Transitions, Instantons

We can also consider other contributions to the path integral, such as multi-instanton solutions and
fluctuations around them.
In the simple case considered here, with only two minima of the potential, we only have an
instanton solution, going from x 1 to x 2 in Euclidean time, and an anti-instanton solution, going from
x 2 to x 1 . We could have a quasi-solution that goes from x 1 to x 2 in a large, but not infinite time, and
then back to x 1 in the same large time, which would be an instanton–anti-instanton solution.
However, if instead of the potential with two minima we had a periodic potential, with an infinite
number of minima, we could consider a multi-instanton (quasi-)solution, where we go from one
minimum to the next, then to the next, etc. In this case, in the path integral we must sum over all the
possible multi-instantons, and the fluctuations around them, but we will not explore this possibility
further here.

Important Concepts to Remember

• We can use the WKB method to calculate matrix elements f 12 ≡ ψ1 | Â|ψ2  between two
eigenstates of the Hamiltonian, by first going to the complex plane (more specifically, the imaginary
line in the upper-half plane) for x; consequently, for the exponent in the WKB approximation for
f 12 , we have eig(x) → e−Im g(x) .
• Then, minimizing this exponent over the complex plane, we obtain
 
E2 − E1
f 12 ∼ exp − Im τ ,

x
with τ = 0 dx/v(x), where x 0 is the (unique) point in the upper-half plane where V (x 0 ) → ∞,

the integral only goes near it, and v(x) = 2(E − V (x))/m.
• Using Bohr–Sommerfeld quantization,  the transition probability is found as P(i → m) ∼
exp − 2 Im[S(x 1 , x 0 ) + S(x 0 , x 2 )] , for a path in complex plane for imaginary x and τ, from x 1
to x 2 via x 0 , the extremum of the action in complex plane.
• In a more precise formalism,
 we define path integrals
 in Euclidean (imaginary) time, with
propagator U (t, t 0 ) = Dx(t E ) exp − 1 SE (x(t E )) .
• Motion in Euclidean time is motion in the inverted potential, VE (x) = −V (x), so motion between
two minima for V (x) via a maximum x 0 in between them becomes motion between two maxima
of VE (x) via a minimum x 0 in between them, called an instanton.
• Since bm (t) = ψ m |U (t, t 0 )|ψ0 , the probability becomes the modulus squared of the Euclidean
pathintegral for the instanton, which in first order gives the previous formula for P(i → m) ∼
exp − 2 SE [x cl (t E )] , in terms of the instanton action. Fluctuations around the instanton, as well
as multi-instanton contributions, give corrections to P(i → m).

Further Reading
For more on WKB methods for matrix elements, see [4], and for more on instantons, in both quantum
mechanics and quantum field theory, see [10].
456 40 WKB Methods and Extensions, Transitions, Instantons

Exercises

(1) Use the WKB method, and Bohr–Sommerfeld quantization, to estimate the matrix element
2| exp (−a∂x )|1 for a harmonic oscillator.
(2) Consider the potential V = −A/(x 2 + a2 ), and states |1, |2, the first two eigenstates in this
potential. Using the methods in the text, estimate the transition probability f 12 ≡ 1|x|2.
(3) Can the instanton be associated with the motion of a real particle, albeit nonclassical (perhaps a
quantum path)?
(4) Consider a theory with only two minima of the potential, at x 1 and x 2 . Sum the series of
noninteracting multi-instantons that start at x 1 and end at x 2 , having been through a continuous
motion.
(5) Consider the Higgs-type potential V = α(x 2 − a2 ) 2 , and a transition between the two vacua
(minima of the potential) x 1 = −a and x 2 = +a. Calculate the transition probability in the
classical one-instanton approximation.
(6) Consider the periodic potential V = A cos2 (ax), and a transition between the first two
vacua (minima) of x > 0. Calculate the transition probability in the classical one-instanton
approximation.
(7) Calculate the (formal) fluctuation correction to the instanton calculation in exercise 6.
41 Variational Methods

In this chapter, we will study another type of approximation methods, variational methods. They give
an approximation to eigenenergies and eigenstates by considering variations over a certain subset of
functions, representing for states.

41.1 First Form of the Variational Method

The first form of the variational method applies to finding the energy of the ground state of a system.

Theorem This method starts from a simple theorem, expressed as follows. The energy of a state is
always equal to or larger than that of the ground state:
ψ| Ĥ |ψ
Eψ ≡ ≥ E0 , ∀|ψ. (41.1)
ψ|ψ

Proof Consider an orthonormal basis of exact eigenstates |n of the Hamiltonian Ĥ,
Ĥ |n = En |n, (41.2)
among which E0 is the smallest, i.e., it is the ground state.
Since this is a basis, it is a complete set,

|nn| = 1. (41.3)
n

Then expand the state |ψ in this basis,


 
|ψ = cn |n = |nn|ψ. (41.4)
n n

Since the states |n are an orthonormal set of eigenstates,


m| Ĥ |n = En δ mn . (41.5)
Then it follows that
m,n ψ|mm| Ĥ |nn|ψ
Eψ =
n ψ|nn|ψ

n |n|ψ| 2 En
= (41.6)
n |n|ψ|
2

n |n|ψ| 2 E0
≥ = E0 .
n |n|ψ|
2

q.e.d
457
458 41 Variational Methods

From this theorem, it follows that we can choose any basis |α for the Hilbert space (not just an
eigenbasis of Ĥ), and expand in it,
 
|ψ = aα |α = |αα|ψ, (41.7)
α α

and then the ground state energy can be found as a minimum over the whole space of {aα }
coefficients,

α,β a∗α aβ Hαβ 0| Ĥ |0


Eψ = ∗ ≥ E0 = . (41.8)
α aα aα 0|0
The minimum conditions,
 
∂Eψ
= 0, |∀α (41.9)
∂aα
will then select the ground state, Eψ,min = E0 ⇒ |ψ = |0.

41.2 Ritz Variational Method

However, the above is in general impractical, since in most cases the Hilbert space is either infinite
dimensional, or of a very high dimension, and moreover one has to orthonormalize the basis states,
making full minimization extremely hard if not impossible.
One particular variational method that is practical is the Ritz method, which amounts to
considering a subset of states {|ψi }i , and not even necessarily an orthonormal subset (a sub-basis),
and defining

|ψ ≡ ci |ψi , (41.10)
i

where ci are arbitrary complex coefficients.


Then we define also

i,k ck∗ ci ψk | Ĥ |ψi  i,k ck∗ ci Hki


Eψ = ∗ = ∗ , (41.11)
i,k ck ci ψ k |ψi  i,k ck ci Δki

where we have denoted

ψk | Ĥ |ψi  = Hki , ψk |ψi  = Δki . (41.12)

Treat the ck and ck∗ as independent variables (this is always possible, being equivalent to writing
ci = ai + ibi with ai and bi independent real variables). Then the stationarity (minimum) equations,
with variables ck∗ , are

∂Eψ
= 0, (41.13)
∂ck∗
459 41 Variational Methods

which gives
⎡⎢  ⎤⎥
1 ⎢⎢ ci Hki − Eψ
∗ ci Δki ⎥⎥ = 0, (41.14)
i,k ck ci Δki ⎢⎣ i i
⎥⎦
or equivalently
  
ci Hki − Eψ Δki = 0, (41.15)
i

where Eψ is E at the minimum, leading to an eigenvalue equation for E,


det(Hki − EΔki ) = 0. (41.16)

Lemma The good thing about this method is that, even if we incur a significant error in finding the
correct state (in this case |0), say
|ψ − |0 ∼ O(), (41.17)
the error in finding the correct energy is much smaller,
E − E0 ∼ O( 2 ). (41.18)

Proof We first trivially rewrite the form of the energy of the state as
n |n|ψ| 2 (En − E0 )
Eψ = + E0 . (41.19)
n |n|ψ|
2

We also use normalized states,


|ψ |ψ
| ψ̃ ≡ = . (41.20)
ψ|ψ |n|ψ| 2
n

Then, if we have
n| ψ̃ ∼ O() (41.21)
for n  0, then

Eψ − E0 = |n| ψ̃| 2 (En − E0 ) ∼ O( 2 ). (41.22)
n

q.e.d.

41.3 Practical Variational Method

An even more practical version of the variational method is written explicitly in terms of wave
functions, as opposed to states.
Consider a wave function depending on parameters a1 , . . . , an ,
ψ(x ) ≡ x |ψ = ψ(x ; a1 , . . . , an ). (41.23)
460 41 Variational Methods

Then minimize the energy of the state Eψ over the parameters a1 , . . . , an (instead of over
coefficients of states, or, given functions in an expansion, over parameters of a function):
 
∂Eψ
=0 . (41.24)
∂ai i

Thus, as in the case of the Ritz method, we obtain an approximation for the eigenenergy that is
better, namely O(2 ), than the approximation for the eigenstates, which is O().

41.4 General Method

We can now obtain the most general form of the variational method, which is in fact valid for any
Hermitian operator A, even though it is generally used for Ĥ. As before, we define

ψ| Ĥ |ψ
Eψ = ψ̃| Ĥ | ψ̃ = . (41.25)
ψ|ψ

This general method is also based on a theorem.

Theorem
With the above definitions, and using normalized states | ψ̃, the theorem is written formally as

δEψ
= 0 ⇔ | ψ̃ = |n, (41.26)
δ| ψ̃

i.e., the energy of the state is stationary, meaning there are no O() terms in it, only O(2 ) terms,
if and only if the normalized state is an eigenstate of Ĥ, with energy E = En = Eψ . Note that
this is true for all the eigenstates, not just for the ground state, though note that the condition is of
stationarity, not of a minimum (more on that later).

Proof in a basis. One way to prove it, is to consider a basis, i.e., a complete orthonormal set |r of
states, and expand the normalized state in it,

| ψ̃ = cr |r. (41.27)
r

We want to keep the normalization condition fixed, i.e.,



ψ̃| ψ̃ = 1 ⇔ cr∗ cr = 1. (41.28)
r,s

This means that we need to minimize Eψ , understood as an action, while keeping the constraint,
which is added to the “action” via Lagrange multipliers called E,
⎡⎢  ⎤⎥
δ ⎢⎢Eψ − E  cr∗ cr − 1⎥⎥ = 0. (41.29)
⎢⎣  r ⎥⎦
461 41 Variational Methods

We rewrite it as
⎡⎢  ⎤⎥
δ ⎢⎢ cr∗ Hr s cs − E cr∗ δr s cs + E ⎥⎥ = 0
⎢⎣ r,s r,s
⎥⎦
(41.30)
⎡⎢ ⎤⎥
⇒ δ ⎢⎢ cr∗ (Hr s − Eδr s )cs ⎥⎥ = 0.
⎢⎣ r,s ⎥⎦

Now we treat δcr∗ and δcs as arbitrary and independent variations, just as in the Ritz variational
method. Thus we rewrite the above condition as
 
δcr∗ (Hr s − Eδr s )cs + cr (Hr s − Eδr s )δcs = 0, (41.31)
r,s r,s

but since the variations are arbitrary and independent, we obtain two equations,
(Hr s − Eδr s )cs = 0
(41.32)
(Hr s − Eδr s )cr∗ = 0.

However, since Ĥ is Hermitian, Hr s = Hsr and moreover the eigenergies are real, E = E ∗ , the
second equation becomes

(Hsr − E ∗ δ sr )cr∗ = 0, (41.33)
which is just the complex conjugate of the first. But this is just the eigenstate condition,
Hr s cs = Ecr . (41.34)
q.e.d.
We also note that here for the first time we used hermiticity, and this is the only property used
about Ĥ, which means that indeed, the result is valid for any Hermitian operator.

Proof in the general case In fact we don’t need to expand in a basis in order to prove the theorem. We
vary the energy of the state, with the normalization condition imposed with Lagrange multiplier E,
as before (but without expanding in a basis),
 
δ ψ̃| Ĥ | ψ̃ − Eψ̃| ψ̃ + E = 0. (41.35)
Then the variation becomes
 
δ ψ̃|( Ĥ − E1)| ψ̃ = 0
(41.36)
⇒ δ ψ̃|( Ĥ − E1)| ψ̃ + ψ̃|( Ĥ − E1)|δ ψ̃ = 0.

But then if we treat δ ψ̃| and |δ ψ̃ as arbitrary and independent variations, we obtain
( Ĥ − E1)| ψ̃ = 0
(41.37)
ψ̃|( Ĥ − E1) = 0 ⇒ ( Ĥ † − E ∗ 1)| ψ̃ = 0.

Since Ĥ is Hermitian, Ĥ † = Ĥ and E ∗ = E, so we obtain the same equation as the first, and moreover
it is the eigenstate equation,
Ĥ | ψ̃ = E| ψ̃. (41.38)
q.e.d
462 41 Variational Methods

Moreover, again we can show that the energy error is O(2 ). Consider an error |δ ψ̃ n , and divide
it in a part parallel to |n, and a part |δ ψ̃   orthogonal to it,

| ψ̃ n  = |n + |δ ψ̃ n  = a|n + b|δ ψ̃  , (41.39)

where n|δ ψ̃   = 0. Then normalization means that

ψ̃ n | ψ̃ n  = 1 ⇒ a2 + b2 = 1. (41.40)

But then if b ∼ O(), a ∼ 1 − O( 2 ). That leads to the following error in the energy,

Eψ = a∗ n| + b∗ δ ψ̃  | Ĥ a|n + b| ψ̃ 


= |a| 2 En + |b| 2 δ ψ̃  | Ĥ |δ ψ̃   (41.41)
 En + O( 2 ),

where in the second equality we have used Ĥ |n = En |n and δ ψ̃  |n = 0, and in the last equality
we have used |a| 2 = 1 − O( 2 ) and |δ ψ̃   ∼ O().
Note however that now the sign of the 2 correction is arbitrary, meaning that Eψ reaches either a
minimum or a maximum. Only for En = E0 (the ground state energy) do we always have a minimum
(since the sign of E − E0 is always plus).

41.5 Applications

We now apply the variational methods to some relevant cases. We first point out the pros and cons of
the variational method.
Pro: The method gives a very good approximation to the energy, with O(2 ) error, even when the
O() error in the state (or wave function) is relatively large.
Con: The method does not give us the error itself, since all we know is that we are minimizing
over some functions, not how these functions relate to the exact ones (i.e., what the value of  is).
But we can improve on the cons as follows:
(1) We can choose a wave function for |ψ that has the same number of nodes as the state |n which
we want to reproduce (which is known by general theory, without knowing the state itself).
(2) We can impose the same asymptotic behavior at the extremes, r = 0 and/or r = ∞ in three
dimensions.
ˆ
(3) We can use eigenfunctions of operators that commute with Ĥ, e.g., the angular momentum L.

The Hydrogenoid Atom


In this case, we actually know exactly the wave functions of the eigenstates, but we can pretend we
don’t, and try to use the variational method.
Using point (3) above, we can write the ansatz for the wave function in three dimensions,

Φ = R(r)Ylm (θ, φ). (41.42)


463 41 Variational Methods

Then we use point (2) above, and at r = 0, R(r) → const, whereas at r → ∞, R(r) → 0 really
fast. For the ground state energy E0 , with spherical symmetry, we can then try (using the practical
variational method) the wave function

R(r) = Ae−r/a , (41.43)

depending on the parameters A and a, and minimize the energy Eψ over them. At the minimum,
we find a = a0 and A = a0−3/2 , leading to the ground state wave function |ψ0 , and the corresponding
ground state energy E0 . Thus in this way, we find the exact energy E0 .
But we could also try another function satisfying the same boundary conditions at r = 0 and
r = ∞,
A
R(r) = , (41.44)
b2 + r2
depending on the parameters A and b, and minimize the energy over them. At the minimum, we find
Emin = −0.81|E0 |, which is pretty close to the true value, even though the wave function is way off.
For the energy of an excited state En , n > 0, we must try wave functions that have the correct
symmetry properties, the right number of nodes, etc. Otherwise, if we know the energy levels Em
and states for m < n, we then try functions that are orthogonal to them, leading to En > Em .

Previously Studied Example: the Helium Atom (or Helium-Like Atom)

Helium-like atoms were studied in Chapter 21, and the variational method was used without much
explanation, so here we return to it and streamline the presentation.
Consider a trial wave function for the electrons in a helium-like atom that is just a product of the
states for a single electron, i.e., the zeroth-order approximation in the perturbation theory for the
interaction between the electrons,
1  r +r 
1 2
Φ(r 1 , r 2 ; a) = f (r 1 ; a) f (r 2 ; a) = exp − , (41.45)
πa 3 a
but where a is now a free parameter, and not the value a = a0 /Z that would be appropriate for a
helium-like atom of charge Z. Equivalently, write
a0
a= , (41.46)
Z
where Z  is an arbitrary effective charge felt by the electrons, and is not equal to Z.
Then, as shown in Chapter 21, the first-order (time independent) perturbation theory correction,
the interaction potential (between the electrons) V12 , averaged in the zeroth-order state |Φ, only now
with charge Z  instead of Z, gives

5 
Φ|V̂12 |Φ = Z |E0 |, (41.47)
4
where |E0 | = e02 /(2a0 ) is the ground state energy of the hydrogen atom (the energy of a single electron
in the field of a nucleus of charge one).
464 41 Variational Methods

Then the energy functional of the helium-like atom is twice the energy of an electron in the field
of a nucleus of charge Z , plus the above interaction energy, so
⎡⎢   2⎤
5 ⎥⎥
E(a) = 2|E0 | ⎢⎢ Z 2 − 2 Z −
16 ⎥⎥
. (41.48)
⎢⎣ ⎦
Minimizing this over a, i.e., finding δE(a)/δa = 0, or equivalently minimizing over Z  by finding
δE/δZ  = 0, we obtain
5
Z = Z − , (41.49)
16
and the energy at this minimum is
 2
5
EΦ,min = −2 Z − |E0 |. (41.50)
16
This is a better approximation than the first-order (time-independent) perturbation theory result,
which is
5
Φ|V̂12 |Φ = Z |E0 |, (41.51)
4
leading to the total energy
5 25
E (0+1) = −2|E0 | Z 2 +
Z |E0 | = EΦ,min + |E0 |, (41.52)
4 128
which is larger (and therefore a worse approximation) than EΦ,min .
The Ritz method will not be exemplified here, but rather will be revisited in the next chapter, where
it will be used to work with atomic and molecular orbitals.

Important Concepts to Remember

• The energy of a state is always larger than its ground state, so we can find the ground state by
minimizing the energy α,β a∗α aβ Hαβ /( α |α α | 2 ) over states expanded in an arbitrary basis, |ψ =
α a α |α.
• In the Ritz variational method, we minimize over a subset of states in the Hilbert space, not
∗ ∗
necessarily orthonormal, Eψ = k,i ck ci Hki /( k,i ck ci Δki ), Δki = ψ k |ψi , with |ψ =
i ci |ψi . If the error in the state is |ψ − |0 = O(), then the error in energy is Eψ − E0 = O( ).
2

• In a practical variational method, one chooses wave functions depending on parameters, ψ(x) =
x|ψ = ψ(x ; a1 , . . . , an ), and minimizes the energy Eψ over the parameters, again finding a
better error in the energy, Eψ − E0 = O( 2 ), than in the state, ψ − ψ0 = O().
• The most general variational method (valid for any operator A, not only for H), makes
the energy stationary (no O() terms, only O(2 ) terms) over normalized states, i.e.,
δEψ /δ|ψ = 0 if and only if we have an eigenstate, |ψ = |n, for all states. For the ground state,
stationarity implies a minimum.
• Variational methods improve the error, though we do not then know its value; but we can improve
them by choosing wave functions with the correct number of nodes and behavior at infinity or zero,
and that are eigenfunctions of operators that commute with Ĥ.
• For a helium-like atom, the variational method is better than first-order time-independent perturba-
tion theory.
465 41 Variational Methods

Further Reading
See [2] and [1].

Exercises

(1) Consider
  a two-level system, with basis |1, |2, and in this basis, a Hamiltonian with elements
1 1
. Use the first form of the variational method to find the ground state, and then check by
1 1
finding the exact eigenstates.
(2) Use the Ritz variational method for the harmonic√oscillator, with trial wave functions ψ1 (x) =
e−y /2 , ψ2 (x) = e−y , ψ3 (x) = e−2y , where y = x mω/, in order to find the best approximation
2 2 2

to the ground state energy.


(3) Use the practical variational method for the same harmonic oscillator ground state energy, with
trial wave function ψ a (x) = e−ay .
2

(4) Repeat exercise 3 for the third state (second excited state) |3 of the harmonic oscillator, namely,
for ψ a (x) = y 2 e−ay .
2

Consider the potential V = k x 2 + α|x| 3 , and a trial wave function ψ(x; a, b) = |y| a e−by . Find
2
(5)
an estimate of the ground state energy (write down the equations for the minimum).
(6) Fill in the details in the text for minimization with the trial wave function R(r) = Ae−r/a for the
hydrogenoid atom.
(7) Fill in the details in the text for minimization with the trial wave function R(r) = A/(r 2 + b2 ).
PART IIc

ATOMIC AND NUCLEAR QUANTUM


MECHANICS
Atoms and Molecules, Orbitals and Chemical
42 Bonds: Quantum Chemistry

In this chapter, we will build up from hydrogenoid atoms to multi-electron atoms, and onward to
molecules, by defining atomic orbitals, molecular orbitals, the hybridization of orbitals, and the
resulting chemical bonds, defining the field of quantum chemistry.

42.1 Hydrogenoid Atoms (Ions)

We start with a quick review of the hydrogenoid atoms, i.e., ions with a nucleus of charge Z and only
one electron. The electronic quantum numbers are (n, l, m, ms ), where

l = 0, 1, . . . , n − 1, m = −l, −l + 1, . . . , l − 1, l, ms = ±1/2. (42.1)

There is a spin–orbit (l · s) coupling, meaning that there is such a term in the interaction
Hamiltonian, where l is the angular momentum and s is the spin angular momentum. Then, at least if
the interaction energy is sufficiently large, we must add the angular momenta to give the total angular
momentum, j = l +s, which means that in terms of quantum numbers we have j = l ± 1/2. Moreover,
the values of m j are − j, − j +1, . . . , j −1, j. We can then use the quantum numbers (m, l, j, m j ), which
are more appropriate if the spin–orbit interaction energy is large enough.
The wave functions depend on the position r and the quantum numbers, and can be expanded as
products:

ψ nlm (r, θ, φ, ms ) = Rnl (r)Ylm (θ, φ)χ(ms ). (42.2)

Since Ylm (θ, φ) = Plm (cos θ)eimφ , the probability is

Pnlmms (r ) = |ψ nlm (r, θ, φ, ms )| 2 = |Rnl (r)Pnl (cos θ)| 2 |χ(ms )| 2 , (42.3)

which means that it actually depends on r, θ and (n, l, m, ms ). This probability profile, more precisely
the shape that contains most of it (say, 99% for instance), is defined to be an orbital, for a given
(n, l, m, ms ) as above or a given (n, l, j, m j ).
For the orbitals, we have the spectroscopic notation, where we denote an orbital by nl d , where d
is the multiplicity of the state. More precisely, instead of the value of l, we use the notation where
l = 0 is called s, l = 1 is called p, l = 2 is d and l = 3 is f . From l = 4 onwards, we have alphabetic
notation, so l = 4, 5, 6, 7 are denoted by g, h, i, j, . . . The l = 0 or s orbital has spherical symmetry,
so no θ dependence at all; the l = 1 or p orbital has axial (or dipole) symmetry, meaning one axis;
the l = 2 or d orbital has quadrupole symmetry (or 2 axes); etc.
469
470 42 Quantum Chemistry

The above-defined hydrogenoid orbitals, i.e., atomic orbitals for hydrogenoid atoms, meaning the
single-electron wave functions, are specifically (here ρ = r/a0 as always)

1s : φ100 (ρ, θ, φ) = Ce−ρ/2


C
2s : φ200 (ρ, θ, φ) = √ (2 − ρ)e−ρ/2
32
cos θ Y1,0
C −ρ/2    Y1,−1√−Y1,1  x i
2p : φ21i (ρ, θ, φ) = √ e sin θ cos φ  ∝  2  ∝ r
32 +Y
 sin θ sin φ
Y 1,−1 1,1
i √2 (42.4)

C
3s : φ300 (ρ, θ, φ) = √ (6 − 6ρ + ρ 2 )e−ρ/2
972
cos θ Y
 Y1,−11,0
−Y1,1 
C 2 −ρ/2 sin θ cos φ  ∝  √2  ∝ x i .

3p : φ31i (ρ, θ, φ) = √ (4ρ − ρ )e
972  Y1,−1 +Y1,1  r
 sin θ sin φ 
i √2

We make two observations:

• For the s orbitals, φ(ρ = 0)  0 so there is a direct interaction between the electrons and the
nucleus, called a Fermi contact interaction.
• As we saw before, the average value of the radius is
a0
r = [3n2 − l (l + 1)], (42.5)
2Z
so it decreases with the charge Z of the nucleus, meaning that the hydrogenoid atom gets
compressed for higher Z, rather than growing.

42.2 Multi-Electron Atoms and Shells

We next move on to multi-electron atoms. In this case,

• the other electrons screen the nucleus somewhat by an amount σ; they are described in terms of a
hydrogenoid atom, but with an effective charge,

Zeff = Z − σ. (42.6)

• due to the other electrons, which are not spherically distributed (unless we only have s, i.e.,
spherically symmetric, orbitals for the other electrons), the potential seen by an electron is no
longer central.

We have a shell model, where we fill up atomic orbitals, which are extensions of the hydrogenoid
orbitals, with electrons, in order of increasing energy. We call a shell, the set of energy levels with a
given n, and a sub-shell the set of energy levels of a given n and l; see Fig. 42.1.
471 42 Quantum Chemistry

3p

3s
2p
2s
1s
Figure 42.1 The first sub-shells in the shell model, ignoring the spin (ms ) degree of freedom.

For a sub-shell, of given n, l and varying (m, ms ) or ( j, m j ), if it is full, meaning all the
energy levels in the sub-shell are occupied with electrons, then we have a spherical distribution
of probability, and the total angular momenta vanish, L tot = 0 = Stot .

Hund Rules for the Atomic Orbitals:


(1) The electrons avoid being in the same orbitals if there is more than one orbital of equal l.
(2) Two electrons on equivalent (same value of l), but different, orbitals, have parallel (↑↑) spins in
the ground state of the atom.
If we have a fully filled shell (all the energy levels of a given n are occupied by electrons), then for
all calculations referring to further electrons, it is as if we have a hydrogenoid atom, with an effective
charge Zeff = Z − σ.

42.3 Couplings of Angular Momenta

In an atom, in general we have many electrons, each with an orbital angular momentum l i and a
spin si . We have various possibilities for coupling of these angular momenta, but we have two ideal
situations:

(a) LS coupling (normal, or Russell–Saunders). In this case, we first couple the orbital angular
momenta to each other, and the spins to each other, and then we couple the resulting L and S:

 
L = l i , S = si J = L + S.
 (42.7)
i i

This means that the possible values of J are

|L − S| ≤ J ≤ L + S. (42.8)

The notation in this coupling is 2S+1 L J , or rather n2S+1 L J , where usually 2S + 1 is the multiplicity
of states of given L, S. However, if S > L then the multiplicity is 2L + 1, replacing 2S + 1 in the
notation, and we say that the multiplicity is not fully developed. An example of the notation is the
472 42 Quantum Chemistry

2
P3/2 state, which is read as “doublet P three halves”. This LS coupling is predominant in the light
atoms: electrostatic forces between the electrons couple the l i into L and the si into S,
 and then the
 
smaller magnetic spin–orbit coupling couples L and S into J . 
(b) jj coupling (spin–orbit). In this other idealized case, we first couple the orbital and spin
angular momenta of each electron into their total angular momentum, and then the total angular
momenta of the electrons to each other:

ji = l i + si , J = ji . (42.9)
i

This case is predominantly used for heavy atoms, with large Z. In it, the magnetic spin–orbit coupling
happens first, since it is due to the nuclear charge Z, which is large. Only after that do we couple the
ji to each other by the electrostatic forces between the electrons, which are Z-independent.
In general, a particular coupling lies somewhere in between the two idealized cases above. But
the particular type of coupling does not modify the total number of levels, nor J, but rather changes
the energy gaps between the various levels. This splitting of energy levels due to the coupling of the
angular momenta of the electrons is called the “fine structure”.
But there is also a “hyperfine structure” due to the nuclear angular momentum I, more precisely
its interaction with J , leading to the total atomic angular momentum,
 = J + I.
F (42.10)

42.4 Methods of Quantitative Approximation for Energy Levels

There is an approximation method of introducing a self-consistent field, called the Hartree–Fock


method, but we will leave it for later, when analyzing multiparticle states under some generality.

Method of Atomic Orbitals


The method we will describe here is an application of the Ritz variational method to atoms and their
orbitals, called the atomic orbital method.
It is a way to introduce some interaction, usually between electrons. We split the Hamiltonian into
free plus interaction parts,
Ĥ = Ĥ0 + Ĥint . (42.11)
We use orthonormal eigenstates of Ĥ0 , called |Φk , which we can usually calculate (we choose Ĥ0
such that we can). If we consider all of them, for k = 1 to ∞, we obtain a basis of the Hilbert space
of the system. But, rather than do that, we restrict to n terms only, for the expansion of a trial state,

n
|ψ a  = ck |Φk  (42.12)
k=1

and use the full Hamiltonian Ĥ on it. Then, from the Ritz variational method, we want to obtain the
eigenstates and eigenenergies for this system,

n
ck ( Ĥ − E)|Φk  = 0. (42.13)
k=1
473 42 Quantum Chemistry

Multiplying with Φi | from the left, we have



n
ck (Hik − Eδik ) = 0. (42.14)
k=1

Then we can find the secular equation for the eigenenergies,

det(Hik − Eδik ) = 0. (42.15)

From it, we obtain n approximate values E1 , . . . , En for the eigenvalues of Ĥ.

42.5 Molecules and Chemical Bonds

We next move on to molecules, which are several atoms bound together. The chemical bonds between
the atoms are mostly between two atoms within a molecule, though there are exceptions such as the
aromatic bonds, for instance in C6 H6 , explained briefly at the end of the chapter.
Chemical bonds usually fall into one of three categories:

• Electrovalent, or ionic, bonds, leading to a heteropolar molecule (meaning it has two different
poles). In it, the electrons move between the two atoms, so one atom loses an electron (or more,
or less; we are talking about probabilities in quantum mechanics, so fractional parts make sense)
and the other gains it. For instance, in the ionic bond between an alkaline element (an element that
has a single electron outside filled shells) and a halogen (an element for which a single electron is
needed to have only filled shells), the alkaline element loses one electron, and the halogen gains it.
The standard example of this is the ionic salt molecule, Na+ Cl− ; see Fig. 42.2a.
• Metallic bonds, in which the atoms have a crystalline structure, in which some electrons are
delocalized within the crystalline structure. This means that the very many equal energy levels of
the individual atoms split infinitesimally, creating almost continuous “bands” of electrons instead
of discrete levels; see Fig. 42.2b.
• Covalent bonds, between neutral atoms, leading to a homopolar molecule (meaning with two like
poles). The standard example is the H2 molecule; see Fig. 42.2c.

A molecule is a stable structure, meaning it is a minimum of the energy (be it classical or quantum),
associated with a given spatial structure. That in turn means that the nuclei are approximately fixed,
and only the electrons move (or, rather, move fast compared with the nuclei).

(a) (b) (c)


+ −
Figure 42.2 (a) Ionic bond: Na Cl . (b) Metallic bond: energy bands. (c) Covalent bond: H2 .
474 42 Quantum Chemistry

42.6 Adiabatic Approximation and Hierarchy of Scales

Since electrons are much lighter than nuclei,


me
 1, (42.16)
MN
as stated above we can consider that the nuclei are approximately fixed to their stable structure, and
the electrons move fast within it, i.e., they “follow” the nuclei adiabatically; so, at all times we are
able to treat the nuclei positions as mere parameters as far as the electrons are concerned.
Consider d as the characteristic distance between two atoms, or rather two nuclei, in the molecule.
Then the electrons are delocalized over a distance Δx ∼ d, meaning the electron momentum

pe− ∼ . (42.17)
d
Then the energy of an electron is about twice the kinetic energy, giving

p2 2
Ee ∼ 2Te ∼ ∼ . (42.18)
m md 2
On the other hand, the rotational energy associated with the molecule, or rather with the nuclei that
compose it, is
L 2 2
Erot ∼ ∼ , (42.19)
I M d2
where I is the moment of inertia of the nuclei. Thus

Ee  Erot . (42.20)

Moreover, there is also a vibrational energy, associated with the vibrations of the nuclei around the
stable structure. This means that the potential energy, approximately that of a harmonic oscillator,

M ω2 d 2
Epot ∼ , (42.21)
2
should be balanced by the kinetic energy of an electron,

2
Te ∼ , (42.22)
2md 2
so that
 2
ω∼ √ ⇒ Evibr ∼ ω ∼ √ . (42.23)
d 2 Mm d 2 Mm
Then finally the hierarchy of energy scales for electronic, vibrational, and rotational modes is

Erot  Evibr  Ee . (42.24)

While Ee is an atomic quantity, the rotational and vibrational modes are intrinsically molecular,
have no atomic analogue, and are small. Concretely, the rotational energy Erot ∼ k B T ∼ 25 meV, the
vibrational energy Evibr ∼ 0.5 eV, and the electronic energy is of a few eV.
475 42 Quantum Chemistry

The Schrödinger equation for a molecule is written as

 2  2   
− Δα − Δi + Vαi + Vαβ + Vi j  ψ n{s } {r i }, { R
α }
α
2M α 2m i (42.25)
 i α,i α,β i,j

= En ψ n{s } {r i }, { R
α } .


Here mi are the electron masses and r i their positions, Mα are the masses of the nuclei and R
their positions, n is a generic index for energy states whereas {s} is a degeneracy index. Moreover,
TN is the nuclear kinetic energy and Te the electronic kinetic energy, where
 2  2
TN = − Δα , Te = − Δi , (42.26)
α
2Mα i
2mi

and V is the total potential energy, composed of nuclear–electronic part VN e , nucleus–nucleus part
VN N , and an electron–electron part Vee , where
  
VN e = Vαi , VN N = Vαβ , Vee = Vi j , V = VN e + VN N + Vee . (42.27)
α,i α,β i,j

42.7 Details of the Adiabatic Approximation

We can consider a perturbation in which the unperturbed part Ĥ0 is the purely electronic part,

Ĥ0 = T̂e + V̂ , (42.28)

neglecting TN , so considering fixed nuclei. We can perhaps also neglect VN N , which would mean
neglecting the vibrational and rotational modes, thus assuming fixed nuclei at the positions in the
 α are just parameters, as in the practical variational method, and we can
stable structure. Then R
minimize over them.
Further, writing

Ĥ = Ĥ0 + T̂N , (42.29)

we can write down a solution that is not fully separated as an eigenfunction for the Schrödinger
equation for Ĥ,

ψ n{s } {r i }, { R
 α } = ψe,n{s } {r i }, { R
α } ψ N { R
α } , (42.30)

where the ψe are eigenfunctions of Ĥ0 ,

Ĥ0 ψe,n{s } {r i }, { R


 α } = En(0) ψe,n{s } {r i }, { R
α } , (42.31)

that depend on the positions R  α of the nuclei only as parameters.


On the other hand, T̂N acts on both ψe and ψ N , so we can write
 
T̂N (+V̂N N ) ψe,n{s } {r i }, { R
α } ψ N { R
 α } = E N ,n ψe,n{s } {r i }, { R
α } ψ N { R
α } , (42.32)
476 42 Quantum Chemistry

which is multiplied by the electron wave function ψe and integrated, leading to the kinetic energy
plus an extra part,
  

d 3r i ψe,n{s } {
r i }, { R
 α } T̂N (+V̂N N ) ψe,n{s } {r i }, { R
α } ψ N { R
α }
i (42.33)
= (TN + ΔE N )ψ N { Rα } .


Then the free (electronic) Schrödinger equation is written, more explicitly, as


 
T̂e + V̂N e + V̂ee (+ V̂N N ) ψe,n{s } {r i }, { R
 α } = Ee,n ψe,n{s } {r i }, { R
α } . (42.34)
The total energies split into electronic and nuclear parts,
En = E N ,n + Ee,n ≡ E N ,n + En(0) . (42.35)
The quantized vibrational energy is
 
1
Evibr = v+ ω0 , (42.36)
2
and the quantized rotational energy is
J 2 2 J (J + 1)
Erot = = . (42.37)
2I 2I
In order to find the stable structure, i.e., the molecular shape, we can apply the practical variational
method for ψ n{s } = ψe,n{s } ψ N and minimize the energy over { R  α } as parameters.

42.8 Method of Molecular Orbitals

This method is an application of the Ritz variational method, similar to the method of atomic orbitals.
We use the interaction between the electrons as a perturbation, Ĥ1 = Ĥee .
We first find an electronic wave function φ ai (r i ) that is decoupled, in which one electron is in the
field of all the nuclei, so the Schrödinger equation for it is
N
−  Δi −
2
Z α e2 
φ ai (r i ) = Ea(i)i φ ai (r i ), (42.38)
 2m α=1
4π 0 r αi

where i = 1, . . . , ne is an index for the electrons.


Next, we write a molecular orbital as a product of individual wave functions for the electrons
φ ai (r i ),
φr (r 1 , . . . , r ne ) ≡ φ a1 (r 1 ) · · · φ an e (r ne ), (42.39)
where the total index r is made up of the {ai } indices, and we use the practical variational method
and minimize over the parameters { R
 α }.
Finally, we write the wave functions of the molecule as linear combinations of these molecular
orbitals,

ψe {r i }; { R
α } = cr φr {r i }; { R
α } , (42.40)
r
477 42 Quantum Chemistry

and use the Ritz variational method: we restrict to a finite subset of r = 1, . . . , k. Then we obtain the
eigenstate–eigenvalue equation,

k
cr (Hsr − Ee δr s ) = 0, (42.41)
r=1

leading to the secular equation for the electronic energy,


δ(Hsr − Ee δ sr ) = 0. (42.42)

42.9 The LCAO Method

A variant of the above method is the linear combination of atomic orbitals (LCAO) method.
In order to find the decoupled electronic wave function φ ai (r i ), we use a type of Ritz method.
We consider atomic orbitals for different nuclei inside the molecule, but the same electron, meaning
the same position r i , χr (r i ; R
r ), where R
r is a parameter, representing for the position of the rth
nucleus. Then we make linear combinations of the orbitals for each nucleus, one combination per
electron,

N
φ ai i (r i ) = cir χr r i ; R
r , (42.43)
r=1

where N is the number of nuclei in the molecule, i is an index for the electron, and ai is the particular
molecular orbital.
As an example, we can consider the NH molecular orbital, where the N nucleus has a position R 1
and atomic orbital is P (L = 1), and the H nucleus has position R2 and atomic orbital is S (L = 0).
We minimize over cir (in the Ritz-like variational method) to find the molecular orbitals φ ai i . The
total energy is

ne
E= Ei , (42.44)
i=1

where the individual electron energies are



r,s cir Hr s cis
Ei = ∗ , (42.45)
r,s cir Δr s cis

and, as before, we use the notation


Hr s = χr | Ĥ |χs 
(42.46)
Δr s = χr |χs .
The eigenenergy–eigenstate equation is

N
cir (Hr s − Ei Δr s ) = 0, (42.47)
i=1

leading to a secular equation for the energies Ei ,


det(Hr s − Ei Δr s ) = 0. (42.48)
478 42 Quantum Chemistry

42.10 Application: The LCAO Method for the Diatomic Molecule

Consider two nuclei only, with distance R  between them, and electrons with positions r 1 with respect
to nucleus 1, and r 2 with respect to nucleus 2, and generic position (with respect to some arbitrary
origin) r .
Then the molecular orbital is a linear combination of the atomic orbitals (MO = LCAO), meaning
φ(r ) = Aφ1 (r 1 ) + Bφ2 (r 2 ). (42.49)
The energy in this state is
A2 H11 + B2 H22 + 2ABH12
Eφ = , (42.50)
A2 + B2 + 2ABS
where

S = φ1 |φ2  = φ1∗ φ2 . (42.51)

We minimize it with respect to A and B,


∂Eφ ∂Eφ
= = 0. (42.52)
∂A ∂B
If the two atomic levels are equal, H11 = H22 , then at the minimum we find A = ±B and
H11 ± H12
E1,2 = , (42.53)
1±S
meaning the level splits in two.
If H11  H22 , but S is negligible (S  1), then the split energy levels are

 2
H11 + H22 H11 − H22 2H12
E1,2 = ± 1+ . (42.54)
2 2 H11 − H22
We note that E1 < H11 , H22 and E2 > H11 , H22 , and now |B|  | A|.

42.11 Chemical Bonds

Chemical bonds fall into two cases:


• Purely covalent bond, as in a homopolar molecule. It corresponds to | A| = |B| (A = ±B), and so to
the first case above, with the MO being the symmetric LCAO,
φ = a(φ1 ± φ2 ). (42.55)
In it, we see that the electron is evenly distributed between the two nuclei (or, rather, atoms).
• An ionic bond, as in a heteropolar molecule. It corresponds to | A|  |B|, which is the second case
above, with
φ = a(φ1 + λφ2 ), (42.56)
but where λ  ±1, so the electron is found mostly near one or the other nucleus.
479 42 Quantum Chemistry

(a) (b) (c)

(d) (e)

(g) (h)

(f)

Figure 42.3 (a) s orbital. (b) p orbital. (c) σ bond in C–C between two 2p orbitals or spx orbitals. (d) π bond in C–C between two 2p orbitals
or spx orbitals. (e) Hybridization: spx orbitals (sp, sp2 , sp3 ). (f) C2 H6 , C2 H4 and C2 H2 structure, orbitals, and bonds. (g) Aromatic
bond in C6 H6 , the naive structure and the correct structure. (h) Aromatic bond in C6 H6 , the p orbitals that make up the
aromatic bond.

More precisely, describing the bonds between two atoms in a molecule (i.e, the chemical bonds),
we first note that there is a preferred axis uniting the two nuclei. This means that L tot is not conserved,
only its projection onto the axis, L z,tot , is conserved. The values are, as usual, L z = 0, 1, 2, . . .,
denoted by σ, π, δ, . . . instead of s, p, d, f , . . .. These are molecular orbitals, which only have one
axis of symmetry, not the full rotational invariance. See Fig. 42.3a, b, c, d for the spatial form of the
s, p atomic orbitals and σ, π bonds, respectively (exemplified for carbon).

Hybridization
A very important example is the carbon atom, which has four free electrons (outside full shells), two
electrons in S states, with opposite spins ↑↓, and two electrons in P states. Specifically, the electron
structure is 1s(↑↓), 2s(↑↓), 2p(↑), 2p(↑), 2p().
But in molecules that include a carbon atom, the split in energy between the S and the P states is
much smaller than the binding energy of the molecule. That means that the relevant atomic orbitals
of carbon are “hybridized”, or effective, orbitals that are combinations of the S and P orbitals.
480 42 Quantum Chemistry

The particular hybridization depends on the (symmetry of the) molecule under study. In Fig. 42.3e
we see the spatial form of the hybridized spx orbitals (with x = 1, 2, 3).
We now consider the CH4 (methane) molecule. In it, each hydrogen has a bond with the carbon,
all these bonds being equivalent. This means that we need to hybridize all four orbitals of carbon,
obtaining sp3 orbitals instead of the s and p orbitals.
The sp3 hybridization defines a linear-combination wave function,
φ sp3 = c1 φ s + c2 φ p x + c3 φ py + c4 φ pz
1 (42.57)
=√ [φ s + c(u1 φ p x + u2 φ py + u3 φ pz )].
1 + c2
We have to define four sp3 orbitals replacing the s and three p orbitals, which means that there are
four different u vectors, u (1) , u (2) , u (3) , u (4) . The orbitals have to be equivalent, so we need to have the
same c for all of them. Moreover, the states need to be orthonormal, so
(1) (2) 1
φ sp 3 |φ sp 3  = [1 + c2 u (1) · u (2) ] = 0, etc. (42.58)
1 + c2
But u (i) · u ( j) = cos θi j , thus the equation becomes

c2 cos θi j = 1, (42.59)

and since the angles between any two of the four vectors must be the same, the vectors end on the
vertices of a regular tetrahedron, meaning
1
cos θi j = − ⇒ c2 = 3. (42.60)
3
Each sp3 orbital will join with a hydrogen atom and orbital, forming the CH4 molecule.
Next we consider the C2 H6 (ethane) molecule, H3 C–CH3 . In it, each of the two carbons has three
bonds, each with a hydrogen atom, and there is one bond between the two carbons. In it, the same sp3
hybridization works, and three such orbitals are used to bond with the hydrogen atoms. The last sp3
orbital of each of the two carbons joins to form a σ (l z = 0) bond. In it, the spins of the two electrons
(from the two carbons) are antiparallel, ↑↓, giving a bond (i.e., negative energy). See Fig. 42.3f for
the structure, orbitals, and bonds of C2 H6 .
Next we consider the C2 H4 (ethylene) molecule, H2 C=CH2 . In it, for each carbon atom, the
s orbital and two of the p orbitals hybridize into three sp2 orbitals, leaving a single p orbital
unhybridized. The two bonds between the carbons are different. There is a σ bond (L z = 0), with
the two sp2 orbitals parallel to the axis between the atoms and with the electrons in it antiparallel, as
before. The two unhybridized p orbitals, each with wave function
 = u0 φ p + u0 φ p + u0 φ p ,
ψ0 = u0 · φ (42.61)
1 x 2 y 3 z

join into a π (L z = 1) orbital. The p orbitals are transverse to the axis between the carbons, and the
wave function is normal to the sp2 hybridized orbitals.
Next we consider the C2 H2 (acetylene) molecule, HC≡CH. In it, for each carbon atom, the s orbital
and one p orbital hybridize into two sp orbitals, while the remaining two p’s remain untouched. One
sp orbital joins with a hydrogen, while the other joins with the orbital of the other carbon and forms
a σ bond. The two remaining untouched p orbitals of each carbon join with the one from the other
atom, and form two π bonds. In all, we have one σ and two π bonds between the carbons. See
Fig. 42.3f for the structure, orbitals, and bonds of C2 H4 and C2 H2 .
481 42 Quantum Chemistry

Finally, we consider the C6 H6 (benzene) molecule, which corresponds to a complex bond. The
carbons sit at the vertices of a regular hexagon, and bond with one hydrogen each, and have a σ
bond with each of the two immediate neighbors, leaving one electron unbonded. Therefore, one s
orbital and two p orbitals hybridize into three sp2 orbitals, one joining with the hydrogen and the
other two joining with the immediate neighbors to make two σ bonds. The remaining electrons from
each carbon pool together to make a circular motion around the hexagon that is Bohr quantized (like
in a hydrogenoid atom) on the circle, meaning we have delocalized bound electrons. We represent
these electrons by a circle inside the hexagon. In Fig. 42.3g we represent the naive structure and the
actual structure (with the aromatic bond) of C6 H6 , and in Fig. 42.3h we represent the creation of the
aromatic bond out of p orbitals.

Important Concepts to Remember

• In a hydrogenoid atom, the states are described by (n, l, m, ms ) or (because of the spin–orbit
interaction) as (n, l, j, m j ), and the spectroscopic notation nl d .
• In shell models, we fill atomic orbitals, extensions of the hydrogenoid orbitals; the Hund rules
state that electrons avoid being in the same orbitals if other orbitals of same l are available, and
that electrons in equivalent, but different orbitals, have parallel spins in the ground state.
• In LS coupling (normal, or Russell–Saunders) first l i couples into L and si into S,  then L and S
 
into J , while in j j, or spin–orbit, coupling, first l i couples and si into ji , then ji into J ; most atoms
 
are in between these two extremes. Note that the coupling doesn’t change the number of energy
levels, only the energy gaps.
• In the method of atomic orbitals, we use a finite number of unperturbed eigenfunctions for the total
Hamiltonian, and minimize.
• Chemical bond types are electrovalent or ionic, metallic (electrons delocalized within the crys-
talline structure), and covalent, between neutral atoms.
• Energy scales in molecules are: Erot  Evibr  Ee , with usually Erot ∼ 25 meV, Evibr ∼ 0.5 eV and
Ee ∼ a few eV.
• A molecular orbital is a product of individual electronic orbitals,

φr (r 1 , . . . , r ne ) = φ a1 (r 1 ) · · · φ an e (r ne ),

and in the method of molecular orbitals one uses linear combinations of these, and the Ritz
variational method for them.
• In the LCAO method, one defines a molecular orbital as linear combinations of atomic orbitals (for
a single electron with respect to each of the nuclei) and uses a Ritz-like variational method for the
linear coefficients.
• In molecules, atomic orbitals hybridize when the difference in energy between them is smaller than
the binding energy of the molecule: e.g., sp3 in CH4 and C2 H6 , sp2 in C2 H4 , sp in C2 H2 .
• Hybridized orbitals can join to form bonds: σ (L z = 0), for two parallel such orbitals (sp3 , sp2 , or
sp) that are parallel to the common axis and of opposite spin, and 2p orbitals can form 2π (L z = 1)
bonds, etc. One particular example is the aromatic bond of C6 H6 with the electron delocalized and
quantized on a circle.
482 42 Quantum Chemistry

Further Reading
See [2].

Exercises

(1) Write down explicitly the 4s and 4p atomic orbital wave functions.
(2) Use the shell model and the Hund rules to show how the orbitals are populated with electrons in
the elements C, O, and Mg.
(3) Consider the element O (oxygen). Write down the LS and jj couplings for the electrons, and
show explicitly that the splitting of energy levels is the same for each type of coupling.
(4) Consider the element O. Write down explicitly the method of atomic orbitals for it (for electron–
electron interactions), using the basis of only the filled orbitals.
(5) Write down explicitly the method of molecular orbitals for the O2 molecule.
(6) Write down explicitly the LCAO method for the O2 molecule, applied to each of the atomic
orbitals in O.
(7) Write down the Bohr quantization for the common electron in benzene, and find the energy
levels as a function of the “radius” of the benzene molecule (the distance between the center and
a C atom).
43 Nuclear Liquid Droplet and Shell Models

In this chapter, we study nuclear models and approximations, the counterpart to the analysis of atoms
in the previous chapter.

43.1 Nuclear Data and Droplet Model

In the nucleus we have Z protons (where Z is electric charge of the nucleus), and N neutrons. Both
of them are nucleons; the total number of nucleons is known as the atomic number A, so A = N + Z.
The proton and neutron mass are approximately the same, m p  mn , together called m N , and the
atomic number A is then approximately the total nucleus mass divided by m N . More precisely,

Mnucleus  Am p (1 − 8/100), (43.1)

which means that the binding energy per nucleon is very small (about 8%) and is constant as A
increases.
Moreover, for most nuclei, we have the radius R versus A law:

R = r 0 A1/3 , (43.2)

where r 0  1.3 × 10−15 m = 1.3 fm (1 fm = 10−15 m is a femtometer, or one fermi). This law and the
previous relation (M ∝ A) imply that both the energy and the volume of the nucleus are proportional
to A, which is the property of a liquid. We say then that we have “nuclear matter”, forming something
like a droplet of liquid, hence we have a “liquid droplet model”.
In the ground state, we have a static, spherically symmetric fluid droplet. In an excited state, we
can have waves propagating in the fluid, deformations of the droplet that make it nonspherical, etc.
Inside this fluid, the individual nucleons are like particles of matter, and their motion is either thermal
or brownian.
However, the law E ∝ A is only approximate (we mentioned it is valid for most nuclei; in fact,
there are fluctuations around it for some cases). In fact, there is some structure besides linearity:
there are nuclei that have better stability, meaning higher binding energy. The nuclei for which this
happens have either N or Z as one of the “magic numbers”,

2, 8, 20, 28, 50, 82, 126. (43.3)

For instance, tin (chemically Sn, for Stannum in Latin) has Z = 50, and it has 10 stable isotopes,
the most of any element.
Also, if both N and Z are magic numbers, a situation called “double magic”, we have elements that
are even more stable. One example of this is helium-4, He4 , with N = Z = 2; others are oxygen-16,
O16 , with N = Z = 8, and lead-208, Pb208 , with Z = 82 and N = 126.
483
484 43 Nuclear Liquid Droplet and Shell Models

The situation above, with magic numbers for Z or/and N, is similar to the stability of atomic inert
gases, which have full atomic shells, in the shell model of the atom. In fact, more quantitatively, for
electrons in a hydrogenoid atom, with a Coulomb potential, the degeneracy of a shell of given n, for
electrons characterized by (n, l, m, ms ), is (ms takes two values and m takes 2l + 1 values, while l
goes from 0 to n − 1)


n−1
dn = 2(2l + 1) = 2n2 , (43.4)
l=0

giving, for n = 1, 2, 3, 4, 5, 6,

2, 8, 18, 32, 50, 72. (43.5)

However, the magic number is supposed to be the total number of electrons in all the full
shells, so

n 
n
n(n + 1)(2n + 1)
dm = 2m2 = , (43.6)
m=1 m=1
3

giving, for n = 1, 2, 3, 4, 5, 6,

2, 10, 28, 60, 110, 182, . . . (43.7)

43.2 Shell Models 1: Single-Particle Shell Models

From the previous analysis it follows that one possibility for a nuclear model is to consider a shell
model similar to that for an atom, but with another central potential V (r) rather than the Coulomb
potential, and where the shell corresponds to a given energy (principal quantum number n).
The wave function for a central potential is, as we saw in Chapter 19,
χnl (r)
ψ nlm (r ) = Ylm (θ, φ). (43.8)
r
The form of the reduced radial wave function χnl (r) is found from the Schrödinger equation, for
which we need the potential V (r). The true potential is something like a rounded-out finite square
well, with a depth of V0 and width R (the maximum radius for the well) and 0 at infinity, as in
Fig. 43.1a.
Instead of the true potential, we must use an analytic form that we can solve. A first approximation
is a spherical square well, of depth V0 and radius R. The potential is V = −V0 for 0 ≤ r ≤ R and, in
order to be closer to reality, we would also need V = 0 for r > R; see Fig. 43.1b. A second possible
approximation is a spherical (three-dimensional) harmonic oscillator, starting at V (r = 0) = −V0 ,
and reaching V = 0 at r = R. To be closer to reality, again we would need to put V = 0 for r > R;
see Fig. 43.1c.
Since the potential is central, for r ∈ (0, ∞), and not one-dimensional (for x ∈ (−∞, +∞)),
the conditions that one needs to impose are finiteness and integrability at 0 and ∞, as we saw in
Chapter 19.
485 43 Nuclear Liquid Droplet and Shell Models

V(r) V(r) V(r)


r r r

R R R

(a) (b) (c)


Figure 43.1 (a) Real nuclear potential. (b) Approximation 1: spherical square well. (c) Approximation 2: three-dimensional harmonic
oscillator.

Approximation 1 (Spherical Square Well)

The spherical square well has been analyzed in Chapter 19 already, where we found that, for
r < R ≡ r0 ,
χnl (r)
= Nnl jl (kr), (43.9)
r
where

2m(E + V0 )
k= . (43.10)
2
For r > R, the correct thing to do would be to put V = 0, but the simpler thing to do is to consider
V = ∞, for an infinitely deep square well. If we are interested only in the low-lying energy states (for
V0 − |V |  V0 ), which we are since there are seven magic numbers only so presumably we go up to
at most n = 7, the difference in the calculation required is minimal.
Then the wave function should vanish for r ≥ R, meaning that we need to have the quantization
condition

jl (k R) = 0 ⇒ k n R = jn,l
0
, (43.11)
0
where jn,l is the nth zero of jl . It then also follows that
2 1
2
Nnl = 3 2 0
. (43.12)
R jl+1 ( jn,l )

If we consider, as is more appropriate, V = 0 for r > R, the analysis of the boundary conditions,
giving the quantization condition and normalization, is more complicated.

Approximation 2 (Spherical Harmonic Oscillator)

A better approximation is, however, the spherical harmonic oscillator, starting at V = −V0 , in which
case the potential for the nucleus is
  r 2 1
VN (r) = −V0 1 − = −V0 + mω 2 r 2 , (43.13)
R 2
486 43 Nuclear Liquid Droplet and Shell Models

where we have defined


V0 1
2
≡ mω 2 , (43.14)
R 2
and m is the nucleon mass, m = m N  mn  m p .
Again, really we should stop the increase of V at r = R, when V = 0, and put V = 0 for r > R
instead. However, if we are interested only in the low lying states (and we saw that we have only
seven magic numbers, so we should stop at most at n = 7), the difference is minimal.
The solution of the radial Schrödinger equation is
χnl (r)
= Nn,l e−α r /2 (αr) l 1 F1 (−nr , l + 3/2; (αr) 2 ),
2 2
(43.15)
r
where

1 F1 (−nr , l + 3/2; (αr) 2 ) = L l+1/2 2 2


nr −1 (α r ), (43.16)

are Laguerre polynomials,




α= , (43.17)

the normalization constant is
√   1/2
α 2α nr !
Nn,l = , (43.18)
Γ(l + 3/2) Γ(nr + l + 3/2)
and nr = 0, 1, 2, . . . is the number of nodes of the radial wave function (less the node at infinity). The
energy (principal) quantum number is, as we saw,

n = 2nr + l = 0, 1, 2, . . . (43.19)

From the quantization condition, we found the eigenenergies


 
3
En = ω n + . (43.20)
2
The degeneracy of the above energy levels comes from the two values of the spin projection ms and
sum over l from 0 to n,

n
(n + 1)(n + 2)
dn = 2 (l + 1) = 2 = (n + 1)(n + 2). (43.21)
l=0
2

But the number in which we are interested, to be tested against the magic numbers, is the total
number of states in the fully filled shells, i.e.,

n 
n
n(n + 1)(2n + 1) n(n + 1)
dm = (m2 + 3m + 2) = +3 + 2(n + 1)
m=0 m=0
6 2
(43.22)
(n + 1)(n + 2)(n + 3)
= ,
3
which gives, for n = 0, 1, 2, 3, 4, 5, 6, 7,

2, 8, 20, 40, 70, 112, 168, 240, . . . (43.23)

Out of these, the first three numbers match the magic numbers, although the others do not.
487 43 Nuclear Liquid Droplet and Shell Models

For completeness, note that

d n = (n + 1)(n + 2) = 2, 6, 12, 20, 30, 42, 56, 72, . . . (43.24)

Since only the first three numbers match, we need a better theory but it is clear what it should be,
since the same thing happens for the atomic energy levels. At high Z, as for the atomic levels, the
magnetic spin–orbit coupling becomes large (whereas the electrostatic energy responsible for energy
levels does not increase).

43.3 Spin–Orbit Interaction Correction

In order to understand the nuclear spin–orbit interaction, we review the atomic spin–orbit interaction
first. The interaction term in the Hamiltonian comes from the Dirac equation, in the expansion around
the nonrelativistic result (see Chapter 54), and one finds
e 1 dA0 
Hspin–orbit = s · l . (43.25)
2m02 c2 r dr
But then

eA0 = −Ve (r) (43.26)

is the Coulomb potential, so the spin–orbit interaction Hamiltonian is rewritten as


1 dVe (r)
Hspin–orbit = −const × (2s · l ). (43.27)
r dr
We will assume that the same formula still holds for the nucleus, except that instead of the Coulomb
potential Ve (r) we use the nuclear potential (the potential for the nucleus states) VN (r).
We will use the second approximation, the spherically symmetric harmonic oscillator, which gives
better results, and obtain that
1 dVN (r) 2V0
= mω 2 = 2 = const. (43.28)
r dr R
We also find that V0  Ebinding ∝ A increases for larger nuclei.
Moreover,
2 2 2
(2s · l ) = (l + s ) 2 − l − s 2 = j − l − s 2
(43.29)
= 2 [ j ( j + 1) − l (l + 1) − s(s + 1)] .
Since all the nucleons are spin 1/2 fermions, just like electrons, s = 1/2 implies that j = l ± 1/2. In
this case, for the two values we find

(2s · l ) = (l + 1/2)(l + 3/2) − l (l + 1) − 3/4 = l, for j = l + 1/2


(43.30)
= (l − 1/2)(l + 1/2) − l (l + 1) − 3/4 = −l − 1, for j = l − 1/2.
Then the spin–orbit interaction Hamiltonian is
     
 l <0 j = l + 1/2
Hspin–orbit = −(const ) × , which is for . (43.31)
−l − 1 >0 j = l − 1/2
488 43 Nuclear Liquid Droplet and Shell Models

This means that the j = l − 1/2 states are shifted upwards in energy, and are shifted more as A
increases (so as either N or Z increases), since the constant is roughly proportional to A. Similarly,
the j = l + 1/2 states are shifted downwards in energy, and shifted more as A increases. Moreover,
the shift is also proportional to l for the first case, and to l + 1 in the second, so the shift is smaller
for small l and larger for large l.
To describe the nuclear states, we use a variant of the spectroscopic notation, with
n+1−l
nr + 1 = nr = + 1, (43.32)
2
the number of nodes, including infinity, appearing before the notation for l, so that

(nr + 1) l j = nr l j . (43.33)

For the first three shells, which as we saw already match the magic numbers, without considering
the spin–orbit shifts, we have the sub-shells
n = 0 : 1s1/2
n = 1 : 1p3/2 , 1p1/2 (43.34)
n = 2 : 1d 5/2 , 2s1/2 , 1d 3/2 ,
in the exact order of increasing energy written above, that is

(1s1/2 )n=0 , (1p3/2 , 1p1/2 )n=1 , (1d 5/2 , 2s1/2 , 1d 3/2 )n=2 . (43.35)

For the next shells, we have the following sub-shells over nr , l:


n = 3 : 1 f , 2p
n = 4 : 1g, 2d, 3s
(43.36)
n = 5 : 1h, 2 f , 3p
n = 6 : 1i, 2g, 3d, 4s,
where the n = 3 shell splits over j into

1 f 7/2 , 1 f 5/2 , 2p3/2 , 2p1/2 , (43.37)

but the first sub-shell, 1 f 7/2 , for n = 3, l = 3, has a spin–orbit interaction term that is sufficiently
large to separate it downwards in energy from the n = 3 shell, but not enough for it to join the n = 2
shell, meaning that it acts as a separate shell all by itself. Its degeneracy is 2 j + 1 = 8.
Next, for n = 4, the shell splits into

1g9/2 , 1g7/2 , 2d 5/2 , 2d 3/2 , 2s1/2 . (43.38)

But now 1g9/2 has n = l = 4 and the spin–orbit interaction is large enough to take it down into the
n = 3 shell.
Finally, for n = 5, the shell splits into

1h11/2 , 1h9/2 , 2 f 7/2 , 2 f 5/2 , 3p3/2 , 3p1/2 . (43.39)

The sub-shell 1h11/2 has n = l = 5 and the spin–orbit interaction is large enough to take it down
into the n = 4 shell. Moreover, from the n = 6 shell, the element with n = l = 6, namely 1i 13/2 ,
also has a spin–orbit interaction large enough to take it down into the n = 5 shell. Then, finally, the
489 43 Nuclear Liquid Droplet and Shell Models

sub-shells, ordered by increasing energy into shells (except the n = 0, 1, 2 shells, which are already
matched), are as follows (where we also give the degeneracy d n of the shell and the magic number
k n = nm=1 d m ):
n l (1 f ) (2p3/2 1 f 5/2 2p1/2 )n=3 (1g9/2 )n=4
 r j   7/2 n=3   
 dn  =  8 ,  22 ,
k
 n  28  50

(2d 5/2 1g7/2 2d 3/2 )n=4 (1h11/2 )n=5 (3s1/2 )n=4


 
 32 , (43.40)
 82

(2 f 1h9/2 3 f 5/2 2p3/2 )n=5 (1i 13/2 )n=6 (3p1/2 )n=5


 7/2 
 44 .
 126
We see that we have managed to reproduce all the magic numbers.

43.4 Many-Particle Shell Models

Until now we have considered only single-particle shell models, where the contribution from the
other nucleons is assumed to be included in the one-particle potential VN (r), but in reality we must
consider the interactions of the nucleons, at least for those outside the fully filled shells, called (as
in the atomic case) valence particles. As for atoms, the fully filled shells correspond to an inert core
(spherically symmetric, without multipole interactions), contributing only through a “screening” of
the nuclear charge and thus modifying V0 only, but not creating a multiparticle potential.
The simplest correction to the above single-particle shell models is to consider a two-particle
potential for the valence particles. Denoting by r 1 , r 2 the positions of the two valence particles
relative to the nuclear center, and θ12 the angle between them, the potential is
V = V (r 1 , r 2 ) = V (r 1 , r 2 , cos θ12 ). (43.41)
Moreover, we can expand it in terms of Legendre polynomials for the angular dependence,

V= f k (r 1 , r 2 )Pk (cos θ12 ). (43.42)
k

Conversely (using the orthonormality of the Legendre polynomials), we have



2k + 1 1
fk = Pk (cos θ12 )V (r 1 , r 2 , cos θ12 )d cos θ12 . (43.43)
2 −1
The simplest model for the two-particle potential is a contact, delta function, potential,
δ(r 1 − r 2 )
V = −gδ(r 1 − r 2 ) = −gδ(1 − cos θ12 ) , (43.44)
πr 1 r 2
where πr 1 r 2 is the Jacobian for the transformation between the Cartesian coordinates and the radial
coordinates. In this case, we obtain
g δ(r 1 − r 2 )
f k (r 1 , r 2 ) = − (2k + 1) . (43.45)
4π r1r2
490 43 Nuclear Liquid Droplet and Shell Models

We consider states that are tensor products of the single-particle states (in zeroth-order perturbation
theory for the interaction potential V , in which we have only single-particle potentials and shells),
defined in terms of the quantum numbers (nl jm) for each particle (the spin–orbit interaction for each
valence nucleon produces the coupling ji = l i + s i which, as for atomic j j coupling, happens at
large Z, for several fully filled shells since then the spin–orbit interaction is larger), so

|φ n1 l1 j1 m1  ⊗ |φ n2 l2 j2 m2 . (43.46)

We want to calculate the interaction energy coming out of this two-particle interaction potential
using first-order time-independent perturbation theory, so we will evaluate the potential V̂ for the
zeroth-order states.
The total angular momenta couples the two individual ones, j1 + j2 = J , so we replace | j1 m1 j2 m2 
by | j1 j2 J M via Clebsch–Gordan coefficients. We could continue with a full description, but we will
just show a simple application instead.
Consider the case when j1 = j2 = j, so that there are two identical particles in the same shell.
Then these particles can “pair up” into parallel but opposite (“antiparallel”) angular momenta, ↑↓,
i.e., J = 0 = M. In this case, the matrix element in the correct states | j1 j2 J M = | j j00 can be
trivially related to the matrix elements for the |↑ ⊗ |↓ states (products of one-particle states), and
then the sum over m j gives the degeneracy d j = 2 j + 1, so

 j j00|V̂ | j j00 = (2 j + 1) |φ n1 l1 (r 1 )| 2 |φ n2 l2 (r 2 )| 2 f k (r 1 , r 2 )dr 1 dr 2 . (43.47)

Substituting (43.45), we obtain



g dr
 j j00|V̂ | j j00 = −(2 j + 1) |φ n1 l1 (r)| 2 |φ n2 l2 (r)| 2 . (43.48)
8π r2
This is the pairing energy in first-order time-independent perturbation theory for the two-particle
shell model.
The next logical step would be to continue on to multiparticle shell models. But this leads to a
self-consistent solution, “Hartree–Fock approximation”, and will be treated in Chapter 57, in Part IIe ,
concerning multiparticle calculations.

Important Concepts to Remember

• A nucleus, defined by N and Z (and A = N + Z), can be roughly approximated as a liquid droplet
of nuclear matter, since its radius is R = r 0 A1/3 and Mnucleus /(m p A)  const and < 1.
• However, we have magic numbers of stability in N or Z, 2,8,20,28,50,82,126, explained by analogy
with the atomic shell model as being due to having a full nuclear shell (so the magic number is
m d m ), for the quantum numbers arising from quantization in the nuclear potential.
• The nuclear potential can be approximated either as a spherical square well or as a spherical (three
dimensional) harmonic oscillator starting at −V0 , naturally with V = 0 for r > R, but in practice
V = ∞ for r > R gives good results for low quantum numbers.
• For the spherical harmonic oscillator with V = 0 for r > R, we find the first three magic numbers,
but the rest do not match. For them, we need to take into account the spin–orbit interaction
correction, and make some assumptions about its parameters.
491 43 Nuclear Liquid Droplet and Shell Models

• To obtain a better result for the energy levels, one has to consider many-particle interactions. A two-
particle delta function interaction gives the pairing energy, but better corrections are obtained via
the self-consistent Hartree–Fock approximation.

Further Reading
See more about nuclear models in the book [27].

Exercises

(1) If a nucleus is spinning around a given axis, assuming the liquid droplet model how would you
modify the law R = r 0 A1/3 from the static case?
(2) Calculate the eccentricity, ΔR/R, of the spinning nucleus in exercise 1, in a classical physics
approximation.
(3) If the (effective) central potential is replaced with an (effective) azimuthal potential, V (r, z),
depending independently on the polar radius r in a plane and on the height z along the axis
transverse to the plane, is it possible to have a shell model? Why? If so, consider the harmonic
oscillator potential with different constants in the direction z and in the polar radial direction r,
and find the (first few) magic numbers.
(4) In the case of a spherical square well with V = 0 for r ≥ R, write down the solutions in
the regions r ≤ R and r ≥ R, the gluing (continuity) conditions at r = R, and the resulting
quantization condition.
(5) Consider the nuclear central potential (R2  R1 , but not by too large a factor)
r2
VN (r) = −V0 1 − 2  e−r/R2 . (43.49)
 R1
What can you (qualitatively) infer about the modification of the energy levels with respect to
the levels of the spherical harmonic oscillator approximation in the text?
(6) In the case in exercise 5, calculate the spin–orbit interaction correction. What can you
(qualitatively) infer about the modification of the various states from the uncorrected case?
(7) Calculate the pairing energy for the two-particle delta function potential, for two nucleons in the
ground state, 1s1/2 .
Interaction of Atoms with Electromagnetic Radiation:
44 Transitions and Lasers

In this chapter, we first study the exact time-dependent solution for a two-level system in a harmonic
potential, after which we study the general first-order perturbation theory in a harmonic potential for
interaction with an external electromagnetic field. Finally, we consider a quantized electromagnetic
field, and the interaction with the resulting photons, and derive the Planck formula that started
quantum mechanics.

44.1 Two-Level System for Time-Dependent Transitions

We start with the generic two-level system considered in Chapter 4, but now with a time-dependent
potential for transition, namely
      
d c1 (t) c (t) H11 H12 c1 (t)
i = Ĥ 1 = , (44.1)
dt c2 (t) c2 (t) H21 H22 c2 (t)

where H12 = V (t) and H21 = H12 = V ∗ (t), since the Hamiltonian is Hermitian. Moreover, as before,
H11 + H22 H22 − H11
E= , Δ= , (44.2)
2 2
where the nonperturbed energies (at V (t) = 0) are
H11 = E1(0) = E − Δ
(44.3)
H22 = E2(0) = E + Δ > E1(0) .
A useful formalism for describing the evolution of the states is in terms of the density matrix,
   
c1 (t) ∗ ∗ |c1 (t)| 2 c1 (t)c2∗ (t)
ρ = |ψ(t)ψ(t)| = c1 (t) c2 (t) = ∗ . (44.4)
c2 (t) c1 (t)c2 (t) |c2 (t)| 2
Define first the difference in occupation numbers (probabilities) in the two states,
N (t) = ρ 22 − ρ 11 = |c2 (t)| 2 − |c1 (t)| 2 = 2|c2 (t)| 2 − 1, (44.5)
where we have used that |c1 (t)| 2 + |c2 (t)| 2 = 1, since we only have these two states, so the probability
for the system to be in either state is one.
Next, we note that ρ 21 = ρ ∗12 , and then define
Q(t) + iP(t)
ρ 21 = c2 (t)c1∗ (t) = , (44.6)
2
so that
ρ 21 − ρ ∗21 ρ 21 − ρ 12
Q = ρ 21 + ρ ∗21 = ρ 21 + ρ 12 , P = = . (44.7)
i i
492
493 44 Interaction of Atoms with Electromagnetic Radiation

Then the Schrödinger equation is equivalent to the Bloch equations


d 1
N (t) = − [Q(t)(V (t) − V ∗ (t)) + iP(t)(V (t) + V ∗ (t))]
dt i
d 1
Q(t) = ω0 P(t) + N (t)(V (t) − V ∗ (t)) (44.8)
dt i
d 1
P(t) = −ω0 Q(t) + N (t)(V (t) + V ∗ (t)),
dt 
where we have defined
2Δ E2 − E1
ω0 ≡ = , (44.9)
 
previously called ω21 .
The most relevant case, which can be solved exactly, is that of a harmonic potential,
V (t) = V0 eiωt . (44.10)
We can then check that the solution for N (t) is (the method for finding it is somewhat lengthy)
 
2V0 2V0 2V0 ω0 − ω
N (t) = N (0) − P(0) sin Ωt + N (0) − Q(0) (cos Ωt − 1), (44.11)
Ω Ω Ω Ω
where

 2
2V0
Ω≡ (ω − ω0 ) 2 + . (44.12)

The solution for P(t), Q(t) is given as
Q(t) = Q  (t) cos ωt + P  (t) sin ωt
P(t) = P  (t) cos ωt − Q  (t) sin ωt
 
 2V0 ω0 − ω (44.13)
P (t) = P(0) cos Ωt + N (0) − Q(0) sin Ωt
Ω Ω
 
ω0 − ω ω0 − ω 2V0 ω0 − ω
Q  (t) = Q(0) + P(0) sin Ωt − N (0) − Q(0) (cos Ωt − 1).
Ω Ω Ω Ω
Considering that N (t) = 2|c2 (t)| 2 − 1, and that the most relevant case is when all particles are in
the ground state initially, at t = 0, we have |c2 (0)| 2 = 0, which means N (0) = −1. Moreover, in this
case c2 (0) = 0, so c2∗ (0) = 0 also, and thus Q(0) = P(0) = 0. In this case we obtain
2(ω0 − ω)V0 2V0
Q(t) = (1 − cos Ωt) cos ωt − sin Ωt sin ωt
Ω 2 Ω2
2V0 2(ω0 − ω)V0
P(t) = − sin Ωt cos ωt − (1 − cos Ωt) sin ωt (44.14)
Ω Ω2
 2  2
2V0 2V0 Ωt
N (t) = −1 + (1 − cos Ωt) = −1 + 2 sin2 .
Ω Ω 2
Therefore we obtain that the probability for the system to be in the upper energy level is
 2
1 + N (t) 2V0 Ωt
P2 (t) = |c2 (t)| =
2
= sin2 , (44.15)
2 Ω 2
which is known as the Rabi formula. We also have P1 (t) = 1 − P2 (t).
494 44 Interaction of Atoms with Electromagnetic Radiation

Figure 44.1 Probability |c2 (t)| 2 as a function of time t, showing the absorption and emission parts of the cycle.

From the form of Ω, we see that the amplitude is a maximum when Ω is a minimum, which
happens when
E2 − E1
ω = ω0 = , (44.16)

which is known as the resonance condition. At resonance,
V0
Ω = Ω0 = . (44.17)

Note also that the amplitude is Lorentzian in shape, since
 2
2V0 (2V0 ) 2
= , (44.18)
Ω (2V0 ) 2 + [(ω − ω0 )]2
and it becomes 1 at the resonance. Moreover, when averaging over time, we obtain
1
sin2 Ωt = , (44.19)
2
so P2  = 1/2 at resonance, and is smaller otherwise.
At resonance then, we have a cycling between the two energy levels. From t = 0 until Ωt/2 = π/2,
the system absorbs energy from the potential V (t), since E2 > E1 , and the probability P2 (t) = |c2 (t)| 2
increases from 0 to a maximum during that time. Then, from π/2 to π, P2 (t) decreases, so the system
is losing energy, by emission of radiation. The cycles of absorption and emission, each for a time Ωt/2
equal to π/2, repeat ad infinitum; see Fig. 44.1.
The above two-level system in a harmonic potential is relevant for the case of the laser, which
actually is the acronym for “light amplification by stimulated emission of radiation” (LASER). In
this case, there is radiation in a resonant cavity, with walls made of the atom with two levels. Then
the field gives energy to the atom through absorption, followed by stimulated emission, after which
the radiation gets reflected and comes back and gets absorbed, etc. The light is amplified coherently
and is then led out of the cavity as a laser.

44.2 First-Order Perturbation Theory for Harmonic Potential

The above case was exact, but the two-level system itself is an approximation. In reality, the atom is a
multi-level system and interacts with the electromagnetic field, oscillating at some frequency, which
generates transitions between the states of the system (so the interaction with the electromagnetic
495 44 Interaction of Atoms with Electromagnetic Radiation

field couples the states indirectly). Thus we can consider the generic interaction operator as the sum
of two terms (so that it is Hermitian),
Ĥ1 (t) = V̂ (t) = V̂0 eiωt + V̂0† e−iωt , (44.20)
for t > 0 (so, in a sudden approximation). For example, we found in Chapter 39 that the interaction
Hamiltonian with an electromagnetic field is
   
d 3 x( (k) · j) ei(k ·x −ωt) + e−i(k ·x −ωt) ,
 
Ĥ1 = d x A · j = A0
3  
(44.21)

and we took the matter current j to have matrix elements.


For comparison, for absorption from a constant field, we have the interaction Hamiltonian matrix
elements

H1, f i = iq A0 ω0 d 3 r ψ∗f (r )( (k) · r )ψi (r ). (44.22)

Then, using the first-order time-dependent perturbation theory formula from Chapter 37, integrated
over time, we have
 t
1 
cn (t) =
(1)
dt  H1,ni eiω ni t . (44.23)
i 0
Substituting H1 (t), we find
 t
1  †  
cn(1) (t) = dt  V0,ni eiωt + V0,ni e−iωt eiω ni t
i 0
  (44.24)
1 1 − ei(ω+ω ni )t 1 − ei(ω ni −ω)t †
= V0,ni + V0,ni .
 ω + ωni −ω + ωni
Then, as in Chapter 37, we continue on to find the Fermi golden rule, but now with the replacement
ωni = En − Ei → ωni ± ω = En − Ei ± ω, (44.25)
where the ± refers to stimulated emission or absorption.
The modified Fermi golden rule is then
  
dPni 2π  |V0,ni | 2 
= † 2 dE ··· ρ(En , . . .)δ(En − Ei ± ω). (44.26)
dt   |V0,ni |
This formula could have been obtained intuitively, for the stimulated emission or absorption of a
photon with energy ω.
† 2
We note that the Hamiltonian Ĥ is Hermitian, so |V0,ni | = |V0,ni | 2 , which comes from i|V0† |n =

n|V |i .

44.3 The Case of Quantized Radiation

To improve upon the previous analysis, we must also quantize the electromagnetic field. Strictly
speaking, this should be done in the relativistically invariant quantum electrodynamics (QED)
formalism, which is beyond the scope of this book, but we can make some shortcuts and “guess”
the result.
496 44 Interaction of Atoms with Electromagnetic Radiation

Briefly, radiation is made up of quanta of a given ω, just as for a harmonic oscillator, where
we saw that E = (N + 1/2)ω is built up from the vacuum by adding N quanta of energy ω.
Now, the vector potential A  contains a polarization vector  j (k) (where j is the index for the two
polarizations transverse to the momentum), a wave factor ei(k ·r −ωt) , and a normalization constant K,


plus a Hermitian conjugate term to ensure the reality of the potential (all of which are present in the
classical theory), but now the wave factor is multiplied, as for a harmonic oscillator, by a creation or
annihilation operator for respectively creating or annihilating a photon. All in all,
  
 j (k) a j (k)ei(k ·r −ωt) + a†j (k)e−i(k ·r −ωt) .
 
 r , t) = K
A( (44.27)

k,j

As we saw when discussing the harmonic oscillator, the only nonzero matrix elements of the
creation and annihilation operators are

N − 1|a|N = N
√ (44.28)
N + 1|a† |N = N + 1.
We see that, with respect to the classical analysis of the electromagnetic field given above, we replace
A0 with K a j (k) in one term and with K a†j (k) in the other, which are operators in both cases. This
means that now we need to consider states not only for the atom but for the radiation itself, and now
the total Hilbert space is a product Htotal = Hatom ⊗ Hradiation . Then the initial and final states become

|I = |i ⊗ |i,


˜ |F = | f  ⊗ | f˜, (44.29)

where |i, | f  are atomic states and |i,


˜ | f˜ are radiation states.
Therefore now the matrix elements of the interaction potential are

I = K (a j ( k)) f˜i˜ M f i ,
V0,F  (44.30)

where

d 3 r  j (k) · ( j) f i ei k ·r .

Mf i = (44.31)

k,j

But, as we just saw, the only nonzero matrix element (a j (k)) f˜i˜ is for |i
˜ = |N and | f˜ = |N − 1, in

which case the matrix element is N j (k). As we can see, in this situation one photon is absorbed,

so the term with V0,F I is for absorption, as in the classical radiation case.
Similarly,

V0,F I = K (a†j (k)) f˜i˜ M f∗i (44.32)



is only nonzero if |i
˜ = |N and | f˜ = |N + 1, in which case the matrix element is N j (k) + 1. This
is then the situation of emission of a photon into the radiation field, so the V0,F I term corresponds to
emission, again as in the classical radiation case.
Thus, the transition rate in this quantum radiation case is (from the Fermi golden rule)


dP(I → F) ⎪ ⎪ K 2 |M f i | 2 k,j N j (k) dE · · · δ(E f − Ei − ω) abs.
=⎨ ⎪  (44.33)
dt ⎪ K 2 |M f i | 2  [N j (k) + 1] dE. · · · δ(E f − Ei − ω) em.
⎩ k,j

for the absorption and emission cases, respectively.


497 44 Interaction of Atoms with Electromagnetic Radiation

We note that the transition rate in emission case is proportional to N j (k) + 1, so it can be split
into two terms, a term proportional to N j (k), which is only nonzero when there are already photons
in the radiation field, so this is a stimulated emission term (emission stimulated by photons), whereas
the term equal to 1 is independent of the existence of photons, so it is a spontaneous emission term.
We also note that the absorption rate equals the stimulated emission rate, as in the previous
(classical) case. Only the spontaneous emission is different. In the resulting spontaneous emission
rate, the right-hand side is called the “Einstein coefficient”.
We can turn the sum over k into an integral,
  
ω 2 dω
d3 k = k 2 dk dΩ = dΩ. (44.34)
c3
√ √
Moreover, the normalization constant K is proportional to 1/ ω, so we define K = K / ω.
Then we finally obtain, for the emission rate in the cases of stimulated or spontaneous emission,
  
dPni 2π N (k)
= 3 |Mni | 2 K 2 ω 2 dω dΩ j δ(E f − Ei − ω). (44.35)
dt c 1

44.4 Planck Formula

We can use the above formalism to calculate the Planck formula for thermal radiation, which was
what started quantum theory (of course, Planck did not use this derivation, but a much simplified one
with only a single assumption, that energy comes in quanta hν, but here we present the correct proof
in quantum mechanics).
We start from the observation that, for a given momentum k (thus angular frequency ω for the
radiation), we have the following ratio of the absorption to spontaneous emission rates:
dPabs. (k)
= N (k). (44.36)
dPsp.em. (k)
Next, we calculate the energy absorbed in an interval dω of angular frequency and in a solid angle
dΩ. Since we are absorbing photons, we have a factor ω for the energy of the photon times a factor
N (k) for how many photons there are in the field. Further, in the continuum, we have
 
1 1 1 ω 2 dω dΩ
→ d 3
k = , (44.37)
V (2π) 3 (2π) 3 c3

k

where the correspondence comes from k x = 2πn x /L x , so dk x = 2π/L x , and a product over dk x ,
dk y , dk z . Then we get

N (k) ω 2 dω dΩ
dE(k) = ω . (44.38)
(2π) 3 c3
This allows us to rewrite
dPabs (k) dE(ω) (2πc) 3
= N (k) = . (44.39)
dPsp.em. (k) dω dΩ ω 3
Consider now a two-level system, with lower level E1 and upper level E2 , in equilibrium with the
radiation. The probability that the system in E2 changes by dP2 via absorption, taking the atom from
498 44 Interaction of Atoms with Electromagnetic Radiation

the E1 state to the E2 state, with rate dPabs. , is equal to dPabs. times the probability P1 of finding the
atom in state E1 ,
dP2 = P1 dPabs. (44.40)
On the other hand, the probability of finding the system in E1 increases owing to emission from
the state E2 , with rate dPem times the probability P2 of finding the atom in state E2 , so
dP1 = P2 dPem. (44.41)
But the emission is either stimulated or spontaneous, and the stimulated rate equals the absorption
rate, so
dPem = dPst.em. + dPsp.em. = dPabs. + dPsp.em. , (44.42)
so
dP1 = P2 (dPabs + dPsp.em. ). (44.43)
But, at equilibrium (and since we only have two levels for the atom in this picture),
dP2 = dP1 ⇒ P1 dPabs. = P2 (dPabs. + dPsp.em. ). (44.44)
Then we obtain
dPabs. P2 1
= = . (44.45)
dPsp.em. P1 − P2 P1 /P2 − 1
However, at equilibrium, for a temperature T, the Boltzmann distribution of the atomic states is
1 −E2 /k B T 1
P2 = e , P1 = e−E1 /k B T , (44.46)
Z Z
so we obtain
dE(ω) (2πc) 3 dPabs. 1
= = , (44.47)
dω dΩ ω 3 dPsp.em. e (E2 −E1 )/k B T − 1
Finally, after integrating over dΩ and getting 4π, we have
dE(ω) ω 3 1
= 4π . (44.48)
dω (2πc) 3 e (E2 −E1 )/k B T − 1
Alternatively, using the frequency ν = ω/(2π), given that E2 − E1 = ω, and having an extra
factor of 2 from the two polarizations of light, we obtain the Planck formula
dE 8πν 2 hν
= 3
. (44.49)
dν c exp hν − 1
kB T

Important Concepts to Remember

• For a two-level system with a time-dependent interaction between states, H12 = V (t), specifically
for the harmonic V (t) = V0 eiωt , we find the Rabi formula, with an oscillating probability of being
in the upper state, |P2 (t)| 2 ∝ sin2 Ωt/2, and a Lorentzian-shaped amplitude with a maximum at
resonance, ω = ω0 = (E2 − E1 )/.
499 44 Interaction of Atoms with Electromagnetic Radiation

• A laser refers to radiation in a resonant cavity, with atoms of two-level system and coherent
amplification at resonance, which is then led out of the cavity as the laser.
• From quantum mechanics the atoms and Fermi’s golden rule we obtain stimulated emission and
absorption, whereas when quantizing the radiation, the previous results are proportional to the
number of photons, but we also obtain the possibility of spontaneous emission, which is not.
• The Planck formula is obtained from the energy balance for emission and absorption of photons,
and the Boltzmann distribution for the atoms.

Further Reading
Read more about the interaction between matter and electromagnetism in [1] and [2].

Exercises

(1) Show that (44.11) is the solution to the Bloch equations (44.8).
(2) Show that P(t), Q(t) in (44.13) are the corresponding solutions to the Bloch equations.
(3) Consider an exponentially decaying and oscillating potential,
V̂ (t) = (V̂0 eiωt + V̂0 e−iωt )e−γt
for the interaction with classical radiation. What is the equivalent of Fermi’s golden rule in this
case?
(4) Find a limit in which the first-order perturbation theory for a harmonic potential is consistent
with the exact, but two-level, calculation of Section 44.1 (giving the same result).
(5) Consider a possible quantum term, of the type α A  2 , for light–light interaction in the case of
quantized radiation. Analyze and interpret the resulting terms in the potential V (t) in terms of
photons, as was done in the text for the linear term.
(6) If there are no photons (no radiation field), N j (k) = 0, how do we interpret the rate of
spontaneous emission (what is the physical situation it describes, and what are its limitations)?
(7) In deriving the Planck formula, we used the classical Boltzmann distribution for the atoms. Why
is this allowed, given that we are considering quantum interactions?
PART IId

SCATTERING THEORY
45 One-Dimensional Scattering, Transfer and S-Matrices

In Part IId , we give an introduction to the rather large field of scattering theory. In this chapter, we
revisit one-dimensional scattering in a potential, restating it in a form that can be generalized to three
dimensions and introducing various useful quantities along the way such as transfer and S-matrices.
The one-dimensional case is interesting in several ways; an application we will do at the end of
the chapter is for discrete one-dimensional systems, or “spin chains” (where the continuous line is
replaced by a series of sites or points).
We start with a short review of the set-up for one-dimensional systems from Chapter 7. The one-
dimensional Schrödinger equation is
2 d 2 ψ
− + V (x)ψ(x) = Eψ(x), (45.1)
2m dx 2
and can be rewritten as

ψ  (x) = −( − U (x))ψ(x), (45.2)

where
2mV (x)
U (x) ≡
2
(45.3)
2mE
 ≡ 2 ≡ k 2.

In Chapter 7, we presented a more general Wronskian theorem, but, for applications here, we
consider two solutions ψ1 (x), ψ2 (x) of the same Schrödinger equation (meaning, with the same
potential and energy). Then their Wronskian,

W (y1 , y2 ) = y1 y2 − y2 y1 , (45.4)

is independent of x,
dW
= 0. (45.5)
dx
Moreover, W  0 implies that the two solutions are linearly independent, in which case an arbitrary
solution can be written as a linear combination of the two,

ψ(x) = aψ1 (x) + bψ2 (x). (45.6)

If we give the initial condition data at x 0 , namely ψ(x 0 ), ψ  (x 0 ), then the determinant of the linear
equations for a and b,
ψ(x 0 ) = aψ1 (x 0 ) + bψ2 (x 0 )
(45.7)
ψ  (x 0 ) = aψ1 (x 0 ) + bψ2 (x 0 ),
is W .
503
504 45 One-Dimensional Scattering, Transfer and S-Matrices

Figure 45.1 Generic scattering situation.

45.1 Asymptotics and Integral Equations

Having in mind applications to 3-dimensional systems, we assume that the potential V (x) is bounded
in space (is “local”, or has a finite domain), as in Fig. 45.1. More precisely, we impose
 +∞
dx|V (x)| < ∞. (45.8)
−∞

Under this assumption, it follows that at x → ±∞ we have a free particle solution, meaning a
linear combination of e±ik x , which defines this as a solution with momentum k. Then we must have
ψ(x → ±∞) = a± eik x + b± e−ik x ⇒
(45.9)
ψ  (x → ±∞) = a± ikeik x − b±ike−ik x .
Generalizing this behavior to the domain of the potential (the “inside”) means that a and b must
become functions,
ψ(x) = a(x)eik x + b(x)e−ik x
(45.10)
ψ  (x) = a(x)ikeik x − b(x)ike−ik x .
This ansatz is by definition good, since we are writing two arbitrary functions ψ(x) and ψ  (x)
in terms of two other arbitrary functions a(x) and b(x). The Schrödinger equation (a second-order
differential equation) is equivalent to two first-order differential equations,
d
ψ(x) = ψ  (x)
dx
(45.11)
d 
ψ (x) = (U (x) − k 2 )ψ(x).
dx
Substituting the ansatz for ψ(x) and ψ  (x) into the first equation above, we obtain
a  (x)eik x + b (x)e−ik x = 0, (45.12)
while substituting it into the second equation, we obtain
a  (x)ikeik x − b (x)ike−ik x = U (x)ψ(x). (45.13)
In this way we have obtained two equations for the unknowns a  (x) and b (x). Adding the first
equation to the second divided by ik, we obtain
U (x)ψ(x) −ik x U (x)
a  (x) = e = a(x) + b(x)e−2ik x . (45.14)
2ik 2ik
505 45 One-Dimensional Scattering, Transfer and S-Matrices

Replacing the sum with the difference, we obtain


U (x)ψ(x) ik x U (x)
b (x) = − e =− b(x) + a(x)e2ik x . (45.15)
2ik 2ik
We define ψ1 (x) as the solution that has only the term eik x at −∞, so a(−∞) = 1, b(−∞) = 0.
Then integrating the equations for a  (x), b (x) from −∞ to x, we obtain
 x
1 
a(x) = 1 + dx  e−ik x U (x  )ψ1 (x  )
2ik −∞
 x (45.16)
1  ik x   
b(x) = − dx e U (x )ψ1 (x ).
2ik −∞
Substituting these into (45.10), we obtain
 x
1  
ψ1 (x) = eik x + dx  eik (x−x ) − e−ik (x−x ) U (x  )ψ1 (x  )
2ik −∞
 x
1
=e +
ik x
dx  sin[k (x − x  )] U (x  )ψ1 (x  )
k −∞
 x (45.17)
 1  
ψ1 (x) = ike +ik x
dx  eik (x−x ) + e−ik (x−x ) U (x  )ψ1 (x  )
2 −∞
 x
= ikeik x + dx  cos[k (x − x  )] U (x  )ψ1 (x  ).
−∞

We have obtained an integral equation for ψ1 (x). It can be solved iteratively, as an infinite
perturbative sum,


ψ1 (x) = ψ1(n) (x), (45.18)
n=0

where the zeroth-order term is a free wave,


ψ1(0) (x) = eik x . (45.19)
Then, we substitute it into the right-hand side of the equation, thus defining the first-order term,
ψ1(1) (x), then repeat the procedure to find ψ1(2) , etc. In general, then,
 x
1
ψ1(n+1)
(x) = dx  sin[k (x − x  )] U (x  )ψ1(n) (x). (45.20)
k −∞
One can show that the series is convergent uniformly (though we will not give the proof here) if
 +∞
1
dx V (x  ) < 1. (45.21)
k −∞
But since the integral is finite by assumption, the relation is true for a sufficiently high k, k > k0 ,
with k0 the integral in (45.21). Conversely, though, it is not true for sufficiently small k, k < k0 .
We can make a similar analysis for the solution ψ2 (x) that is pure eik x at +∞, so that a(+∞) =
1, b(+∞) = 0. In this case, integrating the differential equations for a  (x) and b (x), we obtain
 +∞
1 
a(x) = 1 − dx  e−ik x U (x  )ψ2 (x  )
2ik x
 +∞ (45.22)
1 
b(x) = + dx  eik x U (x  )ψ2 (x  ).
2ik x
506 45 One-Dimensional Scattering, Transfer and S-Matrices

Note the change of sign in front of the integral, since now x is the lower limit instead of the upper
limit. Substituting these expressions for a(x), b(x) into (45.10), we obtain
 +∞
1  
ψ2 (x) = eik x − dx  eik (x−x ) − e−ik (x−x ) U (x  )ψ2 (x  )
2ik x
 +∞
1
= eik x − dx  sin[k (x − x  )] U (x  )ψ2 (x  )
k x
 (45.23)
 1 +∞  ik (x−x ) −ik (x−x  )  
ψ2 (x) = ike −ik x
dx e +e U (x )ψ2 (x )
2 x
 +∞
= ikeik x − dx  cos[k (x − x  )] U (x  )ψ2 (x  ).
x

45.2 Green’s Functions

We can rewrite the above integral equations as


 +∞
ψ1 (x) = e +
ik x
dx G1 (x, x  )U (x  )ψ1 (x  )
−∞
 +∞
(45.24)
   
ψ2 (x) = e ik x
+ dx G2 (x, x )U (x )ψ2 (x ),
−∞

where we have defined the Green’s functions


sin[k (x − x  )]
G1 (x, x  ) = θ(x − x  )
k
(45.25)
sin[k (x − x  )]
G2 (x, x  ) = θ(x  − x),
k
where θ(x) is the Heaviside function. As a reminder, θ(x) = 1 for x ≥ 0 and θ(x) = 0 for x < 0.
Also, we remember that (d/dx)θ(x) = δ(x) as a distribution.
These Green’s functions have a number of properties:

• G1 , G2 are continuous and equal to zero at x = x , since there the prefactor of the Heaviside
function vanishes, sin[k (x − x  )] = 0.
• The derivatives of G1 , G2 are discontinuous, however. Indeed,
sin[k (x − x  )] 
∂x G1 (x, x  ) = cos[k (x − x  )]θ(x − x  ) + θ (x − x  )
k (45.26)
= cos[k (x − x  )]θ(x − x  ).

In the second equality above we have used that θ  (x − x  ) = δ(x − x  ), which is zero everywhere
except at x = x ; however, there the prefactor vanishes, so the whole second term does too (we
explained this in Chapter 2). Then we also find

∂x G2 (x, x  ) = cos[k (x − x  )]θ(x  − x). (45.27)

• These G1 , G2 are indeed Green’s functions for the free Schrödinger equation. Indeed, calculating
the second derivative, we obtain
507 45 One-Dimensional Scattering, Transfer and S-Matrices

∂x2 G1 (x, x  ) = −k sin[k (x − x  )]θ(x − x  ) + cos[k (x − x  )]θ  (x − x  )


(45.28)
= −k 2 G1 (x, x  ) + δ(x − x  ).
Here we have used the fact that we can replace the function multiplying θ  (x − x  ) = δ(x − x  )
with its value at x = x , which is 1 in this case. Then it follows that G1 (x, x  ) solves the Green’s
function equation for the Schrödinger operator in the free case,

(∂x2 + k 2 )G1 (x, x  ) = δ(x, x  ). (45.29)

We can rewrite the full Schrödinger equation as


 2 
d
+ k ψ(x) = U (x)ψ(x) ≡ f (x),
2
(45.30)
dx 2
and if we consider f (x) as a given function (which, in fact, it is not, as we want to solve for ψ(x)),
we can formally solve for ψ(x) by convoluting the Green’s function with f . Indeed, multiplying the
equation with f (x) and integrating between −∞ and x, we obtain
 x  2   x
  d 
dx f (x ) + k 2
G 1 (x, x ) = dx  δ(x − x  ) f (x  ) = f (x), (45.31)
−∞ dx 2 −∞

or more generally
   x 
d2   
+ k2 dx G(x, x ) f (x ) + ψ0 (x) = f (x), (45.32)
dx 2 −∞

where ψ0 (x) is a general solution of the free (homogenous) equation, (d 2 /dx 2 + k 2 )ψ0 (x) = 0.
Then it follows that the formal solution is
 x
ψ(x) = ψ0 (x) + dx G1 (x, x  ) f (x  ). (45.33)
−∞

However, as we noted, in reality f (x) is not given but is equal to U (x)ψ(x). Replacing it in the
formal solution above, we actually obtain the integral equations.
We note also that a general Green’s function is a sum of the above special Green’s function
G1 (x, x  ) and a general solution g(x, x  ) of the homogenous equation
 2 
d
+ k 2
g(x, x  ) = 0, (45.34)
dx 2
so that

G̃1 (x, x  ) = G1 (x, x  ) + g(x, x  ). (45.35)

45.3 Relations between Abstract States and


Lippmann–Schwinger Equation

Until now, we have worked in the x (space) representation. But we can use an abstract approach as
well, without choosing a representation. In fact, the formal solution above can be written as

|ψ = |ψ0  + Ĝ0 | f , (45.36)


508 45 One-Dimensional Scattering, Transfer and S-Matrices

where Ĝ0 stands in for G1 (x, x  ) without introducing a given representation, and the index 0 is there
to show that the Green’s operator is for the free Schrödinger operator D̂0 , equal to d 2 /dx 2 + k 2 in the
x representation. In other words, we are solving
D̂0 |ψ = | f  = Û |ψ. (45.37)
Equivalently, by multiplying with 2 /(2m), we can solve
(− Ĥ0 + E)|ψ = V̂ |ψ, (45.38)
where Ĥ0 is the Hamiltonian of the free particle. Then the Green’s function equation is
2
(E − Ĥ0 ) Ĝ0 =
1, (45.39)
2m
where 1 is the identity operator, so that x|1|x  = δ(x − x  ). Finally, this means that we can solve
the Green’s function equation formally as
2 1
Ĝ0 = . (45.40)
2m E − Ĥ0
However, we see that this operator is not well defined when acting on eigenvectors |k of the free
Hamiltonian, since (E − Ĥ0 )|k = 0. To make it well defined, we must make the energy slightly
complex, E → E + i (we can also add −i instead, but that gives another Green’s function, not so
useful), so that we obtain
2 1
Ĝ0 = lim . (45.41)
→0 2m E + i − Ĥ0

The integral equation is written formally as


1
|ψ = |k + Ĝ0 V̂ |ψ = |k + V̂ |ψ, (45.42)
E + i − Ĥ0
and is known as the Lippmann–Schwinger equation. To see that this solves the Schrödinger equation
formally, we multiply by E − Ĥ0 , obtaining (neglecting i for simplicity)
(E − Ĥ0 )|ψ = (E − Ĥ0 )|k + V̂ |ψ = V̂ |ψ ⇒
(45.43)
( Ĥ0 + V̂ )|ψ = E|ψ.
An iterative solution is obtained as before, by substituting the nth-order solution into the right-hand
side of the Lippmann–Schwinger equation, to obtain the (n + 1)th-order solution,
|ψ = |k + Ĝ0 V̂ |k + (Ĝ0 V̂ ) 2 |k + · · · + (Ĝ0 V̂ ) n |k + · · ·
(45.44)
= |ψ (0)  + |ψ (1)  + |ψ (2)  + · · · + |ψ (n)  + · · · .
We can obtain this same perturbative solution in a different way. Substitute |ψ = |k + |φ into
the full Schrödinger equation (E − Ĥ0 − V̂ )|ψ = 0. Using the free Schrödinger equation, we obtain
(E − Ĥ0 − V̂ )|φ = V̂ |k. (45.45)
This is solved by
1
|φ = V̂ |k
E − Ĥ0 − V̂ (45.46)
= (Ĝ0 + Ĝ0 V̂ Ĝ0 + Ĝ0 V̂ Ĝ0 V̂ Ĝ0 + · · · )V̂ |k,
509 45 One-Dimensional Scattering, Transfer and S-Matrices

where we have defined


1 1
Ĝ = = , (45.47)
E − Ĥ0 − V̂ Ĝ−1
0 − V̂

and used the following relation for two arbitrary operators  and B̂,
1 1 1 1 1 1 1
= + B̂ + B̂ B̂ + · · · , (45.48)
 − B̂      Â
which we leave as a simple exercise to prove.
We note that, indeed, we have obtained the same perturbative solution of the full Schrödinger
equation. We can now define the operator T̂ by factorizing Ĝ0 on the right-hand side of the
perturbative solution, obtaining

Ĝ0 (V̂ + V̂ Ĝ0 V̂ + V̂ Ĝ0 V̂ Ĝ0 V̂ + · · · ) ≡ Ĝ0T̂. (45.49)

But then we notice that we can again obtain the T̂ operator in its definition, as

T̂ = V̂ + V̂ Ĝ0 (V̂ + V̂ Ĝ0 V̂ + · · · ) = V̂ + V̂ Ĝ0T̂. (45.50)

This equation for T̂ is in fact an equivalent form of the Lippmann–Schwinger equation.

45.4 Physical Interpretation of Scattering Solution

We have defined ψ1 (x) as constituting a pure solution at x → −∞,

ψ1 (x) ∼ eik x ⇒ ψ1∗ (x) ∼ e−ik x , (45.51)

while ψ2 (x) has a pure solution at x → +∞,

ψ2 (x) ∼ eik x ⇒ ψ2∗ (x) ∼ e−ik x . (45.52)

Thus at x → −∞ the two solutions ψ1 (x) and ψ1∗ (x) are linearly independent, with Wronskian

W1 (ψ1 , ψ1∗ ) = ψ1∗ ψ1 − ψ1 ψ1∗  = 2ik  0, (45.53)

so that (ψ1 (x), ψ1∗ (x)) form a basis in the space of eigenfunctions near −∞. Similarly, at x → +∞,
the two solutions ψ2 (x) and ψ2∗ (x) are linearly independent, with Wronskian

W2 (ψ2 , ψ2∗ ) = ψ2∗ ψ2 − ψ2 ψ2∗  = 2ik  0, (45.54)

meaning that (ψ2 (x), ψ2∗ (x)) is a basis in the space of eigenfunctions near +∞.
Therefore we can expand the basis elements near −∞ in terms of the basis elements near +∞,
ψ1 (x) = α(k)ψ2 (x) + β(k)ψ2∗ (x) ⇒
(45.55)
ψ1∗ (x) = β ∗ (k)ψ2 (x) + α ∗ (k)ψ2∗ (x).

Applying this relation near +∞, where ψ2 (x) ∼ eik x , we obtain

ψ1 (x → +∞) → α(k)eik x + β(k)e−ik x


(45.56)
ψ1∗ (x → +∞) → β∗ (k)eik x + α ∗ (k)e−ik x .
510 45 One-Dimensional Scattering, Transfer and S-Matrices

Substituting the relation between basis elements into the Wronskian W1 , we obtain

W1 = ψ1∗ ψ1 − ψ1 ψ1∗  = [|α(k)| 2 − |β(k)| 2 ](ψ2∗ ψ2 − ψ2 ψ2∗  ) = [|α(k)| 2 − |β(k)| 2 ]W2 . (45.57)

However since W1 and W2 are independent of x, and we have calculated that W1 = 2ik = W2 , it
follows that

|α(k)| 2 − |β(k)| 2 = 1. (45.58)

For a generic eigenfunction, expanded in the basis at −∞,

ψ(x) = aψ1 (x) + bψ1∗ (x), (45.59)

which means that near x → −∞, when

ψ(x) → aeik x + be−ik x , (45.60)

we can use the transformation law between bases, and apply it near x → +∞, to obtain

ψ(x) → (aα(k) + bβ∗ (k))e+ik x + (aβ(k) + bα ∗ (k))e−ik x ≡ a  e+ik x + b e−ik x . (45.61)

It follows that in effect we are acting with a matrix on the state (a, b), namely
     
a a a
→  =K , (45.62)
b b b
where we define the transfer matrix K:
 
α(k) β∗ (k)
K≡ . (45.63)
β(k) α ∗ (k)
Outside the domain of the potential, the time-dependent solution of the Schrödinger equation is

ψk (x, t) = ψk (x)e−iEt , (45.64)

and it corresponds to a propagating wave. For ψk (x) ∼ eik x , we have a wave propagating to the
right, while for ψk (x) ∼ e−ik x , we have a wave propagating to the left.
We can now connect with the analysis in Chapter 7. Indeed, there is a solution
β ∗
ψ+ (x) = ψ1 (x) − ψ (x), (45.65)
α∗ 1
that behaves as
1 ik x
ψ+ (x) ∼ e (45.66)
α∗
at x → +∞, and as
β −ik x
ψ+ (x) ∼ eik x − e (45.67)
α∗
at x → −∞. This is a wave propagating from the left (∝ eik x ), and after encountering the potential
domain, having a reflected part ∝ e−ik x and a transmitted part ∝ eik x .
Similarly, there is a solution
1 ∗
ψ− (x) = ψ (45.68)
α∗ 1
511 45 One-Dimensional Scattering, Transfer and S-Matrices

that behaves as
1 −ik x
ψ− (x) ∼ e (45.69)
α∗
at x → −∞, and as
β ∗ ik x
ψ− (x) ∼ e + e−ik x (45.70)
α∗
at x → +∞. This is a wave propagating from the right (∝ e−ik x ), after encountering the potential
domain, having a reflected part ∝ eik x , and a transmitted part ∝ e−ik x .
We define a wave as being an “in” wave if it is going towards the potential, and as an “out” wave
if it is going away from the potential.

45.5 S-Matrix and T-Matrix

Define abstract states |k± by

eik x = x|k+, e−ik x = x|k−. (45.71)

Then the transmitted (t) and reflected (r) coefficients in the waves are
1 β
t+ = r+ = −
α∗ α∗
(45.72)
1 β∗
t− = ∗ r− = ∗ ,
α α
which are normalized correctly,

|t + | 2 + |r + | 2 = |t − | 2 + |r − | 2 = 1. (45.73)

Then the free “in” states turn into a combination of free “out” states, as follows:
|k+in → t + |k+out + r + |k−out
(45.74)
|k−in → t − |k−out + r − |k−out .
This can be written as a matrix action,
      
|k+in |k+out t r+ |k+out
→ Ŝ = + , (45.75)
|k−in |k−out t− r− |k− out
where we have defined the matrix
     
k + | Ŝ|k+ k − | Ŝ|k+ t r+ 1/α ∗ β∗ /α ∗
Ŝ = = + = , (45.76)
k + | Ŝ|k− k − | Ŝ|k− t− r− −β/α ∗ 1/α ∗
called the S-matrix. It is a unitary matrix, since, as we can easily check,

Ŝ Ŝ † = 1. (45.77)

Thus we can write it in terms of a Hermitian (“real”) matrix Δ̂, as

Ŝ = eiΔ̂ . (45.78)
512 45 One-Dimensional Scattering, Transfer and S-Matrices

Equivalently, we can define an S-operator as the evolution operator between −∞ and +∞,

Ŝ = Û (+∞, −∞) = lim U (t 2 , t 1 ), (45.79)


t2 →+∞,t1 →−∞

and the S-matrix as the matrix whose elements are the matrix elements of the Ŝ operator between the
free states |k±. This definition also gives a transition between in states (at t → −∞) and out states
(at t → +∞).
We can define the T-matrix from the previously defined T-operator T̂. The perturbative solution of
the Schrödinger equation was written as

|ψ = |k + |φ = |k + Ĝ0T̂ |k = (1 + Ĝ0T̂ )|k. (45.80)

Acting with V̂ from the left, we obtain

V̂ |ψ = (V̂ + V̂ Ĝ0T̂ )|k = T̂ |k, (45.81)

where in the last equality we have used the Lippmann–Schwinger equation. This then gives an
alternative form of the Lippmann–Schwinger equation,

|ψ = |k + Ĝ0 V̂ |ψ. (45.82)

Moreover, considering the two |k± free-wave solutions, we find

|ψ± = |k± + Ĝ0 V̂ |ψ± = |k± + Ĝ0T̂ |k±. (45.83)

This equation also defines an S-matrix, though one in which we have separated an identity term 1,
giving the first, free-wave, part of the state.

45.6 Application: Spin Chains

As an important application of the one-dimensional scattering formalism, consider a useful one-


dimensional system: on a discrete line, made up of a set of points (“sites”) i = 1, 2, . . . , L, each site
has a “spin” variable. In the simplest case, of s = 1/2, the spin variable can have two states, |↑ and
|↓. We can define the state |x 1 , x 2 , . . . , x k  as the state with |↑ at sites x 1 , . . . , x k ∈ {1, 2, . . . , L} and
with |↓ at the rest of the sites; see Fig. 45.2.
Further, we can define a state with one excitation, one spin up moving along a sea of spins down
(called a “magnon”) with momentum p,

L 
L
|ψ(p) = eipx |x ≡ ψ p (x)|x. (45.84)
x=1 x=1

Figure 45.2 Spin chain.


513 45 One-Dimensional Scattering, Transfer and S-Matrices

Similarly, the state with two excitations (two magnons) is defined in terms of a two-particle wave
function as

|ψ(p1 , p2 ) = ψ(x 1 , x 2 )|x 1 , x 2 . (45.85)
1≤x1 <x2 ≤L

Next, we can define an ansatz for this wave function ψ(x 1 , x 2 ), known as the Bethe ansatz (as
defined by Hans Bethe in 1932),
ψ(x 1 , x 2 ) = ψ p1 (x 1 )ψ p2 (x 2 ) + S(p2 , p1 )ψ p2 (x 1 )ψ p1 (x 2 )
(45.86)
= ei(p1 x1 +p2 x2 ) + S(p2 , p1 )ei(p2 x1 +p1 x2 ) ,
where the first term is the free, “in”, part, and the second term is the scattered, “out”, part, while
S(p2 , p1 ) is an S-matrix.
We note that ψ(x 1 , x 2 ) is defined only for x 1 < x 2 . However, we can define ψ(x 2 , x 1 ) by
periodicity, as (since x 2 < x 1 + L)
ψ(x 2 , x 1 + L) = ψ(x 1 , x 2 ). (45.87)
That gives the condition
eip1 L = S(p1 , p2 ), (45.88)
which in turn gives
eip2 L = S(p2 , p1 ), (45.89)
meaning the S-matrix is a phase (a 1 × 1 unitary matrix). These two equations are called the Bethe
ansatz equations, and from them, and their generalizations in the case of several magnons, we obtain
the sets of possible p’s (in general complex), or Bethe roots (strictly speaking, these are a given
function of the p’s).
From these two equations, we obtain
S(p1 , p2 )S(p2 , p1 ) = 1
(45.90)
= ei(p1 +p2 )L ,
which gives the quantization condition

p1 + p2 =
n mod(2π/L). (45.91)
L
In general, we can define the S-matrix between two momenta as the phase
S(pi , p j ) = eiδ i j . (45.92)
In terms of it, we can define the Bethe ansatz for M magnons,
 ⎡⎢  ⎤⎥
i 
M
ψ(x 1 , . . . , x M ) = ⎢
exp ⎢⎢i pP(i) x i + δ P(i)P( j) ⎥⎥⎥ . (45.93)
P ∈Perm.(M ) ⎢ 2 ⎥⎦
⎣ i=1 i< j

The Hamiltonian on the spin chain is an operator acting on the sites. The standard example, for
which actually Bethe invented his ansatz, is the Heisenberg spin 1/2 Hamiltonian,

L
H = 2J (Pj,j+1 − 1), (45.94)
j=1
514 45 One-Dimensional Scattering, Transfer and S-Matrices

where J is a constant, and Pj,j+1 is the operator that permutes the states at sites j and j + 1. One can
show that in general, when acting on the spin 1/2 Hilbert spaces,
1 1
Pi j = + σi · σ j , (45.95)
2 2
where σi are the Pauli matrices at site i. Then the Hamiltonian can be rewritten as

L
H=J σ j · σ j+1 − 1 . (45.96)
j=1

Since we have
(Pj,j+1 − 1)| j = | j + 1 − | j
(45.97)
(Pj,j+1 − 1)| j + 1 = | j − | j + 1,
the Heisenberg Hamiltonian acting on the one-magnon state gives

L
Ĥ |ψ(p) = J 2eipx (|x − 1 + |x + 1 − 2|x)
j=1

L
= 2J eip + e−ip − 2 eipx |x (45.98)
x=1
p
= −8J sin |ψ(p) ≡ E|ψ(p).
2
2
As can be seen, the one-magnon state is automatically an eigenstate of the Heisenberg
Hamiltonian.
We could solve for eigenstates and eigenenergies in the case of several magnons, but we will not
do it here.

Important Concepts to Remember

• Defining ψ1 (x) as the solution that has only the term eik x at x = −∞ and ψ2 (x) as the
solution
 that has only the term eik x at x = +∞, we have the equations ψ1,2 (x) = eik x +
dx G1,2 (x, x  )U (x  )ψ1 (x  ), with G1,2 Green’s functions and U the rescaled potential.


• In the abstract case, we have the Lippmann–Schwinger equation |ψ = |k + Ĝ0 V̂ |ψ, where
Ĝ0 = 1/(E + i − Ĥ0 ) is the free Green’s function, and |k is a free-wave state, which can be solved
by iteration.
• Alternatively, we have |ψ = |k + ĜV̂ |k, where Ĝ = 1/(E + i − Ĥ), where Ĥ = Ĥ0 + V̂ . Also,
ĜV̂ ≡ Ĝ0T̂, implying T̂ = V̂ + V̂ Ĝ0T̂.
• Expanding ψ1 in ψ2 , ψ2∗ as ψ1 = αψ2 + βψ2∗ , a generic wave function is ψ = aψ1 + bψ1∗ and which
near x = −∞ is ∼ aeik x + be−ik x and near x = +∞ is ∼ a  eik x + b e−ik x , with (a , b ) related to
(a, b) by the transfer matrix K.
• There exist wave functions that correspond to a wave coming from the left and being reflected
and transmitted, or to a wave coming from the right and being reflected and transmitted. Also, we
have “in” waves (moving towards the potential domain) and “out” waves (moving away from the
potential domain).
515 45 One-Dimensional Scattering, Transfer and S-Matrices

• The S-matrix relates out and in waves, is unitary, and corresponds to the matrix elements in free
|k states of the S-operator, Ŝ = Û (+∞, −∞).
• Spin chains have spin variables at sites along a one-dimensional chain, and have magnon
excitations.
• The Bethe ansatz for two magnons is ψ(x 1 , x 2 ) = eip1 x1 +ip2 x2 + S(p2 , p1 )eip2 x1 +ip1 x2 , where
S(p2 , p1 ) is the S-matrix, and the standard example for it is the Heisenberg spin 1/2 Hamiltonian
H = 2J i [Pi,i+1 − 1].

Further Reading
See [2] for one-dimensional scattering, [2],[1] for Lippman–Schwinger formalism, and [6] for spin
chains.

Exercises

Consider the potential V (x) = V0 e−αx . Calculate perturbatively the first two terms in the
2
(1)
solutions ψ1 (x) (∼ eik x at x = −∞, incoming) and ψ2 (x) (∼ eik x at x = +∞, outgoing), if
E ≤ V0 .
(2) Show that G2 (x, x  ) is also a Green’s function for the Schrödinger operator, as is G1 (x, x  ), and
that their difference is a solution of the homogenous Schrödinger operator.
(3) Write down the Lippmann–Schwinger equation in the momentum representation, and the
corresponding first few terms in its perturbative solution.
(4) Consider a barrier potential, V = V0 for |x| ≤ L/2 and V = 0 for |x| > L/2, and an energy
E ≤ V0 . Calculate ψ1 (x), ψ2 (x), and the corresponding transfer matrix K.
(5) In the case in exercise 4, calculate the S-matrix and the Hermitian matrix Δ̂, with Ŝ = eiΔ̂ .
(6) Substitute the two-magnon ansatz into the Schrödinger equation for the Heisenberg spin 1/2
spin chain, and find that the energy of the two-magnon state is just the sum of the energies of
the single magnons, and that
φ(p1 ) − φ(p2 ) + i 1 p
S(p1 , p2 ) = , φ(p) ≡ cot . (45.99)
φ(p1 ) − φ(p2 ) − i 2 2
(7) Solve the Bethe ansatz equations for two magnons for the case n = 0 in the quantization
condition (45.91), and write down the corresponding explicit wave functions.
Three-Dimensional Lippmann–Schwinger Equation,
46 Scattering Amplitudes and Cross Sections

In this chapter, we start the analysis of three-dimensional scattering. We will consider (except in the
last chapter of this part) only elastic scattering, meaning the colliding particles do not change their
states during the collision. As we have already seen, this can be reduced to scattering off a potential,
where the scattering is in terms of the relative motion.
We begin with a quick review of motion in a potential.

46.1 Potential for Relative Motion

Since we are considering elastic scatterings, it is enough to have only two colliding particles, A and
B. In that case, the Hamiltonian for the system is
2 2
Ĥ = Ĥ A + ĤB + V̂ = − ΔA − ΔB − V̂ (r A − r B ). (46.1)
2m A 2m B
Defining the relative position r = r A − r B , the center of mass position R,  the relative mass
m = m A m B /(m A + m B ), and the total mass M = m A + m B , we rewrite the above equation as
2 2
Ĥ = ĤCM + Ĥrel = − ΔR − Δr + V (r ). (46.2)
2M 2m
Then the center of mass and relative variables separate in the solution. The center of mass
Schrödinger equation is trivial, and the relative-motion Schrödinger equation is
 2 

− Δr + V̂ (r ) ψ = Eψ. (46.3)
2m
In the absence of a potential (for instance, when we are far from the domain of the potential), we
have a free-wave solution,
1
r |p  = φ p (r ) = ei p ·r / . (46.4)
(2π) 3/2
Defining the wave vector k = p /, we can expand the exponential in terms of spherical harmonics
Ylm (θ, φ),

ei k ·r =

Alm (nk ) jl (kr)Ylm (nr ), (46.5)
l,m

where the unit vector in the momentum direction is nk = k/k and in the position direction is nr = r /r,
and the coefficients are

Alm (nk ) = 4πi l Ylm (nk ). (46.6)
516
517 46 Three-Dimensional Lippman–Schwinger Equation

The position wave function at given l, m, is then

ulm (k; nr ) = jl (kr)Ylm (nr ). (46.7)

46.2 Behavior at Infinity

To describe scattering in a potential, we need to understand first the boundary conditions, just as
we did in the one-dimensional case, and in the case of eigenfunctions in a central potential, in
Chapters 18 and 19. We saw there that the boundary conditions at r = ∞ and r = 0 restrict the
wave function, in a similar way to that discussed in Chapter 7, in the one-dimensional case. If we
have negative energy, E < 0, we saw that we obtain bound states, and that leads to the quantization
of the energy levels. Indeed, we have two normalizability conditions, one at r = 0 and one at r = ∞,
for the two independent solutions of the second-order differential equation. On the other hand, if we
have positive energy, E > 0, we have asymptotic states (states that “escape” to infinity), and there
are no quantization conditions. At r → ∞, both these states, with behavior r R(r) = χ(r) ∼ e±ikr
(as we saw) are well behaved. As we will see, these two behaviors lead to outgoing and incoming
waves, respectively, for scattering states |ψ± (as opposed to bound states, like those in which we
were interested in Chapters 18 and 19). In conclusion, then, the boundary conditions at infinity imply
the scattering solution that we want.
For the time being we will consider a potential with a finite range, i.e., a potential V (r) (for a
spherically symmetric case) that decays faster than 1/r at r → ∞.
In that case, at infinity, the free particle (which experiences no potential) is a good approximation.
The Schrödinger equation there (with E = 2 k 2 /(2m)) is
∂ 2 ψ 2 ∂ψ 1
+ + 2 Δθ,φ ψ + k 2 ψ = 0. (46.8)
∂r 2 r ∂r r
It can be rewritten as
∂2 1
(r ψ) + Δθ,φ ψ + k 2 (r ψ) = 0. (46.9)
∂r 2 r
But, since we are at r → ∞, the middle term is O(1/r), so it goes to zero when compared with the
other two terms and can be neglected. Then the equation at infinity is
∂2
(r ψ) + k 2 (r ψ) = 0, (46.10)
∂r 2
which has the solutions

r ψ = Ae±ikr . (46.11)

To be more precise, the prefactors A have also some dependence,

r ψ = A(k, ±nr )e±ikr , (46.12)

which means that we have an “out” (outgoing) solution


eikr
ψ1 = A(k, nr ) (46.13)
r
518 46 Three-Dimensional Lippman–Schwinger Equation

and an “in” (incoming) solution


e−ikr
ψ2 = B(k, −nr )
, (46.14)
r
where the minus sign in the dependence of B is conventional, to remind us that the wave goes towards
r = 0.
The most general solution then is a combination of both: ψ = ψ1 + ψ2 . In particular, we show that
the free wave is also of this form, at r → ∞, with given A and B,
2π e−ikr 2π eikr
ei k ·r 

δ(nk − nr ) − δ(nk + nr ) , (46.15)
ik r ik r
so that
2π 2π
A(k, nr ) = δ(nk − nr ), B(k, −nr ) = − δ(nk + nr ). (46.16)
ik ik
Proof To prove this relation, we calculate the integral of the wave, on a sphere multiplied by an
arbitrary function,
  π  2π
k ·
I= 2
d nr f (nr )e i r
= sin θdθ dφeikr cos θ f (θ, φ). (46.17)
0 0

Defining x = cos θ, we calculate


 2π  1  2π  1

I= dφ eikr x f (x, φ) = f (x, φ)d (eikr x )
0 −1 0 ikr −1
   1  (46.18)
ikr x d f (x, φ)

dφ −ikr
= f (1, φ)e − f (−1, φ)e
ikr
− e dx ,
0 ikr −1 dx
where in the second line we have integrated by parts. Since at cos θ = 1 we have nr = nk , and at
1
cos θ = −1 we have nr = −nk , it follows that f (±1, φ) = f (±nk ). Moreover, at kr  1, −1 eikr x is
highly oscillatory, and therefore the integral is close to zero, or rather it is much less than the first
two terms. These remaining terms are independent of φ, since nk is a fixed direction in terms of r ,
 2π
which means that we can trivially obtain 0 dφ = 2π. Then finally, we have
 
2π eikr e−ikr
I f (nk ) + f (−nk ) (46.19)
ikr r r

when kr  1. We note then that this integral I gives the same result as when replacing ei kr with
 
2π eikr e−ikr
δ(nk − nr ) − δ(nk + nr ) (46.20)
ikr r r
in the original integral, thus proving the relation. q.e.d.

46.3 Scattering Solution, Cross Sections, and S-Matrix

After seeing the behavior at infinity of the solution of the Schrödinger equation for a finite-range
potential, we need to construct a physical set-up that will describe the scattering of particles, from
the stationary (scattered wave) point of view.
519 46 Three-Dimensional Lippman–Schwinger Equation

Figure 46.1 Generic scattering situation in three-dimensions with a potential V (the arbitrary shape with solid lines).

Certainly at r → ∞ (away from the potential), we want to have an incident free particle wave,
ei k ·r . We have already seen that at infinity this wave decomposes into an in part and an out part,


as for any wave. Yet, since we are in the stationary-wave point of view (with the time-independent
Schrödinger equation), the wave should have a scattered component but that component must only
be an outgoing wave (a diverging spherical wave), since it is generated at the moment of scattering;
see Fig. 46.1. Then we have the following ansatz for the stationary-wave function at r → ∞:

eikr
u+ (r )  ei k ·r + f k (nr )

= uinc (r ) + uscatt (r ). (46.21)
k r
We saw earlier that the probability current is

j =  (ψ∗ ∇ψ  ∗ ).
 − ψ∇ψ (46.22)
2im
When substituting the above wave function, we obtain an incident current, a scattered current, and
an interference term,
j +k  jinc + jscatt + jinterf (46.23)

when r → ∞. Specifically, we find

jinc =  k

m (46.24)
jscatt = k | f |2
nr 2 ,
m r
where nr = r /r. We note that |uscatt | 2 = | f | 2 /r 2 is the probability density. Then the interference
current term is

jinterf = − 4π k (Im f )δ(nk − nr ), (46.25)
2
mr k
which means that we can neglect it except in the forward direction.
We saw in Chapter 37 that the definition of the differential cross section is
dPscatt (dΩ, dt)/dt
dσ = , (46.26)
jinc
520 46 Three-Dimensional Lippman–Schwinger Equation

and the incident current is jinc = k/m, while the probability current (flow rate) through the surface
dS = r 2 dΩ is
k 2
dPscatt (dΩ, dt)/dt = jscatt dS = jscatt r 2 dΩ = | f | dΩ. (46.27)
m
Thus we obtain

= | f |2 . (46.28)

The conservation of probability equation is

 · j + ∂|ψ| = 0.
2
∇ (46.29)
∂t
In the stationary (time-independent) case we have ∂|ψ| 2 /∂t = 0, so after integrating over a volume
V bounded by Σ∞ , we obtain
   

0= d 3r ∇
 · j = d S · j = dS jr = r 2 dΩ(ψ∗ ∂r ψ − ψ∂r ψ∗ ). (46.30)
V Σ∞ Σ∞ 2im Σ∞
We can choose the whole of the Σ∞ boundary to be within the r → ∞ limit, in which case we can
replace ψ with its general asymptotical form in terms of e±ikr /r with coefficients A and B. Then the
conservation of probability becomes

 2ik
dΩr 2 2 [A∗ A − B∗ B] = 0
2im r
  (46.31)
⇒ dΩ| A| =
2
dΩ|B| . 2

This means that knowing one of the coefficients A, B implies a value for the average of the other, and
thus both coefficients must be nonzero. Moreover, it means that there exists a linear relation between
them. Indeed, if we have several waves expanded in this way,
eikr e−ikr
ψ n  An + Bn , (46.32)
r r
then linear combination of the waves means linear combinations for the coefficients,
  eikr  e−ikr
Cn ψ n   Cn An  + Cn Bn  , (46.33)
n  n r  n r

which in turn implies that


  2   2
 
dΩ  Cn An  = dΩ  Cn Bn  .
 (46.34)
 n   n 

This is only possible if the out coefficient is an operator times the in coefficient,

A = Ŝ · B, (46.35)

which more precisely leads to




A(n ) = d 2n S(n , n )B(n ), (46.36)

where we have generalized from n  = nr , n = −nr to general n, n .


521 46 Three-Dimensional Lippman–Schwinger Equation

 
However, to satisfy the law of conservation of probability dΩ| A| 2 = dΩ|B| 2 , we also require
that Ŝ is a unitary operator, Ŝ † = Ŝ −1 (obtained by substituting A = Ŝ · B into the conservation law).
This Ŝ refers to the S-operator, or S-matrix.
We note that this is a generalization of the one-dimensional case, where there are only two states
|k±, to the case of an infinity of states, called “channels”, for the in and out states.

46.4 S-Matrix and Optical Theorem

We now expand the free wave ei k ·r in the scattering stationary wave (46.21) into in and out


components, changing the notation as follows, f k (nr ) = f k (nr , nk ) and u+ (r ) = uk+ (nr , nk ; r), in
k
order to emphasize the dependence on angles. We obtain
  ikr
2π e 2π e−ikr
uk+ (nr , nk ; r)  δ(nr − nk ) + f k (nr , nk ) + δ(−nr − nk ) . (46.37)
ik r ik r
We generalize nr to n  and −nr to n as before, leading to

A(n  ) = δ(n  − nk ) + f k (n , nk )
ik
(46.38)

B(n ) = δ(n − nk ).
ik
Then the S-matrix relation A = Ŝ · B becomes

A(n  ) = δ(n  − nk ) + f k (n , nk )
ik
 (46.39)
2π 2π
= d 2n S(n , n )δ(n − nk ) = S(n , n ).
ik ik
Therefore the S-matrix contains a trivial (identity) part, plus a nontrivial part,
ik
S(nr , nk ) = δ(nr − nk ) + f k (nr , nk ), (46.40)

or, in operator terms,
ik ˆ
Ŝ = 1 +
f. (46.41)

Here S and f are amplitudes for scattering, with S containing also the trivial (not the interaction)
part. We will see that fˆ is related to the T-operator defined in the previous chapter.
Since Ŝ is a unitary operator, as we saw before, meaning that it conserves probability as it relates
the in and out states, we find
  
† ik ˆ ik ˆ†
1 = Ŝ Ŝ = 1 + f 1− f
2π 2π
 2 (46.42)
ik ˆ ˆ† k †
=1+ (f − f ) + ˆ
ff .ˆ
2π 2π
Finally, we obtain
fˆ − fˆ† k ˆ ˆ†
= ff . (46.43)
2i 4π
522 46 Three-Dimensional Lippman–Schwinger Equation

Changing from operators to matrices,

fˆ = f (n , nk ) ⇒ fˆ† = f ∗ (nk , n  ), (46.44)

where n  = nr , and considering forward scattering, so that nr = nk , we obtain

k k
Im f (nk , nk ) = d 2nr | f (nr , nk )| 2 = σtot , (46.45)
4π 4π
where in the last equality we have used the fact that d 2nr = dΩ and dσ/dΩ = | f | 2 .
The above form is the most common (though not the most general) form of the optical theorem
for scattering.

46.5 Green’s Functions and Lippmann–Schwinger Equation

We now turn to equations for the wave functions, specifically the generalization to three dimensions
of the Lippmann–Schwinger equation. Now the relations we will prove are valid even for the case of
a potential with an infinite range, as in the Coulomb case.
We trivially generalize the one-dimensional case from the previous chapter to define the Green’s
functions in an abstract form
2
(E − Ĥ0 ) Ĝ0 = 1. (46.46)
2m
However, the coordinate representation is also trivially generalized, since

r | Ĝ0 |r  = G0 (r , r  )


(46.47)
r |1|r  = δ3 (r − r  ).

Continuing to use the abstract form, we find

2 1
Ĝ0 = , (46.48)
2m E − Ĥ0
or, more precisely, since we need to avoid the singularities appearing in the solutions of the (free)
Schrödinger equation, (E − Ĥ0 )|ψk  = 0, we need to add ±i to the energy, leading to two Green’s
functions,
2 1
Ĝ0(±) = lim . (46.49)
→0 2m E ± i − Ĥ0

Also as in the one-dimensional case (trivially generalized in the abstract form) we find the
Lippmann–Schwinger equation,
2m (±)
|ψ± = |k + Ĝ V̂ |ψ±, (46.50)
2 0
where |k is the free particle state of wave vector k, (E − Ĥ0 )|k = 0. Note here that we have defined
|ψ± as the state corresponding to Ĝ0(±) , though we will soon see that it also corresponds to having an
outgoing or incoming scattered wave, meaning that ψ1 = ψ+ , ψ2 = ψ− are the solutions at infinity.
523 46 Three-Dimensional Lippman–Schwinger Equation

Proof To prove the Lippmann–Schwinger equation, we multiply it by (E ± i − Ĥ0 ), obtaining


2m
(E ± i − Ĥ0 )|ψ± = (E ± i − Ĥ0 )|k + (E ± i − Ĥ0 ) Ĝ0 2 V̂ |ψ±
 (46.51)
⇒ (E ± i − Ĥ0 − V̂ )|ψ± = 0,

where we have used the fact that the first term in the first line vanishes, by the free particle
Schrödinger equation, and in the second, we have used the definition of Ĝ0 to write it as V̂ |ψ±.
q.e.d.

Note that we can define Ĝ0 fully on the complex plane, not just an infinitesimal distance away
from the real line,
2 1
Ĝ0 (z) = , (46.52)
2m z − Ĥ0
and then the Lippmann–Schwinger equation in this more general case is
2m
|ψ(z) = |k z  + Ĝ0 (z)V̂ |ψ(z). (46.53)
2
Then in coordinate space, i.e., multiplying with r |, we get

2m
ψz (r ) = φz (r ) + 2 d 3r G0 (z; r , r  )V (r  )ψz (r  ). (46.54)

We go back now and specialize to z = E ± i, which means that

2m
ψ± (r ) = φ E (r ) + 2 d 3r G0(±) (r , r  )V (r  )ψ± (r  ). (46.55)

The (free) Green’s function in coordinate space,
   
2m (±)  
G (r , r ) = r 
1
 r  (46.56)
2 0  E ± i − Ĥ0 
is not diagonal but in momentum space it is diagonal, since
   
2m (±)  
G (p , p ) = p 
1
 p 
2 0  E ± i − Ĥ0  (46.57)
p |p  1
= = δ3 (p − p  ).
E ± i − p /2m E ± i − p2 /2m
2

This means that we can obtain a simpler form for the Lippmann–Schwinger equation in momentum
space,

2m
ψ± (p ) = φ E (p ) + 2 d 3 p G0(±) (p , p  )V (p , ψ±)

1 (46.58)

= φ E (p ) + V (
p , ψ±),
p2
E ± i − 2m
where

V (p , ψ±) ≡ p  |V̂ |ψ±. (46.59)


524 46 Three-Dimensional Lippman–Schwinger Equation

To obtain an explicit form for the Lippmann–Schwinger equation in coordinate space, we must
first find the (free) Green’s function in coordinate space. We insert momentum space completeness
relations in order to relate it to the momentum space Green’s function, obtaining
 

(±)
G0 (r , r ) =
 3
d p d 3 pr |p p | Ĝ0(±) |p p  |r 
   
2 ei p ·r / δ3 (p − p  ) e−i p ·r /
= d3 p d 3 p
2m (2π) E ± i − (p /2m) (2π) 3/2
3/2 2
  (46.60)
2 d3 p ei p ·(r −r )/
=
2m (2π) 3 E ± i − p2 /2m
 ∞  2π  π 
1 eiq |r −r | cos θ
= 2
q dq dφ dθ sin θ 2 ,
(2π) 3 0 0 0 k − q2 ± i
where the θ, φ angles refer to the r − r  vector relative to the p vector, E = 2 k 2 /(2m), and p = q.
 2π π 1
Integrating using 0 dφ = 2π and 0 dθ sin θ = −1 d(cos θ), we find
 ∞  
1 eiq |r −r | − e−iq |r −r |
G0(±) (r , r  ) = q 2
dq
4π 2 0 iq|r − r  |(k 2 − q2 ± i)
 +∞  
(46.61)
1 eiq |r −r | − e−iq |r −r |
=− 2 qdq ,
8π i|r − r  | −∞ (q2 − k 2 ∓ i)
 +∞ ∞
where in the last equality we have used 12 −∞ qdq = 0 qdq. The integral has poles at

q = ± k 2 ± i = ±k (1 ± i  ). (46.62)
To calculate the above integral, we use the standard residue theorem on the complex plane. We first
extend the integral over the complex plane, and then close the contour of integration over the real line

with a semicircle at infinity in the upper half-plane for the term with eiq |r −r | , since if |q| → ∞ in the

upper half-plane, it becomes ∼ e−(Im q) |r −r | → 0. We close the contour with a semicircle at infinity

in the lower half plane for the term with e−iq |r −r | , since if |q| → ∞ in the lower half-plane, the

term becomes ∼ e+(Im q) |r −r | → 0. This means that addition of the semicircles does not change the
final result; however, now that the contour is closed we can use the residue theorem, and find that the
integral is equal to 2πi times the residue(s) inside the contour, with a plus sign for counterclockwise
contour integration and a minus for clockwise contour integration.
That means that the integral over the real line times 2πi 1
gives, in the cases of G0(±) , the following
results:
k ik |r −r  | (−k) −i(−k) |r −r  | 
G0(+) → e − (−) e = eik |r −r |
2k 2(−k)
(46.63)
(−) −k i(−k) |r −r  | k −ik |r −r  | −ik |
r −r |
G0 → e − (−) e =e ,
−2k 2k
where in the first line the first term comes from the +k + i residue and has a counterclockwise
contour and the second term comes from the −k − i residue and has a clockwise contour; in the
second line the first term comes from the −k + i residue and has a counterclockwise contour and the
second term comes from the +k − i residue and has a clockwise contour, as in Fig. 46.2.
Then we find the coordinate space Green’s function
1 r |
G0(±) (r , r  ) = −  e
±ik |
r −
. (46.64)
4π|r − r |
525 46 Three-Dimensional Lippman–Schwinger Equation

Figure 46.2 Complex poles relevant for G+0 and G−0 .

We note that this is the same Green’s function as for the equation
(Δ + k 2 )G = δ3 (r − r  ). (46.65)
Now we are ready to find an explicit form for the coordinate-space Lippmann–Schwinger equation.
Substituting G0(±) (r , r  ) into (46.55) and substituting φ E (r ) = ei k ·r , we find


 ±ik | r |
r −
2m 3 e
ψ± (r ) = e k ·
i r
− d r V (r  )ψ± (r  ). (46.66)
4π2 |r − r  |
Specializing to a potential of finite range, so that r  belongs to the finite volume V that is the
domain of the potential, it follows that for r → ∞ we have
√  
r
|r − r  |  r 2 − 2rr  cos α  r 1 − cos α = r − nr · r , (46.67)
r
where α is the angle between r and r , and nr = r /r. Then at r → ∞ we keep in the exponential
both the infinite phase and the finite phase (the first two terms in the expansion) since the latter is
integrated over,
 
e±ik |r −r |  e±ikr e∓iknr ·r , (46.68)

and in the denominator we can replace |r − r | with the leading term r, since we only are interested
in the leading behavior of the integral.
Then we obtain

e±ikr 2m  
ψ± (r )  ei k ·r − d 3r  e∓i k ·r V (r  )ψ± (r  ),

(46.69)
r 4π2
where we have defined
k  ≡ knr . (46.70)
We note that we have reached the general scattering wave function in the stationary point of view
if we make the identification

2m  
f k± (nr , nk ) ≡ f (±) (k , k) ≡ − 2
d 3r  e∓i k ·r V (r  )ψ± (r  ), (46.71)
4π
corresponding to an outgoing or incoming wave ψ± : that is, the state |ψ±, previously defined
by the term ±i in the Green’s function, actually corresponds to the scattered wave function term
526 46 Three-Dimensional Lippman–Schwinger Equation

e±ikr f (±) /r. Then it follows that, since in most cases we are interested only in the outgoing case
|ψ+, we are more interested in the Green’s function G0(+) .

Important Concepts to Remember

• For three-dimensional scattering, at r → ∞ we have two asymptotic solutions, outgoing or “out”,


ψ1 = A(k, nr )eikr /r, and incoming or “in”, ψ2 = B(k, −nr )e−ikr /r.
• A free wave ei k ·r is expanded at infinity in the in and out waves, with A(k, nr ) = (2π/ik)δ(nk −nr )


and B(k, −nr ) = −(2π/ik)δ(nk + nr ).


• In the case of scattering in a potential, we have at infinity the stationary wave function ansatz
u+ (r ) = ei k ·r + f k (nr )eikr /r = uinc (r ) + uscatt (r ), and differential cross section dσ/dΩ = | f | 2 .

k
−ikr
 a generic in plus out wave, ψ = Ae /r + Be /r, we have A = Ŝ · B, or A(n ) =
ikr
• For
  
dn S(n, n )B(n ), where Ŝ is the unitary S-matrix or S-operator.
• Then we have Ŝ = 1 + (ik/2π)  f , or S(nr , nk ) = δ(nr − nk ) + (ik/2π) f (nr , nk ), and the optical
ˆ
theorem, Im f (nr , nk ) = k/4π dnr | f | 2 = (k/4π)σtot .
• The Green’s functions and Lippmann–Schwinger equations in the abstract form are trivially
generalized from one dimension, and we can further generalize to the complex plane,
2 1
Ĝ0 (z) = .
2m z − Ĥ0
• In the coordinate representation, the Lippmann–Schwinger equation becomes

2m
ψ± (r ) = φ E (r ) + 2 d 3r G±0 (r , r  )V (r  )ψ± (r  ),

with
1 
G±0 (r , r  ) = − e±ik |r −r | .
4π|r − r  |

Further Reading
See [2] and [1].

Exercises

(1) Consider a central potential of the spherical-step type, V = V0 > 0 for r ≤ R and V = 0 for
r > R, and a solution with energy E > V0 and given angular momentum l > 0. If the solution is
assumed to be square integrable at r = 0, calculate the coefficients A and B at infinity.
(2) The decomposition of the free wave at infinity, (46.15), seems counterintuitive since the left-
hand side is certainly nonzero (in fact, naively, it is of order 1!) if k is other than parallel or
antiparallel to r . In what sense should we understand this relation, then?
(3) If the f k (nr ) in (46.21) were independent of nr , would that contradict unitarity or not?
(4) For the case in exercise 1, calculate the total cross section and the S-matrix.
527 46 Three-Dimensional Lippman–Schwinger Equation

(5) Consider the wave ψ = Aeikr/r + Be−ikr/r , with A, B = (2π/ik)δ(nk ± nr ) + a, b, where a, b
are real constants. Can it be understood as a scattering solution? If so, calculate the total cross
section. What if a, b are imaginary constants?
(6) Calculate the generalization of G±0 (r , r  ) on the complex plane for energy, G0 (z; r , r  ).
(7) Calculate the equivalent of (46.69) for the generalization G0 (z; r , r  ) in exercise 6.
47 Born Approximation and Series, S-Matrix and T-Matrix

In this chapter, we define the Born approximation and series of higher-order approximations, and
connect with the time-dependent point of view for scattering, where we define the S- and T-matrices.

47.1 Born Approximation and Series

To solve the Lippmann–Schwinger integral equation, we use the same procedure as in the one-
dimensional case. We define the zeroth-order solution, namely the free (noninteracting) wave,
 ·
ψ±(0) (r ) = φ E (r ) = ei k r
. (47.1)

Then we substitute this solution into the right-hand side of the Lippmann–Schwinger equation to find
the first order interacting solution,

2m    2m
f (1,±)   
( k , k) = − d 3r  eir ·(k∓k ) V (r  ) = − V (k  ∓ k), (47.2)
4π2 4π2

where we have defined the Fourier-transformed potential V (q ), where q = k  − k is the momentum
transfer. That means that the first-order term in the wave function solution is

2m e±ikr ir  ·(k∓k )
ψ±(1) (r ) = − 2
d 3r  e V (r  )
4π r
 (47.3)
 
= d 3r G0(±) (r , r  )ei k ·r V (r  ).

We can then put this into the right-hand side of the Lippmann–Schwinger equation to find the second-
order term in the wave function solution, etc.
Generally, though restricting to the physical + solution (and dropping the index when doing so),
we find the recursion relation for the (n + 1)th-order term,

2m
ψ (n+1)
(r ) = d 3r  2 G0 (r , r  )ψ (n) (r  )V (r  ). (47.4)


This recursion relation defines the Born series. The first-order term is the Born approximation.
We apply the Born approximation to a spherically symmetric potential, V (r ) = V (r). Since
|k  | = |k | = k, and defining θ as the angle between nr and nk , we have (see Fig. 47.1)

θ
|k  − k | = 2k sin = q. (47.5)
2
528
529 47 Born Approximation and Series, S- and T-Matrix

k = kr̂ q
k μ

Figure 47.1 Geometry of scattering.

The first-order solution is then


 ∞  2π  1
2m 2  
f (1) = − r dr dφ d cos θ eiqr cos θ V (r  )
4π2 0 0 −1
 iqr  −iqr 
2m ∞ 2  e − e
=− 2 r dr V (r  )
 0 2iqr 
 ∞ (47.6)
2m
=− 4π r  dr V (r  ) sin(qr  )
4π2 q 0
2m
=− V (q),
4π2

where in the last line we have defined V (q), which matches the first-order term (47.2), specialized
to the spherically symmetric case, as well as the same formula in the derivation of the Born
approximation through the Fermi golden rule, in Chapter 37. However, note that there we had a
different normalization, with a 1/(2π) 3/2 factor.
We will take an example, the Yukawa potential

e−μr
V = V0 , (47.7)
r

as in Chapter 37. There, we left as an exercise the proof of

4πV0
V (q) = , (47.8)
μ2 + q2

which leads to the Born approximation for the wave function,


 
2mV0 1
f (1) = − . (47.9)
2 μ2 + q2

Then the Born approximation for the differential cross section,

dσ  m 2
= | f |2 = |V (q)| 2 , (47.10)
dΩ 2π2

matches the Chapter 37 calculation using Fermi’s golden rule.


530 47 Born Approximation and Series, S- and T-Matrix

47.2 Time-Dependent Scattering Point of View

From the Fermi’s golden rule calculation of the Born approximation, which used first-order time-
dependent perturbation theory, we realize that we can use a time-dependent point of view for
scattering.
We review this calculation (though within our new formalism) for completeness. We define the
S-matrix as in the one-dimensional case in Chapter 45,

Ŝ = ÛI (+∞, −∞), (47.11)

which however is still the same matrix as that defined in the previous chapter, Aout = Ŝ · Bin , and then
the probability of transition is

Probab.(p in → dΩ around p f ) = dΩ|p f | Ŝ|p i | 2 . (47.12)

However in time-dependent perturbation theory, at first order,


 +∞
i
ÛI (+∞, −∞) = 1 − dtVI (t), (47.13)
 −∞
as we saw in Chapter 38. Then we have
dP 2π
= |p f |V̂ |p i | 2 ρ(Ei , n )dΩ, (47.14)
dt 
and the analysis in Chapter 37 follows.
In the case of scattering in a Coulomb potential, obtained as the μ → 0 limit of the Yukawa
potential, so that
4πV0
V (q) = , (47.15)
q2
we obtain the differential cross section
 2
dσ  m  2 4πV0 4m2 V02 m2 V02
= = = , (47.16)
dΩ 2π2 q2 4 q4 4 4k 4 sin4 θ/2
which is the Rutherford formula for V0 = Z ze02 , derived classically by Rutherford (so the quantum
mechanics approach to the first-order Born approximation doesn’t improve upon it).

47.3 Higher-Order Terms and Abstract States, S- and T-Matrices

We saw in the previous chapter that the differential cross section is



= | f + |2 (47.17)

and, from the Lippmann–Schwinger equation,

2m  
f + (k  , k) = − d 3r  e−i k ·r V (r  )ψ+ (r  ). (47.18)
4π2
531 47 Born Approximation and Series, S- and T-Matrix

In terms of abstract operators and states, we have


m 
f + (k , k) =  k |V̂ |ψ+, (47.19)
2π2
so
dσ  m  2   
 k |V̂ |ψ+ ,
2
= (47.20)
dΩ 2π2 
where k  | = p f | is a final momentum state, so the formula above generalizes the Born approxima-
tion formula above (47.10), which can be written as
dσ  m  2 
p f |V̂ |p i  .
2
= 2
(47.21)
dΩ 2π
However, k  |V̂ |ψ+ is not a matrix element in the Hilbert space basis, since we have two different
kinds of states on the left and on the right. In order to have a matrix element in the Hilbert space basis,
we need to define, as in the one-dimensional case in Chapter 45, the operator T:
V̂ |ψ+ ≡ T̂ |k, (47.22)

i.e., we need to replace the action on an interacting state by an action on the free state |k. Then
k  |V̂ |ψ+ = k  |T̂ |k (47.23)
is a matrix element in the Hilbert space basis.
But since the density of states is, as we saw in Chapter 37,
mk
ρ(E) = , (47.24)
(2π) 3
and since the velocity is v = k/m, we can rewrite the differential cross section in terms of the
transition matrix, or T-matrix (the matrix element of the T-operator),
dσ 2π 2π
= ρ(E)|k  |V̂ |ψ+| 2 = ρ(E)|k  |T̂ |k| 2 . (47.25)
dΩ v v
Then we can identify the amplitude for scattering, f + , with the T-matrix, up to a constant,
m  
f + (k , k) =  k |T̂ | k, (47.26)
2π2
or, for operators,
m
fˆ = − T̂. (47.27)
2π2
Moreover, from (46.41), we can relate the S-operator to the T-operator (and the S-matrix to the
T-matrix), since
ik ˆ imk
Ŝ = 1 + f =1− T̂. (47.28)
2π (2π) 2
As in the one-dimensional case (a trivial generalization), the abstract Lippmann–Schwinger
equation, and its “solution” through the T-operator, is
 
2m (±)
|ψ± = | k +
 Ĝ V̂ |ψ±
2 0
  (47.29)
2m (±)
= |k + Ĝ T̂ |k.
2 0
532 47 Born Approximation and Series, S- and T-Matrix

Also as in the one-dimensional case, the full Born series is defined by the recursion relation
 
(n+1) 2m (±)
|ψ±  = Ĝ V̂ |ψ±(n) . (47.30)
2 0
Absorbing 2m/2 into Ĝ0(±) for simplicity, the full Born series becomes

|ψ± = |k + Ĝ0(±) V̂ |k + (Ĝ0(±) V̂ ) 2 |k + · · ·


= |ψ±(0)  + |ψ±(1)  + |ψ±(2)  + · · · (47.31)
= |k + Ĝ0(±) (V̂ + V̂ Ĝ0(±) V̂ + · · · )|k.

Comparing with the Lippmann–Schwinger equation in terms of T̂, we obtain

T̂ = V̂ + V̂ Ĝ0 V̂ + V̂ Ĝ0 V̂ Ĝ0 V̂ + · · · = V̂ + V̂ Ĝ0T̂. (47.32)

The physical interpretation of the full Born series is obtained once we go to coordinate space, since
then the nth-order term is
 
d 3r 1 . . . d 3r n G0 (r , r 1 )V (r 1 )G0 (r 1 , r 2 )V (r 2 ) · · · G0 (r n−1 , r n )V (r n )φ(r n ). (47.33)

Then we see that the incoming wave hits the potential at r n (the last integration variable, the
variable of the free-wave function φ(r n )), interacts with the potential, and then propagates with the
propagator (Green’s function) G0 until r n−1 , where it interacts again with V (r n ), etc., until after the
last propagation we hit the point where we are “measuring” the wave function, r .
We can also define the full Green’s function,
1 1
Ĝ ≡ =
E − Ĥ0 − V̂ Ĝ−1
0 − V̂
(47.34)
= Ĝ0 + Ĝ0 V̂ Ĝ0 + Ĝ0 V̂ Ĝ0 V̂ Ĝ0 + · · ·
= Ĝ0 (1 + T̂ Ĝ0 ).

Acting with Ĝ−1 on the difference between the full state and the free state, we get

(E − Ĥ0 − V̂ )(|ψ − |k) = V̂ |k, (47.35)

where we have used the full Schrödinger equation for |ψ and the free Schrödinger equation for |k.
Then we have
1
|ψ − |k = V̂ |k = ĜV̂ |k
E − Ĥ0 − V̂ (47.36)
= Ĝ0 (1 + T̂ Ĝ0 )V̂ |k,
which defines the full Born series once again.

47.4 Validity of the Born Approximation

Until now we have assumed that the Born approximation is valid, but we have not stated a condition
for this to be true.
533 47 Born Approximation and Series, S- and T-Matrix

We require that, for r = 0, meaning at the maximum of the interaction (where the potential is
centered), the extra term is small with respect to the free term, which at r is 1, thus
 
|ψ±(1) (0)| =  d 3r G0 (0, r  )ei k ·r V (r  ) 
 

 
  ikr  
(47.37)
2m  3 e  i r 
k ·
=  d r  V (r )e   1.
4π2  r 
Note that in the second line we have an exact equality, by just substituting G0 (0, r  ) into the first line.
For the particular case of the Yukawa potential,

e−μr
V (r  ) = V0 , (47.38)
r
and at low energy, kr   1, we obtain the condition
 −μr 
2m 3 e 2mV0
V0 d r 2 =  1, (47.39)
4π2 r μ2
∞  
where we have used d 3 r /r 2 = r  dΩ, integrated over r , 0 dr  e−μr = 1/μ, and dΩ = 4π.
On the other hand, for a potential where V ∼ V0 for a range r ≤ r 0 , and at low energy kr   1,
we have

V (r  ) V0
d 3 r    4π r 02 , (47.40)
r 2
which means that the Born approximation validity condition is
mV0 2
r  1. (47.41)
2 0
One can make a more precise analysis of the validity condition, in particular one that interpolates
to high energy, but we will not do it here. Here we just note that the region of validity of the
Born approximation is not a low-energy one in general, and also is different than that of the WKB
approximation.

Important Concepts to Remember

• The Born approximation is the first-order approximation to the Lippmann–Schwinger equation,


where we put the zeroth-order solution ei k ·x on the right-hand side, and the Born series is obtained


as successive terms in the approximation.


• The Born approximation gives the same result as a reinterpretation of the scattering as a time-
dependent process, with Fermi’s golden rule in first order, giving dP/dt = (2π/)|p f |V |pi | 2 ρ dΩ.
• The Lippman–Schwinger equation implies
dσ  m  2  
= | k |V̂ |ψ+| 2 .
dΩ 2π2
with Born approximation
dσ  m  2
= |p f |V̂ |p i | 2 .
dΩ 2π2
534 47 Born Approximation and Series, S- and T-Matrix

• One can define the T-matrix by V̂ |ψ+ = T̂ |k, so that the action on the interacting state is that of
the T-matrix on a free state, and then
m
fˆ = − T̂.
2π2
• The Born approximation is valid in a region different from that of the WKB approximation, nor is
it the low-energy region. At low energy, for a Yukawa potential V0 e−μr /r, we have mV0 /(μ2 )  1
and for a constant potential, V ∼ V0 , in a range r 0 , we have mV0 r 02 /2  1.

Further Reading
See [2] and [1].

Exercises

(1) Calculate the first two terms in the Born series for a delta function potential, V (r) = −V0 δ3 (r ).
(2) Calculate the differential cross section for scattering in the Born approximation for a potential
V (r) = A/r 2 .
(3) Describe physically how it is possible that the Born approximation to quantum mechanical
scattering in a Coulomb potential gives the classical-scattering Rutherford formula.
(4) Calculate the first two terms in the Born series for a potential V (r) = A/(r 2 + a2 ), with
a = constant.
(5) Write down explicitly the Lippmann–Schwinger equation for the T-matrix in the Yukawa case
of V (r) = V0 e−μr /r.
(6) Write down and solve the Lippmann–Schwinger equation for the T-matrix in the case of the
delta function potential V = −V0 δ3 (r ).
(7) Is there a domain of validity of the Born approximation in the case of the Coulomb potential?
Partial Wave Expansion, Phase Shift Method,
48 and Scattering Length

In this chapter we will consider spherically symmetric potentials, in which case we need to
describe states and wave functions of a given l. To do that, we define an expansion in angular
momentum l, the partial wave expansion, and the associated notions and methods of phase shift
and scattering length.

48.1 The Partial Wave Expansion

In the spherically symmetric case, the complete set of mutually commuting variables is {H, L 2 , L z },
so we have basis states that are eigenstates of those operators, |Elm, and which are orthonormal
(including on the continuous energy variable),
E  l  m  |Elm = δll δ mm δ(E − E  ). (48.1)
In coordinate space, the wave function of a basis state is
r |Elm = uElm (r ) = REl (r)Ylm (nr ). (48.2)
We saw in Chapters 18 and 19 that, for a free particle, the solution for the radial wave function is
REl (r) = Cl jl (kr), (48.3)

where jl is the spherical Bessel function. Moreover, the plane wave solution, ei k ·r , where k = p /,


is expanded in the above basis as




ei k ·r = eikr cos θ =

(2l + 1)i l jl (kr)Pl (cos θ)
l=0
(48.4)

∞ 
l
= alm jl (kr)Ylm (θ, φ),
l=0 m=−l

where (2l + 1)i l ≡ al , the first line is expanded in terms of only θ, since the expanded function
is a function only of cos θ, whereas the second line is the general expansion of a wave function
in the r |Elm basis. The relation between the two expansions is provided by the equality (from
Chapter 17)


Pl (cos θ) = Yl,0 (θ, φ), (48.5)
2l + 1
which means that


alm = δ m,0 al . (48.6)
2l + 1
535
536 48 Partial Wave Expansion, Phase Shifts, Scattering Length

The expansion in terms of θ (on the first line) follows from the relation
 +1
1
jl (kr) = l d(cos θ)eikr cos θ Pl (cos θ). (48.7)
2i −1
Since at large values of the argument the spherical Bessel function becomes
eikr i −l − e−ikr i l
jl (kr)  , (48.8)
2ikr
this means that, at r → ∞, the free plane wave becomes
eikr 1  e−ikr 1 
∞ ∞
ei k ·r 

(2l + 1)Pl (cos θ) − (2l + 1)i 2l Pl (cos θ). (48.9)
r 2ik l=0 r 2ik l=0

Then, in terms of the general expansion at infinity, the coefficients of the free plane wave are
1 

A(k, nr ) = (2l + 1)Pl (cos θ)
2ik l=0

=
δ(nk − nr )
ik
(48.10)
1 

B(k, −nr ) = (2l + 1)i 2l Pl (cos θ)
2ik l=0

=
δ(nk + nr ).
ik
Then, when we have scattering in a central (spherically symmetric) potential, we saw that we can
make the replacement
2π  ∞
1
A(k, nr ) = Ak (n  ) = δ(n  − nk ) = (2l + 1) Pl (cos θ)
ik l=0
2ik
(48.11)

→ δ(n  − nk ) + f k (n , nk ),
ik
where k  = kn  generalizes knr .
However, given the expansion in cos θ of the coefficient A before the replacement, it means that
the added term should also have an expansion, where we replace 1/(2ik) with a general coefficient
al (k), i.e., we define


f k (n , nk ) = f (θ) = (2l + 1)al (k)Pl (cos θ), (48.12)
l=0

where θ is the angle between n  and nk .


Here al (k) is called the lth partial wave amplitude. Given the above expansion, the scattering
solution for a potential with a finite range, i.e., a free plane wave and a diverging spherical wave,
expands into partial waves as follows:
eikr
ψ+ (r ) = ei k ·r + f k (nr , nk )

k r
⎧ ∞  ikr 
−ikr 2l ⎫ (48.13)
1 ⎪
⎨ (2l + 1)Pl (cos θ) e (1 + 2ikal (k)) − e i ⎪ ⎬.
 ⎪ ⎪
2ik r r
⎩ l=0 ⎭
537 48 Partial Wave Expansion, Phase Shifts, Scattering Length

48.2 Phase Shifts

The boundary conditions for the wave function in the spherically symmetric scattering case are
imposed on REl (r), and define the abstract state |Elm in coordinate space. But the physical situation
we have now, unlike the previous derivations (in Chapters 18 and 19) of bound states in the case of
say, the hydrogen atom, is that of a scattering solution, which means an incoming plane wave plus
an outgoing (or incoming) spherical wave giving the state |ψ±. So, really, it is more useful to write
±
|Elm± for the abstract state and REl (r) to refer to the two outgoing or incoming solutions. If we
do not use the ± index, it means that we are considering a general linear combination of the two
solutions.
Then the general solution of the Schrödinger equation |ψ is expanded in terms of some basis
|Elm (a linear combination of the |Elm± states) as

∞ 
l 
∞ 
l
ψk (r ) = Klm REl (r)Ylm (θ, φ) = Klm uElm (r ), (48.14)
l=0 m=−l l=0 m=−l
±
where REl (r) is a linear combination of REl (r).
At r → ∞, we have the Helmholtz equation
(Δ + k 2 )R = 0, (48.15)
so the general solution of it is either a linear combination of the spherical Bessel function jl (kr) and
the spherical Neumann function nl (kr) or yl (kr), or a linear combination of the spherical Hankel
functions of first and second degrees, hl(1) (kr) and hl(2) (kr),

REl (r) ∼ Cl jl (kr) + Dl nl (kr)


(48.16)
= Al hl(1) (kr) + Bl hl(2) (kr),

where, since hl(1,2) = jl ± inl , we have


Cl = Al + Bl , Dl = i Al − iBl . (48.17)
But, at large values of the argument, the spherical Hankel functions behave as
eikr i −l
hl(1) (kr) ∼
ikr
(48.18)
e−ikr i l
hl(2) (kr) ∼ − ,
ikr
which means that if at r → ∞ Bl = Al∗ then REl (r) is real (since hl(1)∗ = hl(2) ). Moreover, if also Al is
real, the solution has only a jl (kr) component, as for the free wave, but that is a very special case.
Then the real solution for the radial wave function is
Al ei(kr−lπ/2) − Al∗ e−i(kr−lπ/2)
REl (r) ∼ , (48.19)
ikr
(+) (−)
and is therefore the sum REl (r) = REl (r) + REl (r), where the first term contains eikr and the
second e−ikr .
We define the phase of Al as eiδ l , i.e.,
Al = Al0 eiδ l ⇒ Al∗ = Al0 e−iδ l . (48.20)
538 48 Partial Wave Expansion, Phase Shifts, Scattering Length

Then δl is called the phase shift, and the real solution for the wave function becomes
 
2Al0 lπ
REl (r) ∼ sin kr − + δl . (48.21)
kr 2
Since
Cl = Al + Al∗ = 2Al0 cos δl
(48.22)
Dl = i Al − i Al∗ = −2Al0 sin δl ,

it follows that we obtain the phase shift from the expansion in terms of jl and nl ,
Dl
tan δl = − . (48.23)
Cl
We then define the wave function solution of the Schrödinger equation,


ψk (r, θ) = Kl REl (r)Pl (cos θ), (48.24)
l=0

which is a particular linear combination of the |Elm solutions, and is also of the scattering-
solution type (since the partial wave expansion (48.12) is of the same type). The linear combination
of solutions, just like the plane wave free solution (48.4), is a particular linear combination of
jl (kr)Ylm (θ, φ), with alm ∝ δ m,0 . Here lm=−l KlmYlm (θ, φ) reduces to Kl Pl (cos θ) when
Klm ∝ δ m,0 .
We can define outgoing and incoming solutions |ψ± from the decomposition of the radial wave
functions (48.19), as

∞ 
l 
∞ 
l
|ψ+ ≡ |k+ = e |Elm+ =
0 iδ l
Alm Al0 Klm eiδ l |Elm+
l=0 m=−l l=0 m=−l
(48.25)

∞ 
l 
∞ 
l
0 −iδ l
|ψ− ≡ |k− = Alm e |Elm− = Al0 Klm e−iδ l |Elm−,
l=0 m=−l l=0 m=−l

where Alm ≡ Al0 Klm , and the states |Elm± are defined with real coefficients (times e±ikr ) at infinity.
Then the asymptotics at r → ∞ of the linear combination solution defined above is


ei(kr−lπ/2+δ l ) − e−i(kr−lπ/2+δ l )
ψk (r )  Al0 Pl (cos θ) , (48.26)
l=0
2ikr

where we have absorbed Kl into Al0 . We note that the above becomes

r | |k+ − |k− . (48.27)

The |k± are orthonormal,

k  + |k+ = δ3 (k  − k) = k  − |k−, (48.28)

as are the spherical states |Elm,

E  l  m  |Elm = δll δ mm δ(E − E  ). (48.29)


539 48 Partial Wave Expansion, Phase Shifts, Scattering Length

However, the product of the in and out states defined in (46.69) is nontrivial,

ik
k  − |k+ = d 3r ψ(−)∗

(r )ψ(+) (r ) = δ(k  − k) + δ(k  − k) f k (n , nk )
k k 2π
(48.30)
= δ(k  − k)ψ − |ψ+ = k  | Ŝ|k
= δ(k  − k)S(n  , nk ),
where, compared with the in and out states defined before in (48.25), we have used states with an
extra k modulus tensored in, |k± = |ψ± ⊗ |k, so that in the product we have an extra k  |k =
δ(k  − k).
Now we need to put the solution (48.24) with asymptotics (48.26) into a scattering form. In that
form, the incoming wave part (∝ e−ikr /r) is only from the plane wave (ei k ·r ) part, so we can identify


Al0 by comparing with the plane wave asymptotics in (48.9), obtaining




i 2l  ∞
i l e−iδ l (k)
B(k, nr ) = (2l + 1)Pl (cos θ) = Al0 Pl (cos θ) , (48.31)
l=0
2ik l=0
2ik

which means that Al0 is given by


Al0 = (2l + 1)ei(δ l +lπ/2) . (48.32)
Substituting back into the full ψk (r ) in (48.24), we obtain first


ψk (r, θ) = Kl REl (r)Pl (cos θ). (48.33)
l=0

But Kl was absorbed into Al0 so, up to an irrelevant normalization, we can identify them, implying


ψk (r, θ) = (2l + 1)ei(δ l +lπ/2) REl (r)Pl (cos θ). (48.34)
l=0

Then, substituting Al0 also in asymptotic (r → ∞) form into (48.26), we obtain

1 

ψk (r )  (2l + 1)Pl (cos θ)[eikr e2iδ l − e−i(kr−lπ) ]
2ikr l=0
⎡∞  2iδ ⎤ (48.35)
k ·
i eikr ⎢⎢ e l − 1 ⎥⎥
=e r
+ (2l + 1)Pl (cos θ)
r ⎢⎢ l=0 2ik ⎥⎥ ,
⎣ ⎦
where in the last line we have recreated ei k ·r from the cos θ expansion. Then we find


⎡⎢ ∞  2iδ ⎤
e l − 1 ⎥⎥
f k (n , nk ) = ⎢⎢ (2l + 1)Pl (cos θ) ⎥⎥ , (48.36)
⎢⎣ l=0 2ik

and, now that we have the asymptotic scattering form, we can identify al (k) as
e2iδ l (k) − 1 eiδ l (k) sin δl (k) 1
al (k) = = = . (48.37)
2ik k k cot δl (k) − ik
Moreover, with n  = nr , and nr · nk = cos θ, we have also an expansion in spherical harmonics,

∞ 
∞ 
l

f k (nr , nk ) = (2l + 1)Pl (nr · nk )al (k) = 4πal (k)Ylm (nk )Ylm (nr ). (48.38)
l=0 l=0 m=−l
540 48 Partial Wave Expansion, Phase Shifts, Scattering Length

Considering the scattering solution in (48.13), and comparing with the free case in (48.9), we see
that the diverging wave (∝ eikr /r) is just multiplied by the factor

1 + 2ikal (k) = e2iδ l (k) ≡ Sl (k), (48.39)

called the lth partial wave S-matrix element.


Indeed, since the S-operator is related to the f -operator by
ik ˆ
Ŝ = 1 + f, (48.40)

so that, by multiplication with nr | from the left and with |nk  from the right; we have
ik
S(nr , nk ) = δ2 (nr − nk ) + f k (nr , nk ), (48.41)

which expands into

1  1 
∞ ∞
S(nr , nk ) = (2l + 1)Pl (cos θ) + (2l + 1)Pl (cos θ)2ikal (k), (48.42)
4π l=0 4π l=0

it follows that

Sl = 1 + 2ikal (k) = e2iδ l (k) . (48.43)

48.3 T-Matrix Element

The f -operator is related to the T-operator by


m
fˆ = − T̂, (48.44)
2π2
which means that their matrix elements are related too, by
m  
f (k , k) = k  | fˆ|k = −  k |T̂ | k. (48.45)
2π2
Then the T-matrix element is
k  |T̂ |k = k  |V̂ |k+
2π2  ∞
= δ(k − k  ) (2l + 1)al (k)Pl (n  · nk ) (48.46)
m l=0

≡ δ(k − k  )Tk (n , nk ),

where again the δ(k − k  ) factor appears because we have modified the states, |k+ from |ψ+.
Taking in consideration (48.30), and the above T-matrix, we obtain for the S-matrix
    
k  − |k+ = k  | Ŝ|k = k  1 − k
imk

 (2π) 2 
(48.47)
imk
= δ3 (k − k  ) − δ(k 
− k)T k (n  
, k).
(2π) 2
541 48 Partial Wave Expansion, Phase Shifts, Scattering Length

Then the S- and T-operators are related by


imk  ik 2
Ŝ = 1 − δ(k − k) T̂k = 1 − δ(E  − E)T̂k . (48.48)
(2π) 2 (2π) 2
The S-matrix elements in the spherical, |Elm, basis, are given by
E  l  m  | Ŝ|Elm = δll δ mm δ(E − E  )Sl , (48.49)
because of the cos θ expansion of the S-matrix S(nr , nk ) in (48.42).
But then the T-matrix element in the same basis is
E  l  m  |T̂ |Elm = δll δ mm δ(E − E  )Tl , (48.50)
or, taking out the δ(E − E  ) factor,
E  l  m  |T̂k |Elm = δll δ mm Tl . (48.51)
Now multiplying (48.48) by E  l  m  | from the left and by |Elm from the right, we find1
Sl = 1 − 2πiTl , (48.52)
which means that Tl is related to al by
k
Tl (k) = − al (k). (48.53)
π
The matrix element relation generalizes to an operatorial relation,
Ŝ = 1 − 2πiδ(E − E  )T̂. (48.54)

48.4 Scattering Length

At low energies, k → 0, we will see later that, for a finite range, δl (k) ∝ k 2l+1 → 0, and moreover
only l = 0 contributes, so δ0 (k) → 0, and it dominates the other δl (k).
But then, the al (k) are given by
sin δl (k) tan δl (k) δl (k)
al (k)    . (48.55)
k k k
The leading contribution is then
δ0 (k)
a0 (k) 
, (48.56)
k
but it is actually negative, so we define the finite and positive quantity
δ0 (k)
a ≡ − lim a0 (k) = − lim (48.57)
k→0 k→0 k
called the scattering length.

1 There is an extra (2π) 3 , coming from the, now different, normalization of states (with a 1/(2π) 3/2 factor), and there is also
an extra 1/k 2 factor, which appears because of the states being defined as k nk | rather than as 
nk |, giving a factor of 1/k
in the normalization. We take into account this change in normalization because the relation between S and T is usually
defined as in the following.
542 48 Partial Wave Expansion, Phase Shifts, Scattering Length

48.5 Jost Functions, Wronskians, and the Levinson Theorem

We have seen before that, for a finite-range potential, i.e., for a potential decaying faster than the
Coulomb potential,

lim rV (r) → 0, (48.58)


r→∞

the radial wave function solution


 
χkl (r) 2Al0 lπ
= Rkl (r) ∼ sin kr − + δl (48.59)
r kr 2
+ −
is a real solution, meaning it contains both an outgoing and an incoming part: Rkl (r) + Rkl (r).
If we also impose

lim r 2 V (r) = 0, (48.60)


r→0

it means that there is a discrete spectrum for E < 0 (since we are now imposing two normalizability
conditions, at infinity and at zero, on two independent solutions of the Schrödinger equation), but the
spectrum is still continuous for E ≥ 0.
The behavior of χkl (r) at r → 0 is (as we saw in Chapter 19) ∼ r −l or ∼ r l+1 , so that Rkl (r) ∼ r −l−1
or ∼ r l .
We then define the regular (normalizable-at-zero) solution χl = φl , which means the solution that
has the behavior ∼ r l+1 . Moreover, we choose the normalization constant such that

lim r −l−1 φl = 1. (48.61)


r→0

Note that this result is valid whether the energy is positive (a continuous-spectrum, scattering
solution) or negative (a discrete-spectrum, bound-state solution). The physically normalized solution
is multiplied by a constant Nkl ,

χkl = Nkl φl (k; r). (48.62)

We then extend φl (k; r) to the full complex plane for k using analytic continuation, which means
that φl must be an analytical function of k. Its properties are as follows.
(1) φl (−k; r) = φl (k; r), which is part of the definition of the function: on the real axis, k < 0 is
defined from the physical case, k > 0.
(2) φl (k; r) = φl∗ (k ∗ ; r), which is a result of the analyticity imposed on φl .
(3) If k ∈ R then φl is an eigenfunction of the Hamiltonian, but if k is not real (so that E ∗  E), then
φl is not an eigenfunction of H.

Jost Functions

For k real, we define uniquely the Schrödinger equation solutions f l± that behave at r → ∞ as
incoming or outgoing, respectively, i.e.,

lim e±ikr f l± (k, r) = 1, (48.63)


r→∞
543 48 Partial Wave Expansion, Phase Shifts, Scattering Length

where again we choose the normalization constant such that we have 1 on the right-hand side.
Therefore
e∓ikr
χl = f l±  e∓ikr ⇒ ψ ∝ . (48.64)
r
Then, at r → 0, the solutions f l± are a linear combination of regular (∼ r l+1 ) and irregular, or non-
normalizable (∼ r −l ), solutions, with the irregular solution dominating. This means that, at r → 0,
f l± ∼ Cl± r −l , (48.65)
where Cl± is well defined (the subleading, regular, component is ∼ Dl± r l+1 ).
When k ∈ C, we have that:
f l+ is well defined for Im k < 0.
f l− is well defined for Im k > 0.
This implies that, at r → ∞,
| f l± (k; r)| ∝ e±(Im k)r → 0. (48.66)
In one dimension, and therefore also in three dimensions but for the radial direction, we defined
the Wronskian theorem. We need only to apply it for the same potential V (r), and the same energy,
which means that the Wronskian of two solutions ψ1 and ψ2 ,
W (ψ1 , ψ2 ) = ψ1 ψ2 − ψ2 ψ1 , (48.67)
is constant as a function of r, so
dW (ψ1 , ψ2 )
= 0. (48.68)
dr
Then in particular we can calculate it at r → ∞. Choosing the two solutions to be f l± , in which case,
f l±  e∓ikr , we find
W ( f l+ , f l− ) = 2ik. (48.69)
Thus, if for all k there are three solutions, φl , f l+ , f l− , this means that they are linearly dependent, so
φl is a linear combination of f l+ and f l− :
φl = C1 f l+ + C2 f l− . (48.70)
In this case, we define the Jost functions F ± (k) as
F ± (k) ≡ W ( f l± , φl ). (48.71)
We can calculate the Wronskian (which is r-independent again, as we have the same potential and
the same energy) either at infinity or at zero. We find (from the Wronskian at infinity)
1  − 
φl = −Fl (k) f l+ (k; r) + Fl+ (k) f l− (k; r) , (48.72)
2ik
where (from the Wronskian at zero)
Fl± (k) = (2l + 1)Cl± = (2l + 1) lim r l f l± (k; r), (48.73)
r→0

so, at r → 0,
Fl± (k)
f l± (k; r)  . (48.74)
(2l + 1)r l
544 48 Partial Wave Expansion, Phase Shifts, Scattering Length

The properties of f l± and the Jost functions Fl± are:


(1) f l− (k; r) = f l+ (−k; r), since at infinity we have f l±  e∓ikr .
(2) [ f l± (−k ∗ ; r)]∗ = f l± (k; r), again proven from the behavior at infinity and analyticity.
(3) From the two previous properties, we also obtain f l± (k; r) = [ f l∓ (k ∗ ; r)]∗ .
In particular, if k ∈ R, then f l− = ( f l+ ) ∗ , so

Fl± (−k) = Fl∓ (k). (48.75)

(4) This is generalized through analyticity to the complex plane relation

[Fl± (−k ∗ )]∗ = Fl± (k). (48.76)

But if we go back to k ∈ R, this reduces to a more general relation than the previous one,

Fl+ (−k) = (Fl− (k)) ∗ . (48.77)

As φl is χl , from the expansion of φl into f l± , we obtain


Fl+ (k) ilπ Fl+ (k) ilπ
Sl (k) = e2iδ l (k) = e = e . (48.78)
Fl− (k) Fl+ (−k)
Substituting this into the behavior at r → ∞ of (48.72), we find
 
Fl− (k) iδ (k) −lπ/2 lπ
φl (k; r) ∼ e l
e sin kr − + δl . (48.79)
k 2
If k ∈ R, then
Fl+ (k) = |Fl+ (k)|eiα l
(48.80)
Fl− (k) = |Fl+ (k)|e−iδ l .

This, together with the relation e2iδ l = Fl+ (k)eilπ /Fl− (k), implies that

δl = αl + lπ(mod 2π)
(48.81)
δl (−k) = −δl (k).
In this analysis of Jost functions, we have assumed that k ∈ C, though without much justification.
We will come back to this in Chapter 50, where we will define more thoroughly the analysis for
complex k.

Levinson Theorem

This theorem was proven in 1949 by Levinson, but here we will just present a statement of it, without
a proof. We will consider a not very rigorous proof later on.
(a) The first statement regarding δl (k), specifically for the difference between low energy and high
energy, is:


⎪ n0b π if F0+ (0)  0
δ0 (0) − δ0 (∞) = ⎨
⎪ (48.82)
⎪ (n0 + 1/2)π if F0+ (0) = 0.
⎩ b
Note that F0+ (0)  0 means that φl contains both f l+ and f l− components, whereas the F0+ (0) = 0
condition means that φl ∝ f l+ , giving a quantization condition (but not two, as in the case of obtaining
545 48 Partial Wave Expansion, Phase Shifts, Scattering Length

a discrete level, i.e., a bound state), meaning an extra 1/2 term is added to n0b . Here n0b is the number
of energy levels (number of bound states) of given angular momentum l = 0, so it is simple to
generalize to nlb for angular momentum l. The assumption of this statement is that if F0+ (0) = 0 then
F0+ (k) ∼ ak as k → 0.
(b) The second statement is:

δl (0) − δl (∞) = nlb π if Fl+ (0) = 0 and Fl+ (k) ∼ Ak 2 . (48.83)

In both cases, an extra assumption is that the potential has not just finite range but also a faster
vanishing of the potential at infinity,

lim r 3 V (r) = 0. (48.84)


r→0

Important Concepts to Remember


• The partial wave expansion is f k (n , nk ) = l=0 (2l + 1)al (k)Pl (cos θ), in terms of the lth partial
wave amplitude al (k) and follows the same expansion as that of the other term, (2π/ik)δ(n  − nk ),
in Ak (n  ), which has 1/(2ik) instead of al (k).
• For a central potential going to 0 at infinity, the real radial wave function at infinity is REl (r) =
[Al ei(kr−lπ/2) − Al∗ e−i(kr−lπ/2) ]/(ikr), with Al = Al0 eiδ l , or REl (r) = (2Al0 /kr) sin(kr − lπ/2 + δl ),
with δl the phase shift.
• The partial wave amplitude is related to the phase shift by al (k) = (e2iδ l (k) − 1)/(2ik) =
[eiδ l (k) sin δl (k)]/k = [k cot δl (k) − ik]−1 , and to the lth partial wave S-matrix element by
Sl (k) = eiδ l (k) = 1 + 2ikal (k), for the same partial wave expansion of S(nr , nk ).
• The S-matrix and T-matrix are related by Sl = 1 − 2πiTl and Ŝ = 1 − 2πiδ(E − E  )T̂.
• At low energies, k → 0, al (k)  δl (k)/k, and a ≡ − limk→0 a0 (k) > 0 is the scattering length.
• We can define radial wave solutions Rkl (r) analytically continued to the complex k plane. Defining,
for real k, φl via limr→0 r −l−1 φl (r) = 1 and f l± (k, r) via limr→∞ e±ikr f l± (k; r) = 1, we define the
Jost functions as the Wronskians F ± (k) = W ( f l± , φl ).
• The Levinson theorem states that δ0 (0) − δ0 (∞) = n0b π if F0+ (0)  0, where nlb is the number of
bound states with l, and δ0 (0) − δ0 (∞) = (n0b + 1/2)π if F0+ (0) = 0, while δl (0) − δl (∞) = nlb π
if Fl+ (0) = 0 and Fl+ (k) ∼ Ak 2 .

Further Reading
For partial waves, phase shifts, and scattering length, see [1] and [2]. For the Levinson theorem, see
Levinson’s paper [28].

Exercises

(1) Consider scattering onto a delta function potential, V = −V0 δ3 (r ), in the Born approximation.
Calculate the partial wave amplitudes al (k).
546 48 Partial Wave Expansion, Phase Shifts, Scattering Length

(2) Consider a spherical well potential V = −V0 for r ≤ R, and V = 0 for r > R, in the Born
approximation, and waves with E > 0. Calculate the phase shifts δl (k) for scattering.
(3) In the case in exercise 2, calculate Sl (k) and the differential cross section.
(4) If δl (k) is real, what do you deduce about Tl (k)?
(5) Calculate the scattering length for the case in exercise 1.
(6) Is relation (48.72) well defined for complex k? Why?
(7) If limk→0 Fl+ (k)/k 2 is constant for l even, find the k → 0, r → ∞ behavior with k of φl (k; r)
for even l.
49 Unitarity, Optics, and the Optical Theorem

In this chapter, we will revisit the issue of unitarity and the optical theorem for scattering. An
example of the partial wave formalism, for a hard sphere, puts us on the track of quantum mechanical
scattering as optics.

49.1 Unitarity: Review and Reanalysis

To understand the application of unitarity, we first remember what unitarity means: in quantum
mechanics, the conservation of probability implies that the time evolution is unitary, namely that
the evolution operator is unitary, Û † = Û −1 .
But the S-matrix is related to the evolution operator by Ŝ = ÛI (+∞, −∞), so Ŝ † = Ŝ −1 . Then
unitarity of the S-matrix is equivalent to the conservation of probability.
On the other hand, in the |Elm basis, which means in the partial wave formalism, the S-operator
Ŝ is (as we saw) represented by Sl = e2iδ l , and so

E  l  m  | Ŝ|Elm = δll δ mm δ(E − E  )Sl . (49.1)

The diagonalization comes from the fact that the time evolution operator Ŝ = ÛI (+∞, −∞) has

common eigenfunctions with Ĥ (with eigenvalue E), L 2 (with eigenvalue related to l), and L̂ z (with
eigenvalue m). This follows from the fact that the time evolution leaves invariant the energy E,
angular momentum L 2 , and angular momentum projection L z . Classically, this means that

∂t H = ∂t L 2 = ∂t L z = 0, (49.2)

while in quantum mechanics, invariance under time translation means that the commutator with Ĥ
vanishes, ∂t · · · = 0 → [ Ĥ, . . .] = 0 (the generator of time translation is Ĥ). However, the time
evolution invariance is

[ Ĥ, Ĥ] = [ Ĥ, L 2 ] = [ Ĥ, L̂ z ] = 0, (49.3)

which is true in a spherically symmetric system.


But then, unitarity, Ŝ † = Ŝ −1 , implies Sl∗ = Sl−1 and, since Sl = e2iδ l (k) , this means that δl∗ (k) =
δl (k), so that the phase shift is real. But this was what we (implicitly) assumed when defining the
phase shift.
Equivalently, the unitarity condition is |Sl (k)| = 1, which means that the only change in the
outgoing wave with respect to the incoming one is a phase, not an amortization (e−|Im δ l | ), which
would lead to probability decay (in a physical case, probability decay could only mean that the
system is not complete, and the probability “leaks” somewhere else, into another system coupled to
the one we are analyzing).
547
548 49 Unitarity, Optics, and the Optical Theorem

Since
Sl − 1 i ie2iδ l i e−iπ/2+2iδ l
kal = = − = + , (49.4)
2i 2 2 2 2
this means that kal maps a circle in the complex plane, centered on i/2, of radius 1/2, called the
“unitarity circle”.

49.2 Application to Cross Sections

To see the effect of unitarity analysis on physical measurements, we need to look at cross sections.
As we saw, in the partial wave formalism,

= | f k (θ)| 2 . (49.5)

But since (see (48.12))


f k (θ) = (2l + 1)Pl (cos θ)al (k)
l=0 (49.6)
e2iδ l − 1 Sl − 1
al (k) = = ,
2ik 2ik
it follows that
1 
∞ ∞

= 2 (2l + 1)(2l  + 1)(e2iδ l (k) − 1)(e−2iδ l (k) − 1)Pl (cos θ)Pl (cos θ), (49.7)
dΩ 4k l=0 l =0

where we have used the fact that Pl (cos θ) is real. Then the total cross section is found by integrating
over dΩ,


σtot (k) = dΩ

 2π  ∞  ∞  +1
1 (49.8)
= 2 dφ d(cos θ)Pl (cos θ)Pl (cos θ)
4k 0 
l=0 l =0 −1

× (2l + 1)(2l  + 1)(e2iδ l (k) − 1)(e−2iδ l (k) − 1).


Now we use the orthogonality condition of the Legendre polynomials (see (17.38)),
 +1
2δll
d(cos θ)Pl (cos θ)Pl (cos θ) = , (49.9)
−1 2l + 1
so that we obtain
π  
∞ ∞
σtot (k) = (2l + 1) e2iδ l (k) − 12 = 4π (2l + 1) sin2 δl (k). (49.10)
k 2 l=0   k 2 l=0

This means we can define a cross section σl through




σtot (k) = σl (k), (49.11)
l=0
549 49 Unitarity, Optics, and the Optical Theorem

and, identifying the two expansions, we obtain


σl (k) = (2l + 1) sin2 δl (k). (49.12)
k2
A more general formula applies also to the case when unitarity is “violated”, meaning that δl (k)
is complex (which implies that the system under analysis interacts with other systems, having a
“probability leak”, so the scattering is “inelastic”). This formula is
π
(2l + 1) e2iδ l (k) − 1 .
2
σl (k) = (49.13)
k2

In the case of unitary evolution (δl (k) ∈ R), since sin2 δl (k) ≤ 1 we have a “unitarity bound”,


σl (k) ≤ (2l + 1) ≡ σlmax . (49.14)
k2
We will come back to this bound later but, for the moment, we just note that the saturation of the
bound is at δl = π/2, which means that when δl  π/2, the lth partial wave has a maximal effect.
However, we need to calculate δl (k). To do that, we note that al (k) is found from (49.6) by
integration with Pl (cos θ)d cos θ, so
 1
e2iδ l (k) − 1 sin δl
d(cos θ)Pl (cos θ) f k (θ) = 2al (k) = = 2eiδ(k) . (49.15)
−1 ik k

But the Lippmann–Schwinger equation implies (see (46.71))



2m  
f k (θ) = − d 3r  e∓i k ·r V (r  )ψ± (k, r  ), (49.16)
4π2

where we have substituted G0(+) (r , r  ), and in the spherically symmetric case the l-expansion of the
wave function is


ψ+ (k; r, θ) = (2l + 1)ei(δ l +lπ/2) Rl (k; r)Pl (cos θ). (49.17)
l=0

Instead of continuing like this, we can use a spherical expansion of the Green’s function,

G0(+) (r , r  ) = −ik ∗
Ylm (nr )Ylm (nr ) jl (kr < )hl(1) (kr > ), (49.18)
l,m

which implies, via the l-expansion of the wave function, the Lippmann–Schwinger equation for the
radial wave function,
 ∞
2mik
Rl (k; r) = jl (kr) − 2 dr r 2 jl (kr < )hl(1) (kr > )V (r  )Rl (k; r  ). (49.19)
 0

Since the first term is the expansion of the plane wave, the second term is the expansion of the f k (θ),
which gives al (k), so that
 ∞
eiδ l (k) sin δl (k) 2m
al (k) = =− 2 r 2 drV (r) jl (kr)Rl (k; r). (49.20)
k  0
550 49 Unitarity, Optics, and the Optical Theorem

49.3 Radial Green’s Functions

We can also expand the Green’s functions and the Lippmann–Schwinger equation using a partial
wave expansion.
The Green’s function partial wave expansion is


2l + 1
G(z; r , r  ) = G(z; r, r , θ) = g̃l (z; r, r  )Rl (nr , nk )
l=0

(49.21)

∞ 
l
 ∗
= g̃l (z; r, r )Ylm (nr )Ylm (nr  ),
l=0 m=−l

where g̃l (z; r, r  ) is known as the radial Green’s function. It satisfies


     
2 1 d 2 d l (l + 1)  δ(r − r  )
z+ r − − V (r) g̃l (z; r, r ) = , (49.22)
2m r 2 dr dr r2 r2
and has the spectral decomposition
 Rnl (r)R∗ (r)
g̃l (z; r, r  ) = nl
, (49.23)
n
z − E n

representing in radial space the general formula


 |nn|
Ĝ(z) = . (49.24)
n
z − En

In the free particle case, the radial Green’s function is


2mi 2
g̃l(+)0 (E; r, r  ) = − q jl (qr < )hl(1) (qr > ), (49.25)
2
where
2 q2
E= (49.26)
2m
and r < = min(r, r  ), r > = max(r, r  ). It appears in the radial Lippmann–Schwinger equation,

Rl (k; r) = jl (kr) + dr  g̃l(+)0 (E; r, r  )V (r  )Rl (k; r  )r 2 , (49.27)

which leads to the equation (49.19). It also leads to the Lippmann–Schwinger equation for the full
Green’s function g̃l ,

 
(+)0
g̃l (z; r, r ) = g̃l (z; r, r ) + dr r 2 g̃l(+)0 (z; r, r  )V (r  ) g̃l (z; r , r  ). (49.28)

As an example, consider the radial delta function potential,


V (r) = −λδ(r − a). (49.29)
Then the Lippmann–Schwinger equation for the full radial Green’s function has the solution
g̃l(+)0 (z; r, a) g̃l(+)0 (z; a, r  )
g̃l (z; r, r  ) = g̃l(+)0 (z; r, r  ) − λa2 . (49.30)
1 + λa2 g̃l(+)0 (z; a, a)
551 49 Unitarity, Optics, and the Optical Theorem

This means that the radial Green’s function gives


g̃l(+)0 (E; r, a) jl (ka)
Rl (k; r) = jl (kr) − λa2 . (49.31)
1 + λa2 g̃l(+)0 (E; a, a)
The Jost solutions are
f l∓ (k; r) = ±ikr hl(1,2) (kr). (49.32)
The details are left as an exercise.

49.4 Optical Theorem

We have already proven a version of the optical theorem in Chapter 46, but here we will revisit it.
First, we review the proof given in Chapter 46. Unitarity means Ŝ Ŝ † = 1, but since
ik ˆ
Ŝ = 1 + f, (49.33)

it follows that
fˆ − fˆ† k ˆ ˆ†
= ff . (49.34)
2i 4π
When we represent this abstract statement in a basis, and take n  = nk , we obtain the total cross
section, since

k k
Im f (nk , nk ) = d 2nr | f (nr , nk )| 2 = σtot , (49.35)
4π 4π
and on the left-hand side we have the forward direction,
f (θ = 0) = f (nk , nk ). (49.36)
But we want to describe this proof in physical terms, which means in terms of the interaction with
a potential V . To do that, we relate the above calculation to the T-matrix, by
m
fˆ = − T̂. (49.37)
2π2
Then, when taking the matrix element in the k | basis, we obtain
m  
f (θ = 0) = f (nk , nk ) = −  k |T̂ | k. (49.38)
2π2
But the T-matrix element is related to the potential V̂ in between two states, so
Imk |T̂ |k = Imk |V̂ |ψ+. (49.39)
From the Lippmann–Schwinger equation, we obtain
   
k  |V̂ |ψ+ = ψ + |V̂ |ψ+ − ψ + V̂ V̂  ψ + .
1
(49.40)
 E − Ĥ0 − i 
On the other hand, from simple complex analysis formula extended to operators,
 
1 1
Ĝ0(+) = =P + iπδ(E − Ĥ0 ), (49.41)
E − Ĥ0 − i E − Ĥ0
552 49 Unitarity, Optics, and the Optical Theorem

where P refers to the principal part. But when taking the imaginary part, we obtain a zero if we have
a diagonal matrix element of a Hermitian (“real”) operator, so

Imψ + |V̂ |ψ+ = 0


     
1  (49.42)
Im ψ + V̂ P V̂ ψ + = 0.
 E − Ĥ0 

Then out of the three terms in the imaginary part of the T-matrix element, only one contributes
(and is actually purely imaginary), and we obtain

Imk |V̂ |ψ+ = −πψ + |V̂ δ(E − Ĥ0 )V̂ |ψ+


= −πk |T̂ δ(E − Ĥ0 )T̂ |k
  
2 k̃ 2    (49.43)
= −π d 3 k̃ k |T̂ |k̃  δ E −  k̃ |T̂ | k
2m

πm
dΩk 2 k |T̂ |k   ,
2
=− 2
 k

where in the last step we have used


 
2 k̃ 2 δ(k − k̃  )
δ E− = . (49.44)
2m 2 k̃ /m

Then also (since d 2n  = d 2nk = dΩ)


  
3  
d k̃ δ(k − k̃ ) = d n k × (|k̃   → |k  ) =
2  2
dΩ k 2 × (|k̃   → |k  ). (49.45)

Finally, then, we obtain

m
Im f (θ = 0) = − Imk |V̂ |ψ+
2π2

m2 k
dΩ k |T̂ |k  
2
=
24
 (49.46)
k  m 2
= dΩ −  k |T̂ |k  
4π  2π2 

k k
= dΩ| f (θ)| 2 = σtot .
4π 4π

This completes the physical proof of the optical theorem. While the result is the same, and we
have taken a longer route to prove it, we still have gained something. The use of the Lippmann–
Schwinger equation, which can be solved perturbatively through the Born series, means that we can
define the perturbative series on both sides of the equation for the optical theorem, and consider a
term-by-term analysis (though we will not do so here). Moreover, in the previous proof, we assumed
that the S-matrix is unitary, but that means assuming that the quantum mechanical construction of
scattering is fully consistent (and we have not proved this until now), whereas here the proof is
explicitly written in terms of the potential.
553 49 Unitarity, Optics, and the Optical Theorem

49.5 Scattering on a Potential with a Finite Range

We now give an example of how to calculate δl and solve the partial wave problem. Consider the
case of a potential that is strictly vanishing outside a spherical shell,

V (r) = 0 for r ≥ r 0 . (49.47)

In such a case (the case of a potential with a strict finite range), outside the spherical shell the solution
of the Schrödinger equation is a free spherical wave.
Throughout the space (both outside and inside the spherical shell), we found (see (48.34)) that the
wave function solution for a spherically symmetric potential is



ψk (r, θ) = (2l + 1)ei(δ l +lπ/2) R̃El (r)Pl (cos θ)
l=0
(49.48)


= REl (r)Pl (cos θ).
l=0

Furthermore, we found an asymptotic solution, (48.16), for which the potential was vanishing (the
interaction was turning off). But in our case, the potential is really zero, so the solution is exact,

REl (r) = Cl jl (kr) + Dl nl (kr), (49.49)

where

Cl = Al + Al∗ = Al0 2 cos δl


(49.50)
Dl = i Al − i Al∗ = Al0 (−2 sin δl )

and
REl (r)
Al0 = = (2l + 1)ei(δ l +lπ/2) . (49.51)
R̃El (r)

In order to find the wave function, as we saw in the one-dimensional case, at the place where the
potential jumps, we need to impose continuity of the logarithmic derivative of the wave function.
Calculating it at r 0 + , so we can use the above formulas, we obtain

r dREl (r)  j  (kr 0 ) cos δl − nl (kr 0 ) sin δl


q0l ≡  = kr 0 l . (49.52)
REl (r)dr r=r0 jl (kr 0 ) cos δl − nl (kr 0 ) sin δl

Then we calculate q0l at r 0 − , in terms of the wave function solution at r ≤ r 0 ; equating it to the
above gives the full solution, in terms of the calculated δl . Indeed, inverting the formula, we find

kr 0 jl (kr 0 ) − q0l jl (kr 0 )


tan δl = . (49.53)
kr 0 nl (kr 0 ) − q0l nl (kr 0 )

To find the solution inside the shell, we must impose regularity (normalizability) of the solution at
r = 0, which, as we saw in Chapter 19, means that χEl (r = 0) = r REl (r)|r=0 = 0.
554 49 Unitarity, Optics, and the Optical Theorem

49.6 Hard Sphere Scattering

The simplest example of the calculation inside radius r = r 0 , is for a “hard sphere”, with infinite
potential, V = ∞, for r ≤ r 0 .
Then the simplest boundary condition at r = r 0 is that the radial wave function vanishes,
REl (r 0 ) = 0, which means
jl (kr 0 ) cos δl − nl (kr 0 ) sin δl = 0, (49.54)
leading to
jl (kr 0 )
tan δl = . (49.55)
nl (kr 0 )
Equivalently, we can take the limit q0l → ∞ (since the denominator in the definition of q0l has
REl (r 0 ) = 0, as the wave function does when it hits an infinite wall) in the general formula for
V (r < r 0 ), and we obtain the same result.
For l = 0, and at r ≥ r 0 , we find
j0 (kr 0 ) sin(kr 0 )/(kr 0 )
tan δ0 = = = − tan(kr 0 ) = tan(−kr 0 ). (49.56)
n0 (kr 0 ) − cos(kr 0 )/(kr 0 )
Then we identify the arguments of the tangent, since both of them are bounded,
δ0 = −kr 0 . (49.57)
The radial wave function (for r ≥ r 0 ) is
REl (r) = 2Al0 ( jl (kr) cos δl − nl (kr) sin δl ) , (49.58)
which means that for l = 0 we obtain
 
sin kr 0 cos kr 0
RE0 (r) = 2A00 cos δ0 + sin δ0
kr 0 kr 0
(49.59)
2A00 2A00
= sin(kr 0 + δ0 ) = sin[k (r − r 0 )] ,
kr 0 kr 0
a sinusoid shifted by r 0 , i.e., starting at r 0 instead of zero.
Then, we calculate the scattering length,
δ0 (k)
= r0 .
a = − lim (49.60)
k r→0

Note that now we do not need to take the low-energy limit.

49.7 Low-Energy Limit

We now consider the low-energy limit for the hard sphere potential. For Bessel functions at small
argument, x = kr 0  1,
(kr 0 ) l
jl (kr 0 ) 
(2l + 1)! !
(49.61)
(2l − 1)! !
nl (kr 0 )  − .
(kr 0 ) l+1
555 49 Unitarity, Optics, and the Optical Theorem

In the k → 0 limit, we saw that δl  tan δl , so


(kr 0 ) 2l+1
δl (k) = tan δl (k) = . (49.62)
(2l + 1)! ! (2l − 1)! !
For the partial wave amplitude, in the low-energy limit, we obtain
δl (k)
al (k)  ∝ k 2l . (49.63)
k
This means that δ0 dominates strongly, giving an isotropic (l = 0) result. The lth partial wave
cross section is
sin2 δl (k)
σl = 4π(2l + 1) ∝ k 4l , (49.64)
k2
which means all these cross sections go to zero, except σ0 , which stays constant. Then
sin2 δ0 (k)
σ0 = 4π lim = 4πa2
k→0 k2 (49.65)
δ0 (k)
a = − lim = r0 ,
k→0 k
so the total cross section is
σtot  σ0  4πa2 = 4πr 02 , (49.66)
which is four times the geometrical cross section for radius r 0 , πr 02 .
Moreover, the differential cross section is also constant,
 δ0 (k) 
2
= | f | 2  |a0 (k)P0 (cos θ)| 2 =  lim

 = a2 = r 02 , (49.67)
dΩ k→0 k 
where we have used P0 (cos θ) = 1. This is consistent with the previous result, as the integration over
the solid angle just gives a factor equal to the total angle, 4π.
Since we have four times the geometrical cross section (which would be the classical result), we
have to ask, why do we have a larger result? One answer is that at low energy the wavelengths of
the particles are larger than the range of the potential, leading to an extreme quantum regime, the
opposite of the classical limit.
On the other hand, at high energy, kr 0  1, we still find a result higher than the classical one, even
though now the wavelengths are going to zero. The result is (the derivation can be found in [1])
σtot = 2πr 02 = σreflection + σshadow , (49.68)
where σreflection = πr 02 is the classical result, and σshadow is the Fraunhofer diffraction result in optics.
This is consistent with the geometric optics approximation of the quantum mechanical wave function
(semiclassical), which is a short-wavelength approximation.

Important Concepts to Remember

• Unitarity means that Sl∗ = Sl−1 , so that δl is real, |Sl | = 1, and kal = Sl2i−1 , meaning that kal maps a
circle on the complex plane called the unitarity circle.
• The total cross section expands as σtot = l σl , with σl = 4π/k 2 (2l + 1) sin2 δl , so σl ≤
4π/k 2 (2l + 1), called the unitarity bound.
556 49 Unitarity, Optics, and the Optical Theorem

• We can write a Lippmann–Schwinger equation for the radial wave function Rl (k; r), Rl (k; r) =

jl (kr) − 2mik
2 0
dr r 2 jl (kr < )hl(1) (kr > )V (r  )Rl (k; r  ), and the Green’s function has also a

decomposition in spherical harmonics Ylm (nr )Ylm (nr ), with the coefficients of the radial Green’s
function
 Rnl (r)Rnl (r  )
gl (k; r, r  ) = ,
n
z − En
satisfying a similar L–S equation.
• The optical theorem can be proven physically, without assuming Ŝ Ŝ † = 1, and in the proof we use
the Lippmann–Schwinger equation, which can be solved perturbatively via the Born series so that
we can reduce it to a term-by-term equation.
• When scattering on a hard sphere of radius r 0 , the scattering length (without actually employing a
low-energy limit) is a = r 0 , and the low-energy differential cross section is constant, dσ/dΩ = r 02 ,
so we get four times the geometric cross section, owing to a Fraunhofer diffraction contribution.

Further Reading
See [2] and [1] for more details.

Exercises

(1) Use the Lippmann–Schwinger equation for ψ to write an expression for σl in terms of integrals
involving the potential V (r) and the wave function.
(2) Show the details of going from (49.19) to (49.20).
(3) In the case of the radial delta function potential, prove that the radial wave function Rl (k; r) is
(49.31) and the Jost solutions are (49.32).
(4) Expand the optical theorem first in angular momentum l and then in the Born series, for both
f (θ = 0) and k |T̂ |k  .
(5) For a potential with finite range, consider V = V0 > 0 inside r < r 0 , and find the solution in this
region, in the case E > V0 , imposing normalizability at r = 0.
(6) For the hard sphere, calculate σl at general k.
(7) Calculate the high-energy limit (k → ∞) for the partial wave cross section σl for a hard sphere,
and their relative weight in σtot .
Low-Energy and Bound States, Analytical Properties
50 of Scattering Amplitudes

In this chapter we will consider the relation between low-energy scattering (meaning E > 0) and
bound states (with E < 0), and we will find that, on considering complex values of k (or E), the
same result can be obtained by analysis on the complex plane. This leads us to consider more general
analytical properties of the scattering amplitudes.

50.1 Low-Energy Scattering

We have seen that the phase shift δl is calculated from the potential via
 ∞
2mk
kal (k) = eiδ l sin δl = − 2 dr r 2 V (r) jl (kr)Rl (k; r). (50.1)
 0

Moreover, at low energy, k → 0 implies δl → 0, and we have

kal (k)  δl ∝ k 2l+1 → 0, (50.2)

called threshold behavior, which means that the l = 0 mode dominates the phase shifts.
Physically, we can understand the dominance of l = 0 from the effective potential
2 l (l + 1)
Veff (r) = V (r) + . (50.3)
2m r 2
At low energy, since the second term in Veff blows up for r → 0 if l  0, we will never come close to
zero, where l  0 dominates the effective potential. This means that only the l = 0 term contributes,
for which Veff = V (r).

Step Potential
We next consider the simplest finite-range potential, the one that is constant inside its range,

⎪V0 , r < r 0
V=⎨
⎪ 0, r ≥ r . (50.4)
⎩ 0

If V0 > 0 we have a repulsive potential, and if V0 < 0 we have an attractive one.


At low energy, kr 0  1, l = 0 dominates and the equation (48.34) implies
eiδ0 sin(kr + δ0 )
ψk (r, θ)  eiδ0 RE,l=0 = eiδ0 ( j0 (kr) cos δ0 − n0 (kr) sin δ0 )  , (50.5)
kr
where in the first equality we have used the dominance of l = 0, in the second P0 (cos θ) = 1, and in
the third r → ∞.
557
558 50 Low-Energy and Bound States, Scattering Amplitudes

For the reduced radial wave function, we obtain (for r ≥ r 0 )


χ = r RE,l=0 (r) = const × sin(kr 0 + δ0 ), (50.6)
where we have chosen a different normalization, such that the constant is 1.
Within the potential range, we impose regularity at r → 0, χ(0) = 0. Since V = V0 is constant
over the range, the wave function is a free particle one, but with energy shifted by V0 ,
2 k̃ 2
E − V0 = , (50.7)
2m
so the real solution for l = 0 (the spherically symmetric case) or at large r is
ei k̃r − e−i k̃r
χ(r) = N = N sin k̃r, (50.8)
2i
where N is a normalization constant. Of course, this is only true if E > V0 . If E < V0 , we write
k̃ = iκ, so
2 κ2
V0 − E = , (50.9)
2m
leading to the wave function
N e κr − e−κr
χ(r) = = Ñ sinh κr. (50.10)
i 2
We need to join the inside and outside solutions at r = r 0 .
We saw in the previous chapter that among the lth partial wave cross sections

σl = (2l + 1) sin2 δl , (50.11)
k2
at low energy the l = 0 term dominates,
sin2 δ0 (k)
σ0 = 4π → 4πa2 . (50.12)
k2
However, the dominance of σ0 also happens at sin2 δ0 → 0 and k → 0, not only at δ0 → 0. It can
also happen at δ0 = π.
If −V0 = |V0 | is large then k̃ is large (even though k is small). Thus, eventually we reach k̃r 0 = π, so
sin k̃r = sin π = 0, (50.13)
even though kr 0 = 0. But the continuity of the inside and outside solutions means that
sin(kr 0 + δ0 ) = N sin( k̃r 0 ) = 0, (50.14)
so δ0 = π, since we have increased δ0 as |V0 | is increased from 0, which correspond to the free
particle (when δ0 = 0).

50.2 Relation to Bound States

The outside (r ≥ r 0 ) wave function solution for k → 0 can be rewritten as


    
δ0 δ0
χ(r) = sin(kr + δ0 ) = sin k r + k r+ . (50.15)
k k
559 50 Low-Energy and Bound States, Scattering Amplitudes

However, since we are in the regime E  0, l = 0, r > r 0 so that V = 0, the Schrödinger equation is
d 2 (r R) d 2 χ
= = 0, (50.16)
dr 2 dr 2
with solution

χ  C(r − a), (50.17)

where C, a are constants. But matching with the first form in the k → 0 limit, we obtain that
δ0
a = − lim , (50.18)
k→0 k
meaning a is the scattering length.
But the condition of matching the inside and outside solutions means the logarithmic derivative
must be continuous at r 0 . We define it slightly differently, in terms of χ(r), as
χ  (r 0 )
q̃0 ≡ . (50.19)
χ(r 0 )
In terms of the outside function, its value at arbitrary r is
  
k cos(kr + δ0 ) δ0 1 1
q̃0 = = k cot k r + k = . (50.20)
sin(kr + δ0 ) k k (r + δ0 /k) r − a
If we choose r = 0, even though this is outside the domain of validity of the function (r ≥ r 0 ), we
obtain
1
q̃outside (r → 0) = k cot δ0 = − , (50.21)
a
consistent with the definition of the scattering length. Its significance follows from the fact that q̃ is
continuous at r 0 : we have

χoutside (r)  sin[k (r − a)] = 0 (50.22)

for r = a, regardless of whether r = a ≥ r 0 or not, which means that the scattering length r = a is
the intercept of the outside wave function; see Fig. 50.1.
If δ0 < 0, coming from the repulsive potential V0 > 0, then a > 0. If δ > 0, coming from the
attractive potential V0 < 0, then a < 0. However, in the attractive-potential case V0 < 0, if |V0 | is
large, meaning large k̃, then we have lower periodicity of the wave function, so the wave function
drops faster at r > r 0 and so its first zero after r 0 corresponds to a different scattering length a  > 0.
Indeed, the low-energy (E  0) outside solution r > r 0 is

χ  C(r − a). (50.23)

χ(r)

r
−a r0 +a
Figure 50.1 Wave function and scattering length.
560 50 Low-Energy and Bound States, Scattering Amplitudes

This is a scattering solution for E > 0 (so E = 0+), though it can be written also as

χ  Be−κr , (50.24)

where κ  0 (very small) and

B = −Ca, −Bκ = C < 0. (50.25)

After this rewriting, the wave function looks also like a bound state (E < 0), though very slightly so,
E = 0−. In both the scattering (E = 0+) and bound state (E = 0−) solutions,

2 κ̃2
= E − V0  |V0 |. (50.26)
2m
Therefore in both cases the inside wave function is

χ  sin κ̃r. (50.27)

Then the logarithmic derivative inside is (in both cases)

−κe−κr0
q̃0inside = = −κ = κ̃ cot κ̃r 0 , (50.28)
e−κr0
to be equated to the outside value,
1 1
q̃0outside = − , (50.29)
r0 − a a
in this larger-scattering-length case, r 0  a. Thus the binding energy of the bound state is obtained
by equating the cases E = 0+ and E = 0−,

2 κ2 2
I = −Ebound state = = . (50.30)
2m 2ma2
This is a relation between the binding energy I of a mode close to zero and the (large value of the)
scattering length a for scattering on an attractive potential of large depth.

50.3 Bound State from Complex Poles of Sl (k)

The partial wave S-matrix Sl = e2iδ l (k) appears in the ratio of the eikr /r and e−ikr /r modes of the real
wave function solution. From (48.19), the radial wave function is
 i(kr−lπ/2) −i(kr−lπ/2)

iδ l e −iδ l e
REl (r)  Al e0
−e
r r
 i(kr−lπ/2) −i(kr−lπ/2)
 (50.31)
e 1 e
= Ãl 0
− 2iδ (k) ,
r e l r

where e2iδ l (k) = Sl (k) appears in the denominator of the incoming spherical wave, and we have
redefined Al0 by a factor eiδ l (k) .
561 50 Low-Energy and Bound States, Scattering Amplitudes

Dropping Al0 , putting l = 0 as it is the most relevant case, yet the calculation is actually valid for
any l, from (48.34) we obtain
 
eikr e−ikr
ψk (r, θ)  e RE,l=0 = A0 S0 (k)
iδ0 0
+
r r
 ikr −ikr
 (50.32)
0 e 1 e
= Ã0 − .
r S0 (k) r
At this point we generalize to complex k, in order to connect to the description of bound states.
Writing k = iκ, now with κ > 0, the scattering solution transforms into
 −κr 
0 e 1 e+κr
ψk (r, θ) = Ã0 − , (50.33)
r S0 (iκ) r
which matches the bound state wave function solution provided that the second term vanishes, which
means S0 (iκ) → ∞ for κ ∈ R+ . Therefore the function S0 (k) has a pole on the imaginary line in the
upper half-plane k = iκ.
This means that, starting from scattering with k ∈ R, we generalize to the complex plane, k ∈ C,
and then the poles k = iκ correspond to bound states.
If there are no singularities other than a single bound state k = iκ, with κ > 0, the s wave S-matrix
S0 (k) can be constructed. This implies that there are no singularities even at infinity; otherwise, we
need to consider k not too large (if this is not true, then S0 (k) → 0 at infinity for a physical case).
Besides the pole condition for S0 (k), we have also the following conditions: (1) |S0 (k)| = 1 for
2l+1
k ∈ R+ , derived from unitarity, and (2) S0 = 1 for k = 0, since then Sl = e2iδ l  eiCk → 1.
The simplest solution for the conditions on S0 (k) is
k + iκ
S0 (k) = − = e2iδ0 (k) . (50.34)
k − iκ
This means that the partial wave amplitude is
Sl=0 − 1 1 1
al=0 (k) = =− = . (50.35)
2ik i(k − iκ) −κ − ik
Equating it with the definition
1
al=0 (k) = (50.36)
k cot δl=0 (k) − ik
gives
1
k cot δl=0 (k) = −κ ⇒ −
= lim k cot δl=0 (k) = −κ, (50.37)
a k→0
the same relation between bound states and scattering length as we had before.
Thus extending δl (k) and Sl (k) to the complex-k plane is useful for obtaining relations between
different physical regimes.

50.4 Analytical Properties of the Scattering Amplitudes

Therefore, we will set up the analytical properties of the functions δl (k) and Sl (k), extended over
the complex-k plane. If k ∈ C, the energy is also complex, E ∈ C. The exceptions are k ∈ R, leading
to E ∈ R+ and k ∈ iR (imaginary k), leading to E ∈ R− . In general, since
562 50 Low-Energy and Bound States, Scattering Amplitudes

2 2
E= k , (50.38)
2m
we obtain

2
Re E = [(Re k) 2 − (Im k) 2 ]
2m (50.39)
2
Im E = 2 Re k Im k.
2m
Thus if we consider k over the upper half-plane (Im k > 0), it sweeps all of the complex E plane (both
the real and imaginary parts reach arbitrarily large positive and large negative values). However, this
implies that this complex E plane is not a regular plane (over which functions are single valued), but
rather a Riemann sheet of the domain of E, called the physical sheet.
Indeed, when r → ∞, the reduced radial wave function in the scattering regime, with k ∈ R, is

χ(r)  A(E)eikr + B(E)e−ikr , (50.40)

where
A(E)
− = e2iδ0 (E) = S0 (E). (50.41)
B(E)

But if k = iκ instead, meaning

2 2
E=− κ < 0, (50.42)
2m
then a real χ(r) for E < 0 means real A(E), B(E).
The relation between E > 0 and E < 0 is by analytical continuation through the upper half-plane
in k (Im k > 0), so k = iκ. But at an arbitrary point on the upper half of the k plane, we have

A(E ∗ ) = A∗ (E), B(E ∗ ) = B∗ (E). (50.43)

Indeed, Im E does change sign, but, since Im k > 0, in fact only Re k actually changes sign, and

eikr = eiRe k r e−Im k r ↔ e−iRe k r e−Im k r = e−iRe k r e−Re κ r , (50.44)

the outgoing wave function eikr becomes the bound state wave function e−κr , and the coefficient is
its complex conjugate, so

A(E ∗ ) = A∗ (E), (50.45)

and similarly for B(E ∗ ) = B∗ (E).


Then the physical Riemann sheet for E is defined by

Im k = Re κ > 0, (50.46)

and we need to consider a cut for E ∈ R+ (from E = 0 to infinity), taking us through to a different
Riemann sheet. The analyticity conditions A(E ∗ ) = A∗ (E) and B(E ∗ ) = B∗ (E) also define the E
plane, or physical Riemann sheet.
563 50 Low-Energy and Bound States, Scattering Amplitudes

50.5 Jost Functions Revisited

To continue the analysis of analyticity properties, we go back to the Jost functions


Fl± (k) = W ( f l± , φl ), (50.47)
where the regular solution is
1  − 
φl = −Fl (k) f l+ (k; r) + Fl+ (k) f l− (k; r) , (50.48)
2ik
and the Jost solutions are defined by
lim e±ikr f l± (k; r) = 1. (50.49)
r→∞

But then the coefficients B and −A are matched to Fl± (k),

lim f + (k; r) = e−ikr ⇒ Fl+ (k) = B(E)


r→∞ l
(50.50)
− lim f l− (k; r) = e+ikr ⇒ −Fl− (k) = A(E).
r→∞

Then, since k → −k ∗ means that Re k changes sign but Im k does not, this corresponds to E ∗ , so
from the analyticity of A and B we obtain
 ∗
Fl± (−k ∗ ) = ±B(E ∗ )/A(E ∗ ) = ±[B(E)/A(E)]∗ = Fl± (k) , (50.51)
which is indeed one of the Jost function’s properties. The other relevant one is
Fl± (−k) = F ∓ (k), (50.52)
implying also
 ∗
Fl± (k ∗ ) = Fl∓ (k). (50.53)
Then the lth partial wave S-matrix is written in terms of the Jost functions as
Fl+ (k) ilπ Fl+ (k) ilπ
Sl (k) = e2iδ l (k) = e = e . (50.54)
Fl− (k) Fl+ (−k)
This means that the analytical properties of Sl (k) can be derived from the analytical properties of the
Jost functions Fl± (k).
Indeed, we have a theorem. The zeroes of the Jost functions Fl+ (k) in the lower half-plane
(meaning Im k < 0) lie on the imaginary line iR, correspond to bound states, and are simple zeroes
(with multiplicity one).
To prove this, we see that Fl+ (−k) = 0 implies Sl (k) → ∞, meaning a pole of Sl (k). But we have
already seen that a pole of S0 (k) is a bound state k = iκ n . Moreover, Sl (k) → ∞ ⇔ Fl+ (−k) = 0, so
Fl+ (k) has zeroes at k = −iκ n corresponding to bound states. q.e.d.
We then also obtain easily that Sl (k) has zeroes at k = −iκ n and that Fl (k) has poles at k = +iκ n ;
see Fig. 51.1 in the next chapter.
For bound states of Fl− (k), the same analysis follows, except now we have zeroes in the upper
half-plane (Im k > 0), since
Fl− (k) = Fl+ (−k). (50.55)
564 50 Low-Energy and Bound States, Scattering Amplitudes

Because Fl+ (k) = W ( f l+ , φl ), the reduced radial wave function for physical bound state is just the
analytically continued regular solution with a normalization constant in front,
χl (k n = −iκ n ; r) = Nnl φl (k n = −iκ n ; r), (50.56)
where
−4κ2n
2
Nnl = . (50.57)
Fl− (−iκ n ) dFl+ (−iκ)/dκ
κ=κ n

The proof of the normalization constant is left as an exercise.


Since
F + (k) ilπ
Sl (k) = e2iδ l (k) = l− e , (50.58)
Fl (k)
the analytical property [Fl± (k ∗ )] = [Fl∓ (k)]∗ implies that
Fl+ (k ∗ ) ilπ [Fl− (k)]∗ ilπ
Sl (k ∗ ) = e = e ⇒
Fl− (k ∗ ) [Fl+ (k)]∗
(50.59)
F − (k) −ilπ
Sl∗ (k ∗ ) = l+ e = Sl−1 (k),
Fl (k)
so Sl (k) satisfies
Sl (k)Sl∗ (k ∗ ) = 1. (50.60)
On the other hand, the analytical property Fl± (−k) = Fl∓ (k) implies that
Fl+ (−k) ilπ Fl− (k) ilπ
Sl (−k) = e = + e , (50.61)
Fl− (−k) Fl (k)
so Sl (k) satisfies
Sl (−k)Sl (k) = e2ilπ , (50.62)
where the right-hand side equals 1 if l is integer.

Important Concepts to Remember

• For scattering at low energy, a0 and σ0 dominate; this is known as threshold behavior.
• For a finite-range negative step potential, V = V0 < 0 for r ≤ r 0 , we can relate scattering (E = 0+)
and bound state (E = 0−) solutions by
2 κ2 2
−Eboundstate ≡ =
2m 2ma2
(so κ = 1/a), with a the (large) scattering length.
• Generalizing Sl (k) to the complex k plane, the poles k = iκ of S0 (k) correspond to bound states.
For a single pole, S0 (k) = (k + iκ)/(k − iκ), giving a = 1/κ.
• Expanding the reduced radial wave function at infinity, χ(r)  A(E)eikr + B(E)e−ikr , and
extending to k ∈ C, the coefficients are Jost functions, B(E) = Fl+ (E) and A(E) = −Fl− (E),
and Sl (k) = Fl+ (k)/Fl− (k)eilπ , with Sl (−k)Sl (k) = eilπ .
565 50 Low-Energy and Bound States, Scattering Amplitudes

Further Reading
For more on the analytical properties of wave functions, see [4].

Exercises

(1) For a step potential, with energy E < V0 , connect the solution at r < r 0 with the solution at
r > r 0 (in the general case).
(2) Find the first correction to the approximate relation (50.30) between the binding energy I of the
bound state close to zero and the large scattering length a.
(3) Find the equivalent of the relation replacing (50.30) if we still have r 0  a, but E = V0 /2 > 0
and very small (kr 0  1).
(4) Extend the analysis in complex space leading to Sl (k), al (k), δl (k) and coming from the
existence of a bound state, from the similar analysis for l = 0.
(5) In the complex k plane do we still have real δl (k)?
(6) Write down the Jost functions Fl+ (k) and Fl− (k) in the case of a single bound state.
(7) Derive the normalization constant (50.57).
51 Resonances and Scattering, Complex k and l

In this chapter we consider resonances in scattering and their interpretation and analytical properties
in the complex k plane, where k is the wave number. We end with a few remarks about continuing l
into the complex plane.

51.1 Poles and Zeroes in the Partial Wave S-Matrix Sl (k)

We saw that k n = −iκ n is a zero for Fl+ (k) = 0, corresponding to a bound state. Then k n = +iκ n is
a pole for Sl (k).
But then the question arises: can one not have poles outside the imaginary axis in the complex k
plane? If there are extra poles, they cannot be on the real axis, since for k ∈ R, |S0 (k)| = 1, whereas
we want S0 (k) = ∞. Thus, we require

kpole = k1 + ik2 . (51.1)

Since S0 (k) = ∞, from the previous chapter we know that this corresponds to having only the “out”
wave at r → ∞,

χ(r) ∼ eikr ⇒ χ ∗ ∼ e−ik r . (51.2)

Then the Schrödinger equations for χ and for χ ∗ are


d2
χ(r) + k 2 χ(r) − V (r)χ(r) = 0 ⇒
dr 2
(51.3)
d2 ∗
χ (r) + k ∗2 χ ∗ (r) − V (r)χ ∗ (r) = 0.
dr 2
Multiplying the first equation by χ ∗ and subtracting the second, multiplied by χ, we obtain
 
d ∗ d d ∗
χ χ − χ χ + (k 2 − k ∗2 )χ ∗ χ = 0. (51.4)
dr dr dr
r
Integrating 0 dr , we obtain
  r  r
χ ∗ χ − χ χ ∗  + (k 2 − k ∗2 )
d d
dr  |χ(r  )| 2 = 0. (51.5)
dr dr 0 0

Regularity of the solution at r = 0 means that χ(0) = χ ∗ (0) = 0. Since r → ∞, we can use

χ(r) ∼ eikr , χ ∗ (r) ∼ e−ik r , obtaining
  r 
(k + k ∗ ) ie−2rIm k + 2i Im k dr  |χ(r  )| 2 = 0. (51.6)
0
566
567 51 Resonances and Scattering, Complex k and l

Then we can have either k + k ∗ = 0, which means that the poles lie on the imaginary line (the bound
states), or

e−2rIm k
Im k = −  r < 0. (51.7)
2 0 dr  |χ(r  )| 2

This would mean that there are poles in the lower half-plane, k = k1 − ik2 , where k2 ∈ R+ .
We have analyzed the poles for S0 (k) because we started with (50.32), but that was actually valid
for any l, so all the analysis above can be generalized to Sl (k) for any l. Since

Fl+ (k) ilπ Fl+ (k) ilπ


Sl (k) = e = e , (51.8)
Fl− (k) Fl+ (−k)

the k = k1 − ik2 pole for Sl (k) ⇔ a pole for Fl+ (k) or a zero of Fl+ (−k) = Fl− (k) ⇔ a zero for
Fl∗− (k) = Fl+ (k ∗ ) ⇔ the k = k1 + ik2 zero for Sl (k),

kzero = k1 + ik2 . (51.9)

Therefore for every pole in the lower half-plane, we have a corresponding zero in the upper half-plane
at the complex conjugate point.
However, we also have Fl± (−k ∗ ) = [Fl± (k)]∗ . Thus if k = −k1 + ik2 is a zero for Fl+ (k) ⇔
k = k1 − k2 is a zero for Fl− (k) (⇔ pole for Sl (k)) ⇔ also −k1 − ik2 is a zero for Fl− ⇔ k1 + ik2 is a
zero of Fl+ (k).
To summarize, we have (see Fig. 51.1)

zero for Fl− (k) : k = ±k1 − ik2 ↔ pole for Sl (k)


(51.10)
zero for Fl+ (k) : k = ±k1 + ik2 ↔ zero for Sl (k).

However, in the physical case, k is real. If the poles and zeroes are close to the real line, i.e., if
k2  k1 , we can approximate the form of Sl (k). Then
 
k − kzero E − Ezero ˜
Sl (k) = e2iδ l (k) = S̃l (k) = S̃l (E) , (51.11)
k − kpole E − Epole

where S̃l (k) varies very little. For physical values of k, we have two cases. For bound states, k
is on the imaginary axis, and if also l = 0 then S̃0 (k) = −1. But for the poles or zeroes near the

k k k

(a) (b) (c)

Figure 51.1 Zeroes and poles of (a) Sl (k); (b) Fl+ (k); (c) Fl− (k). Zeroes are denoted by a circle, and poles by a cross.
568 51 Resonances and Scattering, Complex k and l

real axis, S̃l (k)  1 since this corresponds to scattering, and if k is far from the pole or zero, then
Sl (k)  S̃l (k)  1 is close to the unperturbed case. Then
k − k1 − ik2
Sl (k)  . (51.12)
k − k1 + ik2
This Sl (k) is valid close to the pole or zero.

51.2 Breit–Wigner Resonance

From the formula Sl (k) = e2iδ l (k) above, we can calculate the partial wave amplitude by substituting
it into the general formula,
Sl (k) − 1 1 −k2 1
al (k) =  = k−k
2ik k (k − k1 ) + ik2 k (−k21) − ik
(51.13)
eiδ l (k) sin δl (k) 1
= = .
k k cot δl (k) − ik
Identifying the term with cot δl with the similar term in the first line, we obtain
k − k1
cot δl (k) = . (51.14)
−k2
Moreover, the lth partial wave cross section is
4π 4π k22
σl (k) = (2l + 1) sin2 δl (k) = 4π(2l + 1)|al (k)| 2  2 (2l + 1) . (51.15)
k 2 k (k − k1 ) 2 + k22
But this formula has a maximum at k = k1 , which is called a resonance.
We have seen that the zero of Sl (k) is at kzero = k1 + ik2 , and correspondingly there is a pole at
kpole = k1 − ik2 . Then
2 2 2 2 Γ
Ezero/pole =
kzero/pole = (k − k22 ± 2ik1 k2 ) = Eres ± i , (51.16)
2m 2m 1 2
where we have defined the resonance energy Eres and the width Γ by
2 2
Eres =
(k − k22 )
2m 1
(51.17)
Γ 2
= 2k1 k2 .
2 2m
If k is close to k1 (k  k1 ), and correspondingly the real energy E is close to the real energy E1 ,
related by
2 2 2 2
E= k ∈ R, E1 = k ∈ R, (51.18)
2m 2m 1
we obtain
2 2
E − E1 = (k + k1 )(k − k1 )  2k1 (k − k1 ), (51.19)
2m 2m
as well as (since k1  k2 )
Eres  E1 . (51.20)
569 51 Resonances and Scattering, Complex k and l

¾l (E)

E
E1
Figure 51.2 Breit–Wigner resonance.

Then from (51.14) we have


−k2 −Γ/2
tan δl (k) = = , (51.21)
k − k1 E − E1
which in turn means that
1 + i tan δl (k) E − E1 − iΓ/2
Sl (k) = e2iδ l (k) = = . (51.22)
1 − i tan δl (k) E − E1 + iΓ/2
Moreover, now we have
(E − E1 ) 2 (Γ/2) 2
(k − k1 ) 2 = 2
, k22 = 2
, (51.23)
2 2
2m 2k 1 2m 2k 1

which we substitute in σl (k) to obtain


4π (Γ/2) 2
σl (k)  (2l + 1) . (51.24)
k2 (E − E1 ) 2 + (Γ/2) 2
This is the Breit–Wigner formula for resonance in energy, giving a bell-like shape for the cross
section; see Fig. 51.2. The maximum of σl (k) is at E = E1  Eres , which is therefore the resonance
energy. On the other hand Γ is the width at half maximum value for σl . Indeed, if E − E1 = ±Γ/2,
σl (E)/σ(E1 ) = 1/2.
We can write a slightly more general formula. Until now, we have assumed that we can directly
invert (51.14), so that
 
−1 −Γ/2
δl (k) = tan , (51.25)
E − E1
and replace it in Sl (k) = e2iδ l (k) to find the Breit–Wigner formula for σl (k). But it could be that there
is a “background” value (a constant) for δl , called ξl , i.e.,
   
−Γ/2 −Γ/2
δl (k) → δ̃l (k) = δl (k) + ξl = tan−1 + ξl = tan−1 + ξl . (51.26)
E − E1 E − Eres
If ξl = nπ, that does not change the value of of tan δl (k), but otherwise it does. Then, defining
−Γ/2 1
tan δl = ≡−
E − E1 
(51.27)
1
tan ξl ≡ − ,
q
570 51 Resonances and Scattering, Complex k and l

from the partial wave S-matrix

Sl = e2iδ l → e2i δ̃ l = e2iδ l e2iξl , (51.28)

we obtain the partial wave cross section


4π (q + ) 2
σl = (2l + 1) . (51.29)
k2 (1 + q2 )(1 + 2 )
In the normal case ξl = 0, implying q → ∞, we obtain
4π 1 4π (Γ/2) 2
σl = (2l + 1) = (2l + 1) . (51.30)
k2 1 + 2 k2 (E − Eres ) 2 + (Γ/2) 2
In the extreme non-normal case, q = 0 ↔ ξl = π/2, we obtain a new formula,
4π (E − Eres ) 2
σ̃l = 2 σl = (2l + 1) . (51.31)
k 2 (E − Eres ) 2 + (Γ/2) 2
A final observation is that
E − Eres
cot δl  (51.32)
−Γ/2
means that at resonance cot δl = 0, and a little away from it, the above formula gives the first term
in the Taylor expansion.

51.3 Physical Interpretation and Its Proof

Now that we have defined the poles of Sl (k) and the related Breit–Wigner resonance formulas
analytically, we need to understand the physical interpretation of this mathematical result. The first
observation is that the poles are near the real axis, and they influence the behavior on the real
axis, where physical scattering is located. But how are these poles created, or equivalently, what
do the poles represent? The poles that we considered previously were bound states. But in bound
states the system stays bound for an infinite amount of time, i.e., there is an infinite lifetime. The
infinite lifetime refers to the particle being inside a potential well, with the energy of the bound state
satisfying En < V on both sides of the potential well, and with En < 0, so that there are no asymptotic
states (the particle cannot escape to infinity).
Bound states are also stationary, which means that their time dependence is just a phase,

ψ(r, t) ∝ e−iEn t/ , (51.33)

where En = −|En |. Since this corresponds to an imaginary k pole, k n = iκ n , the “out” wave turns into
a decaying exponential, eikn r = e−κ n r , but under the transition the energy stays real, just changing
sign,
2 k n2 2 κ2n
En = E(k n ) = =− = −|En |. (51.34)
2m 2m
Turning to the resonance scattering, k  kpole = k1 − ik2 , we are still close to the real line, so
actually k ∼ k1 , meaning the “out” wave stays out, eikr  eik1 r . On the other hand, the energy now
571 51 Resonances and Scattering, Complex k and l

Veff (r)

En r

Figure 51.3 Scattering by entering, and then tunneling out of, a potential barrier around a quasi-bound state.

becomes complex, E  Epole = Eres − iΓ/2, leading to a time dependence of the wave function that is
not just a phase, but also a decaying exponential,

ψ(r, t) ∝ e−iEt/  e−iEres t/ e−Γt/2 . (51.35)

Then the probability density decays in time,

|ψ(r, t)| 2 ∝ e−Γt/ , (51.36)

with a finite lifetime



τ= (51.37)
Γ
for the state. This means we now have a quasi-bound state. With respect to the (true) bound state,
we still have En < V on both sides of the potential well, but now En > 0, which means that there are
asymptotics at infinity. This means that the well has a potential barrier on the side at large r, and we
can tunnel out of it, see Fig. 51.3.
In conclusion, the physical picture is the following: we have a particle coming in from infinity, it
gets captured into the quasi-bound state, and then it tunnels out of the potential well, or “leaks out”.
To prove this picture, we consider the radial wave function extended on the complex k plane,
 
eikr e−ikr
Rl (k; r) = C Sl (k) −
r r
  (51.38)
k − k1 − ik2 e ikr
e−ikr
C − .
k − k1 + ik2 r r
Choosing the constant
1
C= , (51.39)
k − k1 − ik2
the reduced radial wave function becomes
eikr e−ikr
χl (k; r)  − . (51.40)
k − k1 + ik2 k − k1 − ik2
From it, we construct a wave packet by integrating χl , multiplied by the stationary phase, near
the pole,

ψ(r, t) = dk χl (k; r)e−iEt/ . (51.41)
572 51 Resonances and Scattering, Complex k and l

Since we are integrating near the pole, we write k = k1 + δk and integrate over δk, with dk = dδk.
In this case, the energy is
2 k 2 2 2 2
E=  k + k1 δk = E1 + v1 δk, (51.42)
2m 2m 1 m
where we have defined the velocity of the pole,
k1
v1 ≡ . (51.43)
m
Then the reduced radial wave function is
eik1 r eiδk r e−ik1 r e−iδk r
χl (k; r)  − (51.44)
δk + ik2 δk − ik2
and the wave packet is
 
eiδk (r−v1 t) e−δk (r+v1 t)
ψ(r, t) = ei(k1 r−E1 t/) dδk − e−i(k1 r−E1 t/) dδk . (51.45)
δk + ik2 δk − ik2
Since most of the integral comes from the region near the pole (δk ∼ 0), we can extend the
+∞
integration region to the whole real line, −∞ dδk with minimal change. To calculate this integral
over the real line, we must close the contour of integration in the complex plane, with a half-circle at
infinity.
In the first integral:
If r − v1 t > 0 then in order for the contribution of the half-circle to vanish, we need to have Im
δk > 0 (so that the exponential is decaying), meaning the integration is in the upper half-plane. Then
1
we have a closed contour C with a counterclockwise integration, meaning 2πi C
gives the sum of
residues inside C, i.e., the residues in the upper half-plane, of which there are none.
If r − v1 t < 0 then for the contribution of the half-circle to vanish, we need to have Im δk < 0, i.e.,
integration in the lower half-plane. But now we have a closed contour C with clockwise integration,
so − 2πi
1
C
gives the sum of residues in the lower half-plane. That means the residue at δk = −ik2 ,
giving finally

eiδk (r−v1 t)
dδk = −2πi Res(δk = −ik2 )θ(v1 t − r) = −2πiek2 (r−v1 t) θ(v1 t − r). (51.46)
δk + ik2
In the second integral:
If r + v1 t > 0 for the contribution of the half-circle to vanish, we need Im δk < 0, which means the
integration is in lower half-plane. This gives a clockwise integration, and − 2πi 1
C
gives the residues
in the lower half-plane, of which there are none.
If r + v1 t < 0 then we need Im δk > 0, so integration is in the upper half-plane. This gives a
1
counterclockwise integration, with 2πi C
giving the sum of the residues in the upper half-plane,
meaning δk = ik2 , giving finally

eiδk (r+v1 t)
dδk = 2πi Res(δk = ik2 )θ(−v1 t − r) = 2πiek2 (r+v1 t) θ(−v1 t − r). (51.47)
δk − ik2
Then the wave packet time dependence splits into three regions:
If t < −r/v1 , we have the in wave. This time region corresponds to the coming wave propagating
towards the center, since at r → ∞ we have
ψ(r, t)  2πie−i(k1 r−E1 t/) ek2 (r+v1 t) , (51.48)
which means that the wave comes in from r = ∞ with velocity −v1 .
573 51 Resonances and Scattering, Complex k and l

If −r/v1 < t < r/v1 , in the asymptotic region r → ∞ we have


ψ(r, t)  0, (51.49)
which means the wave is concentrated near r = 0. Thus the particle has been captured in a metastable
or quasi-bound state.
If t > r/v1 , we have the out wave. This region corresponds to the outgoing wave propagating from
the center, since at r → ∞ we have
ψ(r, t)  −2πiei(k1 r−E1 t/) ek2 (r−v1 t) , (51.50)
where the last exponential is decaying, since r − v1 t < 0. Since
Γ 2
= 2k1 k2 = v1 k2 , (51.51)
2 2m
we obtain that the probability density at fixed r → ∞ in this region is
|ψ(r, t)| 2 ∝ e−2k2 v1 t = e−Γt/ = e−t/τ , (51.52)
which is what we wanted to prove. Indeed then, at large times, the wave function implies that there
is a probability of tunneling out of the metastable state.

51.4 Review of Levinson Theorem

Now that we have understood how to obtain δl (k), we will come back to the Levinson theorem, to
present a simple proof. The theorem can be stated as
δl (0) − δl (∞) = nlb π + φ, (51.53)
where φ = 0 or π/2.
We start with a small potential, namely potential that is small with respect to the energy associated
with a de Broglie wavelength of the order of the range of the potential,
2
|V (r)|  . (51.54)
2mr 02
In this case, the Born approximation is valid at all energies, including E = ∞, so δl (E)  1, and
δl (∞)  0. On the other hand, δl (0) = φ (which is also zero, except for a background value). Since
the potential is shallow, we have no bound states, nlb = 0. Then consider E = , a small and fixed
energy, and δl () − δl (∞) as V (r) deepens. As each new bound state of energy  appears as V (r)
deepens, and its radial wave function contains a new node with respect to the previous bound state
with energy  (since the bound state with energy  has the largest energy among bound states, i.e., it
is Enmax , for which we have nmax + 1 total nodes). But since E =  is small, this last node is at large r
(since the periodicity is inversely proportional to k), for which the wave function is
 

REl (r) ∼ sin kr − + δl , (51.55)
2
which means that kr − lπ/2 + δl changes by π at the new node’s appearance, so δl () changes by π
when each new node appears at fixed r. That means that
δl ( → 0) − δl (∞) = nlb π + φ, (51.56)
and after we take the limit  → 0, we obtain the Levinson theorem. q.e.d.
574 51 Resonances and Scattering, Complex k and l

51.5 Complex Angular Momentum

Until now, we have learned new information from considering k in the complex plane (and so also
the energy E in the complex plane), but we have kept the angular momentum as a natural number,
l ∈ N.
However, we can also learn from an extension to a complex angular momentum, l ∈ C, where now
the energy is kept real, E ∈ R. The result is called Regge theory.
The reduced radial wave function in the bound state case is

χEl (r) = r REl (r) = A(l, E)e−κr + B(l, E)e+κr , (51.57)



where κ = −2mE/. We can relate it to the scattering solution through analytical continuation
through the lower half-plane (different from the upper half-plane relevant to complex k). The
scattering solution is (the proof is just a modification of the continuation of the energy dependence
through the upper half-plane)

χEl (r) = A∗ (l, E)e−ikr + B∗ (l, E)eikr . (51.58)

This means that

A∗ (l, E) = −Fl− (k) = −(Fl+ (k)) ∗ , B∗ (l, E) = Fl+ (k) = (Fl− (k)) ∗ . (51.59)

If l ∈ R and E > 0, then, identifying the scattering χEl with the one analytically continued from the
bound state, we obtain an analyticity constraint of A(l, E), namely

A(l, E) = B∗ (l, E). (51.60)

But analytically continuing to the complex plane, l ∈ C, while still keeping E ∈ R+ , we obtain

A(l ∗ , E) = B∗ (l, E). (51.61)

The general solution of the Schrödinger equation, even in the case of complex l, has the following
behavior near r → 0,

REl (r) ∼ αr l + βr −l−1 . (51.62)

In the case of l ∈ N, we need to put β = 0 so that we have only REl ∼ r l . If l ∈ C, we have an


alternative: we can define REl to be valid for general r −l−1 . Namely, we require r l  r −l−1 , which
means Re l ≥ Re(−l − 1), so

Re(2l + 1/2) > 0. (51.63)

Since
A(l, E)
S(l, E) = e2iδ(l,E) = eiπl , (51.64)
B(l, E)
we can generalize to

A(E) Fl+ (k) F + (l, k) A(l, E)


= − → − = . (51.65)
B(E) Fl (k) F (l, k) B(l, E)
575 51 Resonances and Scattering, Complex k and l

Then we obtain
∗ A∗ (l, E)
S ∗ (l, E) = e−iπl
B∗ (l, E)
(51.66)
∗ A(l ∗ , E)
S(l ∗ , E) = e+iπl ,
B(l ∗ , E)
leading to the analyticity condition

S ∗ (l, E)S(l ∗ , E) = 1. (51.67)

The poles of S(l, E) are then the zeroes of B(l, E) in the complex l plane, called Regge poles.
These line up as points on curves

l = α i (E) (51.68)

called Regge trajectories.


We will not continue with the analysis of Regge theory, since it is beyond the scope of this book.

Important Concepts to Remember

• For Sl (k) we have poles in the lower half-plane, kpole = k1 − ik2 , and for each such pole a
corresponding zero in the upper half-plane, kzero = k1 + ik2 .
• For poles close to R, k1  k2 , near the pole
k − k1 − ik2
Sl (k)  ,
k − k1 + ik2
and we have a resonance,
4π k22
σl = (2l + 1) ,
k2 (k − k1 ) 2 + k22

with Ezero/pole = Eres ± i Γ2 , and with tan δl = (−Γ/2)/(E − E1 ).


• The resonance satisfies the Breit–Wigner formula,
4π (Γ/2) 2
σl = (2l + 1) ,
k 2 (E − E1 ) 2 + (Γ/2) 2
though with a background value for δl , δl → δl + ξl , we can change it.
• The resonance has the physical interpretation that there is a quasi-bound state, with a potential
barrier but with E > 0, so we have an in wave for a particle going towards the center of the
potential, then a metastable (quasi-bound) state, then an out wave for a particle tunneling out
(decaying) with probability ∝ e−t/τ , τ = /Γ.
• We can also consider complex angular momentum l, with real energy, leading to Regge theory, in
which case S(l, E) has poles at points l = αi (E) forming curves known as Regge trajectories.

Further Reading
For more on the analytical properties of wave functions, see [4]. See also [2] and [1].
576 51 Resonances and Scattering, Complex k and l

Exercises

(1) Write down the approximate value for the partial wave S-matrix Sl (k) for the case of two
resonances and two bound states.
(2) Consider the case where among the partial wave cross sections σl only σ1 is at (or very near) a
resonance. What can you say about σtot ?
(3) Consider the case where Sl (k) has a single pole (resonance), but k2 is comparable with k1 .
Calculate tan δl and the equivalent of the Breit–Wigner formula in this more general case, σl (E).
(4) If we have two resonances (complex poles) for σl close to each other, and we are at resonance
on the real k line, write down the wave function in terms of the physical interpretation of time-
dependent scattering with in and out waves, as in the text.
(5) In the case in exercise 4, calculate the asymptotic wave function in the metastable region of
time, and in the out region.
(6) In the case of the Levinson theorem, for nlb , do we need to count only the poles of Sl (k) that are
exactly on the imaginary line, or can they be slightly away from the imaginary line?
(7) Consider that Sl (k) for all l ∈ N has a single resonance pole, very close to the real line. If we
consider now complex momentum instead, what can we learn about the Regge trajectories?
The Semiclassical WKB and Eikonal
52 Approximations for Scattering

In this chapter we study semiclassical approximations for scattering. We start with the WKB
approximation, extended to three dimensions but then reduced to effectively one dimension. Then we
give an alternative approach, the eikonal approximation, where the geometrical optics approximation
is further refined to a straight line. We end with a special application, the Coulomb potential
scattering, where the semiclassical approximation is actually exact.

52.1 WKB Review for One-Dimensional Systems

Before the extension of the WKB analysis to three dimensions, we review the one-dimensional case.
Constructing the semiclassical wave function starts with the redefinition of the time-dependent but
stationary wave function,

ψ(r , t) = eiS(r ,t)/ = ψ(r )e−iEt/ , (52.1)

where the exponent separates the time dependence as

S(r , t) = s(r ) − Et. (52.2)

Then the time-independent wave function is

ψ(r ) = eis(r )/ . (52.3)

This already shows that we can apply this formalism to scattering, if ψ(r ) corresponds to the
scattering wave function ψ (+) (r ). Note that we applied the WKB method before to bound states,
leading to Bohr–Sommerfeld quantization.
The expansion of the function s(r ) in ,

s(r ) = s(r ) + s1 (r ) + 2 s2 (r ) + · · · (52.4)

leads to the semiclassical approximation if we keep only the first two terms, s0 (classical) and s1
(semiclassical).
The zeroth-order term is the classical on-shell action between the initial condition r 0 (t 0 ) and the
final point r (t),

s0 (r ) = S0 [r 0 (t 0 ) → r (t)]. (52.5)

This arises as the extremum of the path integral for the propagator,
  
i
U (t, t 0 ) = Dr (t) exp S[r 0 (t 0 ) → r (t)] . (52.6)

577
578 52 WKB and Eikonal Approximations for Scattering

One Dimension
Specializing to one dimension, the Schrödinger equation implies the equation for s(x),
 2
ds(x)  d2
− 2m(E − V (x)) + s(x) = 0. (52.7)
dx i dx 2
It is the quantum-corrected version of the Hamilton–Jacobi equation for the classical action.
The solution of the equation to the first order in  (i.e., s0 + s1 ) gives, for the wave function in
the classically allowed region E > V (x),
  x 
1 i  
ψ(x) = exp ± dx 2m(E − V (x )) , (52.8)
[2m(E − V (x))]1/4  x0

where the exponent is i s0 (x), where s0 (x) is the on-shell action (the classical action on the classical
trajectory).
We use connection formulas to transition from a solution in a classically allowed region to a
solution in a forbidden one (and vice versa), through the classical turning point of the trajectory.
The transition formula from the classically allowed region x > x 1 to the classically forbidden
region x < x 1 is
√   x 
2 1   )) +
π
sin dx 2m(E − V (x , x > x1
[2m(E − V (x))]1/4  x1 4
  x1  (52.9)
1 1  
→ exp − dx 2m(V (x ) − E) , x < x 1 .
[2m(V (x) − E)]1/4  x
For the case where the classically allowed region is to the left, x < x 2 , and the forbidden region is
to the right, x > x 2 , we have
  x 
1 1  
exp − dx 2m(V (x ) − E) , x > x 2
[2m(V (x) − E)]1/4  x2
√   x2  (52.10)
2 1  
π
→ sin dx 2m(E − V (x )) − , x < x2.
[2m(E − V (x))]1/4  x 4

52.2 Three-Dimensional Scattering in the WKB Approximation

We now apply the formalism to three-dimensional scattering. Then the quantum-corrected Hamilton–
Jacobi equation arises from the Schrödinger equation, in terms of s(r ),

(∇s)
 2 i
+ V (r ) − E − Δs = 0. (52.11)
2m 2m
But we are interested in the -expansion of s(r ), whose zeroth-order term is s0 (r ). To obtain the
equation for s0 we drop the last term in (52.11), the only one depending on , and obtain the classical
Hamilton–Jacobi equation,

(∇s
 0 )2 2 k 2
+ V (r ) = E ≡ , (52.12)
2m 2m
which defines k.
579 52 WKB and Eikonal Approximations for Scattering

Then we reduce the system in the spherically symmetric case to one dimension, namely to the
radial direction. We first reduce to the radial wave function by writing
ψ(r ) = R(r)Ylm (nr ), (52.13)
and then to the reduced radial wave function χ(r) = r R(r). But the radial variable has a domain
equal to half the real line, r ∈ (0, ∞), so we have to redefine the variable to x = ln(kr) so that
x ∈ (−∞, +∞).
The equation for χ(r) is (19.7), namely
d2 2m
χ(r) + 2 [E − Veff (r)]χ(r) = 0, (52.14)
dr 2 
where the effective potential has the centrifugal term added,
2 l (l + 1)
Veff (r) = V (r) + . (52.15)
2mr 2
In terms of x = ln(kr), meaning dr = e x dx/k, we obtain
d2 d 2m e2x
χ(x) − χ(x) + [E − Veff (x)]χ(x) = 0. (52.16)
dx 2 dx  k2
But to get rid of the term with the first derivative, we need to further set
χ(x) = e x/2 W (x). (52.17)
Then the equation of motion becomes
d2
W (x) + Q2 (x)W (x) = 0, (52.18)
dx 2
where
2m 2 1
Q2 (x) ≡ r [E − Veff (r)] −
2 4
  (52.19)
2 2m (l + 1/2) 2
=r (E − V (r)) − ,
2 r2
so that Q2 takes the place of E − V (x) in the one-dimensional case.
We can easily see that if E > 0 (meaning, for scattering solutions) and V (r) → 0 for r → ∞, then
Q (x) → +∞ at r → ∞ (x → +∞). Then, if V (r) doesn’t blow up faster than 1/r 2 at r → 0, Q2 < 0
2

at r → 0 (x → −∞), so there is a single turning point x 0 , at which Q2 (x 0 ) = 0. Since Q2 represents


E − V (x) in the one-dimensional case, we consider −Q2 as V (x) − E. At infinity, −Q2 → −∞ and at
r = 0, −Q2 > 0, with a classical turning point, E − V (x) = 0, in the middle.
The effective one-dimensional quantum-corrected Hamilton–Jacobi equation is obtained by
further redefining
W (x) = ei s̃(x)/ , (52.20)
which means

k i s̃(x)/
ψ(r ) = e r )/
is(
= e Ylm (nr ). (52.21)
r
The quantum-corrected Hamilton–Jacobi equation is
 2
d s̃ d 2 s̃(x)
− 2 Q2 (x) = i . (52.22)
dx dx 2
580 52 WKB and Eikonal Approximations for Scattering

Expanding in ,

s̃(x) = s̃0 (x) +  s̃1 (x) + 2 s̃2 (x) + · · · (52.23)

the equation of the leading term s̃0 (x) is obtained by dropping the right-hand side, which is
proportional to , leaving just the classical Hamilton–Jacobi equation, with solution
 x
s̃0 (x) = ± dx  Q(x  ). (52.24)
x0

The semiclassical solution correspnds to s̃1 (x), as usual:

i
s̃1 (x) = A + ln |Q(x)|, (52.25)
2
where A is a constant, so we have the usual one-dimensional WKB solution in terms of W (x),
   x    x 
1
WWKB (x) = √ C1 exp i dx  k (x  ) + C2 exp −i dx  k (x  ) , (52.26)
k (x) x0 x0

where Q2 (x) ≡ k 2 (x) > 0, so it is in the classically allowed region, including the asymptotic region
r → ∞ (x → +∞).
For the classically forbidden region Q2 (x) ≡ −κ2 (x) < 0, inside the range of the potential, we
find
   x    x 
1    
WWKB (x) = √ D1 exp − dx κ(x ) + D2 exp + dx κ(x ) . (52.27)
κ(x) x0 x0

At the effective one-dimensional turning point of the classical trajectory, Q2 (x) = 0, we have
the continuity formula, transitioning from the inside region (inside the range of the potential) to the
outside region (including the asymptotic region),
  x0  √   x 
1   2 π
WWKB (x) = √ exp − dx κ(x ) → √ sin + dx  k (x  ) . (52.28)
κ(x) x k (x) 4 x0

To describe scattering, we need to consider physical solutions in the r →


√ ∞ limit, which are in the
classically allowed region. For the reduced radial wave function χ(r) = krW , we go back to the r
dependence, dx = dr/r, implying Q(x) = k (x) → k (r)/r = Q(r)/r, obtaining
  −1/4
2mV (r) (l + 1/2) 2
χWKB (r) = 1 − −
2 k 2 r2k2
⎡⎢   ⎤ (52.29)
⎢ π r
 2m 
(l + 1/2) 2 ⎥⎥
× sin ⎢ + dr k − 2 V (r ) −
2
⎥⎥ .
⎢⎣ 4 r0  r 2 ⎦
∞
But at r → ∞, which means that the integral in the exponent is extended to r , the argument of the
0
sine is kr − lπ/2 + δlWKB , leading to a phase shift in the WKB approximation,
 ∞ ⎡⎢ ⎤⎥
π lπ 2m (l + 1/2) 2
δlWKB (k) = + − kr 0 + dr ⎢⎢ k 2 − 2 V (r  ) −

− k ⎥⎥ . (52.30)
4 2 r0 ⎢⎣  r 2 ⎥⎦
581 52 WKB and Eikonal Approximations for Scattering

52.3 The Eikonal Approximation

We now consider another version of the WKB approximation called the eikonal approximation. The
region of the validity is a subset of the region of the validity of the WKB approximation, namely the
region where V (x) varies little over the de Broglie wavelength λ. This implies the WKB condition,
δλ λ
 1. (52.31)
λ
We further refine the domain of the approximation by saying that E  |V | (the actual value of the
potential, not just the variation of it, is small with respect to the energy). This means we are at high
energy, which is different from the case for the Born approximation, which is valid for both low and
high energies.
However, as a WKB approximation, we are within geometrical optics. That means that a wave is
replaced by a light ray path, i.e., the integral over all paths (the path integral) is restricted to just the
classical path. Indeed, that restricts s(r ) to the classical on-shell action s0 (r ), where x(t) is on the
classical trajectory, which solves the classical Hamilton–Jacobi equation.
Instead of reducing the three- dimensional problem to a one-dimensional problem as in the WKB
method above, we now make a further approximation: that the classical trajectory is approximately a
straight path, which is true at high energies, since then there is only a small deflection. This makes the
problem calculable, since to actually find the classical trajectory as a solution of the Hamilton–Jacobi
equation is potentially very difficult.
The approximation defined above is called the eikonal approximation, coming from the Greek
“eikon”, meaning icon or image, since in the geometric optics case the light rays remaining straight
through the interaction means that we obtain an undistorted “image” of an emitting object, after the
interaction.
Since the classical trajectory is a straight line, we can define it as the Oz axis, where the origin O is
the projection onto the axis of the central point of the spherically symmetric potential; see Fig. 52.1.
The distance√of the projection from the central point to the origin O is the impact parameter b. Since
r = |r | = b2 + z 2 , the Hamilton–Jacobi equation reduces to a one-dimensional equation for the
parameter z of the classical trajectory, giving
 2
d s0 2m 2m
= k 2 − 2 V (r) = k 2 − 2 V ( b2 + z 2 ). (52.32)
dz   
It integrates to
 z

s0  2m
= dz k2 − V ( b2 + z 2 ) + C, (52.33)
 −∞ 2

o
Figure 52.1 Classical trajectory in the eikonal approximation, and the impact parameter.
582 52 WKB and Eikonal Approximations for Scattering

where C is a constant, which we can choose such that s0 / → 0 when we turn off the potential,
V → 0. Then, since
   
2mV V V mV
k2 − 2 = k 1 −  k 1 − k− 2 , (52.34)
 E 2E  k

we find
 ⎡⎢ z ⎤⎥
s0 2m
= kz + dz ⎢⎢ k 2 − 2 V ( b2 + z 2 ) − k ⎥⎥

 −∞ ⎢  ⎥⎦
 z⎣ (52.35)
m
 kz − 2 dz V ( b2 + z 2 ).
 k −∞

Then we obtain a scattering wave function for the argument r = b+ zez in the eikonal approximation,
  z 
im
ψ+ (r ) = ψ(b + zez ) = eikz exp − 2 dz V ( b2 + z 2 ) , (52.36)
 k −∞

where eikz = ei k ·r .




To obtain the scattering amplitude f we substitute the above ψ+ into the right-hand side of (46.71),
obtained by substituting the free Green’s function G0(+) (r , r  ) in the Lippmann–Schwinger equation,
namely

2m  
+ (+)   
f k (nr , nk ) = f ( k , k) = − 2
d 3r  e−i k ·r V (r  )ψ+ (r  )
4π
 ⎡⎢  z ⎤⎥ (52.37)
2m r r ⎢ im
=− d r e k  ·
3  −i 
V ( b + z )e
2 2 k ·
i
exp ⎢− 2 d z̃ V ( b + z̃ ) ⎥⎥ ,
2 2
4π2 ⎢⎣  k −∞ ⎥⎦

where if we replace the last exponential exp[. . . ] with 1, we have the Born amplitude f (1) (k , k). This
means that the eikonal approximation
 resums some interactions (each having potential V ), replacing
V × 1 with V × exp[. . . V ] ∼ V × n (. . . V ) n .
To do the integral, we integrate over cylindrical coordinates
  z,b and the angle φ that rotates the
trajectory around the center of the potential, so d 3r  = dz db bdφ. The deflection turns k into
k , but the deflection angle θ is very small, and the modulus of k is unchanged, which means that
k − k  is perpendicular to k, which is proportional to ez . Thus, since (k − k  ) · ez = 0, we find

(k − k  ) · r  = (k − k  ) · (b + z  ez )  (k − k) · b. (52.38)

However, k − k  has modulus kθ (since θ  1). And its direction makes an angle φ with the impact
parameter vector b (see Fig. 52.2), so finally

(k − k  ) · r   kθ(b cos φ). (52.39)

Then the scattering amplitude is


 ∞  2π  +∞   z 
2m im
f (+) (k  , k) = − db b dφe−ikbθ cos φ dz V exp − 2 dz V . (52.40)
4π2 0 0 −∞  k −∞
583 52 WKB and Eikonal Approximations for Scattering

k
θ k−k
b k
θ

Figure 52.2 Geometry of scattering by a small angle θ, with impact parameter b.

Using the formulas


 2π
dφe−ikbθ cos φ = 2π J0 (kbθ)
0
 +∞   z   +∞    z  (52.41)
im  d i2 k im 
dz V exp − 2 dz V = dz exp − 2 dz V ,
−∞  k −∞ −∞ dz m  k −∞
we obtain
 ∞  
f (+)
(k , k) = −ik db bJ0 (kbθ) e2iΔ(b) − 1 , (52.42)
0

where
 +∞
m
Δ(b) ≡ − 2 dz V ( b2 + z 2 ). (52.43)
2 k −∞

If we have a finite-range potential, with range r 0 , if b > r 0 then Δ(b) = 0, so the integrand is zero,
leading to
 r0  
f (+) (θ) = −ik db bJ0 (kbθ) e2iΔ(b) − 1 . (52.44)
0

This amplitude corresponds to the optical theorem in an interesting way. To see this, we take the
imaginary part of the forward amplitude, obtaining
 r0    r0
Im f (θ = 0) = −kRe
(+)
db bJ0 (0) e 2iΔ(b)
− 1 = 2k db b sin2 Δ(b). (52.45)
0 0

For the optical theorem to be true, the total cross section must be
 r0
σtot = db(4πb)2 sin2 Δ(b). (52.46)
0

This can be obtained from the partial wave expansion.


To see that, we first consider the fact that the eikonal approximation is a high-energy approxima-
tion, therefore the maximum angular momentum is large. But the angular momentum is l = bp, and
since p = k, we find l = bk. Then l max = kbmax = kr 0  1, and thus the sum over l becomes an
integral over kb,

lmax  r0
↔ kdb, (52.47)
l=0 0
584 52 WKB and Eikonal Approximations for Scattering

and the phase shift turns into Δ(b),


δl (k) ↔ Δ(b)| b=l/k . (52.48)
Thus if b > bmax = r 0 , δl = 0 ↔ Δ(b) = 0. Since also l  1 and θ  1, we have
Pl (cos θ)  J0 (lθ) = J0 (kbθ), (52.49)
and also 2l + 1  2l ↔ 2kb. Then the amplitude, from the partial wave expansion formula, is, since
al (k) = (e2iδ l (k) − 1)/(2ik),
 r0
2kb  2iΔ(b) 
f (θ) ↔ k db e − 1 J0 (kbθ)
0 2ik
 r0  
(52.50)
= −ik db bJ0 (kbθ) e2iΔ(b) − 1 ,
0
which is the eikonal approximation formula, obtained from the partial wave expansion.
The total cross section is now obtained similarly,

4π 
lmax
σtot = (2l + 1) sin2 δl (k)
k 2 l=0
 (52.51)
4π r0
↔ 2 kdb 2bk sin2 Δ(b),
k 0
and this formula is consistent with the optical theorem.

52.4 Coulomb Scattering and the Semiclassical Approximation

We will try to apply the formalism of the semiclassical WKB approximation to the case of Coulomb
scattering.
Since
  
2mV (r) (l + 1/2) 2 V (r) 2 (l + 1/2) 2
k −
2 −  k 1− − , (52.52)
2 r2 2E 4mEr 2
we find
 ∞    
 2mV (r) (l + 1/2) 2 r
 k r
  2 (l + 1/2) 2 r
dr
dr k2 − − k dr − dr V (r ) − .
r0 2 r2 r0 2E r0 4mE r0 r2
(52.53)
But r → ∞ does not make sense in the Coulomb case V (r) = A/r, but only if V = A/r α with α > 1.
Instead of using the WKB approximation for the radial reduction, for the Coulomb potential
V = e02 /r (for the interaction of two electrons, i.e., negative charges, unlike in the hydrogenoid
atom), we start directly from the Schrödinger equation,
2m e2
Δψ +  k 2 − 2 0  ψ = 0. (52.54)
  r
Making the change of variables
ψ(r ) = eikr F (r ), (52.55)
585 52 WKB and Eikonal Approximations for Scattering

the Schrödinger equation becomes

r  2me2 1
ΔF + 2ik · ∇F + 2ik − 2 0  F = 0. (52.56)
r   r

However, instead of reducing the problem to the radial coordinate, we will use Cartesian coordinates
(x, y, z) to reduce the problem to the variable u = k (r − z). Therefore our ansatz for the scattering
solution of the Schrödinger equation is F = F (u). Then we find

2ku d 2 F 2u dF
ΔF = +
r du2 r du
 (52.57)
r  z  dF u dF
· ∇F = k 1 − = .
r r du r du
This reduces the Schrödinger equation to

2k d 2 F 2k dF 2k  me2
u 2 + (1 + iu) + i − 20  F = 0. (52.58)
r du r du r  k

Defining v = −iu, the equation becomes

dF dF  me02
v + (1 − v) − 1 + i  F = 0, (52.59)
dv 2 dv  k2

which is the equation for the confluent hypergeometric function 1 F1 (a, b; v) (see (18.41)),

d2 F dF
v 2
+ (b − v) − aF = 0; (52.60)
dv dv
this means that the solution has b = 1, a = 1 + iα, where
me02
α= . (52.61)
k2
But the confluent hypergeometric function of an imaginary variable has the asymptotics

Γ(b) eiπ(b−a)/2 −iu Γ(a) 1


1 F1 (a, b; −iu)  e + e−iπa/2 . (52.62)
Γ(a) ub−a Γ(b − a) u a
In our case, we have
1 e πα/2 −iu 1 e−iπ(1+iα)/2
1 F1 (1 + iα, 1; −iu)  e +
Γ(1 + iα) u−iα Γ(−iα) u1+iα
πα/2
 −iα ln u
 (52.63)
e −iu+iα ln u −iπ/2 Γ(1 + iα) e
= e +e ,
Γ(1 + iα) Γ(−iα) u

where we have used 1/u−iα = e+iα ln u and 1/u1+iα = u−1 e−iα ln u .


Therefore the full wave function is (since ψ = eikr 1 F1 (. . . ))
 
e πα/2 Γ(1 + iα) eikr−iα ln[k (r−z)]
ψ= eikz−iα ln[k (r−z)] − i . (52.64)
Γ(1 + iα) Γ(−iα) k (r − z)

In the first term, eikz = ei k ·r is the incoming plane wave, and there is an extra slowly varying phase.

586 52 WKB and Eikonal Approximations for Scattering

In the second term, using Γ(1 + iα) = iαΓ(iα) and r − z = r (1 − cos θ) = 2r sin2 θ/2 and
factorizing eikr /r, we obtain the scattering amplitude
Γ(iα) α
e−iα ln[2kr sin θ/2] .
2
f (θ) = (52.65)
Γ(−iα) 2k sin2 θ/2
However, since [Γ(iα)]∗ = Γ(−iα), the first factor is a phase and the differential cross section is
 2 2
d2 σ α e02 m
= | f (θ)| =
2
=  
dΩ 2k sin2 θ/2  2 k sin θ/2
2 2 2
(52.66)
e02 1
= .
4m2 v 4 sin4 θ/2
This is nothing other than the classical Rutherford formula, even though we have obtained an exact
formula for (nonrelativistic) quantum scattering, since we haven’t used any approximation. We have
seen before that in the Born approximation we can find the classical result. So any semiclassical
corrections to the classical result vanish, and in fact all quantum corrections vanish. Thus, Coulomb
scattering is a very special case.

Important Concepts to Remember

• We can use the WKB approximation also for three-dimensional scattering, using as before
the transition formulas from the classically allowed region to the classically forbidden region,
and reducing the three-dimensional Schrödinger equation to a one-dimensional Schrödinger
d2
equation with domain R, by writing x = ln(kr), χ(x) = e x/2 W (x), and the equation dx 2 W (x) +

Q2 (x)W (x) = 0.
 
• The WKB approximation for W (x) = exp i s̃(x) is obtained from the quantum-corrected
Hamilton–Jacobi equation for s̃(x), leading to an expression for χWKB (r) that at infinity goes
like sin(kr − lπ/2 + δlWKB ), giving a formula for the phase shift in the WKB approximation.
• The WKB approximation is a geometrical optics approximation, of classical ray paths, but one can
add to it an eikonal approximation, of straight ray paths, since the classical paths (solutions of the
Hamilton-Jacobi equation) are difficult to obtain.
• The eikonal approximation, a high-energy approximation,  ∞ resums some Born series terms, and
gives, for a finite-range potential with range r 0 , f + = ik 0 db bJ0 (kbθ)[e2iΔ(b) − 1], with Δ(b) =
 √ r
−(m/22 k) dxV ( b2 + z 2 ) and σtot = 0 0 db(4πb)2 sin2 Δ(b).
• The exact Coulomb scattering calculation reproduces the Rutherford formula, so there are no WKB
approximation corrections, in fact no quantum corrections of any kind, in this nonrelativistic case.

Further Reading
See [2] and [1].
587 52 WKB and Eikonal Approximations for Scattering

Exercises

(1) Consider a one-dimensional system with potential V (x) = V0 /(x 2 + a2 ). Write the WKB
approximation for the wave function for scattering, for energy 0 < E < V0 /a2 .
(2) Consider a three-dimensional system with central potential V (r) = V0 /(r 2 + a2 ). Write the WKB
approximation for the wave function for scattering of given angular momentum l, for energy
0 < E < V0 /a2 .
(3) For the case in exercise 2, calculate the cross section as a formal sum over l.
(4) For the same potential as above, V (r) = V0 /(r 2 + a2 ), calculate Δ(b) for the eikonal
approximation. If the potential turns off (V (r) = 0) for r ≥ r 0 , calculate Δ(b) and, if r 0 is
large, calculate an approximate value for the scattering amplitude in the eikonal approximation.
(5) For a Yukawa potential, V (r) = V0 e−μr /r, write an approximate value for the cross section in
the eikonal approximation.
(6) If we have a Coulomb potential that turns off (V (r) = 0) for a large r ≥ r 0 , calculate the total
(approximate) cross section in the eikonal approximation, and compare with the exact value
from the Rutherford formula.
(7) Are there any resonances for Coulomb scattering? Check that the analytical properties of Sl (k)
in this case are satisfied.
53 Inelastic Scattering

In this chapter we study inelastic scattering, which means there are energy, momentum, and maybe
even particles leaking out of the system into another system (or “sink”). A standard example is the
scattering of a fundamental particle (such as an electron) off a compound particle (such as an atom),
which can absorb energy and get excited to a different state. After analyzing this from the point of
view of an unknown other system, and that of a known other system, we generalize to scattering in
the multi-channel case, and also the scattering of identical particles.

53.1 Generalizing Elastic Scattering from Unitarity Loss

In the first generalization we abandon unitarity, which means that there is a probability leak into an
unknown other system. We previewed this in Chapter 49. Then, everything follows via the partial
wave expansion, but the S-matrix is not unitary, Ŝ † Ŝ  1. In terms of partial waves,
Sl∗ Sl  1, (53.1)
which means that Sl is not a phase factor. Or, defining in the same way Sl = e2iδ l , it means that
δ∗  δ, i.e., δ has a nonzero imaginary part.
Since Sl is the ratio of the “out” wave to the “in” wave,
eikr e−ikr
ψ ∼ Sl (k) + , (53.2)
r r
and the probability leaks into something else between the in and the out wave functions, which means
that |Sl (k)| < 1, or
Im δl ≥ 0 ⇒ |Sl | = e−2Im δ l (k) ≤ 1. (53.3)
In the eikonal approximation,
δl (k) ↔ Δ(b)| b=l/k
(53.4)
Sl (k) = e 2iδ l (k)
↔ e2iΔ(b) ,
where Im Δ(b) ≥ 0.
The total cross section expands into partial waves,

lmax
σtot = σl , (53.5)
l=0

and for σl we have a more general formula, previewed in (49.13),


π π
σl = 2 (2l + 1)|e2iδ l (k) − 1| 2 = 2 (2l + 1)|Sl (k) − 1| 2 , (53.6)
k k
588
589 53 Inelastic Scattering

corresponding in the eikonal approximation to


π
2b|e2iΔ(b) − 1| 2 . (53.7)
k
r
The sum over l turns into an integral, ll=0
max
→ 0 0 kdb, so
 r0
σtot = 2π db b|e2iΔ(b) − 1| 2 . (53.8)
0

We can define then a maximally inelastic scattering, the “black disk” eikonal, which means that
inside a disk of radius r 0 we have complete absorption, Sl = 0, i.e., a “black” region. This is achieved
by setting

Im Δ(b) = +∞, for b < r 0 , (53.9)

and, as always, Sl = 1 (there is no interaction or deflection), or Δ(b) = 0, for b > r 0 .


Then the total cross section of the black disk eikonal is

σtot = πr 02 , (53.10)

which is just the classical, geometrical, cross section. Note that this is the total inelastic cross section.
On the other hand, we can construct the amplitude as we did in the previous chapter. Defining
q = k  − k, with modulus q = |q |  kθ, we write −i(kθ)b cos φ as −iq · b, where the vector
parameter b is in the two-dimensional plane transverse to the initial direction k. Then the eikonal
amplitude becomes
 r0  2π
ik
dφ e−iq ·b (e2iΔ(b) − 1),

f (θ, r 0 ) = − db b (53.11)
2π 0 0

where
 2π
dφe−iq ·b = 2π J0 (qb) = 2π J0 (kθb).

(53.12)
0

Taking the imaginary part of the forward (θ = 0) amplitude for the black disk eikonal (e2iΔ(b ≤r0 ) = 0),
we obtain
 r0
kr 2
Im f (0, r 0 ) = k db b = 0 . (53.13)
0 2
If the optical theorem is to be satisfied, the total cross section must be

σtot = Im f = 2πr 02 . (53.14)
k
However, if we consider only inelastic scattering then unitarity is violated, as we said, since |Sl | <
1, and thus the optical theorem (which follows from unitarity) is also violated. But the total cross
section from the optical theorem is larger. In fact, the eikonal approximation is at high energy, where
we have already seen that the result is 2πr 02 , comprising the geometrical cross section and the shadow
one,

σtot = 2πr 02 = πr 02 + πr 02 = σgeometric + σshadow . (53.15)

The inelastic cross section is the geometric part of the above.


590 53 Inelastic Scattering

The eikonal amplitude can be further rewritten as


 r0  2π
ik
dφe−iq ·b (e2iΔ(b) − 1)

f (θ, r 0 ) = − db b
2π 0 0
 (53.16)
ik
d 2b e−iq ·b (e2iΔ(b) − 1).

=−

Calculating the black disk (e2iΔ(b ≤r0 ) = 0) eikonal amplitude at arbitrary θ, we obtain
 r0
ik
f (θ, r 0 ) = db b2π J0 (qb)
2π 0
 qr0 (53.17)
ik ikr 0
= 2 dx x J0 (x) = J1 (qr 0 ),
q 0 q
a
where we have used 0 dx x J0 (x) = a J1 (a). Taking the limit q → 0 by using the limit J1 (x)  x/2
for x → 0, we obtain
ikr 02
f (θ → 0, r 0 ) = . (53.18)
2
In order to obtain the correct inelastic cross section, we use the optical theorem but divide by an extra
2, since the optical theorem gives the total cross section, which is in equal parts inelastic and shadow,
as explained above.
1 4π
σtot,inelastic = Im f (0, r 0 ) = πr 02 . (53.19)
2 k

53.2 Inelastic Scattering Due to Target Structure

One possible way to have inelastic scattering is for the target to have structure, leading to excited
states, that can be reached by absorbing energy from the projectile.
That is, the projectile is particle A, and the target is system B, with internal structure. The
standard example is the scattering of an electron (particle A) off an atom (system B) with excited
internal states. Then the state of the combined system AB (electron plus atom) is the product of the
momentum state of the particle A and the state of the target B,

|kn0  = |k A ⊗ |n0 B , (53.20)

with the wave function factorizing into a free particle wave function for A times a multiparticle wave
function for the N components of B,

ψ ∼ ei k ·x ψ n0 (r 1 , . . . , r N ).

(53.21)

In the case of the atom we have N = Z, the electric charge number, which is the same as the number
of electrons in the atom.
The transition between the initial state and the final state is

|k, n0  = |k A ⊗ |n0 B → |k  n = |k   ⊗ |nB . (53.22)


591 53 Inelastic Scattering

The final state wave function is


 ·
ψ ∼ ei k x
ψ n (r 1 , . . . , r N ). (53.23)

The elastic case corresponds to n = n0 , otherwise we have inelastic scattering.


In Chapter 46, we found the differential cross section as

dσ r 2 | jscatt | r 2 (k/m)(| f |/r 2 )


= = = | f |2. (53.24)
dΩ | jinc | (k/m)

We want to generalize the final formula by calculating the change in the initial and scattered currents.
In the scattered current the energy, and therefore the wave vector k, of the projectile is changed by
exciting the internal state of the target, so k   k. But the projectile itself does not change, so m stays
fixed. The more general formula is then

dσ r 2 jscatt k
(|n0  → |n) = = | f |2. (53.25)
dΩ jinc k

If we restrict to the Born approximation, namely the first-order time-dependent perturbation theory,
Fermi’s golden rule, calculated in Chapter 47 as

2m  
f (1) (k  , n; k, n0 ) =  k , n|V̂ |k, n0 , (53.26)
4π2
leads to the first-order differential cross section,

dσ (1) k   2m   2
k, n0  .
=   k , n| V̂ |  (53.27)
dΩ k  4π2

In general (outside the Born approximation), the amplitude is written as

2m  
f (k , n; k, n0 ) =  k , n|V̂ |k, n0 +, (53.28)
4π2
and the first-order differential cross section is

dσ k   2m   2
k, n0 + .
=   k , n| V̂ |  (53.29)
dΩ k  4π2

The potential contains the interaction of the projectile A with all the components of the target B.
In the particular case of the atom, we have the interaction of the incoming electron with the nucleus
of charge +Ze situated at r = 0, and the N = Z electrons, situated at r i ,

Ze02  N
e02
V =− + . (53.30)
r i=1
|r − r i |

In the Born approximation, the amplitude for the elastic scattering becomes explicitly

2m   
f (1,+) (k , k) = d 3r  eir ·(k−k ) V (r  ), (53.31)
4π2
592 53 Inelastic Scattering

where k − k  = −q . In the inelastic generalization, this becomes



2m 
f (1,+)  
( k , n; k, n0 ) =
 d 3r  e−iq ·r n|V (r  )|n0 
4π2

2m 
= d 3r  e−iq ·r (53.32)
4π2
  N
× d 3r i ψ∗n (r 1 , . . . , r N )V (r , r 1 , . . . , r N )ψ n0 (r 1 , . . . , r N ).
i=1

But we have that


N 
 
 N 
e−iq ·r e−iq ·(r +r i )
d 3r   = d 3r
i=1
|r − r i | i=1 r
 (53.33)

N
4π 4π
−i
q · −i
q ·
= e ri
= 2 d r ρ(r )e
3 r
,
i=1
q2 q

where the target’s electron density is



N
ρ= δ3 (r − r i ). (53.34)
i=1

We define the form factors (in momentum space)

1  N
Fn,n0 (q ) = n| e−iq ·r i |n0 . (53.35)
Z i=1

Then the integral in the amplitude is


    
−i
q ·  Ze02  N
e02 n = Ze2 4π −δ 
3
d r e r
n − + + F (
q ) , (53.36)
|r − r i | 
0 0 n,n0 n,n0
 r i=1
q2

leading to the differential cross section

k  4me2 Z 2 e04 
−δ n,n0 + Fn,n0 (q ) 
dσ 2
=
dΩ k 4 q4 
(53.37)
k  4Z 2 
−δ n,n0 + Fn,n0 (q )  .
2
= 2 4
k a0 q

53.3 General Theory of Collisions: Inelastic Case


and Multi-Channel Scattering

After considering a simplified treatment of inelastic scattering, from the point of view of a
fundamental projectile, we consider the general theory of collisions. In this general case, both
projectile and target can have structure, as for instance in atom-on-atom collision, or nucleus-on-
nucleus collision, as in the RHIC (relativistic high-energy ion collider) and LHC (large hadron
collider) experiments.
593 53 Inelastic Scattering

In the most general case, then, consider N fundamental particles (particles that cannot be separated
in sub-parts, at least not at the available energies), like for instance electrons and nucleons. They can
form different fragments, as for instance nuclei or atoms, in different ways, if there are bound states
(if there is a nucleus, or an atom, in the given conditions). Examples include nuclear reactions through
scattering, in which we collide two nuclei, but after the scattering interaction other nuclei appear.
A division of all the N fundamental particles into fragments is called a channel. We note that
changing the state or the quantum numbers (as for instance |n0  to |n for an atom) of a fragment
remains within the channel (though we can formally consider it as being in a different channel, if we
want); the term channel is reserved for different combinations into fragments. For scattering, we will
have both an “in” channel (before the collision) and an “out” channel (after the collision).
Also, if conservation of energy, momentum, angular momentum, etc., impedes the appearance in
the final state of a fragment, related to a given channel, then we say that the “channel is closed”.
The energy Ei of fragment i is given by the kinetic energy of the center of mass motion of the
fragment, i , subtracting from it the binding energy wi ,
Ei = i − wi . (53.38)
The simplest nontrivial example of channels corresponds to a system of N = 3 fundamental
particles, called, say, a, b, c. Then the three two-particle bound states correspond to fragments
(ab), (bc), and (ac). To admit the bound states of the fragments, we need to have the binding energies
w ab , wbc , w ac > 0.
Then there are four channels, with the total energy divided among the various fragments,
I, a + (bc) : E =  a +  bc − wbc
II, b + (ac) : E =  b +  ac − w ac
(53.39)
III, c + (ab) : E =  c +  ab − w ab
IV, a + b + c : E =  a +  b +  c .
Putting the fragments in the order of their binding energies, wbc > w ac > w ab > 0, and having
the middle channel, II, as the in channel, we will analyze the case of the out channel that is the least
bound, III, i.e., c + (ab).
Energy conservation for the collision, between the in and the out channels, gives
 b +  ac − w ac =  c +  ab

− w ab . (53.40)
Then we obtain
 b +  ac = w ac − w ab +  c +  ab

> w ac − w ab > 0. (53.41)
This is the condition for the c + (ab) channel to be open. If the total incoming kinetic energy,
 b +  ac , is < w ac − w ab , we have a closed channel.

53.4 General Theory of Collisions

Before considering channels further, we describe a general, abstract-state, model of scattering.


Suppose that we have orthonormal states |Eα,
Eα|E  α  = δ(E − E  )δ αα , (53.42)
594 53 Inelastic Scattering

of eigenvalue energy E of some unperturbed Hamiltonian Ĥ0 ,


Ĥ0 |Eα = E|Eα. (53.43)
The scattering states are
|Eα± = |Eα + Ĝ± (E)V̂ |Eα, (53.44)
where Ĝ± (E) are the full Green’s functions. The states have the same norm as the free states,
Eα ± |E  α ± = Eα|E  α , (53.45)
and they obey the Lippmann–Schwinger equation
|Eα± = |Eα + Ĝ±0 (E)V̂ |Eα±. (53.46)
The full Green’s function also satisfies the Lippmann–Schwinger equation:
Ĝ± = Ĝ±0 + Ĝ±0 V̂ Ĝ± . (53.47)
The S-matrix is defined by
Eα − |E  α+ = Eα| Ŝ|E  α  = δ(E − E  )δ αα − 2πiδ(E − E  )Tαα (E), (53.48)
where the T-matrix is
Tαα (E) = Eα − |V̂ |E  α E=E  = Eα|V̂ |E  α +E=E  . (53.49)
In a spherical basis, this S-matrix reduces to the partial wave S-matrix,
SElm,E  l m = S̃l (E)δ(E − E  )δll δ mm , (53.50)
leading to
π
S(k1 , k2 ) = e2iδ l δ(k1 − k2 )δl1 l2 δ m1 m2 . (53.51)
2k12
Define the operators Ω(±) that create the scattering states from the free states,
Ω(±) |Eα = |Eα±, (53.52)
specifically
Ω(±) = lim UI (0, t), (53.53)
t→∓∞

and giving the S-operator as


Ŝ = Ω(−)† Ω(+) . (53.54)
If we specialize to a single channel, yet with target structure (electron–atom scattering), the initial
(free) state is
|Eα = |E A, n A ⊗ | B0 , β0 , (53.55)
and the final state is
|Ea , n A ⊗ | B , β. (53.56)
Conservation of energy leads to
E = E A +  B0 = Ea +  B . (53.57)
595 53 Inelastic Scattering

The Lippmann–Schwinger equation is

|Eα+ = |E; n A, β0 + = |E A, n A ⊗ | B0 , β0  + Ĝ+0 (E)V̂ |E; n A, β0 +. (53.58)

The corresponding scattering wave function for projectile position r and center of mass target
 as r → ∞, is
position R,
 eik r
 |Eα+ = v + (r , R)
r | ⊗  R|  ∼ eik ·r un0 ( R)
 + f nn0 (k , k)un ( R).
 (53.59)
n
r

We expand the scattering solution in terms of the target states,



v + (r , R)
 = Fn+ (r )un ( R).
 (53.60)
n

Then at infinity for the projectile, r → ∞, we find



eik r
Fn+ (r ) ∼ ei k ·r δ nn0 + f nn0 (k , k)

(53.61)
r
for an open channel, and

Fn+ (r ) ∼ 0 (53.62)

for a closed channel.


Then the differential cross section becomes
dσ  B0 ,β0 → B β pa
= | f |2 , (53.63)
dΩa pA
which splits between the elastic (n = n0 ) and inelastic (n  n0 ) cases,
dσnelastic
0
= | f n0 n0 | 2

(53.64)
dσninelastic k
= | f nn0 | 2 .
dΩ k

53.5 Multi-Channel Analysis

We now generalize to the multi-channel case. As we said, the presence of different channels means
dividing the fundamental particles and their interaction into fragments. Denoting a channel by Γ, we
have

Ĥ = Ĥ0Γ + V̂ Γ . (53.65)

The free states |EαΓ correspond to eigenvalues of the free Hamiltonian of the channel,

Ĥ0Γ |Eα; Γ = E|Eα; Γ. (53.66)

These states are orthonormal within the channel,

Eα; Γ|E  α ; Γ = δ(E − E  )δ αα , (53.67)


596 53 Inelastic Scattering

but there is a nonzero overlap between channels,


Eα; Γ|E  α ; Γ   0 (53.68)
for Γ  Γ .
The scattering states are defined as
|Eα; Γ± ≡ |Eα; Γ + Ĝ(E)V̂ Γ |Eα; Γ (53.69)
and satisfy the Lippmann–Schwinger equation,
|Eα; Γ± = |Eα; Γ + Ĝ0 (E)V̂ Γ |Eα; Γ±. (53.70)
The orthogonality conditions of the states are as follows:
(1) The scattering states in different channels are orthogonal,
EαΓ ± |EαΓ  ± = 0 (53.71)
for Γ  Γ .
(2) The free states within a single channel are not complete,
 ∞
dE|EαΓEαΓ|  1, (53.72)
α 0

while the scattering states of a single channel give just the projector to the scattering states of the
channel,
 ∞
dE|EαΓ±EαΓ ± | ≡ P Γ± . (53.73)
α 0

(3) The sum of the projectors onto different channels, added to the projector Λb onto bound states,
is complete,

P Γ± + Λb = 1. (53.74)
Γ

The operators turning free states into scattering states are defined for each channel, by projecting
onto the (free) states of the channel with ΛΓ :
Ω(±)Γ = Û Γ (0, +∞)ΛΓ . (53.75)
Within the channel these operators are unitary,
[ΩΓ(±) ]† ΩΓ(±) = ΛΓ . (53.76)
The S-operator is defined for arbitrary in and out channels,

S Γ Γ = ΩΓ(−)† (+)
 ΩΓ . (53.77)
The S-matrix is

E  α  Γ  |S Γ Γ |EαΓ = E  α  Γ  − |EαΓ+

(53.78)
= δ(E − E  )δ αα δ ΓΓ − 2πiδ(E − E  )TαΓ αΓ (E),
where the T-matrix is given by

TαΓ αΓ (E) = Eα  Γ  |V̂Γ |EαΓ+
(53.79)
= Eα  Γ  − |V̂Γ |EαΓ.
597 53 Inelastic Scattering

If both the in and the out channels have two fragments, small one, A (the projectile), transformed
into A, and large one, B (the target), transformed into B , the inelastic differential cross section has
an extra factor coming from the mass of the projectile,

dσinelastic k  mA
= | f α ,Γ ;α,Γ | 2 . (53.80)
dΩ k m A

53.6 Scattering of Identical Particles

The relevant case of electron scattering off an atom, which itself has N electrons, reveals a new
characteristic, namely when identical particles scatter (the scattering electron versus the atom’s
electrons). But identical particles need to be symmetrized, as we have seen (their wave functions
are either totally symmetric (S), or totally antisymmetric (A)).
We assume that the target (the atom, containing electrons) is already symmetrized,

Ĥ0 [ P̂|Eα] = E P̂|Eα, (53.81)

where P is the symmetrization operator.


But then we add to the system the projectile A, which is also an electron, so we need to consider
states that are symmetric or antisymmetric in all the N + 1 electrons. That means we need to
symmetrize j = 1, corresponding to the projectile, with j = 2, . . . , N +1, for the electrons of the atom.
The symmetrized system is obtained by the action of the operator Λ, symmetrizing j = 1 with all
the other j’s,
⎡⎢ 
N +1 ⎤⎥
Λ|Eα =
1 ⎢⎢ |Eα + P1j |Eα⎥⎥⎥ . (53.82)
N +1 ⎢⎢ ⎥⎦
⎣ j=2

The correctly normalized symmetric/antisymmetric states are then



|EαS/A = N + 1Λ|Eα
⎡⎢ 
N +1 ⎤⎥ (53.83)
=√
1 ⎢
⎢⎢ |Eα± ± P1j |Eα±⎥⎥⎥ .
N +1 ⎢ ⎥⎦
⎣ j=2

The S-matrix is
SE  α ,E α = E  α  − |Eα+ ± NE  α  − |P12 |Eα+
(53.84)
= δ(E − E  )δ αα − 2πiδ(E − E  )Tαα (E),

where the T-matrix is given by

Tαα (E) = E  α  − |V̂ |Eα ± NE  α  − |P12 V̂ |Eα


(53.85)
= E  α  |V̂ |Eα+ ± NE  α  |P12 V̂ |Eα+.

It splits into a direct term (index d) and an exchange term (index exch),

Tαα (E) = Tαα


d
 (E) ± NT αα (E).
exch
(53.86)
598 53 Inelastic Scattering

Applying this to the simplest system, electron scattering off a hydrogen atom (N = 1), we replace
Tαα
d
 (E) → t    
d
δ  δ 
p ,n,l,m m1 m1 m2 m2
 n l m ;
p l l
(53.87)
Tαα
exch
 (E) → t exch
  n l ml ;
p
δ  δ  .
p ,n,l,ml m1 m2 m2 m1

Then the differential cross section is


dσm1 m2 ;m1 ,m2
= (. . . )|t d δ m1 m1 δ m2 m2 ± t exch δ m1 m2 δ m2 m1 | 2 . (53.88)

Considering mi = ± (the states of the spin projection), we obtain
dσ++,++ dσ−−,−−
= = |t d − t exch | 2
dΩ dΩ
dσ+−,+− dσ−+,−+
= = |t d | 2 (53.89)
dΩ dΩ
dσ+−,−+ dσ−+,+−
= = |t exch | 2 ,
dΩ dΩ
and the rest of the scattering cross sections vanish.
That means that the unpolarized differential cross section is
  unpolarized
dσ 3 1
= |t d − t exch | 2 + |t d + t exch | 2 . (53.90)
dΩ 4 4
To calculate the transition amplitude, we use the Born approximation for |Eα± → |Eα, and
V̂ = V̂1 + V̂12 , giving
Tαα (E) = E  α  |(V̂1 + V̂12 )|Eα ± NE  α  |P12 (V̂1 + V̂12 )|Eα. (53.91)

Important Concepts to Remember

• Inelastic scattering means scattering when energy, momentum, and/or particles can leak into a
“sink”, namely the case in which one or both of the particles (projectile and target) have internal
structure and can become excited.
• In inelastic scattering Ŝ is not unitary, so Sl is not a phase, i.e., δl is not real: |Sl | < 1, or Im δl > 0.
• For a “black disk” eikonal, Im Δ(b) = +∞ for b < r 0 and Δ = 0 for b > r 0 , so the total inelastic
cross section is σtot = σgeometrical = πr 02 , while the total cross section is 2πr 02 .
• In inelastic scattering with a target structure (so that the target can absorb energy and momentum),
dσ k
(|n0  → |n) = | f | 2 .
dΩ k
For an atom with N electrons,
dσ k  4Z 2
= | − δ n,n0 + Fn,n0 (q )| 2 ,
dΩ k a02 q4
with Fn,n0 (q ) the form factors.
• In general, both projectile and target can have structure, and that structure can change during
the collision (as in nuclear reactions occurring through collision), so we have different divisions,
named channels, of the total number of fundamental particles into fragments.
599 53 Inelastic Scattering

• In the case of several channels Γ (different ones for initial and final states), we have
dσ k  mA
= | f α ,Γ ;α,Γ | 2 .
dΩ k m A
• When scattering involves identical fundamental particles, organized into fragments, as for instance
an electron scattering off an atom with electrons, the total state needs to be (anti)symmetrized,
leading to a direct term and an exchange term in the amplitude, so dσ/dΩ ∼ |t d δ · · · + t exch δ . . . | 2 .

Further Reading
See [2] and [1].

Exercises

(1) Consider the mapping of the black disk eikonal into the general, δl (k), representation. Calculate
the total inelastic cross section in this representation.
(2) Calculate the differential cross section dσ(θ)/dΩ for the black disk eikonal, integrate it, and
compare with the total inelastic cross section.
(3) Consider the scattering of a projectile A (nonidentical to the target components) off a
hydrogenoid atom in the Born approximation, involving a jump of the electron from the ground
state to the first excited state. Calculate the differential cross section.
(4) Consider a system of N = 4 fundamental particles. Write down the channels for the scattering
of the various fragments. Taking the in channel as one where each of the two fragments have
two particles, find the condition for the out channel to be one in which the fragments have
respectively one and three particles.
(5) Describe the general theory of scattering (as in the text) for the case when the target is a single
fundamental particle (and the projectile is composite), so both in and out channels have a single-
particle target.
(6) How do you write the inelastic differential cross section for the case where the in channel has two
parallel projectiles, and the out channel has a single outgoing projectile (fragment) of general
composition (neither of the initial projectiles, nor their sum)?
(7) Specialize the scattering of the identical particle formalism for the case of an electron scattering
off a helium atom to describe the differential cross sections of various spin projections in terms
of direct and exchange T-matrices.
PART IIe

MANY PARTICLES
54 The Dirac Equation

In this chapter we consider relativistic corrections to the electron (fermion) theory, in the form
of the Dirac equation. The correct treatment of the Dirac equation is in quantum field theory,
but here we will deal with only the first relativistic corrections, in which case we can ignore
all quantum field corrections. In some treatments in the literature, one talks about building
a relativistic quantum mechanics. However, there is no relativistic quantum mechanics; joining
relativity with quantum mechanics leads to quantum field theory, as we will show.

54.1 Naive Treatment

In basic nonrelativistic quantum mechanics we have the Schrödinger equation, which in the free
case is
2
−i∂t ψ − Δψ = 0, (54.1)
2m
and is invariant under the Galilei transformations,

x  = x − vt. (54.2)

In the momentum representation, the free Schrödinger equation is


2 2
i∂t ψ = H ψ = p ψ = 0, (54.3)
2m
which is the nonrelativistic energy relation.
We would like to build a replacement for the Schrödinger equation that is invariant under the
Lorentz transformation. We will replace the energy relation with a relativistic one. Acting twice with
the same operator on the wave function we obtain

−2 ∂t2 ψ = H 2 ψ, (54.4)

where

E 2 → H 2 = p 2 c2 + m2 c4 . (54.5)

Then we have a replacement of the Schrödinger equation defined by


− 2 ∂t2 ψ = (p 2 c2 + m2 c4 )ψ ⇒
[−2 ∂t2 + 2 c2 ∇
 2 − m2 c4 ]ψ = 0 ⇒
(54.6)
  2
 2 − 1 ∂ − mc
2
∇ ψ = 0.
c2 ∂t 2 
603
604 54 The Dirac Equation

This is the Klein–Gordon equation, but it is not what we wanted. The equation acts on a wave
function ψ that is in a scalar representation of the Lorentz group. In fact, we need to find an equation
acting on the spinor representation (spin 1/2, for the electron) of the Lorentz group.
Moreover, we need to have a first-order equation in ∂t , like the usual Schrödinger equation, and
correspondingly (because of Lorentz invariance) also in p =  ∇.
i
Putting together these two requirements, we find that we need to take the “square root” of the
Klein–Gordon equation, in terms of matrices (the coefficients are matrices). Then we have

c2 p 2 + (mc2 ) 2 = (c
α · p + βmc2 ) 2 ≡ H 2 , (54.7)

meaning that the Hamiltonian is

H = c
α · p + βmc2 , (54.8)

with matrix coefficients α  and β, satisfying (from the defining equation, by equating terms with pi p j ,
or pi , or no pi at all, on both sides of it),
(α i ) 2 = β2 = 1
α i α j + α j α i = {α i , α j } = 0, i  j (54.9)
α i β + βαi = {α i , β} = 0.
Moreover, since the Hamiltonian is Hermitian, it means that the coefficient matrices are also
Hermitian, so α†i = α i and β† = β. They are also traceless and have eigenvalues ±1 (since α2i =
β2 = 1). The matrices must be 4 × 4, which we can prove as follows. The dimension must be even,
since we know that the functions on which we act with them must eventually depend on spin, which
takes two values (±1/2). And for 2 × 2 matrices, the complete set of matrices is (1, σ ), so the full set
of independent matrices that anticommute comprises the three Pauli matrices σi ; yet we need four,
for the α i and β. That leaves the next even dimension up, namely 4 × 4 matrices.
The matrices are also not unique; after a unitary transformation, with matrix S (S † = S −1 ), we have
another solution for the matrices:

 → S −1 α
α  S, S −1 βS. (54.10)

The Dirac equation, replacing the Schrödinger equation, is then


 
∂ c
i ψ = (c α · p + βmc2 )ψ = α
 ·∇  + βmc2 ψ. (54.11)
∂t i
One useful choice for the matrices α, β (reachable from another by a unitary transformation) is
   
0 σ 1 0
α
= , β= . (54.12)
σ 0 0 −1

Since the α , β coefficients are 4 × 4 matrices, they act on four-dimensional column vectors ψ, not
just on a single wave function. However, we have already encountered a two-component vector
as the spin 1/2 wave function (ms = +1/2 and ms = −1/2 components). Here we have two two-
component vectors, which actually correspond to both the electron e− and its antiparticle e+ (though
the separation of the two is related to some specific choice for α , β).
The correct treatment for producing the Dirac equation and its solutions is in quantum field theory.
In the literature, there are statements about the existence of a relativistic quantum mechanics, but
that is misleading; the correct relativistic treatment, joining relativity with quantum mechanics, is
605 54 The Dirac Equation

quantum field theory. The fact that a strictly relativistic quantum mechanics does not exist can be
understood from three points of view:
(1) For any particle, there is an antiparticle. For spin 1/2 particles, the two are different (for real
fields, such as real scalars, the particles are their own antiparticles). Thus if the energy is E > m p c2 +
m p̄ c2 , we can create a particle and antiparticle pair, which means that the number of particles is not
conserved in quantum field theory, which describes real situations. But in usual quantum mechanics,
particles follow quantum paths that never end or begin, so the number of particles is conserved.
(2) Even if E < m p c2 +m p̄ c2 , we can create a particle–antiparticle pair for a short time. Indeed, the
Heisenberg uncertainty principle, regarding E and t implies that ΔE·Δt ∼ , so for low enough Δt, we
can have E + ΔE > m p c2 + m p̄ c2 . Therefore we can always create a virtual pair of particles. There
are no asymptotic particles (at large space and time separation) owing to energy and momentum
conservation, but we have quantum fluctuations. Thus, even for low energies, we can always have
quantum paths of particles that begin and end, violating the conservation of particle number.
(3) In the usual quantum mechanics,
 we also violate causality, even if we have Lorentz invariance
(a relativistic theory), with E = p 2 c2 + m2 c4 . We will show that under a time evolution that violates
causality we still have a nontrivial result, contrary to what we should have in a good theory.
The propagator matrix element between x 0 at time zero and x at time t is
 
U (t) = x |e−i Ĥt |x 0  = d 3 p d 3 p x |p p |e−i Ĥt/ |p p  |x 0 
   (54.13)
1 i
= d p exp − t p c + m c ei p ·(x −x 0 )/ .
3 2 2 2 4
(2π) 3 
But
  ∞  +1  ∞  
2π ipx/
d 3 p ei p ·x / = p2 dp 2π d(cos θ)eipx cos θ/ = p2 dp e − e−ipx/
0 −1 0 ipx
 p 
4π
= p2 dp sin x .
px 
(54.14)
Then the propagator matrix element is
 ∞ p    
 i
U (t) = p dp sin |
x − x
 0 | exp − t p2 c2 + m2 c4 . (54.15)
2π 2 |x − x 0 | 0  
To approximate it, we use the saddle point approximation of an integral, by Taylor expanding
around the extremum in an exponent,
  
f  (x0 )δx 2 /2 2π
I= dxe f (x)
e f (x0 )
dδxe =e f (x0 )
. (54.16)
f  (x 0 )

If we consider a separation much outside the causal cone, x 2  c2 t 2 , the saddle point (extremum)
condition implies

1 d t pc2
(ipx − it p2 c2 + m2 c4 ) = 0 ⇒ x =
 dp p2 c2 + m2 c4
(54.17)
imcx
⇒ p = p0 = √ .
x 2 − c2 t 2
606 54 The Dirac Equation

Thus the saddle point evaluation of the propagator matrix element is


    mc √ 
i
U (t) ∝ exp (p0 x − t p02 c2 + m2 c4 ) ∼ exp − x 2 − c2 t 2  0 (54.18)
 
This means that there is a breakdown of causality even well outside the causal cone, albeit with an
exponentially small value. In quantum field theory causality is recovered, but we will not explain
how, since it requires information beyond the scope of this book.

54.2 Relativistic Dirac Equation

Our previous treatment of the Dirac equation was based on the idea of a relativistic version of
quantum mechanics. But, as we have just seen, that is not consistent so we need a derivation based
on quantum field theory.
We need to consider a field associated with spin 1/2 particles. These are “spinor” fields, encom-
passing the Hilbert space acted upon by so-called gamma matrices γμ . These are objects satisfying
the “Clifford algebra”, objects which when squared give the identity, and which anticommute among
each other, just like α
 and β. Together, these conditions make the Clifford algebra
{γ μ , γ ν } = 2g μν 1, (54.19)
where g μν is the metric, in our case the Minkowski metric g μν = η μν = diag(−1, +1, +1, +1). If
we replace γ0 with iγ0 , the Clifford algebra will have the Euclidean metric g̃ μν = δ μν . In three
dimensions, the Pauli matrices σi satisfy this Euclidean Clifford algebra.
The Hilbert space for the Clifford algebra has a basis that is changed by a unitary transformation,
ψ → Sψ, which amounts to a transformation on the gamma matrices, γ μ → Sγ μ S −1 , meaning the
gamma matrices are not unique, just like α  and β.
A choice of γ μ corresponds to a representation of the Clifford algebra. A useful representation is
the Weyl representation,
   
0 1 0 σi
γ = −i
0
, γ = −i
i
, (54.20)
1 0 −σi 0
satisfying (γ0 ) 2 = −1, (γi ) 2 = 1. Together, the gamma matrices in the Weyl representation are
written as
 
0 σμ
γ μ = −i μ , (54.21)
σ̄ 0
where
σ μ = (1, σi ), σ̄ μ = (1, −σi ). (54.22)
Then the Dirac equation is the Lorentz-invariant 4 × 4 matrix equation that is linear in ∂t and has
rest energy mc2 . That uniquely gives, in the “theorist’s units” with c =  = 1, the equation
(γ μ ∂μ + m)ψ = 0. (54.23)
Reinstating  and c, we find
 ∂
γ0 ψ + γi ∂i ψ + mcψ = 0. (54.24)
c ∂t
607 54 The Dirac Equation

Multiplying by iγ0 c, we obtain



i ψ = icγ0 γi ∂i ψ + imc2 γ0 ψ, (54.25)
∂t
which means that we have

α
 = −γ0 γ
 , β = iγ0 . (54.26)

54.3 Interaction with Electromagnetic Field

To describe the interaction of the spin 1/2 fermions (electrons) with the electromagnetic field, we use
minimal coupling, replacing p with p − q A, and H with H + qφ.
Then the Hamiltonian with the interaction with the electromagnetic field is

H = c  + βmc2 + qφ.
α · p − A (54.27)

Substituting into the Schrödinger equation



i ψ = Ĥ ψ, (54.28)
∂t
we obtain the Dirac equation including the interaction with the electromagnetic field,
   
1 ∂ q 
+ φ+α  · ∇ − q A + βmc ψ = 0.
 (54.29)
i c ∂t c i
In the explicitly relativistic invariant notation, we have
   

γ μ ∂μ − q Aμ + mc ψ = 0. (54.30)
i

54.4 Weakly Relativistic Limit

The main reason to consider in this book on quantum mechanics the Dirac equation, which is really
part of a quantum field theory treatment, is to calculate the first nontrivial relativistic corrections to
the nonrelativistic result. This is usually called the weakly relativistic limit.

Interaction with a Magnetic Field

The first application is to the case of φ = 0, leading to interaction with just a magnetic field. In the
stationary case, ψ(t) = ψe−iEt/ , the Dirac equation for interaction with A
 reduces to

Eψ = ψ̂ = (c
α · p kin + βmc2 )ψ, (54.31)

where the kinetic momentum is

p kin = p − q A.
 (54.32)
608 54 The Dirac Equation

Dividing the 4-vector ψ into two two-component vectors χ and φ, we find the matrix equation
(using α
 and β in (54.12))
  
E − mc2 −cσ · p kin χ
= 0, (54.33)
−cσ · p kin E + mc2 φ

which divides into two equations,

(E − mc2 )χ = cσ · p kin φ


(54.34)
(E + mc2 )φ = cσ · p kin χ.

In the nonrelativistic limit, σ · p ∼ mv, and E + mc2  2mc2 , meaning that (from the second
relation)

φ mvc 1v
∼ =  1. (54.35)
χ 2mc 2 2c

Thus the two-component vector φ is small, while χ is large.


Substituting

σ · p kin
φ χ, (54.36)
2mc
obtained from the second Dirac equation, into the first equation, we obtain

(σ · p kin ) 2
(E − mc2 )χ = cσ · p kin φ = χ, (54.37)
2m
which is called the Pauli equation, for the large two-component vector χ.
Using the relation between Pauli matrices,

σi σ j = δi j + ii jk σk , (54.38)

we find

p − q A
 2   
1  i  
σi σ j pi − q Ai p j − q A j = +  i jk ∂i − q Ai ∂ j − q A j σk
2m 2m 2m i i
(54.39)
2
p − q A
 q
= − σ · B,

2m 2m
where we have used i jk (−∂i A j + ∂j Ai ) = −Bi .
Then the Dirac equation for χ including the interaction with A
 (the Pauli equation) is

⎡⎢ 2 ⎤⎥
p − q A
 q
(E − mc )χ = ⎢⎢⎢
2
−  ⎥⎥ χ.
σ · B ⎥⎥ (54.40)
⎢⎣ 2m 2m

Here E − mc2 is the energy appearing in the nonrelativistic Schrödinger equation, and the σ · B
 term
is the spin interaction with magnetic field, with the Landé factor g = 2, as we have seen before.
609 54 The Dirac Equation

Interaction with an Electric Field


Interaction with a central electric potential φ(r),
V (r) = eφ(r), (54.41)
gives two coupled Dirac equations for χ and φ with an extra term proportional to the identity,
(E − V − mc2 )χ = cσ · p kin φ
(54.42)
(E − V + mc2 )φ = cσ · p kin χ.
Solving the second equation as before,
cσ · p kin
φ= χ, (54.43)
E − V + mc2
and substituting into the first equation, we get an equation for the large two-component vector χ,
1
(E − mc2 − V )χ = c2 (σ · p kin ) (σ · p kin )χ. (54.44)
E − V + mc2
This equation is now expanded in the small parameter
E − mc2 − V
 1. (54.45)
2mc2
The zeroth-order term is
(σ · p kin ) 2
(E − mc2 − V )χ  c2 χ, (54.46)
2mc2
which becomes (moving V to the right-hand side) the usual interaction with a magnetic field and an
electric potential (the Pauli equation with V in it)
 
(σ · p kin ) 2
(E − mc )χ =
2
+ V χ. (54.47)
2m
Next, we consider the first-order corrections in the expansion parameter,
  −1
(σ · p kin ) E − mc2 − V
(E − mc2 − V )χ = c2 2
1 + 2
(σ · p kin ) 
 2mc 2mc
2 (54.48)
⎡⎢ ⎤⎥
σ
 · p
 E − mc 2
− V
 ⎢⎢ kin
− (σ · p kin ) (σ · p kin ) ⎥⎥ χ.
⎢⎣ 2m 4m2 c2 ⎥⎦

At this point, we want to consider the pure electric case, with A  = 0, and p kin = p . Then, using

(E − mc2 − V )σ · p χ = σ · p (E − mc2 − V )χ + σ · [E − mc2 − V , p ]χ, (54.49)


we find

⎪ p 2 p4 (σ · p )(σ · [p , V ]) ⎫

(E − mc2 )χ = ⎨⎪ 2m + V − 3 2
− 2 2
⎬ χ.
⎪ (54.50)
⎩ 8m c 4m c ⎭
The first two terms on the right-hand side are the nonrelativistic result, while the third term is the
relativistic correction to the energy coming from the expansion of E = p2 c2 +  m2 c4 ,

p2 p4
E  mc2 + − . (54.51)
2m 8m3 c2
610 54 The Dirac Equation

Decomposing σi σ j = δi j + ii jk σk , we obtain the corrected Schrödinger equation

p 2 p 4 σ · (p × [p , V ]) p · [p , V ] 


(E − mc2 )χ =  +V − 3 2
−i − χ, (54.52)
 2m 8m c 4m2 c2 4m2 c2
where there are three relativistic corrections to the Hamiltonian. But the second is
σ · (p × [p , V ]) σ · (p × [−i∇,  V ])
−i = −i , (54.53)
4m2 c2 4m2 c2
where

[−i∇,
 V ] = −i(∇V  ) = −i r dV . (54.54)
r dr
However, −p × r = L is the angular momentum and σ /2 = S is the spin, so the second relativistic
correction is
1 1 dV
2 2
+
( S · L) , (54.55)
2m c r dr
which is the spin–orbit interaction, giving Thomas precession.
Finally however, the last term gives the so-called Darwin term. But the term is not Hermitian,
(p · [p , V ]) † = [V , p ] · p = −p · [p , V ]. (54.56)
This means that the conservation of probability is broken (the conservation of probability implies
unitary evolution, through a Hermitian Hamiltonian). We need an extra term in the Hamiltonian to
compensate for the probability loss and restore Hermiticity.
The probability loss is due to the fact that we dropped the small two-component vector φ. Indeed,
the normalization of probability is given by
ψ|ψ = χ|χ + φ|φ = 1. (54.57)
Therefore we must replace the resulting norm of φ with a term involving only χ. Using (54.43),
approximated as
σ · p
|φ  |χ, (54.58)
2mc2
to relate the two vectors, we obtain
  2 
 σ · p 
φ|φ = χ   χ (54.59)
 2mc 
to be added to χ|χ, i.e., to replace it by
     
σ · p 2 
χ  1 + χ . (54.60)
 2mc 
Thus we are replacing |χ by

   
σ · p 2 (σ · p ) 2
| χ̃ = 1 + |χ  1 + |χ, (54.61)
2mc 8m2 c2
inverted as
 
(σ · p ) 2
|χ = 1 − | χ̃. (54.62)
8m2 c2
611 54 The Dirac Equation

Evaluating the energy in the Schrödinger equation E − mc2 in the | χ̃ state, we obtain (since
(σ · p ) 2 = p 2 )
 
p · p
(E − mc )| χ̃ = 1 +
2
(E − mc2 )|χ
8m2 c2
   
p · p p · p
= 1+ Ĥ 1 − | χ̃ (54.63)
8m2 c2 8m2 c2
 
[p · p , Ĥ]
= Ĥ + | χ̃,
8m2 c2
where in the second equality we have used (E−mc2 )|χ = Ĥ |χ, and replaced |χ with its expression
in terms of | χ̃. The only nontrivial commutator with p · p is V , the rest of the terms in Ĥ commute
trivially. Therefore we have an extra term in the Hamiltonian:
 
[p · p , V ]
(E − mc2 )| χ̃ = Ĥ + | χ̃. (54.64)
8m2 c2
Then the Darwin term, including the last relativistic correction and the term coming from the
normalization, is
1  1 2
HD = −2
p · [
p , V ] + [
p · p
 , V ] = − [
p , ·[
p , V ]] = + ΔV . (54.65)
8m2 c2 8m2 c2 8m2 c2

54.5 Correction to the Energy of Hydrogenoid Atoms

The relativistic correction to the Schrödinger equation is given by


(p 2 ) 2 1   1 dV 2
H = − 3 2
+ 2 2
L·S + ΔV = H1 + H2 + H3 , (54.66)
8m c 2m c r dr 8m2 c2
where H1 is the relativistic correction to the energy of a particle, H2 is the spin–orbit coupling, and
H3 is the Darwin term. We will apply this to the hydrogenoid atom, with
Ze02
V =− . (54.67)
r
Since we have spin–orbit coupling, it is convenient to add the orbital and spin angular momenta
together to give the total angular momentum, L + S = J . The eigenstates in the central potential are
2
then eigenstates |nl jm j  of the complete set ( Ĥ, L 2 , J , Jz ). In the coordinate representation,
r |nl jm j  = Rnl (r)θ, φ|l jm j  = Rnl (r) Cl jm j Ylm j −1/2 (θ, φ)|ξ + Dl jm j Ylm j +1/2 (θ, φ)|η ,
(54.68)
where |ξ = |↑ and |η = |↓.
The first-order time-independent perturbation theory for the energy is given by
W (1) = H nl jm j , (54.69)
where the matrix element is diagonal,
nl jm j |H  |nl  j  m j  = δll δ j j  δ m j mj H nl jm j . (54.70)
612 54 The Dirac Equation

The corrected energy is


(1)
Wnj = En + Wnj . (54.71)

Using the matrix elements of powers of momentum and radius, one finds for the first term in (54.66),
 
1 (αZ ) 4 1 3
H1 nl jm j = − mc2 − , (54.72)
2 n3 l + 1/2 4n
for the spin–orbit coupling


⎪ 2 (αZ )
4
1
⎪ ± 1
mc , for j = l ± 1/2, l  0
H2 nl jm j =⎨

4
n 3 (l + 1)(l + 1/2) (54.73)

⎪ 0,
⎩ for l = 0,

and for the Darwin term,


1 2 (αZ ) 4
H3 nl jm j = mc δl0 . (54.74)
2 n3
Since at l  1, we have
 
1 1 1
− − 1∓ , (54.75)
l + 1/2 ± 1/2 l + 1/2 2(l + 1/2)
this means that at large l or j, we find the total relativistic correction as
 
(1) 1 (αZ ) 4 1 3
Wnj = − mc2 − . (54.76)
2 n3 j + 1/2 4n

Important Concepts to Remember

• The Dirac equation comes from the quantum field theory treatment of the electron, though it is
often mistakenly described as a relativistic version of quantum mechanics.
• From
 the Hamiltonian  becoming relativistic energy, we get the Klein–Gordon equation,
 2 − c−2 ∂t2 − (mc/) 2 ψ = 0, and the Dirac equation is a sort of matrix square root of it, i∂t ψ =

(−ic α·∇ + βmc2 )ψ.
• There is no relativistic quantum mechanics since: the particle number is not conserved, owing at
least to particle–antiparticle annihilation (and more); for a short time we can have virtual particles
that can also change the particle number; in quantum mechanics we can violate causality.
• The relativistic form of the Dirac equation is (γ μ ∂μ + mc)ψ = 0 (where ∂t has a factor /c in front),
with γ μ the gamma matrices satisfying the Clifford algebra, {γ μ , γ ν } = 2g μν , ψ a spinor field and
αi = −γ0 γi and β = iγ0 .
• The coupling to electromagnetism is via the minimal coupling, p → p − q A,  relativistically  ∂μ →
i

i ∂μ − q Aμ .
• The resulting coupling to a magnetic field is [(p − q A)  2 /2m − (q/2m)σ · B],
 and the coupling to
an electric field is the spin–orbit interaction (giving Thomas precession),
2   1 dV
( S · L) ,
2m2 c2 r dr
613 54 The Dirac Equation

and the Darwin term,


2
ΔV .
2m2 c2
• The relativistic corrections to the hydrogenoid atom come from: the first relativistic correc-
tion to the energy, −p4 /(8m3 c2 ), the spin–orbit interaction, and the Darwin term, leading to
mc2 (αZ ) 4
− (1/( j + 1/2) − 3/(4n)).
2 n3

Further Reading
See [2], [1] and [3].

Exercises

(1) Does a wave function satisfy the Klein–Gordon equation? What can you deduce about the sign of
the rest energy mc2 in the Dirac equation?
(2) Find the Dirac equation in 1+1 dimensions (one space dimension) and find a representation for
the matrices involved.
(3) Show that if we consider γ5 = iγ0 γ1 γ2 γ3 together with γ0 and γi , they form a Clifford algebra
in five dimensions.
(4) Write down the Klein–Gordon equation with coupling to electromagnetism.
(5) Consider a shell model for a nucleus, with the potential for a nucleon being approximated by the
Yukawa potential. Calculate the first relativistic correction to the Hamiltonian for the nucleon.
(6) Check the missing steps in the relativistic corrections to the hydrogenoid atom.
(7) Calculate the relativistic corrections to a hydrogenoid atom in a magnetic field.
Multiparticle States in Atoms and
55 Condensed Matter: Schrödinger versus
Occupation Number

In this chapter we return to multiparticle states, from the point of view of atomic physics and
condensed matter physics, with a large number of identical particles. After reviewing the Schrödinger
representation and the approximations needed, we lay the foundations of the occupation number
picture, which will be continued in the next chapter in terms of Fock states and second quantization.

55.1 Schrödinger Picture Multiparticle Review

The abstract-state Schrödinger equations, in the time-dependent and time-independent varieties, are

i∂t |ψ = Ĥ |ψ ⇒ Ĥ |ψ = E|ψ. (55.1)

This equation applies to both single-particle and multiparticle states. Applying it to N identical
particles, the coordinate basis is |r 1 . . . r N , and the basis has exchange symmetry under exchange
of r i with r j (particle i with particle j),

|r 1 . . . r i . . . r j . . . r N  = ±|r 1 . . . r j . . . r i . . . r N , (55.2)

where the ± correspond to Bose–Einstein and Fermi–Dirac statistics, respectively. If the particles are
all different, we have no symmetry. If only some of them are identical, the exchange symmetry is
valid for them only.
The completeness relation on the N-particle Hilbert space is
 
d 3r 1 · · · d 3r N |r 1 . . . r N r 1 . . . r N | = 1. (55.3)

We can use, alternatively, the unsymmetrized basis |r 1  ⊗ · · · ⊗ |r N  in the identical case also (since
in the case of all different particles it is a basis by definition), with completeness relation
 
d r 1 · · ·
3
d 3r N |r 1 r 1 | ⊗ · · · ⊗ |r N r N | = 1. (55.4)

Inserting the symmetrized basis, we find the Schrödinger equation acting on wave functions,
 
3 
i∂t r 1 . . . r N |ψ = d r 1 · · · d 3r N r 1 . . . r | Ĥ |r 1 . . . r N r 1 . . . r N |ψ, (55.5)

or
i∂t ψ N (r 1 , . . . , r N )
(55.6)
= Hr 1 ,...,r N ψ N (r 1 , . . . , r N ) ⇒ Hr 1 ,...,r N ψ N (r 1 , . . . , r N ) = Eψ N (r 1 , . . . , r N ).
614
615 55 Multiparticle States: Schrödinger vs. Occupation Number

The symmetry properties of N independent but identical particles imply the general invariant states
1 
|12 . . . N = sgn(P) P̂|1 ⊗ |2 ⊗ · · · ⊗ |N , (55.7)
Nperms perms P

where in the Bose–Einstein case sgn(P) is replaced by 1, leading to the wave function
ψ S/A (r 1 , . . . , r N ).
In the Fermi–Dirac (antisymmetric) case, we have
 ψ a1 (r 1 ) ··· ψ a N (r 1 ) 
 
ψ A (r 1 , . . . , r N ) = 
.. ..  ,
 . .  (55.8)
 ψ a (r N ) 
 1 ··· ψ a (r N ) 
N

known as the Slater determinant.


The symmetrized wave function satisfies

ψ N ,S/A (r 1 , . . . , r i , . . . , r j , . . . , r N ) = ±ψ N ,S/A (r 1 , . . . , r j , . . . , r i , . . . , r N ). (55.9)

We consider a Hamiltonian containing a free part that acts on a single particle, namely the kinetic
part for each particle, and a potential that acts on several particles,
Ĥ = Ĥ0 + V̂ (r 1 , . . . , r N )
N
p 2i
Ĥ0 = T̂ =
i=1
2mi (55.10)

N 
N 
N
V̂ (r 1 , . . . , r N ) = V̂ (r i ) + V̂ (r i , r j ) + V̂ (r i , r j , r k ) + · · ·
i=1 i,j=1 i,j,k=1

N
Here the one-particle potential i=1 V̂ (r i ) can be a Coulomb potential from a fixed source, in which
case it could be added to the free one-particle Hamiltonian Ĥ0 , while the three-particle potential
V̂ (r i , r j , r k ) and any higher-N N-particle potentials come from quantum field theory corrections to
the classical potential.
The only relevant term in the potential is the two-particle potential V̂ (r i , r j ) = V̂ (|r i − r j |) (the
Coulomb potential between the particles, for instance).

(a)

(b)
Figure 55.1 (a) The 2-point potential can come from a quantum mechanical Feynman diagram, but without loops (since it is classical).
(b) The three-point potential can only come from loop corrections, and so from quantum field theory.
616 55 Multiparticle States: Schrödinger vs. Occupation Number

55.2 Approximation Methods

The multiparticle Schrödinger equation is hard to solve exactly, so in the relevant cases for atomic
or molecular physics and condensed matter physics, one can use approximation methods to get a
workable wave function.
A useful example for this chapter is the LCAO (linear combination of atomic orbitals) approxima-
tion to the multi-electron wave function for the atom or molecule. Consider the basis of factorized
wave functions

φr (r 1 , . . . , r N ) ≡ φ a1 (r 1 ) · · · φ a N (r N ), (55.11)

where r is an index encompassing the relevant subset of all the a1 , . . . , a N , or its symmetrized
alternative, in the fermionic case a Slater determinant.
In the case of the molecular wave function, we must also add a dependence on the positions R  α of
the nuclei in the molecule, giving φr ({r i }, { R  α }).
Then the wave function is expanded in a finite subset of the total set of basis states,

ψ({r i }, { R
 α }) = Cr φi ({r i }, { R
 α }), (55.12)
r

and we minimize over Cr .

55.3 Transition to Occupation Number

The approximation method amounts to using a subset of the complete set of states, but if we keep all
the states, we will get an exact statement. The basis wave functions are denoted

φ a1 ...a N (r 1 , . . . , r N ) = φ a1 (r 1 )φ a2 (r 2 ) · · · φ(r N ). (55.13)

The exact expansion in this basis is with coefficients in all the basis states, i.e.,

ψ(r 1 , . . . , r N ; t) = Ca1 ...a N (t)φ a1 ...a N (r 1 , . . . , r N ). (55.14)

In the LCAO approximation, one takes a subset of states and minimizes the Schrödinger equation
over the coefficients.
Alternatively, to obtain the same expansion, we can use successive expansions in each particle.
That is, first expand in wave functions of r 1 ,

ψ(r 1 , . . . , r N ; t) = Ca1 (r 2 , . . . , r N ; t)φ a1 (r 1 ), (55.15)
a1

and then the coefficients are expanded in the wave functions for r 2 ,

Ca1 (r 2 , . . . , r N ; t) = Ca1 a2 (r 3 , . . . , r N ; t)φ a2 (r 2 ), (55.16)
a2

etc., until we are back to



ψ(r 1 , . . . , r N ; t) = Ca1 ···a N (t)φ a1 (r 1 ) · · · φ a N (r N ). (55.17)
a1 ,...,a N
617 55 Multiparticle States: Schrödinger vs. Occupation Number

The symmetry properties of the wave function are transmitted to symmetry properties of the
coefficients,
Ca1 ...ai ...a j ...a N (t) = ±Ca1 ...a j ...ai ...a N (t). (55.18)
A relevant case occurs when the indices ai are indexing energies, so that ai → Ei . The energies
can even be approximately continuous, though not exactly continuous. In this case, we will not
consider degeneracies: the energy indexes the available states. From now on we will replace ai with
Ei everywhere.
Note then that Ei takes value in the set {1, 2, 3, . . . , ∞}, ordered by increasing energy.
In the following we will mostly follow Bose–Einstein (boson) statistics, while Fermi–Dirac
(fermion) statistics will be treated later.
Then, if index 1 (the lowest-energy state) occurs n1 times among the row of indexes (E1 , . . . , E N ),
index 2 occurs n2 in the same row of indexes, etc., until ∞ occurs n∞ times in the row, we have that
the total number of occurences of any value equals N,


ni = N. (55.19)
i=1

In that case, the coefficients can have the dependence on (E1 , . . . , E N ) replaced with dependence on
the occupation numbers n1 , n2 , . . . , n∞ . The numerical equality is then related to a different functional
dependence, C̃ instead of C,
CE1 ...E N (t) = C̃n1 ...n∞ (t). (55.20)
Now we can translate the normalization condition in the indices Ei ,

|CE1 ...E N (t)| 2 = 1, (55.21)
E1 ...E N

into a normalization condition in the indices ni . But the sum over Ei ’s equals the sum over ni , with a
combinatorial factor for choosing n1 objects of type 1, n2 of type 2, etc. out of a total of N,
N!
CnN1 ...n∞ = . (55.22)
n1 ! · · · n∞ !
The normalization condition in C̃ is now

   N!
2
1= |C̃n1 ...n∞ (t)| 2
CnN1 ...n∞ = 
 n ! . . . n ! 1 ∞  .
C̃n ···n (t) (55.23)
n1 ,...,n∞  ∞
n1 ,...,n∞ 1

This then allows for the definition of new coefficients, appropriately normalized:

N!
C̃n ...n (t) = Cn 1 ...n∞ (t). (55.24)
n1 ! · · · n∞ ! 1 ∞

55.4 Schrödinger Equation in Occupation Number Space

We now consider the Schrödinger equation with a Hamiltonian that contains a kinetic term that is a
one-particle operator, perhaps including also the one-particle potential V (r i ), and a potential that is
a two-particle operator,
618 55 Multiparticle States: Schrödinger vs. Occupation Number


N 
N
Ĥ = T (r i ) + V̂ (r i , r j ) + · · · (55.25)
i=1 i,j=1

We set to zero the three-particle and higher parts of the potential (since they are small, and come
from quantum field theory corrections to the classical potential).
We will transition from the Schrödinger equation in the coordinate representation to a representa-
tion in terms of occupation numbers.

One-Particle Operator Acting on Wave Functions


We start with the action of the (generalized) kinetic operator, meaning the one-particle part of the
Hamiltonian. That means we calculate

N 
N 
T (r i )ψ(r 1 , . . . , r N ; t) = T (r i ) CE1 ...E N (t)φ E1 (r 1 ) · · · φ E N (r N ). (55.26)
i=1 i=1 E1 ...E N


More precisely, we want to peel off the existence of extra wave functions (N − 1 of them) on the
right-hand side. To do that, we calculate
  
N
T(1) ≡ d r 1 · · ·
3
d 3r N φ∗E1 (r 1 ) · · · φ∗E N (r N ) T (r i )ψ(r 1 , . . . , r N ; t). (55.27)
i=1

Using the orthonormality of the wave functions,



d 3r j ψ∗E j (r j )ψ E j (r j ) = δ E j E j , (55.28)

for all the wave functions except for ψ Ei (r i ), on which we have T (r i ) acting, we obtain
N 

T(1) = d 3r i φ∗Ei (r i )T (r i )φ Ei (r i )CE1 ...Ei−1 Ei Ei+1 ...E N (t)
i=1 Ei
(55.29)

N 
= φ Ei |T̂i |φ Ei CE1 ...Ei−1 Ei Ei+1 ...E N (t).
i=1 Ei

This means that in the coefficients, we replace Ei with Ei, and sum over it.
Next we change to the occupation number indexes:
CE1 ...Ei−1 Ei Ei+1 ...E N (t) = C̃n1 ...nEi −1,...nE  +1,...n∞
i
 (55.30)
n1 ! · · · n∞ ! 
= Cn1 ...nE −1,...nE  +1,...n∞ .
N! i i

In the first equality we note that we have removed one occupied state from the value Ei and added it
at Ei. In the second equality, we have just redefined the coefficients in terms of correctly normalized
ones.
When going to occupation number space, the sum over particles translates into a sum over states,
the multiplicity being the occupation numbers,

N 
φ Ei | = nEi φ Ei |. (55.31)
i=1 Ei
619 55 Multiparticle States: Schrödinger vs. Occupation Number

At this point, we can simplify the notation, and replace |φ Ei  with |i and nEi with ni everywhere.
Then we get the simple one-particle operator action

 N
n1 ! · · · n∞ ! 

T(1) = i|T̂ |i  Cn1 ...ni −1,...ni +1,...n∞ (t). (55.32)
i,i =1
N!

Two-Particle Operator Acting on Wave Functions


Next, we consider the action of the two-particle operator, i.e., the two-particle potential


N
1 
N
V (r i , r j ) = V (r i , r j ), (55.33)
i> j=1
2 i,j=1,ij

and define the same kind of coefficient coming from it as for T(1) ,
 
1 
N
V(2) = d 3r 1 · · · d 3r N φ∗E1 (r 1 ) · · · φ∗E N (r N ) V (r i , r j ). (55.34)
2 i,j=1,ij

Again using the orthonormality of the one-particle wave functions, except for those involving r i , r j ,
we find
 
1  
V(2) = d 3r i d 3r j φ∗Ei (r i )φ∗E j (r j )V (r i , r j )ψ Ei (r i )φ E j (r j )
2 i,j=1,ij E  E 
i j

× CE1 ...Ei−1 Ei Ei+1 ...E j−1 E j E j+1 ...E N (t) (55.35)
1  
= φ Ei φ E j |V̂ |φ Ei φ E j CE1 ...Ei−1 Ei Ei+1 ...E j−1 E j E j+1 ...E N (t).
2 i,j=1,ij E  E 
i j

We then change to the occupation number representation, now with two changes in the indices,

CE1 ...Ei−1 Ei Ei+1 ...E j−1 E j E j+1 ...E N (t) = C̃n1 ...nEi −1,...nE  +1,...nE j −1,...nE  +1,...n∞
i j
 (55.36)
n1 ! · · · n∞ ! 
= Cn1 ...nE −1,...nE  +1,...nE −1,...nE +1,...n∞ ,
N! i i j j

where the factorials in the square root have the same occupation numbers as the coefficients C .
Now, besides replacing the single sum over particles replaced with a sum over particles that
includes the multiplicity of occupation numbers,


N 
φ Ei | = nEi φ Ei |, (55.37)
i=1 Ei

we have to account for double sum, in which, if two energies are equal, the second sum has one state
less,

N 
N  
φ Ei φ E j | = nEi nE j φ Ei φ E j | + nEi (nEi − 1)φ Ei φ Ei |. (55.38)
i=1 j=1 E j Ei Ei
620 55 Multiparticle States: Schrödinger vs. Occupation Number

Relabeling |φ Ei  as |i and nEi = ni as before, we find the two-particle potential operator:
  1
V(2) = nE (nE j − δ Ei E j )φ Ei φ E j |V̂(2) |φ Ei φ E j 
E E E E
2 i
i j i j

n1 ! · · · n∞ ! 
× Cn1 ...nE −1,...nE  +1,...nE −1,...nE  +1,...n∞ (t)
N! i i j j
(55.39)
 1
= ni (n j − δi j )i j |V̂(2) |i  j 
i,j,i ,j 
2

n1 ! · · · n∞ ! 
× Cn1 ...ni −1,...ni +1,...n j −1,...n j  +1,...n∞ (t),
N!
where again the factorials inside the square root have the same occupation numbers as the
coefficients C .
Then, the Schrödinger equation, multiplied by the wave functions and integrated over position,
gives
  
3
d r 1 d r 2 · · ·
3
d 3r N φ∗E1 (r 1 ) · · · φ∗E N (r N ) × i∂t ψ(r 1 , . . . , r N ; t) = T(1) + V(2) . (55.40)

Since the left-hand side gives i∂t CE1 ...E N (t), we obtain Schrödinger equation for the coefficients,
first in terms of the original CE1 ...E N (t) coefficients,

N 
i∂t CE1 ...E N (t) = φ Ei |T̂i |φ Ei CE1 ...Ei−1 Ei Ei+1 ...E N (t)
i=1 Ei

1  
+ φ Ei φ E j |V̂ |φ Ei φ E j CE1 ...Ei−1 Ei Ei+1 ...E j−1 E j E j+1 ...E N (t),
2 i,j=1,ij E  E 
i j

(55.41)
and finally in terms of the Cn 1 ...n∞ (t) coefficients,
 
n1 ! · · · n∞ !  N
n1 ! · · · (ni − 1)! · · · (ni + 1)! · · · n∞ !
 
i ∂t Cn1 ...n∞ (t) = i|T̂ |i 
N i,i =1
N
 1
× Cn 1 ...ni −1,...ni +1,...n∞ (t) + ni n j i j |V̂(2) |i  j 

ij,i ,j  2

n1 ! · · · (ni − 1)! · · · (ni + 1)! · · · (n j − 1)! · · · (n j  + 1)! · · · n∞ !
×
N!
× Cn 1 ...ni −1,...ni +1,...n j −1,...n j  +1,...n∞ (t)
 1
+ ni (ni − 1)i j |V̂(2) |i  j 

i=j,i ,j  2

n1 ! · · · (ni − 2)! · · · (ni + 1)! · · · (n j  + 1)! · · · n∞ !
×
N!
× Cn 1 ...ni −2,...ni +1,...n j  +1,...n∞ (t).
(55.42)
621 55 Multiparticle States: Schrödinger vs. Occupation Number

55.5 Analysis for Fermions

In the analysis before, we were considering bosonic particles. To analyze the case of fermions,
we must replace the factorized basis (products of the one-particle wave functions) with Slater
determinant, with combinatorial coefficient
 φ E (r 1 ) ··· φ E N (r 1 ) 
 1 
−1/2  .. ..
  .
N
[Cn1 ...n∞ ] . . (55.43)
 
φ E1 (r N ) ··· φ E N (r N ) 

But this combinatorial coefficient is the same as before,



−1/2 n1 ! · · · n∞ ! 1
N
[Cn1 ...n∞ ] = =√ , (55.44)
N! N!
since the occupation numbers for fermions ni are either 0 or 1, in both cases obtaining ni ! = 1.
The one-particle and two-particle operators act on particles with the same coefficients CE1 ...E N (t),
where all the Ei are different (since for fermions we have at most one particle per state). Then the
coefficients can be rewritten in terms of C̃n1 ...n∞ (t) and Cn 1 ...n∞ (t),

1
CE1 ...E N (t) = C̃n1 ...n∞ (t) = √ Cn 1 ...n∞ (t). (55.45)
N!
The one-particle operator acting on the Slater determinant gives

φ Ei |T̂i |φ Ei CE1 ...Ei−1 Ei Ei+1 ...E N (t), (55.46)

where the term with coefficient CE1 ...Ei−1 Ei Ei+1 ...E N (t) acts by replacing the Slater determinant with

 φ E1 (r 1 ) ··· φ Ei (r 1 ) ··· φ E N (r 1 ) 


 
 .. ..  . (55.47)

. . 
φ E1 (r N ) ··· φ Ei (r N ) ··· φ E N (r N ) 

Important Concepts to Remember

• The multiparticle Schrödinger equation has one-particle, two-body, three-body, etc., potentials,
where only the one-particle and two-body potentials have a classical counterpart, the others coming
from quantum field theory.
• We can expand the multiparticle wave function in a basis of factorized wave functions,
φr (r 1 , . . . , r N ) = φ a1 (r 1 ) · · · φ a N (r N ), and in the LCAO approximation we keep only a finite
number of these basis functions.
• The coefficients Ca1 ...a N (t) have the correct symmetry properties (for bosons or fermions). It is
useful to consider ai ≡ Ei (energy eigenstates), and transition to occupation number states for all
the energy eigenstates, CE1 ...E N (t) ≡ C̃n1 ...n∞ (t).
622 55 Multiparticle States: Schrödinger vs. Occupation Number

• The one-particle (kinetic and potential) operator in occupation number is


i Ei φ Ei |T̂i |φ Ei CE1 ...Ei−1 Ei Ei+1 ...E N (t), and the two-particle operator is

Ei ,E j φ Ei φ E j |V̂ |φ Ei φ E j CE1 ...Ei−1 Ei Ei+1 ....E j−1 E j E j+1 ...E N (t).
1
2 ij
• We can write a Schrödinger equation for the coefficients CE1 ...E N (t) ≡ C̃n1 ...n∞ (t).
• For fermions the coefficients multiply changed Slater determinants.

Further Reading
More details are given in the book [27].

Exercises

(1) How does translational invariance and rotational symmetry constrain the three-body potential
coming from quantum field theory?
(2) Consider the LCAO approximation for an atom with three electrons. Write the expansion,
restricting yourself to four basis elements.
(3) Write the coefficients of the expansion in exercise 2 as coefficients in terms of energy,
CE1 ... , show explicitly their symmetry properties, and then rewrite them as coefficients in the
occupation number picture.
(4) Consider five bosons with a one-body potential that can be approximated as harmonic oscillator.
Write the wave function expansion in terms of harmonic oscillator states with n ≤ 3.
(5) For the case in exercise 4, write the one-particle (kinetic) operator in the Hamiltonian, in the
occupation number basis.
(6) For the case in exercise 4, consider a two-particle potential of the Coulomb (∝ 1/r) type. Write
explicitly the first six nontrivial terms (those having an occupation number change in the two-
particle operator in the Hamiltonian), in the occupation number basis.
(7) Consider three fermions with a one-body potential that can be approximated as a harmonic
oscillator potential. Write the first four one-particle terms in the Hamiltonian that acts on the
Slater determinant (for a wave function with arbitrary coefficients).
56 Fock Space Calculation with Field Operators

In this chapter we continue with the occupation number picture, and build the Fock space
corresponding to it. Then we build field operators, and operators acting on them. This is the beginning
of the quantum field theory formalism, though in its nonrelativistic version, with v  c, and no
antiparticles. This is a whole field of study, of many-body physics, of which we describe only the
foundation.

56.1 Creation and Annihilation Operators

Assume, as in the previous chapter, that there are one-particle states, which means that there are
(quasi-)particles. The states are indexed as (1, 2, . . . , ∞), and each multiparticle state has occupation
numbers (n1 , n2 , . . . , n∞ ) for the one-particle states. Consider basis vectors for the multiparticle
Hilbert space that are tensor products of each single-particle state,

|n1 n2 . . . n∞  = |n1  ⊗ |n2  ⊗ · · · ⊗ |n∞ . (56.1)

In the coordinate representation for the n-particle Hilbert space, we have


r 1 , . . . , r n1 |n1  = φ1 (r 1 ) · · · φ1 (r n1 )
(56.2)
r n1 +1 , . . . , r n1 +n2 |n2  = φ2 (r n1 +1 ) · · · φ2 (r n1 +n2 ), etc.
The occupation number basis is complete in the multiparticle Hilbert space, so

|n1 . . . n∞ n1 . . . n∞ | = 1. (56.3)
n1 ...n∞

We now define creation and annihilation operators, first in the bosonic case, b†i , bi . These operators
act in the same way as for the harmonic oscillator, just on the occupation number states. This means
that on the single-particle state i, with occupation number ni , the actions of bi and b†i are

bi |ni  = ni |ni − 1
(56.4)
b†i |ni  = ni + 1|ni + 1.
We can also define the number operator in the state i,

Ni = b†i bi → Ni |ni  = b†i bi |ni  = ni |ni . (56.5)

For the states with ni = 0 or ni = 1, we have, in particular,


b|0 = 0, b|1 = |1
√ (56.6)
b† |0 = |1, b† |1 = 2|2.
623
624 56 Fock Space Calculation with Field Operators

The creation and annihilation operators satisfy the independent harmonic oscillator commutation
rules,

[bi , b†j ] = δi j
(56.7)
[bi , b j ] = [b†i , b†j ] = 0.

We can then finally extend, trivially, the actions of bi and b†i to the full multiparticle Hilbert space,
with its basis |n1 . . . ni . . . n∞ .

56.2 Occupation Number Representation for Fermions

We can define similarly the occupation number states (and representation) for fermions. The
fermionic creation and annihilation operators ai† , ai for fermionic harmonic oscillators were defined
in Chapters 20 and 28 via their anticommutation rules, owing to the antisymmetric statistics. The
anticommutators are
{ai , a†j } = δi j
(56.8)
{ai , a j } = {ai† , a†j } = 0.

We can import the anticommutators from the fermionic harmonic oscillator to the single-particle
states i.
Since we have fermions, the single-particle states can be either empty (unoccupied) or full
(occupied), so the states are |0 or |1. Then we must solve for the anticommutation rules for the
actions of the operators ai and ai† on the states |0 and |1. The solution is

0|0 = 0, a|1 = |1



(56.9)
a |0 = |1, a† |1 = 0.

With respect to the bosonic case, we only need to change the last action, since there is no |2 state
so it must be replaced with 0: |2 ≡ 0.
In terms of the occupation numbers ni = 0, 1, the relations above can be summarized as

ai |ni  = ni |ni − 1
(56.10)
ai† |ni  = 1 − ni |ni + 1.

Equivalently, we can just use the same definition as for the bosonic case but with the identification
|2 = 0.
We can also define fermionic number operators

Ni = ai† ai ⇒ Ni |ni  = ai† ai |ni  = ni |ni , (56.11)

for ni = 0, 1.
Finally, the extension of the actions of (ai† , ai ) on the basis |n1 . . . n∞  of the full multiparticle
Hilbert space will be explained later, in the definition of Fock space.
625 56 Fock Space Calculation with Field Operators

56.3 Schrödinger Equation on Occupation Number States

Define the abstract state



|ψ(t) = Cn1 ...n∞ (t)|n1 n2 . . . n∞ , (56.12)
n1 ...n∞

with a general wave function, i.e., coefficient Cn1 ...n∞ .


Going back to the Schrödinger equation for the coefficients of occupation number states (55.42),
rewriting it by dividing by the square root of the factorials on the left-hand side and isolating the

i = i  kinetic term, in which ni = ni − 1, so that ni (ni + 1) = ni , we obtain

i∂t Cn 1 ...n∞ (t) = i|T̂ |ini Cn 1 ...ni ...n∞ (t)
i
 √
+ i|T̂ |i  ni ni + 1Cn 1 ...ni −1,...ni +1,...n∞ (t)
i,i
 1√ √ 
+ i j |V̂(2) |i  j  ni n j ni  + 1 n j  + 1
ij,i ,j 
2 (56.13)
× Cn 1 ...ni −1,...ni +1,...n j −1,...n j  +1,...n∞ (t)
 1 
+ ii|V̂(2) |i  j  ni (ni − 1) ni + 1 n j  + 1
i=j,i j 
2

× Cn 1 ...ni −2,...ni +1,...n j  +1,...n∞ (t).

Multiplying by the occupation number states, we obtain the Schrödinger equation for general states,

i∂t |ψ(t) = i∂t Cn 1 ...n∞ (t)|n1 n2 . . . n∞ 
n1 ...n∞ ; i ni =N

 ⎧
⎪ 
= ⎨ i|T̂ |iCn ...n ...n (t)ni |n1 . . . ni . . . n∞ 
⎪ 1 i ∞
n1 ...n∞ ; i ni =N ⎩ i
 √
+ i|T̂ |i Cn 1 ...ni −1,...ni +1,...n∞ (t) ni ni + 1|n1 . . . ni , . . . ni , . . . n∞ 
ii
 1
+ i j |V̂(2) |i  j  Cn 1 ...ni −1,...ni +1,...n j −1,...n j  +1,...n∞ (t) (56.14)
ij,i ,j 
2
√ √ 
× ni n j ni + 1 n j  + 1|n1 . . . ni , . . . ni , . . . n j , . . . n j  , . . . n∞ 
 1
+ ii|V̂(2) |i  j  Cn 1 ...ni −2,...ni +1,...n j  +1,...n∞ (t)
i=j,i ,j 
2
 
× ni (ni − 1) ni + 1 n j  + 1|n1 . . . ni , . . . ni , . . . n j  , . . . n∞  .

Redefine in the second term (in the first term no redefinition is needed)

ni − 1 = ñi , ni + 1 = ñi . (56.15)


626 56 Fock Space Calculation with Field Operators

In the third term, we redefine


ni − 1 = ñi n j − 1 = ñ j
(56.16)
ni + 1 = ñi n j  + 1 = ñ j  ,
and in the fourth term, we redefine

ni − 2 = ñi , ni + 1 = ñi , n j  + 1 = ñ j  . (56.17)

Then the states and prefactors multiplying the matrix elements and coefficients C  become, under
the redefinitions,

ni |n1 . . . ni . . . n∞  = b†i bi |n1 . . . ni . . . n∞ ,


ñi + 1 ñi |n1 . . . ñi + 1, . . . ñi − 1, . . . n∞  = b†i bi |n1 . . . ñi . . . ñi . . . n∞ ,
 
ñi + 1 ñi ñ j + 1 ñ j  |n1 . . . ñi + 1, . . . ñi − 1, . . . ñ j + 1, . . . ñ j  − 1, . . . n∞ 
= b†i bi b†j b j  |n1 . . . ñi . . . ñi . . . ñ j . . . ñ j  . . . n∞ ,
 (56.18)
( ñi + 1)( ñi + 2) ñi ñ j  |n1 . . . ñi + 2, . . . ñi − 1, . . . ñ j  − 1, . . . n∞ 
= b†i b†i bi b j  |n1 . . . ñi . . . ñi . . . ñ j  . . . n∞ ,
and we can replace the sum over ni with a sum over ñi , etc., since the sum is unchanged as the
prefactor is zero for the extra terms in the sum. Then, in all the terms, the original basis vectors are
reobtained, and from them the full |ψ(t).
Finally then, the Schrödinger equation for general states is

i∂t |ψ(t) = i|T̂ |ib†i bi |ψ(t)
i

+ i|T̂ |i b†i bi |ψ(t)
i,i
 1 (56.19)
+ i j |V̂(2) |i  j  b†i b†j bi b j  |ψ(t)
ij,i ,j 
2
 1
+ ii|V̂(2) |i  j  b†i b†i bi b j  |ψ(t).
i=j,i j 
2

The right-hand side of the equation can be identified with Ĥ |ψ(t) in occupation number states,
which means that the Hamiltonian acting on the occupation number states is
 
Ĥ = b†i i|T̂ |i bi + b†i b†j i j |V̂(2) |i  j bi b j  . (56.20)
i,i i,j,i ,j 

56.4 Fock Space

We can define the total number operator,


 
N̂ = N̂i = b†i bi , (56.21)
i i
627 56 Fock Space Calculation with Field Operators

which has as eigenvalue the total number of particles/occupied states



N= ni . (56.22)
i

We could consider the space to be of given total number N, since in (nonrelativistic) quantum
mechanics the number of particles does not change. However, as the individual operators bi , b†i
change the total number to N − 1 and N + 1, respectively, in order to define their action we must
consider the total Hilbert space as the space of all values of N, called Fock space.
In this space, the normalized basis states are

(b† ) n1 (b† ) n∞
|n1 . . . n∞  = √1 · · · √∞ |0. (56.23)
n1 ! n∞ !
Indeed, the normalization constant C for the state

|n = C(b† ) n |0 (56.24)

comes from the normalization condition

1 = n|n = |C| 2 0|bn (b† ) n |0 = |C| 2 n0|bn−1 (b† ) n−1 |0 = · · · = |C| 2 n! 0|0 = |C| 2 n! , (56.25)

where in the iteration step we have commuted the last b past n b† s, leading to n(b† ) n−1 .

56.5 Fock Space for Fermions

For fermions, the definition of the occupied states |n is

|n = (a† ) n |0, (56.26)

where
√ √
a|n = n|n − 1, a† |n = n + 1|n + 1, (56.27)

and where n = 0 or 1 only and |2 ≡ 0. Note that then we also have n! = 1.
The multiparticle states are defined in the same way,

|n1 . . . n∞  = (a1† ) n1 · · · (ai† ) ni · · · (a∞


† n∞
) |0. (56.28)

However, because of the anticommutating operators for i  j,

{ai , a†j } = 0, (56.29)

the action of the annihilation operator ai on the Fock space states is different from its action just on
a single state, by a sign: before reaching (ai† ) ni in this string, we have to anticommute past the rest
of the creation operator, creating a sign,

ai |n1 . . . ni . . . n∞  = (−1) S ni |n1 . . . ni − 1, . . . n∞ , (56.30)

where

S = n1 + n2 + · · · + ni−1 . (56.31)
628 56 Fock Space Calculation with Field Operators

Similarly, the creation operator acts on the Fock space to produce the same sign,
ai† |n1 . . . ni . . . n∞  = (−1) S ni + 1|n1 . . . ni + 1, . . . n∞ . (56.32)
Finally, the number operator acts without a sign,
Ni |n1 . . . ni . . . n∞  = ai† ai |n1 . . . ni . . . n∞  = ni |n1 . . . ni . . . n∞ . (56.33)

56.6 Schrödinger Equations for Fermions, and Generalizations

The analysis for the Schrödinger equation for fermions follows that for bosons, since we have the
same definition of the action of creation and annihilation operators, but with the addition of the
condition |2 = 0 and with the sign (−1) S in intermediate equations. But at the end, when |ψ(t) is
re-formed, the sign disappears (we leave the details as an exercise).
Then the Schrödinger equation for fermions has the same form as for bosons,
i∂t |ψ(t) = Ĥ |ψ(t), (56.34)
where the Hamiltonian in the occupation number space is
 1  † †
Ĥ = ai† i|T̂ |i ai + ai a j i j |V̂(2) |i  j ai a j  . (56.35)
i,i
2 i,j,i ,j 

We can then use a formalism where we treat both bosons and fermions together, with annihilation
operator ci = (bi , ai ) according to whether we have bosons or fermions, and similarly for the creation
operators.
Then the more general one-particle operator

N
 = Â(i) , (56.36)
i=1

becomes in the occupation number picture



 = ci† i| Â|i ci . (56.37)
i,i

Similarly the more general two-particle operator

1  (i j)
N
B̂ = B̂ , (56.38)
2 i,j=1

becomes in the occupation number picture

1  † †
N
B̂ = c c i j | B̂|i  j ci c j  . (56.39)
2 i,j,i ,j  =1 i j

56.7 Field Operators

The occupation number representation leads to so-called second quantization, which is regarded
as the quantization of the wave function, where instead of complex numbers they take values in
629 56 Fock Space Calculation with Field Operators

operators (numbers are replaced with operators). However, that is actually a misnomer, since the
correct procedure is to build quantum field theory, where the wave function is replaced by a quantum
field. But here we are dealing with nonrelativistic quantum mechanics, so v  c, and there are no
antiparticles in the theory, only particles. This means that we have annihilation and creation operators
only for particles, ci and ci† , but not the corresponding operators for antiparticles, as in the relativistic
quantum field theory.
We build the field operators as the product of the annihilation or creation operators with the single-
particle wave functions φi (r ), i.e.,

ψ̂(r ) = φi (r )ci ⇒
i
 (56.40)

ψ̂ (r ) = φi∗ (r )ci† .
i

More precisely, the above definitions are only for the boson case, ci = bi .
In the case of fermions, there is an extra index for spin σ = ±1/2 or 1, 2,

ψ̂ σ (r ) = φiσ (r )ai,σ ⇒
i
 (56.41)

ψ̂†σ (r ) = ∗
φiσ (r )ai,σ .
i

We can also write field operators involving spin wave functions,



ψ̂(r , s) = ψ̂ σ (r )χσ (s)
σ=1,2
 (56.42)

ψ̂ (r , s) = ψ̂†σ (r )χσ (s).
σ=1,2

The field operators are operators in the occupation number Hilbert space, through ci , ci† , and, as
such, they obey commutation and anticommutation relations derived from those for ci and ci† ,
[ ψ̂ σ (r ), ψ̂†σ (r  )]∓ = δ σσ δ3 (r − r  )
(56.43)
[ ψ̂ σ (r ), ψ̂ σ (r  )]∓ = [ ψ̂†σ (r ), ψ̂†σ (r  )]∓ = 0.
In the above, we have used the same indices σ as for fermions. For bosons, we just drop all the σ
indices.
Using the orthonormality and completeness of the single-particle wave functions φiσ (r ), we can
rewrite the result of acting with one-particle operators in the occupation number picture as an integral
with the field operators,

 = ci† i| Â|i ci
i,i
  (56.44)
= d 3r ψ†σ (r ) A(r )ψ σ (r ).
σ

Similarly, the two-particle operators in the occupation number picture also become an integral with
field operators,
1  † †
B̂ = c c i j | B̂|i  j ci c j 
2 i,j,i ,j  i j
   (56.45)
1
= d 3r d 3r  ψ̂†σ (r ) ψ̂†σ (r  )B(r , r  ) ψ̂ σ (r  ) ψ̂ σ (r ).
2 σ,σ
630 56 Fock Space Calculation with Field Operators

We could further generalize to operators that are nontrivial in their spin action, Âσσ , B̂σ1 ,σ2 ,σ1 ,σ2 ,
so that the relation becomes a matrix relation in σ space.
Remaining in the case of trivial dependence on spin, the Hamiltonian becomes
    
1
Ĥ = 3
d r †
ψ σ (r )T (r )ψ σ (r ) + 3
d r d 2r  ψ̂†σ (r ) ψ̂†σ (r  )V (r , r  ) ψ̂ σ (r  ) ψ̂ σ (r ).
σ
2 σ,σ 

(56.46)

56.8 Example of Interaction: Coulomb Potential, as a Limit


of the Yukawa Potential, for Spin 1/2 Fermions

Consider the example of the Yukawa interaction,



 e02 e−μ |r −r |
V (r , r ) =  , (56.47)
|r − r |

which leads to the Coulomb potential in the limit μ → 0.


Consider then the state labeled by k = p / and spin σ, with spin wave functions
   
1 0
χσ = , or . (56.48)
0 1

Then the kinetic energy in these states in momentum space is

2 k12
k1 σ1 |T |k1 σ1  = δ σ σ  δ   . (56.49)
2m 1 1 k1 ,k1
The potential in the same states is

e02 4π
k1 σ1 , k2 , σ2 |V̂ |k1 σ1 , k2 , σ2  = δ σ σ  δ σ σ  δ     . (56.50)
V 1 1 2 2 k1 +k2 ,k1 +k2 (k1 − k2 ) 2 + μ2

In terms of the matrix elements, the Hamiltonian is



Ĥ = k1 σ1 |T |k1 σ1 a† ak σ
k1 σ1 1 1
k1 σ1
k1 σ1 

 (56.51)
+ a† a† k1 σ1 , k2 , σ2 |V̂ |k1 σ1 , k2 σ2 ak σ ak σ .
k1 σ1 k2 ,σ2 1 1 2 2
k1 σ1 ,
 k1 σ1 ,
k2 σ2 , k2 σ2

Taking the limit μ → 0 and substituting the matrix element, with the replacements k1 → k, k2 →
k , and k1 − k2 → q , we get

 2 k 2 e02   4π †
H= a† akσ + a a† a   a  . (56.52)
2m kσ V  σ,σ q 2 k+q ,σ k −q ,σ k ,σ k,σ

k,σ 
k,
k ,
q
631 56 Fock Space Calculation with Field Operators

Important Concepts to Remember

• If we have one-particle states in a multiparticle system, in the occupation number picture we can
define creation and annihilation operators, b†i , bi that act as for a harmonic oscillator.
• Similarly, for fermions, in the occupation number (equal to 0 or 1) picture, we can define creation
and annihilation operators ai† , ai that act as for a fermionic harmonic oscillator.
• In terms of bi , b†i , the occupation number picture Hamiltonian is Ĥ = i,i b†i i|T̂ |i bi +
† †  
i,j,i ,j  bi b j i j |V̂(2) |i j bi b j  .
• Fock space is the Hilbert space in the occupation number picture, for all possible values of the total
number N.
• In the fermionic Fock space, ai |n1 . . . n∞  = (−1) S |n1 . . . ni−1 , ni − 1, ni+1 . . . n∞ , with
S = n1 + · · · ni−1 and the same sign for ai† .
• By analogy with (relativistic) quantum field theory, in fact a nonrelativistic version of the same,
we can construct field operators, as ψ̂(r ) = i φi (r ) ĉi , ci = (bi , ai ), where for ci = ai we actually
have also a spin index, so that ψ̂ σ (r ) = i φi,σ (r )ai and we can write ψ̂(r ; s) = s ψ̂ σ (r )χs (s).
• For the Coulomb potential, as a limit of the Yukawa potential, we find
 2 k 2 e2   †
Ĥ = ↠âk,σ + 0 â ↠a   a  .
2m k,σ V  σ,σ k+q ,σ k −q ,σ k ,σ k,σ

k,σ 
k,
kq

Further Reading
More details can be found in the book [27].

Exercises

(1) Find the eigenvalue of the operator e αb on single-particle states.


(2) If the Hamiltonian acting on the occupation number states of a system can be written as
Ĥ = i t i b†i bi + V i< j b†i b†j bi+1 b j−1 , how would you interpret this physically?
(3) Consider a system with seven one-particle states, and three bosons in the system. How many
states are there in the Fock space? How many multiparticle states are possible?
(4) Find the eigenvalue of the fermionic operator exp [( i ai )α] in the fermionic Fock space.
(5) Show the details of the proof of the Schrödinger equation for the occupation number states for
fermions, and show the resulting absence of a sign.
(6) Calculate the occupation number Hamiltonian for a delta function two-body interaction
V (r i − r j ) = V0 δ(r i − r j ).
(7) Explain physically the Coulomb occupation number Hamiltonian (56.52) in terms of Feynman
diagrams.
The Hartree–Fock Approximation and Other
57 Occupation Number Approximations

In this chapter, we will consider a self-consistent approximation method for multiparticle systems,
the Hartree–Fock approximation. It arises from the multiparticle picture, either the original one or,
better, the occupation number picture. We will make connections with the original derivation of
Hartree and Fock. Then we sketch the general ideas of the other approximations in the occupation
number picture: the usual perturbation theory in the style of nonrelativistic quantum field theory, and
the self-consistent Bethe–Salpeter equation.

57.1 Set-Up of the Problem

We would like to solve perturbatively for the energies of multiparticle states. The most relevant case
is that of multi-electron states, meaning we have fermionic one-particle states.
The perturbative solution is via a self-consistent equation, where the Hamiltonian contains field
operators for the rest of the fermions (electrons) other than the one whose interaction we are
analyzing.
We will start directly with the occupation number Hamiltonian derived in the previous chapter,
Ĥ = Ĥ0 + Ĥ1

Ĥ0 = i|T̂(1) |i ai† ai
i,i (57.1)
1  † †
Ĥ1 = − a a i j |V̂(2) |i  j ai a j  .
2 i,j,i ,j  i j

Note that the minus sign in front of the expression for Ĥ1 is there because the particles are fermions
(the general formula for bosons or fermions with the above ordering has ± in front).
Because of the Fermi–Dirac antisymmetric statistics, we can rewrite Ĥ1 equivalently by reordering
ai a j  as −a j  ai , followed by redefinition of the summation variables, i  as j  and j  as i . Averaging
over the two equivalent forms, we substitute in Ĥ1
1
i j |V̂(2) |i  j  →
i j |V̂(2) |i  j  − i j |V̂(2) | j i  . (57.2)
2
Equivalently, consider in the original formulation the basis N-particle state
|ψ = |E1 1 ⊗ |E2 2 ⊗ · · · ⊗ |E N  N , (57.3)
and antisymmetrize it to obtain the Slater determinant,
1 
| ψ̄ = √ P|ψ. (57.4)
N! perms. P
632
633 57 The Hartree–Fock Approximation

We consider the average of Ĥ1 for the Slater determinant state | ψ̄. Then we define V̂ (i j) as the
two-particle interaction potential, acting on the left on the ith single-particle Hilbert space, and on
the right on the jth single-particle Hilbert space, so that
1
ψ̄| Ĥ1 | ψ̄ = ψ|V̂ (i j) | ψ̄. (57.5)
2 i,j

Then, for | ψ̄ containing states with Ei and E j , this results (after using the orthonormality of the rest
of the other one-particle states in |ψ and | ψ̄, i| j = δi j ) in a matrix element for two-particle states,
1
ψ̄| Ĥ1 | ψ̄ = Ei E j |V̂ (i j) |Ei E j  − Ei E j |V̂ (i j) |E j Ei  . (57.6)
2 i,j

This is the same result as that from the occupation number method above. On the other hand,
transitioning from the above original representation result, by writing

|ψ = . . . ai† . . . a†j . . . |0, (57.7)

means that we can relate the N-particle matrix element to the two-particle one in the two-particle
operator,
1
ψ̄|V̂ (i j) | ψ̄ = i j |V̂(2) |i j − i j |V̂(2) | ji . (57.8)
2

57.2 Derivation of the Hartree–Fock Equation

We want to solve the time-independent Schrödinger equation, Ĥ |ψ = E|ψ, which means that we
need to consider

ψ|( Ĥ − E)|ψ = 0. (57.9)

Varying the matrix equation, we obtain

δψ|( Ĥ − E)|ψ. (57.10)

If the state is

|ψ = ai†1 ai†2 · · · ai†N |0, (57.11)

then its variation is

|δψ = δai†1 ai†2 · · · ai†N |0 + ai†1 δai†2 · · · ai†N |0 + · · · + ai†1 · · · ai†N −1 δai†N |0. (57.12)

The variation occurs through a unitary transformation of the single-particle state by Û, with Û Û † =
1, such that

Û  1 + i K̂ ⇒ K̂ † = K̂. (57.13)

The transformation Û acts on states, so on the annihilation and creation operators, mixing the
various i’s, so that

ai → Ui j a j , ai† → Ui†j a†j . (57.14)


634 57 The Hartree–Fock Approximation

For an infinitesimal transformation,



δai† = −i K ji a†j . (57.15)
j

We can have two types of transformation:


(a) If δai† ∝ ai only, it means that |δψ ∝ |ψ, so (57.10) becomes just the usual Schrödinger
equation,

ψ| Ĥ |ψ = Eψ|ψ. (57.16)

(b) If δai† contains other a†j , meaning K ji  0 for i  j, then δψ|ψ = 0 and then we obtain

δψ| Ĥ |ψ = 0. (57.17)

The interpretation of this result is that, in the sector of |ψ’s, see (57.11), where there is only
one creation operator ai† that changes from i to j (in all the terms in |δψ, there is also only one
δai† among the other ones that is untouched, and thus δai† ∝ j K ji a†j ) and the matrix element of
Ĥ vanishes.
In particular, for relevant combinations of (i j, i  j  ),

δψ| Ĥ(2) |ψ = 0| . . . ai . . . a j . . . | Ĥ(2) | . . . ai† . . . a†j  . . . |0 = i j | Ĥ(2) |i  j . (57.18)

This means that the terms in the occupation number Hamiltonian Ĥ that have only one ai† -change,
or otherwise (by the previous equation) if i j | Ĥ(2) |i  j  has an index on the left equal to one on the
right, must vanish. These terms are
1 1
− i j |V̂(2) |i j ai† a†j ai a j  − i j |V̂(2) |i iai† a†j ai ai
2 i,j,j  2 i,j,i
1 1
− i j |V̂(2) | j j ai† a†j a j a j  − i j |V̂(2) |i  jai† a†j ai a j
2 i,j,j  2 i,j,i
 1 1
=− − ki|V̂(2) |ki  ai† ak† ak ai + ki|V̂(2) |i  kai† ak† ak ai
k,i,i
2 2
 (57.19)
1 1
+ ik |V̂(2) |ki  ai† ak† ak ai − ik |V̂(2) |i  kai† ak† ak ai
2 2
 1 1
=− − ik |V̂(2) |i  kai† ak† ak ai + ik |V̂(2) |ki  ai† ak† ak ai
k,i,i
2 2

1 1
+ ik |V̂(2) |ki  ai† ak† ak ai − ik |V̂(2) |i  kai† ak† ak ai ,
2 2

where in the first equality we have commuted a† past a† or a past a (with different indices), and then
relabeled the summation indices as the common set (kii  ), and in the second equality we have used
i j |V̂(2) |kl =  ji|V̂(2) |l k in the first two terms.
Then the relevant terms terms in the Hamiltonian are
 ⎡⎢  ⎤⎥
ai† ⎢⎢ ak† ak ik |V̂(2) |i  k − ak† ak ik |V̂(2) |ki  ⎥⎥ ai , (57.20)
ii
⎢⎣ k k
⎥⎦
635 57 The Hartree–Fock Approximation

and since ak† ak = Nk is the occupation number in mode k, which is 1 for an occupied state and 0 for
an unoccupied state, it means that we can replace it with changing the sum over all k to a sum over
occupied states k,
 ⎡⎢   ⎤⎥
ai† ⎢⎢ ik |V̂(2) |i  k − ik |V̂(2) |ki  ⎥⎥ ai
ii
⎢⎣k occ. k occ.
⎥⎦
 (57.21)
≡ ai† i|V̂H−F |i ai ,
ii

where we have defined the matrix element of the Hartree–Fock potential V̂H–F ,
 
i|V̂H–F |i  = ik |V̂(2) |i  k − ik |V̂(2) |ki  . (57.22)
k occ. k occ.

Thus the condition for the terms in the Hamiltonian with only one transition of ai to vanish is

i|T̂(1) | j + i|V̂H–F | j = 0, for i  j. (57.23)

Therefore T̂(1) + V̂H–F is diagonal, or in other words the one-particle states are eigenstates for it,

(T̂(1) + V̂H–F )|i = Ei |i. (57.24)

This is the Hartree–Fock equation for the single-particle Hamiltonian (V̂H–F is a single-particle
operator), which is to be solved iteratively for the eigenstates |i and the eigenenergies Ei .
So the new Hamiltonian in occupation number space is
 
ĤH–F = i|T̂(1) |i ai† ai + i|V̂H−F |i ai† ai . (57.25)
i,i i,i

57.3 Hartree and Fock Terms

Writing the Schrödinger equation for wave functions, we obtain


 2  

− (1)
Δr + V (r ) φi (r ) + VH–F (r )φi (r ) − d 3r VH−F
(2,exch)
(r , r  )φi (r  ) = 0. (57.26)
2m
In this equation, Hartree considered the first term in VH–F , and Fock added the second term.
The equation is obtained as follows.
The first term,

ik |V̂(2) |i  k = i|V̂H–F
(1)
|i  (57.27)
k occ

becomes, in wave functions, on introducing the completeness relation for |r ,


  
d 3r φi∗ (r )VH–F
(1)
φi (r ) = d 3r φi∗ (r ) d 3r  φk∗ (r  )V(2) (r , r  )φk (r  )φi (r ), (57.28)
k occ.

which defines the Hartree potential,


  
(1)
VH (r ) = VH−F (r ) = d 3r  |φk (r  )| 2 V(2) (r , r  ) = d 3r  ρ(r  )V(2) (r , r  ). (57.29)
k,occ.
636 57 The Hartree–Fock Approximation

We have defined the probability density ρ(r ) = k,occ. |φk (r )| 2 of the electrons. By multiplying with
e, to give ρ e = eρ, we obtain the electron density, which means that this first term has a classical
interpretation.
The second term is called the exchange term, since it is a purely quantum term, without a classical
analog. In it,

ik |V̂(2) |ki   = i|V̂H–F
(2)
|i  (57.30)
k,occ.

becomes, in wave functions,


   
∗ (2)
d r φi (r )VH–F φi (r ) =
3 ∗
d r φi (r )
3
d 3r  φk∗ (r  )V(2) (r , r  )φi (r  )φk (r ), (57.31)
k occ.

which defines the Fock “exchange” potential acting on a wave function,


 
VF φi (r ) = (2)
VH–F φi (r ) = d 3r   φk∗ (r  )φk (r )  V(2) (r , r  )φi (r  )
k,occ (57.32)

≡ d 3r VH−F
(2,exch)
(r , r  )φi (r  ).

Here we have defined the two-particle exchange term of the potential,



(2,exch)
VH–F (r , r  ) = φk∗ (r  )φk (r )V (2) (r , r  ). (57.33)
k occ.

This completes our definition of the Hartree–Fock equation for wave functions.
We now specialize to the case of the Coulomb interaction,

e02
V= . (57.34)
|r − r  |
Then keeping only the first of the Hartree–Fock terms, we obtain the Hartree equation,
 2 

− Δ + V (r ) + VH (r ) φi (r ) = Ei φi (r ). (57.35)
2m
This has an interpretation as a semiclassical potential, in which the Hartree term VH is a classical
potential with a quantum charge density,
 
e0
VH (r ) = d 3r  ρ(r  )V (r − r  ) = d 3r  ρ e (r  ) , (57.36)
|r − r  |
where ρ e (r ) = e0 ρ(r ) is the electric charge density.
On the other hand, the second term, the Fock term, is due to the Coulomb exchange potential
 e0
(2,exch)
VH–F (r , r  ) = e0 φk∗ (r  )φk (r ) , (57.37)
k occ.
|r − r  |

leading to

VF φi (r ) = d 3r VH–F
(2,exch)
(r , r  )φi (r  ). (57.38)
637 57 The Hartree–Fock Approximation

57.4 Other Approximations in the Occupation Number Picture

Besides the Hartree–Fock approximation, solving iteratively the Hartree–Fock equations, we can also
employ perturbation theory, but not for single-particle wave functions. Instead, we use a perturbation
theory in terms of the field operators for the occupation number. It is in fact the nonrelativistic version
of the formalism of quantum field theory. Here we will give just a rough sketch, an outline, of the
method.
In order to define perturbation theory, we use the Dirac picture, with respect to the free
Hamiltonian, in the occupation number version,

Ĥ = ωi ĉi† ĉi , (57.39)
i

where

ωi δii = i|T̂(1) |i , (57.40)

and T̂(1) contains the kinetic energy and the one-particle potential.
The operators in the Dirac (interaction) picture are (in terms of the Schrödinger picture
operator ÂS )

ÂI (t) = ei Ĥ0 t/ ÂS e−i Ĥ0 t/ , (57.41)

while the operators in the Heisenberg picture are

ÂH (t) = ei Ĥt/ ÂS e−i Ĥt/ . (57.42)



Then we can define the creation and annihilation operators in the Dirac picture, ĉi,I (t), ĉi,I (t).
A crucial object in the theory is the Green’s function, defined as

 
ψ0 |T { ψ̂ H,σ (r , t) ψ̂†H,σ (r , t  )}|ψ0 
G σσ (r , t; r , t ) ≡ −i . (57.43)
ψ0 |ψ0 
This will have an expansion similar to the single-particle Green’s function, generalizing
 |ii|
Ĝ(z) ∼ , (57.44)
i
z − Ei

implying that the Green’s function has poles at the energies of the bound states Ei .
Moreover, the Green’s function is part of a more general correlation function of generic operators
Â, B̂ in the Heisenberg picture; for the ground state wave function |ψ0  we have

ψ0 |T { ÂH (t) B̂H (t  )}|ψ0 


. (57.45)
ψ0 |ψ0 
The “Feynman theorem” will relate these correlation functions to those of the interaction picture
operator ÂI (t), B̂I (t) and |φ0 , the ground state of Ĥ0 .
Then the Green’s functions are related to observables. For one-particle operators,

Âσσ = d 3r âσσ (r ), (57.46)
638 57 The Hartree–Fock Approximation

H-F
+ = + B-S

(a) (b)
Figure 57.1 (a) Hartree–Fock equation for Green’s functions. (b) Bethe–Salpeter equation for four-point Green’s functions.

the average in the ground state is



a(r ) ≡ ψ0 |aσσ |ψ0  = −i lim

Aσ,σ G σ ,σ (r , t; r , t  )
r ,t  →t
r →
σ,σ (57.47)
= −i Tr[AG].
Similarly, the average in the ground state of the two-particle operator is
  
1 3 
 B̂ = − d r d 3r Bσ1 σ1 σ2 σ2 (r , r  ) lim
 →t
G σ2 σ2 ,σ1 σ1 (r , t; r , t  ). (57.48)
2 σ σ σ σ
t
1 2 1 2

Hartree–Fock versus Bethe–Salpeter Equation

We can consider a self-consistent equation for the standard perturbation theory approximation, the
Hartree–Fock equation for the Green’s functions,
G (2)  G0(2) + G0(2) · VH–F
(2)
· G (2) , (57.49)
which depicted pictorially is in Fig. 57.1a. As can be seen, the equation is for two-point Green’s
functions, i.e., propagators.
Alternatively, we can write a Bethe–Salpeter equation, or Dyson equation, for the four-point
Green’s functions G (4) . This equation for G (4) is
(2) (2) (4) (2) (2)
G (4) = G0,1 G0,2 + K B−S · G0,1 G0,2 · G (4) , (57.50)
depicted pictorially in Fig. 57.1b.
However, as we saw, the Green’s function has poles at the corrected bound states Ei , and near the
pole at i, the Green’s function becomes
|ii|
Ĝ(z)  , (57.51)
z − Ei
which can be generalized to the case of several particles, in particular to the propagation of two
particles, described by the four-point Green’s functions Ĝ (4) . Near a pole it has a similar form,
|(i j)(i j)|
Ĝ (4) (E) ∼ , (57.52)
E − Ei
which means that the Bethe–Salpeter equation for the two-particle wave function |(i j) is
(4) (2) (2)
|(i j)  K B−S · G0,1 G0,2 |(i j). (57.53)
639 57 The Hartree–Fock Approximation

Important Concepts to Remember

• The Hartree–Fock approximation is an approximation for the energy of an electron in a multi-


electron state, using the occupation number picture, via a self-consistent equation with field
operators for the other electrons.
• The Hartree–Fock potential V̂H−F is defined by i|V̂H−F |i  = k occ. ik |V̂
(2) 
|i k −
k occ. ik |V̂
(2)
|ki  .
• The Hartree–Fock equation states that then the Hamiltonian is diagonal, so (T̂ (1) + V̂H−F )|i =
Ei |i, and the mean-field Hartree–Fock Hamiltonian in occupation number space is ĤH−F =
(1)  † 
i,i i|T̂ |i ai ai + i,i i|V̂H−F |i ai† ai .
• On wave functions, from V̂H−F we obtain two terms, the direct term or Hartree potential,
(1)
VH−F φi (r ), with VH (r ) = VH−F (1)
(r ) = d 3r  k occ. |φk (r  )| 2 V (2) (r , r  ) = d 3r  ρ(r  )V (2) (r , r  ),

and the exchange or Fock term, VF φi (r ) = VH−F (2)
φi (r ) = d 3r VH−F (2,exch.)
(r , r  )φi (r  ), with
(2,exch.)
VH−F (r , r  ) = k occ. φk (r )φk (r  )V (2) (r , r  ).
• The Hartree equation contains only the Hartree potential, as a semiclassical contribution from
the charge of the other electrons, d 3r  ρ(r  )V (r − r  ), while the Fock term contains the purely
quantum exchange term.
• In the occupation number picture, we can also introduce a nonrelativistic version of quantum
field theory, via field operators. Interaction picture operators, ÂI (t) = ei Ĥ0 / ÂS e−i Ĥ0 / are used
for perturbation theory, and Heisenberg picture operators ÂH (t) = ei Ĥ/ AS e−i Ĥ/ are used for
definitions.
• Green’s functions

 
ψ0 |T { ψ̂ H,σ (r , t) ψ̂†H,σ (r , t  )}|ψ0 
G σσ (r , t; r , t ) ≡ −i
ψ0 |ψ0 

and more general correlation functions

ψ0 |T { ÂH (t) B̂H (t  )}|ψ0 


ψ0 |ψ0 

can be expanded, via the Feynman theorem, using the perturbation theory of interaction picture
operators; one-particle operators  = d 3r aσσ (r ) have an average in the ground state
a = −i Tr[AG] and similar formulas hold for operators for higher numbers of particles.
• One can write a Hartree–Fock equation for (two-point) Green’s functions, G (2)  G0(2) + G0(2) ·
(2)
VH−F · G (2) , or alternatively a Dyson equation, or version of the Bethe–Salpeter equation, for the
(2) (2) (4) (2) (2)
four-point Green’s function, G (4) = G0,1 G0,2 + K B−S · G0,1 G0,2 · G (4) .

Further Reading
More details are given in the books [2] and [27].
640 57 The Hartree–Fock Approximation

Exercises

(1) Consider bosons instead of fermions, and set up the corresponding problem for a self-consistent
approximation.
(2) Continue with this method to find the equivalent of the Hartree–Fock potential V̂H–F for bosons.
(3) Given the Hartree–Fock equation, rewrite the Hartree–Fock Hamiltonian.
(4) Calculate the Hartree potential for the helium atom in its ground state.
(5) Calculate the two-particle exchange term in the Hartree–Fock approximation for the lithium
atom in its ground state.
(6) Write explicitly the Hartree–Fock equation for the 2-point Green’s function, in the coordinate
(r and spin σ) representation.
(7) Write explicitly, in the coordinate (r and spin σ) representation, the Dyson or Bethe–Salpeter
equation.
58 Nonstandard Statistics: Anyons and Nonabelions

In this chapter we consider statistics other than Bose–Einstein and Fermi–Dirac, namely, statistics
involving multiplication with an abelian phase eiα (different from ±1), called anyonic statistics,
or with a nonabelian factor U, called nonabelian anyonic statistics. The corresponding particles,
anyons and nonabelions, live in 2+1 dimensions and are associated with the abelian and nonabelian
Berry phase and connection. Therefore, we will review these concepts first. The implementation in
physical materials of the new statistics is associated with the Chern–Simons action, which we will
also introduce.

58.1 Review of Statistics and Berry Phase

In Chapter 20 we considered the permutation of particles within a quantum state,

|ψ = |x 1 x 2  → P̂12 |ψ = |x 2 x 1 . (58.1)

We found that P̂12 has eigenvalues C12 . Then we imposed that the application of the interchange
operator twice returns the situation to the initial state, P̂12 2
= 1, implying that the eigenvalue is
C12 = ±1. This behavior was associated with Bose–Einstein (bosons) and Fermi–Dirac (fermions)
2
statistics. But it is not necessary to have P̂12 = 1. In fact, we can “interchange” particles continuously,
through a path around each other, leading to a Berry phase, as considered in Chapter 20.
If the Hamiltonian is time dependent, Ĥ = Ĥ (t), via the adiabatic time dependence of parameters
K = K (t), the time dependence of the state that at t = 0 is |n and has energy En has an extra phase,
called the Berry phase γn , so that
  t 
i  
|ψ(t) = exp − En (t )dt eiγn (t) |n(K (t)). (58.2)
 0
Here the Berry connection is
 n (K ) = in(K (t))| ∇
A  K |n(K (t)). (58.3)

If K corresponds to a rescaled position R/(c), then the Berry phase is
  
An ( R)
γn = · d R,
 (58.4)
C c
and is an Aharanov–Bohm-like phase.
We saw that the Berry phase on a closed contour C, modulo 2πm, is invariant and so cannot be
removed by a gauge transformation since, under it,

γn (C) = γn (C) + 2πm. (58.5)


641
642 58 Nonstandard Statistics: Anyons and Nonabelions

We also defined a nonabelian generalization, in the case when the state with En has degeneracy N
with index |a, a = 1, . . . , N. Then the adiabatic Hamiltonian Ĥ (K (t)) has states |n, K depending
on time through the parameters. In this case, the nonabelian Berry connection is
 ab (K ) = ia, K (t)| ∇
A  K |b, K (t), (58.6)

with values in in the adjoint representation of the group U (N ) of unitary transformations. The
nonabelian Berry phase is

γ (ab)
= A  ) · d K.
 (ab) ( K  (58.7)
C

The Berry phase factor is path ordered,


(a b)
Peiγ , (58.8)

 along C: the
implying that the expansion of the exponential in powers is ordered by the value of K
later in the path, the further to the right.

58.2 Anyons in 2+1 Dimensions: Explicit Construction

We now construct explicitly an anyon system, where the particles are neither bosons or fermions and
under the permutation of two particles we obtain a phase eiα , different from ±1. The possibility of an
abelian phase eiα is restricted to 2+1 dimensions, which means that the system has an infinitesimal
extent in a third spatial dimension.
Consider the following construction of an anyon system. We start with a system of N particles,
for instance electrons, interacting with an electromagnetic field Aμ = ( A0 , A)
 since they are charged.
The Hamiltonian is
N  r i )| 2 
|p i − q A( N
Ĥ = + v(r i − r j ) + q A0 (r i ), (58.9)
i=1
2m i< j i=1

where r i are the positions of the particles, and v(r i − r j ) is a two-particle potential.
The multiparticle wave function ψ(r 1 , . . . , r N ) obeys the Schrödinger equation,

Ĥ ψ(r 1 , . . . , r N ) = Eψ(r 1 , . . . , r N ). (58.10)

Now we make a canonical transformation (a unitary transformation) of the wave function from ψ
to φ = U ψ as follows:


⎪   ⎫


⎨ θ ⎪
φ(r 1 , . . . , r N ) = U ψ(r 1 , . . . , r N ) = ⎪ exp −i α(r i − r j ) ⎬
⎪ ψ(r 1 , . . . , r N ), (58.11)
⎪ i< j π ⎪
⎩ ⎭
where α(r i − r j ) is the angle made by the relative distance between the particles r i − r j with respect
to a fixed axis. Under U the Hamiltonian transforms as Ĥ → U −1 ĤU. We can check easily that

U −1 (p i − q A(
 r i ))U = p i − q A(
 r i ) − qa (r i ), (58.12)
643 58 Nonstandard Statistics: Anyons and Nonabelions

where we have defined an emergent (or statistical) gauge field a by


θ  θ  ẑ × (r i − r j )
qa (r i ) = ∇i α(r i − r j ) = . (58.13)
π ji π ji |r i − r j | 2

Here ẑ is the unit vector in the direction perpendicular to the two-dimensional material (the third
spatial direction).
Note that, since α(r i − r j ) is the angle of the relative distance with the fixed axis,
α(r i − r j ) = α(r j − r i ) + π. (58.14)
Substituting this in the definition of φ = U ψ, when we exchange the two particles i and j, after the
canonical transformation by U we obtain an extra eiθ , besides the original ±1 valid for ψ:
φ(. . . , r j , . . . , r i , . . .) = ±eiθ (. . . , r i , . . . , r j , . . .). (58.15)
Then, if θ = (2k + 1)π, the statistics is changed from Bose–Einstein to Fermi–Dirac and from
Fermi–Dirac to Bose–Einstein and, for more general θ, we change to fractional, or anyonic, statistics.
The field aμ is called the statistical gauge field, since it changes the statistics.
In the new representation, the Hamiltonian is
N  r i ) − qa (r i )| 2 
|p i − q A( N
Ĥ = + v(r i − r j ) + q A0 (r i ), (58.16)
i=1
2m i< j i=1

where a is added to A,  and its field strength (due to the magnetic field) is

 × a (r i ) = 2θ
f 12 (r i ) ≡ b(r i ) = ∇

δ(r i − r j ) = 2 ρ charge (r i ), (58.17)
q ji q

and where ρ charge (r i ) = q ji δ(r i − r j ) is the charge density of the anyon system. Note that in 2+1
dimensions
 the magnetic field is a scalar B, not a vector, and the magnetic flux associated with it is
just Φ = BdS.
 → A
Since A  + a after the canonical (unitary) transformation, if B = F12 (r ) = 0 before the
transformation then the total magnetic field after it is

B̃ = F̃12 = ∇
 × (A  + a )(r i ) = 2θ δ(r i − r j ). (58.18)
q ji

This means that we have a delta function magnetic flux associated with the particle (at its position),
 

Φ= F̃12 dS = f 12 dS = , (58.19)
S S q
where S is a small surface around the particle. Therefore, we have attached a magnetic flux at the
position of the particle, turning it into an anyon.
The explicit construction above was through the canonical (unitary) transformation, leading to
magnetic field B for the statistical gauge field, but we can define B as a delta function for any gauge
field, including an electromagnetic field itself, and still obtain an anyon behavior.
Indeed, the Aharonov–Bohm phase obtained when one of the anyons is interchanged with another,
by continuously moving it around the other, is
  
exp iq  · dr = exp iqΦ[C] = e2iθ .
A (58.20)
C
644 58 Nonstandard Statistics: Anyons and Nonabelions

However, rotating one anyon around another returns it to the original position, therefore corresponds
to two interchanges. That means that the factor for a single interchange is

C12 = eiθ . (58.21)

Moreover, the same argument can be used for the more general Berry phase γ, associated with a
 Then the Berry phase for one particle going around the other in parameter space
Berry connection A.
gives
γ
e2iθ = eiγ ⇒ θ = . (58.22)
2

58.3 Chern–Simons Action

We have obtained the relation between the magnetic field (here the total magnetic field) and the
electric charge density,
2θ 2θ
F12 = ρ charge ≡ J0 , (58.23)
q q
but ρ charge is the zeroth component of the relativistic current Jμ (in 2+1 dimensions), ρ charge = J0 .
The above equation is the (12) component of a relativistically covariant equation,

Fμν =  μνρ J ρ , (58.24)
k
with a “Chern-Simons quantization level” k, where
q
k=π , (58.25)
θ
and  μνρ is the totally antisymmetric Levi–Civita tensor, with 012 = +1.
The action from which this relativistically covariant equation of motion comes is

k
SCS = d 2+1 x  μνρ Aμ ∂ν Aρ
4π M
 
k
= d 2+1 x − A1 Ȧ2 + A2 Ȧ1
4π M

+ A0 ∂1 A2 − A2 ∂1 A0 − A0 ∂2 A1 + A1 ∂2 A0 (58.26)

k
= d 2+1 x[A2 Ȧ1 + A0 (∂1 A2 − ∂2 A1 )]
2π M

k
= d 2+1 x[A2 Ȧ1 + A0 F12 ],
2π M
where we have used partial integration in the third equality, assuming that the boundary term at
infinity vanishes.
Adding a source term,
 
Ssource = d 2+1 x J μ Aμ = d 2+1 x[J · A
 + J 0 A0 ], (58.27)
645 58 Nonstandard Statistics: Anyons and Nonabelions

the total action,


  
k μνρ
S = SCS + Ssource = d 2+1 x  Aμ ∂ν Aρ + J μ Aμ , (58.28)

has the equations of motion
k μνρ
 Fνρ = J μ ⇒

(58.29)

Fμν =  μνρ J ρ ,
k
as required.
 as fixed) is just the relation (without
Alternatively, the equation of motion for A0 (considering A
the relativistically covariant generalization)
2θ 0
F12 = J . (58.30)
q
Moreover, the Chern–Simons level k is quantized. Its quantization is related to Dirac quantization,
as we will now show.
The gauge transformation is in general
δA
 = ∇λ,
 δ A0 = λ̇, (58.31)
but we restrict to transformations that depend only on time, λ = λ(t). Then the Chern–Simons action
transforms as
  t2 
k k
δSCS = d 2+1 x λ̇F12 = dt dSF12 λ̇
2π M 2π t1 M
 (58.32)
⎡⎢ BdS ⎤⎥
=⎢ ⎢ M ⎥ k (λ(t 2 ) − λ(t 1 )) .
⎢⎣ 2π ⎥⎥⎦
We now consider periodicity in time, which is related to finite temperature as we saw in Chapter 28
(the relation to finite temperature appearing after the Wick rotation to imaginary, or Euclidean, time),
t 2 −t 1 being the periodicity. Since λ is the gauge parameter, under the gauge transformation the wave
function of the anyon changes by (see Dirac quantization in Chapter 27)
ψ → eiqλ/ ψ, (58.33)
and in order for the gauge transformation to be well defined as we go around the periodicity cycle,
we need to have the periodicity
2π
λ(t 2 ) − λ(t 1 ) = m, m ∈ Z. (58.34)
q
Substituting this in the variation of the Chern–Simons action δSCS , together with the fact that the
magnetic flux is quantized in units of Φ0 = h/q, as seen in Chapter 24 concerning the Hall effect,

BdS Φ p p h 
= = = p , p ∈ Z, (58.35)
2π 2π 2π q q
and defining the product of the integers m and p as an integer n, we obtain
2π 
δSCS = kn . (58.36)
q q
646 58 Nonstandard Statistics: Anyons and Nonabelions

Since the action appears in the path integral for quantum calculations in the form eiS/ , and therefore
the variation under the gauge transformation of eiS/ must be trivial, eiδS/ = 1, we obtain

δSCS = 2πN ⇒ k ∈ Z. (58.37)
q2
It is usual to rescale the gauge field by the charge, Aμ → Aμ /q, such that the electromagnetic
kinetic term has 1/q2 in front,

1
Sem,kin = − 2 Fμν F μν ; (58.38)
4q
then the quantization condition is simply that k is an integer in units of 1/, or k ∈ Z.
Then the Chern–Simons action SCS generates anyons, and
π
θ= , k ∈ Z. (58.39)
k

58.4 Example: Fractional Quantum Hall Effect (FQHE)

The Hall effect refers to the current density j ( jy ) induced in a direction perpendicular to the applied
electric field E (Ex ) and a constant magnetic field B  (Bz ), with jy = σ H Ex . Then the integer quantum
Hall effect (IQHE) refers to a quantized value of the Hall conductivity, σ H = nσ0 , whereas the
fractional quantum Hall effect (FQHE) refers to fractional values of the same, σ H = (q/r)σ0 , with
q, r ∈ Z.
We will not consider further the theory of the FQHE, first encountered in Chapter 24, but will just
say that for the particular case where
1
σ= σ0 , (58.40)
r
the action that describes both the Hall effect on the conductivity, and the particles responsible for this
effect, is
  
1 μνρ r μνρ
Seff = d 2+1 x  Aμ ∂ν aρ −  aμ ∂ν aρ . (58.41)
M 2π 4π
The classical equations of motion of the action (for aμ ) are

1 1 1
f μν = ∂μ aν − ∂ν aμ = Fμν = (∂μ Aν − ∂ν Aμ ) ⇒ aμ = Aμ . (58.42)
r r r
Then the on-shell action (solved for aμ ) in terms of the electromagnetic field Aμ is

 1 1
Seff = d 2+1 x  μνρ Aμ ∂ν Aρ , (58.43)
r 4π M
and from it we obtain the fractional Hall conductivity σ H = σ0 /r (though we will not consider it
further here).
But if we do not solve for aμ , we obtain the kinetic term for the statistical field interacting
 with
μ
anyons. Indeed, the source term for electromagnetism, in the form of charged particles, eJ Aμ ,
647 58 Nonstandard Statistics: Anyons and Nonabelions

charge e, implies that there should also be a source term for the statistical gauge field aμ , in the form
of “quasiparticles” of charge q under aμ ,
 
μ
d x qJ aμ = q
2+1
dt a0 (x , t). (58.44)

Adding this source term to Seff , the equation of motion for aμ becomes
F12 r f 12
− + qδ(x − x (t)) = 0. (58.45)
2π 2π
This implies that we have a delta function flux either in F12 (the electromagnetic field) or in f 12 (the
statistical field). But having a delta function flux in the electromagnetic field is an idealization that
does not sit well with reality, since the flux lines going up through the third direction should return
back down somewhere else. Instead, we can have a delta function flux in the statistical gauge field,
that is,
f 12 q
= δ(x − x (t)). (58.46)
2π r
Comparing this with the relation between the magnetic field and the charge density for anyons, we
obtain that the anyon statistics parameter θ is
π
θ = q2 , (58.47)
r
and, as we said, the factor q2 can be absorbed into a redefinition of the gauge fields.

58.5 Nonabelian Statistics

Until now we have considered an abelian phase statistics, but we can have also a nonabelian statistics.
Moore and Read have shown that, even in the FQHE, we can have nonabelian statistics.
That means that under the interchange of particles, the possibilities for the change in the wave
function do not belong to the permutation group Z N but rather to the braid group B N . We can think of
the anyons as abelian representations of the braid group B N . But there are nonabelian representations
of the braid group, and the particles charged under these representations are called nonabelian anyons,
or nonabelions.
Under the interchange (braid), the wave function changes as
ψ p:{i1 ,...,ir ,...,is ,...,i N } (z1 , . . . , zir , . . . , zis , . . . , z N )
 (58.48)
= Bpq [i 1 , . . . , i N ]ψq:{i1 ,...,i N } (z1 , . . . , z N ),
q

where p is a shorthand for the set of indices {i 1 , . . . , i N }, and z ∈ C is the two-dimensional spatial
coordinate.
The Moore–Read wave function for N “electrons without spin” is
   
1 1
ψ P f (z1 , . . . , z N ) = Pfaff (zi − z j ) m exp − |z j | 2 , (58.49)
zi − z j i< j 4

where zi is the position of a “vortex” and the Pfaffian is a kind of “square root” of the determinant,
when N is even (there is no space to explain this further). It is rather hard to show that a bound
648 58 Nonstandard Statistics: Anyons and Nonabelions

Figure 58.1 A basic braiding move.

state of four excitations obeys nonabelian statistics. Equivalently, and also hard to show, we have a
nonabelian Berry connection, leading to a nonabelian Berry phase, also with nonabelian statistics.
An example of the action of the nonabelian braid on bound states of excitations is as follows.
Consider oscillators with annihilation and creation operators ci and ci† , with commutation relations
{ci , c†j } = δi j , {ci , c j } = {ci† , c†j } = 0. (58.50)
Then define the Majorana fermion zero modes (modes of zero energy) γi as
γi = ci + ci† , (58.51)
satisfying the Clifford algebra
{γi , γ j } = δi j , i = 1, . . . , 2n. (58.52)
Then the complex fermion made up from the Majorana fermions is
γ2k+1 + iγ2k
ψk = , k = 1, . . . , n, (58.53)
2
and has a 2n -dimensional Hilbert space. For each vortex, we have one Majorana zero mode, but the
relevant excitations forming the nonabelion are nonlocal (they come from several vortices).
An example of braiding, in fact the one that generates the braid group, is where the ith vortex is
braided with the (i + 1)th vortex, in an anticlockwise direction:
R : {γi → γi+1 , γi+1 → −γi , γ j → γ j , j  i, i + 1}. (58.54)
A graphical representation of a basic braiding move is given in Fig. 58.1.

Important Concepts to Remember

• Instead of the Bose–Einstein and Fermi–Dirac statistics, in 2+1 dimensions we can also have
particles whose statistics are defined by interchange by a phase eiθ , called anyons, or by a
nonabelian matrix, called nonabelions.
• We can construct anyons with eiθ by a unitary transformation from the wave function of a system
of ordinary particles (for instance electrons), obtaining an extra coupling to a statistical gauge field
qaμ , qa = πθ ji ∇
 i α(r i − r j ).
649 58 Nonstandard Statistics: Anyons and Nonabelions

• To construct the anyon we attach a flux at the position of the particle, B̃ = F̃12 = ∇
 × (A
 + a )(r i ) =
ji δ(
r i − r j ).

q
• The
 constructed
 anyon satisfies the equation of motion for the Chern–Simons action with a source,
k μνρ
d 2+1 x 4π  Aμ ∂ν Aρ + J μ Aμ , with quantized level k, quantized via Dirac quantization.
• Anyons and the associated Chern–Simons action appear also in the theoretical description of the
fractional quantum Hall effect (FQHE), for σr = σ0 /r.
• A nonabelian statistics is also possible, with interchange under the braiding group, exemplified on
Majorana fermion zero modes, as found from the Moore–Read wave function.

Further Reading
More details about the FQHE and nonabelian statistics can be found in the review by David Tong
[29], and the Moore–Read paper [30].

Exercises

(1) The nonabelian Berry phase, by its very definition, changes the wave function, even though we
return to the initial state in terms of K (t) (in the abelian case, the wave function changes by a
phase, so |ψ| 2 doesn’t change). Can you give an example where this is consistent?
(2) Show the steps in the proof of the relation (58.12), for constructing an anyon from a statistical
gauge field.
ρ
(3) Consider the (01) component of the equation of motion Fμν = 2π k  μνρ J , coming from the
relativistic Chern–Simons action with a source. What physical property does it describe? Note
that this is now related to the existence of anyons.
(4) Show that the Chern–Simons action (without a source) is equal to a (3 + 1)-dimensional action
of a gauge invariant, topological (metric independent, and with discrete values) type.
(5) When describing the FQHE in the text, we gave the option of solving for the statistical gauge
field aμ and obtaining a Chern–Simons-like action for the electromagnetic field Aμ , or of adding
a source term and obtaining an anyon. But why did we not consider the equation of motion for
Aμ , either in the first or in the second case?
(6) When discussing the FQHE anyon as a statistical gauge delta function flux added to the
quasiparticles, we argued that for the electromagnetic field Fμν to be a delta function is
unphysical because of flux conservation. But why then is it OK for the statistical f μν to be a
delta function?
(7) Show that the factor (zi − z j ) m in the Moore–Read wave function (also present in the
phenomenological “Laughlin wave function” for a theoretical description of the FQHE) implies
there are m “vortices” at each position. A vortex ansatz is f (r)eiα , where r, α are the radius and
the polar angle in the plane, respectively, and one can impose a consistency condition on the
ansatz (and so find it).
References

[1] J.J. Sakurai and San Fu Tuan, Modern Quantum Mechanics, revised edition, Addison–Wesley,
1994.
[2] Albert Messiah, Quantum Mechanics, Dover Publications, 2014.
[3] R. Shankar, Principles of Quantum Mechanics, second edition, Springer, 1994.
[4] L.D. Landau and L.M. Lifshitz, Quantum Mechanics (Non-Relativistic Theory), Butterworth-
Heinemann, 1981.
[5] H. Nastase, Introduction to Quantum Field Theory, Cambridge University Press, 2019.
[6] H. Nastase, String Theory Methods for Condensed Matter Physics, Cambridge University Press,
2017.
[7] H. Nastase, Classical Field Theory, Cambridge University Press, 2019.
[8] Howard Georgi, Lie Algebras in Particle Physics: From Isospin to Unified Theories, Westview
Press, 1999.
[9] Michael V. Berry, “Quantal phase factors accompanying adiabatic changes”, Proc. Roy. Soc.
Lond. A 392 (1984) 45.
[10] R. Rajaraman, Solitons and Instantons: An Introduction to Solitons and Instantons in Quantum
Field Theory, first edition, North-Holland Personal Library, Vol. 15, 1987.
[11] Paul A.M. Dirac, Lectures in Quantum Mechanics, Dover Publications, 2001.
[12] John Preskill, Caltech Lecture Notes on Quantum Information,

http://theory.caltech.edu/ preskill/ph219/.
[13] E. Knill, R. Laflamme, H. Barnum, et al., “Introduction to quantum information processing”,
arXiv:quant-ph/0207171 [quant-ph].
[14] Richard Jozsa, “An introduction to measurement-based quantum computation”, arXiv:quant-
ph/0508124 [quant-ph].
[15] Daniel Gottesman, “An introduction to quantum error correction and fault-tolerant quantum
computation”, arXiv:0904.2557 .[quant-ph].
[16] J.S. Bell, “On the Einstein–Podolsky–Rosen paradox”, Physics 1 (3) (1964) 195.
[17] E.P. Wigner, American Journal of Physics 38 (1970) 1005.
[18] J. Maldacena, S.H. Shenker, and D. Stanford, “A bound on chaos”, J. High-Energy Phys. 08
(2016) 106, arXiv:1503.01409 [hep-th].
[19] R. Jefferson and R.C. Myers, “Circuit complexity in quantum field theory”, J. High-Energy
Phys. 10 (2017) 107, arXiv:1707.08570 [hep-th].
[20] M.A. Nielsen, “A geometric approach to quantum circuit lower bounds”, Quantum Inform.
Comput. 6 (2006) 213 [quant-ph/0502070].
[21] A.R. Brown and L. Susskind, “Second law of quantum complexity”, Phys. Rev. D 97 (8) (2018)
086015, arXiv:1701.01107 [hep-th].
[22] S. Xu and B. Swingle, “Accessing scrambling using matrix product operators”, Nature Phys.
16 (2) (2019) 199, arXiv:1802.00801 [quant-ph].
650
651 References

[23] M. Srednicki, “Chaos and quantum thermalization”, cond-mat/9403051.


[24] R. Nandkishore and D. A. Huse, “Many-body localization and thermalization in quantum
statistical mechanics”, Ann. Rev. Cond. Matter Phys. 6 (2015) 15.
[25] W.H. Zurek, “Decoherence, Einselection, and the quantum origins of the classical”, Rev. Mod.
Phys. 75 (2003) 715, arXiv:quant-ph/0105127 [quant-ph].
[26] Marcos Rigol, Vanja Dunjko, and Maxim Olshanii, “Thermalization and its mechanism for
generic isolated quantum systems”, Nature 452 (2008) 854, arXiv:0708.1324 [cond-mat].
[27] A.L. Fetter and J.D. Walecka, Quantum Theory of Many-Particle Systems, Dover, 2003.
[28] N. Levinson, “Determination of the potential from the asymptotic phase”, Phys. Rev. 75 (1949)
1445.
[29] David Tong, “Lectures on the quantum Hall effect”, arXiv:1606.06687 [hep-th].
[30] G.W. Moore and N. Read, “Nonabelions in the fractional quantum Hall effect”, Nucl. Phys. B
360 (1991) 362.
Index

adiabatic approximation, 426 canonical quantization, 60


Aharonov–Bohm effect, 260, 308 conditions, 60
Aharonov–Bohm phase, 641 Cauchy–Schwarz inequality, 70
angular momenta, central potential, 214
composition, 174 cases, 215
angular momentum, 137, 159 channel, scattering, 593
complex, 566, 574 chaos, quantum, 384, 388
total, 194 Chebyshev polynomials, 102
antilinear operator, 15, 152, 153 chemical bond, 473
anyons, 641, 642 Chern–Simons
arrow of time, 149 action, 644
atomic orbital method, 472 level, 644
Clebsch–Gordan coefficients, 176, 177, 185
Balmer relation, 5 Clifford algebra, 606, 648
Banach space, 24 coherent states, 96
Bell inequalities, 348 completeness relation, 15
Bell inequality, complexity
original, 350 classical, 384
Bell–CHSH inequality, 356 quantum, 384, 385
Bell–Wigner inequalities, 353 quantum, Nielsen, 386
Berry connection, 264, 641 Compton effect, 4
nonabelian, 269 computation, quantum, 54
Berry curvature, 267 conductance tensor, 278
Berry phase, 260, 264, 641 confluent hypergeometric function, 209
Bessel function, constraints
spherical, 537 first-class, 326, 327
Bethe ansatz, 513 primary, 326
equations, 513 second-class, 326, 327
Bethe roots, 513 secondary, 326
Bethe–Salpeter equation, 632, 638 continuity equation, probability, 83
blackbody radiation, 3 correlation function, 123
Bloch equations, 493 n-point function, 123
Bogoliubov transformation, 393, 400 correspondence principle, 7
Bohr magneton, 46, 196 cost (Finsler) function, 387
Bohr radius, 210 Coulomb scattering, WKB approximation, 584
Bohr–Sommerfeld quantization, 8, 282, 290 creation/annihilation operators, 93
Boltzmann distribution, 344 cross section, 516
Born approximation, 528 cryptography, quantum, 380
elastic scattering, 424 cyclotron frequency, 274
Born series, 528
Born–Oppenheimer approximation, 266 Darwin term, 610
Bose–Einstein statistics, 230 Davisson–Germer experiment, 8
boson, 229 de Broglie particle–wave duality, 132
bound states, 557, 558 de Broglie wavelength
braid group, 647 definition, 8, 132
Breit–Wigner distribution, 429, 432 decay width, 431
Breit–Wigner formula, 569 decoherence, 380
butterfly velocity, 389 quantum, 393

652
653 Index

delta function, Dirac, 26, 27 field operators, 623, 628


density matrix, 42, 361, 362 fine structure, 472
diatomic molecule, 220 flux quantum, 263, 275
dipole moment Fock space, 623, 626
electric, 440 form factors, 592
magnetic, 441 Fredholm theorem, 31
Dirac bra-ket notation, 12 free particle, 63
Dirac brackets, 329 spherical coordinates, 190
Dirac equation, 257, 603
Dirac quantization condition, 301, 303 gamma matrices, 606
Dirac quantization, constrained systems, 325 gate
Dirac string, 308 AND, 384
dissipation time, 390 INPUT, 384
distribution, 27 NOT, 384
Bose–Einstein, 370 OR, 384
Fermi–Dirac, 370 quantum Hadamard, 385
Maxwell, 368 quantum NOT, 385
double-slit experiment, 6 quantum phase, 385
dual space, 17 quantum Toffoli, 386
Dyson equation, 638 XOR, 385
gauge field, statistical, 643
effective mass of electron, 254 Gaussian integration formula, 119
effective potential, radial Schrödinger, 206 Gaussian integration in Grassmann algebra, 320
Ehrenfest theorem, 127, 140, 153 Gegenbauer polynomials, 102
eigenstate thermalization hypothesis (ETH), generalized Gibbs ensemble (GGE), 398
398 generating functional, 123, 124
eigenvalue problem, 19 generating function, canonical transformation,
eikonal approximation, 577, 581 59
einselection, 395 geometric phase, 264
Einstein coefficient, 497 geometrical optics, 581
Einstein locality, 347 geometrical optics approximation, 132
energy shift, 431 Gram–Schmidt theorem, 14
ensemble Grassmann algebra, 318
canonical, 368 Green’s function, 506
grand canonical, 369 Green’s function, radial, 550
quantum, 42 group
quasi-microcanonical, 367 abelian, 144
statistical, 363 cyclic, 144
entangled state, 339 definition, 143
maximally, 53, 343
entanglement, 52, 339 H2 molecule, 240, 246
entropy Hall effect, 272
entanglement, 343, 345, 371 Hall voltage, 276
Gibbs, 344 Hamilton–Jacobi
Renyi, 371 equation, 130, 131, 578
statistical, 368 formalism, 127, 130, 282
thermal entanglement, 371 Hamiltonian
von Neumann, 344, 368, 377 extended, 328
EPR paradox, 339, 345, 350 total, 327
ergodic hypothesis, 364, 398 Hamilton’s principal function, 131
Euler angles, 161 Hankel function, spherical, 537
exchange density probability, 237, 242 harmonic oscillator, 91
exchange integral, 242 fermionic, 233
forced, 311
Fermi golden rule, 422, 495, 529 isotropic, 214, 222
Fermi–Dirac statistics, 230 isotropic, cylindrical coordinates, 225
fermion, 229 isotropic, spherical coordinates, 223
Feynman–Kac formula, 316 Hartree–Fock approximation, 490, 632
Feynman theorem, 637 Hartree–Fock method, 472
654 Index

Hartree–Fock potential, 635 Majorana fermion, 648


Heisenberg spin 1/2 Hamiltonian, 513 many-body localization (MBL), 398
helium atom, 240 many-body physics, 623
Helmholtz equation, 537 matrix mechanics, 6
Hermite polynomials, 98, 102 Maxwell duality, 301
hidden variables, 345, 350 measurement, quantum, 41
Hilbert space, 21 Meissner effect, 255
Hilbert–Schmidt theorem, 31 metric space, complete (Cauchy), 24
Hund rules, 471 minimal coupling, 194, 250
hybridization, 479 mixed state, 361
hydrogen atom, 204 monodromy, 308
hyperfine structure, 472 monopole, ’t Hooft, 303
Moore–Read wave function, 647
identical particles, 229 multiparticle states, 614
inelastic scattering, 588 multiplication table, 143
instanton, 447, 453 mutual information, 376
integral
Lebesgue, 26 Neumann function, spherical, 537
Riemann, 26 no-cloning principle, 380
interaction with magnetic field, 194 Noether theorem, 137
nonabelions, 641, 647
Jacobi identity, 157, 328 number operator, 93
Jacobi polynomials, 101
Jost function, 542, 563 occupation number, 617
picture, 614
Klein–Gordon equation, 604 on-shell value, 121
operator
Laguerre polynomials, 102, 211 adjoint, 17
Landau levels, 272, 273 Hermitian, 19
Landé g factor, 46, 278, 437 in infinite dimensions, 30
Larmor frequency, 437 matrix representation, 16
Larmor precession, 203 unitary, 18
laser, 492 optical theorem, 521, 547, 551
Laughlin wave function, 277 orbital, 469
LCAO approximation, 616 molecular, 476, 479
LCAO method, 477 ortho-helium, 243
Legendre functions, associated, 198 orthogonal polynomials, classical, 101, 211
Legendre polynomials, 101, 198 orthonormality, 13
Levi–Civita tensor, 46 oscillation states, 50
Levinson theorem, 542, 544, 573 oscillations, neutrino, 50
Lie algebra, 155, 164 OTOC, 388
Lie group, 155
classical, 146 pairing energy, 490
linear operator, 15 para-helium, 243
Liouville–von Neumann equation, 366 parity, 137, 149
Lippman–Schwinger equation, 507 partial wave amplitude, 536
three-dimensional, 516 partial wave expansion, 535
liquid droplet model, nuclear, 483 particle creation, 401
local realism, 347 particle–wave complementarity, 6
London equations, 255 partition function, connection with statistical mechanics, 315
London–London model, 255 thermodynamical, 368
Lorentz force, 251 Paschen–Back effect, 437
LS coupling (Russell–Saunders), 471 path integral, 118
Lyapunov exponent, 388 configuration space, 120
quantum, 390 fermionic, 311
Feynman, 116
magnetic moment, 195 harmonic phase space, 121
magnetic monopole, 301 imaginary time, 311
Dirac, 303 phase space, 119
magnon, 512 Pauli exclusion principle, 230
655 Index

Pauli matrices, 46, 172 analytical properties, 557


permutation operator, 229 scattering length, 535, 541
perturbation theory scattering, one-dimensional, 503
stationary (time-independent), 407 Schmidt decomposition, 342
time-dependent, 418 Schrödinger equation, 8, 36
time-dependent, second-order, 429 Schrödinger’s cat, 393
Pfaffian, 647 Schwinger oscillator model, 179
phase shift, 535, 537 scrambling time, 390
WKB approximation, 580 selection rules, 151, 438, 441
photoelectric effect, 436, 444 semiclassical approximation, 282
photon in photoelectric effect, 4 separable state, 340
picture Shannon entropy, 376
Dirac (interaction), 112 conditional, 376
Heisenberg, 106, 108 Shannon theory, 375
Schrödinger, 106 shell model, 470
Planck constant, definition, 4 nuclear, 483
Planck formula, 497 Slater determinant, 230, 615, 632
postulates of quantum mechanics, 35 SO(n), 159
premeasurement, 395 Sommerfeld polynomial method, 97, 207, 217
projector, 39 spectroscopic notation, 469
propagator, 116 spherical harmonics, 188
Feynman, 313 addition formula, 199
free correlator, 312 spin, 194
spin chains, 512
quadrupole moment, electric, 441 spin–orbit coupling, 472
quantum chemistry, 469 spin–orbit interaction, 257, 610
quantum computation, 378 spin–statistics theorem, 234
quantum computing, 375 spinor rotation, 199
quantum Hall effect square well
fractional (FQHE), 277, 646 finite, 84
integer (IQHE), 275 infinitely deep, 78
quantum information, 375 Stark effect, 414
quantum supremacy, 380 state
quantum teleportation, 381 mixed, 43
qubit, 54, 378 pure, 43
statistical operator, 362
Rabi formula, 493 statistics, nonabelian, 647
Racah coefficients, 178 Stefan–Boltzmann constant, 4
reflection coefficient, 87 Stern–Gerlach experiment, 5, 45, 194, 272
Regge pole, 575 Stirling approximation, 376
Regge theory, 574 SU (2), 162
representation structure constants, 156
adjoint, 157 sudden approximation, 421, 425
fundamental, 145 superconductor, 254
reducible, 146 symmetry
regular, 145 active, 140
resonance, Breit–Wigner, 568 continuous, 137, 143
resonances, 566 discrete, 149
Riemann sheet, energy domain, 562 internal, 137, 149, 154
Rindler space, 400 passive, 140
Rodrigues, formula, 102
rotation matrix in the ( j) representation, T-matrix, 528, 531
184 tensor operators, 183
Rutherford formula, 425, 530 tensor product, 21
Rydberg constant, 5, 210 thermalization, quantum, 393, 397
thermofield double state, 372
S-matrix, 503, 521, 528 Thomas precession, 610
S-matrix element partial wave, 540 threshold behavior, 557
scalar product, 13 time-ordering operator, 107
scattering amplitude, 516 time-reversal invariance, 149, 152
656 Index

time-reversal operator vector space, 12


states with spin, 201 topological, 24
trace, 21 vortex, 647
transfer matrix, 503, 510
transition frequency, 420 wave function
transition functions, 305 collapse, 54
translation operator, 57 definition, 8
translation, lattice, 149 wave packet, 64
transmission coefficient, 84, 87 Gaussian, 66, 72
tunneling effect, 87 Wick rotation, 315
tunneling probability, 88 Wigner 3j symbol, 177
two-body problem, 204 Wigner 6j symbol, 179
two-level system, 45 Wigner–Eckhart theorem, 185, 437
WKB approximation, 282
uncertainty principle, Heisenberg, 66 path integral, 287
uncertainty relation, energy–time, 72 scattering, 577
uncertainty relations, Heisenberg, 69, 218 WKB method, 127, 133
unitarity, 547 state transitions, 447
unitarity bound, 549 transition formulas, 284
WKB solution
van der Waals interaction, 248 in one dimension, 283
variance, 66 Wronskian, 542
Gaussian, 72 Wronskian theorem, 76, 503
variational method, 245, 457
Ritz, 458 Zeeman effect, 272, 279, 436

You might also like