DeAngelis2015 PDF
DeAngelis2015 PDF
Alessandro De Angelis
Mário João Martins Pimenta
Introduction
to Particle and
Astroparticle
Physics
Questions to the Universe
Undergraduate Lecture Notes in Physics
Undergraduate Lecture Notes in Physics (ULNP) publishes authoritative texts covering
topics throughout pure and applied physics. Each title in the series is suitable as a basis for
undergraduate instruction, typically containing practice problems, worked examples, chapter
summaries, and suggestions for further reading.
The purpose of ULNP is to provide intriguing, absorbing books that will continue to be the
reader’s preferred reference throughout their academic career.
Series editors
Neil Ashby
Professor Emeritus, University of Colorado, Boulder, CO, USA
William Brantley
Professor, Furman University, Greenville, SC, USA
Matthew Deady
Professor, Bard College Physics Program, Annandale-on-Hudson, NY, USA
Michael Fowler
Professor, University of Virginia, Charlottesville, VA, USA
Morten Hjorth-Jensen
Professor, University of Oslo, Oslo, Norway
Michael Inglis
Professor, SUNY Suffolk County Community College, Long Island, NY, USA
Heinz Klose
Professor Emeritus, Humboldt University Berlin, Germany
Helmy Sherif
Professor, University of Alberta, Edmonton, AB, Canada
Introduction to Particle
and Astroparticle Physics
Questions to the Universe
123
Alessandro De Angelis Mário João Martins Pimenta
Department of Mathematics, Laboratório de Instrumentação e Física
Physics and Computer Science de Partículas
University of Udine and INFN Padova University of Lisboa
Padova Lisboa
Italy Portugal
This book introduces particle physics, astrophysics, and cosmology starting from
experiment. It provides a unified view of these fields, which is needed to answer in
the best way our questions about the Universe—a unified view that has been lost
somehow during the last years due to increasing specialization.
Particle physics has recently seen the incredible success of the so-called “stan-
dard model.” A 50-year long search for the missing ingredient of the model, the
Higgs particle, has just been concluded successfully, and some scientists claim that
we are close to the limit of the physics humans may know. Also, astrophysics and
cosmology have shown an impressive evolution, driven by experiments and com-
plemented by theories and models. We have nowadays a “standard model of cos-
mology” which can describe the evolution of the Universe from a tiny time after its
birth to any foreseeable future. The situation is similar to the one in the end of the
nineteenth century, after the formulation of the Maxwell equations—and we know
how the story went.
As in the end of the nineteenth century, there are some clouds that might hide a
new revolution in physics. The main cloud is that experiments indicate that we are
still missing the description of the main ingredients of the Universe from the point
of view of its energy budget. We believe one of these ingredients to be a new
particle, of which we know very little; and the other to be a new form of energy.
The same experiments indicating the need for these new ingredients are probably
not powerful enough to unveil them, and we must invent new experiments to do it.
The scientists who will solve this puzzle will base their project on a unified
vision of physics, and this book helps providing such a vision.
This book is addressed primarily to advanced undergraduate or beginning
graduate level students, since the reader is only assumed to know quantum physics
and “classical” physics, in particular electromagnetism and analytical mechanics, at
an introductory level; but it can also be useful for graduates and postgraduates, and
postdoc researchers involved in high-energy physics or astrophysics research. It is
also aimed at senior particle and astroparticle physicists as a consultation book.
Exercises at the end of each chapter help the reader reviewing material from the
v
vi Preface
chapter itself and synthesizing concepts from several chapters. A “further reading”
list is also provided for readers who want to explore in more detail particular topics.
Our experience is based on research both at artificial particle accelerators (in our
younger years) and in astroparticle physics after the late 1990s. We work as pro-
fessors for more than 20 years, teaching courses on particle and/or astroparticle
physics at undergraduate and graduate levels. We spent long time in several
research institutions outside our countries, also teaching there and gaining experi-
ence with students from different backgrounds.
This book contains broad and interdisciplinary material, which is appropriate for
a consultation book, but it can be too much for a textbook. In order to give
coherence to the material for a course, one can think of at least three paths through
the manuscript:
• For an “old-style” one-semester course on particle physics for students with a
good mathematical background, one could select Chaps. 1–6, part of 7, and
possibly (part of) 9.
• For a basic particle physics course centered in astroparticle physics, one could
instead use Chaps. 1, 2 (possibly excluding Sect. 2.10.7), 3, 4 (excluding Sect. 4.4),
Sects. 5.1, 5.2, 5.4 (excluding Sect. 5.4.1), 5.5 (possibly excluding Sect. 5.5.4), 5.6,
5.7, 6.1, 8.1, 8.4, part of Chap. 10, and if possible Chap. 11.
• A specialized half-semester course in high-energy astroparticle physics could be
based on Sects. 4.3.2, 4.5 and Chaps. 8, 10, 11; if needed, an introduction to
experimental techniques could be given based on Sects. 4.1 and 4.2.
This book would have not been possible without the help of friends and col-
leagues; we remind here (in alphabetical order) Pedro Abreu, Sofia Andringa,
Stefano Ansoldi, Pedro Assis, Liliana Apolinario, Luca Baldini, Fernando Barão,
Sandro Bettini, Barbara Biasuzzi, Giovanni Busetto, Per Carlson, Nuno Castro,
Julian Chela-Flores, Ruben Conceiçao, Jim Cronin, Michela De Maria, Tristano di
Girolamo, Jorge Dias de Deus, Anna Driutti, Catarina Espírito Santo, Fernando
Ferroni, Giorgio Galanti, Riccardo Giannitrapani, Marco Laveder, Francesco
Longo, José Maneira, Oriana Mansutti, Mauro Mezzetto, Alessandro Pascolini,
Gianni Pauletta, Elena Pavan, Massimo Persic, Ignasi Reichardt, Jorge Romao,
Marco Roncadelli, Sara Salvador, Ron Shellard, Franco Simonetto, Fabrizio
Tavecchio, Bernardo Tomé, Ezio Torassa, Andrea Turcati, Robert Wagner, Alan
Watson, and Jeff Wyss. We also thank all our students who where patiently lis-
tening and discussing with us during all the past years.
vii
viii Contents
5.7 The Particle Data Group and the Particle Data Book . . . . . . . . 240
5.7.1 PDG: Estimates of Physical Quantities . . . . . . . . . . . 241
5.7.2 Averaging Procedures by the PDG . . . . . . . . . . . . . . 241
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 653
Acronyms
xv
xvi Acronyms
BH Black hole
BL Lac Blazar Lacertae (an active galactic nucleus)
BNL Brookhaven national laboratory (in Long Island, NY)
Borexino Boron solar neutrino experiment (at the LNGS)
BR Branching Ratio (in a decay process)
CANGAROO Collaboration of Australia and Nippon (Japan) for a gamma-ray
observatory in the outback (Cherenkov observatory)
CAST CERN axion search telescope (experiment at CERN)
CDF Collider detector at Fermilab (experiment)
CERN European Organization for Nuclear Research, also European
laboratory for particle physics
CGC Color glass condensate
CGRO Compton gamma-ray observatory (orbiting the Earth)
cgs centimeter, gram, second (system of units)
CKM Cabibbo, Kobayasha, Maskawa (matrix mixing the quark
flavors.)
CMB Cosmic microwave background (radiation)
CMS Compact muon solenoid (experiment at CERN)
COBE Cosmic background explorer (satellite orbiting the Earth)
CoGeNT Coherent germanium neutrino telescope (experiment in the US)
COUPP Chicagoland observatory for underground particle physics
(experiment at Fermilab)
CP Charge conjugation × Parity (product of symmetry operators)
CPT Charge conjugation × Parity × Time reversal (product of
symmetry operators)
CR Cosmic rays
CRESST Cryogenic rare event search with superconducting thermometers
(experiment at LNGS)
CTA Cherenkov telescope array (an international gamma-ray detector)
CUORE Cryogenic underground observatory for rare events (experiment
at LNGS)
D0 Experiment at Fermilab
DASI Degree angular scale interferometer
DAMA Dark matter experiment (at LNGS)
DAMPE Dark matter particle explorer (astrophysical space observatory)
DAQ Data acquisition (electronics system)
DARMa De Angelis, Roncadelli, Mansutti (model of axion-photon
mixing)
DAS Data acquisition system
DELPHI Detector with lepton, photon, and hadron identification (exper-
iment at the CERN’s LEP)
DESY Deutsche synchrotron (laboratory in Germany)
DM Dark matter
DNA Desoxyribonucleic acid (the genetic base of life)
DONUT Direct observation of the ντ (experiment at Fermilab)
Acronyms xvii
HE High energy
HEGRA High-energy gamma-ray astronomy (Cherenkov experiment in
La Palma)
HERA Hadron elektron ring anlage (particle accelerator at DESY)
H.E.S.S. High-energy stereoscopic system (Cherenkov experiment in
Namibia)
HPD Hybrid photon detector
HST Hubble space telescope (orbiting the Earth)
IACT Imaging atmospheric Cherenkov telescope
IBL Intermediate-energy peaked BL Lac
IC Inverse Compton scattering (mechanism for the production of
HE gamma-rays)
ICRR Institute for Cosmic Ray Research (at the University of Tokyo,
Japan)
IceCube Neutrinos observatory in Antarctica
IDPASC International doctorate on particle and astroparticle physics,
astrophysics, and cosmology (doctoral network)
IMB Irvine, Michigan, Brookhaven (experiment in the US)
INFN Istituto Nazionale di Fisica Nucleare (in Italy)
IR Infrared (radiation)
IRB Infrared Background (photons)
ISS International Space Station
IST Instituto Superior Técnico (at the University of Lisboa, Portugal)
JEM Japanese experimental module (onboard the ISS)
JEM-EUSO Extreme universe space observatory at JEM
K2K KEK to Kamioka experiment (Japan)
Kamiokande Kamioka neutrino detector (experiment in Japan)
KamLAND Kamioka liquid scintillator antineutrino detector (experiment in
Japan)
KASCADE Karlsruhe shower and cosmic array detector (experiment in
Germany)
KATRIN Karlsruhe tritium neutrino experiment (in Germany)
KEK High-energy accelerator in Japan
Kepler Mission to search for extraterrestrial planets (NASA)
KM Parametrization of the CKM matrix in the original paper by
Kobayasha and Maskawa
Km3NeT kilometer cube neutrino telescope (experiment in the
Mediterranean Sea)
kTeV Experiment at Fermilab
L3 LEP 3rd (experiment at CERN)
LAr Liquid argon
LAT Large area telescope (detector in the Fermi Satellite)
LBL Low-energy peaked BL Lac
ΛCDM Lambda and Cold Dark Matter (model with cosmological
constant Λ)
Acronyms xix
UV Ultraviolet (radiation)
V-A Vector minus axial-vector relational aspect of a theory
VCV Véron-Cetty Véron (catalogue of galaxies with active galactic
nuclei)
VERITAS Very energetic radiation imaging telescope array system
(Cherenkov experiment in the US)
VHE Very-high-energy (cosmic rays)
VIRGO Italian-French laser interferometer collaboration at EGO
(experiment in Italy)
VLBA Very long baseline array (of radio telescopes, in the US)
WA# West area # (experiment at CERN, # standing for its number)
WBF Weak boson fusion (electroweak process)
WHIPPLE Cherenkov telescope (in Arizona)
WIMP Weakly interactive massive particle
WMAP Wilkinson microwave anisotropy probe (satellite orbiting the
Earth)
XCOM Photon cross-sections database by NIST
XTR X-ray transition radiation
About the Authors
xxiii
Chapter 1
Understanding the Universe: Cosmology,
Astrophysics, Particles, and Their
Interactions
The Universe around us, the objects surrounding us, display an enormous diversity.
Is this diversity built over small hidden structures? This interrogation started out, as
it often happens, as a philosophical question, only to become, several thousand years
later, a scientific one. In the sixth and fifth century BC in India and Greece the atomic
concept was proposed: matter was formed by small, invisible, indivisible, and eternal
particles: the atoms—a word invented by Leucippus (460 BC) and made popular by
his disciple Democritos. In the late eighteenth and early nineteenth century, chemistry
gave finally to atomism the status of a scientific theory (mass conservation law,
Lavoisier 1789; ideal gas laws, Gay-Lussac 1802; multiple proportional law, Dalton
1805), which was strongly reinforced with the establishment of the periodic table of
elements by Mendeleev in 1869—the chemical proprieties of an element depend on
a “magic” number, its atomic number.
If atoms did exist, their shape and structure were to be discovered. For Dalton, who
lived before the formalization of electromagnetism, atoms had to be able to establish
mechanical links with each other. After Maxwell (who formulated the electromag-
netic field equations) and J.J. Thomson (who discovered the electron) the binding
force was supposed to be the electric one and in atoms an equal number of positive and
negative electric charges had to be accommodated in stable configurations. Several
solutions were proposed (Fig. 1.1), from the association of small electric dipoles by
Philip Lenard (1903) to the Saturnian model of Hantora Nagaoka (1904), where the
positive charges were surrounded by the negative ones like the planet Saturn and its
rings. In the Anglo-Saxon world the most popular model was, however, the so-called
© Springer-Verlag Italia 2015 1
A. De Angelis and M.J.M. Pimenta, Introduction to Particle
and Astroparticle Physics, Undergraduate Lecture Notes in Physics,
DOI 10.1007/978-88-470-2688-9_1
2 1 Understanding the Universe: Cosmology, Astrophysics …
Fig. 1.1 Sketch of the atom according to atomic models by several scientists in the early twentieth
century: from left to right, the Lenard model, the Nagaoka model, the Thomson model, and the
Bohr model with the constraints from the Rutherford experiment. Source http://skullsinthestars.
com/2008/05/27/the-gallery-of-failed-atomic-models-1903-1913/
plum pudding model of Thomson (1904), where the negative charges, the electrons,
were immersed in a “soup” of positive charges. This model was clearly dismissed
by Rutherford, who demonstrated in the beginning of the twentieth century that the
positive charges had to be concentrated in a very small nucleus.
Natural radioactivity was the first way to investigate the intimate structure of
matter; then people needed higher energy particles to access smaller distance scales.
These particles came again from natural sources: it was discovered in the beginning
of the twentieth century that the Earth is bombarded by very high energy particles
coming from extraterrestrial sources. These particles were named “cosmic rays.” A
rich and not predicted spectrum of new particles was discovered. Particle physics,
the study of the elementary structure of matter, also called “high energy physics,”
was born.
High energy physics is somehow synonymous with fundamental physics. The
reason is that, due to Heisenberg’s1 principle, the minimum scale of distance x we
can sample is inversely proportional to the momentum (which approximately equals
the ratio of the energy E by the speed of light c for large energies) of the probe we
are using for the investigation itself:
x .
p
1 Werner Heisenberg (1901–1976) was a German theoretical physicist and was awarded the Nobel
Prize in Physics for 1932 “for the creation of quantum mechanics.” He also contributed to the theories
of hydrodynamics, ferromagnetism, cosmic rays, and subatomic physics. During World War II he
worked on atomic research, and after the end of the war he was arrested, then rehabilitated. Finally
he organized the Max Planck Institute for Physics, which presently in Munich is entitled to him.
1.1 Particle and Astroparticle Physics 3
The paradigm which is currently accepted by most researchers, and which is at the
basis of the standard model, is that there is a set of elementary particles constituting
matter. From a philosophical point of view, even the very issue of the existence of
2 Max Planck (1858–1934) was the originator of quantum theory, and deeply influenced the human
understanding of atomic and subatomic processes. Professor in Berlin, he was awarded the Nobel
Prize in 1918 “in recognition of the services he rendered to the advancement of Physics by his
discovery of energy quanta.” Politically aligned with the German nationalistic positions during
World War I, Planck was later opposing nazism. Planck’s son, Erwin, was arrested attempting
assassination of Hitler, and died at the hands of the Gestapo.
4 1 Understanding the Universe: Cosmology, Astrophysics …
elementary particles is far from being established: the concept of elementarity may
just depend on the energy scale at which matter is investigated—i.e., ultimately, on
the experiment itself. And since we use finite energies, a limit exists to the scale
one can probe. The mathematical description of particles, in the modern quantum
mechanical view, is that of fields, i.e., of complex amplitudes associated to points in
spacetime, to which a local probability can be associated.
Interactions between elementary particles are described by fields representing the
forces; in the quantum theory of fields, these fields can be also seen as particles by
themselves. In classical mechanics fields were just a mathematical abstraction; the
real thing were the forces. The paradigmatic example was Newton’s3 instantaneous
and universal gravitation law. Later, Maxwell gave to the electromagnetic field the
status of a physical entity: it transports energy and momentum in the form of elec-
tromagnetic waves and propagates at a finite velocity—the speed of light. Then,
Einstein4 explained the photoelectric effect postulating the existence of photons—
the interaction of the electromagnetic waves with free electrons, as discovered by
Compton, was equivalent to elastic collisions between two particles: the photon and
the electron. Finally with quantum mechanics the wave-particle duality was extended
to all “field” and “matter” particles.
Field particles and matter particles have different behaviors. Whereas matter parti-
cles comply with the Pauli5 exclusion principle—only one single particle can occupy
a given quantum state (matter particles obey Fermi-Dirac statistics, and are called
“fermions”)—there is no limit to the number of identical and indistinguishable field
resentation of the Universe, and our concepts of space and time. Although he is best known by the
general public for his theories of relativity and for his mass-energy equivalence formula E = mc2
(the main articles on the special theory of relativity and the E = mc2 articles were published in
1905), he received the 1921 Nobel Prize in Physics “especially for his discovery of the law of
the photoelectric effect” (also published in 1905), which was fundamental for establishing quan-
tum theory. The young Einstein noticed that Newtonian mechanics could not reconcile the laws of
dynamics with the laws of electromagnetism; this led to the development of his special theory of
relativity. He realized, however, that the principle of relativity could also be extended to accelerated
frames of reference when one was including gravitational fields, which led to his general theory
of relativity (1916). Professor in Berlin, he moved to the United States when Adolf Hitler came
to power in 1933, becoming a US citizen in 1940. During World War II, he cooperated with the
Manhattan Project, which led to the atomic bomb. Later, however, he took a position against nuclear
weapons. In the US, Einstein was affiliated to the Institute for Advanced Study in Princeton.
5 Wolfgang Ernst (the famous physicist Ernst Mach was his godfather) Pauli (Vienna, Austria,
1900—Zurich, Switzerland, 1958) was awarded the 1945 Nobel prize in physics “for the discovery of
the exclusion principle, also called the Pauli principle.” He also predicted the existence of neutrinos.
Professor in ETH Zurich and in Princeton, he had a rich exchange of letters with psychologist Carl
Gustav Jung. According to anecdotes, Pauli was a very bad experimentalist, and the ability to break
experimental equipment simply by being in the vicinity was called the “Pauli effect”.
1.2 Particles and Fields 5
Fig. 1.2 Presently observed elementary particles. Fermions are listed in the first three columns;
gauge bosons are listed in the fourth column. Adapted from MissMJ [CC BY 3.0 (http://
creativecommons.org/licenses/by/3.0)], via Wikimedia Commons
particles that can occupy the same quantum state (field particles obey Bose–Einstein
statistics, and are called “bosons”). Lasers (coherent streams of photons) and the
electronic structure of atoms are thus justified. The spin of a particle and the statis-
tics it obeys are connected by the spin-statistics theorem: according to this highly
nontrivial theorem, demonstrated by Fierz (1939) and Pauli (1940), fermions have
half-integer spins, whereas bosons have integer spins.
At the present energy scales and to our current knowledge, there are 12 elementary
“matter” particles; they all have spin 1/2, hence they are fermions. The 12 “matter
particles” currently known can be divided into two big families: 6 leptons (e.g.,
the electron, of charge −e, and the neutrino, neutral), and 6 quarks (a state of 3
bound quarks constitutes a nucleon, like the proton or the neutron). Each big family
can be divided into three families of two particles each; generations have similar
properties—but different masses. This is summarized in Fig. 1.2. A good scale for
masses is one GeV/c2 , the proton mass being about 0.938 GeV/c2 . Quarks have
fractional charges with respect to the absolute value of the electron charge, e: 23 e for
the up, charm, top quark, and − 13 e for the down, strange, top. Quark names are just
fantasy names.
The material constituting Earth can be basically explained by only three particles:
the electron, the up quark, and the down quark (the proton being made of two up
6 1 Understanding the Universe: Cosmology, Astrophysics …
quarks and one down, uud, and the neutron by one up and two down, udd). To each
known particle there is an antiparticle (antimatter) counterpart, with the same mass
and opposite charge quantum numbers.
At the current energy scales of the Universe particles interact via four funda-
mental interactions. There are indications that this view is related to the present-day
energy of the Universe: at higher energies—i.e., earlier epochs—some interactions
would “unify” and the picture would become simpler. In fact, theorists think that
these interactions might be the remnant of one single interaction that would occur
at extreme energies—e.g., the energies typical of the beginning of the Universe. By
increasing order of strength:
1. The gravitational interaction, acting between whatever pair of bodies and domi-
nant at macroscopic scales.
2. The weak interaction, also affecting all matter particles (with certain selection
rules) and responsible, for instance, for the beta decay and thus for the energy
production in the Sun.
3. The electromagnetic interaction, acting between pairs of electrically charged par-
ticles (i.e., all the matter particles, excluding neutrinos).
4. The color force, acting among quarks. The strong interaction,6 responsible for
binding the atomic nuclei (it ensures electromagnetic repulsion among protons in
nuclei does not break them up) and for the interaction of cosmic protons with the
atmosphere, is just a van der Waals shadow of the very strong interaction between
quarks.
The relative intensity of such interactions spans many orders of magnitude. In a
2H atom, in a scale where the intensity of strong interactions between the nucleons
is 1, the intensity of electromagnetic interactions between electrons and the nucleus
is 10−5 , the intensity of weak interactions is 10−13 , and the intensity of gravitational
interactions between the electron and the nucleus is 10−45 . In the quantum mechan-
ical view of interaction, interaction itself is mediated by quanta of the force field.
6 Thiskind of interaction was first conjectured and named by Isaac Newton at the end of the
seventeenth century: “There are therefore agents in Nature able to make the particles of bodies stick
together by very strong attractions. And it is the business of experimental philosophy to find them
out. Now the smallest particles of matter may cohere by the strongest attractions, and compose
bigger particles of weaker virtue; and many of these may cohere and compose bigger particles
whose virtue is still weaker, and so on for divers successions, until the progression ends in the
biggest particles on which the operations in chemistry, and the colors of natural bodies depend.” (I.
Newton, Opticks)
1.2 Particles and Fields 7
The origin and destiny of the Universe is, for most researchers, the fundamental ques-
tion. Many answers were provided over the ages, a few of them built over scientific
observations and reasoning. Over the last century enormous scientific theoretical and
experimental breakthroughs have occurred: less than a century ago, people believed
that the Milky Way, our own galaxy, was the only galaxy in the Universe; now we
know that there are 1011 galaxies within the observable universe, each containing
some 1011 stars. Most of them are so far, that we can not even hope to explore them.
Let us start an imaginary trip across the Universe from the Earth. The Earth, which
has a radius of about 6400 km, is one of the planets orbiting around the Sun. The
latter is a star with a mass of about 2 × 1030 kg located at a distance from us of about
150 million km (i.e., 500 light seconds). We call the average Earth-Sun distance the
astronomical unit, AU. The ensemble of planets orbiting the Sun is called the Solar
8 1 Understanding the Universe: Cosmology, Astrophysics …
System. Looking to the aphelion of the orbit of the farthest acknowledged planet,
Neptune, the Solar System has a diameter of 9 billion km (about 10 light hours, or
60 AU).
The Milky Way (Fig. 1.3) is the galaxy that contains our Solar System. Its name
“milky” is derived from its appearance as a dim glowing band arching across the night
sky in which the naked eye cannot distinguish individual stars. The ancient Romans
named it “via lactea,” which literally corresponds to the present name (being lac the
latin word for milk)—the term “galaxy,” too, descends from a Greek word indicating
milk. Seen from Earth with the unaided eye, the Milky Way appears as a band because
its disk-shaped structure is viewed edge-on from the periphery of the Galaxy itself.
1.3 A Quick Look at the Universe 9
Galilei7 first resolved such band of light into individual stars with his telescope, in
1610.
The Milky Way is a spiral galaxy some 100 000 light-years (ly) across. The Solar
System is located within the disk, about 30 000 light-years away from the Galactic
Center, in the so-called Orion arm. The stars in the inner 10 000 light-years form
a bulge and one or more bars that radiate from the bulge. The very center of the
Galaxy, in the constellation of Sagittarius, hosts a supermassive black hole of some
4 million solar masses (this value has been determined precisely by studying the
orbits of nearby stars).
The Milky Way is a relatively large galaxy. Teaming up with a similar-sized partner
(called the Andromeda galaxy), it has gravitationally trapped many smaller galaxies:
together, they all constitute the so-called Local Group. The Local Group comprises
more than 50 galaxies, including numerous dwarf galaxies—some are just spherical
collections of hundreds of stars that are called globular clusters. Its gravitational
center is located somewhere between the Milky Way and the Andromeda galaxies.
The Local Group covers a diameter of 10 million light-years, or 10 Mly (i.e., 3.1
megaparsec,8 Mpc); it has a total mass of about 1012 solar masses.
Galaxies are not uniformly distributed; most of them are arranged into groups
(containing some dozens of galaxies) and clusters (up to several thousand galaxies);
groups and clusters and additional isolated galaxies form even larger structures called
superclusters that may span up to 100 Mly.
This is how far our observations can go.
In 1929 the American astronomer Edwin Hubble, studying the emission of radi-
ation from galaxies compared their speed (calculated from the Doppler shift of their
emission lines) with the distance (Fig. 1.4), and discovered that objects in the Uni-
verse move away from us with velocity
7 Galileo Galilei (1564–1642) was an Italian physicist, mathematician, astronomer, and philosopher
who deeply influenced scientific thoughts down to the present days. He first formulated some of the
fundamental laws of mechanics, like the principle of inertia and the law of accelerated motion; he
formally proposed, with some influence from previous works by Giordano Bruno, the principle of
relativity. Galilei was professor in Padua, nominated by the Republic of Venezia, and astronomer
in Firenze. He built the first practical telescope (using lenses) and using this instrument he could
perform astronomical observations which supported Copernicanism; in particular he discovered the
phases of Venus, the four largest satellites of Jupiter (named the Galilean moons in his honor), and
he observed and analyzed sunspots. Galilei also made major discoveries in military science and
technology. He came into conflict with the catholic church, for his support of Copernican theories.
In 1616 the Inquisition declared heliocentrism to be heretical, and Galilei was ordered to refrain
from teaching heliocentric ideas. In 1616 Galilei argued that tides were an additional evidence for
the motion of the Earth. In 1933 the Roman Inquisition found Galilei suspect of heresy, sentencing
him to indefinite imprisonment; he was kept under house arrest in Arcetri near Firenze until his
death.
8 The parsec (symbol: pc, and meaning “parallax of one arcsecond”) is often used in astronomy to
measure distances to objects outside the Solar System. It is defined as the length of the longer leg
of a right triangle, whose shorter leg corresponds to one astronomical unit, and the subtended angle
of the vertex opposite to that leg is one arcsecond. It corresponds to approximately 3 ×1016 m, or
about 3.26 light-years. Proxima Centauri, the nearest star, is about 1.3 pc from the Sun.
10 1 Understanding the Universe: Cosmology, Astrophysics …
Fig. 1.4 Redshift of emission spectrum of stars and galaxies at different distances. A star in our
galaxy is shown at the bottom left with its spectrum on the bottom right. The spectrum shows the
dark absorption lines, which can be used to identify the chemical elements involved. The other three
spectra and pictures from bottom to top show a nearby galaxy, a medium distance galaxy, and a
distant galaxy. Using the redshift we can calculate the relative radial velocity between these objects
and the Earth. From http://www.indiana.edu/~geol105
v = H0 d , (1.1)
where d is the distance between the objects, and H0 is a parameter called the Hubble
constant (whose value is known today to be about 68 km s−1 Mpc−1 , i.e., 21 km
s−1 Mly−1 ). The above relation is called Hubble’s law (Fig. 1.5). Note that at that
time galaxies beyond the Milky Way had just been discovered.
The Hubble law means that sources at cosmological distances (where local
motions, often resulting from galaxies being in gravitationally bound states, are
negligible) are observed to move away at speeds that are proportionally higher for
larger distances. The Hubble constant describes the rate of increase of recession
velocities for increasing distance. The Doppler redshift
z = λ /λ − 1
can thus also be used as a metric of the distance of objects. To give an idea of what
H0 means, the speed of revolution of the Earth around the Sun is about 30 km/s.
Andromeda, the large galaxy closest to the Milky Way, has a distance of about
2.5 Mly from us—however we and Andromeda are indeed approaching: this is an
example of the effect of local motions.
Dimensionally, we note that H0 is the inverse of a time: H0 (14×109 years)−1 .
A simple interpretation of the Hubble law is that, if the Universe had always been
expanding at a constant rate, about 14 billion years ago its volume was zero—naively,
1.3 A Quick Look at the Universe 11
Fig. 1.5 Experimental plot of the relative velocity (in km/s) of known astrophysical objects as a
function of distance from Earth (in Mpc). Several methods are used to determine the distances.
Distances up to hundreds of parsecs are measured using stellar parallax (i.e., the difference between
the angular positions from the Earth with a time delay of 6 months). Distances up to 50 Mpc are
measured using cepheids, periodically pulsating stars for which the luminosity is related to the
pulsation period (the distance can thus be inferred by comparing the intrinsic luminosity with the
apparent luminosity). Finally, distances from 1 to 1000 Mpc can be measured with another type of
standard candle, Type Ia supernova, a class of remnants of imploded stars. In between (from 15 to
200 Mpc), the Tully-Fisher relation, an empirical relationship between the intrinsic luminosity of
a spiral galaxy and the width of its emission lines (a measure of its rotation velocity), can be used.
The methods, having large superposition regions, can be cross-calibrated. The line is a Hubble-law
fit to the data. From A. G. Riess, W. H. Press and R. P. Kirshner, Astrophys. J. 473 (1996) 88
we can think that it exploded through a quantum singularity, such an explosion being
usually called the “Big Bang.” This age is consistent with present estimates of the
age of the Universe within gravitational theories, which we shall discuss later in this
book, and slightly larger than the age of the oldest stars, which can be measured from
the presence of heavy nuclei. The picture looks consistent.
The adiabatic expansion of the Universe entails a freezing with expansion time t,
which in the nowadays quiet Universe can be summarized as a law for the evolution
of the temperature T ,
30000 K
T .
t (years)
The present temperature is slightly less than 3 K, and can be measured from the
spectrum of the blackbody (microwave) radiation (the so-called cosmic microwave
background, or CMB, permeating the Universe). The formula implies also that study-
ing the ancient Universe in some sense means exploring the high energy world:
subatomic physics and astrophysics are naturally connected.
12 1 Understanding the Universe: Cosmology, Astrophysics …
At epochs corresponding to fractions of a second after the Big Bang, tiny quantum
fluctuations in the distribution of cosmic matter led to galaxy formation. Density
fluctuations grew with time into proto-structures which, after accreting enough mass
from their surroundings, overcame the pull of the expanding universe and after the
end of an initial era dominated by radiation collapsed into bound, stable structures.
The average density of such structures was reminiscent of the average density of
the Universe when they broke away from the Hubble expansion: so, earlier-forming
structures have a higher mean density than later-forming structures. Proto-galaxies
were initially dark. Only later, when enough gas had fallen into their potential well,
stars started to form—again, by gravitational instability in the gas—and shine due to
the nuclear fusion processes activated by the high temperatures cause by gravitational
forces. The big picture of the process of galaxy formation is probably understood
by now, but the details are not. The morphological difference between disk (i.e.,
spiral) galaxies and spheroidal (i.e., elliptical) galaxies are interpreted as due to
the competition between the characteristic timescale of the infall of gas into the
protogalaxy’s gravitational well and the timescale of star formation: if the latter is
shorter than the former, a spheroidal (i.e., three-dimensional) galaxy likely forms;
if it is longer, a disk (i.e., two-dimensional) galaxy forms. A disk galaxy is rotation
supported, whereas a spheroidal galaxy is pressure supported—stars behaving in this
case like gas molecules. It is conjectured that the velocity dispersion (∼200 km/s)
among proto-galaxies in the earlier universe may have triggered rotation motions in
disk galaxies, randomly among galaxies but orderly within individual galaxies.
Stars also formed by gravitational instabilities of the gas. For given conditions
of density and temperature, gas (mostly hydrogen and helium) clouds collapse and,
if their mass is suitable, eventually form stars. Stellar masses are limited by the
conditions that (i) nuclear reactions can switch on in the stellar core (>0.1 solar
masses), and (ii) the radiation drag of the produced luminosity on the plasma does
not disrupt the star’s structure (<100 solar masses). For a star of the mass of the Sun,
formation takes 50 million years—the total lifetime is about 11 billion years before
collapsing to a “white dwarf,” and in the case of our Sun some 4.5 billion years are
already gone.
Stars span a wide range of luminosities and colors, and can be classified according
to these characteristics. The smallest stars, known as red dwarfs, may contain as little
as 10 % the mass of the Sun and emit only 0.01 % as much energy, having typical
surface temperatures of 3000 K. Red dwarfs are by far the most numerous stars in the
Universe and have lifetimes of tens of billions of years, much larger than the age of
the Universe. On the other hand, the most massive stars, known as hypergiants, may
be 100 or more times more massive than the Sun, and have surface temperatures of
more than 30 000 K. Hypergiants emit hundreds of thousands of times more energy
than the Sun, but have lifetimes of only a few million years. They are thus extremely
rare today and the Milky Way galaxy contains only a handful of them.
Luminosity and temperature of a star are in general linked. In the temperature-
luminosity plane, stars populate a locus that can be described (in log scale) as a
straight line: this is called the main sequence of stars. Our Sun is also found there—
corresponding to very average temperature and luminosity.
1.3 A Quick Look at the Universe 13
Fig. 1.6 Binding energy per nucleon for stable atoms. Iron (56 Fe) is the stable element for which the
binding energy per nucleon is the largest (about 8.8 MeV); it is thus the natural endpoint of processes
of fusion of lighter elements, and of fission of heavier elements (although 58 Fe and 56 Ni have a
slightly higher binding energy, by less than 0.05 %, they are subject to nuclear photodisintegration).
From http://hyperphysics.phy-astr.gsu.edu/
The fate of a star depends on its mass. The heavier is the star, the larger is its
gravitational energy, the more effective are the nuclear processes powering it. In
average stars like the Sun, the outer layers are supported against gravity until the
stellar core stops producing fusion energy; then the star collapses into a “white
dwarf”—an Earth-sized object. Main-sequence stars with mass over 8 solar masses
die in a very energetic explosion called a supernova. In a supernova, the star’s core,
made by iron (which being the most stable atom, i.e., one whose mass defect per
nucleon is maximum, is the endpoint of nuclear fusion processes, Fig. 1.6) collapses
and the released gravitational energy goes into heating the overlying mass layers
which, in an attempt to dissipate the sudden excess heat by increasing the star’s
radiating surface, expand at high speed (10 000 km/s and more) to the point that the
star gets quickly disrupted—i.e., explodes. Supernovae release an enormous amount
of energy, about 1046 J—mostly in neutrinos from the nuclear processes occurring in
the core, and just 1 % in kinetic energies of the ejecta—in a few tens of seconds.9 For
a period of days to weeks, a supernova may outshine its entire host galaxy. Being the
energy of the explosion large enough to generate hadronic interactions, basically any
element and many subatomic particles are produced in these explosions. On average,
in a typical galaxy (e.g., the Milky Way) supernova explosions occur just once or
twice per century. Supernovae leave behind neutron stars or black holes.
9 Note that frequently astrophysicist use as a unit of energy the old “cgs” (centimeter-gram-second)
The heavier the star, the most effective the fusion process, the shortest the lifetime.
We need a star like our Sun, having a lifetime of a few tens of billion years, to both
give enough time to life to develop and to guarantee large enough temperatures for
humans. The Solar System is estimated to be some 4.6 billion years old and to have
started from a molecular cloud. Most of the collapsing mass collected in the center,
forming the Sun, while the rest flattened into a disk out of which the planets formed.
The Sun is too young to have created heavy elements in such an abundance to justify
carbon-based life on Earth. The carbon, nitrogen, and oxygen atoms in our bodies,
as well as atoms of all other heavy elements, were created in previous generations
of stars somewhere in the Universe.
The study of stellar motions in galaxies indicates the presence of a large amount of
unseen mass in the Universe. This mass seems to be of a kind presently unknown to
us; it neither emits nor absorbs electromagnetic radiation (including visible light) at
any significant level. We call it dark matter: its abundance in the Universe amounts to
an order of magnitude more than the conventional matter we are made of. Dark matter
represents one of the greatest current mysteries of astroparticle physics. Indications
exist also of a further form of energy, which we call dark energy. Dark energy
contributes to the total energy budget of the Universe three times more than dark
matter.
In summary, we live in a world that is mostly unknown even from the point of view
of the nature of its main constituents (Fig. 1.7). The evolution of the Universe and our
everyday life depend on this unknown external world. First of all, the ultimate destiny
of the Universe—a perpetual expansion or a recollapse—depends on the amount of
all the matter in the Universe. Moreover, every second high-energy particles (i.e.,
above 1 GeV) of extraterrestrial origin pass through each square centimeter on the
Earth, and they are messengers from regions where highly energetic phenomena take
place that we cannot directly explore. These are the so-called cosmic rays, discovered
in the beginning of the nineteenth century (see Chap. 3). It is natural to try to use these
messengers in order to obtain information on the highest energy events occurring in
the Universe.
1.4 Cosmic Rays 15
The distribution in energy (the so-called spectrum) of cosmic rays is quite well
described by a power law E − p with p a positive number (Fig. 1.8). The spectral
index p is around 3 in average. After the low energy region dominated by cosmic
rays from the Sun (the solar wind), the spectrum becomes steeper for energy values
of less than ∼1000 TeV (150 times the maximum energy foreseen for the beams of
the LHC collider at CERN): this is the energy region that we know to be dominated
Fig. 1.8 The energy spectrum (number of incident particles per unit of energy, per second, per unit
area, and per unit of solid angle) of the primary cosmic rays. The vertical band on the left indicates
the energy region in which the emission from the Sun is supposed to be dominant; the central band
the region in which most of the emission is presumably of galactic origin; the band on the right the
region of extragalactic origin. By Sven Lafebre (own work) [GFDL http://www.gnu.org/copyleft/
fdl.html], via Wikimedia Commons.
16 1 Understanding the Universe: Cosmology, Astrophysics …
by cosmic rays produced by astrophysical sources in our galaxy, the Milky Way. For
higher energies a further steepening occurs, the point at which this change of slope
takes place being called the “knee”; we believe that the region above this energy is
dominated by cosmic rays produced by extragalactic sources, mostly supermassive
black holes growing at the centers of other galaxies. For even higher energies (more
than one million TeV) the cosmic-ray spectrum becomes less steep, resulting in
another change of slope, called the “ankle.” About possible sources at such large
energies, we have no clues. Finally, at the highest energies in the figure a drastic
suppression is present—as expected from the interaction of long-traveling particles
with the cosmic microwave background, remnant of the origin of the Universe.10
The majority of high-energy particles in cosmic rays are protons (hydrogen
nuclei); about 10 % are helium nuclei (nuclear physicists usually call them alpha par-
ticles), and 1 % are neutrons or nuclei of heavier elements. Together, these account
for 99 % of the cosmic rays, and electrons and photons make up the remaining 1 %.
The number of neutrinos is estimated to be comparable to that of high-energy pho-
tons, but it is very high at low energy because of the nuclear processes that occur in
the Sun: such processes involve a large production of neutrinos.
Cosmic rays hitting the atmosphere (called primary cosmic rays) generally pro-
duce secondary particles that can reach the Earth’s surface, through multiplicative
showers.
About once per minute, a single subatomic particle enters the Earth’s atmosphere
with an energy larger than 10 J. Somewhere in the Universe there are accelerators
that can impart energies 100 million times larger than the energy reached by the
most powerful accelerators on Earth to single protons. It is thought that the ultimate
engine of the acceleration of cosmic rays is gravity. In gigantic gravitational collapses,
such as those occurring in supernovae (stars imploding at the end of their lives, see
Fig. 1.9, left) and in the accretion of supermassive black holes (equivalent to millions
to billions of solar masses) at the expense of the surrounding matter (Fig. 1.9, right),
part of the potential gravitational energy is transformed, through not fully understood
mechanisms, into kinetic energy of the particles.
The reason why the maximum energy attained by human-made accelerators can-
not compete with the still mysterious cosmic accelerators with the presently known
acceleration technologies is simple. The most efficient way to accelerate particles
requires their confinement within a radius R by a magnetic field B, and the final
energy is proportional to the product R × B. On Earth, it is difficult to imagine rea-
10 A theoretical upper limit on the energy of cosmic rays from distant sources was computed in 1966
by Greisen, Kuzmin, and Zatsepin, and it is called today the GZK cutoff. Protons with energies above
a threshold of about 1020 eV suffer a resonant interaction with the cosmic microwave background
photons to produce pions through the formation of a short-lived particle (resonance) called :
p + γ → → N + π . This continues until their energy falls below the production threshold.
Because of the mean path associated with the interaction, extragalactic cosmic rays from distances
larger than 50 Mpc from the Earth and with energies greater than this threshold energy should be
strongly suppressed on Earth, and there are no known sources within this distance that could produce
them. A similar effect (nuclear photodisintegration) limits the mean free path for the propagation
of nuclei heavier than the proton.
1.4 Cosmic Rays 17
Fig. 1.9 Left The remnant of the supernova in the Crab region (Crab nebula), a powerful gamma
emitter in our galaxy. The supernova exploded in 1054 and the phenomenon was recorded by
Chinese astronomers. For 40 years, until 2010, most astronomers regarded the Crab as a standard
candle for high energy photon emission, but recently it was discovered that Crab Nebula from
time to time flickers. Anyway, most plots of sensitivity of detectors refer to a “standard Crab” as
a reference unit. The vortex around the center is visible; a neutron star rapidly rotating (with a
period of around 30 ms) and emitting pulsed gamma-ray streams (pulsar) powers the system. Some
supernova remnants, seen by Earth, have an apparent dimension of a few tenths of degree—about
the dimension of the Moon. Right A supermassive black hole accretes swallowing neighboring
stellar bodies and molecular clouds, and emits jets of charged particles and gamma rays. Credits:
NASA
sonable confinement radii greater than one hundred kilometers, and magnetic fields
stronger than 10 tesla (i.e., one hundred thousand times the Earth’s magnetic field).
This combination can provide energies of a few tens of TeV, such as those of the
LHC accelerator at CERN. In nature accelerators with much larger radii exist, such
as supernova remnants (light-years) and active galactic nuclei (tens of thousands of
light-years). Of course human-made accelerators have important advantages, as the
flux and the possibility of knowing the initial conditions (cosmic ray researches do
not know a-priori the initial conditions of their phenomena).
Among cosmic rays, photons are particularly important. The gamma photons
(called gamma rays for historical reasons) are photons of very high energy, and
occupy the most energetic part of the light spectrum; being neutral they can travel
long distances without being deflected by galactic and extragalactic magnetic fields,
hence they allow us to directly study their emission sources. These facts are now
pushing us to study in particular the high-energy gamma rays and cosmic rays of
hundreds of millions of TeV. However, gamma rays are less numerous than charged
cosmic rays of the same energy, and the energy spectrum of charged cosmic rays
is such that particles of hundreds of millions of TeV are very rare. The task of
18 1 Understanding the Universe: Cosmology, Astrophysics …
Fig. 1.10 Map of the emitters of photons above 100 GeV in the Universe, in galactic coordinates
(from the TeVCAT catalog). The sources are indicated as circles—the colors represent different
kinds of emitters which will be explained in Chap. 10. From http://tevcat.uchicago.edu/
11 Usually the planar representations of maps of the Universe are done in galactic coordinates. To
understand what this means, let us start from a celestial coordinate system in spherical coordinates,
in which the Sun is at the center, the primary direction is the one joining the Sun with the center
of the Milky Way, and the galactic plane is the fundamental plane. Coordinates are positive toward
North and East in the fundamental plane.
We define as galactic longitude (l or λ) the angle between the projection of the object in the
galactic plane and the primary direction. Latitude (symbol b or φ) is the angular distance between
the object and the galactic plane. For example, the North galactic pole has a latitude of +90◦ .
Plots in galactic coordinates are then projected onto a plane, typically using an elliptical (Moll-
weide) projection preserving areas. This projection transforms latitude and longitude to plane coor-
dinates x and y via the equations (angles are expressed in radiants):
√
2 2
x = R cos θ
π
√
y = R 2 sin θ ,
and R is the radius of the sphere to be projected. The map has area 4π R 2√, obviously
√ equal to
the surface area of the generating √ The x-coordinate has a range [−2R 2, 2R 2], and the
√ globe.
y-coordinate has a range [−R 2, R 2]. The Galactic center is located at (0,0).
Further Reading 19
Further Reading
Exercises
1. The Universe. Find a dark place close to where you live, and go there in the night.
Try to locate the Milky Way and the Galactic center. Comment on your success
(or insuccess).
2. Telescopes. Make a research on the differences between Newtonian and Galileian
telescopes; discuss such a difference.
3. Size of an atom. Explain how you will be able to find the order of magnitude of
an atom size using a drop of oil. Make the experiment and check the result.
4. Thomson atom. Consider the Thomson atom model applied to a helium atom (the
two electrons are in equilibrium inside a homogeneous positive charged sphere
of radius r ∼ 10−10 m).
(a) Determine the distance of the electrons to the center of the sphere.
(b) Determine the vibration frequency of the electrons in this model and compare
it to the first line of the spectrum of hydrogen.
5. Atom as a box. Consider a simplified model where the hydrogen atom is described
by a one dimensional box of length r with the proton at its center and where the
electron is free to move around. Compute, taking into account the Heisenberg
uncertainty principle, the total energy of the electron as a function of r and deter-
mine the value of r for which this energy is minimized.
6. Cosmic ray fluxes and wavelength. The most energetic particles ever observed
at Earth are cosmic rays. Make an estimation of the number of such events with
an energy between 3 ×1018 and 1019 eV that may be detected in one year by an
experiment with a footprint of 1000 km2 . Evaluate the structure scale that can be
probed by such particles.
7. Energy from cosmic rays: Nikola Tesla’s “free” energy generator. “This new
power for the driving of the world’s machinery will be derived from the energy
which operates the universe, the cosmic energy, whose central source for the Earth
is the Sun and which is everywhere present in unlimited quantities.” Immediately
20 1 Understanding the Universe: Cosmology, Astrophysics …
This chapter introduces the basics of the techniques for the study
of the intimate structure of matter, described in a historical
context. After reading this chapter, you should understand the
basic tools which lead to the investigation and the description of
the subatomic structure, and you should be able to compute the
interaction probabilities of particles. A short reminder of the
concepts of special relativity needed to understand astroparticle
physics is also provided.
In the second half of the nineteenth century, the work by Mendeleev1 on the periodic
table of the elements provided the paradigm that paved the way to the experimental
demonstration of the atomic structure. The periodic table is an arrangement of the
chemical elements. Mendeleev realized that the physical and chemical properties of
elements are related to their atomic mass in a quasi-periodic way. He ordered the 63
elements known at his time according to their atomic mass and arranged them in a
table so that elements with similar properties would be in the same column. Figure 2.1
shows this arrangement. Hydrogen, the lightest element, is isolated in the first row
of the table. The following light elements are then disposed in octets. Mendeleev
found some gaps in his table, and predicted that elements then unknown would be
discovered which would fill these gaps: his predictions were successful.
Mendeleev’s periodic table has been expanded and refined with the discovery of
new elements and a better theoretical understanding of chemistry. The most important
1 Dimitri Mendeleev (1834–1907) was a Russian chemist born in Tobolsk, Siberia. He studied
science in St. Petersburg, where he graduated in 1856 and became full professor in 1863. Mendeleev
is best known for his work on the periodic table, published in Principles of Chemistry in 1869, but
also, according to a myth popular in Russia, for establishing that the minimum alcoholic fraction
of vodka should be 40 %—this requirement was easy to verify, as this is the minimum content at
which an alcoholic solution can be ignited at room temperature.
© Springer-Verlag Italia 2015 21
A. De Angelis and M.J.M. Pimenta, Introduction to Particle
and Astroparticle Physics, Undergraduate Lecture Notes in Physics,
DOI 10.1007/978-88-470-2688-9_2
22 2 The Birth and the Basics of Particle Physics
Fig. 2.1 Mendeleev’s periodic table as published in Annalen der Chemie 1872 [public domain].
The noble gases had not yet been discovered, and are thus not displayed
modification was the use of atomic number (the number of electrons, which indeed
characterizes an element) instead of atomic mass to order the elements. Since atoms
are neutral, the same number of positive charges (protons) should be present. Starting
from the element with atomic number 3, it had been conjectured by Mendeleev that
electrons are disposed in shells. The n–th shell is complete with 2n 2 electrons and
the external shell alone dictates the chemical properties of an element. As we know,
the quantum mechanical view is more complete but not as simple.
The present form of the periodic table (Appendix A) is a grid of elements with
18 columns and 7 rows, with an additional double row of elements. The rows are
called periods; the columns, which define the chemical properties, are called groups;
examples of groups are “halogens” and “noble gases.”
Thanks to Mendeleev’s table, a solid conjecture was formulated that atoms are
composite states including protons and electrons, electrons being loosely bound. But
how to understand experimentally the inner structure of the atom, i.e., how were
protons and electrons arranged inside the atom? Were electrons “orbiting” around a
positive nucleus, or were both protons and electrons embedded in a “plum pudding,”
with electrons (the “plums”) more loosely bound? A technique invented around 1900
to answer this question pioneered the full history of particle physics.
Collide a beam of particles with a target, observe what comes out, and try to infer
the properties of the interacting objects and/or of the relevant interaction force. This
is the paradigm of particle physics experiments. The first experiment was conducted
2.2 The Rutherford Experiment 23
Fig. 2.2 Left Sketch of the Rutherford experiment (by Kurzon [own work, CC BY-SA 3.0], via
Wikimedia Commons). Right Trajectories of the α particles
where 0 is the vacuum dielectric constant, Q 1 and Q 2 are the charges of the beam
particle and of the target particle, and E 0 is the kinetic energy of the beam particle.
2 Ernest Rutherford (1871–1937) was a New Zealand-born physicist. In early works at McGill
University in Canada, he proved that radioactivity involved the transmutation of one chemical
element into another; he differentiated and named the α (helium nuclei) and β (electrons) radiations.
In 1907, Rutherford moved to Manchester, UK, where he discovered (and named) the proton. In 1908
he won the Nobel Prize in Chemistry “for his investigations into the disintegration of the elements,
and the chemistry of radioactive substances.” He became director of the Cavendish Laboratory at
Cambridge University in 1919. Under his leadership, the neutron was discovered by James Chadwick
in 1932. Also in 1932, his students John Cockcroft and Ernest Walton split for the first time the
atom with a beam of particles. Rutherford was buried near Newton in Westminster Abbey, London.
The chemical element rutherfordium—atomic number 104—was named after him in 1997.
24 2 The Birth and the Basics of Particle Physics
If the number of beam particles per unit of transverse area n beam is not a function
of the transverse coordinates b and φ (the beam is uniform and wide with respect to
the target size), the differential number of particles as a function of b is
dN
= 2πb n beam . (2.2)
db
the well-known Rutherford formula. This equation explained the observation of scat-
tering at large angles and became the paradigm for particle diffusion of nuclei.
Beta radioactivity, the spontaneous emission of electrons by some atoms, was dis-
covered by Ernest Rutherford just a few years after the discovery by Henri Becquerel
that uranium was able to impress photographic plates wrapped in black paper. It took
then some years before James Chadwick in 1914 realized that the energy spectrum of
the electrons originated in the beta decays was continuous and not discrete (Fig. 2.3).
This was a unique feature in the new quantum world, in which decays were explained
as transitions between well-defined energy levels. There was a missing energy prob-
lem and many explanations were tried along the years, but none was proved. In 1930,
Niels Bohr even went so far as to suggest that the sacrosanct energy conservation
law could be violated.
In December 1930, in a famous letter, Wolfgang Pauli proposed as “desperate
remedy” the existence of a new neutral particle with spin one-half and low mass
named neutron: “The continuous β spectrum would then become understandable
from the assumption that in the β decay a neutron is emitted along with the electron,
2.3 β Decay and the Neutrino Hypothesis 25
in such way that the sum of the energies of the neutron and the electron is constant.”
This tiny new particle was later renamed “neutrino” by Enrico Fermi. The particle
today known as neutron, constituent of the atomic nuclei, was discovered by James
Chadwick in 1932, Nobel prize in Physics 1935, then at the University of Cambridge,
Chadwick found a radiation consisting of uncharged particles of approximately the
mass of the proton. His group leader Rutherford had conjectured the existence of
the neutron already in 1920, in order to explain the difference between the atomic
number of an atom and its atomic mass, and he modeled it as an electron orbiting a
proton.
The atomic nuclei were thus composed (in the modern language) by protons and
neutrons and the beta radioactive decays was explained by the decay of one of the
nuclei neutrons into one proton, one electron, and one neutrino (in fact, as it will be
discussed later, in one antineutrino):
The β + decay, i.e., the decay of one proton in the nucleus into one neutron, one
positron (the antiparticle of the electron), and one neutrino
p → ne+ ν. (2.7)
is also possible, although the neutron mass is larger than the proton mass—consider
that nuclei are bound in the nucleus, and they are not free particles.
Neutrinos have almost no interaction with matter and therefore their experimental
discovery was not an easy enterprise: intense sources and massive and performant
detectors had to be made. Only in 1956 Reines and Cowan proved the existence of
the neutrino placing a water tank near a nuclear reactor. Some of the antineutrinos
produced in the reactor interacted with a proton in the water, giving rise to a neutron
and a positron, the so-called inverse beta process:
26 2 The Birth and the Basics of Particle Physics
ν̄ p → ne+ . (2.8)
The positron then annihilates with an ordinary electron and the neutron is captured
by cadmium chloride atoms dissolved in the water. Three photons were then detected
(two from the annihilation and, 5 µs later, one from the de-excitement of the cadmium
nucleus).
The mass of the neutrino is indeed very low (but not zero, as discovered by the end
of the twentieth century with the observation of the oscillations between neutrinos
of different families, a phenomenon that is possible only for massive neutrinos) and
determines the maximum energy that the electron may have in the beta decay (the
energy spectrum end-point). The present measurements are compatible with neutrino
masses below a few eV.
If we want to investigate a structure below a length scale x, we are limited by the
uncertainty principle (which indeed is a theorem). Since a wavelength
λ (2.9)
p
c
E> (2.10)
x
must be used. For example, a hard X-ray with an energy of 1 keV can investigate the
structure of a target at a scale
c
x > 2 × 10−11 m, (2.11)
E
an order of magnitude smaller than the atomic radius. A particle with an energy of 7
TeV (the running energy of the LHC accelerator at CERN, Geneva) can investigate
the structure of a target at a scale
c
x > 3 × 10−20 m. (2.12)
E
Since one can extract only a finite energy from finite regions of the Universe
(and maybe the Universe itself has a finite energy), there is an intrinsic limit to the
investigation of the structure of matter, below which the quest makes no more sense.
However, as we shall see, there are practical limits much more stringent than that.
2.4 Uncertainty Principle and the Scale of Measurements 27
Does the concept of elementary particle have a meaning below this? The question is
more philosophical than physical, since one cannot access infinite energies.
The maximum energies attainable by human-made accelerators are believed to be
of the order of some 1000 TeV. However, Nature gives us for free beams of particles
with much larger energies, hitting the Earth from extraterrestrial sources: cosmic
rays.
Particle physicists observe and count particles, as pioneered by the Rutherford exper-
iment. They count, for instance, the number of particles of a certain type with certain
characteristics (energy, spin, scattering angle) that result from the interaction of a
given particle beam at a given energy with a given target. It is then useful to express
the results as quantities independent from the number of beam particles, or target
particles. These quantities are the cross-sections, σ.
The total cross-section measured in a collision of a beam with a single object (Fig. 2.4)
is defined as
Nint
σtot = (2.13)
n beam
where Nint is the total number of measured interactions and n beam is, as previously
defined, the number of beam particles per unit of transverse area.
A cross-section has thus dimensions of area. It represents the effective area with
which the interacting particles “see” each other. The usual unit for cross-section is
the barn, b (1 b = 10−24 cm2 ) and its submultiples (millibarn—mb, microbarn—μb,
nanobarn—nb, picobarn—pb, femtobarn—fb, etc.). To give an order of magnitude,
the total cross-section for the interaction of two protons at a center-of-mass energy
around 100 GeV is 40 mb (approximately the area of a circle with radius 1 fm).
We can write the total cross-section with a single target as
Wint
σtot = , (2.14)
J
in terms of the interaction rate Wint (number of interactions per unit time) and of the
flux of incident particles J (number of beam particles that cross the unit of transverse
area per unit of time). J is given as
J = ρbeam v, (2.15)
where ρbeam is the density of particles in the beam and v is the beam particle velocity
in the rest frame of the target.
In real life, most targets are composed of Nt small sub-targets (Fig. 2.5) within
the beam incidence area. Considering as sub-targets the nuclei of the atoms of the
target with the depth x and ignoring any shadowing between the nucleus, Nt is
given as
ρx
Nt = N , (2.16)
wa
where N is Avogadro’s number, ρ is the specific mass of the target, wa is its atomic
weight. Note that Nt is a dimensionless number: it is just the number of sub-targets
that are hit by one beam that has one unit of transverse area. In the case of several
Wint Wint
σtot = = , (2.17)
J · Nt L
In practice, detectors often cover only a given angular region and we do not measure
the total cross-section in the full solid angle. It is therefore useful to introduce the
differential cross-section
dσ(θ, φ) 1 dWint (θ, φ)
= (2.19)
d L d
and
dσ(θ, φ)
σtot = d cos θ. (2.20)
d
N1 N2
L= Nb f (2.22)
AT
where N1 and N2 are the number of particles in the crossing bunches, Nb is the
number of bunches per beam, A T is the intersection transverse area, and f is the
beam revolution frequency.
When two particles collide, it is often the case that there are many possible outcomes.
Quantum mechanics allows us to compute the occurrence probability for each specific
final state. Total cross-section is thus a sum over all possible specific final states
σtot = σi (2.23)
i
Fig. 2.7 Total and elastic cross sections for pp and p̄ p collisions as a function of laboratory beam
momentum and total center-of-mass energy. From the Review of Particle Physics, K.A. Olive et al.
(Particle Data Group), Chin. Phys. C 38 (2014) 090001
When a beam of particles crosses matter, its intensity is reduced. Using the definition
of total cross-section 2.14 and Eq. (2.17), the reduction when crossing a slice of
thickness x is:
N Wint ρ
= = N x σtot (2.24)
N J wA
32 2 The Birth and the Basics of Particle Physics
where w A is the atomic weight of the target. Defining the interaction length L int as
wA
L int = (2.25)
σtot N ρ
then
dN 1
=− N (2.26)
dx L int
and
L int has units of length (usually cm). However, this quantity is often redefined as
wA
L int = L int ρ = (2.28)
σtot N
and its units will then be g cm−2 . This way of expressing L int is widely used in
cosmic ray physics. In fact, the density of the atmosphere has a strong variation with
height (see next section). For this reason, to study the interaction of cosmic particles
in their path in the atmosphere, the relevant quantity
is not the path length but rather
the amount of matter that has been traversed, ρd x.
In a rough approximation, the atmosphere is isothermal; under this hypothesis, its
depth x in g cm−2 varies exponentially with height h (km), according to the formula
x = X e−h/H (2.29)
where H 6.5 km, and X 1030 g·cm−2 is the total vertical atmospheric depth.
Stable particles like the proton and the electron (as much as we know) are the excep-
tion, not the rule. The lifetime of most particles is finite and its value may span over
many orders of magnitude from, for instance, 10−25 s for the electroweak massive
bosons (Z and W ) to around 900 s for the neutron, depending on the strength of the
relevant interaction and on the size of the decay phase space.
In order to describe decays we must use a genuine quantum mechanical language,
being decay a genuine quantum process whose statistical nature cannot be described
properly by classical physics. We shall use, thus, the language of wave functions.
|(x, y, z, t)|2 d V is the probability density for finding a particle in a volume d V
around point (x, y, z) at time t.
2.6 Decay Width and Lifetime 33
t
F.T.
mc 2 E
Fig. 2.8 The wave function of a stable particle and its energy spectrum
Stable particles are described by pure harmonic wave functions and their Fourier
transforms are functions centered in well-defined proper energies—in the rest frame,
E = mc2 (Fig. 2.8):
E
(t) ∝ (0) e−ı t (2.30)
(E) ∝ δ(E − mc2 ) . (2.31)
Unstable particles are described by damped harmonic wave functions and there-
fore their proper energies are not well-defined (Fig. 2.9):
E
(t) ∝ (0)e−ı t e− 2 t
=⇒ |(t)|2 ∝ |(0)|2 e−t/τ (2.32)
1 1
(E) ∝ =⇒ |(E)|2 ∝ (2.33)
(E − mc2 ) + i/2 (E − mc2 )2 + 2 /4
which is a Cauchy function (physicists call it a Breit-Wigner function) for which the
width is directly related to the particle lifetime τ :
τ= . (2.34)
t
F.T. Γ
mc 2 E
Fig. 2.9 The wave function of an unstable particle and its energy spectrum
34 2 The Birth and the Basics of Particle Physics
If a particle can decay through different channels, its total width will be the sum
of the partial widths i of each channel:
t = i . (2.35)
An unstable particle may thus have several specific decay rates, but it has just one
lifetime:
τ= . (2.36)
i
Therefore, all the Breit-Wigner functions related to the decays of the same particle
have the same width t but different normalization factors which are proportional to
the fraction of the decays in each specific channel, also called the branching ratio,
B Ri (or Bri ), defined as
i
B Ri = . (2.37)
t
Particles interact like corpuscles but propagate like waves. This was the turmoil
created in physics in the early twentieth century by the photoelectric effect theory of
Einstein. In the microscopic world, deterministic trajectories were no longer possible.
Newton’s laws had to be replaced by wave equations. Rutherford formulae, classically
deduced, coincide anyway with calculations based on quantum mechanics.
In quantum mechanics, the scattering of a particle due to an interaction that acts
only during a finite time interval can be described as the transition between an initial
and a final stationary states characterized by well-defined momenta. The probability
λ of such a transition is given, if the perturbation is small, by the Fermi3 golden
3 Enrico Fermi (Rome 1901–Chicago 1954) studied in Pisa and became full professor of Analytical
Mechanics in Florence in 1925, and then of Theoretical Physics in Rome from 1926. Soon he
surrounded himself by a group of brilliant young collaborators, the so-called “via Panisperna boys”
(E. Amaldi, E. Majorana, B. Pontecorvo, F. Rasetti, E. Segré, O. D’Agostino). For Fermi, theory
and experiment were inseparable. In 1934, he discovered that slow neutrons catalyzed a certain type
of nuclear reactions, which made it possible to derive energy from nuclear fission. In 1938, Fermi
went to Stockholm to receive the Nobel Prize, awarded for his fundamental work on neutrons, and
from there he emigrated to the USA, where he became American citizen in open dispute with the
Italian racial laws. He actively participated in the Manhattan Project for the use of nuclear power
for the atomic bomb, but spoke out against the use of this weapon on civilian targets. Immediately
after the end of World War II, he devoted himself to theoretical physics of elementary particles
and to the origin of cosmic rays. Few scientists of the twentieth century impacted as profoundly as
Fermi in different areas of physics: Fermi stands for elegance and power of thought in the group of
immortal geniuses like Einstein, Landau, Heisenberg, and later Feynman.
2.7 The Fermi Golden Rule and the Rutherford Scattering 35
rule: (see the book by Townsend among the recommended readings at the end of the
chapter)
2π 2
λ= |H f i | ρ(E i ) (2.38)
p
f = k
f , respectively (k =
k
i
=
k
f
):
u i = L − 2 exp(i k
i · r
)
3
(2.40)
L
36 2 The Birth and the Basics of Particle Physics
and
u f = L − 2 exp(i k
f · r
) .
3
(2.41)
where ε0 is the vacuum dielectric constant and Q1 and Q2 are the charges of the
beam and of the target particles. The transition amplitude can thus be written as
H f i = L −3 exp − i k
f · r
V (r ) exp − i k
i · r
d 3 x. (2.43)
Expressing |
q | = 4 k sin
2 2 2
, 2
(2.47)
2
2.7.2 Flux
The flux, as seen in 2.15, is J = ρbeam v, which in the present case may be written as
v k
J= = . (2.49)
L 3 m L3
The density of final states ρ(E i ) is determined by the dimension of the normalization
box. At the boundaries of the box, the wave function should be zero and so only
harmonic waves are possible. Therefore, the projections of the wave number vector
κ
2π n x 2π n y 2π n z
kx = ; ky = ; kz = , (2.50)
L L L
where n x , n y and n z are the integer harmonic numbers.
Considering now the wave number κ
in its vector space the volume associated to
each possible state defined by a particular set of harmonic numbers is just
3
dk x dk y dk z 2π
= , (2.51)
dn x dn y dn z L
d 3 k = k 2 dk d. (2.52)
(k)2
E= , (2.54)
2m
38 2 The Birth and the Basics of Particle Physics
Q1 Q2
dmin = (2.57)
4πε0 E 0
h
λ= √ . (2.58)
2m E 0
In the particular case of the Rutherford experiment (α particles with a kinetic energy
of 7.7 MeV against a golden foil) λ dmin and the classical approximation is, by
chance, valid.
Quantum field theories, which provide in modern physics the description of inter-
actions, describe nature in terms of fields. A force between two particles (described
by “particle fields”) is described in terms of the exchange of virtual force carrier
particles (again described by appropriate fields) between them. For example, the
electromagnetic force is mediated by the photon field, weak interactions are medi-
ated by the Z and W ± fields, while the mediators of the strong interaction are called
gluons. “Virtual” means that these particles can be off-shell, i.e., they do not need to
have the “right” relationship between mass, momentum, and energy—this is related
to the virtual particles that we discussed in the previous chapter, which can violate
energy-momentum conservation for short times.
2.8 The Modern View of Scattering: Quantum Field Exchange 39
1 e2
1/137
4π0 c
for each photon, so the amplitudes for diagrams with many photons (see for example
Fig. 2.11, right) are small, compared to those with only one.
4 Richard Feynman (New York 1918—Los Angeles 1988), longtime professor at Caltech, is known
for his work in quantum mechanics, in the theory of quantum electrodynamics, as well as in particle
physics; he participated in the Manhattan project. In addition, he proposed quantum computing. He
received the Nobel Prize in Physics in 1965 for his “fundamental work in quantum electrodynamics,
with deep-ploughing consequences for the physics of elementary particles.” His life was quite
adventurous, and full of anecdotes. In the divorce file related to his second marriage, his wife
complained that “He begins working calculus problems in his head as soon as he awakens. He did
calculus while driving in his car, while sitting in the living room, and while lying in bed at night.” He
wrote several popular physics books, and an excellent general physics textbook now freely available
at http://www.feynmanlectures.caltech.edu/.
40 2 The Birth and the Basics of Particle Physics
Rutherford formula was deduced assuming a static Coulomb field created by a fixed
point charge. These assumptions can be either too crude or just not valid in many
cases. Hereafter, some generalizations of the Rutherford formula are discussed.
Let us assume that the source of the static Coulomb field has some spatial extension
ρ(r ) (Fig. 2.12) with
∞
ρ r dr = 1 . (2.59)
0
2.9 Particle Scattering in Static Fields 41
Then
1 Q 1 Q 2 ρ(
r ) i
H f i = L −3 exp −
q ·
r d3x d3x =
r − r
|
4πε0 |
(2.60)
1 Q 1 Q 2 ρ(
r ) i i
= L −3 r − r
) exp − q
· r
d 3 x d 3 x
exp − q
· (
r − r
|
4πε0 |
q )|2 (2.62)
d d 0
dσ
where d 0
is the Rutherford cross-section.
In the case of the proton the differential ep cross-section at low transverse momen-
tum is described by such a formula, being the form factor given by the dipole formula
−2
2
|q|
F (q) ∝ 1 + 2 2 . (2.63)
b
The charge distribution is the Fourier transform ρ (r ) ∝ e−r/a , where a = 1/b 0.2
fm corresponds to a root mean square charge radius of 0.8 to 0.9 fm. The size of the
proton is then determined to be at the scale of 1 fm.
The Coulomb field, as the Newton gravitational field, has an infinite range. Let us
now consider a field with an exponential attenuation (Yukawa potential)
42 2 The Birth and the Basics of Particle Physics
g r
V (r ) = exp − (2.64)
4πr a
where a is some interaction range scale. Then,
g r i
H f i = L −3
exp(− ) exp − q
· r
d 3 x (2.65)
4πr a
2 g
H f i =− 3 (2.66)
q
2 + a 2
2
L
and
2
dσ q
2 dσ
=g 2
, (2.67)
d q
2 + M 2 c2 d 0
dσ
where d 0
is the Rutherford cross-section. M = /(a · c) was interpreted by
Hideki Yukawa,5 as it will be discussed in Sect. 3.2.4, as the mass of a particle
exchanged between nucleons and responsible for the strong interaction which ensures
the stability of nuclei. The relevant scale a = 1 fm corresponds to the size of nuclei,
and the mass of the exchanged particle comes out to be M 200 MeV/c2 (see
Sect. 2.11 for the conversion).
μ
= S (2.68)
me
where Q e and m e are, respectively, the charge and the mass of the electron.
The electron cross-scattering formula is given by the Mott cross-section (its derivation
is out of the scope of the present chapter as it implies relativistic quantum mechanics):
dσ dσ θ
= 1 − β 2 sin2 . (2.69)
d d 0 2
5 HidekiYukawa (Tokyo, 1907–Kyoto, 1981), professor at the Kyoto University, gave fundamental
contributions to quantum mechanics. For his research he won the prize Nobel Prize for Physics in
1949.
2.9 Particle Scattering in Static Fields 43
When β → 1,
dσ dσ θ
= 2
cos , (2.71)
d d 0 2
which translates the fact that, for massless particles (as it will be discussed in
Sect. 6.2.4), the projection of the spin S
over the direction of the linear momentum
p
is conserved (Fig. 2.13). The helicity quantum number h is defined as
h = S
· . (2.72)
| p
|
Physical laws are, since Galilei and Newton, believed to be the same in all inertial
reference frames (i.e., in all frames moving with constant speed with respect to a
frame in which they hold—classical mechanics postulates with the law of inertia the
existence of at least one admissible frame). This is called the principle of special
relativity, and it has been formulated in a quantitative way by Galilei. According
to the laws of transformations of coordinates between inertial frames in classical
physics (called Galilei transformations), accelerations are invariant with respect to
a change of reference frame—while speeds are trivially non-invariant. Since the
equations of classical physics (Newton’s equations) are based on accelerations only,
this automatically guarantees the principle of relativity.
Something revolutionary happened when Maxwell’s equations6 were formulated.
The Maxwell’s equations in vacuo
· E
= ρ
∇ (2.73)
0
∇
× E
= − ∂ B (2.74)
∂t
∇
· B
= 0 (2.75)
× B
= 1 ∂ E + μ0
j
∇ (2.76)
c2 ∂t
together with the equation describing the motion of a particle of electrical charge q
in an electromagnetic field
F
= q(E
+ v
× B)
(2.77)
6 James Clerk Maxwell (1831–1879) was a Scottish physicist. His most prominent achievement was
electromagnetism. He also derived the equations subsequently used by Albert Einstein to describe
the transformation of space and time coordinates in different inertial reference frames. He was
awarded the 1902 Nobel Prize in Physics.
2.10 Special Relativity 45
To solve the problem (i.e., maintaining the speed of light c a nature invariant, and
guaranteeing the covariant formulation of the laws of mechanics) a deep change in our
perception of space and time was needed: it was demonstrated that time and length
intervals are no longer absolute. Two simultaneous events in one reference frame
are not simultaneous in any other reference frame that moves with nonzero velocity
with respect to the first one; the Galilean transformations had to be replaced by new
ones, the Lorentz transformations. Another striking consequence of this revolution
was that mass is just a particular form of energy; kinematics at velocities near c is
quite different from the usual one and particle physics is the laboratory to test it.
directed along the common x axis with respect to S (Fig. 2.14). The coordinates in
one reference frame transform into new coordinates in the other reference frame
(Lorentz transformations) as:
ct = γ(ct + β x )
x = γ(x + β ct )
y = y
z = z
where β = v/c and γ = 1/ 1 − β 2 .
It can be verified that applying the above transformations the speed of light is an
invariant between S and S .
A conceptually nontrivial consequence of these transformations is that for an
observer in S the time interval T is larger than the time measured by a clock in
S for two events happening at the same place, the so-called proper time, T (time
dilation):
T = γ T (2.78)
while the length of a ruler that is at rest in S’ is shorter when measured in S (length
contraction):
L = L /γ . (2.79)
ds 2 = c2 dt 2 − d x 2 − dy 2 − dz 2 (2.80)
and now we want to extend the properties of the 4-ple (cdt, d x, dy, dz) to other
4-ple behaving in a similar way. We want to do something similar to what we did
in classical physics: we want to introduce representations such that the equations
become visually covariant with respect to transformations—i.e., the laws of physics
are respected in different reference frames.
Let us introduce a simple convention: in the 4-ple (cdt, d x, dy, dz) the elements
will be numbered from 0 to 3. Greek indices like μ will run from 0 to 3, and Roman
symbols will run from 1 to 3 (i = (1, 2, 3)) as in the usual three-dimensional case.
We define four-vector a quadruple
Aμ = A0 , A1 , A2 , A3 = A0 , A
(2.81)
which transforms like (cdt, d x, dy, dz) for changes of reference systems. The Aμ
(with high indices) is called contravariant representation of the four-vector.
Correspondingly, we define the 4-ple
Aμ = (A0 , A1 , A2 , A3 ) = A0 , −A1 , −A2 , −A3 = A0 , − A
(2.82)
x 0 = ct x 1 = x x 2 = y x 3 = z . (2.83)
μ μ
By our definition, the quantity ≡ μ Aμ A ≡ Aμ A is invariant. Omitting
the sum sign when an index is repeated once in contravariant position and once in
covariant position is called the Einstein’s sum convention. Sometimes, when there
is no ambiguity, this quantity is also indicated as A2 .
In analogy to the square of a four-vector, one forms the scalar product of two
different four-vectors:
Aμ Bμ = A0 B0 + A1 B1 + A2 B2 + A3 B3 = A0 B 0 − A1 B 1 − A2 B 2 − A3 B 3 .
It is clear that it can be written either as Aμ Bμ or Aμ B μ —the result is the same.
2.10 Special Relativity 47
is called metric tensor (sometimes the Minkowski tensor or the Minkowski metric
tensor), a symmetric matrix which transforms the contravariant Aμ in the covariant
Aμ and vice versa.
Indeed we can also write Aμ = gμν Aμ , where
⎛ ⎞
1 0 0 0
⎜0 −1 0 0⎟
gμν =⎜
⎝0
⎟ (2.85)
0 −1 0⎠
0 0 0 −1
Aμ Aμ = gμν Aμ Aν = g μν Aμ Aν . (2.86)
ρ
Besides, we have that gμν g μρ = δμ = 1. In this way we have enlarged the space
adding a fourth dimension A0 : the time dimension.
The generic transformation between reference frames can be written expressing
the Lorentz transformations by means of a 4 -matrix :
2.10.1.1 Tensors
ε0123 = 1. (2.89)
The components change sign under interchange of any pair of indices and clearly
the nonzero components are those for which all four indices are different. Every
permutation with an odd rank changes the sign. The number of component with
nonzero value is 4! = 24.
We have
εμνρσ = gαμ gβν gγρ gδσ εαβγδ = −εμνρσ .
ds 2 = gμν d x μ d x ν . (2.90)
Since the last equation must be true for any infinitesimal interval, the quantity in
parentheses must be zero, so
∂μ = = ,∇ ∂ = = , −∇ . (2.94)
∂x μ c ∂t ∂xμ c ∂t
1 ∂2
2 ≡ ;
∂μ ∂ μ = −∇ (2.95)
c2 ∂t 2
this is called the D’Alembert operator.
The space-time is divided thus into two regions by the hypercone of the
s 2 = 0 events (the so-called light cone, Fig. 2.15). If the interval between two
causally connected events is “time-like” (no time travels, sorry) then the maximum
speed at which information can be transmitted is c.
Energy and momentum conservation have a deep meaning for physicists (they are
closely connected to the invariance of physical laws under time and space transla-
tions). In order to keep these conservation laws in special relativity, new definitions
have to be introduced:
E = γmc2
p
= γm v
.
1 3 v4
E = mc2 + mv 2 + m 2 + ... (2.96)
2 8 c
The classical expression for kinetic energy is recovered at first order but a constant
term is now present. At rest, the energy of a body is E 0 = mc2 . In the words of
2.10 Special Relativity 51
Einstein “mass and energy are both but different manifestations of the same thing.”
The dream of the alchemists is possible, but the recipe is very different.
The “Lorentz boost” γ of a particle and its velocity β normalized to the speed of
light can be obtained as:
γ = E/mc2
|β| = | pc|/E .
p 0 = E/c ; p 1 = px ; p 2 = p y ; p 3 = pz , (2.97)
and the same Lorentz transformations, valid for any four-vector, apply to this object:
The scalar product of p μ with itself is by definition invariant and the result is
p 2 = pμ p μ = (E/c)2 − | p
|2 = m 2 c2 (2.98)
and thus
E 2 = m 2 c4 + | p
|2 c2 . (2.99)
While in classical mechanics you cannot have a particle of zero mass, this is
possible in relativistic mechanics. The particle will have 4-momentum
p μ = (E/c, p
) (2.100)
with
E 2 − p 2 c2 = 0 , (2.101)
and thus will move at the speed of light. The converse is also true: if a particle moves
at the speed of light, its rest mass is zero—but the particle still carries a momentum
E/c. The photon is such a particle.
52 2 The Birth and the Basics of Particle Physics
2.10.4.1 Decay
Let a body of mass M spontaneously decay into two parts with masses m 1 and m 2
respectively. In the frame of reference in which the body is at rest, energy conservation
gives
Mc2 = E 1 + E 2 . (2.102)
where E 1 and E 2 are the energies of the final-state particles. Since for a particle of
mass m E ≥ m, this requires that M ≥ (m 1 + m 2 ): a body can decay spontaneously
only into parts for which the sum of masses is smaller or equal to the mass of the
body. If M < (m 1 + m 2 ) the body is stable (with respect to that particular decay),
and if we want to generate that process we have to supply from outside an amount
of energy at least equal to its “binding energy” (m 1 + m 2 − M)c2 .
Momentum must be conserved as well in the decay: in the rest frame of the
decaying particle p1 + p2 = 0. Consequently, p12 = p22 or
E 12 − m 21 c2 = E 22 − m 22 c2 . (2.103)
M 2 + m 21 − m 22 2 M 2 + m 22 − m 21 2
E1 = c ; E2 = c . (2.104)
2M 2M
Let us consider, from the point of view of relativistic mechanics, the elastic collision
of particles. We denote the momenta and energies of the two colliding particles (with
masses m 1 and m 2 by p1i and p2i respectively; we use primes for the corresponding
quantities after collision. The laws of conservation of momentum and energy in
the collision can be written together as the equation for conservation of the four-
momentum:
μ μ μ μ
p1 + p2 = p1 + p2 . (2.105)
μ μ μ μ
We rewrite it as p1 + p2 − p1 = p2 and square:
μ μ μ μ μ μ μ
p1 + p2 − p1 = p2 =⇒ m 21 c4 + p1 p2μ − p1μ p1 − p2μ p1 = 0 . (2.106)
Similarly,
μ μ μ μ μ μ μ
p1 + p2 − p2 = p1 =⇒ m 22 c4 + p1 p2μ − p2μ p2 − p1μ p2 = 0 . (2.107)
2.10 Special Relativity 53
Let us consider the collision in a reference frame in which one of the particles
μ μ
(m 2 ) was at rest before the collision. Then p
2 = 0, and p1 p2μ = E 1 m 2 c2 , p2μ p1 =
μ
m 2 E 1 c2 , p1μ p1 = E 1 E 1 − p1 p1 c2 cos θ1 where cos θ1 is the angle of scattering of
the incident particle m 1 . Substituting these expressions into Eq. (2.106) we get
E 1 (E 1 + m 2 c2 ) − E 1 m 2 c2 − m 1 c4
cos θ1 = . (2.108)
p1 p1 c2
We note that if m 1 > m 2 , i.e., if the incident particle is heavier than the target
particle, the scattering angle θ1 cannot exceed a certain maximum value. It is easy
to find by elementary computations that this value is given by the equation
The kinematics of two-to-two particle scattering (two incoming and two outgoing
particles, see Fig. 2.16) can be expressed in terms of Lorentz invariant scalars, the
Mandelstam variables s, t, u, obtained as the square of the sum (or subtraction) of
the four-vectors of two of the particles involved.
If p1 and p2 are the four-vectors of the incoming particles and p3 and p4 are the
four-vectors of the outgoing particles, the Mandelstam variables are defined as
s = ( p 1 + p 2 )2
t = ( p 1 − p 3 )2
u = ( p 1 − p 4 )2 .
1 + 2 → X → 3 + 4. (2.111)
s ≥ M X c2 (2.112)
If mc2 E
θ
t = q 2 −4E 1 E 3 sin2 (2.114)
2
Fig. 2.17 Two-to-two particles interaction channels: left s-channel; center t-channel; right
u-channel
2.10 Special Relativity 55
q 2 −
q2 . (2.115)
s + t + u = m 1 c2 + m 2 c2 + m 3 c2 + m 4 c2 . (2.116)
2π |M|2
λ= n i ρn (E) (2.117)
i=1 2E i f
n i and n f are the number of particles in the initial and final states, respectively, and
the δ functions ensure the conservation of linear momentum and energy.
56 2 The Birth and the Basics of Particle Physics
In the case of a two-body final state, the phase space in the center-of-mass frame
is simply
π | p
∗ |
ρ2 (E ∗ ) = (2.120)
(2π)6 E ∗
where p
∗ and E ∗ are the linear momentum of each final state particle and the total
energy in the center-of-mass reference frame, respectively. The flux is now defined as
J = 2E a 2E b vab = 4F (2.121)
where vab is the relative velocity of the two interacting particles and F is called the
Möller’s invariant flux factor. In terms of the four-vectors pa and pb of the incoming
particles:
F= ( pa . pb )2 − m a 2 m b 2 c4 (2.122)
Putting together all the factors, the cross-section for the two-particle interaction is
given as
⎡ ⎤ ⎡ ⎤
nf
nf
nf
1 S 2 d3 p f
σ a+b→1+2+···+n f = |M| 2
δ⎣ p
f − p
0 ⎦ δ ⎣ E f − E0 ⎦
4F (2π)3n f −4 2E f
f =1 f =1 f =1
(2.124)
where S is a statistical factor that corrects for double counting whenever there are
identical particles and also accounts for spin statistics.
In the special case of a two-to-two body scattering in the center-of-mass frame, a
dσ
simple expression for the differential cross section d can thus be obtained (if |M|2
is a function of the final momentum, the angular integration cannot be carried out):
2
dσ c S |M|2
p
f
= . (2.125)
d 8π s | p
i |
The partial width can be computed using again the relativistic version of the Fermi
golden rule applied to the particular case of a particle at rest decaying into a final
state with n f particles
⎛ ⎞ ⎛ ⎞
nf
3p nf
nf
1 S d f ⎝
i = |M|2 δ p
f − p
0 ⎠ δ ⎝ E f − E 0 ⎠.
2m i (2π) f −4)
(3n 2E f
f =1 f =1 f =1
(2.126)
2.10 Special Relativity 57
In the particular case of only two particles in the final state, it simplifies to
S| p
∗ |
i = |M|2 (2.127)
8πm i2 c
where p
∗ is the linear momentum of each final state in the center-of-mass reference
frame.
tions. We know that it implies that B can be expressed as the curl of a vector field.
So, we write it in the form:
B
= ∇
× A
. (2.128)
× A)/∂t = 0. Since we can differentiate either with
respect to time or to space first, we can also write this equation as
× E
+ ∂ A
∇ = 0. (2.129)
∂t
We see that E
+ ∂ A/∂t
is a vector whose curl is equal to zero. Therefore that vector
can be expressed as the gradient of a scalar field. In electrostatics, we take E
to be
the gradient of −φ. We do the same thing for E
+ ∂ A/∂t;
we set
∂ A
E
+
.
= −∇φ (2.130)
∂t
We use the same symbol φ, so that, in the electrostatic case where nothing changes,
the relation E
= −∇φ
still holds. Thus Faraday’s law can be put in the form
58 2 The Birth and the Basics of Particle Physics
− ∂A .
E
= −∇φ (2.131)
∂t
We have solved two of Maxwell’s equations already, and we have found that to
describe the electromagnetic fields E
and B,
we need four potential functions: a scalar
which is, of course, three functions.
potential φ, and a vector potential A,
Now that A
determines part of E,
as well as B,
what happens when we change
A
to A
= A
+ ∇ψ?
Although B
does not change since ∇
= 0, in general E
× ∇ψ
would change. We can, however, still allow A
to be changed without affecting the
electric and magnetic fields—that is, without changing the physics—if we always
change A
and φ together by the rules
; φ = φ − ∂ψ .
A
= A
+ ∇ψ (2.132)
∂t
Now we return to the two remaining Maxwell equations which will give us rela-
tions between the potentials and the sources. Once we determine A
and φ from the
currents and charges, we can always get E
and B
from Eqs. (2.128) and (2.131), so
we will have another form of Maxwell’s equations.
· E
= ρ/0 ; we get
We begin by substituting Eq. (2.131) into ∇
∇
− ∂A
· −∇φ =
ρ ∂
=⇒ −∇ 2 φ − ∇ ·A=
ρ
. (2.133)
∂t 0 ∂t 0
× B
= μ0
j + 1 ∂ E
∇ (2.134)
c ∂t
2
can be written as
× (∇
∇
= μ0
j + 1 ∂
× A)
− ∂A
−∇φ (2.135)
c2 ∂t ∂t
× (∇
and since ∇
= ∇(
× A)
∇
− ∇ 2 A
we can write
· A)
1 ∂
1 ∂ 2 A
− ∇ A + ∇(∇ · A) + 2 ∇φ + 2 2 = μ0
j
2
c ∂t c ∂t
· A
+ 1 ∂φ 1 ∂ 2 A
=⇒ −∇ 2 A
+ ∇
∇ + = μ0
j . (2.136)
c2 ∂t c2 ∂t 2
2.10 Special Relativity 59
Fortunately, we can now make use of our freedom to choose arbitrarily the divergence
which is guaranteed by (2.132). What we are going to do is to use our choice
of A,
to fix things so that the equations for A
and for φ are separated but have the same
form. We can do this by taking (this is called the Lorenz8 gauge):
· A
= − 1 ∂φ .
∇ (2.137)
c2 ∂t
1 ∂ 2 A
2 A
= μ0
j =⇒ A
= μ0
j
−∇ (2.138)
c2 ∂t 2
1 ∂2φ
2 φ = ρ =⇒ φ = ρ .
−∇ (2.139)
c ∂t
2 2 0 0
These equations are particularly fascinating. We can easily obtain from Maxwell’s
equations the continuity equation for charge
·
j + ∂ρ = 0 .
∇
∂t
If there is a net electric current is flowing out of a region, then the charge in that
region must be decreasing by the same amount. Charge is conserved. This provides
a proof that j μ = (ρ/c,
j) is a four-vector, since we can write
∂μ j μ = 0 . (2.140)
the Eqs. (2.138), (2.139) can be written all
If we define the 4-ple Aμ = (φ/c, A),
together as
Aμ = μ0 j μ . (2.141)
Thus the 4-ple Aμ is also a four-vector; we call it the 4-potential of the electromagnetic
field. Considering this fact, it appears clear that the Lorenz gauge (2.137) is covariant,
and can be written as
∂μ A μ = 0 . (2.142)
8 Ludvig
Lorenz (1829–1891), not to be confused with Hendrik Antoon Lorentz, was a Danish
mathematician and physicist, professor at the Military Academy in Copenhagen.
60 2 The Birth and the Basics of Particle Physics
In regions where there are no longer any charges and currents, the solution of
Eq. (2.141) is a 4-potential, which is changing in time but always moving out at the
speed c. This 4-field travels onward through free space.
Since Aμ is a four-vector, it is thus a 4-tensor the antisymmetric matrix
F μν = ∂ μ Aν − ∂ ν Aμ . (2.143)
Obviously the diagonal elements of this tensor are null. The 0-th row and column
are, respectively,
1 ∂ Ai ∂φ
F 0i = ∂ 0 Ai − ∂ i A0 = + i = −E i /c (2.144)
c ∂t ∂x
F i0 = −F 0i = E i /c . (2.145)
F 12 = ∂ 1 A2 − ∂ 2 A1 = −(∇
z = −Bz
× A) (2.146)
F 13 = ∂ 1 A3 − ∂ 3 A1 = (∇
y = By
× A) (2.147)
F 23 = ∂ 2 A3 − ∂ 3 A2 = −(∇
x = −Bx
× A) (2.148)
The components of the electromagnetic field are thus elements of a tensor, the
electromagnetic tensor.
The nonhomogeneous Maxwell equations have been written as (Eq. 2.141):
We can write
and since ∂ν Aν = 0,
∂ν F νμ = j μ (2.153)
× E
+ ∂ B = 0
∇ ;
· B
= 0
∇
∂t
the following result (4 equations):
∂ 2 F 03 + ∂ 3 F 20 + ∂ 0 F 32 = 0 ... ∂ 1 F 23 + ∂ 2 F 31 + ∂ 3 F 12 = 0
and thus
∂
B
× E
= −
∇
· B
= 0 ⇐⇒ αβγδ ∂ β F γδ = 0 (α = 0, 1, 2, 3) .
&∇
∂t
Due to the tensor nature of Fμν , the two following quantities are invariant for
transformations between inertial frames:
1
Fμν F μν = B 2 − E 2 /c2
2
c
αβγδ F αβ F γδ = B
· E
8
where αβγδ is the completely antisymmetric unit tensor of rank 4.
The international system of units (SI) can be constructed on the basis of four funda-
mental units: a unit of length (the meter, m), a unit of time (the second, s), a unit of
mass (the kilogram, kg) and a unit of charge (the coulomb, C).9
These units are inappropriate in the world of fundamental physics: the radius of
a nucleus is of the order of 10−15 m (also called one femtometer or one fermi, fm);
the mass of an electron is of the order of 10−30 kg; the charge of an electron is (in
absolute value) of the order of 10−19 C. Using such units, we would carry along a lot
of exponents. Thus, in particle physics, we rather use units like the electron charge
9 For reasons related only to metrology (reproducibility and accuracy of the definition) in the standard
SI the unit of electrical current, the ampere A, is used instead of the coulomb; the two definitions
are however conceptually equivalent.
62 2 The Birth and the Basics of Particle Physics
for the electrostatic charge, and the electron-volt eV and its multiples (keV, MeV,
GeV, TeV) for the energy:
Length 1 fm 10−15 m
Mass 1 MeV/c2 ∼ 1.78 × 10−30 kg
Charge |e| ∼ 1.602 × 10−19 C.
Note the unit of mass, in which the relation E = mc2 is used implicitly: what
one is doing here is to use 1 eV 1.602 × 10−19 J as the new fundamental unit of
energy. In these new units, the mass of a proton is about 0.938 GeV/c2 , and the mass
of the electron is about 0.511 MeV/c2 . The fundamental energy level of a hydrogen
atom is about −13.6 eV.
In addition, nature provides us with two constants which are particularly appro-
priate in the world of fundamental physics: the speed of light c 3.00 × 108 m/s
= 3.00 × 1023 fm/s, and Planck’s constant (over 2π) 1.05 × 10−34 J s
6.58 × 10−16 eV s.
It seems then natural to express speeds in terms of c, and angular momenta in
terms of . We then switch to the so-called Natural Units (NU). The minimal set of
natural units (not including electromagnetism) can then be chosen as
After the convention = c = 1, one single unit can be used to describe the
mechanical Universe: we choose energy, and we can thus express all mechanical
quantities in terms of eV and of its multiples. It is immediate to express momenta
and masses directly in NU. To express 1 m and 1 s, we can write10
1m
1m = 5.10 × 1012 MeV−1
c
1s
1s = 1.52 × 1021 MeV−1
1 kg = 1J/c2 5.62 × 1029 MeV
Both length and time are thus, in natural units, expressed as inverse of energy.
The first relation can also be written as 1fm 5.10 GeV−1 : note that when you have
a quantity expressed in MeV−1 , in order to express it in GeV−1 , you must multiply
(and not divide) by a factor of 1000.
Let us now find a general rule to transform quantities expressed in natural units
into SI, and vice versa. To express a quantity in NU back in SI, we first restore the
and c factors by dimensional arguments and then use the conversion factors and
c (or c) to evaluate the result. The dimensions of c are [m/s]; the dimensions of
are [kg m2 s−1 ].
The vice versa (from SI to NU) is also easy. A quantity with meter-kilogram-
second [m · k · s] dimensions M p L q T r (where M represents mass, L length and T
time) has the NU dimensions [E p−q−r ], where E represents energy. Since and c
do not appear in NU, this is the only relevant dimension, and dimensional checks
and estimates are very simple. The quantity Q in SI can be expressed in NU as
p −1 q
29 MeV 12 MeV
Q NU = Q SI 5.62 × 10 5.10 × 10
kg m
−1 r
MeV
× 1.52 × 1021 MeV p−q−r
s
The NU and SI dimensions are listed for some important quantities in Table 2.1.
Note that choosing natural units, all factors of and c may be omitted from
equations, which leads to considerable simplifications (we will profit from this in the
next chapters). For example, the relativistic energy relation
E 2 = p 2 c2 + m 2 c4 (2.154)
becomes
E 2 = p2 + m 2 (2.155)
e2
(2.156)
4π0
has the dimension of [J · m], and thus is a pure, dimensionless, number in NU.
Dividing by c one has
e2 1
. (2.157)
4π0 c 137
e2 1
α= . (2.158)
4π 137
This is called the Lorentz-Heaviside convention. Elementary charge in NU becomes
then a pure number
e 0.303. (2.159)
8πα2
σT . (2.160)
3m 2e
8πα2 a b
σT c (2.161)
3m 2e
and determine a and b such that the result has the dimension of a length squared. We
find a = 2 and b = −2; thus
8πα2
σT (c)2 (2.162)
3(m e c2 )2
h
λC = = 2π . (2.163)
mc mc
The Compton wavelength sets the distance scale at which quantum field theory
becomes crucial for understanding the behavior of a particle: wave and particle
description become complementary at this scale.
On the other hand, we can compute for any mass m the associated Schwarzschild
radius, R S , such that compressing it to a size smaller than this radius we form a black
hole. The Schwarzschild radius is the scale at which general relativity becomes
crucial for understanding the behavior of the object, and
2G N m
RS = (2.164)
c2
where G N is the gravitational constant.
We call Planck mass the mass at which the Schwarzschild radius of a particle
becomes equal to its Compton length, and Planck length their common value when
this happens. The probe that could locate a particle within this distance would collapse
to a black hole, something that would make measurements very weird. In NU, one
can write
2π π
= 2G N m P → m P = (2.165)
mP GN
πc
mP = 3.86 × 10−8 kg 2.16 × 1019 GeV/c2 . (2.166)
GN
√
Since we are talking about orders of magnitude, the factor π is often neglected
and we take as a definition:
c
mP = 2.18 × 10−8 kg 1.22 × 1019 GeV/c2 . (2.167)
GN
Besides the Planck length P , we can also define a Planck time t P = P /c (their
value is equal in NU):
1
P = tP = = GN (2.168)
mP
(this corresponds to a length of about 1.6 × 10−20 fm, and to a time of about 5.4 ×
10−44 s).
Both general relativity and quantum field theory are needed to understand the
physics at mass scales about the Planck mass or distances about the Planck length,
66 2 The Birth and the Basics of Particle Physics
or times comparable to the Planck time. Traditional quantum physics and gravita-
tion certainly fall short at this scale; since this failure should be independent of the
reference frame, many scientists think that the Planck scale should be an invariant
irrespective of the reference frame in which it is calculated (this fact would of course
require important modifications to the theory of relativity).
Note that the shortest length you may probe with the energy of a particle accel-
erated by LHC is about 1015 times larger than the Planck length scale. Cosmic rays,
whcih can reach c.m. energies beyond 100 TeV, are at the frontier of the exploration
of fundamental scales.
Further Reading
Exercises
3. LHC
√ collisions. The LHC running parameters in 2012 were, for a c.m. energy
s 8 TeV: number of bunches = 1400; time interval between bunches
50 ns; number protons per bunch 1.1 ×1011 ; beam width at the crossing point
16 µm.
(a) Determine the maximum instantaneous luminosity of the LHC in 2012.
(b) Determine the number of interactions per collision (σ pp ∼ 100 mb).
(c) As you probably heard, LHC found a particle called Higgs boson, which
Leon Lederman called the “God particle” (a name the news like very much).
If Higgs bosons are produced with a cross-section σ H ∼ 21 pb, determine the
number of Higgs bosons decaying into 2 photons (B R(H → γγ) 2.28 ×
10−3 ) which might have been produced in 2012 in the LHC, knowing that
the integrated luminosity of the LHC (luminosity integrated over the time)
during 2012 was around 20 fb−1 . Compare it to the real number of detected
Higgs in this particular decay mode reported by the LHC collaborations
(about 400). Discuss the difference.
4. Classical electromagnetism is not a consistent theory. Consider two electrons at
rest, and let r be the distance between them. The (repulsive) force between the
two electrons is the electrostatic force
1 e2
F= ,
4π0 r 2
where e is the charge of the electron; it is directed along the line joining the two
charges. But an observer is moving with a velocity v perpendicular to the line
joining the two charges will measure also a magnetic force (still directed as F)
1 e2 μ0 2 2
F = 2
− v e = F .
4π0 r 2πr
The expression of the force is thus different in the two frames of reference. But
masses, charges, and accelerations are classically invariant. Comment.
5. GZK threshold. The Cosmic Microwave Background fills the Universe with pho-
tons with a peak energy of 0.37 meV and a density of ρ ∼ 400/cm3 . Determine:
(a) The minimal energy (known as the GZK threshold) that a proton should
have in order that the reaction pγ → may occur.
(b) The interaction length of such protons in the Universe considering a mean
cross-section above the threshold of 0.6 mb.
6. p̄ production at the Bevatron. The antiprotons were first produced in the labo-
ratory in 1955, in proton–proton fixed target collisions at an accelerator called
Bevatron (it was named for its ability to impart energies of billions of eV, i.e.,
Billions of eV Synchrotron), located at Lawrence Berkeley National Laboratory,
US . The discovery resulted in the 1959 Nobel Prize in physics for Emilio Segrè
and Owen Chamberlain.
68 2 The Birth and the Basics of Particle Physics
(a) Describe the minimal reaction able to produce antiprotons in such collisions.
(b) When a proton is confined in a nucleus, it cannot have arbitrarily low
momenta, as one can understand from the Heisenberg principle; the actual
value of its momentum is called the “Fermi momentum.” Determine the
minimal energy that the proton beam must have in order that antiprotons
were produced considering that the target protons have a Fermi momentum
of around 150 MeV/c.
7. Photon conversion. Consider the conversion of one photon in one electron–
positron pair. Determine the minimal energy that the photon has to have in order
that this conversion would be possible if the photon is in presence of:
(a) one proton;
(b) one electron;
(c) when no charged particle is around.
8. π − decay. Consider the decay of a flying π − into μ− ν¯μ and suppose that the μ−
was emitted along the flight line of flight of the π − . Determine:
(a) The energy and momentum of the μ− and of the ν¯μ in the π − frame.
(b) The energy and momentum of the μ− and of the ν¯μ in the laboratory frame,
if the momentum Pπ− = 100 GeV/c.
(c) Same as the previous question but considering now that was the ν¯μ that was
emitted along the flight line of the π − .
9. π 0 decay. Consider the decay of a π 0 into γγ (with pion momentum of
100 GeV/c). Determine:
(a) The minimal and the maximal angles between the two photons in the labo-
ratory frame.
(b) The probability of having one of the photons with an energy smaller than
an arbitrary value E 0 in the laboratory frame.
(c) Same as (a) but considering now that the decay of the π 0 is into e+ e− .
(d) The maximum momentum that the π 0 may have in order that the maximal
angle in its decay into γγ and in e+ e− would be the same.
10. Invariant flux. In a collision between two particles a and b the incident flux is
given by F = 4|v
a − v
b |E a E b where v
a , v
b , E a and E b are, respectively, the
vectorial speeds and the energies of particles a and b.
(a) Verify that the above formula is equivalent to: F = 4 (Pa Pb )2 − (m a m b )2
where Pa and Pb are, respectively, the four-vectors of particles a and b, and
m a and m b their masses.
(b) Relate the expressions of the flux in the center-of-mass and in the laboratory
reference frames.
Exercises 69
192π 3
τμ =
G 2F m 5μ
This chapter illustrates the path which led to the discovery that
particles of extremely high energy, up to a few joule, come from
extraterrestrial sources and collide with Earth’s atmosphere.
The history of this discovery started in the beginning of the
twentieth century, but many of the techniques then introduced
are still in use. A relevant part of the progress happened in
recent years and has a large impact on the physics of elementary
particles and fundamental interactions.
By 1785, Coulomb found that electroscopes (Fig. 3.1) can discharge spontaneously,
and not simply due to defective insulation. The British physicist Crookes, in 1879,
observed that the speed of discharge decreased when the pressure of the air inside the
electroscope itself was reduced. The discharge was then likely due to the ionization
of the atmosphere. But what was the cause of atmospheric ionization?
The explanation came in the early twentieth century and led to the revolutionary
discovery of cosmic rays. We know today that cosmic rays are particles of extrater-
restrial origin which can reach high energy (much larger than we shall ever be able to
produce). They were the only source of high energy beams till the 1940s. World War
II and the Cold War provided new technical and political resources for the study of
elementary particles; technical resources included advances in microelectronics and
the capability to produce high energy particles in human-made particle accelerators.
By 1955, particle physics experiments would be largely dominated by accelerators,
at least until the beginning of the 1990s, when explorations possible with the ener-
gies one can produce on Earth started showing signs of saturation, so that nowadays
cosmic rays are again at the edge of physics.
Fig. 3.1 The electroscope is a device for detecting electric charge. A typical electroscope (the
configuration in the figure was invented at the end of the eighteenth century) consists of a vertical
metal rod from the end of which two gold leaves hang. A disk or ball is attached to the top of
the rod. The leaves are enclosed in a glass vessel, for protection against air movements. The test
charge is applied to the top, charging the rod, and the gold leaves repel and diverge. By Sylvanus P.
Thompson [public domain], via Wikimedia Commons
Radium B, ... indicated several isotopes of the element today called radon, and also
some different elements) underwent transmutations by which they generated radioac-
tivity; these processes were called “radioactive decays.” A charged electroscope
promptly
discharges in the presence of radioactive materials. It was concluded that the dis-
charge was due to the emission of charged particles, which induce the formation of
ions in the air, causing the discharge of electroscopes. The discharge rate of elec-
troscopes was used to gauge the radioactivity level. During the first decade of the
twentieth century, several researchers in Europe and in the New World presented
progress on the study of ionization phenomena.
Around 1900, C.T.R. Wilson1 in Britain and Elster and Geitel in Germany
improved the sensitivity of the electroscope, by improving the technique for its
insulation in a closed vessel (Fig. 3.2). This improvement allowed the quantitative
measurement of the spontaneous discharge rate, and led to the conclusion that the
radiation causing this discharge came from outside the vessel. Concerning the origin
of such radiation, the simplest hypothesis was that its would be related to radioactive
material in the surrounding of the apparatus. Terrestrial origin was thus a com-
monplace assumption, although experimental confirmation could not be achieved.
Wilson did suggest that atmospheric ionization could be caused by a very penetrating
radiation of extraterrestrial origin. His investigations in tunnels, with solid rock for
1 Charles Thomson Rees Wilson, (1869–1959), a Scottish physicist and meteorologist, received the
Nobel Prize in Physics for his invention of the cloud chamber; see the next chapter.
3.1 The Puzzle of Atmospheric Ionization and the Discovery of Cosmic Rays 73
Fig. 3.2 Left The two friends Julius Elster and Hans Geitel, gymnasium teachers in Wolfenbuttel,
around 1900. Credit http://www.elster-geitel.de. Right an electroscope developed by Elster and
Geitel in the same period (private collection R. Fricke; photo by A. De Angelis)
shielding overhead, however could not support the idea, as no reduction in ioniza-
tion was observed. The hypothesis of an extraterrestrial origin, though now and then
discussed, was dropped for many years.
By 1909, measurements on the spontaneous discharge had proved that the dis-
charging background radiation was also present in insulated environments and could
penetrate metal shields. It was thus difficult to explain it in terms of α (He nuclei)
or β (electron) radiation; it was thus assumed to be γ radiation, i.e., made of pho-
tons, which is the most penetrating among the three kinds of radiation known at
the time. Three possible sources were then hypothesized for this radiation: it could
be extraterrestrial (possibly from the Sun), it could be due to radioactivity from the
Earth crust, or to radioactivity in the atmosphere. It was generally assumed that there
had to be large contribution from radioactive materials in the crust, and calculations
of its expected decrease with height were performed.
Father Theodor Wulf, a German scientist and a Jesuit priest, thought of checking
the variation of ionization with height to test its origin. In 1909, using an improved
electroscope in which the two leaves had been replaced by metal coated silicon glass
wires, making it easier to transport than previous instruments (Fig. 3.3), he measured
the ionization rate at the top of the Eiffel tower in Paris, about 300 m high. Under
the hypothesis that most of the radiation was of terrestrial origin, he expected the
74 3 Cosmic Rays and the Development of Particle Physics
Fig. 3.3 Left Scheme of the Wulf electroscope (drawn by Wulf himself; reprinted from Z. Phys.
[public domain]). The main cylinder was made of zinc, 17 cm in diameter and 13 cm deep. The
distance between the two silicon glass wires (at the center) was measured using the microscope to
the right. The wires were illuminated using the mirror to the left. According to Wulf, the sensitivity of
the instrument was 1 V, as measured by the decrease of the inter-wire distance. Right an electroscope
used by Wulf (private collection R. Fricke; photo by A. De Angelis)
ionization rate to be significantly smaller than the value at ground. The measured
decrease was however too small to confirm the hypothesis: he observed that the
radiation intensity “decrease at nearly 300 m [altitude] was not even to half of its
ground value,” while “just a few percent of the radiation” should remain if it did
emerge from ground. Wulf’s data, coming from experiments performed for many
days at the same location and at different hours of the day, were of great value, and
for long considered the most reliable source of information on the altitude variation
of the ionization rate. However, his conclusion was that the most likely explanation
for this unexpected result was still emission from ground.
The conclusion that atmospheric ionization was mostly due to radioactivity from
the Earth’s crust was challenged by the Italian physicist Domenico Pacini. Pacini
developed a technique for underwater measurements and conducted experiments in
the sea of the Gulf of Genova and in the Lake of Bracciano (Fig. 3.4). He found
a significant decrease in the discharge rate in electroscopes placed three meters
underwater. He wrote: “Observations carried out on the sea during the year 1910 led
me to conclude that a significant proportion of the pervasive radiation that is found
in air has an origin that is independent of direct action of the active substances in the
upper layers of the Earth’s surface. [...] [To prove this conclusion] the apparatus [...]
was enclosed in a copper box so that it could immerse in depth. [...] Observations
were performed with the instrument at the surface, and with the instrument immersed
3.1 The Puzzle of Atmospheric Ionization and the Discovery of Cosmic Rays 75
Fig. 3.4 Left Pacini making a measurement in 1910. Courtesy of the Pacini family, edited by A.
De Angelis [public domain, via Wikimedia Commons]. Right the instruments used by Pacini for
the measurement of ionization. By D. Pacini (Ufficio Centrale di Meteorologia e Geodinamica),
edited by A. De Angelis [public domain, via Wikimedia Commons]
in water, at a depth of 3 m.” Pacini measured the discharge of the electroscope during
3 h, and repeated the measurement seven times. At the surface, the average ionization
rate was 11.0 ions per cubic centimeter per second, while he measured 8.9 ions per
cubic centimeter per second at a depth of 3 m in the 7 m deep sea (the depth of the
water guaranteed that radioactivity from the soil was negligible). He concluded that
the decrease of about 20 % was due to a radiation not coming from the Earth.
After Wulf’s observations on the altitude effect, the need for balloon experiments
(widely used for atmospheric electricity studies since 1885) became clear. The first
high-altitude balloon with the purpose of studying the penetrating radiation was
flown in Switzerland in December 1909 with a balloon from the Swiss aeroclub.
Albert Gockel, professor at the University of Fribourg, ascended to 4500 m above
sea level (a.s.l.). He made measurements up to 3000 m and found that ionization
rate did not decrease with altitude as expected under the hypothesis of terrestrial
origin. His conclusion was that “a non-negligible part of the penetrating radiation is
independent of the direct action of the radioactive substances in the uppermost layers
of the Earth.”
In spite of Pacini’s conclusions, and of Wulf’s and Gockel’s puzzling results on
the altitude dependence, the issue of the origin of the penetrating radiation still raised
doubts. A series of balloon flights by the Austrian physicist Victor Hess,2 settled the
issue, firmly establishing the extraterrestrial origin of at least part of the radiation
causing the atmospheric ionization.
2 Hess was born in 1883 in Steiermark, Austria, and graduated at Graz University in 1906 where he
became professor of Experimental Physics in 1919. In 1936 Hess was awarded the Nobel Prize in
Physics for the discovery of cosmic rays. He moved to the USA in 1938 as professor at Fordham
University. Hess became an American citizen in 1944 and lived in New York until his death in 1964.
76 3 Cosmic Rays and the Development of Particle Physics
Fig. 3.5 Left Hess during the balloon flight in August 1912. [public domain], via Wikimedia
Commons. Right one of the electrometers used by Hess during his flight. This instrument is a
version of a commercial model of a Wulff electroscope especially modified by its manufacturer,
Günther and Tegetmeyer, to operate under reduced pressure at high altitudes (Smithsonian National
Air and Science Museum, Washington, DC). Photo by P. Carlson
Hess started by studying Wulf’s results. He carefully checked the data on gamma-
ray absorption coefficients (due to the large use of radioactive sources he will loose a
thumb) and, after careful planning, he finalized his studies with balloon observations.
The first ascensions took place in August 1911. From April 1912 to August 1912, he
flew seven times, with three instruments (one of them with a thin wall to estimate the
effect of β radiation, as for a given energy electrons have a shorter range than heavier
particles). In the last flight, on August 7, 1912, he reached 5200 m (Fig. 3.5). The
results clearly showed that the ionization rate first passied through a minimum and
then increased considerably with height (Fig. 3.6). “(i) Immediately above ground the
total radiation decreases a little. (ii) At altitudes of 1000–2000 m there occurs again
a noticeable growth of penetrating radiation. (iii) The increase reaches, at altitudes
of 3000–4000 m, already 50 % of the total radiation observed on the ground. (iv) At
4000–5200 m the radiation is stronger [more than 100 %] than on the ground.”
Hess concluded that the increase in the ionization rate with altitude was due to
radiation coming from above, and he thought that this radiation was of extraterrestrial
origin. His observations during the day and during the night showed no variation and
excluded the Sun as the direct source of this hypothetical radiation.
The results by Hess would later be confirmed by Kolhörster. In flights up to
9200 m, Kolhörster found an increase in the ionization rate up to ten times its value
at sea level. The measured attenuation length of about 1 km in air at NTP came as a
surprise, as it was eight times smaller than the absorption coefficient of air for γ rays
as known at the time.
3.1 The Puzzle of Atmospheric Ionization and the Discovery of Cosmic Rays 77
Fig. 3.6 Variation of ionization with altitude. Left panel Final ascent by Hess (1912), carrying two
ion chambers. Right panel Ascents by Kolhörster (1913, 1914)
After the 1912 flights, Hess coined the name “Höhenstrahlung.” Several other
names were used for the extraterrestrial radiation: Ultrastrahlung, Ultra-X-Strahlung,
kosmische Strahlung. The latter, used by Gockel and Wulf in 1909, inspired Millikan3
who suggested the name “cosmic rays,” which became generally accepted.
The idea of cosmic rays, despite the striking experimental evidence, was not imme-
diately accepted (the Nobel prize for the discovery of cosmic rays will be assigned
to Hess only in 1936). During the 1914–1918 war and the years that followed, very
few investigations of the penetrating radiation were performed. In 1926, however,
Millikan and Cameron performed absorption measurements of the radiation at differ-
ent depths in lakes at high altitudes. They concluded that the radiation was made of
high energy γ rays and that “these rays shoot through space equally in all directions,”
and called them “cosmic rays.”
With the development of cosmic ray physics, scientists knew that astrophysical
sources provided high energy particles which entered the atmosphere. The obvi-
ous next step was to investigate the nature of such particles, and to use them to probe
matter in detail, much in the same way as in the experiment conducted by Mars-
den and Geiger in 1909 (the Rutherford experiment, described in Chap. 2). Particle
physics thus started with cosmic rays, and many of the fundamental discoveries were
made thanks to cosmic rays.
In parallel, the theoretical understanding of the Universe was progressing quickly:
at the end of the 1920s, scientists tried to put together relativity and quantum mechan-
ics, and the discoveries following these attempts changed completely our view of
nature. A new window was going to be opened: antimatter.
4 The Geiger-Müller counter is a cylinder filled with a gas, with a charged metal wire inside. When a
charged particle enters the detector, it ionizes the gas, and the ions and the electrons can be collected
by the wire and by the walls. The electrical signal of the wire can be amplified and read by means
of an amperometer. The tension V of the wire is large (a few thousand volts), in such a way that the
gas is completely ionized; the signal is then a short pulse of height independent of the energy of the
particle. Geiger-Müller tubes can be also appropriate for detecting γ radiation, since a photoelectron
or a Compton-scattered electron can generate an avalanche.
3.2 Cosmic Rays and the Beginning of Particle Physics 79
∂ 2 2
i =− ∇ + V
∂t 2m
can be seen as the translation into the wave language of the Hamiltonian equation of
classical mechanics
p2
H= + V,
2m
where the Hamiltonian (essentially the energy of the system) is represented by the
operator
∂
Ĥ = i +V
∂t
and momentum by
pˆ = −i∇.
The solutions of the equation are in general complex wavefunctions, which can be
seen as probability amplitudes.
5 Erwin Schrödinger was an Austrian physicist who obtained fundamental results in the fields of
quantum theory, statistical mechanics and thermodynamics, physics of dielectrics, color theory,
electrodynamics, cosmology, and cosmic-ray physics. He also paid great attention to the philosoph-
ical aspects of science, re-evaluating ancient and oriental philosophical concepts, and to biology
and to the meaning of life. He formulated the famous paradox of the Schrödinger cat. He shared
with P.A.M. Dirac the 1933 Nobel Prize for Physics “for the discovery of new productive forms of
atomic theory.”
80 3 Cosmic Rays and the Development of Particle Physics
(the integral is extended over all the volume), the probability to find the particle in
an infinitesimal volume d V around a point r at a time t is
d P = ∗ (
r , t)(
r , t) d V = |(
r , t)|2 d V .
The left term in Eq. (3.1) is defined as the scalar product of the function by itself.
The statistical interpretation introduces a kind of uncertainty into quantum
mechanics: even if you know everything the theory can tell you about the parti-
cle (its wavefunction), you cannot predict with certainty the outcome of a simple
experiment to measure its position: all the theory gives is statistical information
about the possible results.
Measurement: Operators. The expectation value of the measurement of, say, posi-
tion along the x coordinate is thus given as
x = d V ∗ x (3.2)
and one can easily demonstrate (see for example [F3.4] in the Further Readings) that
the expectation value of the momentum along x is
∗ ∂
px = dV −i . (3.3)
∂x
In these two examples we saw that measurements are represented by operators act-
ing on wavefunctions. The operator x represents position along x, and the operator
(−i∂/∂x) represents the x component of momentum, px . When ambiguity is pos-
sible we put a “hat” on top of the operator to distinguish it from the corresponding
physical quantity.
To calculate expectation values, we put the appropriate operator in sandwich
between ∗ and , and integrate. If A is a quantity and  the corresponding operator,
A = d V ∗ Â . (3.4)
Dirac Notation In the Dirac notation, a vector (an eigenstate in this case) is identified
by the symbol | , and is called ket; the symbol | is called bra.
The bracket | is the scalar product of the two vectors:
| = d V ∗ .
3.2 Cosmic Rays and the Beginning of Particle Physics 81
and thus if we want all expectation values (and the results of any measurement) to
be real,  must be a Hermitian operator (i.e., such that † = Â).
Now let us call i the eigenvectors of  (which form a basis) and ai the corre-
sponding eigenvalues; for m , n such that n = m
 | m = am |m  | n = an |n
and thus
n | m = δmn .
Hermitian operators are thus good operators for representing the measurement of
physical quantities: their eigenvalues are real (and thus is the measurement of any
quantity) and the solutions form an orthonormal basis.
When we measure a value, we obtain a well-defined value: thus the wavefunction
“collapses” to an eigenfunction, and the measured value is one of the eigenvalues of
the measurement operator.
Schrödinger’s Equation: Meaning of the Eigenvalues. Schrödinger’s equation
is an equation for which the eigenvectors are eigenstates of definite energy. For a
potential V not explicitly dependent on time, it can be split in general into two
equations. One is a time-independent eigenvalue equation
2 2
− ∇ + V ψ(
r ) = Eψ(
r)
2m
82 3 Cosmic Rays and the Development of Particle Physics
(
r , t) = ψ(
r )φ(t) .
and we say that the two operators commute when their commutator is zero. We can
simultaneously measure observables whose operators commute, since such operators
have a complete set of simultaneous eigenfunctions—thus one can have two definite
measurements at the same time.
However, pairs of noncommuting operators cannot give rise to simultaneous
measurements arbitrarily precise for the associated quantities (this is called usually
Heisenberg’s uncertainty principle, but indeed it is a theorem).
Let us define as spread of an operator the operator
 =  − A .
[ Â, B̂] = i Ĉ
C 2
(A)2 (B)2 ≥ . (3.5)
4
In particular when a simultaneous measurement of position and momentum along
an axis, say x, is performed, one has
xpx ≥ ∼ .
2
Somehow linked to this is the fact that energy is not defined with absolute precision,
but, if measured in a time t, has an uncertainty E such that
Et ∼
(energy conservation can be violated for short times). The value of the Planck’s
constant 6.58 × 10−22 MeV · s is anyway small with respect to the value corre-
sponding to the energies needed to create particles living for a reasonable (detectable)
time.
Limits of the Schrödinger’s Equation. Since Schrödinger’s equation contains deriv-
atives of different order with respect to space and time, it cannot be relativistically
covariant, and thus, it cannot be the “final” equation. How can it be extended to
be consistent with Lorentz invariance? We must translate relativistically covariant
Hamiltonians in the quantum language, i.e., into equations using wavefunctions. We
shall see in the following two approaches.
84 3 Cosmic Rays and the Development of Particle Physics
In the case of a free particle (V = 0), the simplest way to extend the Schrödinger’s
equation to take into account relativity is to write the Hamiltonian equation
Ĥ 2 = p̂ 2 c2 + m 2 c4
∂2 2 + m 2 c4 ,
=⇒ −2 2 = −2 c2 ∇
∂t
or, in natural units,
∂2 2 = m2 .
− 2 +∇
∂t
have both positive and negative eigenvalues for energy. For every plane wave solution
of the form
r , t) = N ei( p·r −E p t)
(
Ep = p 2 + m 2 ≥ m,
there is a solution
∗ (
r , t) = N ∗ ei(− p·r +E p t)
6 Oskar
Klein (1894–1977) was a Swedish theoretical physicist; Walter Gordon (1893–1939) was
a German theoretical physicist, former student of Max Planck.
3.2 Cosmic Rays and the Beginning of Particle Physics 85
E = −E p = − p 2 + m 2 ≤ −m .
Note that one cannot simply drop the solutions with negative energy as “unphys-
ical”: the full set of eigenstates is needed, because, if one starts from a given wave
function, this could evolve with time into a wave function that, in general, has pro-
jections on all eigenstates (including those one would like to get rid of). We remind
the reader that these are solutions of an equation describing a free particle.
A final comment about notation. The (classical) Schrödinger equation for a sin-
gle particle in a time-independent potential can be decoupled into two equations:
one (the so-called eigenvalue equation) just depending just on space, and the other
depending just on time. The solution of the eigenvalue equation is normally indicated
by a minuscule Greek symbol, ψ( r ) for example, while the time part has a solution
independent of the potential, e−(E/)t . The wavefunction is indicated by a capital
letter:
E
( r )e−i t .
r , t) = ψ(
This distinction makes no sense for relativistically equations and in particular for
the Klein-Gordon equation and for the Dirac equation which will be discussed later.
Both (x) and ψ(x) are now valid notations for indicating a wavefunction which is
function of the 4-vector x = (ct, x, y, z).
Dirac7 in 1928 searched for an alternative relativistic equation starting from the
generic form describing the evolution of a wave function, in the familiar form:
∂
i = Ĥ
∂t
with a Hamiltonian operator linear in pˆ, t (Lorentz invariance requires that if the
Hamiltonian has first derivatives with respect to time also the spatial derivatives
should be of first order):
Ĥ = cα
· p + βmc2 .
7 Paul Adrien Maurice Dirac (Bristol, UK, 1902—Tallahassee, US, 1984) was one of the founders of
quantum physics. After graduating in engineering and later studying physics, he became professor
of mathematics in Cambridge. In 1933 he shared the Nobel Prize with Schrödinger. He assigned
to the concept of “beauty in mathematics” a prominent role among the basic aspects intrinsic to
the nature so far as to argue that “a mathematically beautiful theory is more likely to be right and
proper to an unpleasant as it is confirmed by the data.”
86 3 Cosmic Rays and the Development of Particle Physics
αi2 = 1 ; β 2 = 1
αi β + βαi = 0
αi α j + α j αi = 0.
(cα
· p + βm) u( p) = Eu( p).
This equation has four solutions: two with positive energy E = +E p and two with
negative energy E = −E p . We discuss later the interpretation of the negative energy
solutions.
Dirac’s equation was a success. First, it accounted “for free” for the existence of
two spin states (we remind that spin had to be inserted by hand in the Schrödinger
equation of nonrelativistic quantum mechanics). In addition, since spin is embedded
in the equation, the Dirac’s equation:
8 The term spinor indicates in general a vector which has definite transformation properties for a
rotation in the proper angular momentum space—the spin space. The properties of rotation in spin
space will be described in greater detail in Chap. 5 .
3.2 Cosmic Rays and the Beginning of Particle Physics 87
Fig. 3.7 Dirac picture of the vacuum. In normal conditions, the sea of negative energy states is
totally occupied with two electrons in each level. By Incnis Mrsi [own work, public domain], via
Wikimedia Commons.
• allows computing correctly the energy splitting of atomic levels with the same
quantum numbers due to the spin–orbit interaction in atoms (fine and hyperfine
splitting);
• explains the magnetic moment of point-like fermions.
The predictions on the values of the above quantities were incredibly precise and
still resist to experimental tests.
Negative energy states must be occupied: if they were not, transitions from positive to
negative energy states would occur, and matter would be unstable. Dirac postulated
that the negative energy states are completely filled under normal conditions. In the
case of electrons the Dirac picture of the vacuum is a “sea” of negative energy states,
while the positive energy states are mostly free (Fig. 3.7). This condition cannot be
distinguished from the usual vacuum.
If an electron is added to the vacuum, it finds in general place in the positive energy
region since all the negative energy states are occupied. If a negative energy electron
is removed from the vacuum, however, a new phenomenon happens: removing such
an electron with E < 0, momentum − p, spin − S and charge −e leaves a “hole”
indistinguishable from a particle with positive energy E > 0, momentum p, spin S
and charge +e. This is similar to the formation of holes in semiconductors. The two
cases are equivalent descriptions of the same phenomena. Dirac’s sea model thus
predicts the existence of a new fermion with mass equal to the mass of the electron,
but opposite charge. This particle, later called the positron, is the antiparticle of the
electron, and is the prototype of a new family of particles: antimatter.
88 3 Cosmic Rays and the Development of Particle Physics
Fig. 3.8 Left A cloud chamber built by Wilson in 1911. By C.T.R. Wilson [public domain],
via Wikimedia Commons. Right a picture of a collision in a cloud chamber [CC BY 4.0 http://
creativecommons.org/licenses/by/4.0] via Wikimedia Commons
During his doctoral thesis (supervised by Millikan), Anderson was studying the tracks
of cosmic rays passing through a cloud chamber9 in a magnetic field (Fig. 3.8). In
1933 he discovered antimatter in the form of a positive particle of mass consistent
with the electron mass, later called the positron (Fig. 3.9). Dirac’s equation prediction
was confirmed; this was a great achievement for cosmic ray physics. Anderson shared
with Hess the Nobel Prize for Physics in 1936; they were nominated by Compton,
with the following motivation:
The time has now arrived, it seems to me, when we can say that the so-called cosmic rays
definitely have their origin at such remote distances from the Earth that they may properly be
called cosmic, and that the use of the rays has by now led to results of such importance that
they may be considered a discovery of the first magnitude. [...] It is, I believe, correct to say
that Hess was the first to establish the increase of the ionization observed in electroscopes with
increasing altitude; and he was certainly the first to ascribe with confidence this increased
ionization to radiation coming from outside the Earth.
9 The cloud chamber (see also next chapter), invented by C.T.R. Wilson at the beginning of the
twentieth century, was an instrument for reconstructing the trajectories of charged particles. The
instrument is a container with a glass window, filled with air and saturated water vapor; the volume
could be suddenly expanded, bringing the vapor to a supersaturated (metastable) state. A charged
cosmic ray crossing the chamber produces ions, which act as seeds for the generation of droplets
along the trajectory. One can record the trajectory by taking a photographic picture. If the chamber
is immersed in a magnetic field, momentum and charge can be measured by the curvature. The
working principle of bubble chambers is similar to that of the cloud chamber, but here the fluid is
a liquid. Along the tracks’ trajectories, a trail of gas bubbles condensates around the ions. Bubble
and cloud chambers provide a complete information: the measurement of the bubble density and
the range, i.e., the total track length before the particle eventually stops, provide an estimate for the
energy and the mass; the angles of scattering provide an estimate for the momentum.
3.2 Cosmic Rays and the Beginning of Particle Physics 89
Fig. 3.9 The first picture by Anderson showing the passage of a cosmic antielectron, or positron,
through a cloud chamber immersed in a magnetic field. One can understand that the particle comes
from the bottom in the picture by the fact that, after passing through the sheet of material in the
medium (and therefore losing energy), the radius of curvature decreases. The positive charge is
inferred from the direction of bending in the magnetic field. The mass is measured by the bubble
density (a proton would lose energy faster). Since most cosmic rays come from the top, the first
evidence for antimatter comes thus from an unconventional event. From C.D. Anderson, “The
Positive Electron,” Physical Review 43 (1933) 491
At the end of the 1920s, Bothe and Kolhörster introduced the coincidence tech-
nique to study cosmic rays with the Geiger counter. A coincidence circuit activates
the acquisition of data only when signals from predefined detectors are received
within a given time window. The coincidence technique is widely used in particle
physics experiments, but also in other areas of science and technology. Walther Bothe
shared the Nobel Prize for Physics in 1954 with the motivation: “for the coincidence
method and his discoveries made therewith.” Coupling a cloud chamber to a system
of Geiger counters and using the coincidence technique, it was possible to take pho-
tographs only when a cosmic ray traversed the cloud chamber (we call today such a
system a “trigger”). This increased the chances of getting a significant photograph
and thus the efficiency of cloud chambers.
Soon after the discovery of the positron by Anderson, a new important observation
was made in 1933: the conversion of photons into pairs of electrons and positrons.
Dirac’s theory not only predicted the existence of antielectrons, but it also predicted
that electron–positron pairs could be created from a single photon with energy large
enough; the phenomenon was actually observed in cosmic rays by Blackett (Nobel
Prize for Physics in 1948) and Occhialini, who further improved in Cambridge the
coincidence technique. Electron–positron pair production is a simple and direct con-
firmation of the mass-energy equivalence and thus of what is predicted by the theory
of relativity. It also demonstrates the behavior of light, confirming the quantum
concept which was originally expressed as “wave-particle duality”: the photon can
behave as a particle.
In 1934, the Italian Bruno Rossi10 reported the observation of the near-simultan-
eous discharge of two widely separated Geiger counters during a test of his equip-
ment. In the report, he wrote: “[...] it seems that once in a while the recording
equipment is struck by very extensive showers of particles, which causes coinci-
dences between the counters, even placed at large distances from one another.” In
1937 Pierre Auger, who was not aware of Rossi’s report, made a similar observation
and investigated the phenomenon in detail. He concluded that extensive showers
originate when high energy primary cosmic rays interact with nuclei high in the
atmosphere, leading to a series of interactions that ultimately yield a shower of par-
ticles that reach ground. This was the explanation of the spontaneous discharge of
electroscopes due to cosmic rays.
10 Bruno Rossi (Venice 1905—Cambridge, MA, 1993) graduated in Bologna, and then moved to
Arcetri near Florence before becoming full professor of physics at the University of Padua in 1932.
In Padua he was charged of overseeing the design and construction of the new Physics Institute,
which was inaugurated in 1937. He was exiled in 1938, as a consequence the Italian racial laws, and
he moved to Chicago and then to Cornell. In 1943 he joined the Manhattan project in Los Alamos,
working to the development of the atomic bomb, and after the end of the second World War moved
to MIT. At MIT Rossi started working on space missions as a scientfic consultant for the newborn
NASA, and proposed the rocket experiment that discovered the first extra-solar source of X-rays.
Many fundamental contributions to modern physics, for example the electronic coincidence circuit,
the discovery and study of extensive air showers, the East–West effect, and the use of satellites for
the exploration of the high-energy Universe, are due to Bruno Rossi.
3.2 Cosmic Rays and the Beginning of Particle Physics 91
In 1935 the Japanese physicist Yukawa, 28 years old at that time, formulated his inno-
vative theory explaining the “strong” interaction ultimately keeping together matter
(strong interaction keeps together protons and neutrons in the atomic nuclei). This
theory has been sketched in the previous chapter, and requires a “mediator” particle
of intermediate mass between the electron and the proton, thus called meson—the
word “meson” meaning “middle one.”
To account for the strong force, Yukawa predicted that the meson must have a
mass of about one-tenth of GeV, a mass that would explain the rapid weakening of
the strong interaction with distance. The scientists studying cosmic rays started to
discover new types of particles of intermediate masses. Anderson, who after the Nobel
Prize had became a professor, and his student Neddermeyer, observed in 1937 a new
particle, present in both positive and negative charge, more penetrating than any other
particle known at the time. The new particle was heavier than the electron but lighter
than the proton, and they suggested for it the name “mesotron.” The mesotron mass,
measured from ionization, was between 200 and 240 times the electron mass; this
was fitting into Yukawa’s prediction for the meson. Most researchers were convinced
that these particles were the Yukawa’s carrier of the strong nuclear force, and that they
were created when primary cosmic rays collided with nuclei in the upper atmosphere,
in the same way as electrons emit photons when colliding with a nucleus.
The lifetime of the mesotron was measured studying its flow at various altitudes,
in particular by Rossi in Colorado (Rossi had been forced to emigrate to the United
States to escape racial persecution); the result was of about two microseconds (a
hundred times larger than predicted by Yukawa for the particle that transmits the
strong interaction). Rossi found also that at the end of its life the mesotron decays into
an electron and other neutral particles (neutrinos) that did not leave tracks in bubble
chamber—the positive mesotron decays into a positive electron plus neutrinos.
Beyond the initial excitement, however, the picture did not work. In particular, the
Yukawa particle is the carrier of strong interactions, and therefore it cannot be highly
penetrating—the nuclei of the atmosphere would absorb it quickly. Many theorists
tried to find complicated explanations to save the theory. The correct explanation
was however the simplest one: the mesotron was not the Yukawa particle, as it was
demonstrated in 1945/46 by three young Italian physicists, Conversi, Pancini, and
Piccioni.
The experiment by Conversi, Pancini and Piccioni exploits the fact that slow
negative Yukawa particles can be captured by nuclei in a time shorter than the typical
lifetime of the mesotron, about 2 µs, and thus are absorbed before decaying; on the
contrary, slow positive particles are likely to be repelled by the potential barrier of
nuclei and thus have the time to decay. The setup is shown in Fig. 3.10; a magnetic
lens focuses particles of a given charge, thus allowing charge selection. The Geiger
counters A and B are in coincidence—i.e., a simultaneous signal is required; the C
counters under the absorber are in “delayed coincidence,” and it is requested that
one of them fires after a time between 1 and 4.5 µs after the coincidence (AB). This
92 3 Cosmic Rays and the Development of Particle Physics
Fig. 3.10 Left a magnetic lens (invented by Rossi in 1930). Right setup of the Conversi, Pancini and
Piccion experiment. From M. Conversi, E. Pancini, O. Piccioni, “On the disintegration of negative
mesons”, Physical Review 71 (1947) 209
guarantees that the particle selected is slow and, in case of decay, has a lifetime
consistent with the mesotron. The result was that when carbon was used as absorber,
a substantial fraction of the negative mesons decayed. The mesotron was not the
Yukawa particle.
There were thus two particles of similar mass. One of them (with mass of about
140 MeV/c2 ), corresponding to the particle predicted by Yukawa, was later called
pion (or π meson); it was created in the interactions of cosmic protons with the
atmosphere, and then interacted with the nuclei of the atmosphere, or decayed.
Among its decay products there was the mesotron, since then called the muon (or μ
lepton), which was insensitive to the strong force.
In 1947, Powell, Occhialini and Lattes, exposing nuclear emulsions (a kind of
very sensitive photographic plates, with space resolutions of a few µm; we shall
discuss them in the next chapter) to cosmic rays on Mount Chacaltaya in Bolivia,
finally proved the existence of charged pions, positive and negative, while observing
their decay into muons and allowing a precise determination of the masses. For this
discovery Cecil Powell, the group leader, was awarded the Nobel Prize in 1950.
Many photographs of nuclear emulsions, especially in experiments on balloons,
clearly showed traces of pions decaying into muons (the muon mass was reported to
be about 106 MeV/c2 ), decaying in turn into electrons. In the decay chain π → μ → e
(Fig. 3.11) some energy is clearly missing, and can be attributed to neutrinos.
At this point, the distinction between pions and muons was clear. The muon looks
like a “heavier brother” of the electron. After the discovery of the pion, the muon had
no theoretical reason to exist (the physicist Isidor Rabi was attributed in the 1940s
the famous quote: “Who ordered it?”). However, a new family was initiated: the
family of leptons—including for the moment the electron and the muon, and their
antiparticles.
3.2 Cosmic Rays and the Beginning of Particle Physics 93
Fig. 3.11 The pion and the muon: the decay chain π → μ → e. The pion travels from bottom to
top on the left, the muon horizontally, and the electron from bottom to top on the right. The missing
momentum is carried by neutrinos. From C.F. Powell, P.H. Fowler and D.H. Perkins, The Study of
Elementary Particles by the Photographic Method (Pergamon Press 1959)
Before it was even known that mesotrons were not the Yukawa particle, the theory of
mesons had great development. In 1938, a theory of charge symmetry was formulated,
conjecturing the fact that the forces between protons and neutrons, between protons
and protons and between neutrons and neutrons are similar. This implies the existence
of positive, negative and also neutral mesons.
The neutral pion was more difficult to detect than the charged one, due to the fact
that neutral particles do not leave tracks in cloud chambers and nuclear emulsions—
and also to the fact, discovered only later, that it lives only approximately 10−16
s before decaying mostly into two photons. However, between 1947 and 1950, the
neutral pion was identified in cosmic rays by analyzing its decay products in showers
of particles. So, after 15 years of research, the theory of Yukawa had finally complete
confirmation.
In 1947, after the thorny problem of the meson had been solved, particle physics
seemed to be a complete science. Fourteen particles were known to physicists (some
of them at the time were only postulated, and were going to be found experimentally
later): the proton, the neutron (proton and neutron together belong to the family of
baryons, the Greek etymology of the word referring to the concept of “heaviness”)
and the electron, and their antiparticles; the neutrino that was postulated because
of an apparent violation of the principle of energy conservation; three pions; two
muons; and the photon.
Apart from the muon, a particle that appeared unnecessary, all the others seemed
to have a role in nature: the electron and the nucleons constitute the atom, the photon
carries the electromagnetic force, and the pion the strong force; neutrinos are needed
for energy conservation. But, once more in the history of science, when everything
seemed understood a new revolution was just around the corner.
94 3 Cosmic Rays and the Development of Particle Physics
Fig. 3.12 The first images of the decay of particles known today as K mesons or kaons—the first
examples of “strange” particles. The image on the left shows the decay of a neutral kaon. Being
neutral it leaves no track, but when it decays into two lighter charged particles (just below the central
bar to the right), one can see a “V”. The picture on the right shows the decay of a charged kaon into
a muon and a neutrino. The kaon reaches the top right corner of the chamber and the decay occurs
where the track seems to bend sharply to the left (from G.D. Rochester, C.C. Butler, “Evidence for
the Existence of New Unstable Elementary Particles” Nature 160 (1947) 855)
Since 1944, strange topologies of cosmic particles were photographed from time
to time in cloud chambers. In 1947, G.D. Rochester and the C.C. Butler from the
University of Manchester observed clearly in a photograph a couple of tracks from
a single point with the shape of a “V”; the two tracks were deflected in opposite
directions by an external magnetic field. The analysis showed that the parent neutral
particle had a mass of about half a GeV (intermediate between the mass of the proton
and that of the pion), and disintegrated into a pair of oppositely charged pions. A
broken track in a second photograph showed the decay of a charged particle of about
the same mass into a charged pion and at least one neutral particle (Fig. 3.12).
These particles, which were produced only in high energy interactions, were
observed only every few hundred photographs. They are known today as K mesons
(or kaons); kaons can be positive, negative, or neutral. A new family of particles had
been discovered. The behavior of these particles was somehow strange: the cross
section for their production could be understood in terms of strong interactions;
however, their lifetime was inconsistent with strong interaction, being too long. These
new particles were called “strange mesons”. Later analyses indicated the presence
of particles heavier than protons and neutrons. They decayed with a “V” topology
into final states including protons, and they were called strange baryons, or hyperons
(, , ...). Strange particles appear to be always produced in pairs, indicating the
presence of a new conserved quantum number—thus called strangeness.
3.2 Cosmic Rays and the Beginning of Particle Physics 95
In the beginning, the discovery of strange mesons was made complicated by the so-
called τ -θ puzzle. A strange meson was disintegrating into two pions, and was called
the θ meson; another particle called the τ meson was disintegrating into three pions.
Both particles disintegrated via the weak force and, apart from the decay mode, they
turned out to be indistinguishable from each other, having identical masses within the
experimental uncertainties. Were the two actually the same particle? It was concluded
that they were (we are talking about the K meson); this opened a problem related to
the so-called parity conservation law, and we will discuss it better in Chaps. 5 and 6.
The discovery of mesons, which had put the physics world in turmoil after World War
II, can be considered as the origin of the “modern” physics of elementary particles.
The following years showed a rapid development of the research groups dealing
with cosmic rays, along with a progress of experimental techniques of detection,
exploiting the complementarity of cloud and bubble chambers, nuclear emulsions,
and electronic coincidence circuits. The low cost of emulsions allowed the spread of
nuclear experiments and the establishment of international collaborations.
It became clear that it was appropriate to equip laboratories on top of the mountains
to study cosmic rays. Physicists from all around the world were involved in a scientific
challenge of enormous magnitude, taking place in small laboratories on the tops of
the Alps, the Andes, the Rocky Mountains, the Caucasus.
Particle physicists used cosmic rays as the primary tool for their research until the
advent of particle accelerators in the 1950s, so that the pioneering results in this field
are due to cosmic rays. For the first 30 years cosmic rays allowed to gain information
on the physics of elementary particles. With the advent of particle accelerators, in
the years since 1950, most physicists went from hunting to farming.
In 1953, the Cosmic Ray Conference at Bagnères de Bigorre in the French Pyrenees
was a turning point for high energy physics. The technology of artificial accelerators
was progressing, and many cosmic ray physicists were moving to this new frontier.
CERN, the European Laboratory for Particle Physics, was soon to be founded.
Also from the sociological point of view, important changes were in progress,
and large international collaborations were formed. Only 10 years before, articles for
which the preparation of the experiment and the data analysis had been performed
by many scientists were signed only by the group leader. But the recent G-stack
experiment, an international collaboration in which cosmic ray interactions were
96 3 Cosmic Rays and the Development of Particle Physics
It should be stressed that, despite the great advances of the technology of accel-
erators, the highest energies will always be reached by cosmic rays. The founding
fathers of CERN, in their Constitution (Convention for the Establishment of a Euro-
pean Organization for Nuclear Research, 1953) explicitly stated that cosmic rays are
one of the research items of the Laboratory.
A calculation made by Fermi about the maximum reasonably (and even unrea-
sonably) achievable energy by terrestrial accelerators is interesting in this regard. In
his speech “What can we learn from high energy accelerators”) held at the American
Physical Society in 1954 Fermi had considered a proton accelerator with a ring as
large as the maximum circumference of the Earth (Fig. 3.13) as the maximum possi-
ble accelerator. Assuming a magnetic field of 2 tesla (Fermi assumed that this was the
maximum field attainable in stable conditions and for long magnets; the conjecture
is still true unless new technologies will appear), it is possible to obtain a maximum
energy of 5000 TeV: this is the energy of cosmic rays just under the “knee,” the typical
energy of galactic accelerators. Fermi estimated with great optimism, extrapolating
the rate of progress of the accelerator technology in the 1950s, that this accelerator
could be constructed in 1994 and cost approximately 170 million dollars (the cost of
LHC is some 100 times larger, and its energy is 700 times smaller).
3.4 The Recent Years 97
between the photon accelerators and the cosmic ray accelerators in the Milky Way,
in particular supernova remnants. Studying the propagation of very energetic photons
traveling through cosmological distances, they are also sensitive to possible violations
of the Lorentz invariance at very high energy, and to photon interactions with the
quantum vacuum, which in turn are sensitive to the existence of yet unknown fields.
A new detector, CTA, is planned and will outperform the present detectors by an
order of magnitude at least.
Finally, the field of study of cosmic neutrinos registered impressive results. In the
analysis of the fluxes of solar neutrinos and then of atmospheric neutrinos, studies
performed using large neutrino detectors in Japan, US, Canada, China, and Italy have
demonstrated that neutrinos can oscillate between different flavors; this phenomenon
requires that neutrinos have nonzero mass—present indications favor masses of the
order of tens of meV. Recently the IceCube South Pole Neutrino Observatory, a
km3 detector buried in the ice of Antarctica, has discovered the first solid evidence
for astrophysical neutrinos from cosmic accelerators (some with energies above the
PeV).
Cosmic rays and cosmological sources are thus again in the focus of very high
energy particle and gravitational physics. This will be discussed in greater detail in
Chap. 10.
Further Reading
Exercises
2. Klein-Gordon equation. Show that in the nonrelativistic limit E mc2 the posi-
tive energy solutions of the Klein-Gordon equation can be written in the form
mc2
( r , t)e−
r , t) ( ,
Particle detectors measure physical quantities related to the result of a collision; they
should ideally identify all the outcoming (and the incoming, if unknown) particles,
and measure their kinematical characteristics (momentum, energy, velocity).
In order to detect a particle, one must make use of its interaction with a sensitive
material. The interaction should possibly not destroy the particle that one wants to
detect; however, for some particles this is the only way to obtain information.
In order to study the properties of detectors, we shall thus first need to review the
characteristics of the interaction of particles with matter.
Charged particles interact basically with atoms, and the interaction is mostly electro-
magnetic: they might expel electrons (ionization), promote electrons to upper energy
levels (excitation), or radiate photons (bremsstrahlung, Cherenkov radiation, tran-
sition radiation). High-energy particles may also interact directly with the atomic
nuclei.
This is one of the most important sources of energy loss by charged particles. The
average value of the specific (i.e., calculated per unit length) energy loss due to
ionization and excitation whenever a particle goes through a homogeneous material
© Springer-Verlag Italia 2015 101
A. De Angelis and M.J.M. Pimenta, Introduction to Particle
and Astroparticle Physics, Undergraduate Lecture Notes in Physics,
DOI 10.1007/978-88-470-2688-9_4
102 4 Particle Detection
where
• ρ is the material density, in g/cm3 ;
• Z and A are the atomic and mass number of the material, respectively;
• z p is the charge of the incoming particle, in units of the electron charge;
• D 0.307 MeV cm2 /g;
• m e c2 is the energy corresponding to the electron mass, ∼ 0.5 MeV;
• I is the mean excitation energy in the material; it can be approximated as I
16 eV × Z 0.9 for Z > 1;
• δ is a correction term that becomes important at high energies. It accounts for
the reduction in energy loss due to the so-called density effect. As the incident
particle velocity increases, media become polarized and their atoms can no longer
be considered as isolated.
The energy loss by ionization (Fig. 4.1) in first approximation is:
• independent of the particle’s mass;
• typically small for high-energy particles (about 2 MeV/cm in water; one can
roughly assume a proportionality to the density of the material);
• proportional to 1/β 2 for βγ ≤ 3 (the minimum of ionization: minimum ionizing
particle, often just called a “mip”);
• basically constant for β > 0.96 (logarithmic increase after the minimum);
• proportional to Z /A (Z /A being about equal to 0.5 for all elements but hydrogen
and the heaviest nuclei).
In practice, most relativistic particles (such as cosmic-ray muons) have mean energy
loss rates close to the minimum; they can be considered within less than a factor of
two as minimum ionizing particles. The loss from a minimum ionizing particle is
well approximated as
1 dE Z
−3.5 MeV cm2 /g .
ρ dx A
1 The 24-year-old Hans Bethe, Nobel prize in 1967 for his work on the theory of stellar nucleosyn-
thesis, published this formula in 1930; the formula—not including the density term, added later
by Fermi—was derived using quantum mechanical perturbation theory up to z 2p . The description
can be improved by considering corrections which correspond to higher powers of z p : Felix Block
obtained in 1933 a higher-order correction proportional to z 4p , not reported in this text, and some-
times the formula is called “Bethe-Block energy loss”—although this naming convention has been
discontinued by the Particle Data Group since 2008.
4.1 Interaction of Particles with Matter 103
10
6 H 2 liquid
cm 2 )
1
5
4
dE/dx (MeV g
He gas
3
C
Al
Fe
2 Sn
Pb
1
0.1 1.0 10 100 1000 10 000
= p/Mc
Fig. 4.1 Specific ionization energy loss for muons, pions, and protons in different materials. From
K.A. Olive et al. (Particle Data Group), Chin. Phys. C 38 (2014) 090001
In any case, as we shall see later, the energy loss in the logarithmic increase region
can be used by means of appropriate detectors for particle identification.
Due to the statistical nature of the ionization process, large fluctuations on the
energy loss arise when fast charged particles pass through absorbers which are thin
compared to the particle range. The energy loss is distributed around the most prob-
able value according to an asymmetric distribution (named the Landau distribution).
The average energy loss, represented by the Bethe formula, is larger than the most
probable energy loss, since the Landau distribution has a long tail (as the width of
the material increases, the most probable energy loss becomes however closer to the
average, as one can see in Fig. 4.2).
Although its nature is quantum mechanical, the main characteristics of Eq. 4.1
can be derived classically, as it was first done by Bohr. Let us suppose a charged
particle of mass m and charge z p passes at a distance b from a target of mass M and
charge Z . The momentum p transferred to the target depends on the electric field
E produced by the charged traveling particle. Given the symmetry of the problem
104 4 Particle Detection
only the transverse component of the electric field with the respect to the particle
track E ⊥ matters. Relating the interaction time t with the velocity of the particle,
dt = d x/v, one can write for the momentum transfer:
+∞ +∞ +∞
e
p = F dt = e E dt = E ⊥ d x.
−∞ −∞ v −∞
The electric field integral can be calculated using the Gauss’s law. In fact, the flux of
the electric field passing through a cylinder of radius b is given by E ⊥ 2πb d x =
z p e/ε0 . Therefore, the momentum transferred to the target particle can be written
as
z p e2
p =
2 π ε0 v b
or still in terms of the energy and using the classical radius of the electron2
re = (e2 /4π0 )/(m e c2 ) 0.003 pm:
2
p 2 1 1 2 z 2p Z 2 e4 (m e c2 )2 2 z 2p Z 2 re 2
E = = = .
2m 4πε0 m c2 b2 β 2 m c2 β2 b
From this expression one can see that close collisions (E ∝ 1/b2 ) and low mass
particles (E ∝ 1/m) are the most important with respect to energy loss; thus one
can neglect the effect of nuclei.
Photoluminescence. In some transparent media, part of the ionization energy loss
goes into the emission of visible or near-visible light by the excitation of atoms
and/or molecules. This phenomenon is called photoluminescence; often it results
into a fast (<100 µs) excitation/deexcitation—in this last case we talk of fluores-
cence, or scintillation. Specialists often use definitions which distinguish between
fluorescence and scintillation; this separation is, however, not universally accepted.
2 The classical electron radius is the size the electron would need to have for its mass to be completely
due to its electrostatic potential energy, under the assumption that charge has a uniform volume
density and that the electron is a sphere.
4.1 Interaction of Particles with Matter 105
We shall discuss later fluorescence in the context of the detection of large showers
induced in the atmosphere by high-energy cosmic rays.
Fig. 4.3 The stopping power (−d E/d x) for positive muons in copper as a function of βγ = p/Mc
is shown over nine orders of magnitude in momentum (corresponding to 12 orders of magnitude in
kinetic energy). From K.A. Olive et al. (Particle Data Group), Chin. Phys. C 38 (2014) 090001
106 4 Particle Detection
Fig. 4.4 Fractional energy loss per radiation length in lead as a function of the electron or positron
energy. From K.A. Olive et al. (Particle Data Group), Chin. Phys. C 38 (2014) 090001
1 dE 1
− (4.2)
E dx X0
where n a is the density of atoms per cubic centimeter in the medium, or more simply
1 A X 0
X 0 180 2 cm < ±20 % for 12 < Z < 93 . (4.4)
ρ Z X0
The total average energy loss by radiation increases rapidly (linearly in the approx-
imation of the equation (4.2)) with energy, while the average energy loss by collision
is practically constant. At high energies, radiation losses are thus much more impor-
tant than collision losses (Fig. 4.4).
3 NTP is commonly used as a standard condition; it is defined as air at 20 ◦ C (293.15 K) and 1 atm
(101.325 kPa). Density is 1.204 kg/m3 . Standard Temperature and Pressure STP, another condi-
tion frequently used in physics, is defined by IUPAC (International Union of Pure and Applied
Chemistry) as air at 0◦ C (273.15 K) and 100 kPa.
4.1 Interaction of Particles with Matter 107
The energy at which the radiation energy loss overtakes the collision energy loss
(called the critical energy, E c ) decreases with increasing atomic number:
550 MeV E c
Ec < ±10 % for 12 < Z < 93 . (4.5)
Z Ec
Critical energy for air at NTP is about 84 MeV; for water is about 74 MeV.
The photons radiated by bremsstrahlung are distributed at leading order in such
a way that the energy loss per unit energy is constant, i.e.,
1
Nγ ∝
Eγ
between 0 and E. This results in a divergence for E γ → 0, which anyway does not
contradict energy conservation: the integral of the energy released for each energy
bin is constant.
The emitted photons are collimated: the typical angle of emission is ∼ m e c2 /E.
1
cos θc = (4.6)
nβ
from the direction of the emitting particle. The threshold velocity is thus β = 1/n,
where n is the refractive index of the medium. The presence of a coherent wavefront
can be easily derived by using the Huygens–Fresnel principle.
The number of photons produced per unit path length and per unit energy interval
of the photons by a particle with charge z p e at the maximum (limiting) angle is
d2 N αz 2p
sin2 θc 370 sin2 θc eV−1 cm−1 (4.7)
d Ed x c
or equivalently
4 Pavel
Cherenkov (1904–1990) was a Soviet physicist who shared the Nobel prize in physics in
1958 with con-nationals Ilya Frank (1908–1990) and Igor Tamm (1895–1971) for the discovery of
Cherenkov radiation, made in 1934. The work was done under the supervision of Sergey Vavilov,
who died before the recognition of the discovery by the Nobel committee.
108 4 Particle Detection
d2 N 2παz 2p
sin2 θc (4.8)
dλd x λ2
(the index of refraction n is in general a function of photon energy E; Cherenkov
radiation is relevant when n > 1 and the medium is transparent, and this happens
close to the range of visible light).
The total energy radiated is small, some 10−4 times the energy lost by ionization.
In the visible range (300–700 nm), the total number of emitted photons is about
40/m in air, about 500/cm in water. Due to the dependence on λ, it is important that
Cherenkov detectors be sensitive close to the ultraviolet region.
Dense media can be transparent not only to visible light, but also to radio waves.
The development of Cherenkov radiation in the radiowave region due to the interac-
tions with electrons in the medium is often referred to as the Askar’yan effect. This
effect has been experimentally confirmed for different media (namely sand, rock salt
and ice) in accelerator experiments at SLAC; presently attempts are in progress to
use this effect for in particle detectors.
X-ray transition radiation (XTR) occurs when a relativistic charged particle crosses
from one medium to another with different dielectric permittivity.
The energy radiated when a particle with charge z p e and γ 1000 crosses the
boundary between vacuum and a different transparent medium is typically concen-
trated in the soft X-ray range 2–40 keV.
The process is closely related to Cherenkov radiation, and also in this case the
total energy emitted is low (typically the expected number of photons per transition
is smaller than unity; one thus needs several layers to build a detector).
4.1 Interaction of Particles with Matter 109
x
x /2
plane yplane
splane
plane
Fig. 4.6 Multiple Coulomb scattering. From K.A. Olive et al. (Particle Data Group), Chin. Phys.
C 38 (2014) 090001
A charged particle passing near a nucleus undergoes deflection, with an energy loss
that is in most cases negligible (approximately zero). This phenomenon is called elas-
tic scattering and is caused by the interaction between the particle and the Coulomb
field of the nucleus. The global effect is that the path of the particle becomes a ran-
dom walk (Fig. 4.6), and information on the original direction is partly lost—this
fact can create problems for the reconstruction of direction in tracking detectors. For
very-high-energy hadrons, also hadronic cross section can contribute to the effect.
Summing up many relatively small random changes on the direction of flight
of a particle of unit charge traversing a thin layer of material, the distribution of its
projected scattering angle can be approximated by a Gaussian
√ distribution of standard
deviation projected on a plane (one has to multiply by 2 to determine the standard
deviation in space):
13.6 MeV x x
θ0 zp 1 + 0.038 ln .
βcp X0 X0
Fig. 4.7 Range per unit of density and of mass for heavy charged particles in liquid (bubble
chamber) hydrogen, helium gas, carbon, iron, and lead. Example: a K + with momentum 700
MeV/c, βγ 1.42, and we read R/M 396, in lead, corresponding to a range of 195 g/cm2 . From
K.A. Olive et al. (Particle Data Group), Chin. Phys. C 38 (2014) 090001
4.1.2 Range
From the specific energy loss as a function of energy, we can calculate the fraction
of energy lost as a function of the distance x traveled in the medium. This is known
as the Bragg curve. For charged particles, most of the ionization loss occurs near the
end of the path, where the particle speed is low. The Bragg curve has a pronounced
peak close to the end of the path length, where it then falls rapidly to zero. The range
R for a particle of energy E is the average distance traveled before reaching the
energy at which the particle is absorbed (Fig. 4.7):
Mc2 1
R(E
) = dE
dE .
E dx
4.1 Interaction of Particles with Matter 111
Photons mostly interact with matter via photoelectric effect, Compton scattering,
and electron–positron pair production. Other processes, like Rayleigh scattering and
photonuclear interactions, have in general much smaller cross sections.
The photoelectric effect is the ejection of an electron from a material that has just
absorbed a photon. The ejected electron is called a photoelectron.
The photoelectric effect was pivotal in the development of quantum physics (for
the explanation of this effect Albert Einstein was awarded the Nobel prize). Due to
photoelectric effect, a photon of angular frequency ω > V /e can eject from a metal
an electron which pops up with a kinetic energy ω−V , where V is the minimum gap
of energy of electrons trapped in the metal (V is frequently called the work function
of the metal).
No simple relationship between the attenuation of the incident electromagnetic
wave and the photon energy E can be derived, since the process is characterized by
the interaction with the (quantized) orbitals. The plot of the attenuation coefficient
(the distance per unit density at which intensity is reduced by a factor 1/e) as a
function of the photon energy displays sharp peaks at the binding energies of the
different orbital shells and depends strongly on the atomic number. A reasonable
approximation for the cross section σ is
Zν
σ∝ ,
E3
with the exponent ν varying between 4 and 5 depending on the energy. The cross
section rapidly decreases with energy above the typical electron binding energies
(Fig. 4.8).
The photoelectric effect can be used for detecting photons below the MeV; a
photosensor (see later) sensitive to such energies can “read” the signal generated by
a photoelectron, possibly amplified by an avalanche process.
Compton scattering is the collision between a photon and an electron. Let E be the
energy of the primary photon (corresponding to a wavelength λ) and suppose the
electron is initially free and at rest. After the collision, the photon is scattered at an
angle θ and comes out with a reduced energy E
, corresponding to a wavelength
λ
; the electron acquires an energy E − E
. The conservation laws of energy and
momentum yield the following relation (Compton formula):
112 4 Particle Detection
E
λ
− λ = λC (1 − cos θ) −→ E
=
1+ E
m e c2
(1 − cos θ)
where θ is the scattering angle of the emitted photon; λC = h/m e c 2.4 pm is the
Compton wavelength of the electron.
It should be noted that, in case the target electron is not at rest, the energy of the
scattered photon can be larger than the energy of the incoming one. This regime is
called inverse Compton, and it has great importance in the emission of high-energy
photons by astrophysical sources: in practice, thanks to inverse Compton, photons
can be “accelerated.”
The differential cross section for Compton scattering was calculated by Klein and
Nishina around 1930. If the photon energy is much below m e c2 (so the scattered
electrons are non-relativistic) then the total cross section is given by the Thomson
cross section. This is known as the Thomson limit. The cross section for E m e c2
(Thomson regime) is about
8πα2 8πre2
σT = , (4.9)
3m 2e 3
where re = (e2 /4π0 )/(m e c2 ) 0.003 pm is the classical radius of the electron. If
the photon energy is E m e c2 , we are in the so-called Klein-Nishina regime and
the total cross section falls off rapidly with increasing energy (Fig. 4.8):
3σT ln 2E
σK N . (4.10)
8 E
As in the case of the photoelectric effect, the ejected electron can be detected
(possibly after multiplication) by an appropriate sensor.
Pair production is the most important interaction process for a photon above an
energy of a few tenth of MeV. In the electric field in the neighborhood of a nucleus,
a high-energy photon has a non-negligible probability of transforming itself into a
negative and a positive electron—the process being kinematically forbidden unless
an external field, regardless of how little, is present.
Energy conservation yields the following relation between the energy E of the
primary photon and the total energies U and U
of the electrons:
E = U + U .
With reasonable approximation, for 1 TeV> E > 100 MeV the fraction of energy
u = U/E taken by the secondary electron/positron is uniformly distributed between
0 and 1 (becoming peaked at the extremes as the energy increases to values above
1 PeV).
The cross section grows quickly from the kinematical threshold of about 1 MeV
to its asymptotic value reached at some 100 MeV:
7 1
σ ,
9 na X 0
where n a is the density of atomic nuclei per unit volume, in such a way that the
interaction length is
9
λ X0 .
7
The angle of emission for the particles in the pair is typically ∼m e c2 /E.
The total Compton scattering probability decreases rapidly when the photon energy
increases. Conversely, the total pair production probability is a slowly increasing
function of energy. At large energies, most photons are thus absorbed by pair pro-
duction, while photon absorption by Compton effect dominates at small energies
(being the photoelectric effect characteristic of even smaller energies). The absorp-
tion of photons by pair production, Compton, and photoelectric effect is compared
in Fig. 4.8.
As a matter of fact, above about 30 MeV the dominant process is pair production,
and the interaction length of a photon is with extremely good approximation equal
to 9X 0 /7.
At extremely high matter densities and/or at extremely high energies (typically
above 1016 –1018 eV, depending on the medium composition and density) collisions
cannot be treated independently, and the result of the collective quantum mechanical
treatment is a reduction of the cross section. The result is the so-called Landau-
Pomeranchuk-Migdal effect, or simply LPM effect, which entails a reduction of the
pair production cross section, as well as of bremsstrahlung.
The nuclear force is felt by hadrons, charged and neutral; at high energies (above a
few GeV) the inelastic cross section for hadrons is dominated by nuclear interaction.
High-energy nuclear interactions can be characterized by an inelastic interaction
length λ H . Values for ρλ H are typically of the order of 100 g/cm2 ; a listing for
some common materials is provided in Appendix B—where the inelastic length λ I
and the total length λT are separately listed, and the rule for the composition is
1/λT = 1/λ H +1/λ I .
The final state products of inelastic high-energy hadronic collisions are mostly
pions, since these are the lightest hadrons. The rate of positive, negative, and neu-
tral pions is more or less equal—as we shall see, this fact is due to a fundamental
symmetry of hadronic interactions, called the isospin symmetry.
The case of neutrinos is a special one. Neutrinos have a very low interaction cross
section, which can be parameterized (on a single nucleon) for intermediate energies
(Fig. 4.9) as
σν N (6.7 × 10−39 E) cm2 , (4.11)
4.1 Interaction of Particles with Matter 115
Fig. 4.9 Measurements of muon neutrino and antineutrino inclusive scattering cross sections
divided by neutrino energy as a function of neutrino energy; different symbols represent exper-
imental measurements by different experiments. Note the transition between logarithmic and linear
scales at 100 GeV. From K.A. Olive et al. (Particle Data Group), Chin. Phys. C 38 (2014) 090001
E being the neutrino energy in GeV. Solar neutrinos, which have MeV energies,
typically cross the Earth undisturbed (see a more complete discussion in Chap. 9).
The low value of the cross section makes the detection of neutrinos very difficult.
High-energy electrons lose most of their energy by radiation. Thus, in their interaction
with matter, most of the energy is spent in the production of high-energy photons
and only a small fraction is dissipated. The secondary photons, in turn, undergo pair
production (or, at lower energies, Compton scattering); in the first case, electrons
and positrons can in turn radiate. This phenomenon continues generating cascades
(showers) of electromagnetic particles; at each step the number of particles increases
while the average energy decreases, until the energy falls below the critical energy.
Given the characteristics of the interactions of electrons/positrons and of photons
with matter, it is natural to describe the process of electromagnetic cascades in terms
of the scaled distance x
t=
X0
E
=
Ec
(where E c is the critical energy); the radiation length and the critical energy have
been defined in Sect. 4.1.1.2. Since the opening angles for bremsstrahlung and pair
116 4 Particle Detection
production are small, the process can be in first approximation (above the critical
energy) considered as one-dimensional (the lateral spread will be discussed at the
end of this section).
A simple approximation (a “toy model”), proposed by Heitler in the late 1930s,
assumes that
• the incoming charged particle has an initial energy E 0 much larger than the critical
energy E c ;
• each electron travels one radiation length and then gives half of its energy to a
bremsstrahlung photon;
• each photon travels one radiation length and then creates an electron–positron pair;
the electron and the positron each carry half of the energy of the original photon.
In the above model, asymptotic formulas for radiation and pair production are
assumed to be valid; the Compton effect and the collision processes are neglected.
The branching stops abruptly when E = E c , and then electrons and positrons lose
their energy by ionization.
This simple branching model is schematically shown in Fig. 4.10, left. It implies
that after t radiation lengths the shower will contain 2t particles and there will be
roughly the same number of electrons, positrons, and photons, each with an average
energy
E(t) = E 0 /2t .
The cascading process will stop when E(t) = E c , at a thickness of absorber tmax ,
that can be written in terms of the initial and critical energies as
tmax = log2 (E 0 /E c ) ,
Fig. 4.10 Left Scheme of the Heitler approximation for the development of an electromagnetic
shower. From J. Matthews, Astropart. Phys. 22 (2005) 387. Right Image of an electromagnetic
shower developing through a number of brass plates 1.25 cm thick placed across a cloud chamber
(from B. Rossi, “Cosmic rays”, McGraw-Hill 1964)
4.1 Interaction of Particles with Matter 117
E0
Nmax = ≡ y.
Ec
The model suggests that the shower depth at its maximum varies as the logarithm
of the primary energy. This emerges also from more sophisticated shower models
and is observed experimentally. A real image of an electromagnetic shower in a cloud
chamber is shown in Fig. 4.10, right.
An improved model was formulated by Rossi in the beginning of the 1940s.
Rossi (see for example reference [F4.1]) computed analytically the development
of a shower in the so-called “approximation B,” in which: electrons lose energy
by ionization and bremsstrahlung (described by asymptotical formulae); photons
undergo pair production, also described by asymptotical formulae. All the process
is one-dimensional. The results of the “Rossi approximation B” are summarized in
Table 4.1. Under this approximation, the number of particles grows exponentially in
the beginning up to the maximum, and then decreases as shown in Figs. 4.11 and
4.12.
Table 4.1 Shower parameters for a particle on energy E 0 according to Rossi approximation B
(y = E 0 /E c )
Incident electron Incident photon
Peak of shower tmax 1.0 × (ln y − 1) 1.0 × (ln y − 0.5)
Center of gravity tmed tmax + 1.4 tmax + 1.7
√ √
Number of e+ and e− at peak 0.3y/ ln y − 0.37 0.3y/ ln y − 0.31
Total track length y y
Fig. 4.11 Logarithm of the number of electrons for electron-initiated showers, calculated under
Rossi approximation B, as a function of the number of radiation lengths traversed. Multiplication
by E c /I (E c is called ε in the Figure) yields the specific ionization energy loss [F4.1]
118 4 Particle Detection
0.125 100
30 GeV electron
0.100 incident on iron 80
0.000 0
0 5 10 15 20
t = depth in radiation lengths
Fig. 4.12 A Monte Carlo simulation of a 30 GeV electron-induced cascade in iron. The histogram
shows the fractional energy deposition per radiation length, and the curve is a fit to the distribution
using Eq. (4.12). The circles indicate the number of electrons with total energy greater than 1.5 MeV
crossing planes at X 0 /2 intervals (scale on the right) and the squares the number of photons above
the same energy crossing the planes (scaled down to have the same area as the electron distribution).
From K.A. Olive et al. (Particle Data Group), Chin. Phys. C 38 (2014) 090001
5 Monte Carlo methods are computational algorithms based on repeated random sampling. The
name is due to its resemblance to the act of playing in a gambling casino.
4.1 Interaction of Particles with Matter 119
opment, as well as for the angular and lateral distribution of the shower particles.
Rossi approximation B, however, is faster and represents a rather accurate model.
The description of the transverse development of a shower is more complicated.
Usually the normalized lateral density distribution of electrons is approximated by
the Nishimura-Kamata-Greisen (NKG) function, which depends on the “shower age”
s, being 0 at the first interaction, 1 at the maximum and 3 at the death [F4.1]:
3t
s= (4.13)
t + 2tmax
where Ne is the electron shower size, r is the distance from the shower axis, and R M is
a transverse scale called the Molière radius described below, is accurate for a shower
age 0.5 < s < 1.5. A variety of transverse distribution functions can be found in the
literature (Greisen, Greisen–Linsley, etc.) and are mostly specific modifications of
the NKG function.
In a crude approximation, one can assume the transverse dimension of the shower
to be dictated by the Molière radius:
21 MeV
RM X0 .
Ec
The aim of a particle detector is to measure the momenta and to identify the particles
that pass through it after being produced in a collision or a decay; this is called an
“event.” The position in space where the event occurs is known as the interaction
point.
In order to identify every particle produced by the collision, and plot the paths
they have taken—i.e., to “completely reconstruct the event”—it is necessary to know
the masses and momenta of the particles themselves. The mass can be computed by
measuring the momentum and either the velocity or the energy.
The characteristics of the different instruments that allow these measurements are
presented in what follows.
orbit with a radius proportional to the momentum of the particle. This requires the
determination of the best fit to a helix of the hits (particle fit). For a particle of unit
charge
p 0.3B⊥ R ,
where B⊥ is the component of the magnetic field perpendicular to the particle veloc-
ity, expressed in tesla (which is the order of magnitude of typical fields in detectors),
the momentum p is expressed in GeV, and R is the radius of curvature of the helix
in meters.
A source of uncertainty for this determination is given by the errors in the mea-
surement of the hits; another (intrinsic) noise is given by multiple scattering. In what
follows we shall review some detectors used to determine the trajectory of charged
tracks.
The cloud chamber was invented by C.T.R. Wilson in the beginning of XX century,
and was used as a detector for reconstructing the trajectories of charged cosmic
rays. The instrument, already discussed in the previous chapter, is a container with a
glass window, filled with air and saturated water vapor (Fig. 3.8); the volume can be
suddenly expanded, and the adiabatic expansion causes the temperature to decrease,
bringing the vapor to a supersaturated (metastable) state. A charged particle crossing
the chamber produces ions, which act as seeds for the generation of droplets along
the trajectory. One can record the trajectory by taking a photographic picture. If the
chamber is immersed in a magnetic field B, momentum and charge can be measured
by the curvature.
The working principle of bubble chambers (Fig. 4.14) is similar to that of the
cloud chamber, but here the fluid is a liquid. Along the trajectory of the particle, a
trail of gas bubbles evaporates around the ions.
Fig. 4.14 Left The BEBC bubble chamber. Center A picture taken in BEBC, and right its inter-
pretation. Credits: CERN
122 4 Particle Detection
Due to the higher density of liquids compared with gases, the interaction prob-
ability is larger for bubble chambers than for gas chambers, and bubble chambers
act at the same time both as an effective target and as a detector. Different liquids
can be used, depending on the type of experiment: hydrogen to have protons as a
target nucleus, deuterium to study interactions on neutrons, etc. From 1950 to the mid
1980s, before the advent of electronic detectors, bubble chambers were the reference
tracking detectors. Very large chambers were built (the Big European Bubble Cham-
ber BEBC now displayed at the entrance of the CERN exhibition is a cylinder with
an active volume of 35 cubic meters), and wonderful pictures have been recorded.
Bubble and cloud chambers provide a complete information: the measurement of
the bubble density (their number per unit length) provides an estimate of the specific
ionization energy loss d E/d x, hence βγ = p/Mc; the range, i.e., the total track
length before the particle eventually stops (if the stopping point is recorded), provides
an estimate for the initial energy; the multiple scattering (see below) provides an
estimate for the momentum.
A weak point of cloud and bubble chambers is their dead time: after an expansion,
the fluid must be re-compressed. This might take a time ranging from about 50 ms
for small chambers (LEBC, the LExan Bubble Chamber, used in the beginning of
the 1980s for the study of the production and decay of particles containing the
quark charm, had an active volume of less than a liter) to several seconds. Due
to this limitation and to the labor-consuming visual scanning of the photographs,
bubble chambers have been abandoned in the mid 1980s—cloud chambers had been
abandoned much earlier.
A nuclear emulsion is a photographic plate with a thick emulsion layer and very
uniform grain size. Like bubble chambers and cloud chambers they record the tracks
of charged particles passing through, by changing the chemical status of grains that
have absorbed photons (which makes them visible after photographic processing).
They are compact, have high density, but have the disadvantages that the plates must
be developed before the tracks can be observed, and they must be visually examined.
Nuclear emulsion have very good space resolution of the order of about 1 µm.
They had great importance in the beginning of cosmic-ray physics, and they are still
used in neutrino experiments (where interactions are rare) due to the lower cost per
unit of volume compared to semiconductor detectors and to the fact that they are
unsurpassed for what concerns to the single-point space resolution. They recently
had a revival with the OPERA experiment at the LNGS underground laboratory in
Gran Sasso, Italy, detecting the interactions of a beam of muon neutrinos sent from
the CERN SPS in Geneva, 730 km away.
4.2 Particle Detectors 123
These three kinds of detectors have the same principle of operation: they consist
of a tube filled with a gas, with a charged metal wire inside (Fig. 4.15). When a
charged particle enters the detector, it ionizes the gas, and the ions and the electrons
can be collected by the wire and by the walls (the mobility of electrons being larger
than the mobility of ions, it is convenient that the wire’s potential is positive). The
electrical signal of the wire can be amplified and read by means of an amperometer.
The tension V of the wire must be larger than a threshold below which ions and
electrons spontaneously recombine.
Depending on the tension V of the wire, one can have three different regimes
(Fig. 4.16):
• The ionization chamber regime when V < I /e (I is the ionization energy of
the gas, and e the electron charge). The primary ions produced by the track are
collected by the wire, and the signal is then proportional to the energy released by
the particle.
• The proportional counter regime when V > I /e, but V is smaller than a breakdown
potential VG M (see below). The ions and the electrons are then accelerated at an
energy such that they can ionize the gas. The signal is thus amplified and it generates
an avalanche of electrons around the anode. The signal is then proportional to the
wire tension.
• Above a potential VG M , the gas is completely ionized; the signal is then a short
pulse of height independent of the energy of the particle (Geiger-Müller regime).
Geiger-Müller tubes are also appropriate for detecting gamma radiation, since a
photoelectron can generate an avalanche.
124 4 Particle Detection
Fig. 4.16 Practical gaseous ionization detector regions: variation of the ion charge with applied
voltage in a counter, for a constant incident radiation. By Doug Sim (own work) [CC BY-SA 3.0
http://creativecommons.org/licenses/by-sa/3.0], via Wikimedia Commons
6 Jerzy(“Georges”) Charpak (1924–2010) was awarded the Nobel prize in Physics in 1992 “for his
invention and development of particle detectors, in particular the multiwire proportional chamber.”
Charpak was a Polish born, French physicist. Coming from a Jewish family, he was deported in the
Nazi concentration camp in Dachau; after the liberation he studied in Paris, and then worked since
1959 at CERN, Geneva.
4.2 Particle Detectors 125
Fig. 4.18 The flash chamber built by the Laboratório de Instrumentação e Partículas (LIP) in Lisbon
for didactical purposes records a shower of cosmic rays
pendicularly one can determine the position of a particle. The typical response time
is of the order of 30 ns.
These are typically multianode (can be multiwire) chambers operating in the Geiger-
Müller regime. Short electric pulses of the order of 10 kV/cm are sent between sub-
sequent planes; when a particle passes in the chamber, it can generate a series of
discharges which can be visible—a sequence of flashes along the trajectory, Fig. 4.18.
Fig. 4.19 Scheme of a silicon microstrip detector, arranged in a double-side geometry (strips are
perpendicular). Source http://spie.org/x20060.xml
An extreme case is time projection chambers (TPC), for which drift lengths can
be very large (up to 2 m), and the sense wires are arranged at one end; signals in pads
or strips near the signal wire plane are used to obtain three-dimensional information.
4.2.1.8 Scintillators
Scintillators are among the oldest particle detectors. They are slabs of transparent
material, organic or inorganic; the ionization induces fluorescence, and light is con-
veyed towards a photosensor (photosensors will be described later). The light yield
is large (can be as large as 104 photons per MeV of energy deposited), and the time
of formation of the signal is very fast (typically less than 1 ns): they are appropriate
for trigger7 systems.
To make the light travel efficiently towards the photosensor (photomultiplier),
light guides are frequently used (Fig. 4.20). Sometimes the fluorescence is dominated
by low wavelengths; in this case it is appropriate to match the photosensor optical
efficiency with a wavelength shifter (a material which induces absorption of light
and re-emission in an appropriate wavelength).
The scintillators can be used as tracking devices, in the so-called “hodoscope”
configuration (from the Greek “hodos” for path, and “skope” for observation) as in
the case of silicon strips.
7 A trigger is an electronic system that uses simple criteria to rapidly decide which events in a particle
detector to keep in cases where only a small fraction of the total number of events can be recorded.
128 4 Particle Detection
The experimenter can use two segments shaped like strips, arranged in two layers.
The strips of the two layers should be arranged in perpendicular directions (let us call
them horizontal and vertical). A particle passing through hits a strip in each layer;
the vertical scintillator strip reveals the horizontal position of the particle, and the
horizontal strip indicates its vertical position (as in the case of two wire chambers
with perpendicular orientation of the wires, but with poorer resolution). Scintillator
hodoscopes are among the cheapest detectors for tracking charged particles.
Among scintillators, some are polymeric (plastic); plastic scintillators are partic-
ularly important due to their good performance at low price, to their high light output
and relatively quick (few ns) signal, and in particular to their ability to be shaped
into almost any desired form.
Table 4.2 Typical characteristics of different kinds of tracking detectors (data come from K.A.
Olive et al. (Particle Data Group), Chin. Phys. C 38 (2014) 090001)
Detector type Spatial resolution Time resolution Dead time
RPC ≤10mm ∼1ns (down to ∼50 ps) −
Scintillation counter 10 mm 0.1 ns 10 ns
Emulsion 1 µm − −
Bubble chamber 10–100 µm 1 ms 50 ms–1 s
Proportional chamber 50–100 µm 2 ns 20–200 ns
Drift chamber 50–100 µm few ns 20–200 ns
Silicon strip Pitch/5 (few µm) few ns 50 ns
Silicon pixel 10 µm few ns 50 ns
4.2 Particle Detectors 129
4.2.2 Photosensors
Most detectors in particle physics and astrophysics rely on the detection of photons
near the visible range, i.e., in the eV energy range. This range covers scintillation and
Cherenkov radiation as well as the light detected in many astronomical observations.
Essentially, one needs to extract a measurable signal from a (usually very small)
number of incident photons. This goal can be achieved by generating a primary
photoelectron or electron–hole pair by an incident photon (typically by photoelectric
effect), amplifying the signal to a detectable level (usually by a sequence of avalanche
processes), and collecting the secondary charges to form the electrical signal.
The important characteristics of a photodetector include as follows:
• the quantum efficiency Q E, namely the probability that a primary photon generates
a photoelectron;
• the collection efficiency C related to the overall acceptance;
• the gain G, i.e., the number of electrons collected for each photoelectron generated;
• the dark noise D N , i.e. the electrical signal when there is no incoming photon;
and
• the intrinsic response time of the detector.
Several kinds of photosensor are used in experiments.
Fig. 4.21 Scheme of a photomultiplier attached to a scintillator. Source Colin Eberhardt [public
domain], via Wikimedia Commons
130 4 Particle Detection
the focusing electrode toward the electron multiplier chain, where they are multiplied
by secondary emission.
The electron multiplier consists of several dynodes, each held at a higher positive
voltage than the previous one (the typical total voltage in the avalanche process
being of 1–2 kV). The electrons produced in the photocathode have the energy of
the incoming photon (minus the work function of the photocathode, i.e., the energy
needed to extract the electron itself from the metal, which typically amounts to a few
eV). As the electrons enter the multiplier chain, they are accelerated by the electric
field. They hit the first dynode with an already much higher energy. Low-energy
electrons are then emitted, which in turn are accelerated towards the second dynode.
The dynode chain is arranged in such a way that an increasing number of electrons are
produced at each stage. When the electrons finally reach the anode, the accumulation
of charge results in a sharp current pulse. This is the result of the arrival of a photon
at the photocathode.
Photocathodes can be made of a variety of materials with different properties.
Typically materials with a low work function are chosen.
The typical quantum efficiency of a photomultiplier is about 30 % in the range
from 300 to 800 nm of wavelength for the light, and the gain G is in the range
105 –106 .
A recent improvement to the photomultiplier was obtained thanks to hybrid photon
detectors (HPD), in which a vacuum PMT is coupled to a silicon sensor. A photo-
electron ejected from the photocathode is accelerated through a potential difference
of about V 20 kV before it hits a silicon sensor/anode. The number of electron–
hole pairs that can be created in a single acceleration step is G ∼ V /(3.6 V), the
denominator being the mean voltage required to create an electron–hole pair. The
linear behavior of the gain is helpful because, unlike exponential gain devices, high
voltage stability translates in gain stability. HPD detectors can work as single-photon
counters.
Fig. 4.22 Left Image of the hits on the photon detectors of the RICHs of the LHCb experiment at
CERN with superimposed rings. Credit: LHCb collaboration. Right Dependence of the Cherenkov
angle measured by the RICH of the ALICE experiment at CERN on the particle momentum; the
angle can be used to measure the mass through Eq. 4.6 (β = p/E). Credit: ALICE Collaboration
4.2.5 Calorimeters
An ideal material used for an electromagnetic calorimeter should have a short radia-
tion length, so that one can contain the electromagnetic shower in a compact detec-
tor, and the signal should travel unimpeded through the absorber (homogeneous
calorimeters). However, sometimes materials which can be good converters and
conductors of the signals are very expensive: one then uses sampling calorimeters,
where the degraded energy is measured in a number of sensitive layers separated by
passive absorbers.
The performance of calorimeters is limited both by the unavoidable fluctuations
of the elementary phenomena through which the energy is degraded and by the
technique chosen to measure the final products of the cascade processes.
Homogeneous Calorimeters. Homogeneous calorimeters may be built with heavy
(high density, high Z ) scintillating crystals, i.e., crystals in which ionization energy
loss results in the emission of visible light, or Cherenkov radiators such as lead glass
and lead fluoride. The material acts as a medium for the development of the shower,
as a transducer of the electron signal into photons, and as a light guide towards
the photodetector. Scintillation light and/or ionization can be detected also in noble
liquids.
Sampling Calorimeters. Layers of absorbers are typically alternated to layers of
active material (sandwich geometry). The absorber helps the development of the
134 4 Particle Detection
electromagnetic shower, while the active material transforms part of the energy into
photons, which are guided towards the photodetector. Different geometries can be
used: for example sometimes rods of active material cross the absorber (spaghetti
geometry).
Converters have high density, short radiation length. Typical materials are iron
(Fe), lead (Pb), uranium. Typical active materials are plastic scintillator, silicon,
liquid ionization chamber gas detectors.
Disadvantages of sampling calorimeters are that only part of the deposited particle
energy is detected in the active layers, typically a few percent (and even one or
two orders of magnitude less in the case of gaseous detectors). These sampling
fluctuations typically result in a worse energy resolution for sampling calorimeters.
Electromagnetic Calorimeters: Comparison of the Performance. The fractional
energy resolution E/E of a calorimeter can be parameterized as
E a c
= √ ⊕b⊕ ,
E E E
Table 4.3 Main characteristics of some electromagnetic calorimeters (data from K.A. Olive et al.
(Particle Data Group), Chin. Phys. C 38 (2014) 090001)
Technology (experiment) Depth (X 0 ) Energy resolution (relative)
√
BGO (L3) 22 2 %/ E ⊕ 0.7 %
√
CsI (kTeV) 27 2 %/ E ⊕ 0.45 %
√
PbWO4 (CMS) 25 3 %/ E ⊕ 0.5 % ⊕ 0.2 %/E
√
Lead glass (DELPHI, OPAL) 20 5 %/ E
√
Scintillator/Pb (CDF) 18 18.5 %/ E
√
Liquid Ar/Pb (SLD) 21 12 %/ E
that the energy of the incident particle is proportional to the multiplicity of charged
particles.
Most large hadron calorimeters are sampling calorimeters installed as part of
complex detectors at accelerator experiments. The basic structure typically consists
of absorber plates (Fe, Pb, Cu, or occasionally U or W) alternating with plastic
scintillators (shaped as plates, tiles, bars), liquid argon (LAr) chambers, or gaseous
detectors (Fig. 4.23). The ionization is measured directly, as in LAr calorimeters, or
via scintillation light observed in photodetectors (usually photomultipliers).
The fluctuations in the invisible energy and in the hadronic component of a shower
affect the resolution of hadron calorimeters.
A hadron with energy E generates a cascade in which there are repeated hadronic
collisions. In each of these, neutral pions are produced which immediately decay
to photons. A fraction of the energy is converted to a potentially observable signal
with an efficiency which is in general different, usually larger, than the hadronic
detection efficiency. The response to hadrons is thus not compensated with respect
to the response to electromagnetic particles (or to the electromagnetic part of the
hadronic shower).
Due to√ all these problems, typical fractional energy resolutions are in the order of
30–50 %/ E.
We have seen that when we use a beam of particles as a microscope, like Rutherford
did in his experiment, the minimum distance we can sample (for example, to probe
a possible substructure in matter) decreases with increasing energy. According to de
Broglie’s equation, the relation between the momentum p and the wavelength λ of
a wave packet is given by
h
λ= .
p
136 4 Particle Detection
(a) (b)
PMT
waveshifter
W (Cu) absorber fiber
LAr filled
tubes
scintillator
Hadrons
tile
r
z
Hadrons
Fig. 4.23 The hadronic calorimeters of the ATLAS experiments at LHC. Credit: CERN
This means that, in a fixed target experiment, the centre-of-mass energy grows only
with the square root of E. In beam–beam collisions in the centre-of-mass, instead,
E CM = 2E .
A particle of charge q and speed v in an electric field E and a magnetic field B feels
a force
F = q( E + v × B) .
The electric field can thus accelerate the particle; the work by the magnetic field is
zero, nevertheless the magnetic field can be used to control the particle’s trajectory.
8 The Nobel prize for physics in 1984 was assigned to the Italian physicist Carlo Rubbia (1984 –) and
to the Dutch Simon van der Meer (1925–2011) “for their decisive contributions to the large project,
which led to the discovery of the field particles W and Z , communicators of weak interaction.” In
short, Rubbia and van der Meer used feedback signals sent in the ring to reduce the entropy of the
beam; this technique allowed the accumulation of focalized particles with unprecedented efficiency,
and is at the basis of all modern accelerators.
138 4 Particle Detection
Fig. 4.24 Scheme of an acceleration line displayed at two different times. By Sgbeer (own work)
[GFDL http://www.gnu.org/copyleft/fdl.html], via Wikimedia Commons
For example, a magnetic field perpendicular to v can constrain the particle along a
circular trajectory perpendicular to B.
If a single potential were applied, increasing energy would demand increasing
tensions. The solution is to apply multiple times a limited potential.
An acceleration line (which corresponds roughly to a linear accelerator) works as
follows. In a beam pipe (a cylindrical tube in which vacuum has been made) cylin-
drical electrodes are aligned. A pulsed radiofrequency (RF) source of electromotive
force V is applied. Thus particles are accelerated when passing in the RF cavity
(Fig. 4.24); the period is adjusted in such a way that half of the period corresponds of
the time needed to the particle to cross the cavity. The potential between the cylinders
is reversed when the particle is located within them.
To have a large number of collisions, it is useful that particles are accelerated in
bunches. This introduces an additional problem, since the particles tend to diverge
due to mutual electrostatic repulsion. Divergence can be compensated thanks to
focusing magnets (for example quadrupoles, which squeeze beams in a plane).
A collider consists of two circular or almost circular accelerator structures with
vacuum pipes, magnets and accelerating cavities, in which two beams of particles
travel in opposite directions. The particles may be both protons, or protons and
antiprotons, or electrons and positrons, or electrons and protons, or also nuclei and
nuclei. The two rings intercept each other at a few positions along the circumference,
where bunches can cross and particles can interact. In a particle–antiparticle collider
(electron–positron or proton-antiproton), as particles and antiparticles have opposite
charges and the same mass, a single magnetic structure is sufficient to keep the two
beams circulating in opposite directions.
4.3 High-Energy Particles 139
An important
√ parameter for an accelerator is the maximum center of-mass (c.m.)
energy s available, since this sets the maximum mass of new particles that can be
produced.
Another important parameter is luminosity, already discussed in Chap. 2. Imagine
a physical process has a cross section σproc ; the number of outcomes of this process
per unit time can be expressed as
d Nproc dL
= σproc .
dt dt
As we have already shown, cosmic rays can attain energies much larger than particles
produced at human-made accelerators. The main characteristics of cosmic rays have
been explained in Sect. 1.4 and in Chap. 3.
We just recall here that the distribution in energy (the so-called spectrum) of
cosmic rays is quite well described by a power law E − p , with the so-called spectral
index p around 3 in average (Fig. 1.8), extending up to about 1021 eV (above this
energy the GZK cutoff, explained in the previous chapters, stops the cosmic travel
of particles; a similar mechanism works for heavier nuclei, which undergo photo-
disintegration during their cosmic travel). The majority of the high-energy particles
in cosmic rays are protons (hydrogen nuclei); about 10 % are helium nuclei (nuclear
140 4 Particle Detection
physicists call them usually “alpha particles”), and 1 % are neutrons or nuclei of
heavier elements. These together account for 99 % of the cosmic rays, and electrons
and photons make up the remaining 1 %. The number of neutrinos is estimated to
be comparable to that of high-energy photons, but it is very high at low energy
because of the nuclear processes that occur in the Sun: such processes involve a
large production of neutrinos. Cosmic rays hitting the atmosphere (called primary
cosmic rays) generally produce secondary particles that can reach the Earth’s surface,
through multiplicative showers.
The reason why human-made accelerators cannot compete with cosmic accelera-
tors from the point of view of the maximum attainable energy is that with the present
technologies acceleration requires confinement within a radius R by a magnetic field
B, and the final energy is proportional to the product of R times B. On Earth, it is
difficult to imagine reasonable radii of confinement larger than one hundred kilo-
meters and magnetic fields stronger than ten tesla (one hundred thousand times the
Earth’s magnetic field). This combination can provide energies of a few tens of TeV,
such as those of the LHC accelerator at CERN. In nature there are accelerators with
much larger radii, as the remnants of supernovae (hundreds of light years) and active
galactic nuclei of galaxies (tens of thousands of light years): one can thus reach ener-
gies as large as 1021 eV (the so-called Extremely-High-Energy, EHE, cosmic rays;
cosmic rays above 1018 eV are instead called Ultra-High-Energy, UHE). Of course
terrestrial accelerators have great advantages like luminosity and possibility to know
the initial conditions.
The conditions are synthetically illustrated in the so-called Hillas plot (Fig. 10.32),
a scatter plot in which different cosmic objects are grouped according to their sizes
and magnetic fields; this will be discussed in larger detail in Chap. 10. EHE can be
reached in the surroundings of active galactic nuclei, or in gamma-ray bursts. The
product of R times B in supernova remnants are such that particles can reach energies
of some PeV.
Detectors at experimental facilities are in general hybrid, i.e., they combine many
of the detectors discussed so far, such as the drift chambers, Cherenkov detectors,
electromagnetic, and hadronic calorimeters. They are built up in a sequence of layers,
each one designed to measure a specific aspect of the particles produced after the
collision.
Starting with the innermost layer the successive layers are typically as follows:
• A tracking system: this is designed to track all the charged particles and allow
for complete event reconstruction. It is in general the first layer crossed by the
particles, in such a way that their properties have not yet been deteriorated by the
interaction with the material of the detector. It should have as little material as
possible, so to preserve the particles for the subsequent layer.
• A layer devoted to electromagnetic calorimetry.
• A layer devoted to hadronic calorimetry.
4.4 Detector Systems and Experiments at Accelerators 141
Fig. 4.25 Overview of the signatures by a particle in a multilayer hybrid detector. Credit: CERN
• A layer of muon tracking chambers: any particle releasing signal on these tracking
detectors (often drift chambers) has necessarily traveled through all the other layers
and is very likely a muon (neutrinos have extremely low interaction cross sections,
and most probably they cross also the muon chambers without leaving any signal).
A layer containing a solenoid can be inserted after the tracking system, or after the
calorimeter. Tracking in magnetic field will allow momentum measurement.
The particle species can be identified by energy loss, curvature in magnetic field,
and Cherenkov radiation. However, the search for the identity of a particle can be
significantly narrowed down by simply examining which parts of the detector it
deposits energy in:
• Photons leave no tracks in the tracking detectors (unless they undergo pair pro-
duction) but produce a shower in the electromagnetic calorimeter.
• Electrons and positrons leave a track in the tracking detectors and produce a shower
in the electromagnetic calorimeter.
• Muons leave tracks in all the detectors (likely as a minimum ionizing particle in
the calorimeters).
• Long-lived charged hadrons (protons for example) leave tracks in all the detectors
up to the hadronic calorimeter where they shower and deposit all their energy.
• Neutrinos are identified by missing energy-momentum when the relevant conser-
vation law is applied to the event.
These signatures are summarized in Fig. 4.25.
142 4 Particle Detection
Fig. 4.26 A configuration of the European Hybrid Spectrometer (a fixed target detector at the CERN
Super-proton Synchrotron). From M. Aguilar-Benitez et al., “The European hybrid spectrometer,”
Nucl. Instr. Methods 258 (1987) 26
In a fixed target experiment, relativistic effects make the interaction products highly
collimated. In such experiments
√ then, in order to enhance the possibility of detection
in the small-x T (x T = pT / s, where pT is the momentum component perpendicular
to the beam direction), different stages are separated by magnets opening up the
charged particles in the final state (lever arms).
The first detectors along the beam line should be non-destructive; at the end of the
beam line, one can have calorimeters. Two examples is given in the following; the
first is a fixed target experiment from the past, while the second is an almost fixed
target detector presently operating.
The European Hybrid Spectrometer EHS has been operational during the years 1970s
and the beginning of the 1980s at the North Area of CERN, where beams of protons
were extracted from the SPS (Super-proton Synchrotron)9 accelerator at energies
ranging from 300 to 400 GeV. Such particles might possibly generate secondary
beams of charged pions of slightly smaller energies by a beam-dump and a velocity
selector based on magnetic field. EHS was a multi-stage detector serving different
experiments (NA16, NA22, NA23, NA27). Here we describe a typical configuration;
Fig. 4.26 shows a schematic drawing of the EHS set-up.
In the Figure, the beam particles come in from the left. Their direction is deter-
mined by the two small wire chambers U1 and U3. From the collision point inside a
rapid cycling bubble chamber (RCBC; the previously described LEBC is an exam-
ple, with a space resolution of 10 µm) most of the produced particles will enter
the downstream part of the spectrometer. The fast ones (typically with momentum
p > 30 GeV/c) will go through the aperture of the magnet M2 to the so-called second
lever arm.
9A synchrotron is a particle accelerator ring, in which the guiding magnetic field (bending the
particles into a closed path) is time dependent and synchronized to a particle beam of increasing
kinetic energy. The concept was developed by the Soviet physicist Vladimir Veksler in 1944.
4.4 Detector Systems and Experiments at Accelerators 143
The RCBC acts both as a target and as a vertex detector. If an event is triggered,
stereoscopic pictures are taken with 3 cameras and recorded on film.
The momentum resolution of the secondary tracks depends on the number of
detector element hits available for the fits. For low momentum tracks, typically
p < 3 GeV/c, length and direction of the momentum vector at the collision-point
can be well determined from RCBC. On the other hand, tracks with p < 3 GeV/c
have a very good chance to enter the so-called first lever arm. This is defined by the
group of four wire chambers W2, D1, D2, and D3 placed between the two magnets
M1 and M2.
Fast tracks, typically p > 50 GeV/c, have a good chance to go in addition through
the gap of the magnet M2 and enter the second lever arm, consisting of the three drift
chambers D4, D5, and D6.
To detect gammas, two electromagnetic calorimeters are used in EHS, the inter-
mediate gamma detector (IGD) and the forward gamma detector (FGD). IGD is
placed before the magnet M2. It has a central hole to allow fast particles to pass
into the second lever arm. FGD covers this hole at the end of the spectrometer. The
IGD has been designed to measure both the position and the energy of a shower in
a two-dimensional matrix of lead-glass counters 5 cm × 5 cm in size, each of them
connected to a PMT. The FGD consists of three separate sections. The first section
is the converter (a lead glass wall), to initiate the electromagnetic shower. The sec-
ond section (the position detector) is a three-plane scintillator hodoscope. The third
section is the absorber, a lead-glass matrix deep enough (60 radiation length) to
totally absorb showers up to the highest available energies. For√both calorimeters,
the relative accuracy on energy reconstruction is E/E 0.1/ E ⊕ 0.02.
The spectrometer included also three detectors devoted to particle identification:
the silica-aerogel Cherenkov detector SAD, the ISIS chamber measuring specific
ionization, and the transition radiation detector TRD.
LHCb (“Large Hadron Collider beauty”) is a detector at the Large Hadron Collider
accelerator at CERN. LHCb is specialized in the detection of b−hadrons (hadrons
containing a bottom quark). A sketch of the detector is shown in Fig. 4.27.
Although in strict terms, LHCb is a colliding beam experiment, it is done as a
fixed target one: the strongly boosted b-hadrons fly along the beam direction, and
one side is instrumented.
At the heart of the detector is the vertex detector, recording the decays of the
b particles, which have typical lifetimes of about 1 ps and will travel only about
10 mm before decaying. It has 17 planes of silicon (radius 6 cm) spaced over a meter
and consisting of two disks (in order to measure radial and polar coordinates) and
provides a hit resolution of about 10 and 40 µm for the impact parameter of high
momentum tracks.
Downstream of the vertex detector, the tracking system (made of 11 tracking
chambers) reconstructs the trajectories of emerging particles. LHCb’s 1.1 T super-
144 4 Particle Detection
conducting dipole spectrometer magnet (inherited from the DELPHI detector at LEP,
see later) opens up the tracks.
Particle identification is performed by two ring-imaging Cherenkov (RICH) detec-
tor stations. The first RICH is located just behind the vertex detector and equipped
with a 5 cm silica aerogel and 1 m C4 F10 gas radiators, while the second one consists
of 2 m of CF4 gas radiator behind the tracker. Cherenkov photons are picked up by
a hybrid photodiode array.
The electromagnetic calorimeter, installed following the second RICH, is a
“shashlik” structure of scintillator and lead read out by wavelength-shifting fibers. It
has three annular regions with different granularities in order to optimize readout. A
lead-scintillator preshower detector improves electromagnetic particle identification.
The hadron calorimeter is made of scintillator tiles embedded in iron. Like the
electromagnetic calorimeter upstream, it has three zones of granularity. Downstream,
shielded by the calorimetry, is four layers of muon detectors. These are multigap
resistive plate chambers and cathode pad chambers embedded in iron, with an addi-
tional layer of cathode pad chambers mounted before the calorimeters. Besides muon
identification, this provides important input for triggering.
There are four levels of triggering. The initial (level 0) decisions are based on
a high transverse-momentum particle and use the calorimeters and muon detectors.
This reduces by a factor of 40 the 40 MHz input rate. The next trigger level (level 1)
is based on vertex detector (to look for secondary vertices) and tracking informa-
tion, and reduces the data by a factor of 25 to an output rate of 40 kHz. Level 2,
suppressing fake secondary decay vertices, achieves further eightfold compression.
Level 3 reconstructs B decays to select specific decay channels, achieving another
compression factor of 25. Data are written to tape at 200 Hz.
4.4 Detector Systems and Experiments at Accelerators 145
The modern particle detectors in use today at colliders are as much as possible
hermetic detectors. They are designed to cover most of the solid angle around the
interaction point (a limitation being given by the presence of the beam pipe). The
typical detector consists of a cylindrical section covering the “barrel” region and two
endcaps covering the “forward” regions.
In the standard coordinate system, the z axis is along the beam direction, the x
axis points towards the center of the ring, and the y axis points upwards. The polar
angle tothe z axis is called θ and the azimuthal angle is called φ; the radial coordinate
is R = x 2 + y 2 .
Frequently the polar angle is replaced by a coordinate called pseudorapidity η
and defined as
θ
η = ln tan ;
2
the region η 0 corresponds to θ π/2, and is called the central region. When
in Chap. 6 we shall discuss the theory of hadronic interactions, quantum chromody-
namics, we shall clarify the physical significance of this variable.
The detector has the typical onion-like structure described in the previous section:
a sequence of layers, the innermost being the most precise for tracking.
The configuration of the endcaps is similar to that in a fixed-target experiment
except for the necessary presence of a beam pipe, which makes it impossible to detect
particles at very small polar angles, and entails the possible production of secondary
particles in the pipe wall.
In the following we shall shortly describe three generations of collider detectors
operating at the European Organization for Particle Physics, CERN: UA1 at the S p p̄S
p p̄ accelerator, DELPHI at the LEP e+ e− accelerator, and both the main detectors at
the LHC pp accelerator: CMS and ATLAS. We shall see how much the technology
developed and the required labor increased; the basic ideas are anyway still common
to the prototype detector, UA1.
The UA1 experiment, named as the first experiment in the CERN Underground Area
(UA), runs at CERN’s S p p̄S (Super-proton-antiproton-Synchrotron) accelerator-
collider from 1981 till 1993. The discovery of the W and Z bosons, mediators of
the weak interaction, by this experiment in 1983, led to the Nobel prize for physics
to Carlo Rubbia and Simon van der Meer in 1984 (the motivation of the prize being
more related to the development of the collider technology). The S p p̄S was colliding
protons and antiprotons at a typical c.m. energy of 540 GeV; three bunches of protons
and three bunches of antiprotons, 1011 particles per bunch, were present in the ring
at the same time, and the luminosity was about 5 × 1027 cm−2 /s (5 inverse millibarn
per second).
146 4 Particle Detection
Fig. 4.28 Left The UA1 detector, and Carlo Rubbia. Right A Z boson decaying into a muon–
antimuon pair as seen at the event display of UA1 (Source CERN)
UA1 was a huge and complex detector for its days, and it was and still is the
prototype of collider detectors. The collaboration constructing and managing the
detector included approximately 130 scientists from all around the world.
UA1 was a general purpose detector. The central tracking system was an assembly
of six drift chambers 5.8 m long and 2.3 m in diameter. It recorded the tracks of
charged particles curving in a 0.7 T magnetic field, measuring their momenta with
typical accuracy δ p/ p 0.01 pT (where pT is the momentum component transverse
to the beam axis, also called the transverse momentum10 ; p is expressed in GeV/c)
and possibly identifying them by the specific energy loss d E/d x. The geometrical
arrangement of the about 17000 field wires and 6125 sense wires allowed a three-
dimensional reconstruction of events. UA1 introduced also the concept of event
display (Fig. 4.28).
After the tracking chamber and an air gap of 0.5 m, the particles next encountered
the calorimeter plus 60 cm of additional iron shielding, including the magnet yoke.
The calorimeter started with an electromagnetic calorimeter made of a sandwich
√
of lead and scintillator, with a total relative energy resolution about 0.2/ E. The
iron shielding was partially instrumented with streamer tubes11 measuring the posi-
tion and the number of minimum ionizing particles, and √ thus, acting as a hadronic
calorimeter with relative energy resolution about 0.8/ E. Together, the calorimeter
and the shielding corresponded to more than eight interaction lengths of material,
which almost completely absorbed strongly interacting particles. Finally, muons
form of a round or square tube, with a thick (0.1 mm) anode wire in its axis; they operate at voltages
close to the breakdown (see Sect. 4.2.1.3). Such detectors can be produced with 1–2 cm diameter,
and they are cheap and robust.
4.4 Detector Systems and Experiments at Accelerators 147
Quadrupole
Beam Pipe
Vertex Detector
Inner Detector
were detected in the outer muon chambers, which covered about 75 % of the solid
angle in the pseudorapidity range |η| < 2.3. Muon trigger processors required tracks
in the muon chambers pointing back to the interaction region to retain an event as
significant.
DELPHI (DEtector with Lepton Photon and Hadron Identification, Fig. 4.29) was
one of the four experiments built for the LEP (Large Electron–Positron) collider at
CERN. The main aim of the experiment was the verification of the theory known as
the Standard Model of particle physics. DELPHI started collecting data in 1989 and
ran about 8 months/year, 24 h a day, until the end of 2000; it recorded the products
of collisions of electrons and positrons at c.m. energies from 80 to 209 GeV (most
of the data being taken at the Z peak, around 91.2 GeV). Typical luminosity was
2 × 1031 cm−2 /s (20 inverse microbarn per second). DELPHI was built and operated
by approximately 600 scientists from 50 laboratories all over the world.
DELPHI consisted of a central cylindrical section and two endcaps, in a solenoidal
magnetic field of 1.2 T provided by a superconducting coil. The overall length and
the diameter were over 10 meters and the total weight was 2500 tons. The electron–
positron collisions took place inside the vacuum pipe at the center of DELPHI and
the products of the annihilations would fly radially outwards, tracked by several
detection layers and read out via about 200,000 electronic channels. A typical event
was about one million bits of information.
148 4 Particle Detection
δp
0.6 % pT , (4.15)
p
where p is expressed in GeV/c, and the typical calorimetric resolution in the barrel
part is
δE 7%
√ ⊕ 1%, (4.16)
E E
Fig. 4.30 Two events reconstructed by DELPHI, projected on the x z plane (left) and on the x y
plane (right). Credits: CERN
The Compact Muon Solenoid (CMS) experiment is one of the two large general
purpose particle physics detectors built on the proton–proton Large Hadron Collider
(LHC) at CERN, the other being called ATLAS.12 Approximately 3000 scientists,
representing 183 research institutes and 38 countries, form the CMS collaboration
who built and now since 2008 operate the detector (the size of the ATLAS collabora-
tion is similar). The detector is shown in Fig. 4.31. Proton–proton collisions at c.m.
energies of 8 TeV are recorded; typical luminosity is 7 × 1034 cm−2 /s (70 inverse
nanobarn per second).
As customary for collider detectors, CMS in structured in layers arranged in an
onion-like structure.
Layer 1 is devoted to tracking. The inner silicon tracker is located immediately
around the interaction point. It is used to identify the tracks of individual particles
and match them to the vertices from which they originated. The curvature of charged
particle tracks in the magnetic field allows their charge and momentum to be mea-
sured. The CMS silicon tracker consists of 13 layers in the central region and 14
layers in the endcaps. The three innermost layers (up to 11 cm radius) are made of
100 × 150 µm pixels (a total of 66 million pixels) and the next four (up to 55 cm
radius) are silicon strips (9.6 million strip channels in total). The CMS silicon tracker
is the world’s largest silicon detector, with 205 m2 of silicon sensors (approximately
the area of a tennis court) and 76 million channels.
12 Seven detectors have been constructed at the LHC, located underground in large caverns excavated
at the LHC’s intersection points. ATLAS and CMS are large, general purpose particle detectors.
A Large Ion Collider Experiment (ALICE) and LHCb have more specific roles, respectively, the
study of collisions of heavy ions and the study of the physics of the b quark. The remaining three
are much smaller; two of them, TOTEM and LHCf, study the cross section in the forward region
(which dominates the total hadronic cross section, as we shall see in Chap. 6); finally, MoEDAL
searches for exotic particles, magnetic monopoles in particular.
150 4 Particle Detection
The amount of raw data from each crossing is approximately 1 MB, which at the
40 MHz crossing rate would result in 40 TB of data per second, A multi-stage trigger
system reduces the rate of interesting events down to about 100/s. At the first stage,
the data from each crossing are held in buffers within the detector and some key
information is used to identify interesting features (such as large transverse momen-
tum particles, high energy jets, muons or missing energy). This task is completed in
around 1 µs, and the event rate is reduced by a factor of about thousand down to
50 kHz. The data corresponding to the selected events are sent over fiber-optic links
to a higher level trigger stage, which is a software trigger. The lower rate allows
for a much more detailed analysis of the event, and the event rate is again reduced
by a further factor of about a thousand, down to around 100 events per second. In
a high-luminosity colliders as the LHC, one single bunch crossing may produce
several separate events, so-called pile-up events. Trigger systems must thus be very
effective.
The overall accuracy in momentum can be parameterized as
δp
0.015 % pT ⊕ 0.5 % , (4.17)
p
where p is expressed in GeV/c, and the typical calorimetric resolution in the barrel
part is
δE 3%
√ ⊕ 0.3 % , (4.18)
E E
Fig. 4.32 An event reconstructed by CMS as shown in different projections by the CMS event
display (Source CERN)
152 4 Particle Detection
Table 4.4 Comparison of the main design parameters of CMS and ATLAS
Parameter ATLAS CMS
Total weight (tons) 7000 12,500
Overall diameter (m) 22 15
Overall length (m) 46 20
Magnetic field for tracking (T) 2 4
Solid angel for precision 2π × 5.0 2π × 5.0
measurement (φ × η)
Solid angel for energy 2π × 9.6 2π × 9.6
measurement (φ × η)
Total cost (million Swiss 550 550
francs)
The main design parameters of ATLAS and CMS are compared in Table 4.4.
The overall accuracy in momentum can be parameterized as
δp
0.05 % pT ⊕ 1 % , (4.19)
p
where p is expressed in GeV/c, and the typical calorimetric resolution in the barrel
part is
δE 2.8 %
√ ⊕ 0.3 % , (4.20)
E E
The strong decrease in the flux of cosmic rays with energy, in first approximation
∝ E −3 , poses a big challenge to the dimensions and the running times of the
experimental installations when large energies are studied. Among cosmic rays, a
small fraction about 10−3 are photons, which are particularly interesting since they
are not deflected by intergalactic magnetic fields, and thus point directly to their
sources; the large background from charged cosmic rays makes the detection even
more complicated. Neutrinos are expected to be even less numerous than photons,
and their detection is even more complicated due to the small cross section.
We shall examine first the detectors of cosmic rays which have a sizable probability
of interactions with the atmosphere: nuclei, electrons/positrons, and photons.
Balloon and satellite born detectors operate at an altitude above 15 km where
they can detect the interaction of the primary particle inside the detector, but they
are limited in detection area and therefore also limited in the energy range they
154 4 Particle Detection
can measure (the maximum primary energy that can be measured by means of direct
observations is of the order of 1 PeV; above this energy the observations are performed
by exploiting the cascades induced in atmosphere by the interactions of cosmic rays).
The physics of electromagnetic and hadronic showers has been described before; here
we particularize the results obtained to the development of the showers due to the
interaction of high-energy particles with the atmosphere. These are called extensive
air showers (EAS).
High-energy hadrons, photons, and electrons interact in the high atmosphere. As
we have seen, the process characterizing hadronic and electromagnetic showers is
conceptually similar (Fig. 4.34).
For photons and electrons above a few hundred MeV, the cascade process is
dominated by the pair production and the bremsstrahlung mechanisms: an energetic
photon scatters on an atmospheric nucleus and produces a pair, which emits secondary
photons via bremsstrahlung; such photons produce in turn an e+ e− pair, and so on,
giving rise to a shower of charged particles and photons.
The longitudinal development of typical photon-induced extensive air showers
is shown in Fig. 4.35 for different values of the primary energies. The maximum
shower size occurs approximately ln(E/0 ) radiation lengths, the radiation length
for air being about 37 g/cm2 (approximately 300 m at sea level and NTP). The critical
atmospheric nucleus
π0
e+ π+
e– γ π–
EM shower γ
e+ nucleons,
+
γ γ K -, etc.
_
e– atmospheric nucleus
νμ
EM shower
e+ + π0
e– π
γ e+
e+ γ
γ e– nucleons,
e-
+
K -, etc. e– μ–
γ
e+ e+
νμ
μ+ e–
EM shower
Fig. 4.34 Schematic representation of two atmospheric showers initiated by a photon (left) and by
a proton (right). Credit: R.M. Wagner, dissertation, MPI Munich 2007
4.5 Cosmic-Ray Detectors 155
log(Shower Size)
8 370 Pe
the shower age. Credit: R.M. 0.5 V
Wagner, dissertation, MPI s = 0.4
6
Munich 2007; adapted from
reference [F4.1] in the 120
TeV
4 13 T
“Further reading” 1.8 T eV
eV
300
2 GeV
30 G
eV
0
0 5 10 15 20 25 30 35
Atmospheric depth (r.l.)
energy 0 , the energy at which the ionization energy loss starts dominating the energy
loss by bremsstrahlung, is about 80 MeV in air.13
The hadronic interaction length in air is about 61 g/cm2 for protons (500 meters for
air at NTP), being shorter for heavier nuclei—the dependence of the cross section
on the mass number A is approximately A2/3 . The transverse profile of hadronic
showers is in general wider than for electromagnetic showers, and fluctuations are
larger.
Particles release energy in the atmosphere, which acts like a calorimeter, through
different mechanisms—which give rise to a measurable signal. We have discussed
these mechanisms in Sect. 4.1.1; now we reexamine them in relation to their use in
detectors.
4.5.1.1 Fluorescence
As the charged particles in an extensive air shower go through the atmosphere, they
ionize and excite the gas molecules (mostly nitrogen). In the de-excitation processes
that follow, visible and UV radiations are emitted. This is the so-called fluorescence
light associated to the shower.
The number of emitted fluorescence photons is small—of the order of a few
photons per electron per meter in air. This entails the fact that fluorescence technique
can be used only at high energies. However, it is not directional as in the case of
Cherenkov photons (see below), and thus it can be used in serendipitous observations.
13 In
the isothermal approximation, the depth x of the atmosphere at a height h (i.e., the amount of
atmosphere above h) can be approximated as
x X e−h/7 km ,
Many secondary particles in the EAS are superluminal, and they thus emit Cherenkov
light that can be used for the detection. The properties of the Cherenkov emission
have been discussed in Sect. 4.2.
At sea level, the value of the Cherenkov angle θC in air for β = 1 is about 1.3◦ ,
while at 8 km a.s.l. it is about 1◦ . The energy threshold at sea level is 21 MeV for a
primary electron and 44 GeV for a primary muon.
Half of the emission occurs within 20 m of the shower axis (about 70 m for a
proton shower).
Since the intrinsic angular spread of the charged particles in an electromagnetic
shower is about 0.5◦ , the opening of the light cone is dominated by the Cherenkov
angle. As a consequence, the ground area illuminated by Cherenkov photons from a
shower of 1 TeV (the so-called “light pool” of the shower) has a radius of about 120 m.
The height of maximal emission for a primary of 1 TeV of energy is approximately
8 km a.s.l., and about 150 photons per m2 arrive at 2000 m a.s.l. (where typically
Cherenkov detectors are located) in the visible frequencies. This dependence is not
linear, being the yield of about 10 photons per square meter at 100 GeV.
The atmospheric extinction of light drastically changes the Cherenkov light spec-
trum (originally proportional to 1/λ2 ) arriving at the detectors, in particular suppress-
ing the UV component (Fig. 4.36) which is still, anyway, dominant. There are several
sources of extinction: absorption bands of several molecules, molecular (Rayleigh),
scattering from aerosol (Mie).
Radio Emission. Cosmic-ray air showers also emit radio waves in the frequency
range from a few to a few hundred MHz, an effect that opens many interesting
possibilities in the study of UHE and EHE extensive air showers. At present, however,
open questions still remain concerning both the emission mechanism and its strength.
Cherenkov
[nm]
4.5 Cosmic-Ray Detectors 157
The detection of charged cosmic rays may be done at top of the Earth atmosphere in
balloon or satellite-based experiments whenever the fluxes are high enough (typically
below tens or hundreds of GeV) and otherwise in an indirect way by the observation
of the extensive air showers produced in their interaction with the atmosphere (see
Sect. 4.5.1).
In the last thirty years, several experiments to detect charged cosmic rays in the
space or at the top of the Earth atmosphere were designed and a few were successfully
performed. In particular:
• The Advanced Composition Explorer (ACE) launched in 1997 and still in operation
(enough propellant until ∼2024) has been producing a large set of measurements on
the composition (from H to Ni) of solar and Galactic Cosmic rays covering energies
from the 1 KeV/nucleon to 500 MeV/nucleon. ACE have several instruments which
are able to identify the particle charge and mass using different types of detectors
(for example silicon detectors, gas proportional counters, fiber-optics hodoscopes)
and techniques (for example the specific energy loss d E/d x, the time-of-flight,
electrostatic deflection). The total mass at launch (including fuel) was about 800 kg.
• The Balloon-borne Experiment with Superconducting Spectrometer (BESS) per-
formed successive flights starting in 1993 with the main aim to measure the low-
energy anti-proton spectrum and to search for anti-matter namely anti-helium.
The last two flights (BESS-Polar) were over Antarctica and had a long duration
(8.5 days in 2004 and 29.5 days in 2007/2008). The instrument, improved before
every flight, had to ensure a good charge separation and good particle identifica-
tion. It had a horizontal cylindrical configuration and its main components were as
follows: a thin-wall superconducting magnet; a central tracker composed by drift
chambers; time-of-flight scintillation counters hodoscopes; an aerogel (an ultra-
light porous material derived from a gel by replacing its liquid component with a
gas) Cherenkov counter.
• The PAMELA experiment launched in June 2006 measured charged particle and
anti-particles out of the Earth atmosphere during a long (six years)-time period.
A permanent magnet of 0.43 T and a microstrip silicon tracking system ensured a
good charge separation between electrons and positrons up to energies of the order
of the hundred of GeV measured by a silicon/tungsten electromagnetic calorime-
ter complemented by a neutron counter to enhance the electromagnetic/hadronic
discrimination power. The trigger was provided by a system of plastic scintilla-
tors which were also used to measure the time-of-flight and an estimation of the
specific ionization energy loss (d E/d X ).
• The Alpha Magnetic Spectrometer (AMS-02) was installed in May 2011 on the
International Space Station. Its layout is similar to PAMELA but with a much
larger acceptance and a more complete set of sophisticated and higher perform-
ing detectors. Apart from the permanent magnet and the precision silicon tracker
158 4 Particle Detection
The arrival direction of an air shower is determined from the arrival time at the
different surface detectors of the shower front. In a first approximation, the front
can be described by a thin disk that propagates with the speed of light; second-order
corrections can be applied to improve the measurement (Fig. 4.38).
The impact point of the air shower axis at the Earth surface (the air shower core)
is defined as the point of maximum particle density and is determined from the mea-
sured densities at the different surface detectors using, in a first approximation, a
modified center-of-mass algorithm. In Fig. 4.39, the particle density pattern of the
highest energy event at the AGASA array experiment14 is shown. The energy of the
event was estimated to be about 2 × 1020 eV. The measured densities show a fast
decrease with the distance of the core and are usually parameterized by empirical
or phenomenological inspired formulae—the most popular being the NKG function,
introduced in Sect. 4.1.6—which depend also on the shower age (the level of develop-
ment of the shower in the moment when it arrive at the Earth surface). Such functions
allow for a better determination of the shower core and for the extrapolation of the
particle density to a reference distance of the core which is then used as an estimator
of the shower size and thus of the shower energy. The exact function as well as the
reference distance depends on the particular experiment setup.
The fluorescence telescopes record the intensity and arrival time of the light emit-
ted in the atmosphere in specific solid angle regions and thus are able to reconstruct
14 The Akeno Giant Air-Shower Array (AGASA) is a very large surface array covering an area
of 100 km2 in Japan and consisting of 111 surface detectors (scintillators) and 27 muon detectors
(proportional chambers shielded by Fe/concrete).
160 4 Particle Detection
the shower axis geometry and the shower longitudinal profile. In Fig. 4.40 the image
of a shower in the focal plane of one of the Pierre Auger fluorescence telescopes (see
later) is shown. The third dimension, time, is represented in a color code.
The geometry of the shower (Fig. 4.41) is then reconstructed in two steps: first the
shower detector plane (SDP) is found by minimizing the direction of the SDP normal
to the mean directions of the triggered pixels, and then the shower axis parameters
within the SDP are found from the measured arrival time of the light in each pixel
assuming that the shower develops along a line at the speed of light.
Fig. 4.40 Display of one shower in the focal plane of one of the Pierre Auger fluorescence tele-
scopes. Left Pattern of the pixels with signal; right response (signal versus time, with a time bin of
100 ns) of the selected pixels (marked with a black dot in the left panel). The development of the
shower in the atmosphere can be qualitatively pictured. From https://www.auger.org
4.5 Cosmic-Ray Detectors 161
Fig. 4.43 The Pierre Auger Observatory near Malargue, Argentina. The radial lines point to the
fluorescence detectors (FD, 4 ×6 = 24). The black dots are the 1600 ground stations (SD). Sites with
specialized equipment are also indicated. By Darko Veberic [GFDL http://www.gnu.org/copyleft/
fdl.html], via Wikimedia Commons
light emitted and the number of particles in the shower. The integral of such profile
is a good estimator of the energy of the shower (small “missing energy” corrections
due to low interacting particles in the atmosphere, like muons and neutrinos, have to
be taken into account).
The Pierre Auger Observatory. The Pierre Auger Observatory in Malargue,
Argentina, is the largest cosmic-ray detector ever built. It covers a surface of about
3000 square kilometers with 1600 surface detector stations (Cherenkov water tanks)
arranged in a triangular grid of 1.5 km side complemented by 24 fluorescence tele-
scopes, grouped into four locations to cover the atmosphere above the detector area
(Fig. 4.43).
Each water tank is a cylinder of 10 m2 base by 1.5 m height filled with 12 tons
of water (Fig. 4.44). The inner walls of the tank are covered with a high reflectivity
material. The Cherenkov light, produced by the charged particles crossing the tank,
is collected by three PMT placed on the top of the tank. Each tank is autonomous
being the time given by a GPS unit and the power provided by a solar panel; it
communicates via radio with the central data acquisition system.
4.5 Cosmic-Ray Detectors 163
Fig. 4.44 Sketch of one of the Pierre Auger surface detectors (left); a fluorescence telescope (right).
From https://www.auger.org
15 In a Schmidt telescope, a spherical mirror receives light that passed through a thin aspherical lens
that compensates for the image distortions that will occur in the mirror. The light is then reflected
in the mirror into a detector that records the image.
164 4 Particle Detection
Fig. 4.47 Transparency of the atmosphere for different photon energies and possible detection
techniques. Source A. De Angelis and L. Peruzzo, “Le magie del telescopio MAGIC,” Le Scienze,
April 2007
4.5.3.1 Satellites
Main figures of merit for a satellite-borne detector are its effective area (i.e., the
product of the area times the detection efficiency), the energy resolution, the space
or angular resolution (called as well point-spread function, or PSF).
Satellite HE gamma telescopes can detect the primary photons at energies lower
than ground-based telescopes. They have a small effective area, of order of 1 m2
maximum, which limits the sensitivity. They have a large duty cycle, and they suffer
a low rate of background events, since they can be coupled to anticoincidence systems
rejecting the charged cosmic rays. They have a large cost, dominated by the costs of
launch and by the strong requirements of instruments which must be sent in space,
with little or no possibility of intervention to fix possible bugs.
Two modern gamma-ray telescopes are in orbit; they are called AGILE and Fermi
(Fig. 4.48).
Their technology has been inherited from the smaller and less technological
EGRET instrument, operational in the years 1991–2000 on the Compton Gamma-
Ray Observatory, and from particle physics. The direction of an incident photon is
determined through the geometry of its conversion into an e+ e− pair in foils of heavy
materials which compose the instrument, and detected by planes of silicon detec-
tors. The presence of an anticoincidence apparatus realizes a veto against unwanted
incoming charged particles. The principle of operation is illustrated in Fig. 4.48,
right.
Fig. 4.48 On the left, the Fermi satellite. On the right, the layout of the Large Area Telescope
(LAT), and principle of operation. Credits: NASA
4.5 Cosmic-Ray Detectors 167
The angular resolution of these telescopes is limited by the opening angle of the
e+ e− pair, approximately (m e c2 )/E ln(E/(m e c2 )), and especially by the effect of
multiple scattering. To achieve a good energy resolution, in this kind of detector, a
calorimeter in the bottom of the tracker is possibly used, depending on the weight
that the payload is planned to comply with. Due to weight limitations, however, it
is difficult to fit in a calorimeter that completely contains the showers; this leakage
downgrades the energy resolution. Since at low energies multiple scattering is the
dominant process, the optimal detector design is a tradeoff between small radiation
length (which decreases the conversion efficiency) and large number of samplings
(which increases the power consumption, limited by the problems of heat dissipation
in space).
The largest gamma-ray space-based detector ever built is the Fermi observatory,
launched in June 2008—and called GLAST before the successful positioning in
orbit. It is composed by the spacecraft and by two instruments: the Large Area
Telescope (LAT) and the Fermi Gamma Burst Monitor (GBM); the two instruments
are integrated and they work as a single observatory.
The structure of the LAT consists mainly in a tracker, an anticoincidence appa-
ratus and a calorimeter (see Fig. 4.48). Its energy range goes from 20 MeV to about
300 GeV and above, while the energy range explored by the GBM is 10–25 MeV.
Fermi was built and it is operated by an international collaboration with contribu-
tions from space agencies, high-energy particle physics institutes, and universities in
France, Italy, Japan, Sweden, and the United States; it involves about 600 scientists.
After the first year, data are public, i.e., every scientist in the world can in principle
analyze them.
The scientific objectives of the LAT include the understanding of the nature of
unidentified gamma-ray sources and origins of diffuse galactic emission; of particle
acceleration mechanisms at the sources, particularly in active galactic nuclei, pulsars,
supernovae remnants, and the Sun; of the high-energy behavior of gamma-ray burst
and transient sources. The observations will also be used to probe dark matter and,
at high-energy, the early universe and the cosmic evolution of high-energy sources
to z ≥ 6.
The characteristics and performance of the LAT will enable significant progress
in the understanding of the high-energy sky. In particular, it has good angular reso-
lution for source localization and multi-wavelength study, high sensitivity in a broad
field-of-view to detect transients and monitor variability, good calorimetry over an
extended energy band for detailed emission spectrum studies, and good calibration
and stability for absolute, long term flux measurements.
The LAT tracker is done by 16 planes of high-Z material (W) in which incident
γ-rays can convert to an e+ e− pair. The converter planes are interleaved with 18
silicon detectors two-layer planes that measures the tracks of the particles resulting
from pair-conversion. This information is used to reconstruct the directions of the
incident γ-rays.
168 4 Particle Detection
After the tracker, a calorimeter can measure the energy. It is made of CsI(Tl) crys-
tals with a total depth of 8.6 radiation lengths, arranged in a hodoscope configuration
in order to provide longitudinal and transverse information on the energy deposition.
The depth and the segmentation of the calorimeter enable the high-energy reach of
the LAT and significantly contribute to background rejection. The aspect ratio of
the tracker (height/width) is 0.4 (the width being about 1.7 m), resulting in a large
field-of-view (2.4 sr) and ensuring that most pair-conversion showers initiated in the
tracker will reach the calorimeter for energy measurement.
Around the tracker, an anticoincidence detector (ACD) made of plastic scintillator
provides charged particle background rejection.
The overall performance of Fermi can be summarized as follows in the region of
main interest (30 MeV–30 GeV):
• Effective area of about 1 m2 ;
• Relative energy resolution decreasing between 10 % at 100 MeV and 5 % at 1 GeV,
increasing again to 10 % at 30 GeV; and √
• Angular resolution of 0.1◦ at 10 GeV, and approximately varying as 1/ E.
AGILE, the precursor of Fermi, is a completely Italian satellite launched in April
2007. Its structure is very similar to Fermi, but its effective area is about one order of
magnitude smaller. However, many remarkable physics results were obtained thanks
to the AGILE data.
Fig. 4.49 Sketch of the operation of Cherenkov telescopes and of EAS detectors
EAS Detectors. The EAS detectors, such as MILAGRO, Tibet-AS and ARGO in
the past, and HAWC which is presently in operation, are large arrays of detectors
sensitive to charged secondary particles generated in the atmospheric showers. They
have a high duty cycle and a large field-of-view, but a relatively low sensitivity. The
energy threshold of such detectors is rather large—a shower initiated by a 1 TeV
photon typically has its maximum at about 8 km a.s.l.
The principle of operation is the same as the one for the UHE cosmic rays detectors
like Auger, i.e., direct sampling of the charged particles in the shower. This can be
achieved:
• either using a sparse array of scintillator-based detectors, as for example in Tibet-
AS (located at 4100 m a.s.l. to reduce the threshold; for an energy of 100 TeV there
are about 50,000 electrons at mountain-top altitudes);
• or by effective covering of the ground, to ensure efficient collection and hence
lower the energy threshold.
– The ARGO-YBJ detector at the Tibet site followed this approach. It was of an
array of resistive plate counters. Its energy threshold was in the 0.5 TeV-1 TeV
range. The Crab Nebula with a significance of about 5 standard deviations (σ)
in 50 days of observation.
– MILAGRO was a water-Cherenkov instrument located in New Mexico (at an
altitude of about 2600 m a.s.l.). It detected the Cherenkov light produced by the
secondary particles of the shower when they enter the water pool instrumented
with photomultipliers. MILAGRO could detect the Crab Nebula with a signif-
icance of about 5 σ in 100 days of observation, at a median energy of about
20 TeV.
The energy threshold of EAS detectors is at best in the 0.5–1 TeV range, so they
are built to detect UHE photons as well as the most energetic VHE gammas. At such
energies fluxes are small and large surfaces of order of 104 m2 are required.
170 4 Particle Detection
Fig. 4.50 Left Layout of the HAWC detector. The location of the different water tanks is shown.
Right A water tank. Credit: HAWC Collaboration
Concerning the discrimination from the charged cosmic ray background, muon
detectors devoted to hadron rejection may be present. Otherwise, it is based on
the reconstructed shower shape. The direction of the detected primary particles is
computed from the arrival times with an angular precision of about 1◦ . The calibration
can be performed by studying the shadow in the reconstructed directions caused by
the Moon. Energy resolution is poor.
Somehow, the past generation EAS detectors were not sensitive enough and just
detected a handful of sources. This lesson lead to a new EAS observatory with much
larger sensitivity: the High Altitude Water Cherenkov detector HAWC, inaugurated
in 2015.
HAWC. HAWC (Fig. 4.50) is a very high-energy gamma-ray observatory located in
Mexico at an altitude of 4100 m. It consists of 300 steel tanks of 7.3 m diameter
and 4.5 m deep, covering an instrumented area of about 22,000 m2 . Each tank is
filled with purified water and contains three PMT of 20 cm diameter, which observe
the Cherenkov light emitted in water by superluminal particles in atmospheric air
showers. Photons traveling through the water typically undergo Compton scattering
or produce an electron–positron pair, also resulting in Cherenkov light emission.
This is an advantage of the water Cherenkov technique, as photons constitute a large
fraction of the electromagnetic component of an air shower at ground.
HAWC improves the sensitivity for a Crab-like point spectrum by a factor of
15 in comparison to MILAGRO. The sensitivity should be also such to possibly
detect gamma-ray burst emissions at high energy.
A future installation in the Northern hemisphere, a hybrid detector called LHAASO
to be deployed in China, is currently under discussion.
Cherenkov Telescopes. Most of the experimental results on VHE photons are
presently due to Imaging Atmospheric Cherenkov Telescopes (IACTs).
IACTs such as the first successful cosmic detectors, called WHIPPLE, and then
the second generation instruments HEGRA and CANGAROO, and presently the
third generation H.E.S.S., MAGIC and VERITAS, detect the Cherenkov photons
4.5 Cosmic-Ray Detectors 171
m
Camera (cleaned event)
≈ 10k
≈10 km a.s.l.
≈20m
Cherenkov Telescope
Fig. 4.51 The observational technique adopted by Cherenkov telescopes. Credit: R.M. Wagner,
dissertation, MPI Munich 2007
In the GeV-TeV region, the background from charged particles is three orders
of magnitude larger than the signal. Hadronic showers, however, have a different
topology, being larger and more subject to fluctuations than electromagnetic showers.
Showers induced by gamma-rays can thus be separated from the hadronic ones on
the basis of the shower shape.
Most of the present identification techniques rely on a technique pioneered by
Hillas in the 1980s; the discriminating variables are called “Hillas parameters.” The
intensity (and area) of the image produced is an indication of the shower energy,
while the image orientation is related to the shower direction (photons “point” to
emission sources, while hadrons are in first approximation isotropic). The shape of
the image is different for events produced by photons and by other particles; this
characteristic can be used to reject the background from charged particles (Figs. 4.52
and 4.53).
The time structure of Cherenkov images provides an additional discriminator
against the hadronic background, which can be used by isochronous detectors (with
parabolic shape) and with a signal integration time smaller than the duration of the
shower (i.e., better than 1–2 GHz).
Systems of more than one Cherenkov telescope provide a better background rejec-
tion, and a better angular and energy resolution than a single telescope.
There are three large operating IACTs: H.E.S.S., MAGIC, and VERITAS; the
first located in the southern hemisphere and the last two in the northern hemisphere.
Fig. 4.52 Development of vertical 1 TeV photon (left) and proton (right) showers in the atmosphere.
The upper panels show the positions in the atmosphere of all shower electrons above the Cherenkov
threshold; the lower panels show the resulting Cherenkov images in the focal plane of a 10 m
reflecting mirror when the showers fall 100 m from the detector (the center of the focal plane is
indicated by a star). From C.M. Hoffmann et al., “Gamma-ray astronomy at high energies,” Reviews
of Modern Physics 71 (1999) 897
174 4 Particle Detection
Fig. 4.53 Images from the focal camera of a Cherenkov telescope. The electromagnetic events differ
from the hadronic events by several features: the envelope of the electromagnetic shower can be quite
well described by an ellipse whereas the important fraction of large transverse momentum particles
in hadronic showers will result in a more scattered reconstructed image. Muons are characterized
by a conical section. From http://www.isdc.unige.ch/cta/images/outreach/
Fig. 4.55 One of the MAGIC telescopes. Credit: Robert Wagner, University of Stockholm
Table 4.5 A comparison of the characteristics of Fermi, the IACTs and of the EAS particle detector
arrays. Sensitivity computed over one year for Fermi and the EAS, and over 50 h for the IACTs
Quantity Fermi IACTs EAS
Energy range 20 MeV−200 GeV 100 GeV−50 TeV 400 GeV−100 TeV
Energy res. 5–10 % 15–20 % ∼ 50 %
Duty cycle 80 % 15 % > 90 %
FoV 4π/5 5 deg × 5 deg 4π/6
PSF (deg) 0.1 0.07 0.5
Sensitivity 1 % Crab (1 GeV) 1 % Crab (0.5 TeV) 0.5 Crab (5 TeV)
Agile
-11 Fermi
10 Argo Crab
-13 1% Crab
10
CTA
-14
10
4 5
10 100 1000 10 10
E [GeV]
Fig. 4.56 Sensitivities of some present and future HE gamma detectors, measured as the minimum
intensity source detectable at 5 σ. The performance for EAS and satellite detector is based on one
year of data taking; for Cherenkov telescopes it is based on 50 h of data
Regular observations are performed in stereoscopic mode. Only events that trigger
both telescopes are recorded. The trigger condition for the individual telescope is
that at least 3 neighboring pixels must be above their pixel threshold (level-0 trigger).
The stereo trigger makes a tight time coincidence between both telescopes taking
into account the delay due to the relative position of the telescopes and their pointing
direction. Although the individual telescope trigger rates are of several kHz, the
stereo trigger rate is in the range of 150–200 Hz with just a few Hz being accidental
triggers. The lower observational threshold can be reduced to 30 GeV thanks to a
dedicated low-energy trigger.
It is difficult to think for this century of an instrument for GeV photons improving
substantially the performance of the Fermi LAT: the cost of space missions is such
that the size of Fermi cannot be reasonably overcome with present technologies.
New satellites in construction (like the Russian-Italian mission GAMMA400 and
the Chinese-Italian mission DAMPE) improve some of the aspects of Fermi, e.g.,
calorimetry.
Improvements are however possible in the sector of VHE gamma astrophysics.
VHE gamma astrophysics in the current era has been dominated by Cherenkov
telescopes. We know today that the previous generation EAS telescopes were under-
dimensioned in relation to the strength of the sources.
4.5 Cosmic-Ray Detectors 177
The research in the future will push both on EAS and IACT, which have mutual
advantages and disadvantages. The sensitivities of the main present and future detec-
tors are illustrated in Fig. 4.56. We are already seen the characteristics of HAWC,
which is under completion; a very large Cherenkov Telescope Array (CTA) is also
in construction.
CTA. The CTA is a future instrument for VHE gamma astrophysics that is expected to
provide an order of magnitude improvement in sensitivity over existing instruments.
An array of tens of telescopes will detect gamma-ray-induced showers over a large
area on the ground, increasing the efficiency and the sensitivity, while providing a
much larger number of views of each cascade. This will result in both improved
angular resolution and better suppression of charged cosmic-ray background events.
In the present design scenario, CTA will be deployed in two sites. The southern
hemisphere array will consist of three types of telescopes with different mirror sizes,
in order to cover the full energy range. In the northern hemisphere array, the two
larger telescope types would be deployed.
• The low energy (the goal is to detect showers starting from an energy of 20 GeV)
instrumentation will consist of four 23 m telescopes with a FoV of about 4–5
degrees.
• The medium energy range, from around 100 GeV–1 TeV, will be covered by some
20 telescopes of the 12 m class with a FoV of 6–8 degrees.
• The high-energy instruments, operating above 10 TeV, will consist of a large num-
ber (30–40) of small (4–6 m diameter) telescopes with a FoV of around 10 degrees.
The southern CTA will cover about three square kilometers of land with around
60 telescopes that will monitor all the energy ranges in the center of the Milky Way’s
galactic plane. The northern CTA will cover one square kilometer and be composed
of some 30 telescopes. These telescopes will be mostly targeted at extragalactic
astronomy.
The telescopes of different sizes will be disposed in concentric circles, the largest
in the center (Fig. 4.57). Different modes of operation will be possible: deep field
observation; pointing mode; scanning mode.
The energy spectrum of neutrinos interesting for particle and astroparticle physics
spans more than 20 orders of magnitude, from the 2 eV of relic neutrinos from the big
bang, to the few MeV of the solar neutrinos, to the few GeV of the neutrinos produced
by the interaction of cosmic rays with the atmosphere (atmospheric neutrinos), to the
region of extremely high energy where the production from astrophysical sources is
dominant. We concentrate here on the detection of neutrinos of at least some MeV,
and we present some of the most important neutrino detectors operating.
178 4 Particle Detection
Fig. 4.57 Left Possible layout of the CTA. Right Project of the large telescope (LST). Credit: CTA
Collaboration
Neutrino detectors mostly use the detection of the products of induced β decays.
The first setups used a solution of cadmium chloride in water and two scintillation
detectors. Antineutrinos with an energy above the 1.8 MeV threshold cause charged
inverse beta-decay interactions with the protons in the water, producing a positron
which in turn annihilated, generating pairs of photons that could be detected.
Radiochemical chlorine detectors consist instead of a tank filled with a chlorine
solution in a fluid. A neutrino converts a 37 Cl atom into one of 37 Ar ; the threshold
neutrino energy for this reaction is 0.814 MeV. From time to time the argon atoms
are counted to measure the number of radioactive decays.
The first detection of solar neutrinos was achieved using a chlorine detector con-
taining 470 tons of fluid in the former Homestake Mine near Lead, South Dakota.
This was first measurement of the deficit of electron neutrinos. For this discovery the
leader of the experiment, Ray Davis, won the Nobel prize for physics.16 A similar
detector design, with a lower detection threshold of 0.233 MeV, uses the Ga → Ge
transition.
16 One half of the Nobel prize in physics 2002 was assigned jointly to the US physicist Raymond
Davis Jr. (Washington 1914—New York 2006) and to the leader of the Kamiokande collaboration
(see later) Masatoshi Koshiba (Aichi, Japan, 1926) “for pioneering contributions to astrophysics,
in particular for the detection of cosmic neutrinos.”
4.5 Cosmic-Ray Detectors 179
Probably the most important results in the sector of MeV to GeV neutrinos in the
recent years are due to a Cherenkov-based neutrino detector, Kamiokande, in Japan.
We give here a short description of this detector in its present version, called Super-
Kamiokande.
The Super-Kamiokande Detector. Super-Kamiokande (often abbreviated to Super-
K or SK) is a neutrino observatory located in a mine 1000 m underground under
Mount Kamioka near the city of Hida, Japan. The observatory was initially designed
to search for proton decay, predicted by several unification theories (see Sect. 7.6.1).
Super-K (Fig. 4.58) consists of a cylindrical tank about 40 m tall and 40 m in
diameter containing 50,000 tons of ultra-pure water. The volume is divided by a
stainless steel structure into an inner detector (ID) region (33.8 m in diameter and
Fig. 4.58 The Super-Kamiokande detector (from the Super-Kamiokande web site)
180 4 Particle Detection
36.2 m in height) and an outer detector (OD) consisting of the remaining tank volume.
Mounted on the structure are about 11000 PMT 50 cm in diameter that face the ID
and 2000 20 cm PMT facing the OD.
The interaction of a neutrino with the electrons or nuclei in the water can produce
a superluminal charged particle producing Cherenkov radiation, which is projected
as a ring on the wall of the detector and recorded by the PMT. The information
recorded is the timing and charge information by each PMT, from which one can
reconstruct the interaction vertex, the direction and the size of the cone.
The typical threshold for the detection of electron neutrinos is of about 6 MeV.
Electrons lose quickly their energy, and, if generated in the ID, are likely to be
fully contained (not penetrating inside the OD). Muons instead can penetrate, and
the muon events can be partially contained in the detector (or not contained). The
threshold for the detection of muon neutrinos is about 2 GeV.
A new detector called Hyper-Kamiokande is envisaged, with a volume 20 times
larger than Super-Kamiokande. Construction is expected to start in 2018, and obser-
vations might start in 2025.
The SNO Detector. The Sudbury Neutrino Observatory (SNO) uses 1000 tons of
heavy water (D2 O) contained in a 12 m diameter vessel surrounded by a cylinder
of ordinary water, 22 m in diameter and 34 m high. In addition to the neutrino inter-
actions visible in a detector as SK, the presence of deuterium allows the reaction
producing a neutron, which is captured releasing a gamma-ray that can be detected.
The challenge in the field of UHE neutrinos is to build telescopes with good enough
sensitivity to see events, since the flux is expected to be lower than the photon flux
(the main mechanism for the production of neutrinos, i.e., the hadronic mechanism,
is common to photons). This requires instrumenting very large volumes. Efforts to
use large quantities of water and ice as detectors are ongoing. Several experiments
are completed, operating, or in development using Antarctic ice, the oceans, and
lakes, with detection methods including optical and coherent radio detection as well
as particle production.
Among the experiments in operation, the largest sensitivity detectors are Baikal
NT-200 and IceCube.
Baikal NT-200 Detector. The underwater neutrino telescope NT-200 is located in
the Siberian lake Baikal at a depth of approximately 1 km and is taking data since
1998. It consists of 192 optical sensors deployed in eight strings, with a total active
volume of 5 million cubic meters. Deployment and maintenance are carried out
during winter, when the lake is covered with a thick ice sheet and the sensors can
easily be lowered into the water underneath. Data are collected over the whole year
and permanently transmitted to the shore over electrical cables.
The IceCube Experiment. IceCube, a cube of 1 km3 instrumented in the Antarctica
ices, has been in operation at the South Pole since 2010 (Fig. 4.59). The telescope
views the ice through approximately 5160 optical sensors, deployed in 80 sparse and
6 dense vertical strings, at a depth of 1450–2450 m. At the surface, an air-shower
array is coupled to the detector. As the Earth is opaque to UHE neutrinos, detection
of extremely high-energy neutrinos must come from neutrinos incident at or above
the horizon, while intermediate energy neutrinos are more likely to be seen from
below.
To date some possible high-energy events have been seen, consistent with cosmo-
genic sources (astrophysical interactions of cosmic rays or production from a cosmic
accelerator), with energies most likely between 1 and 10 PeV. It is not excluded,
however, although unlikely, that these events are atmospheric neutrinos. IceCube
sensitivity has reached the upper range of the high-energy neutrino fluxes predicted
in cosmogenic neutrino models.
Km3NeT. A large underwater neutrino detector, Km3NeT, is planned. KM3NeT will
host a neutrino telescope with a volume of several cubic kilometers at the bottom of
the Mediterranean sea.
This telescope is foreseen to contain of the order of 12,000 pressure-resistant glass
spheres attached to about 300 detection units—vertical structures with nearly one
kilometer in height. Each glass sphere will contain 31 photomultipliers and connected
to shore via a high-bandwidth optical link. At shore, a computer farm will perform
the first data filtering in the search for the signal of cosmic neutrinos. KM3NeT builds
on the experience of three pilot projects in the Mediterranean sea: the ANTARES
detector near Marseille, the NEMO project in Sicily, and the NESTOR project in
Greece. ANTARES was completed in 2008 and is the largest neutrino telescope in
the northern hemisphere.
L/L ∼
where L is the distance between the two masses and L is its variation.
The idea explored first to detect gravitational waves was to detect the elastic
energy induced by the compression/relaxation of a metal bar due to the compres-
sion/relaxation of distance. Detectors were metal cylinders, and the energy converted
to longitudinal oscillations of the bar was measured by piezoelectric transducers. The
first large gravitational wave detector, built by Joseph Weber in the early 1960s, was
a 1.2 ton aluminum cylindrical bar of 1.5 m length and 61 cm diameter (Fig. 4.60)
4.5 Cosmic-Ray Detectors 183
working at room temperature and isolated as much as possible from acoustic and
ground vibrations. The mechanical oscillation of the bar was translated into electric
signals by piezoelectric sensors placed in its surface close to the central region. The
detector behaved as a narrow band high−Q (quality factor) mechanical resonator
with a central frequency of about 1600 Hz. The attenuation of the oscillations is, in
such devices, very small and therefore the bar should oscillate for long periods well
after the excitation induced by the gravitational waves. The sensitivity of Weber’s
gravitational antenna was of the order of ∼ 10−16 over timescales of 10−3 s.
The present bar detectors (ALLEGRO, AURIGA, Nautilus, Explorer, Niobe)
reach sensitivities of ∼ 10−21 , thanks to the introduction of cryogenic techniques
which allow for a substantial reduction in the thermal noise as well as the use of very
performing superconducting sensors. However, their frequency bandwidths remain
very narrow (∼tens of Hz) and the resonant frequencies (∼1 kHz) correspond typi-
cally to acoustic wavelengths of the order of the detector length. A further increase in
sensitivity implies a particular attention to the quantum noise, and thus a considerable
increase of the detector mass (bars with hundred tons of mass are being considered).
Nowadays the most sensitive gravitational wave detectors are Michelson-type
interferometers with kilometer-long arms and very stable laser beams (see Fig. 4.61).
The lengths of the perpendicular arms of the interferometer will be differently mod-
ified by the incoming gravitational wave and the interference pattern will change
according to the difference induced in the round-trip time between the two arms.
These detectors are per nature broadband, being their sensitivity limited only by the
smallest time difference they are able to measure. The present and the aimed sensi-
tivities ( ∼ 10−22 − 10−24 ) correspond to interferences over distances many orders
of magnitude (∼1014 − 1016 ) smaller than the dimension of an atom, and thus both
the stability of the laser beam and the control of all possible noise sources are critical.
184 4 Particle Detection
ities were installed along their arms in a way that the light beams suffer multiple
reflections increasing by a large factor the effective arm lengths.
Both LIGO and VIRGO are carrying out extensive upgrade programs (advanced
LIGO and advanced VIRGO) and a close collaboration among all of the gravitational
waves observatories is now in place. The first direct detection of gravitational waves
may be near in the 10–1000 Hz range.
In order to further extend the lever arms, one needs to go to space. In a more
distant future (one/two decades) a space observatory may be built extending the
detection sensitivity to a much lower frequency range (10−4 –10−1 Hz). The LISA
concept project, comprising three satellite detectors spaced by more than five million
kilometers (Fig. 4.62), is under study by ESA and NASA.
The present and expected sensitivities of gravitational wave detectors are sum-
marized in Fig. 4.63.
Fig. 4.62 The proposed LISA detector (the size is increased by a factor of 10). From M. Pitkin et al.,
“Gravitational Wave Detection by Interferometry (Ground and Space),” http://www.livingreviews.
org/lrr-2011-5
Fig. 4.63 Present and expected sensitivities of gravitational wave detectors (http://www.aspera-
eu.org/)
186 4 Particle Detection
Further Reading
Exercises
the minimum energy (neglecting the background) that such a system can detect
at a height of 2 km a.s.l.?
8. Cherenkov telescopes. If a shower is generated by a gamma ray of E = 1 TeV
penetrating the atmosphere vertically, considering that the radiation length X 0
of air is approximately 37 g/cm2 and its critical energy E c is about 88 MeV,
calculate the height h M of the maximum of the shower in the Heitler model and
in the Rossi approximation B.
9. Cherenkov telescopes. Show that the image of the Cherenkov emission from a
muon in the focal plane of a parabolic IACT is a conical section (approximate
the Cherenkov angle as a constant).
10. Colliders. The luminosity at the Large Electron–Positron Collider (LEP) was
determined by measuring the elastic e+ e− scattering (Bhabha scattering) as its
cross section at low angles is well known from QED. In fact, assuming small
polar angles, the Bhabha scattering cross section integrated between a polar
angle θmin and θmax is given at first order by
1040 nb 1
σ≈ .
s [GeV2 ] θmax − θmin
2 2
Determine the luminosity of a run of LEP knowing that this run lasted 4 hours,
and the number of identified Bhabha scattering events was 1200 in the polar
range of θ ∈ [29; 185] mrad.√Take into account a detection efficiency of 95 %
and a background of 10 % at s = m Z .
11. Initial state radiation. The effective energy of the elastic e+ e− scattering can
be changed by the radiation of a photon by the particles of the beam (initial
radiation), which is peaked at very small angles. Supposing that a measured
e+ e− pair has the following transverse momenta: p1t = p2t = 5 GeV, and
the radiated photon is collinear with the beam and has an energy of 10 GeV,
determine the effective energy of the interaction of the electron and √positron in
√
the center of mass, se+ e− . Consider that the beam was tuned for s = m Z .
12. Energy loss. In the Pierre Auger Observatory the surface detectors are composed
by water Cherenkov tanks 1.2 m high, each containing 12 tons of water. These
detectors are able to measure the light produced by charged particles crossing
them. Consider one tank crossed by a single vertical muon with an energy of
5 GeV. The refraction index of water is n 1.33 and can be in good approxi-
mation considered constant for all the relevant photon wavelengths. Determine
the energy lost by ionization and compare it with the energy lost by Cherenkov
emission.
13. Synchrotron radiation. Consider a circular synchrotron of radius R0 which is
capable of accelerating charged particles up to an energy E 0 . Compare the radi-
ation emitted by a proton and an electron and discuss the difficulties to accelerate
these particles with this technology.
Chapter 5
Particles and Symmetries
Fig. 5.1 The Livingston plot, representing the maximum energies attained by accelerators as a
function of the year: original (left) and updated (right). For colliders, energies are translated into
the laboratory system. Original figures from M.S. Livingston & J.P. Blewett, “Particle Accelerators,”
MacGraw Hill 1962; A. Chao et al. “2001 Snowmass Accelerator R&D Report,” eConf C010630
(2001) MT100
to the GeV in the beginning of the 1950s, was summarized by M. Stanley Livingston
in 1954, in the so-called Livingston plot (Fig. 5.1). This increase went on along the
last fifty years of the twentieth century and just in the most recent years it may have
reached a limit ∼14 TeV with the Large Hadron Collider (LHC) at CERN.
Accelerators provide the possibility to explore in a systematic way the energy scale
up to a few TeV, and thanks to this a huge number of new particles have been discov-
ered. Already in 1950s, many discoveries of particles with different masses, spins,
charges, properties took place. Their names almost exhausted the Greek alphabet:
these particles were called π, ρ, η, η , φ, ω, , , , ...
Classifications had to be devised to put an order in such a complex zoology. Par-
ticles were first classified according to their masses in classes with names inspired,
once again, to Greek words: heavy particles like the proton were called baryons; light
particles like the electron were called leptons; and particles with intermediate masses
were called mesons. The strict meaning of the class names was soon lost, and now
we know a lepton, the tau, heavier than the proton. According to the present defini-
tion, leptons are those fermions (particles with half-integer spins) that do not interact
strongly with the nuclei of atoms, while baryons are the fermions that do. Mesons
are bosons (particles with integer spins) subject to strong interactions. Baryons and
5.1 A Zoo of Particles 191
mesons interact thus mainly via the strong nuclear force and have a common desig-
nation of hadrons.
The detailed study of these particles shows that there are several conserved quan-
tities in their interactions and decays. The total electric charge, for example, is always
conserved, but also the total number of baryons appears to be conserved, and thus
the proton, being the lightest baryon, cannot decay (the present experimental limit
for the proton lifetime is of about 1034 years). Strange particles if they decay by
strong interactions give always birth to a lighter strange particle, but the same is not
true when they decay via weak interactions. To each (totally or partially) conserved
quantity, a new quantum number was associated: baryons, for instance, have “bary-
onic quantum number” +1 (anti-baryons have baryonic quantum number −1, and
mesons have baryonic quantum number 0).
As a consequence of baryon number conservation, for example, the most eco-
nomic way to produce an antiproton in a proton–proton collision is the reaction
pp → ppp p̄. A proton beam with energy above the corresponding kinematic thresh-
old is thus needed to make this process possible. The Bevatron, a proton synchrotron at
the Lawrence Berkeley National Laboratory providing beams with energy of 6.2 GeV,
was designed for this purpose and started operation in 1954. In the next year, Cham-
berlain, Segrè, Wiegand, and Ypsilantis announced the discovery of the antiproton;
the new particle was identified by measuring its momentum and mass using a spec-
trometer with a known magnetic field, a Cherenkov detector, and a time-of-flight
system. This discovery confirmed that indeed, as predicted by the Dirac equation, to
each particle an oppositely charged particle with the same mass and spin corresponds.
The existence of particles and antiparticles is an example of symmetry, and sym-
metries became more and more present in the characterization of particles and of
their interactions. Particle physicists had to study or reinvent group theory in order
to profit of the possible simplifications guaranteed by the existence of symmetries.
Being a part of the Universe, it is difficult to imagine how humans can expect to
understand it. But we can simplify the representation of the Universe if we find that
its description is symmetrical with respect to some transformations. For example,
if the representation of the physical world is invariant with respect to translation in
space, we can say that the laws of physics are the same everywhere, and this simplifies
our description of Nature—quite a lot.
The dynamical description of a system of particles can be classically expressed
by the positions q j of the particles themselves, by their momenta p j and by the
potentials of the interactions within the system. One way to express this is to use the
so-called Hamiltonian function
H = K +V (5.1)
192 5 Particles and Symmetries
which represents the total energy of the system (K is the term corresponding to the
kinetic energies, while V corresponds to the potentials). An equivalent description,
using the Lagrangian function, will be discussed in the next chapter.
From the Hamiltonian, the time evolution of the system is obtained by the Hamil-
ton’s equations:
d pj ∂H
=− (5.2)
dt ∂q j
dq j ∂H
= (5.3)
dt ∂ pj
where
∂H
pj = . (5.4)
∂ q̇ j
For example, in the case of a single particle in a conservative field in one dimension,
p2
H= +V (5.5)
2m
and Hamilton’s equations become
dp dV dx p
=− =F; = . (5.6)
dt dx dt m
To the Hamiltonian, there corresponds a quantum mechanical operator, which in
the non-relativistic theory can be written as
p̂ 2
Ĥ = +V. (5.7)
2m
We shall expand this concept in this chapter and in the next one.
Symmetries of a Hamiltonian with respect to given operations entail conservation
laws: this fact is demonstrated by the famous Noether theorem.1 In the opinion of
the authors, this is one of the most elegant theorems in physics.
Let us consider an invariance of the Hamiltonian with respect to a certain
transformation—for example, a translation along x. One can write
1 Emmy Noether (1882–1935) was a German mathematician. After dismissing her original plan to
become a teacher in foreign languages, she studied mathematics at the University of Erlangen, where
her father was a professor. After graduating in 1907, she worked for seven years as an unpaid assistant
(at the time women could not apply for academic positions). In 1915, she joined the University of
Göttingen, thanks to an invitation by David Hilbert and Felix Klein, but the faculty did not allow
her to receive a salary and she worked four years unpaid. In that time, she published her famous
theorem. Finally, Noether moved to the US to take up a college professorship in Philadelphia, where
she died at the age of 53.
5.2 Symmetries and Conservation Laws: The Noether Theorem 193
∂H dpx dpx
0 = d H = dx = −d x =⇒ = 0.
∂x dt dt
If the Hamiltonian is invariant with respect to a translation along a coordinate, the
momentum associated to this coordinate is constant. And the Hamiltonian of the
world should be invariant with respect to translations if the laws of physics are the
same everywhere. In a similar way, we could demonstrate that the invariance of
an Hamiltonian with respect to time entails energy conservation, and the rotational
invariance entails the conservation of angular momentum.
A set (a,b,c, ... finite or infinite) of objects or transformations (called hereafter ele-
ments of the group) form a group G if there is an operation (called hereafter product
and represented by the symbol ) between any two of its elements and if:
1. It is closed: the product of any of two elements a, b is an element c of the group
c = a b.
2. There is one and only one identity element: the product of any element a by the
identity element e is the proper element a
a =ae =ea.
3. Each element has an inverse: the product of an element a by its inverse element
b (designated also as a−1 ) is the identity e
e =ab =ba.
4. The associativity law holds: the product between three elements a, b, c can be
carried out as the product of one element by the product of the other two or as the
product of the product of two elements by the other element, keeping however
the order of the elements:
a b c = a (b c) = (a b) c .
ab =ba
of space and time. For instance in mechanics, the description of isolated systems is
invariant with respect to space and time translations as well as to space rotations.
We have seen that a fundamental theorem, the Noether’s theorem, grants that for
each symmetry of a system there is a corresponding conservation law and therefore
a conserved quantity. The formulation of Noether’s theorem in quantum mechanics
is particularly elegant.
Suppose that a physical system is invariant under some transformation U (it can be,
for example, the rotation of the coordinate axes). This means that invariance holds
when the wave function is subject to the transformation
ψ → ψ = U ψ .
A minimum requirement for U to not change the physical laws is unitarity, since
normalization of the wave function should be kept
where I represents the unit operator (which can be, for example, the identity matrix)
and U † is the Hermitian conjugate of U . We shall use in what follows without
distinction the terms Hermitian conjugate, conjugate transpose, Hermitian transpose,
or adjoint of an m × n complex matrix A to indicate the n × m matrix obtained from
A by taking the transpose and then taking the complex conjugate of each entry.
For physical predictions to be unchanged by the symmetry transformation, the
eigenvalues of the Hamiltonian should be unchanged, i.e., if
Ĥ ψi = E i ψi
then
Ĥ ψi = E i ψi .
ĤU ψi = E i U ψi = U E i ψi = U Ĥ ψi
and since the {ψi }, eigenstates of the Hamiltonian, are a complete basis, U commutes
with the Hamiltonian:
[ Ĥ , U ] = ĤU − U Ĥ = 0 .
Thus for every symmetry of a system there is a unitary operator that commutes with
the Hamiltonian.
As a consequence, the expectation value of U is constant, since
5.3 Symmetries and Groups 195
d
ψ|U |ψ = −iψ|[U, H ]|ψ = 0. (5.8)
dt
U ( ) = I + i G
U † U = (I − i G † )(I + i G) = I + i (G − G † ) = I
i.e.,
G† = G .
The generator of the unitary group is thus Hermitian, and it is thus associated
with an observable quantity (its eigenvalues are real). Moreover, it commutes with
the Hamiltonian:
[H, I + i G] = 0 =⇒ [H, G] = 0
(trivially [H, I ] = 0), and since the time evolution of the expectation value of G is
given by the equation
d
G = i[H, G] = 0
dt
the quantity G is conserved.
Continuum symmetries in quantum mechanics are thus associated to conservation
laws related to the group generators. In the next section, we shall make examples; in
particular, we shall see how translational invariance entails momentum conservation
(the momentum operator being the generator of space translations).
Let us see now how Noether’s theorem can be extended to discrete symmetries.
In case one has a discrete Hermitian operator P̂ which commutes with the Hamil-
tonian
[ Ĥ , P̂] = 0
P̂ 2 = I
Due to fundamental properties of space and time, a generic system is invariant with
respect to space translations. We can consider, without loss of generality, a translation
along x:
∂ 1 ∂2
ψ (x) = ψ (x + x) = ψ (x) + x ψ (x) + x 2 2 ψ (x) + · · · (5.9)
∂x 2 ∂x
which can be written in a symbolic way as
∂
ψ (x) = exp x ψ (x) .
∂x
∂
p̂x = −i
∂x
and thus
i
ψ (x) = exp x p̂x ψ (x) .
The same exercise can be done for other transformations which leave a physical
system invariant like for instance rotations about an arbitrary axis (this invariance
is due to isotropy of space). In the case of a rotation about the z axis, the rotation
operator will be
i
Rz (θz ) = exp θz L̂ z
where L̂ z , the angular momentum operator about the z axis, is the generator of the
rotation:
∂ ∂
L̂ z = −i x −y .
∂y ∂x
which in fact ensures that, expanding the exponential, the usual rotation matrix is
recovered: ⎛ ⎞
cos θz sin θz 0
i
Rz (θz ) = exp θz L̂ z = ⎝ − sin θz cos θz 0 ⎠ . (5.12)
0 0 1
which is, as the angular operators do not commute, different from the exponential of
the sum of the exponents
i
Rx (θx ) R y θ y Rz (θz ) = exp θx L̂ x + θ y L̂ y + θz L̂ z :
the result of a sequence of rotations depends on the order in which the rotations were
done.
In fact being  and B̂ two operators, the following relation holds:
1
exp Â+ B̂ = exp Â, B̂ exp ( Â) exp( B̂) (5.13)
2
where Â, B̂ is the commutator of the two operators.
The commutators of the angular momentum operators are indeed not zero and
given by
L̂ x , L̂ y = i L̂ z ; L̂ y , L̂ z = i L̂ x ; L̂ z , L̂ x = i L̂ y . (5.14)
The commutation relations between the generators determine thus the product of
the elements of the rotation group and are known as the Lie algebra of the group.
Once a basis is defined, operators can in most cases be associated to matrices;
there is a isomorphism between vectors and states, matrices and operators.2 In the
following, whenever there is no ambiguity, we shall identify operators and matrices,
and we shall omit when there is no ambiguity the “hat” associated to operators.
2 Here,
we are indeed cutting a long story short; we address the interested readers to a textbook in
quantum physics to learn what is behind this fundamental point.
5.3 Symmetries and Groups 199
Unitary groups U(n) and Special Unitary groups SU(n) of a generic rank n play a
central role in particle physics both related to the classification of the elementary
particles and to the theories of fundamental interactions.
The unitary group U(n) is the group of unitary complex square matrices with n
rows and n columns. A complex n × n matrix has 2n 2 parameters, but the unitarity
condition (U † U = UU † = 1) imposes n 2 constrains and thus the number of free
parameters is n 2 . A particularly important group is the group U(1) which has just
one free parameter and so one generator. It corresponds to a phase transformation:
i
U = exp α Â
where α is a real number and  is a Hermitian operator. Relevant cases are  being the
identity operator (like a global change of the phase of the wave function as discussed
above) or an operator associated to a single measurable quantity (like the electric
charge or the baryonic number). Noether’s theorem ensures that the invariance of
the Hamiltonian with respect to such transformation entails the conservation of a
corresponding measurable quantity.
The Special Unitary group SU(n) is the group of unitary complex matrices of
dimension n × n and with determinant equal to 1. The number of free parameters
and generators of the group is thus (n 2 − 1). Particularly important groups will be
the groups SU(2) and SU(3).
5.3.4 SU(2)
SU(2) is the group of the spin rotations. The generic SU(2) matrix can be written as
a b
(5.15)
−b∗ a ∗
where a and b are complex numbers and |a|2 + |b|2 = 1. This group has three free
parameters and thus three generators. SU(2) operates in an internal space of particle
spinors, which in this context are complex two-dimensional vectors introduced to
describe the spin 21 (as the electron) polarization states. For instance in a ( |z, | − z)
basis, the polarization states along z, x, and y can be written as
1 0
| + z = , | − z = (5.16)
0 1
200 5 Particles and Symmetries
1 1 1 1
| + x = √ , | − x = √ (5.17)
2 1 2 −1
1 1 1 1
| + y = √ , | − x = √ . (5.18)
2 i 2 −i
The generators, of the group, which are the spin 21 angular momentum operators,
can be for example (it is not a unique choice) the Pauli matrices σz , σx e σ y
1 0 01 0 −i
σz = ; σx = ; σy = , (5.19)
0 −1 10 i 0
being
Ŝz = σz ; Ŝx = σx ; Ŝ y = σ y . (5.20)
2 2 2
The following commutation relations hold:
Ŝi , Ŝ j = i εi, j,k Ŝk (5.21)
where εi, j,k , the Levi-Civita symbol, is the completely antisymmetric tensor which
takes the value 1 if i, j, k, is obtained by an even number of permutations of x, y, z,
the value −1 if i, j, k, is obtained by a odd number for permutations of x, y, z, and
is zero otherwise.
These commutation relations are identical to those discussed above for the genera-
tors of space rotations (the normal angular momentum operators) in three dimensions,
which form a SO(3) group (group of real orthogonal matrices of dimension 3 × 3
with determinant equal to 1). SU(2) and SO(3) have thus the same algebra and there
is a mapping between the elements of SU(2) and elements of SO(3) which respect
the respective group operations. But, while in our example SO(3) operates in the real
space transforming particle wave functions, SU(2) operates in the internal space of
particle spinors.
The rotation operator in this spinor space around a generic axis i can then be
written as
θi
U = exp i σi (5.22)
2
σ = σx −
and in general, defining −
→ →e x +σ y −
→
e y + σz −
→
e z where the −→e x,y,z are the unit
vectors of the coordinate axes, and aligning the rotation axis to a unit vector −→n:
θ →− θ θ →−
U = exp i − σ = cos + i sin −
n ·→ n ·→
σ (5.23)
2 2 2
Spin projection operators do not commute and thus the Heisenberg theorem tells us
that the projection of the spin along different axis cannot be measured simultaneously
with arbitrary precision. However, (n − 1) operators (called the Casimir operators)
exist which do commute with all the SU(n) generators. In case of SU(2), there is then
just one, and it can be chosen as the square of the total spin:
Ŝ 2 = Ŝx2 + Ŝ y2 + Ŝz2
Ŝz |s, m s = m s | s, m s
spin eigenstates can be thus labeled by the eigenvalues ms of the projection operator
along a given axis and by the total spin s. The two other operators Ŝx e Ŝ y can be
combined forming the so-called raising and lowering operators Ŝ+ e Ŝ− .
the names “raising” and “lowering” being justified by the facts that
5.3.5 SU(3)
SU(3) is the group behind the so-called “eightfold way” (the organization of baryons
and mesons in octets and decuplets, which was the first successful attempt of classifi-
cation of hadrons) and behind QCD (quantum chromodynamics, the modern theory of
strong interactions). Indeed, SU(3) operates in an internal space of three-dimensional
complex vectors and thus can accommodate at the same level rotations among three
different elements (flavors u, d, s or colors Red, Green, Blue). The eightfold way
will be discussed later in this Chapter, while QCD will be discussed in Chap. 6; here,
we present the basics of SU(3).
The elements of SU(3) generalizing SU(2) can be written as
θj
U j = exp i λ j
2
where the 3×3 matrices λ j are the generators. Since for a generic matrix A
ti = λi
2
where λi are the Gell-Mann matrices:
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
010 0 −i 0 1 0 0
λ1 = ⎝ 1 0 0 ⎠ ; λ2 = ⎝ i 0 0 ⎠ ; λ3 = ⎝ 0 −1 0 ⎠
000 0 0 0 0 0 0
⎛ ⎞ ⎛ ⎞
001 0 0 −i
λ4 = ⎝ 0 0 0 ⎠ ; λ 5 = ⎝ 0 0 0 ⎠
100 i 0 0
5.3 Symmetries and Groups 203
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
000 00 0 10 0
1
λ6 = ⎝ 0 0 1 ⎠ ; λ7 = ⎝ 0 0 −i ⎠ ; λ8 = √ ⎝ 0 1 0 ⎠
010 0i 0 3 0 0 −2
2
Iˆ3 = t3 ; Ŷ = √ t8
3
Iˆ± = t1 ± i t 2
V̂± = t4 ± i t 5
Û± = t6 ± i t 7 .
Let us examine now in larger detail three fundamental discrete symmetries: parity,
charge conjugation, and time reversal.
5.3.6.1 Parity
−
→ −
→
x → x = −−
→
x .
A vector (for instance, the position vector, the linear momentum or the electric
field) will be inverted under parity transformation, while the cross product of two
vectors (like the angular momentum or the magnetic field) will not be changed.
The latter is called pseudo—(or axial) vector. The internal product of two vectors
is a scalar and is invariant under parity transformation but the internal product of a
vector and a pseudo-vector changes sign under parity transformation and it is called
a pseudo-scalar.
The application of the parity operator P̂ once and twice to a wave function leads to
→
→
→
P̂ ψ − x = λP ψ −
x = ψ −− x
→
→
→
P̂ 2 ψ −x = λP 2 ψ −x =ψ −
x
implying that the eigenvalues of the P̂ operator are λ P = ±1. The parity group
has just two elements: P̂ and the identity operator Iˆ. It is thus Hermitian, and a
measurable quantity, parity, can be associated to its eigenvectors, which is a legal
quantum number.
5.3 Symmetries and Groups 205
Charge conjugation reverses the sign of all “internal” quantum numbers (electric
charge, baryon number, strangeness, …) keeping the values of mass, momentum,
206 5 Particles and Symmetries
energy, and spin. It transforms a particle in its own antiparticle. Applying the charge
conjugation operator Ĉ twice brings the state back to its original state, as in the
case of parity. The eigenvalues of Ĉ are again λC = ±1 but most of the elementary
particles are not eigenstates of Ĉ (particle and antiparticle are not usually the same);
this is trivial for electrically charged particles.
Once again electromagnetic and strong interactions appear to be invariant under
charge conjugation but weak interactions are not (they are “almost” invariant to the
product Ĉ P̂ as it will discussed later on).
Electric charge changes sign under charge conjugation and so do electric and
magnetic field. The photon has thus a negative Ĉ eigenvalue. The neutral pion π 0
decays into two photons: its Ĉ eigenvalue is positive. Now you should be able to
answer the question: is the decay π 0 → γγγ possible?
t → t = −t.
Physical laws that are invariant to such transformation have no preferred time
direction. Going back in the past would be as possible as going further in the future.
Although we have compulsory evidence in our lives that this is not generally the
case, the Hamiltonians of fundamental interactions were believed to exhibit such
invariance. On the other hand, in relativistic quantum field theory in flat space-time
geometry, it has been demonstrated (CPT theorem), under very generic assumptions,
that any quantum theory is invariant under the combined action of charge conjugation
(C) , space reversal (parity P), and time reversal (T). This is the case of the Standard
Model of particle physics. As a consequence of the CPT theorem, particles and
antiparticles must have identical masses and lifetimes. Stringent tests have been
performed being the best the limit at 90 % CL on the mass difference between the
0
K 0 and the K :
mK0 − mK0
< 0.6 × 10−18 . (5.29)
mK0
5.3.7 Isospin
In 1932, J. Chadwick discovered the neutron after more than 10 years of intensive
experimental searches following the observation by Rutherford that to explain the
mass and charges of all atoms, and apart hydrogen, the nucleus should consist of
protons and of neutral bound states of electrons and protons. The particle discovered
was not a bound state of electron and proton—meanwhile, the uncertainty relations
demonstrated by Heisenberg in 1927 had indeed forbidden it. The neutron was indeed
a particle like the proton with almost the same mass (m p
939.57 MeV/c2 , m n
938.28 MeV/c2 ), the same behavior with respect to nuclear interaction, but with no
electric charge. It was the neutral “brother” of the proton.
Soon after neutron discovery, Heisenberg proposed to regard proton and neutron
as two states of a single particle later on called the nucleon. The formalism was
borrowed from the Pauli spin theory and Wigner, in 1937, called “isospin” symmetry
this new internal symmetry with respect to rotations in the space defined by the
vectors ( p, 0) and (0, n). Strong interactions should be invariant with respect to an
internal SU(2) symmetry group, the nucleons would have isospin I = 1/2, and their
states would be described by isospin spinors. By convention, the proton is identified
with the isospin-up (I3 = +1/2) projection and the neutron with the isospin-down
(I3 = −1/2) projection.
As we have discussed in Chap. 3, Yukawa postulated in 1935 that the short-range
nuclear interaction might be explained by the exchange of a massive meson, the pion.
The charge independence of nuclear interactions suggested later on that three pions
(π + , π − , π 0 ) should exist. Nuclear interaction could thus be seen as an interaction
between the nucleon isospin doublet (I = 1/2) and a isovector (I = 1) triplet of
pions. Three spin 0 and isospin 1 pions (π + with I3 = +1, π 0 with I3 = 0, π −
with I3 = −1) with almost the same masses (m π±
139.6 MeV/c2 , m π0
135.0
MeV/c2 ) were indeed discovered in the late 1940s and in the beginning of the 1950s.
The isospin theory of nuclear interactions was established.
1 1 1 1
| , + | , = |1, 1
2 2 2 2
1 1 1 1 1 1
| , + | , − = √ |1, 0 + √ |0, 0
2 2 2 2 2 2
1 1 1 1 1 1
| , − + | , = √ |1, 0 − √ |0, 0
2 2 2 2 2 2
1 1 1 1
| , − + | , − = |1, −1 (5.30)
2 2 2 2
The final states can thus be organized in a symmetric triplet of total spin 1
1 1 1 1
|1, 1 = | , | ,
2 2 2 2
1 1 1 1 1 1 1 1 1 1
|1, 0 = √ | , | , − + √ | , − | , − (5.31)
2 2 2 2 2 2 2 2 2 2
1 1 1 1
|1, −1 = | , − | , −
2 2 2 2
and in an antisymmetric singlet of total spin 0
1 1 1 1 1 1 1 1 1 1
|0, 0 = √ | , | , − − √ | , − | , − . (5.32)
2 2 2 2 2 2 2 2 2 2
In the language of group theory, the cross product of two SU(2) doublets gives a
triplet and a singlet:
2 ⊗ 2 = 3 ⊕ 1. (5.33)
Strong interactions are invariant under SU(2) rotations in the internal isospin space
and according to Noether‘s theorem, total isospin I is conserved in such interactions.
The transition amplitudes between initial and final states are a function of the isospin
I and can be labeled as M I .
Let us consider the inelastic collision of two nucleons giving a deuterium nucleus
and a pion. Three channels are considered:
1. p + p → d + π +
2. p + n → d + π 0
3. n + n → d + π − .
The deuteron is a pn bound state and must have isospin I = 0; otherwise, the bound
states pp and nn should exist (experimentally, they do not exist). The isospin quan-
5.3 Symmetries and Groups
Fig. 5.4 Clebsch–Gordan coefficients and spherical harmonics. From K.A. Olive √ et al. (Particle Data Group), Chin. Phys. C 38 (2014) 090001. Note A
square-root sign is to be understood over every coefficient, e.g., for −8/15 read − 8/15
209
210 5 Particles and Symmetries
tum numbers |I, I3 of the final states are thus those of the π, which means |1, 1,
|1, 0 , |1, −1, respectively. The isospins of the initial states follow the rules of
the addition of two isospin 1/2 states discussed above, and are, respectively, |1, 1,
√1 |1, 0+ √1 |0, 0 and |1, −1 . As the final state is a pure I = 1 state, only the tran-
2 2
sition amplitude corresponding to I = 1 is possible. The cross section (proportional to
the square of the sum of the scattering amplitudes) for the reaction p + n → d + π 0
should then be half of each of the cross sections of any of the other reactions.
Let us consider now the π + p and π − p collisions:
1. π + + p → π + + p
2. π − + p → π − + p
3. π − + p → π 0 + n .
Using the Clebsch–Gordan tables, the isospin decomposition of the initial and final
states are
1 1 3 3
π + + p : |1, 1 + | , =| ,
2 2 2 2
− 1 1 1 3 1 2 1 1
π + p : |1, −1 + | , = | ,− − | ,−
2 2 3 2 2 3 2 2
1 1 2 3 1 1 1 1
π 0 + n : |1, 1 + | , = | ,− + | ,− .
2 2 3 2 2 3 2 2
Therefore, there are two possible transition amplitudes M1/2 e M3/2 corresponding
to I = 21 and I = 23 , respectively, and
M(π + p → π + p) ∝ M3/2
1 2
M(π − p → π − p) ∝ M3/2 + M1/2
3 3
√ √
2 2
M(π − p → π 0 p) ∝ M3/2 − M1/2
3 3
Experimentally, in 1951, the group led by Fermi in Chicago discovered in the
π + p elastic scattering channel an unexpected and dramatic increase at center-of-
mass energies of 1232 MeV (Fig. 5.5). Such increase was soon after interpreted by
Keith Brueckner (Fermi was not convinced) as evidence that the pion and the proton
form at that energy a short-lived bound state with isospin number I = 23 . Indeed,
whenever M3/2 M1/2 ,
σ(π + p → π + p) (M3/2 )2
∼ ∼3
σ(π − p → π − p) + σ(π − p → π 0 p) 9 (M3/2 )
1 2 + 29 (M3/2 )2
in agreement with the measured value of such ratio as shown in Fig. 5.5.
5.3 Symmetries and Groups 211
The “eightfold way” is the name Murray Gell-Mann, inspired by the noble Eightfold
Path from the Buddhism (Fig. 5.6), gave to the classification of mesons and baryons
proposed independently by him and by Yuval Ne’eman in the early 1960s.
As discussed in Chap. 3, strange particles had been discovered in the late 1940s
in cosmic rays, and later abundantly produced in early accelerator experiments in
the beginning of the 1950s. These particles were indeed strange considering the
knowledge at that time: they have large masses and they are produced in pairs with
large cross sections but they have large lifetimes as compared with what expected
212 5 Particles and Symmetries
Fig. 5.7 Fundamental meson and baryon octets: on the left spin 0, parity –1 (pseudo-scalar mesons);
on the right the spin 1/2 baryons. The I3 axis is the abscissa, while the Y axis is the ordinate
for nuclear resonances. Their production is ruled by strong interactions while they
decay weakly. A new quantum number, the strangeness, was assigned in 1953 to
these particles by Nakano and Nishijima and, independently, by Gell-Mann. By
convention, the positive K mesons (kaons) have strangeness +1, while the baryons
have strangeness −1. “Ordinary” (non-strange) particles (proton, neutron, pions, ...)
have strangeness 0.
Strangeness is conserved in the associated production of kaons and lambdas, as
for instance, in
π+ n → K + ; π− p → K 0 (5.34)
→ π− p ; K 0 → π− π+ . (5.35)
Strange particles can also be grouped in isospin multiplets but the analogy with
strangeless particles is not straightforward. Pions are grouped in an isospin triplet
being the π + the antiparticle of the π − and the π 0 its own antiparticle. For kaons,
the existence of the strangeness quantum number implies that there are four0
different
+ 0
In the middle of each hexagon, there are two particles with I3 = 0, Y = 0: one with
I = 0 (η 0 , ) and one with I =1 (π 0 , 0 ).
The masses of the particles in each multiplet are not equal (they would be if the
symmetry were perfect). Indeed, while particles lying on the horizontal lines with
the same isospin have almost equal masses, the masses of the particles in consecutive
horizontal lines differ by 130–150 MeV/c2 .
Besides the two basic octets, a decuplet of 3/2 baryons was also present (Fig. 5.8).
There was however an empty spot: a baryon with Q = −1, I = 0, Y = −2,
S = −3 and a mass around 1670 MeV/c2 was missing. This particle, which we call
now the − , was indeed discovered in the Brookhaven National Laboratory 2-meter
hydrogen bubble chamber in 1964 (Fig. 5.9). A K − meson interacts with a proton
in the liquid hydrogen of bubble chamber producing a − , a K 0 , and a K + , which
then decay in other particles according to the following scheme:
Measuring the final state charged particles and applying energy-momentum con-
servation, the mass of the − was reconstructed with a value of (1686 ± 12) MeV/c2 ,
in agreement with the prediction of Gell-Mann and Ne’eman.
214 5 Particles and Symmetries
Fig. 5.9 Bubble chamber picture of the first − . From V.E. Barnes et al., “Observation of a Hyperon
with Strangeness Minus Three,” Physical Review Letters 12 (1964) 204
The meson and baryon multiplets were recognized as representations of SU(3) but it
was soon realized that they are not the fundamental ones, and they could be generated
by the combination of more fundamental representations. In 1964, Gell-Mann and
Zweig proposed as fundamental representation a triplet
(3) of hypothetical spin 1/2
particles named quarks. Its conjugate representation 3 is the triplet of antiquarks.
Two of the quarks (named up, u, and down, d, quarks) form a isospin duplet and the
other (named strange, s), which has strangeness quantum number different from
zero, a isospin singlet. This classification of quarks into u, d, and s states introduces
a new symmetry called flavor symmetry and the corresponding SU(3) group is
labeled as SU(3)flavor or shortly SU(3) f , whenever confusion is possible with the
group SU(3)c (SU(3)color ) of strong interactions that will be discussed in the next
chapter.
Mesons are quark/antiquark bound states, whereas baryons are composed by three
quarks (anti-baryons by three antiquarks). To reproduce the hadrons quantum num-
bers, quarks must have fractional electric charge and fractional baryonic number.
Their quantum numbers are as follows:
The fundamental representations of quarks and antiquarks form triangles in the
(I3 , Y ) plane (Fig. 5.10).
5.4 The Quark Model 215
Q I I3 S B Y
u +2/3 1/2 +1/2 0 1/3 1/3
d −1/3 1/2 −1/2 0 1/3 1/3
s −1/3 0 0 −1 1/3 −2/3
d u 2/3
1/3
s
1/2
-1/2
-1/2 1/2
- 2/3 s - 1/3
u d
The mesons multiplets are thus obtained by the cross product of the (3) and 3
SU(3) representations, which gives an octet and a singlet:
3 ⊗ 3 = 8 ⊕ 1. (5.36)
Graphically, the octet can be drawn centering in each quark vertex the inverse anti-
quark triangle (Fig. 5.11).
3 ⊗ 3 ⊗ 3 = 10 ⊕ 8 ⊕ 8 ⊕ 1 . (5.37)
In terms of the exchange of the quark flavor, it can be shown that the decuplet state
wave functions are completely symmetric, while the singlet state wave function is
completely antisymmetric. The octet state wave functions have mixed symmetry. The
total wave function of each state is however not restricted to the flavor component.
It must include also a spatial component (corresponding to spin and to the orbital
angular momentum) and a color component which will be discussed in the next
section.
Color is at the basis of the present theory of strong interactions, QCD (see Sect. 6.4),
and its introduction solves the so-called ++ puzzle. The ++ is formed by three
u quarks with orbital angular momentum l = 0 (it is a ground state) and in the
same spin projection state (the total spin is J = 3/2). Therefore, its flavor, spin, and
orbital wave functions are symmetric, while the Pauli exclusion principle imposes
that the total wave functions of states of identical fermions (as it is the case) should
be antisymmetric.
In color space, quarks are represented by complex three-vectors (the generaliza-
tion of the two-dimensional spinors). The number of colors in QCD is Nc = 3, as we
shall see later in this Chapter; the quark colors are usually designated as Red, Blue
and Green, having the antiquark the corresponding anti-colors.
Quarks interact via the emission or absorption of color field bosons, the gluons.
There are eight gluons corresponding to the eight generators of the SU(3) group (see
Sect. 5.3.5). Gluons are in turn colored, and the emission of a gluon changes the
color.
(Anti-)baryons are singlet states obtained by the combination of three (anti)quarks;
mesons are singlet states obtained by the combination of one quark and one antiquark.
In summary, all stable hadrons are color singlets, i.e., they are neutral in color. This
5.4 The Quark Model 217
Fig. 5.12 OZI favored (left) and suppressed (right) φ decay diagrams
is the main reason behind the so-called OZI (Okubo–Zweig–Iizuka) rule, which can
be seen, for example, in the case of the φ the decay into a pair of kaons which is
experimentally favored (86 % branching ratio) in relation to the decay into three pions
which however has a much larger phase space. The suppression of the 3π mode can
be seen as a consequence of the fact that “decays with disconnected quark lines are
suppressed” (Fig. 5.12). Being the s s̄ state a color singlet, the initial and the final state
in the right plot cannot be connected by a single gluon, being the gluon a colored
object (see Sect. 6.4). Indeed, one can prove that the “disconnected” decay would
need the exchange of at least three gluons.
In color space, the physical states are antisymmetric singlets (the total color charge
of hadrons is zero, the strong interactions are short range, confined to the interior
of hadrons and nuclei). The product of the spin wave function and the flavor wave
function in ground states (angular orbital momentum L = 0) must then be symmetric.
The net result is that the ground-state baryons are organized in a symmetric spin 1/2
octet and in a symmetric spin 3/2 decuplet (Fig. 5.13).
Hundreds of excited states have been discovered with nonzero orbital angular
momentum; they can be organized in successive SU(3) multiplets.
Fig. 5.13 Baryon ground states in the quark model: the spin 1/2 octet (left), and the spin 3/2
decuplet (right). The vertical (S) axis corresponds to the Y axis, shifted by 1 (Y = 0 corresponds
to S = −1). By Trassiorf [own work, public domain], via Wikimedia Commons
218 5 Particles and Symmetries
Fig. 5.14 J/ψ invariant mass plot in e+ e− annihilations (left) and in proton–beryllium interactions
(right). Credits: Nobel foundation
In the case of mesons, these states are labeled using the notation of atomic physics.
In the case of baryons, two independent orbital angular momenta can be defined
(between for instance two of the quarks, said to form a diquark state, and between
this diquark and the third quark), and the picture is more complex.
3 K. Niu and collaborators had already published candidates for charm (no such name was ascribed
to this new quark at that time) in emulsion as early as 1971. These results, taken seriously in Japan,
were not accepted as evidence for the discovery of charm by the majority of the US and European
scientific communities. Once again, cosmic-ray physics was the pathfinder.
5.4 The Quark Model 219
Fig. 5.15 Left The 16-plets for the pseudo-scalar (a) and vector (b) mesons made of the u, d, s,
and c quarks as a function of isospin I3 , charm C, and hypercharge Y = B + S + C. The nonets
of light mesons occupy the central planes to which the cc̄ states have been added. Right SU(4)
multiplets of baryons made of u, d, s, and c quarks: (a) The 20-plets including an SU(3) octet; (b)
The 20-plets with an SU(3) decuplet. From K.A. Olive et al. (Particle Data Group), Chin. Phys. C
38 (2014) 090001
220 5 Particles and Symmetries
The cc states, named charmonium states, are particularly interesting; they correspond
to non-relativistic particle/antiparticle bound states. Charmonium states have thus a
structure of excited states similar to positronium (an e+ e− bound state) but their
energy levels can be explained by a potential in which a linear term is added to the
Coulomb-like potential, which ensures that an infinite energy is needed to separate
the quark and the antiquark (no free quarks have been observed up to now, and we
found a clever explanation for this fact, as we shall see):
4 αs
V (r )
− +k r (5.38)
3 r
where r is the radius of the bound state, αs is the equivalent for the strong interactions
of the fine structure constant α, k is a positive constant, and the coefficient 4/3 has to
do with the color structure of strong interactions; we shall discuss it in larger detail
in Sect. 6.4.6.
The linear term dominates for large distances. Whenever the pair is stretched, a
field string is formed storing potential energy; at some point, (part of) the stored
energy can be converted, by tunnel effect, into mass and a new quark/antiquark pair
can be created transforming the original meson into two new mesons (Fig. 5.16).
This process is named quark hadronization and plays a central role in high-energy
hadronic interactions.
In positronium spectroscopy, one can obtain the energy levels by solving the
Schrödinger equation with a potential Vem = −α/r ; the result is
αm e c2
E p;n = − .
4n 2
Note that these levels are equal to the energy levels of the hydrogen atom, divided by
two: this is due to the fact that the mass entering in the Schrödinger equation is the
reduced mass m r of the system, which in the case of hydrogen is approximately equal
to the electron mass (m r = m e m p /(m e + m p ), while in the case of positronium, it
is exactly m e /2. The spin–orbit interaction splits the energy levels (fine splitting),
and a further splitting (hyperfine splitting) is provided by the spin–spin interactions.
The left plot of Fig. 5.17 shows the energy levels of positronium. They are
indicated by the symbols n 2S+1 L s (n is the principal quantum number, S is the
Fig. 5.17 Energy levels for (a) the positronium and (b) the charmonium states. From S. Braibant,
G. Giacomelli, and M. Spurio, “Particles and fundamental interactions,” Springer 2012
total spin, L indicates the orbital angular momentum in the spectroscopic notation
(S being the = 0 state), and s is the spin projection.
The right plot of Fig. 5.17 shows the energy levels of charmonium; they can be
obtained inserting the potential (5.38) into the Schrödinger equation. One obtains
k ∼ 1 GeV, and αs ∼ 0.3.
The quark bottom, which will be introduced in the next section, has an even larger
mass, and it gives rise a similar spectroscopy of quarkonia.
A fifth quark was discovered a few years later. In 1977, in Fermilab, an experiment
led by Leon Lederman studied the mass spectrum of μ− μ+ pairs produced in the
interaction of a 400 GeV proton beam on copper and platinum targets. A new heavy
and narrow resonance, named the upsilon ϒ, was found, with a mass of around 9.46
GeV/c2 .
The ϒ was interpreted as a bb vector meson where b stands for a new quark flavor,
the bottom or beauty, which has, like the d and the s, an electric charge of −1/3.
Several hadrons containing at least a b quark were discovered. A family of bb mesons,
called the bottomium and indicated by the letter ϒ, was there, as well as mesons and
baryons resulting from
the
combination
of b quarks
with lighter quarks:
pseudo-scalar
mesons like the B + ub , Bc+ cb , the B 0 db , and the Bs0 sb ; bottom baryons
like 0b (udb), 0b (usb), − −
b (dsb), b (ssb). Heavy mesons and baryons with a
single heavy quark are very interesting. The light quarks surround the heavy quark
222 5 Particles and Symmetries
Their masses cover an enormous range, from the tens of MeV/c2 for the u and the d
quarks,4 to the almost 200 GeV/c2 for the t quark. The flavor symmetry that was the
clue to organize the many discovered hadrons is strongly violated. Why? Is there a
fourth, a fifth (...), family to be discovered? Are quarks really elementary? These are
questions we hope to answer during this century.
4 The problem of the determination of the quark masses is not trivial. We can define as a “current”
quark mass the mass entering in the Lagrangian (or Hamiltonian) representation of a hadron; this
comes out to be of the order of some MeV/c2 for u, d quarks, and ∼0.2 GeV/c2 for s quarks.
However, the strong field surrounds the quarks in such a way that they acquire a “constituent
”(effective) quark mass including the equivalent of the color field; this comes out to be of the order
of some 300 MeV/c2 for u, d quarks, and ∼0.5 GeV/c2 for s quarks. Current quark masses are
almost the same as constituent quark mass for the heavy quarks.
5.5 Quarks and Partons 223
The electron–proton cross section, in case the target proton is thought as a of point-
like spin 1/2 particle with a mass m p , was calculated by Rosenbluth (Sect. 6.2.8):
α2 cos2θ
dσ 2 E Q2 θ
=
1+ 2
tan 2
(5.39)
d θ
4 E 2 sin 2 E
4 2M p 2
E
E = . (5.40)
1 + E(1 − cos θ)/m p
5 Robert Hofstadter (1915–1990) was an American physicist. He was awarded the 1961 Nobel
Prize in Physics “for his pioneering studies of electron scattering in atomic nuclei and for his
consequent discoveries concerning the structure of nucleons.” He worked at Princeton before joining
Stanford University, where he taught from 1950 to 1985. In 1948, Hofstadter patented the thallium
activated NaI gamma-ray detector, still one of the most used radiation detectors. He coined the
name “fermi,” symbol fm, for the scale of 10−15 m. During his last years, Hofstadter became
interested in astrophysics and participated to the design of the EGRET gamma-ray space telescope
(see Chap. 10).
224 5 Particles and Symmetries
(5.41)
where G 2E Q 2 and G 2M Q 2 are called, respectively, the electric and the magnetic
form factors (if G E = G
M = 1, the
Rosenbluth
formula 5.47 is recovered).
At low Q 2 , G E Q 2 and G M Q 2 can be interpreted as the Fourier transforms
of the electric charge and of the magnetisation current density inside the proton. In
the limit Q 2 → 0 (λ → ∞ for the exchanged virtual photon), the electron “sees”
the entire proton and it could be expected that G E (0) = G M (0) = 1. This is what
the experiment tells for G E , but it is not the case for G M . The measured value is
G M (0) = μ p
2.79. The proton has an anomalous magnetic moment μ p which
reveals already that the proton is not a Dirac point-like particle. The same is observed
for the neutron which has μn
−1.91.
In fact, at low Q 2 ( Q 2 < 1 − 2 GeV2 ), the experimental data on G E and G M are
well described by the dipole formula
2
G M Q2 1
GE Q 2
(5.42)
μp 1 + Q 2 /0.71GeV2
suggesting similar spatial distributions for charges and currents. However, recent
data at higher Q 2 using polarized beams showed a much richer picture reflecting a
complex structure of constituents and their interactions.
The scattering of an electron on a proton may, if the electron energy is high enough,
show the substructure of the proton. At first order, such scattering (Fig. 5.18) can be
seen as the exchange of a virtual photon (γ ∗ ) with four-momentum:
2
q 2 = −Q = ( p1 − p3 )2 = ( p4 − p2 )2 .
where p1 and p3 are, respectively, the four-momentum of the incoming and outcoming
electron, p2 is the target four-momentum, and p4 is
the four-momentum of the final
hadronic state which has an invariant mass M = p42 (see Fig. 5.18). In case of
elastic scattering, M = m p .
5.5 Quarks and Partons 225
s
m p (2E + m p ).
It is then useful to construct, using internal products of the above four-vectors, other
Lorentz invariant variables, namely:
• the energy lost ν
qp2
ν=
mp
which in the laboratory reference frame is just the energy lost by the electron:
ν = E − E ;
• the inelasticity y
qp2
y=
p1 p2
which in the laboratory frame is just the fraction of the energy lost by the electron:
ν
y=
E
(y is thus dimensionless and it is limited to the interval 0 ≤ y ≤ 1);
226 5 Particles and Symmetries
Q2
x=
2m p ν
Q2
x=
Q 2 + M 2 − m 2p
Q 2
2 Mν
Q2 = x y s2 − M p2
M 2 = M p 2 + 2M p ν − Q 2
2m p x 2 y 2 1 E θ
= sin2
Q2 2 mp E 2
m 2p x 2 y 2 E θ
1−y− = cos2 .
Q2 E 2
5.5 Quarks and Partons 227
W1 Q 2 , ν describes the interaction between the electron and the proton magnetic
moments and can be neglected for low Q 2 .
In the limit of electron–proton elastic scattering (x → 1, ν = Q 2 /2m p ), these
structure functions should reproduce the elastic cross-section formula discussed
above:
Q2
Q2
W1 Q 2 , ν = G 2
Q 2
δ ν − ; (5.45)
4M 2p M 2 mp
G 2E Q 2 + Q2
G2 Q2
4M 2p M Q2
W2 Q 2 , ν = δ ν− . (5.46)
Q2 2 Mp
1+ 4M 2p
In the case of the elastic scattering of electrons on a point 1/2 spin particle with
mass m p and charge e, the Rosenbluth formula (5.47) should be reproduced:
Q2 Q2 Q2
W1 Q 2 , ν = δ ν− ; W2 Q 2 , ν = δ ν − . (5.47)
4M 2p 2 mp 2 mp
1
1
W1 Q 2 , ν → F1 (w) ; W2 Q 2 , ν → F2 (w) (5.48)
mp ν
1 2M p ν
w= = . (5.49)
x Q2
6 The Nobel Prize in Physics 1990 was assigned to Jerome I. Friedman, Henry W. Kendall and
Richard E. Taylor “for their pioneering investigations concerning deep inelastic scattering of elec-
trons on protons and bound neutrons, which have been of essential importance for the development
of the quark model in particle physics.”
5.5 Quarks and Partons 229
Fig. 5.20 νW2 scaling: measurements at different scattered energies and angles. Source W.B.
Atwood, “Lepton Nucleon Scattering,” Proc. 1979 SLAC Summer Institute (SLAC-R-224)
Fig. 5.21 νW2 scaling for the proton: measurements at fixed x and at different Q 2 . Source J.I.
Friedman and H.W. Kendall, Annual Rev. Nucl. Science 22 (1972) 203
electron into free point-like charged particles in the nucleon: the partons (Fig. 5.22).
This is the so-called quark-parton model (QPM).
The nucleon has point-like constituents but, despite the strong interaction that
holds the nucleon, they appear to be basically free. The partons are, however, confined
inside the hadrons: nobody has ever observed a parton out of a hadron.
The Feynman partons were soon identified as the Gell-Mann and Ne’eman quarks;
nowadays, the term “parton” is used, however, to denominate all the nucleon con-
stituents, i.e., the quarks and the gluons.
230 5 Particles and Symmetries
Ei = Z i E
p i = Zi −
−
→ →
p
and thus the parton mass is also a fraction Z i of the nucleon mass:
mi = Zi MN .
These assumptions are exact in a frame where the parton reverses its linear momentum
keeping constant its energy (collision against a “wall”). In such frame (called the Breit
frame, or also the infinitum momentum frame), the energy of the virtual photon is
zero (qγ ∗ = (0, 0, 0, −Q)) and the proton moves with a very high momentum toward
the photon. However, even if the parton model was built in such an extreme frame,
its results are valid whenever Q 2 m N , where m N is the nucleon mass.
Remembering the previous section, the elastic form factor of the scattering elec-
tron on a point-like spin 1/2 particle with electric charge ei and mass m i can be
written as
Q2 2 Q2 Q2
W1 Q 2 , ν = ei δ ν − ; W 2 Q 2
, ν = ei
2
δ ν − .
4m i2 2 mi 2 mi
x
Zi
W1 Q 2 , ν = ei2 δ ( Z i − x) ; W2 Q 2 , ν = ei2 δ ( Z i − x) (5.50)
2 MN Zi ν
5.5 Quarks and Partons 231
The δ function imposes that Z i ≡ x, meaning that to comply with the elastic
kinematics constraints the exchanged virtual photon has to pick up a parton with
precisely a fraction x of the nucleon momentum.
Comparing F1 and F2 , the so-called Callan–Gross relation is established:
F2 Q 2 , ν = 2x F1 Q 2 , ν . (5.52)
1
F1 Q 2 , x = ei2 f i (x)
2
i
and
1
F2 Q , x =
2
ei2 Z i f i (Z i ) δ ( Z i − x) d Z i
i 0
F2 Q 2 , x = x ei2 f i (x) .
i
232 5 Particles and Symmetries
The functions f i (x) are called the parton density functions (PDFs).
The sum of all parton momentum fractions should be 1, if the partons are the only
constituents of the nucleon:
1
ei2 f i (x) d x = 1.
i 0
Experimentally, however, the partons (the charged constituents of the nucleon) carry
only around 50 % of the total nucleon momentum. The rest of the momentum is
carried by neutral particles, the gluons, which are, as it will be discussed in the next
chapter, the bosons associated with the strong field that holds together the nucleon.
The real picture is more complex: instead of a nucleon composed by three quarks,
there are inside the nucleon an infinite number of quarks. In fact, as it happens in
the case of the electromagnetic field where electron–positron pairs can be created
even in the vacuum (the Casimir effect being a spectacular demonstration), virtual
quark/antiquark pairs can be created inside the nucleon. These pairs are formed in
time scales allowed by the Heisenberg uncertainties relations. In an artistic picture
(Fig. 5.24), the nucleon is formed by three quarks which determine the nucleon
quantum numbers and carry a large fraction of the nucleon momentum (the valence
quarks) surrounded by clouds of virtual quark/antiquark pairs (the “sea” quarks) and
everything is embedded in a background of strong field bosons (the gluons).
The quarks of the hadrons may have different flavors and thus different charges and
masses. The corresponding PDFs are denominated according to the corresponding
flavor: u (x) , d (x) , s (x) , c (x) , b (x) , t (x) for the quarks; u (x) , d (x) ,
s (x) , . . . for the antiquarks.
The form factor F2 for the electron–proton scattering can now be written as a
function of the specific quarks PDFs:
4 1
F2 ep Q 2 , x
x (u (x) + u (x)) + d (x) + d (x) + s (x) + s (x) .
9 9
The small contributions from heavier quarks and antiquarks can be usually neglected
(due to their high masses, they are strongly suppressed). The PDFs can still be divided
into valence and sea. To specify if a given quark PDF refers to valence or sea, a
subscript V or S is used. For instance, the total u quark proton PDF is the sum of two
PDFs:
u (x) = u V (x) + u S (x) .
For the u antiquark PDF, we should remember that in the proton there are no
valence antiquarks, just sea antiquarks. Moreover, as the sea quarks and antiquarks
are produced in pairs, the sea quarks and antiquarks PDFs with the same flavor should
be similar. Therefore, the u component in the proton can be expressed as
There are thus several new functions (the specific quarks PDFs) to be determined
from the data. A large program of experiments has been carried out and in particular
deep inelastic scattering experiments with electron, muon, neutrino, and antineutrino
beams. The use of neutrinos and antineutrinos is particularly interesting since, as
it will be discussed in the next chapter, their interactions with the quarks arises
through the weak force and having a well-defined helicity (neutrinos have left helicity,
antineutrinos right helicity) they “choose” between quarks and antiquarks. The results
of all experiments are globally analyzed and PDFs for quarks but also for gluons
(g (x)), are obtained, like these shown in Fig. 5.25.
At low x, the PDFs of sea quarks and gluons behave as 1/x, and therefore their
number inside the proton becomes extremely large at x → 0. However, the physical
observable x f (x) (the carried momentum) is well behaved.
The valence quark PDFs can then be obtained subtracting the relevant quark and
antiquark PDFs:
Their integration over the full x range is consistent with the quark model. In fact for
the proton,
1 1
u V (x) d x
2 ; dV (x) d x
1. (5.54)
0 0
The x u V (x) and x dV (x) distributions have a maximum around 1/3 as expected
but the sum of the momenta carried out by the valence quarks is (as it was discussed
before) smaller than the total momentum:
1 1
x u V (x) d x
0.36 ; x d V (x) d x
0.18 . (5.55)
0 0
234 5 Particles and Symmetries
Fig. 5.25 Parton distribution functions at Q2 = 10 GeV2 (left) and Q2 = 10000 GeV2 (right).
The gluon and sea distributions are scaled down by a factor of 10. The experimental, model, and
parameterization uncertainties are shown. From K.A. Olive et al. (Particle Data Group), Chin. Phys.
C 38 (2014) 090001
Many tests can be done by combining the measured form factors. An interesting
quantity, for instance, is the difference of the form factor functions F2 for electron–
proton and electron–neutron scattering.
Assuming isospin invariance:
Then
4 p 1 10 2
F2 ep Q 2 , x
x u v (x) + dvp (x) + u p (x) + s p (x)
9 9 9 9
1 p 4 10 2
F2 en Q 2 , x
x u v (x) + dvp (x) + u p (x) + s p (x)
9 9 9 9
and 1
F2 ep Q 2 , x − F2 en Q 2 , x ∼ x u V p
(x) − dV p
(x) .
3
Integrating over the full x range, one has
5.5 Quarks and Partons 235
1 1 ep 2
1
F2 Q , x − F2 en Q 2 , x d x
. (5.56)
0 x 3
This is the so-called Gottfried sum rule. This rule is, however, strongly violated in
experimental data (the measured value is 0.235 ± 0.026) showing the limits of the
naïve quark-parton model. There is probably an isospin violation in the sea quark
distributions.
The Q 2 dependence of the structure functions (Fig. 5.26) was systematically
measured by several experiments, in particular, at the HERA electron–proton col-
lider, where a wide Q 2 and x range was covered (2.7 < Q 2 < 30000 GeV2 ;
6 10−5 < x < 0.65). For x > 0.1, the scaling is reasonably satisfied but for
small x, the F2 structure function clearly increases with Q 2 . This behavior is well
predicted by the theory of strong interactions and basically reflects the resolution
power of the exchanged
virtual photon. A higher Q 2 corresponds to a smaller wave-
length (λ ∼ 1/ Q ), and therefore a much larger number of sea quarks with a very
2
small x can be seen.
A direct experimental test of the number of colors, Nc, comes from the measurement
of the R ratio of the hadronic cross section in e+ e− annihilations to the μ+ μ− cross
section, defined as
σ e+ e− → qq
R= .
σ(e+ e− → μ+ μ− )
√
At low energies s < m Z , these processes are basically electromagnetic and are
mediated at the first order by one virtual photon (γ ∗ ). The cross sections are thus
proportional to the squares of the electric charges of the final state particles qi2 . A
rule-of-thumb that can frequently be helpful (note the analogies with the Rutherford
cross section) is
4πα 2 86.8nb 2
σ(e+ e− → μ+ μ− )
qi
q . (5.57)
3s s[GeV2 ] i
When considering more than one flavor (for example in the case of hadronic final
states), a sum over all the possible final states has to be performed. Thus
1√
√ sum runs over all the quark flavors with mass m i < 2 s, and over all colors.
The
For s 3 GeV, just the u, d and s quarks can contribute and then
2
R= Nc .
3
√
For 3 GeV s 5 GeV, there is also the contribution of the c quark and
10
R= Nc .
9
√
Finally, for s 5 GeV, the b quark contributes
11
R= Nc .
9
The mass of the top quark is too high for the tt pair production to be accessible at
the past and present e+ e− √ colliders.
The measurements for s 40 GeV, summarized in Fig. 5.27, show, apart from
regions close to the resonances, a √ fair agreement between the data and this naïve
prediction provided Nc = 3. Above s 40 GeV, the annihilation via the exchange
of a Z boson starts to be non-negligible and the interference between the two channels
is visible, the calculation in Eq. 5.58 being no more valid (see Chap. 7).
5.6 Leptons 237
Fig. 5.27 Measurements of R s . From [F5.2] in the “further readings”; the data are taken from
K.A. Olive et al. (Particle Data Group), Chin. Phys. C 38 (2014) 090001
5.6 Leptons
Electrons, muons, and the experimental proof of the existence of neutrinos were
already discussed in Chaps. 2 and 3. Neutrino oscillations and neutrino masses will
be discussed in Chap. 9.
The τ (tau) lepton, with its neutrino, was the last discovered.
The third charged lepton, the τ (tau), was discovered in a series of experiments lead
by Martin Perl, using the Mark I detector at the SPEAR e+ e− storage ring in the
years 1974–1976. The first evidence was the observation of events with only two
charged particles in the final state: an electron or a positron and an opposite sign
muon, with missing energy and momentum (Fig. 5.28). The conservation of energy
and momentum indicated the existence in such events of at least two undetected
particles (neutrinos).
238 5 Particles and Symmetries
There was no conventional explanation for those events: one had to assume the
existence of a new heavy lepton, the τ . In this case, a τ + τ − pair could have been
produced,
e+ e− → τ + τ −
followed by the weak decay of each τ into its (anti)neutrino plus a W boson
(Fig. 5.29); the W boson, as it will be explained in the next chapter, can then decay
in one of the known charged leptons (l = e, μ) plus the corresponding neutrino and
antineutrino:
τ − → ντ W − → ντ l − ν l ; τ + → ν τ W + → ν τ l + ν l .
A confirmation came two years later with the observation of the τ hadronic decay
modes:
τ − → ντ + W − → ντ + hadr ons .
Muons decay into electrons and the electron energy spectrum is continuous. Once
again, this excludes a two-body decay and thus at least two neutral “invisible” par-
ticles should be present in the final state:
μ− → e− ν1 ν 2 .
Is ν 2 the antiparticle of ν1 ? Lee and Yang were convinced, in 1960, that it should not
be so (otherwise the Feynman diagram represented in Fig. 5.30, would be possible
and then the branching fraction for μ− −→ e− γ would be large). At least two
different species of neutrinos should exist.
Around the same time, the possibility to produce a neutrino beam from the decay
of pions created in the collision of GeV protons on a target was intensively discussed
in the cafeteria of the Columbia University. In 1962, a kind of a neutrino beam was
finally available at BNL (Brookhaven National Laboratory): the idea was to put an
internal target in a long straight section of the proton accelerator and to drive with a
magnet the proton beam on it; the pions coming from the proton interactions were
then decaying into pions. An experiment led by Leon Lederman, Melvin Schwartz,
and Jack Steinberger was set to observe the neutrino reaction within a 10-ton spark
chamber. Hundreds of millions of neutrinos were produced mostly accompanied by
a muon (B R( π −→ μ ν) B R(π −→ e ν) as it will be discussed in the next
chapter). Forty neutrino interactions in the detector were clearly identified; in six of
them, the final state was an electron, and in 34, the final state was a muon. The νμ
and νe are, thus, different particles, otherwise the same number of events with one
electron and with one muon in the final state should have been observed.
Only in the last year of the 20th century, the direct evidence for the third neutrino
was established. The DONUT experience at Fermilab found four events in six mil-
lions where a τ lepton was clearly identified. In these events, the reconstruction of
the charged tracks in the iron/emulsion target showed a characteristic kink indicating
at least one invisible particle produced in the decay of a heavy particle into a muon
(Fig. 5.31).
240 5 Particles and Symmetries
Fig. 5.31 Tau-neutrino event in DONUT. A tau-neutrino produces several charged particles. Among
them a tau particle, which decays to another charged particle with missing energy (at least one neu-
trino). From K. Kodama et al., DONUT Collaboration, “Observation of tau neutrino interactions,”
Phys. Lett. B 504 (2001) 218
5.7 The Particle Data Group and the Particle Data Book
How can one manage all this information about so many particles? How can one
remember all these names? The explosion of particle discoveries has been so great,
that Fermi said, “If I could remember the names of all these particles, I’d be a
botanist.”
Fortunately, a book, called the Review of Particle Physics (also known as the Par-
ticle Data Book), can help us. It is edited by the Particle Data Group (in short PDG),
an international collaboration of about 150 particle physicists, helped by some 500
consultants, that compiles and organizes published results related to the properties
of particles and fundamental interactions, reviewing in addition theoretical advance-
ments relevant for experimental physics. The PDG publishes the Review of Particle
Physics and its pocket version, the Particle Physics Booklet, which are printed bienni-
ally in paper, and updated annually in the Web. The PDG also maintains the standard
numbering scheme for particles in event generators (Monte Carlo simulations).
The Review of Particle Physics is a voluminous reference work (more than one
thousand pages); it is currently the most referenced article in high energy physics,
being cited more than 2,000 times per year in the scientific literature. It is divided
into 3 sections:
• Particle physics summary tables—Brief tables with the properties of of particles.
• Reviews, tables, and plots—Review of fundamental concepts from mathematics
and statistics, tables related to the chemical and physical properties of materials,
review of current status in the fields of Standard Model, Cosmology, and experi-
mental methods of particle physics, tables of fundamental physical and astronom-
ical constants, summaries of relevant theoretical subjects.
• Particle listings—Extended version of the Particle Physics Summary Tables, with
reference to the experimental measurements.
5.7 The Particle Data Group and the Particle Data Book 241
The Particle Physics Booklet (about 300 pages) is a condensed version of the
Review, including the summary tables, and a shortened section of reviews, tables,
and plots.
The publication of the Review of Particle Physics in its present form started in
1970; formally, it is a journal publication, appearing in different journals depending
on the year.
The “particle listings” (from which the “summary tables” are extracted) contain all
relevant data known to the PDG team that are published in journals. From these data,
“world averages” are calculated.
Sometimes a measurement might be excluded from a world average. Among the
reasons of exclusion are the following (as reported by the PDG itself, K.A. Olive et
al. (Particle Data Group), Chin. Phys. C 38 (2014) 090001):
• it is superseded by or included in later results;
• no error is given;
• it involves questionable assumptions;
• it has a poor signal-to-noise ratio, low statistical significance, or is otherwise of
poorer quality than other data available;
• it is clearly inconsistent with other results that appear to be more reliable.
Several kinds of “world average” are provided:
• OUR AVERAGE—From a weighted average of selected data.
• OUR FIT—From a constrained or overdetermined multiparameter fit of data.
• OUR EVALUATION—Not from a direct measurement, but evaluated from mea-
surements of related quantities.
• OUR ESTIMATE—Based on the observed range of the data, not from a formal
statistical procedure.
i wi xi 1
x̄ ± δ x̄ = with wi = . (5.59)
i wi (δxi )2
• If χ2 /(N − 1) is less than or equal to 1, and there are no known problems with the
data, the results are accepted.
• If χ2 /(N − 1) is very large, the PDG may
– not to use the average at all, or
– quote the calculated average, making an educated (conservative) guess of the
error.
• If χ2 /(N − 1) is greater than 1, but not
so much, the PDG still averages the data,
but then also increases the error by S = χ2 /(N − 1). This scaling procedure for
errors does not affect central values.
If the number of experiments is at least three, and χ2 /(N − 1) is greater than 1.25,
an ideogram of the data is shown in the Particle Listings. Figure 5.32 is an example.
Each measurement is shown as a Gaussian with a central value xi , error δxi , and area
proportional to 1/δxi .
Further Reading
Exercises
4. Cross sections and isospin. Determine the ratio of the following interactions
cross sections at the ++ resonance:
π− p → K 0 0 ; π− p → K + − ; π+ p → K + + .
5. Decay branching ratios and isospin. Consider the decays of the ∗0 into + π − ,
0 π 0 and − π + . Determine the ratios between the decay rates in these decay
channels.
6. Quantum numbers. Verify if the following reactions/decays are possible and if
not say why:
(a) pp → π + π − π 0 ,
(b) pp → ppn,
(c) pp → ppp p̄,
(d) p p̄ → γ,
(e) π − p → K 0 ,
(f) n → pe− ν,
244 5 Particles and Symmetries
(g) → π − p,
(h) e− → ν e γ .
7. − mass. Verify the relations between the masses of all the particles lying in
the fundamental baryon decuplet but the − and predict the mass of this one.
Compare your prediction with the measured value.
8. Experimental resolution in deep inelastic scattering. Consider an e− p deep
inelastic scattering experiment where the electron scattering angle is ∼6◦ .
Make an estimation of the experimental resolution in the measurement of the
energy of the scattered electron that is needed to distinguish the elastic channel
(e− p → e− p) from the first inelastic channel (e− p → e− pπ 0 ).
9. e− p deep inelastic scattering kinematics. Consider the e− p deep inelastic scat-
tering and deduce the following formula:
Q 2 = 4E E sin2 (θ/2)
Q 2 = 2M p ν
Q 2 = x y(s 2 − y 2 ) .
10. Gottfried sum rule. Deduce in the framework of the quark-parton model the
Gottfried sum rule
1 ep ep
1 2
and comment the fact that the value measured in e− p and e− d deep inelastic
elastic scattering experiments is approximately 1/4.
Chapter 6
Interactions and Field Theories
The structure and the dynamics of the Universe are determined by the so-called
fundamental interactions: gravitational, electromagnetic, weak, and strong. In their
absence, the Universe would be an immense space filled with ideal gases of structure-
less particles. Interactions between “matter” particles (fermions) are in relativistic
quantum physics associated to the exchange of “wave” particles (bosons)—note that
bosons can also interact among themselves. Such a picture can be visualized (and
observables related to the process can be computed) using the schematic diagrams
invented in 1948 by Richard Feynman: the Feynman diagrams (Fig. 6.1), that we
have shortly presented in Chap. 2.
Each diagram corresponds to a specific term of a perturbative expansion of the
scattering amplitude. It is a symbolic graph, where initial and final state particles are
represented by incoming and outgoing lines (which are not space–time trajectories)
and the internal lines represent the exchange of virtual particles (the term “virtual”
meaning that their energy and momentum do not have necessarily to be related
through the relativistic equation E 2 = p 2 + M 2 ; if they are not, they are said to
be off the mass shell). Solid straight lines are associated to fermions while wavy,
curly, or broken lines are associated to bosons. Arrows indicate the time flow of the
external particles and antiparticles (in the plot time runs usually from left to right,
but having it running from bottom to top is also a possible convention). A particle
(anti-particle) moving backwards in time is equivalent to its anti-particle (particle)
moving forwards in time.
At the lowest order, the two initial state particles exchange only a particle medi-
ating the interaction (for instance a photon). Associated to each vertex (a point
where at least three lines meet) is a number, the coupling constant (in the case of
© Springer-Verlag Italia 2015 245
A. De Angelis and M.J.M. Pimenta, Introduction to Particle
and Astroparticle Physics, Undergraduate Lecture Notes in Physics,
DOI 10.1007/978-88-470-2688-9_6
246 6 Interactions and Field Theories
√ √
electromagnetic interaction z α = ze/ 4π for a particle with electrical charge z),
which indicates the probability of the emission/absorption of the field particle and
thus the strength of the interaction. Energy–momentum, as well as all the quantum
numbers, is conserved in each vertex.
At higher orders, more than one field particle can be exchanged (second diagram
from the left in the Figure) and there is an infinite number of possibilities (terms in
the perturbative expansion) for which amplitudes and probabilities are proportional
to increasing powers of the coupling constants. Although the scattering amplitude
is proportional to the square of the sum of all the terms, if the coupling constants
are small enough, just the first diagrams will be relevant. However, even low-order
diagrams can give an infinite contribution. Indeed in the second diagram, there is a
loop of internal particles and an integration over the exchanged energy-momentum
has to be carried out. Since this integration is performed in a virtual space, it is not
bound and therefore it might, in principle, diverge. Curing divergent integrals (or, in
jargon, “canceling infinities”) became the central problem of quantum field theory
in the middle of the twentieth century (classically the electrostatic self-energy of a
point charged particle is also infinite) and it was successfully solved in the case of
electromagnetic interaction, as it will be briefly discussed in Sect. 6.2.9, within the
renormalization scheme.
The quantum equations for “matter” (Schrödinger, Klein-Gordon, Dirac equa-
tions) must be modified to incorporate explicitly the couplings with the interac-
tion fields. The introduction of these new terms makes the equations invariant to
a combined local (space–time dependent) transformation of the matter and of the
interactions fields (the fermion wave phase and the four-momentum potential degree
of freedom in case of the electromagnetic interactions). Conversely requiring that
the “matter” quantum equations should be invariant to local transformation of some
internal symmetry groups implies the existence of well-defined interaction fields, the
gauge fields. These ideas, developed in particular by Feynman and by Yang and Mills
in the 1950s, were applied to the electromagnetic, weak and strong interactions field
theories; they provided the framework for the unification of the electromagnetic and
weak interactions (electroweak interactions) which has been extensively tested with
an impressive success (see next Chapter) and may lead to further unification involv-
ing strong interaction (GUTs—Grand Unified Theories) and even gravity (ToE—
Theories of Everything). One could think that we are close to the “end of physics.”
However, the experimental discovery that most of the energy of the Universe cannot
be explained by the known physical objects quickly dismissed such claim—in fact
Dark Matter and Dark Energy represent around 95 % of the total energy budget of
the Universe, and they are not explained by present theories.
6.1 The Lagrangian Representation of a Dynamical System 247
In the quantum world, we usually find it convenient to use the Lagrangian or the
Hamiltonian representation of a system to compute the equations of motion. The
Lagrangian L of a system of particles is defined as
L = K −V (6.1)
where is the K is the total kinetic energy of the system, and V the total potential
energy.
Any system with n degrees of freedom, is fully described by n generalized coor-
dinates qi and n generalized velocities q̇i . The equations of motion of the system are
the so-called Euler–Lagrange equations
d ∂L ∂L
= (6.2)
dt ∂ q̇ j ∂q j
where the index j = 1, 2, . . . , n runs over the degrees of freedom. For example, in
the case of a single particle in a conservative field in one dimension, x, one can write
1 2
L= mv − V (x) (6.3)
2
and applying the Euler–Lagrange equations
d d
mv = − V = F =⇒ F = ma
dt dx
(Newton’s law).
Although the mathematics required for Lagrange’s equations might seem more
complicated than Newton’s law, Lagrange equations make often the solution easier,
since the generalized coordinates can be conveniently chosen to exploit symmetries
in the system, and constraint forces are incorporated in the geometry of the problem.
The Lagrangian is of course not unique: you can multiply it by a constant factor,
for example, or add a constant, and the equations will not change. You can also add
the four-divergence of an arbitrary vector function: it will cancel when you apply the
Euler–Lagrange equations, and thus the dynamical equations are not affected.
The so-called Hamiltonian representation uses instead the Hamiltonian function
H ( p j , q j , t):
H = K +V. (6.4)
We have already shortly discussed in the previous chapter this function, which rep-
resents the total energy in terms of generalized coordinates q j and of generalized
momenta
248 6 Interactions and Field Theories
∂H
pj = . (6.5)
∂ q̇ j
d pj ∂H
=− (6.6)
dt ∂q j
dq j ∂H
= . (6.7)
dt ∂ pj
The two representations, Lagrangian and Hamiltionian, are equivalent. For exam-
ple, in the case of a single particle in a conservative field in one dimension
p2
H= +V (6.8)
2m
and Hamilton’s equations become
dp dV dx p
=− =F; = . (6.9)
dt dx dt m
We shall use more frequently Lagrangian mechanics. Let us now see how
Lagrangian mechanics simplifies the description of a complex system.
is conserved. For example the invariance to space translation implies that linear
momentum is conserved. By a similar approach, we could see that the invariance to
rotational translation implies that angular momentum is conserved.
6.1 The Lagrangian Representation of a Dynamical System 249
∂L
jμ ≡ (6.14)
∂(∂μ φ)
∂ j0 · j = 0 ,
∂μ j μ = 0 ⇒ − +∇ (6.15)
∂t
where j 0 is the charge density, and j is the current density. The total (conserved)
charge will be
Q= d3x j 0 . (6.16)
all space
250 6 Interactions and Field Theories
The dimension of the Lagrangian density is [energy4 ] since the action (6.12) is
dimensionless; the scalar field φ has thus the dimension of an energy.
Electromagnetic effects were known since the antiquity, but just during the nine-
teenth century the (classical) theory of electromagnetic interactions was firmly estab-
lished. In the twentieth century, the marriage between electrodynamics and quantum
mechanics (Maxwell’s equations were already relativistic even before the formula-
tion of Einstein’s relativity) gave birth to the theory of Quantum Electrodynamics
(QED), which is the most accurate theory ever formulated. QED describes the inter-
actions between charged electrical particles mediated by a quantized electromagnetic
field.
6.2.1 Electrodynamics
· E = ρ
∇
6.2 Quantum Electrodynamics (QED) 251
· B = 0
∇
× E = − ∂ B
∇
∂t
× B = j + ∂ E .
∇
∂t
− ∂A
E = −∇φ
∂t
B = ∇
× A .
∂ B
× E = ∇
∇ − ∂A
× −∇φ =−
∂t ∂t
However, the potential fields (φ, A) are not totally determined, having a local degree
of freedom. In fact, if χ (t, x) is a scalar function of the time and space coordinates,
defined as
then the potentials (φ, A)
∂χ
φ = φ −
∂t
A = A + ∇χ
give origin to the same E and B fields. These transformations are designated as
gauge transformations and generalize the freedom that exist in electrostatics in the
definition of the space points where the electric potential is zero (the electrostatic
field is invariant under a global transformation of the electrostatic potential, but the
252 6 Interactions and Field Theories
electromagnetic field is invariant under a joint local transformation of the scalar and
vector potential).
The arbitrariness of these transformations can be used to write the Maxwell equa-
tions in a simpler way. What we are going to do is to use our choice to fix things so
that the equations for A and for φ are separated but have the same form. We can do
this by taking (this is called the Lorenz gauge):
· A = − ∂φ .
∇ (6.18)
∂t
Thus
∂2φ
− ∇2φ = ρ
∂t 2
∂ 2 A
− ∇ 2 A = j .
∂t 2
(notice that the Lorenz gauge ∂μ Aμ = 0 is covariant), the two equations are summa-
rized in
Aμ = j μ . (6.20)
Aμ = 0 . (6.21)
This equation is similar to the Klein–Gordon equation for one particle with m = 0
(see Sects. 5.7.2 and 6.2.5) but with spin 1. Aμ is identified with the wave function
of a free photon, and the solution of the above equation is, up to some normalization
factor:
Aμ = μ (q) e−iq x (6.22)
where q is the four-momentum of the photon and μ its the polarization four-vector.
The four components of μ are not independent. The Lorenz condition imposes one
constraint, reducing the number of independent component to three. However, even
after imposing the Lorenz condition, there is still the possibility, if ∂ 2 χ = 0, of a
further gauge transformation
Aμ → Aμ + ∂ μ χ . (6.23)
6.2 Quantum Electrodynamics (QED) 253
This choice is known as the Coulomb gauge and it makes clear that there are just
two degrees of freedom left for the polarization which is the case of mass zero spin
1 particles (m s = ±1).
· E = ρ − μγ 2 φ
∇
· B = 0
∇
× E = − ∂ B
∇
∂t
× B = j + ∂ E − μγ 2 A.
∇
∂t
In this scenario, the electrostatic field would show a Yukawa-type exponential atten-
uation, e−μγ r . Experimental tests of the validity of the Coulomb inverse square
law have been performed since many years in experiments using different tech-
niques, leading to stringent limits: μγ < 10−18 eV ∼10−51 g. Stronger limits
(μγ < 10−26 eV) are reported from the analyses of astronomical data, but are model
dependent.
Classically, the coupling between a particle with charge e and the electromagnetic
field is given by the Lorentz force:
1 AlexandruProca (1897–1955) was a Romanian physicist who studied and worked in France (he
was a student of Louis de Broglie). He developed the vector meson theory of nuclear forces and
worked on relativistic quantum field equations.
254 6 Interactions and Field Theories
F = e E + v × B
∂ A d A
=e −∇
φ − v · A −
− v · ∇ A = e − ∇ φ − v · A −
∂t dt
d
=⇒ =e −∇
( p + e A) φ − v · A .
dt
∂ ∂L ∂L
=−
∂t ∂ ẋi ∂xi
∂L
pi =
∂ ẋi
p = m ẋ i + e A
1 2
H= p − e A + eφ .
2m
Then the free-particle equation
p 2
E=
2m
6.2 Quantum Electrodynamics (QED) 255
is transformed in the case of the coupling with the electromagnetic field in:
1 2
E − eφ = p − e A . (6.25)
2m
This is equivalent to the following replacements for the free-particle energy and
momentum:
E → E − eφ ; p → p − e A (6.26)
p μ → p μ − e Aμ (6.27)
∂ μ → D μ ≡ ∂ μ + ie Aμ . (6.28)
∂ 1 2
i =− −i∇ (6.29)
∂t 2m
becomes under such a replacement
∂ 1 2
− e A .
i − eφ = − −i∇ (6.30)
∂t 2m
The Schrödinger equation couples directly to the scalar and vector potential and not
to the force, and quantum effects not foreseen in classic physics do exist. One of
them is the well-known Bohm–Aharonov effect predicted in 1959 by David Bohm
and his student Yakir Aharonov.2 Whenever a particle is confined in a region where
the electric and the magnetic field are zero but the potential four-vector is not, its
wave function changes the phase.
This is the case of particles crossing a region outside an infinite thin solenoid
(Fig. 6.2, left).
2 David Bohm (1917–1992) was an American scientist who contributed innovative and unorthodox
ideas to quantum theory, neuropsychology, and the philosophy of mind. Yakir Aharonov (1932)
is an Israeli physicist specialized in quantum physics, interested in quantum field theories and
interpretations of quantum mechanics.
256 6 Interactions and Field Theories
Fig. 6.2 Left Vector potential in the region outside an infinite solenoid. Right Double-slit experiment
demonstrating the Bohm–Aharonov effect. From D. Griffiths, “Introduction to quantum mechanics,”
second edition, Pearson 2004
In this region, the magnetic field B is zero but the vector potential vector A is not
∇ × A = B
A · d l = B · ds .
S
The line integral of the vector potential A around a closed loop is equal to the
magnetic flux through the area enclosed by the loop. As B inside the solenoid is not
zero, the flux is also not zero and therefore A is not null.
This effect was experimentally verified observing shifts in an interference pattern
whether or not the current in a microscopic solenoid placed in between the two
fringes is turned on (Fig. 6.2, right).
We have seen that physical observables connected to a wave function are invariant
to global change in the phase of the wave function itself
(
x , t) → (
x , t) eiqα (6.31)
( x , t) eiqα(x ,t) .
x , t) → ( (6.32)
6.2 Quantum Electrodynamics (QED) 257
On the other hand, the electromagnetic field is, as it was discussed in Sect. 6.2.1,
invariant under a combined local transformation of the scalar and vector potential:
∂χ
φ→φ− (6.33)
∂t
A → A + ∇χ
(6.34)
where χ (t, x) is a scalar function of the time and space coordinates.
Remarkably the Schrödinger equation modified using the minimal coupling pre-
scription is invariant under a joint local transformation both of the phase of the wave
function and of the electromagnetic four-potential:
( x , t) eieα(x )
x , t) → ( (6.35)
Aμ → Aμ − ∂ μ α (
x) . (6.36)
Applying the minimal coupling prescription to the relativistic wave equations (Klein-
Gordon and Dirac equations) these equations become also invariant under local gauge
transformations, as we shall verify later.
Conversely, imposing the invariance under a local gauge transformation of the
free-particle wave equations implies the introduction of a gauge field.
The gauge transformation of the wave functions can be written in a more general
form as
(
x , t) → ( x , t) exp iα (
x ) Â (6.37)
where α ( x ) is a real function of the space coordinates and  a unitary operator (see
Sect. 5.3.3).
In the case of QED, Herman Weyl, Vladmir Foch, and Fritz London found in
the late 1920s that the invariance of a Lagrangian including fermion and field terms
with respect to transformations associated to the U(1) group, corresponding to local
rotations by α ( x ) of the wave function phase, requires (and provides) the interaction
term with the electromagnetic field, whose quantum is the photon.
The generalization of this symmetry to non-Abelian groups was introduced in
1954 by Chen Yang and Robert Mills.3 Indeed we shall see that
• The weak interaction is modeled by a “weak isospin” symmetry linking “weak
isospin up” particles (identified for example with the u-type quarks and with the
3 Chen Yang (1922) is a Chinese-born American physicist who works on statistical mechanics and
particle physics. He shared the 1957 Nobel prize in physics with T.D. Lee for their work on parity
non-conservation in weak interactions. While working with the US physicist Robert Mills (1927–
1999) at Brookhaven National Laboratory, in 1954 he proposed a tensor equation for what are now
called Yang–Mills fields.
258 6 Interactions and Field Theories
neutrinos) and “weak isospin down” particles (identified for example with the
d-type quarks and with the leptons). We have seen that SU(2) is the minimal
representation for such a symmetry. If  is chosen to be one of the generators of
the SU(2) group then the associated gauge transformation corresponds to a local
rotation in a spinor space. The needed gauge field to assure the invariance of the
wave equations under such transformations are the weak fields whose quanta are
the W ± and Z (see Sect. 6.3).
• The strong interaction is modeled by QCD, a theory exploiting the invariance of
the strong interaction with respect to a rotation in color space. We shall see that
SU(3) is the minimal representation for such a symmetry. If  is chosen to be
one of the generators of the SU(3) group then the associated gauge transformation
corresponds to a local rotation in a complex three-dimensional vector space, which
represents the color space. The gauge fields needed to assure the invariance of the
Fig. 6.3 Schematic representations of U(1), SU(2) and SU(3) transformations applied to the models
of QED, weak and strong interactions
6.2 Quantum Electrodynamics (QED) 259
wave equations under such transformations are the strong fields whose quanta are
called gluons (see Sect. 6.4).
Figure 6.3 shows schematic representations of such transformations.
Dirac equation was briefly introduced in Sect. 3.2.1. It is a linear equation describing
free relativistic particles with spin 1/2 (electrons and positrons for instance); linearity
allows overcoming some difficulties coming from the nonlinearity of the Klein-
Gordon equation, which was the translation in quantum mechanical form of the
relativistic Hamiltonian
H 2 = p2 + m 2
replacing the Hamiltonian itself and the momentum with the appropriate operators:
∂2ψ 2ψ + m2ψ .
Ĥ 2 = p̂ 2 + m 2 =⇒ − = −∇ (6.38)
∂t 2
Dirac searched for an alternative relativistic equation starting from the generic
form describing the evolution of a wave function, in the familiar form:
∂
i = Ĥ ψ
∂t
with a Hamiltonian operator linear in pˆ, t (Lorentz invariance requires that if the
Hamiltonian has first derivatives with respect to time also the spatial derivatives
should be of first order):
Ĥ = α · p + βm .
αi2 = 1 ; β 2 = 1
αi β + βαi = 0
αi α j + α j αi = 0 . (6.39)
Therefore, the parameters α and β cannot be numbers. However, things work if they
are matrices (and if these matrices are hermitian it is guaranteed that the hamiltonian
is also hermitian). It can be demonstrated that their lowest possible rank is 4.
Using the explicit form of the momentum operator p = −i ∇, the Dirac equation
can be written as
∂ψ + βm ψ .
i = iα ·∇ (6.40)
∂t
260 6 Interactions and Field Theories
(cα
· p + βm) u( p) = Eu( p) .
This equation has four solutions: two with positive energy E = +E p and two
with negative energy E = −E p . We will discuss later the interpretation of the
negative energy solutions. The Dirac equation accounts “for free” for the existence
of two spin states, which had to be inserted by hand in the Schrödinger equation of
nonrelativistic quantum mechanics, and therefore explains the magnetic moment of
point-like fermions. In addition, since spin is embedded in the equation, the Dirac’s
equation allows computing correctly the energy splitting of atomic levels with the
same quantum numbers due to the spin-orbit and spin–spin interactions in atoms
(fine and hyperfine splitting).
We shall now write the free-particle Dirac equation in a more compact form, from
which relativistic covariance is immediately visible. This requires the introduction
of a new set of important 4 × 4 matrices, the γ μ matrices, which replace the αi and β
matrices discussed before. To account for electromagnetic interactions, the minimal
coupling prescription can once again be used.
One possible, choice for αi and β satisfying the conditions (6.39) is the set of
matrices:
0 σi
αi = (6.41)
σi 0
I 0
β= (6.42)
0 −I
being σi the 2 × 2 Pauli matrices (see Sect. 5.7.2) and I the unit 2 × 2 matrix.
Multiplying the Dirac equation (6.40) by β one has
∂ψ
+m ψ.
iβ = iβ α
·∇
∂t
6.2 Quantum Electrodynamics (QED) 261
γ 0 = β ; γ = β α
(6.43)
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 0 0 0 0 0 0 1 0 0 0 −i 0 0 1 0
⎜0 1 0 0 ⎟ ⎜ 0⎟ ⎜ i 0 ⎟ ⎜ 0 −1 ⎟
γ0 = ⎜ ⎟ , γ1 = ⎜ 0 0 1 ⎟ , γ2 = ⎜ 0 0 ⎟ , γ3 = ⎜ 0 0 ⎟
⎝0 0 −1 0 ⎠ ⎝ 0 −1 0 0⎠ ⎝ 0 i 0 0 ⎠ ⎝ −1 0 0 0 ⎠
0 0 0 −1 −1 0 0 0 −i 0 0 0 0 1 0 0
then:
0 ∂ i ∂
i γ +γ −m ψ = 0.
∂x 0 ∂x i
γ μ = (β, β α)
In this simple case, the two spinors are subject to two independent differential equa-
tions:
∂
ψ A = −imψ A
∂t
∂
ψ B = imψ B
∂t
which have as solution (up to some normalization factor):
• ψ A = e−imt ψ A (0) with energy E = m > 0;
• ψ B = eimt ψ B (0) with energy E = −m < 0
or in terms of each component of the wave function vector
⎛ ⎞ ⎛ ⎞
1 0
⎜ 0 ⎟ ⎜ 1 ⎟
ψ1 = e−imt ⎜ ⎟
⎝ 0 ⎠ ψ2 = e
−imt ⎜
⎝
⎟
0 ⎠
0 0
⎛ ⎞ ⎛ ⎞
0 0
⎜ 0 ⎟ ⎜ 0 ⎟
ψ3 = eimt ⎜ ⎟ ψ4 = eimt ⎜ ⎟.
⎝ 1 ⎠ ⎝ 0 ⎠
0 1
There are then four solutions which can accommodate a spin 1/2 particle or antipar-
ticle. The positive energy solutions ψ1 and ψ2 correspond to fermions (electrons for
instance) with spin up and down, respectively, while the negative energy solutions ψ3
and ψ4 correspond to antifermions (positrons for instance) with spin up and down.
Free particles have p = constant and their wave function is be a plane wave of the
form:
x , t) = u ( p0 , p) e−i( p0 t− p·x )
ψ (
where
φ
u ( p0 , p) = N
χ
Inserting the equation of a plane wave as a trial solution and using the Pauli–Dirac
representation of the γ matrices:
( p0 − m)I − σ · p φ
= 0.
σ · p (− p 0 − m)I χ
I is again the 2 × 2 unity matrix which is often omitted writing the equations and
pz px − i p y
σ · p = .
p x + i p x − pz
σ · p
φ= χ
E −m
σ · p
χ= φ
E +m
and then the u bi-spinor can be written either in terms of the spinor φ or in term of
the spinor χ:
φ
u1 = N σ · p
E+m φ
−
σ · p
u2 = N −E+m χ .
χ
The first solution corresponds to states with E > 0 (particles) and the second to
states with E < 0 (antiparticles) as can be seen by going to the p = 0 limit. These
last states can be rewritten changing the sign of E and p and labeling the bi-spinor
u 2 as v (u 1 is then labeled just as u).
σ · p
v=N E+m χ .
χ
Finally, we have then again four solutions: two for the particle states and two for the
antiparticle states.
The normalization factor N is often defined as
√
E +m
N= √
V
u † u = v † v = 2E/V .
6.2.4.3 Helicity
The spin operator S introduced in Sect. 5.7.2 can now be generalized in this bi-spinor
space as
1
S = (6.45)
2
where
= σ 0
. (6.46)
0 σ
More generally, defining the helicity operator h as the projection of the spin over
the momentum direction:
1 σ · p
h= (6.47)
2 | p|
there are always four eigenstates of this operator. Indeed, using spherical polar coor-
dinates (θ, φ):
⎛ ⎞ ⎛ ⎞
p
sin θ p
E+m cos
θ
⎜ E+m 2 ⎟ ⎜ 2 ⎟
⎜ ⎟ ⎜ iφ ⎟
⎜ − E+m cos 2θ eiφ ⎟ θ
p p
√ √ ⎜ E+m sin 2 e ⎟
v↑ = E +m⎜
⎜
⎟ ; v↓
⎟ = E +m⎜
⎜
⎟.
⎟
⎜ − sin 2θ ⎟ ⎜ cos 2θ ⎟
⎝ ⎠ ⎝ ⎠
θ iφ θ iφ
cos 2 e sin 2 e
Note that helicity is Lorentz invariant only in the case of massless particles (otherwise
the direction of p can be inverted choosing an appropriate reference frame).
The Dirac bi-spinors are not real four-vectors and it can be shown that the product
ψ † ψ is not a Lorentz invariant (a scalar). On the contrary the product ψψ is a Lorentz
invariant being ψ named the adjoint Dirac spinor and defined as:
ψ = ψ† γ 0 =
⎛ ⎞
1 0 0 0
∗ ∗ ∗ ∗ ⎜ 0 1 0 0 ⎟
= ψ1 , ψ2 , ψ3 , ψ4 ⎜
⎝0
⎟ = ψ ∗ , ψ ∗ , −ψ ∗ , −ψ ∗ .
⎠
0 −1 0 1 2 3 4
0 0 0 −1
The parity operator P in the Dirac bi-spinor space is just the matrix γ 0 (it reverts the
sign of the terms which are function of p), and
P ψψ = ψ † γ 0 γ 0 γ 0 ψ = ψψ
2
as γ 0 = 1.
Other quantities can be constructed using ψ and ψ (bilinear covariants). In par-
ticular introducing γ 5 as
⎛ ⎞
0 0 0 1
⎜0 0 1 0⎟
⎜
γ = iγ γ γ γ = ⎝
5 0 1 2 3 ⎟:
0 1 0 0⎠
1 0 0 0
266 6 Interactions and Field Theories
• ψγ 5 ψ is a pseudoscalar.
• ψγ μ ψ is a four-vector.
• ψγ μ γ 5 ψ is
a pseudo four-vector.
• ψ σ ψ , where σ μν = 2i (γ μ γ ν − γ ν γ μ ), is an antisymmetric tensor.
μν
∂ μ → D μ ≡ ∂ μ + ie Aμ .
Then
μ
iγ Dμ − m ψ = 0
μ
iγ ∂μ − e γ μ Aμ − m ψ = 0 .
The interaction with a magnetic field can be then described introducing the two
spinors φ and χ and using the Pauli–Dirac representation of the γ matrices:
⎛ ⎞
− e A
p0 − m −
σ · −i ∇
⎝ ⎠ φ = 0.
− e A
σ · −i ∇ − p0 − m χ
1 2
e B ·
p − e A ψ − ψ=0
2m 2m
can be identified with the intrinsic magnetic moment of a charged particle with spin S.
Defining the gyromagnetic ratio g as the ratio between μ S and the classical mag-
netic moment μ L of a charged particle with an angular momentum L = S:
μ S
g= = 2.
μ L
6.2 Quantum Electrodynamics (QED) 267
6.2.4.6 g − 2
The value of the coupling between the magnetic field and the spin of the point charged
particle is however modified by higher order corrections which can be translated in
successive Feynman diagrams, as the ones we have seen in Fig. 6.1. In second order,
the main correction is introduced by a vertex correction, described by the diagram
represented in Fig. 6.4 computed in 1948 by Schwinger, leading to deviation of g
from 2 with the magnitude:
g−2 α
ae = 0.0011614 .
2 2π
Nowadays, the theoretical corrections are completely computed up to the eighth-
order (891 diagrams) and the most significant tenth-order terms as well as electroweak
and hadronic corrections are also computed. There is a remarkable agreement with
the present experimental value of:
exp
ae = 0.00115965218076 ± 0.00000000000027 .
eB
ωr ot = .
m
268 6 Interactions and Field Theories
Nowadays, Penning traps are used to keep electrons (and positrons) confined for
months at a time. Such a device, invented by H. Dehmelt in the 1980s uses a homo-
geneous static magnetic field and a spatially inhomogeneous static electric field to
trap charged particles (Fig. 6.6).
The muon and electron magnetic moments should be at first-order equal. However,
the loop corrections are proportional
to the square of the
respective masses and thus
those of the muon are much larger m μ 2 /m e 2 ∼ 4 · 104 . In particular, the sensitivity
to loops involving hypothetical new particles (see Chap. 7 for a survey) is much higher
and a precise measurement of the muon anomalous magnetic moment aμ may be
used as a test of the Standard Model.
Fig. 6.6 Schematic representation of the electric and magnetic fields inside a Penning trap. By
Arian Kriesch Akriesch 23:40, [own work, GFDL http://www.gnu.org/copyleft/fdl.html, CC-BY-
SA-3.0], via Wikimedia Commons
6.2 Quantum Electrodynamics (QED) 269
Fig. 6.7 The E821 storage ring. From Brookhaven National Laboratory
The most precise measurement of aμ so far was done by the experiment E821 at
Brookhaven National Laboratory (BNL). A beam of polarized muons circulates in a
storage ring with a diameter of ∼14 m under the influence of an uniform magnetic
field (Fig. 6.7). The muon spin precess and the polarization of the beam is a function
of time. After many turns, muons decay to electron (and neutrinos) whose momentum
is basically aligned with the direction of the muon spin (see Sect. 6.3). The measured
value is
aμexp = 0.00116592083 ± 0.00000000063 .
This result is more than 3 σ away from the expected one which leads to a wide discus-
sion both on the accuracy of the theoretical computation (in particular in the hadronic
contribution) and the possibility of an indication of new physics. This discussion is
quite interesting but the statistical significance of the result is still marginal.
∂L
= iγ μ ∂μ ψ − mψ = 0 ,
∂(∂μ ψ̄)
Notice that:
• the mass (i.e., the energy associated to rest—whatever this can mean in quantum
mechanics) is associated to a term quadratic in the field
m ψ̄ψ ;
The Klein–Gordon equation was briefly introduced in Sect. 3.2.1. It describes free
relativistic particles with spin 0 (scalars or pseudoscalars). With the introduction
of the four-vector notation, it can be written in a covariant form. To account for
electromagnetic interactions, the minimal coupling prescription can be used.
with
E = ± p2 + m 2
(the positive solutions correspond to particles and the negative ones to antiparticles).
6.2 Quantum Electrodynamics (QED) 271
Doing some arithmetic with the Klein–Gordon equation and its conjugate a con-
tinuity equation can also be obtained for a particle with charge e:
· j = − ∂ρ
∇
∂t
where
ρ (x) = ie φ∗ ∂tφ − φ∂tφ∗ ; j (x) = −ie φ∗ ∇φ − φ∇φ∗
or in terms of four-vectors:
∂ μ jμ = 0
where
jμ (x) = ie φ∗ ∂μ φ − φ∂μ φ∗ .
and thus
∂μ + ie Aμ ∂ μ + ie Aμ + m 2 φ (x) = 0
∂μ ∂ μ + m 2 + ie ∂μ Aμ +Aμ ∂ μ − e2 Aμ Aμ φ (x) = 0 .
The e2 term is of second order and can be neglected. Then the Klein-Gordon equation
in presence of an electromagnetic field can be written at first order as
∂μ ∂ μ + V (x) + m 2 φ (x) = 0
where
V (x) = ie ∂μ Aμ + Aμ ∂ μ
is the potential.
272 6 Interactions and Field Theories
1 1
L= (∂μ φ)(∂ μ φ) − m 2 φ2 (6.51)
2 2
and apply the Euler–Lagrange equations to φ. We find
∂L ∂L
∂μ − = ∂μ ∂ μ φ + m 2 φ = 0 ,
∂(∂μ φ) ∂φ
1 2 2
m φ ;
2
Let us draw a field theory which is equivalent to the Dirac equations in the presence
of an external field.
We already wrote a Lagrangian density equivalent to the Dirac equation for a free
particle:
L = ψ̄(iγ μ ∂μ − m)ψ . (6.52)
Aμ → Aμ − ∂ μ θ(x)
1
L = i ψ̄γ μ ∂μ ψ − eψ̄γμ Aμ ψ − m ψ̄ψ − Fμν F μν . (6.54)
4
To find the equations of motion, we can apply to this Lagrangian the Euler–
Lagrange equations for a field:
∂L ∂L
∂μ − = 0. (6.55)
∂(∂μ ψ) ∂ψ
One has
∂L
∂L
∂μ = ∂μ i ψ̄γ μ ; = −eψ̄γμ Aμ − m ψ̄ .
∂(∂μ ψ) ∂ψ
Substituting these two expressions back into the Euler–Lagrange equation results in
∂μ ψ̄γ μ + eψ̄γμ Aμ + m ψ̄ = 0 ,
and taking the complex conjugate and moving the term in the potential to the right
side:
γ μ ∂μ ψ − mψ = eγμ Aμ ψ .
This is the Dirac equation including electrodynamics, as we have seen when dis-
cussing the minimal coupling prescription.
Let us now apply the Euler–Lagrange equations this time to the field Aμ in the
Lagrangian (6.53):
∂L ∂L
∂ν − = 0. (6.56)
∂(∂ν Aμ ) ∂ Aμ
We find
∂L
∂L
∂ν = ∂ν ∂ μ A ν − ∂ ν A μ ; = −eψ̄γ μ ψ
∂(∂ν Aμ ) ∂ Aμ
∂ν F νμ = eψ̄γ μ ψ . (6.57)
274 6 Interactions and Field Theories
For the spinor matter fields, the current takes the simple form:
j μ (x) = qi ψ̄i (x)γ μ ψi (x) (6.58)
i
∂ν F νμ = j μ (6.59)
μνρσ F μν F ρσ = 0
Aμ = eψ̄γ μ ψ
which is a wave equation for the four-potential—the QED version of the classical
Maxwell equations in the Lorenz gauge.
Notice that the Lagrangian (6.53) of QED, based on a local gauge invariance, con-
tains all the physics of electromagnetism. It reflects also some remarkable properties,
confirmed by the experiments:
• The interaction conserves separately P, C, and T.
• The current is diagonal in flavor space (i.e., it does not change the flavors of the
particles).
We can see how the massless electromagnetic field Aμ “appears” thanks the gauge
invariance. This is the basis of QED, quantum electrodynamics.
If a mass m = 0 would be associated to A, this new field would enter in the
Lagrangian with a Proca term
1
− F μν Fμν + m Aμ Aμ
2
which is not invariant under local phase transformation. The field must, thus, be
massless.
Summarizing, the requirement of local phase invariance under U(1), applied to
the free Dirac Lagrangian, generates all of electrodynamics and specifies the elec-
tromagnetic current associated to Dirac particles; moreover, it introduces a massless
field which can be interpreted as the photon. This is QED.
Notice that introducing local phase transformations just implies a simple differ-
ence in the calculation of the derivatives: we pick up an extra piece involving Aμ .
6.2 Quantum Electrodynamics (QED) 275
∂ μ → D μ = ∂ μ + iq Aμ
Electrons and muons have spin 1/2; but, for a moment, let us see how to compute
transition probabilities in QED in the case of spinless particles, since the computation
of the electromagnetic scattering amplitudes between charged spinless particles is
much simpler.
The scattering of a particle due to an interaction that acts only in a finite time interval
can be described, as it was discussed in Sect. 2.7, as the transition between an initial
and a final stationary states characterized by well-defined momentum. The first-
order amplitude for such transition is written, in relativistic perturbative quantum
mechanics, as (see Fig. 6.8, left):
Hif = −i φ∗f (x)V (x) φi (x) d 4 x .
A C
i f
B D
Fig. 6.8 Left Schematic representation of the first-order interaction of a particle in a field. Right
Schematic representation (Feynman diagram) of the first-order elastic scattering of two charged
non-identical particles
276 6 Interactions and Field Theories
and noting that integrating by parts assuming that the potential vanishes at t → ±∞
or x → ±∞
φ∗f (x) ∂μ Aμ φi d 4 x = − ∂μ φ∗f Aμ φi d 4 x
if
introducing a “transition” current jμ between the initial and final state defined as:
jμi f = ie φ∗f ∂μ φi − ∂μ φ∗f φi .
In the case of plane waves describing particles with charge e the current can be
written as:
jμi f = eNi N f pi + p f μ ei ( p f − pi )x . (6.60)
The interaction of two charged particles can be treated as the interaction of one of
the particles with the field created by the other (which thus acts as the source of the
field).
The initial and final states of particle 1 are labeled as the states A and C, respec-
tively, while for the particle 2 (taken as the source of the field) the corresponding
labels are B and D (see Fig. 6.8, right). Let us assume that particles 1 and 2 are not of
the same type (otherwise they would be indistinguishable) and have charge e. Then:
Hif = e jμAC Aμ d 4 x
with
jμAC = eN A NC ( p A + pC )μ ei( pC − p A )x .
with
μ
j B D = eN B N D ( p B + p D )μ ei( p D − p B )x .
Defining
q = ( p D − p B ) = ( p A − pC )
then
1 μ
Aμ = − j
q2 B D
and
1 μ
Hif = −i jμAC − 2 jB D d 4 x .
q
Hif = −i N A N B NC N D (2π)4 δ 4 ( p A + p B − pC + p D ) M ,
The different factors of this amplitude can be associated directly to the sev-
eral elements of the Feynman diagrams (Fig. 6.8, right). In particular, the factors
(ie( p A + pC )μ ) and (ie( p B + p D )ν ) can be associated,
respectively,
to the vertex
of the particle 1 and 2 with the photon and the factor −igμν /q 2 , called the “propaga-
tor” since it is the mathematical term accounting for the interaction, to the exchanged
photon. There is energy-momentum conservation at each vertex.
Being θ the scattering angle in the center-of-mass reference frame (see Fig. 6.9)
and p the module of center-of-mass momentum, the four-vectors of the initial and
final states at high-energy (E m ) can be written as
p A = ( p, p, 0, 0)
p B = ( p, − p, 0, 0)
pC = ( p, p cos θ, p sin θ, 0)
p D = ( p, − p cos θ, − p sin θ, 0) .
Then:
( p A + pC ) = (2 p, p (1 + cos θ) , p sin θ, 0)
( p B + p D ) = (2 p, − p (1 + cos θ) , − p sin θ, 0)
and
1
3
M = −e 22
( p A + pC ) ( p B + p D ) −
0 0
( p A + pC ) ( p B + p D )
i i
q
i=1
1
M = −e2 4 p 2 + p 2 (1 + cos θ )2 + p 2 sin θ 2
p 2 (1 − cos θ )2 + p 2 sin θ 2
(3 + cosθ)
M = e2 .
(1 − cos θ)
dσ |M|2
=
d 64π 2 s
where
e2
α= (6.61)
4π
is the fine structure constant.
6.2 Quantum Electrodynamics (QED) 279
Note that when cos θ → 1 the cross section diverges. This fact is a consequence
of the infinite range of the electromagnetic interactions, translated into the fact that
photons are massless.
Electron and muons have spin 1/2 and are thus described by Dirac bi-spinors (see
Sect. 6.2.4). The computation of the scattering amplitudes is more complex than the
one discussed in the previous subsection for the case of spinless particles but the
main steps, summarized hereafter, are similar.
The Dirac equation in presence of an electromagnetic field is written as
μ
iγ ∂μ − e γ μ Aμ − m ψ = 0.
jμ (x) = −eψ γ μ ψ .
The transition amplitude for the electron (states A and C)/muon (states B and D)
scattering can then be written as (Fig. 6.10):
1 μ
Hi f = −i jμelect
− 2 jmuon d4 x
q
where
jμelect = −e u C γμ u A e−iq x
μ
jmuon = −e u D γ μ u A eiq x
with
q = ( p D − p B ) = ( p A − pC ) .
Hif = −i N A N B NC N D (2π)4 δ 4 ( p A + p B − pC + p D ) M
Once again there is a strict correspondence between the different factors of this
amplitude and the several elements of the Feynman diagrams. Now to each incoming
or outgoing external line one spinor (u A , u B ) or one adjoint spinor (u C , u D ) is,
respectively, associated,
while
the factor associated to the photon internal line (photon
propagator) is −igμν /q 2 . The vertex factor is now given by (ieγ μ ) and again
energy-momentum is conserved at each vertex.
Higher order terms correspond to more complex diagrams which may have inter-
nal loops and fermion internal lines (see Fig. 6.11). In this case, the factor associated
to each internal fermion line is
i γ μ pμ + m
p2 − m 2
and one should not forget that each internal four-momentum loop has to be integrated
over the full momentum range.
The cross sections can be computed once again using the Fermi golden rule. This
requires the computation of the square of the transition amplitude |M|2 and is, even
at the first order, out of the scope of the present text (it involves the sum over all
internal indices and products of γ matrices which can be considerably simplified
using the so-called trace theorems). If several Feynman diagrams contribute to the
same process (same initial and final states) the corresponding amplitudes have to be
summed before squaring. On the other hand, if several different final states (different
spin configurations for instance) correspond to the same initial state, the sum must be
performed over the squares of the individual amplitudes. Finally, if the beam and/or
the target are not polarized (if they are not in spin eigenstates) an average over the
possible spin configuration has to be calculated.
The first-order elastic electron/muon scattering cross section at high-energy (E
m) in the center-of-mass reference frame is given by
dσ α2 1+cos4 (θ/2)
= (6.62)
d 2s sin4 (θ/2)
High-order diagrams involve often closed loops where integration over all possible
momentum should be performed (see Fig. 6.11). As these loops are virtual, they repre-
sent phenomena that occur in time scales compatible with the Heisenberg uncertainty
relations. Since there is no limit on the range of the integration and on the number of
diagrams, the probabilities may a-priori diverge to infinity. We shall see, however,
that the effect of higher order diagrams is the redefinition of some quantities; for
example, the “bare” charge of the electron becomes a new quantity e that we mea-
sure in experiments. A theory with such characteristics—i.e., a theory for which the
series of the contributions form all diagrams converges—is said to be renormalizable.
To avoid confusion in what follows, shall call now ge the “pure” electromagnetic
coupling.
Following the example of the amplitude corresponding to the diagram represented
in Fig. 6.10, the photon propagator is modified by the introduction of the integration
over the virtual fermion/antifermion loop leading to
−g 4
∞ (. . . )
M2 ∼ 4e u C γ μ u A u D γμu B
d 4k
q 0 k − m (k − q)2 − m 2
2 2
√
where ge is the “bare” coupling constant (ge = 4πα0 , in the case of QED; α0
refers to the “bare” coupling, without renormalization).
282 6 Interactions and Field Theories
The integral can be computed setting some energy cut-off M and making M → ∞
in the end of the calculation. Then it can be shown that
2 2
M (. . .) q2 M −q
lim(M → ∞)
d 4k ∼ ln − f
0 k − m (k − q)2 − m 2
2 2 12π 2 m 2 m2
and neglecting ge6 terms (for that many other diagrams have to be summed up, but
the associated probability is expected to become negligible)
2
−g R 2 gR 2 −q
M2 ∼ (. . .) (. . .) 1 + f .
q2 12π 2 m2
The M2 is no more divergent but the coupling constant g R (the electric charge) is
now a function of q 2 :
g R q02 −q 2
g R q = g R q0
2 2
1+ f . (6.65)
12π 2 m2
Other diagrams as those represented Fig. 6.12 lead to the renormalization of fun-
damental constants. In the left diagram, “emission” and “absorption” of a virtual
photon by one of the fermion external lines contributes to the renormalization of
the fermion mass, while in the one on the right, “emission” and “absorption” of
a virtual photon between the fermion external lines from a same vertex contribute
to the renormalization of the fermion magnetic moment and thus are central in the
that is at the same time the source of the electromagnetic field (Fig. 6.14). This bare
charge is screened by this polarized medium and its intensity decreases with the
distance to the charge (increases with the square of the transferred momentum).
Even in the absence of any “real” matter particle (i.e., in the vacuum) there is no
empty space in quantum field theory. A rich spectrum of virtual wave particles (like
the photons) can be created and destroyed under the protection of the Heisenberg
incertitude relations and within its limits be transfigurated into fermion/antifermion
pairs. The space is thus full of electromagnetic waves and the energy of its ground
state (the zero point energy) is, like the ground state of any harmonic oscillator,
different from zero. The integral over all the space of this ground-state energy will
be infinite, which leads once again to an enormous challenge to theoretical physicists
(what is the relation with a nonzero cosmological constant which may explain the
accelerated expansion of the Universe observed in the last years as discussed in
Sect. 8.1).
A spectacular consequence is the attraction experimented by two neutral planes of
conductor when placed face to face at very short distances, typically of the order of
the micrometer (see Fig. 6.15). This effect is known as the Casimir effect, since it was
predicted by Hendrick Casimir4 in 1948 and later on experimentally demonstrated.
The two plates impose boundary conditions to the electromagnetic waves originated
by the vacuum fluctuations, and the energy decreases with the distance in such a way
that the net result is a very small but measurable attractive force.
A theory is said to be renormalizable if (as in QED) all the divergences at all
orders can be absorbed into running physical constants; corrections are then finite at
any order of the perturbative expansion. The present theory of the so-called Standard
Model of Particle Physics was proven to be renormalizable. In contrast, the quanti-
zation of general relativity leads easily to non-renormalizable terms and this is one
4 Hendrick Casimir (1909–2000) was a Dutch physicist mostly known for his works on supercon-
ductivity.
6.2 Quantum Electrodynamics (QED) 285
of the strong motivations for alternative theories (see Chap. 7). Nevertheless, the fact
that a theory is not renormalizable does not mean that it is useless: it might just be
an effective theory that works only up to some physical scale.
Weak interactions are, of course, weak; they have short range and contrary to the
other interactions do not bind particles together. Their existence was first revealed
in β decay and their universality was the object of many controversies until being
finally established in the second half of the twentieth century. All fermions have
weak charges and are thus subject to their subtle or dramatic effects. The structure of
the weak interactions was found to be similar to the structure of QED, and this fact is
at the basis of one of the most important and beautiful pieces of theoretical work in
the 20th century: the Glashow–Weinberg–Salam model of electroweak interactions,
which, together with the theory of strong interactions (QCD), constitutes the Standard
Model (SM) of particle physics, that will be discussed in the next chapter.
There are however striking differences between QED and weak interactions: parity
is conserved, as it was expected, in QED, but not in weak interactions; the relevant
symmetry group in weak interactions is SU(2) (fermions are grouped in left doublets
and right singlets) while in QED the symmetry group is U(1); in QED there is only
one massless vector boson, the photon, while weak interaction are mediated by three
massive vector bosons, the W ± and the Z .
The β decay was known since long time when Enrico Fermi in 1933 realized that
the associate transition amplitude could be written in a way similar to QED (see
Sect. 6.2.8). Assuming time reversal symmetry, one can see that the transition ampli-
tude for β decay,
n → p e− ν e ,
e− p → n νe (K capture);
ν e p → n e+ (inverse β decay) .
The transition amplitude can then be seen as the interaction of one hadronic and
one leptonic current (Fig. 6.16) and may be written, in analogy to the electron-muon
elastic scattering discussed before (Fig. 6.10) as
286 6 Interactions and Field Theories
M = GF u p γμun u e γμ u νe . (6.67)
The cross section grows with the square of the center-of-mass energy and this behav-
ior is indeed observed in low-energy neutrino scattering experiments.
However, from quantum mechanics, it is well known that a cross section can be
decomposed in a sum over all the possible angular momenta l and then
∞
4π
σ≤ (2l + 1) . (6.69)
k2
l=0
Being λ = 1/k, this relation just means that contribution of each partial wave is bound
and its scale is given by the area (πλ2 ) “seen” by the incident particle. In a contact
interaction, the impact parameter is zero and so the only possible contribution is the
S wave (l = 0). Thus, the neutrino–nucleon cross section cannot increase forever.
Given the magnitude of the Fermi constant G F , the Fermi model of weak interactions
cannot be valid for center-of-mass energies above a few hundreds of GeV (this bound
is commonly said to be imposed by unitarity in the sense that the probability of an
interaction cannot be >1).
6.3 Weak Interactions 287
The solution of such a problem was found associating the weak interactions to a
new field of short range, the weak field, whose massive charged bosons (the W ± )
will act as propagators. In practice (see Sect. 6.3.5),
gw
2
GF → , (6.70)
q 2 − MW
2
μ− → e− νμ ν e
M = GF u νμ γ ρ u μ u e γρ u νe . (6.71)
Although β and μ decays are due to the same type of interaction their phenomenol-
ogy is completely different: the neutron lifetime is ∼900 s while the muon lifetime
is ∼2.2 µs; the energy spectrum of the decay electron is in both cases continuum
(three-body decay) but its shape is quite different (Fig. 6.18). While in β decay it
vanishes at the endpoint, in the case of μ is clearly nonzero. These striking differences
are basically a reflection of the decay kinematics.
5 The electromagnetic decay μ− → e− γ violates the lepton family number and was never observed:
μ− → e− γ / tot < 5.7 · 10−13 .
288 6 Interactions and Field Theories
Fig. 6.18 Electron energy spectrum in β decay of thallium 206 (left) and in μ decay (right). Sources
F.A. Scott, Phys. Rev. 48 (1935) 391; ICARUS Collaboration (S. Amoruso et al.), Eur. Phys. J. C33
(2004) 233
In fact using once more dimensional arguments the decay width of these particles
should behave as
∼ G F 2 E 5 ,
where E is the energy released in the decay. In the case of the β decay:
E n ∼ m n − m p ∼ 1.29 MeV
and therefore
E n5 E μ5 .
On the other hand, the shape of the electron energy spectrum at the endpoint is
determined by the available phase space. At the endpoint, the electron is aligned
against the other two decay products but, while in the β decay the proton is basically
at rest (or remains “imprisoned” inside the nucleus) and there is only one possible
configuration in the final state, in the case of μ decay, as neutrinos have negligible
mass, the number of endpoint configurations is quite large reflecting the different
ways to share the remaining energy between the neutrino and the anti-neutrino.
The conservation of parity (see Sect. 5.3.6) was a dogma for physicists until the
1950s. Then, a puzzle appeared: apparently two strange mesons (denominated as θ+
6.3 Weak Interactions 289
and τ + ) had the same mass, the same lifetime but different parities according to their
decay modes:
In the 1956 Rochester conference, the conservation of parity in weak decays was
questioned by Feynman reporting a suggestion of Martin Block. Few months later,
Lee and Yang reviewed all the past experimental data and found that there was
no evidence of parity conservation in weak interactions and they proposed new
experimental tests based in the measurement of observables depending on axial
vectors.
C.S. Wu (known as Madame Wu) was able, in a few months, to design and execute
a β decay experiment where nuclei of 60 Co (with total angular momentum J = 5)
decay into an excited state 60 Ni∗∗ (with total angular momentum J = 4):
60
Co → 60
Ni∗∗ e− ν e
The 60 Co was polarized (a strong magnetic field was used, and the temperatures were
as low as a few mK) and the number of decay electrons emitted in the direction (or
opposite to) of the polarization field was measured (Fig. 6.19). The observed angle
θ between the electron and the polarization direction followed a distribution of the
form:
N (θ) ∼ 1 − P β cos θ
Fig. 6.19 Conceptual (left) and schematic diagram (right) of the experimental apparatus used by
Wu et al. (1957) to detect the violation of the parity symmetry in β decay. The left plot comes from
Wikimedia commons; the right plot from the original article by Wu et al., Physical Review 105
(1957) 1413
290 6 Interactions and Field Theories
where P is the degree of polarization of the nuclei and β is the speed of the electron
normalized to the speed of light.
The electrons were emitted preferentially in the direction opposite to the polar-
ization of the nuclei, thus violating parity conservation. In fact under a parity trans-
formation the momentum of the electron (a vector) reverses its direction while the
magnetic field (an axial vector) does not (Fig. 6.20). Pauli placed a bet: “I don’t
believe that the Lord is a weak left-hander, and I am ready to bet a very high sum
that the experiment will give a symmetric angular distributions of electrons”—and
lost.
The universality of the Fermi model of weak interactions was questioned long before
the Wu experiment. In fact in the original Fermi model, only β decays in which there
was no angular momentum change in the nuclei (Fermi transitions) were allowed,
while the existence of β decays where the spin of the nuclei changed by one unity
(the Gamow–Teller transitions) was already well established. The Fermi model had
to be generalized.
In the most general way, the currents involved in the weak interactions could be
written as a sum of Scalar (S), Pseudoscalar (P), Vector (V), Axial (A), or Tensor (T)
terms following the Dirac bilinear forms referred in Sect. 6.2.4:
J1,2 = Ci (u 1 i u 2 )
i
Each vectorial current in the Fermi model is, in the (V-A) theory, replaced by a
vectorial minus an axial-vectorial current. For instance, the neutrino-electron vec-
torial current present in the β decay and in the muon decay amplitudes (Eqs. 6.67,
6.71, and Fig. 6.16, respectively):
u e γμ u νe (6.74)
is replaced by
u e γμ (1 − γ 5 )u νe . (6.75)
In terms of the Feynman diagrams, the factor associated to the vertex becomes
γ μ (1 − γ 5 ) . (6.76)
Within the (V-A) theory, the transition amplitude of the muon decay, which is a
golden example of a leptonic weak interaction, can be written as
GF
M = √ u νμ γ μ (1 − γ 5 )u μ u e γμ (1 − γ 5 )u νe . (6.77)
2
√
The factor 2 is just a convention to keep the same definition for G F and the only
relevant change in relation to the Fermi model is the replacement:
γ μ → γ μ (1 − γ 5 ) .
The muon lifetime can now be computed using the Fermi golden rule. This detailed
computation, which is beyond the scope of the present text, leads to:
192 π 3
τμ = , (6.78)
G F 2 mμ5
G∗
M = √F u p γ μ (C V − C A γ 5 )u n u e γμ (1 − γ 5 )u νe . (6.79)
2
292 6 Interactions and Field Theories
The C V and C A constants reflect the fact that the neutron and the proton are not
point-like particles and thus form factors may lead to a change on their weak charges.
Experimentally, the measurement of many nuclear β decays is compatible with the
preservation of the value of the “vector weak charge” and a 25 % change in the axial
charge:
C V = 1.000
C A = 1.255 ± 0.006 .
The value of G ∗F was found to be slightly lower (2 %) than the one found from the
muon decay. This “discrepancy” was cured with the introduction of the Cabibbo
angle as will be discussed in Sect. 6.3.6.
1 1
uL = 1 − γ5 u ; u R = 1 + γ5 u (6.80)
2 2
with u = u L + u R . The adjoint spinors are given by
1 1
uL = u 1 + γ5 ; u R = u 1 − γ5 . (6.81)
2 2
For antiparticles
1 1
vL = 1 + γ5 v ; vR = 1 − γ5 v (6.82)
2 2
1 1
v L = v 1 − γ ; v R = v 1 + γ5 .
5
(6.83)
2 2
Chiral states are closely related to helicity states but they are not identical. In fact,
applying the chiral projection operators defined above to the helicity eigenstates
(Sect. 6.2.4) one obtains, for instance, for the right helicity eigenstate:
6.3 Weak Interactions 293
1 1
u↑ = (1 − γ 5 ) + (1 + γ 5 ) u ↑
2 2
1 p 1 p
u↑ = 1+ uR + 1− uL . (6.84)
2 E +m 2 E +m
the weak (V-A) neutrino-electron current (Eq. 6.75) can be written as:
1 + γ5 1 − γ5
u e γμ (1 − γ )u νe = 2 u e
5
γμ u νe = 2 u eL γμ u νeL :
2 2
(6.85)
the weak charged leptonic current involves then only chiral left particles (and right
chiral anti-particles).
In the case of the 60 Co β decay (the Wu experiment), the electron and anti-neutrino
masses can be neglected and so the anti-neutrino must have right helicity and the
electron left helicity. Thus, as the electron and anti-neutrino have to add up their
spin to compensate the change by one unity in the spin of the nucleus, the electron
is preferentially emitted in the direction opposite to the polarization of the nucleus
(Fig. 6.21).
The confirmation of the negative helicity of neutrinos came from a sophisticated
and elegant experiment by M. Goldhaber, L. Grodzins and A. Sunyar in 1957, study-
ing neutrinos produced in a K capture process (e− p → n νe ). The helicity of the
emitted neutrino was translated in the longitudinal polarization of a photon resulting
from the de-excitation of a nucleus produced in the K capture:
152
Eu e− → 152
Sm∗ νe
152
Sm∗ → 152 Sm γ .
In the framework of (V-A) theory, if the charged leptons were massless these weak
decays would be forbidden. In fact the pion has spin 0, the anti-neutrino is a right-
handed particle and thus to conserve angular momentum the helicity of the electron
should be positive (Fig. 6.22) which is impossible for a massless left electron. How-
ever, the suppression of the decay into electron neutrino face to the decay into muon
neutrino, contrary to what would be expected from the available decay phase space,
is not a proof of the (V-A) theory. It can be shown that a theory with V or A cou-
plings (or any combination of them) would also imply a suppression factor of the
order m 2e /m 2μ .
As a last example, the neutrino and anti-neutrino handedness is revealed in the
observed ratio of cross sections for neutrino and anti-neutrino in isoscalar nuclei
(with an equal number of protons and neutrons) N at GeV energies:
σ ν μ N → μ+ X 1
∼ .
σ νμ N → μ− X 3
Note that at these energies, the neutrinos and the anti-neutrinos interact directly with
the quarks and antiquarks the protons and neutrons are made of (similarly to the
electrons in the deep inelastic scattering discussed in Sect. 5.5.3).
Four-fermion interaction theories (like Fermi model—see Sect. 6.3.1) violate unitar-
ity at high energy and are not renormalizable (all infinities cannot be absorbed into
running physical constants—see Sect. 6.2.9). The path to solve such problem was
to construct, in analogy with QED, a gauge theory of weak interactions which lead
to the introduction of intermediate vector bosons, with spin 1: the W ± and the Z .
However, in order to model the short range of the weak interactions, such bosons
could not have zero mass, and thus would violate the gauge symmetry. The problem
was solved by the introduction of spontaneously broken symmetries, which then led
to the prediction of the existence of the so-called Higgs boson.
296 6 Interactions and Field Theories
In this section, the modification introduced on the structure of the charged weak
currents as well as the discovery of the neutral currents and of the W ± and the Z
bosons will be briefly reviewed. The overall discussion on the electroweak unification
and its experimental tests will be the object of the next chapter.
The structure of the weak charged and of the electromagnetic interactions became
similar with the introduction of the W ± bosons, with the relevant difference that
weak-charged interactions couple left-handed fermions (right-handed antifermions)
belonging to SU(2) doublets, while electromagnetic interactions couple fermions
belonging to U(1) singlets irrespective of chirality.
The muon decay amplitude deduced in (V-A) theory (Eq. 6.77) can now, intro-
ducing the massive W ± propagator (Fig. 6.24), be written as:
gW 2 −i
gμν − qμ qν /MW 2
μ ν
M= u νμ γ (1 − γ )u μ
5
u e γ (1 − γ 5
)u ν
q 2 − MW 2
e
8
The derivation of the expression of the propagator for massive spin 1 boson is based
on the Proca equation (Sect. 6.2.1) and it is out of the scope of the present
text. But
as the second term can be neglected a Yukawa-type expression, gμν / q 2 − MW 2 ,
is recovered.
In the low-energy limit, q 2 MW 2 the two coupling constants are thus
related by: √
2 gW 2
GF = .
8 MW 2
e
W
_
e
6.3 Weak Interactions 297
Neutral weak currents were predicted long before their discovery at CERN in 1973
(N. Kemmer 1937, O. Klein 1938, S. A. Bludman 1958). Indeed the SU(2) structure
of charged interactions (leptons organized in weak isospin doublets) suggested the
existence of a triplet of weak bosons similarly to the pion triplet responsible for the
proton–neutron strong isospin rotations.
However, if the charged components would be the W ± , the neutral one could not
be the γ, which has no weak charge. Furthermore, in the 1960s it was discovered
that strangeness-changing neutral currents (for instance K + → π + ν ν) were highly
suppressed and thus some thought that neutral weak interactions may not exist. Many
theorists however became enthusiastic about neutral currents around the 1970s since
they were embedded in the work by Glashow, Salam, and Weinberg on electroweak
unification (see Sect. 7.2). From the experimental point of view, it was clearly a very
difficult issue and the previous experimental searches on neutral weak processes lead
just to upper limits.
Neutrino beams were the key to such searches. In fact, as neutrinos do not have
electromagnetic and strong charges, their only possible interaction is the weak one.
Neutrino beams are produced in laboratory (Fig. 6.25) by the decay of secondary
pions and kaons coming from a primary high-energy proton interaction on a fixed
target. The charge and the momentum range of the pions and kaons can be selected
using a sophisticated focusing magnetic optics system (narrow-band beam) or just
loosely selected maximizing the beam intensity (wide-band beam). The energy spec-
tra of such beams are quite different (Fig. 6.26). While the narrow-band beam has an
almost flat energy spectrum, the wide band is normally peaked at low energies.
In the 1960s a large heavy liquid bubble chamber (18 tons of freon under a
pressure of 10–15 atmospheres, within a magnetic field of 2 T) called Gargamelle
was designed at CERN. Gargamelle could collect a significant number (one order of
~ 300 m ~ 400 m ~ 40 m
Fig. 6.25 Neutrino narrow-band beam (top) and wide-bam beam (bottom) production
298 6 Interactions and Field Theories
Fig. 6.26 Narrow-band and wide-band neutrino energy spectra. The y axis represents the number
of particles per bunch
magnitude above the previous experiments) of neutrino interactions (Fig. 6.27). Its
first physics priority was, in the beginning of the 1970s, the test of the structure of
the protons and neutrons just revealed in the deep inelastic scattering experiment at
SLAC (Sect. 5.5.3).
In a batch of about 700,000 photos of neutrino interactions, one event emerged
as anomalous. In that photo (Fig. 6.28), taken with an anti-neutrino beam, just an
electron was visible (giving rise to a small electromagnetic cascade). This event is a
perfect candidate for a ν μ e− → ν μ e− interaction (Fig. 6.29). The background in
the anti-neutrino beam was estimated to be negligible.
6.3 Weak Interactions 299
Fig. 6.28 Gargamelle picture (top) and sketch (bottom) of the first observed neutral current process
ν μ e− → ν μ e− . A muon anti-neutrino coming from the left knocks an electron forward, creating
a small shower of electron–positron pairs. Source CERN
If neutral current interactions did exist, they should be even more visible in the
semileptonic channel. The signature should be clear: in charged semileptonic weak
interactions an isolated muon and several hadrons should be produced in the final
state, while in the interactions mediated by the neutral current there could be no
muon (Fig. 6.30).
Fig. 6.30 First-order Feynman diagrams for the charged (left) and neutral (right) semileptonic
weak interactions
300 6 Interactions and Field Theories
Neutral currents did exist, and the GSW model proposed a complete and unified
framework for electroweak interactions: the intermediate vector bosons should be
there (with expected masses around 65 and 80 GeV for the W ± and the Z , respec-
tively, based on the data known at that time). They had to be found.
In 1976 Carlo Rubbia pushed the idea to convert the existing Super Proton Syn-
chrotron accelerator at CERN (or the equivalent machine at Fermilab) into a pro-
ton/antiproton collider. It was not necessary to build a new accelerator (protons and
antiprotons would travel in opposite directions within the same vacuum tube) but
antiprotons had to be produced and kept alive during many hours to be accumulated
in an auxiliary storage ring. Another big challenge was to keep the beam focused.
Simon van der Meer made this possible developing an ingenious strategy of beam
cooling, to decrease the angular dispersion while maintaining monochromaticity. In
beginning of the 1980s, the CERN SPS collider operating at a center-of-mass energy
of 540 GeV was able to produce the first W ± and Z (Fig. 6.31) by quark/antiquark
annihilation (u u → Z ; d d → Z ; u d → W + ; d u → W − ).
The leptonic decay channels with electrons and muons in the final state were the
most obvious signatures to detect the so awaited bosons. The hadronic decay channels
as well as final states with tau leptons suffer from a huge hadronic background due
Z,W
p
6.3 Weak Interactions 301
to the “normal” quark and gluons strong interactions. Priority was then given to
searches into the channels:
p p → Z X → e− e+ X
p p → Z X → μ− μ+ X
and
p p → W ± X → e ± νe X
p p → W ± X → μ± νe X .
Two general purpose experiments, UA1 and UA2, were built having the usual
“onion” structure (a tracking detector surrounded by electromagnetic and hadronic
calorimeters, surrounded by an exterior layer of muons detectors). In the case of
UA1, the central detector (tracking and electromagnetic calorimeter) was immersed
in a 0.7 T magnetic field, perpendicular to the beam line, produced by a magnetic
coil (Fig. 6.32); the iron return yoke of the field was instrumented to operate as a
hadronic calorimeter. UA1 was designed to be as hermetic as possible.
The first W ± and Z events were recorded in 1983. Z → e− e+ events were
characterized by two isolated high-energy deposits in the cells of the electromagnetic
calorimeter (Fig. 6.33 left) while W ± X → e± νe events were characterized by one
isolated high-energy deposit in the cells of the electromagnetic calorimeter and an
important transverse missing energy (Fig. 6.33 right).
The Z mass in this type of events can be reconstructed just computing the invariant
mass of the final state electron and positron:
2
m 2Z = (E 1 + E 2 )2 − P1 + P2
Fig. 6.33 Left Two high-energy deposits from a Z → e− e+ event seen in the electromagnetic
calorimeter of the UA2 experiment. Right One high-energy deposit and the missing transverse
momentum from a W ± X → e± νe event. From http://cern-discoveries.web.cern.ch
m 2Z ∼
= 4 E 1 E 2 sin2 (α/2)
and
d cos (θ∗ ) 4PT /m W
= .
d PT PT2
1 − 4 m2
W
dσ dσ d cos θ∗
=
d PT d cos θ∗ d PT
The measured value for m W by UA1 and UA2 was, respectively, m W = (82.7 ±
1.0 ± 2.7) GeV and m W = (80.2 ± 0.6 ± 0.5) GeV—the present world average is
(80.385 ± 0.015) GeV.
Finally the V-A character of the charged weak interactions, as well as the fact that
the W has spin 1, is revealed by the differential cross section as a function of cos θ∗ for
the electron produced in the W semileptonic decay, which displays a (1 + cos θ∗ )2
dependence (Fig. 6.36).
In fact, at CERN collider energies, neglecting the masses of the quarks and leptons
and considering that W ± are mainly produced by the interaction of valence quarks
(from the proton) and valence antiquarks (from the antiproton), the helicity of the
third component of spin of the W ± is along the antiproton beam direction and thus the
electron (positron) is emitted preferentially in the proton (antiproton) beam direction
(Fig. 6.37).
The universality of weak interactions established in the end 1940s (see Sect. 6.3.1)
was questioned when it was discovered that some strange particle decays (as for
instance K − → μ− ν μ or → p e− ν e ) were suppressed by a factor around 20 in
relation to what expected.
6.3 Weak Interactions 305
The problem was solved in 1963 by Nicola Cabibbo,6 who suggested that the
quark weak and strong eigenstates may be not the same. At that time only the u, d
and s quarks were known (Sect. 5.7.2) and Cabibbo conjectured that the two quarks
with electromagnetic charge −1/3 (d and s) mixed into a weak eigenstate d such as
d = dcos θc + s sin θc
6 NicolaCabibbo (1935–2010) was a professor in Rome, and president of the Italian Institute for
Nuclear Physics (INFN). He gave fundamental contributions to the development of the standard
model.
306 6 Interactions and Field Theories
Fig. 6.38 Weak decay coupling: Leptonic (top), semileptonic with (bottom) and without (middle)
involving strange quarks
Fig. 6.39 Possible s and d quarks transitions generated by Z (top) and W ± (bottom) couplings
(three families)
6.3 Weak Interactions 307
Fig. 6.42 The two orthogonal combinations of the quarks s and d in the d and s states
fourth quark, the charm c, to symmetrize the weak currents, organizing the quarks
into two SU(2) doublets. Such scheme, known as the GIM mechanism, solves the
FCNC puzzle and was spectacularly confirmed with the discovery of the J /ψ meson
(see Sect. 5.7.2). FCNC are now suppressed by the cancelation of the two lowest
diagrams in Fig. 6.41. In fact in the limit of equal masses the cancelation would be
perfect. As the c mass is much higher than u mass the sum of the diagrams will lead
to terms proportional to m 2c /m 2Z ,W .
There are now two orthogonal combinations of the quarks s and d (Fig. 6.42):
d = d cos θc + s sin θc
308 6 Interactions and Field Theories
s = −d sin θc + s cos θc
in such a way that, for example, the square of the coupling of the b quark to the u
quark in the weak transition (which is in turn proportional to the probability of the
transition) can be written as
|gub |2 = |Vub |2 gW
2
.
The Japanese physicists Makoto Kobayashi and Toshihide Maskawa proposed this
form of the quark mixing matrix in 1973. Their work was built on that of Cabibbo
to extend the concept of quark mixing from two to three generations of quarks; at
that time, the third generation had not been observed, yet, but, as we shall see, the
extension to three families allows explaining the violation of the CP symmetry, i.e.,
of the product of the operations of charge conjugation and parity. In 2008, Kobayashi
and Maskawa shared one half of the Nobel Prize in Physics “for the discovery of the
origin of the broken symmetry which predicts the existence of at least three families
of quarks in nature”.
A-priori, being the Vi j complex numbers, the CKM matrix might have 2N 2
degrees of freedom; however, the physical constraints reduce the free elements to
(N − 1)2 . The physical constraints are
• Unitarity. If there are only three quark families, one must have
V †V = I (6.86)
where I is the identity matrix. This will guarantee that in an effective transition
each u-type quark will transform into one of the three d-type quarks (i.e., that the
current is conserved and no fourth generation is present). This constraint reduces
the number of degrees of freedom to N 2 ; the six equations underneath can be
6.3 Weak Interactions 309
written explicitly as
|Vik |2 = 1 (i = 1, 2, 3) (6.87)
k
This last equation is a constraint on three sets of three complex numbers, telling
that these numbers form the sides of a triangle in the complex plane. There are
three independent choices of i and j, and hence three independent triangles; they
are called unitarity triangles, and we shall discuss them later in larger detail.
• Phase invariance. 2N − 1 of these parameters leave physics invariant, since one
phase can be absorbed into each quark field, and an overall common phase is
unobservable. Hence, the total number of free variables is N 2 − (2N − 1) =
(N − 1)2 .
Four independent parameters are thus required to fully define the CKM matrix
(N = 3). This implies that the most general 3 × 3 unitary matrix cannot be con-
structed using real numbers only: Eq. (6.86) implies that a real matrix has only three
degrees of freedom, and thus at least one imaginary parameter is required.
Many parameterizations have been proposed in the literature. An exact parame-
trization derived from the original work by Kobayashi and Maskawa (K M) extends
the concept of Cabibbo angle; it uses three angles θ12 , θ13 , θ23 , and a phase δ:
⎛ ⎞
c12 c13 s12 c13 s13 e−iδ
VK M = ⎝ −s12 c23 − c12 s23 s13eiδ c12 c23 − s12 s23 s13eiδ s23 c13 ⎠ , (6.89)
s12 s23 − c12 c23 s13eiδ −c12 s23 − s12 c23 s13 c23 c13
with the standard notations si j = sin θi j (θ12 is the Cabibbo angle) and ci j = cos θi j .
Another frequently used parametrization of the CKM matrix is the so-called
Wolfenstein parametrization. It refers to four free parameters λ, A, ρ, and η, defined
as
|Vus |
λ = s12 =
|Vus |2 + |Vud |2
A = s23 /λ2
s13 e iδ
= Aλ3 (ρ + iη)
(λ is the sine of the Cabibbo angle). We can use the experimental fact that s13
s23 s12 1 and expand the matrix in powers of λ. We obtain at order λ4 :
310 6 Interactions and Field Theories
⎛ ⎞
1 − 21 λ2 λ Aλ3 (ρ − iη)
VW ⎝ −λ 1 − 21 λ2 Aλ2 ⎠. (6.90)
Aλ (1 − ρ − iη) −Aλ
3 2 1
λ = 0.22535 ± 0.00065
A = 0.817 ± 0.015
ρ̄ = ρ(1 − λ2 /2) = 0.136 ± 0.018
η̄ = η(1 − λ2 /2) = 0.348 ± 0.014 .
6.3.8 CP Violation
Weak interactions violate the parity and the charge conjugation symmetries. But, for
a while, it was thought that the combined action of charge and parity transformation
(CP) would restore the harmony physicists like so much. Indeed a left-handed neu-
trino transforms under CP into a right-handed anti-neutrino and the conjugate CP
0
world still obeys to the V-A theory. However, surprisingly, the study of the K 0 − K
system revealed in 1964 a small violation of the CP symmetry. In the turn of the
century, CP violation was observed in many channels in the B sector. Since then,
an intense theoretical and experimental work has been developed for the interpre-
tation of these effects in the framework of the Standard Model, in particular by
the precise determination of the parameters of the CKM matrix and by testing its
self-consistency.
0
6.3.8.1 K 0 − K Mixing
0
Already in 1955, Gell-Mann and Pais had observed that the K 0 (ds) and the K (sd),
which are eigenstates of the strong interaction, could mix through weak box diagrams
as those represented in Fig. 6.43.
0 0
A pure K 0 (K ) beam will thus develop a K (K 0 ) component and at each time a
0
linear combination of the K 0 and of the K may be observed. Since CP is conserved
6.3 Weak Interactions 311
0
Fig. 6.43 Leading box diagrams for the K 0 − K mixing
The short and the long lifetime states are usually designated by K-short (K S ) and K-
long (K L ), respectively. These states are eigenstates of the free-particle Hamiltonian,
which includes weak mixing terms, and if CP were a perfect symmetry, they would
coincide with |K 1 and |K 2 , respectively. The K S and K L wavefunctions evolve
with time, respectively, as
where m S (m L ) and S ( L ) are, respectively, the mass and the width of the K S (K L )
mesons (see Sect. 2.6).
K 0 and K̄ 0 , being a combination of K S and K L , will also evolve in time. Indeed,
considering initially a pure beam of pure K 0 with an energy of a few GeV, just after
a few tens of cm, the large majority of the K S mesons will decay and the beam will
become a pure K L beam. The probability to find a K 0 in this beam after a time t can
be expressed as:
2
1 0
PK 0 −→K 0 (t) = √ K |K S (t) + K |K L (t)
0
2
1 − S t
= e + e− L t + 2e−( S + L )t/2 cos (m t) (6.95)
4
where S = 1/τ s , L = 1/τ L , and m is the difference between the masses of the
two eigenstates. The last term, coming from the interference of the two amplitudes,
provides a direct measurement of m.
0
In the limit S −→ 0, L −→ 0, a pure flavor oscillation between K 0 or K
would occur:
1
PK 0 −→K 0 (t) = (1 + cos (m t)) = cos2 (m t) .
2
In the real case, however, the oscillation is damped and the survival probability
converges quickly to a value close to 1/4. Measuring the initial oscillation through
the study of semileptonic decays, which will be discussed later on in this section,
m was determined to be
K S and K L have quite different lifetimes but almost the same mass.
In 1964 Christenson, Cronin, Fitch and Turlay7 performed the historical experience
(Fig. 6.44) that revealed by the first time the existence of a small fraction of two-pion
decays in a K L beam:
7 The Nobel Prize in Physics 1980 was awarded to James (“Jim”) Cronin and Val Fitch “for the
discovery of violations of fundamental symmetry principles in the decay of neutral K-mesons”.
Cronin (Chicago 1931) received his Ph.D. from the University of Chicago in 1955. He then worked
at Brookhaven National Laboratory, in 1958 became a professor at Princeton University, and finally
in Chicago. Later he moved to astroparticle physics, being with Alan Watson the founder of the
Auger cosmic ray observatory. Fitch (Merriman, Nebraska, 1923—Princeton 2015) was interested
in chemistry and he switched to physics in the mid-1940s when he participated to the Manhattan
Project. Ph.D. in physics by Columbia University in 1954, he later moved to Princeton.
6.3 Weak Interactions 313
Fig. 6.44 Layout of the Christenson, Cronin, Fitch and Turlay experiment that demonstrated the
existence of the decay K L → π + π − . The
c Nobel Foundation
K L → π+ π−
R= = (2.0 ± 0.4) × 10−3 .
(K L → all charged modes)
The K L beam was produced in a primary target placed 17.5 m downstream the
experiment and the observed decays occurred in a volume of He gas to minimize
interactions. Two spectrometers each composed by two spark chambers separated
by a magnet and terminated by a scintillator and a water Cherenkov measured and
identified the charged decay products.
The presence of two-pion decay modes implied that the long-lived K L was not a
pure eigenstate of CP. The K L should then have a small component of K 1 and so it
could be expressed as:
1
|K L = (|K 2 + ε |K 1 ) (6.96)
1 + |ε|2
where ε is a small complex parameter whose phase φε depends on the phase con-
0
vention chosen for the CP action on K 0 and K . With the present phase convention
2m
φε tan−1
where m and are, respectively, the differences between the masses and the
decay widths of the two eigenstates.
Alternatively, K L and K s can be expressed as a function of the flavor eigenstates
0 0
K and K as
314 6 Interactions and Field Theories
1 0 0
|K L =
(1 + ε) K − (1 − ε) K (6.97)
2 1 + |ε| 2
1 0 0
|K s =
(1 + ε) K + (1 − ε) K (6.98)
2 1 + |ε| 2
The probability that a state initially produced as a pure K 0 or K̄ 0 will decay into
a 2π system will evolve in time. A “2π asymmetry” can thus be define as
0
K −→ π + π − − K 0 −→ π + π −
A± (t) = 0
. (6.101)
K −→ π + π − + K 0 −→ π + π −
0
6.3.8.3 CP Violation in Semileptonic K 0 , K Decays
0
K 0 and K may also decay semileptonically through the channels:
0
K 0 → π − e + νe ; K → π + e − ν e
and thus CP violation can also be tested measuring the charge asymmetry A L
K L → π−l + ν − K L → π+l − ν
AL = .
K L → π−l + ν + K L → π+l − ν
(1 + ε)2 − (1 − ε)2
AL = ≈ 2 Re (ε) .
(1 + ε)2 + (1 − ε)2
The measured value A L is positive and it is in good agreement with the measurement
of ε obtained in the 2π decay modes. The number of K L having in their decay products
an electron is slighter smaller (0.66 %) than the number of K L having in their decay
products a positron. There is thus an unambiguous way to define what is matter and
what is antimatter.
1
|K S = (|K 1 + ε |K 2 )
1 + |ε|2
1
|K L = (|K 2 + ε |K 1 ) .
1 + |ε|2
In this context, the decays of K s and K L into 2π modes are only due to the
presence in both states of a K 1 component. It is then expected that the ratio of the
decay amplitudes of the K L and of the K s into 2π modes should be equal to ε and
independent of the charges of the two pions:
A (K L → ππ)
η= = ε.
A (K s → ππ)
316 6 Interactions and Field Theories
and
A K L → π0 π0
η 00
=
A K S → π0 π0
(both about 2 × 10−3 ) are slightly, but significantly, different. In fact the present
experimental ratio is 00
η
η +− = 0.9950 ± 0.0008 .
Finally, CP violation may also occur whenever both the meson and its antimeson
can decay to a common final state with or without M − M mixing:
(M → f ) = M → f .
η +− = ε + ε
η 00 = ε − 2 ε .
6.3 Weak Interactions 317
The ratio between the CP violating parameters can also be related to the double ratio
of the decay probabilities K L and K s into specific 2π modes:
00 2
ε 1 η 1 K L → π0 π0 K s → π+ π−
Re = 1− 2 = 1−
.
ε 6 η ± 6 K L → π+ π− K s → π0 π0
0
Fig. 6.46 Leading box diagrams for the B 0 − B mixing. From S. Braibant, G. Giacomelli, and
M. Spurio, “Particles and fundamental interactions”, Springer 2012
318 6 Interactions and Field Theories
0
e+ e− → ϒ (4S) → B 0 B .
0
The B 0 B states evolved entangled and therefore if one of the mesons was observed
(“tagged”) at a given time, the other had to be its anti-particle. The “tag” of the flavor
of the B mesons could be done through the determination of the charge of the lepton
in B semileptonic decays:
B 0 → D −l + νl (b → c l + νl )
0
B → D +l − ν l (b → c l − ν l ) .
It was thus possible to determine the decay rate of the untagged B meson to
J/ψ K S as a function of its decay time. This rate is shown, both for “tagged” B 0 and
0
B in Fig. 6.47. The observed asymmetry:
0
B → J/ψ K S − (B 0 → J/ψ K S )
AC P (t) = 0
B → J/ψ K S + (B 0 → J/ψ K S )
8 The BaBar detector was a cylindrical detector located at the Stanford Linear Accelerator Center
in California. Electrons at an energy of 9 GeV collided with 3.1 GeV antielectrons to produce a
center-of-mass collision energy of 10.58 GeV, corresponding to the ϒ(4S) resonance. The ϒ(4S)
decays into a pair of B mesons, charged or neutral. The detector had the classical “onion-like”
structure, starting from a Silicon Vertex Tracker (SVT) detecting the decay vertex, passing through
a Cherenkov detector for particle identification, and ending with an electromagnetic calorimeter.
A magnet produced a 1.5 T field allowing momentum measurement. BaBar analyzed some 100
million B B̄ events, being a kind of “B factory”.
6.3 Weak Interactions 319
is a clear proof of the CP violation in this channel. This asymmetry can be explained
by the fact that the decays can occur with or without mixing. The decay amplitudes
for these channels may interfere—in
the case of the B0 , the relevant amplitudes are
0 0
A1 B → J/ψ K S and A2 B 0 → B → J/ψ K S .
Nowadays, after the experiments Belle and BaBar at the B factories at KEK and
SLAC, and after the first years of the LHCB experiment at LHC, there is already a
rich spectrum of B channels where CP violation was observed at a level above 5σ.
These results allowed a precise determination of most of the parameters of the CKM
matrix and intensive tests of its unitarity as it will be briefly discussed.
where δ1 is the phase term introduced from the CKM matrix (called often “weak
phase”) and φ1 is the phase term generated by CP-invariants interactions in the
decay (called often “strong phase”). The exact values of these phases depend on
the convention but the differences between the weak phases and between the strong
phases in any two different terms of the decay amplitude is independent of the
convention.
Since physically measurable reaction rates are proportional to |M|2 , so far nothing
is different. However, consider a process for which there are different routes (say for
simplicity two routes). Now we have:
These conditions are sums of three complex numbers and thus can be represented in
a complex plane as triangles, usually called the unitarity triangles.
In the triangles obtained by taking scalar products of neighboring rows or columns,
the modulus of one of the sides is much smaller than the other two. The equation for
which the moduli of the triangle are most comparable is
∗ ∗
Vud Vub + Vcd Vcb + Vtd Vtb∗ = 0 .
It can also be demonstrated that the areas of all unitarity triangles are the same, and
they equal half of the so-called Jarlskog invariant (from the Swedish physicist Cecilia
Jarlskog), which can be expressed as J A2 λ6 η in the Wolfenstein parametrization.
The fact that the Jarlskog invariant is proportional to η shows that the unitarity
triangle is a measure of CP violation: if there is no CP violation, the triangle degen-
erates into a line. If the three sides do not close to a triangle, this might indicate that
Fig. 6.49 Unitarity triangle and global CKM fit in the plane (ρ, η). From CKMfitter Group (J.
Charles et al.), Eur. Phys. J. C41 (2005) 1, updated results and plots available at http://ckmfitter.
in2p3.fr
the CKM matrix is not unitary, which would imply the existence of new physics—in
particular, the existence of a fourth quark family.
The present experimental constrains on the CKM unitarity triangle, as well as a
global fit to all the existing measurements by the CKMfitter group,9 are shown in
Fig. 6.49.
All present results are thus consistent with the CKM matrix being the only source
of CP violation in the Standard Model. Nevertheless, it is widely believed that the
observed matter–antimatter asymmetry in the Universe (see next section), requires
the existence of new sources of CP violation that might be revealed either in the
quark sector as small inconsistencies at the CKM matrix, or elsewhere, like in precise
measurements of the neutrino oscillations or of the neutron electric dipole moments.
The real nature of CP violation is still to be understood.
9 TheCKMfitter group provides once or twice per year an updated analysis of standard model
measurements and average values for the CKM matrix parameters.
322 6 Interactions and Field Theories
At CERN the study of antimatter atoms has been pursued in the last 20 years. Anti-
hydrogen atoms have been formed and trapped for periods as long as 16 min and
recently the first antihydrogen beams were produced. The way is open, for instance,
to detailed studies of the antihydrogen hyperfine transitions and to the measurement
of the gravitational interactions between matter and antimatter. The electric charge
of one antihydrogen atom was found by the ALPHA experiment to be compatible
with zero to eight decimal places (Q H (−1.3 ± 1.1 ± 0.4) 10−8 e).
No primordial antimatter was observed so far, while the relative abundance of
baryons (n B ) to photons (n γ ) was found to be (see Sect. 8.1.3):
nB
η = ∼ 5 × 10−10 .
nγ
Although apparently small, this number is many orders of magnitude higher than
what would be expected if there would be in the early Universe a equal number of
baryons and antibaryons. Indeed in such case the annihilation between baryons and
anti baryons would have occurred until its interaction rate equals the expansion rate
of the Universe (see Sect. 8.1.2) and then the expected ratios can be computed to be:
nB n
= B ∼ 10−18 .
nγ nγ
The excess of matter over antimatter should then be present before nucleons and
antinucleons may be formed. On the other hand, inflation (see Sect. 8.3.2) would
wipe out any excess of baryonic charge present in the beginning of the Big Bang.
Thus, this excess had to be originated by some unknown mechanism (baryogenesis)
after inflation and before or during the quark–gluon plasma stage.
In 1967, soon after the discovery of the CMB and of the violation of CP in the
K − K 0 system (see previous section), Andrej Sakharov modeled the Universe
0
evolution from a B = 0 initial state to the B = 0 present state (B indicates here the
baryonic number). This evolution needed three conditions which are now known as
the Sakharov conditions:
1. Baryonic number (B) should be violated.
2. Charge (C) and Charge and Parity (CP) symmetries should be violated.
3. Baryon-number violating interactions should have occurred in the early Universe
out of thermal equilibrium.
The first condition is obvious. The second is necessary since if C and CP were
conserved any baryonic charge excess produced in a given reaction would be com-
pensated by the conjugated reaction. The third is more subtle: the particles and their
corresponding antiparticles do not achieve thermal equilibrium due to rapid expan-
sion decreasing the occurrence of pair annihilation.
Thermal equilibrium may have been broken when symmetry breaking pro-
cesses had occurred. Whenever two phases are present, the boundary regions between
these (for instance the surfaces of bubbles in boiling water) are out of thermal equi-
6.3 Weak Interactions 323
librium. In the framework of the Standard Model (see Chap. 7) this fact could in
principle had occurred at the electroweak phase transition. However, it was demon-
strated analytically and numerically that, for a Higgs with a mass as the one observed
recently (m H ∼ 125 GeV), the electroweak phase transition does not provide the
thermal instability required for the formation of the present baryon asymmetry in the
Universe.
The exact mechanism responsible for the observed matter-antimatter asymmetry
in the Universe is still to be discovered. The Standard Model is not clearly the end
of physics.
The quark model simplifies the description of hadrons. We saw that deep inelastic
scattering evidences a physical reality for quarks—although the interaction between
these particles is very peculiar, since no free quarks have been observed up to now.
A heuristic form of the potential between quarks with the characteristics needed has
been shown.
Within the quark model, we needed to introduce a new quantum number—color—
to explain how bound stated of three identical quarks can exist and not violate the
Pauli exclusion principle. Invariance with respect to color can be described by a
symmetry group SU(3)c , where the subscript c indicates color.
The theory of quantum chromodynamics (QCD) enhances the concept of color
from a role of label to the role of charge, and is the basis for the description of the
interactions binding quarks in hadrons. The phenomenological description through
an effective potential can be seen as a limit of this exact description, and the strong
interactions binding nucleons can be explained as van der Waals forces between
neutral objects.
QCD has been extensively tested, and is very successful. The American physicists
David J. Gross, David Politzer and Frank Wilczek shared the 2004 the Nobel Prize
for physics by devising an elegant mathematical framework to express the asymptotic
(i.e., in the limit of very short distances, equivalent to the high momentum transfer
limit) freedom of quarks in hadrons, leading to the development of QCD.
However, a caveat should be stressed. At very short distances, QCD is essentially a
theory of free quarks and gluons—with relatively weak interactions, and observables
can be perturbatively calculated. However at longer wavelengths, of the order the
proton size ∼1 fm = 10−15 m, the coupling constant between partons becomes too
large to compute observables (we remind that exact solutions are in general impos-
sible, and perturbative calculations must be performed): the Lagrangian of QCD,
that in principle contains all physics, becomes de facto of little help in this regime.
Parts of QCD can thus be calculated in terms of the fundamental parameters using
the full dynamical (lagrangian) representation, while for other sectors one should
use models, guided by the characteristics of the theory, whose effective parameters
cannot be calculated but which can be constrained by experimental data.
324 6 Interactions and Field Theories
Before formulating QCD as a gauge theory, we must extend the formalism shown for
the description of electromagnetism (Sect. 6.2.6) to a symmetry group like SU(3).
This extension is not trivial, and it was formulated by Yang and Mills in the 1950s.
U(1). Let us first summarize the ingredients of the U(1) gauge theory—which is
the prototype of the abelian gauge theories, i.e., of the gauge theories defined by
symmetry groups for which the generators commute. We have seen in Sect. 6.2.3
that the requirement that physics is invariant under local U(1) phase transformation
implies the existence of the photon gauge field. QED can be derived by requiring
the Lagrangian to be invariant under local U(1) transformations of the form U =
eiqχ(x)I —note the identity operator I , which, in the case of U(1), is just unity. The
recipe is:
• Find the gauge invariance of the theory—in the case of electromagnetism U(1):
∂μ → Dμ = ∂μ + iq Aμ (x) (6.103)
where Aμ transforms as
Aμ → Aμ = Aμ + ∂μ χ . (6.104)
The Lagrangian
1
LQED = ψ̄(iγ μ Dμ − m)ψ − Fμν F μν (6.105)
4
with
1
Fμν = ∂μ Aν − ∂ν Aμ = [Dμ , Dν ] (6.106)
iq
is invariant for the local gauge transformation, and the field Aμ and its interactions
with ψ are defined by the invariance itself. Note that the Lagrangian can be written as
L = Lloc + Lgf
where Lloc is the locally invariant Lagrangian for the particle, Lgf is the field
Lagrangian.
Abelian Symmetry Groups. What we have seen for U(1) can be trivially extended
to symmetries with more than one generator, if the generators commute.
Non-Abelian Symmetry Groups and Yang–Mills Theories. When the symmetry
group is non-abelian, i.e., generators do not commute, the above recipes must be
6.4 Strong Interactions and QCD 325
generalized. If the generators of the symmetry are T a , with a =1, …, n, one can
write the gauge invariance as
ψ(x) → ψ (x) = ei gs a a (x) T
a
. (6.107)
From now on, we shall not explicitly write the sum over a—the index varying within
the set of the generators, or of the gauge bosons, which will be assumed implicitly
when the index is repeated; generators are a group. We do not associate any particular
meaning to the fact that A is subscript or superscript.
If the commutation relations hold
[T a , T b ] = i f abc T c (6.108)
where Aaμ are the vector potentials, and g is the coupling constant. In four dimensions,
the coupling constant g is a pure number and for a SU(N) group one has a, b, c =
1 . . . N 2 − 1.
The gauge field Lagrangian has the form
1
Lgf = − F aμν Fμν
a
. (6.110)
4
The relation
a
Fμν = ∂μ Aaν − ∂ν Aaμ + g f abc Abμ Acν (6.111)
The field is self-interacting: from the given Lagrangian one can derive the equations
∂ μ Fμν
a
+ g f abc Aμb Fμν
c
= 0. (6.113)
∂ μ Fμν
a
+ g f abc Abμ Fμν
c
= −Jνa . (6.114)
One can demonstrate that a Yang–Mills theory is not renormalizable for dimen-
sions greater than four.
326 6 Interactions and Field Theories
QCD is based on the gauge group SU(3), the Special Unitary group in 3 dimensions
(each dimension is a color, conventionally Red, Gr een, Blue). This group is rep-
resented by the set of unitary 3 × 3 complex matrices with determinant one (see
Sect. 5.3.5).
Since there are nine linearly independent unitary complex matrices, there are a
total of eight independent directions in this matrix space, i.e., the carriers of color
(called gluons) are eight. Another way of seeing that the number of gluons is eight
is that SU(3) has eight generators; each generator represents a color exchange, and
thus a gauge boson (a gluon) in color space.
These matrices can operate both on each other (combinations of successive gauge
transformations, physically corresponding to successive gluon emissions and/or
gluon self-interactions) and on a set of complex 3-vectors, representing quarks in
color space.
Due to the presence of color, a generic particle wave function can be written
as a three-vector ψ = (ψq R , ψqG , ψq B ) which is a superposition of fields with a
definite color index i = Red, Gr een, Blue. The SU(3) symmetry corresponds to the
freedom of rotation in this three-dimensional space. As we did for the electromagnetic
gauge invariance, we can express the local gauge invariance as the invariance of the
Lagrangian with respect to the gauge transformation
where the t a (a = 1 . . . 8) are the eight generators of the SU(3) group, and the
a (x) are generic local transformations. gs is the strong coupling, related to αs by
the relation gs2 = 4παs ; we shall return to the strong coupling in more detail later.
Usually, the generators of SU(3) are written as
1 a
ta = λ (6.116)
2
where the λ are the so-called Gell–Mann matrices, defined as:
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
010 0 −i 0 1 0 0
λ1 = ⎝ 1 0 0 ⎠ ; λ2 = ⎝ i 0 0 ⎠ ; λ3 = ⎝ 0 −1 0 ⎠
000 0 0 0 0 0 0
⎛ ⎞ ⎛ ⎞
001 0 0 −i
λ4 = ⎝ 0 0 0 ⎠ ; λ 5 = ⎝ 0 0 0 ⎠
100 i 0 0
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
000 00 0 10 0
1
λ6 = ⎝ 0 0 1 ⎠ ; λ7 = ⎝ 0 0 −i ⎠ ; λ8 = √ ⎝ 0 1 0 ⎠ .
010 0i 0 3 0 0 −2
6.4 Strong Interactions and QCD 327
As discussed in Sect. 5.3.5, these generators are just the SU(3) analogs of the Pauli
matrices in SU(2) (one can see it by looking at λ1 , λ2 and λ3 ). Note that superscribing
or subscribing an index for a matrix makes no difference in this case.
As a consequence of the local gauge symmetry, eight massless fields Aaμ will
appear (one for each generator); these are the gluon fields. The covariant derivative
can be written as
Dμ = ∂μ + igs t a Aaμ . (6.117)
1
L = ψ̄q (iγ μ )(Dμ )ψq − m q ψ̄q ψq − G aμν G aμν , (6.118)
4
where m q is the quark mass, and G aμν is the gluon field strength tensor for a gluon
with color index a, defined as
and the f abc are defined by the commutation relation [t a , t b ] = i f abc t k . This terms
arises since the generators do not commute.
To guarantee the local invariance, the field Ac transforms as
Acμ → Ac
μ = Aμ − ∂μ c − gs f
c abc
a Abμ .
The only stable hadronic states are neutral in color. The simplest example is the
combination of a quark and antiquark, which in color space corresponds to
3⊗3=8⊕1. (6.120)
3
R R̄ + G Ḡ + B B̄ ; otherwise it is in an overall octet state (Fig. 6.50).
Correlated production processes like Z → q q̄ or g → q q̄ will project out specific
components (here the singlet and octet, respectively).
In final states, we average over all incoming colors and sum over all possible
outgoing ones. Color factors are thus associated to QCD processes; such factors
basically count the number of “paths through color space” that the process can take,
and multiply the probability for a process to happen.
A simple example is given by the decay Z → q q̄ (see the next Chapter). This
vertex contains a δi j in color space: the outgoing quark and antiquark must have
328 6 Interactions and Field Theories
YC
Y C Gds
B̄ RB̄
us
YC YC
G
d R
u sB̄ 1
√ (RR̄ − GḠ)
2 1
√ (RR̄ + GḠ + B B̄)
Gdu
R̄ R
udḠ 3
⊗ = ⊕
Y3C Y3C 1 Y3C 1 Y3C
(uu + dd + ss)
√ (RR̄ + GḠ − 2B B̄) √3
sB
uR̄ Ḡd 6
B R̄
su B Ḡ
sd
matrix element must be the same as before, but since the quarks are here incoming,
we must average rather than sum over their colors, leading to
1 1 1 1
q q̄ → Z → e+ e− : |M|2 ∝ δi j δ ∗ji = Tr{δ} = , (6.122)
9 9 9 3
colors
and the color factor entails now a suppression due to the fact that only quarks of
matching colors can produce a Z boson. The chance that a quark and an antiquark
picked at random have a corresponding color-anticolor is 1/NC .
Color factors enter also in the calculation of probabilities for the vertices of QCD.
In Fig. 6.51 one can see the definition of color factors for the three-body vertices
q → qg, g → gg (notice the difference from QED: being the gluon colored, the
“triple gluon vertex” can exist, while the γ → γγ vertex does not exist) and g → q q̄.
After tedious calculations, the color factors are
1 4
TF = CF = C A = NC = 3 . (6.123)
2 3
When we discussed QED, we analyzed the fact that renormalization can be absorbed
in a running value for the charge, or a running value for the coupling constant.
6.4 Strong Interactions and QCD 329
where
β(αs ) = −αs2 (b0 + b1 αs + b2 αs2 + · · · ), (6.125)
with
11C A − 4TR n f
b0 = , (6.126)
12π
17C 2A − 10TR C A n f − 6TR C F n f 153 − 19 n f
b1 = 2
= . (6.127)
24π 24π 2
In the expression for b0 , the first term is due to gluon loops and the second to the
quark loops. In the same way, the first term in the b1 coefficient comes from double
gluon loops, and the others represent mixed quark–gluon loops.
At variance with the QED expression (6.66), the running constant increases with
decreasing q 2 .
330 6 Interactions and Field Theories
1
αs (q 2 ) = αs (μ2 ) . (6.128)
q2
1 + b0 αs (μ2 ) ln μ2
+ O(αs2 )
1
αs (q 2 ) αs (M Z2 ) . (6.129)
Q2
1 + b0 αs (M Z2 ) ln M Z2
Sept. 2013
τ decays (N3LO)
Lattice QCD (NNLO)
DIS jets (NLO)
0.3 Heavy Quarkonia (NLO)
e+e– jets & shapes (res. NNLO)
Z pole fit (N3LO)
(–)
pp –> jets (NLO)
α s(Q)
0.2
0.1
QCD αs(Mz) = 0.1185 ± 0.0006
1 10 100 1000
Q [GeV]
Fig. 6.52 Dependence of αs on the energy scale Q; a fit to QCD is superimposed. From K.A. Olive
et al. (Particle Data Group), Chin. Phys. C 38 (2014) 090001
6.4 Strong Interactions and QCD 331
When quarks are very close to each other, they behave almost as free particles. This
is the famous “asymptotic freedom” of QCD. As a consequence, perturbation theory
becomes accurate at higher energies (Eq. 6.128). Conversely, the potential grows at
large distances.
In addition, the evolution of αs with energy must make it comparable to the
electromagnetic and weak couplings at some (large) energy, which, looking to our
present extrapolations, may lie at some 1015 –1017 GeV—but such “unification” might
happen at lower energies if new, yet undiscovered, particles generate large corrections
to the evolution. After this point, we do not know how the further evolution could
behave.
At a scale
∼ 200 MeV (6.130)
the perturbative coupling (6.128) starts diverging; this is called the Landau pole. Note
however that Eq. 6.128 is perturbative, and more terms are needed near the Landau
pole: strong interactions indeed do not exhibit a divergence for Q → .
Asymptotic freedom entails the fact that at extremely high temperature and/or density
a new phase of matter should appear due to QCD. In this phase, called quark–gluon
plasma (QGP), quarks and gluons become free: the color charges of partons are
screened. It is believed during the first few ms after the big bang the Universe was
in a QGP state, and flavors were equiprobable.
QGP should be formed when temperatures are close to 200 MeV and density is
large enough. This makes the ion–ion colliders the ideal place to reproduce this state.
One characteristic of QGP should be that jets are “quenched”: the high density
of particles in the “fireball” which is formed after the collision absorbs jets in such
a way that in the end no jet or just one jet appears.
Many experiments at hadron colliders tried to create this new state of matter in
the 1980s and 1990s, and CERN announced indirect evidence for QGP in 2000.
Current experiments at the Relativistic Heavy Ion Collider (RHIC) at BNL and at
CERN’s LHC are continuing this effort, by colliding relativistically accelerated gold
(at RHIC) or lead (at LHC) ions. Also RHIC experiments have claimed to have
created a QGP with a temperature 4 T ∼ 4 × 1012 K (about 350 MeV).
Fig. 6.53 The creation of a multihadronic final state from the decay of a Z boson or from a virtual
photon state generated in an e+ e− collision
At large energies, QCD processes can be described directly from the QCD
Lagrangian. Quarks will radiate gluons, which branch into gluons or generate q q̄
pairs, and so on. This is a parton shower, quite similar in concept to the electromag-
netic showers described by QED.
However, at a certain hadronization scale Q had we are not able anymore to per-
form perturbative calculations. We must turn to QCD-inspired phenomenological
models to describe a transition of colored partons into colorless states, and the fur-
ther branchings.
The problem of hadron generation from a high-energy collision is thus modeled
through four steps (Fig. 6.53):
1. Evolution of partons through a parton shower.
2a. Grouping of the partons onto high-mass color-neutral states. Depending on the
model these states are called “strings” or “clusters”—the difference is not rele-
vant for the purpose of this book; we shall describe in larger detail the “string”
model in the following.
2b. Map of strings/clusters onto a set of primary hadrons (via string break or cluster
splitting).
3. Sequential decays of the unstable hadrons into secondaries (e.g., ρ → ππ,
→ nπ, π 0 → γγ, …).
The physics governing steps 2a and 2b is non-perturbative, and pertains to hadroniza-
tion; some properties are anyway bound by the QCD Lagrangian.
An important result in lattice QCD,10 confirmed by quarkonium spectroscopy,
is that the potential of the color-dipole field between a charge and an anticharge at
distances r 1 fm, can be approximated as V ∼ kr (Fig. 6.54). This is called “linear
confinement”, and it justifies the string model of hadronization, discussed below in
Sect. 6.4.6.1.
10 This formulation of QCD in discrete rather than continuous spacetime allows pushing momentum
cut-offs for calculations to the lowest values, below the hadronization scale; however, lattice QCD
is computationally very expensive, requiring the use of the largest available supercomputers.
6.4 Strong Interactions and QCD 333
The Lund string model, implemented in the Pythia [F6.1] simulation software, is
nowadays commonly used to model hadronic interactions. We shall shortly describe
now the main characteristics of this model; many of the basic concepts are shared by
any string-inspired method. A more complete discussion can be found in the book
by Andersson [F6.2].
Consider the production of a q q̄ pair, for instance in the process e+ e− → γ ∗ /Z →
q q̄ → hadrons. As the quarks move apart, a potential
V (r ) = κ r (6.131)
is stretched among them (at short distances, a Coulomb term proportional to 1/r
should be added). Such a potential describes a string with energy per unit length κ,
which has been determined from hadron spectroscopy and from fits to simulations
to have the value κ ∼ 1 GeV/fm ∼ 0.2 GeV2 (Fig. 6.54). The color flow in a string
stores energy (Fig. 6.55).
A soft gluon possibly emitted does not affect very much the string evolution
(string fragmentation is “infrared safe” with respect to the emission of soft and
collinear gluons). A hard gluon, instead, can store enough energy that the qg and the
g q̄ elements operate as two different strings (Fig. 6.56). The quark fragmentation is
different from the gluon fragmentation since quarks are only connected to a single
string, while gluons have one on either side; the energy transferred to to strings by
gluons is thus roughly double compared to quarks.
As the string endpoints move apart, their kinetic energy is converted into potential
energy stored in the stringitself (Eq. 6.131). This process continues until by quantum
Fig. 6.56 Illustration of a qg q̄ system. Color conservation entails the fact that the color string goes
from quarks to gluons and vice-versa rather than from quark to antiquark
Fig. 6.57 String breaking by quark pair creation in the string field; time evolution goes from bottom
to top
fluctuation a quark–antiquark pair emerges transforming energy from the string into
mass. The original endpoint partons are now screened from each other, and the string
is broken in two separate color-singlet pieces, (q q̄) → (q q̄ ) + (q q̄), as shown
in Fig. 6.57. This process then continues until only final state hadrons remain, as
described in the following.
The individual string breaks are modeled from quantum mechanical tunneling,
which leads to a suppression of transverse energies and masses:
−πm q2 −π p⊥q
2
Prob(m q2 , p⊥q
2
) ∝ exp exp , (6.132)
κ κ
where m q is the mass of the produced quark and p⊥ is the transverse momentum
with respect to the string. The p⊥ spectrum of the quarks is thus independent of the
quark flavor, and !
2
p⊥q = σ 2 = κ/π ∼ (250 MeV)2 . (6.133)
The mass suppression implied by Eq. 6.132 is such that strangeness suppression
with respect to the creation of u or d, s/u ∼ s/d ∼, is 0.2–0.3. This suppression is
consistent with experimental measurements, e.g., of the K /π ratio in the final states
from Z decays.
6.4 Strong Interactions and QCD 335
By inserting the charm quark mass in Eq. 6.132 one obtains a relative suppression
of charm of the order of 10−11 . Heavy quarks can therefore be produced only in the
perturbative stage and not during fragmentation.
Baryon production can be incorporated in the same picture if string breaks occur
also by the production of pairs of diquarks, bound states of two quarks in a 3̄ rep-
resentation (e.g., “red + blue = antigreen”). The relative probability of diquark–
antidiquark to quark–antiquark production is extracted from experimental measure-
ments, e.g., of the p/π ratio.
The creation of excited states (for example, hadrons with nonzero orbital momen-
tum between quarks) is modeled by a probability that such events occur; this probabil-
ity is again tuned on the final multiplicities measured for particles in hard collisions.
With p⊥ 2 and m 2 in the simulation of the fragmentation fixed from the extraction
of random numbers distributed as in Eq. 6.132, the final step is to model the fraction,
z, of the initial quark’s longitudinal momentum that is carried by the final hadron; in
first approximation, this should scale with energy for large enough energies. The form
of the probability density for z used in the Lund model, the so-called fragmentation
function f (z), is
1 b (m 2h + p⊥h
2 )
f (z) ∝ (1 − z) exp −
a
, (6.134)
z z
Fig. 6.59 Iterative selection of flavors and momenta in the Lund string fragmentation model. From
P. Skands, http://arxiv.org/abs/1207.2389
a string. A d d̄ pair is created from the vacuum; the d̄ combines with the from the
u and forms a π + , which carries off a fraction z 1 of the total momentum p+ . The
next hadron takes a fraction z 2 of the remaining momentum, etc. The z i are random
numbers generated according to a probability density function corresponding to the
Lund fragmentation function.
A more precise expression has been obtained from QCD. The expression including
leading and next-to-leading order calculation is:
√
αs (E C M )
n(E C M ) = a[αs (E C M )]b ec/ 1 + O( αs (E C M ) , (6.136)
6.4 Strong Interactions and QCD 337
MARK II
Data Group), Chin. Phys. C NNLO QCD
( S(MZ2)=0.11840)
UA5
38 (2014) 090001 25
| |...
JADE, TASSO
dn
d
20 AMY S
nch or
CM
F,
CD
HRS
TPC
,
UA5
15 LENA TOPAZ, 1,
VENUS UA
ARGUS
CLEO
10 ISR
2, MARK I
CMS
ICE,
F, AL
5 UA5, CD
H1, ZEUS UA1,
Bubble
Chambers
0 2 3 4
1 10 10 10 10
s or Q GeV
where a is a parameter (not calculable from perturbation theory) whose value should
be fitted from the data. The constants b = 0.49 and c = 2.27 are calculated from the
theory.
The summary of the experimental data is shown in Fig. 6.60; a plot comparing the
charge multiplicity in e+ e− annihilations with expression 6.136 in a wide range of
energies will be discussed in larger detail in the next chapter (Fig. 7.18). The charged
particle multiplicity at the Z pole, 91.2 GeV, is about 21 (the total multiplicity
including π 0 before their decays is about 30).
Fig. 6.61 A two-jet event (left) and a 3-jet event (right) observed by the ALEPH experiment at
LEP. Source CERN
dσ
∝ (1 + cos2 θ)
d cos θ
expected for spin 1/2 objects.
Some characteristics of quarks can be seen also by the ratio of the cross section
into hadrons to the cross section into μ+ μ− pairs, as discussed in Sect. 5.4.2. QED
predicts that this ratio should be equal to the sum of squared charges of the charged
hadronic particles produced; due to the nature
√ of QCD, the sum has to be extended
over quarks and over colors. For 2m t s 2m b ,
1 4 1 4 1 11
R=3 + + + + = .
9 9 9 9 9 3
The O(α S ) process qg q̄ (Fig. 6.56) can give events with three jets (Fig. 6.61,
right). Notice that, as one can see from Fig. 6.56, one expects an excess of particles
in the direction of the gluon jet, with respect of the opposite direction, since this is
where most of the color field is. This effect is called the string effect, and has been
observed by the LEP experiments at CERN in the 1990s; we shall discuss it in the
next chapter. This is evident also from the comparison of the color factors—as well
as from considerations based on color conservation.
Jet production was first observed at e+ e− colliders only in 1975. It was not an
easy observation, and the reason is that the question “how many jets are there in an
event”, which at first sight seems to be trivial, is in itself meaningless, because there
is arbitrariness in the definition of jets. A jet is a bunch of particles flying into similar
directions in space; the number of jets in a final state of a collision depends on the
clustering criteria which define two particles as belonging to the same bunch.
6.4 Strong Interactions and QCD 339
The situation is more complicated when final state hadrons come from a hadron-
hadron interaction. On top of the interaction between the two partons responsible
for a hard scattering, there are in general additional interactions between the beam
remnant partons; the results of such interaction are called the “underlying event”
(Fig. 6.62).
Usually, the underlying event comes from a soft interaction involving low momen-
tum transfer; therefore perturbative QCD cannot be applied and it has to be described
by models. Contributions to the final energy may come from additional gluon radia-
tion from either the initial state or final state partons; typically, the products have small
transverse momentum with respect to the direction of the collision (in the center-of-
mass system). In particular, in a collision at accelerators, many final products of the
collision will be lost in the beam pipe.
To characterize QCD interactions, a useful quantity is the so-called “rapidity” y
of a particle:
1 E + pz
y = ln , (6.137)
2 E − pz
11 Rapidity is also a useful variable also for the study of electron–positron collisions. However there
is nothing special in that case about the beam direction, apart from the (1 + cos θ2 ) dependence
of the jet axis; rapidity in e+ e− is thus usually defined with respect to the q q̄ axis, and it still has,
for 2-jet events, the property that the distribution of final state hadrons is approximately uniform in
rapidity.
340 6 Interactions and Field Theories
1 E + pz 1 m + mvz
y= ln ln vz . (6.138)
2 E − pz 2 m − mvz
Note that nonrelativistic velocities transform as well additively under boosts (as
guaranteed by the Galilei transformation).
The rapidity of a particle is not easy to measure, since one should know its mass.
We thus define a variable easier to measure: the pseudorapidity η
θ
η = − ln tan , (6.139)
2
where θ is the angle of the momentum of the particle relative to the +z axis. One can
derive an expression for rapidity in terms of pseudorapidity and transverse momen-
tum:
m 2 + pT2 cosh2 η + pT sinh η
y = ln . (6.140)
m 2 + pT2
The two extreme limits of QCD, asymptotic freedom (perturbative) and confinement
(non-perturbative), translate in two radical different strategies in the computation
of the cross sections of the hadronic processes. At large momentum transfer (hard
processes) cross sections can be computed as the convolution of the partonic (quarks
and gluons) elementary cross sections over the parton distribution functions (PDFs).
At low transfer momentum (soft interactions) cross sections must be computed using
phenomenological models that describe the distribution of matter inside hadrons and
6.4 Strong Interactions and QCD 341
Fig. 6.63 Proton–(anti)proton cross sections at high energies. Cross section values for several
important processes are given. The right vertical axis reports the number of events for a luminosity
value L = 1033 cm−2 s−1 . From N. Cartiglia, arXiv:1305.6131 [hep-ex]
whose parameters must be determined from the data. The soft processes are dominant.
At the LHC for instance (Fig. 6.63), the total proton–proton cross section is of the
order of 100 milibarn while the Higgs production cross section is of the order of tens
of picobarn (a difference of 10 orders of magnitude!).
At high momentum transfer, the number of partons, mostly gluons, at small x,
increases very fast as shown in Fig. 5.25. This fast rise, responsible for the increase
of the total cross sections, can be explained by the possibility, at these energies,
of gluons radiated by the valence quarks radiate themselves new gluons forming
gluonic cascades. However, at higher energies, the gluons in the cascades interact
with each other suppressing the emission of new soft gluons and a saturation state
often described as the color Glass Condensate (CGC) is reached. In high energy,
heavy-ion collisions, high densities may be accessible over extended regions and the
QGP may be formed.
342 6 Interactions and Field Theories
In hadronic hard processes the factorisation assumption, tested first in the deep inelas-
tic scattering, holds. The time scale of the elementary interaction between partons
(or as in case of deep inelastic scattering between the virtual photon and the quarks),
is basically given by the inverse of the transferred momentum Q
τint ∼ Q −1 (6.141)
while the hadron time scale is given by the inverse of the QCD non-perturbative scale
τhad ∼ −1
QCD . (6.143)
Hence, whenever τint τhad the processes at each timescale can be considered
independent. Thus the inclusive cross section of the production of the final state X
(for instance one μ+ μ− dilepton, or one multijet system, or one Higgs boson, …)
in the collision of two hadrons h 1 and h 2 with, respectively, four-momentum p1
and p2 :
h 1 ( p1 ) h 2 ( p2 ) → X + · · · (6.144)
1 1
j
σh 1 h 2 →X = d x1 d x2 f hi 1 (x1 , Q) f h 2 (x2 , Q) "
σi j→X ŝ (6.145)
ij 0 0
j
where f hi 1 and f h 2 are the parton distribution functions evaluated at the scale Q, x1
and x2 are the fraction of momentum carried, respectively, by the partons i and j
and "
σi j→X the partonic cross section evaluated at an effective square of the center-
of-mass energy
ŝ = x1 x2 s (6.146)
1 4πα2
σqq→ll = Qq 2 . (6.147)
Nc 3M 2
Q q is the quark charge and M 2 is the square of the center-of-mass energy of the
system of the two colliding quark–antiquark (i.e., the square of the invariant mass of
the dilepton system). M 2 is thus given by
M 2 = ŝ = x1 x2 s . (6.148)
Finally note that, as it was already discussed in Sect. 5.4.2, the color factor Nc
appears in the denominator (average over the incoming colors) in contrast with what
happens in the reverse process ll → qq (sum over outgoing colors) whose cross
section is given by
4πα2
σll→qq = Nc Q q 2 . (6.149)
3s
There is a net topological difference between the final states of these two processes.
While in the e+ e− interactions, the scattering into two leptons or two jets implies
a back-to-back topology, in the Drell–Yan there is a back-to back topology in the
plane transverse to the beam axis but, since each quark or antiquark carries an arbi-
trary fraction of the momentum of the parent hadron, the system has in general a
momentum along the beam axis.
It is then important to observe that the rapidity of the dilepton system is by energy-
momentum conservation equal to the rapidity of the quark–antiquark system,
1 Ell + PZll 1 E qq + PZ qq 1 x1
y≡ ln = ln = ln . (6.151)
2 Ell − PZll 2 E qq − PZ qq 2 x2
Then, if the mass M and the rapidity y of the dilepton are measured, the momentum
fractions of the quark and antiquark can, in this particular case, be directly accessed.
In fact, inverting the equations which relates M and y with x1 , x2 one obtains:
M
x1 = √ e y (6.152)
s
M −y
x2 = √ e . (6.153)
s
The Drell–Yan differential cross section can now be written in terms of M and y.
Computing the Jacobian of the change of the variables from (x1 , x 2 ) to (M, y),
d (x1 x 2 ) 2M
= . (6.154)
d (y, M) s
It can be easily shown that the differential Drell–Yan cross section for the collision
of two hadrons is just:
dσ 8πα2
= f (x1 ; x2 ) (6.155)
d Mdy 9Ms
where f (x1 ; x2 ) is the combined PDF for the fractions of momentum carried by
the colliding quark and antiquark weighted by the square of the quark charge. For
6.4 Strong Interactions and QCD 345
instance, in the case of proton–antiproton scattering one has, assuming that the quark
pdfs in the proton are identical to the antiquark pdfs in the antiproton and neglecting
the contributions of the antiquark (quark) of the proton (antiproton) and of other
quarks than the u and d:
4 1
f (x1 ; x2 ) = u (x1 ) u (x2 ) + d (x1 ) d (x2 ) (6.156)
9 9
where
u (x) = u p (x) = u p (x) (6.157)
p
d (x) = d p (x) = d (x) . (6.158)
At the LHC in proton–proton collisions the antiquark must come from the sea. Any-
how, to have a good description of the dilepton data (see Fig. 6.66) it is not enough
to consider the leading order diagram discussed above. In fact the peak observed
around M ∼ 91 GeV corresponds to the Z resonance, not accounted in the naïve
Drell-Yan, and next-to-next leading order (NNLO) diagrams are needed to have such
a perfect agreement between data and theory.
Fig. 6.66 Dilepton cross section measured by CMS. From V. Khachatryan et al. (CMS Collabora-
tion), The European Physical Journal C75 (2015) 147
346 6 Interactions and Field Theories
and the following relations can be established between the partonic and the final state
variables:
PT
x1 = √ e y1 + e y2 (6.160)
s
PT
x2 = √ e−y 1 + e−y 2 (6.161)
s
Q 2 = PT 2 1 + e y1 −y2 . (6.162)
with the LHC data provides a powerful test of QCD which spans by many orders of
magnitude (see Fig. 6.67).
In the forward region (θ = 0), the interference between the incident and the scattered
waves is non-negligible. In fact, this term has a net effect on the reduction of the
348 6 Interactions and Field Theories
incident flux that can be seen as a kind of “shadow” created by the diffusion center.
An important theorem, the Optical Theorem, connects the total cross section with
the imaginary part of the forward elastic scattering amplitude:
4π
σtot (E) = ImF (E, 0) . (6.165)
k
The elastic cross section is just the integral of the elastic differential cross section,
σel (E) = |F (E, θ)|2 d (6.166)
and the inelastic cross section just the difference of the two cross sections
1
l=∞
F (E, θ) = (2l + 1) fl (E) Pl (cos θ) (6.168)
k
l=0
where the functions fl (E) are the partial wave amplitudes and Pl are the Legendre
polynomials which form an orthonormal basis.
Cross sections can be also written as a function of the partial wave amplitudes:
4π
l=∞
σel (E) = (2l + 1)| fl (E)|2 (6.169)
K2
l=0
4π
l=∞
σtot (E) = (2l + 1) Im fl (E) (6.170)
K2
l=0
and again σinel is simply the difference between σtot and σel .
The optical theorem applied now at each partial wave imposes the following
relation (unitarity condition):
Noting that
2
fl − i = | fl |2 − Im fl + 1 (6.172)
2 4
6.4 Strong Interactions and QCD 349
i
fl = 1 − e2iδl (6.174)
2
being δl a complex number.
Whenever δl is a pure real number
and the scattering is totally elastic (the inelastic cross section is zero).
On the other hand, if the wavelength associated to the beam particle is much
smaller than the target region,
1
λ∼ R (6.176)
k
a description in terms of the classical impact parameter b (Fig. 6.69) may make sense.
Defining
1 1
b≡ l+ (6.177)
k 2
l=∞
F (E, θ) = 2k b b f bk−1/2 (E) Pbk−1/2 (cos θ) (6.178)
b=1/k
where the Legendre polynomials Pl (cos θ) were replaced by the Legendre functions
Pν (cos θ) being ν a real positive number, and the partial wave amplitudes fl were
interpolated giving rise to the scattering amplitude a (b, E).
For small scattering angles, the Legendre functions may be approximated by a
zeroth order Bessel Function J0 (b, θ) and finally one can write
∞
F (E, θ) ∼
= 2k b db a (b, E) J0 (b, θ) . (6.180)
0
The scattering amplitude a (b, E) is thus related to the elastic wave amplitude
discussed above basically by a Bessel–Fourier transform.
Following a similar strategy to ensure automatically unitarity, a (b, s) may be
parametrized as
i
a (s, b) = (1 − eiχ(b,s) ) (6.181)
2
where
χ (b, s) = χ R (b, s) + i χ I (b, s) (6.182)
σtot (s) = 2 d 2 b 1 − cos (χ R (b, s)) e−χ I (b,s) (6.184)
σinel (s) = d 2 b 1 − e−2χ I (b,s) (6.185)
(the integrations run over the target region with a radius R).
Note that:
• if χ I = 0 then σinel = 0 and all the interactions are elastic.
• if χ R = 0 and χ I → ∞ for b ≤ R, then σinel = σel and σtot = 2π R 2 . This is
the so-called black disk limit.
In a first approximation, hadrons may be described by gray disks with mean radius
R and χ (b, s) = i(s) for b ≤ R and 0 otherwise. The opacity is a real number
(0 < < ∞). In fact, the main features of proton–proton cross sections can be
reproduced in such a simple model (Fig. 6.70). At the high energy limit, the gray
disk tends asymptotically to a black disk and thus thereafter the increase of the cross
section, limited by the Froissart Bound to ln2 (s), is just determined by the increase
of the mean radius.
6.4 Strong Interactions and QCD 351
Fig. 6.70 The total cross section (left) and the ratio of the elastic and total cross sections in proton–
proton interactions as a function of the center-of-mass energy. Points are experimental data and the
lines are coming from a fit using one gray disk model. From R. Conceia̧o et al., Nuclear Physics A
888 (2012) 58
(n)n e−n
P (n, n) = (6.186)
n!
The probability to have at least one collision is given by
so
σinel (s) = d 2 b 1 − e−n . (6.188)
1
χ I (b, s) = n (b, s) . (6.189)
2
Following this reasoning, χ I (b, s) is then often computed as the sum of the
different kind of parton-parton interactions factorizing each term into a transverse
density function and the corresponding cross section:
χ I (b, s) = G i (b, s)σi . (6.190)
352 6 Interactions and Field Theories
For instance,
where qq, qg, gg stays for, respectively, the quark–quark, quark–gluon, and gluon–
gluon interactions.
On the other hand, there are models where χ I is divided in perturbative (hard)
and nonperturbative (soft) terms:
The transverse density functions G i (b, s) must take into account the overlap of
the two hadrons and can be computed as the convolution of the Fourier transform of
the form factors of the two hadrons.
This kind of strategy can be extended to nuclei–nuclei interactions which are then
seen as an independent sum of nucleon–nucleon interactions. This approximation
known as the Glauber 12 model, can be written as:
σ N N (s) = d 2 b 1 − e−G(b,s) σnn (s)) . (6.193)
The function G (b, s) takes now into account the geometrical overlap of the two
nuclei and indicates the probability per unit of area of finding simultaneously one
nucleon in each nuclei at a given impact parameter.
12 Roy Jay Glauber (New York, 1925) is an American physicist, recipient of the 2005 Nobel Prize
“for his contribution to the quantum theory of optical coherence,” a fundamental contribution to the
field of quantum optics. For many years before, Glauber participated to the organization of the Ig
Nobel Prize: he had the role of “keeper of the broom,” sweeping the paper airplanes thrown during
the event out from the stage.
6.4 Strong Interactions and QCD 353
Fig. 6.73 First lead-lead event recorded by ALICE detector at LHC at center-of-mass energy per
nucleon of 2.76 TeV. Thousands of charged particles were recorded by the time-projection chamber.
Source CERN
is huge with thousand of particles detected (Fig. 6.73). Such events are an ideal
laboratory to study the formation and the characteristics of the QGP. Both global
observables, as the asymmetry of the flow of the final states particles, and hard
probes like high transverse momentum particles, di-jets events, and specific heavy
hadrons are under intense scrutiny.
The asymmetry of the flow of the final states particles can be predicted as a conse-
quence of the anisotropies in the pressure gradients due to the shape and structure of
the nuclei–nuclei interaction region (Fig. 6.74). In fact, more and faster particles are
expected and seen in the region of the interaction plane (defined by the directions of
the two nuclei in the center-of-mass reference frame) where compression is higher.
Although the in-out modulation (elliptic flow) is qualitatively in agreement with the
predictions, quantitatively the effect is smaller than the expected with the assump-
tion of one QGP formed by a free gas of quarks and gluons. Some kind of collective
phenomena should exist which could be related to a nonzero QGP fluid viscosity.
QGP behaves rather like a low viscosity liquid than an ideal gas. Such behavior was
first discovered at lower energies at the RHIC collider at Brookhaven.
Partons resulting from elementary hard processes inside the QGP have to cross a
high dense medium and thus may suffer significant energy losses or even be absorbed
in what is generically called “quenching”. The most spectacular observation of such
phenomena is in di-jet events, where one of the high PT jets loose a large fraction
of its energy (Fig. 6.75). This “extinction” of jets is usually quantified in terms of
the nuclear suppression factor R A A defined as the ratio between differential PT
distributions in nuclei–nuclei and in proton–proton collisions:
d 2 N A A /dyd PT
RAA = . (6.194)
Ncoll d 2 N pp /dyd PT
Fig. 6.75 Display of an unbalanced di-jet event recorded by the CMS experiment at the LHC
in lead–lead collisions at a center-of-mass energy of 2.76 TeV per nucleon. The plot shows the
sum of the electromagnetic and hadronic transverse energies as a function of the pseudorapidity
and the azimuthal angle. The two identified jets are highlighted. From S. Chatrchyan et al. (CMS
Collaboration), Phys. Rev. C84 (2011) 024906
356 6 Interactions and Field Theories
But not only loss processes may occur in the presence of a hot and dense medium
(QGP). The production of high-energy quarkonia (bound states of heavy quark–
antiquark pairs) may also be suppressed whenever QGP is formed as initial proposed
on a seminal paper in 1986 by Matsui and Satz in the case of the J /ψ (cc pair)
production in high-energy heavy-ion collisions. The underlined proposed mechanism
was a color analog of Debye screening which describes the screening of electrical
charges in the plasma. Evidence of such suppression was soon reported at CERN
in fixed target oxygen–uranium collisions at 200 GeV per nucleon by the NA38
collaboration. Many other results were published in the following years and a long
discussion was held on whether the observed suppression was due to the absorption
of these fragile cc states by the surrounding nuclear matter or to the possible existence
of the QGP. In 2007 the NA60 reported, in indium–indium fixed target collisions at
158 GeV per nucleon, the existence of an anomalous J /ψ suppression not compatible
with the nuclear absorption effects. However, this anomalous suppression did not
increase at higher center-of-mass energies at the RHICH heavy-ion collider, and
even recently showed a clear decrease at the LHC (Fig. 6.77). Meanwhile, the possible
(re)combination of charm and anti-charm quarks at the boundaries of the QGP region
was proposed as an enhancement production mechanism and such mechanism seems
to be able to describe the present data.
The study of the J/ψ production, as well as of other quarkonia states, is extremely
important to study QGP as it allows for a thermal spectroscopy of the QGP evolution.
The dissociation/association of these qq pairs is intrinsically related to the QGP
temperature; as such, as this medium expands and cools down, these pairs may
recombine and each flavor has a different recombination temperature. However, the
6.4 Strong Interactions and QCD 357
Fig. 6.77 The nuclear modification factor R A A for inclusive J /ψ production at mid rapidity as
reported by PHENIX (RHIC) and ALICE (LHC) experiments at center-of-mass energy per nucleon
of 0.2 and 2.76 TeV, respectively. From http://cerncourier.com/cws/article/cern/48619
competition between the dissociation and association effects is not trivial and so far
it was not yet experimentally assessed.
The process of formation of the QGP in high-energy heavy-ions collisions is
theoretically challenging. It is generally accepted that in the first moments of the
collisions the two nuclei had already reached the saturation state described by the
color glass condensate (CGC) referred at the beginning of the Sect. 6.4.7. Then a
fast thermalisation process occur ending in the formation of a QGP state described
by relativistic hydrodynamic models. The intermediated stage, not experimentally
accessible and not theoretically well established, is designated as glasma. Finally, the
QGP “freezes-out” into a gas of hadrons. Such scheme is pictured out in an artistic
representation in Fig. 6.78.
In ultra-high-energy cosmic ray experiments (see Chap. 10), events with center-
of-mass energies well above those presently attainable in human-made accelerators
are detected. Higher Q2 and thus smaller scales ranges can then be explored opening
a new possible window to test hadronic interactions.
358 6 Interactions and Field Theories
Fig. 6.78 An artistic representation of the time–space diagram of the evolution of the states created
in Heavy Ion collisions. From “Relativistic Dissipative Hydrodynamic description of the Quark-
Gluon Plasma” A. Monai 2014 (http://www.springer.com/978-4-431-54797-6)
Further Reading
Exercises
where N stands for an isoscalar (same number of protons and neutrons) nucleus.
Consider that the involved energies are much higher than the particle masses.
Take into account only diagrams with valence quarks.
9. Top pair production. Consider the pair production of the top/anti-top quarks in a
proton-antiproton collider. Draw the dominant first-order Feynman diagram for
this reaction and estimate what should be the minimal beam energy of a collider
to make the process happen. Discuss which channels have a clear experimental
signature.
10. c-quark decay. Consider the decay of the c quark. Draw the dominant first-order
Feynman diagrams of this decay and express the corresponding decay rates as a
function of the muon decay rate and of the Cabibbo angle. Make an estimation
of the c quark lifetime knowing that the muon lifetime is about 2.2 µs.
11. Gray disk model in proton–proton interactions. Determine, in the framework of
the gray disk model, the mean radius and the opacity of the proton as a function
of the center-of-mass energy (you can use Fig. 6.70 to extract the total and the
elastic proton–proton cross sections).
Chapter 7
The Higgs Mechanism and the Standard
Model of Particle Physics
In the previous chapter, we have characterized three of the four known interactions:
the electromagnetic, strong interaction, and weak interaction.
We have presented an elegant mechanism for deriving the existence of gauge
bosons from a local symmetry group. We have carried out in detail the calculations
related to the electromagnetic theory, showing that the electromagnetic field naturally
appears from imposing a local U(1) gauge invariance. However, a constraint imposed
by this procedure is that the carriers of the interactions are massless. If we would
like to give mass to the photons, we would violate the gauge symmetry:
1 2 1 1 1 µ 1
M A Aµ Aµ → M A2 A µ − ∂µ α A − ∂ α = M A2 Aµ Aµ
µ
(7.1)
2 2 e e 2
The representation of the weak interaction has been less satisfactory. We have
a SU(2) symmetry there, a kind of isospin, but the carriers of the force must be
massive to explain the weakness and the short range of the interaction—indeed they
are identified with the known W ± and Z particles, with a mass of the order of 100
GeV. But we do not have a mechanism for explaining the existence of massive gauge
particles, yet. Another problem is that, as we shall see, incorporating the fermion
masses in the Lagrangian by brute force via a Dirac mass term m ψ̄ψ would violate
the symmetry related to the weak interaction.
Is there a way to generate the gauge boson and the fermion masses without vio-
lating gauge invariance? The answer is given by the so-called Higgs mechanism,
proposed in the 1960s. This mechanism is one of the biggest successes of fundamen-
tal physics, and requires the presence of a new particle; the Higgs boson, responsible
for the masses of particles. This particle has been found experimentally in 2012 after
50 years of searches—consistent with the standard model parameters measured with
high accuracy at LEP.
The Higgs mechanism allowed to formulate a quantum field theory—
relativistically covariant—that explains all currently observed phenomena at the scale
of elementary particles: the standard model of particle physics (in short “standard
model,” also abbreviated as SM). The SM includes all the known elementary particles
and the three interactions relevant at the particle scale: the electromagnetic interac-
tion, the strong interaction, and the weak interaction. It does not include gravitation,
which, for now, cannot be described as a quantum theory. It is a SU(3) ⊗ SU(2) ⊗ U(1)
symmetrical model.
The SM is built from two distinct interactions affecting twelve fundamental parti-
cles (quarks and leptons) and their antiparticles: the electroweak interaction, coming
from the unification of the weak force and electromagnetism (QED), and the strong
interaction explained by QCD. These interactions are explained by the exchange of
gauge bosons (the vectors of these interactions) between elementary fermions.
All our knowledge about fundamental particles and interactions, which we have
described in the previous chapters, can be summarized in the following table (taken
from Iliopulos, 2012 CERN School on Particle Physics).
The principle of local gauge invariance works beautifully for electromagnetic inter-
actions. Veltman and ’t Hooft1 proved in the early 1970s that gauge theories are
renormalisable. But gauge fields appear to predict the existence of massless gauge
bosons, while we know in nature that weak interaction is mediated by heavy vectors
W ± and Z .
How to introduce mass in a gauge theory? We have seen that a quadratic term
µ2 A2 in the gauge boson field spoils the gauge symmetry; gauge theories seem, at
face value, to account only for massless gauge particles.
The idea to solve this problem came from fields different from particle physics,
and it is related to spontaneous symmetry breaking.
Spontaneous symmetry breaking (SSB) was introduced into particle physics in 1964
by Englert and Brout, and independently by Higgs.2 Higgs was the first to mention
explicitly the appearance of a massive scalar particle associated with the curvature of
1 Martinus Veltman (1931) is a Dutch physicist. He supervised the Ph.D. thesis of Gerardus’t Hooft
(1946), and during the thesis work, in 1971, they demonstrated that gauge theories were renormal-
izable. For this achievement, they shared the Nobel Prize for Physics in 1999.
2 Peter Higgs (Newcastle, UK, 1929) has been taught at home having missed some early schooling.
He moved to city of London School, and then to King’s College also in London, at the age of
17 years, where he graduated in molecular physics in 1954. In 1980, he was assigned the chair
of Theoretical Physics at Edinburgh. He shared the 2013 Nobel prize in physics with François
Englert “for the theoretical discovery of a mechanism that contributes to our understanding of the
origin of mass of subatomic particles, and which recently was confirmed through the discovery of
the predicted fundamental particle.” François Englert (1932) is a Belgian physicist; he is a Holocaust
364 7 The Higgs Mechanism and the Standard Model of Particle Physics
the effective potential that determines the SSB; the mechanism is commonly called
the Higgs mechanism, and the particle is called the Higgs boson.
Let us see how SSB can create in the Lagrangian a mass term quadratic in the
field. We shall concentrate on a scalar theory, but the extension to a vector theory
does not add conceptually.
The idea is that the system has at least two phases:
• The unbroken phase: the physical states are invariant with respect to all symmetry
groups with respect to which the Lagrangian displays invariance. In a local gauge
theory, massless vector gauge bosons appear.
• The spontaneously broken phase: below a certain energy, a phase transition might
occur. The system reaches a state of minimum energy (a “vacuum”) in which part
of the symmetry is hidden from the spectrum. For a gauge theory, we shall see that
some of the gauge bosons become massive and appear as physical states.
Infinitesimal fluctuations of a system which is crossing a critical point can decide
on the system’s fate, by determining which branch among the possible ones is taken.
Such fluctuations arise naturally in quantum physics, where the vacuum is just the
point of minimal energy and not a point of zero energy.
It is this kind of phase transition that we want to study now.
Consider the bottom of an empty wine bottle (Fig. 7.1). If a ball is put at the peak of
the dome, the system is symmetrical with respect to rotating the bottle (the potential
is rotationally symmetrical with respect to the vertical axis). But below a certain
energy (height), the ball will spontaneously break this symmetry and move into a
point of lowest energy. The bottle continues to have symmetry, but the system no
longer does.
(Footnote 2 continued)
survivor. After graduating in 1959 at Université Libre de Bruxelles, he was nominated full professor
at the same University in 1980, where he worked with Brout. Brout had died in 2011, and could not
be awarded the Nobel prize.
7.1 The Higgs Mechanism and the Origin of Mass 365
In this case, what happens to the original symmetry of the equations? It still exists
in the sense that, if a symmetry transformation is applied (in this case a rotation around
the vertical axis) to the asymmetric solution, another asymmetric solution which is
degenerate with the first one is obtained. The symmetry has been “spontaneously
broken.” A spontaneously broken symmetry has the characteristics, evident in the
previous example, that a critical point, i.e., a critical value of some external quantity
which can vary (in this case, the height from the bottom, which corresponds to the
energy), exists, which determines whether symmetry breaking will occur. Beyond this
critical point, frequently called a “false vacuum,” the symmetric solution becomes
unstable, and the ground state becomes asymmetric—and degenerate.
Spontaneous symmetry breaking appears in many phenomena, for example, in
the orientation of domains in a ferromagnet, or in the bending of a rod pushed at
its extremes (beyond a certain pressure, the rod must bend in a direction, but all
directions are equivalent).
We move to applications to field theory, now.
L0 = (∂ µ φ∗ )(∂µ φ) − M 2 φ∗ φ (7.2)
1
φ(x) = √ (φ1 (x) + iφ2 (x))
2
Let us try now to find the points of stability for the system (7.3). The potential
associated to the Lagrangian is
1 2 2 1
V (φ) = µ (φ1 + φ22 ) + λ(φ21 + φ22 )2 . (7.6)
2 4
v = (−µ2 /λ)1/2
(Fig. 7.2). Any point on the circle corresponds to a spontaneous breaking of the
symmetry of (7.4). Spontaneous symmetry breaking occurs, if the kinetic energy
is smaller than the potential corresponding to the height of the dome. We call v
the vacuum expectation value: |φ| = v is the new vacuum for the system, and
the argument, i.e., the angle in the complex plane, can be whatever. The actual
minimum is not symmetrical, although the Lagrangian is.
Let us assume, for simplicity, that the actual minimum chosen by the system is at
argument 0 (φ is real); this assumption does not affect generality. We now define a
new coordinate system in which a coordinate σ goes along φ1 and a coordinate ξ is
perpendicular to it (Fig. 7.3). Notice that the coordinate ξ does not have influence on
the potential, since the latter is constant along the circumference. We examine the
1 1
φ = √ [(v + σ) + iξ] √ (v + σ)eiξ/v
2 2
and thus
i 1
∂µ φ = ∂µ ξφ + √ eiξ/v ∂µ σ .
v 2
1 1 1
L1 = ∂µ σ ∂ µ σ + ∂µ ξ ∂ µ ξ − (−2µ2 )σ 2 + const. + O(3) . (7.7)
2 2 2
The σ 2 term is a mass term, and thus Lagrangian (7.7) describes a scalar field of
mass m 2σ = −2µ2 = 2λv 2 .
Since there are now nonzero cubic terms in σ, the reflexion symmetry is broken
by the ground state: after choosing the actual vacuum, the ground state does not show
all the symmetry of the initial Lagrangian (7.3). But a U(1) symmetry operator still
exists, which turns one vacuum state into another one along the circumference.
Note that the initial field φ had two degrees of freedom. One cannot create or
cancel degrees of freedom; in the new system, one degree of freedom is taken by
the field σ, while the second is now absorbed by the massless field ξ, which moves
the potential along the “cul de bouteille.” The appearance of massless particles is an
aspect of the Goldstone theorem, which we shall not demonstrate here. The Goldstone
theorem states that, if a Lagrangian is invariant under a group of transformations G
with n generators, and if there is a spontaneous symmetry breaking such that the
new vacuum is invariant only under a group of transformations G ⊂ G with m < n
generators, then a number (n − m) of massless scalar fields appear. These are called
Goldstone bosons.
In the previous example, the Lagrangian had a U(1) symmetry (one generator).
After the SSB, the system had no symmetry. One Golstone field ξ appeared.
We have seen that spontaneous symmetry breaking can give a mass to a field oth-
erwise massless, and as a consequence some additional massless fields appear—the
Goldstone fields.
In this section, we want to study the consequences of spontaneous symmetry
breaking in the presence of a local gauge symmetry, as seen from the case µ2 <
0 in the potential (7.5). We shall see that (some of the) gauge bosons will become
massive, and one or more additional massive scalar field(s) will appear—the Higgs
368 7 The Higgs Mechanism and the Standard Model of Particle Physics
field(s). The Goldstone bosons will disappear as an effect of the gauge invariance:
this is called the Higgs mechanism.
We consider the case of a local U(1) symmetry: a complex scalar field coupled to
itself and to an electromagnetic field Aµ
1
L = − Fµν F µν + (Dµ φ)∗ (D µ φ) − µ2 φ∗ φ − λ (φ∗ φ)2 (7.8)
4
If µ2 < 0, we shall have spontaneous symmetry breaking. The ground state will
be
µ2
φ = v = − > 0; (7.9)
λ
and as in the previous section, we parametrize the field φ starting from the vacuum
as
1
φ(x) = √ (v + σ(x))eiξ(x)/v . (7.10)
2
We have seen in the previous section that the field ξ(x) was massless and associated
to the displacement between the degenerate ground states. But here the ground states
are equivalent because of the gauge symmetry. Let us examine the consequences of
this. We can rewrite Eq. (7.8) as
1 1 1 1
L=− Fµν F µν + ∂µ σ ∂ µ σ + ∂µ ξ ∂ µ ξ + e2 v 2 Aµ Aµ (7.11)
4 2 2 2
− ve Aµ ∂ µ ξ + µ2 σ 2 + O(3) .
Thus the σ field acquires a mass m 2σ = −2µ2 ; there are in addition mixed terms
between Aµ and ξ. Let us make a gauge transformation
ξ(x)
(x) = − ; (7.12)
v
thus
1
φ(x) → φ (x) = e−iξ(x)/v φ(x) = √ (v + σ(x)) (7.13)
2
7.1 The Higgs Mechanism and the Origin of Mass 369
1
Aµ (x) → Aµ (x) + ∂µ ξ (7.14)
ev
and since the Lagrangian is invariant for this transformation, we must have
1 µν 1 1
L=− Fµν F + ∂µ σ ∂ µ σ + e2 v 2 Aµ Aµ − λv 2 σ 2 + O(3) , (7.16)
4 2 2
and now it is clear that both σ and Aµ have acquired mass:
√
mσ = 2λv 2 = −2µ2 (7.17)
m A = ev . (7.18)
Notice that the field ξ is disappeared. This is called the Higgs mechanism; the
massive (scalar) σ field is called the Higgs field. In a gauge theory, the Higgs field
“eats” the Goldstone field.
Notice that the number of degrees of freedom of the theory did not change: now
the gauge field Aµ is massive (three degrees of freedom), and the field σ has one
degree of freedom (a total of four degrees of freedom). Before the SSB, the field had
2 degrees of freedom, and the massless gauge field an additional two—again, a total
of four.
The exercise we just did is not appropriate to model electromagnetism—after all,
the photon Aµ is massless to the best of our knowledge. However, it shows completely
the technique associated to the Higgs mechanism.
We shall now apply this mechanism to explain the masses of the vectors of the weak
interaction, the Z , and the W ± ; but first, let us find the most appropriate description
for the weak interaction, which is naturally linked to the electromagnetic one.
The weak and electromagnetic interactions, although different, have some common
properties which can be exploited for a more satisfactory—and “economical”
description.
Let us start from an example, taken from experimental data. The + (1189)
baryon, a uus state, decays into pπ 0 via a strangeness-changing weak decay (the
370 7 The Higgs Mechanism and the Standard Model of Particle Physics
basic transition at the quark level being s → ud ū), and it has a lifetime of about
10−10 s, while the 0 (1192), a uds state decaying electromagnetically into γ, has
a lifetime of the order of 10−19 s, the basic transition being u → uγ. The phase
space for both decays is quite similar, and thus the difference in lifetime must be
due to the difference of the couplings for the two interaction, being the amplitude
(and thus the inverse of the lifetime) proportional to the square of the coupling. The
comparison shows that the weak coupling is smaller by a factor of order of ∼10−4
with respect to the electromagnetic coupling. Although weak interactions take place
between all quarks and leptons, the weak interaction is typically hidden by the much
greater strong and electromagnetic interactions, unless these are forbidden by some
conservation rule. Observable weak interactions involve either neutrinos or quarks
with a flavor change—flavor change being forbidden in strong and electromagnetic
interactions, since photons and gluons do not carry flavor.
The factor 10−4 is very interesting, and suggests that the weak interactions might
be weak because they are mediated by gauge fields, W ± and Z , which are very
massive and hence give rise to interactions of very short range. The strength of the
interaction can be written as
gW
2
f (q 2 ) = ,
q2 + M2
Glashow, Weinberg, and Salam3 proposed in the 1960’s—twenty years before the
experimental discovery of the W and Z bosons—that the coupling of the W and Z to
leptons and quarks is closely related to that of the photon; the weak and electromag-
netic interactions are thus unified into an electroweak interaction. Mathematically,
this unification is accomplished under a SU(2) ⊗ U(1) gauge group.
3 Sheldon Lee Glashow (New York City 1932) shared with Steven Weinberg (New York City 1933)
and Abdus Salam (Jhang, Pakistan, 1926 - Oxford 1996) the Nobel Prize for Physics in 1979 “for
their complementary efforts in formulating the electroweak theory. The unity of electromagnetism
and the weak force can be explained with this theory.” Glashow was the son of Jewish immigrants
from Russia. He and Weinberg were members of the same classes at the Bronx High School
of Science, New York City (1950), and Cornell University (1954); then Glashow became full
professor in Princeton, and Weinberg in Harvard. Salam graduated in Cambridge, where he became
full professor of mathematics in 1554, moving then to Trieste.
7.2 Electroweak Unification 371
A problem is apparently given by the mass of the vector bosons: the photon is
massless, while the W and Z bosons are highly massive. Indeed an appropriate
spontaneous symmetry breaking of the electroweak Lagrangian explains the masses
of the W ± and of the Z keeping the photon massless, and predicts the existence of
a Higgs boson, which is called the standard model Higgs boson. The same Higgs
boson can account for the masses of fermions. We shall see now how this unification
is possible.
We used the symmetry group SU(2) to model weak interactions, while U(1) is the
symmetry of QED. The natural space for a unified electroweak interaction appears
thus to be SU(2) ⊗ U(1)—this is what the Glashow–Weinberg–Salam electroweak
theory assumed at the end of the 1960s.
Let us call W 1 , W 2 , and W 0 the three gauge fields of SU(2). We call Wµν a (a =
1, . . . , 3) the field tensors of SU(2) and Bµν the field tensor of U(1). Notice that Bµν
is not equal to Fµν , as W 0 is not the Z field: since we use a tensor product of the
two spaces, in general, the neutral field B can mix to the neutral field W 0 , and the
photon and Z states are a linear combination of the two.
The Lagrangian of the electroweak interaction needs to accommodate some exper-
imental facts, which we have discussed in Chap. 6:
• Only the left-handed (right-handed) (anti)fermion chiralities participate in weak
transitions—therefore, the interaction violates parity P and charge conjugation C;
however, the combined transformation CP is still a good symmetry.
• The W ± bosons couple to the left-handed fermionic doublets, where the electric
charges of the two fermion partners differ in one unit. This leads to the following
decay channels for the W − :
Thus, the weak eigenstates d , s , b are different from the mass eigenstates
d , s , b. They are related through the 3 × 3 unitary matrix VC K M , which charac-
terizes flavor-mixing phenomena.
• The neutral carriers of the electroweak interactions have fermionic couplings with
the following properties:
372 7 The Higgs Mechanism and the Standard Model of Particle Physics
– All interacting vertices conserve flavor. Both the γ and the Z couple to a fermion
and its own antifermion, i.e., γ f f¯ and Z f f¯.
– The interactions depend on the fermion electric charge Q f . Neutrinos do not
have electromagnetic interactions (Q ν = 0), but they have a nonzero coupling
to the Z boson.
– Photons have the same interaction for both fermion chiralities.
• The strength of the interaction is universal, and lepton number is conserved.
We are ready now to draft the electroweak theory.
To describe weak interactions, the left-handed fermions should appear in dou-
blets, and the right-handed fermions in singlets, and we would like to have massive
gauge bosons W ± and Z in addition to the photon. The simplest group with doublet
representations having three generators is SU(2). The inclusion of the electromag-
netic interactions implies an additional U(1) group. Hence, the symmetry group to
consider is then
G ≡ SU(2) L ⊗ U(1)Y , (7.21)
where L refers to left-handed fields (this will represent the weak sector). We shall
specify later the meaning of the subscript Y .
Let us first analyze the SU(2) part of the Lagrangian.
The SU(2) L part. We have seen that the W ± couple to the left chirality of fermionic
doublets—what was called a (V − A) coupling in the “old” scheme (Sect. 6.3.3). Let
us start for simplicity our “modern” description from a leptonic doublet
ν
χL = .
e L
These two currents are associated, for example, with weak decays of muons and
neutrons. Notice that
1 1 1
(1 − γ5 ) (1 − γ5 ) = (1 − γ5 ) ; (7.24)
2 2 2
1 1
(1 − γ5 ) (1 + γ5 ) = 0 . (7.25)
2 2
7.2 Electroweak Unification 373
This should be evident by the physical meaning of these projectors; we leave for the
Exercises a formal demonstration of these properties.
In analogy to the case of hadronic isospin, where the proton and neutron are
considered as the two isospin eigenstates of the nucleon, we define a weak isospin
doublet structure (T = 1/2)
ν T3 = +1/2
χL = , (7.26)
e L T3 = −1/2
with raising and lowering operators between the two components of the doublet
1
τ± = (τ1 ± iτ2 ) (7.27)
2
where the τi are the Pauli matrices.
The same formalism applies to a generic quark doublet, for example
u T3 = +1/2
χL = . (7.28)
d L
T3 = −1/2
jµ+ = χ̄ L γµ τ+ χ L (7.29)
jµ− = χ̄ L γµ τ− χ L . (7.30)
When imposing the SU(2) symmetry, one has two vector fields W 1 and W 2 cor-
responding to the Pauli matrices τ1 and τ2 . Notice that they do not correspond nec-
essarily to “good” particles, since, for example, they are not necessarily eigenstates
of the electric charge operator. However, we have seen that they can be combined to
physical states corresponding to the charged currents W ± (Eq. 7.27):
± 1 1
W = (W ± i W 2 ) . (7.31)
2
with algebra
[τi , τ j ] = i
i jk τk ; (7.34)
from this, we can construct the Lagrangian according to the recipes in Sect. 6.4.1.
Before doing so, let us examine the U(1) part of the Lagrangian.
The U(1)Y part. The electromagnetic current
jµem = ēγµ e = ē L γµ e L + ē R γµ e R
is invariant under U(1) Q , the gauge group of QED associated to the electromagnetic
charge. It is, however, not invariant under SU(2) L : it contains e L instead of χ L .
The neutral isospin
1 1 1
jµ3 = χ̄ L γµ τ3 χ L = ν̄ L γµ ν L − ē L γµ e L (7.35)
2 2 2
couples only left-handed particles, while we know that neutral current involves both
chiralities.
To have a consistent picture, we must construct a SU(2) L -invariant U(1) current.
We define a hypercharge
Y = 2(Q − T3 ) . (7.36)
g Y µ
Lint = −ig jµa W aµ − i j B , (7.38)
2 µ
while the part related to the gauge field is
1 1
Lg = − Waµν Wµν
a
− B µν Bµν , (7.39)
4 4
where W aµν (a = 1, 2, 3) and B µν are the field strength tensors for the weak isospin
and weak hypercharge fields. In the above, we have called g the strength of the SU(2)
coupling, and g the strength of the hypercharge coupling. The field tensors above
can be explicitly written as
7.2 Electroweak Unification 375
a
Wµν = ∂µ Wνa − ∂ν Wµa − g f bca Wµb Wνc (7.40)
Bµν = ∂µ Bν − ∂ν Bµ . (7.41)
with
τa
Dµ = ∂µ + igWµa + ig Y Bµ . (7.43)
2
At this point, the four gauge bosons W a and B are massless. But we know that
the Higgs mechanism can solve this problem.
In addition, we did not introduce fermion masses, yet. When discussing electro-
magnetism and QCD as gauge theories, we put fermion masses “by hand” in the
Lagrangian. This is not possible here, since an explicit mass term would break the
SU(2) symmetry. A mass term −m f χ f χ f for each fermion f in the Lagrangian
would give, for the electron for instance,
1 1
− m e ēe = −m e ē (1 − γ5 ) + (1 + γ5 ) e = −m e (ē R e L + ē L e R ) (7.44)
2 2
In Sect. 7.1.4, the Higgs mechanism was used to generate a mass for the gauge boson
corresponding to a U(1) local gauge symmetry. In this case, three Goldstone bosons
will be required (we need them to give mass to W + , W − , and Z ). In addition, after
symmetry breaking, there will be (at least) one massive scalar particle corresponding
to the field excitations in the direction picked out by the choice of the physical
vacuum.
The simplest Higgs field, which has the necessary four degrees of freedom, con-
sists of two complex scalar fields, placed in a weak isospin doublet. One of the scalar
fields will be chosen to be charged with charge +1 and the other to be neutral. The
hypercharge of the doublet components will thus be Y = 2(Q − T3 ) = 1. The Higgs
doublet is then written as
376 7 The Higgs Mechanism and the Standard Model of Particle Physics
φ+ 1 φ1 + iφ2
φ= =√ . (7.45)
φ0 2 φ3 + iφ4
where as in the case of Sect. 7.1.4, we have introduced the new field around the point
of minimum (we call it h instead of σ).
• As usual in the SSB, the h field acquires mass; we shall call the corresponding
particle H . This is the famous standard model Higgs boson, and its mass is
√
mH = −2µ2 = 2λv . (7.50)
• We now analyze the term (7.48). We have two massive charged bosons W 1 and W 2
with the same mass gv/2. We have seen, however, that physical states of integer
charge ±1 can be constructed by a linear combination of them (Eq. 7.31):
± 1 1
W = (W ± i W 2 ) .
2
1
MW ± = gv , (7.51)
2
and these states correspond naturally to the charged current vectors.
• Finally, let us analyze the term (7.49).
Here, the fields W 3 and B couple through a non-diagonal matrix; they thus are not
mass eigenstates. The physical fields can be obtained by an appropriate rotation
which diagonalizes the mass matrix
g 2 −gg Y
M= .
−gg Y g 2
For Y = ±1 (we recall our choice Y = 1), the determinant of the matrix is 0,
and when we shall diagonalize, one of the two eigenstates will be massless. If we
introduce the fields Aµ and Z µ defined as
where the angle θW , called the Weinberg angle, parametrizes the electroweak
mixing:
g
tan θW = , (7.54)
g
Aµ is then massless (we can identify it with the photon). Note that
1 2
MZ = v g + g 2 , (7.56)
2
and thus
MW = M Z cos θW . (7.57)
From the above expression, using the measured masses of the W and Z bosons,
we can get an estimate of the Weinberg angle:
2 2
MW 80.385 GeV
sin θW 1 −
2
1− 0.22 . (7.58)
MZ 91.188 GeV
378 7 The Higgs Mechanism and the Standard Model of Particle Physics
Note the use of symbols: this result has been obtained only at tree level of the
electroweak theory, while in the determination of the actual masses of the W and
Z , boson higher-order terms enter, related for example to QCD loops. Higher-
order processes (“radiative corrections”) should be taken into account to obtain a
fully consistent picture, see later; the current “best fit” value of the Weinberg angle
provides
sin θW = 0.2318 ± 0.0006 . (7.59)
T3
Q=Y+ . (7.60)
2
Thus the covariant derivative can be written as
τa 1
Dµ = ∂µ + igWµa + ig Bµ
2 2
g g g τ
= ∂µ + i √ Wµ τ + √ W − τ − + ig sin θW Q Aµ + i
+ + 3
− sin2 θW Q Z µ
2 2 cos θ W 2
(7.61)
and thus
g sin θW = e . (7.62)
The above relation holds also at tree level only: we shall see in Sect. 7.4 that
radiative corrections can influence at a measurable level the actual values of the
observables in the standard model.
Up to now the fermion fields in the theory are massless, and we have seen (Eq. 7.44)
that we cannot insert them by hand in the Lagrangian as we did in the case of QED and
QCD. A simple way to explain the fermion masses consistent with the electroweak
interaction is to ascribe such a mass to the Higgs mechanism, again: masses appear
after the SSB.
The problem can be solved by imposing in the Lagrangian coupling of the Higgs
doublet to the fermions by means of gauge invariant terms like (for the electron)
λe 0 ν
Leeh = − √ (ν̄e , ē) L e R + ē R (0, v + h) . (7.63)
2 v + h e L
7.2 Electroweak Unification 379
λe v λe
Leeh = − √ ēe − √ ēe h . (7.64)
2 2
Since the symmetry breaking term λ f for each fermion field is unknown, the
masses of fermions in the theory
√ are free parameters, and must be determined
by experiment. Setting λ f = 2m f /v, the part of the Lagrangian describing the
fermion masses and the fermion-Higgs interaction for each fermion is
mf
L f f h = −m f f¯ f − f¯ f h . (7.65)
v
Notice that the coupling to the Higgs is proportional to the mass of the fermion: this
is a strong prediction, to be verified by experiment.
We stress the fact that we need just one Higgs field to explain all massive particles
of the standard model: the weak vector bosons W ± , Z , fermions, and the Higgs
boson itself. The electromagnetic symmetry and the SU(3) color symmetry both
remain unbroken—the former being an accidental symmetry of SU(2) L ⊗ U(1)Y .
This does not exclude, however, that additional Higgs fields exist—it is just a matter
of economy of the theory or, if you prefer, it is just a consequence of the Occam’s
razor, to the best of our present knowledge.
Using Eq. (7.61), we can write the interaction terms between the gauge bosons and
the fermions (we use a lepton doublet as an example, but the result is general) as
g g
Lint = − √ ν e γ µ (1 − γ5 )e Wµ+ − √ eγ µ (1 − γ5 )νe Wµ−
2 2 2 2
g
− ν e γ (1 − γ5 )νe − eγ µ (1 − 4 sin2 θW − γ5 )e Z µ
µ
4 cos θW
− (−e) eγ µ e Aµ . (7.66)
f 1 f f 1 f
gV = T − Q f sin2 θW ; gA = T . (7.68)
2 3 2 3
The Z interaction can also be written considering the left and right helicity states
of the fermions. Indeed, for a generic fermions f ,
1 f
ψ f γ µ (gV − g A γ5 )ψ f = ψ f γ µ
f f f
(g − g A )(1 − γ5 )
2 V
1 f f
+ (gV − g A )(1 + γ5 ) ψ f (7.69)
2
= ψ f L γµgL ψ f L + ψ f R γµg R ψ f R (7.70)
where the left and right couplings g L and g R are thus given by
1
gL = (gV + g A ) (7.71)
2
1
gR = (gV − g A ) . (7.72)
2
In the case of neutrinos, Q = 0, gV = g A and thus g R = 0. The right neutrino
has then no interaction with the Z and as, by construction, it has also no interactions
with the γ, W ± , and gluons. The right neutrino is therefore, if it exists, sterile.
On the contrary, for electrical charged fermions, gV = g A and thus the Z boson
couples both with left and right helicity states although with different strengths
(g L = g R = 0).
Parity is also violated in the Z interactions.
These results are only valid in the one-family approximation. When extending
to three families, there is a complication: in particular, the current eigenstates for
quarks q are not identical to the mass eigenstates q. If we start by u-type quarks
being mass eigenstates, in the down-type quark sector, the two sets are connected by
a unitary transformation
Let us compute as an example the differential cross section for processes involving
electroweak currents. We should not discuss the determination of the absolute value,
but just the dependence on the flavor and on the angle.
Let us examine the fermion antifermion, f f¯, production in e+ e− annihilations.
At a center-of-mass energy smaller than the Z mass, the photon coupling will dom-
inate the process. The branching fractions will be dominated by photon exchange,
and thus proportional to Q 2f (being zero in particular for neutrinos).
Close to the Z mass, the process will be dominated by decays Z → f f¯ and the
f f f f
amplitude will be proportional to (gV + g A ) and (gV g A ), respectively, for left and
right fermions. The width into f f¯ will be then proportional to
7.2 Electroweak Unification 381
f f f f f2 f2
(gV + g A )2 + (gV − g A )2 = gV + g A . (7.74)
one has
2G F M Z3 f 2 f
= √ gV + g A 2 . (7.77)
3 2π
where the combinations A f are given, in terms of the vector and axial vector couplings
of the fermion f to the Z boson, by
f f
2gV g A
Af = f2 f2
, (7.80)
gV + g A
and Ae is the corresponding combination for the specific case of the electron.
The tree-level expressions discussed above give results which are correct at the per-
cent level—in the case of b-quark final states, additional mass effects O(4m 2b /M Z2 ),
also ∼0.01, have to be taken into account. For the production of e+ e− final states,
the t-channel gauge boson exchange contributions have to be included (this process
allows to determine the absolute luminosity at e+ e− colliders, making it particularly
important), and it is dominant at low angles, the cross section being proportional to
sin3 θ. However, one needs to include the one-loop radiative corrections so that the
4 For a deduction, see for instance Chap. 16.2 of Reference [F7.2] in the “Further readings.”
382 7 The Higgs Mechanism and the Standard Model of Particle Physics
Self-couplings among the gauge bosons are present in the SM as a consequence of the
non-abelian nature of the SU(2)L ⊗ U(1)Y symmetry. These couplings are dictated
by the structure of the symmetry group as discussed before and, for instance, the
triple self-couplings among the W and the V = γ, Z bosons are given by
LW W V = igW W V Wµν
†
W µ B ν − Wµ† Bν W µν + Wµ† Wν B µν (7.81)
We have already shown how to compute the invariant amplitude M for a scalars fields
in Sect. 6.2.7. We give here only the Feynman rules for the propagators (Fig. 7.4) and
vertices (Fig. 7.5) of the standard model that we can use in our calculations—or our
estimates, since the calculation of the complete amplitude including spin effects can
be very lengthy and tedious. We follow here Ref. [F7.5] in the “Further readings”;
a complete treatment of the calculation of amplitudes from the Feynman diagrams
can be found in Ref. [F7.1]. Note that we do not provide QCD terms, since the few
perturbative calculations practically feasible in QCD involve a very large number of
graphs.
Fig. 7.5 Terms associated to vertices in the electroweak model. Adapted from [F7.5]
The Lagragian of the standard model is the sum of the electroweak Lagrangian
(including the Higgs terms, which are responsible for the masses of the W ± bosons
and of the Z , and of the leptons) plus the QCD Lagrangian without the fermion mass
terms.
384 7 The Higgs Mechanism and the Standard Model of Particle Physics
The SM Higgs boson is thus the Higgs boson of the electroweak Lagrangian.
In accordance with relation (7.65), the interaction of the Higgs boson with a
mf
fermion is proportional to the mass of this fermion itself: g H f f = .
v
One finds that the Higgs boson couplings to the electroweak gauge bosons are
instead proportional to the squares of their masses:
MW2 M2 M2 M2
gH W W = 2 , g H H W W = 2W , and g H Z Z = Z , g H H Z Z = Z2 .
v v v 2v
(7.82)
Among the consequences, the prediction of the branching fractions for the decay
of the Higgs boson is discussed later.
The standard model describes in detail particle physics at least at energies below or at
the order of the electroweak scale (gravity is not considered here). Its power has been
intensively and extensively demonstrated in the past thirty years by an impressive
number of experiments (see later in this Chapter). However, it has a relatively large
set of “magic numbers” not defined by the theory, which thus have to be obtained
from measurements. The numerical values of these parameters were found to differ
by more than 10 orders of magnitude (e.g., m ν < 0.1 eV, m t ∼ 0.2 TeV).
These free parameters may be listed in the hypothesis that neutrinos are standard
massive particles (the hypothesis that they are “Majorana” particles, i.e., fermions
coincident with their antiparticles, will be discussed in Chap. 9), as follows:
• In the gauge sector:
– three gauge constants (respectively, SU(2) L , U(1)Y , SU(3)):
g, g , gs .
µ, λ .
m ν1 , m ν2 , m ν3 , m e , m µ , m τ , m u , m d , m c , m s , m t , m b ;
– the four quark CKM mixing parameters (in the Wolfenstein parametrization,
see Sect. 6.3.7):
λ, A, ρ, η ;
– the four neutrino PMNS mixing parameters (see Sect. 9.1.1), which can be three
real angles and one phase:
θ12 , θ13 , θ23 , δ .
θC P = 0 . (7.83)
The bare masses of the electroweak gauge bosons, as well as their couplings, are
derived directly from the standard model Lagrangian after the Higgs spontaneous
symmetry breaking mechanism (see the previous sections):
• m γ = 0,
by construction
• mZ = 1
2 g2 + g 2 v
386 7 The Higgs Mechanism and the Standard Model of Particle Physics
• m W = 21 gv
and sin2 θW can also be expressed as
mW 2 πα πα
sin2 θW = 1 − =√ =√ . (7.84)
mZ2 2G F m W 2 2G F m Z 2 cos2 θW
Finally, the mass of the Higgs boson is given, as seen Sect. 7.2.2, by
√
mH = 2λv . (7.85)
The standard model exhibits additional global symmetries, collectively denoted acci-
dental symmetries, which are continuous U(1) global symmetries which leave the
Lagrangian invariant. By Noether’s theorem, each symmetry has an associated con-
served quantity; in particular, the conservation of baryon number (where each quark
is assigned a baryon number of 1/3, while each antiquark is assigned a baryon number
of − 1/3), electron number (each electron and its associated neutrino is assigned an
electron number of + 1, while the anti-electron and the associated anti-neutrino carry
a − 1 electron number), muon number, and tau number are accidental symmetries.
Note that somehow these symmetries, although mathematically “accidental,” exist
by construction, since, when we designed the Lagrangian, we did not foresee gauge
particles changing the lepton number or the baryon number as defined before.
In addition to the accidental symmetry, but nevertheless exact symmetries,
described above, the standard model exhibits several approximate symmetries. Two
of them are particularly important:
• The SU(3) quark flavor symmetry, which reminds us the symmetries in the “old”
hadronic models. This obviously includes the SU(2) quark flavor symmetry—the
strong isospin symmetry, which is less badly broken (only the two light quarks
being involved).
• The SU(2) custodial symmetry, which keeps
7.3 The Lagrangian of the Standard Model 387
mW 2
r= 1
m Z 2 cos2 θW
limiting the size of the contributions from loops involving the Higgs particle (see
the next Section). This symmetry is exact before the SSB.
As it was discussed in the case of QED (see Sect. 6.2.9), measurable quantities are
not directly bare quantities present in the Lagrangian, but correspond to effective
quantities which “absorb” the infinities arising at each high-order diagrams due to
the presence of loops for which integration over all possible momentum should be
performed. These effective renormalized quantities depend on the energy scale of
the measurement. This was the case for α and αs as discussed in Sects. 6.2.10 and
6.4.4. The running of the electromagnetic coupling α
1 1
α m2e ∼ ; α m2 Z ∼
137 129
implies for instance a sizeable change on the values of m Z and m W from those that
could be computed using the relations listed above ignoring this running and taking
for G F and sin2 θW the values measured at low energy (muon decay for G F and
deep inelastic neutrino scattering for sW ).
In addition, QCD corrections to processes involving quarks can be important.
For example, at the first perturbative order αs (essentially keeping into account the
emission of one gluon)
(Z → q q̄) αs
1+ ; (7.86)
(Z → q q̄)leading π
radiation of quarks and gluons increases the decay amplitude linearly in αs . In fact
these QCD corrections are known to O(αs3 ) for q q̄ (see later), and the measurement
of the hadronic Z width provides the most precise estimate of αs —see later a better
approximation of Eq. (7.86).
High-order diagrams have thus to be carefully computed in order that the high-
precision measurements that have been obtained in the last decades (see the next
section) can be related to each other and, in that way, determine the level of consis-
tency (or violation) of the standard model. These corrections are introduced often as
a modification of previous lowest order formulas as, for example,
2 πα
sin2 θW cos θW = √ (7.87)
2G F m Z 2 (1 − r )
388 7 The Higgs Mechanism and the Standard Model of Particle Physics
f f
The departure of ρ Z from the unity (ρ Z ) is again a function of m t 2 and ln (m H ).
Another way to incorporate radiative corrections in the calculations, frequently
used in the literature, is to absorb them in the Weinberg angle, which then becomes
an “effective” quantity. The form of the calculations then stays the same as that at
leading order, but with an “effective” angle instead of the “bare” angle.
This approach is the simplest—and probably the most common in the literature—
but one has to take into account that, at higher orders, the “effective” value of the
angle will be different for different processes.
7.4 Observables in the Standard Model 389
Fig. 7.7 Calculated mass of the top quark (left) and of the Higgs boson (right) from fits of experi-
mental data to the standard model, as function of the year. The lighter bands represent the theoretical
predictions at 95 % confidence level, while the dark bands at 68 % confidence level. The points with
error bars represent the experimental values after the discovery of these particles. From http://
project-gfitter.web.cern.ch/project-gfitter/
Global fits to electroweak precision data (see for instance the Gfitter project at
CERN), taking properly into account correlations between standard model observ-
ables, have so far impressively shown the consistency of the standard model and its
predictive power.
Both the mass of the top quark and mass of the Higgs boson were predicted
before their discovery (Fig. 7.7), and the masses measured by experiment confirmed
the prediction.
Many other consistency tests at accelerators confirmed the validity of the standard
model; we shall discuss them in the next section.
The SM has been widely tested, also in the cosmological regime, and with high-
precision table-top experiments at sub-GeV energies. The bulk of the tests, however,
has been performed at particle accelerators, which span a wide range of high energies.
In particular, from 1989 to 1995, the Large Electron–Positron collider (LEP) at
CERN provided collisions at center-of-mass energies near the Z mass; four large
state-of-the-art detectors (ALEPH, DELPHI, L3 and OPAL) recorded about 17 mil-
lion Z decays. Almost at the same time, the SLD experiment at the SLAC laboratory
near Stanford, California, collected 600,000 Z events at the SLAC Linear Collider
(SLC), with the added advantage of a longitudinally polarized electron beam—
polarization provided additional opportunities to test the SM. LEP was upgraded
later to higher energies starting from 1996, and eventually topped at a center-of-
mass energy of about 210 GeV at the end of 2000. In this second phase, LEP could
produce and study at a good rate of all SM particles—except the top quark and the
Higgs boson; it produced in particular a huge statistics of pairs of W and Z bosons.
390 7 The Higgs Mechanism and the Standard Model of Particle Physics
The Tevatron circular accelerator at the Fermilab near Chicago collided protons
and antiprotons in a 7-km ring to energies of up to 1 TeV. It was completed in 1983,
and its main achievement was the discovery of the top quark in 1995 by the scientists
of the CDF and D0 detectors. The Tevatron ceased operations in 2011 because of the
completion of the LHC, which had started stable operations in early 2010.
Finally, the Large Hadron Collider (LHC) was built in the 27-km-long LEP tunnel;
it collides pairs of protons (and sometimes of heavy ions). It started stable operation
in 2010, and increased its center-of-mass energy up the design energy of 14 TeV for
proton–proton collisions, reached in 2015. Its main result has been the discovery of
the Higgs boson.
All these accelerators provided extensive tests of the standard model; we shall
review them in this section.
Before summarizing the main SM results at LEP/SLC, at the Tevatron and at the
LHC, let us shortly remind some of the earlier results at accelerators which gave the
scientific community confidence in the electroweak part of the standard model, and
were already presented in Chap. 6:
• The discovery of the weak neutral currents by Gargamelle in 1972.
A key prediction of the electroweak model was the existence of neutral currents
mediated by the Z . These currents are normally difficult to reveal, since they are
hidden by the most probable photon interactions. However, the reactions
ν̄µ + e− → ν̄µ + e− ; νµ + N → νµ + X
cannot happen via photon exchange, nor via W exchange. The experimen-
tal discovery of these reactions happened in bubble-chamber events, thanks to
Gargamelle, a giant bubble chamber. With a length of 4.8 m and a diameter of
nearly 2 m, Gargamelle held nearly 12 m3 of liquid freon, and operated from 1970
to 1978 with a muon neutrino beam produced by the CERN Proton Synchrotron.
The first neutral current event was observed in December 1972, and the detection
was published with larger statistics in 1973; in the end, approximately 83,000
neutrino interactions were analyzed, and 102 neutral current events observed.
Gargamelle is now on exhibition in the CERN garden.
• The discovery of a particle made of the charm quark (the J/ψ) in 1974. Charm
was essential to explain the absence of strangeness-changing neutral currents (by
the so-called GIM mechanism, discussed in the previous Chapter).
• The discovery of the W and Z bosons at the CERN Sp p̄S collider in 1983, in the
mass range predicted, and consistent with the relation m Z m W / cos θW .
In the context of the Minimal Standard Model (MSM) neglecting the neutrino masses
which are anyway very small, electroweak processes can be computed at tree level
from the electromagnetic coupling α, the weak coupling G F , the Z mass M Z , and
from the elements of the CKM mixing matrix.
When higher-order corrections and phase space effect are included, one has to add
to the above αs , m H , and the masses of the particles. The calculations show that the
loops affecting the observables depend on the top mass through terms (m 2t /MZ2 ), and
on the Higgs mass through terms showing a logarithmic dependence ln(m 2H /M Z2 )—
plus, of course, on any kind of “heavy new physics” (see Sect. 7.4).
The set of the three SM variables which characterize the interaction is normally
taken as M Z = 91.876 ± 0.0021 GeV (derived from the Z line shape, see later),
G F = 1.1663787(6) × 105 GeV−2 (derived from the muon lifetime), and the fine
structure constant in the low-energy limit α = 1/137.035999074(44), taken from
several electromagnetic observables; these quantities have the smallest experimental
errors.
One can measure the SM parameters through thousands of observables, with
partially correlated statistical and systematic uncertainties; redundancy can display
possible contradictions, pointing to new physics. This large set of results has been
reduced to a more manageable set of 17 precision results, called electroweak observ-
ables. This was achieved by a model-independent procedure, developed by the LEP
and Tevatron Electroweak Working Groups (a group of physicist from all around the
world charged of producing “official” fits to precision observables in the SM).
About three-fourth of all observables arise from measurements performed in
electron–positron collisions at the Z resonance, by the LEP experiments ALEPH,
DELPHI, L3, and OPAL, and the SLD experiment. The Z -pole observables are five
observables describing the Z lineshape and leptonic forward-backward asymmetries,
two observables describing polarized leptonic asymmetries measured by SLD with
polarized beams and at LEP through the tau polarization, six observables describ-
ing b− and c−quark production at the Z pole, and finally the inclusive hadronic
charge asymmetry. The remaining observables are the mass and total width of the W
boson measured at LEP and at hadron accelerators, the top quark mass measured at
hadron accelerators. Recently, also the Higgs mass has been added to the list; the fact
that the Higgs mass has been found in the mass range predicted by the electroweak
observables is another success of the theory.
Figure 7.8 shows the comparison of the electroweak observables with the best fit
to the SM. One can appreciate the fact that the deviations from the fitted values are
consistent with statistical fluctuations.
Figure 7.9 shows the evolution of the hadronic cross section σ(e+ e− → hadr ons)
with energy, compared with the predictions of the SM. This is an incredible success
of the SM, which quantitatively accounts for experimental data over a wide range of
energies:
392 7 The Higgs Mechanism and the Standard Model of Particle Physics
• starting from a region (above the ϒ threshold and below some 50 GeV) where the
production is basically due to photon exchange, and σ ∝ 1/s,
• to a region in which the contributions from Z and γ are important and the Z /γ
interference has to be taken into account,
• to a region of Z dominance (Eq. 7.93), and
• to a region in which the W W channel opens and triple boson vertices become
relevant.
We describe in larger detail three of the most significant electroweak tests at LEP
in phase I: the partial widths of the Z , the forward-backward asymmetries, and the
study of the Z line shape, which has important cosmological implications. Finally, in
this section, we examine the characteristics of vertices involving three gauge bosons.
Partial Widths of the Z . The partial widths of the Z , possibly normalized to the total
width, are nontrivial parameters of the SM. Indeed, the evolution of the branching
fractions with energy due to the varying relative weights of the Z and γ couplings
are a probe into the theory.
Final states of the Z into µ+ µ− and τ + τ − pairs can easily be identified. The e+ e−
final state is also easy to recognize, but in this case, the theoretical interpretation is less
trivial, being the process dominated by t−channel exchange at low angles. Among
Fig. 7.8 Pull comparison of the fit results with the direct measurements in units of the experimental
uncertainty. The absolute value of the pull (i.e., of the difference between the measured value and the
fitted value divided by the uncertainty) of the Higgs mass is 0.0 (its value is completely consistent
with the theoretical fit)
7.5 Experimental Tests of the SM at Accelerators 393
Fig. 7.9 Evolution of the hadronic cross section σ(e+ e− → hadr ons) with energy, compared
with the predictions of the SM
hadronic final states, the bb̄ and cc̄ can be tagged using the finite lifetimes of the
primary hadrons (the typical lifetime of particles containing c quarks and weakly
decaying is of the order of 0.1 ps, while the typical lifetime of particles containing c
quarks and weakly decaying is of the order of 1 ps). The tagging of s s̄ final states is
more difficult and affected by larger uncertainties.
All these measurements gave results consistent with the predictions from the SM
(Table 7.1). By considering that the decay rates include the square of these factors,
and all possible diagrams, the relative strengths of each coupling can be estimated
(e.g., sum over quark families, and left and right contributions). The relative strengths
of each coupling can be estimated by considering that the decay rates include the
square of these factors, and all possible diagrams (e.g., sum over quark families, and
left and right contributions). As we are considering only tree-level diagrams in the
electroweak theory, this is naturally only an estimate.
Also, the energy evolution of the partial widths from lower energies and near the
Z resonance is in agreement with the Z /γ mixing in the SM.
Z Asymmetries and sin2 θe f f . Like the cross section Z → f f¯, the forward-
backward asymmetry
f σF − σB 3
AF B ≡ Ae A f , (7.91)
σF + σB 4
where F (forward) means along the e− direction, where the combinations A f are
given, in terms of the vector and axial vector couplings of the fermion f to the Z
boson, by
f f
2gV g A
Af = f2 f2
, (7.92)
gV + g A
394 7 The Higgs Mechanism and the Standard Model of Particle Physics
Table 7.1 Relative branching fractions of the Z into f f¯ pairs: predictions at leading order from
the SM (for sin2 θW = 0.23) are compared to experimental results
Particle gV gA Predicted (%) Experimental (%)
Neutrinos (all) 1/4 1/4 20.5 (20.00 ± 0.06)
Charged leptons 10.2 (10.097 ± 0.003)
(all)
Electron −1/4 + sin2 θW −1/4 3.4 (3.363 ± 0.004)
Muon −1/4 + sin2 θW −1/4 3.4 (3.366 ± 0.007)
Tau −1/4 + sin2 θW −1/4 3.4 (3.367 ± 0.008)
Hadrons (all) 69.2 (69.91±0.06)
Down-type −1/4 +1/3 −1/4 15.2 (15.6 ± 0.4)
quarks d, s, b sin2 θW
Up-type quarks u, 1/4−2/3 sin2 θW 1/4 11.8 (11.6 ± 0.6)
c
can be measured for all charged lepton flavors, for heavy quarks, with smaller accu-
racy for s s̄ pairs, and for all five quark flavors inclusively (overall hadronic asym-
metry). It thus allows a powerful test of the SM.
One thus expects at the Z asymmetry values of about 7 % for up-type quarks,
about 10 % for down-type quarks, and about 2 % for leptons. Figure 7.8 shows that
results are consistent with the SM predictions, and this observable is powerful in
constraining the value of sin θW (Fig. 7.10).
Since the e+ e− annihilation as a function of energy scans the γ/Z mixing, the
study of the forward-backward asymmetry as a function of energy is also very impor-
7.5 Experimental Tests of the SM at Accelerators 395
tant. The energy evolution of the asymmetries from lower energies and near the Z
resonance is in agreement with the Z /γ mixing in the SM.
The Z Lineshape and the Number of Light Neutrino Species. One of the most
important measurements at LEP concerns the mass and width of the Z boson. While
the Z mass is normally taken as an input to the standard model, its width depends
on the number of kinematically available decay channels and the number of light
neutrino species (Fig. 7.11). As we shall see, this is both a precision measurement
confirming the SM and the measurement of a fundamental parameter for the evolution
of the Universe.
Why is the number of quark and lepton families equal to three? Many families
including heavy charged quarks and leptons could exist, without these heavy leptons
being ever produced in accessible experiments, because of a lack of energy. It might
be, however, that these yet undiscovered families include “light” neutrinos, kine-
matically accessible in Z decays—and we might see a proof of their existence in Z
decays. The Z lineshape indeed obviously depends on the number of kinematically
accessible neutrinos. Let us call them “light” neutrinos.
Around the Z pole, the e+ e− → Z → f f¯ annihilation cross section (s-channel)
can be written as
12π(c)2 se f
σs,Z + corrections. (7.93)
MZ2 (s − M Z )2 + s 2 2Z /M Z2
2
The term explicitly written is the generic cross section for the production of a spin
one particle in an e+ e− annihilation, decaying into visible fermionic channels—just
a particular case of the Breit–Wigner shape. The peak sits around the Z mass, and
has a width Z . B f Z = f is the partial width of the Z into f f¯. As we have
seen, the branching fraction of the Z into hadrons is about 70 %, each of the leptons
represents 3 %, while three neutrinos would contribute for approximately a 20 %.
The term “corrections” includes radiative corrections and the effects of the presence
396 7 The Higgs Mechanism and the Standard Model of Particle Physics
of the photon. We remind that the branching fractions of the photon are proportional
to Q 2f , where Q f is the electric charge of the final state. However, at the peak, the
total electromagnetic cross section is less than 1 % of the cross section at the Z
resonance. Radiative corrections, instead, are as large as 30 %; due to the availability
of calculations up to second order in perturbation theory, this effect can be corrected
for with a relative precision at the level of 10−4 . The effect of a number of neutrinos
larger than three on the formula (7.93) would be to increase the width, and to decrease
the cross section at the resonance.
The technique for the precision measurement of the Z cross section near the peak
is not trivial; we shall just sketch it here. The energy of the beam is varied, and all
visible channels are classified according to four categories: hadrons, electron pairs,
muon pairs, and tau pairs. The extraction of the cross section from the number of
events implies the knowledge of the luminosity of the accelerator. This is done by
measuring at the same time another process with a calculable cross section, the elastic
scattering e+ e− → e+ e− in the t−channel (Bhabha scattering), which results in an
electron–positron pair at small angle. Of course one has to separate this process from
the s-channel, and a different dependence on the polar angle is used for this purpose:
the Bhabha cross section depends on the polar angle as 1/ sin3 θ, and quickly goes
to 0 as θ grows. Another tool can be leptons universality: in the limit in which the
lepton masses are negligible compared to the Z mass, the branching fractions of all
leptons are equal.
In the end, one has a measurement of the total hadronic cross section from the
LEP experiments (with the SLAC experiments contributing to a smaller extent due to
the lower statistics) which is plotted in Fig. 7.11. The best fit to Eq. (7.93), assuming
that the coupling of neutrinos is universal, provides
Notice that the number of neutrinos could be fractional—in case a fourth generation
is relatively heavy—and universality is apparently violated due to limited the phase
space.
The best-fit value of the Z width is
e W e e Z
W
,Z
e e
e W
W e e Z
30
LEP LEP
ZZTO and YFSZZ
1
20
(pb)
(pb)
WW
ZZ
10 0.5
YFSWW/RacoonWW
no ZWW vertex (Gentle)
only e exchange (Gentle)
0 0
160 180 200 180 190 200
s (GeV) s (GeV)
The experimental data at LEP II and at hadronic accelerators (mostly the Tevatron)
have allowed the determination of the W mass and width with high accuracy:
After our considerations on the electroweak observables, let us now summarize the
tests of the remaining building block of the SM: QCD.
LEP is an ideal laboratory for QCD studies since the center-of-mass energy is
high with respect to the masses of the accessible quarks and (apart from radiative
corrections which are important above the Z ) well defined. As a consequence of the
large center-of-mass energy, jets are collimated and their environment is clean: the
hadron level is not so far from the parton level. The large statistics collected allows
investigating rare topologies.
In particular, LEP confirmed the predictions of QCD in the following sectors—
among others:
• QCD is not Abelian: the jet topology is inconsistent with an Abelian theory and
evidences the existence of the three-gluon vertex.
Angular correlations within 4-jet events are sensitive to the existence of the gluon
self-coupling (Fig. 7.14), and were extensively studied at LEP.
As a consequence of the different couplings in the gq q̄ and ggg vertices, the dis-
tribution of the Bengtson–Zerwas angle (Fig. 7.15, left) between the cross product
of the direction of the two most energetic jets in 4-jet events and the cross prod-
uct of the direction of the two least energetic jets is substantially different in the
predictions of QCD and of an Abelian theory where the gluon self-coupling does
not exist.
Fig. 7.15 Left Definition of the Bengtsson–Zerwas angle in 4-jet events. From P.N. Burrows,
SLAC-PUB-7434, March 1997. Right Distribution for the data, compared with the predictions for
QCD and for an Abelian theory. The experimental distribution is compatible with QCD, but it
cannot be reproduced by an Abelian field theory of the strong interactions without gauge boson
self-coupling. From CERN Courier, May 2004
The LEP result is summarized in Fig. 7.15, right; they are in excellent agreement
with the gauge structure of QCD and proved to be inconsistent with an Abelian
theory, i.e., the three-gluon vertex is needed to explain the data.
• Structure of QCD: measurement of the color factors.
QCD predicts that quarks and gluons fragment differently due to their different
color charges. Gluon jets are expected to be broader than quark jets; the multiplicity
of hadrons in gluon jets should be larger than in quark jets of the same energy, and
particles in gluon jets are expected to be less energetic. All these properties have
been verified in the study of symmetric 3-jet events, in which the angle between
pairs of consecutive jets is close to 120◦ —the so-called “Mercedes” events, like
the event in Fig. 6.61, right. In these 3-jet events, samples of almost pure gluon jets
could be selected by requiring the presence of a particle containing the b quark in
both the other jets—this can be done due to the relatively long lifetime associated
to the decay (τb 1 ps, which corresponds to an average decay length of 300 µm
for γ = 1, well measurable for example with Silicon vertex detectors).
Many observables at hadron level can be computed in QCD using the so-called
“local parton-hadron duality” (LPHD) hypothesis, i.e., computing quantities at
parton level with a cutoff corresponding to a mass just above the pion mass, and
then rescaling to hadrons with a normalization factor.
Gluon jets in hadronic 3-jet events at LEP have indeed been found experimentally to
have larger hadron multiplicities than quark jets. Once correcting for hadronization
effects, one obtains
CA
= 2.29 ± 0.09 (stat.) ± 0.15(theory) ,
CF
400 7 The Higgs Mechanism and the Standard Model of Particle Physics
Fig. 7.16 Left Sketch of a 3-jet event in e+ e− annihilations. In the Lund string model for frag-
mentation, string segments span the region between the quark q and the gluon g and between the
antiquark q̄ and the gluon. Right Experimental measurement of the particle flow (1/N )dn/dψ, for
events with ψ A = 150◦ ± 10◦ and ψC = 150◦ ± 10◦ . The points with errors show the flow from
the higher energy quark jet to the low-energy quark jet and then to the gluon jet; in the histogram,
it is shown the measured particle flow for the same events, starting at the high energy quark jet but
proceeding in the opposite sense. The dashed lines show the regions, almost free of the fragmenta-
tion uncertainties, where the effect is visible. From OPAL Collaboration, Phys. Lett. B261 (1991)
334
consistent with the ratio of the color factors C A /C F = 9/4 that one can derive
from the theory at leading order assuming LPHD (see Eq. 6.124).
• String effect.
As anticipated in Sect. 6.4.6 and as one can see from Fig. 7.16, left, one expects
in a Mercedes event an excess of particles in the direction of the gluon jet with
respect to the opposite direction, since this is where most of the color field is. This
effect is called the string effect and has been observed by the LEP experiments at
CERN in the 1990s. This is evident also from the comparison of the color factors,
as well as from considerations based on color conservation.
A direct measurement of the string effect in Mercedes events is shown in Fig. 7.16,
right.
• Measurement of αs and check of its evolution with energy.
One of the theoretically best known variables depending on αs is the ratio of the
Z partial decay widths Rlept0 , which is known to O(α3 ):
s
α α 2 α 3 α 4
0 = hadrons s s s s
Rlept = 19.934 1 + 1.045 + 0.94 − 15 +O .
leptons π π π π
(7.98)
0
From the best-fit value Rlept = 20.767 ± 0.025 (derived by assuming lepton
universality), one obtains
+0.003
αs (m Z ) = 0.124 ± 0.004(exp.) −0.002 (theory). (7.99)
One could ask if these are the most reliable evaluations of αs (m Z ) using the
LEP data. The problem is that the quoted results depend on the validity of the
electroweak sector of the SM and thus small deviations can lead to large changes.
One can also measure αs from infrared safe hadronic event shape variables like
jet rates, etc., not depending on the electroweak theory. A fit to the combined data
results in a value
αs (m Z ) = 0.1195 ± 0.0047,
the large statistics collected at LEP in phase I, information also on the behavior of
this observable at center-of-mass energies below the Z peak.
The QCD prediction including leading and next-to-leading order calculation is
√
αs (E C M )
n(E C M ) = a[αs (E C M )]b ec/ 1 + O( αs (E C M ) , (7.101)
where a is the LPHD scaling parameter (not calculable from perturbation theory)
whose value should be fitted from the data; the constants b = 0.49 and c = 2.27
are instead calculated from the theory. The summary of the experimental data
available is shown in Fig. 7.18 with the best fit to the QCD prediction.
Energy distribution of hadrons can be computed form LPHD; the coherence
between radiated gluons causes a suppression at low energies. Experimental evi-
dence for this phenomenon comes from the “hump-backed” plateau of the distri-
bution of the variable ξ = − ln(2E h /E C M ) shown in Fig. 7.19, left.
The increase with energy of the maximum value, ξ ∗ , of these spectra is strongly
reduced compared to expectations based on phase space (Fig. 7.19, right).
7.5.1.3 The Discovery of the Top Quark at the “Right” Mass at the
Tevatron
LEP could not discover the top quark but was able to indirectly estimate its mass, since
the top quark mass enters into calculations of characteristics of various electroweak
observables, as seen before.
In 1994, the best (indirect) estimate for the top quark mass by LEP was m t =
178 ± 20 GeV.
7.5 Experimental Tests of the SM at Accelerators 403
8 e+e :
LEP 206 GeV
LEP 189 GeV
7 LEP 133 GeV
LEP 91 GeV
TOPAZ 58 GeV
6 TASSO 44 GeV
TASSO 35 GeV
TASSO 22 GeV
5
DIS:
1/ d /d
0
0 1 2 3 4 5 6
=ln(1/xp)
Fig. 7.19 Left Center-of-mass energy dependence of the spectra of charged hadrons as a function
of ξ = − ln x; x = 2E h /E C M . From K.A. Olive et al. (Particle Data Group), Chin. Phys. C 38
(2014) 090001. Right Energy dependence of the maximum of the ξ distribution, ξ ∗
In March 1995, the two experiments CDF and D0 running at Fermilab at a center-
of-mass energy of 1.8 TeV jointly reported the discovery of the top at a mass of
176 ± 18 GeV. The cross section was consistent with what predicted by the standard
model. Figure 7.7, left, compares the indirect measurements of the top mass with the
direct measurements.
The most important processes which could possibly produce a Higgs boson at LEP
were, besides the unobserved decays Z → H + γ or Z → H + Z ∗ (Z ∗ → f f¯), (a)
the so-called Higgs-strahlung e+ e− → Z + H ; and (b) the vector boson (W + W −
or Z Z ) fusion into a H boson and a lepton–antilepton pair (Fig. 7.20). The direct
process e+ e− → H as a negligible probability because of the small H coupling to
e+ e− , given the mass difference.
404 7 The Higgs Mechanism and the Standard Model of Particle Physics
Fig. 7.20 Main Higgs production mechanisms at LEP: Higgs-strahlung (left) and vector boson
fusion (right)
A first limit on the Higgs mass was obtained shortly after switching on the accel-
erator: the fact that no decays of the Z into H were observed immediately implies
that the Higgs boson must be heavier than the Z . Then the center-of-mass energy
of LEP was increased up to 210 GeV, still without finding evidence for the Higgs.
Indirect experimental bounds on the SM Higgs boson mass (in the hypothesis of a
minimal SM) were obtained from a global fit of precision measurements of elec-
troweak observables at LEP described in the previous subsection; the uncertainty
on radiative corrections was dominated by the uncertainty on the yet undiscovered
Higgs boson—and, to a smaller extent, by the error on the measurement of the top
mass: solid bounds could thus be derived.
LEP shut down in the year 2000. The global fit to the LEP data, with the constraints
given by the top mass measurements at the Tevatron, suggested for the Higgs a mass
of 94+29
−24 GeV (the likelihood distribution was peaked toward lower Higgs mass
values, as shown in Fig. 7.21). On the other hand, direct searches for the Higgs
boson conducted by the experiments at the LEP yielded a lower limit at 95 % C.L.
(confidence limit)
m H > 114.4 GeV/c2 . (7.102)
Higgs masses above 171 GeV were also excluded at 95 % C.L. by the global
electroweak fit. The negative result of searches at the Tevatron and at LHC conducted
before 2011 excluded the range between 156 and 177 GeV; thus one could conclude,
still at 95 % C.L.,
m H < 156 GeV/c2 . (7.103)
Scientists were finally closing on the most wanted particle in the history of high
energy physics.
7.5 Experimental Tests of the SM at Accelerators 405
LHC started operation in September 2008 for a test run at center-of-mass energy
smaller than 1 TeV, and then in a stable conditions in November 2009 after a serious
accident in the test run damaged the vacuum tubes and part of the magnets. Starting
from March 2010, LHC reached an energy of 3.5 TeV per beam, and thus an excellent
discovery potential for the Higgs; the energy was further increased to 4 TeV per beam
in mid 2012. Within the strict bound defined by (7.102) and (7.103), coming from
the constraints of the indirect measurements, a mass interval between 120 and 130
GeV was expected as the most likely for the Higgs.
A Higgs particle around that mass range is mostly produced via gluon–gluon
fusion. Gluon–gluon fusion can generate a virtual top quark loop, and since the
Higgs couples to mass, this process is very effective in producing Higgs bosons—
the order of magnitude of the cross section being of 10 pb. The second largest rate
pertains to the so-called weak-boson fusion (WBF) process qq → qq H via the
radiation of one W by each quark and the subsequent fusion of the two W bosons
into a Higgs particle. The cross section is one order of magnitude smaller, i.e., about
1 pb.
A Higgs particle between 120 and 130 GeV is difficult to detect at LHC, because
the W + W − decay channel is kinematically forbidden (one of the W s has to be highly
virtual). The total decay width is about 4 MeV; the branching fractions predicted by
the SM are depicted in Fig. 7.22.
The dominant decay modes are H → bb̄ and H → W W ∗ . The first one involves
the production of jets, which are difficult to separate experimentally in the event; the
second one involves jets and/or missing energy (in the semileptonic W , decay part of
the energy is carried by an undetected neutrino, which makes the event reconstruction
difficult). The decay H → Z Z might have some nice experimental features: the final
406 7 The Higgs Mechanism and the Standard Model of Particle Physics
10-3 Z
10-4
120 121 122 123 124 125 126 127 128 129 130
MH [GeV]
state into four light charged leptons is relatively easy to separate from the background.
The decay H → γγ, although suppressed at the per mil level, has a clear signature;
since it happens via a loop, it provides indirect information on the Higgs couplings
to W W , Z Z , and t t¯.
On July 4, 2012, a press conference at CERN followed worldwide finally
announced the observation at the LHC detectors ATLAS and CMS of a narrow
resonance with a mass of about 125 GeV, consistent with the SM Higgs boson. The
evidence was statistically significant, above five standard deviations in either exper-
iment; decays to γγ and to Z Z → 4 leptons were detected, with rates consistent
with those predicted for the SM Higgs.
Two candidate events are shown in Fig. 7.23, and we stress the word “candidate.”
Detection involves a statistically significant excess of such events, but the background
from accidental γγ or four-lepton events is important (Fig. 7.24).
Later, the statistics increased, and the statistical significance as well; a compilation
of the experimental data by the PDG is shown in Fig. 7.25. The present fitted value
for the mass is
m H = 125.09 ± 0.24 GeV/c2 , (7.104)
Fig. 7.23 Candidate Higgs boson events at the LHC. The upper is a Higgs decay into two photons
(dashed lines and towers) recorded by CMS. The lower is a decay into four muons (thick solid
tracks) recorded by ATLAS. Source CERN
SM has many free parameters: is this a minimal set, or some of them are calculable?
It is very likely that the SM emerges from a more general (and maybe conceptually
simpler) theory.
408 7 The Higgs Mechanism and the Standard Model of Particle Physics
Events / 3 GeV
H mH=126 GeV
4000
-1 20
s = 7 TeV Ldt = 4.8 fb
2000 -1
s = 8 TeV Ldt = 20.7 fb 15
Events - Fitted bkg
500
400
300 10
200
100
0 5
-100
-200
100 110 120 130 140 150 160
0
80 100 120 140 160 180
m [GeV] m4l [GeV]
Fig. 7.24 Invariant mass of the γγ candidates in the ATLAS experiment (left; in the lower part of
the plot the residuals from the fit to the background are shown) and of the 4-lepton events in CMS
(right; the expected background is indicated by the dark area, including the peak close to the Z
mass and coming from Z γ ∗ events). From K.A. Olive et al. (Particle Data Group), Chin. Phys. C
38 (2014) 090001
We have studied the standard model of particle physics, and we have seen that this
model is very successful in describing the behavior of matter at the subatomic level.
Can it be the final theory? This looks very unlikely: the SM seems rather an ad
hoc model, and the SU(3) ⊗ SU(2) ⊗ U(1) looks like a low-energy symmetry which
must be part of a bigger picture.
First of all, the standard model looks a bit too complicated to be thought as
the fundamental theory. There are many particles, suggesting some higher symme-
tries (between families, between quarks and leptons, between fermions and bosons)
grouping them in supermultiplets. There are many free parameters, as we have seen
in Sect. 7.3.2.
7.6 Beyond the Minimal SM of Particle Physics; Unification of Forces 409
Then, it does not describe gravity, which is the interaction driving the evolution
of the Universe at large scale.
It does not describe all particles: as we said in Chap. 1, and as we shall discuss
in larger detail in the next chapters, we have good reasons to believe that matter
in the Universe is dominated by a yet undiscovered kind of particles difficult to
accommodate in the standard model, the so-called dark matter.
Last but not least, one of the most intriguing questions is discussed as follows. The
fundamental constants have values consistent with conditions for life as we know;
sometimes this requires a fine tuning. Take for example the difference between the
mass of the neutron and the mass of the proton, and the value of the Fermi constant:
they have just the values needed for a Sun-like star to develop its life cycle in a
few billions of years, which is the time needed for life as we know to develop
and evolve. Is this just a coincidence or we miss a global view of the Universe?
A minimal explanation is an “anthropic coincidence,” which leads to the so-called
“anthropic principle”. The anthropic principle in its weak form can be expressed as
“conditions that are observed in the universe must allow the observer to exist” (which
sounds more or less like a tautology; note that the conditions are verified “here” and
“now”), while one of the strongest forms states that “the Universe must have those
properties which allow life to develop within it at some stage in its history.” It is
clear that on this question the borderline between physics and philosophy is very
narrow, but we shall meet concrete predictions about the anthropic principle when
shortly introducing the superstring theory. Just to conclude this argument which
should deserve a deeper treatment, we cannot avoid the observation that discussions
about existence are relevant only for civilizations evolute enough to think of the
question—and we are one.
To summarize, many different clues indicate that the standard model is a work
in progress and will have to be extended to describe physics at higher energies.
Certainly, a new framework will be required close to the Planck scale ∼1018 GeV,
where quantum gravitational effects become important. Probably, there is a simplified
description of Nature at higher energies, with a prospect for the unification of forces.
As we have seen, renormalization entails the idea of coupling constants “running”
with energy. At the energies explored up to now, the “strong” coupling constant
is larger than the electromagnetic constant, which in turn is larger than the weak
constant. The strong constant, however, decreases with increasing energy, while the
weak and electromagnetic constants increase with energy. It is thus very tempting
to conjecture that there will be an energy at which these constant meet—we know
already that the weak and electromagnetic constants meet at a large energy scale. The
evolution of the couplings with energy could be, qualitatively, shown in Fig. 7.26.
However, if we evolve the coupling constants on the basis of the known physics,
i.e., of the standard model of particle physics, they will fail to meet at a single point
(Fig. 7.29, left). The plot suggests the possibility of a grand unification scale at about
1016 eV, but, if we want that unification of the relevant forces happens, we must
assume that there is new physics beyond the standard model. If also gravity will
enter this grand unification scheme, there is no clue of how and at what energy level
such unification will happen—but we shall see later that a hint can be formulated.
410 7 The Higgs Mechanism and the Standard Model of Particle Physics
Fig. 7.26 Artistic scheme (qualitative) of the unification of the interaction forces
7.6.1 GUT
The gauge theory of electroweak interactions is unified at high energy under the group
SU L (2) ⊗ UY (1). This symmetry is spontaneously broken at low energy splitting the
electromagnetic and weak interactions. Given the structure of gauge theories in the
standard model, it is tempting to explore the possibility that SUc (3) ⊗ SU L (2) ⊗
UY (1) is unified by a larger group G at very large scales of energy such that
The smallest group including SUc (3) ⊗ SU L (2) ⊗ UY (1) is SU(5), proposed by
Georgi and Glashow in 1974; this approach, being the first proposed, is called by
default the GUT (grand unified theory)—but of course any group including SU(5) can
play the game. We shall describe in some detail in this section this “minimal” SU(5)
GUT since it is the lowest rank (it has the smallest number of generators) GUT model
and provides a good reference point for non-minimal GUTs. However, we should
take into account the fact that this simple model has been shown experimentally
inadequate, as discussed later.
7.6 Beyond the Minimal SM of Particle Physics; Unification of Forces 411
The symmetry group has 24 generators, which include the extension to rank 5
of the generators of the standard model. A five-dimensional state vector allows to
include quarks and leptons in the same vector. As an example, right states will be
described all together in a spinor
ψ = (d R , dG , d B , e+ , ν̄e ) , (7.105)
p → e+ π 0
has a clear experimental signature. From the unification mass MU , one can compute
Fig. 7.28 Some mechanisms for proton decay in the SU(5) GUT
τ p ∼ 1029 years .
assuming the branching fractions computed by means of the minimal GUT, rules out
the theory. In addition, the LEP precision data indicate that the coupling constants
fail to meet exactly in one point for the current value of sin2 θW , as accurately
measured by LEP. Of course one can save GUT by going to “non-minimal” versions,
in particular with larger groups and a larger number of Higgs particles; in this way,
one loses however simplicity and part of the elegance of the idea—apart possibly for
the unification provided by supersymmetry, which we shall examine in next Section.
In his book, “The trouble with physics” (2007), Lee Smolin writes: “After some
twenty-five years, we are still waiting. No protons have decayed. We have been
waiting long enough to know that SU(5) grand unification is wrong. It’s a beautiful
idea, but one that nature seems not to have adopted. [...] Indeed, it would be hard
to underestimate the implications of this negative result. SU(5) is the most elegant
way imaginable of unifying quarks with leptons, and it leads to a codification of the
properties of the standard model in simple terms. Even after twenty-five years, I still
find it stunning that SU(5) doesn’t work.”
7.6.2 Supersymmetry
different spin particles. This implies an equal number of fermionic and bosonic
degrees of freedom.
By convention, the superpartners are denoted by a tilde. Scalar superpartners of
fermions are identified by adding an “s” to the name of normal fermions (example:
the selectron is the partner of the electron), while fermionic superpartners of bosons
are identified by adding a “ino” at the end of the name (the photino is the superpart-
ner of the photon). In the Minimal Supersymmetric Standard Model (MSSM), the
Higgs sector is enlarged with respect to the SM, having at least two Higgs doublets.
The spectrum of the minimal supersymmetric standard model therefore reads as in
Table 7.2.
SUSY is clearly an approximate symmetry, otherwise the superpartners of each
particle of the standard model would have been found, since they would have the
same mass as the normal articles. But as of today, no supersymmetric partner has
been observed. For example, the selectron would be relatively easy to produce in
accelerators.
Superpartners are distinguished by a new quantum number called R-parity: the
particles of the standard model have parity R = 1, and we assign a parity R = −1
to their superpartners. R-parity is a multiplicative number; if it is conserved, when
attempting to produce supersymmetric particles from normal particles, they must
be produced in pairs. In addition, a supersymmetric particle may disintegrate into
lighter particles but one will have always at least a supersymmetric particle among
the products of disintegration.
Always in the hypothesis of R-parity conservation (or small violation), a stable (or
stable over cosmological times) lightest supersymmetric particle must exist, which
can no longer disintegrate. The nature of the lightest supersymmetric particle (LSP)
is a mystery. If it is the residue of all the decays of supersymmetric particles from
the beginning of the Universe, one would expect that LSPs are abundant. Since we
did not find it, yet, it must be neutral and it does not interact strongly.
The LSP candidates are then the lightest sneutrino and the lightest neutralino χ0
(four neutralinos are the mass eigenstates coming from the mixtures of the zino and
Table 7.2 Fundamental particles in the minimal supersymmetric standard model: particles with
R = 1 (left) and R = −1 (right)
Symbol Spin Name Symbol Spin Name
e, µ, τ 1/2 Leptons ẽ, µ̃, τ̃ 0 Sleptons
νe , νµ , ντ 1/2 Neutrinos ν˜e , ν˜µ , ν˜τ 0 Sneutrinos
d, u, s, c, b, t 1/2 Quarks d̃, ũ, s̃, c̃, b̃, t˜ 0 Squarks
g 1 Gluon g̃ 1/2 Gluino
γ 1 Photon γ̃ 1/2 Photino
W ±, Z 1 EW gauge W˜± , Z̃ 1/2 Wino, zino
bosons
H1 , H2 0 Higgs H̃1 , H̃2 1/2 Higgsinos
414 7 The Higgs Mechanism and the Standard Model of Particle Physics
the photino, and the neutral higgsinos; in the same way, the mass eigenstates coming
from the mixture of the winos and the charged higgsinos are called charginos). The
LSP is stable or almost stable, and difficult to observe because neutral and weakly
interacting.
The characteristic signature of the production of a SUSY LSPs would be missing
energy in the reaction. For example, if the LSP is a neutralino (which has a “photino”
component), the production of a selectron–antiselectron pair in e+ e− collisions at
LEP could be followed by the decay of the two selectrons in final states involving
an LSP, the LSPs being invisible to the detection. Since no such events have been
observed, a firm limit
MLSP > M Z /2
can be set.
An attractive feature of SUSY is that it naturally provides the unification of forces.
SUSY affects the evolution of the coupling constants, and SUSY particles can effec-
tively contribute to the running of the coupling constants only for energies above the
typical SUSY mass scale (the mass of the LSP). It turns out that within the Mini-
mal Supersymmetric Standard Model (MSSM), i.e., the SUSY model requiring the
minimal amount of particles beyond the standard model ones, a perfect unification
of interactions can be obtained as shown in Fig. 7.29, right. From the fit requiring
unification, one finds preferred values for the break point M L S P and the unification
Fig. 7.29 The interaction couplings αi = gi2 /4π c fail to meet at a single point when they are
extrapolated to high energies in the standard model, as well as in SU(5) GUTs. Minimal SUSY
SU(5) model (right) does cause the couplings to meet in a point. While there are other ways to
accommodate the data, this straightforward, unforced fit is encouraging for the idea of supersym-
metric grand unification (Adapted from S. James Gates, Jr., http://live.iop-pp01.agh.sleek.net/2014/
09/25/sticking-with-susy/; adapted from Ugo Amaldi, CERN)
7.6 Beyond the Minimal SM of Particle Physics; Unification of Forces 415
point MGU T :
The observation in Fig. 7.29, right, was considered as the first “evidence” for super-
symmetry, especially since MLSP and MGUT have “good” values with respect to a
number of open problems.
In addition, the LSP provides a natural candidate for the yet unobserved competent
of matter, the so-called dark matter that we introduced in Chap. 1 and we shall further
discuss in the next chapter, and which is believed to be the main component of the
matter in the Universe. Sneutrino-dominated dark matter is, however, ruled out in
the MSSM due to the current limits on the interaction cross section of dark matter
particles with ordinary matter. These limits have been provided by direct detection
experiments—the sneutrino interacts via Z boson exchange and would have been
detected by now if it makes up the dark matter.
Neutralino dark matter is thus the favored possibility. Neutralinos come out in
SUSY to be Majorana fermions, i.e., each of them is identical with its antiparticle.
Since these particles only interact with the weak vector bosons, they are not directly
produced at hadron colliders in copious numbers. A neutralino in a mass consistent
with Eq. 7.107 would provide, as we shall see, the required amount of “dark matter”
to comply with the standard model of cosmology.
Gravitino dark matter is a possibility in non-minimal supersymmetric models
incorporating gravity in which the scale of supersymmetry breaking is low, around
100 TeV. In such models, the gravitino can be very light, of the order of one eV. The
gravitino is sometimes called a super-WIMP, as happens with dark matter, because
its interaction strength is much weaker than that of other supersymmetric dark matter
candidates.
Gravity could not be turned into a renormalisable field theory up to now. One big
problem is that classical gravitational waves carry spin j = 2, and present gauge the-
ories in four dimensions are not renormalisable—the quantum loop integrals related
to the graviton go to infinity for large momenta, i.e., as distance scales go to zero.
Gravity could be, however, renormalisable in a large number of dimensions.
The starting point for string theory is the idea that the point-like elementary
particles are just our view of one-dimensional objects called strings (the string scale
being smaller than what is measurable by us, i.e., the extra dimension is compactified
at our scales).
The analog of a Feynman diagram in string theory is a two-dimensional smooth
surface (Fig. 7.30). The loop integrals over such a smooth surface do not meet the zero
416 7 The Higgs Mechanism and the Standard Model of Particle Physics
distance, infinite momentum problems of the integrals over particle loops. In string
theory, infinite momentum does not even mean zero distance. Instead, for strings,
the relationship between distance and momentum is, roughly,
1 p
x ∼ +
p Ts
where Ts is called the string tension, the fundamental parameter of string theory. The
above relation implies a minimum observable length for a quantum string theory of
1
L min ∼ √ .
Ts
One can further generalize the concept on string adding more than one dimension:
in this case, we speak more properly of branes. In dimension p, these are called
p−branes.
String theories that include fermionic vibrations, incorporating supersymmetry,
are known as superstring theories; several kinds have been described, based on sym-
metry groups as large as S O(32), but all are now thought to be different limits
of a general theory called M-theory. In string theories, spacetime is at least ten-
dimensional—it is eleven-dimensional in M-theory.
7.6 Beyond the Minimal SM of Particle Physics; Unification of Forces 417
7.6.3.2 Criticism
7.6.4 Compositeness
This would change completely our view of nature, as it happened twice in the last
century—thanks to relativity and quantum physics. The 12 known elementary parti-
cles have their own repeating patterns, suggesting they might not be truly fundamen-
tal, in the same way as the patterns on the atomic structure evidenced by Mendeleev
suggested that atoms are not fundamental.
The presence of fundamental components of quarks and leptons could reduce the
number of elementary particles and the number of free parameters in the SM. A
number of physicists have attempted to develop a theory of “pre-quarks,” which are
called, in general, preons.
A minimal number of two different preons inside a quark or a lepton could explain
the lightest family of quarks and leptons; the other families could be explained as exci-
tations of the fundamental states. For example, in the so-called rishon model, there
are two fundamental particles called rishons (which means “primary” in Hebrew);
they are spin 1/2 fermions called T (“Third” since it has an electric charge of e/3)
and V (“Vanishing,” since it is electrically neutral). All leptons and all flavors of
quarks are ordered triplets of rishons; such triplets have spin 1/2. They are built as
follows: T T T = anti-electron; V V V = electron neutrino; T T V, T V T and V T T =
three colors of up quarks; T V V, V T V and V V T = three colors of down antiquarks.
In the rishon model, the baryon number and the lepton number are not individually
conserved, while B − L is (demonstrate it); more elaborated models use three preons.
At the mass scale at which preons manifest themselves, interaction cross sections
should rise, since new channels are open; we can thus set a lower limit of some
10 TeV to the possible mass of preons, since they have not been found at LHC.
The interaction of UHE cosmic rays with the atmosphere reaches some 100 TeV in
the center-of-mass, and thus cosmic rays are the ideal laboratory to observe such
pre-constituents, if they exist and if their mass is not too large.
Further Reading
[F7.1] F. Halzen and Martin, “Quarks and Leptons: An Introductory Course in Mod-
ern Particle Physics,” Wiley 1984. A book at early graduate level providing
in a clear way the theories of modern physics in a how-to approach which
teaches people how to do calculations.
[F7.2] M. Thomson, “Modern Particle Physics,” Cambridge University Press 2013.
A recent, pedagogical, and rigorous book covering the main aspects of Particle
Physics at advanced undergraduate and early graduate level.
[F7.3] B.R. Martin and G.P. Shaw, “Particle Physics,” Wiley 2009. A book at under-
graduate level teaching the main concepts with very little calculations.
[F7.4] M. Merk, W. Hulsbergen, I. van Vulpen,“Particle Physics 1,” Nikhef 2014.
Lecture notes for one semester master course covering in a clear way the
basics of electrodynamics, weak interactions, and electroweak unification
and in particular symmetry breaking.
Further Reading 419
Exercises
(a) G F and MW ;
(b) MW and M Z .
10. Higgs decays into Z Z . The Higgs boson was first observed in the H → γγ and
H → Z Z → 4 leptons decay channels. Compute the branching fraction of
Z Z → µ+ µ− µ+ µ− normalized to Z Z → anything.
11. Higgs decays into γγ. Design the lowest order Feynman diagrams for the decay
of the Higgs boson in γγ and discuss why this channel was a golden channel in
the discovery of the Higgs boson.
Chapter 8
The Standard Model of Cosmology
and the Dark Universe
The origin and fate of the Universe is, for many researchers, the fundamental
question. Many answers were provided over the ages, a few of them built over
scientific observations and reasoning. During the last century important scientific
theoretical and experimental breakthroughs occurred after Einstein’s proposal of the
General Theory of Relativity in 1915, with precise and systematic measurements
establishing the expansion of the Universe, the existence the Cosmic Microwave
Background, and the abundances of light elements in the Universe. The fate of
the Universe can be predicted from its energy content—but, although the chemical
composition of the Universe and the physical nature of its constituent matter have
occupied scientists for centuries, we do not know yet this energy content well enough.
We are made of protons, neutrons, and electrons, combined into atoms in which
most of the energy is concentrated in the nuclei (baryonic matter); and we know a
few more particles (photons, neutrinos, ...) accounting for a limited fraction of the
total energy of atoms. However, the motion of stars in galaxies, as well as results
about background photons in the Universe (both will be discussed in the rest of
this chapter) are inconsistent with the presently known laws of physics, unless we
assume that a new form of matter exists. This matter is not visible, showing little or
no interaction with photons—we call it “dark matter”. It is, however, important in
the composition of the Universe, because its energy is a factor of five larger than the
energy of baryonic matter.
Recently, the composition of the Universe has become even more puzzling, as
observations imply an accelerated expansion. Such an acceleration can be explained
by a new, unknown, form of energy—we call it “dark energy”—generating a repulsive
gravitational force. Something is ripping the Universe apart.
The current view on the distribution of the total budget between these forms of
energy is shown in Fig. 1.7. Note that we are facing a new Copernican revolution: we
are not made of the same matter that most of the Universe is made of. Moreover, the
Universe displays a global behavior difficult to explain, as we shall see in Sect. 8.1.1.
Today, at the beginning of the twenty-first century, the Big Bang model with a large
fraction of dark matter (DM) and dark energy is widely accepted as “the standard
model of cosmology”, but no one knows what the “dark” part really is, and thus the
Universe and its ultimate fate remain basically unknown.
About one century ago, we believed that the Milky Way was the only galaxy; today,
we have a more-refined view of the Universe, and the field of experimental cosmology
probably grows at a faster rate than any other field in physics. In the last century, we
obtained unexpected results about the composition of the Universe, and its global
structure.
v = H0 d, (8.1)
where H0 68 km s−1 Mpc−1 is the so-called Hubble constant (we shall see that
it is not at all constant, and can change during the history of the Universe) which is
often expressed as a function of a dimensionless parameter h defined as
H0
h= . (8.2)
100 km s−1 Mpc−1
However, velocity and distance are not directly measured. The main observables are
the redshift z—i.e., the fractional wavelength shift observed in specific absorption
lines (hydrogen, sodium, magnesium, ...) of the measured spectra of objects (Fig. 8.1)
8.1 Experimental Cosmology 423
λobserved −λemitted λ
z= = , (8.3)
λemitted λemitted
and the apparent luminosity of the celestial objects (stars, galaxies, supernovae, ...),
for which we assume we know the intrinsic luminosity.
A redshift occurs whenever λ > 0 which is the case for the large majority of
galaxies. There are notable exceptions (λ < 0, a blueshift) as the one of M31, the
nearby Andromeda galaxy, explained by a large intrinsic velocity (peculiar velocity)
oriented toward us.
The wavelength shifts were first observed by the US astronomer James Keeler at
the end of the nineteenth century in the spectrum of the light reflected by the rings of
Saturn, and later on, at the beginning of the twentieth century, by the US astronomer
Vesto Slipher, in the spectral lines of several galaxies. In 1925 there were already
around 40 galaxies with measured spectral lines.
These wavelength shifts were (and still often are) incorrectly identified as simple
special relativistic Doppler shifts due to the movement of the sources. In this case z
would be given by
1+β
z= −1, (8.4)
1−β
z β; (8.5)
H0
z d. (8.6)
c
However, the limit of small β is not valid for high redshift objects with z as high as 11
that have been observed in the last years—the list of the most distant object comprises
more than 100 objects with z > 7 among galaxies or protogalaxies (the most abundant
category), quasars, black holes, and even star or protostars. On the other hand, high
redshift supernovae (typically z ∼ 0.1 to 1) have been extensively studied. From
these studies an interpretation of the expansion based on special relativity is clearly
excluded: one has to invoke general relativity.
In terms of general relativity (see Sect. 8.2) the observed redshift is not due to
any movement of the cosmic objects but due to the expansion of the proper space
between them. This expansion has no center: an observer at any point of the Universe
will see the expansion in the same way with all objects in all directions receding with
radial velocities given by the same Hubble law and not limited by the speed of light
(in fact for z > ∼1.5 radial velocities are, in a large range of cosmological models,
higher than c): it is the distance scale in the Universe that is changing.
Let us now write the distance between two objects as
424 8 The Standard Model of Cosmology and the Dark Universe
Fig. 8.1 Wavelength shifts observed in spectra of galaxies depending on their distance. From J.
Silk, “The big bang”, Times Books 2000
d = a(t)x, (8.7)
where a (t) is a scale that may change with time and x by definition is the present
(t = t 0 ) distance between the objects (a (t0 ) = 1) that does not change with time
(comoving distance). Then
ḋ = ȧ x
8.1 Experimental Cosmology 425
v = H0 d
with
ȧ
H0 = . (8.8)
a
The Hubble constant, in this simple model, is just the expansion rate of the distance
scale in the Universe.
Let us come back to the problem of the measurement of distances. The usual
method to measure distances is to use reference objects (standard candles), for which
the absolute luminosity L is known. Then, assuming isotropic light emission in an
Euclidean Universe (see Sect. 8.2) and measuring the corresponding light flux f on
Earth, the distance d can be estimated as
L
d= . (8.9)
4π f
Hubble in his original plot shown in Fig. 8.2 used as standard candles Cepheid1 stars,
as well as the brightest stars in the Galaxy, and even entire galaxies (assuming the
absolute luminosity of the brightest stars and of the Galaxies to be approximately
constant).
The original Hubble result showed a linear correlation between v and d but the
slope (the Hubble constant) was wrong by a factor of 7 due to an overall calibration
error caused mainly by a systematic underestimate of the absorption of light by dust.
A constant slope would mean that the scale distance a(t) discussed above would
increase linearly with time:
1 Cepheids are variable red supergiant stars with pulsing periods strongly correlated with their
absolute luminosity. This extremely useful propriety was discovered by the US astronomer Henrietta
Leavitt at the beginning of twentieth century and has been used by Hubble to demonstrate in 1924
that the Andromeda Nebula M31 was too far to be part of our own galaxy, the Milky Way.
426 8 The Standard Model of Cosmology and the Dark Universe
i.e.,
a (t)
= 1 + H0 (t − t0 ) . (8.10)
a (t0 )
Hubble in his original article suggested, under the influence of a model by de Sitter,
that this linear behavior could be just a first-order approximation. In fact until recently
(1998) most people were convinced that at some point the expansion should be slowed
down under the influence of gravity which should be the dominant (attractive) force
at large scale.
This is why the next term added to the expansion is usually written by introducing
a deceleration parameter q0 (if q0 > 0 the expansion slows down) defined as
ä a ä
q0 = − 2
=− 2 , (8.11)
ȧ H0 a
and then
a (t) 1
= 1 + H0 (t − t0 ) − q0 H0 2 (t − t0 )2 . (8.12)
a (t0 ) 2
The relation between z and d must now be modified to include this new term.
However, in an expanding Universe the computation of the distance is much more
subtle. Two distances are usually defined between two objects: the proper distance
d p and the luminosity distance d L .
d p is defined as the length measured on the spatial geodesic connecting the two
objects at a fixed time (a geodesic is defined to be a curve whose tangent vectors
remain parallel if they are transported along it. Geodesics are (locally) the shortest
path between points in space, and describe locally the infinitesimal path of a test
inertial particle). It can be shown (see Ref. [F8.2]) that
c 1 + q0
dp z 1− z ; (8.13)
H0 2
The relation between d p and d L depends on the curvature of the Universe (see
Sect. 8.2.1). Even in a flat (Euclidean) Universe (see Sect. 8.2.1 for a formal definition;
8.1 Experimental Cosmology 427
for the moment, we rely on an intuitive one, and think of flat space as a space in
which the sum of the internal angles of a triangle is always π) the flux of light emitted
by an object with a redshift z and received at Earth is attenuated by a factor (1 + z)2
due to the dilation of time (γ (1 + z)) and the increase of the photon’s wavelength
(a −1 = (1 + z)). Then if the Universe was basically flat
c 1 − q0
d L = d p (1 + z) z 1+ z . (8.15)
H0 2
2 The Supernova Cosmology Project is a collaboration, led by Saul Perlmutter, dedicated to the
study of distant supernovae of type Ia, that started collecting data in 1988. Another collaboration
also searching for distant supernovae of type Ia was formed by Brian Schmidt Adam Riess in 1994,
the High-z Supernova Search Team. These teams found over 50 distant supernovae of type Ia for
which the light received was weaker than expected—which implied that the rate of expansion of
the Universe was increasing. Saul Perlmutter, born in 1959 in Champaign–Urbana, IL, USA, Ph.D.
from University of California, Berkeley; Brian P. Schmidt, U.S. and Australian citizen, born in 1967
in USA, Ph.D. from Harvard; Adam G. Riess, born in 1969 in Washington, DC, USA. Ph.D. from
Harvard, all professors in the US, were awarded the 2011 Nobel Prize in Physics “for the discovery
of the accelerating expansion of the Universe through observations of distant supernovae.”
428 8 The Standard Model of Cosmology and the Dark Universe
Several candidates for standard rulers have been discussed in the last years and,
in particular, the observation of Baryon Acoustic Oscillations (BAO) opened a new
and promising path. BAO use the Fourier transform of the distance correlation func-
tion between specific astrophysics objects (for instance luminous red galaxies, blue
galaxies, etc.) to discover, as function of the redshift z, possible clustering scales of
the baryonic matter. These scales may be related to the evolution of initial density
perturbations in the early Universe (see Sect. 8.3). The correlation function ξ between
pairs of galaxies is just the probability that the two galaxies are at the distance r and
thus a sharp peak in ξ(r ) will correspond in its Fourier transform to an oscillation
spectrum with a well-defined frequency.
Fig. 8.4 Left The “Hubble plot” obtained by the “High-z Supernova Search Team” and by the
“Supernova Cosmology Project”. The lines represent the prediction of several models with differ-
ent energy contents of the Universe (see Sect. 8.3.3). The best fit corresponds to an accelerating
expansion scenario. From “Measuring Cosmology with Supernovae”, by Saul Perlmutter and Brian
P. Schmidt; Lecture Notes in Physics 2003, Springer. Right an updated version by the “Supernova
Legacy Survey” and the “Sloan Digital Sky Survey” projects, M. Betoule et al. arXiv:1401.4064
In 1965 Penzias and Wilson,3 two radio astronomers working at Bell Laboratories
in New Jersey, discovered by accident that the Universe is filled with a mysterious
isotropic and constant microwave radiation corresponding to a blackbody tempera-
ture around 3 K.
Penzias and Wilson were just measuring a small fraction of the black body spec-
trum. Indeed they were measuring the region in the tail around wavelength λ ∼ 7.5
cm while the spectrum peaks around λ ∼ 2 mm. To fully measure the density spec-
trum it is necessary to go above the Earth’s atmosphere, which absorbs wavelengths
lower than λ ∼ 3 cm. These measurements were eventually performed in several
balloon and satellite experiments. In particular, the Cosmic Background Explorer
(COBE), launched in 1989, was the first to show that in the 0.1 to 5 mm range the
spectrum is well described by the Planck blackbody formula
8πh ν 3 dν
εγ (ν) dν = , (8.18)
c3 hν
e kB T
−1
3 Arno Penzias (1933–) was born in Munich, Germany. In 1939 his family was rounded up for
deportation, but they managed to escape to the US, where he could graduate in Physics at Columbia
University. Robert Wilson (1936–) grew up in Huston, Texas, and studied at Caltech. They shared the
1978 Nobel prize in Physics “for their discovery of the cosmic microwave background radiation.”
430 8 The Standard Model of Cosmology and the Dark Universe
T = (2.726 ± 0.001) K .
The total photon energy density is then obtained by integrating the Planck formula
over the entire frequency range, resulting in the Stefan–Boltzmann law
π 2 (k B T )4
εγ = 0.26 eV cm−3 ; (8.19)
15 ( c)3
The existence of CMB had been predicted in the 1940s by George Gamow, Robert
Dicke, Ralph Alpher, and Robert Herman in the framework of the Big Bang model.
In the Big Bang model the expanding Universe cools down going through successive
stages of lower energy density (temperature) and more complex structures. Radiation
materializes into pairs of particles and antiparticles, which, in turn, give origin to
the existing objects and structures in the Universe (nuclei, atoms, planets, stars,
galaxies, …). In this context, the CMB is the electromagnetic radiation left over when
electrons and protons combine to form neutral atoms (the, so-called, recombination
phase). After this stage, the absence of charged matter allows photons to be basically
8.1 Experimental Cosmology 431
p + e− → H + γ ; H + γ → p e− .
If these reactions are in equilibrium at a given temperature T (high enough to allow the
photodisintegration and low enough to consider e, p, H as nonrelativistic particles)
the number density of electrons, protons, and hydrogen atoms may be approximated
by the Maxwell–Boltzmann distribution (see Sect. 8.3.1)
3 m x c2
mx kB T 2 − kB T
n x = gx e , (8.21)
2π 2
where gx is a statistical
factor accounting for the spin.
The ratio n H / n p n e can then be approximately modeled by the Saha equation
− 3
nH mek B T 2 Q
e kB T
, (8.22)
n p ne 2π 2
where
Q = m p + m e − m H c2 13.6 eV (8.23)
and assuming that there is no total net charge, (n p = n e ), the Saha equation can be
rewritten as
− 3
1−X me k B T 2 Q
np e kB T
. (8.25)
X 2π 2
On the other hand at thermal equilibrium, the energy density of photons as a function
of the frequency ν follows the usual blackbody distribution corresponding, as we
have seen before, to a photon density number of:
432 8 The Standard Model of Cosmology and the Dark Universe
3
2.4 kB T
nγ 2 . (8.26)
π c
nγ n B ,
η ∼ 5 − 6 10−10 . (8.29)
The Saha equation can then be written as a function of η and T , and used to determine
the recombination temperature (assuming X ∼ 0.5):
3
1− X kB T 2 Q
= 2 3.84 η e kB T
. (8.30)
X2 m e c2
The solution of this equation gives a remarkably stable value of temperature for a
wide range of η (see Fig. 8.6). For instance η ∼ 5.5 10−10 results into
This temperature is much higher than the measured CMB temperature reported
above. The difference is attributed to the expansion of the Universe between the
recombination epoch and the present. Indeed, as discussed in Sect. 8.1.1, the photon
wavelength increases during the expansion of a flat Universe by a factor (1 + z). The
entire CMB spectrum was expanded by this factor, and then it could be estimated
that recombination had occurred at
γ_scat ∼ H . (8.33)
γ_scat n e σT c,
where n e and σT are, respectively, the free electron density number and the Thomson
cross section.
Finally as n e can be related to the fractional ionization X (z) and the baryon density
number, (n e = X (z) n B ) the redshift at which the photon decoupling occurs (z dec )
is given by
X (z dec ) n B σT c ∼ H . (8.34)
However, the precise computation of z dec is subtle. Both n B and H evolve during the
expansion (for instance in a matter-dominated flat Universe, as it will be discussed in
3
Sect. 8.2, n B (z) ∝ n B,0 (1 + z)3 and H (z) ∝ H0 (1 + z) 2 ). Furthermore, the Saha
equation is not valid after recombination once electrons and photons are no longer
in thermal equilibrium. The exact value of z dec depends thus on the specific model
for the evolution of the Universe and the final result is of the order of
scattering probability of CMB photons during this epoch is small. To account for
it, the reionization optical depth parameter τ is introduced, in terms of which the
scattering probability is given by
P ∼ 1 − e−τ .
The CMB photons follow then spacetime geodesics until they reach us. These geo-
desics are slightly distorted by the gravitational effects of the mass fluctuations close
to the path giving rise to microlensing effects, which are responsible for a typical
total deflection of ∼2 arcminutes.
The spacetime points to where the last scattering occurred thereby define a region
called the last scattering surface situated at a redshift, z lss , very close to z dec
Beyond z lss the Universe is opaque to photons and to be able to observe it other
messengers, e.g., gravitational waves, will have to be studied. On the other hand, the
measurement of the primordial nucleosynthesis (Sect. 8.1.3) allows to indirectly test
the Big Bang model at times well before the recombination epoch.
The COBE satellite4 measured the temperature fluctuations in sky regions centered
at different points with galactic coordinates (θ, )
δT (θ, ) T (θ, ) − T
= (8.37)
T
T
and found that, apart from a dipole anisotropy of the order of 10−3 , the temperature
fluctuations are of the order of 10−5 : the observed CMB spectrum is remarkably
isotropic.
4 Three satellite missions have been launched so far to study the cosmic background radiation. The
first was COBE in 1989, followed by WMAP (Wilkinson Microwave Anisotropy Probe) in 2001,
both of which were NASA missions. The latest (with the best angular resolution and sensitivity),
called Planck, has been launched by the European Space Agency (ESA) with a contribution from
NASA in 2009, and is still in orbit. In terms of sensitivity and angular resolution, WMAP improved
COBE by a factor of 40, and Planck gave a further improvement by a factor of 4; in addition Planck
measures polarization. The instruments onboard Planck are a low frequency (solid state) instrument
from 30 GHz, and a bolometer—a device for measuring the power of incident electromagnetic
radiation via the heating of a material with a temperature-dependent electrical resistance—for
higher frequencies (up to 900 GHz). The total weight of the payload is 2 tons (it is thus classified
as a large mission); it needs to be kept at cryostatic temperatures. John Mather, from the Goddard
Space Flight Center, and George Smoot, at the University of California, Berkeley, shared the 2006
Nobel Prize in Physics “for their discovery of the blackbody form and anisotropy of the cosmic
microwave background radiation”.
8.1 Experimental Cosmology 435
Fig. 8.7 Sky map (in galactic coordinates) of CMB temperatures measured by COBE after the
subtraction of the emission from our Galaxy. A dipole component is clearly visible. (from http://
apod.nasa.gov/apod/ap010128.html)
The dipole distortion (a slight blueshift in one direction of the sky and a redshift in
the opposite direction—Fig. 8.7) observed in the measured average temperature can
be attributed to a global Doppler shift due to the peculiar motion (COBE, Earth, Solar
system, Milky way, local Group, Virgo cluster, …) with respect to an hypothetical
CMB isotropic reference frame characterized by a temperature T . Indeed
v
T ∗ = T 1 + cos θ (8.38)
c
Fig. 8.8 CMB temperature fluctuations sky map as measured by COBE after the subtraction of the
dipole component and of the emission from our Galaxy. http://lambda.gsfc.nasa.gov/product/cobe/
Fig. 8.9 CMB temperature fluctuations sky map as measured by the Planck mission after the
subtraction of the dipole component and of the emission from our Galaxy. http://www.esa.int/
spaceinimages/
436 8 The Standard Model of Cosmology and the Dark Universe
with
v = (371 ± 1) km/s.
After removing this effect, the remaining fluctuations reveal a pattern of tiny inhomo-
geneities at the level of the last scattering surface. The original picture from COBE
(Fig. 8.8) with an angular resolution of 7◦ was confirmed and greatly improved by
the Wilkinson Microwave Anisotropy Probe WMAP, which obtained full sky maps
with a 0.2◦ angular resolution. Very recently, the Planck satellite delivered sky maps
with three times higher resolution and ten times higher sensitivity (Fig. 8.9), also
covering a larger frequency range.
Once these maps are obtained it is possible to establish two-point correlations
between any two spatial directions.
Technically, the temperature fluctuations are expanded using spherical harmonics
∞
l
δT ∗
(θ, ) = a lm Ylm (θ, ) , (8.39)
T
l=0 m=−l
with
π 2π δT ∗
a lm = (θ, ) Ylm (θ, ) d. (8.40)
θ=−π =0 T
C (α) = n̂ n̂
T
T
n̂ · n̂ ∗ = cos α
where Pl are the Legendre polynomials and the Cl , the multipole moments, are given
by the variance of the harmonic coefficients a lm :
1
l
Cl = | a lm |2 . (8.41)
2l+1
m=−l
180◦
α= . (8.42)
l
The total temperature fluctuations (temperature powers spectrum) can be then
expressed as a function of the multipole moment l (Fig. 8.10, top)
l(l + 1)
T
=
2
Cl T
2 . (8.43)
2π
Such a function shows a characteristic pattern with a first peak around l ∼ 200
followed by several smaller peaks.
The first peak at an angular scale of 1◦ defines the size of the horizon at the time
of last scattering (see Sect. 8.1.3) and the other peaks (acoustic peaks) are extremely
sensitive to the specific contents and evolution model of the Universe at that time. The
observation of very tiny fluctuations at large scales (much greater than the horizon,
l 200) leads to the hypothesis that the Universe, to be casually connected, went a
through a very early stage of exponential expansion, called inflation.
Anisotropies can also be found studying the polarization of the CMB photons.
Indeed at the recombination and reionization epochs the CMB may be partially polar-
ized by Thomson scattering with electrons. It can be shown that linear polarization
may also be originated by quadrupole temperature anisotropies. In general there are
two polarization orthogonal modes, respectively, called B-mode (curl-like) and E-
mode (gradient-like), but Thomson scattering only originates the E-mode while pri-
mordial gravitational waves are expected to display both polarization modes. Gravita-
tional lensing of the CMB E-modes may also be a source of B-modes. E-modes were
first measured in 2002 by the DASI telescope in Antarctica and recently the Planck
collaboration published high resolution maps of the CMB polarization over the full
sky. The detection and the interpretation of B-modes are very challenging since the
signals are tiny and foreground contaminations, as the emission by galactic dust, are
438 8 The Standard Model of Cosmology and the Dark Universe
not always easy to estimate. The arrival angles of the CMB photons are smeared, due
to the microlensing effects, by dispersions that are function of the integrated mass
distribution along the photon paths. It is possible, however, to deduce these disper-
sions statistically from the observed temperature angular power spectra and/or from
polarized E- and B-mode fields. The precise measurement of these dispersions will
give valuable information for the determination of the cosmological parameters. It
will also help constraining parameters, such as the sum of the neutrino masses or the
dark energy content, that are relevant for the growth of structures in the Universe; and
evaluating contaminations in the B-mode patterns from possible primordial gravity
waves. In the last years, the detection of gravitational lensing was reported by several
experiments such as the Atacama Cosmology Telescope, the South Pole Telescope,
and the POLARBEAR experiment, and recently at a 40σ and 5σ significance by the
Planck collaboration using, respectively, temperature and polarization data or only
polarization data. Some of these aspects will be discussed briefly in Sect. 8.3 but a
detailed discussion of the theoretical and experimental aspects of this fast-moving
field is far beyond the scope of this book.
The measurement of the abundances of the light elements in the Universe (H, D,
3 He, 4 He, 6 Li, 7 Li) is the third observational “pillar” of the Big Bang model, after
the Hubble expansion and the CMB. As it was proposed, and first computed, by
the Russian American physicists, Ralph Alpher and George Gamow in 1948, the
expanding Universe cools down and when it reaches temperatures of the order of the
nuclei biding energies per nucleon (∼1−10 MeV) nucleosynthesis occurs if there
are enough protons and neutrons available. The main nuclear fusion reactions are
• proton–neutron fusion:
p+n → D γ
• Deuterium–Deuterium fusion:
D+D →3 He n
D+D →3 H p
D+D → 4 He γ
3
He+n → 4 He
3
H+ p → 4 He γ
• and finally the lithium and beryllium formation reactions (there are no stable nuclei
with A = 5):
4
He+D → 6 Li γ
4
He+3 H → 7 Li γ
4
He+3 He → 7 Be γ
7
Be+γ → 7 Li p.
The absence of stable nuclei with A = 8 basically stops the primordial big bang
nucleosynthesis chain. Heavier nuclei are produced in stellar (up to Fe) or supernova
nucleosynthesis.5
The relative abundance of neutrons and protons, in case of thermal equilibrium at
a temperature T , is fixed by the ratio of the usual Maxwell–Boltzmann distributions
(similarly to what was discussed for the recombination—Sect. 8.1.2):
3
nn mn 2 m n − m p c2
= exp − . (8.44)
np mp kT
If k B T m n − m p c2 −→ n n /n p ∼ 1; if k B T m n − m p c2 −→
n n /n p ∼ 0.
Thermal equilibrium is established through the weak processes connecting protons
and neutrons:
n+νe p+e−
n+e+ p+ν e
as long as the interaction rate of these reactions n, p is greater than the expansion
rate of the Universe,
n, p ≥ H.
and H evolve during the expansion diminishing the former much faster than the
latter. Indeed in a flat Universe dominated by radiation (Sect. 8.2)
5 Iron (56 Fe)is the stable element for which the binding energy per nucleon is largest (about 8.8
MeV); it is thus the natural endpoint of fusion processes of lighter elements, and of fission of heavier
elements.
440 8 The Standard Model of Cosmology and the Dark Universe
n, p ∼ G F T 5 , (8.45)
H∼ g∗ T 2 , (8.46)
where G F is the Fermi weak interaction constant and g ∗ the number of degrees of
freedom that depends on the relativistic particles content of the Universe (namely,
on the number of generations of light neutrinos Nμ , which, in turn, allows to set a
limit on Nμ ).
The exact calculation of the freeze-out temperature T f at which
n, p ∼ H
is out of the scope of this book. The values obtained for T f are a little below the
MeV scale,
At this temperature
nn
∼ 0.2.
np
After the freeze-out this ratio would remain constant if neutrons were stable. How-
ever, as we know, neutrons decay via beta decay,
n → p e− ν e .
Therefore, the nn np ratio will decrease slowly while all the neutrons will not be bound
inside nuclei, so that
nn
∼ 0.2 e−t/τn (8.48)
np
with τn 885.7 s.
The first step of the primordial nucleosynthesis is, as we have seen, the formation
of Deuterium via proton–neutron fusion
p+n D γ
Although the deuterium binding energy, 2.22 MeV, is higher than the freeze-out
temperature, the fact that the baryons to photons ratio η is quite small (η ∼ 5 −
6 10−10 ) makes photodissociation of the deuterium nuclei possible at temperatures
lower than the blackbody peak temperature T f (the Planck function has a long tail).
The relative number of free protons, free neutrons, and deuterium nuclei can be
expressed, using a Saha-like equation (Sect. 8.1.2), as follows:
8.1 Experimental Cosmology 441
3 3
nD gD mD 2 kB T − 2 Q
e kB T
, (8.49)
n p nn g p gn m p m n 2π 2
Q = m p +m n −m D c2 ∼ 2.22 MeV.
− 3
nD m p c2 k B T 2 Q
∝ η nγ e kB T
. (8.50)
nn π 2
3
nD kB T 2 Q
∝η e kB T
. (8.51)
nn m p c2
This is analogous to the formulation of the Saha equation used to determine the
recombination temperature (Sect. 8.1.2). As we have shown its solution (for instance
for (n D /n n ) ∼ 1) gives a remarkably stable value of temperature. In fact there is a
sharp transition around k B TD ∼ 0.1 MeV: above this value neutrons and protons
are basically free; below this value all neutrons are substantially bound first inside
D nuclei and finally inside 4 He nuclei, provided that there is enough time before
the fusion rate of nuclei becomes smaller than the expansion rate of the Universe.
Indeed, since the 4 He binding energy per nucleon is much higher than those of the
D, 3 H , and 3 He, and since there is no stable nuclei with A = 5, then 4 He is the most
favorite final state.
The primordial abundance of 4 He, Y p , is defined usually as the fraction of mass
density of 4 He nuclei, ρ(4 He), over the total baryonic mass density, ρ (Baryons)
ρ 4 He
Yp = . (8.52)
ρ (Baryons)
In a crude way let us assume that after nucleosynthesis all baryons are H or 4 He, i.e.,
that
ρ (H) + ρ 4 He = 1.
Thus
ρ (H)
Yp = 1 −
ρ (Baryons)
442 8 The Standard Model of Cosmology and the Dark Universe
n p −n n 2 nn np
Yp = 1 − = . (8.53)
n p +n n 1+ nn np
Around one quarter of the baryonic mass of the Universe is due to 4 He and around
three quarters is made of hydrogen.
There are however small fractions of D, 3 He, and 3 H that did not turn into 4 He,
and there are, thus, tiny fractions of 7 Li and 7 Be that could be formed after the
production of 4 He and before the dilution of the nuclei due to the expansion of the
Universe. Although their abundances are quantitatively quite small, the comparison
of the expected and measured ratios are important because they are rather sensitive
to the ratio of baryons to photons, η.
In Fig. 8.11 the predicted abundances of 4 He, D, 3 He, and 7 Li computed in the
framework of the standard model of Big Bang nucleosynthesis as a function of η
are compared with recent measurements (for details see the Particle Data Book).
An increase in η will increase slightly the Deuterium formation temperature TD
(there are less γ per baryon available for the photodissociation of the Deuterium),
and therefore, there is more time for the development of the chain fusion processes
ending at the formation of the 4 He. Therefore, the fraction of 4 He will increase
slightly, in relative terms, and the fraction of D and 3 He will decrease much more
significantly, again in relative terms. The evolution of the fraction of 7 Li is, on the
contrary, not monotonous; it shows a minimum due to the fact that it is built up from
two processes that have a different behavior (the fusion of 4 He and 3 H is a decreasing
function of η; the production via 7 Be is an increasing function of η).
Apart from the measured value for the fraction of 7 Li all the other measurements
converge to a common value of η that is, within the uncertainties, compatible with
the value indirectly determined by the study of the acoustic peaks in the CMB power
spectrum (see Sect. 8.3.3).
Evidence that the Newtonian physics applied to visible matter does not describe the
dynamics of stars, galaxies, and galaxy clusters was well established in the twentieth
century.
As a first approximation, one can estimate the mass of a galaxy based on its
brightness: the brighter galaxies contain more stars. However, there are other ways
to assess the total mass of a galaxy. In spiral galaxies, for example, stars rotate
in quasicircular orbits around the center. The rotational speed of peripheral stars
depends, according to Newton’s law, on the total mass of the galaxy, and one has
thus an independent measurement of this mass. Do these two methods give consistent
results?
In 1933 the Swiss astronomer, Fritz Zwicky studied the motion of distant clusters
of galaxies, in particular the Coma and the Virgin clusters. The masses of the galax-
ies in the clusters were estimated by Zwicky based on their brightnesses. Zwicky
obtained the masses of the clusters from the sums of the galactic masses (based
on the luminosity of galaxies) and compared them to the value of the total mass
obtained independently from the measurement of the distributions of velocities of
individual galaxies in the cluster (the mass can be derived from the average speed
by the virial6 theorem, stating that the average over time of the total kinetic energy
6 The word virial comes from the latin vis, i.e., strength, or force; the term was coined by German
physicist and mathematician Rudolf Clausius, one of the founders of the science of thermodynamics,
around 1870.
444 8 The Standard Model of Cosmology and the Dark Universe
μv 2 μM(r ) M(r )
=G 2
=⇒ v 2 = G .
r r r
As a consequence, in the halo of a Galaxy (where one expects from the distribution
of luminous matter that M(r ) Mtot , the total mass of the Galaxy),
G Mtot
v . (8.55)
r
The above result holds also for a generic bounded orbit, as a consequence of the
virial theorem.
8.1 Experimental Cosmology 445
Experimental observations do not agree with this conclusion: the speed of stars as
a function of the distance√from the center (the so-called rotation curve of the galaxy)
does not decrease as 1/ r in the halo (see for example, Fig. 8.12, referred to the
Galaxy M33, the so-called Triangulum galaxy).
Stars at the outskirts of M33 have observed orbital velocities of about 150 kilome-
ters per second, and would escape in reasonably short timescales if the galaxy were
composed only of visible matter, since their orbital speeds are three times larger than
the escape velocity from the galaxy.
The Milky Way itself displays a roughly saturating speed in the halo (Fig. 8.13).
To explain within Newtonian gravity an asymptotically constant or nearly constant
speed in the halo, as observed in many spiral galaxies, an amount of matter M(r )
roughly proportional to the radius has to be considered—and thus a density ρ of dark
matter decreasing as r −2 should exist, which is not observed in the luminous mass.
Thus one can conclude that dark matter exists, having a distribution which does not
fall off with radius as for luminous matter. Of course this matter is likely to be more
densely packed in the center than luminous matter, but its effect is more visible in
the halo, since there it is dominating.
Asymptotically for at large r , a density function ρ proportional to r −2 can be
expressed as:
ρs
Isothermal : ρ(r ) = ρiso (r ) = . (8.56)
1+ (r/rs )2
This is the so-called isothermal spherical halo, since it can be derived also from sim-
ulations of a gas of particles in hydrostatic equilibrium and at uniform temperature.
This model is described by two parameters: the central density, and the radius of the
core rs (alternately, the parameters can be the bulk radius and the limiting speed).
The isothermal halo is therefore characterized as a rotation curve growing linearly
starting from the center, approaching a flat rotation curve at large radii.
Profiles obtained in numerical simulations of dark matter including baryons are
steeper in the center than those obtained from simulations with nonbaryonic dark
matter only. The Navarro, Frenk, and White (NFW) profile, often used as a bench-
mark, follows a r −1 distribution at the center. On the contrary, the Einasto profile
does not follow a power law at the GC, is smoother at kpc scales, and seems to
fit better more recent numerical simulations. A value of about 0.17 for the shape
parameter α in Eq. 8.57 is consistent with present data and simulations. Moore and
collaborators have suggested profiles steeper than NFW.
The analytical expression of these profiles are
rs r −2
NFW : ρNFW (r ) = ρs 1+
r rs
α
2 r
Einasto : ρEinasto (r ) = ρs exp − −1 (8.57)
α rs
r 1.16
r −1.84
s
Moore : ρMoore (r ) = ρs 1+ .
r rs
i.e., five orders of magnitude larger than the total energy density of the Universe.
8.1 Experimental Cosmology 447
104 Moore
103 NFW
ρDM GeV cm3
102 Einasto EinastoB
10
Iso
1
Ρ Burkert
1
10
2
r
10
3 2
10 10 10 1 1 10 102
r kpc
Fig. 8.14 Comparison of the densities as a function of the radius for DM profiles used in the
literature, with values adequate to fit the radial distribution of velocities in the halo of the Milky
Way. The curve EinastoB indicates an Einasto curve with a different α parameter. From M. Cirelli
et al., “PPPC 4 DM ID: A Poor Particle Physicist Cookbook for Dark Matter Indirect Detection”,
arXiv:1012.4515, JCAP 1103 (2011) 051
Also because of this, one of the preferred targets for astrophysical searches for
DM are small satellite galaxies of the Milky Way, the so-called dwarf spheroidals
(dSph). For these galaxies the ratio between the estimate of the total mass M inferred
from the velocity dispersion (velocities of single stars are measured with an accuracy
of a few kilometers per second thanks to optical measurements) and the luminous
mass L, inferred from the count of the number of stars, can be very large. The dwarf
spheroidal satellites of the Milky Way could become tidally disrupted if they did not
have enough dark matter. In addition these objects are not far from us: a possible
DM signal should not be attenuated by distance dimming.
Table 8.1 A list of dSph satellites of the Milky Way that may represent the best candidates for DM
searches according to their distance from the Sun, luminosity, and inferred M/L ratio
dSph D (kpc) L (103 L ) M/L ratio
Segue 1 23 0.3 >1000
UMa II 32 2.8 1100
Willman 1 38 0.9 700
Coma Berenices 44 2.6 450
UMi 66 290 580
Sculptor 79 2200 7
Draco 82 260 320
Sextans 86 500 90
Carina 101 430 40
Fornax 138 15500 10
448 8 The Standard Model of Cosmology and the Dark Universe
Fig. 8.15 The Local Group of galaxies around the Milky Way (from http://abyss.uoregon.edu/~js/
ast123/lectures/lec11.html). The largest galaxies are the Milky Way, Andromeda, and M33, and
have a spiral form. Most of the other galaxies are rather small and with a spheroidal form. These
orbit closely the large galaxies, as is also the case of the irregular Magellanic Clouds, best visible in
the southern hemisphere, which are located at a distance of about 120,000 ly to be compared with
the Milky Way radius of about 50,000 ly
Table 8.1 shows some characteristics of dSph in the Milky Way; their position is
shown in Fig. 8.15.
The observations of the dynamics of galaxies and clusters of galaxies, however,
are not the only astrophysical evidence of the presence of DM. Cosmological models
for the formation of galaxies and clusters of galaxies indicate that these structures
fail to form without DM.
The dependence of v 2 on the mass M(r ) on which the evidence for DM is based
relies on the virial theorem, stating that the absolute value of the kinetic energy is in
average equal to the potential energy for a bound state (after defining zero energy at
infinite distance). The departure from this Newtonian prediction could also be related
to a departure from Newtonian gravity.
Alternative theories do not necessarily require dark matter, and replace it with a
modified Newtonian gravitational dynamics. Notice that, in an historical perspec-
tive, deviations from expected gravitational dynamics already led to the discovery of
previously unknown matter sources. Indeed, the planet Neptune was discovered fol-
lowing the prediction by Le Verrier in the 1840s of its position based on the detailed
observation of the orbit of Uranus and Newtonian dynamics. In the late nineteenth
century, the disturbances to the expected orbit of Neptune led to the discovery of
Pluto. On the other hand, the precession of the perihelion of Mercury, which could
not be quantitatively explained by Newtonian gravity, confirmed the prediction of
general relativity—and thus a modified dynamics.
The simplest model of modified Newtonian dynamics is called MOND; it was
proposed in 1983 by Milgrom, suggesting that for extremely small accelerations the
8.1 Experimental Cosmology 449
Newton’s gravitational law may not be valid—indeed Newton’s law has been verified
only at reasonably large values of the gravitational acceleration. MOND postulates
that the acceleration a is not linearly dependent on the gradient of the gravitational
field φ N at small values of the acceleration, and proposes the following modification:
a
μ a = −∇φ N . (8.58)
a0
v4 GM
2
a0 2 .
r r
In this limit, the rotation curve flattens at a typical value v f given by
MOND explains well the shapes of rotation curves. But for the masses of clusters,
one finds an improvement but the problem is not completely solved.
The likelyhood that MOND is the full explanation for the anomaly observed in
the velocities of stars in the halo of galaxies is not strong. An explanation through
MOND would require an ad hoc theory to account for cosmological evidence as
well. In addition, a recent (2004) astrophysical observation, the merging galaxy
cluster 1E0657-58 (the so-called bullet cluster), has further weakened the MOND
hypothesis. The bullet cluster consists of two colliding clusters of galaxies, at a
distance of about 3.7 Gly. In this case (Fig. 8.16), the distance of the center of mass
to the center of baryonic mass cannot be explained by changes in the gravitational
law, as indicated by data with a statistical significance of 8σ.
One could also consider the fact that galaxies may contain invisible matter of
known nature, either baryons in a form which is hard to detect optically, or massive
neutrinos—MOND, then reduces the amount of invisible matter needed to explain
the observations.
The age of the Universe is an old question. Has the Universe a finite age? Or is the
Universe eternal and always equal to itself (steady state Universe)?
450 8 The Standard Model of Cosmology and the Dark Universe
Fig. 8.16 The matter in the “bullet cluster” is shown in this composite image (from http://apod.nasa.
gov/apod/ap060824.html, credits: NASA/CXC/CfA/ M. Markevitch et al.). In this image depicting
the collision of two clusters of galaxies, the bluish areas show the distributions of dark matter in
the clusters, as obtained with gravitational lensing, and the red areas correspond to the hot x-ray
emitting gases. The individual galaxies observed in the optical image data have a total mass much
smaller than the mass in the gas, but the sum of these masses is far less than the mass of dark matter.
The clear separation of dark matter and gas clouds is a direct evidence of the existence of dark
matter
For sure the Universe must be older than the oldest object that it contains and
the first question has been then: how old is the Earth? In the eleventh century, the
Persian astronomer, Abu Rayhan al-Biruni had already realized that Earth should
have a finite age, but he just stated that the origin of Earth was too far away to
possibly measure it. In the nineteenth century, there were finally the first quantitative
estimates. From considerations, both, on the formation of the geological layers, and
on the thermodynamics of the formation and cooling of Earth, it was estimated
that the age of the Earth should be of the order of tens of millions of years. These
estimates were in contradiction with both, some religious beliefs, and Darwin’s theory
of evolution. Rev. James Ussher, an Irish Archbishop, published in 1650 a detailed
calculation concluding that according to the Bible “God created Heaven and Earth”
some six thousand years ago (more precisely “at the beginning of the night of October
23rd in the year 710 of the Julian period”, which means 4004 B.C.). On the other
hand, tens or even a few hundred million years seemed to be a too short time to allow
for the slow evolution advocated by Darwin. Only the discovery of radioactivity at
the end of nineteenth century provided precise clocks to date rocks and meteorite
debris with, and thus to allow for reliable estimates of the age of the Earth. Surveys
in the Hudson Bay in Canada found rocks with ages of over four billion (∼4.3 × 109 )
years. On the other hand measurements on several meteorites, in particular on the
Canyon Diablo meteorite found in Arizona, USA, established dates of the order of
(4.5−4.6) × 109 years. Darwin had the time he needed!
8.1 Experimental Cosmology 451
The proportion of elements other than hydrogen and helium (defined as the metal-
licity) in a celestial object can be used as an indication of its age. After primordial
nucleosynthesis (Sect. 8.1.3) the Universe is basically composed by hydrogen and
helium. Thus the older (first) stars should have lower metallicity than the younger
ones (for instance our Sun). The measurement of the age of low metallicity stars
imposes, therefore, an important constraint on the age of the Universe. The oldest
stars with a well-determined age found so far are HE 1523-0901, a red giant at around
7500 light years away from us, and HD 140283, denominated the Methuselah star,
located around 190 light years away. The age of HE 1523-0901 was measured to be
13.2 billion years, using mainly the decay of uranium and thorium. The age of HD
140283 was determined to be (14.46 ± 0.8) billion years.
The “cosmological” age of the Universe is defined as the time since the Big Bang,
which at zeroth order is just given by the inverse of the Hubble constant:
1
T 14 × 109 years . (8.60)
H0
The precise value is determined by solving Friedmann equations (see Sect. 8.2) for a
given set of the cosmological parameters . Within the CDM model (see Sect. 8.3.3)
the best fit value, taking into account the present knowledge of such parameters, is
T = (13.798 ± 0.037) × 109 years.
Within uncertainties both the cosmological age and the age of the first stars are
compatible, but the first stars had to be formed quite early in the history of the
Universe.
Finally, we stress that a Universe with a finite age and in expansion will escape
the nineteenth century Olbers’ Paradox: “How can the night be dark?” This paradox
relies on the argument that in an infinite static Universe with uniform star density
(as the Universe was believed to be by most scientists until the mid of last century)
the night should be as bright as the day. In fact, the light coming from a star is
inversely proportional to the square of its distance, but the number of stars in a shell
at a distance between r and r + dr is proportional to the square of the distance r .
From this it seems that any shell in the Universe should contribute the same amount
light. Apart from some too crude approximations (such as, not taking into account
the finite life of stars), redshift and the finite size of the Universe solve the paradox.
force. The net result is that the local acceleration of a body, g, due to a gravitational
field created by a mass M at a distance r , is proportional to the ratio m g /m I
M
Fg = m g G
r2
Fg = m I g,
and
mg M
g= G ,
mI r2
passing close to the Sun’s limb; this result in its deduction assumes the Newton
corpuscular theory of light. However, Einstein found, using the newborn equations
of general relativity, a double value and then a clear test was in principle possible
through the observation of the apparent position of stars during a total solar eclipse.
In May 1919 Eddington and Dyson led two independent expeditions, respectively,
to the equatorial islands of São Tomé e Príncipe and to Sobral, Brazil. The observa-
tions were perturbed by clouds (Príncipe) and by instrumental effects (Sobral) but
nevertheless the announcement by Eddington that Einstein’s predictions were con-
firmed had an enormous impact on public opinion and made general relativity widely
known. Further and more robust observations were carried on in the following years
and the predictions of general relativity on light deflection were firmly confirmed.
A light beam emitted by a source at rest in such an elevator (Fig. 8.19) is seen, by
an outside observer at rest in the gravitational field, as emitted by a moving source
and thus its frequency is Doppler shifted (redshifted if the elevator is running away
from the outside observer):
λ g x
z= ∼β∼ 2 . (8.61)
λ c
For the outside observer a clock placed in a freely falling elevator runs slower (has
a longer period) than a clock placed in another freely falling elevator situated above
the first one having thus a lower velocity (Fig. 8.20).
This time dilation due to the gravitational field can also be seen as due to the
differences in the energy losses of the photons emitted in both elevators by “climbing”
out the gravitational field. In fact in a weak gravitational field the variation of the total
454 8 The Standard Model of Cosmology and the Dark Universe
E ν λ gx
= ∼ ∼ 2 .
E ν λ c
Time runs slower in the presence of stronger gravitational fields.
Gravity in General Relativity is no longer a force (whose sources are the masses)
acting in a flat spacetime Universe. Gravity (as it will be discussed in next sections)
is embedded in the geometry of spacetime that is determined by the energy and
momentum contents of the Universe.
8.2 General Relativity 455
8.2.1.1 2D Space
In a flat 2D surface (a plane) the square of the distance between two points is given
in Cartesian coordinates by
ds 2 = d x 2 + dy 2 (8.62)
or
ds 2 = gμν d x μ d x ν (8.63)
Fig. 8.21 2D surfaces with positive, negative, and null curvatures (from http://thesimplephysicist.
com/?p=665, ©2014 Bill Halman/tdotwebcreations)
456 8 The Standard Model of Cosmology and the Dark Universe
with
10
gμν = . (8.64)
01
The metric gμν of the 2D flat surface is constant and the geodesics are straight lines.
The metric of a 2D spherical surface is a little more complex. The square of the
distance between two neighboring close points situated on the surface of a sphere
with radius a embedded in our usual 3D Euclidean space (Fig. 8.22) is given in
spherical coordinates by
is no longer constant, because of the presence of the sin2 θ term. It is not possible
to cover the entire sphere with one unique plane without somewhat distorting the
plane, although it is always possible to define locally at each point one tangent plane.
The geodesics are not straight lines; they are indeed part of great circles, as it can be
deduced directly from the metrics and its derivatives.
This metric can now be written introducing a new variable r = sin θ as
dr 2
ds = a
2 2
+ r 2 dϕ2 (8.67)
1 − Kr2
8.2 General Relativity 457
with
K =1
for the case of the sphere.8 Indeed, the sphere has a positive (K = 1) curvature at
any point of its surface. However, the above expressions are valid both for the case of
negative (K = −1) and null (K = 0) curvature. In the case of a flat surface, indeed,
the usual expression in polar coordinates is recovered:
ds 2 = a 2 dr 2 + r 2 dϕ2 . (8.68)
The distance between two points with the same ϕ and, respectively, r1 = 0 and
r2 = Ra , is given by:
R
a dr
s= a √ = a Sk (8.69)
0 1 − Kr2
with
⎧
⎨ arcsin Ra , if K = 1
Sk = R/a,
if K = 0 . (8.70)
⎩
arcsinh Ra , if K = −1
A = 4 π a 2 Sk 2 . (8.71)
The relation between the proper distance and the luminosity distance (Sect. 8.1.1)
is now
a
dL = d p Sk (1+z) , (8.72)
R
and the metric can also be written in a more compact form using the function Sk :
ds 2 = a 2 dr 2 + Sk 2 dϕ2 . (8.73)
8r is dimensionless, with range [0, 1]; K is the curvature, which, in general, can be −1, 0, or +1.
The more general change of coordinates r = a sin θ does not result in anything new, and can be
recast in the form used above after setting r = r /a. Of course with the r coordinate, the curvature
is not normalized, and can be, generically, negative, zero, or positive.
458 8 The Standard Model of Cosmology and the Dark Universe
8.2.1.2 3D Space
For a homogeneous and isotropic 3D space the previous formula can be generalized
(now r and θ are independent variables) leading to:
dr 2
ds = a
2 2
+ r dθ + sin θ dϕ
2 2 2 2
.
1 − Kr2
8.2.1.3 4D Spacetime
where a (t) is a radial scale factor which may depend on t (allowing for the expan-
sion/contraction of the Universe).
Introducing the solid angle, d2 = dθ2 + sin2 θ dϕ2 , the Robertson–Walker
metric can be written as
dr 2
ds 2 = dt 2 − a 2 (t) + r 2
d2
. (8.75)
1 − Kr2
Finally, the Robertson–Walker metric can also be written using the functions Sk
introduced above as
ds 2 = dt 2 − a 2 (t) dr 2 + Sk 2 d2 . (8.76)
In General Relativity the world lines of freely falling test particles are just the geo-
desics of the 4D spacetime of the Universe we are living in, whose geometry is
locally determined by its energy and momentum contents as expressed by Einstein
8.2 General Relativity 459
equations (which, below, are in the form where we neglect a cosmological constant
term)
1 8π
G μν = Rμν − gμν R = 4 Tμν .
2 c
In the equations above G μν and Rμν are, respectively, the Einstein and the Ricci
which
are built from the metric and its derivatives; R is the Ricci scalar
tensors,μν
R = g Rμν and Tμν is the energy–momentum tensor.
The energy and the momentum of the particles determine the geometry of the
Universe which then determines the trajectories of the particles. Gravity is embedded
in the geometry of spacetime. Time runs slower in the presence of gravitational fields.
Einstein equations are tensor equations and thus independent on the reference
frame (the covariance of the physics laws is automatically ensured). They involve 4D
symmetric tensors and represent in fact 10 independent nonlinear partial differential
equations whose solutions, the metrics of spacetime, are in general very difficult to
sort out. However, in particular and relevant cases, exact or approximate solutions can
be found. Examples are the Minkowski metric (empty Universe); the Schwarzschild
metric (spacetime metric outside a noncharged spherically symmetric nonrotating
massive object—see Sect. 8.2.4); the Kerr metric (a cylindrically symmetric vacuum
solution); the Robertson–Walker metric (homogeneous and isotropic Universe—see
Sect. 8.2.3).
Einstein introduced at some point a new metric proportional term in his equations
(, the so-called “cosmological constant”)
8π
G μν + gμν = Tμν . (8.77)
c4
His motivation was to allow for static cosmological solutions, as this term can balance
gravitational attraction. Although later on Einstein discarded this term (the static
Universe would be unstable), the recent discovery of the accelerated expansion of
the Universe might give it again an essential role (see Sects. 8.2.3 and 8.3.3).
The energy–momentum tensor T μν in a Universe of free noninteracting particles
μ
with four-momentum pi moving along trajectories ri (t) is defined as
μ
T μ0 = pi (t) δ 3 (r − ri (t)) (8.78)
i
μ d xik 3
T μk = pi (t) δ (r − ri (t)) . (8.79)
dt
i
The T μ0 terms can be seen as “charges” and the T μk terms as “currents”, which then
obey a continuity equation which ensures energy–momentum conservation. In gen-
eral relativity local energy–momentum conservation generalizes the corresponding
results in special relativity,
460 8 The Standard Model of Cosmology and the Dark Universe
∂ μ0
T + ∇i T μi = 0 (8.80)
∂x 0
or
∂ μν
T = 0. (8.81)
∂x ν
To get an intuitive grasp of the physical meaning of the energy–momentum tensor,
let us consider the case of a special relativistic perfect fluid (no viscosity). In the rest
frame of a fluid with energy density ρ and pressure P
T 00 = c2 ρ (8.82)
T 0i = 0 (8.83)
T i j = P δi j (8.84)
∇ 2 φ = 4 πG ρ, (8.85)
from which we readily see that pressure does not contribute. On the contrary, the
weak field limit of Einstein equations is
3P
∇ φ = 4 πG ρ + 2 .
2
(8.86)
c
1
P∼ ρ c2 (8.87)
3
the weak gravitational field is then determined by
∇ 2 φ = 8 πG ρ, (8.88)
which shows that the gravitational field predicted by general relativity is twice the one
predicted by Newtonian gravity. Indeed, the observed light deflection by Eddington
in 1919 at S. Tomé and Príncipe islands in a solar eclipse was twice the one expected
according to classical Newtonian mechanics.
8.2 General Relativity 461
Once the metric is known, the free fall trajectories of test particles are obtained
“just” by solving the geodesic equations
d2xσ μ
σ dx dx
ν
+ μν = 0, (8.89)
dτ 2 dτ dτ
σ are the Christoffel symbols given by
where μν
σ g ρσ ∂gνρ ∂gμρ ∂gμν
μν = + − . (8.90)
2 ∂x μ ∂x ν ∂x ρ
In the particular case of flat space in Cartesian coordinates the metric tensor is
everywhere constant, μνσ = 0, and then
d2xμ
= 0.
dτ 2
The free particles classical straight world lines are then recovered.
The present standard model of cosmology assumes the so-called cosmological prin-
ciple, which assumes a homogeneous and isotropic Universe at large scales. Homo-
geneity means that, in Einstein’s words, “all places in the Universe are alike” and
isotropic just means that all directions are equivalent.
The Robertson–Walker metric discussed before (Sect. 8.2.1) embodies these sym-
metries leaving two independent functions, a(t) and K (t), which represent, respec-
tively, the evolution of the scale and of the curvature of the Universe. The Russian
physicist, Alexander Friedmann in 1922, and independently the Belgian Georges
Lemaitre in 1927, solved the Einstein Equations for such a metric leading to the
famous Friedmann equations, which are still the starting point for the standard cosmo-
logical model, also known as the Friedmann–Lemaitre–Robertson–Walker (FLRW)
model.
d 3 d 3
ρ a = −P a , (8.93)
dt dt
d E = −P d V .
ȧ
H= ,
a
the first Friedmann equation is also often written as
K 8πG
H2 + 2
= ρ+ , (8.94)
a 3 3
which shows that the Hubble constant is not a constant but a parameter that evolves
with the evolution of the Universe.
The two Friedmann equations are “almost” recovered. The striking differences are
that in classical mechanics the pressure does not contribute to the “gravitational
mass”, and that the term must be introduced by hand as a form of repulsive
potential.
8.2 General Relativity 463
The curvature of spacetime is, in this “classical” version, associated to (minus) the
total energy of the system, which can somehow be interpreted as a “binding energy”.
The two Friedmann equations determine, once the energy density ρ and the pressure
P are known, the evolution of the scale (a(t)) and of the curvature (K (t)) of the
Universe. However, ρ and P are nontrivial quantities. They depend critically on the
amount of the different forms of energy and matter that exist in the Universe at each
evolution stage.
In the simplest case of a Universe with just nonrelativistic particles (ordinary bary-
onic matter or “cold”—i.e., nonrelativistic—dark matter) the pressure is negligible
with respect to the energy density (P ρm c2 ) and the Friedmann equations can be
approximated as
d
ρm a 3 = 0 (8.97)
dt
ä 4πG
=− (ρm ) . (8.98)
a 3
1 2
ρm ∝ 3
; a (t) ∝ t 3 . (8.99)
a
In general for a Universe with just one kind of matter characterized by an equation
of state relating ρ and P of the type
P =αρ (8.100)
1 1
a (t) ∝ t 2 ; ρrad ∝ . (8.102)
a4
This last relation can be interpreted by taking, as an example, a photon-dominated
Universe, where the decrease in the density number of photons (n γ ∝ a −3 ) combines
with a decrease in the mean photon energy (E γ ∝ a −1 ) corresponding to wavelength
dilation.
464 8 The Standard Model of Cosmology and the Dark Universe
K 8πG
= ρ ; ρ + 3P = 0 . (8.103)
a2 3
with
ρ = . (8.105)
8πG
P = −ρ . (8.106)
ρ = ρm + ρ (8.107)
and
ρm = 2 ρ . (8.108)
implying
with
H= . (8.112)
3
Thus the de Sitter Universe has an exponential expansion while its energy density
remains constant.
K 8πG
= ρ − H2 (8.113)
a2 3
Therefore, if
3 H2
ρ = ρcrit = (8.114)
8πG
one obtains
K =0
i.e., less than 6 hydrogen atoms per cubic meter. The number of baryons per
cubic meter one obtains from galaxy counts is, however, twenty times smaller—
consistently with the result of the fit to CMB data.
The energy densities of each type of matter, radiation, and vacuum are often normal-
ized to the critical density as follows:
ρi 8πG
i = = ρi . (8.116)
ρcrit 3 H2
466 8 The Standard Model of Cosmology and the Dark Universe
K K
K = − = − 2, (8.117)
H 2a2 ȧ
the first Friedmann equation
8πG K
ρ− 2 2 =1 (8.118)
3H 2 H a
takes then a very simple form
m + rad + + K = 1 . (8.119)
On the other hand, it can be shown that taking into account the specific evolution
of each type of density with the scale parameter a, the evolution equation for the
Hubble parameter can be written as
H 2 = H02 0 + 0K a −2 +0m a −3 + 0rad a−4 , (8.120)
(1 + z) = a −1 ,
ä
q0 = − . (8.122)
H0 2 a
Pi = αi ρi ,
one obtains
8.2 General Relativity 467
ä 1 8πG
q0 = − = ρi (1 + 3αi )
H0 2 a 2 3H0 2
i
1
q0 = i (1 + 3αi )
2
i
1
q0 = 0m + 0rad − 0 (8.123)
2
These equations in H and q0 are of the utmost importance, since they connect directly
the experimentally measured quantities H0 and q0 to the densities of the various
energy species in the Universe.
ρ K
= =1+ 2 2 , (8.129)
ρcrit H a
where the closure parameter is the sum of m , rad and , with rad 5×10−5
being negligible. This means that, in general, is a function of time, unless = 1
and thus K = 0 (flat Universe).
Present experimental data indicate a value
it would look very strange if this was a coincidence, unless is identically one. For
this reason this fact is at the heart of the standard model of cosmology, the CDM
model, which postulates = 1.
The evolution of the Hubble parameter can be used to estimate the age of the Universe
for different composition of the total energy density. Indeed
ȧ 1 da dz/dt
H= = =−
a a dt 1+z
dz
dt = −
(1 + z) H
Z
1 dz
(t0 − t) = .
H0 0 (1 + z) + (1 + z)2 + (1 + z)3 +
4 1/2
0 0K 0m 0rad (1 + z)
(8.131)
The solution to this equation has to be obtained numerically in most realistic situa-
tions. However, in some simplified scenarios, an analytical solution can be found. In
particular for matter (m = 1) and radiation (0rad = 1) dominated Universes the
solutions are, respectively,
8.2 General Relativity 469
2
(t0 − t) = (8.132)
3 H0
and
1
(t0 − t) = . (8.133)
2 H0
In a flat Universe with matter and vacuum energy parameters close to the ones
presently measured (m = 0.3, 0 = 0.7) we obtain
0.96
(t0 − t) ∼ . (8.134)
H0
Friedmann equations have four independent parameters which can be chosen as:
• the present value of the Hubble parameter, H0 ;
• the present value of the energy density of radiation, 0rad ;
• the present value of the energy density of matter, 0m ;
• the present value of the energy density of vacuum, 0 .
If we know these parameters, the geometry and the past and future evolutions of
the Universe are determined provided the dynamics of the interactions, annihilations
and creations of the different particle components (see Sect. 8.3.1) are neglected.
The solutions to these equations, in the general multicomponent scenarios, cannot
be expressed in closed, analytical form, and require numerical approaches.
However, as we have discussed above, the evolution of the energy density of
the different components scales with different powers of the scale parameter of the
Universe a. Therefore, there are “eras” where a single component dominates. It is then
reasonable to suppose that, initially, the Universe was radiation dominated (apart for
a very short period where it is believed that inflation occurred—see Sect. 8.3.2), then
that it was matter dominated and finally, at the present time, that it is the vacuum
energy (mostly “dark” energy, i.e., not coming from quantum fluctuations of the
vacuum of the known interactions) that is starting to dominate (Fig. 8.23).
The crossing point (a = across ) between the matter and radiation ages can be
obtained, at first approximation, by just equalizing the corresponding densities:
−1
across m (a0 )
= 1 + z cross = . (8.135)
a0 rad (a0 )
The time after the Big Bang when this crossing point occurs, can, approximately, be
obtained from the evolution of the scale factor in a radiation dominated Universe
1
2
across ∼ 2 H0 rad (a0 ) tcross , (8.136)
or
−1
tcross ∼ across 2 2 H0 rad (a0 ) . (8.137)
Using the current best fit values for the parameters (see Sect. 8.3.3), we obtain
After this time (i.e., during the large majority of the Universe evolution) a two-
component (matter and vacuum) description should be able to give a reasonable,
approximate description. In this case the geometry and the evolution of the Universe
are determined only by m and . Although this is a restricted parameter phase
space, there are several different possible evolution scenarios as shown in Fig. 8.24:
1. If m + = 1 the Universe is flat but it can expand forever ( > 0) or
eventually recollapses ( < 0).
2. If m + > 1 the Universe is closed (positive curvature).
3. If m + < 1 the Universe is open (negative curvature).
8.2 General Relativity 471
4. In a small phase space region with m + > 1 and > 0 there is a solution
with no Big Bang—the Universe bounces between a minimum and a maximum
scale factor.
Some of these evolution scenarios are represented as functions of time in Fig. 8.25
for selected points in the parameter space discussed above. The green curve represents
a flat, matter-dominated, critical density Universe (the expansion rate is continually
slowing down). The blue curve shows an open, low density, matter-dominated Uni-
verse (the expansion is slowing down, but not as much). The orange curve shows
a closed, high-density Universe (the expansion reverts to a “big crunch”). The red
curve shows a Universe with a large fraction of “dark energy” (the expansion of the
Universe accelerates).
The present experimental evidence (see Sect. 8.3.3) highly favors the “dark
energy” scenario, leading to a cold thermal death of the Universe.
The first exact analytical solution of Einstein equations was found in 1915, just a
month after the publication of Einstein’s original paper, by Karl Schwarzschild, a
German physicist who died one year later from a disease contracted on the First
World War battle field.
Schwarzschild’s solution describes the gravitational field in the vacuum surround-
ing a single, spherical, nonrotating massive object. In this case the spacetime metric
(the Schwarzschild metric) can be expressed as
rS 2 2 r S −1 2
ds 2 = 1 − c dt − 1 − dr − r 2 (dθ2 + sin2 θdφ2 ), (8.139)
r r
with
2G M M
rS = 2.7 km . (8.140)
c2 MSun
In the weak field limit, r → ∞, we recover flat spacetime. According to this solution,
a clock with period τ ∗ placed at a point r is seen by an observer placed at r = ∞
with a period τ given by:
r S −1 ∗
τ = 1− τ . (8.141)
r
In the limit r → r S (the Schwarzschild radius) the metric shows a coordinate sin-
gularity: the time component goes to zero and the radial component goes to infinity.
From the point of view of an asymptotic observer, the period τ ∗ is seen now as
infinitely large. No light emitted at r = r S is able to reach the r > r S world. This is
what is usually called, following John Wheeler, a “black hole”.
The existence of objects so massive that light would not be able to escape from
them, was already predicted in the end of the eighteenth century by Michell in
England and independently by Laplace in France. They just realized that, if the
escape velocity from a massive object would have been greater than the speed of
light, then the light could not escape from the object:
2G M
vesc = > c. (8.142)
r
8.2 General Relativity 473
Thus an object with radius R and a mass M would be a “black hole” if:
c2 R
M> : (8.143)
G 2
the“classical” radius and the Schwarzschild radius coincide.
The singularity observed in the Schwarzschild metric is not in fact a real physics
singularity; it depends on the reference frame chosen (see [8.3] for a discussion). An
observer in free fall frame will cross the Schwarzschild surface without feeling any
discontinuity; (s)he will go on receiving signals from the outside world but (s)he will
not be able to escape from the unavoidable, i.e., from crunching, at last, at the center
of the black hole (the real physical singularity).
Schwarzschild black holes are however just a specific case. In 1963, New Zealand
mathematician Roy Kerr found an exact solution to Einstein equations for the case of
a rotating noncharged black hole and two years later the US Ezra Newman extended
it to the more general case of rotating charged black holes. In fact it can be proved
that an electrovacuum black hole can be completely described by three parameters:
mass, angular momentum, and electric charge (the so-called no-hair theorem).
Black holes are not just exotic solutions of the General Theory of Relativity. They
may be formed either by gravitational collapse or particle high-energy collisions.
While so far there is no evidence of their formation in human-made accelerators,
there is striking indirect evidence that they are part of several binary systems and
that they are present in the center of most galaxies, including our own (the Milky
Way hosts in its center a black hole of roughly 4 million solar masses, as determined
from the orbit of nearby stars). Extreme high-energy phenomena in the Universe,
generating the most energetic cosmic rays, may also be caused by supermassive black
holes inside AGNs (Active Galactic Nuclei—see Chap. 10).
In its “first” moments the Universe, according to the Big Bang model, is filled with
a high-density, hot (high-energy) gas of relativistic particles at thermal equilibrium.
The assumption of thermal equilibrium is justified since the interaction rate per
particle ( = nσv : n, number density; σ, cross section; v, mean relative velocity)
and the Hubble parameter H ( H 2 ∼ 8πG 3 ρ ) evolve with the energy density ρ as
1
∝ n ∝ ρ ; H ∝ ρ2 . (8.144)
1 . (8.145)
H
Since the early Universe is radiation dominated (Sect. 8.2)
1 1
ρrad ∝ 4
; a (t) ∝ t 2 . (8.146)
a
The temperature is, by definition, proportional to the mean particle energy and thus,
in the case of radiation, it increases proportionally to the inverse of the Universe
scale:
T ∝ a −1 . (8.147)
On the other hand, at each temperature T the number density, the energy density,
and the pressure of each particle type are given (neglecting chemical potentials) by
standard quantum statistical mechanics
∞
gi 4π p 2
ni = dp , (8.148)
(2π) 3
0
Ei
e kB T
±1
∞
gi 4π p 2
ρi c2 = E i dp , (8.149)
(2π) 3
0
Ei
e kB T
±1
∞
gi 4π p 2 pi c2
Pi = dp , (8.150)
(2π)3 0
Ei
3 Ei
e kB T
±1
where gi are the internal degrees of freedom of the particles—the +/– signs are for
bosons (Bose–Einstein statistics) and fermions ( Fermi–Dirac statistics), respectively.
For k B T m i c2 (relativistic limit)
⎧ 3
⎪
⎨ gi ζ(3) kB T
, for bosons
π2 c
ni = 3 (8.151)
⎪
⎩ 43 gi ζ(3) kB T
, for fermions
π2 c
⎧ 3
⎪
⎨ gi π2
kB T kB T
, for bosons
c
3
30
ρi c2 = (8.152)
⎪
⎩ 78 gi π2
k B T k BcT , for fermions
30
ρi c2
Pi = ,
3
8.3 Past, Present, and Future of the Universe 475
The total energy density in the early Universe can be obtained summing over all
possible relativistic particles and can be written as
π2 kB T 3
ρc = 2
ge∗f kB T , (8.153)
30 c
where ge∗f is defined as the total “effective” number of degrees of freedom, and is
given by
7
ge∗f = gi + gj . (8.154)
8
bosons fermions
However, the scattering interaction rate of some relativistic particles (like neutrinos,
see below) may become at some point smaller than the expansion rate of the Universe
and then they will be no more in thermal equilibrium with the other particles. It is
said that they decouple and their temperature will evolve as a −1 independently of
the temperature of the other particles. The individual temperatures Ti , T j may be
introduced in the definition of the “effective” number of degrees of freedom as
4 4
Ti 7 Tj
ge f = gi + gj (8.155)
T 8 T
bosons fermions
(ge f is of course a function of the age of the Universe). At a given time all the
particles with Mx c2 k B T contribute.
The total energy density determines the evolution of the Hubble parameter
8πG π 2 kB T 3
H ∼2
ge f k B T (8.156)
3 30 c
1/2
4π 3 G
H∼ ge f 1/2
(k B T )2 , (8.157)
45 (c)3
(k B T )2
H ∼ 1.66 ge f 1/2
. (8.158)
c2 m P
476 8 The Standard Model of Cosmology and the Dark Universe
Remembering (Sect. 8.2) that in a radiation dominated Universe the Hubble parame-
ter is related to time just by
1
H= , (8.159)
2t
time and temperature are related by
1/2
45(c)3 1 1
t= , (8.160)
16 π 3 G ge f 1/2
(k B T )2
ρc + P
S= V, (8.162)
kB T
ρc2 + P
s = . (8.163)
kB T
3 3
Ti 7 Tj
ges f = gi + gj . (8.165)
T 8 T
bosons fermions
At early times the possibility of physics beyond the standard model (Grand Uni-
fied Theories (GUT) with new bosons and Higgs fields, SuperSymmetry with the
association to each of the existing bosons or fermions of, respectively, a new fermion
or boson, ...) may increases this number. The way up to the Planck time (∼10−43
s), where general relativity meets quantum mechanics and all the interactions may
become unified, remains basically unknown. Quantum gravity theories like string
theory or loop quantum gravity have been extensively explored in the last years, but,
for the moment, are still more elegant mathematical constructions than real physical
theories. The review of such attempts is out of the scope of the present book. Only
the decoupling of a possible stable heavy dark matter particle will be discussed in
the following.
At later times the temperature decreases, and ge f decreases as well. At k B T ∼
0.2 GeV hadronization occurs and quarks and gluons become confined into massive
hadrons. At k B T ∼ 1 MeV (t ∼ 1 s) the light elements are formed (primordial nucle-
osynthesis, see Sect. 8.1.3). Around the same temperature neutrinos also decouple as
it will be discussed below. At k B T ∼ 0.8 eV the total energy density of nonrelativis-
tic particles is higher than the total energy density of relativistic particles and then
the Universe enters a matter-dominated era (see Sect. 8.2.3). Finally at k B T ∼ 0.3
eV recombination and decoupling occur (see Sect. 8.1.2). At that moment, the hot
plasma of photons, baryons, and electrons, which was coherently oscillating under
the combined action of gravity (attraction) and radiation pressure (repulsion), breaks
apart: photons propagate away originating the CMB while the baryon oscillations
stop (no more radiation pressure) leaving a density excess at a fixed radius (the
sound horizon) which, convoluted with the initial density fluctuations, are the seeds
for the subsequent structure formation. This entire evolution scenario is strongly con-
strained by the existence of dark matter which is gravitationally coupled to baryons
and represented at that time 2/3 of the total energy of the Universe.
The present, most popular, cosmological model for the Universe (the, so-called,
CDM model—see Sect. 8.3.3) assumes that dark matter is formed by stable mas-
sive nonrelativistic particles. These particles must be weakly interacting—otherwise
they would have been found (see later); the acronym WIMP (Weakly Interactive
Massive Particle) is often used to name them, since for several reasons that will be
discussed below, the favorite theoretical guess compatible with experiment is that
they are heavier than 45 GeV. The lightest supersymmetric particle, possibly one of
the neutralinos χ (see the previous Chapter), is for many the most likely candidate;
we shall use often the symbol χ to indicate a generic WIMP. They must be neutral
and we assume without loss of generality that they coincide with their antiparticle
(as it is the case for the neutralino).
478 8 The Standard Model of Cosmology and the Dark Universe
χ = nσv ∼ H . (8.168)
After decoupling annihilations cease and the density of dark matter particles will
just decrease with a −3 . The value of the decoupling density is therefore an inverse
function of σv
, where the velocity v is small for a large mass particle. In Fig. 8.26
the number density of an hypothetical dark matter particle as a function of time
2
(expressed in terms of the ratio mk Bx cT ) for different assumed values of σ v
is shown.
If this new particle χ has Fermi-type weak interactions (Chap. 6) its annihilation
cross sections for low energies can be expressed as
gχ 4
σ ∼ G χ2 mχ2 ∼ (8.169)
mχ2
where G χ and gχ are the effective and the elementary coupling constants.
1 10 100 1000
8.3 Past, Present, and Future of the Universe 479
A fixed relation between gχ and m χ , as shown in Fig. 8.27, can then ensure a
given decoupling density of particles. In particular the expected values for a WIMP
with gχ ∼ gW ∼ 0.6 and m χ ∼ m W ∼ 80 GeV give the right scale for the observed
dark matter density ( D M ∼ 0.2–0.3, see Sect. 8.3.3); this coincidence is called the
WIMP miracle. A WIMP can indeed be the mysterious missing dark particle, but
this is not the only available solution.
However, the Fermi interaction is a model valid only at low energies; its divergence
at higher energies is regularized introducing the intermediate vector bosons W ± :
MW 2 → q 2 − MW 2
which changes drastically the behavior of the dependence of cross section on energy
from
σ ∝ E2
to
σ ∝ E −2 .
A more careful treatment should be then carried on. In the present text, we just
want to stress that this fact explains the usual break of the sensitivity of the present
cross sections upper limits obtained on the WIMP searches for masses around MW
(Fig. 8.36) reported in Sect. 8.4.2.
Decoupling of neutrinos occurs, similarly to what was discussed in Sect. 8.1.3 in the
case of the primordial nucleosynthesis, whenever the neutrinos interaction rate ν is
480 8 The Standard Model of Cosmology and the Dark Universe
ν ∼ H.
− −
√ (like ν e → ν e ) and thus their cross
Neutrinos interact just via weak interactions
sections have a magnitude for k B T ∼ s m W of the order of
σ ∼ G F 2 s ∼ G F 2 (k B T )2 . (8.170)
ges f T 3 = constant
eγ γ
ge f Teγ
3
= ge f Tγ3 ;
before decoupling
eγ 7 11
ge f = 2 × 2 × + 2 = (8.171)
8 2
and after decoupling
γ
ge f = 2 . (8.172)
Therefore,
1/3
Tγ 11
∼ 1.4 . (8.173)
Teγ 4
The temperature of photons after the annihilation of the electrons and positrons is
thus higher than the neutrino temperature at the same time (the so-called reheating).
The temperature of the neutrino cosmic background is therefore nowadays around
1.95 K, while the temperature of the CMB is around 2.73 K (see Sect. 8.1.2).
The ratio between the number density of cosmic background neutrinos (and anti-
neutrinos) and photons can then be computed using Eq. 8.153 as:
8.3 Past, Present, and Future of the Universe 481
Nν 3
=3 ,
Nγ 11
where the factor 3 takes into account the existence of three relativistic neutrino fam-
ilies. Reminding that nowadays Nγ 410/cm3 , the number density of cosmological
neutrinos should be:
Nν 340/cm3 .
The detection of such neutrinos (which are all around) remains an enormous challenge
to experimental particle and astroparticle physicists.
The early Universe should have been remarkably flat, isotropic, and homogeneous
to be consistent with the present measurements of the total energy density of the
Universe (equal or very near of the critical density) and with the extremely tiny
temperature fluctuations (∼10−5 ) observed in the CMB. On the contrary, at scales
∼50 Mpc, the observed Universe is filled with rather inhomogeneous structures, like
galaxies, clusters, superclusters, and voids. The solution for this apparent paradox
was given introducing in the very early Universe an exponential superluminal (reced-
ing velocities much greater than the speed of light) expansion. It is the, so-called,
inflation.
1 1 2 1 1 2
ρ= φ̇ + V (φ) ; P = φ̇ − V (φ) . (8.174)
2 c3 2 c3
Thus whenever
1 2
φ̇ < V (φ)
2c3
482 8 The Standard Model of Cosmology and the Dark Universe
a (t) ∼ e H t (8.175)
a tf
∼ eN , (8.176)
a (ti )
K c2
−1= .
H 2a2
Then, at the end of the inflationary period, the energy density will be very close to the
critical density as it was predicted extrapolating back the present measured energy
density values to the early moments of the Universe (the so-called flatness problem).
For example, at the epoch of the primordial nucleosynthesis (t ∼ 1 s) the deviation
from the critical density should be ≤10−12 −10−16 .
The exponential expansion will also give a solution to the puzzle that arises from
the observations of the extreme uniformity of the CMB temperature measured all
over the sky (the so-called horizon problem).
In the standard Big Bang model the horizon distance (the maximum distance light
could have traveled since the origin of time) at last scattering (tls ∼ 3 105 years,
zls ∼ 1100) is given by
tls
c dt
d H = a(tls ) . (8.177)
0 a(t)
In the Big Bang model there was (see Sect. 8.2.3) first a radiation dominated expan-
sion followed by a matter-dominated expansion with scale parameter evolution
1 2
a (t) ∝ t 2 and a (t) ∝ t 3 , respectively. The crossing point was computed to be
around tcross ∼ 7 104 years.
Then, assuming that during most of the time the Universe is matter dominated
(the correction due to the radiation dominated period is small),
d H ∼ 3 tls ,
D ∼ 3 t0 .
The regions causally connected at the time of last scattering (when the CMB photons
were emitted) as seen by an observer on Earth have an angular size of
dH 180◦
δθ ∼ (1 + zls ) ∼ 1◦ − 2◦ , (8.179)
D π
484 8 The Standard Model of Cosmology and the Dark Universe
where the (1 + zls ) factor accounts for the expansion between the time of last
scattering and the present.
Regions separated by more than this angular distance have in the standard Big
Bang model no way to be in thermal equilibrium. Inflation, by postulating a super-
luminal expansion at a very early time, ensures that the entire Universe that we can
now observe was causally connected in those first moments.
Finally, according to the Big Bang picture, at the very early moments of the
Universe all the interactions should be unified. When later temperature decreases,
successive phase transitions, due to spontaneous symmetry breaking, will originate
the present world we live in, in which the different interactions are well individual-
ized. The problem is that Grand Unified Theories (GUT) phase transition should give
rise to a high density of magnetic monopoles. Although none of these monopoles
were ever observed (the so-called “monopole problem”), if inflation had occurred
just after the GUT phase transition the monopoles (or any other possible relics, e.g.,
the gravitino) would be extremely diluted and this problem would be solved.
It is then tempting to associate the inflaton field to some GUT breaking mechanism,
but it was shown that potentials derived from GUTs do not work, and for this reason
the inflaton potential is still, for the moment, an empirical choice.
Nowadays, the most relevant and falsifiable aspect of inflation models is their pre-
dictions for the origin and evolution of the structures that are observed in the present
Universe.
Quantum fluctuations of the inflaton field originate primeval density perturbations
at all distance scales. During the inflationary period all scales that can be observed
today went out of the horizon (the number of e-foldings is set accordingly) to reenter
later (starting from the small scales and progressively moving to large scales) during
the classical expansion (the horizon grows faster than the Universe scale). They
evolve under the combined action of gravity, pressure, and dissipation, giving rise
first to the observed acoustic peaks in the CMB power spectrum and, finally, to the
observed structures in the Universe.
The spatial density fluctuations are usually decomposed into Fourier modes
labeled by their wave number k or by their wavelength λ = 2π/k, and
∞
δρ −
→
−
→−
k .→
r =A δk e−i r 3
d k.
ρ −∞
δT ∼ 1
= φ.
T
3
Fig. 8.30 The density, temperature, age, and redshift for the several Universe epochs. From Eric
Linder, “First principles of Cosmology”, Addison-Wesley 1997
488 8 The Standard Model of Cosmology and the Dark Universe
At the same time, additional questions pop up. What is dark matter made of? And
how about the dark energy? Why the “particle physics” vacuum expectation value
originated from quantum fluctuations is 120 orders of magnitude higher than what
is needed to account for dark energy?
Finally, the standard model of cosmology gives us a coherent picture of the evo-
lution of the Universe (Figs. 8.30 and 8.31) starting from as back as Planck time,
where even General Relativity is no longer valid. What happened before? Was there
a single beginning, or our Universe is one of many? What will happen in the future? Is
our Universe condemned to a thermal death? Questions for the twenty-first century.
Questions for the present students and the future researchers.
• First, the CDM model (Sect. 8.3.3) computes the total content of baryonic DM
(i.e., nonluminous matter made by ordinary baryons) from the fit to the CMB
spectrum, and the result obtained is only some 4 % of the total energy of the
Universe; the structure of the Universe, computed from astrophysical simulations,
is consistent with the fractions computed within the CDM model.
• Second, the abundances of light elements depend on the baryon density, and the
observed abundances are again consistent with those coming from the fit to b
coming from the CMB data.
A direct search is however motivated by the fact that the hypotheses on which cos-
mological measurements are based might be wrong (as in the case of MOND, for
example).
Baryonic DM should cluster into massive astrophysical compact objects (the so-
called MACHOs), or into molecular clouds. Molecular clouds, when hot, are easily
observable (for example in hot galaxy clusters); the result of observations is that
the amount of DM due to molecular clouds is small. The main baryonic component
should be thus concentrated in massive objects (MACHOs), including black holes.
We can estimate the amount of this component using the gravitational field generated
by it.
Several research groups have searched for MACHOs and found that only a few
percent of the total DM can be attributed to them. Therefore, MACHOs do not solve
the missing mass problem.
The microlensing technique is also one of the main sources of information in the
search for extrasolar planets (see Chap. 11).
ν ≤ 0.004.
After having excluded known matter as possible a DM candidate, we are only left
with presently unknown—although sometimes theoretically hypothesized—matter.
gaγγ = 1/M—all quantities here are expressed in NU. The axion mass m is given
by the formula
m 1
. (8.181)
1 eV M/6 × 106 GeV
The axion lifetime would then be proportional to 1/M 5 , which is larger than the age
of the Universe for m < 10 eV. An axion below this mass would thus be stable.
Since the axion couples to two photons, in a magnetic or electric field it could
convert to a photon; vice versa, a photon in an external magnetic or electric field
could convert into an axion; the amplitude of the process would be proportional to
gaγγ .
Axion-like particles (ALPs) are a generalization of the axion: while the axion is
characterized by a strict relationship between its mass m and gaγγ = 1/M, these
two parameters are unrelated for ALPs. Depending on the actual values of their mass
and coupling constant, ALPs can play an important role in cosmology, either as cold
dark matter particles or as quintessential dark energy.
In order to account for dark matter, that is, to reach an energy density of the order
of the critical density, axion masses should be at least 0.1 meV. Light, axions, and
ALPs could still be DM candidates, since they are produced nonthermally, and thus
they can be “cold”.
Axion Searches. Attempts are being made to directly detect axions mostly by:
1. using the light-shining-through-a-wall technique: a laser beam travels through a
region of high magnetic field, allowing the possible conversion of photons into
axions. These axions can then pass through a wall, and on the other side they can
be converted back into photons in a magnetic field;
2. trying to detect solar axions: the CAST (CERN Axion Solar Telescope) experi-
ment looks for the X-rays that would result from the conversion of solar axions
(produced in the Sun when X-rays scatter off electrons and protons) back into
photons, using a 9-tons superconducting magnet.
Indirect searches are also possible, and are prevalently made by investigating three
effects.
4. The birefringence of vacuum in high magnetic fields due to photon–axion mixing.
It is common that different polarizations experience a different refractive index
in matter—a common example is a uniaxial crystal. The vacuum is also expected
to become birefringent in presence of an external magnetic field perpendicular to
the propagation direction, due to the orientation of the virtual e+ e− loops. The
magnitude of this birefringence could be enhanced by the presence of an axion
field, which provides further magnetic-dependent mixing of light to a virtual field
(experiment PVLAS by E. Zavattini et al. 2006).
5. Possible anomalies in the cooling times of cataclysmic stars. Stars produce vast
quantities of weakly interacting particles, like neutrinos and possibly hypothetical
gravitons, axions, and other unknown particles. Although this flux of particles
492 8 The Standard Model of Cosmology and the Dark Universe
cannot be measured directly, the properties of the stars would be different from
expectations if they could loose too much energy in a new way. The results on
the cooling times and the photon fluxes (since photons are coupled to axions)
constrain the characteristics of the nonvisible axions.
6. ALPs can also directly affect the propagation of photons coming from astrophys-
ical sources, by mixing to them. This possibility has been proposed in 2007 by
De Angelis, Roncadelli, and Mansutti (DARMa), and by Simet, Hooper, and Ser-
pico. The conversion of photons into axions (in the random extragalactic magnetic
fields, or at the source and in the Milky Way), could give rise to a sort of cosmic
light-shining-through-a-wall effect. This might enhance the yield of very-high-
energy photons from distant blazars, which would be otherwise suppressed by
the interaction of these photons with the background photons in the Universe (see
Chap. 10).
Techniques (1) to (5), with negative results, have limited the region of mass and
coupling allowed for ALPs. Failure to detect axions from the Sun with the CAST
experiment sets the robust bound M > 1.14 · 1010 GeV for m < 0.02 eV. Further-
more, the stronger bound M > 3 · 1011 GeV holds for ALPs with m < 10−10 eV,
based on astrophysical observations (technique 5; in particular the photon flux and
the cooling time of SN1987A).
A possible evidence for ALPs comes from anomalies in the propagation of very-
high-energy photons from astrophysical sources (technique 6).
A summary of the present exclusion limits, and the possible observational window
indicated by the propagation of VHE photons, are summarized in Fig. 8.33. The topic
is very hot and many new experimental results are expected in the next years.
If dark matter particles are massive (above some M Z /2 and below some TeV) and
they are weakly interacting (WIMPs), the “WIMP miracle” guarantees that they can
saturate the energy budget for dark matter and the weak interaction characterizing
them can be the well-known electroweak interaction. Let us look in more detail to
this economical solution.
WIMPs should be neutral, and should have a lifetime large enough in order to
have survived from the early Universe until the present time. They “freeze out”
once their interaction rate equals the Hubble expansion rate of the Universe—or
alternatively when the mean free path equals the size of the Universe: c/H0 ∼
1/χ σann , where σann is the cross section for WIMP pair annihilation. Inserting the
appropriate constants, their relic density can be calculated as
0.6 pb
χ . (8.182)
σann
8.4 What Is Dark Matter Made Of, and How Can It Be Found? 493
LSW
S
8
Solar ν
Helioscopes CAST
Telescopes HB
10
1
Log Coupling GeV SN burst ALPS II, REAPR IAXO
xion
12 Transparency hint YMCE
WD cooling hint
axion CDM
Haloscopes
ADMX HF
14 ALP CDM
ADMX
EBL
16 ion
X
ax
VZ
Ra
KS
ys
18
12 10 8 6 4 2 0 2 4 6
Log Mass eV
Fig. 8.33 Axion and ALP coupling to photons versus the ALP mass. Colored regions are: generic
prediction for the QCD axion, which relate its mass with its coupling to photons (yellow), experi-
mentally excluded regions (dark green), constraints from astronomical observations (gray) or from
astrophysical or cosmological arguments (blue), and sensitivity of planned experiments (light green).
Shown in red are boundaries where axions and ALPs can account for all the cold dark matter pro-
duced either thermally or nonthermally by the vacuum-realignment mechanism. Adapted from A.
Ringwald, J. Phys. Conf. Ser. 485 (2014) 012013
If we assume that dark matter can be explained by just one particle χ, we have
χ ∼ 0.22, and
σann ∼ 3 pb . (8.183)
3 cm3 s−1
χ h 2 , (8.184)
σann |vχ |
where vχ is the relative velocity between the two WIMPs; this leads, since χ h 2 ∼
0.1, to
σann |vχ |
3 × 10−26 cm3 s−1 . (8.185)
The results in Eqs. 8.183 and 8.185 are a natural benchmark for the behavior of
DM particles. A little miracle is that if you take the electroweak cross section at an
energy scale around 100 GeV (since we are in unification regime we can take for the
order of magnitude the electromagnetic expression σ(e+ e− → μ+ μ− ) α2 /s, see
Eq. 5.57). The result is
494 8 The Standard Model of Cosmology and the Dark Universe
100nb
σ(100 GeV) ∼ α2 /s 10 pb , (8.186)
s[GeV2 ]
remarkably close to what is needed for a single particle in the 100 GeV range subject
to an interaction strength typical of electroweak interactions.
Several extensions to the SM have proposed WIMP candidates, most notably
supersymmetric models (SUSY) with R−parity conservation, in which the lightest
supersymmetric particle, the putative neutralino χ, is stable and thus a serious can-
didate (Sect. 7.6.1). If its mass is between a few GeV and a few TeV, the parameter
space of SUSY can accommodate weak auto-annihilation cross sections (the neu-
tralino is a Majorana particle) . For this reason the neutralino is usually thought to
be a “natural” DM candidate. However, more general models are also allowed.
WIMPs (we shall often identify them generically by the χ letter, irrespectively of
their possible neutralino-like nature) could be detected directly, via elastic scattering
with targets on Earth, or indirectly, by their self-annihilation products in high-density
DM environments, or by their decay products.
WIMP velocities in the Earth’s surroundings are expected to be of one order of
magnitude smaller than the galactic escape velocity, i.e., they are nonrelativistic:
thermalized WIMPs have typical speeds
" 2k B T 100 GeV 1/2
vχ2
27 m/s .
mχ mχ
These are smaller than the velocity v of the Solar System with respect to the center
of the Galaxy, which is of the order of 10−3 c.
If the Milky Way’s dark halo is composed of WIMPs, then, given the DM density
in the vicinity of the Solar System and the speed of the Solar System with respect to
the center of the Galaxy, the χ flux on the Earth should be about
100 GeV −2 −1
χ v n DM,local 105 cm s
mχ
(a local dark matter density of 0.4 GeV/cm3 has been used to compute the number
density of DM particles). This flux is rather large and a potentially measurable fraction
might scatter off nuclei.
Direct detection of dark matter relies on observation of the scattering or other
interaction of the WIMPs inside low-background Earth-based detectors.
Indirect detection relies on the observation of:
• The annihilation products of pairs of WIMPs—for example, in the halo of the
galaxy, or as a result of their accumulation in the core of the Sun or of the Earth.
Such annihilation can happen if the WIMP is a boson, or, if a fermion, a Majorana
particle; indeed the SUSY neutralino, a likely WIMP candidate, is a Majorana
particle.
8.4 What Is Dark Matter Made Of, and How Can It Be Found? 495
• The decay products of the WIMP—WIMPs can be unstable, provided their lifetime
is larger than the Hubble time.
Direct Detection of WIMPs at Accelerators. WIMPs can be created at colliders, but
not observed, since they are neutral and weakly interacting. However, it is possible
to infer their existence. Their signature would be missing energy when one tries to
reconstruct the dynamics of the collision. There has been a huge effort to search for
the appearance of these new particles.
The searches are complementary to the direct searches that will be described
later; however, to compare with noncollider searches, dark matter, the limits need
to be translated via an effective field theory into upper limits on WIMP-nucleon
scattering or on WIMP annihilation cross sections. They are thus model dependent.
However, for many channels foreseen in SUSY, they are complementary to the limits
from direct searches (Fig. 8.36) and in particular they can exclude the region below
10 GeV and a cross section per nucleon of the order of 10−44 cm2 , where direct
searches are not very sensitive.
Direct Detection of WIMPs in Underground Detectors. If the dark matter conjec-
ture is correct, we live in a sea of WIMPs. For a WIMP mass of 50 GeV, there might
be in our surroundings some 105 particles per cubic meter, moving at a speed smaller
than the revolution velocity of the Earth around the Sun.9 Experimental detection is
based on the nuclear recoil that would be caused by WIMP elastic scattering.
The kinematics of the scattering is such that the transferred energy is in the
keV range. The recoil energy E K of a particle of mass M initially at rest after a
nonrelativistic collision with a particle of mass m χ traveling at a speed 10−3 c is
approximately
# 2 $
M 2
E K 50 keV . (8.187)
100 GeV 1 + M/m χ
The expected number of collisions is some 10−3 per day in a kilogram of material.
Detectors sensitive to the direct detection of a WIMP interaction should have a
low energy threshold, a low-background noise, and a large mass. The energy of a
nucleus after a scattering from a WIMP is converted into a signal corresponding
to (1) ionization, (2) scintillation light; (3) vibration quanta (phonons). The main
experimental problem is to distinguish the genuine nuclear recoil induced by a WIMP
from the huge background due to environmental radioactivity. It would be useful to
do experiments which can measure the nuclear recoil energy and if possible the
direction. The intrinsic rejection power of these detectors can be enhanced by the
simultaneous detection of different observables (for example, heat and ionization or
heat and scintillation).
9 From astrophysical observations the local WIMP density about 0.4 GeV/cm3 ; the velocity distri-
bution is maxwellian, truncated by the galactic escape velocity of 650 km/s. For a mass of 50 GeV,
the RMS velocity is comparable to the speed of the solar system in the galaxy, ∼ 230 km/s.
496 8 The Standard Model of Cosmology and the Dark Universe
Fig. 8.34 Left the directions of the Sun’s and the Earth’s motions during a year. Assuming the
WIMPs to be on average at rest in the Galaxy, the average speed of the WIMPs relative to the Earth
is modulated with a period of 1 year. Right annual modulation of the total counting rate (background
plus possible dark matter signal) in 7 years of data with the DAMA detector. A constant counting
rate has been subtracted. From R. Bernabei et al., Riv. Nuovo Cim. 26 (2003) 1
The WIMP rate may be expected to exhibit some angular and time dependence.
For example, there might be a daily modulation because of the shadowing effects of
the Earth when turned away from the galactic center. An annual modulation in the
event rate would also be expected as the Earth’s orbital velocity around the Sun (about
30 km/s) adds to or subtracts from the velocity of the Solar System with respect to
the galactic center (about 230 km/s), so that the number of WIMPs intercepted per
unit time varies (Fig. 8.34, left).
The detectors have then to be well isolated from the environment, possibly shielded
with active and passive materials, and constructed with very low activity materials. In
particular, it is essential to operate in an appropriate underground laboratory to limit
the background from cosmic rays and from natural radioactivity. There are many
underground laboratories in the world, mostly located in mines or in underground
halls close to tunnels, and the choice of the appropriate laboratory for running a
low-noise experiment is of primary importance. Just to summarize some of the main
characteristics,
• The thickness of the rock (to isolate from muons and from the secondary products
of their interaction).
• The geology (radioactive materials produce neutrons that should be shielded) and
the presence of Radon.
• The volume available (none of the present installations could host a megaton
detector).
• The logistics.
Some of the largest underground detectors in the world are shown in Fig. 8.35.
As an example, the INFN Gran Sasso National Laboratory (LNGS), which is
the largest underground European laboratory, hosts some 900 researchers from 30
different countries. LNGS is located near the town of L’Aquila, about 120 kilometers
from Rome. The underground facilities are located on one side of the highway tunnel
crossing the Gran Sasso mountain; there are three large experimental halls, each about
100 m long, 20 m wide, and 18 m high. An average 1400 m rock coverage gives a
reduction factor of one million in the cosmic ray flux; the neutron flux is thousand
8.4 What Is Dark Matter Made Of, and How Can It Be Found? 497
Fig. 8.35 Underground laboratories for research in particle physics (1–10) listed with their depth
in meters water equivalent. Laboratories for research in the million-year scale isolation of nuclear
waste are also shown (11–20). The NELSAM laboratory (21) is for earthquake research. From
www.deepscience.org
times less than the one at the surface. One of the halls points to CERN, allowing
long-baseline accelerator neutrino experiments.
Essentially three types of detectors operate searching directly for dark matter in
underground facilities all around the world.
• Semiconductor detectors. The recoil nucleus or an energetic charged particle or
radiation ionizes the traversed material and produces a small electric signal pro-
portional to the deposited energy. Germanium crystals, which have a very small
value of the gap energy (3 eV) and thus have a good resolution of 1 ‰ at 1 MeV, are
commonly used as very good detectors since some years. The leading detectors are
the CDMS, CoGeNT, CRESST, and EDELWEISS experiments. The bolometric
technique (bolometers are ionization-sensitive detectors kept cold in a Wheatstone
bridge; the effects measured are: the change in electric resistance consequent to
the heating, i.e., the deposited energy; and ionization) increases the power of back-
ground rejection, and allows a direct estimate of the mass of the scattering particle.
• Scintillating crystals. Although their resolution is worse than Germanium detec-
tors, no cooling is required. The scintillation technique is simple and well known,
and large volumes can be attained because the cost per mass is low. However,
these detectors are not performant enough to allow an event-by-event analysis.
498 8 The Standard Model of Cosmology and the Dark Universe
For this reason, some experiments are looking for a time-dependent modulation
of a WIMP signal in their data. As the Earth moves around the Sun, the WIMP
flux should be maximum in June (when the revolution velocity of the Earth adds
to the velocity of the solar system in the Galaxy) and minimum in December, with
an expected amplitude variation of a few percent.
DAMA (now called in its upgrade DAMA/LIBRA) is the first experiment using
this detection strategy. The apparatus is made of highly radio-pure NaI(Tl) crystals,
each with a mass of about 10 kg, with two PMTs at the two opposing faces.
• Noble liquid detectors. Certainly the best technique, in particular in a low-
background environment, is to use as detectors noble elements (this implies low
background from the source itself) such as argon (A = 40) and xenon (A = 131).
Liquid xenon (LXe) and liquid argon (LAr) are good scintillators and ionizers
in response to the passage of radiation. Using pulse-shape discrimination of the
signal, events induced by a WIMP can be distinguished from background electron
recoil.
The main technique is to the present knowledge the “double phase” technique. A
vessel is partially filled with noble liquid, with the rest of the vessel containing the
same element in a gaseous state. Electric fields of about 1 kV/cm and 10 kV/cm
are established across the liquid and gas volumes, respectively. An interaction in
the liquid produces excitation and ionization processes. Photomultiplier tubes are
present in the gas volume and in the liquid. The double phase allows reconstruction
of the topology of the interaction (the gas allowing a TPC reconstruction), thus
helping background removal.
Whatever the detector is, the energy threshold is a limiting factor on the sensitivity
at low WIMP masses; but for high values of m χ the flux decreases as 1/m χ and
the sensitivity for fixed mass densities also drops. The best sensitivity is attained for
WIMP masses close to the mass of the recoiling nucleus (Fig. 8.36).
The experimental situation is presently unclear. Possible WIMP detection signals
were claimed by the experiment DAMA, based on a large scintillator (NaI (Tl))
volume, and the CRESST and CoGeNT data show some stress with respect to exper-
iments finding no signal. The data analyzed by DAMA corresponded to 6 years of
exposure with a detector mass of 250 kg, to be added to 6 years of exposure done
earlier with a detector mass of 100 kg. Based on the observation of a signal at 8.9
σ (Fig. 8.34, right) modulated with the expected period of 1 year and the correct
phase (with a maximum near June 2), DAMA collaborators propose two possible
scenarios: a WIMP with m χ 50 GeV and σ A 7 × 10−6 pb and a WIMP with
m χ 8 GeV and σ A 10−3 pb. The DAMA signal is controversial, as it has
not presently been reproduced by other experiments with comparable sensitivity but
wiith different types of detectors.
The current best limits come from:
• The XENON100 detector, a 165 kg liquid xenon detector located in LGNS with
62 kg in the target region and the remaining xenon in an active veto together with
high purity Germanium detectors.
8.4 What Is Dark Matter Made Of, and How Can It Be Found? 499
• The LUX detector, a 370 kg xenon detector installed in the Homestake laboratory
(now called SURF) in the US. The LUX collaboration has recently announced
results from an 85-days run and an average operational mass of 118 kg.
A new liquid xenon-based project, XENON1t, is planned in the LNGS, with 3.5 tons
of liquid xenon.
Indirect Detection of WIMPs. WIMPs are likely to annihilate in pairs; it is also
possible that they are unstable, with lifetimes comparable with the Hubble time, or
larger. In these cases one can detect secondary products of WIMP decays. Let us
concentrate now on the case of annihilation in pairs—most of the considerations
apply to decays as well.
If the WIMP mass is below the W mass, the annihilation of a pair of WIMPs
should proceed mostly through f f¯ pairs. The state coming from the annihilation
should be mostly a spin-0 state (in the case of small mutual velocity the s-wave
state is favored in the annihilation; one can derive a more general demonstration
using the Clebsch–Gordan coefficients). Helicity suppression entails that the decay
in the heaviest accessible fermion pair is preferred, similar to what seen in Chap. 6
when studying the π ± decay (Sect. 6.3.4): the probability decay into a fermion–
antifermion pair is proportional to the square of the mass of the fermion. In the mass
region between 10 GeV and 80 GeV, the decay into bb̄ pairs is thus preferred (this
consideration does not hold if the decay is radiative, and in this case a generic f f¯
pair will be produced). The f f¯ pair will then hadronize and produce a number of
secondary particles.
In the case of the annihilation in the cores of stars, the only secondary products
which could be detected would be neutrinos. However, no evidence for a significant
500 8 The Standard Model of Cosmology and the Dark Universe
extra flux of high-energy neutrinos from the direction of the Sun or from the Earth’s
core has ever been found.
One could have annihilations in the halos of galaxies or in accretion regions close
to black holes or generic cusps of dark matter density. In this case one could have
generation of secondary particles, including gamma rays, or antimatter which would
appear in excess to the standard rate.
We shortly present here the possible scenarios for detections, which will be dis-
cussed in larger details in Chap. 10, in the context of multimessenger astrophysics.
Gamma Rays. The self-annihilation of a heavy WIMP χ can generate photons
(Fig. 8.37) in three main ways.
(a) Directly, via annihilation into a photon pair (χχ → γγ) or into a photon—Z pair
(χχ → γ Z ) with E γ = m χ or E γ = (m χ − m Z )2 /(4m χ ), respectively; these
processes give a clear signature at high energies, as the energy is monochromatic,
but the process is suppressed at one loop, so the flux is expected to be very faint.
(b) Via annihilation into a quark pair which produces jets emitting in turn a large
number of γ photons (qq → jets → many photons); this process produces a
continuum of gamma rays with energies below the WIMP mass. The flux can
be large but the signature might be difficult to detect, since it might be masked
by astrophysical sources of photons.
(c) Via internal bremsstrahlung; also in this case one has an excess of low energy
gamma rays with respect to a background which is not so well known. Besides
the internal bremsstrahlung photons, one will still have the photons coming from
the processes described at the two previous items.
The γ-ray flux from the annihilation of dark matter particles of mass m χ can
be expressed as the product of a particle physics component times an astrophysics
component:
8.4 What Is Dark Matter Made Of, and How Can It Be Found? 501
dN 1 σann v
d Nγ
= × dl()ρ2χ . (8.188)
dE 4π 2m 2χ d E −l.o.s.
% &' ( % &' (
Particle Physics Astrophysics
Further Reading
Exercises
1. Cosmological principle and Hubble law. Show that the Hubble law does not
contradict the cosmological principle (all points in space and time are equivalent).
2. Olbers Paradox. Why is the night dark? Does the existence of interstellar dust
(explanation studied by Olbers himself) solve the paradox?
3. Steady state Universe. In a steady state Universe with Hubble law, matter has to
be permanently created. Compute in that scenario the creation rate of matter.
4. Blackbody form of the Cosmic Microwave Background. In 1965 Penzias and
Wilson discovered that nowadays the Universe is filled with a cosmic microwave
background which follows an almost perfect Planck blackbody formula. Show
504 8 The Standard Model of Cosmology and the Dark Universe
that the blackbody form of the energy density of the background photons was
preserved during the expansion and the cooling that had occurred in the Universe
after photon decoupling.
5. Nucleosynthesis and neutron lifetime. The value of the neutron lifetime, which is
anormaly long for weak decay processes (why?), is determinant in the evolution
of the Universe. Discuss what would have been the primordial fraction of He if
the neutron lifetime would have been one-tenth of its real value.
6. GPS time corrections. Identical clocks situated in a GPS satellite and at the Earth
surface have different periods due general relativity effects. Compute the time
difference in one day between a clock situated in a satellite in a circular orbit
around Earth with a period of 12 h and a clock situated on the Equator at the Earth
surface. Consider that Earth has a spherical symmetry and use the Schwarzschild
metric.
7. Asymptotically Matter-dominated Universe. Consider a Universe composed only
by matter and radiation. Show that whatever would have been the initial pro-
portion of the matter and the radiation energy densities this Universe will be
asymptotically matter dominated.
8. Flatness of the Early Universe. The present experimental data indicate a value for
a total energy density of the Universe compatible with one within a few per mil.
Compute the maximum possible value of | − 1| at the scale of the electroweak
symmetry breaking consistent with the measurements at the present time.
9. WIMP “miracle”. Show that a possible Weak Interacting Massive Particle
(WIMP) with a mass of the order of m χ ∼ 100 GeV would have the relic density
needed to be the cosmic dark matter (this is the so-called WIMP “miracle”).
Chapter 9
The Properties of Neutrinos
Neutrinos have been important for the developments of particle physics since they
were conjectured in the twentieth century, and are still at present at the center of
many theoretical and experimental efforts. Their detection is difficult, since they are
subject only to weak interactions (besides the even weaker gravitational interaction).
The existence of neutrinos was predicted by Wolfgang Pauli in 1930 in order
to assure the energy momentum conservation in the β decay as it was recalled in
Sect. 2.3. Then in 1933 Enrico Fermi established the basis of the theory of weak
interactions in an analogy with QED but later on it was discovered that parity is not
conserved in weak interactions: neutrinos should be (with probability close to one)
left-handed, antineutrinos should be right-handed (see Chap. 6). The theory needed
a serious update, which was performed by the electroweak unification (Chap. 7).
Neutrinos were experimentally discovered only in the second-half of the twentieth
century: first the electron antineutrino in 1956 by Reines1 and Cowan (Sect. 2.8); then
1 Frederick Reines (1918–1998) was an American physicist, professor at the University of California
at Irvine and formerly employed in the Manhattan project. He won the Nobel Prize in Physics 1995
“for pioneering experimental contributions to lepton physics;” his conational and coworker Clyde
Cowan Jr. (1919–1974) had already passed away at the time of the recognition.
in 1962 the muon-neutrino by Lederman, Schwartz and Steinberger2 ; and finally, the
tau neutrino in 2000 by the DONUT experiment at Fermilab (Sect. 5.6.2). Meanwhile
it was established in 1991 in the LEP experiments at CERN that indeed there are
only three kinds of light neutrinos (see Sect. 7.5.1).
Neutrinos cannot be directly observed: they are only detected through their inter-
actions. Different neutrino flavors are defined by the flavors of the charged lepton
they produce in weak interactions. The electron neutrino νe , for example, is the neu-
trino produced together with a positron, and its interaction will produce an electron.
Similarly, for the muon and the tau neutrinos.
For many years it was thought that neutrinos are massless, and for the standard
model of particle physics three generations of massless left-handed neutrinos were
enough—a nonzero mass was not forbidden, but it implied new mass terms in the
Lagrangian discussed in Chap. 7. There was anyway a cloud: the so-called solar neu-
trino problem—in short, the number of solar electron neutrinos arriving to the Earth
was measured to be much smaller (roughly between one-third and 60 %, depending
on the experiment’s threshold) of what it should be according to estimates based
on the solar power. This problem was solved when it was demonstrated that neutri-
nos can change flavor dynamically: neutrino species “mix”, and quantum mechanics
implies that they cannot be massless.
Neutrinos are generated in several processes, and their energy spans a wide range
(Fig. 9.1). Correspondingly, there are different kinds of detectors to comply with the
different fluxes and cross sections expected.
Let us start analyzing some of the neutrino sources. In particular solar neutri-
nos, atmospheric neutrinos, reactor neutrinos, and accelerator neutrinos have been
complementary in determining the neutrino oscillation parameters, and thus, con-
straining the masses and the mixing matrix. Other sources of neutrinos, more relevant
for neutrino astrophysics, will be discussed in Chap. 10.
In the so-called “Standard Solar Model” (SSM), the Sun produces energy via ther-
monuclear reactions in its center, in a kernel much smaller than the Sun’s radius. Most
2 The Nobel Prize in Physics 1988 was awarded jointly to Leon Lederman (New York 1922), Melvin
Schwartz (New York 1931—Ketchum, Idaho, 2006) and Jack Steinberger (Bad Kissingen 1921)
“for the neutrino beam method and the demonstration of the doublet structure of the leptons through
the discovery of the muon neutrino”.
9.1 Sources and Detectors; Evidence of the Transmutation of the Neutrino Flavor 507
Fig. 9.1 Neutrino interaction cross section as a function of energy, showing typical energy regimes
accessible by different neutrino sources and experiments. The curve shows the scattering cross
section for an electron antineutrino on an electron. From A. de Gouvêa et al., arXiv:1310.4340v1
of the energy is released via MeV photons, which will originate the electromagnetic
solar radiation through propagation and interaction processes that take a long time
(∼2 million years). The light emitted comes mostly from the thermal emission of
the external region, the photosphere, which has a temperature of about 6000 K, and
is heated by the moderation of these photons.
The fusion reactions in the Sun release 26.7 MeV per reaction and produce also
a large flux of electron neutrinos that can be detected at Earth (the expected flux
at Earth predicted by John Bahcall and collaborators in the Standard Solar Model
(SSM) is ∼6×1010 cm−2 s−1 ). This flux is produced mainly by the nuclear reactions
initiated by proton-proton ( pp) fusions as sketched in Fig. 9.2. The contribution of
the alternative CNO chain3 is small.
The dominant pp reaction (>90 % of the total flux) produces νe which have a low
energy endpoint (<0.42 MeV) as it is shown in Fig. 9.3. The 7 B line at 0.86 MeV is
the second most relevant νe source (7–8 %) while the “ pep” reaction producing νe
with energy of 1.44 MeV contributes with just a 0.2 %.
The 8 B neutrinos are produced in the “ pp I I I ” chain with energies <15 MeV
and although their flux could appear marginal (∼0.1 %) they have a major role in the
hydrogen to helium. In the CNO cycle, four protons fuse, giving origin to one alpha particle, two
positrons and two electron neutrinos; the cycle uses C, N, and O as catalysts. While the threshold
of the pp-chain is around temperatures of 4 MK, the threshold of a self-sustained CNO chain is at
approximately 15 MK. The CNO chain becomes dominant at 17 MK.
508 9 The Properties of Neutrinos
Fig. 9.2 Main nuclear fusion reactions that contribute to the solar neutrino flux. By Dorottya Szam
[CC BY 2.5 http://creativecommons.org/licenses/by/2.5], via Wikimedia Commons
solar neutrino detection experiments. In fact, they were the dominant contribution in
the historical Chlorine experiment and can be detected by Cherenkov experiments
like Super-Kamiokande and SNO (Fig. 9.3).
The first solar neutrino experiment was done in the late 1960s by Ray Davis in
the Homestake mine in South Dakota, USA, counting the number of 37 Ar atoms
produced in 615 ton of C2 Cl4 by the reaction involving chlorine:
−
νe 37
17 Cl → 37
18 Ar e (9.1)
(Nobel prize for Davis, as we discussed in Chap. 4). The observed rate was just around
one-third of the expected number of interactions. This unexpected result originated
the so-called “solar neutrino problem” that for three decades lead to a systematic and
careful work of a large community of physicists, chemists, and engineers which, at
the end, confirmed both the predictions of the SSM and the experimental results of
Ray Davis. The explanation should be in some fundamental properties of neutrinos.
Indeed subsequent solar neutrino experiments based on different detection techniques
found also a significant deficit in the observed νe fluxes; in particular, the GALLEX
(at Gran Sasso in Italy) and the SAGE (at Baksan in Russia) experiments used also a
radiochemical technique with a lower threshold, choosing Gallium as the detection
medium, enabling thus the detection of pp neutrinos.
The Kamiokande and the Super-Kamiokande (described in Chap. 4; also called
Super-K, or SK) experiments at Kamioka in Japan used water as the target material
9.1 Sources and Detectors; Evidence of the Transmutation of the Neutrino Flavor 509
Fig. 9.3 The solar neutrino energy spectrum predicted by the SSM. For continuum sources, the
fluxes are expressed in units of cm−2 s−1 MeV−1 at the Earth’s surface. For line sources, the units
are number of neutrinos cm−2 s−1 . The total theoretical errors are quoted for each source. Source
http://arxiv.org/abs/0811.2424
(50,000 tons in the case of Super-K) which allowed the detection, by Cherenkov radi-
ation, of electrons produced in the interaction of MeV neutrinos on atomic electrons.
The energy and the direction of the scattered electron could be measured determining
respectively the number of photons and the orientation of the Cherenkov ring. In this
way, as the electron keeps basically the direction of the incoming neutrino, it could
be proved that indeed the neutrinos were coming from the Sun as it is shown by the
obtained beautiful neutrino “photography” of the Sun (Fig. 9.4).
Also in this experiment the total observed flux, when interpreted as only νe inter-
actions, is significantly lower than expected by the SSM.
Was the SSM wrong, or some electron neutrinos were disappearing on their way
to the Earth?
The final answer was given by the Sudbury Neutrino Observatory (SNO) in
Canada. SNO used 1000 tons of heavy water (D2 O) as the target material. Both
charged- and neutral-current neutrino interactions with deuterium nuclei are then
possible:
• νe d → e− p p (charged current, CC);
• νx d → νx n p (neutral current, NC).
510 9 The Properties of Neutrinos
While in the first reaction only the νe may interact (the neutrino energy is below
the kinematic threshold for muon production), the νs of all flavors can contribute
to the second one. The resulting e− is detected by measuring the corresponding
water Cherenkov ring. The neutron in the final state may be captured either with low
efficiency in the deuterium nuclei or with higher efficiency in 35 Cl nuclei from 2 tons
of salt (NaCl) that were added in the second phase of the experiment. In any case in
those radiative captures γ photons are produced and these may produce, via Compton
scattering, relativistic electrons which again originate Cherenkov radiation. In the
third and final phase, an array of 3 He-filled proportional counters were deployed to
provide an independent counting of the NC reaction. In addition to the two processes
described above, the elastic scattering
νx e− → νx e−
is also possible for all neutrino types—although with different cross sections, being
the neutrino electron process favored with respect to the other neutrino types.
While νe , νμ , ντ can contribute to the NC, only νe contribute to the CC. Thus
one has in SNO a clear way to separate the measurement of the νe flux from the
measurement of the different active neutrino species (in a three-flavor model, νe +
νμ + ντ ). SNO could determine that
(νe )
= 0.340 ± 0.038 (stat. + syst.)
(νx )
and thus indicate that electron neutrinos might transform themselves into different
neutrino flavors during their travel from the Sun to the Earth. The result is compatible
with a value of 1/3.
9.1 Sources and Detectors; Evidence of the Transmutation of the Neutrino Flavor 511
Fig. 9.5 The flux of muon plus tau neutrinos versus the flux of electron neutrinos as derived from
the SNO data. The vertical band comes from the SNO charged-current analysis; the diagonal band
from the SNO neutral-current analysis; the ellipse shows the 68 % confidence region from the best
fit to the data. The predicted Standard Solar Model total neutrino flux is the solid line lying between
the dotted lines
The results obtained by SNO are summarized in Fig. 9.5. The total measured
neutrino flux is clearly compatible with the total flux expected from the SSM and the
fraction of detected νe is consistent with being only one-third of the total number of
the neutrinos.
The solar neutrino problem could be solved without relying on the SSM, and the
solution was that solar neutrinos change their flavor during their way to the Earth;
the oscillation appears to be complete, in the sense that electron neutrinos are only
one third of the total.
Let us examine now the characteristics of the oscillation of neutrinos in the sim-
plified hypothesis that there are only two flavors, and two eigenstates.
In a world with two flavor (let us suppose for the moment they are νe , νμ ) and
two mass (ν1 , ν2 ) eigenstates, the flavor eigenstates can be written as a function of
a single real mixing angle θ as:
(note that, in order for the mixing to have an effect, the two masses must be different,
i.e., at least one should be different from zero) or, expressing this quantum state again
in terms of the weak eigenstates:
ψ = cos2 θe−i(E 1 t− p1 .x ) + sin2 θe−i(E 2 t− p2 .x ) νe −
cos θ sin θ e−i(E 1 t− p1 .x ) − e−i(E 2 t− p2 .x ) νμ .
Note that at (t = 0, x = 0), ψ = νe but, at later times, there will be usually a mixture
between the two-flavor states νe , νμ .
It can be seen from the equations above that the probability to find a state νμ at a
distance L from the production point is given by:
m 2 L
P νe → νμ = sin2 (2θ) sin2 (9.6)
4E ν
where
m 2 = m 21 − m 22 . (9.7)
The sin2 (2θ) factor plays the role of the amplitude of the oscillation while the
phase is given by m 2 L/4E ν . A phase too small or too large makes the measurement
of the oscillation parameters quite difficult. Typically, an experiment is sensitive to:
Eν
m 2 ∼ . (9.8)
L
9.1 Sources and Detectors; Evidence of the Transmutation of the Neutrino Flavor 513
2π E ν
Lν = (9.9)
m 2
and then
π L
P νe → νμ = sin2 (2θ) sin2 . (9.10)
2 Lν
m 2 eV2 L (km)
P νe → νμ = sin (2θ) sin
2 2
1.27 (9.11)
E ν (GeV)
and then the probability to find a state νe at the same distance L is by construction:
P (νe → νe ) = 1 − P νe → νμ (9.12)
m 2 eV2 L (km)
= 1 − sin (2θ) sin
2 2
1.27 . (9.13)
E ν (GeV)
The oscillation probabilities, in this two-flavor world, are just a function of two
parameters:
the mixing angle θ and the difference of the squares of the two masses
m 2 = m 21 − m 22 .
Experiments that measure the possible depletion of the initial neutrino beam
are called disappearance experiments. Experiments that search for neutrinos with
a flavor different from the flavor of the initial neutrino beam are called appearance
experiments. An appearance experiment is basically sensitive to a given oscillation
channel νi → ν j with i = j while a disappearance experiment is sensitive to transi-
tions to all possible different neutrino species, or to pure disappearance—absorption,
for example.
The determination of the parameters of neutrino oscillations has been one of the
priorities of the research during the last years. If neutrinos oscillate, their masses,
although small, cannot be zero. The direct measurement of such masses and of the
mixing strengths has gained a renewed interest.
The theoretical origin of neutrino masses is not yet established: either it is the result
of the Higgs mechanism as it is the case for all the other fermions (Dirac neutrino) or,
as suggested by Majorana, the neutrino is its own antineutrino (Majorana neutrino).
If the latter is the case, double beta decays—nuclear decays in which two neutrons
become protons—could be neutrinoless (the simplest ways of viewing this fact is to
think that the two neutrinos annihilate each other, or that the second neutron absorbs
514 9 The Properties of Neutrinos
the neutrino emitted by the first one during its transition, and then undergoes the
process νn → p).
In addition, neutrinos travel a long way within the Sun, and it is also plausible
that most of the neutrino oscillation happens in matter.
The neutrino oscillations may be enhanced (or suppressed) whenever neutrinos
travel through matter. In fact, while all neutrino flavors interact equally with matter
through neutral currents, charged current interactions with matter are flavor depen-
dent (at solar neutrino energies, basically only electron neutrinos can interact). This
is called the MSW effect, as it comes from works by Lincoln Wolfenstein, Stanislav
Mikheyev, and Alexei Smirnov. Thus, the time evolution in matter of the electron
neutrino and of the other neutrinos can be different. This effect is translated, in
a two-flavor approximation, into a modified oscillation probability the oscillation
probability νe → νx to:
π Le
P (νe → νx ) = sin2 (2θm ) sin2 F (9.14)
2 Lν
where
sin (2θm ) = sin (2θ)/F, (9.15)
Lν 2
F= cos (2θ) − + sin2 (2θ) (9.16)
Le
and √
L e = ±2π/ 2 2G F Ne . (9.17)
L e , the electron neutrino interaction length, is positive for neutrinos, negative for
antineutrinos. G F is the Fermi constant and Ne the electron density in matter.
L ν , the neutrino oscillation length in vacuum, is, as defined before, a function of
the neutrino energy and of the difference of the square of the masses:
2π E ν
Lν = . (9.18)
m 2
Whenever
L ν = L e cos (2θ) (9.20)
the amplitude of the oscillation is maximal (sin2 (2θm ) = 1). Thus, for a given set
of (E, Ne ) values, resonant oscillations are possible and the oscillation probability
may be strongly enhanced independently of the value of θ in vacuum.
In the center of the Sun Ne ∼ 3×1031 m−3 and then the value of L e is ∼ 3×105 m
which is a small number when compared with the Sun radius (108 –109 m). In this
way, the suppression of the electron neutrinos is a function of the neutrino energy
for given values of m 2 and θ.
How to determine the oscillation parameters? More information comes from dif-
ferent neutrino sources.
Nuclear reactors are abundant ν e sources via the β decays of several of the isotopes
produced in the fission reactions. The ν e have an energy of a few MeV and can be
detected by the inverse β decay reaction (ν e p → e+ n). The results from reactors
can be combined with the results obtained in the solar experiments supposing that
νe and ν e have the same behavior (as guaranteed by CPT invariance). In reactor
experiments the energies and the distances are much better determined than in solar
experiments,
The KamLAND experiment (again in Kamioka in Japan), a 1000-ton liquid scin-
tillator detector, is placed at distances of the order of 100 km of several nuclear
reactors (the weighted average distance being of 180 km) and thus, as discussed in
the previous section, is sensitive to long-scale oscillations.
Electron antineutrinos are detected through the reaction ν̄e p → e+ n, which has
a 1.8 MeV energy threshold. The prompt scintillation light from the positron allows
to estimate the energy of the incident antineutrino. The neutron recoil energy is only
a few tens of keV; the neutron is captured on hydrogen and a characteristic 2.2 MeV
gamma ray is emitted after some 200 µs. This delayed-coincidence between the
positron and the gamma ray signals provides a very powerful signature for distin-
guishing antineutrinos from backgrounds produced by other sources.
KamLAND detects a clear pattern of oscillation as shown in Fig. 9.6. In a two-
flavor approximation (Eq. 9.13), the fit to this pattern provides for the 2m involved
in the νe oscillations (we shall call it m 2Sun ) a value of
A value of
tan2 (θSun ) ∼ 0.47 (9.22)
516 9 The Properties of Neutrinos
Fig. 9.6 The ν e survival probability as a function of L/E observed in the KamLAND experiment.
Figure from A. Gando et al. (KamLAND Collab.), Phys. Rev. D83 (2011) 052002
Another solid evidence that neutrinos do oscillate came from the measurement at the
Earth surface of the relative ratio of the νe and νμ produced in cosmic-ray showers
(Fig. 9.8; see also Chap. 10) by the decays of the π ± and in a smaller percentage of
the K ± . The decay chains:
π + → μ+ νμ ; μ+ → e+ νe ν μ (9.23)
π − → μ− ν μ ; μ− → e− ν e νμ (9.24)
should be around 2. In fact the value of this ratio is slightly different from 2, because
not all muons decay in their way to Earth and only around 63 % of the K ± follow
similar decay chains; this ratio is, thus, energy-dependent. Monte Carlo calculations
allow the computation of these corrections.
9.1 Sources and Detectors; Evidence of the Transmutation of the Neutrino Flavor 517
Fig. 9.7 Allowed parameter regions (at 1σ and 2σ) in the (sin2 (θ12 ), m 221 ) space for the combined
analysis of solar neutrino data and for the analysis of KamLAND data. The result for KamLAND
is illustrated by the ellipses with horizontal major axis (with the best fit marked by a green star).
Figure adapted from NuFIT2014: M.C. Gonzalez-Garcia, M. Maltoni and T. Schwetzd, “Updated
fit to three neutrino mixing: status of leptonic CP violation”, JHEP 11 (2014) 052 (Color in online)
Fig. 9.8 The interaction of cosmic rays in the upper atmosphere give rise to particle showers
comprising neutrinos (right picture), which originate from a 10–20 km thick atmospheric layer. A
large volume detector placed underground, like Super-K, is used to detect them; downward-going
neutrinos traveled only few tens of kilometers and had no “space” to oscillate, while upward-going
neutrinos have traveled about 10,000 km and have likely oscillated. The detector (left picture) can
distinguish between electron neutrinos and muon neutrinos: secondary muons are likely to escape
the detector (non-contained or partially contained events), while secondary electrons formed by
neutrino electrons interacting in the detector are likely to be absorbed (fully contained events). In
the case of fully contained events the electron ring is “fuzzier” than the muon ring. From Braibant,
Giacomelli and Spurio, “Particles and fundamental interactions,’ Springer 2014
oscillation formula deduced in just a two-flavor scenario but now between the muon
and the tau neutrinos.
The best fit to the data in a νμ − ντ oscillation scenario provides (NuFIT2014)4
4 TheNuFIT groups provides and regularly updates at the web site http://www.nu-fit.org/ a global
analysis of neutrino oscillation measurements.
9.1 Sources and Detectors; Evidence of the Transmutation of the Neutrino Flavor 519
250 1.2
200 1
150 0.8
0.6
100
0.4
50
0.2
0 0
-1 -0.5 0 0.5 1 2 3 4
1 10 10 10 10
cos Θ
L/E (km/GeV)
Fig. 9.9 Left Zenith angle distribution of the muon neutrinos in SK. The observed number of
upward-going neutrinos was roughly half of the predictions. Right Survival probability of νμ in
function of L/E. Black dots show the observations and the lines shows the prediction based on
neutrino oscillation. Data show a dip around L/E 500 km/GeV. The prediction of two-flavor
neutrino oscillations agrees well with the position of the dip. From http://www-sk.icrr.u-tokyo.ac.
jp/sk/physics/atmnu-e.html and The Super-Kamiokande Collaboration, Y. Ashie et al., “Evidence
for an Oscillatory Signature in Atmospheric Neutrino Oscillations,” Phys. Rev. Lett. 93 (2004)
101801 (color in Online)
Bruno Pontecorvo first suggested in 1957 that the neutrino may oscillate; in the
1960s it was suggested that the neutrino weak and mass eigenstates were not the
same. Neutrinos would be produced in weak interactions in pure flavor states that
would be a superposition of several mass states (preserving unitarity) which would
determine their time-space evolution, giving rise to mixed flavor states.
We have shortly discussed in the beginning of this chapter a simplified model in
which only two neutrinos and two mass eigenstates appear. Assuming three weak
eigenstates (νe , νμ , ντ ) and three mass eigenstates (ν1 , ν2 , ν3 ), the mixing can be
modeled, similarly to what seen for the CKM matrix, using a 3 × 3 unitary matrix,
which we call today the Pontecorvo-Maki-Nakagawa-Sakata (PMNS) matrix
⎛ ⎞ ⎛ ⎞⎛ ⎞
νe Ue1 Ue2 Ue3 ν1
⎝ νμ ⎠ = ⎝ Uμ1 Uμ2 Uμ3 ⎠ ⎝ ν2 ⎠ . (9.28)
ντ Uτ 1 Uτ 2 Uτ 1 ν3
Taking into account the relations imposed by unitarity and the fact that several
phases can be absorbed in the definition of the fields (if the neutrinos are standard
fermions) there are only three real parameters usually chosen as the mixing angles
520 9 The Properties of Neutrinos
θ12 , θ13 , θ23 and a single complex phase written in the form eiδ . If the mixing angles
and δ are = 0, CP is violated.
The PMNS matrix can be written as the product of three 3 × 3 matrices:
⎛ ⎞
Ue1 Ue2 Ue3
⎝ Uμ1 Uμ2 Uμ3 ⎠ = (9.29)
Uτ 1 Uτ 2 Uτ 1
⎛ ⎞⎛ ⎞⎛ ⎞
1 0 0 cos θ13 0 sin θ13 e−iδ cos θ12 sin θ12 0
= ⎝ 0 cos θ23 sin θ23 ⎠ ⎝ 0 1 0 ⎠ ⎝ − sin θ12 cos θ12 0 ⎠ .
0 − sin θ23 cos θ23 − sin θ13 e 0 cos θ13
iδ 0 0 1
(9.30)
This format puts in evidence what we observed: in the first approximation, both the
oscillation νe → νμ and the oscillation νμ → ντ can be described as oscillations
between two weak eigenstates and two mass eigenstates. Thus, we can identify the
two most important parameters for solar neutrinos, θSun and m 2Sun , with θ12 and
m 221 , respectively;
while
for2 atmospheric
neutrinos we identify θatm and m 2atm ,
with 2θ23 m 32 m31 , respectively (experimentally it was observed that
2
and
m m 2 m 2 ).
32 31 21
The survival probability, for example, νe → νe , in the case of three families is
given by:
m 2 L
P (νe → νe ) = 1 − 4 |Ue1 |2 |Ue2 |2 sin2 21
+
4E ν
2 m 31 L 2 m 32 L
2 2
−4 |Ue1 | |Ue3 | sin
2 2
− 4 |Ue2 | |Ue3 | sin
2 2
.
4E ν 4E ν
The fact that m 232 m 231 m 221 leads to an oscillation
characterized
by two different length scales. Indeed assuming that m 232 = m 231 , imposing
unitarity and expressing the matrix elements in terms of the PMNS parametrization
reported above, one obtains:
m 221 L m 232 L
P (νe → νe ) 1−cos4 (θ13 ) sin2 (2θ12 ) sin2 −sin2 (2θ 13 ) sin2 .
4E ν 4E ν
(9.31)
Fig. 9.10 The νe survival probability as a function of L/E for fixed oscillation parameters as
indicated in the figure. From http://www.hep.anl.gov/minos/reactor13/reactor13.pdf
m 232 L
P (ν e → ν e ) ≈ 1 − sin (2θ13 ) sin
2 2
(9.32)
4 Eν
m 221 L
P (ν e → ν e ) ≈ 1 − cos (θ13 ) sin (2θ12 ) sin
4 2 2
. (9.33)
4 Eν
Close to a fission reactor, where the long wavelength oscillation did not develop yet,
the electron antineutrino survival probability can thus be approximated as computed
in Eq. 9.33.
The Daya Bay experiment in China is a system of six 20-ton liquid scintillator
detectors (antineutrino detectors, AD) arranged in three experimental halls (EH),
placed near six nuclear reactors (the geometry is shown in Fig. 9.11, left); as a conse-
quence of the distances and of the geometry it is sensitive to short oscillations which
may occur in a 3 × 3 mixing matrix scenario (see Sect. 9.1.5). In fact Daya Bay
reported in March 2012 the first evidence of such short-scale oscillations (Fig. 9.11,
right), finding a value of
Fig. 9.11 Left Layout of the Daya Bay experiment. The dots represent reactors, labeled as D1, D2,
L1, L2, L3, and L4; the locations of the detectors are labeled EH1, EH2 and EH3. Right The ν e
disappearance as measured by the Daya Bay experiment. Right Ratio of the measured signal in each
detector versus the signal expected assuming no oscillation. The oscillation survival probability at
the best-fit sin2 2θ13 value is given by the smooth curve. The χ2 versus sin2 2θ13 is shown in the
inset. Figures from F.P. An et al., Phys. Rev. Lett. 108 (2012) 171803
Although small, a nonzero value of θ13 allows the phase δ = 0 to produce CP vio-
lation in the neutrino sector. Later, the RENO experiment in South Korea confirmed
the result.
The results from atmospheric neutrino experiments and reactor experiments can be
tested in accelerator experiments, building intense and collimated νμ and ν μ beams
from the decay of secondary π ± (and in a smaller percentage of K ± ), and placing
detectors both near (100–1000 m) and far (100–1000 km) from the primary target.
The oscillation distance L is then fixed and the neutrino flux and the energy spectrum
can be well predicted and precisely measured at the near detectors, constraining the
elements of the neutrino mixing matrix.
The K2K (KEK to Kamioka) experiment, in Japan, was the first of such experi-
ments (actually its construction started at the end of the 1990s before the discovery
of the neutrino oscillations in Super-Kamiokande). The neutrino beam, with a mean
energy of 1.3 GeV, was produced at KEK in Tsukuba and the interactions were mea-
sured in a nearby detector at 300 m and in the Super-Kamiokande detector at 250 km
(Fig. 9.12). 112 events were detected while 158 ± 9 where expected without con-
sidering oscillations; a neutrino oscillation pattern compatible with the atmospheric
neutrino results was observed.
The T2K (Tokai to Kamioka) experiment succeeded K2K sending muon neutrinos
to the Super-Kamiokande detector. It is a second generation experiment situated at
295 km from Kamioka. The neutrino beam, produced in the the J-PARC facility
9.1 Sources and Detectors; Evidence of the Transmutation of the Neutrino Flavor 523
Fig. 9.12 Sketch of the neutrino path in the K2K long-baseline experiment. From http://neutrino.
kek.jp
in Tokai, Eastern Japan, has a narrow range of energies around 600 MeV, selected
in order to maximize the neutrino oscillation probability in their way to Super-
Kamiokande. The intensity of the beam is two orders of magnitude larger than in K2K.
The near detector (ND280), 280 m downstream the neutrino beam, is a segmented
detector composed of neutrino targets inside a tracking system surrounded by a
magnet. ND280 can measure the energy spectrum of the ν beam, its flux, flavor
content, and interaction cross sections before the neutrino oscillation. We shall see
later that, on top of precise measurements of the ν μ disappearance (Fig. 9.13, left),
T2K detected for the first time explicitly the appearance of ν e in a ν μ beam.
In the USA, the MINOS experiment started taking data in 2005. The beam line at
Fermilab is optimized to produce both νμ and ν μ beams with mean energy of 3 GeV.
The far detector is placed at a distance of 735 km in the Soudan mine. A distortion
of the energy spectrum at the far detector compatible with the previous oscillation
measurements was observed for νμ beams (Fig. 9.13, right).
These results can be once again interpreted in terms of oscillations in a two-
flavor scenario (but now considering νμ → ντ as shown in Fig. 9.14). They confirm
and improve the result from the atmospheric neutrinos. Themixing is almost max-
imal (sin2 (2θ23 ) >0.90) and the mass difference (m 2atm ∼ 2.3 × 10−3 eV2 )
is again much smaller than the normal fermion masses but much higher than the
values measured in the case of the electron neutrino beam, i.e., in the “solar” neu-
trinos (m 2Sun ∼ 8 × 10−5 eV2 ) as discussed above. Accelerator and atmospheric
experiments are complementary: in the former L is fixed and E known assuring a
good resolution in the measurement of m 223 while in the latter the fluxes are high
assuring a good resolution in the measurement of θ23 and being also sensitive to θ13 .
524 9 The Properties of Neutrinos
Fig. 9.13 Left The first T2K study on the disappearance of muon neutrinos: muon-antineutrino
events with well-reconstructed energy recorded before 2011. The energy distribution is compared
to the calculations with and without oscillations. From Phys. Rev. D 85, 031103 (2012). Right The
ratio of the observed spectrum of muon neutrino interactions from MINOS to the predicted spectrum
in the absence oscillations. The dark band represents the prediction assuming oscillations and its 1σ
systematic uncertainty, using the best-fit oscillation parameters from MINOS. The observed data are
well described by the oscillation model. From http://www-numi.fnal.gov/PublicInfo/forscientists.
html
Fig. 9.14 The (m 223 , sin2 (2θ23 )) 90 % CL allowed regions. From K. Abe et al. (T2K Collab.),
Phys. Rev. Lett. 111 (2013) 211803
Fig. 9.15 One of the three tau antineutrino candidate events observed by OPERA. From http://
operaweb.lngs.infn.it/
iments made an explicit detection of neutrinos of different flavor from the muon
neutrinos an an accelerator beam.
The OPERA experiment situated at Gran Sasso, Italy, receives a 17 GeV muon
neutrino beam produced at CERN located 730 km away. OPERA uses a sophisticated
1300 tons detector composed by a sandwich of photographic emulsion films and lead
plates in order to be able to detect tau-leptons: it is thus an appearance experiment aim-
ing to detect tau-neutrinos resulting from the oscillation of the initial muon neutrino
beam. OPERA, which concluded data-taking, reported four tau-neutrino candidates
corresponding to significance of about 4σ; one of them is shown in Fig. 9.15.
The first observation of the νe appearance from a high-purity νμ beam was recently
reported by T2K: 28 electron neutrino events were detected while 4.92 ± 0.55 back-
ground events were expected in the case of no neutrino oscillation.
The interior of the Earth radiates heat at a rate of about 50 TW, which is about 0.1 %
of the incoming solar energy. Part of this heat originates on the energy generated
upon the decay of radioactive isotopes, while another part is due to the cooling of
the Earth.
The Earth’s radioactive elements (in particular 238 U, 232 Th, 40 K) are β − emitters
and thus natural sources of ν e , in this case designated as geo-neutrinos. The fluxes are
small (as an example, around 21 events/year in KamLAND) but their measurements
may provide important geological information on Earth’s composition and structure
that can not be accessible by any other means. The main backgrounds are due to
526 9 The Properties of Neutrinos
The simplified model in which neutrinos coming from two mass eigenstates oscillate
between two flavors does not describe the full picture coming from the data. The large
majority of the present experimental results are well-described assuming three weak
eigenstates (νe , νμ , ντ ) and three mass eigenstates (ν1 , ν2 , ν3 ). Some researchers
evidence a possible tension in the data, which for the first time was announced as the
“LSND5 anomaly. LSND claimed an oscillation with |m| ∼ 1 eV, which would
imply the existence of a neutrino with mass of at least one eV. The only way to
accommodate this with the LEP results in the number of neutrino families is that this
particle is a new kind of neutrino, which should be sterile. A recent review of short
range (L < 100 m) reactor neutrino results, indicating a possible “reactor neutrino
anomaly,” can be found in G. Mention et al., Phys. Rev. D 83 (2011) 073006. These
discrepancies are difficult to reconcile with cosmological bounds on the neutrino
mass (see later), unless neutrinos have new properties; the discussion is beyond the
scope of this book.
The mixing matrix between three states is the Pontecorvo-Maki-Nakagawa-Sakata
(PMNS) matrix (see Sect. 9.1.5). However, it should be noted that a complete treat-
ment of neutrino propagation requires subtle questions of field theory, and has close
links to the foundation of quantum mechanics. Since different mass components
travel at different speeds, the mixing spreads the neutrino wavefunction in space,
with EPR-like6 implications.
The parameters of the PMNS matrix are: two mass differences (m 221 and m 231 );
three angles (θ12 , θ23 and θ13 ); one single complex phase written in the form eiδ .
Several groups provide regular fits imposing the unitarity of the mixing matrix, in
particular the PDG group and the NuFIT group. Averages of the mass differences
5 The Liquid Scintillator Neutrino Detector (LSND) was a 167-tons scintillation counter at Los
Alamos National Laboratory that measured the flux of neutrinos produced by a near accelerator
neutrino source.
6 The Einstein-Podolski-Rosen (EPR) paradox originally involved two particles, A and B, which
interact briefly and then move off in opposite directions. The two particles are then entangled, and
any measurement on A (projection of A on an eigenstate) would have immediately implications on
the state of B; this would violate locality. In the case of neutrinos, the neutrino wavefunction itself
spreads during the travel, with possible nonlocal effects.
9.2 Neutrino Oscillations; Present Knowledge of the Oscillation Parameters 527
and of the mixing angles (derived from PDG 2014 and PDG2013) are:
Note the elaborate way in which we defined the parameter |M|2 in Eq. 9.35. This
comes from the fact that the sign of m 31 (and thence of m 32 ) is not known. In
fact only the sign of m 21 is determined from the experimental measurements (solar
neutrinos). There are two possibilities (Fig. 9.16):
• m 1 < m 2 < m 3 (the so-called “normal” hierarchy, NH—sometimes HI);
• m 3 < m 1 < m 2 (called the inverted hierarchy, IH).
The complex phase is, from NuFit 2014,
It may thus be δ = 0 (which implies the violation of CP in the neutrino sector) but
there is not yet sensitivity to confirm firmly that hypothesis (the value is consistent
with 360◦ in less than three standard deviations).
The PMNS matrix is then highly non-diagonal, which is very different from what
it is observed in the quark sector (see Sect. 6.3.7). The best estimates of 3σ confidence
intervals for its elements are:
528 9 The Properties of Neutrinos
⎛ ⎞ ⎛ ⎞⎛ ⎞
νe 0.801 → 0.845 0.514 → 0.580 0.137 → 0.158 ν1
⎝ νμ ⎠ = ⎝ 0.225 → 0.517 0.441 → 0.699 0.614 → 0.793 ⎠ ⎝ ν2 ⎠ .
ντ 0.246 → 0.529 0.464 → 0.713 0.590 → 0.776 ν3
(9.40)
Future facilities are planned to improve our knowledge of the mixing matrix
and possibly discover new physics; in particular, high-precision and high-luminosity
long-baseline neutrino oscillation experiments have been proposed in the US, in
Japan, and in Europe.
The discovery of neutrino oscillations showed, as discussed above that the neutrino
flavor eigenstates are not mass eigenstates and at least two of the mass eigenstates
are different from zero.
Thanks to a huge experimental effort, we know quite well the neutrino mass
differences. As of today we do not know, however, the absolute values of the neutrino
masses. The values of m i2j (Eqs. 9.34 and 9.35) suggest masses of the order of 1–
100 meV. However, the possibility that the mass of the lightest neutrino is much
larger than this and that all three known neutrino masses are quasi-degenerate is not
excluded.
Neutrino masses can only be directly determined via non-oscillation neutrino
experiments. The most model-independent observable for the determination of the
mass of the electron neutrino is the shape of the endpoint of the beta decay spectrum.
Other probes of the absolute value of the neutrino masses include double beta decays,
if neutrinos are of the Majorana type, discussed below, and maps of the large-scale
structure of the Universe, which is sensitive to the masses of neutrinos—although
this sensitivity depends on cosmological models.
at 95 % C.L.
A more conservative limit
9.3 Neutrino Masses 529
m νi < 0.6 eV (9.42)
can be extracted as follows, based on the density and sizes of structures in the Uni-
verse. Initial fluctuations seeded the present structures in the Universe, growing
during its evolution. Neutrinos, due to their tiny masses, can escape from most struc-
tures being their speed larger than the escape speed. As a net result, neutrinos can
erase the structures at scales smaller than a certain value D F called the free streaming
distance. The smaller the sum of the neutrino masses, the larger is D F . The relevant
observable is the mass spectrum, i.e., the probability of finding a structure of a given
mass as a function of the mass itself. Cosmological simulations predict the shape of
the mass spectrum in terms of a small number of parameters; the limit in Eq. 9.42 is
the limit beyond which the predicted distribution of structures is inconsistent with
the observed one.
Data from astrophysical neutrino propagation over large distances are less con-
straining. So far the only reported upper limit on the neutrino velocity was obtained
comparing the energy and the arrival time of a few tens of neutrinos by three differ-
ent experiments from the explosion of the supernova 1987A in the Large Magellanic
Cloud at around 50 kpc from Earth. From these results a limit of about 6 eV was
obtained on the masses of the neutrinos reaching the Earth. The present long-baseline
accelerator experiments are not sensitive enough to set competitive limits. Recent
claims of the observation of superluminal neutrinos by the OPERA experiment were
proved to be experimentally wrong.
The study of the energy spectrum of the electrons produced in nuclear β decays
is, one century after the first measurement, still the target of intense experimental
efforts. In particular, the detailed measurement of the endpoint of this spectrum may
allow the determination of the electron neutrino mass by direct energy conservation.
In fact it can be shown that whenever the parity of the initial and the final nuclei
is the same, the spectrum of the outgoing electron is given by:
G 2F cos2 θc I 2
dN
= | |
F (Z , R, E) p E (E 0 − E) (E 0 − E)2 − m 2νe (9.43)
dE 2π 3
where:
1. cos θc is the cosine of the Cabibbo angle.
2. I an isospin factor that depends on the isospin of the initial and the final nucleus.
3. F (Z , R, E) is the “Fermi function” accounting for the electrostatic interaction
between the nuclei and the outgoing electron which depends on the nuclear charge
Z , on the nuclear radius R and the electron energy.
530 9 The Properties of Neutrinos
d N /d E
K (E) = . (9.44)
F (Z , R, E) | p| E
In the case of m νe = 0
K (E) ∝ (E 0 − E) (9.45)
and the plot is just a straight line. However, if m νe = 0, this line bends slightly near
the end point (Fig. 9.17) and K (E) becomes null now at:
E = E 0 − m νe . (9.46)
Assuming a mixing scenario with three nondegenerate mass eigenvalues the spec-
trum at the endpoint would be the superposition of the Kurie plots corresponding to
each of the mass eigenvalues (Fig. 9.17, right); indeed the measured mass will be a
superposition m β such that
m 2β = |Uei2 |m i2 . (9.47)
Fig. 9.17 Left Kurie plot. The green line represents the ideal case for m νe = 0, the red line the
ideal case for m νe = 0 and the blue line the real case where a finite detector resolution introduces a
smearing at the endpoint. Right Detail of the endpoint in case of a mixing scenario with three non-
degenerated mass eigenvalues. Credit: Andrea Giuliani, Review of Neutrino Mass Measurements,
Quark-Lepton conference Prague, 2005 (Color in online)
9.3 Neutrino Masses 531
The best present results were obtained by experiments (Troitsk in Russia and
Mainz in Germany) using as source the tritium 3 H . These experiments measure the
electron energy using complex magnetic and electrostatic spectrometers. The current
limit is
m νe < 2.0 eV (9.48)
at 95 % C.L.
Following this line an ambitious project (KATRIN in Karlsruhe, Germany) having
a 200-ton spectrometer is presently in preparation. KATRIN aims either to improve
this limit by an order of magnitude or to measure the mass, with a sensitivity of 0.2 eV.
An alternative proposal (Project 8 in Yale, US) is to use uses the measurement of the
cyclotron frequency of individual electrons to reach similar sensitivities.
The muon and the tau neutrino masses were studied respectively in the decays of
charged pions (π + → μ+ νμ and π − → μ− ν μ ), and in the three- and five-prongs
decays of the tau lepton. Limits
and
m ντ < 18.2 MeV (9.50)
were obtained at 95 % confidence level. They are not competitive with the cosmo-
logical limits.
where gν is the Yukawa coupling, v = 246 GeV is the Higgs vacuum expectation
value, and ν L and ν R are ,respectively, the left- and right-handed chiral7 Dirac spinors.
This mass term is built with the normal Dirac spinors and so these neutrinos are
designated as Dirac neutrinos. The right-handed neutrino that was not necessary in
the first formulation of the SM should then exist. In fact, in the case of massless
neutrinos chirality is conserved and as the right-handed neutrino is a SU(2) singlet
and has no weak interactions, as well as no strong and electromagnetic interactions:
excluding gravitational effects, it is invisible.
However neutrinos and antineutrinos have, apart from the lepton number which
is not generated by a fundamental interaction symmetry, the same quantum numbers
and thus they can be a unique particle. This would not be possible in the case of the
electrons/positrons, for example, as they have electric charge.
The neutrino and the antineutrino can, in this hypothesis first introduced by Ettore
Majorana in 1937, be described only by two-component chiral spinors (instead of the
four-component spinors in the case of the Dirac fermions). A left-handed neutrino is
identical (but for a phase) to a right-handed antineutrino which may be described by
the CP conjugate of the left-handed neutrino (ν LC ). A mass term involving left-handed
neutrinos and right-handed antineutrinos can then be written:
1
− m ν L ν LC + ν LC ν L . (9.52)
2
However ν LC ν L has weak hypercharge Y = −2 and thus can not make a gauge invari-
ant coupling with the Standard Model Higgs doublet which has Y = +1. In this way
either the Higgs sector would be extended or again right-handed neutrinos (in this
case of the Majorana type) are introduced. Indeed a left-handed antineutrino is iden-
tical (but for a phase) to a right-handed neutrino and may be described by the CP
conjugate of the right-handed neutrino (ν C
R ). In this case mass terms involving right-
handed neutrinos and left-handed antineutrinos like:
1 C
− M νR νR + ν R νC
R (9.53)
2
are SU(2) singlets: they can be introduced directly in the Lagrangian without break-
ing the gauge invariance. No Higgs mechanism is therefore needed in this scenario in
the neutrino sector. These Majorana neutrinos would not couple to the weak bosons.
Both Dirac and Majorana mass terms may be present. In the so-called “see-saw”
mechanism, a Dirac term with mass m D and a Majorana term with mass M as defined
above are introduced. The physical states for which the mass matrix is diagonal are
now a light neutrino with mass
m2
mν ∼ D (9.54)
M
7 Hereafter in this section the designations “left”- and “right-handed” refer to chirality and not to
helicity. Note that for massive neutrinos chirality and helicity are not equivalent (see chap. 6).
9.3 Neutrino Masses 533
The interested reader can find the explanation in the additional material.
The light neutrino, in the limit of M m D , has basically the same couplings as
the standard model neutrinos, while the heavy neutrino is basically right-handed and
thus sterile and may have, if they are stable, a role on Dark Matter.
The extremely small values that experiments indicate for the neutrino masses are
in this way generated thanks to the existence of a huge Majorana mass, while the
scale of the Dirac mass is the same as for the other fermions.
If neutrinos are Majorana particles (i.e., if neutrinos and anti-neutrinos the same par-
ticle), neutrinoless double β decays (0νββ, also called ββ0ν) can occur, in particular
for nuclei for which the normal β decays are forbidden by energy conservation. The
lines corresponding to the emission of the ν e can be connected becoming an internal
line (Fig. 9.18).
Considering that the nuclei before and after the decay are basically at rest, the sum
of the energies of the two electrons in 0νββ decays is just the difference of the masses
of the two nuclei (Q = M(Z , A)−M(Z + 2, A) − 2m e ). Thus in these decays the
energy spectrum of the emitted electrons should be a well-defined line while in the
normal double β decay, with the emission of two ν e , this spectrum accommodates a
large phase space and the electron energy distribution is broad (Fig. 9.19).
The decay rate is proportional to the square of the sum of the several mass
eigenstate amplitudes corresponding to the exchange of the electron (anti)neutrino
(coherent sum of virtual channels). Then it is useful to define an effective Majorana
mass as:
m ββ = |Uek |2 eiαk m k (9.56)
n p n p
534 9 The Properties of Neutrinos
where αi are the Majorana phases (one of them can be seen as a global phase and
be absorbed by the neutrino wavefunctions but the two remaining ones cannot be
absorbed, as it was the case for Dirac neutrinos).
Being a function both of the neutrino masses and of the mixing parameters, this
effective mass depends on the neutrino mass hierarchy. In the case of the normal hier-
archy total cancelation may occur for given range of masses of the lightest neutrino
and m ββ may be null.
The experimental measurement is extremely difficult due to the low decay rates
and the backgrounds (the 2νββ background is difficult to reduce, unless the energy
resolution is extremely good). An ideal experiment should then have a large source
mass, an excellent energy resolution; a clean environment and techniques to sup-
press the background (such as particle identification, spatial resolution, and timing
information) would help in general.
Several experimental strategies have been implemented in the last years, and
eleven isotopes for which single beta decay is energetically forbidden have been
experimentally observed undergoing double beta decay, for which half-lives (typi-
cally of the order of 1021 years) have been measured. Among the most interesting
double beta decay emitters are:
• 136 Xe, with a high Q-value of about 2.5 MeV where background is small, which can
Σm
β
Fig. 9.20 Left m β and the sum of the neutrino masses as a function of the lightest neutrino mass. In
green (upper band) and in red (lower band) the allowed phase spaces, respectively, for the inverted
and the normal hierarchy scenarios are plotted. Right m ββ as a function of the lightest neutrino mass
(three standard deviation intervals). In the upper band and in the lower band the allowed phase
spaces, respectively, for the inverted and the normal hierarchy scenarios are plotted
No confirmed signal was so far established (limits for m ββ of a few hundred meV
were obtained from the limits on the neutrinoless half-lives). In the next years, the
new generation of experiments may reach the inverted hierarchy mass region.
The measurements on the single β decay and, if Majorana on the double β decay, set
limits, as discussed in the previous sections, respectively on the weighted m νe and
on the effective m ββ electron neutrino mass. In the latter case case, we should note
however that limits are directly measured on the lifetimes of unstable nucleons, and
must be translated into m ββ .
The bound from cosmology can be similarly expressed in a graph of that form. In
that case
m i = m 1 + m 21 + δm 212 + m 21 + δm 213 . (9.57)
These limits are plotted as a function of the lightest neutrino mass in Fig. 9.20.
Most of the phase space for the inverted mass hierarchy scenario may be fully
probed by the next generation of beta and double beta experiments.
Further Reading
[F9.1 ] C. Giunti and C.W. Kim, “Fundamentals of Neutrino Physics and Astro-
physics”, Oxford 2007.
536 9 The Properties of Neutrinos
Exercises
1. Neutrino interaction cross section. Explain the peak in the cross section in Fig. 9.1.
2. Neutrino oscillation probability. Given a beam of a pure muon neutrino beam,
with fixed energy E, derive the probability of observing another neutrino flavor
at a distance L assuming two weak eigenstates related to two mass eigenstates
by a simple rotation matrix.
3. Tau antineutrinos appearance. OPERA is looking for the appearance of tau anti-
neutrinos in the CNGS (CERN neutrinos to Gran Sasso) muon-antineutrino beam.
The average neutrino energy is 17 GeV and the baseline is about 730 km. Neglect-
ing mass effects, calculate the oscillation probability
P(ν̄μ → ν̄τ )
and comment.
4. Neutrino mass differences. One neutrino experiment detects, at 200 m from the
nuclear reactor that the flux of a 3 MeV antineutrino beam is (90 ± 10 )% of what
was expected in case of no oscillation. Assuming a maximal mixing determine
the value of m 2ν at a confidence level of 95 %.
5. Neutrino rotation angles. Suppose there are three neutrino types (electron, muon,
tau) and three mass values, related by the 3 × 3 PMNS matrix, usually factorized
by three rotation matrices. Knowing that the three mass values are such that:
– m 2 (solar) = m 22 − m21 ∼ 10−5eV2
– m 2 (atmospheric) = m 23 − m 22 ∼ 10−3 eV2
discuss the optimization of reactor and accelerator experiments to measure each
of the three rotation angles and to confirm such mass differences. Compare, for
example, the pairs of experiments (KamLAND, DayaBay), (T2K,OPERA).
6. Neutrino from Supernova 1987A. In 1987, a Supernova explosion was observed in
the Magellanic Cloud, and neutrinos were measured in three different detectors.
The neutrinos, with energies between 10 and 50 MeV, arrived with a time span of
10s, after a travel distance of 5×1012 s, and 3 h before photons at any wavelength.
(a) Can this information be used to determine a neutrino mass? Discuss the
quantitative mass limits that could be derived from the SN1987A.
(b) This was the only SN observed in neutrinos, up to now, but the same rea-
soning can be used in pulsed accelerator beams. Derive the needed time and
position precision to measure ∼1 eV masses, given a beam energy E ∼1 GeV
and distance L.
7. Double β decay. Double β decay is a rare process, possible only for a small
fraction of the nuclear isotopes. The neutrinoless double β decay is only possible
if lepton number is not conserved, and is one of the most promising channels
to discover lepton number violation. Discuss the optimization (total mass, cho-
sen isotope, backgrounds, energy resolution, ...) of the experiments looking for
0νββ. List other possible experimental signals of lepton number violation you can
think of.
Chapter 10
Messengers from the High-Energy Universe
Cosmic rays were discovered at the beginning of the twentieth century (see Chap. 3).
Since then an enormous number of experiments were performed on the Earth’s sur-
face, on balloons, or on airplanes, or even on satellites. We know today that cosmic
rays of different nature, spanning many decades in energy, are of cosmic origin and
travel through the interstellar space. Their origin and composition is a challeng-
ing question, and the combined study of cosmic rays of different nature and ener-
gies (multi-messenger astrophysics) can solve fundamental problems, in particular
related to physics in extreme environments, and unveil the presence of new particles
produced in high-energy phenomena and/or in earlier stages of the Universe.
As we have seen in the Introduction, we think that the ultimate engine of the
acceleration of cosmic rays is gravity. In gigantic gravitational collapses, such as
those occurred in supernovae and in the accretion of supermassive black holes at
the expense of the surrounding matter, part of the potential gravitational energy is
transformed into kinetic energy of particles. The mechanism is not fully understood,
although we can model part of it; we shall give more details in this chapter. As usual
in physics, experimental data are the key: we need to know as accurately as possible
the origin, composition, and energy spectrum of cosmic rays.
The composition and energy spectrum of cosmic rays is not a well-defined prob-
lem: it depends on where experiments are performed. One could try a schematic
© Springer-Verlag Italia 2015 537
A. De Angelis and M.J.M. Pimenta, Introduction to Particle
and Astroparticle Physics, Undergraduate Lecture Notes in Physics,
DOI 10.1007/978-88-470-2688-9_10
538 10 Messengers from the High-Energy Universe
Charged cosmic rays arrive close to the Solar System after being deflected from the
galactic magnetic fields (about 1 µG in intensity) and possibly by extragalactic mag-
netic fields, if they are of extragalactic origin; when getting closer to the Earth they
start interacting with stronger magnetic fields—up to O(1G) at the Earth’s surface,
although for shorter distances. Fluxes of charged particles at lower energies, below
1 GeV, can thus be influenced, e.g., by the solar cycle which affects the magnetic
field from the Sun.
The energy spectrum of the charged cosmic rays reaching the atmosphere spans
over many decades in flux and energy (Fig. 10.1). At low energies, E < 1 GeV,
the fluxes are high (thousands of particles per square meter per second) while at the
highest energies, E > 1011 GeV, they are extremely scarce (less than one particle
per square kilometer per century).
The cosmic rays at the end of the known spectrum have thus energies well above
the highest beam energies attained in any man-made accelerator and their interactions
on the top of the Earth atmosphere have center-of-mass energies of a few hundred
TeV (the design LHC beam energy is E = 7 × 103 GeV).
Below a few GeV their flux is modulated by the solar wind, displaying an anti-
correlation with it, and depends also on the local Earth geomagnetic field. Above a
few GeV the intensity of the cosmic rays flux follows basically a power law,
I (E) ∝ E −γ
with the differential spectral index γ being typically between 2.7 and 3.3.
The small changes in the spectral index can be clearly visualized multiplying the
flux by some power of the energy. Figure 10.2 shows a suggestive anthropomorphic
10.1 The Data 539
representation of the cosmic ray energy spectrum obtained multiplying the flux by
E −2.6 .
Two clear break points corresponding to changes in the spectral index are
observed. The first, called the knee, occurs around E 5 × 1015 eV, and it is
commonly associated to the transition from galactic to extragalactic cosmic rays; it
corresponds to a steep ending from a spectral index of about 2.7 to a spectral index
of about 3.1. There is experimental evidence that the chemical composition of cos-
mic rays changes after the knee region with an increasing fraction of heavy nuclei
at higher energy, at least up to about 1017 eV. At even higher energies the chemical
composition remains matter of debate. The second clear feature, denominated the
“ankle,” occurs around E 5 × 1018 eV and its nature is still controversial. Another
feature, called the second knee, marks a steepening to from about 3.1 to about 3.3,
at an energy of about 400 PeV.
540 10 Messengers from the High-Energy Universe
The number of primary nucleons per GeV from some 10 GeV to beyond 100 TeV
is approximately
nucleons
N (E) 1.8 × 104 E −2.7 2 (10.1)
m s sr Gev
where E is the energy per nucleon in GeV.
A strong suppression at the highest energies, E 5 × 1019 eV, is nowadays
clearly established (Fig. 10.2); it may result, as explained in Chaps. 1 and 2, from the
so-called GZK mechanism due to the interaction of highly energetic protons with
the Cosmic Microwave Background (CMB), but a scenario in which an important
part of the effect is a change of composition (from protons to heavier nuclei, which
undergo nuclear photodisintegration,1 see later), and/or the exhaustion of the sources
(see Sect. 10.3), is not excluded.
10.1.1.1 Composition
Cosmic rays are basically protons (∼90 %) and heavier nuclei. The electron/positron
flux at the top of the atmosphere is small (a few per mil of the total cosmic ray
flux) but extremely interesting as it may be a signature of unknown astrophysical or
Dark Matter sources (see Chap. 8). Antiprotons fluxes are even smaller (about four
orders of magnitude) and so far compatible with secondary production by hadronic
interactions of primary cosmic rays with the interstellar medium. Up to now there is
no evidence for the existence of heavier anti-nuclei (in particular anti-deuterium and
anti-helium) in cosmic rays.
1 Inthe case of nuclei, the spallation due to the giant dipole resonance (a collective excitation of
nucleons in nuclei due to the interaction with photons) has also to be considered. For all nuclei but
iron the corresponding mean free paths are, at these energies, much smaller than the proton GZK
mean free path.
10.1 The Data 541
Fig. 10.3 Shower development scheme. Adapted from the Ph.D. thesis of R. Ulrich: “Measurement
of the proton–air cross section using hybrid data of the Pierre Auger Observatory,” http://bibliothek.
fzk.de/zb/berichte/FZKA7389.pdf
Accessing the composition of cosmic rays can be done in the region below a
few TeV, for example, by combining momentum measurement with the information
from Cherenkov detectors, or transition radiation detectors. This is how the AMS-02
magnetic spectrometer performs the task.
For EAS detectors, being able to distinguish between a shower generated by a
proton or by a heavier particle is a more difficult task. One of the most popular
air shower variables which in principle allows disentangling between protons and
heavier nuclei is the so-called Xmax , the depth of the maximum number of particles
in the shower, expressed in g/cm2 . Xmax may be defined (Fig. 10.3) as the sum of the
depth of the first interaction X1 and a shower development length X:
Xmax = X1 + X .
This discriminating variable uses the fact that the cross section for a nucleus grows
with its atomic number.
The X1 distribution is just a negative exponential, exp −X1 /η , where η
is the interaction length which is proportional to the inverse of the cosmic ray–air
interaction cross section. The observed Xmax distribution is thus the convolution of
the X1 distribution with the X distribution (which has a shape similar to the Xmax
distribution) and a detector resolution function (Fig. 10.4). Note that the tail of the
Xmax distribution reflects the X1 exponential distribution.
The absolute and relative fluxes of the main hadronic components of the cosmic
rays arriving at Earth are shown, respectively, in Figs. 10.5 and 10.6 and compared, in
this last figure, to the relative abundances existing in the solar system. To understand
these figures, we should take into account the facts that:
• in nature nuclei with even number of nucleons are more stable, having higher
binding energy because of pairing effects;
• features of charged particle acceleration must depend on rigidity
542 10 Messengers from the High-Energy Universe
Fig. 10.4 Ingredients of the Xmax distribution. Adapted from the Ph.D. thesis of R. Ulrich: “Mea-
surement of the proton–air cross section using hybrid data of the Pierre Auger Observatory,” http://
bibliothek.fzk.de/zb/berichte/FZKA7389.pdf
Fig. 10.5 Flux of the hadronic components of cosmic rays arriving at Earth for energies greater
than 2 GeV. From K.A. Olive et al. (Particle Data Group), Chin. Phys. C 38 (2014) 090001
pc
R = r L Bc =
ze
where r L is the Larmor radius, and z the atomic number of the nucleus. Rigidity
is measured in V .
Besides a clear deficit of hydrogen and helium in the cosmic rays compared to
the composition of the Solar System, the main features from this comparison are
10.1 The Data 543
Fig. 10.6 Relative abundance of the main hadronic components present in the cosmic rays arriving
at Earth and in the Solar System. Both are normalized to the abundance of Si = 103 , and the relevant
energy range is a few hundreds MeV/nucleon. Credit: ACE archives, http://www.srl.caltech.edu/
ACE
the agreement on the “peaks” (more tightly bounded even-Z nuclei) and higher
abundances for cosmic rays on the “valleys.” These features can be explained within
a scenario where primary cosmic rays are produced in stellar endproducts (see later),
being the “valley” elements mainly secondaries produced in the interaction of the
primaries cosmic rays with the interstellar medium (“spallation”).
10.1.1.2 Anisotropy
The arrival direction of charged cosmic rays is basically isotropic—a fact which can
find explanation in the effect of the galactic magnetic field. Milagro, IceCube, HAWC,
ARGO and the Tibet air shower array have, however, observed small anisotropies
(at the level of about one part per mil) in cosmic rays with energies above a few
TeV, in space regions of size of about 10◦ ; they might be due to nearby sources.
The excesses, however, cannot be attributed to a common source—which means that
deeper studies are needed to understand.
To accelerate particles up to the ultrahigh-energy (UHE) region above the EeV,
1018 eV, one needs conditions that are present in astrophysical objects such as the
surroundings of supermassive black holes in active galactic nuclei (AGN), or transient
high-energy events such as the ones generating gamma ray bursts (see later). Galactic
objects are not likely to be acceleration sites for particles of such energy, also because
UHECRs do not concentrate in the galactic plane; in addition, the galactic magnetic
field cannot confine UHECRs above 1018 eV within our galaxy.
A search for anisotropies in the UHE region has been performed by the Pierre
Auger and the Telescope Array collaborations. In the south galactic hemisphere the
Pierre Auger Collaboration reports a correlation of events above 57 EeV with the
544 10 Messengers from the High-Energy Universe
30
25
20
15
10
0
0 20 40 60 80 100 120
number of events
Fig. 10.7 Energy ordered correlation of events above 57 EeV with the Véron-Cetty Véron (VCV)
AGN catalog as seen by the Pierre Auger Observatory up to June 2011. The red line represents the
foreseen correlation from an isotropic sky. Image credit: Pierre Auger collaboration
Fig. 10.8 Top Positions of the AGN within 71 Mpc used in the correlation search by Auger (dots)
and circles centered at the arrival directions of the events with E > 57 EeV observed by Auger
in galactic coordinates. Bottom Map in equatorial coordinates (note the difference of coordinates
with respect to the plot on top) for the events above 57 EeV as seen by the Telescope Array
collaboration up to May 2014. The color scale represents the estimated significance. For details see
arXiv:1404.5890. Image credit: K Kawata/University of Tokyo Institute for Cosmic Ray Research
Fig. 10.9 Energy spectrum of e+ plus e− (except PAMELA data, for which electrons only are
plotted) multiplied by E 3 . The line shows 1 % of the proton spectrum. From K.A. Olive et al.
(Particle Data Group), Chin. Phys. C 38 (2014) 090001
light-years—one of the closest AGN, and the fifth brightest galaxy in the sky, hosting
a black hole of about 55 million solar masses and known to be a VHE gamma ray
emitter. In the northern hemisphere the Telescope Array collaboration observed for
events above 57 EeV a hotspot in the direction of the Ursa Major constellation with a
diameter of 30◦ –40◦ (Fig. 10.8, bottom). No statistically significant correlation with
AGNs from the VCV catalog was observed.
High-energy electrons and positrons have short propagation distances as they lose
energy through synchrotron and inverse Compton processes as they propagate
through the Galaxy. Their spectrum is therefore expected to be dominated by
local electron accelerators or by the decay/interactions of heavier particles nearby.
Positrons in particular could be the signature of the decay of dark matter particles.
Given these considerations, the experimental data on the flux of electrons plus
positrons, shown in Fig. 10.9, are quite puzzling. First of all, the yield in the 100 GeV
region and above indicates the presence of nearby sources. Second, the ATIC balloon
experiment measured a bump-like structure at energies between 250 and 700 GeV.
A bump-like structure might indicate the presence of a heavy particle decaying into
electrons and positrons of a definite mass. However, this finding was not confirmed
by later and more accurate instruments like the Fermi satellite Large Area Tracker
(Fermi-LAT) and AMS-02. The H.E.S.S. Cherenkov array also measured the electron
546 10 Messengers from the High-Energy Universe
Fig. 10.10 Top The positron fraction in high-energy cosmic rays: measurement from the AMS-02,
PAMELA, and Fermi-LAT satellites. The AMS measurement confirms an excess in the high-energy
positron fraction, above what is expected from positrons produced in cosmic ray interactions, first
observed by PAMELA (the gray band indicates the expected range in the positron fraction, which
is based on standard cosmic ray models). Bottom Recently updated results by AMS-02. No sharp
structures are observed, besides a hint of a maximum at (275 ± 32) GeV
flux at high energies, finding indications of a cutoff above 1 TeV, but no evidence for
a peak.
The ratio as a function of energy between the flux of positrons and the total flux
of electrons plus positrons is shown in Fig. 10.10. This is even more intriguing: in
a matter-dominated Universe, one would expect this ratio to decrease with energy,
unless specific sources of positrons are present nearby; but data show an anomaly
called the “PAMELA anomaly.” If these sources are heavy particles decaying into
10.1 The Data 547
Fig. 10.11 Antiproton to proton ratio measured by AMS-02. Within the existing models of sec-
ondary production, the ratio lacks explanation
final states involving positrons, one could expect the ratio to increase, and then steeply
drop after reaching half of the mass of the decaying particle. If an astrophysical source
of high-energy positrons is present, a smooth spectrum is expected.
Data cannot give a definite answer yet. A possible decrease of the ratio at an
energy of about 275 GeV might have been observed (Fig. 10.10, bottom). The most
recent data on the abundance of high-energy pulsars nearby (see later) might justify
an astrophysical explanation. However, the origin from a new massive particle of
mass of the order of 1 TeV is not excluded yet. New data will tell more. Recent
results by AMS-02 on the antiproton yield are particularly intriguing.
10.1.1.4 Antiprotons
Data are shown in Fig. 10.11. The antiproton to proton ratio stays constant from 20 to
400 GeV. This behavior cannot be explained by secondary production of antiprotons
from ordinary cosmic ray collisions. At variance with the excess of positrons, the
excess of antiprotons cannot be easily explained from pulsar origin. More study is
needed, and this is certainly one of the next frontiers.
Most charged particles on the top of the atmosphere are electrons and protons; how-
ever, the interaction with the atoms of the atmosphere itself dramatically changes
the composition. Secondary muons, photons, electrons/positrons and neutrinos are
produced by the interaction of charged cosmic rays in air, in addition to less stable
particles. Note that the nucleon composition changes dramatically in such a way that
548 10 Messengers from the High-Energy Universe
neutrons, which are 10 % of the total at the atmosphere’s surface, become roughly
1/3 at the Earth’s surface.
Primary muons can hardly reach the Earth’s atmosphere due to their lifetime
(τ ∼ 2 µs); this lifetime is however large enough, that secondary muons produced
in the atmosphere can reach the Earth’s surface, offering a wonderful example of
time dilation: the space crossed in average by such particles is L cγτ , and already
for γ ∼ 50 (i.e., an energy of about 5 GeV) they can travel 30 km, which roughly
corresponds to the atmospheric depth. Muons lose some 3 GeV by ionization when
crossing the atmosphere.
Charged particles at sea level are mostly muons (see Fig. 10.12), with a mean
energy of about 4 GeV.
The flux of muons from above beyond an energy of 1 GeV at sea level is about 60
m−2 s−1 sr−1 . A horizontal detectors sees roughly one muon per square centimeter
per minute. The zenith angular distribution for muons of E ∼ 3 GeV is ∝ cos2 θ,
being steeper at lower energies and flatter at higher energies: low energy muons at
large angles decay before reaching the surface. The ratio between μ+ and μ− is due
to the fact that there are more π + than π − in the proton-initiated showers; there are
about 30 % more μ+ than μ− at momenta above 1 GeV/c.
10.1 The Data 549
A fortiori, among known particles only muons and neutrinos reach significant
depths underground. The muon flux reaches 10−2 m−2 s−1 sr−1 under 1 km of water
equivalent (corresponding to about 400 m of average rock) and becomes about
10−8 m−2 s−1 sr−1 at 10 km of water equivalent.
10.1.2 Photons
Fig. 10.13 Spectrum of photons was experimentally detected at different energies. From M.T.
Ressell and M.S. Turner 1989, “The grand unified photon spectrum: a coherent view of the diffuse
extragalactic background radiation,” [Batavia, Ill.]: Fermi National Accelerator Laboratory. http://
purl.fdlp.gov/GPO/gpo48234
550 10 Messengers from the High-Energy Universe
10-1
10-3
Fig. 10.14 Spectrum of the total extragalactic gamma ray emission measured by the Fermi-LAT.
From M. Ackermann et al., The Astrophysical Journal 799 (2015) 86
(Fig. 10.14). A cutoff at energies close to 1 TeV might be explained by the absorp-
tion of higher energy photons by background photons near the visible populating the
intergalactic medium—through creation of e+ e− pairs, see later.
There is little doubt on the existence of the so-called ultra- and extremely-high-
energy photons (respectively in the PeV-EeV and in the EeV-ZeV range), but so far
cosmic gamma rays have been unambiguously detected only in the low (MeV), high
(GeV) and very (TeV) high-energy domains, and the behavior above some 30 TeV is
extrapolated by data at lower energies and constrained by experimental upper limits.
The data related to diffuse (i.e., unassociated with sources) flux do not go beyond
1 TeV (Fig. 10.14).
In Chap. 4 we have defined as high energy (HE) the photons above 30 MeV—i.e.,
the threshold for the production of e+ e− pairs plus some phase space; as very high
energy (VHE) the photons above 30 GeV. The HE—and VHE in particular—regions
are especially important related to the physics of cosmic rays and to fundamental
physics. One of the possible sources of HE gamma rays is indeed the generation as
a secondary product in conventional scenarios of acceleration of charged particles;
in this case cosmic gamma rays are a probe into cosmic accelerators. The VHE
domain is sensitive to energy scales important for particle physics. One is the 100
GeV scale expected for cold dark matter. A second energy scale is the TeV scale,
where supersymmetric particles might appear. Finally, it might be possible to access
the unification
√ scale and the Planck scale: an energy ∼1019 GeV, corresponding to
a mass c/G—which is, apart from factors of order ∼1, the mass of a black hole
whose Schwarzschild radius equals its Compton wavelength.
Gamma rays provide at present the best window into the nonthermal Universe,
being the “hottest” thermalized processes observed up to now in the accretion region
of supermassive black holes at a temperature scale of 105 K, which corresponds to
10.1 The Data 551
Fig. 10.15 Sources of VHE emission plotted in galactic coordinates. The background represents the
high-energy gamma rays detected by Fermi-LAT. The region near the Galactic Center is enlarged.
From http://tevcat.uchicago.edu/, May 2015
a few tens of eV—the X-ray region. Tests of fundamental physics with gamma rays
are much beyond the reach of terrestrial accelerators.
Besides the interest for fundamental physics, the astrophysical interest of HE and
VHE photons is evident: for some sources such as the AGN—supermassive black
holes in the center of galaxies, powered by infalling matter—the total power emitted
above 100 MeV dominates the electromagnetic dissipation.
A look to the sources of cosmic gamma rays in the HE region shows a diffuse
background, plus a set of localized emitters. Roughly, 3000 HE emitters have been
identified up to now, mostly by the Fermi-LAT, and some 200 of them are VHE
emitters as well (Fig. 10.15).
About half of the gamma ray emitters are objects in our Galaxy; at VHE most of
them can be associated to supernova remnants (SNR), while at MeV to GeV energies
they are mostly pulsars; the remaining half are extragalactic, and the space resolution
of present detectors (slightly better than 0.1◦ ) is not good enough to associate them
with particular points in the host galaxies; we believe, however, that they are produced
in the vicinity of supermassive black holes in the galaxies—see later.
The strongest steady emitters are galactic objects; this can be explained by the fact
that, being closer, they suffer a smaller attenuation. The observed strongest steady
emitter at VHE is the Crab Nebula. The energy distribution of the photons from Crab
Nebula is typical for gamma sources (see the explanation of the “double-hump”
structure in Sect. 10.2.2.1), and it is shown in Fig. 10.16.
552 10 Messengers from the High-Energy Universe
105
104
103 realizations
E2Flux(eV cm-2s-1)
average
radio-optical
102 INTEGRAL
COMPTEL
Fermi
MAGIC
101 HEGRA
HESS
Tibet-AS
Fermi flare (Feb. 2009)
100 Fermi flare (Sep. 2010)
Fermi flare (Apr. 2011)
10-1
10-5 100 105 1010 1015
E(eV)
Fig. 10.16 Spectral energy distribution of the Crab Nebula (data from radio-optical up to 100 TeV).
From Yuan, Yin et al., arXiv:1109.0075
Among cosmic rays, gamma rays are important not only because they point to the
sources, but also because the sensitivity of present instruments is such that transient
events (in jargon, “transients”) can be recorded. Sources of HE and VHE gamma
rays (some of which might likely be also sources of charged cosmic rays, neutrinos
and other radiation) were indeed discovered to exhibit transient phenomena, with
timescales from few seconds to few days.
The sky exhibits in particular transient events from steady emitters (“flares”)
and burst of gamma rays from previously dark regions (“gamma ray bursts”). The
phenomenology of such events is described in the rest of this section.
Short timescale variability has been observed in the gamma emission at high ener-
gies for several astrophysical objects, both galactic and extragalactic, in particular
binary systems, and AGN. For binary systems the variability is quasiperiodical and
can be related to the orbital motion, while for AGN it must be related to some cata-
clysmic events; this is the phenomenon of flares. Flares observed from Crab Nebula
have, as today, no universally accepted interpretation.
Flares. Flares are characteristic mostly of extragalactic emitters (AGN). Among
galactic emitters, the Crab Nebula, which was for longtime used as a “standard
candle” in gamma astrophysics, has been recently discovered to be subject to dramatic
flares on timescales of ∼10 h, with transient emission briefly dominating the flux
10.1 The Data 553
Fig. 10.17 Variability in the very-high-energy emission of the blazar PKS 2155-304. The dotted
horizontal line indicates the flux from the Crab Nebula (from the H.E.S.S. experiment, http://www.
mpi-hd.mpg.de/hfm/HESS)
from this object with a diameter of 10 light-years—which is the diameter of the shell
including the pulsar remnant of the imploded star, and correspond to roughly 0.1◦ as
seen from Earth.
Very short timescale emission from blazars have also been observed in the TeV
band, the most prominent being at present the flare from the AGN PKS 2155-304
shown in Fig. 10.17: a factor >100 flux increase with respect to the quiescent state,
with variability on timescales close to 1 min. Note that the radius of the black hole
powering PKS2155 is about 104 light seconds (corresponding to 109 solar masses),
which has implications on the mechanisms of emission of gamma rays (see later).
Indeed the gamma ray sky looks like a movie rather than a picture, the most
astonishing phenomenon being the explosion of gamma ray bursts.
Gamma Ray Bursts. Gamma Ray Bursts (GRBs) are extremely intense and fast
shots of gamma radiation. They last from fractions of a second to a few seconds
and sometimes up to a thousand seconds, often followed by “afterglows” orders
of magnitude less energetic than the primary emission after minutes, hours, or even
days. GRBs are detected once per day in average, typically in X-rays and soft gamma
rays. They are named GRByymmdd after the date on which they were detected: the
first two numbers after “GRB” correspond to the last two digits of the year, the second
two numbers to the month, and the last two numbers to the day. A progressive letter
(“A,” “B,” ...) might be added—it is mandatory if more than one GRB was discovered
in the same day, and it became customary after 2010.
Their position appears random in the sky (Fig. 10.18), which suggests that they
are of extragalactic origin. A few of them per year have energy fluxes and energies
large enough that the Fermi-LAT can detect them (photons of the order of few tens
of GeV have been detected in a few of them). Also in this case the sources appear to
be isotropic.
554 10 Messengers from the High-Energy Universe
Fig. 10.18 Skymap of the GRBs located by the GRB monitor of Fermi and by the Fermi-LAT.
Some events also seen by the Swift satellite are also shown. Credit: NASA
The energy spectrum is nonthermal and varies from event to event, peaking at
around a few hundred keV and extending up to several GeV. It can be roughly
fitted by phenomenological function (a smoothly broken power law) called “Band
spectrum” (from the name of David Band who proposed it). The change of spectral
slope from a typical slope of −1 to a typical slope of −2 occurs at a break energy E b
which, for the majority of observed bursts, is in the range between 0.1 and 1 MeV.
Sometimes HE photons are emitted in the afterglows.
During fractions of seconds, their energy emission in the gamma ray band exceeds
in some cases the energy flux of the rest of the Universe in the same band. The time
integrated fluxes range from about 10−7 to about 10−4 erg/cm2 . If the emission were
isotropic, the energy output would on average amount to a solar rest-mass energy,
about 1054 erg; however, if the mechanism is similar to the one in AGN the emission
should be beamed,2 with a typical jet opening angle of a few degrees. Thus the actual
average energy yield in γ rays should be ∼1051 erg. This value can be larger than
the energy content of a typical supernova explosion, of which only 1 % emerges as
visible photons (over a time span of thousands of years).
The distribution of their duration is bimodal (Fig. 10.19), and allows a first phe-
nomenological classification between “short” GRBs (lasting typically 0.3 s; duration
is usually defined as the time T90 during which 90 % of the photons are detected)
and “long” GRBs (lasting more than 2 s, and typically 40 s). Short GRBs are on
average harder than long GRBs.
2 Accretion disks, in particular in AGN, produce sometimes two opposite collimated jets, with a fast
outflow of matter and energy from close to the disc. The direction of the jet is determined by the
rotational axis of the accretion disk and/or of the black hole. The resolution of astronomical instru-
ments is in general too poor, especially at high energies, to resolve jet morphology in gamma rays,
and as a consequence observations cannot provide explanations for the mechanism yet. The lim-
ited experimental information available comes from the radio waveband, where very-long-baseline
interferometry can image at sub-parsec scales the emission of synchrotron radiation near the black
hole—but radiation should be present from the radio through to the gamma ray range.
10.1 The Data 555
GRBs are generally far away, typically at z ∼ 1 and beyond (Fig. 10.20). The
farthest event ever detected is a 10-s long GRB at z 8.2, called GRB090423,
observed by the Swift satellite (the burst alert monitor of Swift being sensitive to
energies up to 0.35 MeV).
Short GRBs are very difficult to associate to known objects. For long GRBs in
several cases the emission has been associated with a formation of a supernova,
presumably of very high mass (a “hypernova”). Possible mechanisms for GRBs will
be discussed later.
A historical curiosity: the first GRB was discovered in 1967 by one of the US
satellites of the Vela series, but the discovery has been kept secret for six years. The
Vela satellites had been launched to verify if Soviet Union was respecting the nuclear
test ban treaty imposing non-testing of nuclear devices in space. After the observation
of the GRB, it took some time to be sure that the event was of astrophysical origin.
Unfortunately, we do not know anything about possible similar discoveries by the
Soviet Union.
556 10 Messengers from the High-Energy Universe
Binary Systems. Binary stars (i.e., pairs of stars bound by gravitational interaction)
are frequent in the Universe: most solar-size and larger stars reside in binaries. Binary
systems in which one object is compact (a pulsar, a neutron star, or a black hole)
have been observed to be periodical emitters of gamma radiation.
A particular class of binary systems is the microquasars, binary systems compris-
ing a black hole, which exhibit relativistic jets (they are morphologically similar to
the AGN). In quasars, the accretion object is a supermassive (millions to billions of
solar masses) black hole; in microquasars, the mass of the compact object is only a
few solar masses.
As the space resolution of the Fermi-LAT and of the Cherenkov telescopes are of
the order of 0.1◦ , we can image diffuse structure only in the Milky Way: the other
galaxies will mostly appear like a point. Morphology studies at VHE are basically
limited to structures within our galaxy.
Morphology of SNR is in particular one of the keys to understand physics in the
vicinity of matter at high density—and one of the tools to understand the mechanism
of acceleration of cosmic rays. Sometimes SNRs and the surrounding regions are
too large to be imaged by Cherenkov telescopes, which typically have fields of view
of 3◦ –4◦ . A large field of view is also essential to understand the nature of primary
accelerators in pulsar wind nebulae (PWN), as discussed in Sect. 10.3.1.2: it would
be important to estimate the energy spectrum as a function of the angular distance
to the center of the pulsar to separate the hadronic acceleration from the leptonic
acceleration. The highest energy electrons lose energy quickly as they propagate
away from the source; this is not true for protons.
Intermediate emission structures, a few degrees in radius, have been observed by
MILAGRO and ARGO, which can be attributed to diffusion of protons within the
interstellar medium.
A surprising discovery by Fermi-LAT was the existence of a giant structure emit-
ting photons in our Galaxy, with size comparable to the size of the Galaxy itself:
the so-called Fermi bubbles. These two structures, about 50,000-light-years across
(Fig. 10.21), have quite sharp boundaries end emit rather uniformly in space with an
energy spectrum peaking at a few GeV but yielding sizable amount of energy still
up to 20 GeV.
Although the parts of the bubbles closest to the galactic plane shine in microwaves
as well as gamma rays, about two-thirds of the way out the microwaves fade and
only gamma rays are detectable.
Possible explanations of such a large structure are related to the past activity of the
black hole in the center of our Galaxy. A large-scale structure of the magnetic field
in the bubble region might indicate an origin from the center of the Galaxy, where
magnetic fields are of the order of 100 µG, and might also explain the mechanism of
emission as synchrotron radiation from trapped electrons. However, this explanation
is highly speculative, and as of today the reason for the emission is unknown.
10.1 The Data 557
10.1.3 Neutrinos
Neutrinos play a special role in particle astrophysics. Their cross section is very small
and they can leave production sites without interacting. Differently from photons,
neutrinos can carry information about the core of the astrophysical objects that pro-
duce them. Different from photons, they practically do not suffer absorption during
their cosmic propagation.
1. Neutrinos can be produced in the nuclear reactions generating energy in stars.
For example, the Sun emits about 2 × 1038 neutrinos/s (see Chap. 9).
2. They should be also produced in the most violent phenomena, including the Big
Bang, supernovae, and the accretion of supermassive black holes.
3. They are the main output of the cooling of astrophysical objects, including neutron
stars and red giants.
4. They are produced as secondary by-products of cosmic ray collisions:
• with photons or nuclei near the acceleration regions (these are “astrophysical”
neutrinos, like the ones at items 2. and 3.);
• with the CMB in the case of ultrahigh-energy cosmic rays suffering the
GZK effect (these are called cosmogenic neutrinos, or also “GZK neutrinos,”
although the mechanism was proposed by Berezinsky and Zatsepin in 1969);
• and also in the atmosphere (they are called atmospheric neutrinos).
5. Finally, they are likely to be present in the decay chain of unstable massive
particles, or in the annihilation of pairs of particles like dark matter particles.
Sources 2., 4. and 5. in the list above are also common to photons; however,
detection of astrophysical neutrinos could help constraining properties of the primary
cosmic ray spectrum more effectively than high-energy photons. Neutrinos produced
by reactions of ultrahigh-energy cosmic rays can provide information on otherwise
inaccessible cosmic accelerators.
Experimental data on neutrinos are however scarce: their small cross section
makes the detection difficult, and if a detector can be built with a sensitivity large
558 10 Messengers from the High-Energy Universe
On February 24, 1987, a supernova was observed in the Large Magellanic Cloud
(LMC), a small galaxy satellite of the Milky Way (about 1010 solar masses, i.e., 1 %
of the Milky Way) at a distance of about 50 kpc from the Earth. As it was the first
supernova observed in 1987, it was called SN1987A; it was also the first supernova
since 1604 visible with the naked eye. The event was associated with the collapse of
the star Sanduleak-69202, a main sequence star of mass about 20 solar masses.
No very high-energy gamma emission was detected (in 1987 gamma detectors
were not operating), but gamma rays at the intermediate energies characteristic of
gamma transitions could be recorded.
Three hours before the optical detection, a bunch of neutrinos was observed on
Earth. SN1987A was the first (and unique, up to now) neutrino source other than
the Sun: three water Cherenkov detectors, Kamiokande-II, the Irvine–Michigan–
Brookhaven (IMB) experiment, and the Baksan detector observed 12, 8, and 5 neu-
trino interaction events, respectively, over a 13 s interval. This time interval is con-
sistent with the estimated duration of a core collapse. The energy of neutrinos can
be inferred from the energy of the recoil electrons to be in the tens of MeV range,
consistent with the origin from a collapse.
10.1 The Data 559
IceCube reported during the last years the observation of a few neutrinos with
deposited energy above 1 PeV and a population of several tens of events above 30
TeV; this significantly exceeds the expected atmospheric backgrounds (above 100
TeV the expected atmospheric neutrino background falls to the level of one event per
year; the background above 30 TeV is expected to be about 40 %): the excess must
be of astrophysical or cosmogenic origin.
To identify any bright neutrino sources in the data, IceCube passed the data on the
detected neutrinos through a clustering algorithm (Fig. 10.23) as well as searching for
directional correlations with the TeV gamma ray sources in Fig. 10.15. No significant
evidence of clustering or correlations was found when comparing the significance
with random datasets with the same statistics.
The overall high-energy spectrum is shown in (Fig. 10.24).
The graviton, a massless spin 2 particle (this condition is required by the fact that
gravity is attractive only), is the proposed mediator of any field theory of gravity,
Fig. 10.23 Arrival directions of the IceCube events above 30 TeV in galactic coordinates (numbers
correspond to the numbering in IceCube’s catalog). Shower-like events (angular resolution ∼15◦ )
are marked with “+” and those containing muon tracks (angular resolution ∼1◦ ) with “x.” The gray
line denotes the equatorial plane. Colors show the test statistic (TS) for the point source clustering
test at each location. No significant clustering was observed. Credit: IceCube collaboration
560 10 Messengers from the High-Energy Universe
but has been not detected yet. Indeed the coupling of the graviton with matter is
predicted to be extremely weak and thus its direct detection is extremely difficult.
However, indirect evidence of gravitational radiation was clearly demonstrated by
Hulse and Taylor studying in 1974 the binary pulsar PSR 1913+16. In such system,
composed by two neutron stars, it was possible to deduce, from the precise time of
arrivals of the recorded pulses, all the binary orbital parameters and thus verify that
the orbital period is decreasing in agreement with the prediction of Einstein general
theory of relativity (about 40 s in 30 years, see Fig. 10.25). The total energy of the
system is thus decreasing.
Gravitational waves must then exist, which propagate the lost energy; their detec-
tion will open a new and unique channel to observe the Universe—complementary
to the lines described in the previous sections.
10.2 How Are High-Energy Cosmic Rays Produced? 561
Charged cosmic rays produced by particle ejection in the several possible astro-
physical sources may be accelerated in the regions of space with strong turbulent
magnetic fields. Permanent magnetic fields are not a good candidate since they cannot
accelerate particles, static electric fields would be quickly neutralized, while variable
magnetic fields may induce variable electrical fields and thus accelerate, provided
the particles are submitted to many acceleration cycles.
In 1949 Fermi proposed a mechanism in which particles could be accelerated
in stochastic collisions; this mechanism could model acceleration in shock waves
which can be associated to the remnant of a gravitational collapse (for example, a
stellar collapse, but also, as we know today, the surrounding of a black hole accreted
in the center of a galaxy). Let us suppose (see Fig. 10.26) a charged particle with
energy E 1 (velocity v) in the “laboratory” frame scattering inside a shock wave, i.e.,
a moving boundary between regions of different density.
The cloud has a velocity β = V /c, and θ1 and θ2 are the angles between, respec-
tively, the initial and final particle momentum and the cloud velocity.
The energy of the particle E 1∗ in the cloud reference frame is given by (neglecting
the particle mass with respect to its kinetic energy):
E 1∗ = γ E 1 (1 − β cos θ1 ) .
E 2 = γ E 2∗ (1 + β cos θ2 ) .
The particle suffers a great number of collisions inside the cloud so its output
angle is basically random, it does not conserve the memory of the input direction.
Then
cos θ2 = 0
and
E 1 − β cos θ1
= − 1.
E 1 − β2
The probability P of a collision between the cosmic ray and the cloud is not
constant as a function of the relative angle θ1 ; it is rather proportional to their relative
velocity (it is more probable that a particle hits a cloud that is coming against it, that
it hits a cloud that it is running away from it).
P ∝ (v − V cos θ1 ) (1 − β cos θ1 )
and thus 1
−1 cos θ1 (1 − β cos θ1 )d cos θ1 β
cos θ1
= 1 =− .
−1 (1 − β cos θ1 )d cos θ1
3
In the shock wave rest frame the medium ahead of the shock (upstream) runs
into the shock front with a velocity − →
u 1 , while the shocked gas (downstream) moves
−
→
away with a velocity u 2 (Fig. 10.28). Thus in the laboratory frame a particle coming
from upstream to downstream will meet a high-density magnetized gas travelling
with V = − →u1 − −
→u 2 in a head-on collision. The particle may suffer a great number of
elastic scatterings inside this medium (due to the turbulent magnetic field) and invert
the direction of its initial velocity crossing multiple times the shock front.
The angle θ1 (θ2 ) between the particle initial (final) velocity and the shock velocity
(see Fig. 10.26) is now constrained to this specific geometry: −1 ≤ cos θ1 ≤ 0
(0 ≤ cos θ2 ≤ 1); on the other hand the probability of crossing the wave front is
proportional to cos θ1 (cos θ2 ).
The mean values are
0
cos2 θ1 d cos θ1 2
cos θ1
= −1
0
=−
−1 cos θ1 d cos θ1 3
1
cos2 θ2 d cos θ2 2
cos θ2
= 0 1 = .
0 cos θ2 d cos θ2
3
564 10 Messengers from the High-Energy Universe
E
n = ln / ln(1 + ) . (10.5)
E0
On the other hand a particle may escape from the shock region with some proba-
bility Pi (which is proportional to the velocity V ) and then the probability PE n that
a particle escapes from the shock region with an energy greater or equal to E n is:
∞
PE n = Pi (1 − P j )n = (1 − Pi )n . (10.6)
j=n
E
ln E0
ln PE n = ln(1 − Pi )
ln(1 + )
ln(1 − Pi ) E
ln PE n = ln( ) .
ln(1 + ) E0
Then
N E
= PE n = ( )−α
N0 E0
and
−γ
dN E
∝ (10.7)
dE E0
10.2 How Are High-Energy Cosmic Rays Produced? 565
with
ln(1 − Pi ) ∼ Pi
α=− = (10.8)
ln(1 + )
and
γ = α + 1. (10.9)
The Fermi mechanism predicts then that the energy spectrum is a power law with
an almost constant index (both and Pi are proportional to β
).
In the case of supersonic shock α is predicted to be around 1 (γ ∼ 2). However, the
detected spectrum at Earth should be steeper! In its long journey from the Galactic
sources to the Earth the probability that the particles escapes from the Galaxy is
proportional to its energy (see Sect. 10.4.1):
−γ−δ
dN dN E
∝ −δ
×E ∝ .
dE d E sources E0
Earth
Indeed the experimental values for the observed spectral index, as it was discussed
in Sect. 10.1.1, are between 2.7 and 3.3. From this, γ is measured to be between 2.0
and 2.3.
SNR through Fermi first-order acceleration mechanisms are commonly recog-
nized nowadays as responsible for most of the high-energy cosmic rays. However,
the proof that this mechanism can accelerate cosmic rays all the way up to the knee
region is still missing.
The maximum energy that a charged particle could achieve in the supernova remnant
is then simply the rate of energy gain, times the time TS spent in the shock. In the
Fermi first-order model,
dE E
β ,
dt Tcycle
where Tcycle λcycle /(βc) is the time between two crossings. Since λcycle r L
E/(Z eB) (r L is the Larmor radius),
E dE
Tcycle =⇒ (β 2 c)Z eB .
Z eBβc dt
Finally
dE
E max TS Z eB R S β .
dt
566 10 Messengers from the High-Energy Universe
Fig. 10.29 The interpretation of the knee as due to the dependence of the maximum energy on
the nuclear charge Z . The flux of each nuclear species sharply decreases after a given cutoff. The
behavior of hydrogen, silicon (Z = 14) and iron (Z = 26) nuclei are depicted in figure. From
[F10.1]
(since in the rest frame of the particle the acceleration is small, we can use the
nonrelativistic expression for the radiation rate).
It is immediately evident from Eq. 10.11 that synchrotron energy loss is by far
more important for electrons than for protons.
Compton scattering and “inverse Compton.” The Compton scattering of a photon
by an electron is a relativistic effect, by which the frequency of a photon changes due
to a scattering. In the scattering of a photon by an electron at rest, the wavelength
shift of the photon can be expressed as
λ − λ ω
= (1 − cos α) ,
λ m e c2
where α is the angle of the photon after the collision with respect to its line of flight.
As evident from the equation and from the physics of the problem, the energy of the
scattered photon always decreases.
However, when low-energy photons collide with high-energy electrons instead
than with electrons at rest their energy can increase: such process is called inverse
Compton (IC) scattering. This mechanism is very effective for increasing the photon
energy (for this reason it is called “inverse”), and is important in regions of high
soft-photon energy density and energetic electron number density.
Synchrotron Self-Compton. The simplest purely leptonic mechanism we can draw
for photon “accelerate”—a mechanism we have seen at work in astrophysical
objects—is the so-called self-synchrotron Compton (SSC) mechanism. The SSC
is a purely leptonic mechanism, in which ultrarelativistic electrons accelerated in a
magnetic field—such as the field present in the accretion region of AGN, or in the
surrounding of SNR—generate synchrotron photons. The typical values of the fields
involved are such that the synchrotron photons have an energy spectrum peaked in
the infrared/X-ray range. Such photons in turn interact via Compton scattering with
568 10 Messengers from the High-Energy Universe
their own parent electron population (Fig. 10.30); since electrons are ultrarelativis-
tic (with a Lorentz factor γe ∼ 104−5 ), the energy of the upscattered photon can be
boosted by a large factor.
For a power law population of relativistic electrons with a differential spectral
index q and a blackbody population of soft photons at a temperature T , mean pho-
ton energies and energy distributions can be calculated for electron energies in the
Thomson regime and in the relativistic Klein-Nishina regime:
E γ
3 γe η
4 2
for γe η m e c2 (Thomson limit) (10.12)
2 E e
1
for γe η m e c (Klein-Nishina limit)
2
(10.13)
dNγ − q+1
∝ Eγ 2
for γe η m e c2 (Thomson limit) (10.14)
dE γ
−(q+1)
∝ Eγ ln(E γ ) for γe η m e c2 (Klein-Nishina limit) (10.15)
where E γ denotes the scattered photon’s energy, E e denotes the energy of the parent
electron, and η denotes the energy of the seed photon. A useful approximate relation
linking the electron’s energy and the Comptonised photon’s energy is given by:
E e 2 η
E γ 6.5 GeV .
TeV meV
The Compton component can peak at GeV–TeV energies; the two characteristic
synchrotron and Compton peaks are clearly visible on top of a general E γ−2 depen-
dence. Figure 10.31 shows the resulting energy spectrum. This behavior has been
verified with high accuracy on the Crab Nebula, see Fig. 10.16. If in a given region
the photons from synchrotron radiation can be described by a power law with spec-
tral index p, in the first approximation the tails at the highest energies from both the
synchrotron and the Compton mechanisms will have a spectral index p.
A cornerstone characteristics of the SSC model is a definite correlation between
the yields from synchrotron radiation and from IC during a flare (it would be difficult
to accommodate in the theory an “orphan flare,” i.e., a flare in the IC region not accom-
panied by a flare in the synchrotron region). Although most of the flaring activities
occur almost simultaneously with TeV gamma ray and X-ray fluxes, observations of
10.2 How Are High-Energy Cosmic Rays Produced? 569
1ES 1959+650 and other AGN have exhibited VHE gamma ray flares without their
counterparts in X-rays. Flares observed in VHE gamma rays with absence of high
activity in X-rays are very difficult to reconcile with the standard SSC, although this
model has been very successful in explaining the SED of blazars.
The existence of a hadronic component has been demonstrated from the experi-
mental data (see later) and could explain the production of hadrons up to almost the
knee.
Since neutrinos cannot be absorbed nor radiated, the neutrino would be a unique
tracer of a hadronic cascade.
VHE neutrinos can come from the same sources as VHE gamma rays, but not from
purely leptonic mechanisms: top-down mechanisms and hadronic cascades are thus
neutrino sources. As a consequence, evidence of neutrinos in an accelerator would
be a smoking gun for the presence of a hadronic component among accelerated
particles—or of new physics.
Table 10.1 Typical values of radii and magnetic fields in acceleration sites, and the maximum
attainable energy
Source Magnetic field Radius Maximum energy (eV)
SNR 30 µG 1 pc 3 × 1016
AGN 300 µG 104 pc >1021
GRB 109 G 10−3 AU 0.2 × 1021
Fig. 10.32 The “Hillas plot” represents astrophysical objects which are potential cosmic ray accel-
erators on a two-dimensional diagram where on the horizontal direction the size linear extension
R of the accelerator, and on the vertical direction the magnetic field strength B are plotted. The
maximal acceleration energy E is proportional to Z R Bv, where v is the shock velocity in units of
the speed of light and Z is the absolute value of the particle charge in units of the electron charge.
Particular values for the maximal energy correspond to diagonal lines in this diagram and can be
realized either in a large, low field acceleration region or in a compact accelerator with high mag-
netic fields. Typical shock velocities go from v ∼ 1 in extreme environments down to v ∼ 1/300.
From http://astro.uni-wuppertal.de/~kampert
E B R
× (10.16)
1 PeV 1 µG 1 pc
E B R
0.2 × . (10.17)
1 PeV 1 G 1 AU
This entails the so-called Hillas relation, which is illustrated by (Table 10.1) and
Fig. 10.32. We remind that the energies in the Hillas plot are maximum attainable
energies: besides the containment, one must have an acceleration mechanism.
In the following, known possible acceleration sites are described.
572 10 Messengers from the High-Energy Universe
We have seen that most of the VHE gamma emission in the Galaxy can be associated
to SNR. More than 90 % of the TeV galactic sources discovered up to now are, indeed,
SNR in a large sense (we include here in the set of “SNR” at large also pulsar wind
nebulae, see later).
The term “supernova” indicates a very energetic “stella nova,” a term invented by
Galileo Galilei to indicate objects that appeared to be new stars, that had not been
observed before in the sky. The name is a bit ironic, since Galilei’s diagnosis was
wrong: supernovae are actually stars at the end of their life cycle, exploding with a
collapse of their nuclei.
In the beginning, a massive star burns the hydrogen in its core. When the hydrogen
is exhausted, the core contracts until the density and temperature conditions are
reached such that the fusion 3α →12 C can take place, which continues until helium
is exhausted. This pattern (fuel exhaustion, contraction, heating, and ignition of the
ashes of the previous cycle) might repeat several times depending on the mass, leading
finally to an explosive burning. A 25-solar mass star can go through a set of burning
cycles ending up in the burning of Si to Fe in about 7 My (as discussed in Chap. 1,
Fe is stable with respect to fusion), with the final explosion stage taking a few days.
A supernova remnant (SNR) is the structure left over after a supernova explosion:
a high-density neutron star (or a black hole) lies at the center of the exploded star,
whereas the ejecta appear as an expanding bubble of hot gas that shocks and sweeps
up the interstellar medium. A star with mass larger than 1.4 times the mass of the
Sun cannot die into a white dwarf and will collapse; it will become a neutron star or
possibly, if its mass is larger than 5–10 times the mass of the Sun, into a black hole.
Most collapsing stars are expected to produce a neutron star.
When a star collapses into a neutron star, its size shrinks to some 10–20 km, with a
density of 5 × 1017 kg/m3 . Since angular momentum is conserved, the rotation can
become very fast, with periods of the order of a few ms up to 1 s. Neutron stars in
young SNRs are typically pulsars (short for pulsating stars), i.e., they emit a pulsed
beam of electromagnetic radiation. Since the magnetic axis is in general not aligned
to the rotation axis, two peaks corresponding to each of the magnetic poles can be
seen for each period (Fig. 10.33).
The millisecond rotating period for young pulsars can be estimated using basic
physics arguments. A star like our Sun has a radius R ∼ 7 × 105 km and a rotation
period of T 30 days, so that the angular velocity is ω ∼ 2.5 × µrad/s. After
the collapse, the neutron star has a radius R N S ∼ 10 km. From angular momentum
conservation, one can write:
10.3 Possible Acceleration Sites and Sources 573
Fig. 10.33 Left Schematic of the Crab Pulsar. Electrons are trapped and accelerated along the
magnetic field lines of the pulsar and can emit electromagnetic synchrotron radiation. Vacuum gaps
or vacuum regions occur at the “polar cap” close to the neutron star surface and in the outer region;
in these regions density varies and thus one can have acceleration. From MAGIC Collaboration,
Science 322 (2008) 1221. Right Time-resolved emission from the Crab Pulsar at HE and VHE; the
period is about 33 ms. From VERITAS Collaboration, Science 334 (2011) 69
R2
R 2 ω ∼ R 2N S ω N S =⇒ ω N S = ω =⇒ TN S 0.5 ms .
R 2N S
The gravitational collapse amplifies the stellar magnetic field. As a result, the
magnetic field B N S near the N S surface is extremely high. To obtain an estimate of
its magnitude, let us use the conservation of the magnetic flux during the contraction.
Assuming the magnetic field to be approximately constant over the surface,
R2
Bstar R 2 = B N S R 2N S =⇒ B N S = Bstar .
R 2N S
For a typical value of Bstar = 1 kG, the magnetic fields on the surface of the neutron
star is about 1012 G. This estimate has been experimentally confirmed by measuring
quantised energy levels of free electrons in the pulsar strong magnetic fields.
Typical pulsars emitting high-energy radiation have cutoffs of the order of a few
GeV. 117 HE pulsars emitting at energies above 100 MeV have been discovered by
the Fermi-LAT until 2013. They are very close to the Solar System (Fig. 10.34, left),
most of the ones for which the distance has been measured being less that a few kpc
away. A typical spectral energy distribution is shown in Fig. 10.34, right.
The pulsar in Crab Nebula is not typical, being one of the two (together with the
Vela pulsar) detected up to now in VHE (Fig. 10.35)—Crab and Vela were also the
first HE pulsars discovered in the late 1970s.
574 10 Messengers from the High-Energy Universe
Fig. 10.34 Left Map of the pulsars detected by the Fermi-LAT (the Sun is roughly in the center
of the distribution). The open squares with arrows indicate the lines of sight toward pulsars for
which no distance estimates exist. Credit: NASA. Right Spectral energy distribution from a typical
high-energy pulsar. Credit: NASA
Typical velocities for the expulsion of the material out of the core of the explosion are
of the order of c/100. The shock slows down over time as it sweeps up the ambient
medium, but it can expand over tens of thousands of years and over tens of parsecs
before its speed falls below the local sound speed.3
Based on their emission and morphology (which are actually related), SNRs are
generally classified under three categories: shell-type, pulsar wind nebulae (PWN),
and composite (a combination of the former, i.e., a shell-type SNR containing a
PWN).
Shell Supernova Remnants. As the shockwave from the supernova explosion plows
through space, it heats and stirs up any interstellar material it encounters, which
produces a big shell of hot material in space. This process can continue up to 104 –
105 years before the energy release becomes negligible.
Magnetic field strengths are estimated to be B ∼ 10 µG to 1 mG.
Pulsar Wind Nebulae. Pulsar Wind Nebulae (PWNe) are SNR with a young pulsar
slowing down in rotation: the typical rate of decrease of kinetic energy lies in the range
Ė ∼ 1032 –1039 erg/s. In most cases only a negligible fraction of this energy goes into
the pulsed electromagnetic radiation observed from the pulsar, whereas most of it is
deposited into an outflowing relativistic magnetized particle wind: due to external
pressure, this wind decelerates abruptly at a termination shock, and beyond this
the pulsar wind thermalizes and radiates synchrotron photons which might possibly
undergo IC scattering—so resulting in a PWN.
The best known case of a PWN is the Crab Nebula, powered by the central young
(∼1000 year) pulsar B0531+21.
3 The speed of shock is the speed at which pressure waves propagate, and thus it determines the rate
at which disturbances can propagate in the medium, i.e., the “cone of causality.”
10.3 Possible Acceleration Sites and Sources 575
Fig. 10.35 Left Spectral energy distribution of the Crab Pulsar. Right The VHE energy emission,
compared with the emission from the pulsar wind nebula powered by the pulsar itself (see later).
The two periodical peaks are separated. Credits: MAGIC Collaboration
Crab Nebula emits radiation across a large part of the electromagnetic spectrum,
as seen in Fig. 10.16.
One can separate the contribution of the pulsar itself to the photon radiation from
the contribution of the PWN (Fig. 10.35).
In the early stages of a PWN’s evolution, the pulsar spindown luminosity provides
a steady source of energy to the PWN which then expands supersonically into the
surrounding low-density medium—the PWN radius evolving as ∝ t 6/5 . Eventually,
inside the PWN volume there will be two zones:
• an inner one dominated by the freely outgoing particles and fields that ends at a
termination shock where the wind, whose pressure decreases outwards, is con-
tained by external pressure and the particles are accelerated and get scrambled so
that synchrotron radiation can be emitted,
• an outer zone which is the emitting region of the PWN.
The morphology of a PWN in this early phase is revealed by the X-ray image of the
Crab Nebula, which is dominated by a bright torus, whose ring-shaped inner part is
thought to correspond to the wind termination shock.
A PWN enters a new phase when its outer boundary collides with the reverse
shock from the embedding SNR. For a 1051 erg supernova explosion ejecting 10 M
into an ambient interstellar medium of density 1 atom cm−3 (typical case in the
Galaxy), this typically happens ∼7000 year after the explosion. The encounter with
the reverse shock eventually compresses the PWN, which leads to an increase in the
magnetic field and hence to a brightening of the PWN.
Among the extragalactic emitters that may be observed from Earth, AGN, and
Gamma Ray burst could have the energetics to reach the highest energies.
Supermassive black holes of ∼106 –1010 solar masses (M ) and beyond reside in the
cores of most galaxies—for example, the center of our Galaxy, the Milky Way, hosts
a black hole of roughly 4 million solar masses, its mass having been determined by
the orbital motion of nearby stars. The mass of the black hole is correlated to the
velocity dispersion of the stars in the galaxy bulge.4
In approximately 1 % of the cases such black hole is active, i.e., it displays strong
emission and has signatures of accretion: we speak of an active galactic nucleus
(AGN). Around 10 % of these AGN exhibit relativistic jets powered by accretion
onto a supermassive BH. Despite the fact that AGN have been studied for several
decades, the knowledge of the emission characteristics up to the highest photon
energies is mandatory for an understanding of these extreme particle accelerators.
Infalling matter onto the black hole can produce a spectacular activity; when they
are active, i.e., they are powered by accretion at the expenses of nearby matter, these
black holes are called AGN.
4 The so-called M − σ relation is an empirical correlation between the stellar velocity dispersion σ
where M is the solar mass. A relationship exists also between galaxy luminosity and BH mass,
but with a larger scatter.
10.3 Possible Acceleration Sites and Sources 577
An AGN emits over a wide range of wavelengths from γ ray to radio: typical
luminosities range from about 1037 to 1040 W. The energy spectrum of an AGN is
radically different from an ordinary galaxy, whose emission is due to its constituent
stars. Very large luminosities are possible (up to 10,000 times a typical galaxy). The
maximum luminosity (in equilibrium conditions) is set by requirement that gravity
(inward) is equal to radiation pressure (outward); this is called the Eddington lumi-
nosity. Approximately, the Eddington luminosity (in units of the solar luminosity) is
40,000 times the BH mass expressed in solar units. For short times, the luminosity
can be larger than the Eddington luminosity.
Matter falling into the central black hole will conserve its angular momentum and
will form a rotating accretion disk around the BH itself. In about 10 % of AGN, the
infalling matter turns on powerful collimated jets that shoot out in opposite directions,
likely perpendicular to the disk, at relativistic speeds (see Fig. 10.36). Jets have been
observed close to the BH having a size of about 0.01 pc, orders of magnitude smaller
than the radius of the black hole and a fraction 10−5 of the length of jets themselves.
Frictional effects within the disk raise the temperature to very high values, caus-
ing the emission of energetic radiation—the gravitational energy of infalling matter
accounts for the power emitted. The typical values of the magnetic fields are of the
order of 104 G close to the BH horizon, quickly decaying along the jet axis.
Many AGN vary substantially in brightness over very short timescales (days or
even minutes). Since a source of light cannot vary in brightness on a timescale
shorter than the time taken by light to cross it, the energy sources in AGN must
be very compact, much smaller than their Schwarzschild radii—the Schwarzschild
radius of the BH is 3 km ×(M/M ), i.e., 20 AU (about 104 light seconds) for a
supermassive black hole mass of 109 M .
The so-called “unified model” accounts for all kinds of active galaxies within
the same basic model. The supermassive black hole and its inner accretion disk are
surrounded by matter in a toroidal shape, and according to the “unified model” the
type of active galaxy we see depends on the orientation of the torus and jets relative
to our line of sight. The jet radiates mostly along its axis, also due to the Lorentz
enhancement—the observed photon frequency in the laboratory frame is boosted by
a Doppler factor which is obtained by the Lorentz transformation of a photon from
the jet fluid frame into the laboratory frame; in addition, the jet is beamed by the
Lorentz boost.
• An observer looking very close to the jet axis will observe essentially the emission
from the jet, and thus will detect a (possibly variable) source with no spectral lines:
this is called a blazar.
• As the angle of sight with respect to the jet grows, the observer will start seeing a
compact source inside the torus; in this case we speak of a quasar.
• From a line of sight closer to the plane of the torus, the BH is hidden, and one
observes essentially the jets (and thus, extended radio-emitting clouds); in this
case, we speak of a radio galaxy (Fig. 10.36).
The class of jet dominated AGN corresponds mostly to radio-loud AGN. These
can be blazars or nonaligned AGN depending on the orientation of their jets with
578 10 Messengers from the High-Energy Universe
Fig. 10.36 Schematic diagram for the emission by an AGN. In the “unified model” of AGN, all
share a common structure and only appear different to observers because of the angle at which they
are viewed. From http://www.astro-photography.net, adapted
respect to the line of sight. In blazars, emission is modified by relativistic effects due
to the Lorentz boost.
Blazars. Observationally, blazars are divided into two main subclasses depending
on their spectral properties.
• FSRQs. They show broad emission lines in their optical spectrum.
• BL Lacertae objects (BL Lacs). They have no strong, broad lines in their optical
spectrum. BL Lacs are moreover classified according to the energies of the peaks
of their SED; they are called accordingly low-energy peaked BL Lacs (LBLs),
intermediate-energy peaked BL Lacs (IBL) and high-energy peaked BL Lacs
(HBL). Typically FSRQs have a synchrotron peak at lower energies than LBLs.
Blazar population studies at radio to X-ray frequencies indicate a redshift distribution
for BL Lacs that seems to peak at z ∼ 0.3, with only few sources beyond z ∼ 0.8,
while the FSRQ population is characterized by a rather broad maximum between
z ∼ 0.6–1.5.
Non-AGN Extra Galactic Gamma Ray Sources. At TeV energies, the extragalactic
γ ray sky is completely dominated by blazars. At present, more than 50 objects have
been discovered and are listed in the online TeV Catalog. The two most massive close
by starburst galaxies NGC 253 and M82 are the only non-AGN sources detected at
TeV energies. Only 3 radio galaxies have been detected at TeV energies (Centaurus
A, M87 and NGC 1275).
10.3 Possible Acceleration Sites and Sources 579
At GeV energies, a significant number (about 1/3) of AGN of uncertain type has
been detected by the Fermi-LAT (emitters that Fermi could not associate to any
known object) and few non-AGN objects have been discovered. Among non-AGN
objects, there are several local group galaxies (LMC, SMC, M31) as well as galaxies
in the star formation phase galaxies (NGC 4945, NGC 1068, NGC 253, and M82)—
CRs below the knee can be accelerated by SNRs or other objects that are related to
star formation activity.
The Gamma Ray Yield From Extragalactic Objects. The observed VHE spectra
at high energies are usually described by a power law d N /d E ∝ E − . The spectral
indices need to be fitted from a distribution deconvoluted from absorption in the
Universe, since the transparency of the Universe depends on energy; they typically
range in the interval from 2 to 4, with some indications for spectral hardening with
increasing activity. Emission beyond 10 TeV has been established, for close galaxies
like Mrk 501 and Mrk 421. Some sources are usually detected during high states
(flares) only, with low states falling below current sensitivities.
Observed VHE flux levels for extragalactic objects typically range from 1 % of
the Crab Nebula steady flux (for the average/steady states) up to 10 times as much
when the AGN are in high activity phases. Since TeV instruments are now able to
detect sources at the level of 1 % of the Crab, the variability down to few minute
scale of the near and bright TeV-emitting blazars (Mrk 421 and Mrk 501) can be
studied in detail. Another consequence of the sensitivity of Cherenkov telescopes is
that more than one extragalactic object could be visible in the same field of view.
The study and classification of AGN and their acceleration mechanisms require
observations from different instruments. The spectral energy distributions (SEDs)
of blazars can span almost 20 orders of magnitude in energy, making simultaneous
multiwavelength observations a particularly important diagnostic tool to disentangle
the underlying nonthermal processes. Often, SEDs of various objects are obtained
using nonsimultaneous data—which limits the accuracy of our models.
In all cases, the overall shape of the SEDs exhibit the typical broad double-hump
distribution, as shown in Fig. 10.37 for three AGN at different distances. The SEDs
of all AGN considered show that there are considerable differences in the position
of the peaks of the two components and in their relative intensities. According to
current models, the low-energy hump is interpreted as due to synchrotron emission
from highly relativistic electrons, and the high-energy bump is related to inverse
Compton emission of various underlying radiation fields, or a π 0 decays, depending
on the production mechanism in action (Sect. 10.2.2). Large variability is present,
especially at optical/UV and X-ray frequencies.
Variability is also a way to distinguish between hadronic and leptonic accelera-
tion modes. In a pure leptonic mode, one expects that in a flare the increase of the
synchrotron hump is fully correlated to the increase of the IC hump; in a hadronic
mode, vice versa, one can have a “orphan flare” of the peak corresponding to π 0
decay.
Studies on different blazar populations indicate a continuous spectral trend from
FSRQ to LBL to IBL to HBL, called the “blazar sequence.” The sequence is char-
580 10 Messengers from the High-Energy Universe
Fig. 10.37 Left The blazar sequence. From G. Fossati et al., Mon. Not. Roy. Astron. Soc. 299 (1998)
433. Right The SED of three different AGN at different distance from the Earth and belonging to
different subclasses. To improve the visibility of the spectra, the contents of the farthest (3C 279)
have been multiplied by a factor 1000, while that of the nearest (Mrk 421) by a factor 0.001. The
dashed lines represent the best fit to the data assuming leptonic production. From D. Donato et al.,
Astron. Astrophys. 375 (2001) 739
Gamma ray bursts are another very important possible extragalactic acceleration site.
As discussed before, GRBs are the most luminous events occurring in the gamma
ray Universe (1051 erg on timescales of tens of seconds or less).
The connection between large mass supernovae (from the explosion of hyper-
giants, stars with a mass of between 100 and 300 times that of the Sun) and long
GRBs is proven by the observation of events coincident both in time and space, and
the energetics would account for the emission—just by extrapolating the energetics
from a supernova. During the abrupt compression of such a giant star the magnetic
field could be squeezed to extremely large values, of the order of 1012 –1014 G, in a
radius of some tens of kilometers.
Models of the origin of the short GRBs are, instead, more speculative. Observation
of afterglows allowed the identification of host galaxies where short GRBs occurred.
It was found that short GRBs are not associated with supernovae, and prefer galaxies
that contain a considerable quantity of old stars. The current hypothesis attributes
the origin of short GRBs to the merging of pairs of compact objects, possibly one
neutron star and one black hole; the neutron star loses energy due to gravitational
10.3 Possible Acceleration Sites and Sources 581
radiation, and thus spirals closer and closer until tidal forces disintegrate it providing
an enormous quantity of energy before merging into the black hole. This process can
last only a few seconds.
Although the two families of GRBs are likely to have different progenitors, the
acceleration mechanism that gives rise to the γ rays themselves (and possibly to
charged hadrons one order of magnitude more energetic, and thus also to neutrinos)
can be the same.
The fireball model is the most widely used theoretical framework to describe the
physics of the GRBs, either long or short.
In this model, first the black hole formed (or accreted) starts to pull in more stellar
material; quickly an accretion disk forms, with the inner portion spinning around the
BH at a relativistic speed. This creates a magnetic field which blasts outward two
jets of electrons, positrons and protons at ultrarelativisic speed in a plane out of the
accretion disk. Photons are formed in this pre-burst.
Step two is the fireball shock. Each jet behaves in its way as a shock wave, plowing
into and sweeping out matter like a “fireball”. Gamma rays are produced as a result
of the collisions of blobs of matter; the fireball medium does not allow the light to
escape until it has cooled enough to become transparent—at which point light can
escape in the direction of motion of the jet, ahead of the shock front. From the point
of view of the observer, the photons first detected are emitted by a particle moving at
relativistic speed, resulting in a Doppler blueshift to the highest energies (i.e., gamma
rays). This is the gamma ray burst.
An afterglow results when material escaped from the fireball collides with the
interstellar medium and creates photons. The afterglow can persist for months as the
energies of photons decrease.
Figure 10.38 shows a scheme of the fireball shock model.
10.3.4.1 Gamma Rays and the Origin of Cosmic Rays from SNRs
Among the categories of possible cosmic ray accelerators, several have been studied
in an effort to infer the relation between gamma rays and charged particles. In the
Milky Way in particular, SNRs are, since the hypothesis formulated by Baade and
Zwicky in 1934, thought to be possible accelerators to energies of the order of 1
PeV and beyond. The particle acceleration in SNRs is accompanied by production of
gamma rays due to interactions of accelerated protons and nuclei with the ambient
medium.
The conjecture has a twofold justification. From one side, SNRs are natural places
in which strong shocks develop and such shocks can accelerate particles. On the
other side, supernovae can easily account for the required energetics. Nowadays, as
a general remark, we can state that there is no doubt that SNR accelerate (part of the)
galactic CR, the open questions being: which kind of SNR; in which phase of their
evolution SNR really do accelerate particles; and if the maximum energy of these
accelerated particles can go beyond ∼1 PeV.
A very important step forward in this field of research was achieved in the recent
years with an impressive amount of experimental data at TeV energies, by Cherenkov
telescopes (H.E.S.S., MAGIC, VERITAS), and at GeV energies, by the Fermi-LAT
and AGILE satellites.
In SNRs with molecular clouds, in particular, a possible mechanism involves
a source of cosmic rays illuminating clouds at different distances, and generating
Fig. 10.39 On the left Scheme of the generation of a hadronic cascade in the dump of a proton with
a molecular cloud. On the right, IC443: centroid of the emission from different gamma detectors.
The position measured by Fermi-LAT is marked as a diamond that by MAGIC as a downwards
oriented triangle; the latter is consistent with the molecular cloud G.
10.3 Possible Acceleration Sites and Sources 583
hadronic showers by pp collisions. This allows to spot the generation of cosmic rays
by the study of photons coming from π 0 decays in the hadronic showers.
Recent experimental results support the “beam dump” hypothesis. An example
of such a mechanism at work is the SNR IC443. In Fig. 10.39, a region of acceler-
ation at GeV energies is seen by the LAT space telescope onboard the Fermi-LAT
significantly displaced from the centroid of emission detected at higher energies by
the MAGIC gamma ray telescope—which, in turn, is positionally consistent with a
molecular cloud. The spectral energy distribution of photons also supports a two-
component emissions, with a rate of acceleration of primary electrons approximately
equal to the rate of production of protons.
Such a 2-region displaced emission morphology has been also detected in several
other SNRs (W44 and W82 for example).
Besides indications from the studies of the morphology, the detection of photons of
energies of the order of 100 TeV and above could be a direct indication of production
via π 0 decay, since the emission via leptonic mechanisms should be strongly sup-
pressed at those energies where the inverse Compton scattering cross section enters
the Klein–Nishina regime. Finally, in some sources the fit to the spectral energy
distribution seems to indicate a component from π 0 decay (Fig. 10.40).
Let us shortly examine the relation between the spectra of secondary photons
and the spectra of primary cosmic rays (we shall assume protons). We shall also
examine the case of neutrinos, which could become, if large enough detectors are
built, another powerful tool for the experimental investigation.
In beam dump processes, almost the same number of π 0 , π − and π + are produced.
The π 0 decay immediately into two gamma rays; the charged pions decay into μ νμ ,
with the μ decaying into eνe νμ (charge conjugates are implicitly included). Thus,
there are three neutrinos for each pion and six neutrinos every two gamma rays.
When the collision happens between an accelerated proton and a nucleon at rest
in a molecular cloud, for a c.m. energy of 1 TeV (which corresponds to an energy
of the accelerated proton of 0.5 PeV in the frame of the cloud), around 60 charged
and 30 neutral pions are produced. Taking into account the fact that in the final
state of the charged pion decay there are three neutrinos and one electron/positron,
the energy fraction transferred to each neutrino is about 1/4. Typically, thus, each
584 10 Messengers from the High-Energy Universe
neutrino carries (1/4) × (1/90) 0.3 % of the primary energy. This fraction decreases
roughly logarithmically with energy.
In photoproduction processes (γ p → nπ + , γ p → pπ 0 ), the neutrino energy
from the π + decay is related to the parent proton energy through the relation
E ν E p /20 .
This arises because the average energy of the pion in photoproduction is some 20 % of
the energy of the parent proton, and in the π + decay chain four leptons are produced,
each of which roughly carries 1/4 of the pion energy.
The fraction of energy transferred to photons in π + decays is 1/4 (because the
positron annihilates producing additional photons); for each π + one has in addition
a π 0 , which decays into photons. Following such simple arguments, the ratio of
neutrino to photon flux in photoproduction processes is
L ν /L γ 3/5 .
Here, and in the following, under the symbol ν we always consider the sum of
neutrinos and antineutrinos.
The existence of neutrino sources not observed in gamma rays is not excluded.
If a source is occulted by the presence of thick clouds or material along the line of
sight to the Earth, gamma rays are absorbed while neutrinos survive.
As the energetics of SNRs might explain the production of galactic CR, the energetics
of AGN might explain the production of CR up to the highest energies. In the Hillas
relation, the magnetic field and the typical sizes are such that acceleration is possible
(Table 10.1).
Where molecular clouds are not a likely target, as, for example, in the vicinity of
supermassive black holes, proton–photon interactions can start the hadronic shower.
Although the spatial resolution of gamma ray telescopes is not yet good enough
to study the morphology of extragalactic emitters, a recent study of a flare from
the nearby galaxy M87 (at a distance of about 50 Mly, i.e., a redshift of 0.0004)
by the main gamma telescopes plus the VLBA radio array has shown, based on the
VLBA imaging power, that this AGN accelerates particles to very high energies in
the immediate vicinity (less than 60 Schwarzschild radii) of its central black hole.
This galaxy is very active: its black hole, of a mass of approximately 7 billion solar
masses, accretes by 2 or 3 solar masses per year. A jet of energetic plasma originates
at the core and extends outward at least 5000 ly.
Also Centaurus A, the near AGN for which some hint of correlation with the
Auger UHE data exists, has been shown to be a VHE gamma emitter.
10.3 Possible Acceleration Sites and Sources 585
5 The apparent magnitude (m) of a celestial object measures its brightness as seen by an observer on
Earth. The magnitude scale originates in the Hellenistic practice of dividing stars into six magnitudes.
The brightest stars were said to be of first magnitude (m = 1), while the faintest were of sixth
magnitude (m = 6), the limit of naked eye human visibility; each grade of magnitude was considered
twice the brightness of the following grade. The system is today formalized by defining a first
magnitude
√ star as a star that is 100 times as bright as a sixth magnitude star; thus, a first magnitude
star is 5 100 (about 2.512) times as bright as a second magnitude star (obviously the brighter an
object appears, the lower the value of its magnitude). The stars Arcturus and Vega have an apparent
magnitude approximately equal to 0. The problem of the relation between apparent magnitude,
intrinsic magnitude, and distance, is related also to cosmology, as discussed in Chap. 8.
586 10 Messengers from the High-Energy Universe
Fig. 10.41 Sources of neutrinos with energies below 1 TeV. From W.C. Haxton, arXiv:1209.3743,
to appear in Wiley’s Encyclopedia of Nuclear Physics
π ± (K ± ) → μ± + νμ (ν μ ),
μ± → e± + νe (ν e ) + ν μ (νμ ) . (10.18)
The neutrino sources just discussed are displayed in Fig. 10.41 according to their
contributions to the terrestrial flux density. The figure includes low-energy sources,
such as the thermal solar neutrinos of all flavors, and the terrestrial neutrinos—i.e.,
neutrinos coming from the Earth’s natural radioactivity—not explicitly discussed
here. Beyond the figure’s high-energy limits there exist neutrino sources associ-
ated with some of nature’s most energetic accelerators. Existing data on cosmic ray
protons, nuclei, and γ rays constrain possible neutrino fluxes (see Fig. 10.42). The
high-energy spectrum is one of the frontiers of neutrino astronomy.
Apart from the (low-energy) data from the Sun and from the Earth, and from the
data from the flare of SN1987A, astrophysical neutrino fluxes detected up to now
are consistent to be isotropic.
The equations of Einstein’s General Relativity (see Chap. 8) couple the metric of
space–time with the energy and momentum of matter and radiation, thus providing
the mechanism to generate gravitational waves at all scales. At the largest scales
(extremely low frequencies 10−15 –10−18 Hz) the expected sources are the fluctua-
tions of the primordial Universe. At most lower scales (frequencies 10−4 –104 Hz)
the expected sources are:
10.3 Possible Acceleration Sites and Sources 587
Fig. 10.42 A theoretical model of high-energy neutrino sources. The figure includes experimental
data, limits, and projected sensitivities to existing and planned telescopes. Figure from G. Sigl,
arXiv:0612240
The presence of magnetic fields in the Universe limits the possibility to investigate
sources of emission of charged cosmic rays, as they are deflected by such fields.
588 10 Messengers from the High-Energy Universe
We know from studies of the Faraday rotation of polarization that the Galactic mag-
netic fields are of the order of a few μG; the structure is highly directional and maps
exist.
In spite of intense efforts during the last few decades, the origin and structure of
cosmic (i.e., extragalactic) magnetic fields remain instead elusive. Observations have
detected the presence of nonzero magnetic fields in galaxies, clusters of galaxies, and
in the bridges between clusters. Generally, the values of extragalactic magnetic fields
(EGMF) are estimated to be between <10−9 G and <10−15 G. The determination of
the strength and topology of large-scale magnetic fields is crucial because of their
role in the propagation of ultrahigh-energy cosmic rays and, possibly, on structure
formation.
Large-scale magnetic fields are believed to have a cellular structure. Namely, the
magnetic field B is supposed to be have a correlation length λ, randomly changing its
direction from one domain to another but keeping approximately the same strength.
The present knowledge of EGMF is summarized in Fig. 10.43.
Since the Larmor radius of a particle of unit charge in a magnetic field can be
written as
E
RL
1PeVB
,
1pc 1µG
in order to “point” to the Galactic center, which is about 8 kpc from the Earth, for a
Galacic field of 1µG one needs protons of energy of at least 2 × 1019 eV. The flux is
very small at this energy; moreover, a black hole of 4 million solar masses like the
10.4 The Propagation Process 589
black hole in the center of our Galaxy is not likely to accelerate particles up to this
energy.
There is thus a need to use neutral messengers to study the emission of charged
cosmic rays. Unfortunately, the yield of photons at an energy of 1 TeV is only 10−3
times the yield of protons, and the yield of neutrinos is expected to be of the same
order. In addition, the detection of neutrinos is experimentally very challenging, as
discussed in Chap. 4.
Cosmic rays produced in distant sources have to cross a long way before reaching
Earth. Those produced in our Galaxy (Fig. 10.44) suffer diffusion in magnetic fields
of the order of a μG, convection by Galactic winds, spallation in the interstellar
medium, radioactive decays, as well as energy losses or gains (reacceleration). At
some point they may arrive to Earth or just escape the Galaxy. Low-energy cosmic
rays typically stay within the Galaxy for quite long times (∼107 years).
All these processes must be accounted in coupled transport equations which then
determines the density Ni of each cosmic ray species. These differential equations,
which are also function of the energy, can, for instance, be written as:
∂ Ni
· D∇
= Qi + ∇ Ni − VN i + ∂ (b (E) Ni ) +
∂t ∂ E
spall 1 1
− βcσi + decay + escap + Ni +
τi τi
⎛ ⎞
1
+ ⎝βcσ spall + ⎠ Nj . (10.19)
ji decay
j>i τ ji
590 10 Messengers from the High-Energy Universe
The first term on the right side accounts for the sources (injection spectrum), the
second one for the diffusion and convection (D is the diffusion coefficient and V the
convection velocity), the third for the changes in the energy spectrum due to energy
losses or reacceleration, the fourth for the losses due to spallation, radioactive decays,
and probability of escaping the Galactic, and the fifth the gains due to the spallation
or decays of heavier elements.
These equations may thus include all the physics process and all spatial and
energy dependences but the number of parameters is high and the constraints from
experimental data (see below) are not enough to avoid strong correlations between
them. The solutions can be obtained in a semi-analytical way or in a numerical
using sophisticated codes (ex GALPROP), where three-dimensional distributions of
sources (as traced by pulsars) and the interstellar medium can be included.
Simpler models, like Leaky-Box, are used to cope with the main features of the
data. In the simplest version it consists only in a volume (box) where there are sources
uniformly distributed and charged cosmic rays freely propagates having however
some probability to escape by the walls (see Fig. 10.45).
The stationary equation of the Leaky-Box can be written as:
⎧ ⎫
1 1 ⎨ 1 ⎬
spall spall
0 = Q i − Ni ρβcσi + + + Nj ρβcσ ji + .
decay
γτi τiesc ⎩ decay ⎭
γτ ji
j>i
(10.20)
Here once again the first term on the right side accounts for the sources and the second
and the third, respectively, for the losses (due to spallation, radioactive decays, and
probability of escape) and the gains (spallation or decays of heavier elements). The
effects of diffusion geometry and so are all hidden in the escape lifetime.
All these models are adjusted to the experimental data and in particular to the
energy dependence of the ratios of secondary elements (produced by spallation of
heavier elements during the propagation) over primary elements (produced directly
at the sources) as well as the ratios between unstable and stable isotopes of the same
element. (See Fig. 10.46).
10.4 The Propagation Process 591
ACE-CRIS(1997/08-1999/07)
galprop ( =300 MV) ACE-SIS(1997/08-1999/07) galprop
galprop
( (=400
=400MV)
MV)
0.35 0.5 Balloon(1973/08)
Balloon(1977/05)
Balloon(1977/09)
IMP7&8(1972/09-1975/09)
0.3 IMP7&8(1974/01-1980/05)
ISEE3-HKH(1978/08-1979/08)
0.4 ISOMAX(1998/08)
Ulysses-HET(1990/10-1997/12)
0.25 Voyager1&2(1977/01-1991/12)
Be/ 9Be
Voyager1&2(1977/01-1998/12)
B/ C
0.2 0.3
0.15
10
ACE-CRIS(1997/08-1998/04)
ACE-CRIS(1998/01-1999/01)
Balloon(1991/09)
CREAM-I(2004/12-2005/01)
0.2
0.1 ACE-CRIS(2001/05-2003/09)
ACE-CRIS(2009/03-2010/01)
CRN-Spacelab2(1985/07-1985/08
HEAO3-C2(1979/10-1980/06)
AMS01(1998/06) IMP8(1974/01-1978/10)
ATIC02(2003/01) ISEE3-HKH(1978/08-1981/04)
Balloon(1971/09+1972/10) PAMELA(2006/07-2008/03)
0.05 Balloon(1972/10)
Balloon(1973/09+1974/05+1975/
TRACER06(2006/07)
Ulysses-HET(1990/10-1995/07)
0.1
Balloon(1974/07+1974/08+1976/ Voyager1&2(1977/01-1993/12)
Balloon(1976/09) Voyager1&2(1977/01-1996/12)
Balloon(1976/10) Voyager1&2(1977/01-1998/12)
0 Balloon(1981/09) Voyager2-HET(1986/01-1987/12)
1 3 0
10 1 10 102 10 10 1
1
Ekn [GeV/n] Ekn [GeV/n]
Fig. 10.46 On the left B/C (secondary over primary) as a function of the energy and on the
right 10 Be/9 Be (unstable over stable). The data points were made using the Cosmic Ray database
reference: Maurin, Melot, Taillet, A&A 569, A32 (2014) [arXiv:1302.5525]. The full lines are fits
using Galprop model standard parameters. (reference: http://galprop.stanford.edu)
While primary/primary ratios (Fig. 10.47) basically do not depend on the energy,
the secondary/primary ratio (Fig. 10.46, left) shows a strong dependence for high
energies as a result of the increase of the probability of escape.
1.2
(2014) [arXiv:1302.5525].
1
The full lines are fits using
0.8
Galprop model standard Balloon(1976/10)
0.6 Balloon(1991/09)
parameters. (reference: CREAM-I(2004/12-2005/01)
CRN-Spacelab2(1985/07-1985/08
0.4 HEAO3-C2(1979/10-1980/06)
http://galprop.stanford.edu) ISEE3-HKH(1978/08-1981/04)
0.2 TRACER06(2006/07)
Ulysses-HET(1990/10-1995/07)
0
1 3
10 1 10 102 10
Ekn [GeV/n]
One should note that in the propagation of electrons and positrons the energy
losses are much higher (dominated by synchrotron radiation and inverse Compton
scattering) and the escape probability much higher. Thus leaky-box like models do
not apply to electrons and positrons.
The photon background in the Universe has the spectrum in Fig. 10.13.
The maximum photon density corresponds to the CMB; its total density is about
410 photons per cubic centimeter.
592 10 Messengers from the High-Energy Universe
Fig. 10.48 Left Spectral energy distribution of the EBL as a function of the wavelength. Open
symbols correspond to lower limits from galaxy counts while filled symbols correspond to direct
estimates. The curves show a sample of different recent EBL models, as labeled. On the upper axis
the TeV energy corresponding to the peak of the γγ cross section is plotted. From L. Costamante,
IJMPD 22 (2013) 1330025. Right A summary of our knowledge about the density of background
photons in intergalactic space, from the radio region to the CMB, to the infrared/optical/ultraviolet
region. Reference: M. Ahlers et al., Astropart. Phys. 34 (2010) 106
Extragalactic cosmic rays might cross large distances (tens or hundreds of Mpc) in
the Universe. Indeed the Universe is full of CMB photons (n γ ∼ 410 γ/cm3 —see
Chap. 8) with a temperature of T ∼ 2.73 K (∼10−3 eV). Greisen and Zatsepin and
Kuzmin independently realized early in 1966 that for high-energy protons the inelas-
tic channels
10.4 The Propagation Process 593
p γC M B → + → p π 0 (n π + )
are likely leading to a strong decrease of the proton interaction length (the so-called
GZK cutoff). The threshold energy is just set by relativistic kinematics:
2 2
p p + pγ = m p + mπ
m 2π + 2m p m π
Ep = ≈ 6 1019 eV (10.21)
4 Eγ
1
λp 10 Mpc . (10.22)
nγ σγ p
In each GZK interaction the proton looses on average around 20 % of its initial
energy.
A detailed computation of the effect of such cutoff on the energy spectrum of
ultrahigh-energy cosmic ray at Earth would involve not only the convolution of the
full CMB energy spectrum with the pion photoproduction cross section and the
inelasticity distributions but also the knowledge of the sources, their location and
energy spectrum as well as the exact model of expansion of the Universe (CMB
photons are redshifted). An illustration of the energy losses of protons as a function
of their propagation distance is shown in Fig. 10.49 without considering the expansion
of the Universe. Typically, protons with energies above the GZK threshold energy
after 50–100 Mpc loose the memory of their initial energy and end up with energies
below the threshold.
The decay of the neutral and charged pions produced in these “GZK interactions”
will originate, respectively, high-energy photons and neutrinos which would be a
distinctive signature of such processes.
At a much lower energy (E p ∼ 2 1018 eV) the conversion of a scattered CMB
photon into an electron–positron pair may start to occur, what was associated by
Hillas and later on by Berezinsky to the existence of the ankle (Sect. 10.1.1).
Heavier nuclei interacting with the CMB and Infrared Background (IRB) photons
may disintegrate into lighter nuclei and typically one or two nucleons. The photo-
disintegration cross section is high (up to ∼100 mb) and is dominated by the Giant
Dipole resonance with a threshold which is a function of the nuclei binding energy
per nucleon (for Fe the threshold of the photon energy in the nuclei rest frame is
∼10 MeV). Stable nuclei thus survive longer. The interaction length of Fe, the most
stable nucleus, is at the GZK energy, similar to the proton GZK interaction length.
Lighter nuclei have smaller interaction length and thus the probability of spallation
during their way to Earth is higher.
594 10 Messengers from the High-Energy Universe
Fig. 10.49 Proton energy as a function of the propagation distance. From J.W. Cronin, Nucl. Phys.
B Proc. Suppl. 28B (1992) 213
Once produced, VHE photons must travel towards the observer. Electron–positron
(e− e+ ) pair production in the interaction of VHE photons off extragalactic back-
ground photons is a source of opacity of the Universe to γ rays whenever the corre-
sponding photon mean free path is of the order of the source distance or smaller.
The dominant process for the absorption is pair-creation
γ + γbackground → e+ + e− ;
2 m 2e c4
> thr (E, ϕ) ≡ , (10.23)
E (1 − cos ϕ)
where ϕ denotes the scattering angle, m e is the electron mass, E is the energy of
the incident photon and is the energy of the target (background) photon. Note
that E and change along the line of sight in proportion of (1 + z) because of the
cosmic expansion. The corresponding cross section, computed by Breit and Wheeler
in 1934, is
2πα2
σγγ (E, , ϕ) = W (β) 1.25 · 10−25 W (β) cm2 , (10.24)
3m 2e
with
10.4 The Propagation Process 595
1 + β
W (β) = 1 − β 2 2β β 2 − 2 + 3 − β 4 ln .
1−β
The cross section depends on E, and ϕ only through the speed β—in natural
units—of the electron and of the positron in the center-of-mass
1/2
2 m 2e c4
β(E, , ϕ) ≡ 1 − , (10.25)
E (1 − cos ϕ)
and Eq. (10.23) implies that the process is kinematically allowed for β 2 > 0. The cross
section σγγ (E, , ϕ) reaches its maximum σγγ max 1.70 · 10−25 cm 2 for β 0.70.
Assuming head-on collisions (ϕ = π), it follows that σγγ (E, , π) gets maximized
for the background photon energy
500 GeV
(E) eV , (10.26)
E
900 GeV
(E) eV . (10.27)
E
where θ is the scattering angle, n (z), z is the density for photons of energy (z)
at the redshift z, and l(z) = c dt (z) is the distance as a function of the redshift,
defined by
dl c 1
= . (10.30)
dz H0 (1 + z) (1 + z)2 ( z + 1) − z(z + 2)! 21
M
In the last formula (see Chap. 8) H0 is the Hubble constant, M is the matter density
(in units of the critical density, ρc ) and is the “dark energy” density (in units
of ρc ); therefore, since the optical depth depends also on the cosmological parame-
10.4 The Propagation Process 597
ters, its determination constrains the values of the cosmological parameters if the
cosmological emission of galaxies is known.
The energy dependence of τ leads to appreciable modifications of the observed
source spectrum (with respect to the spectrum at emission) even for small differ-
ences in τ , due to the exponential dependence described in Eq. (10.28). Since the
optical depth (and consequently the absorption coefficient) increases with energy,
the observed flux results steeper than the emitted one.
The horizon or attenuation edge for a photon of energy E is defined as the distance
corresponding to the redshift z for which τ (E, z) = 1, that gives an attenuation by
a factor 1/e (see Fig. 10.51).
Other interactions than the one just described might change our picture of the
attenuation of γ rays, and they are presently subject of thorough studies, since the
present data on the absorption of photons are hardly compatible with the pure QED
picture: from the observed luminosity of VHE photon sources, the Universe appears
to be more transparent to γ rays than expected. One speculative explanation could be
that γ rays might transform into sterile or quasi-sterile particles (like, for example, the
axions which have been described in Chap. 8); this would increase the transparency
by effectively decreasing the path length. A more detailed discussion will be given
at the end of this chapter.
Mechanisms in which the absorption is changed through violation of the Lorentz
invariance are also under scrutiny; such models are particularly appealing within
scenarios inspired by quantum gravity (QG).
The neutrino cross section is the lowest among elementary particles. Neutrinos can
thus travel with the smallest interaction probability and are the best possible astro-
physical probe.
598 10 Messengers from the High-Energy Universe
The study of the galactic sources continues and their morphology and the SED of the
emitted photons are telling us more and more, also in the context of multiwavelength
10.5 Frontier Physics and Open Questions 599
analyses; in the future, the planned Cherenkov Telescope Array (CTA) will give the
possibility to explore the highest energies, and to contribute, together with high-
energy CR detectors and possibly with neutrino detectors, to the final solution of the
CR problem.
One of the main results from the next-generation detectors will probably be the
discovery of new classes of CR sources. The key probably come from dedicating
effort to surveys, which constitute an unbiased, systematic exploratory approach.
Surveys of different extents and depths are amongst the scientific goals of all major
planned facilities.
The key for such surveys are gamma detectors (and if possible, in the future,
neutrino detectors).
More than half of the known VHE gamma sources (about 80) are located in the
galactic plane. Galactic plane surveys are well suited to Cherenkov telescopes given
the limited area to cover, as well as their low-energy thresholds and relatively good
angular resolution (better than 0.1◦ to be compared to ∼1◦ for EAS detectors). CTA,
investing 250 h (3 months) of observation, can achieve a 3 mCrab
√ sensitivity (being
the flux limit on a single pointing roughly proportional to 1/ tobs , where tobs is the
observation time) on the Galactic plane. More than 300 sources are expected at a
sensitivity based on an extrapolation of the current “(log N − log S)” diagram6 for
VHE Galactic sources (Fig. 10.53).
All-sky VHE surveys are well suited to EAS arrays that observe the whole sky
with high duty cycles and large field of view. MILAGRO and the Tibet air shower
arrays have carried out a survey for sources in the Northern hemisphere down to an
average sensitivity of 600 mCrab above 1 TeV; HAWC has a sensitivity of 50 mCrab
in a year, at median energy around 1 TeV. EAS detectors like HAWC can then “guide”
the CTA. A combination of CTA and the EAS can reach sensitivities better than 30
6 The number of sources as a function of flux ‘(logN −log S)” is an important tool for describing and
investigating the properties of various types of source populations. It is defined as the cumulative
distribution of the number of sources brighter than a given flux density S, and it base some regularity
properties.
600 10 Messengers from the High-Energy Universe
mCrab in large parts of the extragalactic sky. The survey could be correlated with
maps obtained by UHE cosmic ray and high-energy neutrino experiments.
Cosmic rays at ultrahigh energies are messengers from the extreme Universe and a
unique opportunity to study particle physics at energies well above those reachable
at the LHC. However, their limited flux and their indirect detection have not yet
allowed answer to the basic and always present questions: Where are they coming
from? What is their nature? How do they interact?
Under the commonly accepted assumptions of a finite horizon (due to a GZK-like
interaction) and tiny extragalactic magnetic fields, it is expected that the possible
number of sources is relatively small and thus some degree of anisotropy should
be found studying the arrival directions of the highest energetic cosmic rays. Such
searches have been performed extensively in the last years either by looking for
correlations with catalogs of known astrophysical objects or by applying sophisti-
cated self-correlation algorithms at all angular scales.
We already discussed the hot spot in the direction of the Ursa Major constellation
claimed by the Telescope Array and the small excess in the region of Cen A by
the Pierre Auger observatory. Both experiments have submitted proposals for the
upgrade of their detectors and for the extension of their activity for a period of
around 10 years. The quest for the origin of the extreme high-energy cosmic rays is
still open.
The strong GZK-like suppression at the highest energies may be interpreted
assuming different CR composition and sources scenarios. Indeed, both pure proton
and mixed composition scenarios are able to describe the observed features. In the
case of a pure proton scenario, the ankle would be described by the opening, at that
energy, of the pair production channel in the interaction of the incoming protons
with the CMB photons ( p γC M B → p e+ e− ) (this is called the dip model), while the
suppression at the highest energies would be described in terms of the predicted GZK
effect. In the case of mixed composition scenarios such features may be described
playing with different sources distributions and injection spectra, assuming that the
maximum energy that each nucleus may attain scales with its atomic number Z . An
example of such fits is given in Fig. 10.54, where recent Telescope Array and Pierre
Auger data are fitted respectively to a pure proton and a mixed composition scenario.
The solution of such puzzle may only be found with the experimental determination
of the cosmic ray composition from detailed studies on the observed characteristics
of the extensive air showers.
The measured Xmax distribution by the Pierre Auger collaboration in the energy
bin 1018 –1018.5 eV for the 20 % of the most deeply penetrating showers is shown in
Fig. 10.55. It follows the foreseen shape with a clear exponential tail. The selection
of the most deeply penetrating showers strongly enhances the proton contents in
10.5 Frontier Physics and Open Questions 601
Fig. 10.54 Flux of UHE cosmic rays measured by the Telescope Array and by the Pierre Auger
Observatory. Superposed are fits assuming pure proton (light blue curve) and mixed composition
(red curve) scenarios. The curves below the result from the mixed mass composition fit. From K-H
Kampert, P. Tinyakov, http://arxiv.org/abs/1405.0575 (Color in online)
the data sample once the proton penetrate deeply in the atmosphere than any other
nuclei.
The conversion of the exponential index of the distribution tail to a value of proton-
air cross section is performed using detailed Monte Carlo simulations. The conversion
to proton–proton total and inelastic cross section is then done using the Glauber
model which takes into account the multi-scattering probability inside the nuclei
(Sect. 6.4.7). The Auger result is shown in Fig. 10.56 together with accelerator data—
namely with the recent LHC result as well as with the expected extrapolations of
several phenomenological models. The several experimental results lies in a smooth
extrapolation line which not only confirms the evolution as a function of the energy
observed so far on the proton–proton cross section as it is a strong confirmation that
at these energies there is an important fraction of protons in the cosmic ray “beam.”
602 10 Messengers from the High-Energy Universe
Fig. 10.56 Comparison of the inelastic proton–proton cross section derived by the
Pierre Auger Observatory in the energy interval 1018 –1018.5 eV to phenomenologi-
cal model predictions and results from accelerator experiments at lower energies. From
P. Abreu et al., Phys. Rev. Lett. 109 (2012) 062002
Fig. 10.57 Energy evolution of the mean (left) and the RMS (right) of the Xmax distributions
measured by the Pierre Auger Observatory. The lines represent the expectations for pure proton
and iron composition for several simulation models which take into account the data obtained at
the LHC. From A. Aab et al., Phys. Rev. D90 (2014) 122005
The first (mean) and second (RMS) momentum of the Xmax distributions measured
by the Pierre Auger collaboration as a function of the energy are shown in Fig. 10.57
where the lines represent the expected values for pure proton (red) or pure iron (blue)
for several hadronic models tuned with recent LHC data.
These results may indicate some evidence of a change of the cosmic ray composi-
tion from light elements (with a large fraction of protons) at lower energies to heavy
elements (with a large fraction of iron) at the highest energies, suggesting therefore
a scenario of exhaustion of the sources. However, a simple bimodal proton–iron sce-
nario does not fit the data of Fig. 10.57 well: as a matter of fact, none of the present
simulations models is able to reproduce well all the observed ultrahigh-energy data.
10.5 Frontier Physics and Open Questions 603
Features in the spectra of known particles, in the GeV–TeV range, could show up
if these particles originate in decays of exotic particles of very large mass possibly
produced in the early Universe. Such long-lived heavy particles are predicted in many
models, and the energy distribution of particles coming from their decay should
be radically different from what predicted by the standard emission models from
astrophysical sources.
Dark matter candidates (WIMPs in particular, as discussed in Chap. 8) are possible
sources of, e.g., photons, electrons and positrons, and neutrinos via a top-down
mechanism.
As discussed in Chap. 8, the relic density of DM can be expressed as
If one can trust the extrapolations of dark matter density, one can predict the
expected annihilation signal when assuming a certain interaction rate σv or put
limits on the latter in the absence of a signal.
Matter and Antimatter. Dark matter particles annihilating or decaying in the halo
of the Milky Way could potentially produce an observable flux of cosmic positrons
and/or antiprotons, and thus be one possible explanation for the PAMELA anomaly,
i.e., the excess of positron with respect to models just accounting for secondary pro-
duction (Fig. 10.10). Most DM annihilation or decay models can naturally reproduce
the observed rise of the positron fraction with energy, up to the mass of the DM can-
didate (or half the mass, depending if the self-annihilation or the decay hypothesis
is chosen). This flux is expected not to be directional.
The measured antiproton flux also shows unexpected features with respect to the
hypothesis of pure secondary production.
It is plausible that both the positron excess and the excess observed in the elec-
tron/positron yield with respect to current models (see Sect. 10.1.1) can be explained
by the presence of nearby sources, in particular pulsars, which have indeed been
copiously found by the Fermi-LAT (Sect. 10.3.1.1). AMS-02 is steadily increasing
the energy range over which positrons and electrons are measured, as well as the
statistics. If the positron excess is originated from a few nearby pulsars, it would
probably give an anisotropy in the arrival direction of at the highest energies—
there is a tradeoff here between distance and energy, since synchrotron losses are
important; in addition, the energy spectrum should drop smoothly at the highest
energies. A sharp cutoff in the positron fraction would instead be the signature of a
DM origin of the positron excess, but the present data do not seem to support such
a scenario.
The excess of antiprotons is difficult to accommodate in this explanation.
Search in Photon Channels. We shall refer in the following, unless explicitly spec-
ified, to a scenario in which secondary photons are produced in the annihilation of
pairs of particles.
The expected flux of photons from dark matter annihilation can be expressed as
dN 1 σann v
d Nγ
= × dl()ρ2D M . (10.31)
dE 4π 2m 2D M d E −l.o.s.
" #$ % " #$ %
Particle Physics Astrophysics
DM-induced gamma rays can present sharp spectral signatures, like for instance γγ
or Z γ annihilation lines, with energies strictly related to the WIMP mass. How-
ever, since the WIMP is electrically neutral, these processes are loop suppressed and
therefore should be rare. WIMP-induced gamma rays are thus expected to be dom-
inated by a relatively featureless continuum of by-products of cascades and decays
(mostly from π 0 ) following the annihilation in pairs of quarks or leptons. The num-
ber of resulting gamma rays depends quadratically on the DM density along the line
of sight of the observer. This motivates search on targets, where one expects DM
density enhancements. Among these targets are the Galactic center, galaxy clusters,
10.5 Frontier Physics and Open Questions 605
• Galactic center.
The GC is expected to be the brightest source of dark matter annihilation. How-
ever, the many astrophysical sources of gamma rays in that region complicate
the identification of DM. In the GeV region the situation is further complicated
by the presence of a highly structured and extremely bright diffuse gamma ray
background arising from the interaction of the pool of cosmic rays with dense
molecular material in the inner Galaxy. To limit problems, searches for dark mat-
ter annihilation/decay are usually performed in regions 0.3◦ –1◦ away form the
central black hole.
At TeV energies, Cherenkov telescopes detected a point source compatible with
the position of the supermassive black hole in the center of our Galaxy and a
diffuse emission coinciding with molecular material in the Galactic ridge. The
galactic center source has a featureless power law spectrum at TeV energies with
an exponential cutoff at ∼10 TeV not indicating a dark matter scenario; the signal
is usually attributed to the supermassive black hole Sgr A or a to pulsar wind
nebula in that region.
Searches have been performed for a signal from the galactic dark matter halo
close to the core; no signal has been found, and limits are comparable to those
from dwarf spheroidals. However, these limits are more model dependent, since
one must estimate the background.
There have been however two hints of a signal in the galactic center region. An
extended signal coinciding with the center of our Galaxy was reported above the
galactic diffuse emission—however, the interaction of freshly produced cosmic
rays with interstellar material is a likely explanation. The second claimed signal
was the indication of a photon line at ∼130 GeV in regions of interest around the
galactic center; we shall describe it better in the next item.
• Line Searches.
The annihilation of WIMP pairs into γX would lead to monochromatic gamma
rays with E γ = m χ (1 − m X /4Mχ2 ). Such a signal would provide a smoking gun
since astrophysical sources could very hardly produce it, in particular if such a
signal is found in several locations. This process is expected to be loop suppressed
being possible only at O(α2 ).
The discovery of a hint of a signal at ∼130 GeV in the Fermi-LAT data, at a
significance slightly larger than 4σ, was recently claimed as originating from
regions near the GC. The search for the emission regions was driven by the data
rather than by astrophysical considerations.
The existence of this signal is strongly disputed, and some 5 more years of Fermi-
LAT data will be needed to clarify the situation; also the Large Size Telescope
of CTA (Chap. 4) will have the sensitivity to confirm or disprove the claim. The
planned Russian/Italian space mission Gamma-400 for gamma detection in the
MeV-GeV range can significantly improve the energy resolution in the 10 GeV
region, which might be relevant for such line searches.
A summary of the present results is plotted in Fig. 10.58 together with extrapola-
tion to next generation detectors.
10.5 Frontier Physics and Open Questions 607
Fig. 10.58 Comparison of current (solid lines) and projected (dashed lines) limits on the DM
annihilation interaction rate from different gamma ray searches as a function of WIMP mass. Left
Expected limits for CTA are shown for WIMP annihilation compared to limits from Fermi-LAT and
H.E.S.S. The points represent different choices of the parameters in SUSY. Right CTA is compared
with the potential from direct detection (COUPP) and accelerator experiments (ATLAS and CMS
at LHC). From S. Funk, arXiv:1310.2695
There is a unique region of the WIMP parameter space that CTA can best address
in the near future: the high-mass (∼1 TeV) scenario.
Neutrinos. Equation (10.31) holds for neutrinos as well, but the branching fractions
into neutrinos are expected to be smaller, due to the fact that the radiative produc-
tion of neutrinos is negligible. In addition, experimental detection is more difficult.
However, the backgrounds are smaller with respect to the photon case.
Balancing the pros and the cons, gamma rays are the best investigation tool in
case the emission comes from a region transparent to photons.
However, neutrinos are the best tool in case DM is concentrated in the center
of massive objects, the Sun for example, which are opaque to gamma rays. Once
gravitationally captured by such massive objects, DM particles lose energy in the
interaction with nuclei and then settle into the core, where their densities and anni-
hilation rates can be greatly enhanced; only neutrinos (and axions) can escape these
dense objects. The centers of massive objects are among the places to look for a
possible neutrino excess from DM annihilation using neutrino telescopes. No signal
has been detected up to now (as in the case of axions from the Sun). A reliable pre-
diction of the sensitivity is difficult, depending on many uncertain parameters like
the annihilation cross section, the decay modes and the capture rate. The first two
uncertainties are common to the photon channels.
The variability of the AGN in the VHE region can provide information about possible
violations of the Lorentz invariance in the form of a dispersion relation for light
expected, for example, in some QG models.
Lorentz invariance violation (LIV) at the n-th order in energy can be heuristically
incorporated in a perturbation to the relativistic Hamiltonian:
608 10 Messengers from the High-Energy Universe
n
pc
E 2 m 2 c4 + p 2 c2 1 − ξn , (10.32)
E LIV,n
which implies that the speed of light (m = 0) could have an energy dependence.
From the expression v = ∂ E/∂ p, the modified dispersion relation of photons can
be expressed by the leading term of the Taylor series as an energy-dependent light
speed
n
∂E n+1 E
v(E) = c 1 − ξn , (10.33)
∂p 2 E LIV,n
Here, th is the arrival time of the high-energy photon, and tl is the arrival time of the
low-energy photon, with E h and El being the photon energies measured at Earth.
For small z, and at first order,
zc0 E
t (E) d/c(E) zTH 1 − ξ1
H0 c(E) EP
Fig. 10.59 Integral flux of Mkn 501 detected by MAGIC in four different energy ranges. From
J. Albert et al., Phys. Lett. B668 (2008) 253
Lately, several GRBs observed by the Fermi satellite have been used to set more
stringent limits. A problem, when setting limits, is that one does not know if photon
emission at the source is ordered in energy; thus one has to make hypotheses—for
example, that QG effects can only increase the intrinsic dispersion.
The Fermi satellite derived strong upper limits at 95 % C.L. from the total degree
of dispersion, in the data of four GRBs:
Some experimental indications exist, that the Universe might be more transparent to
gamma rays than computed in Sect. 10.4.2.
As discussed before, the existence of a soft photon background in the Universe
leads to a suppression of the observed flux of gamma rays from astrophysical sources
through the γγ → e+ e− pair-production process. Several models have been pro-
posed in the literature to estimate the spectral energy density (SED) of the soft
background (EBL); since they are based on suitable experimental evidence (e.g.,
deep galaxy counts), all models yield consistent results, so that the SED of the EBL
is fixed to a very good extent. Basically, the latter reproduces the SED of star-forming
galaxies, which is characterized by a visibile/ultraviolet hump due to direct emission
from stars and by an infrared hump due to the emission from the star-heated warm
dust that typically hosts the sites of star formation.
However, the Universe looks more transparent than expected—this is called the
“EBL crisis.” Basically, two experimental evidences support this conjecture:
• When for each SED of high-z blazars, the data points observed in the optically
thin regime (τ < 1) are used to fit the VHE spectrum in optically thick regions,
points at large attenuation are observed (Fig. 10.60). This violates the current EBL
models, strongly based on observations, at some 5σ.
• The energy dependence of the gamma opacity τ leads to appreciable modifications
of the observed source spectrum with respect to the spectrum at emission, due to the
exponential decrease of τ on energy in the VHE gamma region. One would expect
naively that the spectral index of blazars at VHE would increase with distance: due
to absorption, the SED of blazars should become steeper at increasing distance.
This phenomenon has not been observed (Fig. 10.61).
Among the possible explanations, a photon mixing with axion-like particles
(ALPs), predicted by several extensions of the Standard Model (Sect. 8.4.2), can fix
the EBL crisis, and obtain compatibility on the horizon calculation. Since ALPs are
characterized by a coupling to two photons, in the presence of an external magnetic
field B photon-ALP oscillations can show up. Photons are supposed to be emitted
by a blazar in the usual way; some of them can turn into ALPs, either in the emission
region, or during their travel. Later, some of the produced ALPs can convert back
into photons (for example, in the Milky Way, which has a relatively large magnetic
field) and ultimately be detected. In empty space this would obviously produce a flux
dimming; remarkably enough, due to the EBL such a double conversion can make
10.5 Frontier Physics and Open Questions 611
the observed flux considerably larger than in the standard situation: in fact, ALPs do
not undergo EBL absorption (Fig. 10.62).
We concentrate now on the photon transition to ALP in the intergalactic medium.
The probability of photon-ALP mixing depends on the value and on the structure of
the cosmic magnetic fields, largely unknown (see Sect. 10.4.1.1).
Both the strength and the correlation length of the cosmic magnetic fields do
influence the calculation of the γ → a conversion probability. In the limit of low
conversion probability, if s is the size of the typical region, the average probability
Pγ→a of conversion in a region is
2
BT λ B gaγγ
Pγ→a 2 × 10−3 , (10.35)
1 nG 1 Mpc 10−10 GeV−1
Fig. 10.61 The observed values of the spectral index for all blazars detected in VHE; superimposed
is the predicted behavior of the observed spectral index from a source at constant intrinsic spectral
index within two different scenarios. In the first one (area between the two dotted lines) is
computed from EBL absorption; in the second (area between the two solid lines) it is evaluated
including also the photon-ALP oscillation. Original from A. de Angelis et al., Mon. Not. R. Astron.
Soc. 394 (2009) L21; updated
Fig. 10.62 Illustration of gamma ray propagation in the presence of oscillations between gamma
rays and axion-like particles. From M.A. Sanchez-Conde et al., Phys. Rev. D79 (2009) 123511
Another possible explanation for the hard spectra of distant blazars, needing a
more fine tuning, is that line-of-sight interactions of cosmic rays with CMB radiation
and EBL generate secondary gamma rays relatively close to the observer.
A powerful tool to investigate Planck scale departures from Lorentz symmetry could
be provided by a possible change in the energy threshold of the pair production
process γV H E γ E B L → e+ e− of gamma rays from cosmological sources. This would
affect the optical depth, and thus, photon propagation.
In a collision between a soft photon of energy and a high-energy photon of
energy E, an electron–positron pair could be produced only if E is greater than the
threshold energy E th , which depends on and m 2e .
10.5 Frontier Physics and Open Questions 613
Note that also the violation of the Lorentz invariance changes the optical depth.
Using a dispersion relation as in Eq. (10.32), one obtains, for n = 1 and unmodified
law of energy–momentum conservation, that for a given soft-photon energy , the
process γγ → e+ e− is allowed only if E is greater than a certain threshold energy
E th which depends on and m 2e . At first order:
E th + ξ(E th
3
/8E p ) m 2e . (10.36)
Further Reading
Exercises
How come that human beings are here on Earth today? How did
the laws of physics made it possible that intelligent life evolved?
Are we unique, or are we just one of many intelligent species
populating the Universe? It is likely that in the vastness of the
Universe we humans do not stand alone; in the near future we
might be within reach of other civilizations, and we must
understand how to identify them, and how to communicate with
them. At the basis of all this is understanding what is life, and
how it emerged on Earth and maybe elsewhere. The answer to
these questions is written in the language of physics.
The ultimate quest of astrophysics is probably to understand the role of the human
beings in the Universe, and in this sense it converges with many different sciences.
Astrobiology is the study of the origin, evolution, distribution, and future of life in
the universe: both life on Earth and extraterrestrial life. This interdisciplinary field
encompasses the study of the origin of the materials forming living beings on Earth,
search for habitable environments outside Earth, and studies of the potential for ter-
restrial forms of life to adapt to challenges on Earth and in outer space. Astrobiology
also addresses the question of how humans can detect extraterrestrial life if it exists,
and how we can communicate with aliens. This relatively new field of science is
a focus of a growing number of NASA and European Space Agency exploration
missions in the solar system, as well as searches for extraterrestrial planets which
might host life.
One of the main probes of astrobiology is to understand the question if we are
unique, or just one of many intelligent species populating the Universe. The most
important discovery of all in astrophysics would probably be to communicate with
different beings: this would enrich us and probably change completely our vision of
ourselves and of the Universe. But the question of life and of its meaning is central
also in many other sciences, from biology to philosophy. In particular, biology wants
to answer many questions, as the question of how life was born from non-living
material (abiogenesis), a question that is central since Aristoteles. We are convinced
that humans will soon be able to generate life from nonliving materials—and this will
© Springer-Verlag Italia 2015 617
A. De Angelis and M.J.M. Pimenta, Introduction to Particle
and Astroparticle Physics, Undergraduate Lecture Notes in Physics,
DOI 10.1007/978-88-470-2688-9_11
618 11 Astrobiology and the Relation of Fundamental Physics to Life
probably be the most important discovery of all in biology, again changing radically
our vision of ourselves. This would probably help also in understanding our origin
as humans.
We shall see how astroparticle physics can help us in this research.
A proper definition of life, universally accepted, does not exist. We shall just try to
clarify some of the conditions under which we might say that a system is living, i.e.,
to formulate a description.
Some of the characteristics most of us accept to define a living being are listed
below.
• Presence of a body, distinguishing an individual. This definition is sometimes
nontrivial (think for example of mushrooms, or of coral).
• Metabolism: Conversion of outside energy and materials into cellular components
(anabolism) and decomposition of organic material (catabolism). Living bodies
use energy to maintain internal organization (homeostasis) and the internal envi-
ronment must be regulated to maintain characteristics different form the “external”
environment.
• Growth: At least in a large part of life, anabolism is larger than catabolism, and
growing organisms increase in size.
• Adaptation: Living beings change in response to the environment. This is funda-
mental to the process of evolution and is influenced by the organism’s heredity, as
well as by external factors.
• Response to stimuli (can go from the contraction of a unicellular organism to
external chemicals, to complex reactions involving all the senses of multicellu-
lar organisms). Often the response generates motion—e.g., the leaves of a plant
turning toward the sun (phototropism).
• Reproduction: the ability to produce new individual organisms. Clearly not every-
thing that replicates is alive: in fact computers can replicate files; some machines
can replicate themselves, but we cannot say that they are alive; on the other hand,
some animals have no reproductive ability, such as most of the bees—reproduction
has to be considered at the level of species rather than of individuals.
The above “physiological functions” have underlying physical and chemical
bases. The living organisms we know have a body that is based on carbon: the
molecules needed to form and operate the cells are made of carbon. But, why car-
bon? One reason is that carbon allows the lowest energy chemical bonds, and is a
particularly versatile chemical element that can be bound to as many as four atoms
at a time.
However, we can think of different elements. If we ask for a material which
can allow the formation of complex structures, tetravalent elements (carbon, silicon,
germanium, ...) are favored. The tetravalent elements heavier than silicon are heavier
11.1 What is Life? 619
than iron, hence they can come only from supernova explosions, and are thus very
rare; we are thus left only with silicon as a candidate for a life similar to our life
other than carbon. Like carbon, silicon can create molecules large enough to carry
biological information; it is however less abundant than carbon in the Universe.
Silicon has an additional drawback with respect to carbon: since silicon atoms
are much bigger than carbon, having a larger mass and atomic radius, they have
difficulty forming double bonds. This fact limits the chemical versatility required for
metabolism.
A tranquilizing view on silicon-based aliens would be that in case of invasion they
would rather eat our buildings than us...
In this section we will analyze what life needed and needs to develop on Earth, and
what are the factors that influence it.
620 11 Astrobiology and the Relation of Fundamental Physics to Life
Liquid water is fundamental for life as we know it: it is very important because it
is used like a solvent for many chemical reactions. On Earth, we have the perfect
temperature to maintain water in liquid state, and one of the main reasons is the
obliquity of Earth with respect to the ecliptic plane at about 23◦ , which allows
seasonal changes.
Water can exchange organisms and substances with Earth, thanks to tides. The
Moon is mostly responsible for the tides: the Moon’s gravitational pull on the near
side of the Earth is stronger than on the far side, and this difference causes tides
(Fig. 11.1). The Moon orbits the Earth in the same direction as the Earth spins on its
axis, so it takes about 24 h and 50 min for the Moon to return to the same location
with respect to the Earth. In this time, it has passed overhead once and underfoot
once, you have two tides. The Sun contributes to Earth’s tides as well, but even if
its gravitational force is much stronger than the Moon’s, the solar tides are less than
half that the one produced by the Moon: this is because the tides are caused by the
difference in the gravity field, and the Earth’s diameter is such a small fraction of the
Sun–Earth distance that the gravitational force from the Sun changes only by a factor
of 1.00017 across the Earth (see the first exercise). Tides are important because many
biological organisms have biological cycles based on them, and if the Moon did not
exist these types of cycles might not have arisen.
But, how did Earth come to possess water? Early Earth had probably oceans that
are the result of several factors: first of all, volcanoes released gases and water vapor
in the atmosphere, that condensed forming the oceans. Nevertheless, vapor from
the volcanoes is sterilized and no organisms can actually live in it: for this reason,
many scientists think that some liquid water with seed of life may have been brought
to Earth by comets and meteorites. The problem of how and where the water was
generated on these bodies is not solved; it is, however, known that they carry water.
Life on Earth is based on more than 20 elements, but just 4 of them (i.e., oxygen,
carbon, hydrogen, and nitrogen) make up 96 % of the mass of living cells (Fig. 11.2).
Non-water Solvents. An extraterrestrial life-form, however, might develop and use
a solvent other than water, like ammonia, sulfuric acid, formamide, hydrocarbons,
and (at temperatures lower than Earth’s) liquid nitrogen.
Water has many properties important for life: in particular, it is liquid over a large
range of temperatures, it has a high heat capacity—and thus it can help regulating
temperature, it has a large vaporization heat, and it is a good solvent. Water is also
amphoteric, i.e., it can donate and accept a H+ ion, and act as an acid or as a base—
this is important for facilitating many organic and biochemical reactions in water. In
addition, water has the uncommon property of being less dense as a solid (ice) than
as a liquid: thus masses of water freeze covering water itself by a layer of ice which
isolates water from the external environment (fish in iced lakes swim at a temperature
of 4 ◦ C, the temperature of maximum density of water).
Ammonia (NH3 ) is the best candidate to host life after water, being abundant in
the Universe. Liquid ammonia is chemically similar to water, and numerous chemical
reactions are possible in a solution of ammonia, which like water is a good solvent
for most organic molecules, and in addition is capable of dissolving many elemental
metals. Ammonia is amphoteric; is however flammable in oxygen, which could create
problems for aerobic metabolism as we know it.
As the melting and boiling points of ammonia at normal pressure are between 195
and 240 K, a biosphere based on ammonia could exist at temperatures and air pres-
sures extremely unusual in relation to life on Earth. The chemical being in general
slower at low temperatures, ammonia-based life, if existing, would metabolize more
622 11 Astrobiology and the Relation of Fundamental Physics to Life
slowly and evolve more slowly than life on Earth. On the other hand, lower tem-
peratures might allow the development of living systems based on chemical species
unstable at our temperatures. To be liquid at temperatures similar to the ones on
Earth, ammonia needs high pressures: at 60 bar it melts at 196 K and boils at 371 K,
more or less like water.
Since ammonia and ammonia–water mixtures remain liquid at temperatures far
below the freezing point of water, they might be suitable for biochemical planets and
moons that orbit outside of the “zone of habitability”. This might apply, for example,
to Titan, the largest moon of Saturn.
A key ingredient affecting the development of life on our planet is temperature. One
may think that the temperature on Earth is appropriate for liquid water because of the
Earth’s distance from the Sun; this is not true, because, for example, the Moon lies at
the same distance from the Sun but its temperature, during the day, is 125 ◦C—and
during night, –155 ◦C. The main reasons why the Earth has its current temperature
are the interior heating, and the greenhouse effect.
The greenhouse effect slows down the infrared light’s return to space: instrumental
to this process are gases, like water vapor (H2 O), carbon dioxide (CO2 ), and methane
(CH4 ), that are present in the atmosphere. They absorb infrared radiation and sub-
sequently they release a new infrared photon. This latter photon can be absorbed by
another greenhouse molecule, so the process may be repeated on and on: the result
is that these gases tend to trap the infrared radiation in the lower atmosphere. More-
over, molecular motions contribute to heat the air, so both the low atmosphere and
the ground get warmer (Fig. 11.3).
If the greenhouse effect did not take place, the average temperature on our planet
would be −18 ◦C. A discriminating factor is the level of CO2 in the atmosphere: on
Earth most of the carbon dioxide is locked up in carbonate rocks, whereas only the
19 % is diffuse in the atmosphere. This prevents the temperature to get too hot, like it
is on Venus where CO2 is mostly distributed in the atmosphere and the temperature
is hotter than on Mercury.
Life on our planet can develop because the atmosphere and the Earth’s magnetic
fields protects us from the high-energy particles and radiations coming from space.
Cosmic rays are mostly degraded by the interaction with the atmosphere, which
emerged in the first 500 million years of life from the vapor and gases expelled
during the degassing of the planet’s interior. Most of the gases of the atmosphere
are thus the result of volcanic activity. In the early times, the Earth’s atmosphere
was composed of nitrogen and traces of CO2 (<0.1 %), and very little molecular
11.1 What is Life? 623
Fig. 11.3 Greenhouse gases trap and keep most of the infrared radiation in the low atmosphere.
Source NASA
oxygen (O2 , which is now 21 %); the oxygen currently contained in the atmosphere
increased as the result of photosynthesis by living organisms.
High-energy cosmic rays are not the only danger: also the charged particles com-
ing from the Sun (the solar wind), and some of the Sun’s radiation, can also be
dangerous for life; for example, UV rays can damage proteins and DNA, destroying
the molecules that product folic acid. Such a deficiency can cause (even in the 75 %
of the cases) spina bifida in newborns. The ozone (O3 ) layer in the upper atmosphere
acts as a natural shield for UV rays, absorbing most of them.
The magnetic field of the Earth generates the magnetosphere that protects us from
the lower energy cosmic rays that travel in the Galaxy (Fig. 11.4), in particular from
the solar wind; the associated amount of energy would destroy life in our planet if
there were no magnetosphere, that traps these particles and confines them.
Some of the cosmic rays are trapped in the Van Allen belts. The Van Allen belts
were discovered in the late 1950s when Geiger counters were put on satellites. They
are two main donut-shaped clouds.
• The outer belt is approximately toroidal, and it extends from an altitude of about
three to ten Earth radii above the Earth’s surface (most particles are around 4 to 5
Earth radii). It consists mainly of high-energy (0.1–10 MeV) electrons trapped by
the Earth’s magnetosphere.
• Electrons inhabit both belts; high-energy protons characterize the inner Van Allen
belt, which goes typically from 0.2 to 2 Earth radii (1000–10000 km) above the
Earth. When solar activity is particularly strong or in a region called the South
624 11 Astrobiology and the Relation of Fundamental Physics to Life
Fig. 11.4 The Earth’s magnetic field and the Van Allen belts. From http://www.redorbit.com/
Atlantic Anomaly,1 the inner boundary goes down to roughly 200 km above sea
level. Energetic protons with energies up to 100 MeV and above are trapped by the
strong magnetic fields in the region. The inner belt is a severe radiation hazard to
astronauts working in Earth-orbit, and to some scientific instruments on satellite.
Close to the poles, charged particles trapped in the Earth’s magnetic field can
touch the atmosphere, and this reaction produces photons: this phenomenon is called
Aurora Borealis in the North Pole, and Aurora Australis in the South Pole (Fig. 11.4).
To understand what life is, and how to find life in the Universe, we can examine the
most extreme living forms we know. We shall use this to define a habitable region—
i.e., a region fulfilling a set of conditions under which we know life might occur. It
is obviously not excluded that the actual conditions of life are wider.
Thanks to homeostasis, organisms on Earth called extremophiles exist, that can
survive in extreme environments, such as:
• hot and cold places;
• salty and dry environments;
• acidic and basic places;
• environments of extreme pressure and radiations.
Let us analyze how such organisms can survive in these places.
• Hot and cold environments
Examples of hot places are volcanoes in the deep oceans: there the temperature can
goes from 50 to 80 ◦C and some organisms, called hyperthermophiles, evolved their
proteins and membrane to resist at such high temperatures. An example of these
1 The non-concentricity of the Earth and its magnetic dipole causes the magnetic field to be weakest
in a region between South America and the south Atlantic; the solar wind can penetrate this region.
11.1 What is Life? 625
For thousands of years philosophers, scientists, and theologians have argued how life
can come from non-life. Also in the interpretation of St. Augustine life came from
non-living forms, although this biogenic process was mediated by God: “And God
said, let the Earth bring forth the living creature after his kind, cattle, and creeping
thing, and beast of the Earth after his kind: and it was so.” Thus, God transferred
to the Earth special life-giving powers, and using these powers the Earth generated
plants and animals: “The Earth is said then to have produced grass and trees, that is,
to have received the power of producing”.
626 11 Astrobiology and the Relation of Fundamental Physics to Life
A first step to search for life in the solar system is to try to define a “habitable zone”
that corresponds to the region where the temperature and the presence of water allow
(or allowed) liquid water, there is an atmosphere, and appropriate conditions apply.
11.2 Life on the Solar System, Outside Earth 627
Extremophiles suggest how life has a large range of definability, and that there is not
a universal definition of habitability that suits every organism.
The habitable zone (Fig. 11.5) lies likely between Venus and Mars. This zone is
not fixed because planets change their internal structure and conditions: they can get
hotter or colder, and so they may not be forever habitable. Mercury, the first planet
from the Sun—just 58 million km away—has a temperature ranging from 457 ◦C
in the day to −173 ◦C in the night, not allowing the presence of liquid water or any
chemical reaction; and it has no atmosphere, and its thus exposed to meteoric and
cometary impacts.
The giant planets Jupiter and Saturn on the outer Solar System, having respectively
a mass of 318 and 95 times the Earth’s mass, seem also a very unlikely place for
life. Jupiter, for example, is composed primarily of hydrogen and helium, plus small
amounts of sulfur, ammonia, oxygen, and water. Temperatures and pressures are
extreme. Jupiter does not have a solid surface, either—gravity can move a solid body
to zones with high pressure. Saturn’s atmospheric environment is also unfriendly due
to strong gravity, high pressure, strong winds, and cold temperatures. Some of the
moons of Jupiter and Saturn, however, can be thought as possible hosts of life.
Finally, the planets external to Saturn are too cold to be life-friendly.
628 11 Astrobiology and the Relation of Fundamental Physics to Life
In this section, we will discuss the possibility that conditions for life to develop may
exist in other planets of the solar system, close to the habitability zone just defined.
11.2.1.1 Venus
Venus’ structure and mass are very similar to the Earth’s. However, although Venus,
unlike Mercury, has an atmosphere, carbon-and water-based life cannot develop on
Venus. The main problem is the high temperature of more than 400 ◦C, due to the
greenhouse effect. This effect is particularly strong on Venus because of volcano
activity that fills the atmosphere with a large amount of gases. Pressure too is very
high (∼90 atmospheres), a condition that on Earth can be found only in the deepest
oceans.
11.2.1.2 Mars
Mars orbits at approximately 228 million km from the Sun, and its mass is 11 %
the Earth’s (Fig. 11.6). Its atmosphere was originally similar to Venus’ and Earth’s
(early) atmospheres, due to similar conditions during their formation.
Mars has always been one the best candidates for extraterrestrial life: a long time
ago, Mars was probably warmer, probably it had liquid water (on its surface we
recognize structures which can be attributed to past rivers, as shown in Fig. 11.6),
and it must have had a deep atmosphere with gases produced by volcanic activity. But
things have changed: volcanic activity stopped, and Mars has quickly lost its internal
heat (due to its small mass) and most of its atmosphere. Mars was no longer protected
by cosmic radiations and particles, and it began to cool down. This process lead to its
current conditions: no liquid but frozen water, and temperatures impervious to life
(summer: 27 to −130 ◦C; winter: up to −143 ◦C at the poles). In 1976, two space
probes (the Vikings) landed on the surface to find evidences of life, but found none.
Fig. 11.6 Left Mars and Earth sizes. http://space-facts.com/mars-characteristics/. Right a structure
on Mars’ surface that can be related to the presence of ancient rivers. Source NASA
11.2 Life on the Solar System, Outside Earth 629
In July 2008, laboratory tests aboard NASA’s Phoenix Mars Lander identified frozen
water in a soil sample.
In this section, we will examine the particularities of three moons within the solar
system: Europa (a satellite of Jupiter); and Titan and Enceladus (satellites of Saturn).
11.2.2.1 Europa
Jupiter’s four main satellites are Io, Europa, Callisto, and Ganymede (the Galileian
moons). Some them may have habitats capable of sustaining life: heated subsurface
oceans of water may exist deep under the crusts of the three outer moons—Europa,
Ganymede, and Callisto. The planned JUICE mission will study the habitability of
these moons; Europa is seen as the main target.
It is the smallest of the four, having roughly the same size as our Moon. Its
temperature reaches −160 ◦C. At such temperatures there is no liquid water, but what
makes Europa so fascinating is hidden under its frozen surface: planetary geologists
found out that only the oldest cracks appear to have drifted across the surface, which
is rotating at a different rate respect to its interior, probably due to an underlying,
50 km thick, ocean layer of liquid water, methane, and ammonia. Figure 11.7 shows
the hypothetical structure of Europa.
Fig. 11.8 Left Titan, a satellite of Saturn. Right Detail of Titan: note the presence of lakes on its
surface. Credits: NASA (Cassini)
11.2.2.2 Titan
Titan (Fig. 11.8, left) is the largest moon of Saturn. Having a diameter of 5700 km it
is bigger than Mercury, but has less than half its mass. Even if there is just a weak
gasp of gas surrounding it, Titan has an atmosphere because it is situated in one
of the coldest regions of the solar system. With a pressure of 1.5 atmospheres and
a temperature of −170 ◦C, Titan can host solid, gas and liquid methane: in 1997
the Cassini space probe captured evidence of a giant methane lake, the Kraken sea
(Fig. 11.8, right), that has a surface of about 400,000 km2 .
11.2.2.3 Enceladus
living on Earth could live on Encelado’s geysers. In view of the relatively accessible
distance of Saturn’s satellites, it is conceivable to think of a return space mission.
In the previous section, we saw how difficult is to find life on the other planets and
moons of the solar system, because they hardly have the characteristics that life based
on liquid water and carbon needs. But, what about the rest of the Galaxy? Are we
alone?
Our Galaxy is 30 kpc large and it contains about 4 × 1011 stars, most with a
planetary system: it seems impossible that we represent the only forms of life.
N T = R × f p × n E × fl × f i × f c × L ,
where:
• R is the yearly rate at which suitable stars are born;
• f p is the fraction of stars with a planetary system;
• n E is the number of Earth-like planets per planetary system;
• fl is the fraction of those Earth-like planets where life can develop;
• f i is the fraction of these planets on which intelligent life can develop;
• f c is the fraction of planets with intelligent life that could develop technology;
• L is the lifetime of a civilization with communicating capability.
Let us examine each factor in it. We can distinguish among astronomical, plane-
tary, and biological factors.
Astronomical Factors. The astronomical factors are the star formation rate R in our
Galaxy, and the fraction f p which develop a planetary system. The star formation rate
R is estimated to be 2–5 stars/yr. The current estimate of f p is about (0.2–0.5): thanks
to technological innovation in the search for extraterrestrial planets, we discovered
that a large fraction of stars have a planetary system.
Planetary Factors. The planetary factor of the equation is n E , which depends on
the “habitable zone” that correspond to the zone of the solar system where the tem-
perature liquid water. In the solar system, the habitable zone (Fig. 11.5) lies between
Venus and Mars: the Earth is the only planet located in the solar system’s habitable
632 11 Astrobiology and the Relation of Fundamental Physics to Life
zone today. As we discussed before, this zone is not fixed because conditions change:
planets can get hotter or colder, and so they may not be forever habitable. We esti-
mate, based also on the recent results on searches for extrasolar planets, that n E ∼ 1.
Biological Factors. These are the most difficult to estimate, and the values we assume
here are guesses. fl the fraction of the planets where life can develop, f i the fraction
of planets where intelligent life can develop, f c the fraction of intelligent beings who
can develop communication technology, and L the lifetime of civilization. Even if it
is very difficult to give a range to these factors because we do not know the probability
to find life based on liquid water, Drake estimated fl range from 0.1 to 1; more recent
studies suggest fl ∼ 1. As for the other factors: f i ∼ 0.01 − 1, f c ∼ 0.1 − 1, while
L is valued to have a range from 103 to 106 years, being 10000 years a conservative
estimate.
The number of communicative civilization in our Galaxy can be estimated to be:
As one can see, due to the large uncertainties one cannot exclude that the value is just
one (we know it has to be at least one). On the contrary, it is very unlikely that the
chance to have intelligent life in a galaxy like the Milky Way is smaller than 0.01,
which, given the fact that there are 1011 galaxies in the Universe, makes it very likely
that life exists in some other galaxies. However, communication with these forms of
civilization are, at the present state of technology, unthinkable.
The Drake equation is based on the idea of development of life:
however we cannot give a definition of life, and we cannot rule out that some form
of life could be based, for example, on silicon instead of carbon, so all the factors in
Drake’s equation could take different values if we assume that life could also develop
in extreme condition, for example where no liquid water exists.
One of the critical factors in Drake’s equation is n E : its estimated value is influenced
by the number of habitable planets that we discovered in a planetary system. Our
estimate of n E is continuously changing due to the technological evolution that allows
us to discover new extrasolar planets.
When scientists search for new habitable planets orbiting a star, they first want
to determine the position of the star’s habitable zone, and to do that they study the
radiations emitted by the star: in fact, bigger stars are hotter than the Sun and so their
habitable zone is farther out; otherwise, the habitable zone of smaller stars is tighter.
11.3 Search for Life Outside the Solar System 633
Planets are very small and dark compared to the stars, and so how can they be
detected? If scientists cannot look at the planets, they study the stars and the effects
that orbiting planets have on them.
There are several methods to detect planets orbiting a star. There of the main
methods are listed below.
• Doppler spectroscopy. This method is the most effective to detect extrasolar plan-
ets. It relies on the fact that a star moves, responding to the gravitational force of
the planet. These movements affect the starlight spectrum, via a periodic Doppler
shift of the wavelengths of the emission.
• Transit photometry. With this method scientists can detect planets by measuring
the dimming of the star as the planet that orbits it passes between the star and the
observer on the Earth: if this dimming is periodic, and it lasts a fixed length of
time, likely there is a planet orbiting the star.
• Microlensing. This is the method to detect planets at the largest distances from the
Earth. The gravitational field of a host star acts like a lens, magnifying the light of
a distant background star. This effect occurs only when the two stars are aligned. If
the foreground lensing star hosts a planet, then that planet’s own gravitational field
can contribute in an appreciable way to the lensing (Fig. 11.9). Since such a precise
alignment is not very likely, a large number of distant stars must be monitored in
order to detect such effect. The galactic center region has a large number of stars,
and thus this method is effective for planets lying between Earth and the center of
the galaxy.
Looking for habitable planets, scientists want to find planets with mass, density
and composition similar to the Earth: in large planets like Jupiter, the gravity force
would be too strong for life; too small planets could never trap an atmosphere.
In Fig. 11.10 (left) we show the distance from their sun of the extrasolar planets
discovered up to now, and the energy flux of their host star; in Fig. 11.10 (right) we
can see how the research evolved in the years, and how, thanks to the scientific and
technological innovation, we can find now planets with mass similar to the Earth.
Fig. 11.10 Left discovered extrasolar planets in function of the distance from their sun, and the
stellar flux. The Earth, Venus and Mars are included as a reference. Credit: NASA. Right plot of
year of discovery of extrasolar planets versus their mass (normalized to Jupiter’s mass). Credit:
Wikimedia Commons
The first extrasolar planet (HD114762b) was discovered in 1989, its mass is 10 times
Jupiter’s mass.
The most important mission to detect Earth-like planets outside our solar system
is nowadays the Kepler Mission; the spacecraft was launched on March 7th, 2009.
A photometer analyzed over 145,000 stars in the Cygnus, Lyra, and Draco constel-
lations, to detect a dimming of brightness which could be the proof of the existence
of an orbiting planet.
The most important result of the Kepler mission is the discovery of the first Earth-
like planet, in the habitable zone of the Cygnus Constellation: Kepler-186f–and more
recently, of a possible Earth-like planet orbiting a Sun-like star: Kepler-452b.
11.3.2.1 Kepler-186f
On April 17, 2014, NASA announced the discovery of the first extrasolar Earth-like
planet, orbiting a M-dwarf in the first habitable zone discovered outside our solar
system: Kepler-186f (Fig. 11.11), in the constellation Cygnus, 500 lightyears from
us. This planet was found by the NASA’s Kepler Space Telescope, and many of
its characteristics, composition and mass, make it similar to our planet. M-dwarfs,
which have masses in the range of 0.1–0.5 solar masses, make up about 75 % of the
stars within our galaxy.
Kepler-186f has a period of revolution around its sun of 130 days, it is likely to
be rocky, and it is the first new discovered planet with dimensions similar to the
Earth: in fact its radius is 1.1 times the Earth’s one, and its estimated mass is 0.32
the Earth’s one. It receives from its star one third the energy that Earth gets from
the Sun, although it is much closer (just 0.36 astronomical units AU, where 1 AU is
average distance between the Earth and the Sun): it is thus a cold planet, and it could
not host human life.
Kepler-186f is located in a five-planet system; the other four planets in this system,
Kepler-186b, Kepler-186c, Kepler-186d, and Kepler-186e, orbit around their sun
11.3 Search for Life Outside the Solar System 635
Fig. 11.11 Kepler-186f and Kepler-452b in their solar systems: comparison with the habitable zone
of our Solar System. Credit: NASA
with periods of 4, 7, 13, and 22 days, respectively, they are too hot for life as we
know it to develop.
11.3.2.2 Kepler-452b
On 23 July 2015, NASA announced the discovery of the first extrasolar Earth-like
planet (potentially rocky) within the habitable zone of a Sun-like star (G star). At a
distance of 1400 ly from the Earth and located in the constellation Cygnus, it has a
revolution period of 385 days. The star is six billion years old, i.e.,1.5 billion years
older than our Sun; Kepler-452b is receiving a power close to the one we receive
from our Sun.
The similarities with the Earth are amazing; and the future promises more candi-
dates possibly able to host a carbon-based life.
Enrico Fermi in 1951 tried to give an explanation to this lack of detection of alien
communication; this is called the “Fermi paradox”: where is everybody? Indeed,
given the Drake’s equation, we could expect that a contact could have been already
established. Several possible answers have been suggested.
636 11 Astrobiology and the Relation of Fundamental Physics to Life
• We are alone. We have not received any signal just because nobody sent it, and
life needs some proprieties, like liquid water, carbon, right temperature, that we
can find just on Earth. But this opinion is difficult to accept—also because we
cannot just give a univocal definition of life since life can evolve and develop also
in places where, for example, water can be found only in frozen state.
• The evolution of civilizations able to communicate not last for long. There are two
main reasons why a civilization can fall:
1. Cultural reasons: populations evolute enough destroys themselves.
2. Natural reasons: catastrophic events, like meteorites or cometary impacts.
• Communicative extraterrestrial civilizations do exist, but they are too far away
from us. The galaxy is so extended (30 kpc) that any signal would take millions
of years to get from a planet to another, and in this time a civilization could even
become extinct.
• They do not want to communicate with us, maybe because they are afraid of our
possible reaction. If we knew there are civilizations weaker than us in our Galaxy,
would we attack them?
• We cannot understand their signals. All our attempts of communication are based
on electromagnetic waves, but maybe they have already sent us signal based on
neutrinos, or gravitational waves that we are not able to detect yet.
One of the main unknowns is how could aliens communicate with us, and how can
we receive and decrepit their signals. In this section we will describe what kind of
signals we are trying to detect.
Let us first examine how far a signal can reach. First of all, we must know ho
much energy a civilization can use for transmitting.
In 1964, Kardashev defined three levels of civilizations, based on the order of
magnitude of power available to them:
• Type 1. Technological level close to the level presently attained on Earth, with
energy consumption at 4 × 1012 W (four order of magnitude less than the total
solar insulation).
• Type 2. A civilization capable of harnessing the energy radiated by its own star (if
the host star is Sun-like, 4 × 1026 W).
• Type 3. A civilization in possession of energy on the scale of its own galaxy, (for
the Milky Way, about 4 × 1037 W).
The above jumps might look too steep. Carl Sagan suggested defining intermediate
values by interpolating the values given above:
log10 P − 6
K =
10
11.3 Search for Life Outside the Solar System 637
where value K is a civilization’s rating and P is the power it controls. Using this
extrapolation, humanity’s civilization in 2011—average power was 17.4 TW—was
of 0.72.
In general, the inverse square law for intensity applies here: I = P/4π d 2 , with
P the power of the signal. A rule of thumb for the distance which can be reached
with a radio signal (most economic within the electromagnetic spectrum), a rule of
thumb of the distance that can be reached with top technological devices at present
technology (50 m dish size both for transmitting and receiving, best gains) is:
P
d 1 kpc . (11.1)
1 GW
It seems thus difficult for a Type 1 civilization to reach beyond the scale of a galaxy
based on radio communication.
11.3.4.1 SETI@home
The term SETI (search for extraterrestrial intelligence) refers to a number of activities
to search for intelligent extraterrestrial life. As already discussed, communicating in
space can be quite prohibitive, the main reason being cosmic distances. Receiving
the visit of a spacecraft is extremely unlikely, so SETI is looking for radio waves that
might have been sent by extraterrestrial intelligent civilizations (radio waves since
this is the most economical way of communicating from the energetic point of view).
SETI looks for “narrow band transmissions” which can be produced only by artificial
equipment: the problem with these communications is that they are very difficult to
single out from the many of them produced on Earth. Not even the world’s biggest
supercomputers could manage the task of studying all these noises of the Universe.
So, a Internet-based, public-volunteer computing project, SETI@home, was set up:
after downloading and installing an appropriate software on a personal computer,
the executable gets switched on when the computer is not in use, receives 300 kb
dates by Internet from the Arecibo radio telescope in Puerto Rico, and tries to find
regularities in these data.
The rationale in the search is that we expect that the communication will be narrow-
band, and periodical; thus we can isolate them with Fourier analysis or autocorrelation
studies, tested at different wavelengths.
If aliens ever sent us messages, the real problem is if we can receive them or not,
and distance being the main thing that could prevent signals from reaching us: as
seen in the previous subsection, the distance from which a telescope could detect
an extraterrestrial transmission depends on the sensitivity of the receiver and on the
strength and type of the signal.
Most of the SETI searches are based on radio waves, but it is possible that the aliens
would try to communicate using visible light, or other forms of energy/particles.
Some SETI efforts are indeed addressed to search such signals. Distant civilizations
might choose to communicate in our ways, like with ultraviolet light or X-rays—in
638 11 Astrobiology and the Relation of Fundamental Physics to Life
particular infrared light has a potential value because it can penetrate interstellar
dust—but all these forms of light are much more expensive in terms of energy cost.
Some people have suggested, that extraterrestrial civilizations might use neutrinos, or
gravitational waves but the problem with these kind of communication technologies
is that they might involve physics we are not able to manage, yet. Some carriers of
information offer the possibility to beam the emission; lasers in the visible range for
example. Using current technology available on Earth (10 m reflectors as the trans-
mitting and receiving apertures and a 4 MJ pulsed laser source), a 3 ns optical pulse
could be produced which would be detectable at a distance of 1000 ly, outshining
starlight from the host system by a factor of 104 . It is not unlikely that our civilization
could reach the full Galaxy with a beamed signal—which means that we should be
able to choose our targets. In this case, Cherenkov telescopes, being equipped with
the largest mirrors, would be the ideal target for aliens. Indeed, some effort has been
done to look for extraterrestrial signals in Cherenkov telescopes, with no success.
We also try to communicate with alien civilizations, hoping that they will decrypt
our signal and answer.
In this section, we outline how we try to communicate with technologically
advanced civilizations outside the solar system. The most effective way possible for
our technology is to broadcast a radio signal; however, several other attempts were
done, in particular sending a “message in a bottle” (a handcraft plate in a satellite).
11.3.5.1 Arecibo
This message, sent into space by the Arecibo radio telescope in 1974, is composed
by 1679 symbols. This number is the product of two prime numbers, in fact 1679 =
23 × 73. The message was aimed at the current location of globular star cluster M13
some 25,000 light-years away because M13 is a large and close collection of stars.
We anyway welcome possible decoders from different galaxies.
Decrypting the binary message (Fig. 11.12), translating the number 0 with a black
square and the number 1 with a colored square, the result is a matrix 23 × 73
(Fig. 11.13), that contains some messages about our world:
11.3 Search for Life Outside the Solar System 639
Fig. 11.12 The Arecibo message in binary form. Source Wikimedia Commons
Now, we will analyze the part that compose the message in the Fig. 11.13, from
the top to the bottom set by set.
1. Reading from left to right, vertically from the bottom to the top, there are the
numbers from 1 to 10 written in binary form, where in each column the black
square at the bottom signs the beginning of the number: for example, the first
number written in binary form of the left is 1 = 1×20 which is 1 in decimal form;
then, we can find the number written in binary form 10 which is 0×20 +1×21 = 2
in decimal form; then, the number 111 is written in binary form, that correspond
to 1 × 20 + 1 × 21 + 1 × 22 = 3, and so on. The numbers 8, 9, 10 are written on
two columns.
4. The helix structure of the DNA, and the central bar, read from the bottom to the
top, where the lower black square on the left marks the beginning, represents
the number of the nucleotides: in fact the bar is the number in binary form
11111111111101111111101101011110, that is in decimal form 4294441822
which was believed to be the case in 1974, when the message was sent—we
think now that there are about 3.2 billion nucleotides that form the DNA.
5. In the center a figure of a man; on the left the medium height of a man, i.e.,
1.764 m, which is the product of 14, written in binary form by the horizontal line
where the black square on the left marks the beginning, times the wavelength of the
message (126 mm); on the right, the size of human population in binary form, and
in this case it has to be read horizontally where the upper black square of the left
marks the beginning: the number is 000011111111110111111011111111110110
(4292853750 in decimal form).
642 11 Astrobiology and the Relation of Fundamental Physics to Life
6. Our solar system, where the Earth is offset and the human figure is shown “stand-
ing on” the figure of the Earth.
7. A drawing of the Arecibo Telescope, in Puerto Rico, with below the dimension
of the telescope, 306.18 m, which is the product of the number 2.430 written in
binary form (100101111110) in the two bottom rows, read horizontally and the
black square on the low right in the central block marks the beginning of the
number, multiplied by the wavelength of the message.
11.3.5.2 Pioneer 10
Pioneer 10 is an American space probe launched in 1972. A gold plaque (Fig. 11.14)
was placed onboard (and also onboard the subsequent mission, Pioneer 11), contain-
ing information about the space mission and mankind. The information that we can
find on the plaque are:
1. Hyperfine transition for neutral hydrogen, the most abundant element. The inter-
action between the proton and the neutron magnetic dipole moments in the ground
state of neutral hydrogen results in a slight increase in energy when the spins are
parallel, and a decrease when antiparallel. The transition between the two states
causes the emission of a photon at a frequency about 1420 MHz, which means a
period of about 7.04 × 10−10 s, and a wavelength of ∼21 cm. This is the key to
read the message.
11.3 Search for Life Outside the Solar System 643
2. The figures of a man and a woman; between the vertical column brackets that
indicate the height of the woman, the number eight can be seen in binary form
1000, where the vertical line means 1 and the horizontal lines mean 0: in unit of
the wavelength of the hyperfine transition of the hydrogen, the result is 8 × 21 cm
= 168 cm, which was at the time the average height of a woman. The right hand
of the man is raised, as a good will sign, and it can even be a way to show the
opposable thumb and how the limbs can be moved.
Fig. 11.14 The plaque onboard the Pioneer 10. Source Wikimedia Commons
644 11 Astrobiology and the Relation of Fundamental Physics to Life
3. Relative position of the Sun to the center of the Galaxy, and 14 pulsars with their
period; on the left, we can see 15 lines emanating from the same origin, 14 of
the lines report long binary numbers, which indicate the periods of the pulsars,
using the hydrogen transition frequency as the unit. For example, starting from
the unlabeled line and heading clockwise, the fist pulsar we find matches the
number 1000110001111100100011011101010 in binary form, which correspond
to 1178486506 in decimal form: to find the period of this pulsar relative to the
Sun we have to multiply this number by 7.04 × 10−10 s, which is the period of
the hyperfine transition of hydrogen.
The fifteenth line extend to the right, behind the human figures: it indicates the
Sun’s distance from the center of the galaxy.
4. The solar system with the trajectory of Pioneer. In this section, it is indicated the
distances of every single planet from the Sun, and they are relative to Mercury’s
distance from the Sun, which has a unit of 1010 in binary form, that is 10 in
decimal form: for example the number relative to Saturn is 11110111, that is 247
11.3 Search for Life Outside the Solar System 645
in decimal from and this means that Saturn is 247 times further from the Sun than
Mercury.
Given Eq. 11.1, the possibility to reach a distance of 25,000 light-years is very
optimistic.
11.4 Conclusions
be tied down into staring at the same stars, instead monitoring 50 % of the sky on
eight 30-day positions and two longer 3-year fields. This might allow a hundred of
Earth-like planets with potentially habitable temperatures to be discovered.
Scientists will possibly study with new telescopes stars of nearby galaxies, to
better estimate the number of communicative extraterrestrial civilizations. They will
listen to the sound of gravity waves and neutrinos in the Universe: this will give to
mankind the ability of detecting signals at larger distances.
If all this will happen, it will be easier to find evidence of extraterrestrial life, but
this will require innovation, investment and perseverance.
Further Reading
Exercises
1. The mass of the Moon is about 1/81 of the Earth’s mass, and the mass of the Sun
is 333,000 times the Earth’s mass. The average Sun-Earth distance is 150 ×106
km, while the average Moon–Earth distance is 0.38 ×106 km (computed from
center to center).
(a) What is the ratio between the gravitational forces by the Moon and by the
Sun?
(b) What is the ratio between the tidal forces (i.e., between the differences of
the forces at two opposite sides of the Earth along the line joining the two
bodies)?
2. What is the maximum temperature for which the Earth could trap an atmosphere
containing molecular oxygen O2 ?
3. Assuming that the Sun is a blackbody emitting at a temperature of 6000 K (approx-
imately the temperature of the photosphere), what is the temperature of Earth at
equilibrium due to the radiation exchange with the Sun? Assume the Sun’s radius
to be 7000 km, i.e., 110 times the Earth’s radius.
Appendix A
Periodic Table of the Elements
Source From K.A. Olive et al. (Particle Data Group), Chin. Phys. C 38 (2014) 090001
Source From K.A. Olive et al. (Particle Data Group), Chin. Phys. C 38 (2014) 090001
Source From K.A. Olive et al. (Particle Data Group), Chin. Phys. C 38 (2014) 090001
N
Natural units (NU), 62 O
Navarro, Frenk, and White (NFW) profile of − , 213
DM, 446 Olbers’ paradox, 451
NEMO experiment, 182 OPERA, 525
NESTOR project, 182 Optical depth, 596
Neutral weak currents, 297 Optical theorem, 348
Neutralino, 413, 494 OZI rule, 217
Neutrinos, 505, 557 Ozone layer, 623
appearance experiments, 513
disappearance experiments, 513
accelerator, 522 P
and hot dark matter, 528 Pacini, Domenico, 74
astrophysical, 557 Pair production, 113
atmospheric, 516, 557, 585 PAMELA
beams, 297 anomaly, 546, 604
CNO cycle, see CNO cycle PAMELA experiment, 157
cosmogenic, 557 Parity, 195, 204, 288
cosmological, 481 Parity violation, 288
decoupling, 480 Parsec, 9
Dirac, 532 Partial cross-section, 30
from SN1987A, see SN1987A Partial wave amplitudes, 348
from the collapse of supernovae, 585 Partial width, 34
geo-neutrinos, 525 Particle accelerator, 136
IceCube, see IceCube Particle Data Book, 240
Majorana, 532, 533 Particle lifetime, 33
mass, 505 Parton density function, 232
mass limits, 529 Partons, 229
MSW effect, 514 Pauli
number of light species, 395 exclusion principle, 4
Index 659
Y
W Yang, Chen, 257
W boson, 7, 285, 370, 378 Yang–Mills theories, 324
Wavelength shifter, 127 Yukawa potential, 41
Weak interaction, 6, 285 Yukawa, Hideki, 42
Weak isospin, 373
Weinberg
angle, 377 Z
Weinberg, Steven, 370 Z boson, 7, 285, 370, 378
WHIPPLE telescope, 170 asymmetries, 393
Wilczek, Frank, 323 lineshape, 395
Wilson, Charles Thomson Rees, 72 partial widths of the, 392
Wilson, Robert, 429 Zero point energy, 284