0% found this document useful (0 votes)

205 views

2008 Book StringTheoryAndFundamentalInte

Uploaded by

SergioGimeno

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

205 views

2008 Book StringTheoryAndFundamentalInte

Uploaded by

SergioGimeno

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 974

Lecture Notes in Physics

Editorial Board
R. Beig, Wien, Austria
W. Beiglböck, Heidelberg, Germany
W. Domcke, Garching, Germany
B.-G. Englert, Singapore
U. Frisch, Nice, France
P. Hänggi, Augsburg, Germany
G. Hasinger, Garching, Germany
K. Hepp, Zürich, Switzerland
W. Hillebrandt, Garching, Germany
D. Imboden, Zürich, Switzerland
R. L. Jaffe, Cambridge, MA, USA
R. Lipowsky, Potsdam, Germany
H. v. Löhneysen, Karlsruhe, Germany
I. Ojima, Kyoto, Japan
D. Sornette, Nice, France, and Zürich, Switzerland
S. Theisen, Potsdam, Germany
W. Weise, Garching, Germany
J. Wess, München, Germany
J. Zittartz, Köln, Germany
The Lecture Notes in Physics
The series Lecture Notes in Physics (LNP), founded in 1969, reports new developments
in physics research and teaching – quickly and informally, but with a high quality and
the explicit aim to summarize and communicate current knowledge in an accessible way.
Books published in this series are conceived as bridging material between advanced grad-
uate textbooks and the forefront of research and to serve three purposes:
• to be a compact and modern up-to-date source of reference on a well-defined topic
• to serve as an accessible introduction to the field to postgraduate students and
nonspecialist researchers from related areas
• to be a source of advanced teaching material for specialized seminars, courses and
schools
Both monographs and multi-author volumes will be considered for publication. Edited
volumes should, however, consist of a very limited number of contributions only. Pro-
ceedings will not be considered for LNP.

Volumes published in LNP are disseminated both in print and in electronic formats, the
electronic archive being available at springerlink.com. The series content is indexed, ab-
stracted and referenced by many abstracting and information services, bibliographic net-
works, subscription agencies, library networks, and consortia.

Proposals should be sent to a member of the Editorial Board, or directly to the managing
editor at Springer:

Christian Caron
Springer Heidelberg
Physics Editorial Department I
Tiergartenstrasse 17
69121 Heidelberg / Germany
[email protected]
M. Gasperini
J. Maharana (Eds.)

String Theory
and Fundamental
Interactions
Gabriele Veneziano and Theoretical Physics:
Historical and Contemporary Perspectives

13
Editors
Maurizio Gasperini Jnan Maharana
Università di Bari Institute of Physics
Dipartimento di Fisica Sachivalaya Marg
Via G. Amendola,173 Bhubaneswar - 751 005
70126 Bari, Italy Orissa, India
[email protected] [email protected]

M. Gasperini and J. Maharana (Eds.), String Theory and Fundamental Interactions,

Lect. Notes Phys. 737 (Springer, Berlin Heidelberg 2008), DOI 10.1007/
978-3-540-74233-3

Library of Congress Control Number: 2007934340

ISSN 0075-8450
ISBN 978-3-540-74232-6 Springer Berlin Heidelberg New York

This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting,
reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9,
1965, in its current version, and permission for use must always be obtained from Springer. Violations
are liable for prosecution under the German Copyright Law.
Springer is a part of Springer Science+Business Media
springer.com
c Springer-Verlag Berlin Heidelberg 2008
The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply,
even in the absence of a specific statement, that such names are exempt from the relevant protective laws
and regulations and therefore free for general use.
Typesetting: by the authors and Integra using a Springer LATEX macro package
Cover design: eStudio Calamar S.L., F. Steinen-Broo, Pau/Girona, Spain
Printed on acid-free paper SPIN: 12070305 543210
To Gabriele
from his friends, with best wishes
Preface

This book has been prepared to celebrate the 65th birthday of Gabriele
Veneziano and his retirement from CERN in September 2007. This retire-
ment certainly will not mark the end of his extraordinary scientific career (in
particular, he will remain on the permanent staff of the Collège de France in
Paris), but we believe that this important step deserves a special celebration,
and an appropriate recognition of his monumental contribution to physics.
Our initial idea of preparing a volume of Selected papers of Professor
Gabriele Veneziano, possibly with some added commentary, was dismissed
when we realized that this format of book, very popular in former times, has
become redundant today because of the full “digitalization” of all important
physical journals, and their availability online in the electronic archives. We
have thus preferred an alternative (and unconventional, but probably more
effective) form of celebrating Gabriele’s birthday: a collection of new papers
written by his main collaborators and friends on the various aspects of theo-
retical physics that have been the object of his research work, during his long
and fruitful career.
Selecting a reasonable number of invited contributors and contributed top-
ics has proved to be a very difficult task, given the impressive number of dis-
tinguished collaborators (see the full list in the first chapter of this book), and
the exceptionally wide spectrum of research interests. After a careful analysis
of four decades of work, we have finally decided to invite only a few repre-
sentative contributions, trying to provide a survey of most of the many faces
of Gabriele’s activity, and to avoid, at the same time, too many overlaps and
too large gaps. We have been assisted in this process by Gabriele himself, but
we are responsible for any important omission, of course. We hope, however,
that the reader will appreciate the time (and space) limitations of this book,
since making a complete and detailed survey of all of Gabriele’s activities is
surely impossible.
The contributors have been invited to prepare high-level (but not too much
specialized) lectures on the assigned themes, with some introductory part
and, possibly, some historical perspective concerning their work with Gabriele.
VIII Preface

We are very grateful to our colleagues and friends for having accepted our
invitation, and for their excellent scientiﬁc and pedagogic work:
Daniele Amati
Adi Armoni
Ram Brustein
Alessandra Buonanno
Marcello Ciafaloni
Thibault Damour
Paolo Di Vecchia
Sergio Ferrara
Alberto Giovannini
Massimo Giovannini
Kenichi Konishi
Giuseppe Marchesini
Krzysztof Meissner
Roberto Petronzio
Eliezer Rabinovici
Giancarlo Rossi
Hector Rubinstein
Adam Schwimmer
Mikhail Shifman
Graham Shore
Tomasz Taylor
Luca Trentadue
Henry Tye
Carlo Ungarelli
Gregory Vilkovisky
Miguel Virasoro
Should this book have any form of success and appreciation, the merit will
rest on their dedicated and enthusiastic work, and on the many hours of their
valuable time spent on the materialization of this project. We would also like
to thank Christian Caron, Senior Editor of Physics at Springer, for his kind
encouragement, advice, and for many important suggestions.
This book is divided into various parts. The introductory part is fully de-
voted to Gabriele Veneziano, and contains a short biography summarizing his
main successes and achievements (to date), a full updated list of his collabo-
rators and of his publications, and a short interview concerning his personal
point of view about the present and future of fundamental physics. We have
also included the Latex version of an old, unpublished (and handwritten) note,
dating back to 1973, that Gabriele discovered after a long search in his oﬃce
at CERN. Apart from the genuine historical value of such a document (see,
for instance, the comments added by the author for the edition of this book),
parts of the original draft are still of interest, and potentially relevant for
modern applications.
Preface IX

The rest of the book is divided into the following seven parts:

Part 1 – Dual resonance models and string theory

Part 2 – Perturbative QCD
Part 3 – Non-perturbative QCD
Part 4 – Supersymmetric gauge theories
Part 5 – String dualities and symmetries
Part 6 – String/quantum gravity, black holes, and entropy
Part 7 – String cosmology

Each of these parts contains from a minimum of two to a maximum of ﬁve

articles (organized in historical or pedagogical order), illustrating different
aspects of these fields with special emphasis on the contribution of Gabriele
and of his collaborators.
The result is a rather unconventional, “unique” book where old and new
scientific results are mixed with personal memories and feelings of the authors,
spanning over four decades of research on fundamental interactions and an im-
pressive spectrum of interests, ranging from subnuclear physics to cosmology.
We think that it should be easy for the specialized reader to find out his/her
preferred topic, and to jump directly to his/her field of interest. We also hope,
however, that he/she will be tempted to deviate from this preferred path for
enjoying the exploration of other branches of theoretical physics, and learn-
ing about their historical development, following the excellent introductions
written by leading experts in the field.
To conclude this short introduction we would like to present our warmest
thanks to Gabriele Veneziano, teacher and friend, for so many years of en-
joyable and rewarding collaboration. We wish him, also on behalf of the
other contributors to this volume and of all his friends, a happy 65th Birth-
day, and many future years full of exciting research projects and outstanding
achievements.

Bari and Bhubaneswar, Maurizio Gasperini

March 2007 Jnan Maharana
Contents

Part I Introduction

Gabriele Veneziano: A Concise Scientiﬁc Biography

and an Interview
M. Gasperini, J. Maharana . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1 Biographical Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 List of Collaborators of Gabriele Veneziano
(Updated to 2006) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3 An Interview with Gabriele Veneziano . . . . . . . . . . . . . . . . . . . . . . . . . . 11
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

An Unpublished Draft by Gabriele Veneziano (1973):

“Non-local Field Theory Suggested by Dual Models”
G. Veneziano . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1 Introduction and Content of the Paper . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2 Yukawa’s Non-local Field Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3 The Zero Slope (Local) Limit of Dual Models . . . . . . . . . . . . . . . . . . . . 34
4 The Correspondence Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5 Non-Local, Classical Field Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
6 Smeared Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

Part II Dual Resonance Models and String Theory

The Birth of the Veneziano Model and String Theory

H. Rubinstein . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
1 The Weizmann Institute in January 1966 and the Work Leading
to the Veneziano Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2 The Dominant Problems from 1950 to 1970 . . . . . . . . . . . . . . . . . . . . . . 49
3 The Breakthrough . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
XII Contents

4 The Early Phenomenology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

The Birth of String Theory

P. Di Vecchia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
2 Construction of the N -point Amplitude . . . . . . . . . . . . . . . . . . . . . . . . . 64
3 Operator Formalism and Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4 The Case α0 = 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5 Physical States and Their Vertex Operators . . . . . . . . . . . . . . . . . . . . . 85
6 The DDF States and Absence of Ghosts . . . . . . . . . . . . . . . . . . . . . . . . . 90
7 The Zero Slope Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
8 Loop Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
9 From Dual Models to String Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
10 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

The Beginning of String Theory: A Historical Sketch

P. Di Vecchia, A. Schwimmer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
2 Prehistory: the Discovery of the Dual Scattering Amplitudes . . . . . . . 120
3 The String World Sheet Through Factorization
of the N -point amplitudes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
4 The Virasoro Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
5 The Critical Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

The Little Story of an Algebra

M. A. Virasoro . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
2 The Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

Part III Perturbative QCD

Parton Densities: A Personal Retrospective

R. Petronzio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

Infrared-sensitive Physics in QCD and in Electroweak Theory

M. Ciafaloni . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
1 Infrared-sensitive Observables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
2 QCD Form Factors, Multiplicities, Preconﬁnement . . . . . . . . . . . . . . . . 153
Contents XIII

3 Inclusive Electroweak Double Logarithms . . . . . . . . . . . . . . . . . . . . . . . . 155

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

From QCD Lagrangian to Monte Carlo Simulation

G. Marchesini . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
1 The Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
2 Structure of Monte Carlo generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
3 The Long Way to Monte Carlo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
4 Multi-gluon Soft Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
5 Monte Carlo Simulation for Soft Emission . . . . . . . . . . . . . . . . . . . . . . 174
6 From Partons to Hadrons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

Fracture Functions
L. Trentadue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
1 Introduction and Motivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
2 Formalism and Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
3 Applications and Phenomenology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
4 Jet Cross sections and Fracture Functions . . . . . . . . . . . . . . . . . . . . . . . 214
5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218

Part IV Non-perturbative QCD

Coherence and Incoherence in QCD Jets Dynamics (QCD Jets

and Branching Processes)
A. Giovannini, R. Ugoccioni . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
2 Elementary Models and Unexplained Facts in Multiparticle
Dynamics in the Early 1970s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
3 KUV Diﬀerential Evolution Equations and the Advent of QCD in
the Late 1970s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
4 The Collaboration with Léon Van Hove, and the UA5 Collaboration
Results at CERN pp̄ Collider on Multiplicity Distributions, in Full
Phase Space and in Restricted Pseudo-rapidity Windows . . . . . . . . . . 228
5 New Experimental Findings on Final Charged Particle MD
in e+ e− Annihilation at LEP c.m. Energy, and More Precise
Measurements on Final Particle MD at pp̄ Collider Top c.m.
Energy. The Occurrence of Substructures or Components in the
Various Collisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
6 New Physics at CERN. The Weighted Superposition
of Three Classes of Events (Soft, Semihard, and Hard)
in pp Collisions at LHC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
XIV Contents

The U (1)A Anomaly and QCD Phenomenology

G. M. Shore . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
2 The U (1)A Anomaly and the Topological Susceptibility . . . . . . . . . . . 237
3 ‘U (1)A Without Instantons’ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
4 Pseudoscalar Mesons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
5 Topological Charge Screening and the ‘Proton Spin’ . . . . . . . . . . . . . . 265
6 Polarised Two-photon Physics and a Sum Rule for g1γ . . . . . . . . . . . . . 279
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
Planar Equivalence 2006
A. Armoni, M. Shifman . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
1 Planar Equivalence: a Refined Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
2 The Orientifold Large-N Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
3 Applications for One-flavor QCD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
4 Applications for Three-flavor QCD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
5 Sagnotti’s Model and the Gauge/String Correspondence . . . . . . . . . . . 297
6 Charge Conjugation and the Validity of Planar Equivalence . . . . . . . . 297
7 Other Developments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299

Part V Supersymmetric Gauge Theories

Instantons and Supersymmetry

M. Bianchi, S. Kovacs, G. Rossi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
2 Generalities about Instantons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
3 Chiral and Supersymmetric Ward–Takahashi Identities . . . . . . . . . . . . 315
4 Instanton Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
5 The Eﬀective Action Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334
6 N = 2 SYM: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
7 N = 2 SYM: Generalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
8 Seiberg–Witten Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
9 Checking the SW Formula by Instanton Calculations . . . . . . . . . . . . . . 358
10 Topological Twist and Non-commutative Deformation . . . . . . . . . . . . . 364
11 (Constrained) Instantons from Open Strings . . . . . . . . . . . . . . . . . . . . . 374
12 Instanton Eﬀects in N = 4 SYM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
13 N = 4 Supersymmetric Yang–Mills Theory . . . . . . . . . . . . . . . . . . . . . . 386
14 Instanton Calculus in N = 4 SYM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390
15 One-instanton in N = 4 SYM with SU (Nc ) Gauge Group . . . . . . . . . 394
16 Generalisation to Multi-instanton Sectors . . . . . . . . . . . . . . . . . . . . . . . . 405
17 AdS/CFT Correspondence: a Brief Overview . . . . . . . . . . . . . . . . . . . . 407
Contents XV

18 Instanton Eﬀects in the AdS/CFT Duality . . . . . . . . . . . . . . . . . . . . . . 412

19 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463

The Magnetic Monopoles Seventy-ﬁve Years Later

K. Konishi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471
1 Color Confinement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472
2 Difficulties with the Semiclassical “Non-Abelian Monopoles” . . . . . . . 474
3 Non-Abelian Monopoles from Vortex Moduli . . . . . . . . . . . . . . . . . . . . 480
4 N = 2 Supersymmetric Gauge Theories and Light Non-Abelian
Monopoles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 482
5 Vortices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494
6 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 500
7 Confinement Near Conformal Vacua . . . . . . . . . . . . . . . . . . . . . . . . . . . 507
8 Quantum Chromodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 508
9 Conclusive Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 509
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 519

Part VI String dualities and symmetries

Novel Symmetries of String Theory

J. Maharana . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525
2 Hamiltonian Formalism and BRS Quantization . . . . . . . . . . . . . . . . . . 527
3 Canonical Transformations and Invariance Properties of Σ . . . . . . . . 534
4 Symmetries of Massive String Excitations . . . . . . . . . . . . . . . . . . . . . . . 542
5 Summary and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 549
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551

Threshold Eﬀects Beyond the Standard Model

T. R. Taylor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553
2 Threshold Eﬀects of Extra Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . 553
3 Superstring Threshold Corrections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559

Dualities in String Cosmology

K. A. Meissner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561
2 Scale Factor Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 563
3 O(d, d) Symmetry to the Lowest Order . . . . . . . . . . . . . . . . . . . . . . . . . . 564
4 O(d, d) Symmetry to the Next Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567
5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 569
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 570
XVI Contents

Spontaneous Breaking of Space–Time Symmetries

E. Rabinovici . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573
2 Spontaneous Breaking of Space Symmetries . . . . . . . . . . . . . . . . . . . . . . 574
3 Spontaneous Breaking of Time-Translational Invariance and of
Supersymmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 590
4 Spontaneous Breaking of Conformal Invariance . . . . . . . . . . . . . . . . . . . 597
5 O(N ) Vector Models in d = 3: Spontaneous Breaking of Scale
Invariance and the Vacuum Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 599
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 604

Part VII String/Quantum Gravity, Black Holes and Entropy

The Information Paradox

D. Amati . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 609
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 609
2 String Theories and Black Holes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 610
3 The Role of Decoherence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 612
4 High-energy Collisions in String Theory and Metric Back Reaction . 613
5 Metric Back Reaction and Possible Avoidance of Black Holes . . . . . . 615
6 Conclusions and Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 615
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 616

Cosmological Entropy Bounds

R. Brustein . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 619
1 To Gabriele . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 619
2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 619
3 The Causal Entropy Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 624
4 The Generalized Second Law and the Causal Entropy Bound . . . . . . 645
5 Area Entropy, Entanglement Entropy and Entropy Bounds . . . . . . . . 655
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 658

Extremal Black Holes in Supergravity

L. Andrianopoli, R. D’Auria, S. Ferrara, M. Trigiante . . . . . . . . . . . . . . . . 661
1 Introduction: Extremal Black Holes from Classical General
Relativity to String Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 661
2 Extremal Black Holes as Massive Representations
of Supersymmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 668
3 The General Form of the Supergravity Action in Four Dimensions
and its BPS Conﬁgurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674
4 Supersymmetric Black Holes: General Discussion . . . . . . . . . . . . . . . . . 694
5 BPS and Non-BPS Attractor Mechanism: The Geodesic Potential . . 701
6 Detailed Analysis of Attractors in Extended Supergravities: BPS
and Non-BPS Critical Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 713
7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 723
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 724
Contents XVII

Expectation Values and Vacuum Currents of Quantum Fields

G. A. Vilkovisky . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 729
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 729
2 Lecture 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 730
3 Lecture 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 741
4 Lecture 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 752
5 Lecture 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 768
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 783

Part VIII String Cosmology

Dilaton Cosmology and Phenomenology

M. Gasperini . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 787
1 Dilaton-dominated Inﬂation: the Pre-big Bang Scenario . . . . . . . . . . . 789
2 The Relic Dilaton Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 812
3 Late-time Cosmology: Dilaton Dark Energy . . . . . . . . . . . . . . . . . . . . . . 826
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 842

Relic Gravitons and String Pre-big-bang Cosmology

A. Buonanno, C. Ungarelli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 845
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 845
2 Graviton Production in Cosmology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 847
3 Gravitational-wave Background in Pre-big-bang Inﬂation . . . . . . . . . . 853
4 Accessibility of LIGO to Pre-big-bang Models . . . . . . . . . . . . . . . . . . . . 857
5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 859
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 860

Magnetic Fields, Strings and Cosmology

M. Giovannini . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 863
1 Half a Century of Large-Scale Magnetic Fields . . . . . . . . . . . . . . . . . . . 863
2 Magnetogenesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 869
3 Why String Cosmology? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 892
4 Primordial or Not Primordial, This Is the Question... . . . . . . . . . . . . . 902
5 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 934
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 935

Cosmological Singularities and a Conjectured Gravity/Coset

Correspondence
T. Damour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 941
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 941
2 Cosmological Billiards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 942
3 Gravity/Coset Correspondence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 944
4 A New View of the (quantum) Fate of Space at a Cosmological
Singularity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 946
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 948
XVIII Contents

Brane Inﬂation: String Theory Viewed from the Cosmos

S.-H. H. Tye . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 949
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 949
2 Brane Inﬂation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 956
3 Graceful Exit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 961
4 Production and Properties of Cosmic Superstrings . . . . . . . . . . . . . . . 964
5 Evolution and Detection of Cosmic Superstrings . . . . . . . . . . . . . . . . . . 966
6 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 970
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 972
Part I

Introduction
Gabriele Veneziano: A Concise Scientiﬁc
Biography and an Interview

M. Gasperini1 and J. Maharana2

1
Dipartimento di Fisica, Università di Bari, Via G. Amendola 173, 70126 Bari,
Italy and Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy
[email protected]
2
Institute of Physics, Bhubaneswar University, 751005 India
[email protected]

Abstract. The aim of these notes is to present a broad brush profile of the scien-
tific activity of Gabriele Veneziano, whose wide spectrum of interests and variety of
contributions to fundamental theoretical physics is also reflected by the articles of
his collaborators and friends in this book. We thank Gabriele for his kind help in
preparing these notes, and for disclosing to us some aspects of his life that we were
not aware of. The responsibility of any omission and imprecision will rest on the
authors, of course, and we apologize in advance for the (unavoidable) incomplete-
ness of Sect. 1, warning the reader that a full survey of all of Gabriele’s activities is
outside the scope of this introduction. Finally, we thank Gabriele for his patience in
answering our questions that made possible the interview reported in Sect. 3 where,
starting from the evocation of his past experience, he illustrates his personal point
of view on the present status of fundamental physics, and his expectations for the
future.

1 Biographical Notes
Gabriele Veneziano was born on September 7, 1942 in Florence (Italy). After
completing his high-school studies (at the Liceo Scientifico “Leonardo da
Vinci,” Florence) he entered the University of Florence in 1960, where he
started studying physics. He took his degrees (Laurea in Fisica) in 1965 de-
fending a thesis on the applications of group theory to strong interactions,
under the supervision of Professor Raoul Gatto. A short paper extracted from
his thesis became his first scientific publication [1] (here, and in what follows,
the quoted numbers refer to the list of publications of Gabriele Veneziano
reported at the end of this chapter).
After graduating he won a scholarship of Angelo della Riccia to carry
out research in the group directed by Raoul Gatto, who had gathered in
Florence a number of brilliant young theorists (like Guido Altarelli, Franco
Buccella, Giovanni Gallavotti, Luciano Maiani, and Giuliano Preparata, to

M. Gasperini and J. Maharana: Gabriele Veneziano: A Concise Scientiﬁc Biography and an

Interview, Lect. Notes Phys. 737, 3–27 (2008)
DOI 10.1007/978-3-540-74233-6 1
c Springer-Verlag Berlin Heidelberg 2008
4 M. Gasperini and J. Maharana

mention a few). During that period he wrote a paper on saturation of cur-

rent algebra sum rules [5] that attracted the attention of Professor Sergio
Fubini. Meanwhile (after a conversation with Professor Giulio Racah) he
had decided to continue his studies toward a PhD, choosing to apply to the
Weizmann Institute of Science in Rehovot (Israel). In July 1966 he got married
to Edy Pacifici and, after their honeymoon in Venice, they moved together to
Israel.
His official advisor at the Weizmann Institute was Professor Harry
J. Lipkin; however, his research activity was mainly carried out under the
supervision of Professor Hector Rubinstein (see the contribution of Hector
Rubinstein to this volume). In Israel he quickly completed his PhD stud-
ies, getting the degree at the end of 1967 (see Fig. 1). The PhD thesis was
largely based on his research with Rubinstein and on work done in collabo-
ration with Marco Ademollo (professor in Florence and visiting Harvard at
that time) and Miguel Virasoro (who had joined the Weizmann group in the
spring of 1967). That work developed important ideas initiated by Sergio Fu-
bini and collaborators on a bootstrap approach to strong interactions based
on “superconvergence” and “duality” (see, e.g., [13, 18]).
At the beginning of 1968 he was offered several post-doctoral positions in
the United States, and he decided to accept the invitation of MIT (Boston) to
join the newly formed Center for Theoretical Physics to which Sergio Fubini
and Steven Weinberg had recently moved. Before starting the MIT appoint-
ment he spent the whole summer at the TH Division of CERN, where he
completed the celebrated paper “Construction of a crossing-symmetric, Regge

Fig. 1. Gabriele Veneziano (left) receiving his PhD diploma at the Weizmann
Institute of Science (Rehovot, 1967)
G. Veneziano: A Concise Scientiﬁc Biography and an Interview 5

behaved amplitude for linearly-rising trajectories” [20], in which he proposed

the scattering amplitude that bears his name, and that is usually regarded
as marking the birth of string theory. The model presented in his seminal
paper incorporated most of the desired ingredients of an S-matrix theory of
strong interactions, and it was largely quoted at the Vienna Conference on
High Energy Physics, at the end of that summer.
At MIT he mainly worked with Sergio Fubini to develop generalizations
of his earlier work that became known as “dual resonance models.” Their
work paved the way to the re-interpretation of such models as a theory of
strings. In fact, some of the crucial features of string theory, such as the expo-
nential degeneracy of the states [24, 25], the concept of “Fubini–Veneziano”
vertex operator [28], and the algebraic structure underlying the Virasoro op-
erators [31], were introduced by them in that period (see the contributions
of Paolo di Vecchia, Adam Schwimmer, and Miguel Virasoro to this volume).
In that period he also spent a summer at the Lawrence Berkeley Labora-
tory (California), where he contributed to an influential paper on the “twist”
operator [26].
After the birth of his son Ariel (September 1970), and a one-term visit at
the Institute for Advanced Studies in Princeton, he undertook a program in-
voking topological ideas in order to implement unitarity in the context of dual
resonance models [30]. This led, in particular, to a model for the “Pomeron”
[49], later developed by other researchers into the so-called dual parton model.
In 1972 he came back to the Weizemann Institute as a full professor. In
the subsequent 4 years he also spent extended periods at CERN, pursuing the
development of the topological unitarization ideas, meanwhile interpreted by
Gerard ’t Hooft as a 1/N expansion [53].
In 1976 he joined the TH Division of CERN, first as a scientific associate,
then as a junior staff member (1977–1978), and, finally, as a senior staff mem-
ber. Later he became Head of the TH Division (1994–1997). The beginning
of this period was marked by the birth of his daughter Erika (July 1976), and
by a change of direction of his research interests.
He started to work, in particular, on large N expansions in quantum
chronodynamics (QCD) [62], its applications to baryon dynamics [64], and
Bose–Einstein effects in jet physics [67] (see the contribution of Alberto
Giovannini to this volume). Together with Daniele Amati and Roberto
Petronzio, he proved the factorization theorem on collinear singularities in
perturbative QCD, which forms the basis of the QCD parton model [69, 70]
(see the contribution of Roberto Petronzio to this volume). This brought
him naturally to devote his activity to the physics of QCD jets, writ-
ing some seminal papers with Kenichi Konishi and Akiwa Ukawa [71, 72]
(the KUV jet calculus), and with Daniele Amati [74] (pre-confinement) (see
the contributions of Marcello Ciafaloni and Giuseppe Marchesini to this
volume).
Turning his attention to non-perturbative aspects of QCD, he tackled the
U (1) axial problem for a 1/N perspective, arriving at the celebrated (and
6 M. Gasperini and J. Maharana

even recently conﬁrmed) Witten–Veneziano formula [77]. Related studies led

to an estimate of the electric dipole moment of the neutron induced by a
non-vanishing QCD θ-angle [78]. These results were encoded into an effective
Lagrangian formalism developed with Paolo di Vecchia [79].
The effective Lagrangian formalism was later applied to super-
symmetry (SUSY) Yang–Mills theories [93] and SUSY QCD [95], where the
non-perturbative breakdown of non-renormalization theorems was first sug-
gested. The superpotentials derived in those papers, in collaboration with
Thomasz Taylor and Shimon Yankielowicz, are still being widely used and
cited (often under some other names) in many contexts. The indications
of those papers were confirmed by explicit calculations that he later per-
formed with Giancarlo Rossi and collaborators, and that are summarized in
[119] (see the contribution of Giancarlo Rossi to this volume). In that pe-
riod he also pointed out the possible formulation of SUSY Yang–Mills theo-
ries in the lattice, and suggested an implementation [115] that is still being
attempted.
When string theory was recognized as a promising candidate to unify grav-
ity and gauge interactions (i.e., after the so-called Green–Schwarz revolution
in 1984) he came back to the theory that he had to abandon (not without
regret) when it appeared inappropriate as a theory of strong interactions. His
studies (with various collaborators, including Amit Giveon, Jnan Maharana,
and Eliezer Rabinovici) first concentrated on the following directions: the
physical consequences of a fundamental length [111], the emergence of new
field-theoretic and “stringy” symmetries [113] (see the contributions of Jnan
Maharana and Eliezer Rabinovici to this volume), the possible phenomeno-
logical consequences of a light dilaton [125], and a background field approach
to the study of the T -duality symmetry [133].
A more substantial activity in that period concerned the study of gedanken
experiments on trans-Planckian string collisions, in collaboration with Daniele
Amati and Marcello Ciafaloni [120]. The main purpose of such studies was
the understanding of how string theory may reproduce general-relativistic
results at large distances, while providing important corrections at string-
size distances. The works possibly have applications to an effectively modified
uncertainty principle [138] and to the problem of “information loss” in black-
hole physics (see the contribution of Daniele Amati to this volume).
While working on string theory he kept alive his interest in the subject of
strong interaction phenomenology, producing works on the “spin of the pro-
ton” puzzle [132] (see the contribution of Graham Shore to this volume), and
on semi-inclusive hard processes [167] (see the contribution of Luca Trentadue
to this volume).
Triggered by his wish to find novel applications of string theory (and
new possible ways to test it), he then turned his interest toward primor-
dial cosmology and its theoretical and observational challenges. Starting
from the study of duality symmetries in cosmological backgrounds [148,
149, 151] (see the contribution of Krzysztof Meissner to this volume) he
G. Veneziano: A Concise Scientific Biography and an Interview 7

proposed, in collaboration with Maurizio Gasperini, the so-called pre-big

bang scenario [161], which attracted considerable interest in the astrophys-
ical community, stimulating the studies of new mechanisms of inflation (see
the contribution of Maurizio Gasperini to this volume). The multiple impli-
cations of this scenario were the object of many subsequent studies with
various collaborators (see the contributions of Alessandra Buonanno, Thibault
Damour, Massimo Giovannini, and Carlo Ungarelli to this volume). Of
particular relevance were the phenomenological predictions concerning the
generation of magnetic seeds [175], the enhanced production of primordial
gravitational waves [178], and the possible axionic origin of the cosmic mi-
crowave background (CMB) anisotropy [225], opening a unique observational
window on string/Planck-scale physics.
Encouraged by the possibility of concrete experimental verifications of such
a string cosmology scenario, he and Maurizio Gasperini also tackled the prob-
lem of understanding (or re-interpreting), in such a context, the big bang sin-
gularity, by applying either quantum cosmology techniques (in collaboration
with Jnan Maharana [185]), or higher-order string corrections (in collabora-
tion with Michele Maggiore [190]), or non-local effects of the quantum back
reaction (in collaboration with Massimo Giovannini [235]). The study of the
high-curvature, strong-coupling regime (also appropriate to brane inflation,
see the contribution of Henry Tye to this volume) led him to obtain, as a by-
product, unexpected results on entropy in collaboration with Ram Brustein
[207, 213] (see the contribution of Ram Brustein to this volume), and unex-
pected connections with black-hole physics [211, 240], in collaboration with
Thibault Damour. Later developments of the pre-big bang scenario also led
to interesting (and testable, in principle) interpretations of the presently ob-
served cosmic acceleration [219].
His most recent interests are mainly focused again on the 1/N expan-
sion, with two different ramifications. The first concerns a new version of
such expansion, capable of connecting QCD to supersymmetric theories [236],
developed in collaboration with Adi Armoni and Mikhail Shifman. The
obtained predictions for one-flavor QCD, in particular, have been confirmed
by subsequent (phenomenological or lattice) computations (see the con-
tribution of Adi Armoni and Mikhail Shifman to this volume). The sec-
ond, developed in collaboration with Enrico Onofri and Jacek Wosiek, deals
with a Hamiltonian approach to large N dynamics, which, while still lim-
ited to quantum mechanics, has already produced interesting results in dif-
ferent branches of mathematical physics, like combinatorics and statistical
mechanics [246].
Since 2004 he holds the prestigious Chair of Elementary Particles, Gravi-
tation and Cosmology at the Collège de France, in Paris (see Fig. 2).
We give below a schematic summary of his professional career, his admin-
istrative appointments at CERN, his positions and associations, and his prizes
and honors.
8 M. Gasperini and J. Maharana

Fig. 2. Gabriele Veneziano giving the Inaugural Lecture at the Collège de France.
Paris, February 17, 2005 (Photo Suzy Vascotto)

1.1 Professional Career

• Research Associate at MIT, Cambridge (USA), 1968–1969

• Visiting Assistant Professor at MIT, 1969–1970
• Visiting Associate Professor at MIT, 1970–1972
• Full Professor at Weizmann Institute of Science, Rehovot (Israel),
1971–1975
• Amos-de Shalit Professor of Physics at Weizmann Institute of Science,
1975–1977
• Junior Staﬀ Member at CERN, TH Division, Geneva, 1977–1978
• Senior Staﬀ Member, CERN, Geneva, 1978–2207
• Head of Theory Division, CERN, Geneva, 1994–1997
• Professor at Collège de France, Paris, since 2004

1.2 Administrative Appointments at CERN

• Member of the SPS (Super Proton Synchrotron) Committee, 1983–1986

• CERN representative to Plenary ECFA (European Committee for Future
Accelerators), 1987–1990
• Chairman of the Academic Training Committee, 1990–1994
• Division Leader of the Theory Division, 1994–1997
G. Veneziano: A Concise Scientiﬁc Biography and an Interview 9

• Member of the Scientiﬁc Information Policy Board, 1997–2000

• Member of the Archives Committee, 2001–2004
• Chairman of the Pauli Committee, 2003–2007

1.3 Positions and Associations

• Recipient of a Chaire Condorcet at LPTENS (Laboratoire de Physique

Théorique de l’Ecole Normale Superieure), Paris, 1994
• Co-director (with Gerard ’t Hooft and Antonino Zichichi) of the Interna-
tional School on Subnuclear Physics in Erice, Sicily, 1996–2001
• Recipient of a Chaire Blaise Pascal at LPT (Laboratoire de Physique
Théorique), Université Paris Sud, Orsay, and IHES (Institut des Hautes
Etudes Scientiﬁques), Bures-sur-Yvette (France), 2000–2002
• Academic Staﬀ Member at Kavli Institute of Theoretical Physics,
University of California, Santa Barbara, 2003
• Chairman of the Advisory Committee of the Galileo Galilei Institute
in Arcetri (Italy), since 2005
• Member of Accademia delle Scienze di Torino (Italy), since 1994
• Member of Accademia Nazionale dei Lincei, Roma, since 1996
• Member of Académie des Sciences of the Institut de France, Paris,
since 2002

1.4 Prizes and Honors

• I. Ya. Pomeranchuk Prize, ITEP, Moscow (May 1999).

Motivation: “For his outstanding contributions to quantum field theory
and theory of strings.”
• Gold Medal of the Italian Republic (Diploma di prima classe riservati ai
Benemeriti della Scienza e della Cultura), Rome (June 2000).
• Dannie Heineman Prize of the American Physical Society (May 2004).
Motivation: “For his pioneering discoveries in dual resonance models
which, partly through his own efforts, have developed into string theory
and a basis for the quantum theory of gravity.”
• Enrico Fermi Prize of the Italian Physical Society (September 2005).
• Einstein Medal of the Albert Einstein Gesellschaft, Berne (June 2006).
Motivation: “The laureate has made significant contributions to the
understanding of string theory.”
• Commendatore dell’Ordine al Merito della Repubblica Italiana
(February 2007).
• Oskar Klein Medal of the Swedish Royal Academy of Sciences, Stockholm
(June 2007).
10 M. Gasperini and J. Maharana

2 List of Collaborators of Gabriele Veneziano

(Updated to 2006)

Here we report, in alphabetical order (and to the best of our knowledge), all
authors who have published a paper in collaboration with Gabriele Veneziano.
Their number is impressive (for a theoretical physicist), and we apologize in
advance for any possible omission.

M. Ademollo D. Amati A. Armoni

R. Barbieri A. Bassetto M. Bishari
V. Bozza V. Branchina R. Brustein
F. Buccella A. Buonanno L. Caneschi
M. Ciafaloni R. Crewther G. Curci
T. Damour A. C. Davis V. De Alfaro
D. De Florian E. Del Giudice C. DeTar
A. Di Giacomo P. Di Vecchia M. J. Duff
R. Durrer S. Elitzur M. Fabbrichesi
K. Fabricius S. Ferrara V. Ferrari
S. Foffa D. Freedman J. Freeman
S. Fubini G. Furlan E. Gabathuler
M. Gasperini R. Gatto A. Giovannini
M. Giovannini L. Giusti A. Giveon
D. Gordon A. Ghosh D. Graudenz
M. Grazzini M. B. Green N. S. Han
R. Iengo C. E. Jones F. Karsch
E. Kohlprath K. Konishi T. Kubota
G. Longhi F. E. Low R. Madden
M. Maggiore N. Magnoli J. Maharana
G. Marchesini A. Masiero K. A. Meissner
A. Melchiorri Y. Meurice V. F. Mukhanov
S. Narison F. Nicodemi S. Okubo
L. B. Okun E. Onofri P. Pavlopoulos
P. Pendenza R. Petronzio R. Pettorino
F. Piazza G. Pollifrone E. Predazzi
E. Rabinovoci R. Ricci M. Roncadelli
C. Rosenzweig G. C. Rossi H. Rubinstein
M. Sakellariadou N. Sanchez M. M. Schaap
A. Schwimmer L. Sertorio M. Shifman
G. Shore T. Taylor M. Testa
L. Trentadue H. Tye A. Ukawa
C. Ungarelli G. Vilkowisky F. Vernizzi
M. Virasoro S. Weinberg E. Witten
J. Wosiek S. Yankielowicz J. E. Young
Y. Zarmi
G. Veneziano: A Concise Scientific Biography and an Interview 11

3 An Interview with Gabriele Veneziano

MG & JM: Hi Gabriele, and thank you very much for sparing us your valuable
time, accepting to answer our questions. We had the privilege of preparing
this collection of papers written by your close collaborators, and we would
like to ask you a few questions concerning your personal experience with
physics during over four decades of successful work. We are interested in your
feelings and perspectives about the present status of the research activity in
fundamental physics, and your hopes and expectations for the future. But
let us start with the past. When (and why) did you decide to devote your
professional activity to physics?
GV: In senior high school in Florence I had a very good teacher of maths
and physics, Tebaldo Liverani. He clearly loved those subjects (more maths
than physics, he once admitted) and enjoyed teaching them. Probably under
his influence, in 1960, myself and two other students in my class decided to
enroll at the local university for a degree in either maths or physics. During the
summer we had long debates on what to choose, and, eventually, we all opted
for physics. I believe that none of us has regretted the choice. This little story
tells us how important good-quality teaching is, and not only at the academic
level, quite the contrary.
MG & JM: Do you remember any professor who played a crucial role in
influencing your career, both during your studies and at the beginning of
your research activity? What should be, in your opinion, the main objectives
of undergraduate and graduate courses in physics?
GV: Besides the high school teacher I have just mentioned, I remember some
very good courses at the university, in particular by Professors Mand and
Toraldo di Francia. Then, while I was entering my third year, Professor Raoul
Gatto arrived to Florence, together with a group of brilliant young theorists,
mainly from Rome. His teaching and his presence made me turn in the direc-
tion of theoretical particle physics. Without him around I would have proba-
bly yielded to some gentle pressure to become a high-energy experimentalist.
Later on, at the Weizmann Institute, Hector Rubinstein had a very positive
influence on my research. And, finally, at MIT, I learned a lot from working
with/under Sergio Fubini. Gatto, Rubinstein, and Fubini had rather different
styles in doing theoretical physics and I tried to pick up what I appreciated
most from each one of them. Whether I succeeded or not, I certainly owe a
lot to all three. What I have appreciated most in all my teachers and mentors
has been their passion in doing research together with their professionality.
Both are very important attitudes to communicate to the new generations,
more important than just giving them a long series of notions. Particularly
important is to inject into students a critical, yet constructive, attitude in do-
ing research. Nothing should be taken for granted until it is understood at the
deepest possible level.
12 M. Gasperini and J. Maharana

MG & JM: Among the many scientific institutions you have visited, where
did you find the most pleasant atmosphere and facility of work? What do you
think should be of primary care for a laboratory, an institute, or a depart-
ment of physics in order to encourage the creativity and productivity of its
researchers?
GV: The group in Florence under professor Gatto was a fantastic one. The
atmosphere at the Weizmann Institute, particularly in 1966–1968, was also
extremely congenial for doing research. Work at the Center for Theoretical
Physics at MIT was also carried out under optimal conditions, and the same
has always been true for the TH division at CERN. All these places shared
the virtue of giving the physicists the time and the means to carry out their
research in complete freedom, without administrative burdens and without any
demand of short-term results. For instance, at MIT, Fubini and I were working
on a program (dual resonance models) which was far from fashionable at the
time, but no one tried to push us out of it. I have always been very lucky with
the places where I have been working, but also, I must say, with the historical
period in which I embarked in theoretical particle physics. A posteriori we can
say that the years 1965–1975 were a “golden decade” in theoretical particle
physics. We still live, to a large extent, on the great heritage of that period:
the standard model, its possible extensions, and string theory.
MG & JM: You have deeply influenced, in many ways, the past develop-
ment of fundamental theoretical physics. From your perspective, are you sat-
isfied with the present approach to the physics of fundamental interactions?
In particular, what is your attitude toward the main contemporary theoretical
“paradigmas”?
GV: You ask me to stick my neck out. Well, in my opinion, theorists, on the
basis of their recent successes with the standard model, have grown a little too
arrogant. Some of the ideas around are very well motivated and even beautiful,
but it is very hard to find the right way without the input of new data (it is even
hard with the data, to be sure, see, e.g., the case of neutrino masses and mix-
ing!). For this reason I am not too excited about the huge activity that is going
on in building models for data . . . that are not there yet. Perhaps it would be
better to wait until those become available and, meanwhile, to put more effort
on some of the outstanding theoretical and phenomenological problems that are
already in the data, both in particle physics and in cosmology. Just to mention
a few: confinement and dynamical symmetry breaking in QCD, and the origin
of primordial—as well as of the present—cosmological acceleration. As an ex-
ample, I don’t think that enough effort has been devoted to trying to solve the
first two problems I mentioned above at least in the large-N limit. I am pretty
convinced that both analytic and numerical large-N techniques can and should
be improved. A similar criticism could apply to present mathematical–physics
research, mainly concentrated these days on string theory. It looks to me as if
we forgot that the main “raison d’ être” of modern string theory is the con-
G. Veneziano: A Concise Scientific Biography and an Interview 13

struction of a fully consistent quantum theory of gravity. Most of the present

activity deals with very special (static, supersymmetric) solutions that fail to
address the issue of what happens to generic solutions that approach those of
General Relativity in some limit, but should look very different near the ubi-
quitous singularities of the classical solutions. What happens, in string theory,
to the big bang singularity? Or to the one inside a black-hole horizon? These
are tough problems, of course, but it looks to me that our community tends to
ignore these issues in favor of tackling some easier problems (more for socio-
logical than for scientific reasons, I guess). Let me repeat a motto I have voiced
a few times: “Let’s find tools for our problems, rather than problems for our
tools!” In the case of the singularities, for instance, new techniques should
be searched for studying string theory in geometries whose curvature radius
is much below the string scale. I have the feeling that, by some appropriate
duality, this problem should not be too different from that of large curvature
radii, which we are already able to deal with. Would it not be wonderful to
know about the fate of the big bang singularity—and thus of the beginning of
time—in string theory?
MG & JM: A frequently voiced criticism of string theory (see, e.g., L. Smolin’s
recent book) is that string is not science since it cannot make predictions and,
therefore, cannot be falsified. What is your opinion on this?
GV: I completely disagree. It is fair to say that, at present, we are unable to
extract reliable testable predictions from string theory, but this is only due to
our present incomplete understanding of such a complicated theory. After all,
how many decades had to pass before we could go from Yang–Mills theory to
a theory of the weak interactions? For instance, it is often said that, in order
to test string theory, one would need such high energies that no (human-built)
accelerator will ever be able to produce. But (besides the fact that the Universe
itself has provided such enormous energies right after the big bang, and may
well have kept some imprint of string theory since then) it is not true that
the predictions of string theory are just in the high-energy domain. String
theory contains—at the lowest level of approximation—many massless scalar
fields that could deeply influence low-energy physics by inducing violations
of the equivalence principle, deviations from Newton’s law, or space and/or
time variations of fundamental constants. The problem is that we are presently
unable to understand whether (some of ) those massless particles stay massless
after the theory is completely solved. If the answer is yes, then superstring
theory will be falsified for the same main reason that the old hadronic string
was abandoned: strong interactions are short range, but the old string insisted
on having massless particles! Another generic prediction of string theory is the
existence of extra dimensions of space. If those are not too small they could
be revealed at accelerator experiments. But even if they are tiny they could
have affected very early cosmology, leaving an imprint of today’s cosmological
observables.
14 M. Gasperini and J. Maharana

MG & JM: What would you like the LHS to discover? And what do you think
the LHC will actually discover?
GV: The best gift the LHC could deliver is . . . surprises. The worst would be
just a confirmation of the Standard Model by the discovery of a light Higgs
boson and nothing else. Unfortunately, given the striking phenomenological
successes of the Standard Model, the latter possibility is not easy to exclude. It
would amount to some fine-tuning of the Standard Model’s parameters, true,
but the cosmological constant problem has accustomed us to much worse than
that. Another item in any theorist’s wish list is the discovery, by the LHC, of a
good dark-matter candidate, even better if this will have to do with discovering
supersymmetry. Personally, I am quite convinced that supersymmetry will play
a role in particle physics, sooner or later. The problem, if supersymmetry
lies at too high an energy scale to be reached at the LHC, is that we may
never find the motivations (and resources) to push toward the next energy
frontier. If I should bet my own money on something, I would say that the
LHC will find more than the standard Higgs but not quite what we theorists are
expecting or hoping for (like extra dimensions or strong gravity). For instance,
I am not fully convinced that the ideas of a dynamical symmetry breaking (of
“technicolor” type), or of some compositeness of leptons and quarks explaining
the origin of the three families, can already be put to rest. We have not yet
understood the non-perturbative dynamics of QCD: how can we be sure that a
different gauge theory cannot solve one or both of those questions?
MG & JM: What are your main suggestions and recommendations to young
people at the beginning of their research activity in the field of fundamental
theoretical physics?
GV: To think with their own heads rather than follow the fashion. They should
learn of course what has been done by the previous generation, but to follow
the latter’s prejudices will not help bring out the new ideas we badly need in
order to solve the outstanding problems still facing us.
MG & JM: Do you remember any amusing episode or anecdote concerning
your scientific life that you would like to share with us and with the readers?
GV: An amusing one is the drink I had with Feynman in Caltech after his
talk at a conference on QCD. It must have been around 1979–1980. I had been
invited to present some results obtained at CERN about how quark and gluon
jets evolve and lead, eventually, to a state that looks almost ready to convert
into low-mass hadrons. I gave the talk, which was well received. Feynman was
in the audience, but I do not remember any question by him, either at the
end of my talk or in private afterward. The next day Feynman gave his talk.
Apparently he had rewritten it overnight and, consequently, was not very well
prepared; but it was brilliant, as usual. His talk was largely inspired by mine
and Feynman kept mentioning my results over and over again. I remember
he was even mispelling the name of Petronzio by quoting “Veneziano and
G. Veneziano: A Concise Scientific Biography and an Interview 15

Petronziano,” surely joking Mr. Feynman? After the end of the session, I told
Feynman I had enjoyed his talk. He must not have been very satisfied with
it, since he answered: well, that’s because I quoted you all the time, isn’t it?
And then he added: come, let’s have a drink, I want to understand better what
exactly you have done. So we went to a nearby pub, had coffee (or was it
beer?) and I started to tell him about my work. At some point, before I had
finished, he interrupted me and said: “But then you have been cheating me!
I thought you had done much more! This is nothing but the Altarelli–Parisi
stuff !” I had to sweat a lot to convince him that, indeed, I had done more. Did
he get convinced? I am not sure. But at some point I stumbled on his English
(too good for me, I guess). I asked him what he meant by a “freying jet,” an
expression he had used many times. To explain, he pointed at my shirt and
said: well your poor-Italian-physicist’s shirt is freying...I got it. He also said:
you know, he should get together and fix them, referring to some colleagues in
Caltech who had also been doing jet physics. I thought that “fixing” them would
mean to attack them badly, so I asked “Why be nasty?” But then he reassured
me: no, I mean we should just correct what they are doing incorrectly... This
was indeed my first and last substantial encounter with Feynman, a person I
admired very much for his tremendous talent as a physicist but also for being
so straight, so simple, and yet so deep, as a man.
MG & JM: To which subject(s), in particular, would you like to dedicate your
future scientific activity?
GV: Probably the wisest thing for me to do would be to retire from active
research and give more time to teaching and to writing. However, for me
doing research is a little bit like being addicted to a drug (I’m not sure since
I’ve never been!). It will be difficult to stop abruptly. I would really like to
know, for instance, what happens to spacetime singularities in string theory,
to understand the origin of cosmic acceleration, and to solve QCD in some
suitable large-N limit. But all this sounds like wishful thinking doesn’t it?
MG & JM: Finally, how do you imagine the path that fundamental physics
and cosmology will follow in the future? What do you expect, in particular,
from string theory and/or M-theory? Is, in your opinion, a successful “theory
of everything” really within our reach in a foreseeable future?
GV: Who said that it is difficult to make predictions, particularly for the fu-
ture? But, if I have to make some guess, or a bet, I would say that, probably,
the new accelerator data will not confirm our simplest theoretical ideas and, in
particular, will suggest that there is more structure in today’s “elementary par-
ticles” than we presently assume. In other words, the desert will blossom. The
difficulties we are experiencing with getting the right model from string theory
could mean that, like the old strings did not succeed in describing hadrons, the
new ones will fail to describe quarks and leptons. Also, about the hierarchy
problem, we could be on the wrong track with low-energy SUSY. Possibly, the
solutions of the hierarchy and cosmological constant problem are not unrelated.
16 M. Gasperini and J. Maharana

Will we arrive one day at a “final theory” and to the end of theoretical physics?
I do not think we will ever arrive at a “final theory” (I have given many talks
about “Dreams of a Finite Theory” instead) but we may very well come to
the end of some branch of physics because of “practical” reasons. I think that
Feynman said once that a certain branch of physics may terminate the day
the effort to make a tiny step forward (experimentally or theoretically) will be
too large to be able to afford it. We may be (slowly!) approaching that limit in
high-energy accelerator physics, but I am old enough for not being afraid of it.

References
List of publications of Gabriele Veneziano (updated to 2006)
1. R. Gatto and G. Veneziano, Mass of N33 from N/D calculation with SU (6)W
vertices, Phys. Lett. 19 (1965) 512. 3
2. R. Gatto and G. Veneziano, Strong interactions dynamics with vertices invari-
ant under the collinear group, Phys. Lett. 20 (1966) 439.
3. F. Buccella, R. Gatto and G. Veneziano, Analysis of sum rules following from
local commutation relations of currents, Nuovo Cimento 42 (1966) 1019.
4. G. Veneziano, Remarks on the saturation of the sum rules of the chiral algebra,
Nuovo Cimento 43 (1966) 529.
5. G. Veneziano, On the approximate saturation of the algebra of moments, Nuovo
Cimento 44 (1966) 295. 4
6. M. Ademollo, R. Gatto, G. Longhi and G. Veneziano, The SU (6)W algebra
at infinite momentum, its tensor charges, and electric dipoles, Phys. Lett. 22
(1966) 521.
7. F. Buccella, G. Veneziano, R. Gatto and S. Okubo, Necessity of additional
unitary-antisymmetric q-number terms in the commutator of spatial current
components, Phys. Rev. 149 (1966) 1268.
8. M. Ademollo, R. Gatto, G. Longhi and G. Veneziano, The SU (6)W algebra
and the commutators of electric dipoles at infinite momentum, Phys. Rev. 153
(1967) 1623.
9. M. Ademollo, R. Gatto, G. Longhi and G. Veneziano, Mixing schemes for chiral
and collinear algebras, Nuovo Cimento 47A (1967) 334.
10. H.R. Rubinstein and G. Veneziano, Application of current algebra to pion
emission, Phys. Rev. Lett. 18 (1967) 411.
11. H.R. Rubinstein and G. Veneziano, Connection between Regge pole parameters
and local commutation relations, Phys. Rev. 160 (1967) 1286.
12. M. Ademollo, H.R. Rubinstein, G. Veneziano and M.A. Virasoro, Saturation
of superconvergent sum rules at non-zero momentum transfer, Nuovo Cimento
51 (1967) 227.
13. M. Ademollo, H.R. Rubinstein, G. Veneziano and M.A. Virasoro, Bootstrap-
like conditions from superconvergence, Phys. Rev. Lett. 19 (1967) 1402. 4
14. H.R. Rubinstein, G. Veneziano and M.A. Virasoro, Fixed poles and compos-
iteness, Phys. Rev. 167 (1968) 1441.
15. M. Ademollo, H.R. Rubinstein, G. Veneziano and M.A. Virasoro, Reciprocal
bootstrap of the vector and tensor trajectories from superconvergence, Phys.
Lett. B27 (1968) 99.
G. Veneziano: A Concise Scientific Biography and an Interview 17

16. D. Amati, R. Jengo, H.R. Rubinstein, G. Veneziano and M.A. Virasoro, Com-
positeness as a clue for the understanding of the asymptotic behaviour of form
factors, Phys. Lett. B27 (1968) 38.
17. H.R. Rubinstein, A. Schwimmer, G. Veneziano and M.A. Virasoro, Generation
of parallel daughters from superconvergence, Phys. Rev. Lett. 21 (1968) 491.
18. M. Ademollo, H.R. Rubinstein, G. Veneziano and M.A. Virasoro, Bootstrap of
meson trajectories from superconvergence, Phys. Rev. 176 (1968) 1904. 4
19. M. Bishari, H.R. Rubinstein, A. Schwimmer and G. Veneziano, Meson boot-
straps for unnatural-parity states, Phys. Rev. 176 (1968) 1926.
20. G. Veneziano, Construction of a crossing-symmetric, Regge behaved amplitude
for linearly-rising trajectories, Nuovo Cimento 57A (1968) 190. 5
21. M. Ademollo, G. Longhi and G. Veneziano, Spectral function sum rules for
tensor currents, Nuovo Cimento 58A (1968) 540.
22. M. Ademollo, G. Veneziano and S. Weinberg, Quantization conditions for
Regge intercepts and hadron masses, Phys. Rev. Lett. 22 (1969) 83.
23. G. Veneziano, Crossing symmetry Regge behaviour and the idea of duality,
Proc. 6th Coral Gables Conference on “Fundamental Interactions at High En-
ergy”, Coral Gables, FL, 1969 (Gordon and Breach, New York, 1969), p. 113.
24. S. Fubini and G. Veneziano, Level structure of dual resonance models, Nuovo
Cimento 64A (1969) 811. 5
25. S. Fubini, D. Gordon and G. Veneziano, A general treatment of factorization
in dual resonance models, Phys. Lett. B29 (1969) 679. 5
26. L. Caneschi, A. Schwimmer and G. Veneziano, Twisted propagator in the op-
eratorial duality formalism, Phys. Lett. B30 (1969) 351. 5
27. G. Veneziano, Elementary particles, Physics Today 22 (1969) 31.
28. S. Fubini and G. Veneziano, Duality in operator formalism, Nuovo Cimento
67A (1970) 29. 5
29. E. Del Giudice and G. Veneziano, Dual models, Pomeranchuk term and cross-
ing symmetry, Nuovo Cimento Lett. 3 (1970) 363.
30. A. Di Giacomo, S. Fubini, L. Sertorio and G. Veneziano, Unitarity in dual
resonance models, Phys. Lett. B33 (1970) 171. 5
31. S. Fubini and G. Veneziano, Algebraic treatment of subsidiary conditions in
dual resonance models, Ann. Phys., Amos de Shalit Memorial Volume 63
(1971) 12. 5
32. G. Veneziano, Narrow resonance models compatible with duality and their de-
velopments, in Proc. 8th Int. School of Subnuclear Physics “Ettore Majorana”,
Erice, Sicily, 1970 (Academic Press, New York, 1971), p. 94.
33. G. Veneziano, Duality and dual models, in Proc. 15th Int. Conference on High-
Energy Physics, Kiev, 1970 (Naukova Dumka, Kiev, 1972), p. 437.
34. G. Veneziano, Duality and the bootstrap, Phys. Lett. B34 (1971) 59.
35. D. Gordon and G. Veneziano, Inclusive reactions and dual models, Phys. Rev.
D3 (1971) 2116.
36. G. Veneziano, General features of inclusive reactions from duality, Nuovo Ci-
mento Lett. 1 (1971) 681. 875
37. C. E. DeTar, D. Z. Freedman and G. Veneziano, Sum rules for inclusive cross-
sections, Phys. Rev. D4 (1971) 906.
38. G. Veneziano, Sum rules for inclusive reactions and discontinuity formulae,
Phys. Lett. B36 (1971) 397.
39. M. B. Green and G. Veneziano, Average properties of dual resonances, Phys.
Lett. B36 (1971) 477.
18 M. Gasperini and J. Maharana

40. E. Predazzi and G. Veneziano, A general formulation of inclusive sum rules,

Nuovo Cimento Lett. 15 (1971) 749.
41. G. Veneziano, Conservation laws in inclusive reactions, in Rendiconti del 53
Corso Scuola Internazionale di Fisica “Enrico Fermi”, Varenna, 1971 (Aca-
demic Press, New York, 1973), p. 117.
42. S.-H. H. Tye and G. Veneziano, Exotic channels and approach to scaling in
inclusive reactions, Phys. Lett. B38 (1972) 30.
43. G. Veneziano, Inclusive approach to unitarity, Phys. Rev. Lett. 28 (1972) 578.
44. C. E. Jones, F .E. Low, S.-H. H. Tye, G. Veneziano and J. E. Young, Some
general consequences of Regge theory for Pomeranchukon-pole couplings, Phys.
Rev. D6 (1972) 1033.
45. G. Veneziano, Trilinear coupling of scalar bosons in the small mass limit, Nucl.
Phys. B44 (1972) 142.
46. C. Rosenzweig and G. Veneziano, Unitarity sum rules and soft-pion amplitudes,
Nuovo Cimento 12A (1972) 409.
47. S.-H. H. Tye and G. Veneziano, Properties of inclusive reactions in a unitarized
dual model of production amplitudes, Nuovo Cimento 14A (1973) 711.
48. G. Veneziano, Duality and multiparticle production, in Proc. 4th Int. Sympo-
sium on Multiparticle Hadrodynamics, Pavia, 1973 (Istituto Nazionale di Fisica
Nucleare, Italy), p. 325.
49. G. Veneziano, Origin and intercept of the Pomeranchuk singularity, Phys. Lett.
B43 (1973) 413. 5
50. G. Veneziano, An introduction to dual models of strong interactions and their
physical motivations, Phys. Rep. 9C (1973) 4.
51. G. Veneziano, Unitarity sum rules and the two-Reggeon cut, Nucl. Phys. B69
(1974) 317.
52. G. Veneziano, Regge intercepts and unitarity in planar dual models, Nucl. Phys
B74 (1974) 365.
53. G. Veneziano, Large N expansion in dual models, Phys. Lett. B52 (1974) 220. 5
54. C. Rosenzweig and G. Veneziano, Regge couplings and intercepts from the
planar dual bootstrap, Phys. Lett. B52 (1974) 335.
55. A. Schwimmer and G. Veneziano, Saturation of unitarity bounds in planar and
non-planar models of multiparticle rescattering, Nucl. Phys. B81 (1974) 445.
56. M. M. Schaap and G. Veneziano, Self-consistent ρ − p trajectory from the
planar dual bootstrap, Nuovo Cimento Lett. 12 (1975) 204.
57. G. Marchesini and G. Veneziano, Non-vanishing of the bare triple-Pomeron
coupling from s-channel unitarity, Phys. Lett. B36 (1975) 271.
58. M. Ciafaloni, G. Marchesini and G. Veneziano, A topological expansion for
high-energy hadronic collisions: I. General properties and connection with the
Reggeon calculus, Nucl. Phys. B98 (1975) 472.
59. M. Ciafaloni, G. Marchesini and G. Veneziano, A topological expansion for
high-energy hadronic collisions: II. s-channel discontinuities and multiparticle
content, Nucl. Phys. B98 (1975) 493.
60. M. Bishari and G. Veneziano, Cut cancellation in the planar integral equation
for the Reggeon, Phys. Lett. B58 (1975) 445.
61. G. Veneziano, Harari-Freund and other schemes for the Pomeron in the topo-
logical expansion, Nucl. Phys. B108 (1976) 285.
62. G. Veneziano, Some aspects of a uniﬁed approach to gauge, dual, and Gribov
theories, Nucl. Phys. B117 (1976) 519. 5
G. Veneziano: A Concise Scientiﬁc Biography and an Interview 19

63. J. R. Freeman, G. Veneziano and Y. Zarmi, Constraints on Reggeon amplitudes

from analyticity and planar unitarity, Nucl. Phys. B120 (1977) 477.
64. G. C. Rossi and G. Veneziano, A possible description of baryon dynamics in
dual and gauge theories, Nucl. Phys. B123 (1977) 507. 5
65. G. Veneziano, The colour and flavour 1/N expansions, in Proc. 12th Rencontre
de Moriond, Flaine (1977), ed. J. Tran Thanh Van, Vol. 3, p. 113.
66. G. C. Rossi and G. Veneziano, Electromagnetic mixing of narrow baryonium
states, Phys. Lett. B70 (1977) 255.
67. A. Giovannini and G. Veneziano, The Bose–Einstein effect and the jet structure
of hadronic final states, Nucl Phys. B130 (1977) 61. 5
68. G. Veneziano, A topological approach to the dynamics of quarks and hadrons,
in Proc. 9th Ecole d’Et de Physique des Particules, Gif-sur-Yvette (1977),
Vol. 2, p. 23.
69. D. Amati, R. Petronzio and G. Veneziano, Relating hard QCD processes
through universality of mass singularities, Nucl. Phys. B140 (1978) 54. 5
70. D. Amati, R. Petronzio and G. Veneziano, Relating hard QCD processes
through universality of mass singularities (II), Nucl. Phys. B146 (1978) 29. 5
71. K. Konishi, A. Ukawa and G. Veneziano, A simple algorithm for QCD jets,
Phys. Lett. B78 (1978) 243. 5
72. K. Konishi, A. Ukawa and G. Veneziano, On the transverse spread of QCD
jets, Phys. Lett. B80 (1979) 259. 5
73. G. Veneziano, Dynamics of hadronic reactions, in Proc. XIXth Int. Conference
on High-Energy Physics, Tokyo (1978).
74. D. Amati and G. Veneziano, Preconfinement as a property of perturbative
QCD, Phys. Lett. B83 (1979) 87. 5
75. K. Konishi, A. Ukawa and G. Veneziano, Jet calculus: a simple algorithm for
resolving QCD jets, Nucl. Phys. B157 (1979) 45.
76. G. Veneziano, Momentum and colour structure of jets in QCD, in Proc. 3rd
Workshop on Current Problems in High Energy Particle Theory, Florence
(May–June 1979).
77. G. Veneziano, U(1) without instantons, Nucl. Phys. B159 (1979) 213. 6
78. R. J. Crewther, P. Di Vecchia, G. Veneziano and E. Witten, Chiral estimate of
the electric dipole moment of the neutron in QCD, Phys. Lett. B88 (1979) 123. 6
79. P. Di Vecchia and G. Veneziano, Chiral dynamics in the large N limit, Nucl.
Phys. B171 (1980) 253. 6
80. G.C. Rossi and G. Veneziano, Baryonium physics, Phys. Rep. 63 (1980) 153.
81. D. Amati, A. Bassetto, M. Ciafaloni, G. Marchesini and G. Veneziano, A treat-
ment of hard processes sensitive to the infra-red structure of QCD, Nucl. Phys.
B173 (1980) 429.
82. P. Di Vecchia and G. Veneziano, Minimal composite Higgs systems, Phys. Lett.
B95 (1980) 247.
83. G. Veneziano, Goldstone mechanism from gluon dynamics, Phys. Lett. B95
(1980) 90.
84. G. Veneziano, Quantum chromodynamics, in From nuclei to particles, Proc.
International School of Physics Enrico Fermi, Varenna (June 1980).
85. G. Marchesini, L. Trentadue and G. Veneziano, Space-time description of
colour screening via jet calculus techniques, Nucl. Phys. B181 (1981) 335.
86. P. Di Vecchia, F. Nicodemi, R. Pettorino and G. Veneziano, Large N, chi-
ral approach to pseudoscalar masses, mixings and decays, Nucl. Phys. B181
(1981) 318.
20 M. Gasperini and J. Maharana

87. G. Veneziano, Tumbling and the strong anomaly, Phys. Lett. B102 (1981) 139.
88. D. Amati, R. Barbieri, A.C. Davis and G. Veneziano, Dynamical gauge bosons
from fundamental fermions, Phys. Lett. B102 (1981) 408.
89. P. Di Vecchia, K. Fabricius, G.C. Rossi and G. Veneziano, Preliminary evi-
dence for U (1)A breaking in QCD from lattice calculations, Nucl. Phys. B192
(1981) 392.
90. P. Di Vecchia, K. Fabricius, G.C. Rossi and G. Veneziano, Numerical check
of the lattice definition independence of topological charge fluctuations, Phys.
Lett. B108 (1982) 323.
91. D. Amati and G. Veneziano, Metric from matter, Phys. Lett. B105 (1981)
358.
92. D. Amati and G. Veneziano, A unified gauge and gravity theory with only
matter fields, Nucl. Phys. B204 (1982) 451.
93. G. Veneziano and S. Yankielowicz, An effective Lagrangian for the pure N =
1 supersymmetric Yang–Mills theory, Phys. Lett. B113 (1982) 231. 6
94. E. Gabathuler, G. Veneziano and P. Pavlopoulos, Axions, ghosts and pseu-
doscalars at LEAR, Phys. Lett. B114 (1982) 58.
95. T.R. Taylor, G. Veneziano and S. Yankielowicz, Supersymmetric QCD and its
massless limit: an effective Lagrangian analysis, Nucl. Phys. B218 (1983) 493.
6
96. G. Veneziano, Chiral properties of supersymmetric vacua, Phys. Lett. B124
(1983) 357.
97. G. Veneziano, A supersymmetric variant of Dashen’s formula, Phys. Lett. B128
(1983) 199.
98. R. Barbieri, A. Masiero and G. Veneziano, Hierarchy of fermion masses in
supersymmetric composite models, Phys. Lett. B128 (1983) 179.
99. F. Karsch, E. Rabinovici, G. Shore and G. Veneziano, The spectrum of a class
of supersymmetric theories with false vacua, Nucl. Phys. B242 (1984) 503.
100. G.C. Rossi and G. Veneziano, Non-perturbative breakdown of the non-
renormalization theorem in supersymmetric QCD, Phys. Lett. B138
(1984), 195.
101. Y. Meurice and G. Veneziano, SUSY vacua versus chiral fermions, Phys. Lett.
B141 (1984) 69.
102. V. De Alfaro, S. Fubini, G. Furlan and G. Veneziano, Stochastic identities in
supersymmetric theories, Phys. Lett. B142 (1984) 399.
103. A. Masiero and G. Veneziano, Split light composite supermultiplets, Nucl.
Phys. B249 (1985) 593.
104. D. Amati, G.C. Rossi and G. Veneziano, Instanton effects in supersymmetric
gauge theories, Nucl. Phys. B249 (1985) 1.
105. V. De Alfaro, S. Fubini, G. Furlan and G. Veneziano, Stochastic identities in
quantum theory, Nucl. Phys. B255 (1985) 1.
106. D. Amati and G. Veneziano, Gauge dependence of the Nicolai map in super
Yang-Mills theory, Phys. Lett. B157 (1985) 32.
107. A. Masiero, R. Pettorino, M. Roncadelli and G. Veneziano, An attempt at
realistic supercompositeness, Nucl. Phys. B261 (1985) 633.
108. D. Amati, Y. Meurice, G.C. Rossi and G. Veneziano, Massive SQCD and the
consistency of instanton calculations, Nucl. Phys. B263 (1986) 591.
109. G. Veneziano, Ward identities in dual string theories, Phys. Lett. B167
(1986) 388.
G. Veneziano: A Concise Scientific Biography and an Interview 21

110. J. Maharana and G. Veneziano, Gauge Ward identities of the compactiﬁed

bosonic string, Phys. Lett. B169 (1986) 177.
111. G. Veneziano, A stringy nature needs just two constants, Europhys. Lett. 2
(1986) 199. 6
112. G.M. Shore and G. Veneziano, Current algebra and supersymmetry, Int. J.
Mod. Phys. 1 (1986) 499.
113. J. Maharana and G. Veneziano, Strings in a background: a BRS Hamiltonian
approach, Nucl. Phys. B283 (1987) 126. 6
114. K. Konishi and G. Veneziano, Effective action for dynamical supersymmetry
breaking, Phys. Lett. B187 (1987) 106.
115. G. Curci and G. Veneziano, Supersymmetry and the lattice: a reconciliation?,
Nucl. Phys. B292 (1987) 555. 6
116. D. Amati, M. Ciafaloni and G. Veneziano, Superstring collisions at Planckian
energies, Phys. Lett. B197 (1987) 81.
117. R. Petronzio and G. Veneziano, Constraints from string unification, Mod. Phys.
Lett. A2 (1987) 707.
118. G. Veneziano, Mutual focusing of graviton beams, Mod. Phys. Lett. A2
(1987) 899.
119. D. Amati, K. Konishi, Y. Meurice, G.C. Rossi and G. Veneziano, Non-
perturbative aspects in supersymmetric gauge theories, Phys. Rep. 162
(1988) 169. 6
120. D. Amati, M. Ciafaloni and G. Veneziano, Classical and quantum gravity ef-
fects from Planckian energy superstring collisions, Int. J. Mod. Phys. A7
(1988) 1615. 6
121. T. Kubota and G. Veneziano, Off-shell effective actions in string theory, Phys.
Lett. B207 (1988) 419.
122. V. Ferrari, P. Pendenza and G. Veneziano, Beam-like Gravitational waves and
their geodesics, Gen. Rel. Grav. 20 (1988) 1185.
123. G. Veneziano, Topics in string theory, in Proc. DST Workshop in Particle
Physics—Superstring Theory (Kanpur, December 1987), eds. H.S. Mani and
R. Ramachandran (World Scientific, Singapore, 1988) p. 1.
124. T.R. Taylor and G. Veneziano, Strings and D = 4, Phys. Lett. B212
(1988) 147.
125. T.R. Taylor and G. Veneziano, Dilaton couplings at large distances, Phys. Lett.
B213 (1988) 450. 6
126. S. Narison and G. Veneziano, QCD tests of G(1.6) = glueball, Int. J. Mod.
Phys. A14 (1989) 2751.
127. S. Fubini, J. Maharana, M. Roncadelli and G. Veneziano, Quantum constraints
for an interacting superstring, Nucl. Phys. B316 (1989) 36.
128. D. Amati, M. Ciafaloni and G. Veneziano, Can space-time be probed below
the string size?, Phys. Lett. B216 (1989) 41.
129. M. Fabbrichesi and G. Veneziano, Thinning out of relevant degrees of freedom
in scattering of strings, Phys. Lett. B233 (1989) 135.
130. G. Veneziano, Wormholes, non-local actions and a new mechanism for sup-
pressing the cosmological constant, Mod. Phys. Lett. A4 (1989) 695.
131. T.R. Taylor and G. Veneziano, Quenching the cosmological constant, Phys.
Lett. B228 (1989) 210.
132. G. Veneziano, Is there a QCD “spin crisis”?, Mod. Phys. Lett. A4 (1989) 1605.
6
22 M. Gasperini and J. Maharana

133. A. Giveon, E. Rabinovici and G. Veneziano, Duality in string background

space, Nucl. Phys. B322 (1989) 167. 6
134. G. Veneziano, Quantum strings and the constants of Nature, in Proc. 27th
Course of the International School of Subnuclear Physics, Erice, July 1989, ed.
A. Zichichi (Plenum Press, 1990) p. 199.
135. T.R. Taylor and G. Veneziano, Quantum Gravity at large distances and the
cosmological constant, Nucl. Phys. B345 (1990) 210.
136. G.M. Shore and G. Veneziano, The U(1) Goldberger–Treiman relation and the
two components of the proton “spin”, Phys. Lett. B244 (1990) 75.
137. G. Veneziano, The spin of the proton and the OZI limit of QCD, in From
Symmetries to Strings: Forty Years of Rochester Conferences (Okubofest), ed.
Ashok Das (World Scientific, Singapore, 1990) p. 13.
138. G. Veneziano, An enlarged uncertainty principle from gedanken string col-
lisions?, in Proc. Strings ’89, Texas A&M University, March 1989, eds. R.
Arnowitt et al. (World Scientific, Singapore, 1990) p. 86. 6
139. D. Amati, M. Ciafaloni and G. Veneziano, Higher-order gravitational deflection
and soft bremsstrahlung in Planckian energy superstring collisions, Nucl. Phys.
B347 (1990) 550.
140. G. Veneziano, Quantum string gravity near the Planck scale, in Proc. 1st Sym-
posium on Particles, Strings and Cosmology, Northeastern University, March
1990, eds. P. Nath and S. Reucroft (World Scientific, Singapore, 1991) p. 486.
141. S. Ferrara, N. Magnoli, T.R. Taylor and G. Veneziano, Duality and supersym-
metry breaking in string theory, Phys. Lett. B245 (1990) 409.
142. N. Sanchez and G. Veneziano, Jeans-like instabilities for strings in cosmological
backgrounds, Nucl. Phys. B333 (1990) 253.
143. M. Gasperini, N. Sanchez and G. Veneziano, Highly unstable fundamental
strings in inflationary cosmologies, Int. J. Mod. Phys. A6 (1991) 3853.
144. M. Gasperini, N. Sanchez and G. Veneziano, Self-sustained inflation and di-
mensional reduction from fundamental strings, Nucl. Phys. B364 (1991) 365.
145. Nguyen Suan Han and G. Veneziano, Inflation-driven string instabilities: to-
wards a systematic Large-R expansion, Mod. Phys. Lett. A6 (1991) 1993.
146. G. Veneziano, Inflation-driven string instabilities... and the other way around,
(Gatto-Ruegg birthday Conference, Geneva, Nov. 1990), Helv. Phys. Acta 64
(1991) 877.
147. G. Veneziano, Strings and Gravity, in Proc. Texas/ESO-CERN Symposium
on Relativistic Astrophysics, Cosmology, and Fundamental Physics, Brighton,
Dec. 1990, eds. J. D. Barrow, L. Mestel and P.A. Thomas (The New York
Academy of Sciences, NY, 1991) p. 180.
148. G. Veneziano, Scale factor duality for classical and quantum strings, Phys.
Lett. B265 (1991) 287. 6
149. K.A. Meissner and G. Veneziano, Symmetries of cosmological superstring
vacua, Phys. Lett. B267 (1991) 33. 6
150. K.A. Meissner and G. Veneziano, Manifestly O(d, d) invariant approach to
space-time dependent string vacua, Mod. Phys. Lett. A6 (1991) 3397.
151. M. Gasperini, J. Maharana and G. Veneziano, From trivial to non-trivial
conformal string backgrounds via O(d, d) transformations, Phys. Lett. B272
(1991) 277. 6
152. G. Veneziano, Strings in/and inflation, in Proc. 2nd Symposium on Particles,
Strings and Cosmology, NorthEastern University, March 1991, eds. P. Nath
and S. Reucroft (World Scientific, Singapore, 1992) p. 425.
G. Veneziano: A Concise Scientific Biography and an Interview 23

153. M. Gasperini and G. Veneziano, O(d, d)-covariant string cosmology, Phys. Lett.
B277 (1992) 256.
154. G. Veneziano, Bound on reliable one-instanton cross-sections, Mod. Phys. Lett.
A7 (1992) 1661.
155. D. Amati, M. Ciafaloni and G. Veneziano, Planckian Scattering beyond the
semi-classical approximation, Phys. Lett. B289 (1992) 87.
156. G.M. Shore and G. Veneziano, The U(1) Goldberger–Treiman relation and the
proton “spin”: a renormalisation group analysis, Nucl. Phys. B381 (1992) 23.
157. G.M. Shore and G. Veneziano, Renormalization group aspects of η → γγ, Nucl.
Phys. B381 (1992) 3.
158. S. Narison, G.M. Shore and G. Veneziano, A sum rule for the polarized photon
structure function gγ1 , Nucl. Phys. B391 (1993) 69.
159. G.M. Shore and G. Veneziano, The polarized photon structure function gγ1 as
a probe of chiral symmetry realizations, Mod. Phys. Lett. A8 (1993) 373.
160. M. Gasperini, J. Maharana and G. Veneziano, Boosting away singularities from
conformal string background, Phys. Lett. B296 (1992) 51.
161. M. Gasperini and G. Veneziano, Pre Big-Bang in string cosmology, Astropart.
Phys. 1 (1993) 317. 7
162. M. Gasperini, M. Giovannini and G. Veneziano, Squeezed thermal vacuum and
the maximum scale for inflation, Phys. Rev. D48 (1993) 707.
163. D. Amati, M. Ciafaloni and G. Veneziano, Effective action and all-order grav-
itational eikonal at Planckian energies, Nucl. Phys. B403 (1993) 707.
164. M. Fabbrichesi, R. Pettorino, G. Veneziano and G.A. Vilkovisky, Planckian
energy scattering and surface terms in the gravitational action, Nucl. Phys.
B419 (1994) 147.
165. M. Gasperini and G. Veneziano, Inflation, deflation, and frame independence
in string cosmology, Mod. Phys. Lett. A8 (1993) 3701.
166. M. Gasperini, R. Ricci and G. Veneziano, A problem with non-Abelian dual-
ity?, Phys. Lett. B319 (1993) 438.
167. L. Trentadue and G. Veneziano, Fracture functions: an improved description
of inclusive hard processes in QCD, Phys. Lett. B323 (1994) 201. 6
168. M. Gasperini and G. Veneziano, Dilaton production in string cosmology, Phys.
Rev. D50 (1994) 2519.
169. R. Brustein and G. Veneziano, The graceful exit problem in string cosmology,
Phys. Lett. B329 (1994) 429.
170. G. Veneziano, Strings, cosmology,... and a particle, in Proc. PASCOS 1994,
Syracuse, NY, May 1994 (QCD 161:I69:1994), p. 453.
171. G. Veneziano, A new approach to semiclassical gravitational scattering, in Proc.
of the Second Paris Cosmology Colloquium (Observatoire de Paris, June 1994),
eds. H. De Vega and N. Sanchez (World Scientific, Singapore, 1995) p. 322.
172. R. Brustein, M. Gasperini, M. Giovannini, V.F. Mukhanov and G. Veneziano,
Metric perturbations in dilaton driven inflation, Phys. Rev. D51 (1995) 6744.
173. S. Narison, G.M. Shore and G. Veneziano, Target independence of the EMC-
SMC effect, Nucl. Phys. B433 (1995) 209.
174. M. Gasperini, M. Giovannini, K.A. Meissner and G. Veneziano, Evolution of a
string network in backgrounds with rolling horizons, in String theory in Curved
Space Times (Observatoire de Paris, June 1995), ed. N. Sanchez (World Scien-
tific, Singapore, 1998), p. 49.
175. M. Gasperini, M. Giovannini and G. Veneziano, Primordial magnetic fields
from string cosmology, Phys. Rev. Lett. 75 (1995) 3796. 7
24 M. Gasperini and J. Maharana

176. M. Gasperini, M. Giovannini and G. Veneziano, Electromagnetic origin of the

cosmic microwave backgrounds anisotropy, Phys. Rev. D52 (1995) 6651.
177. S. Elitzur, A. Giveon, E. Rabinovici, A. Schwimmer and G. Veneziano, Remarks
on nonabelian duality, Nucl. Phys. B435 (1995) 147.
178. R. Brustein, M. Gasperini, M. Giovannini and G. Veneziano, Relic gravitational
waves from string cosmology, Phys. Lett. B361 (1995) 45. 7
179. D. Graudenz and G. Veneziano, Estimating diffractive Higgs boson production
at LHC from HERA data, Phys. Lett. B365 (1996) 302.
180. G. Veneziano, String cosmology: basic ideas and general results, in Proc. of the
Third Paris Cosmology Colloquium (Observatoire de Paris, June 1995), eds. H.
De Vega and N. Sanchez (World Scientific, Singapore, 1996).
181. R. Brustein, M. Gasperini, M. Giovannini and G. Veneziano, Gravitational
radiation from string cosmology, in Proc. Int. Europhysics Conference on High
Energy Physics (HEP 95, Brussels, July 1995), eds. J. Lemonne et al. (World
Scientific, Singapore, 1996) p. 408.
182. G. Veneziano, String cosmology: concepts and consequences, in Proc. 4th
Course of the International School of Astrophysics D. Chalonge (Erice, Septem-
ber 1995), eds. N. Sanchez and A. Zichichi (Kluwer Academic Publishers, Dor-
drecht, The Netherland, 1996).
183. G. Veneziano, Summary of SUSY-95, in Supersymmetry and Unification of
Fundamental Interactions (SUSY 95), Palaiseau, France, May 1995, eds.
I. Antoniadis and H. Videau (Editions Frontieres, Paris, 1996).
184. M. Gasperini and G. Veneziano, Birth of the Universe as quantum scattering
in string cosmology, Gen. Rel. Grav. 28 (1996) 1301.
185. M. Gasperini, J. Maharana and G. Veneziano, Graceful exit in quantum string
cosmology, Nucl. Phys. B472 (1996) 349. 7
186. G. Veneziano, Summary, in Proc. 28th Int. Conference on High Energy Physics,
Warsaw, July 1996, eds. Z. Ajduk and A. K. Wroblewski (World Scientific,
Singapore, 1997), p. 449.
187. R. Brustein, M. Gasperini and G. Veneziano, Peak and endpoint of the relic
graviton background in string cosmology, Phys. Rev. D55 (1997) 3882.
188. G. Veneziano, String cosmology and relic gravitational radiation, in Proc.
Int. Conference on Gravitational Waves: Sources and Detectors, Pisa, Italy,
March 1996.
189. M. Gasperini and G. Veneziano, Singularity and exit problems in two-
dimensional string cosmology, Phys. Lett. B387 (1996) 715.
190. M. Gasperini, M. Maggiore and G. Veneziano, Towards a non-singular pre-big
bang cosmology, Nucl. Phys. B494 (1997) 315. 7
191. G. Veneziano, Inhomogeneous pre-big bang string cosmology, Phys. Lett. B406
(1997) 297.
192. G. M. Shore and G. Veneziano, Testing target independence of the Rproton
spinS effect in semi-inclusive deep inelastic scattering, Nucl. Phys. B516 (1998)
333-353.
193. D. de Florian, G.M. Shore and G. Veneziano, Target fragmentation at polarized
HERA: a test of universal topological-charge screening in QCD, in Proc. 1997
Workshop on Physics with polarized protonds at HERA, DESY-Zeuthen and
CERN, March–September 1997.
194. G. Veneziano, Theoretical Outlook, in Proc. Int. EPS Conference on High
Energy Physics, Jerusalem 1997, eds. D. Lellouch, G. Mikenberg and Eliezer
Rabinovici (Springer-Verlag, Berlin, 1999).
G. Veneziano: A Concise Scientific Biography and an Interview 25

195. M. Grazzini, L. Trentadue and G. Veneziano, Fracture functions from cut ver-
tices, Nucl. Phys. B519 (1998) 394-404.
196. A. Buonanno, K.A. Meissner, C. Ungarelli and G. Veneziano, Classical inho-
mogeneities in string cosmology, Phys. Rev. D57 (1998) 2543.
197. J. Maharana, E. Onofri and G. Veneziano, A numerical simulation of pre-big
bang cosmology, JHEP 4 (1998) 4.
198. A. Buonanno, K. A. Meissner, C. Ungarelli and G. Veneziano, Quantum inho-
mogeneities in string cosmology, JHEP 1 (1998) 4.
199. R. Brustein, M. Gasperini and G. Veneziano, Duality in cosmological pertur-
bation theory, Phys. Lett. B431 (1998) 277.
200. R. Durrer, M. Gasperini, M. Sakellariadou and G. Veneziano, Seeds of large-
scale anisotropy in string cosmology, Phys. Rev. D59 (1999) 043511.
201. R. Durrer, M. Gasperini, M. Sakellariadou and G. Veneziano, Massless
(pseudo-)scalar seeds of CMB anisotropy, Phys. Lett. B436 (1998) 66.
202. G. Veneziano, Quantum geometric origin of all forces in string theory, in The
Geometric Universe (Oxford University Press, Oxford, 1998) p. 235.
203. A. Buonanno, T. Damour and G. Veneziano, Pre-big bang bubbles from the
gravitational instability of generic string vacua, Nucl. Phys. B543 (1999) 275.
204. M. Gasperini and G. Veneziano, Constraints on pre-big bang models for seeding
large-scale anisotropy by massive Kalb-Ramond axions, Phys. Rev. D59 (1999)
043503.
205. A. Ghosh, G. Pollifrone and G. Veneziano, Quantum fluctuations in open pre-
big bang cosmology, Phys. Lett. B440 (1998) 20.
206. G. Veneziano, Physics and Mathematics: a happily evolving marriage?, in
Les relations entre les Mathmatiques et la physique theorique, Festschrift
for the 40th anniversary of the IHES (IHES Publications, Bures-sur-yvette
1998), p. 183.
207. G. Veneziano, Pre bangian origin of our entropy and time arrow, Phys. Lett.
B454 (1999) 22. 7
208. G. Veneziano, Entropy bounds and string cosmology, in Fundamental Interac-
tions: from Symmetries to Black Holes (Proceedings of conference in honour
of F. Englert) (ULB, Bruxelles, March 1999) p. 273.
209. A. Melchiorri, F. Vernizzi, R. Durrer and G. Veneziano, CMB anisotropies and
extra dimensions in string cosmology, Phys. Rev. Lett. 83 (1999) 4464.
210. A. Ghosh, R. Madden and G. Veneziano, Back reaction to dilaton-driven in-
flation, Nucl.Phys. B570 (2000) 207.
211. T. Damour and G. Veneziano, Self-gravitating fundamental strings and black
holes, Nucl. Phys. B568 (2000) 93. 7
212. G. Veneziano, Testing string theory by probing the pre-bangian Universe, in
Proc. COSMO-98 Conference, Asilomar, CA, 1998, ed. D.O. Caldwell (AIP
Conference Proceedings, 1999), p. 97.
213. R. Brustein and G. Veneziano, A causal entropy bound, Phys. Rev. Lett. 84
(2000) 5695. 7
214. G. Veneziano, String Cosmology: the pre-big bang scenario, in Proc. Les
Houches Summer School, on The Primordial Universe, Les Houches, 1999,
eds. O. Binetruy et al. (Springer-Verlag, Heidelberg, 2000), p. 581.
215. Valerio Bozza, Gabriele Veneziano, O(d,d)-invariant collapse/inflation from
colliding superstring waves, JHEP 0010 (2000) 035.
216. R. Brustein, S. Foffa and G. Veneziano, CFT, holography, and causal entropy
bound, Phys. Lett. B507 (2001) 270–276.
26 M. Gasperini and J. Maharana

217. V. Bozza, M. Gasperini and G. Veneziano, Localization of scalar ﬂuctuations

in a dilatonic brane-world scenario, Nucl. Phys. B619 (2001) 191.
218. L. Giusti, G.C. Rossi, M. Testa and G. Veneziano, The UA (1) Problem on the
lattice with Ginsparg–Wilson Fermions, Nucl. Phys. B628 (2002) 234–252.
219. M. Gasperini, F. Piazza and G. Veneziano, Quintessence as a run-away dilaton,
Phys. Rev. D65 (2002) 023508. 7
220. M. J. Duff, L. B. Okun and G. Veneziano, Trialogue on the number of funda-
mental constants, JHEP 03 (2002) 023.
221. G. Veneziano, Large-N bounds on, and compositeness limit of, gauge and grav-
itational interactions, JHEP 0206 (2002) 051.
222. E. Kohlprath and G. Veneziano, Black holes from high-energy beam–beam
collisions, JHEP 0206 (2002) 057.
223. T. Damour, F. Piazza and G. Veneziano, Runaway dilaton and equivalence
principle violations, Phys. Rev. Lett. 89 (2002) 081601.
224. T. Damour, F. Piazza and G. Veneziano, Violations of the equivalence principle
in a dilaton-runaway scenario, Phys. Rev. D66 (2002) 046007.
225. V. Bozza, M. Gasperini, M. Giovannini and G. Veneziano, Assisting pre-big
bang phenomenology through short-lived axions, Phys. Lett. B543 (2002) 14. 7
226. M. Gasperini, and G. Veneziano, The pre-big bang scenario in string cosmology,
Phys. Reports 373 (2003) 1.
227. V. Bozza, M. Gasperini, M. Giovannini and G. Veneziano, Constraints on pre-
big bang parameter space from CMBR anisotropies, Phys. Rev. D67 (2003)
063514.
228. A. Armoni, M. Shifman and G. Veneziano, Exact results in nonsupersymmetric
large N orientifold field theories, Nucl. Phys. B667 (2003) 170.
229. V. Bozza, M. Giovannini and G. Veneziano, Cosmological perturbations from
a new physics hypersurface, JCAP 0305 (2003) 001.
230. M. Gasperini, M. Giovannini and G. Veneziano, Perturbations in a nonsingular
bouncing universe, Phys. Lett. B569 (2003) 113.
231. A. Armoni, M. Shifman and G. Veneziano, SUSY relics in one flavor QCD from
a new 1/N expansion, Phys. Rev. Lett. 91 (2003) 191601.
232. V. Branchina, K. A. Meissner and G. Veneziano, The price of an exact, gauge
invariant RG flow equation, Phys. Lett. B574 (2003) 319.
233. A. Armoni, M. Shifman and G. Veneziano, QCD quark condensate from SUSY
and the orientifold large N expansion, Phys. Lett. B579 ( 2004) 384.
234. G. Veneziano, A model for the big bounce, JCAP 0403 (2004) 004.
235. M. Gasperini, M. Giovannini and G. Veneziano, Cosmological perturbations
across a curvature bounce, Nucl. Phys. B694 (2004) 206. 7
236. A. Armoni, M. Shifman and G. Veneziano, From Super Yang–Mills theory to
QCD: planar equivalence and its implications, in From Fields to Strings, eds.
M. Shifman et al. (World Scientific, Singapore, 2004) Vol. 1, p. 353. 7
237. G.C. Rossi and G. Veneziano, Isospin mixing of narrow pentaquark states,
Phys. Lett. B597 (2004) 338.
238. A. Armoni, M. Shifman and G. Veneziano, Exact results in a non supersym-
metric gauge theory, Fortsch. Phys. 52 (2004) 453.
239. G. Veneziano, The myth of the beginning of time, Sci. Am. 290, N5 (2004)
30.
240. G. Veneziano, String-theoretic unitary S-matrix at the threshold of black-hole
production, JHEP 0411 (2004) 001. 7
G. Veneziano: A Concise Scientific Biography and an Interview 27

241. A. Armoni and M. Shifman, G. Veneziano, Reﬁning the proof of planar equiv-
alence, Phys. Rev. D71 (2005) 045015.
242. V. Bozza and G. Veneziano, Scalar perturbations in regular two-component
bouncing cosmologies, Phys.Lett. B625 (2005) 177.
243. V. Bozza and G. Veneziano, Regular two-component bouncing cosmologies and
perturbations therein, JCAP 0509 (2005) 007.
244. G. Veneziano, Unconventional scenarios and perturbations therein, Phys. Scr.
T117 (2005) 51.
245. A. Armoni, G. Shore and G. Veneziano, Quark condensate in massless QCD
from planar equivalence, Nucl. Phys. B740 (2006) 23.
246. G. Veneziano and J. Wosiek, Planar quantum mechanics: an intriguing super-
symmetric example, JHEP 0601 (2006) 156. 7
247. G. Veneziano, Cosmology (including neutrino mass limits): a particle theorist’s
viewpoint (contribution to HEP-EPS 2005), Lisbon, Portugal, July 2005, PoS
HEP 2005 (2006) 403.
248. G. Veneziano and J. Wosiek, A supersymmetric matrix model. II. Exploring
higher-fermion-number sectors, JHEP 0610 (2006) 033.
249. G. Veneziano and J. Wosiek, A supersymmetric matrix model. III. Hidden
SUSY in statistical systems, JHEP 0611 (2006) 030.
250. G. Veneziano, Towards a unitary S-matrix description of black-hole formation
and decay in string theory, AIP Conf. Proc. 861 (2006) 39.
An Unpublished Draft by Gabriele Veneziano
(1973): “Non-local Field Theory Suggested
by Dual Models”

G. Veneziano

CERN, Theory Unit, Physics Department, CH-1211 Geneva 23, Switzerland,

and College de France, 11 Place M. Berthelot, 75005 Paris, France
[email protected]

Abstract. This article reports an old and incomplete note (written in 1973, mostly
at the Weizmann Institute, Rehovot, Israel) about a non-local field theory suggested
by dual resonance models, and largely inspired by Yukawa’s late work on bilocal
fields. It has definite relations to the study of strings in a background (discussed
by Ademollo et al.), and to Polyakov’s action for a string moving in a tachyonic
background. It also suggests, for the first time, a modification of the uncertainty
principle coming from the extended nature of strings. The original note is reported
in this article using the slanted typographical style, for an immediate “visive” sepa-
ration between the old, original text and the modern comments added by the author
in the notes and in the final appendix.

1 Introduction and Content of the Paper

The success of quantum electrodynamics [1] (QED), as well as the recent

breakthroughs in weak interactions [2], are a clear confirmation of the sound-
ness of local field theory (LFT) in describing leptonic interactions.
The situation appears much more dubious at the hadronic level. LFT
fails to explain the spectrum of hadrons, in particular its amazingly rich
structure, and to account for the simple systematics of the SLAC data on
electron/nucleon high energy collisions, known as Bjorken scaling. It is also
hard to construct field theories which provide strong damping of transverse
momenta at high energy, this failure being probably related to the previous
ones.
What seems to emerge is the fact that, for strong interactions, already
in the several GeV region, local field theory is too singular in position space
or, if we prefer, too spread in momentum space. High frequencies are not
damped enough to provide a sharp transverse momentum cut off, infinite
renormalization constants are needed, and the resulting cutoff brings in scaling
violations. Nature, on the other hand, seems to be as naı̈ve as a free field

G. Veneziano: An Unpublished Draft by Gabriele Veneziano (1973): “Non-local Field Theory

Suggested by Dual Models”, Lect. Notes Phys. 737, 29–44 (2008)
DOI 10.1007/978-3-540-74233-6 2
c Springer-Verlag Berlin Heidelberg 2008
30 G. Veneziano

theory or, better, as damped as a super-renormalizable LFT. Unfortunately,

there are no sound super-renormalizable LFTs in four dimensions.
In order to explain the simple SLAC data crude models have been pro-
posed which get away without an underlying field theory. These (“parton”)
models are based on a composite picture of the hadron with a large number
of constituents giving it a structure. As a consequence, the hadron becomes
all but a pointlike object.
Such a composite system can be made such as to enjoy Bjorken scaling. At
the same time, a composite hadron can possibly lie on a Regge trajectory and,
for an infinitely composite object, a trajectory rising from −∞ to +∞ is quite
conceivable. Also, a composite structure will have a spread in position space
and can therefore lead to enough damping of large momenta to encompass
the above-mentioned difficulties.
It is very difficult to put parton models on a more than descriptive, intu-
itive level. On the other hand, a much more refined and detailed model has
been developed over the past five years which has several attractive features
and seems to depart in many respects from any LFT approach. This is the
dual resonance model which, started as a simple mathematical realization of
the duality idea of Dolen, Horn and Schmit, was developed as far as to rep-
resent now, for many, the only possible candidate for a complete theory of
hadrons.
Although this theory has not yet produced a completely satisfactory first-
order solution to the strong interaction S-matrix, its theoretical consistence
and the number of constraints it fulfills can be hardly considered accidental.
One of the most amazing properties of this model is the fact that is has a
universal length (or mass) scale in it. Calling this length λ we can list here
the various quantities related to λ in this dual theory:
1) the slope dJ/dM 2 ≈ λ2 ≡ α
2) the size of total cross-sections σtot ≈ λ2
3) the cut-off in transverse momenta p2⊥ ≈ 1/λ2
• ...1
The properties of dual models which are related to this length are indeed
very suggestive of λ being related to the “size” of the hadron, an intrinsic size
which we do not see (yet) in the leptonic world.
This simple observation suggests that we may look at dual models as
at some approximation of a non-local (rather than of a local) field theory
characterized by this new microscopic constant λ, and which goes into a local
theory in the limit λ → 0.
Of course this idea of introducing a fundamental length λ in quantum rela-
tivistic theories is quite old. In particular, Yukawa has advocated a particular

1
The original text has a vertical series of dots indicating that several other quanti-
ties related to λ were known: an obvious one is the limiting Hagedorn temperature
of dual resonance models.
Non-local Field Theory Suggested by Dual Models (1973, draft unpubl.) 31

modiﬁcation of LFT where the introduction of λ can be made quite naturally.

Using some ideas of Born, Yukawa managed to constrain this theory further.
In spite of some appealing features, however, such attempt of Yukawa has not
progressed too far and has encountered problems of higher order corrections.
The aim of this investigation is to point out that dual models can be
reformulated as a sort of Yukawa-type non-local field theory, with a lot more
of structure in it. We hope that further study along these lines may clarify
the physical meaning of duality and of hadronic compositeness. On the other
hand, a better physical understanding of the dual formalism could provide new
hints for the solution of the remaining problems afflicting dual models such as
fermions and currents. On the other hand, this more sophisticated non-local
FT could solve some of the problems met by the original Yukawa proposal.
We are thus trying to develop a Non-local-Quantum-Relativistic theory,
characterized by the three fundamental constants:
Relativistic – c ≈ 3.01010 cm.sec−1
Quantum – h ≈ 6.610−27 erg. sec
Non-Local – λ ≈ 2.010−14 cm
It would be of course interesting to analyze other limits besides that of
a LFT (λ = 0). An interesting one, on which we shall have some comments
here, is that of a non-local classical field theory.
The plan of this paper is as follows:
In Sect. 2 we review briefly Yukawa’s non-local field theory and its local limit.
In Sect. 3 we reconsider the zero slope (local) limit of dual models and we
argue that there exists an alternative to the results of the type given by Scherk.
In Sect. 4 we establish our correspondence principle between a local field and
a non-local dual field and discuss its physical meaning in terms of quantum
measurements. In Sect. 5 we consider the case of a classical (h = 0) non-local
field theory as it would emerge from the string picture of the dual model.
In Sect. 6 we derive a few simple quantities which could be relevant in the
development of the theory. Finally, Sect. 7 contains a few more speculative
remarks and our outlook.

2 Yukawa’s Non-local Field Theory

Let us consider, at the beginning, a theory of ﬁrst quantization in which we
have introduced as usual operator qi and pi such that

[qi , pj ] = ihδij i = 1, 2, 3, ...(D − 1) . (1)

D is the dimensionality of space-time. For the moment, take D = 4. A local

ﬁeld is introduced by Yukawa as a “ﬁrst-quantized” Hermitian operator U
which commutes with qi , the position operators

[qi , U ] = 0 . (2)
32 G. Veneziano

Hence position and field can be measured simultaneously, i.e., given a test
body, we can measure its position (which implies a point-like body) at a given
time as well as the field acting on it at that point and at that time. In other
words, we can define the meaning of a field at a point x = (x, ct). If we work
in the coordinate representation we shall have, by definition,

qi |x = xi |x ,
x |x = δ (3) (x − x ) , (3)

and, because of (2),

x|U |x = δ (3) (x − x ) φ(x) . (4)

Having in mind relativistic invariance, we shall write instead:

qμ |x = xμ |x , (5)

x |x = δ (4) (x − x ) , (6)

x|U |x = δ (4)
(x − x )φ(x) , (7)
x|U |x
φ(x) = . (8)
x|x
Yukawa identifies thus φ(x) as the local c-number field which then undergoes
the usual second quantization procedure. φ(x) satisfies a wave equation, e.g.
a Klein–Gordon equation

∂ ∂
−m 2
φ(x) = 0 . (9)
∂xμ ∂xμ

This follows from the equation of motion at the U -operator level

[pμ , [pμ , U ]] = m2 c2 U . (10)

Equations (2) and (10) hence characterize, in Yukawa’s scheme, a local field
theory of a spinless particle of mass m and zero size.
Non-local field theories are then introduced by Yukawa through a modifi-
cation of (2) to read
[q, U ] = 0 . (11)
As a consequence of (11) we can no longer extract a δ (4) (x − x ) from (2) and
we shall have
x |U |x = U (x , x) . (12)
Similarly, if we start from eigenstates of p

p|k = kμ |k ,
1
k |k = δ (4) (k − k ) |k = d4 x eikx |x , (13)
(2π)
Non-local Field Theory Suggested by Dual Models (1973, draft unpubl.) 33

we have, for a local FT,

k |U |k = φ(k − k ) , (14)

and for a NLFT

1
k |U |k = U (k, k ) = d4 x d4 x eikx e−ik x U (x, x ) . (15)
(2π)4
It is convenient to introduce the coordinates
x + x
X= , r = x − x ,
2
k+k
K= , Δ = k − k , (16)
2
and, using kx − k x = Kr + Δ · X, we have

x |U |x = U (x, r) → (in LFT limit) δ (4) (r) φ(x)

k |U |k = U (k, Δ) → (in LFT limit) φ(Δ) , (17)

with

1
U (k, Δ) = d4 x d4 r exp(iKr + Δx)U (x, r) ,
(2π)4
1
φ(Δ) = d4 x exp(iΔx) φ(x) . (18)
(2π)4
At this point, in order to restrict the possible choices of NLFT, Yukawa took
inspiration from Born reciprocity principle and speciﬁed (11) to read:

[qμ , [q μ , U ] ] = λ2 U , (19)

where λ has obviously dimensions of a length. Notice the close similarity with
(10). Notice that, as a consequence of (10) and (19),

λ2 (λ̄)4
[q, [q, U ] ] = [p, [p, U ]] = [p, [p, U ]] , (20)
m2 c2 h2
where λ̄ = (λ2 2 m−2 c−2 )1/4 has also dimensions of a length. Hence, [q, [q, U ]])
and [p, [p, U ]] are proportional with an assigned constant of proportionality.
An immediate consequence of (19) is

(r2 − λ2 )U (x, r) = 0 ⇒ U (x, r) = δ(r2 − λ2 ) φ(x, r) , (21)

and, as usual, from (10)

U (k, Δ) = δ(Δ2 − m2 ) φ(k, Δ) . (22)

(21) and (22) are the starting point of Yukawa’s approach to a ﬁeld theory
describing particles of mass m and radius λ. LFT is recovered by letting λ → 0.
34 G. Veneziano

Yukawa himself pointed out the diﬃculties inherent in constructing a dy-

namical system of equations of motions for U . In particular he stressed the
fact that a differential formalism (Schrödinger equation) can be made very
uneffective because the intial conditions cannot be specified on a spacelike
surface as they involve some average over different times as well.
Further developments : field defined on a domain2 .

3 The Zero Slope (Local) Limit of Dual Models

If we want to understand in which sense the dual model can be seen as a
non-local extension of ordinary field theory, we have to consider first its own
local limit, i.e. the limit λ2 = α → 0.
This problem was first investigated by Scherk and then further examined
by Scherk and others. The result of Scherk, for the generalized Beta function
model (GBM), is quite simple. The dual n-point function is given by

An = γ n−2 (α )(n−4)/2 Bn{P } , (23)
{P }

{
where γ is the dimensionless dual coupling constant α = λ2 and Bn P } is
the particular generalized B-function corresponding to the permutation {P }
of the external legs. Bn is dimensionless and thus the dimensionality of An
(which comes from the dimensionality of the eigenstates of p, |pi ) is taken
care of entirely by the factor (λ)n−4 .
Scherk’s limit is defined as the limit of An for λ2 = α → 0+ with γ/λ ≡ g
fixed (hence γ → 0). The limit is taken while keeping α(m2 ) = 0 with m2 also
kept fixed. m is the mass of the external particles and is also the mass of the
lowest state lying on the leading trajectory (assumed here to have α(0) < 0).
One can see immediately that, in such a limit, the coupling in front of Bn
becomes

γ n−2 (α )
n−4
2 → (Scherk limit) g n−2 (λ2 )(n−3) → 0 for λ → 0 , g ﬁxed . (24)

The limit is thus 0(λ2n−6 ) unless Bn can be singular for λ → 0. In fact, Bn

can be exactly as singular as (λ2 )−2n+6 if and only if one is sitting near a set
of compatible lowest poles such as those of Fig. 13 .
Hence for ﬁnite si this term survives as λ → 0. An excited pole would not
survive because
Γ (−αs )Γ (−αt ) (−αt − 1) 1 1
→ → (λ → 0) = 2 → 1 . (25)
Γ (−αs − αt ) −αs + 1 αs − 1 λ (s − 1
λ2 )
2
I probably meant to add some mention of a further development of Yukawa’s
work.
3
A sketch of a multiperipheral tree-diagram with seven external legs and four
internal propagators appears in the original version.
Non-local Field Theory Suggested by Dual Models (1973, draft unpubl.) 35

Hence this term is not of order 1/λ2 if s is ﬁnite. Of course, it becomes

O(1/λ2 ) if s becomes 0(1/λ2 ). We have to understand therefore that Scherk’s
limit also keeps all si ﬁnite in the limit. In other words, all momenta are
supposed to be small compared to the scale 1/λ. This is actually the only
meaning we can give to a local limit.
In general, Scherk proved that An goes in the limit to the sum of all
tree diagrams of a (g/3!)φ3 theory. A similar result could be proven for
loops.
The result of Scherk is certainly correct. On the other hand, there can be
other ways to take the limit λ → 0. Take for instance the Lovelace–Shapiro
model for ππ scattering

Γ (1 − αs ) Γ (1 − αt )
B4 = . (26)
Γ (1 − αs − αt )

For s, t ﬁnite and α → 0 with α(0) = 1/2 and kept ﬁxed,

Γ (1/2) Γ )1/2)
B4 → = 0(λ2 s, λ2 t) → 0 . (27)
Γ (0)

If it was not for the Adler zero, say α(0) = 1/3 or in my original proposal, we
would have found
B4 → (λ → 0) constant . (28)

In general, Bn → const. for λ → 0 unless the region of the small (finite) exter-
nal momenta happens to take us near a pole (or several poles) of Bn . Hence
An → γ n−2 (λ2 )n−4 and the limit depends on what we do with γ as λ → 0.
But, in any case, Bn has no structure on it, in the sense that no singularity
appears. It is crucial to have in this limit α(0) fixed and not an integer. There
is a little problem, however. Dual models can only be constructed, so far,
for on-shell external particles at p2 = m2 . If m2 = −(α(0)/α ), m2 → ∞ as
α → 0 and therefore pμ cannot be kept finite. If we let pμ → 0(1/λ) then we
are back on top of the poles and we get again results à la Scherk. We notice,
however, that
1) In a world of pion amplitudes with massless pions we can take pμ fi-
nite. Then the only singularities come from pion poles but, because of
the zero slope limit, their contributions are down if there is an Adler con-
dition
1 1
B6 → B4 B4 → λ2 2 λ2 ≈ λ2 → 0 . (29)
αs λ

2) One may hope that in a future formulation of the theory off shell ampli-
tudes can be defined so that one can take the external momenta to be
fixed as λ → 0. In that limit, Bn → const.
36 G. Veneziano

3) In a theory with external quarks of zero mass, not appearing as poles, the
limit λ → 0 (pμ finite) is again conceivable.
We note that, in dual models, keeping α(0) fixed is more natural than
keeping α(m2 ) fixed since many properties do depend on the value of α(0) (or
α m2 ) and not just on m2 .
We now want to argue that our λ → 0 limit may be the correct one
physically. This comes from the expression of Bn in the operator formal-
ism. There, Bn is written as a vacuum-expectation value of a product of
fields:
2π
Bn = dτ1 . . . dτn θ(τi − τi−1 )0|V (k1 , τ1 ) . . . V (kn , τn )|0 , (30)
0

where

V (k, τ ) = : exp(ik · Q(τ )) : ,

√ an,μ −inτ a†n,μ inτ
2
Qμ (τ ) = qμ + 2λ pμ τ + λ 2 √ e + √ e ,
n
n n
[qμ , pν ] = igμν ,
[an,μ , a†m,ν ] = δn,m gμν . (31)

For λ → 0, Q(τ ) → q and the vertex V (k, τ ) reduces to the usual exp(i k q)
of an ordinary local theory and gives, up to a number,

Bn = δ (4) (k1 + k2 + · · · + kn ) , (32)

hence the same as a local interaction φn to lowest order.

This limit can also be seen in the formalism of Ademollo et al. (strings
in an external ﬁeld) and in the expression of Bn given by Fubini and
Veneziano:

Bn = dτ1 . . . dτn θ(τi − τi−1 )0|φ(τ1 ) . . . φ(τn )|0 . (33)

In conclusion the type of correspondence principle that we shall use in

this paper will not be that dual amplitudes in the zero slope limit go into the
trees of gφ3 but rather that each Bn goes to the ﬁrst-order approximation of
the highly non-linear local Lagrangian γ n−2 λn−4 φn . In the zero-slope limit,
the whole dual model would then collapse into the ﬁrst iteration of a non-
polynomial Lagrangian of the type:
1 2
Lint (φ) ≈ φ F (γ λ φ) , (34)
λ2
F being a function of the dimensionless quantity (γ λ φ) which can be com-
puted. Of course, for the sum over n the concept of leading order in λ is
somehow lost.
Non-local Field Theory Suggested by Dual Models (1973, draft unpubl.) 37

4 The Correspondence Principle

We have seen that
Qμ (τ ) → (λ → 0) qμ . (35)
If we consider a ﬁeld φ(qμ ) this is, in the sense of Yukawa, a local ﬁeld whereas
φ(Qμ ) in general is not. Hence

φ(Qμ ) → (λ → 0) φ(qμ ) = local ﬁeld . (36)

Also we notice that

+π
1
Qμ (τ ) ≡ dτ Qμ (τ ) = qμ . (37)
2π −π

Hence
φ(Qμ (τ )) = φ(qμ ) = local field . (38)
Our correspondence principle will be such that, in the non-local theory, the
field at the average position goes into the average of field, i.e.

φ(qμ ) = φ(Qμ (τ )) → (λ = 0) φ(Qμ (τ )) ,

+π
1
≡ φ(Qμ (τ )) dτ → (λ → 0) φ(qμ ) , (39)
2π −π

or, in terms of matrices:

δ(x − x ) φ(x) = x|φ(Q)|x → x|φ(Q)|x . (40)

We clearly see that the process of averaging has introduced in a quite essential
way a dependence of φ or both qμ and pμ thus making the theory non-local
in the sense of Yukawa. This has to be contrasted with recent attempts at
constructing a ﬁeld theory (of the ∞-component type) for dual model by
introducing a ﬁeld ϕ[X(σ, 0)] i.e. a functional of X evaluated at one value of
τ . Since [X(σ, 0), X(σ , 0)] = 0, this still keeps the theory local (multilocal to
be more precise), i.e. diagonal in position-space:

x1 , x2 , . . . xn |ϕ|x1 , x2 . . . xn = δ(x1 −x1 ) δ(x2 −x2 ) . . . φ(x1 , x2 . . . xn ) . (41)

We see that this field depends on half as many variables as our field.
In other words we insist on the physical idea that the extended nature
of the dual hadron makes it impossible not only to define a field at a point
in space, but also at a point in time. Namely the field one probes with a
dual hadronic test body is an average field over a period of time and a region
of space related by Δx
Δt = c. This is even more transparent in the Shapiro-
Virasoro model where the average is done over both σ and τ . The generalized
Beta-function model is less symmetric because it corresponds to the case in
which the test body is only active at the ends of the string. Yet it is not the
38 G. Veneziano

same as having a pointlike test body since the motion of the ends results from
that of the string as a whole.
The introduction of non-locality is thus made necessary by the sinple fact
that, if we average the field over a period of time, that average depends on
the trajectory described by the test body. This depends on both the orig-
inal position and velocity and hence classically (h = 0) it is a function of
both x and p. Quantum-mechanically, x and p cannot be measured simul-
taneously and one gets therefore only a matrix representation in x (or p)
space.
The above is actually the crucial point at which one is definitively depart-
ing from conventional theories. We do not claim that our interpretation is a
necessary one for the dual model, but suggest that it is a possible one. Within
such an interpretation we now show that dual amplitudes arise as a non-local
extension of an ordinary, local Lagrangian (or S-matrix in lowest order).
For a single scalar field theory the only interaction one has is

LI = φn (x) . (42)

To lowest order the S-matrix for scattering of n particles of n particles of

momenta k1 , k2 , . . . kn is

k1 , . . . kn | d4 x φn (x)|0, . . . , 0 = d4 xeik1 x . . . eikn x = δ (4) (k1 + · · · + kn ) .
(43)
Let us see now how to use our correspondence principle to give a non-local
extension of such a scattering amplitude. We write:

S = d4 x1 φn (x1 ) =
(1)

d4 x1 . . . d4 xn φ(x1 )δ(x1 − x2 ) . . . φ(xn−1 ) δ(xn−1 − xn )φ(xn ) . (44)

Using our correspondence principle

S (1) = d4 xi x1 |φ|x2 x2 |φ|x3 x3 |φ|xn−1 xn−1 |φ|xn . (45)

Now the integrations over x1 and xn give 0| and |0 respectively (eigenstates
of pμ with zero eigenvalue) and the sum over intermediate xi , i = 2, 3, . . . , n−1
can be replaced by completeness sums in the vector space of Q, |xx| =
|pp|, as well as in the harmonic oscillator basis. Extracting ﬁnally the Fourier
components with momenta k1 . . . kn one ﬁnds

S (1) = dτ1 . . . dτn 0| exp(ik1 Q(τ1 ) exp(ik2 Q(τ2 ) . . . exp(ikn Q(τn ))|0 .
(46)
This is exactly the n-point dual model provided we add an ordering constraint
τ1 ≤ τ2 ≤ τ3 · · · ≤ τn .
Non-local Field Theory Suggested by Dual Models (1973, draft unpubl.) 39

A hint for how to get the ordering comes from:

2π
dτi exp(i ki Q(τi ))
0
2π
→ Tτ dτi 0| exp(ik1 Q(τi1 ) . . . exp(ikn Q(τin )|0 . (47)
orderings 0

One should get the l.h.s. of this equation as a ﬁrst step. Hence this model
is capable of producing the dual model interaction in a very natural way.
Indeed, if we consider a closed string interacting at all values of σ we get the
Shapiro-Virasoro model.
Actually the expression we have obtained is the n-point function only up
to an inﬁnite constant since (with zi = eiτi ),
n
1 dτa dτb dτc
dτi 0|V (ki , τi )|0 = ·Bn = ∞·Bn .
2π τi <τi+1 |za − zb ||zb − zc ||zc − za |
(48)
In other words, the local interaction giving rise to An = g n−2 (λ)n−4 = Bn is
G (gφλ)n
, Lint = λ−4
(n) (n)
Lint = Lint , (49)
n=3
g2 n!n

with −1
dτa dτb dτc
G= .
|za − zb ||zb − zc ||zc − za |

gλ and G would thus play the role of the so-called minor and major coupling
constants of a non-polynomial Lagrangian.
One may ask where the infinity has been produced from since, after all, the
α → 0 limit should be finite. We see that the infinity is still there in the α → 0
limit because our external masses have been fixed to α k 2 = α μ2 = −1; hence,
μ2 → −∞ as α → 0. √
If our external masses would not be fixed at values of order ∼ 1/ α ,
but at a finite value as α → 0, we would not have produced an infin-
ity. On the other hand, the model thus obtained would not
have been dual
(projective invariant) using with the volume element i dτi . In order to
get duality, we would have had to use a more complicated volume element
such as
dτi
.
|zi − zi+1 | . . .

The ideal situation would be one in which the model is dual for external
massless
√ particles and, at the same time, it is free of infrared divergences for
α ki → 0. This could be possible in a chiral-invariant pion world with a
non-integer ρ intercept.
40 G. Veneziano

5 Non-Local, Classical Field Theory

We discuss brieﬂy here the case = 0, λ = 0, i.e., the case of a non-local,
non-quantized ﬁeld theory.
Having the dual model in mind, consider the classical motion of the free
string. The end points of the string describe the classical trajectory (say for
σ = 0)
√ an,μ
xμ (τ ) = xμ + 2λ2 pμ τ + i 2λ √ e−inτ . (50)
n
n=0

This motion is “almost periodic”, i.e., periodic with period τ0 = 2π up to a

linear term 2λ2 pμ τ . During such period of proper time the end of the string
shifts its position by the amount 2(λpμ ) · (2πλ). If pμ is, say, in the z direction
we have
Δz0 pz
Δx = Δy = 0 , Δz0 = 4π λ pz λ , Δ t0 = 4π λ p0 λ , = = v . (51)
Δt0 p0
Hence p/p0 is the average velocity of the end point.
Suppose now that we want to define a field φ(x) which interacts only with
the end point of the string. Since in the classical case we know exactly the
motion of the end point we can think of being able to specify the field φ(x)
at all the points φ(xμ (τ )) namely along the trajectory described by the end
point.
On the other hand, even classically, we may think of having a measuring
apparatus incapable of measuring the reactions of the string to the field in
a time Δt → 0 and we may demand instead to measure its average reaction
during a characteristic interval Δt0 = 4π λ2 p0 . There is a further advantage
to that. After a time Δt0 we know exctly where the end of the string ought to
be in the absence of ineractions if we just measure its total momentum. For
a Δt = Δt0 (or a multiple of it), the full knowledge of the internal motion is
needed before we can disentangle the free motion from the one produced by
the field. Of course we can take Δt = Δt0 /n and we shall only need the first
n harmonics.
When the system is quantized this will be even harder. In this case we
shall measure, instead of φ(xμ (τ ))
+π
1
φ̄(xμ ) = ϕ(xμ + 2λ2 pμ τ + . . .)dτ → (λ → 0) φ(xμ ) . (52)
2π −π n

The field thus defined has become a functional of xμ (τ ) in a sense it is not only
a function of what the field is but also depends on the state of the measuring
apparatus. This seems to be the lesson to learn. For strong interactions the
only way to measure them is to scatter strongly interacting probes. If these
have a composite, extended structure then the field measured is a function of
the internal motion of the probe as well as a function of actual sources. This
may be the clue to duality.
Non-local Field Theory Suggested by Dual Models (1973, draft unpubl.) 41

How should we generalize an interaction of the type λ φ3 (x)dx? We can
try to write
+π
λ dx dτ1 dτ2 dτ3 φ(x + p1 τ1 ) φ(x + p2 τ2 )φ(x + p3 τ3 )
−π
+π
=λ dx dx dydz φ(x ) φ(y)φ(x) dτ1 dτ2 dτ3 δ(x − x − p1 τ1 )
−π
− x − p2 τ ) δ(z − x − p3 τ1)
δ(y
= λ dx dy dz φ(x) φ(y) φ(z) dw θp1 (w − x) θp2 (w − y) θp3 (w − z)

≡ λ dx dy dz φ(x) φ(y) φ(z) G(x, y, z; p1 , p2 , p3 ) . (53)

We get a smoothed interaction, with a smoothing function which depends on

the momenta of the three interacting objects. If we let λ → 0 at ﬁxed pi , or
pi → 0 at ﬁxed λ, we recover a local interaction: G → δ( )δ( ).

6 Smeared Fields
6.1 ?

Consider φ̄ between non-excited states of coordinates x, x . We have:

we ﬁnd:

1
x |φ̄|x = dτ dyφ(y) d4 p d4 p eip(x −y) eip (y−x)
2π
1 sinπλ2 (p2 − p2 ) ip(x −y) ip (y−x)
= dyφ(y) d4 p d4 p e e
2π πλ2 (p2 − p2 )

= dτ dyφ(y) d4 P d4 keiP (x −x) eik(x+x −2y) eiλ τ P ·k
2

1 (x − x ) · (x + x − 2y)
= dyφ(y) dτ 2 4 exp i
(λ τ ) λ2 τ

≡ dyφ(y)G(x, x , y) . (56)

We can also write:

42 G. Veneziano

x |φ̄|x = φ(x, x ) = φ(X, r) = dyφ(y)G(X, r, y)

1 r · (X − 2y)
= dyφ(y) dτ 2 4 exp i . (57)
(λ τ ) λ2 τ
We can also get an expression for other quantities:

1 1 X − 2y
φ(k, X) = d4 reikr φ(X, r) = dyφ(y) dτ 2 4 δ( 2 − k)
2π (λ τ ) λ τ
1
= dτ φ(X − λ2 τ k) , (58)
2π
or, in terms of the Fourier-transform of φ(y),

sinπλ2 k · q
φ(k, X) = d4 q φ(q)e(iqX) , (59)
πλ2 k · q

φ(k, Δ) = d4 Xd4 rexp (i(kr + ΔX)) φ(X, r) . (60)

Also:

1
φ(Δ, r) = d4 XeiΔX φ(X, r) =dτ δ (4) (r − λ2 τ Δ)φ(Δ)
2π
= φ(Δ)Θ(−πλ2 Δμ < rμ < πλ2 Δμ ) , (61)

and

2
φ(Δ, p) = d4 reipr φ(Δ, r) = dτ eiλ τ p·Δ
φ(Δ)
sinπλ2 p · Δ
= φ(Δ) . (62)
πλ2 p · Δ

Δx
6.2 Various Types of φ(y) and Δp
= λ2

• φ(y) = const
x|φ|x = const δ(r) , r2 = 0 . (63)
• φ(y) = e iqy

r |r|
dy dτ δ(q − 2 )eiqX , r2 ∼ q 2 λ2 , = λ2 . (64)
λ τ q
2
• φ(y) = eiky exp(− ηy2 )

1 r·X r·y
exp −2i 2 + iky e−y /η
2 2
φ(r, X) = dy dτ 2 4 exp i 2
(λ τ ) λ τ λ τ
1 r·X r
= dτ 2 4 exp i 2 exp −η (k − 2 ) .
2
(65)
(λ τ ) λ τ λ τ
Non-local Field Theory Suggested by Dual Models (1973, draft unpubl.) 43

Thus:
λ2
r ≈ λ2 τ k , Δr ≈
η
1
Δy ≈ η , Δp ≈ r , (66)
η
implying:
Δr
≈ λ2 . (67)
Δp
Conclusion
If the local ﬁeld is a wave packet of average momentum k and spread 1/η,
average position y0 and spread η, the non-local version has a non-locality
2
parameter Δr ∼ λη . Hence Δp Δr
≈ λ2 , which is the new indetermination prin-
ciple.

References
4
1. 29
5
2. 29

Appendix – Comments by the Author (March 2007)

According to Sect. 1, a seventh (and last) section should have contained some
speculative remarks and an outlook, but apparently has never been written.
Also, no bibliography has been found with the manuscript.

The following comments on this unpublished manuscript may be of interest

and/or of help to the reader:
This draft was probably written at the beginning of 1973, i.e. around the
time that QCD was introduced as a candidate theory of strong interactions,
but before it was accepted as such. The discovery of asymptotic freedom, the
idea of confinement, and the reinterpretation of dual resonance models and
string theory as a large-N limit of QCD, have all probably contributed to
convince me not to pursue any further the line of thought exposed in this
manuscript and to keep it in a drawer.
However, many of the ideas presented there do acquire a definite interest
in the context of the reinterpretation of string theory (some 11 years later
and after rescaling the length parameter λ by some 20 orders of magnitude)
as a unified quantum theory of all interactions, including gravity, Indeed,
in my 1986 paper “A stringy Nature needs just two constants” (Europhys.
4
Presumably a reference to QED precision tests.
5
Presumably a reference to the proof of renormalizability of the GSW theory.
44 G. Veneziano

Lett. 2 (1986) 199), many of the themes presented in this draft, consciously
or not, were taken up again. In particular, Born’s reciprocity idea – and its
implementation in Yukawa’s approach – are among the issues common to both
works.
Two points got clarified during the 13-year interval between the two pa-
pers:
• That α and λ2 are conceptually distinct: the first is the inverse of a clas-
sical tension, the second is a fundamental length appearing as a result of
quantization;
• That Born’s reciprocity works, in string theory, as a symmetry between
X ≡ ∂σ X(σ, τ ) and P , rather than between x and p, as in Born’s or
Yukawa’s approaches. Precisely, this X ↔ P reciprocity gives rise to the
famous T -duality of closed strings, or to the connection between Neumann
and Dirichlet open strings.
A second point of the manuscript is its reinterpretation of the zero-slope limit
of string theory as a low-energy limit in which it reduces to a QFT with a
non-polynomial Lagrangian (unlike Scherk’s limit of an ordinary QFT). This
can be understood today as the result of “integrating out” the massive string
modes when the external particles are light and soft. The non-polynomial na-
ture of the Einstein–Hilbert action does indeed come this way in string theory.
What was missing in the draft is the idea of defining a one-particle-irreducible
functional (the effective action) to avoid the problems of singularities due to
the exchange of massless quanta. This makes Sect. 3 somewhat hard to read.
Last, but not least, the manuscript contains (and by far!) the first claim
that string theory should lead to a modified uncertainty principle whereby,
besides the usual ΔxΔp > 2π, the new constraint Δx/Δp ∼ α should also
be imposed. There are statement in this direction in the above-mentioned
1986 paper of mine but not as clearly stated as in the draft (this makes me
believe that, by 1986, I had lost track of the draft). It was not until 1989–1990,
with the results coming from studying transplanckian string collisions, that
the modified uncertainty principle was formulated (independently by D. Gross
and by Amati, Ciafaloni and myself) in the form presented at the very end of
the manuscript!
Part II

Dual Resonance Models and String Theory

The Birth of the Veneziano Model
and String Theory

H. Rubinstein

Albanova Center, Fysikum, Stockholm, Sweden

[email protected]

Abstract. In this article I describe the work at the Weizmann Institute just before
and when Gabriele arrived.

1 The Weizmann Institute in January 1966

and the Work Leading to the Veneziano Model

1.1 Preliminaries

After two years as a postdoc student at Orsay, France, I went to the Weizmann
Institute in 1966. It lasted about 20 years. At Orsay, I had worked with several
students: Bernard Diu, Jean Loup Gervais, and also with Jean Basdevant and
the late Roger van Royen. Our interest then, one common to many physicists,
was the theory of strong interactions. Rehovot was a sleepy town, with a large
number of still unpaved streets. I had gone to Weizmann invited by Amos
de Shalit to work with Harry Lipkin. The atmosphere at the Institute was
very relaxed and friendly. The Department was small, just a few professors
and very few students. The research was concentrated in nuclear and atomic
physics, and some experimental particle physics led by people educated in
cosmic rays experiments.
The Weizmann Institute had taken the lead in the reconciliation with Ger-
many and several young German physicists came to Rehovot for long visits.
Germany had instituted a scientific exchange programme, called Minerva, that
was a key factor in rapid scientific development. We had a very active time,
full of distinguished short time visitors and several postdocs.
The symmetry approach to particle physics SU(3), SU(6) and later SU (6)W
was popular everywhere and Israel had been a leader in the subject thanks to
the work of Racah in Jerusalem in atomic and nuclear physics. Amos De Shalit
and Igal Talmi continued the tradition at Weizmann in nuclear physics. Harry
Lipkin, originally a nuclear physicist, had turned his efforts to the recently

H. Rubinstein: The Birth of the Veneziano Model and String Theory, Lect. Notes Phys. 737,
47–58 (2008)
DOI 10.1007/978-3-540-74233-6 3
c Springer-Verlag Berlin Heidelberg 2008
48 H. Rubinstein

established field of particle physics. Haim Harari and Moshe Kugler had re-
cently finished their theses and had gone as postdocs to USA.
Lipkin has been inspired by the late Yuval Ne’eman who had returned
from London after having proposed the octet model based on SU(3) [1]. The
model had been recently spectacularly confirmed by the discovery of the Ω.
Matters developed rapidly. Murray Gell-Mann, George Zweig and Ne’eman
proposed the quarks and we started work in the subject with Lipkin and
several graduate students: Moshe Elitzur and Hannah Stern amongst others.
The quark idea was not popular in USA. There was great reluctance to accept
theories that did not have observable asymptotic states.
At Weizmann and at CERN, and in other places, interest in quarks was
intense. A great stream of visitors and postdocs like Florian Sheck, now at
Mainz, worked with us in the subject [2].
We did work on the tensor mesons [3], and with Hannah Stern [4] in
nucleon–antinuclean annihilation, explaining simply the large mesons multi-
plicity against phase space intuition. E. Teller wrote to me that we had in-
vented quark chemistry! Also, hadronic mass relations were clarified in work
together with P. Federman and I. Talmi [5]. We did show that the octet for-
mula is not always correct, and related masses of the octet and decuplet of
baryons without assumptions on the forces, except that these are two body
forces.

1.2 The Players Arrive

After my stay at Orsay (1964–1965) my wife and I went to Argentina for

3 months before going to Weizmann. In Buenos Aires, I tought a course on
SU(3), invited by J.J. Giambiaggi (of dimensional regularization). One of
the students was Miguel Virasoro. He immediately impressed me as an out-
standing mind and soon we wrote a paper on SU (3) × SU (3). We did not
published it.
As soon as I came to Weizmann I asked the Head of the Department, Igal
Talmi, if we could bring Miguel to the Weizmann Institute. Talmi, without
any information but my word, generously agreed to bring him as a postdoc.
This was in March 1966. Miguel came in early 1967.
Gabriele Veneziano arrived to Weizmann to complete a Ph.D. in 1966. He
came from Florence where he had worked with Raoul Gatto. He came as a
student, since Italy did not have an equivalent to a Ph.D. degree.
Lipkin asked me to take him along as a student and I did. My natural
inclination to do some dynamical calculations and not only symmetries co-
incided with Gabriele’s interests, and we wrote two papers on commutation
relations and Regge poles [6].
It became immediately obvious that Gabriele, as Miguel, was in a class
of his own. We started to look to a variety of problems and in particular to
analyticity sum rules (see below).
The Birth of the Veneziano model 49

Fig. 1. Picture at the 1966 Rehovot Conference. From left: H. Dahmen, the author,
Sergio Fubini, M. Virasoro (standing), G. Veneziano

Soon summer came and we all went abroad. I went to Texas, and after-
wards to the Bohr Institute and Gothenburg, where my wife’s family had a
summer house in southern Sweden. Miguel came with me to Copenhagen and
Gabriele went to Italy.
In Copenhagen I taught and I met Professsor Ziro Koba. Two students
who will appear in other articles got interested in the subject. These were
Holger Nielsen in Copenhagen and Lars Brink in Gothenburg (Fig. 1).

2 The Dominant Problems from 1950 to 1970

Strong interaction physics was the topic that attracted most attention. Driven
by a large number of experiments on cross sections, discovery of new particles
and resonances, a large classiﬁcation eﬀort was taking place.
The symmetry approach described above was very successful in classifying
particles but it did not have a dynamical principle. It merely related particles
and cross sections to other particles and cross sections.
Until the beginning of this period the dynamics has been based on trying
to emulate quantum electrodynamics. These perturbative calculations proved
unsuccessful.
50 H. Rubinstein

Under the leadership of Goeﬀrey Chew at Berkeley the concept of boot-

strap emerged. “There are no fundamental particles: any one is as important
as any other” was the slogan. Field theory, that could then only be handled
perturbatively, was unable to make sensible predictions.
S-Matrix theory became the dogma, and Lagrangian physics was declared
obsolete. The analytic structure of scattering amplitudes would – so people
thought – constrain the physics, and predictions would ensue.
Other ideas were pursued, like current algebra, which proved to be wanting
in depth. Some interesting results like soft pion theorems and the Adler–
Weissberger sum rule for the weak axial coupling were important, but the
excitement, in my opinion, was not justified. Another field theoretic result that
turned out to be most important was the triangle anomaly calculation that
showed a very important difference between classical and quantum mechanics.
It became a key consideration when field theory returned to the forum.
Field theory was dormant, but important work on spontaneous symmetry
breaking and short distance expansions was advancing. It became essential 10
years later. It had to wait for the seminal contributions of Curtis Callan, Kurt
Symanzik and Ken Wilson via the renormalization group. Ytzak Frishman and
his students did mainly this type of work in the early period at Weizmann
[7]. High-energy physics was dominated by the discovery of large number of
resonances and new particles. Tulio Regge discovered that the Schrodinger
equation allowed continuation in angular momentum for complex values, and
linked resonances with different spin. Phenomenologists discovered that all
invariant functions in a scattering processes had the form
A(s, t) = β(t)sα(t) , (1)
where s is the direct channel energy and t the momentum transfer. The invari-
ant functions when particles carried spin had to be properly chosen to ensure
that only physical singularities were present. This became a sophisticated
industry.
These Regge trajectories started to be filled with particles, and it was soon
noticed that they were linear in J, and that different families had identical
slope. This was evidence that a simple potential would not do. Linearity re-
quired physics beyond the potential model. What was nice was the connection
of energy and momentum transfer, and the fact that high-spin particles could
be fit to a straight line, and that the continuation to negative t corresponds to
the scattering angle in the cross channel and could fit the angular distribution.
The steps that led to understanding the particle spectrum and couplings
started with the work of Sergio Fubini and collaborators. The thick book by
De Alfaro et al. [8] contains a detailed picture of the period. It is a remarkable
fact that the book contains little field theory and is based mostly on S-matrix
analyticity. The only exciting topics related to field theory were the infinite
momentum frame and chiral symmetry.
The first success of the analyticity ideas was the exploration of super-
convergence sum rules. Extracting an amplitude with the correct analyticity
The Birth of the Veneziano model 51

properties and the appropriate number of helicity ﬂips to ensure a rapid t

decrease reads ∞
ImA(ν, t)dν = 0. (2)

As such this equation is almost tautological, unless a dynamical idea is

introduced.
In the ﬁrst period, Fubini [9] saturated the equation with a few resonances
and noticed that the relations between masses and couplings were reasonable.
Because of the rapid convergence these sum rules became known as supercon-
vergence relations. The position and strength of the resonances related one to
the others.
The real watershed was a paper by R. Dolen, D. Horn and C. Schmid on
duality in the pion–nucleon amplitude, written in the fall of 1967 [10]. This
paper deviated from the dogma established by quantum electrodynamics.
Though electrodynamics had taught us that singularities in all channels
are additive (this was called the interference model), this paper showed that
information on the crossed channels was contained in the direct channel. In a
pictorial way: the resonances average the symptotic behaviour which is domi-
nated by the Regge trajectories in the t channel. The strong interactions have
a diﬀerent structure that cannot be described by perturbative mechanisms.
In modern words, low-energy hadronic physics is a strong interaction realm,
as QCD now shows quite clearly, due to infrared slavery.
This duality hypothesis was received with skepticism. When Gabriele pro-
posed the model the reaction was quite negative (see Gabriele’s letter, Figs.
2 and 3).
Several groups, including ours, added the Regge behaviour to the sum
rules, generalizing the previous equation to
νm αr +n+1
A(ν, t)dν = βr (t) . (3)
r
αr + n + 1
Physically, one divides the integral in two parts: the low-energy part that is
resonance dominated, and the high-energy part that is controlled by Regge
trajectories. The next assumption is how to perform the saturation. Notice
that it is not adding resonances in the other channel.
Already in the superconvergence case a few resonances saturated the sum
rule quite well [11]. The solution gave relations between masses and couplings
of resonances. In this scheme, the idea was to relate Regge parameters to
resonances. By displacing νm we could see the contribution to the sum rule of
the individual resonances! The more resonances one includes, the t dependence
on both sides agreed better and better.

2.1 A Simple Theoretical Model: π + π → π + ω

Back in Rehovot more students joined me in the following academic year.
Yoram Avni who died prematurely, Mordechai Bishari, Mordechai Milgrom,
52 H. Rubinstein

Fig. 2. Letter from Gabriele to H. R., August 1968, ﬁrst part

The Birth of the Veneziano model 53

Fig. 3. Letter from Gabriele to H. R., August 1968, cont

Adam Schwimmer and Masud Chaichian amongst others. Also, the ﬁrst gen-
eration of young people trained in particle physics in Israel by Ne’eman and
Lipkin was returning.
The return process was not completed until 1968. These scientists included,
the late Joe Dothan, Haim Harari, David Horn, Moshe Kugler and Shmuel
Nussinov. Harari came from SLAC and had worked with Fred Gilman on
dispersion relations and symmetries, somewhat related to our work. He and
54 H. Rubinstein

his students also became involved in sum rule work that led to the relation
between the background and diffraction scattering, called the Harari–Freund
hypothesis [12].
A small disgression is needed to understand the situation of theoretical
and experimental particle physics almost everywhere at the time.
The quark model was still looked at with suspicion in most places, because
light quarks were not produced as asymptotic states. Several difficult problems
existed. The most puzzling was that the proton was lighter than the neutron.
Many people tried to solve the problem but it remained. The other problem
was the annoying behaviour of hadronic form factors. Infinite compositeness,
as the bootstrap model required, predicted exponential supression with mo-
mentum transfer, as first pointed out by S. Mandelstam. The quark model
gave the natural answer to both problems. The d quark being heavier than
the u quark solves the first problem. We worked with Gabriele, Miguel and
Daniele Amati collaborators on form factors. This work proved that proton
compositeness required a q −4 form factor [13].
However, it was only after the deep inelastic experiments at SLAC that
quarks became fashionable in USA.
We continued our work on sum rules and found something that turned out
to be important. Together with Marco Ademollo and Adam Schwimmer we
developed finite energy sum rules for hadronic amplitudes. We were inspired
by Sergio Fubini’s work, as already mentioned [14].
The showcase was the fully crossing symmetric π + π → π + ω that would
soon become the Veneziano model. The solution to these equations was simple
and remarkable: they required linearly rising Regge trajectories, in agreement
with the hadronic evidence. Moreover, even the couplings looked reasonable.
Also, parallel trajectories at lower intercept spaced by 1 unit were unravelled
[15]. Here the agreement was spectacular, and the coupling phenomenology
worked very accurately. Other reactions like π + π → π + A2 , the A2 being a
spin 2 meson, gave further information [16]. We soon studied all mesonic reac-
tions and results were very good. Meson–nucleon reactions were not working
that well. The model was not good for fermions.

3 The Breakthrough
We separated that summer and we all went to Europe planning to continue
to USA in the fall. Gabriele made the seemingly trivial but decisive step. I
received the letter shown in August.
From
A(s, t) = β(t)sα(t) (4)

he wrote
Γ (1 − α(s)Γ (1 − α(t)
A(s, t) = . (5)
Γ ((1 − α(s) − α(t))
The Birth of the Veneziano model 55

This equation has full symmetry between s and t, both at low energies and
asymptotically, and obviously multiplies instead of adding different channel
resonances [17].
In his paper he realized that the amplitude looked “almost right”, but
had inherent difficulties. It was difficult to see that it was the starting point
of almost 40 years of research that had incredible physical and mathematical
developments but has not yet achieved a credible form. The theoretical devel-
opments took place rapidly, Fubini and Veneziano [18] and Yohishiro Numbu
[19] realized that the equation has a large degeneracy, the Hagedorn spectrum,
Miguel with K. Kikkawa and B. Sakita [20] showed that it was a true the-
ory by discovering the loop expansion. Further developments that led towards
string theory include the work of many, and this will be covered by others
contributors in this book. Lovelace [21] discovered that Lorentz invariance
requires 26 space–time dimensions. But even then problems persisted.
The crucial step was Scherk’s realization that the model must include
gravity!

4 The Early Phenomenology

The appearance of Gabriele’s paper, and its phenomenal impact at the Inter-
national High Energy Vienna conference in September 1968, led to a deluge
of papers. I was then starting to edit Nuclear Physics B, and we received in
less than 3 months 200 papers on the subject.
The phenomenology of pion–nucleon scattering and five-point functions
looked qualitatively promising but not really correct. I will concentrate here
on the paper of C. Lovelace [22], later expanded by G. Altarelli and myself
[23] on proton–antiproton annihilation at low energy. This paper is perhaps
the most intriguing confirmation of the Veneziano formula besides what was
already known from the FESR.
The reaction at rest can be thought as the disintegration of a pseudoscalar
heavy meson composed by p̄n into three charged pions. By crossing symmetry
it is, in a rough approximation, the scattering of a heavy pion on a pion giving
two pions. So it is a Veneziano formula and, if the decay is to charged pions,
by exoticity there is only one term!
The Dalitz plot was known from Anninos et al. [24] and it is quite remark-
able (see Fig. 4). First, as seen in the figure, it has a hole at the centre, and
second, doing the conventional fitting with the interference model, it predicted
a low-energy resonance that has never been seen. The duality explanation, em-
bodied in the Veneziano formula, explains these features naturally: the hole
was a zero caused by the denominator (see (5)) when α(s) = α(t) = 1/2.
The resonance in the direct channel is a reflection of a resonance in the t
channel.
Improvements to the formula make the result plausible, although we know
that the theory is inconsistent, and the agreement may all be an accident.
56 H. Rubinstein

However, it is possible that a consistent string model of QCD will conserve the
relevant features of the Veneziano tree-level amplitude curing its shortcomings.
At this stage the jury is out.

5 Conclusion
The period 1967–1970 was indeed very productive at Weizmann. The work
that culminated in the Veneziano amplitude and the loop expansion of
Kikkawa, Sakita and Virasoro made it a respectable theory. But the jewel
of the crown was the construction of the Veneziano model.

a)
3.0
(BEV/ C2)2

2.5

2.0
M2(π+π–I,2)

1.5

1.0

0.5

0.5 1.0 1.5 2.0 2.5 3.0

M2(π+π–2,I) (BEV /C2)2
b)
200

M2(π+π–)
NUMBER OF EVETNS

100

c) 0.5 1.0 1.5 2.0 2.5 3.0

200

100 M2(π–π–)

0.5 1.0 1.5 2.0 2.5 3.0

Fig. 4. Dalitz plot of the reaction pseudoscalar to three pions

The Birth of the Veneziano model 57

Gabriele returned to Rehovot after an absence, but soon moved to CERN.

Lately he has become interested in cosmology. Miguel moved also to other
research topics, and myself also moved to cosmology. Adam Schwimmer is
the only of us that has kept working mainly on string theory. Nathan Seiberg,
David Kutasov, Oﬀer Aharony and Michael Berkooz became the new Rehovot
forces in particle physics.
The developments that led to string theory, in particular the work of
Gabriele and Miguel, have not been fully rewarded. It is unquestionable that
the little formula above opened a new chapter in theoretical physics that led
to important developments with repercussions also in mathematics.
Gabriele’s creativity has left an indeleble mark in high-energy physics and
cosmology.

References
1. Yuval Ne’eman: Nucl. Phys. 26, 222 (1961) 48
2. H. R. Rubinstein, F. Scheck, R. Socolow: Phys. Rev. 154, 1608 (1967) 48
3. M. Elitzur, H. J. Lipkin, H. R. Rubinstein, H. Stern: Phys. Rev. Lett. 17, 420
(1966) 48
4. H. R. Rubinstein, H. Stern: Phys. Lett. B 21, 447 (1966) 48
5. P. Federman, H. R. Rubinstein, I. Talmi: Phys. Lett. B 22, 208 (1966);
H. R. Rubinstein, Phys. Rev. Lett. 17, 31 (1966) 48
6. H. R. Rubinstein, G. Veneziano: Phys. Rev. Lett. 18, 411 (1967);
H.R. Rubinstein, G. Veneziano: Phys. Rev. 160, 5 (1967) 48
7. Y. Frishman: Phys. Rev. Lett. 25, 966 (1970) 50
8. V. De Alfaro, S. Fubini, G. Furlan, C. Rossetti: Currents in Hadron Physics
(North-Holland, Amsterdam 1973) 50
9. V. De Alfaro, S. Fubini, C. Rossetti: Nuovo Cimento. Suppl. 6, 575, 1968 51
10. R. Dolen, D. Horn, C. Schmid: Phys. Rev. Lett. 19, 402 (1967) 51
11. S. Fubini, G. Furlan, C. Rossetti: Nuovo Cimento. 43 1611. (1966) 51
12. H. Harari: Proc. Roy. Soc. Lond. A 318, 355 (1970); P. Freund: Lett. Nuovo
Cimento. 4,147 (1970) 54
13. D. Amati, R. Jengo, H. R. Rubinstein, G. Veneziano, M. Virasoro: Phys. Lett.
B 27, 38 (1968) 54, 224
14. M. Ademollo, H. R. Rubinstein, G. Veneziano, M. Virasoro: Phys. Rev. Lett.
19, 1402 (1967); and also M. Ademollo, H. R. Rubinstein, G. Veneziano,
M. Virasoro: Phys. Rev. 176 1904, 1968 54
15. H. R. Rubinstein, A. Schwimmer, G. Veneziano, M. Virasoro: Phys. Rev Lett.
21, 491 (1968) 54
16. M. Bishari, H. R. Rubinstein , A. Schwimmer, G. Veneziano: Phys. Rev. 176,
1926 (1968) 54
17. G. Veneziano: Nuovo Cimento. A 57, 190 (1968) 55
18. S. Fubini, G. Veneziano: Nuovo Cimento. A 64, 811 (1969); Y. Nambu: lecture
to be delivered at Copenhagen. The lecture was never delivered because of an
accident. 55
19. Y. Nambu, Copenhagen undelivered lecture 55
58 H. Rubinstein

20. S. Kikkawa, B. Sakita, M. Virasoro: Phys. Rev. D 1, 3258 (1970) 55

21. C. Lovelace: Proc. Roy. Soc. Lond. A 318 321 (1970) 55
22. C. Lovelace: Phys. Lett. B 28, 269 (1968) 55
23. G. Altarelli, H. R. Rubinstein: Phys. Rev. 185, 1469 (1969) 55
24. P. Anninos et al.: Phys. Rev. Lett. 20, 402 (1968) 55
The Birth of String Theory

P. Di Vecchia

Nordita, Blegdamsvej 17, 2100 Copenhagen Ø, Denmark

[email protected]

Abstract. In this contribution we go through the developments that in the years

from 1968 to about 1974 led from the Veneziano model to the bosonic string the-
ory. They include the construction of the N -point amplitude for scalar particles,
its factorization through the introduction of an infinite number of oscillators and
the proof that the physical subspace was a positive-definite Hilbert space. We also
discuss the zero slope limit and the calculation of loop diagrams. Lastly, we de-
scribe how it finally was recognized that a quantum-relativistic string theory was
the theory underlying the Veneziano model.

1 Introduction
The 1960s was a period in which strong interacting processes were studied
in detail using the newly constructed accelerators at CERN and other places.
Many new hadronic states were found that appeared as resonant peaks in var-
ious cross sections, and hadronic cross sections were measured with increasing
accuracy. In general, the experimental data for strongly interacting processes
were rather well understood in terms of resonance exchanges in the direct
channel at low energy, and by the exchange of Regge poles in the transverse
channel at higher energy. Field theory that had been very successful in de-
scribing QED seemed useless for strong interactions, given the big number of
hadrons to accommodate in a Lagrangian and the strength of the pion–nucleon
coupling constant that did not allow perturbative calculations. The only do-
main in which field theoretical techniques were successfully used was current
algebra. Here, assuming that strong interactions were described by an almost
chiral invariant Lagrangian, that chiral symmetry was spontaneously broken
and that the pion was the corresponding Goldstone boson, field theoretical
methods gave rather good predictions for scattering amplitudes involving pi-
ons at very low energy. Going to higher energy was, however, not possible
with these methods.
Because of this, many people started to think that field theory was use-
less to describe strong interactions, and tried to describe strong interacting

P. Di Vecchia: The Birth of String Theory, Lect. Notes Phys. 737, 59–118 (2008)
DOI 10.1007/978-3-540-74233-6 4 c Springer-Verlag Berlin Heidelberg 2008
60 P. Di Vecchia

processes with alternative and more phenomenological methods. The basic

ingredients for describing the experimental data were at low energy the ex-
change of resonances in the direct channel, and at higher energy the exchange
of Regge poles in the transverse channel. Sum rules for strongly interacting
processes were saturated in this way, and one found good agreement with the
experimental data that came from the newly constructed accelerators. Be-
cause of these successes, and of the problems that ﬁeld theory encountered to
describe the data, it was proposed to construct directly the S matrix without
passing through a Lagrangian. The S matrix was supposed to be constructed
from the properties that it should satisfy, but there was no clear procedure on
how to implement this construction.1 The word “bootstrap” was often used
as the way to construct the S matrix, but it did not help very much to get an
S matrix for the strongly interacting processes.
One of the basic ideas that led to the construction of an S matrix was
that it should include resonances at low energy and at the same time give
Regge behaviour at high energy. But the two contributions of the resonances
and of the Regge poles should not be added because this would imply double
counting. This was called Dolen, Horn and Schmidt duality [2]. Another idea
that helped in the construction of an S matrix was planar duality [3] that
was visualized by associating to a certain process a duality diagram, shown in
Fig. 1, where each meson was described by two lines representing the quark
and the antiquark. Finally, also the requirement of crossing symmetry played
a very important role.
Starting from these ideas Veneziano [4] was able to construct an S matrix
for the scattering of four mesons that, at the same time, had an inﬁnite number
of zero width resonances lying on linearly rising Regge trajectories and Regge
behaviour at high energy. Veneziano originally constructed the model for the
process ππ → πω, but it was immediately extended to the scattering of four
scalar particles.

Fig. 1. Duality diagram for the scattering of four mesons

1
For a discussion of S matrix theory see [1].
The Birth of String Theory 61

In the case of four identical scalar particles, the crossing symmetric

scattering amplitude found by Veneziano consists of a sum of three terms:

A(s, t, u) = A(s, t) + A(s, u) + A(t, u) (1)

where
1
Γ (−α(s))Γ (−α(t))
A(s, t) = = dxx−α(s)−1 (1 − x)−α(t)−1 (2)
Γ (−α(s) − α(t)) 0

with linearly rising Regge trajectories

α(s) = α0 + α s (3)

This was a very important property to implement in a model because it was

in agreement with the experimental data in a wide range of energies. s, t and
u are the Mandelstam variables:

s = −(p1 + p2 )2 , t = −(p3 + p2 )2 , u = −(p1 + p3 )2 (4)

The three terms in (1) correspond to the three orderings of the four particles
that are not related by a cyclic or anticyclic2 permutation of the external
legs. They correspond, respectively, to the three permutations: (1234), (1243)
and (1324) of the four external legs. They have only simple pole singularities.
The first one has only poles in the s and t channels, the second only in the s
and u channels and the third only in the t and u channels. This property fol-
lows directly from the duality diagram that is associated to each inequivalent
permutation of the external legs. In fact, at that time one used to associate
to each of the three inequivalent permutations a duality diagram where each
particle was drawn as consisting of two lines that represented the quark and
antiquark making up a meson. Furthermore, the diagram was supposed to
have only poles singularities in the planar channels which are those involving
adjacent external lines. This means that, for instance, the duality diagram
corresponding to the permutation (1234) has only poles in the s and t chan-
nels as one can see by deforming the diagram in the plane in the two possible
ways shown in Fig. 2.
This was a very important property of the duality diagram that makes
it qualitatively different from a Feynman diagram in field theory where each
diagram has only a pole in one of the three s, t and u channels and not
simultaneously in two of them. If we accept the idea that each term of the sum
in (1) is described by a duality diagram, then it is clear that we do not need
to add terms corresponding to equivalent diagrams because the corresponding
duality diagram is the same and has the same singularities. It is now clear
2
An anticyclic permutation corresponding, for instance, to the ordering (1234) is
obtained by taking the reverse of the original ordering (4321) and then performing
a cyclic permutation.
62 P. Di Vecchia

Fig. 2. The duality diagram contains both s and t channel poles

that it was in some way implicit in this picture the fact that the Veneziano
model corresponds to the scattering of relativistic strings. But at that time
the connection was not obvious at all. The only S matrix property that the
Veneziano model failed to satisfy was the unitarity of the S matrix. because
it contained only zero width resonances, and did not have the various cuts
required by unitarity. We will see how this property will be implemented.
Immediately after the formulation of the Veneziano model, Virasoro [5]
proposed another crossing symmetric four-point amplitude for scalar particles
that consisted of a unique piece given by

Γ (− α(u) α(s) α(t)

2 )Γ (− 2 )Γ (− 2 )
A(s, t, u) ∼ α(u) α(s) α(t)
(5)
Γ (1 + 2 )Γ (1 + 2 )Γ (1 + 2 )

where

α(s) = α0 + α s (6)

The model had poles in all three s, t and u channels and could not be written
as sum of three terms having poles only in planar diagrams. In conclusion,
the Veneziano model satisfies the principle of planar duality being a crossing
symmetric combination of three contributions each having poles only in the
planar channels. On the other hand, the Virasoro model consists of a unique
crossing symmetric term having poles in both planar and non-planar channels.
The attempts to construct consistent models that were in good agreement
with the strong interaction phenomenology of the 1960s boosted enormously
the activity in this research field. The generalization of the Veneziano model to
the scattering of N scalar particles was built, an operator formalism consisting
of an infinite number of harmonic oscillators was constructed and the complete
spectrum of mesons was determined. It turned out that the degeneracy of
states grew up exponentially with the mass. It was also found that the N -point
amplitude had states with negative norm (ghosts) unless the intercept of the
Regge trajectory was α0 = 1 [6]. In this case it turned out that the model
was free of ghosts but the lowest state was a tachyon. The model was called
in the literature the “dual resonance model”.
The Birth of String Theory 63

The model was not unitary because all the states were zero width
resonances and the various cuts required by unitarity were absent. The unitar-
ity was implemented in a perturbative way by adding loop diagrams obtained
by sewing some of the external legs together after the insertion of a propaga-
tor. The multiloop amplitudes showed a structure of Riemann surfaces. This
became obvious only later when the dual resonance model was recognized to
correspond to scattering of strings.
But the main problem was that the model had a tachyon if α0 = 1 or had
ghosts for other values of α0 and was not in agreement with the experimental
data: α0 was not equal to about 12 as required by experiments for the ρ
Regge trajectory and the external scalar particles did not behave as pions
satisfying the current algebra requirements. Many attempts were made to
construct more realistic dual resonance models, but the main result of these
attempts was the construction of the Neveu–Schwarz [7] and the Ramond [8]
models, respectively, for mesons and fermions. They were constructed as two
independent models and only later were recognized to be two sectors of the
same model. The Neveu–Schwarz model still contained a tachyon that only in
1976 through the GSO projection was eliminated from the physical spectrum.
Furthermore, it was not properly describing the properties of the physical
pions.
Actually a model describing ππ scattering in a rather satisfactory way
was proposed by Lovelace and Shapiro [9].3 According to this model the three
isospin amplitudes for pion–pion scattering are given by
3 1
A0 = [A(s, t) + A(s, u)] − A(t, u)
2 2

A1 = A(s, t) − A(s, u) A2 = A(t, u) (7)

where
Γ (1 − α(s))Γ (1 − α(t))
A(s, t) = β ; α(s) = α0 + α s (8)
Γ (1 − α(t) − α(s))
The amplitudes in (7) provide a model for ππ scattering with linearly rising
Regge trajectories containing three parameters: the intercept of the ρ Regge
trajectory α0 , the Regge slope α and β. The ﬁrst two can be determined by
imposing the Adler’s self-consistency condition, that requires the vanishing of
the amplitude when s = t = u = m2π and one of the pions is massless, and the
fact that the
√ Regge trajectory must give the spin of the ρ meson that is equal
to 1 when s is equal to the mass of the ρ meson mρ . These two conditions
determine the Regge trajectory to be

1 s − m2π
α(s) = 1+ 2 = 0.48 + 0.885s (9)
2 mρ − m π 2
3
See also [10].
64 P. Di Vecchia

Having ﬁxed the parameters of the Regge trajectory the model predicts the
masses and the couplings of the resonances that decay in ππ in terms of a
unique parameter β. The values obtained are in reasonable agreement with
the experiments. Moreover, one can compute the ππ scattering lenghts:

a0 = 0.395β a2 = −0.103β (10)

and one finds that their ratio is within 10% of the current algebra ratio given
by a0 /a2 = −7/2. The amplitude in (8) has exactly the same form as that for
four tachyons of the Neveu–Schwarz model with the only apparently minor
difference that α0 = 1/2 (for mπ = 0) instead of 1 as in the Neveu–Schwarz
model. This difference, however, implies that the critical space–time dimension
of this model is d = 44 and not d = 10 as in the Neveu–Schwarz model. In
conclusion, this model seems to be a perfectly reasonable model for describing
low-energy ππ scattering. The problem is, however, that nobody has been able
to generalize it to the multipion scattering and therefore to get the complete
meson spectrum.
As we have seen the S matrix of the dual resonance model was constructed
using ideas and tools of hadron phenomenology of the end of the 1960s. Al-
though it did not seem possible to write a realistic dual resonance model
describing the pions , it was nevertheless such a source of fascination for those
who actively worked in this field at that time for its beautiful internal struc-
ture and consistency that a lot of energy was used to investigate its properties
and for understanding its basic structure. It turned out with great surprise
that the underlying structure was that of a quantum-relativistic string.
The aim of this contribution is to explain the logic of the work that was
done in the years from 1968 to 19745 in order to uncover the deep properties of
this model that appeared from the beginning to be so beautiful and consistent
to deserve an intensive study.
This seems to me a very good way of celebrating the 65th anniversary of
Gabriele who is the person who started and also contributed to develop the
whole thing with his deep physical intuition.

2 Construction of the N -point Amplitude

We have seen that the construction of the four-point amplitude is not suﬃcient
to get information on the full hadronic spectrum because it contains only
those hadrons that couple to two ground state mesons and does not see those
intermediate states which only couple to three or to a higher number of ground
state mesons [12]. Therefore, it was very important to construct the N -point
amplitude involving identical scalar particles. The construction of the N -point
4
This can be checked by computing the coupling of the spinless particle at the
level α(s) = 2 and seeing that it vanishes for d = 4.
5
Reviews from this period can be found in [11].
The Birth of String Theory 65

amplitude was done in [13] (extending the work of [14]) by requiring the same
principles that have led to the construction of the Veneziano model, namely
the fact that the axioms of S-matrix theory be satisﬁed by an inﬁnite number
of zero width resonances lying on linearly rising Regge trajectories and planar
duality.
The fully crossing symmetric scattering amplitude of N identical scalar
particles is given by a sum of terms corresponding to the inequivalent permu-
tations of the external legs:

Np
A= An (11)
n=1

Also in this case two permutations of the external legs are inequivalent if they
are not related by a cyclic or anticyclic permutation. Np is the number of
inequivalent permutations of the external legs and is equal to Np = (N −1)! 2
and each term has only simple pole singularities in the planar channels. Each
planar channel is described by two indices (i, j), to mean that it includes the
legs i, i + 1, i + 2 . . . j − 1, j, by the Mandelstam variable

sij = −(pi + pi+1 + · · · + pj )2 (12)

and by an additional variable uij whose role will become clear soon. It is
clear that the channels (ij) and (j + 1, i − 1)6 are identical and they should
be counted only once. In the case of N identical scalar particles the number
of planar channels is equal to N (N2−3) . This can be obtained as follows. The
independent planar diagrams involving the particle 1 are of the type (1, i)
where i = 2 . . . N − 2. Their number is N − 3. This is also the number of
planar diagrams involving the particle 2 and not the 1. The number of planar
diagrams involving the particle 3 and not the particles 1 and 2 is equal to
N − 4. In general the number of planar diagrams involving the particle i and
not the previous ones from 1 to i − 1 is equal to N − 1 − i. This means that
the total number of planar diagram is equal to

N −2
N −4
2(N − 3) + (N − 1 − i) = 2(N − 3) + i
i=3 i=1

(N − 4)(N − 3) N (N − 3)
= 2(N − 3) + = (13)
2 2
If one writes down the duality diagram corresponding to a certain planar
ordering of the external particles, it is easy to see that the diagram can have
simultaneous pole singularities only in N −3 channels. The channels that allow
simultaneous pole singularities are called compatible channels, the others are
6
This channel includes the particles (j + 1, . . . , N, 1, . . . i − 1).
66 P. Di Vecchia

called incompatible. Two channels (i, j) and (h, k) are incompatible if the
following inequalities are satisﬁed:

i≤h≤j ; j+1≤k ≤i−1 (14)

The aim is to construct the scattering amplitude for each inequivalent per-
mutation of the external legs that has only pole singularities in the N (N2−3)
planar channels. We have also to impose that the amplitude has simultaneous
poles only in N − 3 compatible channels. In order to gain intuition on how to
proceed, we rewrite the four-point amplitude in (2) as follows:
1 1
−α(s )−1 −α(s )−1
A(s, t) = du12 du23 u12 12 u23 23 δ(u12 + u23 − 1) (15)
0 0

where u12 and u23 are the variables corresponding to the two planar chan-
nels (12) and (23) and the cancellation of simultaneous poles in incompatible
channels is provided by the δ-function which forbids u12 and u23 to vanish
simultaneously.
We will now extend this procedure to the N -point amplitude. But for the
sake of clarity let us start with the case of N = 5 [14]. In this case we have five
planar channels described by u12 , u13 , u23 , u24 and u34 . Since we have only two
compatible channels only two of the previous five variables are independent.
We can choose them to be u12 and u13 . In order to determine the depen-
dence of the other three variables on the two independent ones, we exclude
simultaneous poles in incompatible channels. This can be done by imposing
relations that prevent variables corresponding to incompatible channels to
vanish simultaneously. A sufficient condition for excluding simultaneous poles
in incompatible channels is to impose the conditions:

uP = 1 − uP̄ (16)
P̄

where the product is over the variables P̄ corresponding to channels that

are incompatible with P . In the case of the ﬁve-point amplitude, we get the
following relations:

u23 = 1 − u34 u12 ; u24 = 1 − u13 u12

u13 = 1 − u34 u24 ; u34 = 1 − u23 u13 ; u12 = 1 − u24 u23 (17)

Solving them in terms of the two independent ones we get

1 − u12 1 − u13
u23 = ; u34 = ; u24 = 1 − u12 u13 (18)
1 − u12 u13 1 − u12 u13
In analogy with what we have done for the four-point amplitude in (15) we
write the ﬁve-point amplitude as follows:
The Birth of String Theory 67
1 1 1 1 1
−α(s12 )−1 −α(s13 )−1
du12 du13 du23 du24 du34 u12 u13
0 0 0 0 0

−α(s24 )−1 −α(s23 )−1 −α(s34 )−1

×u24 u23 u34

× δ(u23 + u12 u34 − 1)δ(u24 + u12 u13 − 1)δ(u34 + u13 u23 − 1) (19)

Performing the integral over the variables u23 , u24 and u34 we get
1 1
−α(s )−1 −α(s )−1
du12 du13 u12 12 u13 13
0 0

× (1 − u12 )−α(s23 )−1 (1 − u13 )−α(s13 )−1 (1 − u12 u13 )−α(s24 )+α(s23 )+α(s34 )(20)

We have implicitly assumed that the Regge trajectory is the same in all chan-
nels and that the external scalar particles have the same common mass m
and are the lowest lying states on the Regge trajectory. This means that their
mass is given by

α0 − α p2i = 0 ; p2i ≡ −m2 (21)

Using then the relation

α(s23 ) + α(s34 ) − α(s24 ) = 2α p2 · p4 (22)

we can rewrite (20) as follows:

1 1
−α(s2 )−1 −α(s3 )−1
B5 = du2 du3 u2 u3 (1 − u2 )−α(s23 )−1
0 0

2
4

× (1 − u3 )−α(s34 )−1 (1 − xij )2α pi ·pj (23)
i=2 j=4

where

si ≡ s1i , ui ≡ u1i ; i = 2, 3 ; xij = ui ui+1 . . . uj−1 . (24)

We are now ready to construct the N -point function [13]. In analogy with
what has been done for the four- and ﬁve-point amplitudes, we can write the
N -point amplitude as follows:
1 1
−α(s )−1
BN = ... [uP P ] δ(uQ − 1 + uQ̄ ) (25)
0 0 P Q Q̄
68 P. Di Vecchia

where the ﬁrst product is over the N (N2−3) variables corresponding to all
planar channels, while the second one is over the (N −3)(N2
−2)
independent
δ-functions. The product in the δ-function is deﬁned in (16).
The solution of all the non-independent linear relations imposed by the
δ-functions is given by
(1 − xij )(1 − xi−1,j+1 )
uij = (26)
(1 − xi−1,j )(1 − xi,j+1 )
where the variables xij are given in (24). Eliminating the δ-function from Eq.
(25) one gets

N −2 1 N
−3 N
−1
−α(si )−1 −α(si,i+1 )−1
BN = dui ui (1 − ui ) (1 − xij )−γij
i=2 0 i=2 j=i+2
(27)

where

γij = α(sij ) + α(si+1;j−1 ) − α(si;j−1 ) − α(si+1;j ) ; j ≥ i + 2 (28)

It is easy to see that

α(si,i+1 ) = −α0 − 2α pi · pi+1 ; γij = −2α pi · pj ; j ≥ i + 2 (29)

Inserting them in (27) we get

N −2 1 N
−2 N
−1
−α(si )−1
BN = dui ui (1 − ui )α0 −1 (1 − xij )2α pi ·pj (30)
i=2 0 i=2 j=i+1

This is the form of the N -point amplitude that was originally constructed.
Then Koba and Nielsen [15] put it in the form that is more known nowadays.
They constructed it using the following rules. They associated a real variable
zi to each leg i. Then they associated to each channel (i, j) an anharmonic
ratio constructed from the variables zi , zi−1 , zj , zj+1 in the following way:
−α(sij )−1
(zi − zj )(zi−1 − zj+1 )
(zi , zi+1 , zj , zj+1 )−α(sij )−1 = (31)
(zi−1 − zj )(zi − zj+1 )
and ﬁnally they gave the following expression for the N -point amplitude:
∞
BN = dV (z) (zi , zi+1 , zj , zj+1 )−α(sij )−1 (32)
−∞ (i,j)

where

N
[θ(zi − zi+1 )dzi ] dza dzb dzc
dV (z) =
N1 ; dVabc = (33)
i=1 (zi − zi+2 )dVabc (zb − za )(zc − zb )(za − zc )
The Birth of String Theory 69

and the variables zi are integrated along the real axis in a cyclically ordered
way: z1 ≥ z2 · · · ≥ zN with a, b and c arbitrarily chosen.
The integrand of the N -point amplitude is invariant under projective
transformations acting on the leg variables zi :
αzi + β
zi → ; i = 1 . . . N ; αδ − βγ = 1 (34)
γzi + δ

This is because both the anharmonic ratio in (31) and the measure dVabc are
invariant under a projective transformation. Since a projective transformation
depends on three real parameters, then the integrand of the N -point amplitude
depends only on N − 3 variables zi . In order to avoid inﬁnities, one has then
to divide the integration volume with the factor dVabc that is also invariant
under the projective transformations. The fact that the integrand depends
only on N − 3 variables is in agreement with the fact that N − 3 is also the
maximal number of simultaneous poles allowed in the amplitude.
It is convenient to write the N -point amplitude in a form that involves the
scalar product of the external momenta rather than the Regge trajectories.
We distinguish three kinds of channels. The ﬁrst one is when the particles
i and j of the channel (i, j) are separated by at least two particles. In this
case the channels that contribute to the exponent of the factor (zi − zj ) are
the channels (i, j) with exponent equal to −α(sij ) − 1, (i + 1, j − 1) with
exponent −α(si+1,j−1 ) − 1, (i + 1, j) with exponent α(si+1,j ) + 1 and (i, j − 1)
with exponent α(si,j−1 ) + 1. Adding these four contributions, one gets for the
channels where i and j are separated by at least two particles

− α(sij ) − α(si+1,j−1 ) + α(si+1,j ) + α(si,j−1 ) = 2α pi · pj (35)

The second one comes from the channels that are separated by only one
particle. In this case only three of the previous four channels contribute. For
instance, if j = i + 2 the channel (i + 1, j − 1) consists of only one particle
and therefore should not be included. This means that we would get

− α(si;i+2 ) − 1 + α(s1+1;i+2 ) + 1 + α(si;i+1 + 1) = 1 + 2α pi · pi+2 (36)

Finally, the third one that comes from the channels whose particles are adja-
cent, gets only contribution from

− α(si;i+1 ) − 1 = α0 − 1 + 2α pi · pi+1 (37)

Putting all these three terms together in (32) and remembering the factor in
the denominator in the ﬁrst equation of (33) we get

N
∞
dzi θ(zi − zi+1 )
N

BN = 1
(zi − zi+1 )α0 −1 (zi − zj )2α pi ·pj
−∞ dVabc i=1 j>i
(38)
70 P. Di Vecchia

A convenient choice for the three variables to keep ﬁxed is

z a = z1 = ∞ ; zb = z2 = 1 ; zc = zN = 0 (39)

With this choice the previous equation becomes

N −1 1 N
−1
BN = dzi θ(zi − zi+1 ) (zi − zi+1 )α0 −1
i=3 0 i=2

N −1
N

× (zi − zj )2α pi ·pj (40)
i=2 j=i+1

We now want to show that this amplitude is identical to the one given in (30).
This can be done by performing the following change of variables:
zi+1
ui = ; i = 2, 3 . . . N − 2 (41)
zi
that implies

zi = u2 u3 . . . ui−1 ; i = 3, 4 . . . N − 1 (42)

Taking into account that the Jacobian is equal to

N −2
N −3
∂z −2−i
det = zi = uN
i (43)
∂u i=3 i=2

using the following two relations:

N −1 −2 N−2
∂z
N
(N −1−i)α0 −1
det (zi − zi+1 )α0 −1 = ui (1 − ui )α0 −1 (44)
∂u i=2 i=2 i=2

and

N −1
N

(zj − zi )2α pi ·pj
i=2 j=i+1

N −2 N
−1
N −2
−α(si )−(N −i−1)α0
= (1 − xij )2α pi ·pj ui (45)
i=2 j=i+1 i=2

and the conservation of momentum

N
pi = 0 (46)
i=1

together with (21), one can easily see that (30) and (40) are equal.
The Birth of String Theory 71

The N -point amplitude that we have constructed in this section corre-

sponds to the scattering of N spinless particles with no internal degrees of
freedom. On the other hand, it was known that the mesons were classiﬁed
according to multiplets of an SU (3) ﬂavour symmetry. This was implemented
by Chan and Paton [16] by multiplying the N -point amplitude with a factor,
called Chan–Paton factor, given by

T r(λa1 λa2 . . . λaN ) (47)

where the λ’s are matrices of a unitary group in the fundamental representa-
tion. Including the Chan–Paton factors the total scattering amplitude is given
by

T r(λa1 λa2 . . . λaN )BN (p1 , p2 , . . . pN ) (48)
P

where the sum is extended to the (N − 1)! permutations of the external legs,
that are not related by a cyclic permutations. Originally when the dual reso-
nance model was supposed to describe strongly interacting mesons, this factor
was introduced to represent their flavour degrees of freedom. Nowadays, the
interpretation is different and the Chan–Paton factor represents the colour
degrees of freedom of the gauge bosons and the other massive particles of the
spectrum.
The N -point amplitude BN that we have constructed in this section con-
tains only simple pole singularities in all possible planar channels. They cor-
respond to zero width resonances located at non-negative integer values n
of the Regge trajectory α(M 2 ) = n. The lowest state located at α(m2 ) = 0
corresponds to the particles on the external legs of BN . The spectrum of
excited particles can be obtained by factorizing the N -point amplitude in
the most general channel with any number of particles. This was done in
[17] and [18] finding a spectrum of states rising exponentially with the mass
M . Being the model relativistic invariant it was found that many states ob-
tained by factorizing the N -point amplitude were “ghosts”, namely, states
with negative norm as one finds in QED when one quantizes the electromag-
netic field in a covariant gauge. The consistency of the model requires the
existence of relations satisfied by the scattering amplitudes that are similar to
those obtained through gauge invariance in QED. If the model is consistent
they must decouple the negative norm states leaving us with a physical spec-
trum of positive norm states. In order to study in a simple way these issues,
we discuss in the next section the operator formalism introduced already in
1969 [19, 20, 21].
Before concluding this section let us go back to the non-planar four-point
amplitude in (5) and discuss its generalization to an N -point amplitude. Using
the technique of the electrostatic analogue on the sphere instead of on the disk
Shapiro [22] was able to obtain a N -point amplitude that reduces to the four-
point amplitude in (5) with intercept α0 = 2. The N -point amplitude found
in [22] is
72 P. Di Vecchia

N
d 2 zi
i=1
|zi − zj |α pi ·pj (49)
dVabc i<j

3 Operator Formalism and Factorization

The factorization properties of the dual resonance model were first studied by
factorizing by brute force the N -point amplitude at the various poles [17, 18].
The number of terms that factorize the residue of the pole at α(s) = n, in-
creases rapidly with the value of n. In order to find their degeneracy, it turned
out to be convenient to first rewrite the N -point amplitude in an operator for-
malism. In this section we introduce the operator formalism and we rewrite
the N -point amplitude derived in the previous section in this formalism.
The key idea [19, 20, 21] is to introduce an infinite set of harmonic oscil-
lators and a position and momentum operators,7 which satisfy the following
commutation relations:

[anμ , a†mν ] = ημν δnm ; [q̂μ , p̂ν ] = iημν (51)

where ημν is the ﬂat Minkowski metric that we take to be ημν = (−1, 1, . . . 1).
A state with momentum p is constructed in terms of a state with zero mo-
mentum as follows:

p̂|p ≡ p̂eip·q̂ |0 = p|p ; p̂ |0 = 0 (52)

normalized as8

p|p = (2π)d δ (d) (p + p ) (53)

In order to avoid minus signs we use the convention that

p| = 0|eip·q̂ (54)

A complete and orthonormal basis of vectors in the harmonic oscillator space

is given by
(a†μ ;n )λn;μn
|λ1 , λ2 , . . . λi ; p = n eipq̂ |0, 0 (55)
n
λ n,μn
!
7
Actually the position and momentum operators were introduced in [23].
8
Although we now use an arbitrary d we want to remind you that all original
calculations were done for d = 4.
The Birth of String Theory 73

where the ﬁrst |0 corresponds to the one annihilated by all annihilation
operators and the second one to the state of zero momentum

aμn ;n |0, 0 = p̂|0, 0 = 0 (56)

Notice that Lorentz invariance forces to introduce also oscillators that create
states with negative norm due to the minus sign in the ﬂat Minkowski metric.
This implies that the space spanned by the states in (55) is not positive
deﬁnite. This is, however, not allowed in a quantum theory and therefore if
the dual resonance model is a consistent quantum-relavistic theory we expect
the presence of relations of the kind of those provided by gauge invariance in
QED.
Let us introduce the Fubini–Veneziano [23] operator

Qμ (z) = Q(+) (0) (−)

μ (z) + Qμ (z) + Qμ (z) (57)

where
∞ ∞
√ a √ a†
Q(+) = i 2α √n z −n ; Q(−) = −i 2α √n z n
n=1
n n=1
n

Q(0) = q̂ − 2iα p̂ log z (58)

In terms of Q we introduce the vertex operator corresponding to the external

leg with momentum p:
(−)
(z) ipq̂ +2α p̂·p log z ip·Q(+) (z)
V (z; p) =: eip·Q(z) :≡ eip·Q e e e (59)

and compute the following vacuum expectation value:

N
0, 0| V (zi , pi )|0, 0 (60)
i=1

It can be easily computed using the Baker–Haussdorf relation

eA eB = eB eA e[A,B] (61)

that is valid if the commutator, as in our case, [A, B] is a c-number. In our

case the commutation relations to be used are
w
[Q(+) (z), Q(−) (w)] = −2α log 1 − (62)
z
and the second one in (51). Using them one gets

V (z; p)V (w; k) =: V (z; p)V (w; k) : (z − w)2α p·k (63)

and
74 P. Di Vecchia

N N
0, 0| V (zi , pi )|0, 0 = (zi − zj )2α pi ·pj (2π)d δ (d) ( pi ) (64)
i=1 i>j i=1

where the normal ordering requires that all creation operators be put on the
left of the annihilation one and the momentum operator p̂ be put on the right
of the position operator q̂. This means that

N
∞
dzi θ(zi − zi+1 )
N N
d (d)
(2π) δ ( pi )BN = 1
(zi − zi+1 )α0 −1
i=1 −∞ dVabc i=1

N
× 0, 0| V (zi , pi )|0, 0 (65)
i=1

By choosing the three variables za , zb and zc as in (39) we can rewrite the

previous equation as follows:

N
1N −1
N −1
(2π)d δ (d) ( pi )BN = dzi θ(zi − zi+1 )
i=1 0 i=3 i=2

−1 −1
N
N
× (zi − zi+1 )α0 −1 0, p1 | V (zi ; pi )|0, pN (66)
i=2 i=2

where we have taken z2 = 1 and we have deﬁned (α0 ≡ α p2i ; i = 1 . . . N ) :

lim V (zN ; pN )|0, 0 ≡ |0; pN ; 0; 0| lim z12α0 V (z1 ; p1 ) = 0, p1 | (67)
zN →0 z1 →∞

Before proceeding to factorize the N -point amplitude, let us study the prop-
erties under the projective group of the operators that we have introduced.
We have already seen that the projective group leaves the integrand of the
Koba–Nielsen representation of the N -point amplitude invariant. The projec-
tive group has three generators L0 , L1 and L−1 corresponding respectively to
dilatations, inversions and translations. Assuming that the Fubini–Veneziano
ﬁelds Q(z) transforms as a ﬁeld with weight 0 (as a scalar) we can immedi-
ately write the commutation relations that Q(z) must satisfy. This means in
fact that, under a projective transformation, Q(z) transforms as follows:

αz + β
Q(z) → QT (z) = Q ; αδ − βγ = 1 (68)
γz + δ

Expanding for small values of the parameters we get

dQ(z)
QT (z) = Q(z) + (1 + 2 z + 3 z 2 ) + o(2 ) (69)
dz
The Birth of String Theory 75

This means that the three generators of the projective group must satisfy the
following commutation relations with Q(z):
dQ dQ dQ
[L0 , Q(z)] = z ; [L−1 , Q(z)] = ; [L1 , Q(z)] = z 2 (70)
dz dz dz
They are given by the following expressions in terms of the harmonic oscilla-
tors:
∞
∞

2
√
L0 = α p̂ + na†n · an ; L1 = 2α p̂ · a1 + n(n + 1)an+1 · a†n (71)
n=1 n=1

and
∞

√
L−1 = L†1 = 2α p̂ · a†1 + n(n + 1)a†n+1 · an (72)
n=1

They annihilate the vacuum

L0 |0, 0 = L1 |0, 0 = L−1 |0, 0 = 0 (73)

that is therefore called the projective invariant vacuum, and satisfy the algebra
that is called Gliozzi algebra [24]9 :

[L0 , L1 ] = −L1 ; [L0 , L−1 ] = L−1 ; [L1 , L−1 ] = 2L0 (74)

The vertex operator with momentum p is a projective field with weight equal
to α0 = α p2 . It transforms in fact as follows under the projective group:
dV (z, p)
[Ln , V (z, p)] = z n+1 + α0 (n + 1)z n V (z, p) ; n = 0, ±1 (75)
dz
or in finite form as follows:

1 αz + β
U V (z, p)U −1 = V ,p (76)
(γz + δ)2α0 γz + δ
where U is the generator of an arbitrary finite projective transformation.
Since U leaves the vacuum invariant, by using (76) it is easy to show that

N
N
N
0, 0| V (zi , p)|0, 0 = (γzi + δ)2α0 0, 0| V (zi , p)|0, 0 (77)
i=1 i=1 i=1

that together with the following equation:

N
N
N
N −1
N
dzi (zi − zi+1

)α0 −1 = dzi (zi − zi+1 )α0 −1 (γzi + δ)−2α0
i=1 i=1 i=1 i=1 i=1
(78)
9
See also [25].
76 P. Di Vecchia

implies that the integrand of the N -point amplitude in (65) is invariant under
projective transformations.
We are now ready to factorize the N -point amplitude and ﬁnd the spec-
trum of mesons.
From (75) and (76) it is easy to derive the transformation of the vertex
operator under a ﬁnite dilatation

z L0 V (1, p)z −L0 = V (z, p)z α0 (79)

Changing the integration variables as follows:

zi+1 ∂zi
xi = ; i = 2, 3 . . . N − 2 ; det = z3 z4 . . . zN −2 (80)
zi ∂xj

where the last term is the jacobian of the trasformation from zi to xi , we get
from (66) the following expression:

AN ≡ 0, p1 |V (1, p2 )DV (1, p3 ) . . . DV (1, pN −1 )|0, pN (81)

where the propagator D is equal to

1
Γ (L0 − α0 )Γ (α0 )
D= dxxL0 −1−α0 (1 − x)α0 −1 = (82)
0 Γ (L0 )
and

N
d (d)
AN = (2π) δ pi BN (83)
i=1

The factorization properties of the amplitude can be studied by inserting in

the channel (1, M ) or equivalently in the channel (M + 1, N ) described by the
Mandelstam variable

s = −(p1 + p2 + . . . pM )2 = −(pM +1 + pM +2 · · · + pN )2 ≡ −P 2 (84)

the complete set of states given in (55):

AN = p(1,M ) |λ, P λ, P |D|μ, P μ, P |p(M +1,N ) (85)
λ,μ

where

p(1,M ) | = 0, p1 |V (1, p2 )DV (1, p3 ) . . . V (1, pM ) (86)

and

|p(M +1,N ) = V (1, pM +1 )D . . . V (1, pN −1 )|pN , 0 (87)

Introducing the quantity

The Birth of String Theory 77
∞

R= na†n · an (88)
n=1

it is possible to rewrite

α0 − 1
∞ (−1)m
m
λ, P |D|μ, P = λ, P | |μ, P (89)
m=0
R + m − α(s)

where s is the variable deﬁned in (84). Using this equation we can rewrite
(85) as follows:

m α0 − 1
∞ (−1)
m
AN = p(1,M ) |λ, P λ, P | |μ, P μ, P |p(M +1,N )
m=0
R + m − α(s)
λ,μ
(90)

This expression shows that amplitude AN has a pole in the channel (1, M )
when α(s) is equal to an integer n ≥ 0 and the states |λ that contribute to
its residue are those satisfying the relation

R|λ = (n − m)|λ ; m = 0, 1 . . . n (91)

The number of independent states |λ contributing to the residue gives the
degeneracy of states for each level n.
Because of manifest relativistic invariance the space spanned by the com-
plete system of states in (55) contains states with negative norm corresponding
to those states having an odd number of oscillators with time-like directions
(see (51)). This is not consistent in a quantum theory where the states of
a system must span a positive-definite Hilbert space. This means that there
must exist a number of relations satisfied by the external states that decouple
a number of states leaving with a positive-definite Hilbert space. In order to
find these relations we rewrite the state in (87) going back to the Koba–Nielsen
variables

M −1
M −1
|p(1,M ) = [ dzi θ(zi − zi+1 )] (zi − zi+1 )α0 −1
i=2 i=1

× V (1, p1 )V (z2 , p2 ) . . . V (zM −1 , pM −1 )|0, pM (92)

Let us consider the operator U (α) that generates the projective transformation
that leaves the points z = 0, 1 invariant:
z
z = = z + α(z 2 − z) + o(α2 ) (93)
1 − α(z − 1)
78 P. Di Vecchia

From the transformation properties of the vertex operators in (76), it is easy

to see that the previous transformation leaves the state in (92) invariant:

U (α)|p(1,M ) = |p(1,M ) (94)

This means that the generator of the previous transformation annihilates the
state in (92):

W1 |p(1,M ) = 0 ; W1 = L1 − L0 (95)

The explicit form of W1 follows from the infinitesimal form of the transfor-
mation in (93). This condition that is of the same kind of the relations that
on-shell amplitudes with the emission of photons satisfy as a consequence of
gauge invariance, implies that the residue at the pole in (90) can be factorized
with a smaller number of states. It turns out, however, that a detailed analysis
of the spectrum shows that negative norm states are still present. This can
be qualitatively understood as follows. Due to the Lorentz metric, we have a
negative norm component for each oscillator. In order to be able to decouple
all negative norm states, we need to have a gauge condition of the type as in
(95) for each oscillator. But the number of oscillators is infinite and, therefore,
we need an infinite number of conditions of the type as in (95). It was found
in [6] that, if we take α0 = 1, then one can easily construct an infinite number
of operators that leave the state in (92) invariant. In the next section we will
concentrate on this case.

4 The Case α0 = 1
If we take α0 = 1 many of the formulae given in the previous section simplify.
The N -point amplitude in (38) becomes

N
∞
dzi θ(zi − zi+1 )
BN = 1
(zi − zj )2α pi ·pj (96)
−∞ dVabc j>i

that can be rewritten in the operator formalism as follows:

N ∞
N N
dzi θ(zi − zi+1 )
4
(2π) δ( pi )BN = 1
0, 0| V (zi , pi )|0, 0 (97)
i=1 −∞ dVabc i=1

By choosing z1 = ∞, z2 = 1 and zN = 0 it becomes

N
(2π)4 δ( pi )BN
i=1
The Birth of String Theory 79

1N −1
N −1
N −1
= dzi θ(zi − zi+1 )0, p1 | V (zi ; pi )|0, pN (98)
0 i=3 i=2 i=2

where

lim V (zN ; pN )|0, 0 ≡ |0; pN ; 0; 0| lim z12 V (z1 ; p1 ) = 0, p1 | (99)
zN →0 z1 →∞

Equation (81) is as before, but now the propagator becomes

1
D = dxxL0 −2 = (100)
L0 − 1

This means that (89) becomes

BN has a pole in the channel (1, M ) when α(s) is equal to an integer n ≥ 0 and
the states |λ that contribute to its residue are those satisfying the relation

R|λ = n|λ (103)

Their number gives the degeneracy of the states contributing to the pole at
α(s) = n. The N -point amplitude can be written as

BN = p(1,M ) |D|p(M +1,N ) (104)

where
M
−1
|p(1,M ) = [dzi θ(zi − zi+1 )]
i=2

× V (1, p1 )V (z2 , p2 ) . . . V (zM −1 , pM −1 |0, pM (105)

Using (79) and changing variables from zi , i = 2 . . . M − 1 to xi = zi+1 zi ,

i = 1 . . . M − 2 with z1 = 1 we can rewrite the previous equation as follows:

|p(1,M ) = V (1, p1 )DV (1, p2 ) . . . DV (1, pM −1 )|0, pM (106)

where the propagator D is deﬁned in (100).

We want now to show that the state in (105) and (106) is not only annihi-
lated by the operator in (95), but, if α0 = 1 [6], by an inﬁnite set of operators
80 P. Di Vecchia

whose lowest one is the one in (95). We will derive this by using the formalism
developed in [26] and we will follow closely their derivation.
Starting from (70) Fubini and Veneziano realized that the generators of
the projective group acting on a function of z are given by
d d d
L0 = −z ; L−1 = − ; L1 = −z 2 (107)
dz dz dz
They generalized the previous generators to an arbitrary conformal transfor-
mation by introducing the following operators, called Virasoro operators:
d
Ln = −z n+1 (108)
dz
that satisfy the algebra

[Ln , Lm ] = (n − m)Ln+m (109)

that does not contain the term with the central charge! They also showed that
the Virasoro operators satisfy the following commutation relations with the
vertex operator:
d n+1
[Ln , V (z, p)] = z V (z, p) (110)
dz
More in general actually they deﬁne an operator Lf corresponding to an
arbitrary function f (ξ) and Lf = Ln if we choose f (ξ) = ξ n . In this case the
commutation relation in (110) becomes
d
[Lf , V (z, p)] = (zf (z)V (z, p)) (111)
dz
By introducing the variable
z
dξ
y= (112)
A ξf (ξ)

where A is an arbitrary constant, one can rewrite (111) in the following form:
d
[Lf , zf (z)V (z, p)] = (zf (z)V (z, p)) (113)
dy
This implies that, under an arbitrary conformal transformation z → f (z),
generated by U = eαLf , the vertex operator transforms as

eαLf V (z, p) zf (z) e−αLf = V (z , p)z f (z ) (114)

where the parameter α is given by

z
dξ
α= (115)
z ξf (ξ)
The Birth of String Theory 81

On the other hand, this equation implies

dz dz
= (116)
zf (z) z f (z )

that, inserted in (114), implies that the quantity V (z, p) dz is left invariant by
the transformation z → f (z):

eαLf V (z, p)dze−αLf = V (z , p)dz (117)

Let us now act with the previous conformal transformation on the state in
(105). We get

1M −1
eαLf
|p(1,M ) = [dzi θ(zi − zi+1 )] eαLf V (1, p1 )e−αLf
0 i=2

×eαLf V (z2 , p2 )e−αLf . . . . . . eαLf V (zM −1 , pM −1 )e−αLf eαLf |0, pM

1 M−1
= θ(zi − zi+1 ) × eαLf V (1, p1 )e−αLf
0 i=2

× V (z2 , p2 )dz2 . . . V (zM

−1 , pM −1 )dzM −1 e
αLf
|0, pM (118)

where we have used (117). The previous transformation leaves the state invari-
ant if both z = 0 and z = 1 are ﬁxed points of the conformal transformation.
This happens if the denominator in (115) vanishes when ξ = 0, 1. This requires
the following conditions:

f (1) = 0 ; lim ξf (ξ) = 0 (119)

ξ→0

Expanding ξ near the point ξ = 1, we can determine the relation between z

and z near z = z = 1. We get

ze−αf (1)
z = (120)
1 − z + ze−αf (1)
and from it we can determine the conformal factor

dz e−αf (1)
= → eαf (1) (121)
dz (1 − z + ze−αf (1) )2

in the limit z → 1. Proceeding in the same way near the point z = z = 0 we

get

zf (0)eαf (0)
z = → zeαf (0) (122)
f (0) + zf (0)(1 − eαf (0)
82 P. Di Vecchia

in the limit z → 0. This means that (118) becomes

eα(Lf −f (1)−f (0))
|p(1,M ) = |p(1,M ) (123)

A choice of f that satisﬁes (119) is the following:

f (ξ) = ξ n − 1 (124)

that gives the following gauge operator:

Wn = Ln − L0 − (n − 1) (125)

that annihilates the state in (105):

Wn |p1...M = 0 ; n = 1 . . . ∞ (126)

These are the Virasoro conditions found in [6]. There is one condition for each
negative norm oscillator and, therefore, in this case there is the possibility
that the physical subspace is positive deﬁnite. An alternative more direct
derivation of (126) can be obtained by acting with Wn on the state in (106)
and using the following identities:

Wn V (1, p) = V (1, p)(Wn + n) ; (Wn + n)D = [L0 + n − 1]−1 Wn (127)

The second equation is a consequence of the following equation:

Ln xL0 = xL0 +n Ln (128)

Equations (127) imply

Wn V (1, p)D = V (1, p)[L0 + n − 1]−1 Wn (129)

This shows that the operator Wn goes unchanged through all the product
of terms V D until it arrives in front of the term V (1, pM −1 )|0, pM . Going
through the vertex operator it becomes Ln − L0 + 1 that then annihilate the
state

(Ln − L0 + 1)|pM , 0 = 0 (130)

This proves (126).

Using the representation of the Virasoro operators given in (108), Fubini
and Veneziano showed that they satisfy the algebra given in (109) without
the central charge. The presence of the central charge was recognized by Joe
Weis10 in 1970 and never published. Unlike Fubini and Veneziano [26], he used
the expression of the Ln operators in terms of the harmonic oscillators
∞

√
Ln = 2α np̂ · an + m(n + m)an+m · am
m=1
10
See noted added in proof in [26].
The Birth of String Theory 83

1
n−1
+ m(n − m)am−n · am ; n ≥ 0 Ln = L†n (131)
2 m=1

He got the following algebra:

d
[Ln , Lm ] = (n − m)Ln+m + n(n2 − 1)δn+m;0 (132)
24
where d is the dimension of the Minkowski space–time. We write here d for
the dimension of the Minkowski space, but we want to remind you that almost
everybody working in a model for mesons at that time took for granted that
the dimension of the space–time was d = 4. As far as I remember, the ﬁrst
paper where a dimension d = 4 was introduced was [27], where it was shown
that the unitarity violating cuts in the non-planar loop become poles that
were consistent with unitarity if d = 26.
In the last part of this section we will generalize the factorization procedure
to the Shapiro–Virasoro model whose N -point amplitude is given in (49). In
this case we must introduce two sets of harmonic oscillators commuting with
each other and only one set of zero modes satisfying the algebra [28]

[anμ , a†mν ] = [ãnμ , ã†mν ] = ημν δnm ; [q̂μ , p̂ν ] = iημν (133)

In terms of them we can introduce the Fubini–Veneziano operator

√ ∞
2α 1
Q(z, z̄) = q̂ − 2α p̂ log(z z̄) + i √ an z −n − a†n z n
2 n=1 n

√ ∞
2α 1
+i √ ãn z̄ −n − ã†n z̄ n (134)
2 n=1 n

We can then introduce the vertex operator

V (z, z̄; p) =: eip·Q(z,z̄) : (135)

and write the N -point amplitude in (95) in the following factorized form:

N 2 N

i=1 d zi
0|R V (zi , z̄i , pi )) |0
dVabc i=1

N
N 2
i=1 d zi
4 (4)
= (2π) δ ( pi ) |zi − zj |α pi ·pj (136)
i=1
dVabc i<j

where the radial ordered product is given by

84 P. Di Vecchia
N −1

N
N
R V (zi , z̄i , pi )) = V (zi , z̄i , pi )) θ(|zi | − |zi+1 |) + . . . (137)
i=1 i=1 i=1

and the dots indicate a sum over all permutations of the vertex operators.
By ﬁxing z1 = ∞, z2 = 1 and zN = 0, we can rewrite the previous expres-
sion as follows:
N
−1
N −1

d zi 0, p1 |R
2
V (zi , z̄i , pi )) |0, pN (138)
i=3 i=2

For the sake of simplicity, let us consider the term corresponding to the per-
mutation 1, 2, . . . N . In this case the Koba–Nielsen variables are ordered in
such a way that |zi | ≥ |zi+1 | for i = 1, . . . N − 1. We can then use the formula

V (zi , z̄i , pi )) = ziL0 −1 z̄iL̃0 −1 V (1, 1, pi )zi−L0 z̄i−L̃0 (139)

and change variables

zi+1
wi = ; |wi | ≤ 1 (140)
zi
to rewrite (138) as follows:

0, p1 |V (1, 1, pi 1)DV (1, 1, p2 )D . . . V (1, 1, pN −1 )|0, pN (141)

where

d2 w L0 −1 L̃0 −1 2 sin π(L0 − L̃0 )
D= w w̄ = · (142)
|w|2 L0 + L̃0 − 2 L0 − L̃0
We can now follow the same procedure for all permutations arriving at the
following expression:

0, p1 |P [V (1, 1, p2 )DV (1, 1, p3 )D . . . V (1, 1, pN −1 )]|0, pN (143)

where P means a sum of all permutations of the particles.

If we want to consider the factorization of the amplitude on the pole at
s = −(p1 + . . . pM )2 we get only the following contribution:

p(1...M ) |D|p(M +1...N ) (144)

where

|p(M +1...N ) = P [V (1, 1, pM +1 )D . . . V (1, 1, pN −1 ]|0, pN (145)

and

p(1...M ) | = 0, p1 |P [V (1, 1, p2 )D . . . V (1, 1, pM )] (146)

The Birth of String Theory 85

The amplitude is factorized by introducing a complete set of states and

rewriting (141) as follows:

2πλ, λ̃|δL0 ,L̃0 |λ, λ̃

p1...M |λ, λ̃ λ, λ̃|p(M +1,...N ) (147)
L0 + L̃0 − 2
λ,λ̃

By writing
α 2 α
L0 = p̂ + R ; L̃0 = p̂2 + R̃ (148)
4 4
with
∞
∞

R= na†n · an ; R̃ = nã†n · ãn (149)
n=1 n=1

we can rewrite (147) as follows:

2πλ, λ̃|δR,R̃ |λ, λ̃

p1...M |λ, λ̃ λ, λ̃|p(M +1,...N ) (150)
R + R̃ − α(s)
λ,λ̃

We see that the amplitude for the Shapiro–Virasoro model has simple poles

only for even integer values of αSV (s) = 2 + α2 s = 2n ≥ 0 and the residue at
the poles factorizes in a sum with a ﬁnite number of terms. Notice that the
Regge trajectory of the Shapiro–Virasoro model has double intercept and half
slope of that of the generalized Veneziano model.

5 Physical States and Their Vertex Operators

In the previous section, we have seen that the residue at the poles of the
N -point amplitudes factorizes in a sum of a finite number of terms. We have
also seen that some of these terms, due to the Lorentz metric, correspond to
states with negative norm. We have also derived a number of “Ward identities”
given in (126) that imply that some of the terms of the residue decouple. The
question to be answered now is: Is the space spanned by the physical states
a positive norm Hilbert space? In order to answer this question, we need
first to find the conditions that characterize the on-shell physical states |λ, P
and then to determine which are the states that contribute to the residue
of the pole at α(s = −P 2 ) = n. In other words, we have to find a way of
characterizing the physical states and of eliminating the spurious states that
decouple in (102) as a consequence of (126). A state |λ.P contributes at the
residue of the pole in (102) for α(s = −P 2 ) = n if it is on-shell, namely if it
satisfies the following equations:

R|λ, P = n|λ, P ; α(−P 2 ) = 1 − α P 2 = n (151)

86 P. Di Vecchia

that can be written in a unique equation

(L0 − 1)|λ, P = 0 (152)

Because of (126) we also know that a state of the type

†
|s, P = Wm |μ, P (153)

is not going to contribute to the residue of the pole. We call it a spurious or

unphysical state. We start constructing the subspace of spurious states that
are on-shell at the level n. Let us consider the set of orthogonal states |μ, P
such that

R|μ, P = nμ |μ, P ; L0 |μ, P = (1 − m)|μ, P ; 1 − α P 2 = n (154)

where

m = n + nμ (155)

In terms of these states we can construct the most general spurious state that
is on-shell at the level n. It is given by
†
|s, P = Wm |μ, P ; (L0 − 1)|s, P = 0 (156)

per any positive integer m. Using (154), (156) becomes

|s, P = L†m |μ, P (157)

where |μ, P is an arbitrary state satisfying (154).

A physical state |λ, P is deﬁned as the one that is orthogonal to all spuri-
ous states appearing at a certain level n. This means that it must satisfy the
following equation:

λ.P |L† |μ, P = 0 (158)

for any state |μ, P satisfying (154). In conclusion, the on-shell physical states
at the level n are characterized by the fact that they satisfy the following
conditions:

Lm |λ, P = (L0 − 1)|λ, P = 0 ; 1 − α P 2 = n (159)

These conditions characterizing the physical subspace were first found by Del
Giudice and Di Vecchia [28] where the analysis described here was done.
In order to find the physical subspace, one starts writing the most general
on-shell state contributing to the residue of the pole at level n in (154). Then
one imposes (159) and determines the states that span the physical subspace.
Actually, among these states one finds also a set of zero norm states that are
physical and spurious at the same time. Those states are of the form given in
The Birth of String Theory 87

(157), but also satisfy (159). It is easy to see that they are not really physical
because they are not contributing to the residue of the pole at the level n. This
follows from the form of the unit operator given in the space of the physical
states by

1= |λ, P λ, P | + [|λ0 , P μ0 , P | + |μ0 , P λ0 , P |] (160)
norm =0 zero

where |λ0 , P is a zero norm physical and spurious state and |μ0 , P its con-
jugate state. A conjugate state of a zero norm state is obtained by changing
the sign of the oscillators with time-like direction. Since |λ0 , P is a spurious
state when we insert the unit operator, given in (160), in (102) we see that the
zero norm states never contribute to the residue because their contribution
is annihilated either from the state p(1,M ) | or from the state |p(M +1,N ) . In
conclusion, the physical subspace contains only the states in the first term in
the r.h.s. of (160).
Let us analyse the first two excited levels. The first excited level corre-
sponds to a massless gauge field. It is spanned by the states μ a†1μ |0, P . In
this case the only condition that we must impose is

L1 μ a†1μ |0, P = 0 =⇒ P · = 0 (161)

Choosing a frame of reference where the momentum of the photon is given by

P μ ≡ (P, 0....0, P ), (161) implies that the only physical states are
† †
i a+†
1i |0, P + (a1;0 − a1;d−1 )|0, P ; i = 1 . . . d − 2 (162)

where i and are arbitrary parameters. The state in (162) is the most general
state of the level N = 1 satisfying the conditions in (159). The ﬁrst state in
(162) has positive norm, while the second one has zero norm that is orthogonal
to all other physical states since it can be written as follows:

(a†1;0 − a†1;D−1 )|0, P = L†1 |0, P (163)

in the frame of reference where P μ ≡ (P, ...0, P ). Because of the previous

property it is decoupled from the physical states together with its conjugate

(a†1,0 + a†1,d−1 )|0, P (164)

In conclusion, we are left only with the transverse d − 2 states corresponding

to the physical degrees of freedom of a massless spin 1 state. At the next level
n = 2, the most general state is given by

[αμν a†1,μ a†1,ν + β μ a†2,μ ]|0, P (165)

If we work in the centre of mass frame where P μ = (M, 0) we get the following
most general physical state:
88 P. Di Vecchia

1 † †
d−1
|P hys >= αij [a†1,i a†1,j − δij a1,k a1,k ]|0, P
(d − 1)
k=1

+β i [a†2,i + a†1,0 a†1,i ]|0, P >

d−1

d−1 d − 1 †2
+ α ii
a†1,i a†1,i + †
(a1,0 − 2a2,0 ) |0, P (166)
i=1 i=1
5
where the indices i, j run over the d − 1 space components. The ﬁrst term in
(166) corresponds to a spin 2 in (d − 1)-dimensional space and has a positive
norm being made with space indices. The second term has zero norm and is
1 a1,i |0, P .
orthogonal to the other physical states since it can be written as L+ +

Therefore, it must be eliminated from the physical spectrum together with its
conjugate, as explained above. Finally, the last state in (166) is spinless and
has a norm given by
2(d − 1)(26 − d) (167)
If d < 26 it corresponds to a physical spin zero particle with positive norm. If
d > 26 it is a ghost. Finally, if d = 26 it has a zero norm and is also orthogonal
to the other physical states since it can be written in the form

(2L†2 + 3L†2
1 )|0 > (168)

It does not belong, therefore, to the physical spectrum. The analysis of this
level was done [29] with d = 4. This did not allow the authors of [29] to see
that there was a critical dimension.
The analysis of the physical states can be easily extended [28] to the
Shapiro–Virasoro model. In this case the physical conditions given in (159)
for the open string, become [28]

Lm |λ, λ̃ = L̃m |λ, λ̃ = (L0 − 1)|λ, λ̃ = (L̃0 − 1)|λ, λ̃ = 0 (169)

for any positive integer m. It can be easily seen from the previous equations
that the lowest state of the Shapiro–Virasoro model is the vacuum |0a , 0ã , p
corresponding to a tachyon with mass α p2 = 4, while the next level described
by the state a†1μ ã†1ν |0a , 0ã , p contains massless states corresponding to the
graviton, a dilaton and a two-index antisymmetric tensor Bμν .
Having characterized the physical subspace one can go on and construct
a N -point scattering amplitude involving arbitrary physical states. This was
done by Campagna, Fubini, Napolitano and Sciuto [30] where the vertex oper-
ator for an arbitrary physical state was constructed in analogy with what has
been done for the ground tachyonic state. They associated to each physical
state |α, P a vertex operator Vα (z, P ) that is a conformal ﬁeld with conformal
dimension equal to 1:
d n+1
[Ln , Vα (z, p)] = z Vα (z, p) (170)
dz
The Birth of String Theory 89

and reproduces the corresponding state acting on the vacuum as follows:

lim Vα (z; p)|0, 0 ≡ |α; p ; 0; 0| lim z 2 Vα (z; p) = α, p| (171)

z→0 z→∞

It satisﬁes, in addition, the hermiticity relation

1
Vα† (z, P ) = Vα ( , −P )(−1)α(−P )
2
(172)
z
An excited vertex that will play an important role in the next section is the
one associated to the massless gauge ﬁeld. It is given by
dQ(z) ik·Q(z)
V (z, k) ≡ · e ; k · = k2 = 0 (173)
dz
Because of the last two conditions in (173) the normal order is not necessary.
It is convenient to give the expression of dQ(z)
dz in terms of the harmonic
oscillators
∞
dQ(z) √
P (z) ≡ = −i 2α αn z −n−1 (174)
dz n=−∞

It is a conformal ﬁeld with conformal dimension equal to 1. The rescaled

oscillators αn are given by
√ √ √
αn = nan ; α−n = na†n ; n > 0 ; α0 = 2α p̂ (175)

In terms of the vertex operators previously introduced the most general

amplitude involving arbitrary physical states is given by [30]

N ∞
N N
dzi θ(zi − zi+1 )
(2π)4 δ( ex
pi )BN = 1
0, 0| Vαi (zi , pi )|0, 0
i=1 −∞ dVabc i=1
(176)

In the case of the Shapiro–Virasoro model the tachyon vertex operator is

given in (135). By rewriting (134) as follows:

Q(z, z̄) = Q(z) + Q̃(z̄) (177)

where
∞

1 √ 1
−n † n
Q(z) = q̂ − 2α p̂ log(z) + i 2α √ an z − an z (178)
2 n=1
n

and
∞

1 √ 1
−n †
Q̃(z̄) = q̂ − 2α p̂ log(z̄) + i 2α √ ãn z̄ − ãn z̄ n
(179)
2 n=1
n
90 P. Di Vecchia

we can write the tachyon vertex operator in the following way:

V (z, z̄, p) =: eip·Q(z) eip·Q̃(z̄) : (180)

This shows that the vertex operator corresponding to the tachyon of the
Shapiro–Virasoro model can be written as the product of two vertex oper-
ators corresponding each to the tachyon of the generalized Veneziano model.
Analogously the vertex operator corresponding to an arbitrary physical
state of the Shapiro–Virasoro model can always be written as a product of
two vertex operators of the generalized Veneziano model:
p p
Vα,β (z, z̄, p) = Vα (z, )Vβ (z̄, ) (181)
2 2
The ﬁrst one contains only the oscillators αn , while the second one only the
oscillators α̃n . They both contain only half of the total momentum p and
the same zero modes p̂ and q̂. The two vertex operators of the generalized
Veneziano model are both conformal ﬁelds with conformal dimension equal
to 1. If they correspond to physical states at the level 2n, they satisfy the
following relation (n = ñ):

p2
α +n=1 (182)
4
They lie on the following Regge trajectory:

α 2
2− p ≡ αSV (−p2 ) = 2n (183)
2
as we have already seen by factorizing the amplitude in (150).

6 The DDF States and Absence of Ghosts

In the previous section we have derived the equations that characterize the
physical states and their corresponding vertex operators. In this section we
will explicitly construct an infinite number of orthonormal physical states with
positive norm.
The starting point is the DDF operator introduced by Del Giudice, Di
Vecchia and Fubini [31] and defined in terms of the vertex operator corre-
sponding to the massless gauge field introduced in (173)

i
Ai,n = √ dzμi Pμ (z)eik·Q(z) (184)

2α 0
where the index i runs over the d−2 transverse
directions, that are orthogonal
to the momentum k. We have also taken 0 dz z = 1. Because of the log z term
appearing in the zero mode part of the exponential, the integral in (184), that
The Birth of String Theory 91

is performed around the origin z = 0, is well deﬁned only if we constrain the

momentum of the state, on which Ai,n acts, to satisfy the relation

2α p · k = n (185)

where n is a non-vanishing integer.

The operator in (184) will generate physical states because it commutes
with the gauge operators Lm :

[Lm , An;i ] = 0 (186)

since the vertex operator transforms as a primary ﬁeld with conformal dimen-
sion equal to 1 as it follows from (170).
On the other hand it also satisﬁes the algebra of the harmonic oscillator
as we are now going to show. From (184) we get

1
[An,i , Am,j ] = − dζ dzi · P (z)eik·Q(ζ) j · P (ζ)eik ·Q(ζ) (187)
2α 0 ζ

where
2α p · k = n ; 2α p · k = m (188)

and k and k are supposed to be in the same direction, namely,

kμ = nk̂μ ; kμ = mk̂μ (189)

with
2α p · k̂ = 1 (190)
Finally, the polarizations are normalized as

i · j = δij (191)

Since k̂ · i = k̂ · j = k̂ 2 = 0 a singularity for z = ζ can appear only from the

contraction of the two terms P (ζ) and P ((z) that is given by
2α δij
0, 0|i · P (z)j · P (ζ)|0, 0 = − (192)
(z − ζ)2
Inserting it in (187), we get

[An,i , Am,j ] = δij in dζ k̂ · P (ζ)e−i(n+m))k̂·Q(ζ)
0

= inδij δn+m;0 dζ k̂ · P (ζ) (193)
0

where we have used the fact that the integrand is a total derivative and
therefore one gets a vanishing contribution unless n + m = 0. If n + m = 0
from (174) and (190) we get
92 P. Di Vecchia

[An,i , Am,j ] = nδij δn+m;0 ; i, j = 1 . . . d − 2 (194)

Equation (194) shows that the DDF operators satisfy the harmonic oscillator
algebra.
In terms of this inﬁnite set of transverse oscillators we can construct an
orthonormal set of states
1 Aik ,−Nk
m
|i1 , N1 ; i2 , N2 ; . . . im , Nm = √ √ |0, p (195)
h
λh ! k=1 Nk

where λh is the multiplicity of the operator Aih ,−Nh in the product in (195)
and the momentum of the state in (195) is given by

m
P =p+ k̂Ni (196)
i=1

They were constructed in four dimensions where they were not a complete
system of states11 and it took some time to realize that in fact they were
a complete system of states if d = 26 [32, 33].12 Brower [32] and Goddard
and Thorn [33] showed also that the dual resonance model was ghost free
for any dimension d ≤ 26. In d = 26 this follows from the fact that the
DDF operators obviously span a positive-definite Hilbert space (see (194)).
For d < 26 there are extra states called Brower states [32]. The first of these
states is the last state in (166) that becomes a zero norm state for d = 26.
But also for d < 26 there is no negative norm state among the physical states.
The proof of the no-ghost theorem in the case α0 = 1 is a very important
step because it shows that the dual resonance model constructed generalizing
the four-point Veneziano formula, is a fully consistent quantum-relativistic
theory! This is not quite true because, when the intercept α0 = 1, the lowest
state of the spectrum corresponding to the pole in the N -point amplitude for
α(s) = 0, is a tachyon with mass m2 = − α1 . A lot of effort was then made
to construct a model without tachyon and with a meson spectrum consistent
with the experimental data. The only reasonably consistent models that came
out from these attempts, were the Neveu–Schwarz [7] for mesons and the
Ramond model [8] for fermions that only later were recognized to be part of
a unique model that nowadays is called the Neveu–Schwarz-Ramond model.
11
Because of this Fubini did not want to publish our result, but then he went to a
meeting in Israel in spring 1971 giving a talk on our work where he found that
the audience was very interested in our result and when he came back to MIT we
decided to publish our result.
12
I still remember Charles Thorn coming into my office at CERN and telling me:
Paolo, do you know that your DDF states are complete if d = 26? I quickly redid
the analysis done in [29] with an arbitrary value of the space–time dimension
obtaining (166) and (167) that show that the spinless state at the level α(s) = 2
is decoupled if d = 26. I strongly regretted not to have used an arbitrary space–
time dimension d in the analysis of [29].
The Birth of String Theory 93

But this model was not really more consistent than the original dual resonance
model because it still had a tachyon with mass m2 = − 2α 1
. The tachyon

was eliminated from the spectrum only in 1976 through the GSO projection
proposed by Gliozzi, Scherk and Olive [34].
Having realized that, at least for the critical value of the space–time dimen-
sion d = 26, the physical states are described by the DDF states having only
d − 2 = 24 independent components, open the way to Brink and Nielsen [35]
to compute the value α0 = 1 of the Regge trajectory with a very physical ar-
gument. They related the intercept of the Regge trajectory to the zero point
energy of a system with an inﬁnite number of oscillators having only d − 2
independent components
∞
d−2
α0 = − n (197)
2 n=1

This quantity is obviously infinite and, in order to make sense of it, they in-
troduced a cut-off on the frequencies of the harmonic oscillators obtaining an
infinite term that they eliminated by renormalizing the speed of light and a
finite universal constant term that gave the intercept of the Regge trajectory.
Instead of following their original approach we discuss here an alternative ap-
proach due to Gliozzi [36] that uses the ζ-function regularization. He rewrites
(197) as follows:
∞ ∞
d−2 d−2 d−2
α0 = − n=− lim n−s = − ζR (−1) = 1 (198)
2 n=1 2 s→−1 n=1 2

where in the last equation we have used the identity ζR (−1) = − 12 1

and we
have put d = 26. Since the Shapiro–Virasoro model has two sets of trans-
verse harmonic oscillators it is obvious that its intercept is twice that of the
generalized Veneziano model.
Using the rules discussed in the previous section we can construct the
vertex operator corresponding to the state in (195). It is given by
m

V(i;Ni ) (z, P ) = dzi i · P (zi )eiNi k̂·Q(zi ) : eip·Q(z) : (199)
i=1 z

where the integral on the variable zi is evaluated along a curve of the complex
plane zi containing the point z. The singularity of the integrand for zi = z is
a pole provided that the following condition is satisﬁed.

2α p · k̂ = 1 (200)

The last vertex in (199) is the vertex operator corresponding to the ground
tachyonic state given in (59) with α p2 = 1.
Using the general form of the vertex one can compute the three-point
amplitude involving three arbitrary DDF vertex operators. This calculation
94 P. Di Vecchia

has been performed in [37] and since the vertex operators are conformal ﬁelds
with dimension equal to 1 one gets

0, 0|V(i(1) ;N (1) ) (z1 , P1 )V(i(2) ;N (2) ) (z2 , P2 )V(i(3) ;N (3) ) (z3 , P3 )|0, 0
k1 k1 k2 k(2) k3 k(3)

C123
= (201)
(z1 − z2 )(z1 − z3 )(z2 − z3 )

where the explicit form of the coeﬃcient C123 is given by

3 ∞ (r) (s) 3 ∞ (r)
1 rs
Pi ·
C123 = 1 0, 0|2 0, 0|3 0, 0|e 2 r.s=1 n,m=1 A−n;i Nnm A−m;i + i=1 n=1 A−n;i

3
r=1 (α Πr −1)
2 (1) (1) (2) (2) (3) (3)
× eτ0 |Nk1 , ik1 1 |Nk2 , ik2 2 |Nk3 , ik3 3 (202)

where

nmα1 α2 α3 Γ (−n ααr+1 )

rs
Nnm = −Nnr Nm
s
; Nnr = r
(203)
nαs + mαr αr n!Γ (1 − n ααr+1
r
− n)

with

Π = Pr+1 αr − Pr αr+1 ; r = 1, 2, 3 (204)

Π is independent on the value of r chosen as a consequence of the equations

3
3
αr = Pr = 0 (205)
r=1 r=1

7 The Zero Slope Limit

In the introduction we have seen that the dual resonance model has been
constructed using rules that are different from those used in field theory.
For instance, we have seen that planar duality implies that the amplitude
corresponding to a certain duality diagram, contains poles in both s and t
channels, while the amplitude corresponding to a Feynman diagram in field
theory contains only a pole in one of the two channels. Furthermore, the
scattering amplitude in the dual resonance model contains an infinite number
of resonant states that, at high energy, average out to give Regge behaviour.
Also this property is not observed in field theory. The question that was
natural to ask, was then: is there any relation between the dual resonance
model and field theory? It turned out, to the surprise of many, that the dual
resonance model was not in contradiction with field theory, but was instead
an extension of a certain number of field theories. We will see that the limit in
The Birth of String Theory 95

which a ﬁeld theory is obtained from the dual resonance model corresponds
to taking the slope of the Regge trajectory α to zero.
Let us consider the scattering amplitude of four ground state particles in
(1) that we rewrite here with the correct normalization factor

A(s, t, u) = C0 N04 (A(s, t) + A(s, u) + A(t, u)) (206)

where
√
2g(2α )
d−2
N0 = 4 (207)

is the correct normalization factor for each external leg, g is the dimensionless
open string coupling constant that we have constantly ignored in the previous
sections and C0 is determined by the following relation:

C0 N02 α = 1 (208)

that is obtained by requiring the factorization of the amplitude at the pole

corresponding to the ground state particle whose mass is given in (21). Using
(21) in order to rewrite the intercept of the Regge trajectory in terms of the
mass of the ground state particle m2 and the following relation satisﬁed by
the Γ -function:

Γ (1 + z) = zΓ (z) (209)

we can easily perform the limit for α → 0 of A(s, t) obtaining

1 1 1
lim A(s, t) = + (210)
α →0 α m2 − s m2 − s

Performing the same limit on the other two planar amplitudes, we get the
following expression for the total amplitude in (206):
√ 2 2 1 1 1

d−2
lim A(s, t, u) = 2g(2α ) 4 + +
α →0 (α )2 m2 − s m2 − s m2 − u
(211)

By introducing the coupling constant

g3 = 4g(2α )
d−6
4 (212)

Equation (211) becomes

1 1 1
lim A(s, t, u) = g32 + + (213)
α →0 m2 − s m2 − s m2 − u

that is equal to the sum of the tree diagrams for the scattering of four particles
with mass m of Φ3 theory with coupling constant equal to g3 . We have shown
96 P. Di Vecchia

that, by keeping g3 ﬁxed in the limit α → 0, the scattering amplitude of four

ground state particles of the dual resonance model is equal to the tree diagrams
of Φ3 theory. This proof can be extended to the scattering of N ground state
particles recovering also in this case the tree diagrams of Φ3 theory. It is also
valid for loop diagrams that we will discuss in the next section. In conclusion,
the dual resonance model reduces in the zero slope limit to Φ3 theory. The
proof that we have presented here is due to Scherk [38].13
A more interesting case to study is the one with intercept α0 = 1. We will
see that, in this case, one will obtain the tree diagrams of Yang–Mills theory,
as shown by Neveu and Scherk [40].14
Let us consider the three-point amplitude involving three massless gauge
particles described by the vertex operator in (173). It is given by the sum
of two planar diagrams. The ﬁrst one corresponding to the ordering (123) is
given by
0, 0|V1 (z1 , p1 )V2 (z2 , p2 )V3 (z3 , p3 )|0, 0
C0 N03 i3 T r (λa1 λa2 λa3 ) −1 (214)
[(z1 − z2 )(z2 − z3 )(z1 − z3 )]
Using momentum conservation p1 + p2 + p3 = 0 and the mass shell conditions
p2i = pi · i = 0, one can rewrite the previous equation as follows:
√
C0 N03 T r(λa1 λa2 λa3 ) 2α

× [(1 · 2 )(p1 · 3 ) + (1 · 3 )(p3 · 2 ) + (2 · 3 )(p2 · 1 )] (215)

The second contribution comes from the ordering 132 that can be obtained
from the previous one by the substitution

T r(λa1 λa2 λa3 ) → −T r(λa1 λa3 λa2 ) (216)

Summing the two contributions one gets

√
C0 No3 T r(λa1 [λa2 , λa3 ]) 2α

× [(1 · 2 )(p1 · 3 ) + (1 · 3 )(p3 · 2 ) + (2 · 3 )(p2 · 1 )] (217)

The factor

N0 = 2g(2α )(d−2)/4 (218)

is the correct normalization factor for each vertex operator if we normalize

the generators of the Chan–Paton group as follows:
1
T r λi λj = δ ij (219)
2
13
See also [39].
14
See also [41].
The Birth of String Theory 97

It is related to C0 through the relation15

C0 No2 α = 2 (220)

g is the dimensionless open string coupling constant. Notice that (218) and
(220) diﬀer from (207) and (208) because of the presence of the Chan–Paton
factors that we did not include in the case of Φ3 theory.
By using the commutation relations

[λa , λb ] = if abc λc (221)

and the previous normalization factors we get for the three-gluon amplitude

igY M f a1 a2 a3 [(1 · 2 )((p1 − p2 ) · 3

+(1 · 3 )((p3 − p1 ) · 2 ) + (2 · 3 )((p2 − p3 ) · 1 )] (222)

that is equal to the three-gluon vertex that one obtains from the Yang–Mills
action
1 a αβ
LY M = − Fαβ Fa , a
Fαβ = ∂α Aaβ − ∂β Aaα + gY M f abc Abα Acβ (223)
4
where

gY M = 2g(2α )
d−4
4 (224)

The previous procedure can be extended to the scattering of N gluons ﬁnding

the same result that one gets from the tree diagrams of Yang–Mills theory.
In the next section, we will discuss the loop diagrams. Also, in this case one
ﬁnds that the h-loop diagrams involving N external gluons reproduces in the
zero slope limit the sum of the h-loop diagrams with N external gluons of
Yang–Mills theory.
We conclude this section mentioning that one can also take the zero slope
limit of a scattering amplitude involving three and four gravitons obtaining
agreement with what one gets from the Einstein Lagrangian of general rela-
tivity. This has been shown by Yoneya [43].

8 Loop Diagrams
The N -point amplitude previously constructed satisﬁes all the axioms of
S-matrix theory except unitarity because its only singularities are simple poles
corresponding to zero width resonances lying on the real axis of the Mandel-
stam variables and does not contain the various cuts required by unitarity [1].
15
The determination of the previous normalization factors can be found in the
appendix of [42].
98 P. Di Vecchia

In order to eliminate this problem, it was proposed already in the early days of
dual theories to assume, in analogy with what happens for instance in pertur-
bative ﬁeld theory, that the N -point amplitude was only the lowest order (the
tree diagram) of a perturbative expansion and, in order to implement unitar-
ity, it was necessary to include loop diagrams. Then, the one-loop diagrams
were constructed from the propagator and vertices that we have introduced
in the previous sections [44]. The planar one-loop amplitude with M external
particles was computed by starting from a (M + 2)-point tree amplitude and
then by sewing two external legs together after the insertion of a propagator
D given in (100). In this way one gets

dd P
d/2 d
P, λ|V (1, p1 )DV (1, p2 ) . . . V (1, pN )D|P, λ (225)
(2α ) (2π) λ

where the sum over λ corresponds to the trace in the space of the harmonic os-
cillators and the integral in dd P corresponds to integrate over the momentum
circulating in the loop. The previous expression for the one-loop amplitude
cannot be quite correct because all states of the space generated by the oscil-
lators in (51) are circulating in the loop, while we know that we should include
only the physical ones. This was achieved ﬁrst by cancelling by hand the time
and one of the space components of the harmonic oscillators reducing the de-
grees of freedom of each oscillator from d to d − 2 as suggested by the DDF
operators at least for d = 26. This procedure was then shown to be correct
by Brink and Olive [45]. They constructed the operator that projects over the
physical states and, by inserting it in the loop, showed that the reduction of
the degrees of freedom of the oscillators from d to d − 2 was indeed correct.
This was, at that time, the only procedure available to let only the physical
states circulate in the loop because the BRST procedure was discovered a bit
later also in the framework of the gauge ﬁeld theories!
To be more explicit let us compute the trace in (225) adding also the
Chan–Paton factor. We get
M
N T r(λa1 . . . λaM ) M ∞ dτ d−26
d (d)
(2π) δ pi 2 α )d/2
N 0 [f (k)]2−d k 12 (2π)M
d/2+1 1
i=1
(8π 0 τ

1 νM ν3 2α pi ·pj
× dνM dνM −1 . . . dν2 τ M eG(νji ) ; k ≡ e−πτ(226)
0 0 0 i<j

where νji ≡ νj − νi ,
∞

−πν 2 τ Θ1 (iντ |iτ )
G(ν) = log ie ; f1 (k) = k 1/12 (1 − k 2n ) (227)
f13 (k) n=1

and
The Birth of String Theory 99
∞

Θ1 (ν|iτ ) = −2k 1/4 sin πν 1 − e2iπν k 2n 1 − e−2iπν k 2n (1 − k 2n )
n=1
(228)

Finally, the normalization factor N0 is given in (218). We have performed the

calculation for an arbitrary value of the space–time dimension d. However, in
d−26
this way one gets also the extra factor of k 12 appearing in the ﬁrst line of
(226) that implies that our calculation is actually only consistent if d = 26. In
fact, the presence of this factor does not allow one to rewrite the amplitude,
originally obtained in the Reggeon sector, in the Pomeron sector as explained
below. In the following we neglect this extra factor, implicitly assuming that
d = 26, but, on the other hand, still keeping an arbitrary d.
Using the relations
√ 2
f1 (k) = tf1 (q) ; Θ1 (iντ |iτ ) = iΘ1 (ν|it)t1/2 eπν /t (229)

where t = τ1 and q ≡ e−πt , we can rewrite the one-loop planar diagram in the
Pomeron channel. We get
M
N T r(λa1 . . . λaM ) M ∞
d (d)
(2π) δ pi 2 α )d/2
N 0 dt[f1 (q)]2−d (2π)M
i=1
(8π 0

Θ1 (νji |it) 2α pi ·pj

1 νM ν3
× dνM dνM −1 . . . dν2 − (230)
0 0 0 i<j
f13 (q)

Notice that, by factorizing the planar loop in the Pomeron channel, one con-
structed for the ﬁrst time what we now call the boundary state [46].16 This
can be easily seen in the way that we are now going to describe. First of all,
notice that the last quantity in (230) can be written as follows:

Θ1 (νji |it) 2α pi ·pj

−
i<j
f13 (q)

∞ 2α pi ·pj
1 − q 2n e2πiνji 1 − q 2n e−2πiνji
= −2 sin(πνji ) (231)
i<j n=1
(1 − q 2n )2

This equation can be rewritten as follows:

M
T r p = 0|q 2R i=1 : eipi ·Q(e
2iπνi
)
: |p = 0 iM ∞

; R= na†n · an
T r (p = 0|q 2N |p = 0) n=1
(232)
16
See also the ﬁrst paper in [47].
100 P. Di Vecchia

where the trace is taken only over the non-zero modes and momentum con-
servation has been used. It must also be stressed that the normal ordering of
the vertex operators in the previous equation is such that the zero modes are
taken to be both in the same exponential instead of being ordered as in (59).
By bringing all annihilation operators on the left of the creation ones, from
the expression in (232), one gets (zi ≡ e2πiνi )
∞

d (d)
(2π) δ pi (−2 sin πνji )2α pi ·pj
i=1 i<j

√ √

∞ 2na†n ·an
a
†
an −n
2α pj · √nn zjn − 2α pi · √ z
i.j n=1 T r q e e n i

× (233)
T r (p = 0|q 2N |p = 0)

The trace can be computed by using the completeness relation involving co-
†
herent states |f = ef a |0:
2
d f −|f |2
e |f f | = 1 (234)
π

Inserting the previous identity operator in (233), one gets after some calcula-
tion ∞

d (d)
(2π) δ pi (−2 sin πνji )2α pi ·pj
i=1 i<j

∞
M q 2n
−2α pi ·pj e2πinνji
× e n(1−q 2n ) (235)
i.j=1 n=1

Expanding the denominator in the last exponent and performing the sum over
n one gets ∞

(2π)d δ (d) pi (−2 sin πνji )2α pi ·pj
i=1 i<j

∞
log(1−e2πiνji q 2(m+1) )
× e2α pi ·pj m=0 (236)
i.j

that is equal to the last line of (231) apart from the δ-function for momentum
conservation. In conclusion, we have shown that (231) and (232) are equal.
Using (231) we can rewrite (230) as follows:
∞ 1 νM
N N0M T r(λa1 . . . λaM ) 2−d M
dt[f1 (q)] (2πi) dνM dνM −1 . . .
(8π 2 α )d/2 0 0 0
The Birth of String Theory 101

M
: eipi ·Q(e
2iπνi
λ p : |p = 0, λ
ν3 2R )
= 0, λ|q
... dν2 i=1
(237)
λ p = 0, λ|q |p = 0, λ
2N
0

where the sum over any state |λ corresponds to taking the trace over the
non-zero modes. If d = 26 we can rewrite (237) in a simpler form
∞ 1 νM ν3
N N0M T r(λa1 . . . λaM )
dt (2πi)M dνM dνM −1 . . . dν2
(8π 2 α )d/2 0 0 0 0

M
: eipi ·Q(e
2iπνi
× p = 0, λ|q 2R−2 )
: |p = 0, λ (238)
λ i=1

The previous equation contains the factor dtq 2R−2 that is like the propa-
gator of the Shapiro–Virasoro model, but with only one set of oscillators as
in the generalized Veneziano model. In the following we will rewrite it com-
pletely with the formalism of the Shapiro–Virasoro model. This can be done
by introducing the Pomeron propagator
∞ 2
2 α d z L0 −1 L̃0 −1
dt q 2N −2 =
D̂ ; D̂ ≡ z z̄ ; |z| ≡ q = e−πt
0 πα 4π |z|2
(239)

and rewriting the planar loop in the following compact form:

∞
Td−1 a†n ·ã†n
B0 |D̂|BM ; |B0 ≡ N e |p = 0, 0a , 0ã (240)
2 n=1

where |B0 is the boundary state without any Reggeon on it,

√
π √
Td−1 = (d−10)/4 (2π α )−d/2−1 (241)
2
and |BM is instead the one with M Reggeons given by
1 νM ν3
|BM = N0M T r(λa1 ...λ aM
)(2πi) M
dνM dνM −1 . . . dν2
0 0 0

M
: eipi ·Q(e
2iπνi
× )
: |B0 (242)
i=1

We want to stress once more that the normal ordering in the previous equa-
tion is deﬁned by taking the zero modes in the same exponential. Both the
boundary states and the propagator are now states of the Shapiro–Virasoro
model. This means that we have rewritten the one-loop planar diagram, where
102 P. Di Vecchia

the states of the generalized Veneziano model circulate in the loop, as a tree
diagram of the Shapiro–Virasoro model involving two boundary states and a
propagator. This is what nowadays is called open/closed string duality.
Besides the one-loop planar diagram in (225), that is nowadays called the
annulus diagram, also the non-planar and the non-orientable diagrams were
constructed and studied. In particular the non-planar one, that is obtained as
the planar one in (225) but with two propagators multiplied with the twist
operator

Ω = eL−1 (−1)R , (243)

had unitarity violating cuts that disappeared [27] if the dimension of the
space–time d = 26, leaving behind additional pole singularities. The explicit
form of the non-planar loop can be obtained following the same steps done
for the planar loop. One gets for the non-planar loop the following amplitude:

BR |D̂|BM (244)

where now both boundary states contain, respectively, R and M Reggeon

states. The additional poles found in the non-planar loop were called Pomerons
because they occur in the Pomeron sector, that today is called the closed string
channel, to distinguish them from the Reggeons that instead occur in the
Reggeon sector, that today is called the open string sector of the planar and
non-planar loop diagrams. At that time in fact, the states of the generalized
Veneziano models were called Reggeons, while the additional ones appear-
ing in the non-planar loop were called Pomerons. The Reggeons correspond
nowadays to open string states, while the Pomerons to closed string states.
These things are obvious now, but at that time it took a while to show that
the additional states appearing in the Pomeron sector have to be identiﬁed
with those of the Shapiro–Virasoro model. The proof that the spectrum was
the same came rather early. This was obtained by factorizing the non-planar
diagram in the Pomeron channel [46] as we have done in (244). It was found
that the states of the Pomeron channel lie on a linear Regge trajectory that
has double intercept and half slope of the one of the Reggeons. This follows
immediately from the propagator D̂ in Eq. (239) that has poles for values of
the momentum of the Pomeron exchanged given by
α 2
2− p = 2n (245)
2
that are exactly the values of the masses of the states of the Shapiro–Virasoro
model [48], while the Reggeon propagator in (100) has poles for values of
momentum equal to

1 − α p2 = n (246)

However, it was still not clear that the Pomeron states interact among them-
selves as the states of the Shapiro–Virasoro model. To show this it was
The Birth of String Theory 103

ﬁrst necessary to construct tree amplitudes containing both states of the

generalized Veneziano model and of the Shapiro–Virasoro model [49]. They
reduced to the amplitudes of the generalized Veneziano (Shapiro–Virasoro)
model if we have only external states of the generalized Veneziano (Shapiro–
Virasoro) model. Those amplitudes are called today disk amplitudes contain-
ing both open and closed string states. They were constructed [49] by using
for the Reggeon states the vertex operators that we have discussed in Sect. 5
involving one set of harmonic oscillators and for the Pomeron states the vertex
operators given in (181) that we rewrite here
p p
Vα,β (z, z̄, p) = Vα (z, )Vβ (z̄, ) (247)
2 2
because now both component vertices contain the same set of harmonic os-
cillators as in the generalized Veneziano model. Furthermore, each of the two
vertices is separately normal ordered, but their product is nor normal ordered.
The amplitude involving both kinds of states is then constructed by taking
the product of all vertices between the projective invariant vacuum and inte-
grating the Reggeons on the real axis in an ordered way and the Pomerons in
the upper half plane, as one does for a disk amplitude.
We have mentioned above that the two vertices are separately normal
ordered, but their product is not normal ordered. When we normal order
them we get, for instance for the tachyon of the Pomeron sector, a factor
2
(z − z̄)α p /2 that describes the Reggeon–Pomeron transition. This implies a
direct coupling [51] between the U (1) part of gauge field and the two-index an-
tisymmetric field Bμν , called Kalb–Ramond field [50], of the Pomeron sector,
that makes the gauge field massive [51].
It was then shown that, by factorizing the non-planal loop in the Pomeron
channel, one reproduced the scattering amplitude containing one state of
the Shapiro–Virasoro and a number of states of the generalized Veneziano
model [52]. If we have also external states belonging to the generalized
Shapiro–Virasoro model, then by factorizing the non-planar one-loop ampli-
tude in the pure Pomeron channel, one would obtain the tree amplitudes of
the Shapiro–Virasoro model [52].
All this implies that the generalized Veneziano model and the Shapiro–
Virasoro model are not two independent models, but they are part of the
same and unique model. In fact, if one started with the generalized Veneziano
model and added loop diagrams to implement unitarity, one found the ap-
pearence in the non-planar loop of additional states that had the same mass
and interaction of those of the Shapiro–Virasoro model.
The planar diagram, written in (230) in the closed string channel, is di-
vergent for large values of t. This divergence was recognized to be due to
exchange, in the Pomeron channel, of the tachyon of the Shapiro–Virasoro
model and of the dilaton [47]. They correspond, respectively, to the first two
terms of the expansion

[f1 (q)]−24 = e2πt + 24 + O e−2πt (248)
104 P. Di Vecchia

The first one could be cancelled by an analytic continuation, while the second
one could be eliminated through a renormalization of the slope of the Regge
trajectory α [47].
We conclude the discussion of the one-loop diagrams by mentioning that
the one-loop diagram for the Shapiro–Virasoro model was computed by
Shapiro [53] who also found that the integrand was modular invariant.
The computation of multiloop diagrams requires a more advanced tech-
nology that was also developed in the early days of the dual resonance model
few years before the discovery of its connection to string theory. In order to
compute multiloop diagrams, one needs first to construct an object that was
called the N -Reggeon vertex and that has the properties of containing N sets
of harmonic oscillators, one for each external leg, and is such that, when we
saturate it with N physical states, we get the corresponding N -point ampli-
tude. In the following we will discuss how to determine the N -Reggeon vertex.
The first step toward the N -Reggeon vertex is the Sciuto–Della Selva-
Saito [54] vertex that includes two sets of harmonic oscillators that we denote
with the indices 1 and 2. It is equal to

1
VSDS = 2 x = 0, 0| : exp − dzX2 (z) · X1 (1 − z) : (249)
2α 0

where X is the quantity that we have called Q in (57) and the prime denotes
a derivative with respect to z. It satisﬁes the important property of giving the
vertex operator Vα (z = 1) of an arbitrary state |α when we saturate it with
the corresponding state

VSDS |α2 = Vα (z = 1) (250)

A shortcoming of this vertex is that it is not invariant under a cyclic permu-

tation of the three legs. A cyclic symmetric vertex has been constructed by
Caneschi, Schwimmer and Veneziano [55] by inserting the twist operator in
(243). But the three-Reggeon vertex is not enough if we want to compute an
arbitrary multiloop amplitude. We must generalize it to an arbitrary number
of external legs. Such a vertex, that can be obtained from the one in (249)
with a very direct procedure, or that can also be obtained by sewing together
three-Reggeon vertices, has been written in its ﬁnal form by Lovelace [56].17
Here we do not derive it, but we give directly its expression written in [56]:

N
N N
dzi
VN,0 = i=1

N [i< x = 0, Oa |] δ( pi )
dVabc
i=1 [Vi (0)] i=1 i=1
∞

N
1 (i)
exp − a Dnm (Γ Vi−1 Vj ) am
(j)
(251)
2 n,m=0 n
i,j=1
i=j

17
See also [57]. Earlier papers on the N -Reggeon can be found in [58].
The Birth of String Theory 105
(i) √
where a0 ≡ α0i = 2α p̂i is the momentum of particle i and the inﬁnite
matrix

1 m m D
Dnm (γ) = ∂z [γ(z)]n |z=0 ; n, m = 1.. ; D00 (γ) = − log | √ |
m! n AD − BC

1 B 1 C Az + B
Dn0 = √ ( )n ; D0n = √ (− )n ; γ(z) = (252)
n D n D Cz + D

is a “representation” of the projective group corresponding to the conformal

weight Δ = 0, that satisﬁes the equations
∞

Dnm (γ1 γ2 ) = Dnl (γ1 )Dlm (γ2 ) + Dn0 (γ1 )δ0m + D0m (γ2 )δn0 (253)
l=1

and
1
Dnm (γ) = Dmn (Γ γ −1 Γ ) Γ (z) = (254)
z
Finally, Vi is a projective transformation that maps 0, 1 and ∞ into zi−1 , zi
and zi+1 .
The previous vertex can be written in a more elegant form as follows:

N
N N
dzi
VN,0 = i=1

N [i< x = 0, Oa |] δ( pi )
dVabc
i=1 [Vi (0)] i=1 i=1

i
× exp dz∂X (i) (z)p̂i log Vi (z)
4α
⎧ ⎫
⎪
⎨ 1 N ⎪
⎬
× exp − dz dy∂X (z) log[Vi (z) − Vj (y)]∂X (y)
(i) (j)
(255)
⎪
⎩ 2 i,j=1 ⎪
⎭
i=j

where the quantities X (i) are what we called Q, namely the Fubini–Veneziano
ﬁeld, in the previous sections. The N -Reggeon vertex that satisﬁes the impor-
tant property of giving the scattering amplitude of N physical particles when
we saturate it with their corresponding states, is the fundamental object for
computing the multiloop amplitudes. In fact, if we want to compute a M -loop
amplitude with N external states, we need to start from the (N +2M )-Reggeon
vertex and then we have to sew the M pairs together after having inserted a
propagator D. In this way we obtain an amplitude that is not only integrated
over the punctures zi (i = 1 . . . N ) of the N external states, but also over
the additional 3h − 3 moduli corresponding to the punctures variables of the
106 P. Di Vecchia

states that we sew together and the integration variable of the M propaga-
tors. h is the number of loops. The multiloop amplitudes have been obtained
in this way already in 1970 [59, 60, 61] and, through the sewing procedure, one
obtained functions, as the period matrix, the abelian differentials, the prime
form, etc., that are well defined on the Riemann surface! The only thing that
was missing, was the correct measure of integrations over the 3h − 3 variables
because it was technically not possible to let only the physical states to cir-
culate in the loops. This problem was solved only much later [62, 63] when
a BRST invariant formulation of string theory and the light-cone functional
integral could be used for computing multiloops. They are two very different
approaches that, however, gave the same result. For the sake of completeness,
we write here the planar h-loop amplitude involving M tachyons

(d−2)/4 M
AM (p1 , . . . , pM ) = N h Tr(λa1 · · · λaM ) Ch 2gs (2α )
(h)

⎡ ⎤2α pi ·pj

exp G (h) (zi , zj )
× [dm]M h
⎣ ' ⎦ , (256)
i<j Vi (0) Vj (0)

where N h Tr(λa1 · · · λaM ) is the appropriate U (N ) Chan–Paton factor, g is

the dimensionless open string coupling constant, Ch is a normalization factor
given by
1 1
Ch = gs2h−2 , (257)
(2π)dh (2α )d/2

and G (h) is the h-loop bosonic Green function

zj
1 zj μ −1
G (h) (zi , zj ) = log E (h) (zi , zj ) − ω (2πImτμν ) ων , (258)
2 zi zi

with E (h) (zi , zj ) being the prime form, ω μ (μ = 1, . . . , h) the abelian diﬀeren-
tials and τμν the period matrix. All these objects, as well as the measure on
moduli space [dm]M h , can be explicitly written in the Schottky parametrization
of the Riemann surface, and their expressions for arbitrary h can be found for
example in [64]. In particular, the measure on the moduli space is given by
h
1 dzi dkμ dξμ dημ
M
[dm]M
h = (1 − kμ )2
(259)
dVabc i=1 Vi (0) μ=1 kμ2 (ξμ − ημ )2
∞ ∞

−d/2

n −d
× [det (−iτμν )] (1 − kα ) (1 − kα )
n 2

α n=1 n=2

where kμ are the multipliers, ξμ and ημ are the ﬁxed points of the generators
of the Schottky group.
The Birth of String Theory 107

9 From Dual Models to String Theory

The approach presented in the previous sections is a real bottom-up approach.
The experimental data were the driving force in the construction of the
Veneziano model and of its generalization to N external legs. The rest of the
work that we have described above consisted in deriving its properties. The
result is, except for a tachyon, a fully consistent quantum-relativistic model
that was a source of fascination for those who worked in the field. Although
the model grew out of S-matrix theory where the scattering amplitude is the
only observable object, while the action or the Lagrangian have not a central
role, some people nevertheless started to investigate what was the underly-
ing microscopic structure that gave rise to such a consistent and beautiful
model. It turned out, as we know today, that this underlying structure is that
of a quantum-relativistic string. However, the process of connecting the dual
resonance model (actually two of them the generalized Veneziano and the
Shapiro–Virasoro model) to string theory took several years from the origi-
nal idea to a complete and convincing proof of the conjecture. The original
conjecture was independently formulated by Nambu [20, 65], Nielsen [66] and
Susskind [21].18 If we look at it in retrospective, it was at that time a fantastic
idea that shows the enormous physical intuition of those who formulated it.
On the other hand, it took several years to digest it before one was able to
derive from it all the deep features of the dual resonance model. Because of
this, the idea that the underlying structure was that of a relativistic string,
did not really influence most of the research in the field up to 1973. Let me
try to explain why.
A common feature of the work of [20, 66, 21] is the suggestion that the
infinite number of oscillators, that one got through the factorization of the dual
resonance model, naturally comes out from a two-dimensional free Lagrangian
for the coordinate X μ (τ, σ) of a one-dimensional string, that is an obvious
generalization of the Lagrangian that one writes for the coordinate X μ (τ ) of
a point-like object in the proper time gauge

1 dX dX 1 dX dX dX dX
L∼ · =⇒ L ∼ · − · (260)
2 dτ dτ 2 dτ dτ dσ dσ

Being this theory conformal invariant the Virasoro operators were also con-
structed together with their algebra. In this very ﬁrst formulation, however,
the Virasoro generators Ln were just the generators associated to the con-
formal symmetry of the string world-sheet Lagrangian given in (260) as in
any conformal ﬁeld theory. It was not clear at all why they should imply the
gauge conditions found by Virasoro or, in modern terms, why they should be
zero classically. The basic ingredient to solve this problem was provided by

18
See also [67].
108 P. Di Vecchia

Nambu [65] and Goto [68], who wrote the non-linear Lagrangian proportional
to the area spanned by the string in the external target space. They proceeded
in analogy with the point particle and wrote the following action:

S∼ −dσμν dσ μν (261)

where
∂Xμ ∂Xν α ∂Xμ ∂Xν αβ
dσμν = α β
dζ ∧ dζ β = dσdτ (262)
∂ζ ∂ζ ∂ζ α ∂ζ β
Xμ (σ, τ ) is the string coordinate and ζ 0 = τ and ζ 1 = σ are the coordinates of
the string world sheet. αβ is an antisymmetric tensor with 01 = 1. Inserting
(262) into (261) and ﬁxing the proportionality constant, one gets the Nambu–
Goto action [65, 68]
τf π '
S = −cT dτ dσ (Ẋ · X )2 − Ẋ 2 X 2 (263)
τi 0

where
∂X μ ∂X μ
X ≡
μ
Ẋ μ ≡ (264)
∂τ ∂σ
and T ≡ 2πα 1
is the string tension, that replaces the mass appearing in the

case of a point particle. In this formulation, the string Lagrangian is invari-

ant under any reparametrization of the world-sheet coordinates σ and τ and
not only under the conformal transformations. This, in fact, implies that the
two-dimensional world-sheet energy–momentum tensor of the string is actu-
ally zero as we will show later on. But it took still a few years to connect
the Nambu–Goto action to the properties of the dual resonance model. In
the meantime, an analogue model was formulated [69] that reproduced the
tree and loop amplitudes of the generalized Veneziano model. This approach
anticipated by several years the path integral derivation of dual amplitudes.
It was very closely related to the functional integral formulation of [70].
However, one needed to wait until 1973 with the paper of Goddard,
Goldstone, Rebbi and Thorn [71], where the Nambu–Goto action was cor-
rectly treated, all its consequences were derived and it became completely
clear that the structure underlying the dual resonance model was that of a
quantum-relativistic string. The equation of motion for the string were de-
rived from the action in (263) by imposing δS = 0 for variations such that
δX μ (τi ) = δX μ (τf ) = 0. One gets
τf π
∂ ∂L ∂ ∂L ∂L
δS = dσ − − μ
δX + δX |σ=0 = 0
μ σ=π
τi 0 ∂τ ∂ Ẋ μ ∂σ ∂X μ ∂X μ
(265)
where L is the Lagrangian in (263). Since δX μ is arbitrary, from (265) one
gets the Euler–Lagrange equation of motion
The Birth of String Theory 109

∂ ∂L ∂ ∂L ∂ ∂L
+ μ ≡ μ =0 (266)
∂τ ∂ Ẋ μ ∂σ ∂X ∂ζ α ∂( ∂X
∂ζ α )

and the boundary conditions

∂L
=0 or δXμ = 0 at σ = 0, π (267)
∂X μ
for an open string and

X μ (τ, 0) = X μ (τ, π) (268)

for a closed string. In the case of an open string, the ﬁrst kind of bound-
ary condition in (267) corresponds to Neumann boundary conditions, while
the second one to Dirichlet boundary conditions. Only the Neumann bound-
ary conditions preserve the translation invariance of the theory and, there-
fore, they were mostly used in the early days of string theory. It must be
stressed, however, that Dirichlet boundary conditions were already discussed
and used in the early days of string theory for constructing models with oﬀ-
shell states [72].
From (263), one can compute the momentum density along the string

Ẋμ X − X μ (Ẋ · X )
2
∂L
≡ Pμ = cT ' (269)
∂ Ẋ μ (Ẋ · X )2 − Ẋ 2 X 2

and obtain the following constraints between the dynamical variables X μ and
P μ:

c2 T 2 x + P 2 = x · P = 0
2
(270)

They are a consequence of the reparametrization invariance of the string

Lagrangian. Because of this one can choose the orthonormal gauge speciﬁed
by the conditions

Ẋ 2 + X = Ẋ · X = 0
2
(271)

that nowadays is called conformal gauge. In this gauge (269) becomes

∂L
Pμ = cT Ẋμ = −cT Xμ (272)
∂X μ
and therefore the equation of motion in (266) becomes

Ẍμ − Xμ = 0 (273)

while the boundary condition in (267) becomes

Xμ (σ = 0, π) = 0 (274)
110 P. Di Vecchia

The most general solution of the equation of motion and of the boundary
conditions can be written as follows:
∞
√ inτ cosnσ
X μ (τ, σ) = q μ + 2α pμ τ + i 2α [aμn e−inτ − a+μ
n e ] √ (275)
n=1
n

for an open string and

∞
i √ μ −2in(τ +σ) 2in(τ +σ) 1
X μ (τ, σ) = q μ + 2α pμ τ + 2α [ãn e − ã+μ
n e ]√
2 n=1
n

∞
i √ μ −2in(τ −σ) 2in(τ −σ) 1
+ 2α [an e − a+μ
n e ]√ (276)
2 n=1
n

for a closed string. This procedure really shows that, starting from the Nambu–
Goto action, one can choose a gauge (the orthonormal or conformal gauge)
where the equation of motion of the string becomes the two-dimensional
D’Alembert equation in (273). Furthermore, the invariance under reparamet-
rization of the Nambu–Goto action implies that the two-dimensional energy–
momentum tensor is identically zero at the classical level (see (271)).
As the Lorentz gauge in QED the orthonormal gauge does not ﬁx com-
pletely the gauge. We can still perform reparametrizations that leave in the
conformal gauge: they are conformal transformations. Introducing the vari-
able z = eiτ the generators of the conformal transformations for the open
string can be written as follows:
∞
1 1 ∂X μ 2
1
Ln = dzz n+1 − = αn−m · αm = 0 (277)
2πi 4α ∂z 2 m=−∞

where
⎧ √ μ
⎨ √ nan if n > 0
αnμ = 2α pμ if n = 0 (278)
⎩ √ †μ
nan if n < 0

They are zero as a consequence of (270) that in the conformal gauge become
(271). In the case of a closed string we get instead

μ 2
1 1 ∂X
L̃n = dzz n+1 − =0 (279)
2πi α ∂z

2
1 1 ∂X μ
Ln = dz̄ z̄ n+1
− =0 (280)
2πi α ∂ z̄
The Birth of String Theory 111

In terms of the harmonic oscillators introduced in (276) we get

∞ ∞
1 1
Ln = αm · αn−m = 0 ; L̃n = α̃m · α̃n−m = 0 (281)
2 m=−∞ 2 m=−∞

where for the non-zero modes we have used the convention in (278), while the
zero mode is given by
√ pμ
α0μ = α̃0μ = 2α (282)
2
In conclusion, the fact that we have reparametrization invariance implies that
the Virasoro generators are classically identically zero. When we quantize the
theory one cannot and also does not need to impose that they are vanishing at
the operator level. They are imposed as conditions characterizing the physical
states.

P hys |Ln |P hys = P hys |(L0 − 1)|P hys = 0 ; n = 0 (283)

These equations are satisﬁed if we require

Ln |P hys >= (L0 − 1)|P hys >= 0 (284)

The extra factor −1 in the previous equations comes from the normal ordering
as explained in (198).
The authors of [71] further specified the gauge by fixing it completely.
They introduced the light-cone gauge specified by imposing the condition

X + = 2α p+ τ (285)

where
X 0 ± X d−1 X0 ± Xd−1
X± = √ , X± = √ . (286)
2 2
In this gauge the only physical degrees of freedom are the transverse ones.
In fact, the components along the directions 0 and d − 1 can be expressed in
terms of the transverse ones by inserting (285) into the constraints in (271)
and getting
1 − 1
Ẋ − = (Ẋ 2 + X i ), X Ẋi · X i
2
= (287)
4α p+ i 2α p+

that up to a constant of integration determine completely X − as a function

of X i . In terms of oscillators we get
∞

√ 1
αn+ = 0 ; 2α αn− = + αi αi ; n = 0 (288)
2p m=−∞ n−m m
112 P. Di Vecchia

for an open string and

αn+ = α̃n+ = 0 n = 0 (289)

together with
∞

√ 1
2α αn− = αi αi
2p+ m=−∞ n−m m
∞

√ 1
2α α̃n− = α̃i α̃i (290)
2p+ m=−∞ n−m m

in the case of a closed string.

This shows that the physical states are described only by the transverse
oscillators having only d − 2 components. Those transverse oscillators corre-
spond to the transverse DDF operators that we have discussed in Sect. 6. The
authors of [71] also constructed the Lorentz generators only in terms of the
transverse oscillators and they showed that they satisfy the correct Lorentz
algebra only if the space–time dimension is d = 26. In this way the spectrum
of the dual resonance model was completely reproduced starting from the
Nambu–Goto action if d = 26! On the other hand, the choice of d = 26 is a
necessity if we want to keep the Lorentz invariance!
Immediately after this, the interaction was also included either by adding
a term describing the interaction of the string with an external gauge field [73]
or by using a functional formalism [74, 75].
In the following we will give some detail only of the first approach for the
case of an open string. A way to describe the string interaction is by adding
to the free string action an additional term that describes the interaction of
the string with an external field.

SIN T = dD yΦL (y)JL (y) (291)

where ΦL (y) is the external field and JL is the current generated by the string.
The index L stands for possible Lorentz indices that are saturated in order to
have a Lorentz invariant action.
In the case of a point particle, such an interaction term will not give any
information on the self-interaction of a particle.
In the case of a string, instead, we will see that SIN T will describe the
interaction among strings because the external fields that can consistently
interact with a string are only those that correspond to the various states of
the string, as it will become clear in the discussion below.
This is a consequence of the fact that, for the sake of consistency, we must
put the following restrictions on SIN T :
• It must be a well-defined operator in the space spanned by the string
oscillators.
The Birth of String Theory 113

• It must preserve the invariances of the free string theory. In particular, in

the “conformal gauge” it must be conformal invariant.
• In the case of an open string, the interaction occurs at the end point of
a string (say at σ = 0). This follows from the fact that two open strings
interact attaching to each other at the end points.
The simplest scalar current generated by the motion of a string can be written
as follows:

J(y) = dτ dσδ(σ)δ (d) [y μ − xμ (τ, σ)] (292)

where δ(σ) has been introduced because the interaction occurs at the end of
the string. For the sake of simplicity, we omit to write a coupling constant g
in (292).
Inserting (292) into (291) and using for the scalar external ﬁeld Φ(y) =
eik·y a plane wave, we get the following interaction:

SIN T = dτ : eik·X(τ,0) : (293)

where the normal ordering has been introduced in order to have a well deﬁned
operator. The invariance of (293) under a conformal transformation τ → w(τ )
requires the following identity:

SIN T = dτ : eik·X(τ,0) : = dw : eik·X(w,0) : (294)

or, in other words, that

: eik·X(τ,0) :=⇒ w (τ ) : eik·X(w,0) : (295)

This means that the integrand in (294) must be a conformal ﬁeld with confor-
mal dimension equal to one and this happens only if α k 2 = 1. The external
ﬁeld corresponds then to the tachyonic lowest state of the open string. Another
simple current generated by the string is given by

Jμ (y) = dτ dσδ(σ)Ẋμ (τ, σ)δ (d) (y − X(τ, σ)) (296)

Inserting (296) into (291) we get

SIN T = dτ Ẋμ (τ, 0)μ eik·X(τ,0) (297)

if we use a plane wave for Φμ (y) = μ eik·y . The vertex operator in (297) is
conformal invariant only if
k2 = · k = 0 (298)
114 P. Di Vecchia

and, therefore, the external vector must be the massless photon state of the
string. We can generalize this procedure to an arbitrary external field and
the result is that we can only use external fields that correspond to on-shell
physical states of the string.
This procedure has been extended in [73] to the case of external gravitons
by introducing in the Nambu–Goto action a target space metric and obtaining
the vertex operator for the graviton that is a massless state in the closed string
theory. Remember that, at that time, this could have been done only with the
Nambu–Goto action because the σ-model action was introduced only in 1976
first for the point particle [76] and then for the string [77]. As in the case of
the photon, it turned out that the external field corresponding to the graviton
was required to be on-shell. This condition is the precursor of the equations
of motion that one obtains from the σ-model action requiring the vanishing
of the β-function [78].
One can then compute the probability amplitude for the emission of a
number of string states corresponding to the various external fields, from an
initial string state to a final one. This amplitude gives precisely the N -point
amplitude that we discussed in the previous sections [73]. In particular, one
learns that, in the case of the open string, the Fubini–Veneziano field is just
the string coordinate computed at σ = 0:

Qμ (z) ≡ X μ (z, σ = 0) ; z = eiτ (299)

In the case of a closed string we get instead

Qμ (z, z̄) ≡ X μ (z, z̄) ; z = e2i(τ −σ) , z̄ = e2i(τ +σ) (300)

Finally, let me mention that with the functional approach Mandelstam [74]
and Cremmer and Gervais [79] computed the interaction between three arbi-
trary physical string states and reproduced in this way the coupling of three
DDF states given in (202) and obtained in [37] by using the operator for-
malism. At this point it was completely clear that the structure underlying
the generalized Veneziano model was that of an open relativistic string, while
that underlying the Shapiro–Virasoro model was that of a closed relativistic
string. Furthermore, these two theories are not independent because, if one
starts from an open string theory, one gets automatically closed strings by
loop corrections.

10 Conclusions
In this contribution, we have gone through the developments that led from
the construction of the dual resonance model to the bosonic string theory
trying as much as possible to include all the necessary technical details. This
is because we believe that they are not only important from an historical point
of view, but are also still part of the formalism that one uses today in many
The Birth of String Theory 115

string calculations. We have tried to be as complete and objective as possible,

but it could very well be that some of those who participated in the research
of these years, will not agree with some or even many of the statements we
made. We apologize to those we have forgotten to mention or we have not
mentioned as they would have liked.
Finally, after having gone through the developments of these years, my
thoughts go to Sergio Fubini who shared with me and Gabriele many of the
ideas described here and who is deeply missed, and to my friends from Flo-
rence, Naples and Turin for a pleasant collaboration in many papers discussed
here.

Acknowledgements

I thank R. Marotta and I. Pesando for a critical reading of the manuscript.

References
1. G.F. Chew: The Analytic S Matrix (W.A. Benjamin, Inc., New York, 1966);
R.J. Eden, P.V. Landshoﬀ, D.I. Olive, J.C. Polkinghorne: The Analytic S Matrix
(Cambridge University Press, Cambridge, 1966) 60, 97
2. R. Dolen, D. Horn, C. Schmid: Phys. Rev. 166, 1768 (1968);
C. Schmid: Phys. Rev. Lett. 20, 689 (1968) 60
3. H. Harari: Phys. Rev. Lett. 22, 562 (1969);
J.L. Rosner: Phys. Rev. Lett. 22, 689 (1969) 60
4. G. Veneziano: Nuovo Cimento A 57, 190 (1968) 60
5. M.A. Virasoro: Phys. Rev. 177, 2309 (1969) 62
6. M.A. Virasoro: Phys. Rev. D 1, 2933 (1970) 62, 78, 79, 82
7. A. Neveu, J.H. Schwarz: Nucl. Phys. B 31, 86 (1971);
Phys. Rev. D 4, 1109 (1971) 63, 92
8. P. Ramond: Phys. Rev. D 3, 2415 (1971) 63, 92
9. C. Lovelace: Phys. Lett. B 28, 265 (1968);
J. Shapiro: Phys. Rev. 179, 1345 (1969) 63
10. P.H. Frampton: Phys. Lett. B 41, 364 (1972) 63
11. V. Alessandrini, D. Amati, M. Le Bellac, D. Olive: Phys. Rep. C 1, 269 (1971);
G. Veneziano: Phys. Rep. C 9, 199 (1974);
S. Mandelstam: Phys. Rep. C 13, 259 (1974);
C. Rebbi: Phys. Rep. C 12, 1 (1974);
J. Scherk: Rev. Mod. Phys. 47, 123 (1975) 64
12. F. Gliozzi: Lett. Nuovo Cimento 2, 1160 (1970) 64
13. K. Bardakçi, H. Ruegg: Phys. Rev. 181, 1884 (1969);
C.G. Goebel, B. Sakita: Phys. Rev. Lett. 22, 257 (1969);
Chan Hong-Mo, T.S. Tsun: Phys. Lett. B 28, 485 (1969);
Z. Koba, H.B. Nielsen: Nucl. Phys. B 10, 633 (1969) 65, 67
14. K. Bardakçi, H. Ruegg: Phys. Lett.B 28, 671 (1969);
M.A. Virasoro: Phys. Rev. Lett. 22, 37 (1969) 65, 66
116 P. Di Vecchia

15. Z. Koba, H.B. Nielsen: Nucl. Phys. B 12, 517 (1969) 68

16. H.M. Chan, J.E. Paton: Nucl. Phys. B 10, 516 (1969) 71
17. S. Fubini, G. Veneziano: Nuovo Cimento A 64, 811 (1969) 71, 72
18. Bardakçi, S. Mandelstam: Phys. Rev. 184, 1640 (1969) 71, 72
19. S. Fubini, D. Gordon, G. Veneziano: Phys. Lett. B 29, 679 (1969) 71, 72
20. Y. Nambu: Proc. Int. Conf. on Symmetries and Quark Models, Wayne State
University 1969 (Gordon and Breach, New York, 1970), p. 269 71, 72, 107
21. L. Susskind: Nuovo Cimento A 69, 457 (1970); Phys. Rev. Lett. 23, 545 (1969)
71, 72, 107
22. J. Shapiro: Phys. Lett. B 33, 361 (1970) 71
23. S. Fubini, G. Veneziano: Nuovo Cimento A 67, 29 (1970) 72, 73
24. F. Gliozzi: Lettere al Nuovo Cimento 2, 846 (1969) 75
25. C.B. Chiu, S. Matsuda, C. Rebbi: Phys. Rev. Lett. 23, 1526 (1969):
C.B. Thorn: Phys. Rev. D 1, 1963 (1970) 75
26. S. Fubini, G. Veneziano: Ann. Phys. 63, 12 (1971) 80, 82
27. C. Lovelace: Phys. Lett. B 34, 500 (1971) 83, 102
28. E. Del Giudice, P. Di Vecchia: Nuovo Cimento A 5, 90 (1971);
M. Yoshimura: Phys. Lett. B 34, 79 (1971) 83, 86, 88
29. E. Del Giudice, P. Di Vecchia: Nuovo Cimento A 70, 579 (1970) 88, 92
30. P. Campagna, S. Fubini, E Napolitano, S. Sciuto: Nuovo Cimento A 2, 911
(1971) 88, 89
31. E. Del Giudice, P. Di Vecchia, S. Fubini: Ann. Phys. 70, 378 (1972) 90
32. R.C. Brower: Phys. Rev. D 6, 1655 (1972) 92
33. P. Goddard, C.B. Thorn: Phys. Lett. B 40, 235 (1972) 92
34. F. Gliozzi, J. Scherk, D. Olive: Phys. Lett. B 65, 282 (1976); Nucl. Phys. B
122, 253 (1977) 93
35. L. Brink, H.B. Nielsen: Phys. Lett. B 45, 332 (1973) 93
36. F. Gliozzi: unpublished;
see also P. Di Vecchia: in Many Degrees of Freedom in Particle Physics, ed. by
H. Satz (Plenum Publishing Corporation, New York, 1978), p. 493 93
37. M. Ademollo, E. Del Giudice, P. Di Vecchia, S. Fubini: Nuovo Cimento A 19,
181 (1974) 94, 114
38. J. Scherk: Nucl. Phys. B 31, 222 (1971) 96
39. N. Nakanishi: Prog. Theor. Phys. 48, 355 (1972);
P.H. Frampton, K.C. Wali: Phys. Rev. D 8, 1879 (1973) 96
40. A. Neveu, J. Scherk: Nucl. Phys. B 36, 155 (1973) 96
41. A. Neveu, J.L. Gervais: Nucl. Phys. B 46, 381 (1972) 96
42. P. Di Vecchia, A. Lerda, L. Magnea, R. Marotta, R. Russo: Nucl. Phys. B 469,
235 (1996) 97
43. T. Yoneya: Prog. Theor. Phys. 51, 1907 (1974) 97
44. K. Kikkawa, B. Sakita, M. Virasoro: Phys. Rev. 184, 1701 (1969);
K. Bardakçi, M.B. Halpern, J. Shapiro: Phys. Rev. 185, 1910 (1969);
D. Amati, C. Bouchiat, J.L. Gervais: Lett. al Nuovo Cimento 2, 399 (1969);
A. Neveu, J. Scherk: Phys. Rev. D 1, 2355 (1970);
G. Frye, L. Susskind: Phys. Lett. B 31, 537 (1970);
D.J. Gross, A. Neveu, J. Scherk, J.H. Schwarz: Phys. Rev. D 2, 697 (1970) 98
45. L. Brink, D. Olive: Nucl. Phys. B 56, 253 (1973); Nucl. Phys. B 58, 237 (1973)
98
The Birth of String Theory 117

46. E. Cremmer, J. Scherk: Nucl. Phys. B 50, 222 (1972);

L. Clavelli, J. Shapiro: Nucl. Phys. B 57, 490 (1973);
L. Brink, D.I. Olive, J. Scherk: Nucl. Phys. B 61, 173 (1973) 99, 102
47. M. Ademollo, A. D’Adda, R. D’Auria, F. Gliozzi, E. Napolitano, S. Sciuto,
P. Di Vecchia: Nucl. Phys. B 94, 221 (1975);
J. Shapiro: Phys. Rev. D 11, 2937 (1975) 99, 103, 104
48. D.I.Olive, J. Scherk: Phys. Lett. B 44, 296 (1973) 102
49. M. Ademollo, A. D’Adda, R. D’Auria, E. Napolitano, P. Di Vecchia, F. Gliozzi,
S. Sciuto: Nucl. Phys. B 77, 189 (1974) 103
50. M. Kalb, P. Ramond: Phys. Rev. D 9, 2273 (1974) 103
51. E, Cremmer, J. Scherk: Nucl. Phys. B 72, 117 (1974) 103
52. A. D’Adda, R. D’Auria, E. Napolitano, P. Di Vecchia, F. Gliozzi, S. Sciuto:
Phys. Lett. B 68, 81 (1977) 103
53. J. Shapiro: Phys. Rev. D 5, 1945 (1975) 104
54. S. Sciuto: Lett. al Nuovo Cimento 2, 411 (1969);
A. Della Selva, S. Saito: Lett. al Nuovo Cimento 4, 689 (1970) 104
55. L. Caneschi, A. Schwimmer, G. Veneziano: Phys. Lett.B 30, 356 (1969);
L. Caneschi, A. Schwimmer: Lett. al Nuovo Cimento 3, 213 (1970) 104
56. C. Lovelace: Phys. Lett. B 32, 490 (1970) 104
57. D.I. Olive: Nuovo Cimento A 3, 399 (1971) 104
58. I. Drummond: Nuovo Cimento A 67, 71 (1970);
G. Carbone, S. Sciuto: Lett. al Nuovo Cimento 3, 246 (1970);
L. Kosterlitz, D. Wray: Lett. al Nuovo Cimento 3, 491 (1970);
D. Collop: Nuovo Cimento A 1, 217 (1971);
L.P. Yu: Phys. Rev. D 2, 1010 (1970); Phys. Rev. D 2, 2256 (!970);
E. Corrigan, C. Montonen: Nucl. Phys. B 36, 58 (1972);
J.L. Gervais, B. Sakita: Phys. Rev. D 4, 2291 (1971) 104
59. C. Lovelace: Phys. Lett. B 32, 703 (1970) 106
60. V. Alessandrini: Nuovo Cimento A 2, 321 (1971) 106
61. D. Amati, V. Alessandrini: Nuovo Cimento A 4, 793 (1971) 106
62. P. Di Vecchia, M. Frau, A. Lerda, S. Sciuto: Phys. Lett. B 199, 49 (1987)
J.L. Petersen and J. Sidenius, Nucl. Phys. B 301, 247 (1988) 106
63. S. Mandelstam: in Uniﬁed String Theories, ed. by M. Green, D. Gross (World
Scientiﬁc, Singapore), p. 46 106
64. P. Di Vecchia, F. Pezzella, M. Frau, K. Hornfeck, A. Lerda, S. Sciuto: Nucl.
Phys. B 322, 317 (1989) 106
65. Y. Nambu: Lectures at the Copenhagen Symposium, 1970 (unpublished) 107, 108
66. H.B. Nielsen: Paper submitted to the 15th Int. Conf. on High Energy Physics,
Kiev, 1970; Nordita preprint (1969) 107
67. T. Takabayasi: Progr. Theor. Phys. 44 (1970) 1117;
O. Hara: Progr. Theor. Phys. 46, 1549 (1971);
L.N. Chang, J. Mansouri: Phys. Rev. D 5, 2535 (1972);
J. Mansouri, Y. Nambu: Phys. Lett. B 39, 357 (1972);
M. Minami: Prog. Theor. Phys. 48, 1308 (1972) 107
68. T. Goto: Progr. Theor. Phys. 46 1560 (1971) 108
69. D. Fairlie, H.B. Nielsen: Nucl. Phys. B 20, 637 (1970) and 22, 525 (1970) 108
70. C.S. Hsue, B. Sakita, M.A. Virasoro: Phys. Rev. 2, 2857 (1970);
J.L. Gervais, B. Sakita: Phys. Rev. D 4, 2291 (1971) 108
71. P. Goddard, J. Goldstone, C. Rebbi, C. Thorn: Nucl. Phys. B 56, 109 (1973) 108, 111, 112
118 P. Di Vecchia

72. E.F. Corrigan, D.B. Fairlie: Nucl. Phys. B 91, 527 (1975) 109
73. M. Ademollo, A. D’Adda, R. D’Auria, P. Di Vecchia, F. Gliozzi, R. Musto,
E. Napolitano, F. Nicodemi, S. Sciuto: Nuovo Cimento A 21, 77 (1974) 112, 114
74. S. Mandelstam: Nucl. Phys. B 64, 205 (1973) 112, 114
75. J.L. Gervais, B. Sakita: Phys. Rev. Lett. 30, 716 (1973) 112
76. L. Brink, P. Di Vecchia, P. Howe, S. Deser, B. Zumino: Phys. Lett. B 64, 435
(1976) 114
77. L. Brink, P. Di Vecchia, P. Howe: Phys. Lett. B 65, 471 (1976);
S. Deser, B. Zumino: Phys. Lett. B 65, 369 (1976) 114
78. C. Lovelace: Phys. Lett. B 136, 75 (1984);
C.G. Callan, E.J. Martinec, M.J. Perry, D. Friedan: Nucl. Phys. B 262, 593
(1985) 114
79. E. Cremmer, J.L. Gervais: Nucl. Phys. 76, 209 (1974) 114
The Beginning of String Theory:
A Historical Sketch

P. Di Vecchia1 and A. Schwimmer2

1
Nordita, Blegdamsvej 17, 2100 Copenhagen Ø, Denmark
[email protected]
2
Weizmann Institute, Rehovot 76100, Israel
[email protected]

Abstract. In this note we follow the historical development of the ideas that led
to the formulation of String Theory. We start from the inspired guess of Veneziano
and its extension to the scattering of N scalar particles, then we describe how the
study of its factorization properties allowed to identify the physical spectrum, and
ﬁnally we discuss how the critical values of the intercept of the Regge trajectory and
of the critical dimension were ﬁxed to be α0 = 1 and d = 26.

1 Introduction
The purpose of this note is to follow the historical development of the ideas
that led to the formulation of String Theory. As we will discuss, the story
consists of a remarkable succession of inspired insights first by Veneziano who
guessed the form of the four-point function [1], followed by its extension to
an arbitrary number of external legs. At this point the dual resonance model
was constructed, and it took some time to analyse its properties and check its
consistency through its factorization properties that allowed one to identify
the full target Hilbert space of physical states and its critical dimension by
the use of various consistency conditions. The natural interpretation of the
structure uncovered was that of a string propagating in Minkovski space–time.
We want to stress that all this was achieved without the use of a La-
grangian formulation, but by implementing the basic principles of S-matrix
directly on a scattering amplitude in a model containing an infinite number
of zero width resonances, where the sum of resonances in one channel rep-
resents correctly the resonances in the other channel. As a result, the basic
framework of Perturbative String Theory at the operational level was well
understood by 1971. Further progress was achieved through the discovery of
the Superstring and Space–time Supersymmetry, which led to tachyon free
theories. Later some basic concepts used before at a heuristic level, like the
origin of the first class constraints necessary for making the spectrum unitary

P. Di Vecchia and A. Schwimmer: The Beginning of String Theory: A Historical Sketch, Lect.
Notes Phys. 737, 119–136 (2008)
DOI 10.1007/978-3-540-74233-6 5
c Springer-Verlag Berlin Heidelberg 2008
120 P. Di Vecchia and A. Schwimmer

and Lorentz invariant, were put on a firm ground starting from the action
used in [2].
Further conceptual developments, like the connection between world sheet
conformal invariance and target space equations of motion, were only
partially understood, and had to wait for the first String Revolution to
get a more complete formulation. Finally, the relation between different
String Theories through dualities was the result of the second String
Revolution.
In this note we will concentrate on the developments during the period
1969–1972.
As we mentioned above three components entering the basic structure of
perturbative string theory, i.e.:
• the string world sheet
• the physical spectrum and vertex operators
• the critical dimension
were all correctly identified by the end of 1972, and in this short note
we will limit ourselves to the description of the evolution of their under-
standing. We will not cover other very important developments during the
same period, like, e.g. fermionic degrees of freedom on the world sheet (the
Neveu–Schwarz–Ramond formalism [3, 4]), compact degrees of freedom on
the world sheet leading to internal symmetries [5] and String Field Theory in
its light-cone formulation [6].
We will follow the evolution of the ideas, which led to the understanding
of the three basic concepts above, outlining the most important conceptual
jumps. Just the essential formulae will be given, referring for the detailed
derivations to the accompanying paper [7]. We will try to put in perspective
the evolution of the ideas by translating the guesses and insights in today’s lan-
guage and understanding, as presented in the standard modern textbooks [8].
We start with a brief reminder of the developments on which the three break-
throughs mentioned above were based.

2 Prehistory: the Discovery of the Dual

Scattering Amplitudes
The ﬁrst step which started the evolution of String Theory was the Veneziano
Formula [1]. By a historical accident Veneziano’s formula refers to what is
today Open String Theory. The analogous formula for Closed String Theory
guessed by Virasoro [9] was generalized [10] and analysed later [11] when the
basic structure of the open string was already understood. We will follow the
historical path and discuss only Open String Theory.
The formula guessed by Veneziano corresponds to what we call today the
2 to 2 scattering amplitude of the bosonic open string tachyons:
A(s, t, u) = A(s, t) + A(s, u) + A(t, u), (1)
The Beginning of String Theory: A Historical Sketch 121

where
1
Γ (−α(s))Γ (−α(t))
A(s, t) = = dxx−α(s)−1 (1 − x)−α(t)−1 , (2)
Γ (−α(s) − α(t)) 0

and
α(s) = α0 + α s (3)
is a linearly rising Regge trajectory.
The appearance of the free parameter α0 instead of the usual value 1 will
be discussed below. Moreover, in the Veneziano amplitude, as written above,
there is no requirement that the external particles are the spin 0 particles on
the leading trajectory α(s). Nevertheless, we will continue to call the external
particles “tachyons” because they have negative mass squared if we require
them to be on the leading trajectory for α0 = 1.
In Veneziano’s original approach the amplitude was supposed to describe
scattering of mesons due to strong interactions. The physical principles guid-
ing Veneziano in his guess were the usual analyticity and crossing sym-
metry requirements of the scattering amplitudes and a new principle, the
Dolen–Horn–Schmid (DHS) duality [12].
DHS duality was abstracted from a phenomenological study of hadronic
reactions and stated that the scattering amplitude could be decomposed al-
ternatively into a set of s-channel or t-channel poles, each decomposition be-
ing complete, and containing, by analytic continuation, the other. This was
expressed by the pictorial identity [13, 14] presented in Fig. 1.
In today’s language it is qualitatively clear that the DHS requirement is
fulfilled if the amplitude is related to the correlator of four vertex operators
in a conformal field theory. The two different decompositions which make
explicit the pole structure can be represented graphically by two “duality
diagrams” related by a continuous deformation, and correspond to the two
possible decompositions in conformal blocks of the conformal correlator. This
happens if the conformal block is translated into poles in Lorentz invariants
constructed from the space–time momenta. This basic feature of String The-
ory to which DHS duality led, is very far from its phenomenological origin.

Fig. 1. The duality diagram contains both s- and t-channel poles

122 P. Di Vecchia and A. Schwimmer

Ironically, it seems that present hadron scattering data [15] are not anymore
in agreement with DHS duality, which was a feature related to the energy
range available at the time.
For the N -point function the DHS duality is generalized by requiring that,
for a fixed ordering of the external particles, the amplitude can be represented
by any one of the deformations of the respective N -point duality diagram. As
described in [7], one way to understand the mechanism by which A(s, t) sat-
isfies the DHS duality is to study its integral representation, and identify the
two mutually exclusive integration domains, which produce the poles in the
s- and t-channel, respectively. This is generalized for the N -point function
by writing it as a sum of terms, each one corresponding to a given ordering
of the external legs. Each term has a (N − 3)-dimensional integral represen-
tation. The different deformations of the duality diagram are obtained from
the singular contributions to the integral representation of mutually exclusive
(N − 3)- dimensional integration regions.
Based on this idea the unique N -point function was constructed in [16]:

N −2 1 N
−2 N
−1
−α(si )−1
α0 −1
BN = dui ui (1 − ui ) (1 − xij )2α pi ·pj , (4)
i=2 0 i=2 j=i+1

where

si ≡ s1i ; xij = ui ui+1 . . . uj−1 , (5)

sij = −(pi + pi+1 + · · · + pj )2 , (6)

and pi , i = 1, 2, .., N, are the external momenta. We require that the exter-
nal scalar lies on the leading trajectory as explained in [7]. Starting from
this expression Koba and Nielsen [17] put it in the more symmetric SL(2, R)
invariant form (see [7] for details)
∞
BN = dV (z) (zi , zi+1 , zj , zj+1 )−α(sij )−1 , (7)
−∞ (i,j)

where

N
[θ(zi − zi+1 )dzi ] dza dzb dzc
dV (z) =
N1 ; dVabc = , (8)
(z
i=1 i − z i+2 )dV abc (z b − z a )(z c − zb )(za − zc )

and the variables zi are integrated along the real axis in a cyclically ordered
way: z1 ≥ z2 · · · ≥ zN with a, b and c arbitrarily chosen.
The SL(2, R) group mentioned above acts on the integration variables zi
as a Möbius transformation:
αzi + β
zi → ; i = 1 . . . N ; αδ − βγ = 1. (9)
γzi + δ
The Beginning of String Theory: A Historical Sketch 123

Using the transformation in (9) for a fixed ordering, one can relate
amplitudes corresponding to circularly permuted kinematical invariants and
then, adding terms for different orderings, one can show that all the require-
ments of crossing symmetry are fulfilled. As we understand it today, the
Möbius transformations are related to globally defined reparametrizations of
the disk which leave invariant the metric up to a conformal factor. This was
the first manifestation of the conformal symmetry underlying the world sheet
action of String Theory, which played an essential role in the understanding
of the theory.
The expression in (7) which was guessed as following from the principles
mentioned above, coincides (for α0 = 1) with the tree-level scattering ampli-
tude of N open string tachyons, obtained from calculating the open string
path integral on a disk with the insertion of N -tachyon vertex operators after
mapping the disk to the upper half plane.
The Koba–Nielsen form of the N -point function was the starting point for
the crucial developments which started in 1969. There was a general feeling
among the workers in the field that the set of N -point functions represent
the result of a unique and consistent underlying theory. While attempts to
use the functions to fit hadronic data continued, the search for this theory
became the major theoretical challenge. One aspect which became immedi-
ately obvious was the necessity to “unitarize” the theory: the presence of zero
width poles in the N -point functions showed that the amplitudes should be
considered, at best, as “tree diagrams” of an underlying, unknown theory and
“loop” diagrams should be added to them. A first attempt [18] to write loop
diagrams was by using again a generalized form of the DHS principle, requir-
ing a singularity structure of the amplitudes consistent with deformations of
duality diagrams involving loops. The existence of rather involved integrals,
found in [18], which fulfil the constraints, reinforced the belief in the existence
of an underlying theory. On the other hand, the ambiguities in the amplitudes
constructed originating in what we call today “the measure factors”, and the
impossibility to verify the unitarity, reinforced the necessity of understanding
the basic underlying theory.
The approaches used were conditioned by the development of the theoreti-
cal tools at the time. Though the path integral formulation of Quantum Field
Theory existed, it was not well developed as a calculational tool. This was
the case especially for gauge theories where the correct treatment of gauge
symmetries achieved a few years later by Faddeev–Popov did not exist. As a
consequence, Lagrangian methods based on an action were not very precise,
and involved some guess work at different stages. On the other hand, operato-
rial methods were well developed, and through the Gupta–Bleuler treatment
of QED as a prototype even the correct impositions of constraints correspond-
ing to a gauge fixing (at least for the case when the ghosts are decoupled in
today’s language) were understood. We can roughly divide the search for
the underlying theory as the “Lagrangian approach” and the “operatorial
approach”.
124 P. Di Vecchia and A. Schwimmer

Since we will discuss later in more detail the operatorial approach we start
with a description of the evolution of the “Lagrangian” ideas. Researchers
following this path tried to guess the underlying Lagrangian which would
lead to the N -point functions. This line was open by Nambu, Nielsen and
Susskind. Nambu [19] and Susskind [20] proposed that the underlying dy-
namics of the dual N -point functions corresponds to a generalization of the
Schwinger proper time formalism where a relativistic string is propagating
in proper time. The equation of motion satisfied by the string coordinates
was the two-dimensional D’Alembert equation following from a linearized La-
grangian. Using plausible arguments they obtained expressions similar to the
N -point (tree) amplitudes.
Then Nielsen [21] and immediately after Fairlie and Nielsen [22] used
this linearized Lagrangian for constructing the “analogue model”. The ba-
sic observation was that the momentum dependence of the integrands in
the Koba–Nielsen amplitudes, and their loop generalizations is related to
the energy of two-dimensional electrostatic problems where the momenta are
“charges” located on the boundary. Then the electrostatic problem is solved
on a disk for the tree amplitude, or on a higher genus two-dimensional surface
described by the duality diagram corresponding to the respective loop ampli-
tude. We understand this result today as a simple consequence of the fact that
the ikX(σ) factor in the exponential of the vertex operator acts as a source
for the string coordinates whose propagator is the two-dimensional Coulomb
kernel. Though the measure was not correctly reproduced, the “analogue
model” is important since it is the first appearance of the two-dimensional
world sheet in a mathematical role, rather than just as a picture in the du-
ality diagram. This model is the precursor of the path integral formulation
of string theory that was understood completely only later. Furthermore, the
“analogue model” motivated the generalization [10] of the Virasoro ampli-
tude [9], and therefore the formulation of the Closed String Theory by simply
putting electrostatic sources on a sphere instead that on the boundary of
a disk.
A non-linear action, proportional to the area spanned by the string, gen-
eralizing the non-linear one for the point-like particle, was also proposed by
Nambu and Goto in [23, 24]. But the consequences of its non-linear structure,
implying the invariance under an arbitrary reparametrization of the world
sheet coordinates, were only clarified few years later with the treatment of
[25] that provides a rigorous derivation of the properties of the generalized
Veneziano model, though our present understanding of string theory is mostly
based on the action used in [2].
The second approach that we will describe in detail in the next section,
is based instead on the construction of an operator formalism that made
transparent the most important properties of the model as the spectrum of
physical states and their scattering amplitudes, and that historically has been
essential for relating it to string theory in a completely satisfactory way.
The Beginning of String Theory: A Historical Sketch 125

3 The String World Sheet Through Factorization

of the N -point amplitudes
The basic observation used in order to uncover the underlying theory in the
operatorial approach was that, having a set of N -point functions satisfying
DHS duality, crossing symmetry and tree-level analyticity, does not define a
consistent set of S-matrix elements, unless the different poles in the various
channels can be shown to come from the same set of physical states, the
residues being factorized. This means that one should find a set of states, and
a set of three-point couplings between these states, such that any expansion
of a given ordering contribution to any of the N -point functions is reproduced
by the same set of states and couplings.
During 1969 there was an intensive activity in this programme of finding
the universal set of states and couplings leading to factorization. We will de-
scribe in words the main steps in historical succession, and then describe the
complete solution as formulated in [26] at the end of 1969. Through an explicit
analysis of the residues of a given pole in [27, 28], it was shown that factor-
ization can be achieved by having an infinite number of intermediate states.
An essential step was made in [29], where it was proven that the spectrum is
the Fock space of an infinite number of harmonic oscillators. The authors of
[29] gave general formulae for the masses of the states in terms of occupation
numbers, and for the couplings of the external tachyons to arbitrary pairs
of states in terms of matrix elements of vertex operators depending on the
harmonic oscillator degrees of freedom. An important result of [29] was the
discovery of the existence of the Hagedorn temperature in the theory, a basic
feature characterizing String Theories.
We describe now the solution of the factorization problem following [26].
One starts defining the operator Qμ (z) by

Qμ (z) = Q(+) (z) + Q(0) (z) + Q(−) (z), (10)

where
∞ ∞
√ a √ a†
Q(+) = i 2α √n z −n ; Q(−) = −i 2α √n z n ;
n=1
n n=1
n

Q(0) = q̂ − 2iα p̂ log z, (11)

and the vertex operators by

(−)
(z) ipq̂ +2α p̂ log z ip·Q(+)(z)
V (z; p) =: eip·Q(z) :≡ eip·Q e e e . (12)

Then it was shown [26] that the integrand of the Koba–Nielsen N -point func-
tion is related to the Fock space vacuum matrix element of the product of
vertex operators
126 P. Di Vecchia and A. Schwimmer

N N
0, 0| V (zi , pi )|0, 0 = (zi − zj )2α pi ·pj (2π)4 δ (4) ( pi ). (13)
i=1 i>j i=1

In order to obtain exactly the Koba–Nielsen expression one has to deal care-
fully with the ﬁxing of three of the z variables. This is done by extracting the
z dependence of the vertex operators using the identity

z L0 V (1, p)z −L0 = V (z, p)z α0 , (14)

where L0 is the operator

∞

L0 = α p̂2 + na†n · an . (15)
n=1

Choosing three consecutive values of zi to be ﬁxed:

za = z1 = ∞ ; zb = z2 = 1 ; zc = zN = 0, (16)

the Koba–Nielsen amplitude can be rewritten in the operator language as

AN ≡ 0, p1 |V (1, p2 )DV (1, p3 ) . . . DV (1, pN −1 )|0, pN , (17)

where the “propagator” D is equal to

1
Γ (L0 − α0 )Γ (α0 )
D= dxxL0 −1−α0 (1 − x)α0 −1 = , (18)
0 Γ (L0 )

and the states (using what we understand today as “operator-state correspon-

dence”) are deﬁned as

lim V (z; p)|0, 0 ≡ |0; p ; 0; 0| lim z 2α0 V (z; p) = 0, p|. (19)
z→0 z→∞

The zi integrations of the Koba–Nielsen formula which were absorbed in the

definition in (18) are translated into integrations over the “proper times” xi
appearing in the propagators.
This provides an explicit solution to the factorization. In fact, one can
insert between each V and D a complete set of states of the space spanned by
the harmonic oscillators (Fock space) appearing in Q(z). Since D is diagonal
in the basis of occupation numbers, poles will appear at α(s) = 0, 1, 2, ..., with
factorized residues related in a universal fashion to the matrix elements of the
vertex operators.
This solution to the factorization problem was the crucial step in the de-
velopment of String Theory since, from now on, the N -point functions were
clearly related to a theory in which the set of space–time fields is labelled
by the states in the Fock space on which the Qμ fields are realized. The Qμ
fields are, of course, the open string coordinate fields X μ (σ, τ ) in d space–
time dimensions for μ = 0, 1, 2, .., d − 1, computed at the endpoint of the
The Beginning of String Theory: A Historical Sketch 127

string coordinate σ = 0, where z is related to the other string coordinate τ

by z = eiτ . They are Heisenberg operators, their dependence on the world
sheet coordinates σ and τ follows from the fact that they are solutions of an
equation of motion following from a free linearized Lagrangian. However, as it
is described above the Lagrangian was not used in the derivation, the various
expressions being obtained by a rewriting of the N -point amplitudes. While
the linear spacing between the poles of the Veneziano formula was suggestive
of some underlying harmonic oscillator-type structure, only the solution of
the factorization problem unveiled the true structure of the theory, i.e. an
infinite number of oscillators assembled into a set of fields Qμ living on a
two-dimensional world sheet.
The vertex operators for the emission of tachyons represent insertions on
the boundary (for open string theories) of the two-dimensional world sheet. Of
course the relation in (17) is the way in which scattering amplitudes are ob-
tained in String Theory, starting from the matrix element of products of vertex
operators. The historical way was exactly the opposite, i.e. given the Koba–
Nielsen formula, the operators whose matrix elements reproduce the formula
were correctly guessed identifying the Hilbert space. Now the fulfillment of
the DHS requirements became natural: the Qμ are massless two-dimensional
fields defining a two-dimensional conformal theory and the N -point functions
are related to integrals of correlators of the vertex operators in the SL(2, R)
invariant vacuum. The integration over the z variables required by the Koba–
Nielsen formula, in order to produce the poles in α(sij ), is related to the in-
tegration over the “proper times” after the mapping of the disk into the half
upper plane. The fact that this particular expression is special to a particular
gauge (at that time called the orthonormal gauge) was already understood
during the first period of String Theory, but it became more transparent and
rigorous after Polyakov’s seminal paper [2].
Having the decomposition of the amplitudes in “vertices” and “propaga-
tors” allows the calculation of loop diagrams by gluing them and taking traces
for the loops. The loop diagrams are necessary for producing an S-matrix con-
sistent with unitarity. In this way, one obtained already in 1970 the correct
expression in the Schottky parametrization of quantities defined on a Rie-
mann surface as the period matrix, the abelian differentials and the Green’s
functions [30, 31, 32]. However, the correct measure of integration in the mul-
tiloops was not known at the time, since it requires the understanding of ghost
contributions. It is clear now that these operatorial expressions in the covari-
ant gauge are the same as those obtained by performing the path integral of
the string Lagrangian over the appropriate world sheet.
We know today that the restrictions on the operators V and D which
can be used follow from a correct gauge fixing of the string Lagrangian. In
the absence of a Lagrangian, again the correct restrictions on V and D were
found by a rather tortuous path (from today point of view), which we are
going now to describe.
128 P. Di Vecchia and A. Schwimmer

The expressions used above diﬀer from the ones used in the modern for-
mulation in two respects:
i) The vertex operators used were deﬁned for a conformal weight α k 2 .
This value, related to the mass squared of the open string tachyon, is given
in terms of the arbitrary parameter α0 : α0 = α k 2 .
ii) The dimension d of space–time, i.e. the number of string coordinates,
was left free.

4 The Virasoro Conditions

We start this section by reminding the reader how the two points mentioned at
the end of the previous section are understood today. The starting point today
for the bosonic string theory is the σ-model action (the action used in [2]) that,
at the classical level, couples the string coordinates to the two-dimensional
world sheet metric in a diffeomorphism-invariant and Weyl-invariant manner.
Then the requirement that these two “gauge symmetries” (diffeomorphism
and Weyl) are not anomalous in the quantum theory fixes the space–time
dimension to the value d = 26 for the bosonic string.
Once the two “gauge symmetries” are respected at the quantum level,
the standard Faddeev–Popov procedure can be applied, in principle in an
arbitrary gauge, and a consistent quantization can be performed giving the
physical states/operators in the gauge chosen. The states/operators in differ-
ent gauges are isomorphic leading to the same results when gauge-invariant
correlators are calculated. In particular, by choosing a covariant gauge, the
Lorentz invariance of the theory follows automatically, while the unitarity of
the theory is not obvious. On the other hand, by choosing an explicitly uni-
tary gauge (the light-cone gauge) the unitarity of the theory is completely
manifest, while the Lorentz invariance has to be checked. In the covariant
gauge the physical states correspond to operators with dimension 1 for the
open string and (1, 1) for the closed string. This fixes the leading Regge tra-
jectories to have intercept α0 = 1 or α0 = 2 for the open and closed strings,
respectively. In a “physical” gauge, as the light-cone gauge, the states which
are now “transverse” correspond to cohomologically equivalent families in the
covariant gauge.
We have described above the present procedure for quantizing the bosonic
string. However it must also be said that, in practice, one can invert the logic
outlined above and fix the Regge intercept and the space–time dimension in
the light-cone gauge by requiring that the Lorentz algebra be obeyed at the
quantum level. This is, in fact, the way followed in the early days of string
theory when the procedure described above was not yet known and this, of
course, has led to the above values of the critical dimension and intercepts.
Actually, to be more precise, the point of view expressed above has been
essential, when we quantize the bosonic string in a covariant gauge, only in
order to compute the correct integration measure for multiloop amplitudes. It
The Beginning of String Theory: A Historical Sketch 129

has not played, in practice, any significant role in the light-cone gauge where
the Regge slope and the space–time dimension have been correctly determined
by imposing the closure of the Lorentz algebra.
We want to stress here, once more, that none of the ideas based on
the Becchi–Rouet–Stora–Tyutin (BRST)-invariant approach (including the
σ-model action) were known in the early days of string theory. The Nambu–
Goto action was known, but it was not really known how to use it for deriving
all the properties obtained using the operator formalism. One had to use al-
ternative methods which amazingly enough led to the correct results. This is
what we are going to explain below.
But before we proceed, let us notice that, from the present point of view,
the description done in the previous section involved just a conformal theory of
d massless fields. Of course, in such a theory any vertex operator is legal, and
the correlators of vertex operators on the SL(2, R)-invariant vacuum have the
block decomposition properties even after integrating over their “proper time”
coordinates. Interestingly, even without the understanding that a consistent
String Theory should be the gauge fixed version of a Weyl anomaly-free theory,
the way to make the theory consistent by restricting i) and ii) was correctly
guessed. This was done by looking for some “gauge” conditions that could
help in decoupling the negative norm states, required by manifest Lorentz
covariance, from the spectrum of the physical states, pretty much in analogy
with what was known to happen in QED. We start discussing the way in
which the correct gauge conditions were discovered.
In [27] it was pointed out that the residues of the poles on which the
amplitude is factorized are not positive definite simply due to the presence
of the time components of the oscillators, which in the operator formulation
lead to a negative contribution to the scalar product. As a possible way out
from this inconsistency of the theory, linear relations between the residues
were uncovered leading to the decoupling of some Fock space states from the
amplitude. The basic driving idea was that the situation here was analogous
to the Gupta–Bleuler quantization of QED. As in QED the Lorentz condition
was imposed to characterize the subspace of the physical states, here also
some “gauge” conditions, that later on were understood to be due to some
first class constraints, were imposed on the spectrum which would eliminate
the negative norm states.
In this way, one managed to get the correct result without having to fix the
gauge of the diffeomorphisms and Weyl invariance and to introduce the b, c
ghost system. This has been possible because the ghosts are decoupled from
the string coordinates. As a consequence, the non-trivial BRST cohomology
can be realized in terms of the string coordinates only, the ghost ground state
not being excited and, for tree diagrams at least, one can calculate consistently
using the string coordinates restricted by the first class constraints.
The correct final answer was reached following a rather tortuous, but phys-
ical and at that time intuitive path.
130 P. Di Vecchia and A. Schwimmer

We start describing the linear relations [33] mentioned above. In the oper-
atorial formalism there is a realization [34, 35] of the Möbius transformations
in (9) in terms of the inﬁnite set of harmonic oscillators. This SL(2, R) algebra
has a simple action on the vertex operators and annihilates the vacuum. Its
generators L1 , L0 , L−1 are
∞
∞

√
L0 = α p̂2 + na†n · an ; L1 = 2α p̂ · a1 + n(n + 1)an+1 · a†n
n=1 n=1
(20)

and
∞

√
L−1 = L†1 = 2α p̂ · a†1 + n(n + 1)a†n+1 · an . (21)
n=1

We recognize, of course, the central extension free SL(2, R) subalgebra of the

Virasoro algebra, which acts as a symmetry on an arbitrary (conformal field
theory) (CFT) correlator, provided it is evaluated on the SL(2, R)-invariant
vacuum. We remind the reader, however, that the algebra of the Virasoro
operators and, more generally, two-dimensional conformal field theories, were
not known at the time. Their understanding was a result of the developments
we are describing. The SL(2, R) subalgebra generates the Möbius group of
the finite transformations of z:
αz + β
z = , (22)
γz + δ
where αδ − βγ = 1. The vertex operators have the standard transformation
properties under the Möbius group corresponding to the weight L0 = α p2 .
In the expectation value in (17) the information that za is fixed appears only
through the “bra” vector on the l.h.s. of the matrix element. Therefore, the
r.h.s. has a residual symmetry, the subgroup of the Möbius group, which leaves
the fixed zb = 1, zc = 0 unchanged :
z
z = = z + α(z 2 − z) + o(α2 ). (23)
1 − α(z − 1)
This subgroup is generated by

W1 = L1 − L0 . (24)

Since the “ket” on the r.h.s. is left invariant by the subgroup in (23) we obtain

W1 |p(1,M ) = 0, (25)

where

|p1,M ) = V (1, pM )D . . . V (1, p2 )|p1 , 0, (26)

The Beginning of String Theory: A Historical Sketch 131

independently on the number of V D insertions. Clearly, one gauge condition

W1 is not enough to project out all the negative norm states and additional
conditions were searched for. We remark that (25) is not a consequence of
any gauge symmetry being valid in any CFT for vertex operators of arbitrary
dimensions, provided the vertex operators are inserted at the value z = 1.
Nevertheless, following the pattern that led to (24), Virasoro [36] realized
that, if α0 = 1, the state in (26) is annihilated by an inﬁnite set of “gauge”
operators

Wn |p1,M ) = 0 ; n = 1, 2, 3, . . . (27)

where

Wn = Ln − L0 − (n − 1) (28)

with
∞

√
Ln = 2α np̂ · an + m(n + m)an+m · am
m=1

1
n
+ m(n − m)am−n · am ; n ≥ 0 ; L−n = L†n . (29)
2 m=1

The “gauge” conditions in (27) imply the following equations for the on-shell
physical states of the generalized Veneziano model [37]:

(L0 − 1)|P hys = Ln |P hys = 0 ; n = 1, 2, . . . (30)

These are exactly the constraints following from the diffeomorphism and Weyl
symmetry of the action in presence of a two-dimensional metric, after the
gauge fixing that eliminates completely the metric. These constraints annihi-
late the intermediate states in (17), that are not physical, as we know from
the now standard gauge fixing–BRST procedure [8]. We postpone the discus-
sion of the exact conditions under which the constraints eliminate the negative
norm states to the next section, since it is closely tied to the recognition of the
critical dimension. In conclusion, the correct results were obtained at the tree
level without needing to know the underlying Lagrangian and to introduce
the ghost degrees of freedom. What is more amazing is that also the correct
one-loop measure was correctly obtained by using the Brink–Olive operator,
that projected in the subspace of physical states [38]. The correct measure
for the multiloop amplitudes was instead determined much later, although
it would have been possible, in principle, to determine it by extending the
procedure of Brink and Olive to multiloops.
Once the intercept α0 got fixed to 1, it became clear that the first state on
the leading trajectory is a tachyon; its consistent removal was achieved only
with the discovery of the superstring and the GSO projection [39]. Imposing
132 P. Di Vecchia and A. Schwimmer

the inﬁnite set of Virasoro constraints on the vertex operators corresponds, in

today’s language that was already used in [40], to the requirement that vertex
operators should be primary ﬁelds with dimension 1 [40]. Projecting from the
Fock space the states which are annihilated by all the Virasoro constraints,
and eliminating the zero norm states following the procedure explained in
[37], deﬁnes the physical Hilbert space which should have positive norm.
Shortly after Virasoro found the constraints (28) it was realized that the
Ln operators are the generators of the conformal group in d = 2 [33]. The full
algebra of the group including the central extension present in the commutator
of Ln with L−n was correctly worked out only somehow later [41]1 . In this way
the algebra of the Virasoro operators was established and became the basic
algebraic structure underlying two-dimensional CFT and String Theory. The
central extension discovered by Weis [41], which is understood today as a
manifestation of the conformal anomaly [2], has far reaching consequences
which we are going to discuss now.

5 The Critical Dimension

The discovery of the critical dimension with its various manifestations shows
the serendipity characteristic of this first period of String Theory. Since, as
we know it today, the existence of the critical dimension is a consequence
of the conformal anomaly cancellation between the string coordinates fields
and the b, c ghost system, it is clear that in the absence of the understanding
of the coupling to two-dimensional metrics and its gauge fixing which leads to
the ghosts, the critical dimension could manifest itself only through its “side
effects”, i.e. various consistency conditions of the theory. The first calculation
pointing to the existence of the critical dimension was done by Lovelace [42].
He calculated the non-planar loop with a number of tachyons as external
particles, represented in Fig. 2.
This diagram was proposed earlier [43] as a model for the “Pomeron”
which dominates the high-energy elastic scattering amplitude of hadrons and
therefore, according to the lore of the time, was described as the Regge pole in
the t-channel with the highest intercept. In the calculation the dimension of
space–time d and the effective number of dimensions going around in the loop
d , were left as free parameters. It was understood at the time that only the
physical degrees of freedom which obey the Virasoro gauge conditions circulate

1 3

2 4

Fig. 2. The doubly twisted open string diagram

1
See note added in proof of [33].
The Beginning of String Theory: A Historical Sketch 133

in the loops but the exact way to implement this fact was not understood2 .
The result of the calculation showed that the singularity in the t-channel
became a pole only when d = 26 and d = 24 and in this case the intercept
of the “Pomeron” Regge trajectory is 2. We understand this result today
as a consequence of the conformal invariance of the theory: by a continuous
deformation of the world sheet, the diagram in Fig. 2 can be brought to the
form in Fig. 3.
Now it is clear that one has a tree diagram, in the t-channel a closed string
(the cylinder) being exchanged with the open string tachyons being coupled to
the upper and lower disks. However, the conformal deformation of the world
sheet on which the above expectation is based is valid only when conformal
transformations act as expected classically, i.e. no anomaly is present implying
d = 26. In addition, we know today that the b, c ghosts circulating in the loop
cancel the contribution of two of the space–time string coordinates leading to
d = 24. Finally, the intercept 2 is the one required by the correct gauge fixing
for the closed string. We identify nowadays the trajectory in the t-channel
with the graviton and not the Pomeron, though the connection may come
back to haunt us [44]. In the critical case the couplings of the open strings can
be factorized and a consistent open–closed theory can be constructed [45, 46].
Further evidence for the existence of the critical dimension came from a
close examination of the physical spectrum, i.e. the Hilbert space left after the
infinite set of Virasoro conditions are imposed on the Fock space. In [47, 48] it
was shown that the physical spectrum, i.e. the ensemble of Fock space states,
which satisfy the conditions in (30), has a positive-definite scalar product (it
is “ghost free”) only when d ≤ 26. Of course, if the spectrum is ghost free for
d = 26, it is a fortiori so also for d < 26 . In order to prove the “no ghost
theorem” for d = 26 the manipulations used in [47] are very similar to the
modern ones based on the BRST formalism, and which are valid provided
that the BRST operator Q obeys at the quantum level Q2 = 0. As a corollary

Fig. 3. The diagram of Fig. 2 in the closed string channel

2
This was clariﬁed few years later by Brink and Olive [38] inserting in the loop
the operator that projected into the space of physical states.
134 P. Di Vecchia and A. Schwimmer

of their proof Goddard and Thorn showed that the DDF [49] states form a
basis for the physical Hilbert space.
This leads to a third manifestation [25] of the critical dimension which is
already very close to our modern understanding. Though the starting point
in [25] is the Nambu–Goto action the final results correspond to a correct
quantization in light-cone [8] and in covariant gauge [8] of the σ-model ac-
tion. The DDF states are isomorphic to the states in the light-cone gauge
which live in a Hilbert space which has an explicitly positive-definite scalar
product. The light-cone gauge is, therefore, unitary; however, Lorentz invari-
ance is not explicit. On the other hand, in the covariant gauge Lorentz in-
variance is explicit but unitarity is valid only on the physical Hilbert space
after the imposition of the conditions of (30). In our modern understanding,
the two gauges being equivalent at the critical dimension insures, without
further proof, that the spectrum is both unitary and Lorentz invariant. How-
ever, at the time one had to prove explicitly that on the spectrum in the
light-cone gauge the Lorentz algebra is fully realized. By constructing all the
Lorentz generators in [25], it was shown that the algebra correctly closes only
if d = 26.
We mention finally an interesting interpretation of the central extension
(and implicitly of the critical dimension) given by Brink and Nielsen [50].
They related the central extension to the Casimir energy of the string. In our
present understanding this is simply the fact that, transforming L0 to the strip
(or cylinder for the closed string) coordinates, an additional term proportional
to the central extension appears. This argument was later generalized to an
arbitrary CFT in [51], giving a relation between the central extension and
energies on finite geometries.

6 Conclusions

In this history-oriented note we brieﬂy reviewed some of the developments

that led to what we call today “String Theory”. At the end of 1972, a com-
plete theory existed (as summarized in [25]) which, except for the existence of
the tachyon, was consistent. Its perturbative spectrum and the precise rules
for calculating perturbatively scattering amplitudes were completely under-
stood in the operator formalism. The theory is unitary and Lorentz invariant
for α0 = 1 and d = 26. All this was obtained starting from a rather strange
physical motivation, and involved a long chain of beautiful conceptual in-
sights and guesses. The impressive theoretical structure created in the years
1969–1972, and further intensively developed during the last 25 years, contin-
ues to be at the forefront of Theoretical Physics. We dedicate this contribu-
tion to Gabriele Veneziano who played a leading role in the developments we
described.
The Beginning of String Theory: A Historical Sketch 135

References
1. G. Veneziano: Nuovo Cimento A 57, 190 (1968) 119, 120
2. A. M. Polyakov: Phys. Lett. B 103, 207 (1981) 120, 124, 127, 128, 132
3. A. Neveu, J. H. Schwarz: Nucl. Phys. B 31, 86 (1971) 120
4. P. Ramond: Phys. Rev. D 3, 2415 (1971) 120
5. K. Bardakci, M. Halpern: Phys. Rev. D3, 2493 (1971) 120
6. S. Mandelstam: Nucl. Phys. B 64, 205 (1973) 120
7. P. Di Vecchia: The birth of string theory, article in this volume 120, 122
8. M. B. Green, J.H. Schwarz, E. Witten: Superstring Theory, Vol. I (Cambridge
University Press, Cambridge 1987);
J. Polchinski : String Theory, Vol. I (Cambridge University Press, Cambridge
1998);
B. Zwiebach: A First Course in String Theory (Cambridge University Press,
Cambridge 2004) 120, 131, 134
9. M. A. Virasoro: Phys. Rev. 177, 2309 (1969) 120, 124
10. J. Shapiro: Phys. Lett. B 33, 361 (1970) 120, 124
11. E. Del Giudice, P. Di Vecchia: Nuovo Cimento A 5, 90 (1971);
M. Yoshimura: Phys. Lett. B 34, 79 (1971) 120
12. R. Dolen, D. Horn, C. Schmid: Phys. Rev. 166, 1768 (1968);
C. Schmid: Phys. Rev. Letters 20, 689 (1968) 121
13. H. Harari, Phys. Rev. Lett. 22, 562 (1969) 121
14. J. L. Rosner, Phys. Rev. Lett. 22, 689 (1969) 121
15. A. Donnachie, P.V. Landshoﬀ: Phys. Lett. B 296, 227 (1992) 122
16. K. Bardakçi, H. Ruegg: Phys. Rev. 181, 485 (1969);
C.G. Goebel, B. Sakita: Phys. Rev. Lett. 22, 256 (1969);
Chan Hong-Mo, T.S. Tsun: Phys. Lett. B 28, 485 (1969) 122
17. Z. Koba, H. B.Nielsen: Nucl. Phys. B 10, 633 (1969) 122
18. K. Kikkawa, B. Sakita, M. Virasoro: Phys. Rev. 184, 1701 (1969) 123
19. Y. Nambu: Proc. Int. Conf. on Symmetries and Quark Models, Wayne State
University 1969 (Gordon and Breach, New York 1970) p. 269 124
20. L. Susskind: Phys. Rev. D 1, 1182 (1970) 124
21. H. B. Nielsen: Paper submitted to the 15th Int. Conf. on High Energy Physics
(Kiev, 1970) and Nordita preprint (1969) 124
22. D. B. Fairlie, H. B. Nielsen: Nucl. Phys. B 20, 637 (1969) 124
23. Y. Nambu: Lectures at the Copenhagen Symposium (1970), unpublished 124
24. T. Goto: Progr. Theor. Phys. 46 (1971) 1560 124
25. P. Goddard, J. Goldstone, C. Rebbi, C. Thorn: Nucl. Phys. B 56, 109 (1973) 124, 134
26. S. Fubini, G. Veneziano: Nuovo Cimento A 67, 29 (1970) 125
27. S. Fubini, G. Veneziano: Nuovo Cimento A 64, 811 (1969) 125, 129
28. K. Bardakçi, S. Mandelstam: Phys. Rev. 184, 1640 (1969) 125
29. S. Fubini, D. Gordon, G. Veneziano: Phys. Lett. B29, 679 (1969) 125
30. C. Lovelace: Phys. Lett. B 32, 490 (1970) 127
31. V. Alessandrini: Nuovo Cimento A 2, 321 (1971) 127
32. D. Amati, V. Alessandrini: Nuovo Cimento A 4, 793 (1971) 127
33. S. Fubini, G. Veneziano: Ann. Phys. 63, 12 (1971) 130, 132
34. F. Gliozzi: Lett. al Nuovo Cimento 2, 846 (1969) 130
35. C. B. Chiu, S. Matsuda, C. Rebbi: Phys. Rev. Lett. 23, 1526 (1969);
C. B. Thorn: Phys. Rev. D 1, 1963 (1970) 130
136 P. Di Vecchia and A. Schwimmer

36. M. A. Virasoro: Phys. Rev. D 1, 2933 (1970) 131

37. E. Del Giudice, P. Di Vecchia: Nuovo Cimento A 70, 579 (1970) 131, 132
38. L. Brink, D. Olive: Nucl. Phys. B 56, 253 (1973) and Nucl. Phys. B 58, 237
(1973) 131, 133
39. F. Gliozzi, J. Scherk, D. Olive: Phys. Lett. B 65 282 (1976) ; Nucl. Phys. B
122 253 (1977) 131
40. P. Campagna, S. Fubini, E Napolitano, S. Sciuto: Nuovo Cimento A 2, 911
(1971) 132
41. J. Weis: Unpublished work (1970) 132
42. C. Lovelace: Phys. Lett. B 34, 500 (1971) 132
43. P. G. O. Freund, R. J. Rivers: Phys. Lett. B 29, 510 (1969) 132
44. R. C. Brower, J. Polchinski, M. J. Strassler, Chung-I Tan: hep-th/0603115 133
45. E. Cremmer, J. Scherk: Nucl. Phys. B 50 , 222 (1972);
L. Clavelli, J. Shapiro: Nucl. Phys. B 57, 490 (1973);
L. Brink, D.I. Olive, J. Scherk: Nucl. Phys. B 61, 173 (1973) 133
46. M. Ademollo, A. D’Adda, R. D’Auria, E. Napolitano, P. Di Vecchia, F. Gliozzi,
S. Sciuto: Nucl. Phys. B 77, 189 (1974) 133
47. P. Goddard, C. B. Thorn: Phys. Lett. B 40, 235 (1972) 133
48. R. C. Brower: Phys. Rev. D 6, 1655 (1972) 133
49. E. Del Giudice, P. Di Vecchia, S. Fubini: Ann. Phys. 70, 378 (1972) 134
50. L. Brink, H. B. Nielsen: Phys. Lett. B 45, 332 (1973) 134
51. H. W. J. Bloete, J. L.Cardy, M. P. Nightingale: Phys. Rev. Lett. 56, 742 (1986)
134
The Little Story of an Algebra

M. A. Virasoro

Dipartimento di Fisica, Universita di Roma1, Roma, Italy

[email protected]

Abstract. The historical path leading to the so-called Virasoro algebra is recalled,
and the associated physical context is brieﬂy discussed.

1 Introduction

When I heard about this project to honor our friend Gabriele Veneziano I
could not be happier. Gabriele is one of those persons that once you encounter
and interact with him you know he will be your friend for ever. If I try to
make a balance of my life I realize how lucky I was of encountering him and
the rest of the Italian–Argentinian–Israeli mafia on my first post-doc.
Unfortunately, this happiness dissolved soon when I realized that I would
have to contribute an article about the algebra. It is not a mystery that I have
not invested too much on it. This is not because of any deep reason or because
I do not “believe” in it – just the opposite, I am sure contributions to it will
remain in the textbooks long after us. But I prefer diversity and perhaps,
also I share a diffuse feeling that as we get wiser we should risk working on
subjects that are still shapeless. In any case, in a similar occasion celebrating
Sakita (another person I truly cherished) I did choose to talk about “Models
of the Brain”. This time I chose heroically to submerge myself in the past and
try to recover some old impressions. I hope Gabriele will appreciate at least
my effort.

2 The Context

Let me put the story in context. The place was Madison, Wisconsin, a
Midwestern midsize town with a good University and a large student body.
The year was 1968, one of those moments in history when everything seemed
possible, when the obstacles lay around the corner, hidden to the young and

M. A. Virasoro: The Little Story of an Algebra, Lect. Notes Phys. 737, 137–144 (2008)
DOI 10.1007/978-3-540-74233-6 6
c Springer-Verlag Berlin Heidelberg 2008
138 M. A. Virasoro

optimist that we were. I was arriving after 4 months spent in Argentina doing
important things, like getting married, but no physics.
In Buenos Aires I had suffered a big frustration. Having left Israel in mid-
April (leaving my collaborators to finish and write several pending papers),
I had noticed, while preparing a CERN seminar, that the “miracles” that
we were encountering while saturating the finite energy sum rules with the
leading trajectory plus daughters were pointing to a simple mathematical fact:
in the imaginary amplitude we were building a beta function:

Γ (α(s) + α(t)) α(s)α(t)−k

= ck . (1)
Γ (α(s))Γ (α(t)) Γ (α(t))
k

For us, interested in obtaining s − t duality, this formula was telling some-
thing. It was the Imaginary part of the amplitude but it had to correspond
to a full dual amplitude. I tried to use dispersion relations to build the full
amplitude but the result was messy, and so I wrote a letter to Gabriele who
replied to me immediately saying that he was playing with the same idea but
that he has figured it out. He suggested that we wrote our results separately.
I looked at my calculations and ... gave up. Gabriele had added a key element
when he assumed all resonances to be infinitely narrow. That was not our
philosophy in Israel and although he presented it as a mathematical conve-
nient way to deal with an average amplitude, it was the first hint that dual
amplitudes represented something different from the total amplitude, more
like a Born approximation.
Endowed with this healthy frustration I anticipated my travel to Madison
and arrived there with a generous dose of adrenaline. Obviously, Gabriele’s
paper had opened a Pandora’s box and was, to use an expression popular
in those times, mind-blowing. But curiously while in Europe the impact was
immediate, the reaction in the US was more subdued. The bootstrap ap-
proach was identified as a West Coast ideology antagonist to the field theory
framework that was instead the main playground for East Coast physicists.
As a consequence, only a few (but important) American physicists jumped
on the bandwagon. Bunji Sakita was of a different kind. He had a broad
background and an extreme curiosity. Putting together his wide perspective,
Charlie GoebelUs sheer power of analysis, and a young bright Keiji Kikkawa
made Madison the perfect continuation to Rehovoth.
The first year in Wisconsin was a year of adaptation to a new environment.
The pace was hectic. Nothing to do with the relaxing Rehovoh atmosphere.
We had to learn to expect new results almost everyday and many of them
by more than one group simultaneously. The telephone was a key instrument.
I was communicating regularly with Gabriele at MIT and Hector in New York.
During the Summer 1969 I visited Europe. I spent a pleasant month dis-
cussing Loop Diagrams at Orsay where I met two young graduate students
(Joel Scherk and Andre Neveu) who advised by Daniele Amati were interested
in Dual Models. I visited CERN (I vividly remember being there when the
The Little Story of an Algebra 139

Apollo mission was landing on the Moon) and the Niels Bohr Institute where
I learned from Holger Nielsen about his analog model, and for the first time
I realized the crucial role played by Conformal Symmetry.
When I went back to Wisconsin I began to look carefully at the low ly-
ing resonances of the model by calculating by brute force their couplings to
n-ground states and checking whether there were cancellations. My luck then
was a direct consequence of my laziness. I knew that calculations were much
simpler if α(0) = 1 (this as a by-product of an earlier work on an alterna-
tive dual, crossing-symmetric amplitude), so I was routinely working at that
value. Thus when I found that at the first level the ghost decoupled I could
continue to the second level and there find that there were some additional
decouplings. At that exact moment I heard from Gabriele the unwelcome news
that at least two groups [1, 2] had derived the first level results. Fortunately,
by then I was used to this kind of frustration and did not rush to publish
but continued trying to simplify the calculation. The paper still shows how
messy the original calculation was, but how it simplifies considerably once
the Fubini–Veneziano [3] creation–annihilation operators aμ,† μ
n , an were used.
Furthermore, written in that way, the generalization to the mth level became
trivial: the operator
∞
√
Om = n(n + m)a†n+m an − i 2mP a†m
n=1
m
a†n a†m
− n(m − n) + m − H,
n=1
2
∞
H= na†n an − P 2 − 1, (2)
n=1

turns out to create a resonance uncoupled to any number of ground states [4].
Thus there are as many uncoupled resonances as there are ghosts and
therefore I assumed that all the ghosts had been killed. I was worried that I
was trading ghosts for a tachyon. On the other hand, I happily dismissed the
possibility that I could be killing good resonances and leaving ghosts alive.
I have checked that this was not the case for the first two levels but it was
Thorn, Brower and Goddard that took this issue seriously.
More or less at that time (end of the 1969–1970 Winter) Nambu came to
Madison and in a seminar he boldly exposed his idea that the Lagrangian for
the string was the one of a relativistic massless string. Convinced as I was
that the Lagrangian was the conformal invariant one ∂σ φμ ∂σ φμ + ∂τ φμ ∂τ φμ ,
I went to him and kindly explained that there were too many resonances
for his picture to be correct: the classical string would have two oscillating
modes per level and not three. He stared at me, did not answer but was not
seriously affected. Of course he was right [5]. One year later Goto [6] in Japan
and a year and a half later Chang and Mansouri [7] (Nambu’s collaborators)
in Chicago proved that through Dirac quantization, a gauge fixing and for a
140 M. A. Virasoro

26-dimensional space–time the two Lagrangians were equivalent. Nambu was

an expert in Dirac quantization but I still do not understand what hinted him
about this idea. Because of this boldness (or foresight) he is considered the
father of the string idea.
During the next year (1970) we watched a sustained effort to build new
alternative dual models endowed with similar ghost-killing mechanism. The
proposals of Neveu–Schwarz [10] and Ramond [11] models turned out to be
richer. The operators Om were completed with new anti-commuting operators
in what turned out to be the first appearance of supersymmetry in physics.
In Berkeley, Bardakci and Halpern [13] tried an algebraic approach working
directly on the Fock space with creation–destruction operators. The calcu-
lations became very complicated. In Wisconsin, our work on loop diagrams
led us to believe that the locality of the String degrees of freedom had to be
manifested. For this we implemented the functional integral formalism rather
than the operator one. With Sakita and his new student Hsue, we tried to fill
several gaps. We investigated the anomalous dimension of the vertex opera-
tor exp(kφ(z)) and grasped the role of the full conformal algebra [14]. In a
contribution to the Tel Aviv Conference held in April 1971, I presented the
programme of studying systematically Conformal invariant Lagrangians to
build new models [15]. All the models known at that moment were explained
and several proposals could be discussed. I had one unfinished proposal with
M. Kaku and M. Yoshimura but in September 1971 I decided to go back to
Argentina and my personal interaction with the algebra finished.
The algebra by then had its own life nurtured by physicists like R. Brower,
P. Goddard, D. Olive and C.B. Thorn. Any of them could write a much better
account of its development. I can only attempt to give an incomplete, personal
and probably biased version.
The operators Om , whose commutators I have computed, generate a sub-
group of the full conformal algebra: the transformations that leave the lines
†
Imz = 0 and Imz = π, z = 0 and z = −∞ invariant. The operators Om leave
the same lines, z = 0 and z = ∞ invariant. Fubini and Veneziano [8] reshuffled
these operators with the Hamiltonian, defining

Lm = Om − H, m > 0,
†
Lm = − H,
Om m < 0, (3)
L0 = H,

and calculated the remaining commutators to write:

[Ln , Lm ] = (n − m)Ln+m + δn,−m Central Charge. (4)

Finally J.H. Weiss noticed and calculated the Central Charge to be added to
complete the so-called Virasoro algebra

m3 − m
Central Charge = c, (5)
12
The Little Story of an Algebra 141

with c equal to the number of fields. R. Brower, C.B. Thorn and P. Goddard [9]
proved the conjecture that all ghosts had been killed. In their work they found
the special role played by the choice of d = 26. J. Shapiro constructed the full
theory of closed strings proving that in this case there were two commuting
Algebras.
In September 1972 I visited the Fermilab to organize a parallel session on
Dual Models for the XVI International Conference on High Energy Physics.
There I learned about the GGRT paper on the massless string dynamics and
the light cone quantization [12]. I found that paper extremely interesting and
began to work on the interaction among strings in the light cone gauge, but
when I went back to Argentina too many things were happening: a military
regime was reaching its end and a new era full of hopes was announced.
A whole generation became deeply involved in the process with tragic con-
sequences for many of them, because after a short promising periods events
turned for the worst and dragged us into a political eyestorm. In August 1975
Tullio Regge invited me to Princeton and I decided to leave Argentina at least
temporarily. At that moment I was interested in Geophysical Fluid Dynamics
and was still planning to go back to Argentina, but then the military coup of
1976 definitely convinced me that I had to change plans.
The rest of the decade was kind of quiet also in the front of String Theory
and Conformal Invariance. From time to time I was browsing articles and
attending Seminars perhaps just to hear, as Hector Rubinstein used to say, the
“music”. Around 1980 I received a preprint by I.B. Frenkel and V. Kac entitled
“Basic Representations of affine Lie algebras and Dual Resonance Models”
[16]. I only read the nice introduction and understood that there was an
ongoing effort to study the representations of Infinite Dimensional Lie algebras
and in this context the developments in dual resonance models of the 1970s
were a source of inspiration. Not only had we have found unitary, positive
energy representations of an infinite dimensional algebra, but in addition these
authors discovered that they can use the Fubini–Veneziano vertex operator to
build the representations of the affine Lie (Kac–Moody) algebras.
In mathematics these algebras represent a natural generalization of finite
dimensional Lie algebras. An excellent review specially written to introduce
this subject to physicists can be found in [17]. Suppose g is defined by

[T a , T b ] = if abc T c , (6)

where a, b, c run from 1 to dim g and f abc are the structure constants of g,
then an aﬃne Kac–Moody algebra is deﬁned by the commutation relations
a
[Tm , Tnb ] = if abc Tm+n
c
+ kmδ ab δm,−n (7)

with m any integer and k a central charge. In physics they are known objects.
In fact, if we deﬁne
∞
T a (θ) = Tna e2iπnθ , (8)
n=−∞
142 M. A. Virasoro

we obtain
1 ab
[T a (θ), T b (φ)] = if abc T c (θ)δ(θ − φ) + i kδ δ (θ − φ). (9)
2π
These are the equations of current algebra. Sugawara [18] has shown in
1968 that one could construct an energy–momentum tensor directly from the
currents. We also know that from the energy–momentum tensor we can ob-
tain the generators for coordinate transformations. Thus, generically, from
representations of the Kac–Moody algebra one can construct representa-
tions
of the Virasoro algebra. More speciﬁcally, the normal ordered bilin-
ear : a T a (θ)T a (θ) :, conveniently normalized and Fourier expanded, gives
a set of Ln operators that obey

[Lm , Tna ] = −nTn+m

a
. (10)

Even my original construction can be written in this new way though obviously
in the original Dual Model the g group is abelian.
This interrelation between the Kac–Moody and Virasoro algebras was cru-
cial in the next development. From the point of view of physics, one is inter-
ested in representations that have a vacuum state: L0 should have a spectrum
bounded below and (at least in quantum physics) we want the representa-
tions to be unitary. Then in a remarkable paper Friedan, Qiu and Shenker
[19] proved that the possible values of the central charge c and the ground
state energy h had to be

either c≥1 and h ≥ 0,

6 [(m + 3)p − (m + 2)q]2 − 1
or c=1− and h = ,
(m + 2)(m + 3) 4(m + 2)(m + 3)
(11)

with m = 0, 1, 2, . . . ; p = 1, 2, . . . , m; q = 1, 2, . . . , p. Goddard, Kent and Olive

[20] then were able to build all the corresponding representations.
The next chapter in this little story concerns statistical physics systems
in two dimensions at a critical point. The conformal group in two dimen-
sions can be seen as rotations plus dilatations that depend on the coordi-
nates. It is not too surprising that local systems that become scale invariant
become simultaneously invariant under the full conformal group. This has
been stressed before, but became a full programme with the work of Belavin,
Polyakov and Zamolodchikov [21]. Their point was that Conformal Invariance
imposes constraints on the correlation functions of the different fields defined
in the system. If φ(z, z) is one of these fields then there are two families of Ln
operators acting on them in the following way:
∂
[Ln , φ] = z n+1 φ + h(n + 1)z n φ,
∂z
The Little Story of an Algebra 143

∂
[Ln , φ] = z n+1 φ + h(n + 1)z n φ. (12)
∂z
In these equations h, h are two, possible anomalous, dimensions. They are
restricted by the conditions stated above on the lowest eigenvalue of the L0
operator. Therefore the correlation functions < 0|φ(z, z)φ(z , z )|0 > will scale
with h+h. One can the identify which representation is acting on the different
systems at their critical point by looking at critical exponents. For instance,
c = 1/2 corresponds to the Ising Model, c = 4/5 to the three-state Potts
model and so on.
This is as much as I have been able to follow this subject. There are
many new developments about which I am even more ignorant. However,
even to a layman this little story shows clearly the advantages of a multi-
disciplinary approach. The so-called Virasoro algebra, was known to math-
ematicians even with its central charge. However, when we discovered it in
Physics, the amount of excitement that it produced had the positive effect of
a sustained effort to understand it and generalize it. Thus the no-ghost theo-
rem and the Neveu–Schwarz–Ramond generalization. When mathematicians
rediscovered our work they had understood other aspects, and in particular
the connection with the Kac–Moody algebras, but were happily surprised with
the vertex operators that Sergio and Gabriele had introduced. The Kac de-
terminant, a key ingredient for the Friedan et al. classification, could hardly
have been discovered by physicists. Furthermore, the excitement on our side
had decreased a lot by 1980. The latest discoveries, including the classifica-
tion of all unitary representations, the restrictions on c and h, were the direct
consequence of a fertile dialogue between the two communities. In short this
is an edifying story.

References
1. F. Gliozzi: Nuovo Cimento Lett. 22, 846 (1969) 139
2. C. Chiu, S. Matsuda, C. Rebbi: Phys. Rev. Lett. 23, 1526 (1969) 139
3. S. Fubini, G. Veneziano: Nuovo Cimento A 64, 811 (1969) 139
4. M. A. Virasoro: Phys. Rev D 1, 933 (1970) 139
5. Y. Nambu: in Proc. Int. Conf. on Symmetries and Quark Models, ed. by
R. Chand, Wayne State University, 1969 (Gordon and Breach, NY, 1970), p.
269 139
6. T. Goto: Prog. Theor. Phys. 46, 1560 (1971) 139
7. L. N. Chang, F. Mansouri: Phys. Rev. 5, 2535 (1972) 139
8. S. Fubini, G. Veneziano: Ann. Phys. 63, 12 (1971) 140
9. P. Goddard, C. B. Thorn: Phys. Lett. 4, 235 (1972) 141
10. A. Neveu, J.H. Schwarz: Nucl. Phys. B 21, 86 (1971) 140
11. P. Ramond: Phys. Rev. D 3, 2415 (1971) 140
12. P. Goddard, J. Goldstone, C. Rebbi, C. B. Thorn: Nucl. Phys. B 56, 109 (1973)
141
13. K. Bardakci, M. Halpern: Phys. Rev. D 3, 2493 (1971) 140
144 M. A. Virasoro

14. C. Hsue, B. Sakita, M. A. Virasoro: Phys. Rev. D 2, 2857 (1970) 140

15. M. A. Virasoro: in Proc. Int. Conf. on Duality and Symmetries in Hadron
Physics, ed. by E. Gotsman, Tel Aviv University, 1971 (The Weizmann Science
Press of Israel, Jerusalem), p. 224 140
16. I. B. Frenkel, V.G. Kac: Invent. Math. 62, 23 (1980) 141
17. P. Goddard, D. Olive: Int. J. Mod. Phys. A 1, 303 (1986) 141
18. H. Sugawara: Phys. Rev. 170, 1659 (1968) 142
19. D. Friedan, Z. Qiu, S. Shenker: Phys. Rev. Lett. 52, 1575 (1984) 142
20. P. Goddard, A. Kent, D. Olive: Commun. Math. Phys. 103, 105 (1986) 142
21. A.A. Belavin, A. Polyakov, A.B. Zamolodchikov: Nucl. Phys. B 241, 333 (1984)
142
Part III

Perturbative QCD
Parton Densities: A Personal Retrospective

R. Petronzio

University of Rome “Tor Vergata” and INFN, Sezione di Roma “Tor Vergata”,
Roma, Italy
[email protected]

Abstract. The beginning of perturbative QCD and the generalisation of parton

evolution probabilities beyond leading order are brieﬂy recalled, together with my
personal experience of collaboration and friendships with Gabriele.

My collaboration with Gabriele started at CERN. I was there as a fellow and

I got involved with Daniele Amati into a discussion about the general validity
of factorisation of mass singularities beyond leading order [1, 2]. The subject
started from a stimulating argument by D.J. Politzer [3], who argued that
the result about the universality of the leading log result of mass singulari-
ties among different processes involving partons in the initial states could be
generalised, and lead to universal parton distributions for the normalisation
of parton initiated hard processes. The discussion made an extensive use of
the Lee–Nauenberg–Kinoshita [4, 5] theorem, and led to arguments in favour
of the validity of what is known as the factorisation theorem.
Interacting with Gabriele was very stimulating, easy and rewarding. He
was able to stimulate a genuine discussion, in spite of his greater experience
in physics, and I could feel I could bring my personal contribution. Later we
had many more discussion on several topics of the by that time emerging
“perturbative QCD”, on subjects like jet definition and pre-confinement. Not
always they led to joint publications, but that was not the main aim.
The search on parton densities brought me to a deeper study of an explicit
framework by which the study of mass singularities could lead to the gener-
alisation beyond leading order of the probability evolutions, now known as
DGLAP probabilities [6, 7, 8, 9]. En passant, it may be worth noticing that a
first expression for a part of the leading evolution probabilities appeared in a
work [10] with Giorgio Parisi, and was obtained by making the inverse of the
Mellin transform of the well-known operator product expansion (OPE) result.
The generalisation of probabilities beyond the leading order was achieved
by choosing a suitable calculation scheme. Together with W. Furmanski and
G. Curci, that I take this occasion to remember a year after his premature

R. Petronzio: Parton Densities: A Personal Retrospective, Lect. Notes Phys. 737, 147–150
(2008)
DOI 10.1007/978-3-540-74233-6 7
c Springer-Verlag Berlin Heidelberg 2008
148 R. Petronzio

disappearance, we embarked into an explicit factorisation of mass singularities

in a light-like gauge, and with dimensional regularisation of both ultraviolet
and collinear singularities [11]. We could perform the first two-loop calcula-
tions of evolution probabilities directly in the “x” space, both for the flavour
non singlet and for the flavour singlet sector. The method could deal with the
probabilities in the time-like region as well as in the space-like: the only al-
ternative was the use of the generalisation of the operator product expansion
to time-like processes due to A. Mueller [12], and known as the cut vertices.
Many new points became clear to us in the new language: I remember
in particular the clarification of the mixing between quark and anti-quark
parton densities, occurring in the non-singlet case through a peculiar two-
loop evolution kernel, and its connection with the distinction at two-loop level
between even and odd moments of the probability evolutions. Our kinematical
interpretation of the breaking of the relation between space-like and time-like
processes has been confirmed by recent three-loop results.
The singlet calculation was more complicated by the larger number of dia-
grams, and took about 6 months of intense work to Furmanski and myself [13].
I remember we had a discrepancy with the classic [14] result obtained in the
moment space through OPE: only a test of gauge invariance and of the su-
persymmetric relations among probabilities, at the end of additional lengthy
checks, did convince us about the validity of our result. Later, the OPE result
was corrected to ours.
Shortly after, Furmanski and myself [15] proposed a new method to anal-
yse data that would allow incorporating our two-loop result and yet improve
the efficiency of data analysis. The method was based on the use of Laguerre
polynomials as a basis for the expansion of the experimental parton densities:
the choice was mainly motivated by the easy composition rule of these poly-
nomials under convolution, the standard mathematical operation by which
probabilities and parton densities were tied together. The advantage was to
avoid specific parametrisation of the experimental parton densities, avoid-
ing then bias in their determination. The real goal was the determination of
ΛQCD , in a specific renormalisation scheme.
We first applied the method to NA4 data, with the help of Ruediger Voss
and Marc Virchaux, another premature loss in our community. The first re-
sults were surprising: ΛQCD in the minimal scheme was of the order of 200
MeV, instead of the usually quoted values around 130 MeV. We also applied
the same method to the analysis of the Charm data (although less precise
than the NA4 data), with similar findings. The higher value become the stan-
dard one some time later. I regret not having pursued the application of the
method to more recent data, but both Wojtek and myself moved away from
this subject. I kept working on structure functions with Ellis and Furman-
ski [16], and in particular on higher twists effects [17], a subject that the
higher momentum transfers reached by the experiments quickly made not so
relevant for the determination of the parton densities.
Parton Densities: A Personal Retrospective 149

Parton distributions have been a subject of phenomenological studies by

themselves, not only because of their scaling violations. One of my first pa-
pers [18] was about a two-stage model of parton distribution in a nucleon,
together with Cabibbo, Altarelli and Maiani: a recent paper with Ricco and
Simula [19] takes back some of those old ideas with very precise new data:
always a “constituent quark” picture seems to emerge from the data, without
yet a solid field theoretic description of its nature.
The value of parton densities were also investigated through lattice simu-
lations of QCD in the approximation of neglecting fermion loops [20, 21, 22,
23, 24, 25]. Only the first couple of moments could be evaluated. Today, we
are only a few years away from accurate predictions of moments, but delicate
issues, like the gluon content at low x, need an approach different from the
one based on the operator product expansion, and will not be addressed in a
short time.
I would like to end this brief commentary on my activity related to struc-
ture functions with the sketch of an idea on which I am currently working
on: looking for signatures of the quark–gluon plasma phase transition from a
sudden modification of the parton densities. The effect comes from a power-
suppressed contribution that occurs also in absence of a new phase of nuclear
matter. Structure functions may get a contribution coming from the merging,
through a higher twist process, of parton densities belonging to different nu-
cleons that adds up to the yield of ordinary parton densities. In absence of a
dense state of matter, the effect is strongly suppressed by the inverse power
of the square momentum transfer, times the square of the typical distance of
nucleons inside a nucleus. The phase transition brings a nuclear density much
higher, by a factor 10, than ordinary matter, and enhances by a factor of about
10 such an effect. The effect remains small, but undergoes a jump, a signature
of nuclear matter at unusual density values. Only explicit phenomenological
calculations can decide the feasibility of this idea.
My collaboration with Gabriele was not confined to hard processes in
QCD, but also on the study of low values of coupling constants running
through the renormalisation group equations, from the energy scales of uni-
fication schemes and of the onset of string physics [26]. I wish I will be able
to keep discussing with Gabriele at CERN or elsewhere in the next years,
always learning from him how complex problem can be approached without
prejudices, in a simple and physically intuitive manner.

References
1. D. Amati, R. Petronzio, G. Veneziano: Nucl. Phys. B 140, 54 (1978) 147
2. D. Amati, R. Petronzio, G. Veneziano: Nucl. Phys. B 146, 29 (1978) 147
3. H. D. Politzer: Nucl. Phys. B 129, 301 (1977) 147
4. T. Kinoshita: J. Math. Phys. 3, 650 (1962) 147
5. T. D. Lee, M. Nauenberg: Phys. Rev. 133, B1549 (1964) 147
150 R. Petronzio

6. G. Altarelli, G. Parisi: Nucl. Phys. B 126, 298 (1977) 147

7. L. N. Lipatov: Sov. J. Nucl. Phys. 20, 94 (1975) [Yad. Fiz. 20 (1974) 181] 147
8. V. N. Gribov, L. N. Lipatov: Sov. J. Nucl. Phys. 15, 438 (1972) [Yad. Fiz. 15
(1972) 781] 147
9. Y. L. Dokshitzer: Sov. Phys. JETP 46. 641 (1977) [Zh. Eksp. Teor. Fiz. 73
(1977) 1216] 147
10. G. Parisi, R. Petronzio: Phys. Lett. B 62, 331 (1976) 147
11. G. Curci, W. Furmanski, R. Petronzio: Nucl. Phys. B 175, 27 (1980) 148
12. A. H. Mueller: Phys. Rev. D 18, 3705 (1978) 148
13. W. Furmanski, R. Petronzio: Phys. Lett. B 97, 437 (1980) 148
14. E. G. Floratos, D. A. Ross, C. T. Sachrajda: Nucl. Phys. B 152, 493 (1979) 148
15. W. Furmanski, R. Petronzio: Nucl. Phys. B 195, 237 (1982) 148
16. R. K. Ellis, W. Furmanski, R. Petronzio: Nucl. Phys. B 207, 1 (1982) 148
17. R. K. Ellis, W. Furmanski, R. Petronzio: Nucl. Phys. B 212, 29 (1983) 148
18. G. Altarelli, N. Cabibbo, L. Maiani, R. Petronzio: Nucl. Phys. B 92, 413 (1975)
149
19. R. Petronzio, S. Simula, G. Ricco: Phys. Rev. D 67, 094004 (2003) [Erratum-
ibid. D 68, 099901 (2003)] 149
20. M. Guagnelli, K. Jansen, F. Palombi, R. Petronzio, A. Shindler, I. Wetzorke
[Zeuthen-Rome (ZeRo) Collaboration]: Eur. Phys. J. C 40, 69 (2005) 149
21. M. Guagnelli, K. Jansen, F. Palombi, R. Petronzio, A. Shindler, I. Wetzorke
[Zeuthen-Rome (ZeRo) Collaboration]: Phys. Lett. B 597, 216 (2004) 149
22. M. Guagnelli, F. Palombi, R. Petronzio, K. Jansen, A. Shindler, I. Wetzorke
[Ze-Ro Zeuthen-Roma Collaboration]: Eur. Phys. J. A 17, 365 (2003). 149
23. M. Guagnelli, K. Jansen, F. Palombi, R. Petronzio, A. Shindler, I. Wetzorke
[Zeuthen-Rome/ZeRo Collaboration]: Nucl. Phys. B 664, 276 (2003) 149
24. F. Palombi, R. Petronzio, A. Shindler: Nucl. Phys. B 637, 243 (2002) 149
25. A. Bucarelli, F. Palombi, R. Petronzio, A. Shindler: Nucl. Phys. B 552, 379
(1999) 149
26. R. Petronzio, G. Veneziano: Mod. Phys. Lett. A 2, 707 (1987) 149
Infrared-sensitive Physics in QCD
and in Electroweak Theory

M. Ciafaloni

Dipartimento di Fisica, Università di Firenze, Italy and INFN,

Sezione di Firenze, Italy
[email protected]

Abstract. I recall the main ideas about the treatment of QCD infrared physics, as
developed in the late 1970s, and I outline some novel applications of those ideas to
Electroweak Theory.

1 Infrared-sensitive Observables

The high-energy physics of elementary particles, as described by the Standard

Model, gives particular emphasis to states constructed out of massless par-
tons or leptons, because of either the original gauge symmetry, or of the QCD
chiral symmetry. This in principle introduces a number of problems because
of the existence of mass singularities in gauge theories – that is, of infrared
and collinear divergences due to the initial or final states being massless. Of
course, physical states yield finite cross sections because of QCD confinement,
or of electroweak symmetry breaking, or of QED coherent states. However,
a remnant of the mass singularities of the problem is that the cross section,
besides being dependent on energy and momentum transfers of the process at
hand, may also depend on energy through large logarithmic variables, involv-
ing some infrared-sensitive mass parameters.
In QCD, avoiding large parameters is vital for the perturbative description
of hard processes, characterized by probe(s) with large momentum transfer(s)
Q and by a supposedly small coupling. Therefore, the cross section must
be infrared safe, that is, sufficiently inclusive in order to cancel the mass
singularities according to the KLN and/or Bloch–Nordsieck (BN) theorems
[1, 2]. As a consequence, fully inclusive processes are truly perturbative, while
the inclusive processes in which some partons of virtuality Q0 are looked
at (in the initial or final state) show anomalous dimensions [3]. However,
observables in which soft emission is suppressed (e.g., at the boundary of the
phase space) or emphasized (e.g., of multiplicity type) are infrared sensitive
[4], and still contain parametrically large logarithms of infrared origin, because
of an incomplete cancellation of virtual corrections with real emission.

M. Ciafaloni: Infrared-sensitive Physics in QCD and in Electroweak Theory, Lect. Notes

Phys. 737, 151–158 (2008)
DOI 10.1007/978-3-540-74233-6 8 c Springer-Verlag Berlin Heidelberg 2008
152 M. Ciafaloni

The above observation raises a problem for quite interesting observables

(like pT -form factors and jet multiplicity distributions), but indicates also how
to solve it because we know that the infrared behaviour is largely universal
due to the QED factorization theorem [1] and generalizations thereof. This
fact triggered, in the late 1970s, a number of seminal papers dealing with
factorization of the collinear behaviour [5], form factor resummation [6], pre-
confinement [7], jet evolution [8] and multiplicities [9]. It also appeared that
one could describe in full the final state [10] at the level of partons with off-
shellness Q0 much smaller than Q but still large with respect to Λ, the QCD
scale, thus providing a ground for event generators [11].
All the above papers are largely based on factorization theorems for vari-
ous hard processes, and gradually introduce generalized renormalization group
techniques in order to predict the logarithmic dependence on the infrared- sen-
sitive parameters at leading-logarithm anomalous dimension level, extended,
by further analysis [12], to the subleading ones. The factorization properties
are in turn dependent on the cancellation of truly infrared divergent contri-
butions for all such processes, which requires a generalized Bloch-Nordsieck
theorem to be valid in QCD, as better established in the 1980s [13]. In fact,
the BN theorem states that a cross section which is inclusive over soft final
states is also infrared safe, irrespective of the fixed, possibly degenerate initial
state. In this form, the theorem is not automatically valid, because the non-
abelian nature of QCD allows degenerate initial states in a multiplet, which
have different charges and thus in general different cross sections for the same
momentum configuration. This spoils the cancellation of virtual corrections
with real emission when summing over final soft states, unless an average over
initial colour is performed in order to restore the BN theorem. Fortunately,
this averaging is authomatic because of QCD confinement, which allows only
colour singlet asymptotic states.
The ideas above have been refined over the years in QCD, leading to an
approximate treatment of coherence effects by angular ordering in jet evolu-
tion [14], and to a more general treatment of subleading logarithms in form
factor calculations [15]. Recently, they have also led to a new interesting de-
velopment in electroweak theory. Naı̈vely, one would say that in the latter
case the infrared structure is irrelevant because of the spontaneously broken
gauge symmetry, which provides a mass for weak bosons and for fermions.
However, with the advent of teravolt scale accelerators, we shall soon have
access to energies which are much larger than the symmetry breaking scale
(say, the W mass) which may act as infrared cut-off and thus give rise to para-
metrically large infrared logarithms in the energy dependence, in addition to
the ones of collinear origin. That this is indeed the case was first remarked in
the late nineties [17] and soon applied to inclusive observables [18]. The fail-
ure of the BN theorem is due again to the nonabelian nature of electroweak
theory, where now no averaging over flavour is possible, because the initial
state consists of electrons, protons, and so on, each of them having a nontriv-
ial weak isospin charge. This also means that double logarithms depending
Infrared-sensitive Physics 153

on the electroweak scale aﬀect most cross sections which are apparently in-
frared safe, so that electroweak radiative corrections are enhanced, sometimes
comparable to QCD ones, and to be carefully evaluated in a uniﬁed way.
My purpose in this note is to outline, in a few examples, how the novel
ideas of the 1970s allow to understand the physics of large logarithms for both
QCD and electroweak theory, thus turning a potential problem into a powerful
tool. They also lead to a precise calculational framework for the logarithmic
energy dependence, for which I refer to the reviews already mentioned [4, 14],
and to further dedicated papers [15, 16].

2 QCD Form Factors, Multiplicities, Preconﬁnement

2.1 Form Factors

An early consequence of the understanding of infrared and collinear be-

haviours in QCD was the remark [6, 7, 8, 9, 10] that observables where real
emission is suppressed are sensitive to the (square of) the partons’ Sudakov
form factor. The latter is evaluated, at leading logarithmic level, by an evo-
lution equation in μ2 (the parton virtuality) which is derived by a dispersive
argument [4, 6], or by applying [19] Gribov’s generalization of the Low theo-
rem [20] as follows:

d log Fa (Q2 , μ2 ) αs (μ2 ) Q2

= Ca log( ), (1)
d log μ2 2π μ2
where Ca = CF , CA is the Casimir charge of parton a = q, g. Note that μ2 >
Q20 plays the role of cut-off for an infrared divergent anomalous dimension,
so that Fa shows an exponential suppression which, in the frozen αs limit,
involves two logarithms per power of αs , one of collinear type and the other
of infrared origin. In the case of physical observables, the cut-off on μ2 should
be replaced by a parameter which regulates real emission, like Q2 /N for the
parton probability density functions (PDFs) at large moment index N , or
1/B 2 for impact parameter distributions. The outcome is the characteristic
large-N dependence of PDFs for deep inelastic scattering (DIS) and for the
Drell–Yan processes and the corresponding pT -distributions.
For instance, the DIS structure function FN (Q2 ) allows real emission up
to gluon momentum fraction z < 1/N , and this regulates the anomalous
dimension of (1) in the form
Q2
CF dμ2 Q2
FN (Q2 ) exp[− 2
αs (μ2 ) log M in( 2 , N )] . (2)
π Q20 μ μ
We can see that the anomalous dimension becomes finite and of logN type for
μ2 < Q2 /N , while the “exclusive” limit is reached for N = Q2 /Q20 , in which
case (2) reduces to Fq2 (Q2 , Q20 ), where Q0 is the minimal quark virtuality.
154 M. Ciafaloni

2.2 Multiplicities

Actually, the idea underlying [7, 10] is to describe outgoing hadronic jets in
semi-inclusive form, at the level of partons of virtuality Q0 > Λ, the de-
cay products of the latter being summed over. Here a problem of consis-
tency arises, because Q0 is a somewhat arbitrary scale, and hadronic distri-
butions should be independent of it. Fortunately, two important properties
help. Firstly, multiplicity distributions show a factorized Q-dependence with
respect to the Q0 dependence and, secondly, preconfinement holds, namely
the average mass of “minimal” colour singlets connected to a q − q̄ pair is of
order Q0 , much smaller than Q. This means that jet evolution can be viewed
in two steps, a perturbative QCD evolution from Q down to Q0 (of order
Λ) and a hadronization process at scale Q0 . Thus, the virtue of factorization
and preconfinement is that the conversion into hadrons does not affect the
Q-dependence, and occurs at a much lower scale.
Of course, the infrared analysis is essential in order to derive the above
properties. Factorization of multiplicity distributions is argued for by resum-
ming the double-log Feynman-x dependence of jet distribution functions in
the soft region, which eventually leads to a finite
' anomalous dimension with a
singular αs -dependence [9, 10] of type γ0 N2π c αs
[19, 21]. Correspondingly,
the average hadronic jet multiplicity has the behaviour
t
2Nc Q2
n̄(Q ) ∼ exp
2
dtγ0 (αs (t)) exp log 2 , (3)
0 πb Λ
and thus grows more rapidly than any power of log(Q2 /Λ2 ) = t.
The behaviour (3) is remarkably different from the one of QED radiation,
essentially because of the gluon charge, implying that the QCD jet evolution
is a branching process, leading to a cascade, rather than a bremsstrahlung
process off one leg, as in QED. Correspondingly, strong correlations of the final
soft partons are present, leading to an approximate KNO scaling of “exclusive”
n-parton emission probabilities, which for a gluon jet have the form [4]
σn 1 1 n
exp[− (log )2 ], (n n̄) . (4)
σjet n̄ 2 n̄
This result shows that the the approximate proportionality of the σn s in a
gluon jet to the corresponding form factor (1) still holds, at double-log level,
as for the electron in QED, but their relationship to the average multiplicity
(3) – in the frozen αs limit – is quite different from QED because of the QCD
cascade.

2.3 Preconﬁnement

On the other hand, preconfinement [7] follows from a veto on the possible final
states which are allowed in the minimal colour singlets in which, by definition,
Infrared-sensitive Physics 155

a U (3) colour line connects a quark of oﬀshellness Q0 to the corresponding

antiquark. Because of factorization, and of the veto, the inclusive mass dis-
tribution of minimal singlets being produced in a jet of mass up to Q is
independent of Q and is instead sensitive to the quark form factor, as follows
[7, 10]:

M 2 dσ
∼ Fq2 (M 2 , Q20 ) , (5)
σjet dM 2
so that its average mass is of order Q0 . Therefore, the conversion of partons
into hadrons can occur by an interaction of partons which are close in phase
space, leading to the so-called local parton–hadron duality [22], and to the
possibility of building event generators with relatively simple hadronization
models [11, 23].

3 Inclusive Electroweak Double Logarithms

The infrared physics outlined above relies on the BN cancellation of virtual
and real emission singularities, which in QCD occurs because of the colour
averaging in the initial state, as remarked above. Therefore, the form factor
behaviour of type (1) shows up only if some veto uncovers the “exclusive” limit
of the given hard process. On the other hand, in electroweak (EW) theory the
BN theorem fails because of the flavour charges of the accelerator beams.
For instance, the total cross section for e+ e− annihilation into hadrons is an
infrared safe observable from the QCD standpoint, but carries nevertheless
EW double logarithms, embodied into an enhanced effective coupling
αW s
αef f (s) = (log 2 )2 , (6)
4 MW
which is of order 0.2 in the teravolt energy range and leads, therefore, to
sizeable corrections, of the same order as QCD ones. Besides the expected
collinear logarithm, the expression (6) carries an additional one, of infrared
origin, due to the violation of the BN theorem.
The analysis of such inclusive double logarithms [18] involves form factors
of type (1), where now μ2 is cut-off by the EW scale MW 2
MZ2 = M 2
and the Casimir Ca refers to the isospin I representation a = I = 0, 1, ...
in the t-channel of the lepton–antilepton overlap matrix. For instance, the
combinations σe− ν ± σe− e+ correspond to I = 0 (I = 1), so that

1 1 αef f (s)
σe+ e− (s, M 2 ) (σ0 − σ1 F1 (s, M 2 )) (σ0 − σ1 exp (−2 )) , (7)
2 2 π
where σ0 corresponds to the isospin averaged cross section and has therefore no
double logarithms, while the antisymmetric combination σ1 is damped by the
I = 1 form factor, with C1 = 2. We note that, because of the optical theorem,
156 M. Ciafaloni

the inclusive form factor is not squared, though referring to a physical cross
section in the crossed channel. Note also that in this example σ1 > 0, because
the neutrino cross section is larger, and therefore the σe+ e− /σ0 ratio increases
in the teravolt energy range towards its high-energy limit, which is provided
by the flavour average.
The above description can be generalized, by collinear factorization, to
single logarithmic level and to a generic overlap matrix involving leptons and
partons in the initial states, thus coupling the EW and QCD sectors of the
Standard Model. The result of this procedure is a set of evolution equations
in μ2 which are similar to the DGLAP equations [24], except that evolution
kernels exist in the channels with I = 0 also, and are infrared singular or, in
other words, depend on a logarithmic cut-off, much as in (1). For instance, in
the evolution of lepton densities fl and boson densities fb , the I = 0 evolution
kernels coincide with the customary DGLAP splitting functions Pba , while the
I = 1 ones involve the cut-off-dependent virtual kernels

Q2 3 Q2 11 nf
PfV = δ(1 − z)(− log 2
+ ), P V
b = δ(1 − z)(− log 2
+ − ) . (8)
μ 2 μ 6 6
The corresponding evolution equations have the form
dfa1 αW 1 V
− = f P + regular terms , (9)
d log μ2 2π a a
and have been described in fully coupled form in [25]. Here I just notice that
(9) shows a Sudakov behaviour similar to (1) and is consistent with (7) after
taking into account the antilepton evolution, which doubles the virtual kernel.
The presence of inclusive double logarithms in spontaneously broken gauge
theories remains an intriguing subject. It is mostly an initial state effect and, as
such, it is present for any final states of the same class (e.g., flavour blind) and
strongly depends on the accelerator beams. Leptonic accelerators maximize
it, while hadronic ones (like LHC) provide some partial average on the initial
partonic flavours, thus decreasing it. But the effect appears also if the flavour
charges are looked at in the final state instead of the initial state, for instance,
in gluon fusion processes in which some W ’s are observed [26]. Furthermore,
the effect occurs whenever the soft boson emission mixes several degenerate
states having different hard cross sections. Nonabelian theories have it because
of the nontrivial multiplets, but also a broken abelian theory shows it whenever
the mass eigenstates are not charge eigenstates [27]. An example of the latter
type is the mixing of the Higgs boson with the longitudinal gauge boson
occurring in a U (1) theory. The Standard Model shows both kinds of effects
and, given their magnitude in (6), I think that the coupled evolution equations
of parton–lepton distribution functions [25] deserve by now a quantitative
study at the teravolt scale.
Perhaps, the most important lesson to be learned from several decades of
investigation of infrared-sensitive high-energy physics is that, even at the level
Infrared-sensitive Physics 157

of hard processes, the fundamental interactions look much more intertwined,

due to the large time nature of asymptotic states which possibly increases their
effective couplings. By the same token, because of the large times involved,
factorization theorems are at work and allow a good understanding of the
infrared dynamics. It remains true, however, that a unified treatment of all
degrees of freedom is needed already at Standard Model level – that is, even
before discovery of a possible short-distance unification.

Acknowledgements

It is a pleasure to thank Gabriele, as a collaborator and as a friend, for sharing

over many years and subjects the excitement of long discussions and, some-
times, of real understanding. I also warmly thank old and new teams on this
subject, in particular Stefano, and Paolo and Denis, for various updates of
the picture presented here. Finally, I am grateful to the CERN Theory Divi-
sion for hospitality while this work was being completed, and to the Italian
Ministry of University and Research for a PRIN grant.

References
1. F. Bloch, A. Nordsieck: Phys. Rev. 52, 54 (1937);
V. V. Sudakov: Sov. Phys. JETP 3, 65 (1956);
D. R. Yennie, S. C. Frautschi, H. Suura: Ann. Phys. 13, 379 (1961) 151, 152
2. T. Kinoshita: J. Math. Phys.3, 650 (1962);
T. D. Lee, M. Nauenberg: Phys. Rev. 133, 1549 (1964) 151
3. Y. L. Dokshitzer, D. Dyakonov, S. I. Troyan: Phys. Rep. 58, 269 (1980);
A. H. Mueller: Phys. Rep. 73, 237 (1981);
G. Altarelli: Phys. Rep. 81, 1 (1982) 151
4. A. Bassetto, M. Ciafaloni, G. Marchesini: Phys. Rep. 100, 201 (1983) 151, 153, 154
5. D. Amati, R. Petronzio, G. Veneziano: Nucl. Phys. B 140, 54 (1978) and B 146,
29 (1978);
R. K. Ellis, H. Georgi, M. Machacek, H. D. Politzer, G. C. Ross: Phys. Lett.
B 78, 281 (1978); Nucl. Phys. B 152, 285 (1979) 152
6. D. Amati, A. Bassetto, M. Ciafaloni, G. Marchesini, G. Veneziano: Nucl. Phys.
B 173, 429 (1980);
G. Parisi, R. Petronzio: Nucl. Phys. B 154, 427 (1979);
G. Parisi: Phys. Lett. B 90, 295, (1980);
G. Curci, M. Greco: Phys. Lett. B 92, 175 (1980) 152, 153
7. D. Amati, G. Veneziano: Phys. Lett. B 83, 87 (1979) 152, 154, 155
8. K. Konishi, A. Ukawa, G. Veneziano: Nucl. Phys. B 157, 45 (1979);
A. Bassetto, M. Ciafaloni, G. Marchesini: Phys. Lett. B 86, 366 (1979) 152
9. A. Bassetto, M. Ciafaloni, G. Marchesini: Phys. Lett. B 83, 207 (1979);
W. Furmanski, R. Petronzio, S. Pokorski: Nucl. Phys. B 155, 253 (1979) 152, 154
10. A. Bassetto, M. Ciafaloni, G. Marchesini: Nucl. Phys. B 163, 477 (1980) 152, 154, 155
158 M. Ciafaloni

11. G. C. Fox, S. Wolfram: Nucl. Phys. B 168, 285 (1980);

R. Odorico: Nucl. Phys. B 172, 157 (1980); Phys. Lett. B 102, 341 (1981);
G. Marchesini, B. R. Webber: Nucl. Phys. B 238, 1 (1984) 152, 155
12. I. C. Collins, D. E. Soper: Nucl. Phys. B 193, 381 (1981), B 194, 445 (1982);
Nucl. Phys. B 197, 446 (1982);
A. Sen: Phys. Rev. D 24, 3281 (1981);
S. Mukhi, G. Sterman: Nucl. Phys. B 206, 221 (1982);
J. Kodaira, L. Trentadue: Phys. Lett. B 112, 66 (1982) 152
13. R. Doria, J. Frenkel, J. C. Taylor: Nucl. Phys. B 168, 93 (1980);
G. T. Bodwin, S. J. Brodsky, G. P. Lepage: Phys. Rev. Lett. 47, 1799 (1981);
A. H. Mueller: Phys. Lett. B 108, 355 (1982);
W. W. Lindsay, D. A. Ross, C. T. Sachrajda: Nucl. Phys. B 214, 61 (1983);
P. H. Sörensen, J. C. Taylor: Nucl. Phys. B 238, 284 (1984);
S. Catani, M. Ciafaloni, G. Marchesini: Phys. Lett. B 168, 284 (1986); Nucl.
Phys. B 264, 588 (1986) 152
14. See Yu. L. Dokshitzer, V. A. Khoze, S. I. Troyan, A. H. Mueller: Rev. Mod.
Phys. 60, 373 (1988) and references therein 152, 153
15. G. Sterman: Nucl. Phys. B 281, 310 (1987);
S. Catani, L. Trentadue: Nucl. Phys. B 327, 323 (1989) 152, 153
16. For an updated list of references, see, e.g., S. Catani et al.: Proceedings of the
CERN Workshop on Standard Model Physics (and More) at the LHC, ed. by
G. Altarelli, M. L. Mangano (CERN, Geneva 2000), Sect. 5 153
17. P. Ciafaloni, D. Comelli: Phys. Lett. B 446, 278 (1999); Phys. Lett. B 476, 49
(2000);
V. S. Fadin, L. N. Lipatov, A. D. Martin, M. Melles: Phys. Rev. D 61, 094002
(2000);
M. Hori, H. Kawamura, J. Kodaira: Phys. Lett. B 491, 275 (2000);
J. H. Kuhn, S. Moch, A. A. Penin, V. A. Smirnov: Nucl. Phys. B 616, 286
(2001) 152
18. M. Ciafaloni, P. Ciafaloni, D. Comelli: Phys. Rev. Lett. 84, 4810 (2000); Nucl.
Phys. B 589, 359 (2000) 152, 155
19. B. I. Ermolaev, V. S. Fadin: JETP Lett. 33, 269 (1981);
V. S. Fadin: Yad. Fiz. 37, 408 (1983) 153, 154
20. F. Low: Phys. Rev. 110, 974 (1958);
V. N Gribov: Sov. J. Nucl. Phys. 5, 399 (1967) 153
21. A. H. Mueller: Phys. Lett. B 104, 161 (1981) 154
22. See Ya. I. Azimov, Y. L. Dokshitzer, V. A. Khoze, S. I. Troyan: Z. Phys. C 27,
65, (1989) and references therein. 155
23. For an overview see, e.g., G. Marchesini: From QCD Lagrangian to Monte Carlo
Simulation, this volume 155
24. V. N. Gribov, L. N. Lipatov: Sov. J. Nucl. Phys. 15, 438, (1972);
G. Altarelli, G. Parisi: Nucl. Phys. B 126, 298 (1977);
Y. L. Dokshitzer: Sov. Phys. JETP 46, 641 (1977) 156
25. M. Ciafaloni, P. Ciafaloni, D. Comelli: Phys. Rev. Lett. 88, 102001 (2002);
P. Ciafaloni, D. Comelli: JHEP 0511, 039 (2005) 156
26. P. Ciafaloni, D. Comelli: JHEP 0609, 055 (2006) 156
27. M. Ciafaloni, P. Ciafaloni, D. Comelli: Phys. Rev. Lett. 87, 211802 (2001) 156
From QCD Lagrangian
to Monte Carlo Simulation

G. Marchesini

Dipartimento di Fisica, Università di Milano-Bicocca, Milano, Italy

and INFN, Sezione di Milano-Bicocca, Milano, Italy
[email protected]

Abstract. I discuss old and recent aspects of quantum chronodynamics (QCD) jet
emission and describe how hard QCD results are used to construct Monte Carlo
programs for generating hadron emission in hard collisions. I focus on the program
HERWIG at Large Hadron Collider (LHC).

1 The Status
LHC is a discovery machine, it is expected to tell us how to complete the uni-
ﬁed theory of elementary interactions. New (heavy) particles are searched to
indicate/conﬁrm new symmetries. Events with heavy particles are expected
to be accompanied by an intense emission of hadrons at short distances, and
this is the domain of perturbative QCD. Therefore, to identify and under-
stand non-standard events a quantitative knowledge of the characteristics of
the hard radiation is strongly needed. In 1973 QCD was at the frontier of
particle physics (discovery of asymptotic freedom [1] and beginning of quan-
titative QCD studies), now in 2007 QCD is at the centre of particle studies.
The Monte Carlo programs for jet emissions [2, 3, 4] are important instru-
ments for analysing standard and non-standard short distance events. They
are the Summa of most QCD theoretical results and many present studies aim
to improve their quantitative predictions. Thanks to the QCD factorization
structure [5], Monte Carlo programs can be interfaced with hard cross sec-
tions involving also non-QCD processes (electroweak, supersymmetric, extra
dimension, black holes, ...). In this way, Monte Carlo generators can describe
both QCD and non-QCD events at short distances.
In this paper I describe the main QCD results which enter the construction
of a Monte Carlo generator. They are so many that most of the key points will
be recalled in a schematic way, but I hope that this short description could
provide an idea of the reliability range of the Monte Carlo generators. For a
more detailed description see [6]. Here, aiming to be simple and synthetic, I
follow a personal point of view and the focus will be on the Monte Carlo event

G. Marchesini: From QCD Lagrangian to Monte Carlo Simulation, Lect. Notes Phys. 737,
159–180 (2008)
DOI 10.1007/978-3-540-74233-6 9 c Springer-Verlag Berlin Heidelberg 2008
160 G. Marchesini

generator HERWIG [2]. Its general structure is similar to other important

Monte Carlo generators [3, 4]. In Sect. 2, I present the scheme of the operations
performed by Monte Carlo codes for LHC. The fact that the generation of
events can be subdivided into successive stages is physically based on QCD
factorization properties. The theoretical basis are discussed/recalled in Sect. 3.
In Sect. 4, I discuss the multi-gluon soft distributions and in Sect. 5, I describe
in detail a Monte Carlo code for soft emissions. Although important non-soft
contributions included in a realistic Monte Carlo are here missed, it provides
a simple example containing many important physical effects. In Sect. 6, I
discuss non-perturbative effects which enter the Monte Carlo generators. The
last section contains final considerations.

2 Structure of Monte Carlo generator

I start describing schematically the way a Monte Carlo code is organized in
order to generate hard QCD and non-QCD events at LHC. As a speciﬁc il-
lustration I consider the emission of two jets with high ET . This process is
factorized into the elementary hard distribution, the parton densities (struc-
ture functions as in DIS (deep inelastic scattering)) and the fragmentation
functions (as in e+ e− ):

p
Elementary hard distribution

Q = ET Structure function

Fragmentation function
p

Here are the necessary factorized steps:

• start form the hard elementary distribution σ̂ab→cd with ab the incom-
ing and cd the two outgoing partons. This hard distribution corresponds
to QCD jet emission at high ET . Here one can substitute distributions
for other QCD or non-QCD processes. There are many studies of hard
distribution for processes relevant for LHC, see [7].
• generate the momenta of the hard incoming (ab) and outgoing (cd) partons
(and possible non-QCD particles). Given the hard scales ET (and possi-
ble heavy masses), the momenta are generated (via important sampling)
From QCD Lagrangian to Monte Carlo Simulation 161

in computing the total cross section as convolutions of the elementary

distribution and the parton densities (structure functions);
• use the initial state space-like evolution (which at the inclusive level gives
the structure functions) to generate the “bremsstrahlung” of outgoing ini-
tial state partons k1 , k2 · · · . This requires imposing a minimal transverse
momentum w.r.t. the collision direction;
• given the outgoing hard QCD partons cd and k1 , k2 · · · , start the QCD
shower (parton multiplication). First, from the set of these partons, iden-
tify their colour connections and reconstruct the set of the various pri-
mary q q̄ dipoles. Here one works in the large Nc approximation so that
a gluon, from the colour point of view, can be represented as a pair of
quark–antiquark lines, a gluon is then associated to two dipoles;
• generate, for each primary dipole, the multi-parton emission according to
the coherent branching structure that will be illustrated in the following.
This requires imposing a lower bound on the relative transverse momenta
of final state partons (inside sub-jets);
• match with the exact high-order calculation, if available. It consists in
weighting the generated event by comparing [8] the Monte Carlo distri-
bution and the exact square matrix element computed to higher order
[7];
• given the system of all emitted partons, generate the final hadrons by using
a hadronization model making hadrons out of partons. Using hadronization
models based on colour connections and preconfinement [9], such a process
should not substantially modify [10] the structure of the hadronic radiation
with respect to the partonic one which has been obtained in the previous
steps.
In the next sections I describe the QCD basis of these steps.

3 The Long Way to Monte Carlo

QCD has a dimensionless coupling but, even at large-scale Q, when all masses
can be neglected, the cross sections do not scale simply as powers of Q2 .
This is due to the presence of ultraviolet, collinear and infrared divergences.
Ultraviolet divergences are responsible for the presence of the fundamental
QCD scale ΛQCD entering the running coupling. Collinear and infrared diver-
gences are well know from QED [11]. Parton distributions can be computed
only by ﬁxing a resolution Q0 (technically, a subtraction point) in the parton
transverse momentum. Collinear and infrared divergences are responsible for
large enhancements in these distributions which need to be resummed. Monte
Carlo generators do actually perform these resummations as I discuss in the
following.
The possibility to resum these enhanced terms is based on speciﬁc prop-
erties of the collinear and infrared singularities: they factories [5, 12, 13].
162 G. Marchesini

In this way one can formulate recurrence relations that lead to evolution
equations. The fundamental one is the DGLAP evolution equation [14] resum-
ming collinear singularities in parton densities and fragmentation functions.
These are single-inclusive quantities, but to reach a complete description of an
event one needs many-particle distributions so that the fully exclusive picture
can be reconstructed (with given resolutions). The way to this is the jet cal-
culus formulated and constructed by Ken Konishi, Akira Ukawa and Gabriele
Veneziano [15] as generalization of the DGLAP evolution equation. Therefore,
their work can be considered as the basis of the Monte Carlo parton multipli-
cation. Jet calculus leads the way to the evolution equation for the generating
functional [12, 13] of the multi-parton distributions and then to the branch-
ing probabilities for parton splitting in a way that could be implemented into
Monte Carlo codes. The pioneering Monte Carlo codes [16, 17, 18] were re-
summing collinear singularities but only after the discovery of coherence of
soft gluon radiation, both collinear and infrared enhanced logarithms where
correctly resummed. The present Monte Carlo generators [2, 3, 4] fully re-
sum not only the leading collinear and infrared singularities, but also relevant
subleading contributions.
In the following I describe the main theoretical points corresponding to
the Monte Carlo steps recalled in the previous section.

3.1 Asymptotic Freedom and Physical Coupling

At a short distance the theory becomes free [1] and here the use of perturbation
theory is justiﬁed. At the two loops one has

4π 2β1 ln L Q2
αs (Q) 1− + . . . , L = ln 1, (1)
β0 L β02 L Λ2QCD

with β0 = 11 − 23 nf , β1 = 51 − 19
3 nf and nf the number of light flavours.
To account for high-order effects one needs to start from the scheme for
the definition of the running coupling. A physical definition [19] is given by
the strength of the distribution for the emission of a soft gluon k off a colour
singlet pair of a massless quark and antiquark of momenta p, p̄. It is given by

αs (kt ) d3 k (pk)(k p̄)

dwpp̄ (k) = CF , kt2 = 2 , (2)
π kt2 2π|k| (pp̄)

and corresponds to the coupling associated to the Wilson loop cusp anomalous
dimension [20]. The relation to the MS coupling is known at three loops [21].
The argument of the coupling, the transverse momentum kt relative to the
emitting dipole, is obtained by using dispersive methods [12, 22] or, directly,
by two-loop calculations [23]. In order to accurately describe soft emissions,
the physical coupling with the argument in (2) is used in the Monte Carlo
generators.
From QCD Lagrangian to Monte Carlo Simulation 163

3.2 Coherence of Soft Gluons and Colour Connection

Successive soft gluon emission takes place into angular ordered regions with
intensities related to the colour charges. In the large Nc limit these regions are
identified by the parton colour connections. To explain this, one starts from
the emission of a soft gluon k off a colour singlet q q̄ pair, the dipole (2). This
distribution has collinear singularities for θpk = 0 or θkp̄ = 0. Introducing the
angular variable ξij = 1 − cos θij , one can isolate the two singular pieces and
write
p p̄

(pp̄) 1 Ψpp̄ (k) Ψpp̄ (k) p 1
ξpp̄ − ξpk
wpp̄ (k) = = 2 + , Ψpp̄ (k) = 2 1 +
(pk)(k p̄) k ξpk ξkp̄ ξkp̄
(3)
and similarly for the function Ψpp̄p̄ (k) associated to the singularity for ξkp̄ = 0.
Performing the integration of Ψpap̄ (k) over the azimuthal angle around a one
has
dφak a
Ψ (k) = Θ(ξpp̄ − ξak ) , a = p, p̄ . (4)
2π pp̄
This shows that the soft dipole distribution is made up of two collinear pieces,
the one singular for k collinear to a (ξak = 0) is (upon azimuthal averaging)
bounded to a cone around a with opening half-angle θpp̄ . Since the q q̄ dipole
is a colour singlet system, the p and p̄ colour lines are “connected”.
This coherent structure can be generalized to the soft emission of a gluon
k off a colour singlet system made of any number of partons. Consider a q q̄ g
colour singlet of momenta p, p̄ and q, respectively. The distribution is given
by (for simplicity, we take also the gluon q to be soft)

1
wpp̄g (k) = wpp̄ (q) · wpq (k) + wqp̄ (k) − 2 wpp̄ (k) . (5)
Nc

Splitting all dipole distributions as in (3), one can classify all collinear sin-
gularities in successive emissions within corresponding angular regions. One
finds that the piece which is singular for k collinear to a (with a = p, p̄ or q)
is bounded to a cone around a with opening half-angle θab with b the parton
colour connected to a (recall that in the planar limit the gluon is equivalent
to a quark–antiquark pair).
This angular ordered structure associated to colour connections at large
Nc has been extended [24] to the 2 → 2 QCD hard processes needed for LHC
and used in [2]. Beyond large Nc , the structure of soft radiation off the 2 → 2
hard QCD is quite more complex; it involves [25] rotation in the colour space
for the hard matrix elements and includes Coulomb phase contributions. This
is a very interesting contribution and would be nice if it could be included in
a future Monte Carlo generator.
The distribution of a soft gluon k emitted off a colour singlet pair of
massive quark and antiquark P and P̄ is given by
164 G. Marchesini
2
P P̄ (P P̄ ) P2 P̄ 2
WP P̄ (k) = − 1
2
− = − 12 − 1
2
, (6)
(P k) (P̄ k) (P k)(k P̄ ) (P k)2 (P̄ k)2

with (ij) = Ei Ej (1 − vi vj cos θij ) and vi = 1 − m2i /Ei2 . While in the mass-
less case (3) the distribution is collinear singular for k parallel to the emitting
charges, in the heavy quark case the collinear singularities are screened: dis-
tribution vanishes for k parallel to the heavy quark (or antiquark) Pa and the
radiation is suppressed [26, 27] in the cone cos θak > va .
The heavy quark screening is included into the Monte Carlo generators.
One needs to avoid sharp cut-off around the heavy quark which, taken together
with the angular limitations, would leave a dead cone, a phase space region
without radiation.

3.3 Sudakov Form Factor and Jets

An important element in Monte Carlo generator is the probability that, in a

hard process, a parton is not radiating within a given resolution, the Sudakov
form factor. To introduce this quantity, consider the inclusive distributions
(no particle momenta are measured but only energy ﬂows) which are free
from collinear and infrared singularities. Classical examples in e+ e− are the
jet-shape distributions Σ(Q, V ) with

V = v(ki ) . (7)
i

Here the sum runs over all particles in the final state (hadrons in the mea-
surements and partons in the calculations). For v(k) linear in the particle mo-
mentum, such jet-shape observables are collinear and infrared safe. Actually,
individual Feynman diagrams for real emitted partons and virtual corrections
are divergent but they are summed in such a way that, order by order, the
infinities cancel [11] leaving finite results.
Collinear and infrared safe jet-shape distributions Σ(Q, V ) have a pertur-
bative expansion with finite coefficients

Σ(Q, V ) = Σ0 (Q, V )(1+αs (Q) c1 (V )+αs2 (Q) c2 (V )+· · · ) , Q ΛQCD (8)

with Σ0 (Q, V ) the Born distribution and ci (V ) ﬁnite functions of V expressed

in terms of the quark, CF , or gluon, CA , colour charges. Actually, by inhibiting
the radiation by taking V 1, these coeﬃcients are enhanced by powers
of ln V . A clever reshuﬄing of PT (perturbative) series, based on universal
nature of soft and collinear radiation (factorization) results [12, 13] in the
exponentiated answer of the Sudakov form factor S(Q, V )

Σ(Q, V ) = Σ0 (Q, V ) · S(Q, V ) , S(Q, V ) = e−R(Q,V ) ,

∞
(9)
R(Q, V ) = αsn (Q2 ) dn lnn+1 V + sn lnn V + · · · .
n=1
From QCD Lagrangian to Monte Carlo Simulation 165

The dn series is referred to as double logarithmic (DL) and sn as single loga-

rithmic (SL). Reliable predictions for these distributions require the matching
[28] of the exact finite order calculation (8) for finite V and the Sudakov
resummation (9) for small V .
It is instructive to discuss the emergence of the powers of ln V in the
Sudakov form factor S(Q, V ). They result from the incomplete cancellation
of real and virtual effects. For V 1 the real parton production is inhibited,
one has v(k) < V 1. Since the virtual PT radiative contributions remain
unrestricted, the divergences do cancel in the region v(k) < V leaving only
virtual contributions for v(k) > V which produce finite but logarithmically
enhanced leftovers. The DL contributions originate from the fact that each
gluon emission brings in at most two logarithms (one of collinear, another of
infrared origin). This explains the first term d1 ln2 V while the rest of the DL
series is generated simply by the presence of the running coupling (1). The
SL contributions, are necessary to set the scale of the logarithms (lnn cV =
lnn V + n ln c lnn−1 V + · · · ).
In conclusion, the Sudakov form factor S(Q, V ) corresponds to the prob-
ability that in e+ e− the primary quark–antiquark pair remains without ac-
companying radiation up to resolution Q0 = V Q for small V .
To obtain the result (9) one uses the fact that the collinear and/or infrared
enhanced contributions factories and are resummed by linear evolution equa-
tions of the DGLAP type. Therefore, after factorization of collinear and in-
frared singularities (including soft gluon coherence) QCD radiation appears as
produced by “independent” gluon emission (bremsstrahlung). Gluon branch-
ing (into two gluons or quark–antiquark pair) enters only in reconstructing
the running coupling (1) as function of transverse momentum. The fact that
here the branching component does not contribute (within SL accuracy) can
be understood as a result of real–virtual cancellations of singularities. Indeed,
in the collinear limit, the transverse momentum of an emitted gluon is equal
to the sum of transverse momenta of its decay products. Therefore, if one
measures the total emitted transverse momentum, as in broadening for in-
stance, it is enough to consider the contributions of primary bremsstrahlung
gluons. Further branching does not contribute due to unitarity (real–virtual
cancellation).

3.4 Structure and Fragmentation Functions

Moving to less inclusive measurements one faces infinities. The simple case
involves fixing (measuring) momentum of a hadron, e.g. that of the initial pro-
ton in DIS (structure function) or of a final hadron (fragmentation function),
they are functions of the Bjorken and Feynman variables, respectively

−q 2 2(P q)
xB = , xF = . (10)
2(P q) q2
166 G. Marchesini

In DIS q is the large space-like momentum transferred from the incident lepton
to the target nucleon P . In e+ e− annihilation q is the time-like total incoming
momentum and P the momentum of the final observed hadron.
In perturbative calculation, replacing the hadron with a parton, one has
infinities, real and virtual contributions do not cancel. Soft divergences still
cancel but collinear ones do not, making such observables not calculable at
the parton level. These effects, however, turn out to be universal and, given
a proper technical treatment, can be factored out [5] as non-perturbative in-
puts. What remains under control then is only the Q2 -dependence (scaling
violation pattern). This fact is realized in the DGLAP evolution equation
which needs, in order to be solved, an initial condition at a low virtuality Q0 .
This corresponds to a parton resolution (or a factorized subtraction point),
which absorbs all large distance divergences. Such “initial condition” cannot
be computed by perturbative means and has to be provided by low-scale ex-
perimental data.

3.5 DGLAP Evolution Equation for DIS and e+ e−

To derive the DGLAP evolution equation [14], one needs to study the phase
space region leading to collinear singularities. The same Feynman diagrams
are involved in the case of structure function (space-like) and fragmentation
function (time-like). Therefore they can be studied simultaneously. First note
that the Bjorken and Feynman variables (10) are mutually reciprocal: after the
crossing operation P → −P , one x becomes the inverse of the other (although
in both channels 0 ≤ x ≤ 1 thus requiring the analytical continuation).
Such a reciprocity property can be extended to the Feynman diagrams
for the two processes and, in particular, to the contributions from mass-
singularities. Consider, for DIS (S-case) and e+ e− annihilation (T -case), the
skeleton structure of Feynman graphs in axial gauge and the kinematical re-
lation leading to the mass singularities

q
k’n
kn
k’n−1
kn−1

k’1
k1

DIS or e+ e− skeleton graphs

Here k1 , · · · kn are the outgoing parton systems (sub-jets). For space-like (S:
q 2 < 0, k0 entering) and time-like (T: q 2 > 0, k0 outgoing) one has
From QCD Lagrangian to Monte Carlo Simulation 167

ki,+ ki,+
S: ≡ zi and T : ≡ zi−1 . (11)
ki−1,+ ki−1,+

The virtuality ki2 enters the denominators of the Feynman diagrams. In order
for the transverse momentum integration to produce a logarithmic enhance-
ment, the following conditions must be satisﬁed:

|ki−1
2
| |k 2 |
i ⇒ ki−1
2
|ki2 | ziσ , (12)
ki−1,+ ki,+

with σ = −1 for DIS and σ = 1 for e+ e− . The same Feynman graphs are
contributing and, going from S- to T -channel, the mass singularities are ob-
tained by reciprocity: change z into 1/z and the momentum k from space-like
to time-like. This fact is at the origin of the Drell–Levy–Yan relation [29]
and Gribov–Lipatov [30] reciprocity, which has been largely used in order to
obtain the time-like anomalous dimensions from the space-like ones [31, 32].
The ordering (12) in the inverse ﬂuctuation time k 2 /k+ is well known, see for
instance [33].
To make the Gribov–Lipatov reciprocity more clear, use the ordering (12)
in the computation of the probability Dσ (x, Q2 ) to ﬁnd a parton with longi-
tudinal momentum fraction x and virtuality |k 2 | up to Q2 with σ = −1 for
the S-case and σ = 1 for the T -case. This ordering gives rise to the following
reciprocity respecting equation [34]:
1 x
dz
2 2
Q ∂Q2 Dσ (x, Q ) = P (z, αs ) Dσ , Q2 z σ , σ = ±1 , (13)
0 z z

with the same parton splitting kernel P (z, αs ) in the S- or T -channel. This
equation, derived simply from kinematical considerations, has been (partially)
tested at two [31] and three loops [21, 34, 35].
The reciprocity respecting equation (13) is non-local since the derivative of
Dσ (x, Q2 ) in the l.h.s. involves the distribution in the r.h.s. with all virtualities
larger or smaller than Q for σ = −1 or σ = +1, respectively. For the use in
a Monte Carlo generator, one needs to formulate (13) in terms of a local
evolution equation, a Markov process. Formally this is easy to do: as a hard
scale for the parton densities replace Q2 with Q̄2+ = x Q2 in the T -case and,
by reciprocity, with Q̄2− = x−1 Q2 in the S-case. The physical meaning of
these two diﬀerent hard scales is well known from the studies of soft gluon
coherence [12, 13, 33, 36]: in the T -case is related to the branching angle and
in the S-case to the transverse momentum.
It is interesting to illustrate this. The fact that, in the T -case, the ordering
variable is not the inverse ﬂuctuation time k 2 /k+ (12) but rather the angle
k 2 /k+
2
kt2 /k+
2
θk2 , originates from cancellations [36] due to destructive
interference in the region

T-case: zi2 ki2 < ki−1

2
< zi ki2 , (14)
168 G. Marchesini

2
thus leaving the angular ordered region ki−1 < zi2 ki2 . Using reciprocity (zi →
−1
zi ) one has that in the S-case the cancelling region (14) becomes
S-case: |ki2 | < |ki−1
2
| < zi−1 |ki2 | , (15)
2 2
thus leaving the transverse momentum ordering kt,i−1 < kt,i . This agrees also,
at small x, with the BFKL [37] leading order multi-parton kinematical region.
The cancellation in the region (15) has a well-known physical basis for
small x. Consider (see the skeleton graph) the successive emissions ki−2 →

ki−1 +ki−1 and ki−1 → ki +ki in the region ki,+ ki−1,+ ki−2,+ giving the
leading contribution for small x. These cancellations result from taking into

account the emission of ki off the partons ki−2 and ki−1 in the region (15).
Physically, the process can be viewed upon as an inelastic diffraction of the
incident particle ki−2 in the external gluon field of transverse size of order kit .
In the kinematical region (15), the transverse size of the parton fluctuation

ki−2 → ki−1 + ki−1 is smaller than the resolution power of the probe, kit 2
. In
these circumstances, the destructive interference between ki interacting with

the initial (ki−2 ) and with the final state (ki−1 + ki−1 ) comes onto the stage.
The cancellation under discussion is then equivalent to the general physical
observation, due to V.N. Gribov, that inelastic diffraction vanishes in the
forward direction.
To deal with very small x, one needs to resum at least all terms αsn lnn x
as given by the BFKL equation [37], which cannot be accounted for by the
collinear singularities resummation performed in the Monte Carlo codes. How-
ever, the evolution equation in [38] resums leading collinear and ln x terms
(by enlarging the phase space and adding a non-Sudakov form factor) and
allows Monte Carlo simulations [39] with the cost of generating events which
need to be weighted.

4 Multi-gluon Soft Distributions

Collinear and infrared pieces of the multi-parton QCD distributions factories
and can be reproduced by recurrence relations which can be formulated as
a Markov branching process. This can be implemented into a Monte Carlo
code and the simulation provides a “complete” description of the multi-parton
emission in hard process.
I illustrate in detail the case in the leading soft approximation. Although
important non-soft contributions that are included in a realistic Monte Carlos
are here neglected, many important physical eﬀects are well described, in par-
ticular, large angle soft emission (without collinear approximation). Moreover,
in this approximation the path from multi-gluon soft amplitudes to Monte
Carlo is simple to explain. The scheme of the presentation involves the fol-
lowing steps:
• Multi-gluon soft distributions. They are computed in the leading soft ap-
proximation and in the planar approximation.
From QCD Lagrangian to Monte Carlo Simulation 169

• Recurrence relation for the multi-gluon soft distributions. This is obtained

by introducing the generating functional for all multi-gluon distributions
[12] and deriving the evolution equation. From the generating functional,
one computes observables as it will be discussed in Sect. 4.2. For collinear
and infrared safe observables such as jet-shape distributions, the cut-oﬀ
contributes only with power corrections.
• Markov process and Monte Carlo implementation. Here one needs to in-
clude proper cutoﬀ for collinear and infrared singularities. This will be
discussed in the next section.
• from parton to hadron emission. This will be discussed in Sect. 6.

The starting point is the amplitude for the emission of n soft gluons
q1 , · · · , qn oﬀ a primary colour singlet q q̄ pair of momentum p, p̄. It is repre-
sented as a sum of Chan–Paton factors with the coeﬃcients given by colour-
ordered amplitudes. We consider the contribution with a single Chan–Paton
factor (topological expansion [40])

Mn (pp̄q1 · · · qn ) = {λai1 · · · λain }β β̄ Mn (pqi1 · · · qin p̄) , (16)
πn

the sum is over the permutation πn of colour indices, λa are the SU (Nc )
matrices in the fundamental representation. The softest emitted gluon qm
factorizes and one has [12, 42]

q μ q μ
Mn (· · · m · · · ) = gs Mn−1 (· · · · · · ) · − . (17)
(q qm ) (q qm )

The softest gluon is emitted by the two partons neighbouring in colour space.
This approximation is accurate in the soft limit without any collinear approx-
imation. From this factorized structure, one deduces a recurrence relation
and computes all colour amplitudes in the soft limit. Summing over the po-
larization indices, the squared averaged colour amplitude is given, for the
fundamental colour permutation, by

|Mn (pq1 · · · qn p̄)|2 = |M0 |2 (2gs2 )n Wpp̄ (q1 · · · qn ),

(pp̄)
Wpp̄ (q1 · · · qn ) = . (18)
(pq1 ) · · · (qn p̄)

This very simple result for the square amplitude is valid for any energy or-
dering and depends only on the colour ordering. Note that here one takes the
square of the same colour-ordered amplitude. Indeed Mn (πn )Mn∗ (πn ) with πn
and πn two different colour permutations cannot be expressed in a closed form
for any n. On the other hand, contributions from different permutations enter
the calculation of the averaged squared amplitude |Mn |2 . A close expression
for this distribution for any n is obtained only in the planar approximation
[41]. To see this observe that
170 G. Marchesini
n n−1
Nc 1
Tr(λπn λπnT ) = 2CF 1− , (19)
2 Nc
with λπn = {λa1 · · · λan } and λπnT = {λan · · · λa1 }. Taking instead two differ-
ent colour permutations one has that Tr(λπn λπnT ) is suppressed at least by
1/Nc2 . Therefore, only in the planar approximation one can use the simple
result in (18) and obtains [12]
σ0
|Mn |2 = (Nc gs2 )n Wpp̄ (qi1 · · · qin ) , (20)
n! π n

where σ0 = 2CF |M0 |2 and symmetrization has been taken into account.
The distributions (18) contain the leading infrared singularities: for any
colour permutation one has Wpp̄ ∼ (ω1 · · · ωn )−2 with ωi the energy of gluon
qi . They contain also the leading collinear singularities for θij = 0 with ij two
partons neighbouring in colour (thus there are up to n collinear singularities).
An alternative way to obtain the the multi-gluon colour amplitude is based
on the helicity techniques [43]. For q q̄ with + and − polarization, the leading
soft contribution is obtained when all gluons have + helicities and the re-
currence relation (17) reads (for opposite helicities, the result is the complex
conjugate one)
q q
Mn (· · · m · · · ) = gs Mn−1 (· · · · · · ) · ,
q qm qm q

qq = 2qq · eiφqq , (21)

with qm the softest gluon, z the longitudinal direction and the phase
*

q+ q+ qt qt
eiφqq
= − , qt = qx + iqy . (22)
2qq q+ q+

The solution of this recurrence for the amplitude is very simple; it is the same
for any energy ordering and depends only on the colour ordering. For the
fundamental permutation, one has
pp̄
Mn (pq1 · · · qn p̄) = gsn M0 , (23)
pq1 · · · qn p̄
with squared amplitude given by (18). This shows the well-known result that
∗
non-planar contributions, obtained from Mn (πn ) · MN (πn ) for two diﬀer-
ent colour orderings, have the same soft singularities but reduced number
of collinear singularities.

4.1 Virtual Correction, Generating Functional and Evolution

To compute observables one needs to supplement the multi-gluon soft distri-

butions (20) with the related virtual corrections. For infrared and collinear
From QCD Lagrangian to Monte Carlo Simulation 171

safe observables, such as jet-shape distributions, the infrared and collinear

singularities in (18) has to be cancelled by corresponding singularities in vir-
tual corrections. One way to compute the virtual corrections, at the same
level of accuracy in the soft limit as for real emission contribution, consists
of performing the integration over the virtual gluon energy by the Cauchy
method and then taking the soft limit for the virtual gluon. This way one
also regularizes the ultraviolet divergences by neglecting the divergent con-
tribution from the contour at the infinity of the complex energy plane. By
properly choosing a constant, this regularization corresponds to the physical
scheme in (1). The virtual corrections so computed can be included into the
generating functional for the multi-gluon soft distributions. The result of this
study not only gives the relevant virtual corrections but, due to the simple
structure of (20) in the planar approximation, gives the branching structure
of multi-gluon soft emission leading to the Monte Carlo generator.
(n)
Consider the soft distribution dσab for the emission of n gluons off a colour
singlet dipole ab (thus one generalizes the primary dipole pp̄ to a general
dipole with a and b in arbitrary directions). For each emitted soft gluon qi
one introduces a source function u(qi ) and defines the generating functional as
1 dσ (n)
ab
Gab [E, u] = tot u(qi ) , (24)
n
n! σab i

with E = Q/2 the hard scale. This functional depends on the directions a and
b of the primary dipole. By setting all u(qi ) = 1 one has Gab [E, 1] = 1. Using
(20), one has the real emission contribution for the generating functional
dΩqi

Greal
ab [E, u] = ᾱs u(q i ) ω i dω i Θ(E −ω i ) · Wab (q1 · · · qn ), (25)
n i
4π

with ᾱs = Nc αs /π. Here one neglects 1/Nc2 corrections (planar limit) and
uses the soft approximation for the phase space ωi E. Symmetry of the
phase space is used. The condition Gab [E, 1] = 1 must be satisﬁed only after
including the virtual corrections. To include them, we construct the evolution
equation for the generating functional. To this end, we use the fact that the
very simple expression (18) has the following factorization property:
Wab (q1 · · · qn ) = wab (q ) · Wa (q1 · · · q −1 ) · W b (q +1 · · · qn ) , (26)
with q one of the soft gluons and wab (q) the dipole distribution (3). Taking
q as the hardest (soft) gluon and diﬀerentiating (25) with respect to E, thus
setting ω = E, one obtains [44]

dΩq ᾱs ξab + ,
E∂E Gab [E, u] = u(q) Gaq [E, u] · Gqb [E, u] − Gab [E, u] , (27)
4π ξaq ξqb
with ξij = 1 − cos θij . The negative term in the integrand originates from
the virtual corrections obtained via Cauchy integration as mentioned before.
172 G. Marchesini

Since they are evaluated within the same soft approximation used for the real
contributions, at the inclusive level they cancel against the real contributions
giving the correct constraint Gab [E, 1] = 1. Both the real emission (ﬁrst term
in the integrand) and the virtual correction (second term) are collinear and
infrared singular. For inclusive observables, (i.e. for suitable sources u(q))
these singularities cancel. This evolution equation accounts for coherence of
soft gluon radiation [12, 13].

4.2 Observables in the Soft Limit

Using Gab [E, u] one obtains all inclusive distributions in the soft limit.
No collinear approximations are involved in (20); therefore, the functional
Gab [E, u] gives quantities that involves also large angle soft emission. Let me
ﬁrst recall some observables which are collinear singular around the primary
partons a and b.

Collinear Observables

The simplest one is the multiplicity of soft gluons with resolution Q0 . Taking
u(q) = u this observable is deﬁned as, see (24),
- σ (n)
-
nab (E) = ∂u Gab (E, u)- = n ab
tot . (28)
u=1
n
σab

It is easy to derive from (27) the well-known result [36] for the multiplicity
. * /
(0) 4π 2Nc
nab (E) nab exp , (29)
β0 παs (E)

(0)
with nab the non-perturbative initial condition. Similarly, one derives the
fragmentation function Dab (x, E) by taking the source u(q) = u(x) with x
the soft gluon energy fraction
-
δ -
Dab (x, E) = Gab [E, u]- . (30)
δu(x) u(x)=1

Soft gluon coherence here is shown by a depletion of radiation [12, 13] at

small x.

Observables at Large Angle

The simplest case is the distribution discussed in [45] of heavy systems of

mass M emitted in e+ e− at large angle ρ = 12 (1 − cos θ) and small velocity.
The heavy system (typically a heavy q q̄ system) originates from a gluon in
the cascade. The collinear singularities are screened by M so this distribution
is ﬁnite and given by a function of the SL quantity
From QCD Lagrangian to Monte Carlo Simulation 173
E
dqt
τ= ᾱs (qt ) . (31)
M qt

It is interesting that this distribution I(ρ, τ ) satisﬁes an equation with a struc-

ture similar to the BFKL equation [45] and then its asymptotic behaviour in
τ involves the BFKL characteristic function. One has
e4 ln 2 τ ln ρ0 /ρ − ln2 ρ0 /ρ
I(ρ, τ ) ∼ · √ e 2Dτ , D = 28 ζ(3) . (32)
τ 3/2 ρ

The functional Gab [E, u] is suited to give the distributions in the energy emit-
ted away from jets. Such distributions do not have collinear singularities, but
only infrared ones. An example in e+ e− is the distribution in energy recorded
outside a cone θin around the thrust (this is a typical “non-global” jet observ-
able [46]):

out
thrust θin
axis
in out in

Since the jet region is excluded, there are no collinear singularities to SL

accuracy and the resummed PT contributions come from large angle soft
emission. Here resummation is complex but informative. It brings information
on the QCD radiation between jets, a region interesting for understanding
colour neutralization among jets.
It is interesting to discuss this quantity in some detail since it illustrates
the structure of (27). First observe that the distribution depends on E and
Eout through the SL function τ given by (31) with M → Eout . To obtain
Σ(τ ) from Gab [E, u], one takes u(q) = 0 away from jets and u(q) = 1 inside
the jet region. From (27), one derives the evolution equation [44]

dΩq ᾱs ξab + ,
∂τ Σab (τ ) = −sab Σab (τ ) + Σaq (τ ) · Σqb (τ ) − Σab (τ ) , (33)
in 4π ξaq ξqb

with sab related to the Sudakov form factor

dΩq ξab −1
S(τ ) = e−τ sab , sab = ∼ ln θin . (34)
out 4π ξaq ξqb

Equation (33) has a bremsstrahlung (ﬁrst) and branching (second term)

components:
174 G. Marchesini

(a)
(b)

bremsstrahlung component branching component

The bremsstrahlung component resums contributions from gluons emitted in

the recorded region outside the cone. These contributions are the only ones
present for the global jet observables considered in the previous subsection.
Here, since the collinear singularities are screened by the cone θin , the Sudakov
form factor is a SL function.
The branching component resums contributions from gluons emitted inside
the jet region. These gluons need to branch in order to generate decay prod-
ucts entering the recorded region. Here real–virtual cancellation is incomplete
and virtual enhanced contributions are dominating thus leading to a strong
suppression of the distribution which asymptotically turns out to be Gaussian
in τ .
The Monte Carlo generator [2] resums only collinear singularities; there-
fore, it does not fully resum soft emissions at large angles although phe-
nomenologically, it turns out [47] that the most important pieces are correctly
reproduced due to soft gluon coherence.

5 Monte Carlo Simulation for Soft Emission

The evolution equation (27) can be formulated as a Markov process and then
numerically solved. This Monte Carlo procedure has been introduced in [46] to
study non-global distributions. A similar procedure based on dipole branching
is used in the Monte Carlo generator [4].
To construct a Monte Carlo generator from (27) one splits the real and
virtual corrections. To do so, it is necessary to introduce a cut-oﬀ Q0 in
transverse momentum (the argument of αs ) giving the Sudakov form factor
E
dωq dΩq ᾱs ξab ξaq ξqb
ln Sab (E) = − · θ(qtab −Q0 ), 2
qtab = 2 ωq2 , (35)
Q0 ωq 4π ξaq ξqb ξab

which is the solution of (27) with the real emission piece neglected. Here
qtab is the transverse momentum of q with respect to the ab-dipole. Then the
evolution equation (27) can be integrated to give (the cut-oﬀ Q0 dependence
is implicit)

Gab [E] = Sab (E, Q0 ) + dPab (E, ωq , Ωq ) u(q) Gaq [ωq , u] · Gqb [ωq , u] , (36)
From QCD Lagrangian to Monte Carlo Simulation 175

where one has introduced the probability for dipole branching: (ab) →
(aq) (qb)

dωq Sab (E) dΩq ᾱs ξab
dPab (E, ωq , Ωq ) = · θ(qtab −Q0 ) . (37)
ωq Sab (ωq ) 4π ξaq ξqb

To see how this could be used in a Monte Carlo simulation one writes
dPab (E, ω, Ω) in the equivalent form (the bound qtab > Q0 is implicit)

dPab (E, ω, Ω) = drab (E, ω) · dRab (Ω) (38)

with

Sab (E)
rab (E, ωq ) = , drab (E, ωq ) = 1 − Sab (E)
Sab (ωq )
(39)
dΩq ᾱs ξab
dRab (Ωq ) = Nab , dRab (Ωq ) = 1 .
4π ξaq ξqb

The integral of the branching probability gives

dPab (E, ω, Ω) = 1−Sab (E) , (40)

and this shows that the Sudakov factor Sab (E) gives the probability for not
emitting a gluon within the resolution Q0 in qtab .
The probability distribution dPab (E, ω, Ω) can be used to generate Monte
Carlo events distributed according to QCD in the soft and planar approxima-
tion. Using sets of random numbers 0 < ρ < 1, the procedure is as follows:
1. take the ab-dipole with the energy scale E and compare the Sudakov factor
Sab (E) with ρ. If ρ < Sab (E) then the ab-dipole does not emit any soft
gluon within the resolution. In the opposite case, the dipole is emitting a
soft gluon with energy ωq given by solving the equation ρ = rab (E, ωq );
2. obtain the direction Ωq by sampling the distribution dRab (Ωq ). At this
point, from the ab-dipole one has generated two dipoles: aq and qb, both
at the new energy scale ωq ;
3. repeat the procedure for each new generated dipole till no dipole emits
any more within the resolution.
At the end of this procedure, one is left with a Monte Carlo event: a
collection of emitted soft gluons q1 · · · qn together with the primary partons
a, b. These events are distributed with the QCD probability so they can be
used to compute any soft distribution as discussed in Sect. 4.2.
Such a Monte Carlo simulation, based on evolution equation in energy, is
then a successive emission of softer and softer gluons. Angles are given by
the dipole distribution (3) so they are ordered (upon azimuthal average) and
coherence is automatically implemented.
176 G. Marchesini

In order to obtain a realistic simulation one needs to overcome the soft

approximation, that is, to take into account the recoil in the emission and the
non-soft pieces of the gluon splitting function (only the singular pieces are
present in (27))

1 1
Pg→gg (z) = Nc + + z(1−z)−2 . (41)
z 1−z

Similarly, one needs to account also for the quark branching channels. All these
points are accounted in the present realistic Monte Carlo generators. Their
basis is an evolution equation in angle rather than in energy (as (27)). However
this implies that one considers collinear approximations in the emission thus
soft radiation at large angles are not fully accounted for.

6 From Partons to Hadrons

The above description of the Monte Carlo code refers to the generation of
events with emission of partons (possibly together with non-QCD particles)
which, due to the presence of collinear and infrared singularities, requires a
cut-off Q0 . The main questions are then: how to go from partons to hadrons
and how much a phenomenological hadronization model affects and distorts
the QCD radiation generated perturbatively. A suggestion on hadronization
models which do not substantially modify the peturbative radiation is pro-
vided by preconfinement [9].

6.1 Preconﬁnement

The basis is again the Sudakov function, which suppress the probability of
“non-emitting”. Consider, in the planar approximation, two colour connected
partons emitted in a hard collision at scale Q and with resolution Q0 . Colour
connection means that the quark colour line of one parton ends into the an-
tiquark colour line of the other parton (in the planar approximation a gluon
could be, from the colour point of view, described as a pair of q q̄ colour
lines). Thus no gluons are emitted within the resolution Q0 by this colour
line and a Sudakov form factor arises which forces the two colour connected
partons to form a system of mass of order Q0 (even for very large Q). The
system of the quark and antiquark in question forms a colour singlet of small
mass. Although this is not yet an indication of confinement (the colour sys-
tem should be localized in space), such a preconfinement property suggests
that any hadronization models that associates hadrons to colour connected
partons would not distort the perturbative structure of the QCD radiation:
parton and hadron flows are similar within the resolution Q0 . Preconfinement
is then related to the property of local hadron–parton duality [10], which has
been phenomenologically well tested.
From QCD Lagrangian to Monte Carlo Simulation 177

6.2 Power corrections

Other non-perturbative eﬀects are the power corrections to the observables.

They result from the non-convergence the PT expansions even if the coeffi-
cients are finite as in (8) and (9). As a consequence, all PT predictions are
affected by corrections in powers of ΛQCD /Q with coefficients determined by
NP (non-perturbative) effects. An important NP effect, present in short dis-
tance quantities, is that the running coupling is involved at any scale smaller
than Q. For example, the average value of V in (7) is given by an integral of
the type
Q
dkt
V = αs (kt ) · V(kt /Q) = v1 αs (Q) + v2 αs2 (Q) + · · · , (42)
0 kt
where the virtual momentum kt in the Feynman diagrams runs into the
large distance region (although the observable is dominated by short dis-
tance physics). Since the observable is collinear and infrared finite, for kt → 0
the Feynman integrand is regular (V(kt /Q) ∼ kt /Q) so that the integral is
finite, apart from the presence of αs (kt ) that enters the confinement region.
Mathematically, this is reflected into the fact that, although all PT coeffi-
cients in αs (Q) are finite, the expansion is non-convergent [48] (renormalon
singularity).
The fact that the running coupling entering the NP region is at the origin
of the leading power correction can be checked phenomenologically. From the
study of jet-shape observable one finds [49] that, within 10−20%, the power
corrections are described by the same parameter accounting for the running
coupling in the NP region. In the Monte Carlo generators, one sets a cut-off
Q0 in the argument of the coupling and this does bring in these physically
relevant power corrections at the perturbative – parton – stage. Instead, power
behaving contributions to jet shapes arise at the hadronization level [51].

6.3 Underlying event

Another important NP component in the Monte Carlo for LHC is the presence
of radiation besides the one emitted in the hard event. This is typically around
the beams as for the peripheral interactions (events at low ET ). Perturbative
QCD does not provide indication for this component. Thus there are various
models which needs to be studied [50] at the Tevatron together with the
extrapolation at LHC.

7 Conclusion
What I have discussed shows that the Monte Carlo generators involves the
entire Summa of hard QCD results and provide a framework for many fu-
ture QCD and non-QCD studies. The general attempts to improve the Monte
178 G. Marchesini

Carlo generators go in the directions of making the quantitative predictions

both more reliable (by adding new theoretical QCD results and phenomeno-
logical studies) and more general (by including also electroweak and beyond
the standard model physics). As far as the ﬁrst direction, I have mentioned
the works made to include in the Monte Carlo generator the known exact
higher-order distributions [8]. As also mentioned, it is interesting to include
into the present generators reliable predictions on large angle soft emission
(see Sect. 4.2). This would require also the need to account for non-planar
corrections by studying colour rotations involved in the colour structure of
ensembles of more than three hard partons (see [25]).
The three key elements in a Monte Carlo generator for jet emissions are
the QCD factorization properties, the branching algorithm and the procedure
for converting partons into hadrons. As I have mentioned, Gabriele Veneziano
has either contributed to or started each of these three key developments:
The Monte Carlo generators are based on factorization of QCD collinear sin-
gularities [5]. Jet calculus [15] leads to the evolution equation for the gener-
ating functional for multi-parton distributions which can be formulated as a
Markov process. Moreover, the preconﬁnement property [9] is at the basis of
hadronization models that do not destroy the QCD radiation structure.

Acknowledgements
In addition to Gabriele, I am grateful to the many colleagues who shared with
me the beauty of QCD and in particular to Bryan Webber, we undertook the
risk of conveying incomplete theoretical concepts and results into an event gen-
erator, and to Marcello Ciafaloni, Yuri Dokshitzer and Al Mueller, for many
discussions during the construction of the original Monte Carlo generator.

References
1. D.J. Gross, F. Wilczek: Phys. Rev. Lett. 30, 1343 (1973);
H.D. Politzer: Phys. Rev. Lett. 30, 1346 (1973) 159, 162
2. G. Marchesini, B.R. Webber: Nucl. Phys. B 238, 1 (1984); Nucl. Phys. B 310,
461 (1988)
G. Marchesini, B.R. Webber, G. Abbiendi, I.G. Knowles, M.H. Seymour,
L. Stanco: Comput. Phys. Commun. 67, 465 (1992);
G. Corcella, I.G. Knowles, G. Marchesini, S. Moretti, K. Odagiri, P. Richardson,
M.H. Seymour, B.R. Webber: JHEP 0101, 010 (2001) 159, 160, 162, 163, 174
3. T. Sjöstrand: Comput. Phys. Commun. 82, 74 (1994) 159, 160, 162
4. L. Lönnblad, Comput. Phys. Commun. 71, 15 (1992) 159, 160, 162, 174
5. D.Amati, R.Petronzio, G. Veneziano: Nucl. Phys. B 140, 54 (1978); Nucl. Phys.
B 146, 29 (1978);
R.K. Ellis, H. Georgi, M. Machacek, H.D. Politzer, G.G. Ross: Nucl. Phys.
B 152, 285 (1979);
From QCD Lagrangian to Monte Carlo Simulation 179

S. Libby, G. Sterman: Phys. Rev. D 18, 3252 (1978);

A.H. Mueller: Phys. Rev. D 18, 3705 (1978);
C.T. Sachrajda: Phys. Lett. B 73, 185 (1978); Phys. Lett. B 76, 100 (1978) 159, 161, 166, 17
6. R.K. Ellis, W.J. Stirling, B.R. Webber: Camb. Monogr. Part. Phys. Nucl. Phys.
Cosmol. 8, 1 (1996) 159
7. C. Buttar et al.: Les Houches physics at Tev colliders 2005, hep-ph 0604120 160, 161
8. S. Frixione, B.R. Webber: JHEP 0206, 029 (2002); S. Frixone, P. Nason, B. R.
Webber: JHEP 0308, 007 (2003); P. Nason, G. Ridolfi: JHEP 0608, 077 (2006)161, 178
9. D. Amati, G. Veneziano: Phys. Lett. B 83, 87 (1979) 161, 176, 178
10. Ya.I. Azimov, Y.L. Dokshitzer, V.A. Khoze, S.I. Troian: Z. Phys. C 2, 65 (1985)
161, 176
11. T.D. Lee, M. Nauenberg: Phys. Rev. 133, B1549 (1964);
T. Kinoshita: J. Math. Phys. 3, 650 (1962) 161, 164
12. A. Bassetto, M. Ciafaloni, G. Marchesini: Phys. Rep. 100, 201 (1983) 161, 162, 164, 167, 169
13. Y.L. Dokshitzer, V.A. Khoze, S.I. Troian, A. H. Mueller: Rev. Mod. Phys. 60,
373 (1988); Basics of Perturbative QCD (Ed. Frontieres, Gif-sur-Yvette, France,
1991) 161, 162, 164, 167, 172
14. V.N. Gribov, L.N. Lipatov: Sov. J. Nucl. Phys. 15, 438( 1972);
G. Altarelli, G. Parisi: Nucl. Phys. B 126, 298 (1977);
Y. L. Dokshitzer: Sov. Phys. JETP 46, 641 (1977) 162, 166
15. K. Konishi, A. Ukawa, G. Veneziano: Nucl. Phys. B 157, 45 (1979); Phys. Lett.
B 78, 243 (1978) 162, 178
16. G.C. Fox , S.Wolfram: Nucl. Phys. B 168, 285 (1980) 162
17. R. Odorico: Nucl. Phys. B 172, 157 (1980) 162
18. F. Paige, S. Protopopescu: Supercollider Physics, ed. by D. Soper (World Sci-
entific, Singapore, 1986) 162
19. S. Catani, B.R. Webber, G. Marchesini: Nucl. Phys. B 349, 635 (1991)
Yu.L. Dokshitzer, V.A. Khoze, S.I. Troian: Phys. Rev. D 53 89 (1996) 162
20. G.P. Korchemsky: Mod. Phys. Lett. A 4, 1257 (1989);
G.P. Korchemsky, G. Marchesini: Nucl. Phys. B 406, 225 (1993) 162
21. A. Vogt, S. Moch, J.A.M. Vermaseren: Nucl. Phys. B 691, 129 (2004); Nucl.
Phys. B 688, 101 (2004) 162, 167
22. D. Amati, A. Bassetto, M. Ciafaloni, G. Marchesini, G. Veneziano: Nucl. Phys.
B 173, 429 (1980) 162
23. Y. L. Dokshitzer, G. Marchesini, G. Oriani: Nucl. Phys. B 387, 675 (1992);
Y. L. Dokshitzer, A. Lucenti, G. Marchesini, G.P. Salam: Nucl. Phys. B 511,
396 (1998), Erratum-ibid. B 593, 729 (2001) 162
24. R.K. Ellis, G. Marchesini, B.R. Webber: Nucl. Phys. B 286, 643 (1987),
Erratum-ibid. B 294, 1180 (1987) 163
25. N. Kidonakis, G.Sterman: Phys. Lett. B 387, 867 (1996); Nucl. Phys. B 505,
321 (1997);
N. Kidonakis, G. Oderda, G. Sterman: Nucl. Phys. B 531, 365 (1998);
G. Oderda: Phys. Rev. D 61, 014004 (2000);
R. Bonciani, S. Catani, M. Mangano, P. Nason: Phys. Lett. B 575, 268 (2003);
A. Banfi, G.P. Salam, G. Zanderighi: Phys. Lett. B 584, 298 (2004);
Yu.L. Dokshitzer, G. Marchesini: Phys. Lett. B 631, 118 (2005); JHEP 0601,
007 (2006) 163, 178
26. G. Marchesini, B.R. Webber: Nucl. Phys. B 330, 261 (1990) 164
27. Yu.L. Dokshitzer, V.A. Khoze S.I. Troian: Phys. Rev. D 53 89 (1996) 164
180 G. Marchesini

28. S. Catani, G. Turnock, B.R. Webber, L. Trentadue: Phys. Lett. B 263, 491
(1991) 165
29. S.D. Drell, D.J. Levy, T.-M. Yan: Phys. Rev. D , 1035 (1970); Phys. Rev. D 1,
1617( 1970) 167
30. V.N. Gribov, L.N. Lipatov: Sov. J. Nucl. Phys. 1, 438 (1972) 167
31. G. Curci, W. Furmanski, R. Petronzio: Nucl. Phys. B 175, 27 (1980) 167
32. M. Stratmann, W. Vogelsang: Nucl. Phys. B 496, 41 (1997) 167
33. S. Catani, M. Ciafaloni: Phys. Lett. B 150, 379 (1985);
S. Catani, M. Ciafaloni, G. Marchesini: Nucl. Phys. B 264, 588 (1986); Phys.
Lett. B 168, 284 (1986) 167
34. Yu.L. Dokshitzer, G. Marchesini, G.P. Salam: Phys. Lett. B 634, 504 (2006) 167
35. A. Mitov, S. Moch, A. Vogt: Phys. Lett. B 638, 61 (2006) 167
36. A.H. Mueller: Phys. Lett. B 104, 161 (1981);
B.I. Ermolayev, V.S. Fadin: JETP Lett. 33, 285 (1981);
A. Bassetto, M. Ciafaloni, G. Marchesini, A.H. Mueller: Nucl. Phys. B 207, 189
(1982);
Yu.L. Dokshitzer, V.S. Fadin, V.A. Khoze: Z. Phys. C 15, 325 (1983); Z. Phys.
C 18, 37 (1983) 167, 172
37. I.I. Balitsky, L.N. Lipatov: Sov. J. Nucl. Phys. 28, 822 (1978);
E.A. Kuraev, L.N. Lipatov, V.S. Fadin: Sov. Phys. JETP 45, 199 (1977) 168
38. M. Ciafaloni: Nucl. Phys. B 296, 49 (1988);
S. Catani, F. Fiorani, G. Marchesini: Phys. Lett. B 234, 339 (1990); Nucl. Phys.
B 336, 18 (1990) 168
39. G. Marchesini, B.R. Webber: Nucl. Phys. B 349, 617 (1991); Nucl. Phys. B 386,
215 (1992);
H. Jung, G.P. Salam: Eur. Phys. J. C 19, 351 (2001) 168
40. G. Veneziano: Nucl. Phys. B 117, 519 1976 169
41. G.’t Hooft: Nucl. Phys. B 72, 461 (1974) 169
42. F. Fiorani, G. Marchesini, L. Reina: Nucl. Phys. B 309, 439 (1988) 169
43. S. J. Parke, T. R. Taylor: Phys. Rev. Lett. 56, 2459 (1986);
M.L. Mangano, S.J. Parke: Phys. Rep. 200, 301 (1991) 170
44. A. Banﬁ, G. Marchesini, G. Smye: JHEP 0208, 006 (2002) 171, 173
45. G. Marchesini, A.H. Mueller: Phys. Lett. B 575, 37 (2003);
see also G. Marchesini, E. Onofri: JHEP 0407, 031 (2004) 172, 173
46. M. Dasgupta, G.P. Salam: Phys. Lett. B 512, 323 (2001); JHEP 0203, 017
(2002) 173, 174
47. A. Banﬁ, G. Corcella, M. Dasgupta: hep-ph 0612282 174
48. M. Beneke: Phys. Rep. 317, 1 (1999) 177
49. For a review see M. Dasgupta, G.P. Salam: J. Phys. G 30 R143 (2004);
R.W. Jones, M. Ford, G.P. Salam, H. Stenzel, D. Wicke: JHEP 0312, 007 (2003)
177
50. D. Acosta et al.: Phys. Rev. D 70 072002 (2004);
see also G. Marchesini, B.R. Webber: Phys. Rev. D 38, 3419 (1988) 177
51. B.R. Webber: Phys. Lett .B 339, 148 (1994);
Y.L. Dokshitzer, B.R. Webber: Phys. Lett. B 352, 451 (1995) 177
Fracture Functions

L. Trentadue

Dipartimento di Fisica, Università di Parma, and INFN, Gruppo Collegato di

Parma, Parma, Italy
[email protected]

Abstract. We present a review of the fracture functions idea. Starting from the
original motivations we examine the theoretical developments intervened and some
of the phenomenological outputs. Further future applications are also envisaged.

1 Introduction and Motivations

Deep inelastic scattering (DIS) has played a crucial role in the hast four
decades for the comprehension of the inner structure of the hadronic interac-
tions. Already from the starting, from the parton model [1] interpretation of
the SLAC experiments, it has represented an inavoidable test of the contin-
uously growing inspection of the high-energy experiments and a benchmark
for the theoretical description of the most intimate features of the strong in-
teractions dynamics. Quantum chromodynamics (QCD) [2], as the theoretical
framework for strong interactions, and the discovery of the asymptotic free-
dom [3] have given rise to the QCD-improved parton model. A series of new
ideas, theoretical tools and hypotheses have then opened a rich and successful
phenomenological approach giving rise to a novel interpretation of the experi-
mental results. The separate and complementary role played by the “current”
and “target” fragmentation was considered already in the parton model ap-
proach to high-energy processes. An heuristic discussion can be found, for
example, in Richard Feynman’s “Lecture 55” on the “Final Hadronic States
in Deep Inelastic Scattering” in his “Photon Hadron Interactions” book [4]. In
the framework of perturbative QCD, the physics request of describing semi-
inclusively hadronic initial states and the dynamics of target fragmentation
was not addressed at the beginning.
The idea of fracture functions originates from the need to extend the de-
scription of the semi-inclusive hadronic processes in deep inelastic scattering
to include the initial state target fragmentation region. It could seem a natu-
ral task, in fact, the one of a complete description of the ﬁnal state entirely

L. Trentadue: Fracture Functions, Lect. Notes Phys. 737, 181–220 (2008)

DOI 10.1007/978-3-540-74233-6 10 c Springer-Verlag Berlin Heidelberg 2008
182 L. Trentadue

in terms of the collinear and infrared logarithmic structure of QCD in its per-
turbative phase. The formulation of the initial state dynamics to include the
rich complexity of the QCD-improved parton model with his quark and gluon
degrees of freedom was not considered. This fact appared even more needed
at the time when the new HERA lepton–proton collider was beginning to
operate in DESY.
The dynamics of the target fragmentation naturally extends the pertur-
bative region of applicability of the QCD theory. It involves the description
of quantitively important processes which are softer than the hard current
fragmentation. It, therefore, deals with physics scales which are smaller and
at the limits of the perturbative region and, also for this reason, it constitutes
a complementary dynamics with respect to the current fragmentation. Both
target and current fragmentation have to be taken into account in order to
reproduce the entire final state without imposing unnatural cuts to separate
them.
Let us at this point recall the idea and the motivations for the fracture
functions with the same words we used as taken from [5]: “When one or
two hadrons are present in the initial state, collinear singularities cannot be
avoided. Asymptotic freedom, however, is still of much importance. Together
with general factorization theorems for collinear singularities [8], it allows to
justify the so-called QCD-improved parton model whereby experimental cross
sections can be computed by convoluting some uncalculable, but process inde-
pendent, quantities with process-dependent, but calculable, elementary cross
sections. The best known case of this type is undoubtfully that of structure
functions, which can be measured in deep inelastic lepton–hadron collisions in
some kinematical regime and then used to compute either the same process or
a completely new hard reaction at a different scale. Besides this utilitaristlc
value, structure functions have also provided, for many years, an invaluable
source of information [6] about the structure of hadrons in terms of valence
and sea quarks and gluons together with interesting information on their po-
larization state. Another much studied set of uncalculable, universal functions
is that of the so-called fragmentation functions, providing the probability that
a given hadron is produced (inclusively) in a jet initiated by a given parton.
A typical use of factorization resides here in the possibility of computing mul-
tihadron final states in jet physics, by convoluting the above fragmentation
functions with the calculable perturbative jet evolution [9]. With the advent
of the new powerful electron–proton collider HERA at DESY, more phase
space is becoming available together with a richer variety of channels. One
may thus wonder if the only QCD-inspired use of the machine should be the
refined measurements of structure and fragmentation functions together with
tests of their predictable evolution and factorization properties. There seems
to be some widespread consensus that this should not be the case and that,
on the contrary, the study of hadron structure can be extended at HERA in
new directions. Actually, already at hadronic colliders, there have been stud-
ies [7] of quantities such as the pomeron structure function, diffractive hard
Fracture Functions 183

scattering and the like, with stimulating outcomes. The aim of this paper is
to give a proper framework in which to talk about these extensions of “bread
and butter” QCD physics. We shall argue that, within perturbatlve QCD,
it is possible to introduce new uncalculable, but measurable and universal
functions, that we call “fracture” functions, which tell us about the structure
function of a given target hadron once it has fragmented (hence its name)
into another given final state hadron. Fracture functions (besides exhibiting
a mild, calculable Q2 dependence) depend upon two hadronic and one par-
tonic label and on two momentum fractions, a Bjorken x and a Feynman z
j
variable M = Mp,h (x, z, Q2 ). One can also say that M measures the parton
distribution of the object exchanged between the target and the final hadron,
without making a (possibly doubtful) model about what that object actually
is, a single particle, a Regge trajectory, a multiparticle continuum, or else. As
for ordinary structure functions, the importance of measuring such an object
will be twofold: (i) it will teach us about the structure of hadronic systems
other than the usual targets and (ii) it can be used as input for computing
other hard semi-inclusive processes at other machines, such as some future
hadronic colliders. By a judicious choice of the final hadron and of its momen-
tum, one will be able, for instance, to enrich the gluonic component of the
partonic flux and thus to enhance signal to background ratios for interesting
gluon-induced processes in hadron–hadron collisions”.
The intent with the predictable evolution and the factorization properties
we had in mind at that time were pursued by the experiments almost literally
as stated, and, with a series of comparisons with the HERA deep inelastic
data it appeared that was possible to verify features and properties of fracture
functions.
From the theoretical side it is also useful to remind here that, as already
stated above, the basic formalism, without which, this straightforward defi-
nition of the fracture functions could not have been given, is the one of the
“jet calculus” of Konishi, Ukawa and Veneziano [9]. The properties of frac-
ture functions, according to the original formulation were possible in terms of
the typical jet calculus variables by using, for instance, the evolution variables
Y, y, y0 as in the jet calculus, the properties of real P̂ij (u) and regularized Pij (u)
Altarelli–Parisi vertices as well as the “evolution functions” Eij (x, Y − y). Jet
calculus formalism, as for entire “jet physics”, did constitute the proper rich
and fruitful background for defining them.
In this report we review the idea of fracture functions as was originally
proposed. We then discuss some of the theoretical developments it has fur-
ther received and some of the applications made in the course of the years.
The paper is organized as follows: In Sect. 2 we recall original definitions and
the evolutions equations together with the relevant properties of the fracture
functions. The complete proof of the factorization of fracture functions, by
using the cut vertex formalism, is then given. In Sect. 2.3 extended fracture
functions are defined. Two-loop next-to-leading fracture functions are then
introduced in Sect. 2.4. In Sect. 2.5 transverse momentum fracture functions
184 L. Trentadue

are obtained. The formalism of fracture functions for diﬀractive processes is

shown in Sect. 2.6. In Sect. 3.1 the phenomenological application to diﬀrac-
tive processes is discussed. Further applications to Higgs production, polarized
processes and heavy quark production are shown in Sects. 3.2–3.4. The exten-
sion of the fracture function concept to the case of multiple hadronic inclusive
processes is sketched in Sect. 3.5. A new formulation, via fracture functions,
of initial state jets can be found in Sect. 4.

2 Formalism and Deﬁnitions

To deﬁne fracture functions let us follow again [5] : “In any hard process with
at least one hadron in the initial state the question arises on how to describe
both target and current fragmentation in perturbative QCD. For the current
fragmentation, according to factorization theorems [8] in the QCD-improved
parton model, inclusive single particle distributions can be accounted for by
factorized convolution of structure and fragmentation functions with the hard
point-like cross section, i.e.

σcurrent Fpi σ̂ij Djh . (1)

For target fragmentation, however, distributions have to be deﬁned diﬀerently.

To this purpose, fracture functions have been introduced. A fracture function
i
Mp,h (x, z, Q2 ) describes the distribution of a given ﬁnal state hadron h in a
process with a target hadron p with the parton-like state i exchanged with an
hard process at the scale Q2 . A fracture function depends on three labels: two
hadronic and a partonic one and on three variables: two momentum fractions
x and z the Bjorken and Feynman variables, respectively, of the i and h states
and the hard process scale Q2 . A new kind of factorization is conjectured to
take place. The factorized form of the target fragmentation cross section can
be written in terms of fracture functions as follows:

σtarget i
Mp,h σ̂i . (2)

Fracture functions are not calculable but measurable and universal functions
as structure function are. As for ordinary structure functions, measuring them
will give us informations about hadronic systems and dynamics. These infor-
mations can be used, as for structure and fragmentation functions, as an input
for different hard, semi-inclusive processes, also, eventually, at different ener-
gies, by means of evolution equations. By identifying the final h hadron, one
can costrain the exchanged, parton-like, state. As an example the i parton-like
i
state in Mp,h (x, z, Q2 ) will be a gluon-rich state for h = p and a quark-rich
state, possibly a pion-like object for h = n where n is a neutron”.
From the point of view of a consistent formulation in terms of factorized
amplitudes of the theoretical inputs and assumptions, fracture functions do
Fracture Functions 185

represent a further step toward the control of the singularities within the
perturbation theory.
It has been shown with a one-loop evaluation in [10] that an entire class
of collinear divergencies, due to the conﬁgurations corresponding to hadrons
emitted along the initial state directions are naturally absorbed within frac-
ture functions. This observation extends the validity of the factorization
theorems [8] also to the initial state mass singularities within the target frag-
mentation region.
Two separate contributions can be isolated in the target cross section [5]:

σtarget i
Mp,h σ̂i + Fpi Dkh σ̂i . (3)

Correspondingly, one can associate to the cross section two terms, i.e. σtarget =
M N P + M P . The first is a non-perturbative contribution and the second a
perturbative one. They can be defined at a given scale Q20 by requiring that
M P |Q2 =Q20 = 0. It is possible to obtain an evolution equation to determine the
i
fracture function Mp,h (x, z, Q2 ) at any other scale Q2 . The evolution equation
has the form
j
∂Mp,h (x, z, Q2 ) αs (Q2 ) 1 du j i x αs (Q2 )
2
= Pi (u) Mp,h ( , z, Q2 ) +
∂ln Q 2π x
1−z
u u 2π
x+z
x
u du zu x
P̂ij,l (u) Dlh ( , Q2 ) Fpi ( , Q2 ) (4)
x x(1 − u) x(1 − u) u
Pij (u) and P̂ij,l (u) being the regularized and real Altarelli–Parisi vertices, re-
spectively [9]. Dlh (z, Q2 ) represents the fragmentation function of the parton
l into hadron h and Fpi (x, Q2 ) is the ordinary deep inelastic proton structure
function. The evolution equation can be solved and the solution reads
1−z
j αs (Q2 ) dw j x 2 2 αs (Q2 )
Mp,h (x, z, Q2 ) = Ei ( , Q , Q0 ) Mp.h i
(w.z.Q20 ) +
2π x w w 2π
Q2 2 1 1− wx
dk dw du x
Ekj ( , Q2 , k 2 ) P̂ikl (u) (5)
Q20 k 2
x+z w w
2 x u(1 − u) wu
z
× Dlk ( , k 2 ) Fpi (w, k 2 ).
w(1 − u)
The first term describes the hadron distribution at a given arbitrary scale
Q20 evolving it to a scale Q2 by means of the perturbative evolution function
Eij ( w
x
, Q2 , Q20 ) which satisfies the equation [9]

∂ j αs (Q2 ) 1 du j x
Q2 2
E i (x, Q2
, Q2
0 ) = Pk (u) Eik ( , Q2 ). (6)
∂Q 2π x u u
The second term describes the perturbative evolution from Q20 to Q2 of the
active exchanged parton i. The perturbatively generated partonic shower ac-
companying the evolution of the parton i contains an inclusive distribution
186 L. Trentadue

for an additional parton l which ﬁnally fragments into the hadron h. Fracture
functions do satisfy several properties [5] as follows:
• Do not depend on the arbitrary choosen scale Q20 , i.e.

∂
M j (x, z, Q2 ) = 0 (7)
∂Q20 p,h

• Both Dlh (x, Q2 ) and Fpi (x, Q2 ) satisfy the usual Altarelli–Parisi evolution
1 1
equations and h 0 dz z Dlh (x, Q2 ) = 1 and i 0 dx x Fpi (x, Q2 ) = 1
with
1
du u Pij (u) = 0 (8)
i 0

j
Mp,h (x, z, Q2 ) satisﬁes the momentum sum rule:

1
j
dz z Mp,h (x, z, Q2 ) = (1 − x) Fpj (x, Q2 ) (9)
h 0

2.1 Factorization of Fracture Functions: a Proof

Here we give the proof of the factorization of fracture functions. We follow

closely the line of reasoning of [11] and give the proof by using a generalized
operator product expansion (OPE)–cut vertex [12] formalism.
The deﬁnition of cut vertices is used in the simple case of (φ3 )6 theory. This
toy model, despite its simpler structure, shares several important properties
with QCD. It has a dimensionless, asymptotically free coupling constant and
the diagrams with leading mass singularities have the same topology as in
(light cone gauge) QCD. For these reasons, (φ3 )6 is an excellent theoretical
framework for the study of factorization properties [15].
To deﬁne cut vertices, we consider the inclusive deep inelastic process

p + J(q) → X
Fracture Functions 187

oﬀ the current J = 12 φ2 . We deﬁne as usual Q2 and x as

Q2
Q2 = −q 2 x= . (12)
2pq
Let us choose a frame in which p = (p+ , p− , 0) with p+ p− and pq p+ q− .
Given a vector k = (k+ , k− , k) define k̂ = (k+ , 0, 0). The structure function,
defined as
Q2
F (p, q) = d6 y eiqy < p|J(y)J(0)|p >, (13)
2π
describes the interaction of the far off-shell current J(q) with an elementary
quantum of momentum p through the discontinuity of the forward scattering
amplitude (see Fig. 1).
The leading contribution to the structure function comes from the decom-
position shown in Fig. 2. Here, with the notations of [12], τ is the hard part
of the diagram, i.e. the one in which the large momentum flows, while λ is
the soft part. Decompositions with more than two legs connecting the hard
to the soft part are suppressed by powers of 1/Q2 .
Such decomposition can be written in formulae as
d6 k
F (p, q) = Vλ (p, k) Hτ (k̂, q) , (14)
τ
(2π)6

where Vλ (p, k) and Hτ (k, q) are the discontinuities of the long and short dis-
tance parts, respectively.
Moreover, in order to pick up the leading contribution in (14), the mo-
mentum k which enters τ is taken to be collinear to the external momentum
p. Neglecting renormalization, let us deﬁne for a given decomposition into a
λ and a τ subdiagram

Fig. 1. Deep inelastic structure function in (φ3 )6

188 L. Trentadue

Fig. 2. Relevant decomposition for the deep inelastic structure function in (φ3 )6

6
k+ d k
vλ (p2 , x) = Vλ (p, k) x δ x − (15)
p+ (2π)6

and
Cτ (x, Q2 ) = Hτ (k 2 = 0, x, q 2 ). (16)
Here vλ (p2 , x) represents the contribution of λ when the hard part is con-
tracted to a point, while Cτ (x, Q2 ) is the hard part in which one neglects the
virtuality of the incoming momentum with respect to Q2 . Since

Q2
x (17)
2p+ q−

and
Q2
Hτ (k̂, q) = Hτ (0, , q 2 ), (18)
2k+ q−
using the deﬁnition of k̂ and (15)–(19), we can write
d6 k
F (p, q) = Vλ (p, k) Hτ (k̂, q)
τ
(2π)6

k+ d6 k
= Vλ (p, k) δ u − du Cτ (x/u, Q2 )
τ
p+ (2π)6
du

du
= vλ (p2 , u) Cτ (x/u, Q2 ) ≡ v(p2 , u) C(x/u, Q2 ) . (19)
τ
u u

The last integral defines the space-like cut vertex v(p2 , x) and the correspond-
ing coefficient function C(x, Q2 ). As usual, a simpler factorized expression
for the structure function is obtained by taking moments with respect to x.
Defining the Mellin transform as
1
fσ = dx xσ−1 f (x), (20)
0
Fracture Functions 189

we ﬁnd immediately

Fσ (p2 , Q2 ) = vσ (p2 ) Cσ (Q2 ). (21)

It was shown in [13, 14] that the cut vertex represents the analytic contin-
uation in the spin variable of a matrix element of operators of minimal twist.
This correspondence has been conﬁrmed up to two loops by direct calculation
of the anomalous dimensions of cut vertices and leading twist operators [13].
Hence, in the case of DIS, the factorized expression (21) can be identiﬁed with
the one given by OPE

Fn (p2 , Q2 ) = An (p2 ) Cn (Q2 ) , (22)

where An (p2 ) are now matrix elements of local operators. Thus, for integer
values of σ, the coeﬃcient function which appears in (21) is the same as in (22).
This fact will be used in the next section where the evolution of the extended
fracture function will be shown to be driven by the anomalous dimension of
the same set of local operators.

2.2 A Cut Vertex Approach to Semi-inclusive Processes

Let us consider now, still within (φ3 )6 , a deep inelastic reaction in which a
particle with momentum p is inclusively observed in the ﬁnal state, i.e. the
process
p + J(q) → p + X.
By using the same line of reasoning as for the inclusive case we may deﬁne a
semi-inclusive structure function as (see Fig. 3)

Q2
W (p, p , q) = d6 x eiqx < p|J(x)|p X >< X p |J(0)|p > (23)
2π
X

in terms of matrix elements of the current operator between the incoming

hadron with momentum p and the outgoing hadron with momentum p plus
anything.

Fig. 3. Deep inelastic semi-inclusive structure function in (φ3 )6

190 L. Trentadue

When the observed particle has transverse momentum p2 2

⊥ of order Q the
cross section is dominated by the current fragmentation mechanism and can
be written in the usual factorized way [16]

dx dz
W (p, p , q) = fA (x , Q2 ) σ̂(x/x , z , Q2 ) DA (z/z , Q2 ), (24)
x z

where
pp p
z= − . (25)
pq q−

In the language of cut vertices, (24) is a convolution of a space-like and a

time-like cut vertex through a coeﬃcient function [17]

dx dz
W (p, p , q) = v(p2 , x ) C(x/x , z , Q2 ) v (p2 , z/z ). (26)
x z

By contrast, the limit t = −(p − p )2 Q2 is dominated by the target

fragmentation mechanism and has not been considered in either approach.
In particular, it has been shown [10] at one loop that in the limit t → 0 a
new collinear singularity appears in the semi-inclusive cross section, which
cannot be absorbed into parton densities and fragmentation functions, and so
must be lumped into a new phenomenological distribution, i.e. the fracture
function.
Following the same steps as before, one can argue that, in the region
t Q2 , the leading contribution to the semi-inclusive cross section is given
by the decomposition shown in Fig. 4.
Such a decomposition implies that an expansion similar to the one in
(19) holds, in terms of a new function v(p, p , x̄) and a coeﬃcient function
C(x̄, Q2 )
du
W (p, p , q) = v(p, p , u) C(x̄/u, Q2 )

(27)
u

Fig. 4. Relevant decomposition for the semi-inclusive structure function in (φ3 )6

in the limit t Q2
Fracture Functions 191

where we have deﬁned a new variable z as

p q p
z= + (28)
pq p+
and a rescaled variable x̄ = x/(1 − z). The new function v(p, p , x̄) is
given by
6
k+ d k
v(p, p , x̄) = T (p, p , k) x̄ δ x̄ − , (29)
p+ − p+ (2π)6
where T (p, p , k) is the discontinuity of a six-point amplitude in the channel
(p−p −k)2 . The function v(p, p , x̄) is a new object that we will call a general-
ized cut vertex, which depends both on p and p and embodies all the leading
mass singularities of the cross section. By taking moments with respect to x̄
as in (20), (27) becomes
Wσ (p, p , q) = vσ (p, p ) Cσ (Q2 ) (30)
that is a completely factorized expression analogous to (21).
We are now going to show that this expansion holds up to corrections
suppressed by powers of 1/Q2 . In order to do so, it can be used the method
of infrared power counting [18, 19] applied to our process.
In order to get insight into the large Q2 limit of the semi-inclusive cross
section, let us look at the singularities in the limit p2 , p2 , t → 0. The infrared
power counting technique can predict the strength of such singularities. Start-
ing from a given diagram, its reduced form in the large Q2 limit is constructed
by simply contracting to a point all the lines whose momenta are not on shell.
The general reduced diagrams in the large Q2 limit for the process under
study involve a jet subdiagram J, composed by on-shell lines collinear to the
incoming particle, from which the detected particle emerges in the forward
direction, since in the large Q limit p and p can be taken as parallel, and a
hard subgraph H in which momenta of order Q circulate, which is connected
to the jet by an arbitrary number of collinear lines. Soft connections between
J and H can be possibly collected into a soft blob S which is connected to the
rest of the diagram by an arbitrary number of lines (see Fig. 5). In (φ3 )6 , by
using power counting [19], we ﬁnd that the leading contributions come from
graphs with no soft lines and the minimum number of collinear lines connect-
ing the hard to the jet subdiagram, as in Fig. 6. This fact has been veriﬁed
by an explicit one-loop calculation in [20].
Any other diagram containing additional collinear lines between J and H
is suppressed by powers of 1/Q2 . It follows that W (p, p , q) is of the following
form:
d6 k
W (p, p , q) = T (p, p , k) H(k̂, q) + O(1/Q2 ). (31)
(2π)6
It is now straightforward to show that (31) is equivalent to (27) with the
substitution H(0, x, q 2 ) = C(x, Q2 ). Thus the expansion (27) corresponds to
taking the leading part of the semi-inclusive cross section.
192 L. Trentadue

Fig. 5. General reduced graphs which contribute to the semi-inclusive structure

function in (φ3 )6

2.3 Extended Fracture Functions

In the previous section we have given arguments for the validity of a gener-
alized cut vertex expansion for the process p + J(q) → p + X in the region
t Q2 . Let us now investigate the consequences of such a result.
The coefficient function which appears in (27) is the same as that of (19)
since it comes from the hard part of the graphs which is exactly the same
as in DIS. So we can draw the important conclusion that the evolution of
the coefficient function appearing in (27) is directly related to the anomalous
dimension of the leading twist local operator which drives the evolution of the
DIS coefficient function.
Despite the fact that the theoretical framework in which we have been
working is the model field theory (φ3 )6 we expected the main consequences

Fig. 6. Leading contributions to the semi-inclusive structure function in (φ3 )6

Fracture Functions 193

expressed in (27) remain valid also in a gauge theory such as QCD [11].
The only further complication which are expected to arise are due to soft
gluon lines connecting the hard to the jet subdiagrams. Unlike in (φ3 )6 ,
in QCD these diagrams are not suppressed by power counting. The only
way to get rid of such contributions is to show that they cancel out. As
already argued in [21], we did not expect that this complication would
destroy factorization. The issue of a complete factorization proof in QCD
has been later considered in [22]. In QCD, by using renormalization group,
we have

αs (Q2 ) γ (n) (α)
dα
Cn (Q ) ≡ Cn (Q /Q0 , αs ) = e αs
i 2 i 2 2 β(α) Cnj (1, αs (Q2 )), (32)
ij

where Q0 is the renormalization scale, αs ≡ αs (Q20 ), γ (n) is the anomalous

dimension matrix of the relevant operators and an ordered exponential is to
be understood. Thus we can write the analogue of (30) in QCD as

Wn (z, t, Q2 ) = Min (z, t, Q2 ) Cni (1, αs (Q2 )) (33)
i

where, by following [13], we have deﬁned a t-dependent fracture function

αs (Q2 ) γ (n) (α)
dα
Mjn (z, t, Q2 ) ≡ Vni (z, t, Q20 ) e αs β(α) (34)
ij

just in terms of a cut vertex Vni (z, t, Q20 ) (see Fig. 7).
Inverting the moments and expressing the extended fracture function in
terms of the usual Bjorken variable x, one ﬁnds that MiA,A (x, z, t, Q2 ) obeys
the simple homogeneous evolution equation

∂ 1 du
Q2 M i
(x, z, t, Q2
) = Kij (u, αs (Q2 )) MjA,A (x/u, z, t, Q2 )
∂Q2 A,A j
x
1−z
u
(35)
where Kij (u, α), deﬁned as
1
2 +i∞
1
dn γij (α) u−n ,
(n)
Kij (u, α) ≡ (36)
2πi 2 −i∞
1

is the same Dokshitzer–Gribov–Lipatov–Altarelli–Parisi (DGLAP) kernel,

which controls the evolution of the ordinary parton distribution functions.
This result looks particularly appealing since it means that the evolution of
the extended fracture function follows the usual perturbative behaviour. One
may ask at this point how this result matches with the peculiar equation,
which drives the evolution of ordinary fracture functions [5]
194 L. Trentadue

Fig. 7. Extended fracture function

2 ∂ j 2 αs (Q2 ) 1 du j i 2
Q M (x, z, Q ) = Pi (u) MA,A (x/u, z, Q )
∂Q2 A,A 2π x
1−z
u
x
αs (Q2 ) x+z du i 2 jl zu 2
+ F (x/u, Q ) P̂i (u) Dl,A , Q . (37)
2π x x(1 − u) A x(1 − u)
j 2
The evolution equation for MA,A (x, z, Q ) contains two terms: a homo-

geneous term describing the non-perturbative production of a hadron coming

from target fragmentation and an inhomogeneous term whose origin is the per-
turbative fragmentation due to initial state bremsstrahlung. As discussed in
[5], the separation between perturbative and non-perturbative fragmentation
j 2
introduces an arbitrary scale but the fracture function itself MA,A (x, z, Q )

does not depend on it.

It can be shown [23] that the evolution equation (37) can be derived by an
explicit calculation using (35) together with jet calculus rules [9]. By deﬁning
in fact the ordinary fracture function as an integral over t up to a cut-oﬀ of
order Q2 , e.g. Q2 with < 1:
Q2
j
MA,A 2
(x, z, Q ) = dt MjA,A (x, z, t, Q2 ), (38)

the inhomogeneous term in the evolution equation is obtained by taking into

account the Q2 dependence of the integration cut-oﬀ.
Fracture Functions 195

Moreover, the interplay between the scales Q2 and t has a sizeable eﬀect
in terms of a new class of perturbative corrections of the form log Q2 /t. Such
corrections are large and potentially dangerous in the region t Q2 since
they can ruin a reliable perturbative expansion. Those terms are naturally
resummed into (34). For the extended fracture function, these corrections
do play an important role for understanding the dynamics of semi-inclusive
processes in the kinematic region we have been considering here [23].
Despite the proof of factorization in deep inelastic scattering [11, 22] the
one that fracture functions factorize in hadron–hadron scattering has not yet
been given.

2.4 Two-loop Next-to-leading Fracture Functions

Daleo, Garcia-Canal and Sassot [24, 25] have considered the extension of the
fracture function formalism to include O(αs2 ) QCD corrections and to evaluate
amplitudes and evolution equations to next-to-leading (NLO) accuracy. The
factorization of fracture functions has been also explicitly checked. The main
features related to fracture functions, have been studied in these works up
to NLO accuracy, as it is standard in the inclusive case. In particular, there
were neither explicit checks of factorization at O(αs2 ) nor indications of how
relevant the non-homogeneous evolution might be at NLO.
The evaluation of the NLO corrections in the semi-inclusive channel and
the explicit check of the factorization of the collinear singularities need a care-
ful treatment. With respect to the inclusive case, where after a convenient in-
tegration over ﬁnal states singularities may be written as distributions in only
one variable times a regular function, in the semi-inclusive one at O(αs2 ), it is
necessary to keep additional variables unintegrated. Consequently, entangled
singularities in more than one variable have to be dealt with. In order to check
factorization, it has to be kept track of the kinematical origin or conﬁguration,
which gives rise to the singularity [24]. This requires [24, 25] a detailed analy-
sis of the singularity structure characteristic of the process. In the paper [24]
the case where the initial state parton is a gluon is addressed. After obtaining
the explicit expressions for the renormalized fracture functions, the explicit
evolutions equations can be derived [24]:
j
∂Mp,h (x, z, Q2 ) αs (Q2 ) 1 du j αs (Q2 ) (1)j i x
2
= [Pi (u) + Pi (u)] Mp,h ( , z, Q2 )
∂ lnQ 2π x
1−z
u 2π u
x 1−u
αs (Q2 ) 1 x+z du u dv αs (Q2 ) (1)j,k
+ [P̂ij,k (u, v) + P̂i (u, v)] (39)
2π x x u xz v 2π
z x
· Dkh ( , Q2 ) Fpi ( , Q2 ).
xv u
(1)j,k (1)j,k
Here Pij,k (u), Pi (u), P̂ij,k (u) and P̂i (u) are the leading and NLO com-
plete and real kernel, respectively. The corresponding expressions allow the
196 L. Trentadue

complete determination of the non-homogeneous evolution as for the ordinary

DGLAP [26] evolution equations. These therefore allow to verify the factor-
ization of collinear singularities up to O(αs2 ). The relevance of next-to-leading
corrections may be also explicitely shown when the effects of the new evolution
kernels are compared with the leading order corrections.
The case where the initial state parton is a quark has been addressed in
[25]. Here a more complex singularity structure is present which implies a cor-
responding more involved pattern of factorization. The explicit check of the
factorization and a corresponding evolution equation analogous to the one ob-
tained for the gluon case has been explicitly derived [25]. The comparison with
the leading order non-homogeneous equation shows for the quark-initiated
semi-inclusive hadronic distributions that the impact of next-to-leading cor-
rections depends on the kinematical region of the final hadron emission. Next-
to-leading corrections result larger for smaller values of the Bjorken variable
x and longitudinal momentum fraction z of the hadrons and of the parent
partons and the impact, at fixed values of z, becomes larger as x decreases.
In diffractive processes the inhomogeneous term is kinematically suppressed
(z → 1) while for forward hadron production, where the inhomogeneous term
becomes important, the impact of non-leading contributions increases [27].

2.5 Transverse Momentum-dependent Fracture Functions

In this section we discuss the explicit inclusion of transverse momenta for the
semi-inclusive distributions by using fracture functions. We follow the work
of [28]. In the current fragmentation, transverse momentum of the detected
hadron is taken into account through the following DGLAP time-like equa-
tion [29]:

∂Dih (zh , Q2 , p⊥ ) αs (Q2 ) 1 du d 2 q⊥
Q2 2
= Pij (u, αs (Q2 )) ·
∂Q 2π zh u π

z zh
h
δ( u(1 − u)Q2 − q⊥2
) Djh , Q2 , p⊥ − q⊥ .(40)
u u
The corresponding space-like equation can be derived as follows:

∂F i (xB , Q2 , k⊥ ) αs (Q2 ) 1 du i d 2 q⊥
Q2 P 2
= 3
Pj (u, αs (Q2 )) ·
∂Q 2π xB u π

x
B k ⊥ − q⊥
δ( (1 − u)Q2 − q⊥ 2
) FPj , Q2 , . (41)
u u
Perturbative evolution is, however, at work even in target fragmentation re-
gion and we expect that a non-negligible amount of transverse momentum
is also produced there. We thus generalize fracture function distributions to
contain also transverse degrees of freedom. By deﬁnition, fracture functions
Mip,h (x, k⊥ , z, p⊥ , Q2 ) give the conditional probability to ﬁnd in a proton P ,
at a scale Q2 , a parton with momentum fraction x and transverse momentum
Fracture Functions 197

k⊥ while a hadron h, with momentum fraction z and transverse momentum

p⊥ , is detected. Under these assumptions the following evolution equations
can thus be derived [28]:

2
∂Mip,h (x, k⊥ , z, p⊥ , Q2 ) αs (Q2 ) 1
du i
Q = P (u)
∂Q 2 2π x
1−z
u3 j
2

d q⊥ x k⊥ − q⊥
· δ( (1 − u)Q2 − q⊥ 2
)Mjp,h Q2 , , , z, p⊥
π u u
x+zx 2
du d q⊥
+ P̂ji,l (u) δ( (1 − u)Q2 − q⊥ 2
)
x x(1 − u)u 2 π

x k − q
zu zu
⊥ ⊥
·Fpj , , Q2 Dlh , p⊥ − q⊥ , Q2 . (42)
u u x(1 − u) x(1 − u)

As in the longitudinal case, two terms contribute to the evolution of transverse

momentum fracture functions as displayed in Fig. 8. The homogeneous one has
a pure non-perturbative nature since involves the fragmentation of the proton
remnants into the hadron h. The inhomogeneous one takes into account the
production of the hadron h from a time-like cascade of parton j and thus is
dubbed perturbative. The transverse momentum fracture functions fulﬁl the
normalization condition

d2 k⊥ d2 p⊥ MiP,h (x, k⊥ , z, p⊥ , Q2 ) = MiP,h (x, z, Q2 ) , (43)

as direct consequence of the kinematics of both terms in the evolution equa-

tions, (42). The proof of factorization, i.e. that all singularities occuring in
the target remnant direction can be properly renormalized by the less in-
clusive transverse momentum quantity MiP,h (x, k⊥ , z, p⊥ , Q2 ), is still lacking
at present. In the following we assume such a factorization to hold. Once

Q2 Q2

x, k ⊥ x, k ⊥ h
i z
i l
j D
j p⊥
h
z
P p⊥ P
M F
a) b)

Fig. 8. Evolution of fracture functions M : (a) homogeneous term; (b) inhomoge-

neous one
198 L. Trentadue

transverse momentum evolution equations are solved, these predictions can be

compared with semi-inclusive DIS data, as for the longitudinal case, provided
that a factorization theorem holds even for transverse momentum distribu-
tions. Such a theorem has been shown to hold in the current fragmentation
region for the structure function H2 in [30]:

H2 (xB , zh , Ph⊥ , Q2 ) = e2q d2 k⊥ d2 p⊥ δ (2) (zh k⊥ + p⊥ − Ph⊥ )
i=q, q̄

× FPi (xB , μ2F , k⊥ , ) Dih (zh , μ2D , p⊥ ) C(Q2 , μ2F , μ2D ) , (44)

where the standard semi-inclusive variables are deﬁned as follows:

P · Ph Q2
zh = , xB = , (45)
P ·q 2P · q

and μ2F and μ2D are the factorization scales. The above results are accurate
up to powers in (Ph⊥ 2
/Q2 )n for soft transverse momenta Ph⊥ ΛQCD . Evo-
lution equations for F and D are given in (40) and (41). The factor C is the
process-dependent hard coeﬃcient function computable in perturbative QCD
and to leading logarithmic accuracy(LLA) we can set C=1. Provided that
factorization holds for the transverse momentum fracture functions, we may
add, according to (42), their contributions to H2 :

H2 (xB , zh , Ph⊥ , Q2 ) = e2q d2 k⊥ d2 p⊥ δ 2 (zh k⊥ + p⊥ − Ph⊥ )
i=q,q̄

·FPi (xB , Q2 , k⊥ ) Dih (zh , Q2 , p⊥ )A(0) (46)

+(1 − xB )Mip,h (xB , k⊥ , z, p⊥ , Q2 ) δ 2 (p⊥− Ph⊥ )A(1)

where we have identiﬁed all the three factorization scales with the hard scale,
Q2 = μ2F = μ2D = μ2M . Although, formally, the two contributions are sim-
ply added in (46), at LLA and in photon–proton centre of mass frame, the
produced hadrons are mainly distributed in two opposite hemispheres. Target
fragmented hadrons are produced mainly in the θ = π direction, while cur-
rent fragmented hadrons mainly along the θ = 0 direction. Here θ is the angle
of the produced hadron h with respect to the photon direction, as shown in
Fig. 8b. In order to keep track of the emission angle of the detected hadron
h, we supplement current and target fragmentation terms in (46) with an
angular distribution A(v) [10]. The angular and energy variables v and z are
deﬁned as
Eh 1 − cos θ
z= , v= , zh = z v . (47)
Ep (1 − xB ) 2

In (47), Eh and Ep denote, respectively, the energies of the detected hadron

and of the incoming proton in the photon–proton centre of mass frame. The
Fracture Functions 199

variables z and v are a useful frame-dependent representation for the hadronic

invariant zh in two respects: z reduces to zh in the current fragmentation
region so that we recover the standard deﬁnitions, while for low zh -values we
can distinguish soft hadrons (z → 0) from the ones produced in the target
remnant direction (θ → π). Since to LLA all sources of transverse momenta
contributing to Ph⊥ have been taken into account we may pictorially represent
(46) as in Fig. 9. This ﬁgure shows the sources of transverse momenta in the
current and target fragmentation regions extending the description of semi-
inclusive processes transverse degrees of freedom.

2.6 Diﬀraction

Diﬀractive reactions in photon–hadron interactions can be deﬁned as as the

reactions where, within the final state, an isolated hadron can be observed as
separated from the rest of the process by a large rapidity gap.
When compared with an inclusive deep inelastic process γ ∗ p → X a diffrac-
tive channel can be represented by the reaction γ ∗ p → p∗ X, where the final
inclusive collection of hadrons X is well separated from the final eventually
excited hadron p∗ . Differently from the totally inclusive deep inelastic scat-
tering, a diffractive one might be considered as a combination of hard photon
virtualness Q2 scale and another either hard or eventually soft scale t trans-
ferred momentum between p and p∗ . Therefore, in the case of a perturbative
approach, a combination between perturbative evolutions leading to peculiar
final state consisting in a hard hadronic output together with an unbroken
proton or a slightly excited final hadronic state.
The peculiar signature of the large rapidity gap events suggests fur-
thermore that between the lower and the upper parts of the reactions the
exchange of a colourless object does take place. The mechanism of colour

Ph
γ∗
γ∗

Ph⊥

P Ph⊥
P
F M

Fig. 9. Sources of transverse momentum in the current (left) and in the tar-
get (right) fragmentation region in semi-inclusive processes. Dark blobs symbolize
hard partons emission. Transverse momentum Ph⊥ of the detected hadron h is also
indicated. F , D and M represent parton distribution, fragmentation and fracture
functions, respectively
200 L. Trentadue

screening shows itself directly in diffraction. The dynamics of the deep inelastic
diffractive reactions has been already studied long time ago [31] in terms of
space–time evolution of a composite photon. Analogously to the case of the to-
tally inclusive deep inelastic scattering, also in the case of diffractive reactions
it is possible to define particular distributions in terms of suitable structure
and fragmentation functions.
The typical diffractive reaction γ ∗ p → p∗ X can be written in terms of a
D(3)
new kind of structure function F2 (x, Q2 , ξ), i.e.
dσ 4πα2 y 2 D(3)
2
= 4
(1 − y + )F2 (x, Q2 , ξ)
dxdQ dξ xQ 2
where the process is fully defined by the variables
Q2 q · (p − p ) Q2 x Q2
x= ; ξ= ; β= = ; y = (48)
2p · q q·q 2q · (p − p ) ξ xs
with q and p the photon and initial proton momenta, p the momenta of the
final hadron, and x and Q2 the Bjorken variable and the hard scale. ξ and β
characterize the intermediate state of the process. According to the Ingelman–
Schlein [32] approach to diffactive processes, a diffractive distribution can be
written as
dσ
= fP (ξ, t) σ̂P (M 2 ) (49)
dtdξ
Q2 +M 2
where ξ Q2 +Wx2 and Mx2 and W 2 are the final hadron invariant masses.
σ̂P (M 2 ) is the point-like hard cross section and fP the pomeron partonic
distribution. The fully differential diffractive distribution can be written as

dσ D (Q2 , x, ξ, t) 4πα2 dF2D

= (x, Q2 , ξ, t) (50)
dxdQ2 dξdt xQ4 dξdt
where F2D (x, Q2 , ξ, t) is the diﬀractive structure function given by the factor-
ized expression
dF2D dfa/P 2
(x, Q2 , ξ, t) = dx (x , Q , ξ, t) F̂2a (51)
dξdt a
dξdt

If one uses for the diﬀerential parton distribution the Regge parametrization
it can be written as
dfa/P 2 g(t)2
(x , Q , ξ, t) ξ 1−2αP fa/P (52)
dξdt 8π 2
with fa/P the parton a pomeron structure function. In perturbative QCD, a
more direct model-independent expression can be given

dσ ξ dfi (y, ξ, t)
(x, Q2 , ξ, t) = dy σ̂i (x, Q2 , y) · (53)
dtdξ i x dξdt
Fracture Functions 201
dfi (y,ξ,t)
where the parton distributions [21] dξdt are just fracture functions

dfi (y, ξ, t) i
= Mp,p (x, 1 − ξ, Q2 , t). (54)
dξdt
This factorized expression does not require any Regge or any alternative
parametrization for the parton distributions. Fracture functions do represent a
natural continuation of the Ingelman–Schlein [32] approach to describe diffrac-
tive processes in the sense that fracture functions allow the perturbative QCD
evolution of the distributions in terms of the variable Q2 .
An interesting description of diffractive scattering and factorization has
been proposed by Hautmann, Kunszt and Soper [33] in terms of diffractive
parton distributions. According to this formulation in hadronic systems with
small transverse size, diffraction occurs predominantly at short distances and
the diffractive parton distributions can be studied by perturbative methods.
For larger systems it is discussed the possibility that diffractive parton dis-
tributions are controlled essentially by semi-hard physics at a scale of the
order of giga electron volt. The authors find that this possibility accounts for
important qualitative aspects of the diffractive data from HERA as the flat
behaviour in β and the delay in the fall-off with Q2 .
Arguments have been given in [39] against the diffractive factorization in
hadron–hadron scattering.

3 Applications and Phenomenology

3.1 Diﬀraction

The phenomenological description of the diﬀraction dynamics in terms of frac-

ture functions has been extensively used to analyse HERA data.
Here we follow the work of De Florian and Sassot [34] where a fracture
function-based QCD analysis of the first data produced by the H1 and ZEUS
collaborations was given. Both the diffractive and the leading proton deep
D(3) LP (3)
inelastic lepton–proton scattering structure functions F2 and F2 have
been considered. The aim is to verify if the QCD framework for semi-inclusive
processes, based on fracture functions, is able to allow a unified treatment of
diffractive and leading proton processes, with a detailed perturbative QCD
description for them, alternative to those that rely on model-dependent as-
sumptions.
LP (3)
By defining the leading proton structure function F2 from the corre-
sponding triple-differential deep inelastic scattering cross section

d3 σ LP 4πα2 y2 LP (3)
≡ 1 − y + F2 (x, Q2 , ξ) , (55)
dx dQ2 dξ x Q4 2
202 L. Trentadue

with the usual kinematical variables. Even though the processes accounted for
are of a semi-inclusive nature, the formulation based on the leading proton
structure function is used instead of the usual approach for semi-inclusive
deep inelastic scattering
p
d3 σcurrent 4πα2 y2
1 − y + x e2i Fpi (x, Q2 ) Dih (z, Q2 ) (56)
dx dQ2 dz xQ4 2 i

in terms of parton distributions and fragmentation functions, since the last

one only takes into account hadrons produced in the current fragmentation
region and thus not contributing to the forward leading hadron observables.
In terms of fracture functions, for very forward protons with 1 − ξ z

p
d3 σtarget 4πα2 y2
2
= 1−y+ e2i xMpi,p (x, z, Q2 ) , (57)
dx dQ dz x Q4 2 i

where Mpi,p (x, z, Q2 ) is the fracture function that accounts for target fragmen-
tation processes and obeys the evolution equation (4). Deﬁning the equivalent
to F2 for fracture functions, i.e.
p

M2p (x, z, Q2 ) ≡ x e2i Mpi,p (x, z, Q2 ), (58)
i

and taking into account the shift from z to ξ, the relation between this function
and the leading proton structure function is quite apparent.
Similarly the differential cross section for diffractive deep inelastic scatter-
D(3)
ing is usually written in terms of the diffractive structure function F2

d3 σ D 4πα2 y2 D(3)
≡ 1−y+ F2 (β, Q2 , xIP ) , (59)
dβ dQ2 dxIP β Q4 2

where xIP ≡ ξ, and the variable β is used instead of x. To collect the data,
the integration over the small transverse momentum of the final state pro-
ton is implied, i.e. on the variable t = (P − P )2 . When the integration over
the variable t is not performed, then the “extended fracture functions” [11],
with an explicit dependence on that variable and obeying homogeneous evo-
lution equations can be used. These do correspond to the diffractive structure
D(4)
functions F2 (β, Q2 , xIP , t).
The diffractive region is given by small values of xIP (xIP < 0.1), whereas
leading proton data are associated with larger values of xIP ( xIP > 0.1).
As a suitable parametrization for the proton-to-proton fracture function
p
M2p (β, Q20 , xIP ) at a given initial scale Q20 , is choosen [34] by selecting a simple
functional dependence in the variables β and xIP . The quark singlet component
p/p p/p p/p
(Mpq,p ≡ 3Mu = 3Md = 3Ms ) of the fracture function is parametrized
as [34]
Fracture Functions 203

xMqp/p (β, Q20 , xIP ) = Ns β as (1 − β)bs

α
CIP β xIPIP + CLP (1 − β)γLP (1 + aLP (1 − xIP )βLP ) , (60)

and similarly for gluons with the corresponding parameters Ng , ag and bg . The
normalization constants Ns , CIP and CLP , are also properly set. Concerning
the evolution the choosen values are Q20 = 2.5 GeV2 and ΛQCD = 0.232 GeV2
in a scheme with a variable number of ﬂavours, where charm and bottom
distributions are radiatively generated from their corresponding thresholds.
The data of the H1 and ZEUS collaborations [35, 36, 37] have been analysed
in [34]. The results can be listed here in Figs. 10–12. The parametrization in
terms of fracture functions describes the data over the entire range of Q2 and
β when compared with the diﬀractive H1 (Figs. 10 and 11) and ZEUS data.

Fig. 10. H1 diﬀractive data against the outcome of the fracture function
parametrization (solid lines) and its pomeron-like component (dashed lines).
From [34]
204 L. Trentadue

Fig. 11. H1 scale dependence of H1 diﬀractive data and the one obtained evolving
the fracture functions. From [34]

Also the scale dependence, obtained from the evolution, shows agreement with
the data over all the range of the values of Q2 and β.
A more recent analysis has been made by the H1 collaboration [38]. Here
a measurement of the diffractive parton distribution functions has been per-
formed by using diffractive parton distribution functions.
The data are presented in the form of a “diffractive reduced cross section”
D(3)
σr , related to the differential cross section measured experimentally by the
equation [38]

d3 σ ep→eXY 2πα2
= · Y+ · σrD(3) (xIP , x, Q2 ) , (61)
dxIP dxdQ2 xQ4

where Y+ = 1+(1−y)2 . Similarly to what done for the inclusive deep inelastic
case [46], the reduced e+ p cross section depends on the diﬀractive structure
Fracture Functions 205

Fig. 12. ZEUS diﬀractive data against the expectation coming from the fracture
function parametrization. From [34]

D(3) D(3)
functions F2 and FL in the one-photon exchange approximation accord-
ing to the relation
D(3) y 2 D(3)
σrD(3) = F2 − F . (62)
Y+ L
D(3) D(3)
Since for y not too close to unity, σr = F2 holds to very good approx-
imation. differently from the previous measurements of inclusive diffractive
deep inelastic scattering at HERA, where the data were presented in terms of
D(3) D(3)
F2 in [38] are given in terms of σr .
The charged current measurements of the data are integrated over some
or all of the kinematic variables. They are presented as a total cross section
and single differentially in either xIP , β or Q2 .
The Q2 dependence is quantified by fitting the data at fixed xIP and β to
the form
σrD(3) (xIP , Q2 , β) = aD (β, xIP ) + bD (β, xIP ) ln Q2 , (63)
206 L. Trentadue

D(3)
such that bD (β, xIP ) = ∂σr /∂ ln Q2 is the first logarithmic Q2
β,xIP
derivative of the reduced cross section.
As discussed before QCD hard scattering collinear factorization, when ap-
plied to diffractive deep inelastic scattering implies that the cross section for
the process ep → eXY can be written in terms of convolutions of partonic
cross sections σ̂ ei (x, Q2 ) with diffractive parton distribution functions fiD as

dσ ep→eXY (x, Q2 , xIP , t) = fiD (x, Q2 , xIP , t) ⊗ dσ̂ ei (x, Q2 ) . (64)
i

The partonic cross sections are the same as those of inclusive deep inelastic
scattering, and the functions fiD represent probability distributions for the
i-th parton in the proton, under the constraint that the proton is scattered
to a particular system Y with specified four-momentum. They are not known
from first principles, but can be determined from fits to the data using the
evolution equations [26].
The analysis is carried by using input parameters describing the diffrac-
tive parton distribution functions at a starting scale Q20 for QCD evolution
are adjusted to obtain the best description of the data after NLO DGLAP [47]
evolution to Q2 > Q20 and convolution of the diffractive parton distribution
functions with coefficient functions. The fit is performed in the M S renor-
(3)
malization scheme. The strong coupling is set via ΛQCD = 399 ± 37 MeV for
three flavours. The evolution of the diffractive reduced cross section with Q2
is compared with that of the inclusive deep inelastic reduced cross section σr
by forming the ratio

σr (xIP , x, Q2 )
D(3)
, (65)
σr (x, Q2 )
x,xIP

at ﬁxed x and xIP , by using parameterizations of the σr previously analysed

data. This ratio is shown multiplied by xIP in Figs. 13–17 as a function of Q2
for all measured xIP and x = β xIP values.
From the figures it appears a well-defined pattern of scaling violation show-
ing that diffractive deep inelastic scattering do obey evolution equations as for
ordinary inclusive deep inelastic scattering. At large values of xIP it is apparent
the presence of partonic degrees of freedom with a perturbative QCD evolu-
tion. Diffractive parton distribution functions are dominated by the gluon
distribution [36].
Factorization in diffractive deep inelastic scattering has been checked by
using diffractive parton distribution functions by comparing diffractive events
with final state cross sections for jets [53, 54] and heavy quarks [55]. These
comparisons show consistency among them.
On the other hand, the apparent failure of the factorization in hadron–
hadron collider data [56] when deep inelastic parton diffractive distributions
are used, leaves still open the question if the input of deep inelastic data can
Fracture Functions 207

Fig. 13. The ratio of the diffractive to the inclusive reduced cross section, multiplied
by xIP and shown as a function of Q2 for fixed x and fixed xIP = 0.0003. The data
are multiplied by a further factor of 3i for visibility, with i as indicated. The inner
and outer error bars represent the statistical and total uncertainties, respectively.
Normalization uncertainties are not shown. The results of fits of a linear dependence
on log Q2 to the data are also shown. Picture taken from [38]

be used at the hadron colliders. These and other related issues have been
recently discussed [57] and have been investigated by the diffraction working
group at the HERA–LHC Workshop [58].
From the analyses of the H1 [38] and ZEUS [52] collaborations it clearly
emerges the picture of a perturbative QCD description of the diffraction via
factorized diffractive parton distributions as indicated by the fracture function
approach.

3.2 Higgs Production

The possibility of producing the Higgs boson via a diffractive reaction by us-
ing fracture functions has been proposed by Graudenz and Veneziano [48].
This rests on the factorization hypothesis for semi-inclusive hard processes in
QCD at the hadronic colliders. In principle, the diffractive production of the
Standard Model Higgs boson at LHC can be studied by using only, as input,
diffractive hard-processes data of the type recently collected and analysed
208 L. Trentadue

Fig. 14. As in Fig. 13 with xIP = 0.001. From [38]

by the H1 and ZEUS collaborations at HERA. In [48] the existing HERA

data have been combined with a simple pomeron exchange picture. A large
spread in the Higgs boson production cross section is found, depending on
the input parametrization of the pomerons’ parton content. In particular, if
the pomeron gluon density fg xIP (x) is peaked at large β for small scales, sin-
gle diffractive events can represent a sizeable fraction of all produced Higgs
bosons with an expected better-than-average signal-to-background ratio. Dif-
ferent analyses are also possible, since, as the more precise HERA data have
shown, a hard perturbative QCD approach to diffractive processes, in defined
kinematical regions has to be preferred to the hadron–pomeron vertex relying
in a parametrized diffractive parton distribution. It would be interesting to
test the possibility of diffractive Higgs production (modulo the factorization
problem) by using the present available H1 and ZEUS data, as well as the
future (and probably more precise) data.

3.3 Polarized Processes

The application of fracture functions to describe polarized processes has been

studied by De Florian, Garcia Canal, Sampayo and Sassot [50, 51]. The
aim is to extend to the target fragmentation region the description of po-
larized processes. They discuss the factorization of the collinear singularities
Fracture Functions 209

Fig. 15. As in Fig. 13 with xIP = 0.003. From [38]

related to the polarized processes, particularly those which are absorbed in the
redeﬁnition of the spin-dependent analogue of fracture functions.1 In [50] they
show that, with the inclusion of polarized fracture functions, it is possible to
consistently factorize all the collinear singularities that occur and that the

1
An extensive discussion on the role of the U (1)A anomaly in QCD phenomenol-
ogy can be found in the contribution. In this review the issues related to spin
physics and to the theoretical implications as well as the use of fracture func-
tions in experiments on semi-inclusive polarized deep inelastic scattering are also
discussed.
210 L. Trentadue

Fig. 16. The ratio of the diffractive to the inclusive reduced cross section, multiplied
by xIP and shown as a function of Q2 for fixed x and fixed xIP = 0.01. See the caption
of Fig. 13 for further details. From [38]

formalism can be straightforwardly applied in order to factorize unwanted

finite soft contributions. In this way the conservation of the non-singlet ax-
ial current and the non-conservation of the singlet one, as dictated by the
anomaly result is preserved. This requirement allows the definition of polar-
ized parton and fracture distributions intimately related to the fraction of the
nucleon spin carried by partons. The definition of an universal and physically
meaningful factorization scheme for both current and target fragmentation,
consistent with those used in totally inclusive spin-dependent deep inelastic
scattering and unpolarized electron–proton annihilation, allows to perform an
Fracture Functions 211

Fig. 17. As in Fig. 16 with xIP = 0.03. See the caption of Fig. 13 for further details.
From [38]

unambiguous O(αs ) analysis of inclusive experiments. A phenomenological

analysis has been carried in [51], where results of different experiments have
been discussed.
The potential relevance of the fracture functions to describe spin-dependent
distributions has been advocated by Teryaev [42]. They may be applied at
fixed target energies and may also include interference and final state inter-
action, providing a source for azimuthal asymmetries at HERMES and polar-
ization at NOMAD. Accordingly, the work of [44] can be rephrased in terms
of fracture functions (see also [43]).
Kotzinian [45] has recently discussed the role of hadronization mechanism
in polarization phenomena in semi-inclusive deep inelastic scattering and a
purity method for extraction of polarized distribution functions. By using a
Monte Carlo event generator producing hadrons via current quark as well as
target diquark fragmentation or light cluster decays. Since the purity method
assumes that only quark fragmentation gives contribution to hadron produc-
212 L. Trentadue

tion in the current fragmentation region, it turns out that the ignorance of
contributions from target diquark fragmentation and cluster decays to asym-
metry can be the source of incorrect values of polarized quark distributions
extracted by the purity method.

3.4 Fracture Functions and Heavy Quark Production

The possibility of using fracture function formalism to describe the produc-
tion of heavy quarks in the target fragmentation region has been studied by
Graudenz [59]. Several interesting features are advocated in favour of this
possibility:
i) Fixed-target experiments permit the study of hadron production in the
target fragmentation region.
ii) The tagging of specific particles in the target fragments can be employed
to introduce a bias in the hard scattering process towards a specific flavour
content. The case of hadrons containing a heavy quark is particularly at-
tractive because of the clear experimental signatures and the applicability
of perturbative QCD. One of such cases, modulo factorization, can be
considered also the production of heavy quarks at the hadron colliders.
iii) The standard approach to one-particle inclusive processes based on frag-
mentation functions is valid in the current fragmentation region and for
large transverse momenta pt in the target fragmentation region, but it fails
for particle production at small pt in the target fragmentation region. A
collinear singularity, which cannot be absorbed in the standard way into
the phenomenological distribution functions, prohibits the application of
this procedure.
This situation, remedied by the introduction of fracture functions which
describe particle production in the target fragmentation region, and can be
viewed as correlated distribution functions in the momentum fractions of the
observed particle and of the parton initiating the hard scattering process.
In [59] it is shown, in a next-to-leading-order calculation for the case of deep
inelastic lepton–nucleon scattering, that the additional singularity can be con-
sistently absorbed into the renormalized target fragmentation functions on the
one-loop level. The formalism is applied to the production of heavy quarks.
The renormalization group equation of the target fragmentation functions for
the perturbative contribution is solved numerically, and the results of a case
study for deeply inelastic lepton–nucleon scattering at DESY (H1 and ZEUS
at HERA), at CERN (NA47) and at Fermilab (E665) are discussed.
Higgs production and the possible association with heavy quark fracture
functions have been recently considered in [49].

3.5 Multiple Inclusive Processes

By extending the deﬁnition of single hadron fracture function to include a
second hadron one has that for a generic double inclusive DIS process, l+P →
Fracture Functions 213

l + h1 + h2 + X the corresponding cross sections at leading logarithm level

is [40]
dσ x
= (1 − x) e2i Fpi (x, Q2 )D2h1 h2 (z1 , z2 , Q2 )
dxdz1 dz2 dQ2 i=q,q̄
1 − x

i 2 i
+M2,h1 h2 /p (x, z1 , z2 , Q ) + M1,h 1 /p
(x, z1 , Q2 )D1,i
h2
(x, z2 )

i h1
+M1,h 2 /p
(x, z2 , Q2 )D1,i (z1 , Q2 ) (66)

The ﬁrst term takes into account the current fragmentation of h1 and h2 , the
second the target fragmentation, and for completeness we also added mixed
terms in which one hadron is produced in the opposite fragmentation region
with respect to the other.
i
In the following we will discuss evolution equations for M2,h 1 h2 ,p
(x, z1 ,
2
z2 , Q ). M2 gives the conditional probability of ﬁnding an active quark i with
fraction x of the incoming hadron while two secondary hadrons are produced
with momentum fraction z1 and z2 with respect to the incoming hadron mo-
mentum. The evolution equation for M2 can be obtained [40]

2 1
2 ∂M2 (x, z1 , z2 , Q ) αs (Q2 ) du
Q 2
= P (u)M2 (x/u, z1 , z2 , Q2 )
∂Q 2π x
1−z1 −z2
u
x+z x

z u
1 du u 1
+ P̂ (u)M1 (x/u, z2 , Q2 )D , Q2 +
x u x(1 − u) x(1 − u)

1−z2
x+zx+z
z u
1 2 du u2 2 1 z2 u 2
+ P̂ (u)F (x/u, Q )D2 , ,Q .
x u x2 (1 − u)2 x(1 − u) x(1 − u)
(67)
It can be extended to Mn

1
∂Mn (x, z1 , .., zn , Q2 ) αs (Q2 ) du
Q2 = P (u)Mn (x/u, z1 , .., zn , Q2 )
∂Q2 2π x
1− n zi
u
i

n−1
x+
x

du
zi u j
+ P̂ (u)Mn−j (x/u, zn−j , .., zn , Q2 )
j=1 1− zi
x
u x(1 − u)

z u zj u
1
Dj , .., , Q2
x(1 − u) x(1 − u)
xn
n
z u
i i du u zn u
x+ z
2 1 2
+ P̂ (u)F (x/u, Q )Dn , .., ,Q
x u x(1 − u) x(1 − u) x(1 − u)
(68)
where the numeric subscript represents the number of hadrons described by
a given distribution and the remaining partonic and hadronic indexes have
been suppressed for simplicity.
214 L. Trentadue

4 Jet Cross sections and Fracture Functions

In the previous section we have shown that theoretical predictions for inclusive
multi-particle distributions, although available, these become more involved as
we increase the number of identified particles in the final state. The situation
is also worsened, at the experimental level, by the high-multiplicity nature
of events at present and future colliders. Even in the next-to-simplest case,
M2 , an analysis seems to be prohibitive. Outgoing hadrons however emerge
as clusters of particles in defined portions of momentum space, a signature
of the dominant collinear branching scheme of QCD dynamics. For this rea-
son jets become the natural representation of hadronic activity. Perturbative
calculations with an arbitrary number of partons in the final state and exper-
imental jet observables can be quantitatively compared only once a common
jet-algorithm is chosen and used on both the theoretical and experimental
level. Let us focus now on DIS jet cross sections. Inclusive DIS structure func-
tions can be decomposed in terms of n-particles exclusive structure functions
(n)
F2 as [61]

∞ ∞
1 dσ (n) dσ (n)
≡ F 2 (x, Q2
) = x e2
i q i (x, Q2
) = F 2 = (69)
σ dxdQ2 i=q,q̄ n=1 n=1
dxdQ2

Analogously for jet cross sections, in terms of suitable jet algorithm, as for
example the one deﬁned in [41], the jet exclusive structure functions show a
factorized structure of the type

(n)
1 dz (n)

Q2
F2 (x, Q2 ; Et2 , ycut ) = Fpi (x/z, μ2F )R2,i z, αs , 2 , ycut (70)
i=q,q̄ x
z Et

where ycut represents the jet structure resolution parameter [41] and is de-
fined in terms of an arbitrary perturbative scale Et2 , with Λ2 Et2 ≤ Q2 . In
(70), initial state collinear divergences are factorized into parton distributions
(n)
functions. The jet coefficients R2,i are calculable in perturbation theory and
again depend on the particular jet algorithm chosen, as indicated by the de-
pendence on Et2 and ycut . Since we are interested in initial state jets, i.e. jets
originating by the space-like struck parton, we briefly recall some features of
the approach of [41]. Initial state jets in arbitrary number have been accounted
for by using a generating functional method [41]. The n-jet cross sections were
then constructed with an iterative block structure. Given the Sudakov form
factor Δ(Q2i , Q2j ), which inhibites emissions off the struck parton lines i in
between the two scales Q2i and Q2j as

Q2
Q2j
dt 1− i
Q2 αs (t)
Δi (Q2i , Q2j ) ≡ exp − Q2
j
dz P̂ji (z) (71)
j Q2i t i 2π
Q2
j
Fracture Functions 215

it guarantees that no hadronic activity takes place between each couple of jets.
The emission of the real partons is then controlled by real splitting functions
P̂ [9]. Their subsequent decays are taken into account via the jet function [60]
J(Q2 , k 2 )
1
J(Q2 , k 2 ) = dz dh (z, Q2 , k 2 ) , (72)
h 0

where dh (z, Q2 , k 2 ) is the probability that an initial parton with mass Q2

decays into a parton with a longitudinal momentum fraction z with respect
to the parent parton and with a mass k 2 Q2 . The d’s functions satisfy the
properties
Q2 Q2
dk 2 d(z; Q2 , k 2 ) ≡ D(z; Q2 ); dk 2 J(Q2 , k 2 ) = 1 . (73)
0 0

In the construction of the n-jet cross sections, an iterative block-like structure

G is associated to each jet insertion
n =1
Gikjet (u, Q2i , Q2j ) ≡ Δij (Q2i , Q2j ) P̂ljm (u) Jm (Q2j , Q20 ) Δlk (Q2j , Q2k ) (74)

Along the struck parton line with ordered virtualities Q20 < . . . < Q2i <
Q2j < Q2k < . . . < Q2 representing the scales where real parton emissions
are allowed. Here Q20 is intended to be the factorization scale, while Q2 is
the virtuality of the parton which directly interacts with the photon. Let us
discuss differences between this approach and the jet calculus one. The struc-
ture of (74) is obtained by construction at the exclusive level. The evolution
function E(u, Q2i , Q2k ) as defined in (6), can be regarded the analogous of as
n =1
the inclusive level of the function Gikjet (u, Q2i , Q2j ). It takes into account
the corresponding of (74) at the inclusive level. It inclusively sums all the
radiated partons between Q2i and Q2k and does not describe jet production.
We may define a new inclusive distribution [40] that gives the probability of
detecting hadrons in a portion of phase space defined by z and t. The sum
over the partons i, specified by x and Q2 , and struck by the virtual photon,
is understood as in the totally inclusive case (69)

1 dσ
≡ e2 Mi (x, Q2 , z, t) (75)
σtot dxdQ dzdt i=q,q̄ i
2

We interpret such hadronic distributions as characterized only by the partonic

indexes, where x and Q2 are determined by the scattered lepton variables, as
for the inclusive case in (69). The variables z and t are the fraction of the
longitudinal momentum and the invariant transferred momentum squared of
the ﬁnal state hadrons hi , respectively. The hadrons that are contained in a
portion of phase space R limited by the constraints

R : ti = −(P − hi )2 < t, t0 ≤ t ≤ Q2 . (76)

216 L. Trentadue

Once this procedure is followed, we may obtain z by summing the fractional

longitudinal momenta of the hadrons satisfying the phase space constraint
of (76):
z= z i , hi ∈ R (77)
i

By using n-particle exclusive cross sections,

(n) 1 d2n+2 σ (n)

Σexcl ≡ (78)
n! dxdQ2 nm=1 dzm dtm
which may be obtained directly from experiments we may derive, in analogy
with (69), the hadronic distributions in (75) as
k 1
1 t

∞
n
1 dσ (k)
≡ dt m dz m Σexcl δ z − z k . (79)
σtot dxdQ2 dzdt σtot m=1 t0 0
k=1 k=1

with the R phase space constraints already implemented in the cross sec-
tions. The minimum value t0 corresponds to the beam pipe acceptance where
hadrons, being not measured, have not to be counted in Mi . We may recover
the structure function F2 by simply integrating over the hadronic variables
1 Q2
dσ dσ
= dz dt (80)
dxdQ2 0 t0 dxdQ2 dzdt
The dynamics of the evolution along the struck parton line, as seen from
a leading logarithmic accurate evolution equation, can be sketched as fol-
lows: partons are emitted strongly ordered in t, with increasing values of t
along the line towards the virtual photon, while softest kt -emissions are clos-
est to the proton remnant. In such conﬁgurations, planar diagrams give, as
is well known in the inclusive case, the leading logarithmic contributions to
the cross sections, which are actually resummed by parton evolution functions
E(x, Q2i , Q2j ). Let us deﬁne, by using jet calculus rules as in the previous sec-
tions, the semi-inclusive extended functions Mi in terms of parton evolution
functions E(x, Q2i , Q2j )
1−z
dw i
M (x, Q , z, t) =
j 2
M (w, t, z, t)Eij (x/w, t, Q2 ) (81)
x w

The Mi (w, t, z, t) are intended as the distributions corresponding to a parton

with longitudinal momentum w and scale t and a final state hadrons config-
uration specified by (76) and (77). The convolution limits are determined by
using momentum conservation. Once Mi (w, t, z, t) is measured, the evolution
can be obtained by differentiating (81) with respect to Q2

∂ αs (Q2 ) 1 du j
Q2 M j
(x, Q2
, z, t) = Pk (u)Mk (x/u, Q2 , z, t) (82)
∂Q2 2π x
1−z
u
Fracture Functions 217
2
This evolution equation actually resums large logarithm of the type αs log Qt .
In reality, however, t-ordering is only partially realized. Higher-order correc-
tions together with large angle emissions, i.e. fixed order matrix element, pro-
duce partons that, even if originated by parent parton with a hard t-scale, end
up along the time-like shower in final state hadrons with soft values of t. Since
experiments and also the clustering procedure do not distinguish the origin
of such hadrons, the description becomes increasingly reliable as much as the
accuracy in describing the partonic shower increases. This can be achieved,
for instance, by inserting appropriate higher loop splitting functions
j (0) αs j (1)
Pkj (u) = Pk (u) + P (u) + . . . (83)
2π k
The coefficients corresponding to the two-loop vertex functions, are the ones
given in [24, 25] allowing a next-to-leading logarithmic accuracy evolution, do
provide a space-like jet calculus formulation via fracture functions.
At variance with inclusive case, Mi is more sensitive to the details of
the struck parton evolution since a portion of final state hadrons is observed
semi-inclusively and not just summed over. The evolution equations, (82),
is formally equivalent to the one for one-particle inclusive extended fracture
functions, [23]. The reason for this similarity is that, in (82), is actually the
parton with the hardest ti = t̄ in the region R which pilots the evolution of
M, as in the one-particle case. An illustrative example is found in the diffrac-
tive data analysis at HERA [38, 52]. Whenever we perform a semi-inclusive
measurement with a leading baryon detected in the forward spectrometer, we
can of course describe the cross sections in terms of one-particle extended
fracture functions M . On the other hand, a diffractive event is also specified
by observing a gap in the forward hadronic activity while the proton or its
low-mass excitation escape undetected. This ensemble of particles, all of which
have a ti < t̄ is what we call, collectively, M.

5 Conclusions
Fracture functions represent a new approach and a useful theoretical tool
to describe initial state radiation in QCD semi-inclusive processes. A series
of successful applications have been already explored. Theoretical and phe-
nomenological developments are underway. Higher statistics data from HERA
and from the higher energy experiments at hadron colliders will constitute fur-
ther important tests for the fracture function idea.

Acknowledgements

This note has been written to celebrate the 65th birthday of Gabriele
Veneziano. It is a great privilege to work with Gabriele, to share with him
the bright intuition, the vivid imagination, the sharp reasoning and the pro-
found knowledge of physics. I would like to express to Gabriele also the deep
218 L. Trentadue

gratitude for the generosity, for the enthusiasm and, sometimes, for the en-
couragement he has been able to transmit, unchanged in the course of the
years, as a mentor and as a friend and for the continuing enjoyable collabo-
ration. I wish to Gabriele to be happy and to continue to do physics, in his
extraordinary way, for many more years to come, for his pleasure and ours. I
have much beneﬁted from conversations and discussions with several friends
and colleagues. In addition to Gabriele I would like also to thank Gianni
Camici, Federico Ceccopieri, Dirk Graudenz and Massimiliano Grazzini, for
the collaboration we have had on the topics discussed here.

References
1. R.P. Feynman: Phys. Rev. Lett. 23, (1969) 1415
J.D. Bjorken, E.A. Paschos: Phys. Rev. 185, (1969) 1975 181
2. H. Fritzsch, M. Gell-Mann, H. Leutwyler: Phys. Lett. B 47, 365 (1973) 181
3. H. D. Politzer: Phys. Rev. Lett. 30, 1346 (1973); D. J. Gross, F. Wilczek: Phys.
Rev. Lett. 30, 1343 (1973) 181
4. R. P. Feynman: Photon-Hadron Interactions (W. A. Benjamin Advanced Book
Program, New York, 1972) 181
5. L. Trentadue, G. Veneziano: Phys. Lett. B 323, 201(1994) 182, 184, 185, 186, 193, 194
6. R . Taylor: An Historical Review of Lepton Proton Scattering, SLAC-PUB-5832
(June 1992); G Altarelli, Phys. Rep. 81, 1 (1992) 182
7. P. V. Landshoﬀ: in Proc. 27th Rencontre de Moriond on Perturbative QCD
and Hadronic Interactions (22–28 March, 1992), ed. by J. Tran Thanh Van
(Editions Frontieres), p. 393 and references therein, Gif-sur-Yvette, France 182
8. D. Amati, R. Petronzio, G. Veneziano: Nucl. Phys. B 140 ,54 (1978), B 146
29(1978); R. K. Ellis, H. Georgl, M. Machacek, H. D. Politzer, G. G. Ross:
Phys. Lett. B 78 281(1978); Nucl. Phys. B 152, 285 (1979) 182, 184, 185
9. K. Konishi, A. Ukawa, G. Veneziano: Phys. Lett. B 78, 243 (1978) Phys. Lett.
B 80, 259 (1979); Nucl. Phys. B 157, 45 (1979) 182, 183, 185, 194, 215
10. D. Graudenz: Nucl. Phys. B 432, 351 (1994) 185, 190, 198
11. M. Grazzini, L. Trentadue, G. Veneziano: Nucl. Phys. B 519, 394 (1998) 186, 193, 195, 202
12. A. H. Mueller: Phys. Rev. D 18, 3705 (1978) 186, 187
13. L. Baulieu, E.G. Floratos, C. Kounnas: Nucl. Phys. B 166, 321 (1980) 189, 193
14. T. Munehisa: Prog. Theor. Phys. 67, 882 (1982) 189
15. J.C. Taylor: Phys. Lett. B 73, 85 (1978); Y. Kazama, Y.P. Yao: Phys. Rev.
Lett. 41, 611 (1978); Phys. Rev. D 19, 3111 (1979); T. Kubota: Nucl. Phys.
B165, 277 (1980); L. Baulieu, E.G. Floratos, C. Kounnas: Phys. Rev. D 23,
2464 (1981) 186
16. G. Altarelli, R.K. Ellis, G. Martinelli, S.Y. Pi: Nucl. Phys. B 160, 301 (1979) 190
17. S. Gupta, A.H. Mueller: Phys. Rev. D 20, 118 (1979) 190
18. G. Sterman: Phys. Rev. D 17, 2773 (1978) 191
19. J.C. Collins, D.E. Soper, G. Sterman: in Perturbative Quantum Chromodynam-
ics, ed. by A.H. Mueller (World Scientiﬁc, Singapore, 1989) 191
20. M. Grazzini: Nucl. Phys. B 518, 303 (1998); see also M. Grazzini: Phys. Rev.
D 57, 4352 (1998) 191
21. A. Berera, D.E. Soper: Phys. Rev. D 50, 4328 (1994) 193, 201
Fracture Functions 219

22. J. Collins: Phys. Rev. D 57, 305 (1998); Erratum ibid. D 61, 019902 (2000) 193, 195
23. G. Camici, M. Grazzini, L. Trentadue: Phys. Lett. B 439, 382 (1998) 194, 195, 217
24. A. Daleo, C.A. Garcia-Canal, R. Sassot: Nucl. Phys. B 662, 334 (2003) 195, 217
25. A. Daleo, R. Sassot: Nucl. Phys. B 673, 357 (2003) 195, 196, 217
26. V. Gribov, L. Lipatov: Sov. J. Nucl. Phys. 15, 438 (1972) [Yad. Fiz. 15, 781
(1972) ]; ibid. 15, 675 (1972) [Yad. Fiz. 15, 1218 (1972) ]; Yu. L. Dokshitzer: Sov.
Phys. JETP 46, 641 (1977) [Zh. Eksp. Teor. Fiz. 73, 1216 (1977)]; G. Altarelli,
G. Parisi: Nucl. Phys. B 126, 298 (1977) 196, 206
27. A. Daleo, De Florian, R. Sassot: Phys. Rev. D 71, 034013 (2005); A. Daleo,
R. Sassot: Phys. Rev. D 73, 054014 (2006) 196
28. F. A. Ceccopieri, L. Trentadue: Phys. Lett. B 636, 310 (2006) 196, 197
29. A. Bassetto, M. Ciafaloni, G. Marchesini: Nucl. Phys. B 163, 477 (1980) 196
30. X. Ji, J. Ma, F. Yuan: Phys. Rev. D 71, 034005 (2005) 198
31. J.D. Bjorken, J.B. Kogut, Phys. Rev. D 8, 1341 (1973) 200
32. G. Ingelman,P. Schlein: Phys. Lett. B 152, 256 (1985) 200, 201
33. F. Hautmann, Z. Kunszt, D. E. Soper: Nucl. Phys. B 563, (1999) 153; Phys.
Rev. Lett. 81, (1998) 3333 201
34. D. de Florian, R. Sassot: Phys. Rev. D 58, 054003 (1998) 201, 202, 203, 204, 205
35. A. Prinias [H1 and ZEUS Collaborations], Talk given at International Euro-
physics Conference on High-Energy Physics (HEP 97), Jerusalem, Israel, 19–26
August 1997 203
36. C. Adloff et al [H1 Collaboration]: Z. Phys. C 76, 613 (1997) 203, 206
37. J. Breitweg et al [ZEUS Collaboration]: Eur. Phys. J. C 1, 81 (1998) 203
38. A. Aktas et al [H1 Collaboration]: Eur. Phys. J. C 48, 715 (2006) 204, 205, 207, 208, 209, 21
39. J. C. Collins, L. Frankfurt, M. Strikman, Phys. Lett. B 307, 161 (1993) 201
40. F. Ceccopieri, L. Trentadue, arXiv:0706.4242 [hep-ph], Phys. Lett. B ( in press).
213, 215
41. S. Catani, Yu. L. Dokshitzer, B.R. Webber: Phys. Lett. B 285, 291 (1992) 214
42. O. V. Teryaev: Acta Phys. Polon. B 33, 3749 (2002) 211
43. O. V. Teryaev: Phys. Part. Nucl. 35, 524 (2004) 211
44. S. J. Brodsky, D. S. Hwang, I. Schmidt: Phys. Lett. B 530, 99 (2002) 211
45. A. Kotzinian: Phys. Lett. B 552, 172 (2003) 211
46. C. Adloff et al [H1 Collaboration]: Eur. Phys. J. C 30, 1 (2003) 204
47. W. Furmanski, R. Petronzio: Z. Phys. C 11, 293 (1982) 206
48. D. Graudenz, G. Veneziano: Phys. Rev. D 66, 010001 (2002) 207, 208
49. F. Maltoni, T. McElmurry, S. Willenbrock: Phys. Rev. D 72, 074024 (2005) 212
50. D. de Florian, C. A. Garcia Canal, R. Sassot: Nucl. Phys. B 470, 195 (1996) 208, 209
51. D. de Florian, O. A. Sampayo, R. Sassot: Phys. Rev. D 66, 010001 (2002) 208, 211
52. S. Chekanov et al. [ZEUS Collaboration]: Nucl. Phys. B 713, 3 (2005) 207, 217
53. S. Chekanov et al. [ZEUS Collaboration]: Eur. Phys. J. C 38, 43 (2004); 206
J. Breitweg et al. [ZEUS Collaboration]: Eur. Phys. J. C 5, 41 (1998);
K. Golec-Biernat, J. Kwiecinski: Phys. Lett. B 353, 329 (1995);
C. Royon et al.: Phys. Rev. D 63, 074004 (2001)
54. C. Adloff et al. [H1 Collaboration]: Eur. Phys. J. C 6, 421 (1999); 206
C. Adloff et al. [H1 Collaboration]: Eur. Phys. J. C 20, 29 (2001)
55. C. Adloff et al. [H1 Collaboration]: Phys. Lett. B 520, 191 (2001) 206
56. F. Abe et al. [CDF Collaboration]: Phys. Rev. Lett. 79, 2636 (1997);
F. Abe et al. [CDF Collaboration]: Phys. Rev. Lett. 78, 2698 (1997); 206
B. Abbott et al. [D0 Collaboration]: Phys. Lett. B 531, 52 (2002);
T. Affolder et al. [CDF Collaboration]: Phys. Rev. Lett. 84, 5043 (2000);
T. Affolder et al. [CDF Collaboration]: Phys. Rev. Lett. 85, 4215 (2000);
V. Abazov et al. [D0 Collaboration]: Phys. Lett. B 574, 169 (2003)
220 L. Trentadue

57. J. Bjorken, Phys. Rev. D 47, (1993) 101;

E. Gotsman, E. Levin, U. Maor, Phys. Lett. B 309, 199 (1993); 207
E. Gotsman, E. Levin, U. Maor, Phys. Lett. B 438, 229 (1998);
B. Cox, J. Forshaw, L. Lönnblad, JHEP 9910, 023 (1999);
A. Kaidalov, V. Khoze, A. Martin, M. Ryskin, Phys. Lett. B 567, 61 (2003)
58. M. Arneodo et al.: in Proc. of the HERA-LHC Workshop, ed. by A. De Roeck,
H. Jung (CERN-2005-014, 2005), p. 417 207
59. D. Graudenz: Fortsch. Phys. 45, 629 (1997) 212
60. S. Catani, L. Trentadue, Nucl. Phys. B 327, 323 (1989) 215
61. C.E. Detar, D.Z. Freedman, G. Veneziano: Phys. Rev. D 4, 906 (1971) 214
Part IV

Non-perturbative QCD
Coherence and Incoherence in QCD Jets
Dynamics (QCD Jets
and Branching Processes)

A. Giovannini1 and R. Ugoccioni2

1
Theoretical Physics Department, Torino University, Italy, and INFN, Sezione di
Torino, Italy
[email protected]
2
Theoretical Physics Department, Torino University, Italy, and INFN, Sezione di
Torino, Italy
[email protected]

Abstract. The interpretation of QCD jets as Markov branching processes obtained

by solving Konishi–Ukawa–Veneziano equations [1] in the leading logarithmic ap-
proximation with a fixed cut-off regularization prescription [2] is reviewed, and
its impact in multiparticle dynamics critically examined. Independent intermedi-
ate gluon sources (clans) are generated through quark–bremsstrahlung, each source
then decays into final partons according to a cascading mechanism dominated by
gluon self-interaction. At the hadron level, approximate universal regularities are
expected in the different components (or substructures) of the various classes of
high-energy collisions. The general behavior of collective variables of final multi-
plicity distributions is reproduced in terms of the weighted superposition of the
above-mentioned regularities controlling the component behaviors of each collision.
Predictions of signals of new physics at LHC [3] are reviewed, and perspective of
the 1/N expansion approach [4] indicated.

1 Introduction

The research activity of Gabriele Veneziano is very wide and covers different
fields, but usually the emphasis is on his discoveries in dual resonances models
and on his contribution to the understanding of string theory as the correct
quantum theory of gravity. On the occasion of his 65th anniversary, which
motivated the present volume, we would like to point out also the impact of
his work on the search in multiparticle production and correlations, in a region
that is by definition far from the perturbative sector of QCD. In this paper we
will focus our attention on the influence of Gabriele Veneziano in this sector of
physics, starting from his results on jet-calculus and their further applications
to a probabilistic description of parton showers in the leading logarithmic

A. Giovannini and R. Ugoccioni: Coherence and Incoherence in QCD Jets Dynamics (QCD
Jets and Branching Processes), Lect. Notes Phys. 737, 223–234 (2008)
DOI 10.1007/978-3-540-74233-6 11
c Springer-Verlag Berlin Heidelberg 2008
224 A. Giovannini and R. Ugoccioni

approximation (LLA). These developments, together with the idea of local

parton-hadron duality [13] and its related generalization [6], is at the basis of
quite successful models of event generators and of many lines of research in
this field.
The Konishi–Ukawa–Veneziano (KUV) evolution equations opened indeed
a new horizon in the basic understanding of the partonic sector of multipar-
ticle production processes. They provided a QCD basis for the approximate
description of observed universal regularities in the final charged-particle mul-
tiplicity distributions, in all classes of collisions—both in full phase space and
in (pseudo) rapidity windows—in terms of independent intermediate gluon
sources (or partonic clan ancestors) formation, which then decay into final par-
tons through cascading mechanism dominated by gluon self-interactions. This
result is a consequence of the simplified description of quark and gluon QCD
jets as Markov branching processes, as obtained from the above-mentioned
KUV equations. One is led indeed to a sound QCD framework for the approx-
imate description of the observed final charged particle multiplicity distribu-
tions, and of the properties of the related collective variables discovered in
high energy collisions. These observations are the subject of the next section.

2 Elementary Models and Unexplained Facts

in Multiparticle Dynamics in the Early 1970s
The first regular behavior to be recalled concerns the description of the
multiplicity distributions (MD) of all available final particles in high-energy
hadronic collisions, in the accelerator region, in terms of a two-parameter MD
which we will call from now on the Pascal (NB) regularity.1
The two mentioned parameters are the average charged-particle multiplic-
ity n̄ of the distribution itself, and the k parameter, which is linked to the
dispersion D of the distribution by the simple relation k = n̄2 /(D2 − n̄). The
accelerator results confirmed an earlier discovery (in the 1960s) of the multi-
plicity distribution for the pion component in cosmic ray physics, in a variety
of observations performed with different primary nucleon energies [9].
The motivation of this successful phenomenological search in the acceler-
ator region (57 experiments were examined) has to be found in the statistical
generalization of the multiperipheral model of multiparticle production (pro-
posed since 1972), which through the Poissonian superposition of properly
1
It should be noticed that the Pascal (NB) distribution appears with different
names in the literature on multiparticle dynamics. When introduced for the first
time it was called Polya–Eggenberger [7] multiplicity distribution (a due tribute
to biology, where it has been widely applied), while in more recent times the name
Negative binomial (NB) was used (in memory of its statistical origin). However,
as the first historical appearance of the multiplicity distribution in science goes
back to Blaise Pascal [8], the name Pascal distribution was finally proposed, a
choice which will be followed in the present paper.
Coherence and Incoherence in QCD Jets Dynamics 225

weighted multiperipheral diagrams, led to the Pascal (NB) description of the

ﬁnal particle multiplicity distributions [7, 10].
The parameter k was interpreted, in this context, as the ratio of the
reggeon–reggeon particle vertex to the pomeron–reggeon particle vertex; in its
high-energy limit it predicted Koba–Nielsen–Olesen (KNO) scaling violations
in multiparticle production for hadronic collisions, and provided an explana-
tion of observed deviations [11] from the standard expectations of the multi-
peripheral model in terms of the onset of the pomeron coupling.2
It should be recalled that the experimental occurrence of the Pascal distri-
butions in the MD of the ﬁnal charged particles suggested also another possible
interpretation of the data in terms of a stochastic cell model [12]. This model
assumed stimulated emission of identical bosons by identical cells, where each
cell was producing a Bose–Einstein distribution. The parameter k is here an
integer number ≥ 1, and was interpreted as the number of identical cells in-
volved in the collision, according to an old idea of Max Planck [13]. Such an
interpretation was disproved by the data, since k was found to be in general
a non-integer number, and even smaller than one in some cases.

3 KUV Diﬀerential Evolution Equations and the Advent

of QCD in the Late 1970s
It should be pointed out that KUV parton evolution equations [1, 2] are
an application of jet calculus in the leading log approximation (LLA) which
allows—as already mentioned—a probabilistic description of parton shower
processes originated by a quark or a gluon. Since both collinear and infrared
singularities are present in the LLA expression of the DGLAP kernels [14],
the new problem was how to cure such singularities.
It turns out that collinear singularities can be avoided by imposing a soft
cutoff to the evolution of the parton population, whereas infrared divergences
are cured by imposing a fixed cutoff on the variable z in the Dokshitzer–
Gribov–Lipator–Altarelli–Parisi (DGLAP) elementary kernel Pjk (z) describ-
ing emission of parton k from parton j, with parton k carrying a fraction z
of j’s momentum, i.e., zmin = = 1 − zmax . The integrals of the regularized
kernels are then interpreted as elementary splitting probabilities,
1−
Ca Nc
A≡ Pgg (z)dz = = ; (1)

1−
CF N2 − 1
Ã ≡ Pgq (z)dz = = c ; (2)
2Nc

2
The paper [7] was in part done during a stay at MIT of one of the present
authors, and proﬁted of many discussions with G. Veneziano, as witnessed in the
acknowledgments at the end of the paper itself.
226 A. Giovannini and R. Ugoccioni
1−
Nf
B ≡ Nf Pqg (z)dz = , (3)
3

where = (−2 ln )−1 , and Nf , Nc are the number of ﬂavors and colors,
respectively.
The jet thickness Y can be used as evolution variable from the virtual
scale W down to the scale Q,

1 αs (Q2 ) 1 log(W 2 /Λ2 )
Y = log = log , (4)
2πb αs (W 2 ) 2πb log(Q2 /Λ2 )

with b = (11Nc − 2Nf )/12π. The jet thickness Y contains the dependence
on the running coupling constant (leading order), and is the mixture of three
scales: W (the virtual mass of the primary parton), Q (the splitting scale)
and the QCD scale Λ. Y is a small number (< 1) during the early stages of
the shower evolution (Q W ). The probability Pq (Q|W )dQ that a quark q
of virtuality W splits in the range [Q, Q + dQ], by emitting a gluon, is then
given by [2]
Pq (Q|W )dQ = e−ÃY ÃdY , (5)
and the probability Pg (Q|W )dQ that a gluon g splits (by either emitting
another gluon or a quark–antiquark pair) by

Pg (Q|W )dQ = e−(A+B)Y (A + B)dY . (6)

Neglecting conservation laws, the last two equations imply that the split-
ting is constant for each dY interval. This simpliﬁed assumption allows us
to classify the process as Markovian, and therefore to write the correspond-
ing approximate (forward and backward) Kolmogorov equations to create nq
quarks and ng gluons, starting from an initial quark, Pq (nq , ng ; Y ), or an initial
gluon, Pg (nq , ng ; Y ), at thickness Y . The corresponding non-zero transition
probabilities in the interval dY are

(nq , ng ) → (nq , ng ) = 1 − Ang dY − Ãnq dY − Bng dY ;

(nq , ng ) → (nq , ng + 1) = Ang dY + Ãnq dY ; (7)
(nq , ng ) → (nq + 2, ng − 1) = Bng dY .

It is simpler to use the generating functions, calculated from the corresponding

transition probabilities

Ga (u, v; Y ) ≡ unq v ng Pa (nq , ng ; Y ) , (8)
nq ,ng

where a = q, g. Accordingly, the following diﬀerential equations are then ob-

tained:
dGg
= A(G2g − Gg ) + B(G2q − Gg ) , (9)
dY
Coherence and Incoherence in QCD Jets Dynamics 227

dGq
= ÃGq (Gg − 1) . (10)
dY
When the production of quark–antiquark pairs can be neglected (i.e., when
B = 0) the above equations decouple; by looking only at the gluon population
generating function at Y one has
Gg (u, v; Y ) = v[v + (1 − v)eAY ]−1 , (11)
AY −Ã/A
Gq (u, v; Y ) = u[v + (1 − v)e ] . (12)
It turns out that the gluon multiplicity distribution, in a gluon-initiated
shower, is a shifted geometric distribution with average gluon multiplicity eAY
and parameter k ≈ 1. The gluon multiplicity, in a quark-initiated shower, is
instead a Pascal (NB) multiplicity distribution, with average gluon multiplic-
ity n̄ = Ã(eAY − 1)/A and parameter k = Ã/A(≈ 4/9): k is then the ratio
between the gluon self-interaction (g → g + g), with vertex Ã, and the gluon
bremsstrahlung initiated by a quark (q → q + g), with vertex A.
KUV evolution equations revealed in this way the approximate QCD skele-
ton in the early stages of multi-parton production: they single out the essen-
tials of QCD dynamics to be taken into account in its application to the
exploration of the partonic sector. The evolution is characterized at this stage
by the dominance of the g → g + g vertex over the g → q + q̄ vertex, and by
the weak effects of coherence and conservation laws.
We can summarize the situation after Sects. 2 and 3 as follows.
(a) From Sect. 2 one learns that the Pascal (NB) multiplicity distribution
appears experimentally as the natural candidate for describing the final
pion multiplicity distributions in cosmic ray physics; the parameter k is
decreasing going from low to high energy of the primary-hadron, whereas
the average charged particle multiplicity n̄ is increasing in the same energy
range.
The mentioned regularity appears also in the accelerator region (it has
been tested in 57 experiments), although the general trend of its param-
eters is not so spectacular as in cosmic rays, in view of the relatively low
plab of the incident particle on the fixed target experiments. It also appears
in theoretical work on the statistical generalization of the multiperipheral
model, where k is interpreted as the ratio of reggeon to pomeron couplings,
or, more generally, in terms of a coherent production mechanism over an
incoherent one.
(b) From Sect. 3 one notices the occurrence of the Pascal (NB) multiplicity
distribution also in the approximate description of QCD parton show-
ers, originated by an initial quark and an initial gluon according to the
corresponding KUV evolution equation under the simplified assumption
B = Nf /3 → 0.
To the common wisdom the occurrence of the Pascal multiplicity distribu-
tion in so many (apparently different) theoretical and experimental situations
228 A. Giovannini and R. Ugoccioni

was considered to be not interesting enough. Very few people in the ﬁeld had
an opposite point of view; for them, the wide occurrence of the Pascal MD
in hadronic reactions was a signature of the approximately uniﬁed nature of
multiparticle production processes, and of the universality of QCD thanks to
the Markov branching nature of the quark and gluon showers in the early
stages of their evolution .

4 The Collaboration with Léon Van Hove, and the UA5

Collaboration Results at CERN pp̄ Collider
on Multiplicity Distributions, in Full Phase Space
and in Restricted Pseudo-rapidity Windows
Results summarized in points (a) and (b) of Sect. 3 were not overlooked by
Léon Van Hove. His interest on the subject was enhanced by the discovery
by the UA5 Collaboration (in the 1980s) that the Pascal (NB) regularity was
describing quite well in pp̄ collisions at CERN Collider at various c.m. energies
(200, 560 and 900 GeV) the MD of final charged particles, not only in full
phase space, but also in restricted pseudo-rapidity windows [15].
The result by the UA5 Collaboration was independent from the previous
results obtained in the full phase space in cosmic ray physics and in hadronic
collisions in the accelerator region, and initially was not related to the the-
oretical work which had led to the introduction of the Pascal (NB) MD in
high-energy phenomenology. The UA5 results were confirmed by the NA 22
Collaboration data on MD’s in pp and π ± p collisions at 22 GeV c.m. energy
[16].
The characteristic experimental trend of the parameters of the Pascal (NB)
MD was that the k parameter was decreasing in full phase space as the c.m.
energy was increasing, while the opposite occurred to the charged particle
multiplicity n̄, as expected from previous experiments at lower energies and
in cosmic ray physics; in addition, at different fixed c.m. energies, k and n̄
were both decreasing from large to restricted (pseudo-) rapidity windows.
All these facts led Léon Van Hove to look for a new interpretation of the
occurrence of the regularity in hadronic collisions. A first paper was then
produced [17].
One question was still to be answered, concerning the e+ e− annihilation
and deep inelastic scattering. More precisely, is the regularity found in hadron–
hadron collisions also present in other classes of collisions? assuming the an-
swer is positive, what is the trend of its parameters with respect to the energy
and rapidity variables?
A positive reply would had been remarkable for exploring the partonic
sector controlled in its early stages by QCD through KUV equations, where
the Pascal (NB) MD had also been discovered in the form discussed in Sect. 2.
This search was motivated by the conviction that the complex structures
which we observe in experimental data have often a simple origin at the par-
Coherence and Incoherence in QCD Jets Dynamics 229

ton level, and are revealed by universal approximate regularities at the hadron
level. The intimate conviction was indeed that an eventual satisfactory expla-
nation of the observed regularity at the ﬁnal hadron level should be found in
a QCD framework at the parton level. In order to proceed in our program,
we had to solve the following two problems:

(a) calculate ﬁnal parton MD from KUV and DGLAP equations or, in more
general terms, join the non-perturbative to the perturbative sector of par-
ton showers;
(b) look for ﬁnal charged particle MD in e+ e− annihilation and deep inelastic
scattering experiments.

In a region where QCD had no predictions the answer to the point (a) was
found by following an encouraging result provided by W. Kittel [18]. It led us
to rely on the Monte Carlo model of event generator JETSET 7.2 [19], which
is based on DGLAP equations in the partonic sector, and has a hadronization
prescription based on the string model for the transfer of information from
the partonic to the hadronic sector [20]. All event generators have indeed one
feature in common, namely the DGLAP or KUV equations, and diﬀer by
the type of hadronization model, which is the string model in JETSET, and
the cluster model in HERWIG [21]; more recently, a statistical hadronization
model has been proposed by F. Becattini [22].
In order to answer to point (b), the HRS and EMC Collaborations were
asked to produce the requested data in ﬁnal hadron multiplicity distributions.
Remarkably, all replies were positive [23, 24]. The new facts were the following.

(a) The ﬁnal parton multiplicity distributions originated, at various virtuali-

ties, by a quark–antiquark system and by a gluon–gluon system were all
approximate Pascal (NB) MD, either in full phase space or in restricted
windows in rapidity [6].
In addition, the parameter k was approximately the same after the
hadronization, and the average number of particles was varying by a con-
stant factor ρ ≈ 2, i.e. n̄hadron = ρn̄parton and khadron = kparton (the
generalized local hadron–parton duality (GLPHD) prescription).
Notice that LPHD [5] is based on preconﬁnement (i.e., it says that
n̄hadron = ρn̄parton ), whereas GLPHD is expressed in terms of n-particle
inclusive rapidity distributions of partons, Qn,p (y1 , . . . , yn ), and hadrons,
Qn,h (y1 , . . . , yn ), by the equations

Qn,h (y1 , . . . , yn ) = ρn Qn,p (y1 , . . . , yn ) , (13)

ρ being constant. In addition, by assuming that one of the two (partonic

or hadronic) MD is of the Pascal (NB) type in a rapidity domain, then the
other one is again a Pascal (NB) multiplicity distribution, and the corre-
sponding NB parameters are linked by the above relations with constant
ρ [6].
230 A. Giovannini and R. Ugoccioni

(b) The Pascal (NB) distribution was describing the MD of the final charged
particles in all classes of examined collisions, representing with specific
parameter the trends at various c.m. energies in full phase space and in
restricted (pseudo-) rapidity windows.
Since then an avalanche of data was produced leading to the same men-
tioned results in all experiments available at that time. All these facts
suggested the interpretation of the Pascal (NB) MD in terms of a two-step
process [25]. In the first step, independent objects (the clan ancestors) are
produced according to a Poisson MD, in the second step, each ancestor de-
cays, following a logarithmic MD (the clan MD). No correlations exist among
particles produced in different clans, and each clan contains at least one par-
ticle.
This new perspective led to introducing two new variables in the produc-
tion process: the average number of clans, N̄ , and the average number of
particles per clan, n̄c , which are linked to the standard parameters n̄ and k
by the relations

n̄
N̄ = k ln 1 + , (14)
k
n̄
n̄c = . (15)
N̄
The introduction of the clan concept in the interpretation of the approx-
imate Pascal (NB) universal regularity, in the mentioned experiments, led to
very suggestive results. The average number of clans is larger in e+ e− anni-
hilation than in hadron–hadron collisions, whereas just the opposite occurs
for the average number of particles per clan. Clan bremsstrahlung is stronger
and clan size smaller in the former case than in the latter. An intermediate
situation occurs for deep inelastic scattering, where clans are less numerous
than in e+ e− , but the average number of particles per clan is much larger.
In addition, pumping energy into a collision does not increase the average
number of clans but only the average length of the showers, which are larger
in more central than in more peripheral rapidity intervals (see Fig. 1).
Clan formalism can be applied also at parton level, where it disentan-
gles the two above- mentioned QCD vertices; in fact, by recalling the results
described in Sect. 3, one obtains
N̄ = ÃY , (16)
eAY
−1
n̄c = , (17)
AY
i.e., gluon production from a quark is controlled by parameter Ã (average clan
production), and gluon emission from a gluon is controlled by parameter A
(average gluon shower production inside clans). Clans in this context can be
approximately understood (under the assumption Y < 1) as bremsstrahlung
gluon jets. These can be considered indeed as the building blocks of a unified,
although approximate, description of multiparticle production in all classes of
collisions.
Coherence and Incoherence in QCD Jets Dynamics 231

12 4

N nc

10 3.5

8 3

6 2.5

4 2

pp 546 GeV
pp 62 GeV
2 pp 31 GeV 1.5
e+e- 29 GeV

0 1
0 1 2 3 4 5 0 1 2 3 4 5
ycut or ηcut ycut or ηcut

Fig. 1. Average number of clans, N̄ (left panel), and average number of particles
per clan, n̄c (right panel), versus the half-width of the pseudo-rapidity (for 546
GeV data) or rapidity (for the other energies) interval [26]

The picture one has in mind goes as follows: independent, intermediate

gluon sources are produced at the parton level via bremsstrahlung, and they
later decay into gluon showers dominated by gluon self-interactions. The
GLHPD prescription allows us to determine the germane evolution of the
showers at the hadron level, with specific different behavior in the various
classes of collisions.
The next step to be performed, in order to overcome the approximate
description of multiparticle production processes, was to build up a parton
shower model based on essentials of QCD in a correct kinematical framework,
including conservation laws and coherence effects.

5 New Experimental Findings on Final Charged Particle

MD in e+ e− Annihilation at LEP c.m. Energy, and More
Precise Measurements on Final Particle MD at pp̄
Collider Top c.m. Energy. The Occurrence
of Substructures or Components in the Various
Collisions
The situation described in Sect. 4 was simple and quite satisfactory. As al-
ready pointed out, however, including in such a framework coherence eﬀects
232 A. Giovannini and R. Ugoccioni

and conservation laws was still an open problem. This search was in part ac-
complished. But since its natural goal was to build up a Monte Carlo event
generator model, to be added to the already existing (and successful) ones, we
decided to pay more attention to the new experimental facts in multiparticle
production processes which requested a deeper level of investigation than pre-
viously thought. We sketch below the relevant steps; a complete review can
be found in [27].
The Shoulder Effect
A shoulder structure in the multiplicity distribution (“shoulder effect”)
was seen experimentally in pp̄ collisions at 900 GeV c.m. energy and in e+ e−
annihilation at LEP c.m. energies. This was explained with a weighted super-
position mechanism of two classes of events [28], each described by Pascal (NB)
MD, and identified, respectively, with soft (without mini-jets) and semihard
(with mini-jets) events in pp̄, and with two-jet and three-or-more-jet events
in e+ e− .
Cumulant and Factorial Moments
When computing higher-order cumulant and factorial moments of exper-
imental MD, their ratio Hq , when plotted versus the order q, shows sign
oscillations (both in pp̄ and e+ e− ): the weighted superposition mechanism of
two classes of events described by Pascal (NB) MD again is able to explain
this feature [29].
Forward–Backward Multiplicity Correlations
Forward–backward multiplicity correlations (FBMC) in e+ e− annihilation
and in pp̄ collisions appear different: barely visible in the first case, rather
stronger and increasing with c.m. energy in the second one. Both behaviors can
be explained combining together the weighted superposition mechanism and
particle production via clans [30], the differences in clan behavior being just
the key for correctly describing the features of FBMC in the different reactions.
Notice that this makes FBMC a very relevant characteristic to investigate in
future experiments at CERN, because they can be used to explore the “color”
landscape of very high-energy collisions.

The point to be stressed is that all the above-mentioned experimental

facts can be explained in terms of the weighted superposition mechanism of
two classes of events (or components), each described by a Pascal MD with
characteristic parameters. This explanation allows us to maintain the original
simple interpretation of the regularity, which was violated when applied to
the full sample of events: the regularity is not a property of the full sample of
events of the various collisions (except at lower energies, when the full sample
essentially coincides with a single class); instead, it is a property of the dif-
ferent components or substructures (classes of events) in which each reaction
could be eventually disentangled, and whose properly weighted superposition
should reproduce observed experimental data for the general behavior of col-
lective variables.
Coherence and Incoherence in QCD Jets Dynamics 233

6 New Physics at CERN. The Weighted Superposition

of Three Classes of Events (Soft, Semihard, and Hard)
in pp Collisions at LHC
A new class of hard events to be added to the soft class of events (no mini-
jets), and to the semihard class of events (with mini-jets), has been envisioned
to exist at LHC [3]. It has been proposed to be described by a Pascal (NB)
MD with k 1 and n̄ very large (or, in the clan language, by N̄ ≈ 1, n̄c very
large). The total MD of the final charged particles is given by its weighted
superposition with the soft and semihard classes of events. The problem is
how to distinguish the three classes of events.
In addition to the seminal work of Gabriele and coworkers on KUV equa-
tions (which led to the understanding of parton showers and of the occurrence
of the Pascal (NB) regularity in multiparticle production), there is another
work (among his many papers) which might have—in our opinion—a new
interesting application. We are referring to the article on the 1/N expansion
(with N the number of chains) and on Bose–Einstein (BE) interferometry [4].
Although the predictions presented in that paper were disproved by the
data shortly after the paper was produced, we recall that the paper foresaw
indeed no BE interference, contrary to experimental findings, in case of a
coherent reaction like e+ e− annihilation (the number of chain is here just
one). The interference was expected to be large in pp (one pomeron exchange,
N = 2), and even larger in pp̄ (N = 3). This trend of a coherent versus an
incoherent reaction is today what one should expect for disentangling eventual
substructures in the minimum bias sample of events in pp collisions at LHC
[31]. The three classes of events, each described by a Pascal (NB) MD with
different parameters, would correspond to:
(a) one parton to one parton scattering (soft class of coherent events);
(b) two partons to two partons scattering (semihard class of partly coherent
events);
(c) three partons to three partons scattering (hard class of incoherent events).
As the number of parton–parton scatterings (chains) becomes larger the col-
lision becomes harder, the temperature higher and the parton density larger.
Future experiments at LHC will provide data to be confronted with this spec-
ulative thought.

References
1. K. Konishi, A. Ukawa, G. Veneziano: Nucl. Phys. B 157, 45 (1979) 223, 225
2. A. Giovannini, Nucl. Phys. B 161, 429 (1979). 223, 225, 226
3. A. Giovannini, R. Ugoccioni: Phys. Rev. D 59, 094020 (1999); Phys. Rev. D
60, 074027 (1999) 223, 233
4. A. Giovannini, G. Veneziano: Nucl. Phys. B 130, 61 (1977) 223, 233
234 A. Giovannini and R. Ugoccioni

5. D. Amati, G. Veneziano: Phys. Lett. B 83, 87 (1979) 229

6. L. Van Hove, A. Giovannini: Acta Phys. Pol. B 19, 917 (1988) 224, 229
7. A. Giovannini: Il Nuovo Cimento A 10, 713 (1972); Il Nuovo Cimento A 15,
543 (1973) 224, 225
8. B. Pascal: Varia Opera Mathematica, D. Petri de Fermat (Tolossae, France,
1679) 224
9. P.K. MacKeown, A.W. Wolfendale: Proc. Phys. Soc. 89, 553 (1966) 224
10. A. Giovannini, P. Antich, E. Calligarich, G. Cecchet, R. Dolfini, F. Impellizzeri,
S. Ratti: Il Nuovo Cimento A 24, 421 (1974); M. Garetto, A. Giovannini, E. Cal-
ligarich, G. Cecchet, R. Dolfini, S. Ratti: Il Nuovo Cimento A 38, 38 (1977)
225
11. M. Derrick et al.: Phys. Rev. Lett. 29, 515 (1972) 225
12. L. Mandel: Proc. Phys. Soc. London, 233 (1959) 225
13. M. Planck: Sitzungber. Deutsch. Akad. Wiss. Berlin 33, 355 (1923) 225
14. Yu. L. Dokshitzer, V. A. Khoze, A. H. Mueller, S. I. Troyan: Basics of Pertur-
bative QCD (Editions Frontières, Gif-sur-Yvette, 1991) 225
15. G. J. Alner et al. (UA5 Collaboration): Phys. Lett. B 160, 193 (1985) 228
16. M. Adamus et al. (NA22 Collaboration): Phys. Lett. B 177, 239 (1986) 228
17. A. Giovannini, L. Van Hove: Z. Phys. C 30, 391 (1986) 228
18. W. Kittel: in Workshop on Physics with Future Accelerators, ed. by J. Mulvay
(CERN, Yellow Rep. 87-7, 1987), Vol. II, p. 424 229
19. T. Sjöstrand, M. Bengstsson: Computer Physics Commun. 82, 74 (1994) 229
20. B. Andersson: The Lund Model (Cambridge University Press, Cambridge, 1996)
229
21. G. Marchesini, B. R. Webber: Nucl. Phys. B 310, 461 (1988) 229
22. F. Becattini: Z. Phys. C 69, 485 (1996); F. Becattini, U. W. Heinz: Z. Phys. C
76, 269 (1997) 229
23. M. Derrick et al. (HRS Collaboration): Phys. Lett. B 168, 299 (1986) 229
24. M. Arneodo et al. (EMC Collaboration): Z. Phys. C 35, 335 (1987) 229
25. L. Van Hove, A. Giovannini: in XVII International Symposium on Multiparticle
Dynamics, ed. by M. Markitan et al. (World Scientific, Singapore, 1987), p. 561
230
26. A. Breakstone et al.: Il Nuovo Cimento A 102, 1199 (1989) 231
27. A. Giovannini, R. Ugoccioni: Int. J. Mod. Phys. A 20, 3897 (2005) 232
28. A. Giovannini, S. Lupia, R. Ugoccioni: Nucl. Phys. B (Proc. Suppl.) 25, 115
(1992); Phys. Lett. B 374, 231 (1996) 232
29. R. Ugoccioni, A. Giovannini, S. Lupia: Phys. Lett. B 342, 387 (1995) 232
30. A. Giovannini, R. Ugoccioni, Phys. Rev. D 66, 034001 (2002) 232
31. W.D. Walker: Phys. Rev. D 69, 034007 (2004) 233
The U (1)A Anomaly and QCD Phenomenology

G. M. Shore

Department of Physics, University of Wales, Swansea, Swansea SA2 8PP, UK

[email protected]

Abstract. The role of the U (1)A anomaly in QCD phenomenology is reviewed,

focusing on the relation between quark dynamics and gluon topology. Topics covered
include a generalisation of the Witten–Veneziano formula for the mass of the η ,
the determination of pseudoscalar meson decay constants, radiative pseudoscalar
decays and the U (1)A Goldberger–Treiman relation. Sum rules are derived for the
proton and photon structure functions g1p and g1γ measured in polarised deep inelastic
scattering (DIS). The ﬁrst moment sum rule for g1p (the ‘proton spin’ problem) is
confronted with new data from COMPASS and HERMES on the deuteron structure
function and shown to be quantitatively explained in terms of topological charge
screening. Proposals for experiments on semi-inclusive DIS and polarised two-photon
physics at future ep and high-luminosity e+ e− colliders are discussed.

1 Introduction
The U (1)A anomaly has played an important historical role in establishing
QCD as the theory of the strong interactions. The description of radiative
decays of the pseudoscalar mesons in the framework of a gauge theory re-
quires the existence of the electromagnetic axial anomaly and determines the
number of colours to be Nc = 3. The compatibility of the symmetries of QCD
with the absence of a ninth light pseudoscalar meson – the so-called U (1)A
problem – in turn depends on the contribution of the colour gauge fields to the
anomaly. More recently, it has become clear how the anomaly-mediated link
between quark dynamics and gluon topology (the non-perturbative dynamics
of topologically non-trivial gluon configurations) is the key to understanding
a range of phenomena in polarised QCD phenomenology, most notably the
‘proton spin’ sum rule for the first moment of the structure function g1p .
In this paper, based on original research performed in a long-standing col-
laboration with Gabriele Veneziano, we review the role of the U (1)A anomaly
in describing a wide variety of phenomena in QCD, ranging from the low-
energy dynamics of the pseudoscalar mesons to sum rules in polarised deep-
inelastic scattering. The aim is to show how these experiments reveal subtle

G. M. Shore: The U (1)A Anomaly and QCD Phenomenology, Lect. Notes Phys. 737, 235–288
(2008)
DOI 10.1007/978-3-540-74233-6 12
c Springer-Verlag Berlin Heidelberg 2008
236 G. M. Shore

aspects of quantum ﬁeld theory, in particular topological gluon dynamics,

which go beyond simple current algebra or parton model interpretations.
We begin in Sect. 2 with a brief review of the essential theoretical toolkit:
anomalous chiral Ward identities, Zumino transforms, the renormalisation
group and the range of expansion schemes associated with large Nc , notably
the (Okubo–Zweig–Iizuki) (OZI) approximation. Then, in Sect. 3, we build on
Veneziano’s seminal 1979 paper [1] to describe how the pseudoscalar mesons
saturate the Ward identities in a way compatible with both the renormalisa-
tion group and large-Nc constraints and derive a generalisation of the famous
Witten–Veneziano mass formula for the η which incorporates, but goes be-
yond, the original large-Nc derivation [2, 3].
In Sect. 4, we turn to QCD phenomenology and describe how this intuition
on the resolution of the U (1)A problem allows a quantitative description of
low-energy pseudoscalar meson physics, especially radiative decays, the deter-
mination of the pseudoscalar decay constants, and meson–nucleon couplings.
We review the U (1)A extension of the Goldberger–Treiman formula first pro-
posed by Veneziano [4] as the key to understanding the ‘proton spin’ problem
and test an important hypothesis on the origin of OZI violations and their re-
lation to the renormalisation group. Low-energy η and η physics is currently
an active experimental field and we explain the importance of an accurate de-
termination of the couplings gηN N and gη N N in elucidating the role of gluon
topology in QCD.
All of these low-energy phenomena have counterparts in high-energy, po-
larised deep inelastic scattering. This enables us to formulate a new sum rule
for the first moment of the polarised photon structure function g1γ (Sect. 6).
The dependence of this sum rule on the invariant momentum of the off-shell
target photon measures the form factors of the three-current AVV Green
function and encodes a wealth of information about the realisation of chiral
symmetry in QCD, while its asymptotic limit reflects both the electromag-
netic and colour U (1)A anomalies. We show how this sum rule, which we first
proposed in 1992 [5, 6], may soon be tested if the forthcoming generation
of high-luminosity e+ e− colliders, currently conceived as B factories, are run
with polarised beams [7].
The most striking application of these ideas is, however, to the famous
‘proton spin’ problem, which originated with the observation of the violation of
the Ellis–Jaffe sum rule for the first moment of the polarised proton structure
function g1p by the EMC collaboration at CERN in 1988. This experiment, and
its successors at SLAC, DESY (HERMES) and CERN (SMC, COMPASS)
determined the axial charge a0 of the proton. In the simple valence-quark
parton model, this can be identified with the quark spin and its observed
suppression led to an intense experimental and theoretical search over two
decades for the origin of the proton spin. In fact, as Veneziano was the first
to understand [4], a0 does not measure spin in QCD itself and its suppression
is related to OZI violations induced by the U (1)A anomaly.
The U (1)A Anomaly and QCD Phenomenology 237

In a series of papers, summarised in Sect. 5, we have shown how

a0 decouples from the real angular momentum sum rule for the proton (the
form factors for this sum rule are given by generalised parton distributions
(GPDs) which can be extracted from less inclusive measurements such as
deeply virtual Compton scattering) and is instead related to the gluon topo-
logical susceptibility [8, 9]. The experimentally observed suppression is a mani-
festation of topological charge screening in the QCD vacuum. In a 1994 paper
with Narison [10], using QCD spectral sum rule methods, we were able to
compute the slope of the topological susceptibility and give a quantitative
prediction for a0 . Our prediction, a0 = 0.33, has within the past few months
been spectacularly conﬁrmed by the latest data on the deuteron structure
function from the COMPASS and HERMES collaborations.
Hopefully, this impressive new evidence for topological charge screening
will provide fresh impetus to experimental ‘spin’ physics – ﬁrst, to verify
the real angular momentum sum rule by measuring the relevant GPDs, and
second, to pursue the programme of target-fragmentation studies in semi-
inclusive DIS at polarised ep colliders which we have proposed as a further
test of our understanding of the g1p sum rule [11].

This review has been prepared in celebration of the 65th birthday of

Gabriele Veneziano. I first met Gabriele when I came to Geneva as a CERN
fellow in 1981. In fact, our first interaction was across a tennis court, in a
regular Friday doubles match with Daniele Amati and Toine Van Proeyen. I
like to think that in those days I could show Gabriele a thing or two about
tennis – physics, of course, was a different matter. It has been my privilege
through these ensuing 25 years to collaborate with one of the most brilliant
and innovative physicists of our generation. But it has also been fun. As all his
collaborators will testify, his good humour, generosity to younger colleagues,
and enthusiasm in thinking out solutions to the deepest and most fundamen-
tal problems in particle physics and cosmology make working with Gabriele
not only intellectually rewarding but hugely enjoyable.
In his contribution to the ‘Okubofest’ in 1990 [12], Gabriele concluded
an account of the relevance of the OZI rule to g1p by hoping that he had
‘made Professor Okubo happy’. In turn, I hope that this review will make
Gabriele happy: happy to recall how his original ideas on the U (1)A problem
have grown into a quantitative description of anomalous QCD phenomenol-
ogy, and happy at the prospect of new discoveries from a rich programme of
experimental physics at future polarised colliders. It is my pleasure to join all
the contributors to this volume in wishing him a happy birthday.

2 The U (1)A Anomaly and the Topological Susceptibility

We begin by reviewing some essential features of the U (1)A anomaly, chiral
Ward identities and the renormalisation group, placing particular emphasis on
238 G. M. Shore

the role of the gluon topological susceptibility. As we shall see, the anomaly
provides the vital link between quark dynamics and gluon topology which
is essential in understanding a range of phenomena in polarised QCD phe-
nomenology.

2.1 Anomalous Chiral Ward Identities

An anomaly arises when a symmetry which is present in the classical limit can-
not be consistently imposed in a quantum field theory. The original example
of an anomaly, and one which continues to have far-reaching implications for
the phenomenology of QCD, is the famous Adler–Bell–Jackiw axial anomaly
[13, 14, 15], which was first understood in its present form in 1969. In fact,
calculations exhibiting what we now recognise as the anomaly had already
been performed much earlier by Steinberger in his analysis of meson decays
[16] and by Schwinger [17].
Anomalies manifest themselves in a number of ways. The original deriva-
tions of the axial anomaly involved the impossibility of simultaneously impos-
ing conservation of both vector and axial currents due to regularisation issues
in the AVV triangle diagram in QED. More generally, they arise as anoma-
lous contributions to the commutation relations in current algebra. A modern
viewpoint, due to Fujikawa [18], sees anomalies as due to the non-invariance
of the fermionic measure in the path integral under transformations corre-
sponding to a symmetry of the classical Lagrangian. In this approach, the
a a
result of a chiral transformation q → eiα T γ5 q on the quark fields in the
a
QCD generating functional W [Vμ5 , Vμa , θ, S5a , S a ] defined as1

μa a
eiW
= DADq̄Dq exp i dx LQCD +V5 Jμ5 +V Jμ +θQ+S5 φ5 +S φ
μa a a a a a

(1)
1 a
Our notation follows that of [3]. The currents and pseudoscalar ﬁelds Jμ5 , Q, φa5
a
together with the scalar φ are deﬁned by
αs
a
Jμ5 = q̄γμ γ5 T a q Jμa = q̄γμ T a q Q= trGμν G̃μν
8π
φa5 = q̄γ5 T a q φa = q̄T a q

where Gμν is the field strength for the gluon field. Here, T i = 12 λi are flavour
√
SU (nf ) generators, and we include the singlet U (1)A generator T 0 = 1/ 2nf
a b 1 ab
and let the index a = 0, i. With this normalisation, trT T = 2 δ for all the
√
generators T a . This accounts for the rather unconventional factor 2nf in the
anomaly equation but has the advantage of giving a consistent normalisation to

the full set of decay constants including the flavour singlets f 0η and f 0η .
We will only need to consider fields where i corresponds to a generator in the
Cartan sub-algebra, so that a = 3, 8, 0 for nf = 3 quark flavours. We define d-
symbols by {T a , T b }
= dabc T c . For nf = 3, the explicit
values are d000 = d033 =
d088 = d330 = d880 = 2/3, d338 = d383 = −d888 = 1/3.
The U (1)A Anomaly and QCD Phenomenology 239

is

DADq̄Dq ∂ μ Jμ5
a
− 2nf δ a0
Q−dabc mb φc5 −δ 4
d xLQCD exp . . . = 0
(2)
The terms in the square bracket are simply those arising from Noether’s the-
orem, including soft breaking by the quark masses, with the addition of the
anomaly involving the gluon topological charge density Q. Re-expressing the
chiral variation of the elementary ﬁelds in terms of a variation with respect to
a
the sources Vμ5 , Vμa , θ, S5a , S a then gives the functional form of the anomalous
chiral Ward identities:

∂μ WVμ5a − 2nf δa0 Wθ − dabc mb WS5c
+fabc Vμb WVμ5 μ5 WVμc + dabc S WS5c − dabc S5 WS c = 0 (3)
c + fabc V
b b b

where we have abbreviated functional derivatives as suﬃces. This is the key to

all the results derived in this section. It makes precise the familiar statement
of the anomaly as

∂ μ Jμ5
a
− 2nf Qδa0 − dabc mb φc5 ∼ 0 (4)

The chiral Ward identities for two- and higher-point Green functions are
found by taking functional derivatives of (3) with respect to the sources. The
complete set of identities for two-point functions is given in our review [19].
As an example, we ﬁnd2

a Sb −
∂μ WVμ5 2nf δa0 WθS5b − Mac WS5c S5b − Φab = 0 (5)
5

which in more familiar notation reads

2
We use the following SU (3) notation for the quark masses and condensates:
⎛ ⎞
mu 0 0
⎝ 0 md 0 ⎠ = ma T a
0 0 ms a=0,3,8

and ⎛ ⎞
ūu 0 0 a a
⎝ 0 dd
¯ 0 ⎠=2 φ T
0 0 s̄s 0,3,8

where φa is the VEV of φa = q̄ T a q. It is also convenient to use the compact

notation
Mab = dacb mc Φab = dabc φc
240 G. M. Shore

The anomaly breaks the original U (nf )L × U (nf )R chiral symmetry to

SU (nf )L × SU (nf )R × U (1)V /ZnVf and the quark condensate spontaneously
breaks this further to the coset SU (nf )L × SU (nf )/SU (nf )V . Goldstone’s
theorem follows immediately. In the chiral limit, there are (n2f − 1) massless
√
Nambu–Goldstone bosons, which acquire masses of order m for non-zero
quark mass. There is no ﬂavour singlet Nambu–Goldstone boson since the
corresponding current is anomalous.
The zero-momentum Ward identities are especially important here, since
they control the low-energy dynamics. With the assumption that there are no
exactly massless particles coupling to the currents, we ﬁnd

2nf δa0 Wθθ + Mac WS5c θ = 0

2nf δa0 WθS5b + Mac WS5c S5b + Φab = 0 (7)

Another key element of our analysis will be the chiral Ward identities for
a
the effective action Γ [Vμ5 , Vμa , Q, φa5 , φa ], defined as the generating functional
for vertices which are 1PI with respect to the set of fields Q, φa5 and φa but not
a
the currents Jμ5 , Jμa . This is achieved using the partial Legendre transform
(or Zumino transform):

a
Γ [Vμ5 , Vμa , Q, φa5 , φa ] = W [Vμ5 a
, Vμa , θ, S5a , S a ] − dx θQ + S5a φa5 + S a φa
(8)
The chiral Ward identities for Γ are

a −
∂μ ΓVμ5 2nf δa0 Q − dabc mb φc5
+fabc Vμb ΓVμ5 μ5 ΓVμc − dabc φ5 Γφb + dabc φ Γφb5 = 0
c + fabc V
b c c
(9)

Again, the zero-momentum identities for the two-point vertices play an im-
portant role:

Φac Γφc5 Q − 2nf δa0 = 0
Φac Γφc5 φb5 − Mab = 0 (10)

These will be used in Sect. 3 to construct an eﬀective action which captures

the low-energy dynamics of QCD in the pseudoscalar sector.

2.2 Topological Susceptibility

The connection with topology arises through the identiﬁcation of the gluon
operator Q in the anomaly with a topological charge density. Q is a total
divergence:
αs
Q = tr Gμν G̃μν = ∂ μ Kμ (11)
8π
where Kμ is the Chern–Simons current,
The U (1)A Anomaly and QCD Phenomenology 241

αs 1
Kμ = μνρσ tr Aν Gρσ − gAν [Aρ , Aσ ] (12)
4π 3
Nevertheless, the integral over (Euclidean) spacetime of Q need not vanish.
In fact, for gauge field configurations such as instantons which become pure
gauge at infinity,
d4 x Q = n ∈ Z (13)

where the integer n is the topological winding number, an element of the

homotopy group π3 (SU (Nc )).
The form of the anomaly is then understood as follows. Under a chiral
transformation, the fermion measure in the path integral transforms as (for
one ﬂavour)

dxϕ†i γ5 ϕi
Dq̄Dq → e−2iα Dq̄Dq = exp−2iα(n+ −n− ) Dq̄Dq (14)

/ in the background
where ϕi is a basis of eigenfunctions of the Dirac operator D
gauge ﬁeld. The non-zero eigenvalues are chirality paired, so the Jacobian only
depends on the diﬀerence (n+ − n− ) of the positive and negative chirality zero
/ Finally, the index theorem relates the anomaly to the topological
modes of D.
charge density:
/ = n+ − n− =
indD d4 x Q (15)

The topological susceptibility χ(p2 ) is deﬁned as the two-point Green func-

tion of Q, viz.
χ(p2 ) = i dx eipx 0|T Q(x) Q(0)|0 (16)

We are primarily concerned with the zero-momentum limit χ(0) = Wθθ (0).
Combining (7) gives the crucial Ward identity satisﬁed by χ(0):

2nf χ(0) = M0a WS5a S5b Mb0 + (M Φ)00 (17)

that is,

n2f dx 0|T Q(x) Q(0)|0 = dx ma mb 0|T φa5 (x) φb5 (0)|0 + ma φa
(18)
Determining exactly how this is satisﬁed in QCD is at the heart of the Witten–
Veneziano approach to the U (1)A problem [1, 20].
The zero-momentum Ward identities allow us to write a precise form for
the topological susceptibility in QCD in terms of just one unknown dynamical
constant [21]. To derive this, recall that the matrix of two-point vertices is sim-
ply the inverse of the two-point Green function matrix, so in the pseudoscalar
sector we have the following inversion formula:

−1
ΓQQ = − Wθθ − WθS5a (WS5 S5 )−1 ab W b
S5 θ (19)
242 G. M. Shore

Using the identities (7) and (17), this implies that at zero momentum

−1
−1
ΓQQ = −χ 1 − 2nf χ(M Φ)−1
00 (20)

and inverting this relation gives

−1
−1 −1
χ = −ΓQQ 1 − 2nf ΓQQ (M Φ)−1
00 (21)

Finally, substituting for (M Φ)−1

00 using the deﬁnitions above, we ﬁnd the fol-
lowing important identity which determines the quark mass dependence of
the topological susceptibility in QCD:
−1
1
χ(0) = −A 1 − A (22)
q
mq q̄q

−1
where we identify the non-perturbative coeﬃcient A as ΓQQ .
Notice immediately how this expression exposes the well-known result that
χ(0) vanishes if any quark mass is set to zero. In Sect. 3, we will see how it
also clariﬁes the role of the 1/Nc expansion in the U (1)A problem.

2.3 Renormalisation Group

The conserved current corresponding to a non-anomalous symmetry is not

renormalised and has vanishing anomalous dimension. However, an anoma-
0
lous current such as the ﬂavour singlet axial current Jμ5 is renormalised. The
0
composite operator renormalisation and mixing in the Jμ5 , Q sector is as fol-
lows [22]:
0 0
Jμ5R = ZJμ5B
QR = QB − √ 1 (1 − Z)∂ μ Jμ5B
0
(23)
2nf

Notice the form of the mixing of the operator Q with ∂ μ Jμ5 0

under renormali-
μ 0
sation. This ensures that the combination ∂ Jμ5 − 2nf Q occurring in the
U (1)A anomaly equation is RG invariant. The chiral Ward identities therefore
take precisely the same form expressed in terms of the bare or renormalised
operators, making precise the notion of ‘non-renormalisation of the anomaly’.
We may therefore interpret the above Ward identities, which were derived
in terms of the bare operators, as identities for the renormalised composite
operators (and omit the suﬃx R for notational simplicity).
The renormalisation group equation (RGE) for the generating functional
a
W [Vμ5 , Vμa , θ, S5a , S a ] follows immediately from the deﬁnitions (23) of the
renormalised composite operators. Including also a standard multiplicative
−1
renormalisation Zφ = Zm for the pseudoscalar and scalar operators φa5 and
The U (1)A Anomaly and QCD Phenomenology 243

φa and denoting the anomalous dimensions corresponding to Z and Zφ by γ

and γφ , respectively, we find3

1

DW = γ Vμ5 0
− a a
0 + γφ S WS a + S WS a
∂μ θ WVμ5 5 5
+ ... (24)
2nf

where D = μ ∂μ
∂ ∂
+ β ∂g − γm q mq ∂m ∂
q
.
V,θ,S5 ,S
The RGEs for Green functions are found by functional differentiation of
(24) and can be simplified using the Ward identities. For example, for Wθθ we
find
1
DWθθ = 2γWθθ + 2γ M0b WθS5b + . . . (25)
2nf
At zero momentum, we can then use the first identity in (7) to prove that the
topological susceptibility χ(0) is RG invariant,

Dχ(0) = 0 (26)

which is consistent with its explicit expression (22).

a
A similar RGE holds for the eﬀective action Γ [Vμ5 , Vμa , Q, φa5 , φ5 ], which
allows the scaling behaviour of the proper vertices involving Q and φa5 to be
determined [9, 23, 24]. This reads

1

DΓ = γ Vμ5 0
− ΓQ ∂μ ΓVμ50 − γφ φ Γφa + φ Γφa
a
5 5
a
+ ... (27)
2nf

An immediate consequence is that DΓQQ = 0 at zero momentum, which

ensures the compatibility of (22) with the RG invariance of χ(0).

2.4 1/Nc , the Topological Expansion and OZI

The ﬁnal theoretical input into our analysis of the U (1)A problem and phe-
nomenological implications of the anomaly concerns the range of dynamical
approximation schemes associated with the large-Nc limit. At various points
we will refer either to the original large-Nc expansion of ’t Hooft [25], the topo-
logical expansion introduced by Veneziano [26] and the OZI limit [27, 28, 29].
A very clear summary of the distinction between them is given in Veneziano’s
‘Okubofest’ review [12], which we follow here.
In terms of Feynman diagrams, the leading order in the large Nc , ﬁxed
nf (’t Hooft) limit is the most restrictive of these approximations, including
only planar diagrams with sources on a single quark line and no further quark
loops (Fig. 1).
3
The notation + . . . refers to additional terms which are required to produce the
contact term contributions to the RGEs for n-point Green functions and vertices
of composite operators. These are discussed fully in [9, 23, 24], but will be omitted
here for simplicity. They vanish at zero momentum.
244 G. M. Shore

Fig. 1. A typical Feynman diagram allowed in the large-Nc limit. The dots on the
quark loop represent external sources

A better approximation to QCD is the quenched approximation familiar

in lattice gauge theory. This is a small nf expansion at fixed Nc , i.e. excluding
quark loops but allowing non-planar diagrams (Fig. 2).
An alternative is the topological expansion, which allows any number of
internal quark loops, but restricts to planar diagrams at leading order. Pro-
vided the sources remain attached to the same quark line, this corresponds
to taking large Nc at fixed nf /Nc . This means that quarks and gluons are
treated democratically and the order of approximation is determined solely
by the topology of the diagrams (Fig. 2).
Finally, the OZI approximation is a still closer match to full QCD with
dynamical quarks than either the leading order quenched or topological ex-
pansions. Non-planar diagrams and quark loops are retained, but diagrams
in which the external sources are connected to different quark loops are still
excluded (Fig. 3). This means that amplitudes which involve purely gluonic
intermediate states are suppressed. This is the field-theoretic basis for the
original empirical OZI rule.
In each of these large-Nc expansions, except the topological expansion
where nf /Nc is fixed, the U (1)A anomaly does not contribute at leading
order. More precisely, the anomalous contribution 0|T Q φb5 |0 in the chi-
ral Ward identity (6) is suppressed by O(1/Nc ) relative to the current term

(a) (b)

Fig. 2. Feynman diagrams allowed in the quenched approximation (left) or leading

order in the topological expansion (right)
The U (1)A Anomaly and QCD Phenomenology 245

(a) (b)

Fig. 3. Feynman diagrams allowed (left) and forbidden (right) by the OZI rule

0|T Jμ5
0
φb5 |0. This means that the flavour singlet current is conserved, Gold-
stone’s theorem applies, and conventional PCAC methods can be used to
understand the dynamics of the Green functions with a full set of (n2f − 1)
massless bosons in the chiral limit. Taking this as a starting point, we can then
learn about the spectral decomposition of the actual QCD Green functions
as we relax from the leading-order limits. In particular, this leads us to the
famous Witten–Veneziano mass formula for the η meson [1, 20].
The behaviour of the topological susceptibility at large Nc is central to
this analysis. It is clear from looking at planar diagrams that at leading order
in 1/Nc , χ(0) in QCD coincides with the topological susceptibility χ(0)|YM
in the corresponding pure Yang–Mills theory. Referring now to the explicit
expression (22) for χ(0), large-Nc counting rules give A = O(1) while q̄q =
O(Nc ). It follows that for non-zero quark masses,
χ(0) = − A + O(nf /Nc ) (28)
−1
where A = ΓQQ is identified as −χ(0)YM + O(1/Nc ). On the other hand, if
we consider the limit of χ(0) for mq → 0 at finite Nc , then we have
χ(0)|mq →0 = 0 (29)
The ’t Hooft large-Nc limit is therefore not smooth in QCD; the Nc → ∞ and
mq → 0 limits do not commute [1, 20, 21]. This is remedied in the topological
expansion, where quark loops are retained and the O(nf /Nc ) contribution in
(28) allows the smooth chiral limit χ(0) → 0 even for large Nc .

3 ‘U (1)A Without Instantons’

The U (1)A problem has a long history, pre-dating QCD itself, and has been
an important stimulus to new theoretical ideas involving anomalies and gluon
topology.
At its simplest, the original ‘U (1)A problem’ in current algebra is relatively
straightforwardly resolved by the existence of the anomalous contributions to
246 G. M. Shore

the chiral Ward identities (anomalous commutators in current algebra) and

the consequent absence of a ninth light Nambu–Goldstone boson in nf = 3
QCD.4 However, a full resolution requires a much more detailed understand-
ing of the dynamics of the pseudoscalar sector and the role of topological
ﬂuctuations in the anomalous Green functions.
In this section, we review the analysis of the U (1)A problem presented by
Veneziano in his seminal 1979 paper, ‘U (1)A without instantons’ [1].5 As well
as deriving the eponymous mass formula relating the η mass to the topological
susceptibility, the essential problem resolved in [1] is how to describe the
dynamics of the Green functions of the pseudoscalar operators in QCD in
terms of a spectral decomposition compatible with the nf , Nc , θ and quark
mass dependence imposed by the anomalous Ward identities.
First, recall that in the absence of the anomaly, there will be light pseu-
doscalar mesons η α coupling derivatively to the currents with decay constants
f aα , i.e. 0|Jμ5
a
|η α = ipμ f aα . (We use the notation η α to denote the physical
mesons π , η and η , while the SU (3) index a = 3, 8, 0.) The mass matrix μ2αβ
0

satisﬁes the Dashen, Gell-Mann–Oakes–Renner (DGMOR) relation [35, 36]

f aα μ2αβ f T βb = − (M Φ)ab (no anomaly) (30)

The consequences of the anomaly are determined by the interaction of the

pseudoscalar fields φa5 with the topological charge density Q and the subse-
quent mixing. This gives rise to an additional contribution to the masses.
Moreover, we can no longer identify the flavour singlet decay constant by the
0
coupling to Jμ5 since this is not RG invariant. Instead, the physical decay con-
stants f are defined in terms of the couplings of the η α to the pseudoscalar
aα

fields through the relation f aα 0|φb5 |η α = dabc φc . This coincides with the
usual definition except in the flavour singlet case.
The most transparent way to describe how all this works is to use an
effective action Γ [Q, φa5 ] constructed to satisfy the anomalous chiral Ward
identities. It is important to emphasise from the outset that this is an effective
action in the sense of Sect. 2.1, i.e. the generating functional for vertices which
are 1PI with respect to the set of fields Q, φa5 only. The choice of fields is
designed to capture the degrees of freedom essential for the dynamics.6 A
different choice (or linear combination) redefines the physical meaning of the
4
The existence of a light flavour-singlet Nambu–Goldstone boson would produce
a rapid off-shell variation in the η → 3π decay amplitude, in contradiction with
the experimental data [30].
5
For reviews of the instanton approach to the resolution of the U (1)A problem,
see, e.g. [31, 32, 33, 34].
6
Note especially the frequently misunderstood point that the choice of fields in Γ is
not required to be in any sense a complete set, nor does the restriction to a given
set of fields constitute an approximation. Before imposing dynamical simplifica-
tions, the identities derived from Γ are exact – increasing the set of basis fields
simply changes the definitions of the 1PI vertices. The effective action considered
here is therefore different from the non-linear chiral Lagrangians incorporating
The U (1)A Anomaly and QCD Phenomenology 247

vertices, so it is important that the ﬁnal choice of ﬁelds in Γ results in vertices

which are most directly related to physical couplings.
The simplest effective action consistent with the anomalous Ward identi-
ties and the renormalisation group is

1 2
a
Γ [Q, φ5 ] = dx Q + Q 2nf δ0a − Ba ∂ 2 Φ−1 b
ab φ5
2A

1 a −1 −1 b
+ φ5 Φac (M Φ)cd − Ccd ∂ Φdb φ5
2
(31)
2
The constants Cab and Ba are related to ΓVμ5 a V b and ΓV a Q , respectively. The
ν5 μ5
inclusion of the term with Ba is unusual, but is required for consistency with
the RGEs derived from (27) beyond zero momentum.
This form of Γ [Q, φa5 ] encodes three key dynamical assumptions:
• Pole dominance. We assume that the Green functions are dominated by the
contribution of single-particle poles associated with the pseudoscalar mesons
including the flavour singlet. This extends the usual PCAC assumption to the
singlet sector.
• Smoothness. We assume that pole-free dynamical quantities such as the
decay constants and couplings (1PI vertices) are only weakly momentum-
dependent in the range from p = 0 to their on-shell values. This allows us to
impose relations derived from the zero-momentum Ward identities, provided
this is compatible with the renormalisation group.
• Topology. There must exist topologically non-trivial fluctuations which
can give a non-vanishing value to χ(0)|YM in pure gluodynamics. This is re-
1
quired to give the non-vanishing coefficient in the all-important 2A Q2 term
a
in Γ [Q, φ5 ]. Notice that we do not require a kinetic term for Q, which would
be associated with a (presumed heavy) pseudoscalar glueball.
The second derivatives of Γ [Q, φa5 ] are
⎛ ⎞ ⎛ ⎞
ΓQQ ΓQφb5 A−1 ( 2nf δ0d + Bd p2 )Φ−1db
⎝ ⎠=⎝ ⎠
−1
2 −1
2
−1
Γφa5 Q Γφa5 φb5 Φac ( 2nf δc0 + Bc p ) Φac (M Φ)cd + Ccd p Φdb

(32)

The corresponding Green functions (composite operator propagators) are

given by inversion:
−1
Wθθ WθS5b ΓQQ ΓQφb5
= − (33)
WS5a θ WS5a S5b Γφa5 Q Γφa5 φb5

and we ﬁnd, to leading order in p2 ,

the large-Nc approach to the pseudoscalar mesons constructed by a number of
groups. See, for example, [21, 37, 38, 39, 40, 41, 42].
248 G. M. Shore

Wθθ = − A Δ̃−1

WθS5b = WS5b θ 2nf AΔ−1
0d Φdb

WS5a S5b = − Φac Δ−1

cd Φdb (34)
where

−1
Δ̃ = 1 − 2nf Aδa0 δ0b + 2nf A(δa0 Bb + Ba δ0b )p2 M Φ + Cp2 ab (35)

and

Δab = Cab − 2nf A(δa0 Bb + Ba δ0b ) p2 + (M Φ)ab − 2nf A δa0 δ0b (36)

In this form, however, the propagator matrix is not diagonal and the op-
erators are not normalised so as to couple with unit decay constants to the
physical states. It is therefore convenient to make a change of variables in Γ
so that it is written in terms of operators which are more closely identified
with the physical states. We do this is in two stages, since the intermediate
stage allows us to make direct contact with the discussion in [1] and will play
an important role in some of the phenomenological applications considered
later.
First, we define rescaled fields η̂ α whose kinetic terms, before mixing with
Q, are canonically normalised. That is, we set
η̂ α = fˆT αa Φ−1 b
ab φ5 (37)

with the ‘decay constants’ fˆaα deﬁned such that dpd2 Γη̂α η̂β |p=0 = δαβ . This
implies

d
(fˆfˆT )ab = Cab = 2
a Sb
WSD (38)
dp D p=0

where Da = 2nf δa0 Q + Mab φb5 is the divergence of the current Jμ5 a
. In the
chiral limit, this reduces in the ﬂavour singlet sector to

d
(fˆfˆT )00 = 2
χ(p2 ) = χ (0) (39)
dp p=0

a result which plays a vital role in understanding the ‘proton spin’ problem.
Notice however that the fâα are not RG invariant: in fact, Dfâα = γδa0 fâα .
The effective action Γ [Q, η̂ α ] is

1 2
a
Γ [Q, η̂ ] = dx Q + Q 2nf δ0a − Ba ∂ 2 (fˆ−1 )aα η̂ α
2A

1 α 2 ˆ−1T ˆ−1

+ η̂ −∂ + f M Φf αβ
η̂ β
(40)
2
In this form, the η̂ α are the canonically normalised fields corresponding to
the ‘would-be Nambu–Goldstone bosons’ in the absence of the anomaly, be-
fore they acquire an additional anomaly-induced mass. In the framework
The U (1)A Anomaly and QCD Phenomenology 249

of the large-Nc or OZI approximations, they would correspond to true

Nambu-Goldstone bosons. The singlet η̂ 0 is what we have therefore referred

to in our previous papers as the ‘OZI boson’ ηOZI . As we see later, the naive
current algebra relations hold when expressed in terms of the η̂ α and fâα ,
though these do not correspond to physical states or decay constants.
The physical particle masses are identified with the poles in the two-point
Green functions (34). We immediately see that due to mixing with the topo-
logical charge density Q, the physical pseudoscalar meson mass m2αβ is shifted
from its original value. From the pole in (36), we immediately find

f aα m2αβ f T βb = −(M Φ)ab + 2nf Aδa0 δb0 (41)

where we identify the physical, RG-invariant decay constants as

(f f T )ab = (fˆfˆT )ab − 2nf A(δa0 Bb + Ba δ0b ) (42)

Equation (41) is the key result. It generalises the original DGMOR relations
(30) to the flavour-singlet sector with the anomaly properly incorporated and
the renormalisation group constraints satisfied. It represents a generalisation
of the Witten–Veneziano mass formula which makes no direct reference to
large-Nc arguments but depends only on the three dynamical assumptions
stated above [2].
With this clarification of the distinction between the physical decay con-
stants f aα and the RG non-invariant fâα , we can rewrite (35) for the topo-
logical susceptibility χ(p2 ) = Wθθ (p2 ) as
−1 −1
χ(p2 ) = − A 1 − tr (fˆfˆT − f f T )p2 + 2nf A100 fˆfˆT p2 + M Φ (43)

It is clear that in the zero-momentum limit, this expression successfully re-

produces (22) for χ(0). For one ﬂavour, the formula simpliﬁes to
−1
χ(p2 ) = − A fˆfˆT p2 + M Φ f f T p2 + M Φ + 2nf A (nf = 1) (44)

showing clearly the pole at the shifted mass m2 of (41). The occurrence of
both fˆaα and f aα in these expressions allows them to satisfy the RGE (25)
for the topological susceptibility, which requires Dχ(p2 ) = O(p2 ).
The second stage is to make a change of variable which diagonalises the
propagator matrix, so as to give the most direct possible relation between the
operators and the physical states. Choosing

G = Q − WθS5a WS−1 b
a S b φ5 Q + 2nf AΦ−1 b
0b φ5
5 5

η α = f T αa Φ−1 b
ab φ5 (45)

deﬁnes the eﬀective action Γ [G, η α ] as

250 G. M. Shore

1 2 1
Γ [G, η α ] = dx G + η α (−∂ 2 − m2 )αβ η β (46)
2A 2

with m2αβ given by (41). The corresponding propagators are

0|T G G|0 = −A
−1
0|T η α η β |0 = 2 δ αβ (47)
p − m2ηα

where with no loss of generality we have taken m2αβ diagonal.

Notice also that the states mix in the complementary way to the operators.
In particular, the mixing for the states corresponding to (45) for the ﬁelds G
and η α is

|G = |Q

|η α = (f −1 )αa Φab |φb5 − 2nf Aδa0 |Q (48)

In this sense, we see that we can regard the physical η (and, with SU (3)
breaking, the η) as an admixture of quark and gluon components, while the
unphysical state |G is purely gluonic.
An immediate corollary is the following relation, which we will use re-
peatedly in deriving alternative forms of the current algebra identities for the
pseudoscalar mesons:
δ δ δ δ
Φab b
= fˆaα α = f aα a + 2nf Aδa0 (49)
δφ5 δ η̂ δη δG

The formulation in terms of Γ [G, η α ] is exactly what we need to construct a

simple ‘U (1)A PCAC’ with which to interpret the low-energy phenomenology
of the pseudoscalar mesons. We turn to this in the next section.
Here, we focus on the intermediate formulation Γ [Q, η̂ α ] in order to de-
scribe Veneziano’s analysis of the U (1)A problem in the framework of the
large-Nc and topological expansions. The starting point is the anomalous
Ward identity (18) for the topological susceptibility:

n2f dx 0|T Q(x) Q(0)|0 = dx ma mb 0|T φa5 (x) φb5 (0)|0 + ma φa
(50)
The essential problem is how to understand this relation in terms of a spectral
decomposition in the context of the 1/Nc expansion.
Assuming that χ(0)YM = −A + O(1/Nc ) is non-vanishing at O(1), the
l.h.s. is O(n2f ) in leading order in 1/Nc . On the other hand, the r.h.s. includes
the condensate term of O(nf Nc m). To resolve this apparent paradox, we have
to go beyond leading order in 1/Nc and consider the quark loop contributions
which are included in the topological expansion. Although these are formally
suppressed by powers of (nf /Nc ), they contain light intermediate states that
The U (1)A Anomaly and QCD Phenomenology 251

can enhance the order of the Green function. As we have seen above, these light
states are just the ‘OZI bosons’ |η̂ α with masses μ2αβ of O(nf m). Inserting
these intermediate states, we therefore ﬁnd that:
1
χ(p2 ) = χ(p2 )|YM − 0|Q|η̂ α η̂ β |Q|0 + . . . (51)
(p2 − μ2 )αβ

where the coupling 0|Q|η̂ α is O( nf /Nc ).
)YM ∼ −1 −A (a low-momentum smoothness assump-
2
Approximating χ(p
tion) and 0|Q|η̂ ∼ 2nf A(f )0α , then summing the series of intermedi-
α

ate state contributions, we ﬁnd

A
χ(p2 ) −
−1 (52)
1 − 2nf A f (p2 − μ2 )f T
00

This expression reproduces (13) of [1]. Clearly, it is dominated by the physical

pseudoscalar pole with anomaly-induced mass given by (41). It does not com-
pletely recover our more precise expression (43) because of the approximation
for the coupling of Q to the |η̂ α , which misses the subtleties related to the
introduction of Ba in the effective action Γ [Q, η̂ α ] and the distinction of fâα
and f aα . These are effects of higher order in 1/Nc but, as we have seen, they
are necessary to establish full RG consistency and will prove to be important
for phenomenology.
To see how a term with the O(nf Nc m) dependence of the condensate can
arise in n2f χ(0), notice from (41) that the physical pseudoscalar mass squared
m2ηα has two contributions, the first of O(m) from the conventional quark
mass term and the new, anomaly-induced contribution of O(nf /Nc ). If we are
in a regime where the anomaly contribution dominates (m < ΛQCD /Nc ), then
it follows that the above expression for χ(0) indeed becomes of O(n−1 f Nc m).
The original Witten–Veneziano mass formula for the η is the large-Nc
limit of (41). In the chiral limit there is no flavour mixing and the singlet
mass is given by
1 2nf
m2η = 2nf A = − χ(0)YM + O((nf /Nc )2 ) (53)
(f 0η )2 fπ2

This formula provided the ﬁrst link between the η mass and gluon topology.
For an alternative recent derivation in the context of an nf /Nc expansion, see
also [43].
What we learn from all this is that the Green functions in the anomalous
chiral Ward identities admit a consistent spectral decomposition in terms of a
full set of (n2f − 1) pseudoscalar mesons, provided they satisfy the generalised
DGMOR mass formula (41) including the all-important anomaly term. The
presence of these light poles can enhance the apparent order of the Green func-
tions, as is familiar with Nambu–Goldstone bosons, and the anomaly-induced
252 G. M. Shore

O(nf /Nc ) contribution to m2ηα is critical in ensuring complete consistency

with the Ward identities.
Similar considerations apply to the resolution of apparent paradoxes in the
θ-dependence of some Green functions. For example [1], we can show from the
anomalous Ward identities that the condensate satisﬁes

mq q̄q|θ ≡ ma φa = cos(θ/nf ) ma φa |θ=0 (54)
q

This implies

∂2
∂θ 2 m φ |θ=0 = − ma dy 0|T Q(x) Q(y) φa (0)|0
a a
dx
1
= − ma φa |θ=0 (55)
n2f

Here, the Green function is superﬁcially of O(nf /Nc ) while the r.h.s. is
O(Nc /nf ). The resolution is simply that it contains pseudoscalar interme-
diate states contributing two light poles with m2 ∼ O(nf /Nc ). So once again
we see how the spectral decomposition in terms of the full set of pseudoscalar
mesons, including the ﬂavour singlet, ensures consistency with the anomalous
Ward identities.

4 Pseudoscalar Mesons
This theoretical analysis provides the basis for an extension of the conven-
tional PCAC or chiral Lagrangian description of the phenomenology of the
pseudoscalar mesons to the ﬂavour singlet sector. In this section7 we describe
the role of the U (1)A anomaly in the radiative decays of π 0 , η and η and
derive the U (1)A Goldberger–Treiman relation, ﬁrst proposed by Veneziano
as a resolution of the ‘proton spin’ problem.

4.1 U (1)A Dashen, Gell-Mann–Oakes–Renner relations

The extension of the DGMOR relations to the U (1)A sector follows from the
application of the three key dynamical assumptions used above (viz. pole dom-
inance by the nonet of pseudoscalar mesons, smoothness of decay constants
and couplings over the range from zero to on-shell momentum, and the ex-
istence of topologically non-trivial gluon dynamics) to the anomalous chiral
Ward identities.
The fundamental U (1)A DGMOR relation

f aα m2αβ f T βb = −Mac Φcb + 2nf Aδa0 δb0 (56)

7
This section is based on the presentation in [3], where we extend and update our
original work [2, 44] to include a detailed comparison with experimental data.
The U (1)A Anomaly and QCD Phenomenology 253

has been derived above in the course of the general discussion of the U (1)A
problem. In order to make this section self-contained, we give a brief and
direct derivation here.
Recall that the physical meson ﬁelds are given as η α = f T αa Φ−1 b
ab φ5 , with
the decay constants deﬁned so that the propagator WS α Sηβ = −1/(p −m2η )αβ .
2
η
It follows immediately that at zero momentum,

f aα m2αβ f T βb = Φac (WS5 S5 )−1

cd Φdb (57)

Using the chiral Ward identities of Sect. 2 together with the identiﬁcation
(21) of the topological susceptibility, we can then show
−1
Φac (WS5 S5 )−1
cd Φdb = (ΦM )ac M WS5 S5 M cd (M Φ)db

−1
= (M Φ)ac −(M Φ) + 2nf χ(0)100 (M Φ)db
cd
−1
= −(M Φ)ab + 2nf ΓQQ δa0 δb0 (58)

proving the result (56).

Expanding this out, and assuming the mixed decay constants f 0π , f 8π ,

f 3η , f 3η are all negligible, we have
2
¯ + ms s̄s + 6A (59)
(f 0η )2 m2η + (f 0η )2 m2η = − mu ūu + md dd
3
√
2
f 0η f 8η m2 0η 8η 2
η + f f mη = − ¯ − 2ms s̄s
mu ūu + md dd (60)
3
1
(f 8η )2 m2 8η 2 2
η + (f ) mη = − ¯ + 4ms s̄s
mu ūu + md dd (61)
3

¯
fπ2 m2π = − (mu ūu + md dd) (62)

and we can add the standard DGMOR relation for the K + ,

2
fK m2K = − (mu ūu + ms s̄s) (63)

We emphasise that these formulae, as well as the radiative decay and

U (1)A Goldberger–Treiman relations derived below, do not depend at all on
the 1/Nc expansion. In particular, the constant A appearing in the flavour
singlet formula is defined as the non-perturbative parameter determining the
topological susceptibility χ(0) in QCD according to the exact identity (22). As
explained above, large-Nc ideas do indeed provide a rationale for extending
the familiar PCAC assumptions of pole dominance and smoothness to the
flavour singlet channel, but these assumptions can be tested independently
against experimental data.
254 G. M. Shore

The most useful form of these relations for phenomenology is to assume

exact SU (2) ﬂavour symmetry and eliminate the quark masses and conden-
sates in favour of fπ , fK , m2π and m2K in the DGMOR relations for the η and
η . This gives
1 2 2 2

(f 0η )2 m2η + (f 0η )2 m2η = f m + 2fK m2K + 6A (64)
3 π π
√
2 2 2 2
f 0η f 8η m2 0η 8η 2
η + f f mη = fπ mπ − fK
2
m2K (65)
3
1 2 2
(f 8η )2 m2 8η 2 2
η + (f ) mη = − fπ mπ − 4fK
2
m2K (66)
3

We can also now clarify the precise relation of these results to the Witten–
Veneziano formula for the mass of the η in its non-vanishing quark mass
form, viz.
6
m2η + m2η − 2m2K = − 2 χ(0)|Y M (67)
fπ
Of course, only the m2η term on the l.h.s. is present in the chiral limit. Substi-
tuting in the explicit values for the masses in this formula gives a prediction
[1] for the topological susceptibility, χ(0)|YM −(180 MeV)4 , which as we
see below is remarkably close to the subsequently calculated lattice result.
If we now add the DGMOR formulae (64) and (66), we ﬁnd

(f 0η )2 m2η + (f 0η )2 m2η + (f 8η )2 m2η + (f 8η )2 m2η − 2fK
2
m2K = 6A (68)

which we repeat is valid to all orders in 1/Nc . To reduce this to its Witten–
Veneziano approximation, we impose the large-Nc limit to approximate the
QCD topological charge parameter A with −χ(0)|Y M as explained in Sect. 2.4.

We then set the ‘mixed’ decay constants f 0η and f 8η to zero and all the re-

maining decay constants f 0η , f 8η and fK equal to fπ . With these approxima-
tions, we recover (67). Eventually, after we have found explicit experimental
values for all these quantities, we will be able to demonstrate quantitatively
how good an approximation the large-Nc Witten–Veneziano formula is to the
generalised U (1)A DGMOR relation in full QCD.

4.2 Radiative Decay Formulae for π 0 , η, η → γγ

Radiative decays of the pseudoscalar mesons are of particular interest as they

are controlled by the electromagnetic U (1)A anomaly,
α μν
∂ μ Jμ5
a
− Mab φb5 − 2nf Qδa0 − aaem F F̃μν = 0 (69)
8π
The U (1)A Anomaly and QCD Phenomenology 255

where Fμν = ∂μ Aν − ∂ν Aμ is the usual electromagnetic ﬁeld strength and the

anomaly coeﬃcients aaem are determined by the quark charges. The generat-
a
ing functional Γ [Vμ5 , Vμa , Q, φa5 , φa , Aμ ] of 1PI vertices including the photon
satisﬁes the Ward identity
α
a −
∂μ ΓVμ5 2nf δa0 Q − aaem F μν F̃μν − dabc mb φc5
8π
+fabc Vμb ΓVμ5 μ5 ΓVμc − dabc φ5 Γφb + dabc φ Γφb5 = 0 (70)
c + fabc V
b c c

To derive the radiative decay formulae, we ﬁrst diﬀerentiate this identity

with respect to the photon field Aμ . This gives
α
ipμ ΓVμ5
a Aλ Aρ + Φab Γ b λ ρ =
φ5 A A − aaem μνλρ k1μ k2ν (71)
π
where k1 , k2 are the momenta of the two photons. Notice that the mass
term does not contribute directly to this formula. From its definition as 1PI
a Aλ Aρ does not have a pole at
w.r.t. the pseudoscalar fields, the vertex ΓVμ5
p2 = 0, even in the massless limit, so we find simply
α
Φab Γφb5 Aλ Aρ p=0 = − aaem μνλρ k1μ k2ν (72)
π
The radiative couplings gηα γγ for the physical mesons η α = π 0 , η, η are
defined as usual from the decay amplitude γγ|η α . With the PCAC assump-
tions already discussed, they can be identified with the 1PI vertices as follows:

γγ|η α = − igηα γγ μνλρ k1μ k2ν λ (k1 )ρ (k2 ) = iΓηα Aλ Aρ λ (k1 )ρ (k2 ) (73)

Re-expressing (72) in terms of the canonically normalised ‘OZI bosons’ η̂ α ,

we therefore have the first form of the decay formula,
α
fâα gη̂α γγ = aaem (74)
π
Then, rewriting this in terms of the physical pseudoscalar couplings gηα γγ
and decay constants according to the relation (49) gives the final form for the
generalised U (1)A PCAC formula describing radiative pseudoscalar decays,
incorporating both the electromagnetic and colour anomalies:
α
f aα gηα γγ + 2nf AgGγγ δa0 = aaem (75)
π
Expanding this formula, we have
√ α
f 0η gη γγ + f 0η gηγγ + 6AgGγγ = a0em (76)
π
α
f 8η gη γγ + f 8η gηγγ = a8em (77)
π
256 G. M. Shore
α
fπ gπγγ = a3em (78)
π
√
where a0em = 23√23 Nc , a8em = 3√
1
N and a3em = 13 Nc .
3 c
The new element in the flavour singlet decay formula is the gluonic cou-
pling parameter gGγγ . It takes account of the fact that because of the anomaly-
induced mixing with the gluon topological density Q, the physical η is not a
true Nambu–Goldstone boson so the naive PCAC formulae must be modified.
gGγγ is not a physical coupling and must be regarded as an extra parameter to
be fitted to data, although in view of the identifications in (48) it may reason-
ably be thought of as the coupling of the photons to the gluonic component
of the η .
The renormalisation group properties of these relations are readily derived
from the RGE (27) for Γ . In the ‘OZI boson’ form, the unphysical coupling
gη̂α γγ satisfies the complementary RGE to the decay constant fâα so the
combination is RG invariant:

Dfâα = γδa0 fâα D fâα gη̂α γγ = 0 (79)

In contrast, all the decay constants and couplings in the relation (75) can
be shown to be separately RG invariant, including the gluonic coupling gGγγ
[24, 44].

4.3 The Renormalisation Group, OZI and 1/Nc : a Conjecture

Although these U (1)A PCAC relations have been derived purely on the basis
of the pole dominance and smoothness assumptions, we will nevertheless find
it useful in practical applications to exploit their OZI or large-Nc behaviour,
in conjunction with the renormalisation group.
The basic idea is that violations of the OZI rule, or equivalently anoma-
lous large-Nc behaviour, are generally related to the existence of the U (1)A
anomaly. Moreover, we can identify the quantities which will be particularly
sensitive to the anomaly as those which have RGEs involving the anomalous
dimension γ. We therefore conjecture that the dependence of Green functions
and 1PI vertices on γ will be an important guide in identifying propagators
and couplings which are likely to show violations of the OZI rule and those
for which the OZI (or large-Nc ) limit should be a good approximation [9, 24].
As an example, the large-Nc order of the √ quantities in the flavour singlet
aα
decay relation
√ (76) is as follows: f = O( Nc ) for all the decay constants,
gηα γγ = O( Nc ), gGγγ = O(1), aaem = O(Nc ) and the topological susceptibil-
ity parameter A = O(1). The renormalisation group behaviour is especially
simple, with both the meson and gluonic couplings gηα γγ and gGγγ as well
as the decay constants being RG invariant. Putting this together, we find
that all the terms in the decay formula are of O(Nc ) except the anomalous
contribution AgGγγ which is O(1). Since it is RG invariant and independent
of the anomalous dimension γ, we conjecture that it is a quantity for which
The U (1)A Anomaly and QCD Phenomenology 257

the OZI (or large-Nc ) approximation should be reliable so we expect it to be

numerically small compared with the other contributions. In the next section,
we test this against experiment.
As we shall see later, this conjecture has far-reaching implications for a
range of predictions related to the anomaly, particularly in the interpretation
of the U (1)A Goldberger–Treiman relation and associated ideas on the ﬁrst
moment sum rules for g1p and g1γ in deep-inelastic scattering.

4.4 Phenomenology

After all this theoretical development, we ﬁnally turn to experiment and use
the data on the radiative decays η, η → γγ to deduce values for the pseu-

doscalar meson decay constants f 0η , f 0η , f 8η and f 8η from the set of decay
formulae (76), (77) and U (1)A DGMOR relations (64)–(66). We will also ﬁnd
the value of the unphysical coupling parameter gGγγ and test the realisation
of the 1/Nc expansion in real QCD.
The two-photon decay widths are given by

m3η (η)
Γ η (η) → γγ = |gη (η)γγ |2 (80)
64π
The current experimental data, quoted in the Particle Data Group tables [45],
are
Γ (η → γγ) = 4.28 ± 0.19 KeV (81)
which is dominated by the 1998 L3 data [46] on the two-photon formation of
the η in e+ e− → e+ e− π + π − γ, and

Γ (η → γγ) = 0.510 ± 0.026 KeV (82)

which arises principally from the 1988 Crystal Ball [47] and 1990 ASP [48]
results on e+ e− → e+ e− η. From this data, we deduce the following results for
the couplings gη γγ and gηγγ :

gη γγ = 0.031 ± 0.001 GeV−1 (83)

and
gηγγ = 0.025 ± 0.001 GeV−1 (84)
which may be compared with gπγγ = 0.024 ± 0.001 GeV.
We also require the pseudoscalar meson masses:

mη = 957.78 ± 0.14 MeV mη = 547.30 ± 0.12 MeV

mK = 493.68 ± 0.02 MeV mπ = 139.57 ± 0.00 MeV (85)

and the decay constants fπ and fK . These are deﬁned in the standard way,
so we take the following values (in our normalisations) from the PDG [45]:
258 G. M. Shore

(a) (b)
f0eta'MeV f0eta MeV
115 35

110 30

105 25

100 20

95 15

x x
185 190 195 200 185 190 195 200

Fig. 4. The decay constants f 0η and f 0η as functions of the non-perturbative
parameter A = (x MeV)4 which determines the topological susceptibility in QCD

fK = 113.00 ± 1.03 MeV fπ = 92.42 ± 0.26 MeV (86)

giving fK /fπ = 1.223 ± 0.012.

The octet decay constants f 8η and f 8η are obtained from (66) and (77).
This leaves three remaining equations which determine the singlet decay con-

stants f 0η , f 0η and the gluonic coupling gGγγ in terms of the QCD topological
susceptibility parameter A. This dependence is plotted in Figs. 4 and 5.
To make a deﬁnite prediction, we need a theoretical input value for the
topological susceptibility. In time, lattice calculations in full QCD with dy-
namical fermions should be able to determine the parameter A. For the mo-
ment, however, only the topological susceptibility in pure Yang–Mills theory
is known accurately. The most recent value [49] is

10–3
4

x
185 190 195 200

-1

Fig. 5. This shows the relative sizes of the contributions to the ﬂavour singlet
radiative decay formula (76) expressed as functions of the topological
√
susceptibility
4 2√ 2 αem
parameter A = (x MeV) . The dotted (black) line denotes 3 π . The dominant

contribution comes from the term f 0η gη γγ , denoted by the long-dashed (green)
line, while the short-dashed
√ (blue) line denotes f 0η gηγγ . The contribution from the
gluonic coupling, 6AgGγγ , is shown by the solid (red) line
The U (1)A Anomaly and QCD Phenomenology 259

χ(0)|Y M = − (191 ± 5 MeV)4 = − (1.33 ± 0.14) × 10−3 GeV4 (87)

This supersedes the original value χ(0)|Y M −(180 MeV)4 obtained some
time ago [50]. Similar estimates are also obtained using QCD spectral sum rule
methods [51]. At this point, therefore, we have to make an approximation and
so we assume that the O(1/Nc ) corrections in the identiﬁcation

A = χ(0)YM + O(1/Nc ) (88)

are numerically small. With this provisional input for A, we can then deter-
mine the full set of decay constants:

f 0η = 104.2 ± 4.0 MeV f 0η = 22.8 ± 5.7 MeV

f 8η = − 36.1 ± 1.2 MeV f 8η = 98.4 ± 1.4 MeV (89)

and
gGγγ = − 0.001 ± 0.072 GeV−4 (90)
0η 8η
It is striking how close both the diagonal decay constants f and f are to

fπ . Predictably, the oﬀ-diagonal ones f 0η and f 8η are strongly suppressed.
It is also useful to quote these results in the two-angle parametrisation
normally used in phenomenology. Deﬁning,
0η 0η
f f f0 cos θ0 −f0 sin θ0
= (91)
f 8η f 8η f8 sin θ8 f8 cos θ8

we ﬁnd

f0 = 106.6 ± 4.2 MeV f8 = 104.8 ± 1.3 MeV

θ0 = − 12.3 ± 3.0 deg θ8 = − 20.1 ± 0.7 deg (92)

that is
f0 f8
= 1.15 ± 0.05 = 1.13 ± 0.02 (93)
fπ fπ
Given these results, we can now investigate how closely our expectations
based on OZI or 1/Nc reasoning are actually realised by the experimental
data. With the input value (87) for A, the numerical magnitudes and 1/Nc
orders of the terms in the ﬂavour singlet decay relation are as follows (see
Fig. 5):
√
f 0η gη γγ [Nc ; 3.23] + f 0η gηγγ [Nc ; 0.57] + 6AgGγγ [1; − 0.005 ± 0.23]
αem
= a0em [Nc ; 3.79] (94)
π
The important point is that the gluonic contribution gGγγ , which is suppressed
by a power of 1/Nc compared to the others, is also experimentally small. The
260 G. M. Shore

near-vanishing for the chosen value of A is presumably a coincidence, but we

see from Fig. 5 that across a reasonable range of values of the topological
susceptibility it is still contributing no more than around 10%, in line with
our expectations for a RG-invariant, OZI-suppressed quantity.
It is also interesting to see how the 1/Nc approximation is realised in the
U (1)A DGMOR generalisation (68) of the Witten–Veneziano formula (67).
Here we ﬁnd

(f 0η )2 m2η [Nc ; 9.96] + (f 0η )2 m2η [Nc ; 0.15] + (f 8η )2 m2
η [Nc ; 1.19]

+ (f 8η )2 m2η [Nc ; 2.90] − 2fK

2
m2K [Nc ; −6.22] = 6A [1; 7.98]
(95)

This conﬁrms the picture that the anomaly-induced contribution of O(1/Nc )

to m2η , which gives a sub-leading O(1) effect in (f 0η )2 m2η , is in fact nu-
merically dominant and matched by the O(1) topological susceptibility term
6A. Away from the chiral limit, the conventional non-anomalous terms are
all of O(Nc ) and balance as expected. The surprising numerical accuracy of
the Witten–Veneziano formula (18) is seen to be in part due to a cancella-

tion between the underestimates of f 8η (taken to be 0) and fK (set equal
to fπ ). This emphasises, however, that great care must be taken in using the
formal order in the 1/Nc expansion as a guide to the numerical importance
of a physical quantity, especially in the U (1)A channel.
Nevertheless, the fact that the RG-invariant, OZI-suppressed coupling
gGγγ is experimentally small is a very encouraging result. It increases our con-
fidence that we are able to identify quantities where the OZI, or leading 1/Nc ,
approximation is likely to be numerically good. It also shows that gGγγ gives
a contribution to the decay formula which is entirely consistent with its pic-
turesque interpretation as the coupling of the photons to the anomaly-induced
gluonic component of the η . A posteriori, the fact that its contribution is at
most 10% explains the general success of previous theoretically inconsistent
phenomenological parametrisations of η decays in which the naive current
algebra formulae omitting the gluonic term are used.
However, while the flavour singlet decay formula is well-defined and the-
oretically consistent, it is necessarily non-predictive. To be genuinely useful,
we would need to find another process in which the same coupling enters. The
problem here is that, unlike the decay constants which are universal, the cou-
pling gGγγ is process-specific just like gη γγ or gηγγ . There are of course many
other processes to which our methods may be applied such as η (η) → V γ,
where V is a flavour singlet vector meson ρ, ω, φ, or η (η) → π + π − γ. The
required flavour singlet formulae may readily be written down, generalising
the naive PCAC formulae. However, each will introduce its own gluonic cou-
pling, such as gGV γ . Although strict predictivity is lost, our experience with
the two-photon decays suggests that these extra couplings will give relatively
small, at most O(10 − 20%), contributions if like gGγγ they can be identified
as RG invariant and 1/Nc suppressed. This observation restores at least a
The U (1)A Anomaly and QCD Phenomenology 261

reasonable degree of predictivity to the use of PCAC methods in the U (1)A

sector.

4.5 U (1)A Goldberger–Treiman Relation

A further classic application of PCAC is to the pseudoscalar couplings of the

nucleon. For the pion, the relation between the axial-vector form factor of
the nucleon and the pion–nucleon coupling gπN N is the famous Goldberger–
Treiman relation. Here, we present its generalisation to the flavour singlet sec-
tor, which involves the anomaly and gluon topology. This U (1)A Goldberger–
Treiman relation was first proposed by Veneziano [4] in an investigation of
the ‘proton-spin’ problem and further developed in [3, 8, 9, 52].
The axial-vector form factors are defined from

N |Jμ5
a
|N = 2mN GaA (p2 )sμ + GaP (p2 )p.spμ (96)

where sμ = ūγμ γ5 u/2mN is the covariant spin vector. In the absence of a

massless pseudoscalar, only the form factors GaA (0) contribute at zero mo-
mentum.
Expressing the matrix element in terms of the 1PI vertices derived from
a
the generating functional Γ [Vμ5 , Vμa , Q, φa5 , φa ], including spectator ﬁelds N, N̄
for the nucleon, we have

N |Jμ5
a
|N = ū ΓV5μa N̄ N + WV5μa θ ΓQN̄ N + WV5μa S5b Γφb5 N̄ N u (97)

Note that this expansion relies on the speciﬁc deﬁnition(8) of Γ as a partial

Legendre transform.
We also need the following relation, valid for all momenta, which is derived
directly from the fundamental anomalous chiral Ward identity (9) for Γ :

∂μ ΓVμ5
a N̄ N = − Φab Γφb5 N̄ N (98)

Now, taking the divergence of (97), using this Ward identity and then8 taking
the zero-momentum limit, noting that the propagators vanish at zero momen-
tum since there is no massless pseudoscalar, gives

2mN GaA (0) ūγ5 u = iū Φab Γφb5 N̄ N p=0 u (99)

The meson–nucleon couplings are related to the 1PI vertices by

N |η α N = gηα N N ūγ5 u = iūΓηα N̄ N u (100)

8
The p → 0 limit is delicate, as is the case for the derivation of the conven-
tional Goldberger–Treiman relation, and should be taken in this order. Literally
at p = 0, both sides vanish since ūγ5 u = 0.
262 G. M. Shore

Re-expressing (99) in terms of the canonically normalised ‘OZI boson’ ﬁeld

η̂ α , we therefore derive

2mN GaA (0) = fˆaα gη̂α N N (101)

This relation will be useful to us when we consider the ‘proton spin’ problem.
All that now remains to cast this into its ﬁnal form is to make the famil-
iar change of variables from Q, η̂ α to G, η α , where η α are interpreted as the
physical mesons. We therefore ﬁnd the generalised U (1)A Goldberger–Treiman
relation:
2mN GaA (0) = f aα gηα N N + 2nf AgGN N δa0 (102)
For the individual components, this is

2mN G3A = fπ gπN N (103)

8η 8η
2mN G8A = f gη N N + f gηN N (104)
0η 0η
√
2mN G0A = f gη N N + f gηN N + 6AgGN N (105)

The renormalisation group properties of these relations are described in

great detail in [9]. It is clear that the ﬂavour singlet axial coupling G0A satis-
ﬁes a homogeneous RGE and scales with the anomalous dimension γ corre-
0
sponding to the multiplicative renormalisation of Jμ5 . In the form (101), RG
consistency is simply achieved by

Dfˆaα = γδa0 fˆaα Dgη̂α N N = 0 (106)

All the scale dependence is in the decay constant fˆ0α while the the coupling
gη̂α N N of the ‘OZI boson’ to the nucleon is RG invariant (in contrast to gη̂α γγ ).
In the ﬁnal form (102) involving the physical decay constants, a careful anal-
ysis shows that apart from G0A (0) the only other non RG-invariant quantity
is the gluonic coupling gGN N , which is required to satisfy the following non-
homogeneous RGE to ensure the self-consistency of (105):

1 1 0α
DgGN N = γ gGN N + f gηα N N (107)
2nf A

The large-Nc behaviour in the ﬂavour singlet relation is as follows: G0A =

√ √
O(Nc ), f 0η , f 0η = O( Nc ), A = O(1), gηN N , gη N N = O( Nc ), gGN N =
O(1). So the ﬁnal term AgGN N is O(1), suppressed by a power of 1/Nc com-
pared to all the others, which are O(Nc ).
We see that, like gGγγ , the gluonic coupling gGN N is suppressed at large
Nc relative to the corresponding meson couplings. However, unlike gGγγ that
is RG invariant, gGN N has a complicated RG non-invariance and depends on
the anomaly-induced anomalous dimension γ. The conjecture in Sect. 4.3 then
suggests that while the OZI or large-Nc approximation should be a good guide
to the value of gGγγ , we may expect signiﬁcant OZI violations for gGN N . We
The U (1)A Anomaly and QCD Phenomenology 263

would therefore not be surprised to ﬁnd that gGN N makes a sizeable numerical
contribution to the U (1)A Goldberger–Treiman relation.
We now try to test these expectations against the experimental data. We
ﬁrst introduce a notation that has become standard in the literature on deep-
inelastic scattering. There, the axial couplings are written as
1 3 1 1
G3A = a G8A = √ a8 G0A = √ a0 (108)
2 2 3 6
where the aa have a simple interpretation in terms of parton distribution
functions.
Experimentally,

a3 = 1.267 ± 0.004 a8 = 0.585 ± 0.025 (109)

from low-energy data on nucleon and hyperon beta decay. The latest result9 for
a0 quoted by the COMPASS collaboration [53] from deep inelastic scattering
data is
a0 |Q2 →∞ = 0.33 ± 0.06 (110)
with a similar result from HERMES [54].
The OZI expectation is that a0 = a8 . In the context of DIS, this is a
prediction of the simple quark model, where it is known as the Ellis–Jaﬀe
sum rule [57]). We return to this in the context of the ‘proton spin’ problem
in Sect. 5 but for now we concentrate on the low-energy phenomenology of
the pseudoscalar meson–nucleon couplings.
The original Goldberger–Treiman relation (103) gives the following value
for the pion–nucleon coupling:

gπN N = 12.86 ± 0.06 (111)

consistent to within about 5% with the experimental value 13.65(13.80)±0.12

(depending on the data set used [58]). In an ideal world where gηN N and
gη N N were both known, we would now verify the octet formula (104) then
determine the gluonic coupling gGN N from the singlet Goldberger–Treiman
relation (105). However, the experimental situation with the η and η -nucleon
couplings is far less clear. One would hope to determine these couplings from
the near threshold production of the η and η in nucleon–nucleon collisions,
i.e. pp → ppη and pp → ppη , measured for example at COSY-II [59, 60, 61].
However, the η production is dominated by the N (1535)S11 nucleon resonance
which decays to N η, and as a result very little is known about gηN N itself.
The detailed production mechanism of the η is not well understood. However,
9
This supersedes the result a0 |Q2 =4GeV2 = 0.237−0.029
+0.024
quoted by COMPASS in
2005 [55, 56], which we used as input into our analysis of the phenomenology of
the U (1)A GT relation in [3]. The ﬁts presented here are updated from those of
[3] to take account of this. For a further discussion of the experimental situation,
see Sect. 5.
264 G. M. Shore

since there is no known baryonic resonance decaying into N η , we may simply

assume that the reaction pp → ppη is driven by the direct coupling supple-
mented by heavy-meson exchange. This allows an upper bound to be placed
on gη N N and on this basis [62] quotes gη N N < 2.5. This is supported by an
analysis [63] of very recent data from CLAS [64] on the photoproduction reac-
tion γp → pη . Describing the cross section data with a model comprising the
direct coupling together with t-channel meson exchange and s and u-channel
resonances, it is found that equally good ﬁts can be obtained for several values
of gη N N covering the whole region 0 < gη N N < 2.5.
In view of this experimental uncertainty, we shall use the octet and singlet
Goldberger–Treiman relations to plot the predictions for gηN N and gGN N
as a function of the ill-determined η -nucleon coupling in the experimentally
allowed range 0 < gη N N < 2.5. The results (again taking the value (87) for
A) are given in Fig. 6. In Fig. 7 we have shown the relative magnitudes of the
various contributions to the ﬂavour-singlet formula.
What we learn from this is that for values of gη N N approaching the up-
per end of the experimentally allowed range, the contribution of the OZI-

suppressed gluonic coupling gGN N is quite large. The variation of f 0η gη N N
over
√ the allowed range is compensated almost entirely by the variation of
6gGN N , with the f 0η gηN N contribution remaining relatively constant.
For example, if experimentally we found gη N N 2.5, which corre-
sponds to the cross sections for pp → ppη and γp → pη being almost en-
tirely determined by the direct coupling, then we would have gηN N 4.14
and gGN N −31.2 GeV−3 . In terms of the contributions to the U (1)A
Goldberger–Treiman relation, this would give (in GeV)

2mN G0A [Nc ; 0.25] = f 0η gη N N [Nc ; 0.26] + f 0η gηN N [Nc ; 0.09]
√
+ 6AgGN N [O(1); − 0.10] (112)

(a) (b)
getaNN
gGNN GeV-3
4.4 60

4.2
40
4
3.8 20

3.6 geta'NN
0.5 1 1.5 2 2.5
3.4
-20
3.2
geta'NN -40
0.5 1 1.5 2 2.5

Fig. 6. These ﬁgures show the dimensionless η-nucleon coupling gηN N and the
gluonic coupling gGN N in units of GeV−3 expressed as functions of the experimen-
tally uncertain η -nucleon coupling gη N N , as determined from the ﬂavour octet and
singlet Goldberger–Treiman relations (104) and (105)
The U (1)A Anomaly and QCD Phenomenology 265

GeV
0.25
0.2
0.15
0.1
0.05
geta'NN
0.5 1 1.5 2 2.5
-0.05
-0.1

Fig. 7. This shows the relative sizes of the contributions to the U (1)A Goldberger–
Treiman relation from the individual terms in (105), expressed as functions of the
coupling gη N N . The dotted (black) line denotes 2mN G0A . The long-dashed (green)

line is f 0η gη N N and the short-dashed (blue) line is√f 0η gηN N . The solid (red) line
shows the contribution of the novel gluonic coupling, 6AgGN N , where A determines
the QCD topological susceptibility

0
The anomalously small value of G√A compared to its OZI value (the OZI
0
approximation is 2mN GA OZI = 2 2mN G8A = 0.45) is then due to the
partial cancellation of the sum of the meson–nucleon coupling terms by the
gluonic coupling gGN N . Although formally O(1/Nc ) suppressed, numerically
it gives a major contribution to the large OZI violation in G0A . This would
give some support to our conjecture and provide further evidence that we are
able to predict the location of large OZI violations using the renormalisation
group as a guide.
Of course, it may be that experimentally we eventually ﬁnd a value for
gη N N 1.5, in the region where gGN N contributes only around 10% or less.
Although surprising, this would open the possibility that all gluonic couplings
of type gGXX are close to zero, which could be interpreted as implying that the
gluonic component of the η wave function is simply small. Clearly, a reliable
determination of gη N N , or equivalently gηN N , would shed considerable light
on the U (1)A dynamics of QCD.

5 Topological Charge Screening and the ‘Proton Spin’

So far, we have focused on the implications of the U (1)A anomaly for low-
energy QCD phenomenology. However, the anomaly also plays a vital role
in the interpretation of high-energy processes, in particular polarised deep-
inelastic scattering.
In this section, we discuss one of the most intensively studied topics in
QCD of the past two decades – the famous, but misleadingly named, ‘proton
spin’ problem. We review the interpretation initially proposed by Veneziano
[4] and developed by us in a series of papers exploring the relation with the
266 G. M. Shore

U (1)A GT relation and gluon topology [8, 9, 65]. In a subsequent work with
Narison, we were able to quantify our prediction by using QCD spectral sum
rules to compute the slope χ (0) of the topological susceptibility [10, 52].
Remarkably, the most recent experimental data from the COMPASS [53] and
HERMES [54] collaborations, released in September 2006, now conﬁrms our
original 1994 numerical prediction [10].

5.1 The g1p and Angular Momentum Sum Rules

The ‘proton spin’ problem concerns the sum rule for the ﬁrst moment of the
polarised proton structure function g1p . This is measured in polarised DIS ex-
periments through the inclusive processes μp → μX (EMC, SMC, COMPASS
at CERN) or ep → eX (SLAC, HERMES at DESY) together with similar
experiments on a deuteron target. The polarisation asymmetry of the cross-
section is expressed as

dΔσ YP 16π 2 α2 p
M 2 x2
x = g1 (x, Q2 ) + O (113)
dxdy 2 s Q2

with conventional notation: Q2 = −q 2 and x = Q2 /2p2 .q are the Bjorken

variables, where p2 , q are the momenta of the target proton and incident
virtual photon, respectively, y = Q2 /xs and Yp = (2 − y)/y.
According to standard theory, g1p is determined by the proton matrix el-
ement of two electromagnetic currents carrying a large spacelike momentum.
The sum rule for the first moment of g1p is derived from the twist 2, spin 1
terms in the operator product expansion for the currents:
√
qν
1 8 2 2
J (q)J (−q) 2∼ 2
λ ρ λρνμ
ΔC N
1
S
(αs ) J 3
μ5 + √ Jμ5 + √ ΔC S
1 (αs )J 0
μ5
Q →∞ Q2 3 3
(114)
where ΔC1N S and ΔC1S are Wilson coefficients and Jμ5 a
(a = 3, 8, 0) are the
renormalised axial currents, with the normalisations defined in Sect. 2. It is
the occurrence of the axial currents in this OPE that provides the link between
the U (1)A anomaly and polarised DIS. The sum rule is therefore:
1

1 1 1
Γ1p (Q2 ) ≡ dx g1p (x, Q2 ) = ΔC1N S a3 + a8 + ΔC1S a0 (Q2 ) (115)
0 12 3 9

where the axial charges a3 , a8 and a0 (Q2 ) are defined in terms of the forward
proton matrix elements as in (108). Here, we have explicitly shown the Q2
scale dependence associated with the RG non-invariance of a0 (Q2 ).
Since the flavour non-singlet axial charges are known from low-energy data,
a measurement of the first moment of g1p amounts to a determination of the
flavour singlet a0 (Q2 ). At the time of the original EMC experiment in 1988
[66] the theoretical expectation based on the quark model was that a0 =
a8 . The resulting sum rule for g1p is known as the Ellis–Jaffe sum rule [57].
The U (1)A Anomaly and QCD Phenomenology 267

The great surprise of the EMC measurement was the discovery that in fact
a0 is significantly suppressed relative to a8 , and indeed the earliest results
suggested it could even be zero. However, the reason the result sent shockwaves
through both the theoretical and experimental communities (to date, the EMC
paper has over 1300 citations) was the interpretation that this implies that
the quarks contribute only a fraction of the total spin of the proton.
In fact, this interpretation relies on the simple valence quark model of the
proton and is not true in QCD, where the axial charge decouples from the
real angular momentum sum rule for the proton. Rather, as we shall show,
the suppression of a0 (Q2 ) reflects the dynamics of gluon topology and appears
to be largely independent of the structure of the proton itself. Precisely, it is
a manifestation of topological charge screening in the QCD vacuum.
The angular momentum sum rule is derived by taking the forward matrix
element of the conserved angular momentum current M μνλ , defined in terms
of the energy-momentum tensor as

M μνλ = x[ν T λ]μ + ∂ρ X ρμνλ (116)

The inclusion of the arbitrary tensor X ρμνλ just reflects the usual freedom in
QFT of defining conserved currents. This gives us some flexibility in attempt-
ing to write M μνλ as a sum of local operators, suggesting interpretations of the
total angular momentum as a sum of ‘components’ of the proton spin. In fact,
however, it is not possible to write M μνλ as a sum of operators corresponding
to quark and gluon spin and angular momentum in a gauge-invariant way.
The best decomposition is [67, 68, 69]
μ[λ μ[λ
M μνλ = O1μνλ + O2 xν] + O3 xν] + . . . (117)

where the dots denote terms whose forward matrix elements vanish. Here,
1 μνλσ 1
O1μνλ = q̄γσ γ5 q = μνλσ 2nf Jσ5
0
2 2
↔
O2μλ = iq̄γ μ D λ q
O3μλ = F μρ Fρ λ (118)

At ﬁrst sight, O1μνλ looks as if it could be associated with ‘quark spin’, since
for free Dirac fermions the spin operator coincides with the axial vector cur-
μ[λ
rent. O2 xμ] would correspond to ‘quark orbital angular momentum’, leaving
μ[λ
O3 xν] as ‘gluon total angular momentum’. Any further decomposition of the
gluon angular momentum is necessarily not gauge invariant.
The forward matrix elements of these operators may be expressed in terms
of form factors and, as we showed in [68], this exhibits an illuminating can-
cellation. After some analysis, we ﬁnd

p, s|O1μνλ |p, s = a0 mN μνλσ sσ

268 G. M. Shore

1
pρ p{μ [λ}ν]ρσ sσ − a0 mN μνλσ sσ
μ[λ
p, s|O2 xν] |p, s = Jq
2mN
1
pρ p{μ [λ}ν]ρσ sσ
μ[λ
p, s|O3 xν] |p, s = Jg (119)
2mN
The angular momentum sum rule for the proton is then just
1
= Jq + Jg (120)
2
where the Lorentz and gauge-invariant form factors Jq and Jg may reason-
ably be thought of as representing quark and gluon total angular momentum.
However, even this interpretation is not at all rigorous, not least because Jq
and Jg mix under renormalisation and scale as

d Jq αs − 83 CF 32 nf Jq
= (121)
d ln Q2 Jg 4π 8
C
3 F − 2
n
3 f Jg

Only the total angular momentum is Lorentz, gauge and scale invariant.10
The crucial observation, however, is that the axial charge a0 explicitly
cancels from the angular momentum sum rule. a0 is an important form factor,
which relates the ﬁrst moment of g1p to gluon topology via the U (1)A anomaly,
but it is not part of the angular momentum sum rule for the proton.
Just as a0 can be measured in polarised inclusive DIS, the form factors Jq
and Jg can be extracted from measurements of unpolarised generalised parton
distributions (GPDs) in processes such as deeply virtual Compton scattering
γ ∗ p → γp. These can also in principle be calculated in lattice QCD. The
required identiﬁcations with GPDs are given in [68].

5.2 QCD Parton Model

Before describing our resolution of the ‘proton spin’ problem, we brieﬂy review
the parton model interpretation of the ﬁrst moment sum rule for g1p .
In the simplest form of the parton model, the proton structure at large
Q2 is described by parton distributions corresponding to free valence quarks
only. The polarised structure function is given by

1 2
nf
g1p (x) = e Δqi (x) (122)
2 i=1 i

where Δqi (x) is deﬁned as the diﬀerence of the distributions of quarks (and
antiquarks) with helicities parallel and antiparallel to the nucleon spin. It is

10
For a careful discussion of the parton interpretation of longitudinal and transverse
angular momentum sum rules, see [70]. This conﬁrms our assertion that the axial
charge a0 is not to be identiﬁed with quark helicities in the parton model.
The U (1)A Anomaly and QCD Phenomenology 269

convenient to work with the conventionally deﬁned ﬂavour non-singlet and

singlet combinations Δq N S and Δq S (often also written as ΔΣ).
S
1 In thisS model, the first moment of the singlet quark distribution Δq =
0
dx Δq (x) can be identified as the sum of the helicities of the quarks.
Interpreting the structure function data in this model then leads to the con-
clusion that the quarks carry only a small fraction of the spin of the proton.
There is indeed a real contradiction between the experimental data and the
free valence quark parton model.
However, this simple model leaves out many important features of QCD,
the most important being gluons, RG scale dependence and the chiral UA (1)
anomaly. When these effects are included, in the QCD parton model, the naive
identification of Δq S with spin no longer holds and the experimental results
for g1p can be accommodated, though not predicted.
In the QCD parton model, the polarised structure function is written in
terms of both quark and gluon distributions as follows:

g1p (x, Q2 ) =
1
du 1
x
x
x
ΔC N S Δq N S (u, t) + ΔC S Δq S (u, t) + ΔC g Δg(u, t)
x u 9 u u u
(123)

where ΔC S , ΔC g and ΔC N S are perturbatively calculable functions related

to the Wilson coefficients and the quark and gluon distributions have a priori
a t = ln Q2 /Λ2 dependence determined by the RG evolution, or DGLAP,
equations. The first moment sum rule is therefore
1
Γ1p (Q2 ) = ΔC1N S Δq N S + ΔC1S Δq S + ΔC1g Δg (124)
9
Comparing with (115), we see that the axial charge a0 (Q2 ) is identified with
a linear combination of the first moments of the singlet quark and gluon
distributions. It is often, though not always, the case that the moments of
parton distributions can be identified in one-to-one correspondence with the
matrix elements of local operators. The polarised first moments are special in
that two parton distributions correspond to the same local operator.
The RG evolution equations for the first moments of the parton distri-
butions are derived from the matrix of anomalous dimensions for the lowest
spin, twist 2 operators. This introduces an inevitable renormalisation scheme
ambiguity in the definitions of Δq and Δg, and their physical interpretation
is correspondingly nuanced. The choice closest to our own analysis is the ‘AB’
scheme [71] where the parton distributions have the following RG evolution:
d NS d S
d ln Q2 Δq =0 d ln Q2 Δq =0

d αs 2
d ln Q2 2π Δg(Q ) =γ αs 2
2π Δg(Q ) − 13 Δq S (125)
270 G. M. Shore

which requires ΔC1g = 3α S

2π ΔC1 . It is then possible to make the following
s

identiﬁcations with the axial charges:

a3 = Δu − Δd
a8 = Δu + Δd − 2Δs
3αs
a0 (Q2 ) = Δu + Δd + Δs − Δg(Q2 ) (126)
2π
with Δq S = Δu + Δd + Δs. Notice that in the AB scheme, all the scale
dependence of the axial charge a0 (Q2 ) is assigned to the gluon distribution
Δg(Q2 ).
This was the identification originally introduced for the first moments by
Altarelli and Ross [72], and resolves the ‘proton spin’ problem in the context
of the QCD parton model. In this scheme, the Ellis–Jaffe sum rule follows from
the assumption that in the proton both Δs and Δg(Q2 ) are zero, which is
the natural assumption in the free valence quark model. This is equivalent to
the OZI approximation a0 (Q2 ) = a8 . However, in the full QCD parton model,
there is no reason why Δg(Q2 ), or even Δs, should be zero in the proton.
Indeed, given the different scale dependence of a0 (Q2 ) and a8 , it would be
unnatural to expect this to hold in QCD itself.
An interesting conjecture [72] is that the observed suppression in a0 (Q2 )
is due overwhelmingly to the gluon distribution Δg(Q2 ) alone. Although by
no means a necessary consequence of QCD, this is a reasonable expectation
given that it is the anomaly (which is due to the gluons and is responsible for
OZI violations) which is responsible for the scale dependence in a0 (Q2 ) and
Δg(Q2 ) whereas the Δq are scale invariant. This would be in the same spirit
as our conjecture on OZI violations in low-energy phenomenology in Sect. 4.3.
To test this, however, we need to find a way to measure Δg(Q2 ) itself, rather
than the combination a0 (Q2 ). The most direct option is to extract Δg(x, Q2 )
from processes such as open charm production, γ ∗ g → cc̄, which is currently
being intensively studied by the COMPASS [73], STAR [74] and PHENIX [75]
collaborations.

5.3 Topological Charge Screening

We now describe a less conventional approach to deep inelastic scattering

based entirely on ﬁeld-theoretic concepts. In particular, the role of parton dis-
tributions is taken over by the 1PI vertices of composite operators introduced
above (for a review, see [76]).
Once again, the starting point is the use of the OPE to express the mo-
ments of a generic structure function F (x, Q2 ) as
1
dx xn−1 F (x, Q2 ) = n
CA (Q2 ) p|OA
n
(0)|p (127)
0 A
The U (1)A Anomaly and QCD Phenomenology 271

where OA n
denotes the set of lowest twist, spin n operators and CA n
(Q2 ) are
the corresponding Wilson coeﬃcients. The next step is to introduce a new
set of composite operators ÕB , chosen to encompass the physically relevant
degrees of freedom, and write the matrix element as a product of two-point
Green functions and 1PI vertices as follows:
1
dx xn−1 F (x, Q2 ) = CAn
(Q2 ) 0|T OA
n
ÕB |0 ΓÕB pp (128)
0 A B

This decomposition splits the structure function into three parts – first, the
n
Wilson coefficients CA (Q2 ) which can be calculated in perturbative QCD;
second, non-perturbative but target independent Green functions that encode
the dynamics of the QCD vacuum; third, non-perturbative vertex functions
that characterise the target by its couplings to the chosen operators ÕB .11
Now specialise to the first moment sum rule for g1p . For simplicity, we first
present the analysis for the chiral limit, where there is no flavour mixing.
Using the anomaly (4), we can express the flavour singlet contribution to the
sum rule as
1
2 1
Γ1p (Q2 )singlet ≡ dx g1p (x, Q2 )singlet = ΔC1S (αs ) p|Q|p (129)
0 3 2mN

The obvious choice for the operators ÕB in this case are the flavour singlet
pseudoscalars and it is natural to choose the ‘OZI boson’ field η̂ 0 = fˆ00
q̄q
1
φ05 ,

which is normalised so that d/dp2 Γη̂0 η̂0 p=0 = 1. As we have seen in (106),
the corresponding 1PI vertex is then RG invariant. Writing the 1PI vertices
in terms of nucleon couplings as in (100), we find (see Fig. 8)
2 1

Γ1p (Q2 )singlet = ΔC1S (αs ) 0|T Q Q|0 gQN N + 0|T Q η̂ 0 |0 gη̂0 N N
3 2mN
(130)

Recalling that the matrix of two-point Green functions is given by the

inversion formula
−1
Wθθ WθSη̂0 ΓQQ ΓQη̂0
= − (131)
WSη̂0 θ WSη̂0 Sη̂0 Γη̂0 Q Γη̂0 η̂0

11
We emphasise again that this decomposition of the matrix elements into products
of Green functions and 1PI vertices is exact, independent of the choice of the set
of operators ÕB . In particular, it is not necessary for ÕB to be in any sense a
complete set. If a different choice is made, the vertices ΓÕB pp themselves change,
becoming 1PI with respect to a different set of composite fields. In practice, the set
of operators ÕB should be as small as possible while still capturing the essential
degrees of freedom. A good choice can also result in vertices ΓÕB pp which are
both RG invariant and closely related to low-energy physical couplings.
272 G. M. Shore

Q Q

Q η^0

p p p p

Fig. 8. Illustration of the decomposition of the matrix element p|Q|p into two-
point Green functions and 1PI vertices. The Green function in the ﬁrst diagram is
χ(0); in the second it is χ (0)

and using the normalisation condition for η̂ 0 , we can easily show that at zero
momentum,
d
WθS2
= W θθ
(132)
η̂ 0
dp 2 p=0

Finally, therefore, we can represent the first moment of g1p in the following,
physically intuitive form:
2 1

Γ1p (Q2 )singlet = ΔC1S (αs ) χ(0) gQN N + χ (0) gη̂0 N N (133)
3 2mN
This shows that the first moment is determined by the gluon topological
susceptibility in the QCD vacuum as well as the couplings of the proton to
the pseudoscalar operators Q and η̂ 0 . In the chiral limit, χ(0) = 0 so the first
term vanishes. The entire flavour singlet contribution is therefore simply
2 1
Γ1p (Q2 )singlet = ΔC1S (αs ) χ (0) gη̂0 N N (134)
3 2mN
The 1PI vertex gη̂0 N N is RG invariant, and we see from (25) that in the chiral
limit the slope of the topological susceptibility scales with the anomalous
dimension γ, viz.
d
2
χ (0) = γ χ (0) (135)
d ln Q
ensuring consistency with the RGE for the flavour singlet axial charge.
The formulae (133) and (134) are our key result. They show how the first
moment of g1p can be factorised into couplings gQN N and gη̂0 N N , which carry
information on the proton structure, and Green functions that characterise
the QCD vacuum. In the case of g1p , the Green functions reduce simply to
the topological susceptibility χ(0) and its slope χ (0). We now argue that
the experimentally observed suppression in the first moment of g1p is due not
to a suppression in the couplings, but to the vanishing of the topological
susceptibility χ(0) and an anomalously small value for its slope χ (0). This is
what we refer to as topological charge screening in the QCD vacuum.
The justification follows our now familiar conjecture on the relation be-
tween OZI violations and RG scale dependence. We expect the source of OZI
The U (1)A Anomaly and QCD Phenomenology 273

violations to be in those quantities which are sensitive to the anomaly, as

identiﬁed by their scaling dependence on the anomalous dimension γ, in this
case χ (0). In contrast, it should be a good approximation √ to use the OZI
value for the RG-invariantvertex gη̂0 N N , that is gη̂0 N N 2gη̂8 N N . The cor-
√
responding OZI value for χ (0) would be fπ / 6. This gives our key formula
for the ﬂavour singlet axial charge:
√
a0 (Q2 ) 6
χ (0) (136)
a8 fπ

The corresponding prediction for the ﬁrst moment of g1p is

√
p 2 1 S 8 6
Γ1 (Q )singlet = ΔC1 (αs ) a χ (0) (137)
9 fπ
The final step is to compute the slope of the topological susceptibility. In
time, lattice gauge theory should provide an accurate measurement of χ (0).
However, this is a particlarly difficult correlator for lattice methods since it
requires a simulation of QCD with light dynamical fermions and algorithms
that implement topologically non-trivial configurations in a sufficiently fast
and stable way. Instead, we have estimated the value of χ (0) using the QCD
spectral sum rule method. Full details and discussion of this computation can
be found in [10, 52]. The result is

χ (0) = 26.4 ± 4.1 MeV (138)

This gives our final prediction for the flavour singlet axial charge and the
complete first moment of g1p :

a0 Q2 =10GeV2 = 0.33 ± 0.05 (139)

Γ1p Q2 =10GeV2 = 0.144 ± 0.009 (140)

Topological charge screening therefore gives a suppression factor of approxi-

mately 0.56 in a0 compared to its OZI value a8 = 0.585.
In the decade since we made this prediction, the experimental measure-
ment has been somewhat lower than this value, in the range a0 0.20 − 0.25.
This would have suggested there is also a significant OZI violation in the
nucleon coupling gη̂0 N N itself, implicating the proton structure in the anoma-
lous suppression of Γ1p . Very recently, however, the COMPASS and HERMES
collaborations have published new results on the deuteron structure function
which spectacularly confirm our picture that topological charge screening in
the QCD vacuum is the dominant suppression mechanism.
These new data are shown in Fig. 9. This is based on data collected by
COMPASS at CERN in the years 2002–2004 and has only recently been pub-
lished. The accuracy compared to earlier SMC data at small x is significantly
274 G. M. Shore

improved and the dip in xg1d around x ∼ 10−2 suggested by the SMC data
is no longer present (Fig. 9). This explains the signiﬁcantly higher value for
a0 found by COMPASS compared to SMC. From this data, COMPASS quote
the ﬁrst moment for the proton–neutron average g1N = (g1p + g1n )/2 as [53]

Γ1N Q2 =3GeV2 = 0.050 ± 0.003(stat) ± 0.002(evol) ± 0.005(syst) (141)

Extracting the ﬂavour singlet axial charge from the analogue of (115) for Γ1N
then gives
a0 Q2 =3GeV2 = 0.35 ± 0.03(stat) ± 0.05(syst) (142)

or evolving to the Q2 → ∞ limit,

a0 2 = 0.33 ± 0.03(stat) ± 0.05(syst)
Q →∞
(143)

Similar results are found by HERMES, who quote [54]

a0 Q2 =5GeV2 = 0.330 ± 0.011(th) ± 0.025(exp) ± 0.028(evol) (144)

The agreement with our prediction (139) is striking.

To close this section, we briefly comment on the extension
of our anal-
ysis beyond the chiral limit. In this case, the operator 2nf Q in (129)
is replaced
by the full divergence of the flavour singlet axial current, viz.
D0 = 2nf Q + d0bc mb φc5 . Separating the matrix element p|D0 |p into Green
functions and 1PI vertices, we find from the zero-momentum Ward identities
that 0|T D0 Q|0 = 0 so the contribution from gQN N still vanishes. The
xg1 (x)

0.03
COMPASS Q2 > 1 (GeV/c)2
d

0.025 COMPASS Q2 > 0.7 (GeV/c)2

SMC
0.02
fit with ΔG > 0
0.015 fit with ΔG < 0

0.01

0.005

- 0.005

- 0.01
10-2 10-1
x
Fig. 9. COMPASS and SMC data for the deuteron structure function g1d (x).
Sta-
tistical error bars are shown with the data points. The shaded band shows the
systematic error
The U (1)A Anomaly and QCD Phenomenology 275

other Green function is 0|T D0 η̂ α |0 = −fˆ0α , so the first moment sum rule
becomes
1 1 √
Γ1p (Q2 )singlet = ΔC1S (αs ) 6 fˆ0α gη̂α N N (145)
9 2mN
It is clear that this is simply an alternative derivation of the U (1) GT relation
(101) for a0 . We could equally use the alternative form (102) to write
1 1 √
√
Γ1p (Q2 )singlet = ΔC1S (αs ) 6 f 0α gηα N N + 6AgGN N (146)
9 2mN
Recalling the RGE (107) for gGN N , we see that this bears a remarkable sim-
ilarity to the expression for a0 in terms of parton distributions in the AB
scheme (126). This was first pointed out in [8, 9].
Manipulating the zero-momentum Ward identities in a similar way to that
explained above in the chiral limit now shows that we can express the decay
constants fâα in terms of vacuum Green functions as follows (see (38)):

d
(fˆfˆT )ab = 0|T D a
D b
|0 (147)
dp2 p=0

However, for non-zero quark masses there is ﬂavour mixing amongst the ‘OZI
bosons’ η̂ α and we cannot extract the decay constants
simply by taking a
square root, as was the case in writing fˆ00 = χ (0) in the chiral limit.
Nevertheless, in [52] we estimated the decay constants and form factors in the
approximation where we use (147) with the full divergence Da but neglect
ﬂavour mixing. Assuming OZI for the couplings, this gives the estimate

a0 (Q2 ) √ fˆ00
6 (148)
a8
fˆ88
where we take
d d
fˆ00 2
0|T D0 D0 |0p=0 fˆ88
2
0|T D8 D8 |0p=0
dp dp
(149)
Evaluating the Green functions using QCD spectral sum rules gives

a0 Q2 =10GeV2 = 0.31 ± 0.02 (150)
p

Γ1 Q2 =10GeV2 = 0.141 ± 0.005 (151)

As we have seen in the last section, ﬂavour mixing can be non-negligible

in the phenomenology of the pseudoscalar mesons, so we should be a little
cautious in overestimating the accuracy of these estimates. (The quoted errors
do not include this systematic effect.) Nevertheless, the fact that they are
consistent with those obtained in the chiral limit reinforces our confidence that
the flavour singlet axial charge is relatively insensitive to the quark masses and
276 G. M. Shore

that (139) and (140) indeed provide an accurate estimate of the first moment
of g1p .
The observation that the ‘proton spin’ sum rule could be explained in
terms of an extension of the Goldberger–Treiman relation to the flavour singlet
sector was made in Veneziano’s original paper [4]. This pointed out for the
first time that the suppression in a0 was an OZI-breaking effect. Since the
Goldberger–Treiman relation connects the pseudovector form factors with the
pseudoscalar channel, where it is known that there are large OZI violations for
the flavour singlet, it becomes natural to expect similar large OZI violations
also in a0 . This is the fundamental intuition which we have developed into a
quantitative resolution of the ‘proton spin’ problem.

5.4 Semi-inclusive Polarised DIS

While the agreement between our prediction for the first moment of g1p and
experiment is now impressive, it would still be interesting to find other ex-
perimental tests of topological charge screening. A key consequence of this
mechanism is that the OZI violation observed in a0 is not a property specifi-
cally of the proton, but is target independent. This leads us to look for ways to
make measurements of the polarised structure functions of other hadronic tar-
gets besides the proton and neutron. We now show how this can effectively be
done by studying semi-inclusive DIS eN → ehX in the target fragmentation
region (see Fig. 10).
The differential cross section in the target fragmentation region can be
written analogously to (113) in terms of fracture functions:
dΔσ target YP 4πα2
x = ΔM1hN (x, z, t, Q2 ) (152)
dxdydzdt 2 s
where x = Q2 /2p2 .q, xB = Q2 /2k.q, z = p2 .q/p2 .q so that 1 − z = x/xB , and
the invariant momentum transfer t = K 2 = −k 2 , where k is the momentum

Fig. 10. Semi-inclusive DIS eN → ehX in the target fragmentation region. In the
equivalent current fragmentation process, the detected hadron h is emitted from the
hard collision with γ. The right-hand ﬁgure shows a simple Reggeon exchange model
valid for z ∼ 1, where h carries a large target energy fraction
The U (1)A Anomaly and QCD Phenomenology 277

of the struck parton. For K 2 Q2 , z Eh /EN (in the photon–nucleon CM

frame) is the energy fraction of the target nucleon carried by the detected
hadron h.
ΔM1hN is the fracture function [77] equivalent of the inclusive structure
function g1N , so in the same way as in (122) we have
1 2
ΔM1hN (x, z, t, Q2 ) = e ΔMihN (x, z, t, Q2 ) (153)
2 i i

Here, ΔMihN (x, z, t, Q2 ) is an extended fracture function, introduced by

Grazzini, Trentadue and Veneziano [78], which carries an explicit dependence
on t. One of the advantages of these fracture functions is that they satisfy a
simple, homogeneous RG evolution equation analogous to the usual inclusive
parton distributions.
Our proposal [11, 79] (see also [80]) is to study semi-inclusive DIS in the
kinematical region where the detected hadron h (π, K or D) carries a large
target energy fraction, i.e. z approaching 1, with a small invariant momen-
tum transfer t. In this region, it is useful to think of the target fragmen-
tation process as being simply modelled by a single Reggeon exchange (see
Fig. 10), i.e.

ΔM1hN (x, z, t, Q2 )z∼1 F (t)(1 − z)−2αB (t) g1B (xB , t, Q2 ) (154)

If we consider ratios of cross sections, the dynamical Reggeon emission fac-

tor F (t)(1 − z)−2αB (t) will cancel and we will be able to isolate the ratios
of g1B (xB , t, Q2 ) for different effective targets B. Although single Reggeon ex-
change is of course only an approximation to the more fundamental QCD
description in terms of fracture functions (see [81] for a more technical dis-
cussion), it shows particularly clearly how observing semi-inclusive processes
at large z with particular choices of h and N amounts in effect to performing
inclusive DIS on virtual hadronic targets B. Since our predictions will depend
only on the SU (3) properties of B, together with target independence, they
will hold equally well when B is interpreted as a Reggeon rather than a pure
hadron state.
The idea is therefore to make predictions 1−zfor the ratios R of the first mo-
ments of the polarised fracture functions 0 dxΔM1hN (x, z, t, Q2 ) or equiv-
1
alently 0 dxB g1B (xB , t, Q2 ) for various reactions. The first moments Γ1B are
calculated as in (115) in terms of the axial charges a3 , a8 and a0 (Q2 ) for a
state with the SU (3) quantum numbers of B. We then use topological charge
screening to say that a0 (Q2 ) s(Q2 )a0 OZI , i.e. the flavour singlet axial charge
is suppressed relative to its OZI value by a universal,
target-independent, sup-
pression factor s(Q2 ). From our calculation of χ (0) and the experimental
results for g1p , we have s|Q2 =10GeV2 0.33/0.585 = 0.56.
Some of the more interesting predictions obtained in [11] are as follows.
The ratio
278 G. M. Shore

en → eπ + X 2s − 1
R (155)
ep → eπ − X z∼1 2s + 2
is calculated by comparing Γ1 for the Δ− and Δ++ . It is particularly striking
because the physical value of s(Q2 ) is close to one half, so the ratio becomes
very small. For strange mesons, on the other hand, the ratio depends on
whether the exchanged object is in the 8 (where the reduced matrix elements
involve the appropriate F/D ratio) or 10 representation, so the prediction is
less conclusive, viz.

en → eK + X 2s − 1 − 3(2s − 1)F/D 2s − 1
R (8) or (10)
ep → eK 0 X z∼1 2s − 1 − 3(2s + 1)F/D 2s + 1
(156)
which we find by comparing Γ1 for either the Σ − and Σ + in the 8 represen-
tation or Σ ∗− and Σ ∗+ in the 10. For charmed mesons, we again find

en → eD0 X 2s − 1
R −
(157)
ep → eD X z∼1 2s + 2
corresponding to the ratio for Σc0 to Σc++ .
At the other extreme, for z approaching 0, the detected hadron carries
only a small fraction of the target nucleon energy. In this limit, the ratio R
of the fracture function moments becomes simply the ratio of the structure
function moments for n and p, i.e. using the current experimental values,
Rz∼0 Γ1n /Γ1p = −0.30. This is to be compared with the corresponding
OZI or Ellis–Jaffe value of −0.12.
The differences between the OZI, or valence quark model, expectations and
our predictions based on topological charge screening can therefore be quite
dramatic and should give a very clear experimental signal. In [79], together
with De Florian, we analysed the potential for realising these experiments in
some detail. Since we require particle identification in the target fragmenta-
tion region, fixed-target experiments such as COMPASS or HERMES are not
appropriate. The preferred option is a polarised ep collider.
The first requirement is to measure particles at extremely small angles (θ ≤
1 mrad), corresponding to t less than around 1 GeV2 . This has already been
achieved at HERA in measurements of diffractive and leading proton/neutron
scattering using a forward detection system known as the Leading Proton
Spectrometer (LPS). The technique for measuring charged particles involves
placing detectors commonly known as ‘Roman Pots’ inside the beam pipe
itself.
The next point is to notice that the considerations above apply equally to
ρ as to π production, since the ratios R are determined by flavour quantum
numbers alone. The particle identification requirements will therefore be less
stringent, especially as the production of leading strange mesons from protons
or neutrons is strongly suppressed. However, we require the forward detectors
to have good acceptance for both positive and negatively charged mesons
M = π, ρ in order to measure the ratio (155).
The U (1)A Anomaly and QCD Phenomenology 279

The reactions with a neutron target can be measured if the polarised

proton beam is replaced by polarised 3 He. In this case, if we assume that
3
He = Ap + Bn, the cross section for the production of positive hadrons h+
measured in the LPS is given by

σ 3 He → h+ Aσ p → h+ + Bσ n → p + Bσ n → M + (158)

The first contribution can be obtained from measurements with the proton
beam. However, to subtract the second one, the detectors must have sufficient
particle identification at least to distinguish protons from positively charged
mesons.
Finally, estimates of the total rates [79] suggest that around 1% of the
total DIS events will contain a leading meson in the target fragmentation
region where a LPS would have non-vanishing acceptance (z > 0.6) and in the
dominant domain x < 0.1. The relevant cross sections are therefore sufficient
to allow the ratios R to be measured.
The conclusion is that while our proposals undoubtedly pose a challenge
to experimentalists, they are nevertheless possible. Given the theoretical im-
portance of the ‘proton spin’ problem and the topological charge screening
mechanism, there is therefore strong motivation to perform target fragmenta-
tion experiments at a future polarised ep collider [82].

6 Polarised Two-photon Physics and a Sum Rule for g1γ

The U (1)A anomaly plays a vital role in another sum rule arising in polarised
deep inelastic scattering, this time for the polarised photon structure function
g1γ (xγ , Q2 ; K 2 ). For real photons, the first moment of g1γ vanishes as a conse-
quence of electromagnetic current conservation [83]. For off-shell photons, we
proposed a sum rule in 1992 [5, 6] whose dependence on the virtual momen-
tum of the target photon encodes a wealth of information about the anomaly,
chiral symmetry breaking and gluon dynamics in QCD. This is of special cur-
rent interest since, given the ultra-high luminosity of proposed e+ e− colliders
designed as B factories, a detailed measurement of our sum rule is about to
become possible for the first time.

6.1 The First Moment Sum Rule for g1γ

The polarised structure function g1γ is measured in the process e+ e− →

e+ e− X, which at suﬃciently high energy is dominated by the two-photon
interaction shown in Fig. 11. The deep inelastic limit is characterised by
Q2 → ∞ with x = Q2 /2p2 .q and xγ = Q2 /2k.q ﬁxed, where Q2 = −q 2 ,
K 2 = −k 2 and s = (p1 + p2 )2 . The target photon is assumed to be relatively
soft, K 2 Q2 .
280 G. M. Shore

Fig. 11. Kinematics for the two-photon DIS process e+ e− → e+ e− X

We are interested in the dependence of the photon structure function

g1γ (xγ , Q2 ; K 2 ) on the invariant momentum K 2 of the target photon. Ex-
perimentally, this is given by K 2 EE2 θ22 where E2 and θ2 are the energy
and scattering angle of the target electron. For the values K 2 ∼ m2ρ of interest
in the sum rule, the target electron is nearly forward and θ2 is very small. If
it can be tagged, then the virtuality K 2 is simply determined from θ2 ; other-
wise K 2 can be inferred indirectly from a measurement of the total hadronic
energy.
The total cross section σ and the spin asymmetry Δσ can be expressed
formally in terms of ‘electron structure functions’ as follows [5]:
∞
2 1 dQ2 1 dx e 1
y2 ey

σ = 2πα F 2 1 − y + − F L (159)
s 0 Q2 0 x2 y 2 2
∞
1 dQ2 1 dx e
y
Δσ = 2πα2 g 1 − (160)
s 0 Q2 0 x 1 2
where σ = 12 (σ++ + σ+− ) and Δσ = 12 (σ++ − σ+− ) with +, − referring to the
electron helicities. The parameter y = Q2 /xs 1 and only the leading order
terms are retained below.
These electron structure functions can be expressed as convolutions of the
photon structure functions with appropriate splitting functions. In particular,
we have
∞
x
α dK 2 1 dxγ
g1e (x, Q2 ) = 2
ΔPγe g1γ (xγ , Q2 ; K 2 ) (161)
2π 0 K x x γ xγ

where ΔPγe (x) = (2 − x). This allows us to relate the xγ -moments of the
photon structure functions to the x-moments of the cross sections. For the
ﬁrst moment of g1γ , we ﬁnd
1 1
d3 Δσ 3 3 1
dx x 2 2
= α 2K 2
dxγ g1γ (xγ , Q2 ; K 2 ) (162)
0 dQ dxdK 2 sQ 0
The U (1)A Anomaly and QCD Phenomenology 281

The ﬁrst moment sum rule follows, as for the proton, by using the OPE
(114) to express the product of electromagnetic currents for the incident pho-
a
ton in terms of the axial currents Jμ5 . The matrix elements γ ∗ (k)|Jμ5
a
|γ ∗ (k)
with the target photon are then expressed in terms of the three-current AVV
Green function involving one axial and two electromagnetic currents. We de-
ﬁne form factors for this fundamental correlator as follows:

−i 0|Jμ5
a
(p)Jλ (k1 )Jρ (k2 )|0 = Aa1 μλρα k1α + Aa2 μλρα k2α
+ Aa3 μλαβ k1α k2β k2ρ + Aa4 μραβ k1α k2β k1λ
+ Aa5 μλαβ k1α k2β k1ρ + Aa6 μραβ k1α k2β k2λ
(163)

where the six form factors are functions of the invariant momenta, i.e. Aai =
Aai (p2 , k12 , k22 ). We also abbreviate Aai (0, k 2 , k 2 ) = Aai (K 2 ).
The ﬁrst moment sum rule for g1γ is then [5]:
1

dxγ g1γ (xγ , Q2 ; K 2 ) = 4πα ΔC1a (Q2 ) Aa1 (K 2 ) − Aa2 (K 2 ) (164)
0 a=3,8,0

where the Wilson coeﬃcients are √

related to those in (115) by ΔC13 = ΔC1N S ,
1 2 2
ΔC1 = √3 ΔC1 and ΔC1 = √3 ΔC1S .12
8 N S 0

Now, just as the sum rule for the proton structure function g1p could be
related to low-energy meson–nucleon couplings via the U (1)A Goldberger–
Treiman relations, we can relate this sum rule for g1γ to the pseudoscalar meson
radiative decays using the analysis in Sect. 4.2. Introducing the off-shell radia-
tive pseudoscalar couplings for photon virtuality K 2 , we define form factors

α −1 âα
F a (K 2 ) = 1 − aaem f gη̂α γγ (K 2 ) (165)
π
or alternatively,

α −1
F 3 (K 2 ) = 1 − a3em fπ gπγγ (K 2 )
π

α −1
8η

F 8 (K 2 ) = 1 − a8em f gηγγ (K 2 ) + f 8η gη γγ (K 2 )
π

−1
√
0 α
F (K ) = 1 − aem
0 2
f 0η gηγγ (K 2 ) + f 0η gη γγ (K 2 ) + 6AgGγγ (K 2 )
π
(166)
12
Explicitly,

1
αs (Q2 ) 1
αs (Q2 ) t(Q)
ΔC1N S = 1− , ΔC1S = 1− exp dt γ(αs (t ))
3 π 3 π 0

2 α2
at leading order, where t(Q) = 12 ln Q
μ2
and γ = − 34 (4π)
s
2 is the anomalous dimen-

sion corresponding to the U (1)A current renormalisation.

282 G. M. Shore

where the aaem are the electromagnetic U (1)A anomaly coeﬃcients deﬁned
earlier. We may then rewrite the sum rule as
1
1α
dxγ g1γ (xγ , Q2 ; K 2 ) = ΔC1a (Q2 ) aaem F a (K 2 ) (167)
0 2 π a=3,8,0

The dependence of the g1γ on the invariant momentum K 2 of the target

photon reﬂects many key aspects of both perturbative and non-perturbative
QCD dynamics. For on-shell photons, K 2 = 0, we have simply [5, 83]
1
dxγ g1γ (xγ , Q2 ; K 2 = 0) = 0 (168)
0

This is a consequence of electromagnetic current conservation. This follows

simply by taking the divergence of (163) and observing that in the limit p → 0,
both A1 and A2 are of O(K 2 ).13
In the asymptotic limit where K 2 m2ρ , a relatively straightforward
renormalisation group analysis combined with the anomaly equation shows
that, for the flavour non-singlets, the Aai tend to the value 12 α a
π aem . while in
0
the flavour singlet sector, Ai has an additional factor depending on the anoma-
lous dimension γ. Using the explicit expressions for the Wilson coefficients,
we find
1
dxγ g1γ (xγ , Q2 ; K 2 m2ρ )
0
√ t(Q)
1 α
αs (Q2 ) 3 1 8 2 2 0
= 1− aem + √ aem + √ aem exp dt γ(αs (t ))
6π π 3 3 t(K)

1α 4 1 16 1 1
= 1 − + − (169)
3π 9 ln Q2 /Λ2 81 ln Q2 /Λ2 ln K 2 /Λ2

The asymptotic limit is therefore determined by the electromagnetic U (1)A

anomaly, with logarithmic corrections reflecting the anomalous dimension of
the flavour singlet current due to the colour U (1)A anomaly. (See also [84] for
a NNLO analysis.)
In between these limits, the first moment of g1γ provides a measure of
the form factors defining the three-current AV V Green function, which en-
codes a great deal of information about the dynamics of QCD, especially the
non-perturbative realisation of chiral symmetry [6]. Equivalently, in the form
13
Electromagnetic current conservation in (163) implies
1 1
Aa1 = Aa3 k22 + Aa5 (p2 − k12 − k22 ), Aa2 = Aa4 k12 + Aa6 (p2 − k12 − k22 )
2 2
The chiral limit is special since the form factors can have massless poles and is
considered in detail in [6]. The sum rule (168) still holds.
The U (1)A Anomaly and QCD Phenomenology 283

(167), it measures the momentum dependence of the oﬀ-shell radiative cou-

plings of the pseudoscalar mesons as the form factors F a (K 2 ) vary from 0 to 1.
Just as for g1p , we can again isolate a dependence on the topological sus-
ceptibility through ˆ00
the identification of the flavour singlet decay constant f

in (165) with χ (0) in the chiral limit. This time, however, it is unlikely to
be a good approximation to set the corresponding coupling gη̂0 γγ equal to its
OZI value since it is not RG invariant. A more promising approximation is to
recall from Sect. 4 that the RG-invariant gluonic coupling gGγγ (0) is OZI sup-
pressed and likely to be small. This was confirmed by the phenomenological
analysis. If we assume this is also true of the off-shell coupling, then we may
approximate the sum rule for g1γ entirely in terms of the off-shell couplings of
the physical mesons π 0 , η and η .
In general, the momentum dependence of the form factors (Aa1 − Aa2 ) or
F a will depend on the fermions contributing to the AVV Green function [6].
In the case of leptons, or heavy quarks, the crossover scale as the form factors
F a (K 2 ) rise from 0 to 1 with increasing K 2 will be given by the fermion mass.
For the light quarks, however, we expect the crossover scale to be a typical
hadronic scale ∼ mρ rather than mu,d,s . This can be justified by a rough
OPE argument and is consistent with old ideas of vector meson dominance
[6, 85]. This behaviour would be an interesting manifestation of the sponta-
neous breaking of chiral symmetry.
Once again, therefore, we see a close relation between the realisation of sum
rules in high-energy deep inelastic scattering and low-energy meson physics.
All these issues are discussed at some length in our earlier papers, but here
we now turn our attention to the vital question of whether the g1γ sum rule
can be measured in current or future collider experiments [7].

6.2 Cross Sections and Spin Asymmetries at Polarised B Factories

The spin-dependent cross sections for the two-photon DIS process e+ e− →

e+ e− X were analysed in [5, 7] taking account of the experimental cuts on
the various kinematical parameters. Keeping the lower cut on Q2 as a free
parameter, we found the following results for the total cross section and spin
asymmetry:
2
1 Q2min s
σ 0.5 × 10−8 log log (170)
Q2min Λ2 Q2min

and
−1
Δσ 1 Q2min s s Q2min
= log 1 + log log (171)
σ 2 s 4Q2min 4Λ2 Λ2

In order to measure the g1γ sum rule, we need to ﬁnd collider parameters such
that the spin asymmetry is signiﬁcant in a kinematic region where the total
284 G. M. Shore

cross section is still large.

√ A useful statistical measure of the significance of
the asymmetry is that LσΔσ/σ 1, where L is the luminosity.
When we first proposed the first moment sum rule for g1γ , the luminos-
ity available from the then current accelerators was inadequate to allow it
to be studied. For example, for a polarised version of LEP operating at
s = 104 GeV2 with an annual integrated luminosity of L = 100 pb−1 , and opti-
mising the cut at Q2min = 10 GeV2 , we only have σ 35 pb and Δσ/σ 0.01.
The corresponding
√ annual event rate would be 3.5 × 103 and the statistical
significance LσΔσ/σ 0.5, so even a reliable measurement of the spin
asymmetry could not be made.
Clearly, a hugely increased luminosity is required and this has now be-
come available with proposals for machines with projected annual integrated
luminosities measured in inverse attobarns. However, as noted in [5], if this in-
creased luminosity is associated with increased CM energy, then the 1/s factor
in the spin asymmetry (171) sharply reduces the possibility of extracting g1γ .
There is also a competition as Q2min is varied between increasing spin asym-
metry and decreasing total cross section. This is particularly evident when we
analyse the potential of the ILC [86, 87] for measuring the sum rule [7]. We
find that even optimising the Q2min cut, the spin asymmetry is still only of
order Δσ/σ 0.002 when σ itself has fallen to around 15 pb. While, given
the high luminosity, this would allow a measurement of the first moment of
g1γ integrated over K 2 , a detailed study of the K 2 -dependence of the sum rule
requires a much greater spin asymmetry.
This leads us to consider instead the new generation of ultra-high luminos-
ity e+ e− colliders. Although these are envisaged as B factories, these colliders
operating with polarised beams would, as we now show, be extremely valuable
for studying polarisation phenomena in QCD. As an example of this class, we
take the proposed SuperKEKB collider. (The analysis for PEPII is very sim-
ilar, the main difference being the additional 10-fold increase in luminosity in
the current SuperKEKB proposals.)
SuperKEKB is an asymmetric e+ e− collider with s = 132 GeV2 , corre-
sponding to electron and positron beams of 8 and 3.5 GeV, respectively. The
design luminosity is 5 × 1035 cm−2 s−1 , which gives an annual integrated lu-
minosity of 5 ab−1 [88]. To see the effects of the experimental cut on Q2min in
this case, we have plotted the total cross section and the spin asymmetry in
Fig. 12, in the range of Q2min from 1 to 10 GeV2 . In this range σ is falling like
1/Q2min while Δσ/σ rises to what is actually a maximum at Q2min = 10 GeV2 .
Taking Q2min = 5 GeV2 , we find σ 12.5 pb with spin asymmetry
√ Δσ/σ
0.1. The annual event rate is therefore 6.25 × 107 , with LσΔσ/σ 750.
This combination of a very high event rate and the large 10% spin asymmetry
means that SuperKEKB has the potential not only to measure Δσ but to
access the full first moment sum rule for g1γ itself. Recall from (162) that to
1
measure 0 dx g1γ (x, Q2 ; K 2 ) we need not just Δσ but the fully differential
cross section w.r.t. K 2 as well as x and Q2 if the interesting non-perturbative
QCD physics is to be accessed. To measure this, we need to divide the data
The U (1)A Anomaly and QCD Phenomenology 285

(a) (b)
60 0.11

50 0.1

40 0.09

30 0.08

20 0.07

10 0.06

Q2min Q 2min
2 4 6 8 10 2 4 6 8 10

Fig. 12. The left-hand graph shows the total cross section σ (in pb) at SuperKEKB
as the experimental cut Q2min is varied from 1 to 10 GeV2 . The right-hand graph
shows the spin asymmetry Δσ/σ over the same range of Q2min

into suﬃciently ﬁne K 2 bins in order to plot the explicit K 2 dependence of

g1γ , while still maintaining the statistical significance of the asymmetry. The
ultra-high luminosity of SuperKEKB ensures that the event rate is sufficient,
while its moderate CM energy means that the crucial spin asymmetry is not
overly suppressed by its 1/s dependence.
Our conclusion is that the new generation of ultra-high luminosity, mod-
erate energy e+ e− colliders, currently conceived as B factories, could also be
uniquely sensitive to important QCD physics if run with polarised beams.
In particular, they appear to be the only accelerators capable of accessing
the full physics content of the sum rule for the first moment of the polarised
structure function g1γ (x, Q2 ; K 2 ). The richness of this physics, in particular
the realisation of chiral symmetry breaking, the manifestations of the axial
U (1)A anomaly and the role of gluon topology, provides a strong motivation
for giving serious consideration to an attempt to measure the g1γ sum rule at
these new colliders.

Acknowledgements

In addition to Gabriele, I would like to thank Daniel de Florian,

Massimiliano Grazzini, Stephan Narison and Ben White for their collabo-
ration on the original research presented here. This paper has been prepared
with the partial support of PPARC grant PP/D507407/1.

References
1. G. Veneziano: Nucl. Phys. B 159, 213 (1979) 236, 241, 245, 246, 248, 251, 252, 254
2. G. M. Shore: Nucl. Phys. B 569, 107 (2000) 236, 249, 252
3. G. M. Shore: Nucl. Phys. B 744, 34 (2006) 236, 238, 252, 261, 263
4. G. Veneziano: Mod. Phys. Lett. A 4, 1605 (1989) 236, 261, 265, 276
286 G. M. Shore

5. S. Narison, G. M. Shore, G. Veneziano: Nucl. Phys. B 391, 69 (1993) 236, 279, 280, 281, 282
6. G. M. Shore, G. Veneziano: Mod. Phys. Lett. A 8, 373 (1993) 236, 279, 282, 283
7. G. M. Shore: Nucl. Phys. B 712, 411 (2005) 236, 283, 284
8. G. M. Shore, G. Veneziano: Phys. Lett. B 244, 75 (1990) 237, 261, 266, 275
9. G. M. Shore, G. Veneziano: Nucl. Phys. B 381, 23 (1992) 237, 243, 256, 261, 262, 266, 275
10. S. Narison, G. M. Shore, G. Veneziano: Nucl. Phys. B 433, 209 (1995) 237, 266, 273
11. G. M. Shore, G. Veneziano: Nucl. Phys. B 516, 333 (1998) 237, 277
12. G. Veneziano: in From Symmetries to Strings: Forty Years of Rochester Con-
ferences, ed. by A. Das (World Scientiﬁc, Singapore, 1990), pp. 13–26 237, 243
13. S. L. Adler: Phys. Rev. 177, 2426 (1969) 238
14. J. S. Bell, R. Jackiw: Nuovo Cimento A 60, 47 (1969) 238
15. S. L. Adler, W.A. Bardeen: Phys. Rev. 182, 1517 (1969) 238
16. J. Steinberger: Phys. Rev. 76, 1180 (1949) 238
17. J. Schwinger: Phys. Rev. 82, 664 (1951) 238
18. K. Fujikawa: Phys. Rev. Lett. 42, 1195 (1979); Phys. Rev. D 21, 2848 (1980),
erratum-ibid. D 22, 1499 (1980) 238
19. G. M. Shore: in Hidden Symmetries and Higgs Phenomena, Zuoz Summer
School, Switzerland, 1998, pp. 201–223; arXiv:hep-ph/9812354 239
20. E. Witten: Nucl. Phys. B 156, 269 (1979) 241, 245
21. P. Di Vecchia, G. Veneziano: Nucl. Phys. B 171, 253 (1980) 241, 245, 247
22. D. Espriu, R. Tarrach: Z. Phys. C 16, 77 (1982) 242
23. G. M. Shore: Nucl. Phys. B 362, 85 (1991) 243
24. G. M. Shore, G. Veneziano: Nucl. Phys. B 381, 3 (1992) 243, 256
25. G. ’t Hooft: Nucl. Phys. B 72, 461 (1972) 243
26. G. Veneziano: Phys. Lett. B 52, 220 (1974); Nucl. Phys. B 117, 519 (1976) 243
27. S. Okubo: Phys. Lett. 5, 165 (1963) 243
28. G. Zweig: CERN report 8419/TH412 (1964) 243
29. J. Iizuka: Prog. Theor. Phys. Suppl. 37–38, 21 (1966) 243
30. S. Weinberg: Phys. Rev. D 11, 3583 (1975) 246
31. G. ’t Hooft: Phys. Rev. D 14, 3432 (1976); [Erratum-ibid. D 18, 2199 (1978)] 246
32. R. J. Crewther: Riv. Nuovo Cimento 2N8, 63 (1979) 246
33. G. A. Christos: Phys. Rep. 116, 251 (1984) 246
34. G. ’t Hooft: Phys. Rept. 142, 357 (1986) 246
35. M. Gell-Mann, R. J. Oakes, B. Renner: Phys. Rev. 175, 2195 (1968) 246
36. R. F. Dashen: Phys. Rev. 183, 1245 (1969) 246
37. C. Rosenzweig, J. Schechter, C. G. Trahern: Phys. Rev. D 21, 3388 (1980) 247
38. P. Di Vecchia, F. Nicodemi, R. Pettorino, G. Veneziano: Nucl. Phys. B 181,
318 (1981) 247
39. K. Kawarabayashi, N. Ohta: Nucl. Phys. B 175, 477 (1980) 247
40. P. Herrera-Siklody, J. I. Latorre, P. Pascual, J. Taron: Nucl. Phys. B 497, 345
(1997); Phys. Lett. B 419, 326 (1998) 247
41. H. Leutwyler: Nucl. Phys. Proc. Suppl. 64, 223 (1998) 247
42. R. Kaiser, H. Leutwyler: Eur. Phys. J. C 17, 623 (2000) 247
43. L. Giusti, G. C. Rossi, M. Testa, G. Veneziano: Nucl. Phys. B 628, 234 (2002)
251
44. G. M. Shore: Phys. Scr. T 99, 84 (2002) 252, 256
45. Particle Data Group: Review of Particle Properties, Phys. Lett. B 592, 1 (2004)
257
46. M. Acciarri et al., L3 Collaboration: Phys. Lett. B 418, 399 (1998) 257
The U (1)A Anomaly and QCD Phenomenology 287

47. D. A. Williams et al., Crystal Ball Collaboration: Phys. Rev. D 38, 1365 (1988)
257
48. N. A. Roe et al., ASP Collaboration: Phys. Rev. D 41, 17 (1990) 257
49. L. Del Debbio, L. Giusti, C. Pica: Phys. Rev. Lett. 94, 032003 (2005) 258
50. A. Di Giacomo: Nucl. Phys. Proc. Suppl. 23, 191 (1991) 259
51. S. Narison: Phys. Lett. B 255, 101 (1991); Z. Phys. C 26, 209 (1984) 259
52. S. Narison, G. M. Shore, G. Veneziano: Nucl. Phys. B 546, 235 (1999) 261, 266, 273, 275
53. V. Y. Alexakhin et al. [COMPASS Collaboration]: “The deuteron spin-
dependent structure function g1(d) and its first moment,” arXiv:hep-
ex/0609038 263, 266, 274
54. A. Airapetian et al. [HERMES Collaboration]: “Precise determination of the
spin structure function g(1) of the proton, deuteron and neutron,” arXiv:hep-
ex/0609039 263, 266, 274
55. G. Mallot, S. Platchkov, A. Magnon: CERN-SPSC-2005-017; SPSC-M-733 263
56. E. S. Ageev et al. [COMPASS Collaboration]: Phys. Lett. B 612, 154 (2005) 263
57. J. R. Ellis, R. L. Jaffe: Phys. Rev. D 9, 1444 (1974), [Erratum-ibid. D 10, 1669
(1974)] 263, 266
58. D. V. Bugg: Eur. Phys. J. C 33, 505 (2004) 263
59. P. Moskal: “Hadronic interaction of eta and eta mesons with protons,”
arXiv:hep-ph/0408162 263
60. S. D. Bass: Phys. Scr. T 99, 96 (2002) 263
61. P. Moskal et al.: Int. J. Mod. Phys. A 20, 1880 (2005) 263
62. P. Moskal et al.: Phys. Rev. Lett. 80, 3202 (1998) 264
63. K. Nakayama, H. Haberzettl: “Analyzing eta photoproduction data on the
proton at energies of 1.5GeV–2.3GeV,” arXiv:nucl-th/0507044. 264
64. M. Dugger [CLAS Collaboration]: “S=0 pseudoscalar meson photoproduction
from the proton,” arXiv:nucl-ex/0512005. 264
65. G. M. Shore: Nucl. Phys. Proc. Suppl. 39BC, 101 (1995) 266
66. J. Ashman et al: Phys. Lett. B 206, 364 (1988); Nucl. Phys. B 328, 1 (1990) 266
67. R. L. Jaffe, A. Manohar: Nucl. Phys. B 337, 509 (1990) 267
68. G. M. Shore, B. E. White: Nucl. Phys. B 581, 409 (2000) 267, 268
69. G. M. Shore: Nucl. Phys. Proc. Suppl. 96,171 (2001) 267
70. B. L. G. Bakker, E. Leader, T. L. Trueman: Phys. Rev. D 70, 114001 (2004) 268
71. R. D. Ball, S. Forte, G. Ridolfi: Phys. Lett. B 378, 255 (1996) 269
72. G. Altarelli, G. G. Ross: Phys. Lett. B 212, 391 (1988) 270
73. S. Procureur [COMPASS Collaboration]: “New measurement of Delta(G)/G at
COMPASS,” arXiv:hep-ex/0605043 270
74. R. Fatemi [STAR Collaboration]: “Using jet asymmetries to access Delta(G),
the gluon helicity distribution of the proton at STAR,” arXiv:nucl-ex/0606007.
270
75. Y. Fukao [PHENIX Collaboration]: “The overview of the spin physics at RHIC-
PHENIX experiment,” AIP Conf. Proc. 842, 321 (2006) 270
76. G. M. Shore: in From the Planck Length to the Hubble Radius, Erice 1998, ed.
by A. Zichichi (World Scientific, Singapore, 1998), pp. 79–105 270
77. L. Trentadue, G. Veneziano: Phys. Lett. B 323, 201 (1994) 277
78. M. Grazzini, L. Trentadue, G. Veneziano: Nucl. Phys. B 519, 394 (1998) 277
79. D. de Florian, G. M. Shore, G. Veneziano: in Proceedings of the 1997 Work-
shop with Polarized Protons at Hera, ed. by A. de Roeck, T. Gehrmann
(Hamburg/Zeuthen, 1997) pp. 696–703; arXiv:hep-ph/9711353 277, 278, 279
288 G. M. Shore

80. G. M. Shore: Nucl. Phys. Proc. Suppl. 64, 167 (1998) 277
81. M. Grazzini, G. M. Shore, B. E. White: Nucl. Phys. B 555, 259 (1999) 277
82. A. De Roeck: Nucl. Phys. Proc. Suppl. 105, 40 (2002) 279
83. S. D. Bass: Int. J. Mod. Phys. A 7, 6039 (1992) 279, 282
84. K. Sasaki, T. Ueda, T. Uematsu: Phys. Rev. D 73, 094024 (2006) 282
85. T. Ueda, T. Uematsu, K. Sasaki: Phys. Lett. B 640, 188 (2006) 283
86. F. Richard et al.: “TESLA: The Superconducting electron positron linear col-
lider with an integrated X-ray laser laboratory. Technical Design Report, Part
I”; hep-ph/0106314 284
87. M. Woods et al.: in Proc. 5th International Workshop on Electron–Electron
Interactions at TeV Energies, Santa Cruz, 2003; physics/0403037 284
88. A. G. Akeroyd et al (SuperKEKB Physics Working Group): “Physics at Super
B Factory”; hep-ex/0406071 284
Planar Equivalence 2006∗

A. Armoni1 and M. Shifman2

1
Department of Physics, Swansea University, Singleton Park, Swansea
SA2 8PP, UK
[email protected]
2
William I. Fine Theoretical Physics Institute, University of Minnesota,
Minneapolis, MN 55455, USA
[email protected]

Abstract. Planar equivalence between supersymmetric Yang–Mills theory and its

orientifold daughters is a promising tool for explorations of nonperturbative aspects
of quantum chromodynamics. Taking our 2004 review as a starting point, we sum-
marize some recent developments in this issue.

The most interesting processes in quantum chromodynamics (QCD) are those

occurring at large distances, at strong coupling. The large distance dynam-
ics determining such salient features as chiral symmetry breaking and color
conﬁnement are the realm of nonperturbative phenomena. Despite the prac-
tical importance of the issue and the fact that this is a very deep theoretical
problem, very few analytic methods of calculations (of a limited scope) were
developed over the years, for a recent review see [1].
The situation is much better in supersymmetric (SUSY) theories: certain
quantities (which go under the name of F terms) can be calculated exactly, due
to holomorphic dependences on various parameters. In particular, it is possible
to calculate the exact value of the gluino condensate [2] in pure N = 1 super-
Yang–Mills (SYM) theory (we will also refer to this theory as supersymmetric
gluodynamics).
∗
A mini-review in honor of Gabriele Veneziano’s 65th birthday.

A. Armoni and M. Shifman: Planar Equivalence 2006, Lect. Notes Phys. 737, 289–300 (2008)
DOI 10.1007/978-3-540-74233-6 13
c Springer-Verlag Berlin Heidelberg 2008
290 A. Armoni and M. Shifman

The basic idea behind planar equivalence is to approximate QCD by a

supersymmetric theory!
The history of planar equivalence is as follows. In 1998, soon after the
seminal AdS/CFT paper of Maldacena [3], Kachru and Silverstein [4] sug-
gested a class of nonsupersymmetric large-N conformal gauge theories. The
candidate theories were the duals of AdS5 ×S 5 /Γ and, therefore, named “orb-
ifold field theories.” Although it turns out that these theories are in fact not
conformal (not even in perturbation theory, see [5, 6]) Kachru–Silverstein’s
conjecture led to a more subtle conjecture by Strassler [7]. A refined version
of Strassler’s conjecture is planar equivalence for orientifold field theories. In
contrast to various other conjectures, the latter can be proven [8, 9] under
rather mild assumptions, see Sect. 6. The orientifold daughter of SUSY gluo-
dynamics is a nonsupersymmetric Yang–Mills theory with one Dirac fermion
in the two-index antisymmetric (or symmetric) representation of SU(N ). In
this mini-review written on the occasion of Gabriele Veneziano’s 65th birth-
day we focus on recent developments in the issue of planar equivalence—those
that took place after our detailed review on this subject [10] was published.
The statement of planar equivalence for (the minimal) orientifold field
theory is as follows: at large N, in a certain well defined bosonic sector, SU(N )
N = 1 SYM theory is equivalent to an SU(N ) gauge theory with a Dirac
fermion in the two-index antisymmetric representation. The same statement
holds for Dirac fermions in the two-index symmetric representation.
Although planar equivalence is an extremely interesting theoretical state-
ment per se, its practical importance goes far beyond since it relates a super-
symmetric gauge theory to a nonsupersymmetric one. Thus, potentially, it is
a very useful tool for QCD. Let us make a simple observation: for SU(3)color
a Dirac fermion in the antisymmetric representation is equivalent to a Dirac
fermion in the fundamental representation. Therefore, the SU(3) version of the
orientifold field theory is in fact one-flavor QCD! Thus, we can approximate
one-flavor QCD by supersymmetric Yang–Mills and in this way evaluate some
nonperturbative quantities in QCD. In particular, planar equivalence will en-
able us to calculate the quark condensate in one-flavor QCD by using the
value of the gluino condensate in supersymmetric gluodynamics.

1 Planar Equivalence: a Reﬁned Proof

Originally, the idea of planar equivalence between supersymmetric gluody-

namics and its orientifold daughter was formulated in 2003. Since then we
reﬁned the proof and made it more rigorous [9]. Let us brieﬂy outline the
main ingredients of the proof.
It is instructive to start from a perturbative analysis. We want to show
that all planar graphs of the two theories coincide. To this end it is useful to
use ’t Hooft’s notation. In this notation the adjoint representation is denoted
Planar Equivalence 2006 291

a b c

− +

Fig. 1. (a) The quark-gluon vertex; (b) In N = 1 SYM theory; (c) In the orientifold
ﬁeld theory

by two parallel lines with color flow arrows pointing in the opposite directions,
whereas the antisymmetric (symmetric) representation is denoted by two par-
allel lines with the arrows pointing in the same direction. The Feynman rules
of the two theories are depicted in Fig. 1.
Next, we observe that the direction of the color flow arrows does not affect
the value of the planar graphs under consideration. To see that this is indeed
the case, imagine that we paint every pair of the fermionic lines in blue and
red colors, respectively. Accordingly, the gluon lines will be either both red
or both blue. A planar graph then will be divided into blue regions and red
regions separated by fermionic loops. A typical example is given in Fig. 2.
Now imagine that we reverse the arrows attached to the red lines. In
this way we map a planar graph of one theory onto a planar graph of the
other theory. This action does not change the value of the graph. Quod Erat
Demonstrandum.
The complete nonperturbative proof [9] is more involved, of course. The
main ingredients are as follows. First, define, for a generic Dirac fermion in
the representation r, the generating functional

−Wr (JYM , JΨ )

e = DAμ DΨ DΨ̄ e−SYM [A,JYM ] exp Ψ̄ (i ∂+ A
a Tra + JΨ ) Ψ .
(1)
Next, integrate out fermions to arrive at

a b c

Fig. 2. A typical planar graph in SYM and the orientifold ﬁeld theory
292 A. Armoni and M. Shifman

e−Wr (JYM , JΨ ) = DAμ e−SYM [A,JYM ]+Γr [A,JΨ ] , (2)

where
Γr [A, JΨ ] = log det (i ∂+ A
a Tra + JΨ ) . (3)
For what follows it is convenient to write the eﬀective action Γr [A, JΨ ] in the
world-line formalism [11], as an integral over (super-)Wilson loops

1 ∞ dT
Γr [A, JΨ ] = −
2 0 T

T
1 μ μ 1 μ μ 1 2
× DxDψ exp − dτ ẋ ẋ + ψ ψ̇ − JΨ
2 2 2

T
1 μ a ν
× Tr P exp i dτ Aaμ ẋμ − ψ Fμν ψ Tra . (4)
0 2

Thus, the generating functionals of theories with matter in the antisym-

metric/adjoint are very similar. The dependence on the representation enters
through the Wilson loops. The latter can be written as follows:
1
WAS = (Tr U )2 − Tr U 2 + (U → U † ), (5)
2

Wadjoint = Tr U Tr U † − 1 + (U → U † ) = 2 Tr U Tr U † − 1 , (6)

where U (respectively U † ) represents the same group element in the funda-

mental (respectively antifundamental) representation of SU(N ).
To complete the proof [9], one must show that at large N one can use
1 1
WAS ∼ (Tr U )2 + (Tr U † )2 , (7)
2 2

Wadjoint ∼ 2Tr U Tr U † , (8)

and that U can be replaced by U † everywhere.2 The factor 2 in (8) is canceled

by the factor 12 , since the adjoint representation is realized by the Majorana
rather than Dirac fermions.
A remarkable consequence of nonperturbative planar equivalence is that
(non-SUSY) orientifold ﬁeld theories exhibit some feature of supersymmetric
theories. This is surprising since the spectrum of the large-N theory consists
of bosons only— it is impossible to form ﬁnite-mass fermionic color singlets.
As a “remnant” of SUSY they are predicted to have an even/odd parity
degeneracy, as in supersymmetric gluodynamics. More generally, two bosons
2
See Sect. 6 for a more detailed discussion.
Planar Equivalence 2006 293

from one and the same would-be supermultiplet, must be degenerate in mass
at N → ∞. In addition, the quark condensate Ψ̄ Ψ will form, and its value
will be identical to that of the gluino condensate in N = 1 SYM theory. Other
important properties are the NSVZ β function, the domain wall spectrum and
gluonic Green functions [8, 10].

2 The Orientifold Large-N Expansion

Let us forget for a short while about supersymmetry and look at planar equiv-
alence from a broader perspective. Assume we are interested in the large-N
limit of multiflavor QCD. There are various options of generalizing SU(3)
QCD to SU(N ) gauge theory. In the original ’t Hooft large-N expansion [12]
both g 2 N and the number of flavors Nf (quarks in the fundamental represen-
tation) is kept fixed (this is realized in the modern gauge/string duality by
keeping the number of flavor branes fixed). In the Veneziano large-N expan-
sion (the topological expansion [13]) the ratio Nf /N is kept fixed, together
with g 2 N (it can be achieved by placing branes on orbifold singularities, in a
certain region of the moduli space). The advantage of the latter expansion is
that the quark loops are not suppressed at large N and, hence, flavor physics
is better captured in this approximation. In particular, the η mass does not
vanish even when N → ∞, that is to say, a massive η is a part of the planar
theory.
While both expansions are interesting and useful, there is no full quan-
titative solution to either. It is tempting to say that large-N QCD is dual
to a string theory, and there was a significant progress along these lines [3],
but it would be certainly wrong to say that an accurate and well-developed
description of QCD has been already attained. Therefore, alternative large-N
limits may well prove to be very useful.
Let us discuss a new orientifold large-N expansion [14]. It will lead to cer-
tain quantitative predictions for QCD. We start from SU(3)color Yang–Mills
theory with Nf quark flavors in the fundamental representation (to be referred
to as multiflavor QCD). Since for SU(3) the Dirac fermion in the fundamen-
tal representation is equivalent to the Dirac fermion in the antisymmetric
two-index representation, we have the option of generalizing the theory to
SU(N )color treating Nf fermions as antisymmetric Dirac fermions, see Fig. 3.
The next step is to consider the large-N limit of this theory while keeping
Nf fixed. This large-N approximation, to be referred to as the orientifold
large-N approximation, is somewhat similar to the topological expansion since
the quark loops are not suppressed with respect to the gluon loops.
Through planar equivalence the theory with Nf Dirac quarks in the two-
index antisymmetric approximation is related to the theory with Nf adjoint
Majorana quarks (in the common sector).
While phenomenological consequences of the orientifold large-N limit
so far remain essentially unexplored, in purely theoretical aspect planar
294 A. Armoni and M. Shifman

Fig. 3. Antisymmetric/fundamental representation in SU(3)

equivalence of these theories revived interest in gauge theories with quarks

in higher representations, other than the fundamental representation. In par-
ticular, one can ask the question as to the form of the chiral Lagrangian in
the Yang–Mills theory with antisymmetric or adjoint quarks. The chiral La-
grangian of QCD with fundamental quarks supports skyrmions which can be
identified with baryons [17]. And what about the chiral Lagrangian in the
theory with antisymmetric quarks? The pattern of the spontaneous breaking
of the chiral symmetry in this gedanken case is well-known. The corresponding
chiral Lagrangian is not drastically different from that of QCD. It supports
skyrmions too. However, the mass of the skyrmions in this case scales as N 2
rather than N, as is the case in the ’t Hooft limit. At first sight, there is no ap-
parent match between skyrmions and baryons. It turns out [18] that N -quark
hadrons built of antisymmetric fermions are unstable with regards to fusion
of N species into a huge compound object built of N 2 quarks. It is the latter
which is an analog of the baryon! For subsequent discussions see [19].
Moreover, chiral Lagrangians were found in theories with the adjoint
quarks [20]. The issue of baryon analogs and skyrmions in this case is in-
triguing and subtle. There is no conservation of fermion number; rather it is
(−1)F which is conserved. It was argued [20] that an analog of the baryon
is a compound object built of N 2 quarks with an abnormal assignment of
(−1)F . On the skyrmion side, it is seen as a Hopf skyrmion whose topological
stability is associated with a nontrivial Hopf invariant.

3 Applications for One-ﬂavor QCD

As we explained in Sects. 1 and 2, we can approximate one-flavor QCD by a
planar theory with one Dirac two-index antisymmetric fermion. This theory
is planar-equivalent to N = 1 SYM theory. We can therefore make several
quantitative predictions about the nonperturbative regime of the one-flavor
QCD.
The first prediction concerns the spectrum of the theory. As we discussed
at the end of Sect. 1, the color-singlet spectrum of the orientifold field theory
exhibits an odd/even parity degeneracy. Thus, we expect a similar degeneracy
in the spectrum of one-flavor QCD, within a 1/N error,
S
M−
S
= 1 + O(1/N ) , (9)
M+
S
where M− is a color-singlet bosonic degree of freedom with spin S and odd
S
parity and M+ is a color-singlet bosonic degree of freedom with spin S and
Planar Equivalence 2006 295

even parity. In particular the η and the σ mesons should be approximately

degenerate. This prediction was supported by lattice QCD analyses, see [15].
Another prediction is the value of the quark condensate in one-ﬂavor QCD.
The analysis carried out in [16] was recently tested in a lattice simulation by
DeGrand et al. [21]. A comment on this issue is in order here. It is convenient to
deal with a renormalization group invariant deﬁnition of the gluino condensate
and the quark condensate,

Ψ̄ Ψ RGI ≡ (g 2 )γ/β Ψ̄ Ψ . (10)

The renormalization group invariant value of the gluino condensate is

N2 3
λλ = − Λ . (11)
2π 2
Nonperturbative planar equivalence implies the equality of the orientifold
quark condensate and the gluino condensate at inﬁnite N . Moreover, since
we know that for N = 2 the antisymmetric representation is equivalent to
a color-singlet, we can make an educated guess that the value of the quark
condensate at any N is

2 N2 3
Ψ̄ Ψ = − 1 − Λ . (12)
N 2π 2

The evaluation of the quark condensate for N = 3 (one-ﬂavor QCD) at 2 GeV

(assuming the ’t Hooft coupling is 0.115) yields

q̄qorientifold
2 GeV = −(262 MeV)3 ± 30% . (13)

This value can be compared with a recent lattice evaluation by DeGrand

et al. [21]
q̄qlattice
2 GeV = −(269(9) MeV) .
3
(14)
The agreement is more than satisfactory.

4 Applications for Three-ﬂavor QCD

Is it possible to use planar equivalence to calculate nonperturbative quantities

in real three-flavor QCD? In a bid to answer this question positively, a “mixed”
approach has been suggested.
Consider an SU(N ) gauge theory with one Dirac fermion Ψ in the antisym-
metric representation and two extra Dirac fermions χi in the fundamental rep-
resentation. For SU(3) this model reduces to three-flavor QCD. When N → ∞
the fundamental flavors can be neglected and our model is planar equivalent
to N = 1 SYM theory. Thus, the model at hand interpolates between QCD
for SU(3) and SYM theory at large N .
296 A. Armoni and M. Shifman

Several subtleties arise while considering this model. Because of a chiral

symmetry breaking Goldstone bosons occur in this model, at any ﬁnite N .
Therefore, in the attempt to match quantities of this theory and N = 1 SYM
theory, one has to choose sources which do not couple to these Goldstone
particles.
A detailed analysis of the model [22] leads to the estimate

2 N2
Ψ̄ Ψ RGI /Λ3 = − 1 − , (15)
N 2π 2

just as in the previous case. Note, however, that in this model β0 = 3N (as
in SYM theory), and, as a result, the running coupling is different than in
one-flavor QCD. As a result, we find, instead of (13),

q̄qorientifold
2 GeV = − (317 ± 30 ± 36 MeV)3 . (16)

The errors here are due to the 30% uncertainty of the 1/N formula and the
experimental uncertainty in the ’t Hooft coupling at 2 GeV. The above pre-
diction should be compared with a recent lattice analysis by McNeile [23]

q̄qlattice
2 GeV = − (259 ± 27 MeV)3 . (17)

The orientifold prediction and the lattice simulation result are confronted
in Fig. 4.

400

350

300

250

200

0.12 0.13 0.14 0.15 0.16 0.17

Fig. 4. The quark condensate expressed as −(y MeV)3 as a function of the ’t Hooft
coupling λ. The solid line represents the prediction of planar equivalence. The two
dashed lines represent the ±30% error. The ±1σ range of the coupling, 0.138 < λ <
0.158 and the lattice estimate −(259 ± 27 MeV)3 deﬁne the shaded region
Planar Equivalence 2006 297

5 Sagnotti’s Model and the Gauge/String

Correspondence
Orientifold field theories originate in string theory. The starting point is 10D
type-0B string theory. By adding the orientifold
Ω ≡ Ω(−1)fR
and 32 D9 branes we end up with a nonsupersymmetric nontachyonic string
theory [24, 25]. The low-energy spectrum of the closed string modes consists
of the dilaton, the graviton, and a set of the Ramond–Ramond (RR) fields.
There are no fermions (the Neveu–Schwarz–Ramond sector). The open string
sector consists of a 10-dimensional U (32) gauge theory with an antisymmetric
fermion. The model is free of RR tadpoles.
In order to obtain a realization of the 4D orientifold field theory one can use
a Hanany–Witten brane configuration in type 0A, namely a set of N D4 branes
and O 4 plane suspended between rotated NS5 branes [8]. An alternative
realization [26] is via fractional D3 branes placed on a C 3 /Z2 × Z2 orbifold
singularity in type 0 B. The latter description is useful for the gauge/gravity
correspondence [27]. Since at gst = 0 the bosonic gravity modes of type 0 B
and their interactions are identical to those of type IIB, the gauge/gravity
correspondence (provided that it holds) provides an additional evidence in
favor of planar equivalence: if the bosonic sectors of two gauge theories are
described by the same bosonic sectors of two string theories at gst = 0 then
the two gauge theories must be equivalent at infinite N .
The gauge/gravity correspondence for the orientifold field theories was
used recently [27] to make predictions regarding the theories at finite N . In
contrast to the supersymmetric type IIB background which contains N units
of the RR flux, the type-0B background contains N − 2 units of the RR
flux, due to the presence of the O 5 plane that shifts the flux by −2. Certain
quantities are sensitive to this shift. This is in agreement with results from
the effective action approach presented in [28].

6 Charge Conjugation and the Validity

of Planar Equivalence
Recently it was pointed out [29] that a necessary and sufficient condition
for orientifold planar equivalence to hold is the absence of the spontaneous
breaking of charge conjugation symmetry (for earlier work related to pla-
nar equivalence between SYM theories and orbifold daughters, see [30]). This
assumption was implicit in our refined proof [9]. It is clear that this issue
deserves a separate discussion in Yang–Mills theory per se, not necessarily in
association with supersymmetry or planar equivalence.
Motivated by [29] we argued [31] that C parity does not break sponta-
neously in any vector-like gauge theory on R4 . We first argued that charge
298 A. Armoni and M. Shifman

conjugation is not broken in pure Yang–Mills theory. Our reasoning is based

on the uniqueness of the Yang–Mills vacuum. Being physically compelling our
arguments, unfortunately, stop short of a rigorous mathematical proof of the
type given in [32] regarding P parity. There is a deep distinction between
these two aspects of QCD. While the spatial parity conservation is essentially
nondynamical and is based on a general feature of vector-like gauge theories
with spinor quarks, the C-parity conservation versus nonconservation is a dy-
namical question. The uniqueness of the Yang–Mills vacuum provides us with
the necessary dynamical information.
Then we prove [31] that if the charge conjugation is unbroken in pure
Yang–Mills it is not broken in any vector-like theory.
The above arguments are general and apply to QCD as well as to any
other vector-like theory. The absence of the spontaneous breaking of C parity
is suﬃcient for planar equivalence to be valid. It is instructive to return to
the proof [9] and heck where exactly we assume charge conjugation to hold.
In fact, as was noted in Sect. 1, we need to assume the expectation values
of traces of all Wilson loops to coincide with those of their conjugated, being
evaluated in the pure Yang–Mills vacuum. This requires unbroken C parity
of pure Yang–Mills theory. Once it is established, it automatically covers the
theories with vector-like quarks provided that the expansion in quark loops is
convergent.

7 Other Developments

Planar equivalence was used in both formal works and in phenomenology.

Papers on the subject appeared on all theoretical high-energy archives: hep-
th, hep-ph, and hep-lat. Here we would like to mention a few.
The lattice works are mainly devoted to verification of planar equivalence.
A formal strong coupling and large mass proof was given by Patella [33]. The
paper by DeGrand et al. [21] confirms our prediction for the quark conden-
sate in one-flavor QCD. The prediction regarding the mass ratio m2η /m2σ was
confirmed by Keith-Hynes and Thacker [15].
Phenomenological papers, mainly by Sannino and collaborators [34, 35]
were devoted to constructing of technicolor models based on the orientifold
field theories with symmetric matter. In another recent work [36], predic-
tions about one-flavor QCD were used for “beyond the standard model
phenomenology.”
Among the more formal aspects, it is worth mentioning the work by
Di Vecchia et al. [26] who studied realizations of the orientifold field theo-
ries in type-0’ string theory as well as tree level string amplitudes in these
models.
A partial list of other related works is given in [37, 38, 39, 40, 41].
Summarizing, planar equivalence is a new useful tool in a very limited
toolkit available at present for calculations of nonperturbative quantities in
Planar Equivalence 2006 299

QCD. It has already resulted in a few promising applications, both in QCD,

string theory, AdS/CFT, lattice gauge theory and beyond the standard model
phenomenology. We believe that further studies are needed in order to exploit
the potential of this method. In particular, it seems promising to search for
new planar-equivalent pairs with the aim of learning about one of them from
the other.

Acknowledgments

We are happy to thank Gabriele Veneziano for a fruitful and enjoyable col-
laboration. We are grateful to Courtney Davis for the kind permission to use
her cartoon of Gabriele Veneziano from (M)agazine one 2006.
A.A. is supported by the PPARC advanced fellowship award. The work of
M.S. is supported in part by DOE grant DE-FG02-94ER408.

References
1. M. Shifman: Int. J. Mod. Phys. A 21, 5695 (2006) 289
2. M. A. Shifman, A. I. Vainshtein: Nucl. Phys. B 296, 445 (1988) 289
3. J. M. Maldacena: Adv. Theor. Math. Phys. 2, 231 (1998); also S. S. Gubser,
I. R. Klebanov, A. M. Polyakov: Phys. Lett. B 428, 105 (1998); E. Witten: Adv.
Theor. Math. Phys. 2, 253 (1998) 290, 293
4. S. Kachru, E. Silverstein: Phys. Rev. Lett. 80, 4855 (1998) 290
5. A. Armoni, E. Lopez, A. M. Uranga: JHEP 0302, 020 (2003) 290
6. A. Dymarsky, I. R. Klebanov, R. Roiban: JHEP 0511, 038 (2005) 290
7. M. J. Strassler: On Methods for Extracting Exact Nonperturbative Results in
Nonsupersymmetric Gauge Theories, hep-th/0104032 290
8. A. Armoni, M. Shifman, G. Veneziano: Nucl. Phys. B 667, 170 (2003) 290, 293, 297
9. A. Armoni, M. Shifman, G. Veneziano: Phys. Rev. D 71, 045015 (2005) 290, 291, 292, 297, 2
10. A. Armoni, M. Shifman, G. Veneziano: in From Fields to Strings: Circumnavi-
gating Theoretical Physics, ed. by M. Shifman, A. Vainshtein, J. Wheater (World
Scientiﬁc, Singapore, 2005), Vol. 1, p. 353 290, 293
11. R. Casalbuoni, J. Gomis, G. Longhi: Nuovo Cimento A 24 249 (1974);
R. Casalbuoni: Nuovo Cimento A 33, 389 (1976); A. Barducci, F. Bordi,
R. Casalbuoni: Nuovo Cimento B 64, 287 (1981); L. Brink, P. Di Vecchia,
P. S. Howe: Nucl. Phys. B 118, 76 (1977); M. J. Strassler: Nucl. Phys. B 385,
145 (1992); E. D’Hoker, D. G. Gagne: Nucl. Phys. B 467, 272 (1996) 292
12. G. ’t Hooft: Nucl. Phys. B 72, 461 (1974) 293
13. G. Veneziano: Nucl. Phys. B 117, 519 (1976) 293
14. A. Armoni, M. Shifman, G. Veneziano: Phys. Rev. Lett. 91, 191601 (2003) 293
15. P. Keith-Hynes, H. B. Thacker: Double Hairpin Diagrams and the Planar Equiv-
alence of N = 1 Supersymmetric Yang–Mills Theory and One-Flavor QCD, hep-
lat/0610045; Relics of Supersymmetry in Ordinary One-Flavor QCD: Hairpin
Diagrams and Scalar-Pseudoscalar Degeneracy, hep-th/0701136 295, 298
16. A. Armoni, M. Shifman, G. Veneziano: Phys. Lett. B 579, 384 (2004) 295
300 A. Armoni and M. Shifman

17. E. Witten: Nucl. Phys. B 223, 422 (1983); Nucl. Phys. B 223, 433 (1983)
[reprinted in S. Treiman et al., Current Algebra and Anomalies (Princeton Uni-
versity Press, Princeton, 1985), p. 515] 294
18. S. Bolognesi: Baryons and Skyrmions in QCD with Quarks in Higher Represen-
tations, hep-th/0605065 294
19. A. Cherman, T. D. Cohen: JHEP 0612, 035 (2006); Phys. Lett. B 641, 401
(2006) 294
20. R. Auzzi, M. Shifman: Low-Energy Limit of Yang–Mills with Massless Ad-
joint Quarks: Chiral Lagrangian and Skyrmions, hep-th/0612211; S. Bolognesi,
M. Shifman: The Hopf Skyrmion in QCD with Adjoint Quarks, hep-th/0701065
294
21. T. DeGrand, R. Hoffmann, S. Schaefer, Z. Liu: Quark Condensate in One-Flavor
QCD, hep-th/0605147 295, 298
22. A. Armoni, G. Shore, G. Veneziano: Nucl. Phys. B 740, 23 (2006) 296
23. C. McNeile: Phys. Lett. B 619, 124 (2005) 296
24. A. Sagnotti: Some Properties of Open String Theories, hep-th/9509080. 297
25. A. Sagnotti: Nucl. Phys. Proc. Suppl. 56B, 332 (1997) 297
26. P. Di Vecchia, A. Liccardo, R. Marotta, F. Pezzella: JHEP 0409, 050 (2004) 297, 298
27. A. Armoni, E. Imeroni: Phys. Lett. B 631, 192 (2005) 297
28. F. Sannino, M. Shifman: Phys. Rev. D 69, 125004 (2004) 297
29. M. Ünsal, L. G. Yaffe: Phys. Rev. D 74, 105019 (2006) 297
30. P. Kovtun, M. Ünsal, L. G. Yaffe: JHEP 0312, 034 (2003); JHEP 0507, 008
(2005) 297
31. A. Armoni, M. Shifman, G. Veneziano: A note on C-Parity Conservation and
the Validity of Orientifold Planar Equivalence, hep-th/0701229 297, 298
32. C. Vafa, E. Witten: Phys. Rev. Lett. 53, 535 (1984). 298
33. A. Patella: A Proof of Orientifold Planar Equivalence on the Lattice,
hep-lat/0511037 298
34. F. Sannino, K. Tuominen: Phys. Rev. D 71, 051901 (2005) 298
35. D. K. Hong, S. D. H. Hsu, F. Sannino: Phys. Lett. B 597, 89 (2004) 298
36. M. J. Strassler, K. M. Zurek: Echoes of a Hidden Valley at Hadron Colliders,
hep-ph/0604261 298
37. A. Feo, P. Merlatti, F. Sannino: Phys. Rev. D 70, 096004 (2004) 298
38. J. L. F. Barbon, C. Hoyos: JHEP 0601, 114 (2006) 298
39. G. Veneziano, J. Wosiek: JHEP 0601, 156 (2006) 298
40. T. DeGrand, R. Hoffmann, QCD with One Compact Spatial Dimension,
hep-lat/0612012 298
41. T. J. Hollowood, A. Naqvi: Phase Transitions of Orientifold Gauge Theories at
Large N in Finite Volume, hep-th/0609203 298
Part V

Supersymmetric Gauge Theories

Instantons and Supersymmetry

M. Bianchi1 , S. Kovacs2 , and G. Rossi3

1
University of Rome Tor Vergata and INFN, Sez. di Roma Tor Vergata, Via della
Ricerca Scientiﬁca-00133 Roma, Italy
[email protected]
2
Trinity College Dublin Dublin 2, Ireland
[email protected]
3
University of Rome Tor Vergata and INFN, Sez. di Roma Tor Vergata, Via della
Ricerca Scientiﬁca-00133 Roma, Italy
[email protected]

Abstract. The role of instantons in describing non-perturbative aspects of glob-

ally supersymmetric gauge theories is reviewed. The cases of theories with N = 1,
N = 2 and N = 4 supersymmetry are discussed. Special attention is devoted to the
intriguing relation between instanton solutions in ﬁeld theory and branes in string
theory.

1 Introduction

In this review we discuss the role of instantons in describing non-perturbative

effects in globally supersymmetric gauge theories1 .
Instantons (anti-instantons) are non-trivial self-dual (anti-self-dual) solu-
tions of the equations of motion of the pure non-abelian Yang–Mills theory
when the latter is formulated in a compactified S4 Euclidean manifold. In-
stanton solutions are characterised by a topological charge (the Pontrjagin
number, K) which takes integer values, K ∈ Z. The integer K represents the
number of times the (sub)group SU (2) of the gauge group is wrapped by the
classical solution, while its space–time location spans the S3 -sphere at infinity.
The presentation of the material of this review can be naturally split into
three parts according to the number of supersymmetries endowed by the the-
ory. In the first part (Sects. 2–5) we discuss (weak and strong coupling) com-
putations of instanton-dominated correlators in pure N = 1 super Yang–Mills
(SYM) and in some super QCD-like (SQCD) and chiral extensions of it. A
careful analysis of the latter case shows that with a suitable choice of the chi-
ral matter flavour representations the interesting phenomenon of dynamical

1
See the contribution by G. Shore to this book for applications to the non-
supersymmetric case of QCD.

M. Bianchi et al.: Instantons and Supersymmetry, Lect. Notes Phys. 737, 303–470 (2008)
DOI 10.1007/978-3-540-74233-6 14
c Springer-Verlag Berlin Heidelberg 2008
304 M. Bianchi et al.

breaking of supersymmetry occurs in the theory, as a consequence of the

constraints imposed by the Konishi anomaly equation.
In the second part (Sects. 6–11) we move to the N = 2 super Yang–Mills
case. We will show how the highly sophisticated instanton calculus, developed
in the years, is able to produce the correct coefficients that determine the exact
expression of the N = 2 prepotential, derived in the famous Seiberg–Witten
(SW) construction. We will also review the construction of the instanton solu-
tion in terms of branes with the purpose of illustrating the intriguing relation
with string theory.
In the third part (Sects. 12–18) we discuss the role of instantons in N = 4
super Yang–Mills. Although there are no anomalous U (1)’s in this theory
(with the consequence that there exist no chiral U (1) selection rules that would
limit the value of Pontrjagin number of the instanton solutions contributing to
correlators, as instead happens in the N = 1 and N = 2 cases), instantons are
crucial to check the validity of the Maldacena conjecture beyond the realm
of perturbation theory. Furthermore, their correspondence with IIB string
D-instantons gives us hope to understand the yet elusive Montonen–Olive
duality between the weak and strong coupling regimes of N = 4 super Yang–
Mills.
A detailed outline of this review is as follows. In Sect. 2 we start with
some introductory remarks about instantons and their interpretation as field
configurations interpolating between classical Euclidean vacua (quantum tun-
nelling) and we discuss in detail how semi-classical calculations are performed
in the instanton background with special reference to the notion of collective
coordinates. We also illustrate the simplifications that occur in supersymmet-
ric theories. In Sect. 3 we derive from supersymmetry and holomorphicity
the general structure of the Green functions with only insertions of lowest
(highest) components of chiral (anti-chiral) superfields. We show that these
Green functions do not depend on the operator insertion points and have a
completely fixed dependence upon the parameters of the theory (like masses
and coupling constant). We then move in Sect. 4 to the explicit semi-classical
instanton computation of constant Green functions in pure SYM and in mas-
sive SQCD finding perfect agreement between the theoretical expectations
spelled out in the previous section and the results of actual calculations. The
main result of this analysis is that the perturbative non-renormalisation theo-
rems of supersymmetry are violated in the semi-classical instanton approxima-
tion. The instanton calculus is then extended to encompass the more delicate
cases of massless SQCD and to Georgi–Glashow-type theories with matter
in suitable non-anomalous chiral representations. In the first case, certain in-
consistencies are found between results obtained in the massless limit of the
massive theories and what can be directly computed in the strictly massless
situation. The problem is discussed in detail and the issue of “strong coupling”
vs “weak coupling” instanton calculation strategy is addressed. In the second
case, conflicting results with the constraints imposed by the Konishi anomaly
equation lead to the conclusion that in certain supersymmetric chiral theories
supersymmetry is dynamically broken by non-perturbative instanton effects.
Instantons and Supersymmetry 305

In Sect. 5 we give the expression of the effective action for all the cases for
which we have obtained results in the semi-classical instanton approximation.
A nice agreement between these two approaches is found, which gives support
to instanton-based computations.
We then pass to discuss instanton effects in N = 2 SYM theories. After
the introduction to the subject contained in Sect. 6, we present some general
discussion of their properties in Sect. 7. We start by recalling their supermul-
tiplet content. We then describe the coupling of vector multiplets to hyper-
multiplets and the structure of the classical and effective actions. In Sect. 8
we review the celebrated analysis of Seiberg and Witten and the derivation of
the analytic prepotential in the case of pure N = 2 SYM with SU (2) gauge
group. N = 2 instanton calculus is argued to provide a powerful check of
the SW prepotential in Sect. 9. In Sect. 9.1 we describe Matone’s non-linear
recursion relations for the expansion coefficients. The validity of the recur-
sion relations and thus of the expression of the analytic prepotential itself is
checked against instanton calculations for winding numbers K = 1 and K = 2
in Sect. 9.2. In order to go beyond these two cases, we follow the strategy ad-
vocated by Nekrasov and collaborators which is based on the possibility of
topologically twisting N = 2 SYM theories and turning on a non-commutative
deformation that localises the integration over instanton moduli spaces. After
reviewing the strategy in Sect. 10, we describe how to couple hypermultiplets
in Sect. 10.1. We then sketch the mathematical arguments that lead to the
localisation of the measure in Sect. 10.2 and the computation of the residues
that allows a non-perturbative check of the correctness of the SW prepoten-
tial for arbitrary winding number in Sect. 10.3. In Sect. 11 we change gear
and exploit the Veneziano model and D-branes in order to embed (supersym-
metric) YM theories in string theory. In particular, we outline the emergence
of the Atiyah–Drinfeld–Hitchin–Manin (ADHM) data and the ADHM equa-
tions as a result of the introduction of lower dimensional D-branes in a given
configuration with maximal N = 4 supersymmetry. Finally, we describe in
Sect. 11.1 the truncation to N = 2 supersymmetry and the derivation of the
SW prepotential within this framework in Sect. 11.2.
In the final part of the review, starting in Sect. 12, we discuss instanton
effects in N = 4 SYM, focussing in particular on the role of instantons in the
context of the anti-de Sitter space/conformal field theory (AdS/CFT) corre-
spondence. N = 4 SYM is the maximally extended (rigid) supersymmetric
theory in four dimensions and is believed to be exactly conformally invari-
ant at the quantum level. The main properties of the model are reviewed in
Sect. 13. We give explicitly the form of the action and the supersymmetry
transformations and we discuss the basic implications of conformal invari-
ance on the physics of the theory, highlighting some of the features which
make it special compared to the N = 1 and N = 2 theories considered in
previous sections. General aspects of instanton calculus in N = 4 SYM are
presented in Sect. 14. We describe the general strategy for the calculation of
instanton contributions to correlation functions of gauge-invariant composite
operators in the semi-classical approximation, emphasising again the essential
306 M. Bianchi et al.

diﬀerences with respect to the N = 1 and N = 2 cases. In Sect. 15 we focus

on the case of the SU (Nc ) gauge group, which is relevant for the AdS/CFT
duality, and we discuss in detail the calculation of correlation functions in
the one-instanton sector. We first construct a generating function which fa-
cilitates a systematic analysis of instanton contributions to gauge-invariant
correlators and we then present some explicit examples. The generalisation
of these results to multi-instanton sectors in the large-Nc limit is briefly out-
lined in Sect. 16. At this point we change somewhat perspective and we move
to a discussion of the remarkable gauge/gravity duality conjectured by Mal-
dacena, explaining how instanton calculus allows to test its validity beyond
perturbation theory. In Sect. 17 we recall the basic aspects of the duality
which relates N = 4 SYM to type IIB superstring theory in an AdS5 × S 5
background. Instanton effects in N = 4 SYM are in correspondence with the
effects of D-instantons in string theory. More precisely instanton contributions
to correlators in N = 4 SYM are related to D-instanton-induced scattering
amplitudes in the AdS5 × S 5 string theory. In Sects. 18.1 and 18.2 we present
the calculation of D-instanton contributions to the string amplitudes dual to
the SYM correlation functions studied in Sect. 15. The remarkable agreement
between gauge and string theory calculations provides a rather stringent test
of the conjectured duality. Finally in Sect. 18.3 we review the role of instan-
tons in a particularly interesting limit of the AdS/CFT correspondence, the
so-called BMN limit, in which the gravity side of the correspondence is under
a better quantitative control beyond the low-energy supergravity approxima-
tion. We show how instanton effects provide again important insights into the
non-perturbative features of the duality.
Our notation and various technical details are discussed in a number of
appendices.
Given the pedagogic nature of this review we refrain from drawing any
conclusion or present any speculation. In Sect. 19 we try, instead, to summarise
the crucial contributions given by Gabriele to the subject both as the father
of open string theory and as one of the deepest and most original investigators
of the non-perturbative aspects of gauge theories. We thus simply list a few
lines of research activity where Gabriele’s profound insight was precious to
put existing problems in the correct perspective and help in solving them.

2 Generalities about Instantons

Instantons (anti-instantons) are self-dual (Fμν = F̃μν , F̃μν = 12 μνρσ Fρσ )

(anti-self-dual, Fμν = −F̃μν ) solutions of the classical non-abelian Yang–Mills
(YM) Euclidean equations of motion (e.o.m.) [1, 2]2 . They are classiﬁed by
a topological number, the Pontrjagin (or winding) number, K ∈ Z, which
2
There are very many good reviews on the subject of instantons and their role in
ﬁeld theory. Some are listed in [3, 4, 5, 6].
Instantons and Supersymmetry 307

represents the number of times the (sub)group SU (2) of the gauge group is
wrapped by the classical solution, AaI μ (x), when x spans the S3 -sphere at
the infinity of the compactified S4 Euclidean space–time3 . Homotopy theory
shows, in fact, that the homotopically inequivalent mappings S3 → SU (2) are
classified by integers since Π3 (SU (2)) ∼ Π3 (S3 ) = Z [7].

2.1 The Geometry of Instantons

In the Feynman gauge (∂μ Aaμ = 0) the explicit expression of the gauge instan-
ton with winding number K = 1 for the SU (2) gauge group (to which case
we now restrict) is

2 a (x − x0 )ν ρ2
AaI
μ = η̄μν , (1)
g (x − x0 ) (x − x0 )2 + ρ2
2

a
where η̄μν are the ’t Hooft symbols [2]4 . In (1) x0 and ρ are the so-called
location and size of the instanton, respectively. They are not fixed by the YM
classical e.o.m., neither is the orientation of the instanton gauge field in colour
space. Consequently, the derivatives of the instanton solution with respect to
each one of these parameters (collective coordinates [8]) will give rise to zero
modes of the operator associated with the quadratic fluctuations of the gauge
field in the instanton background [2, 9, 10] (see Appendix B for details).
The winding number of a gauge configuration can be expressed in terms
of the associated field strength through the (gauge invariant) formula

g2
K= d4 x Fμν
a a
F̃μν . (2)
32π 2
For the action of a self-dual (or anti-self-dual) instanton configuration one
then gets
8π 2
S I = 2 |K| . (3)
g
The topological nature of (1) can be better enlightened by first recasting it in
the form (y ≡ x − x0 )

i ρ2 ! (1)† "
AIμ = 2 2
Ω ∂μ Ω (1) (y) , (4)
gy +ρ
3
In the following adjoint gauge (colour) indices will be indicated with early Latin
letters, a, b, c, . . ., and vector indices by middle Latin letters, i, j, k, . . .. Thus for
an SU (Nc ) gauge group we will have a, b, c, . . . = 1, 2, . . . , Nc2 − 1 and i, j, k, . . . =
1, 2, . . . , Nc . Further notations are summarised in Appendix A.
4
They relate the generators of one of the two SU (2) groups, in which the Euclidean
Lorentz group, SO(4), can be decomposed (SO(4) ∼ SUL (2) × SUR (2)), to the
L a
generators of the latter through the formula Σμν = 12 η̄μν σa with σa the Pauli
a
matrices. The similar coeﬃcients for other SU (2) group are the ημν symbols with
R a
Σμν = 12 ημν σa .
308 M. Bianchi et al.

where (see (A.14))

! (1)† " xν
Ω ∂μ Ω (1) (x) = −σ̄μν √ . (5)
x2
In the previous equations
xμ
Ω (1) (x) = σμ √ (6)
x2
is a topologically non-trivial SU
√ (2) gauge transformation, since it does not
tend to the group identity as x2 tends to infinity. To compute the winding
number of the gauge configuration (4), it is convenient to gauge transform it
by the transformation Ω (1) itself. One gets in this way
! "Ω (1) i y 2 ! (1) "
(AIμ )N.S. = AIμ = Ω ∂μ Ω (1)† (y)
g y 2 + ρ2
i yν y2
= − σμν . (7)
g y 2 y 2 + ρ2
From the second equality we see that (AIμ )N.S. tends at infinity to a non-trivial
pure gauge. Inserting (7) into (2), one gets the expected result, K = 1.
The form (1) (or (4)) of the one-instanton field is called “singular” because
the point where the non-vanishing contribution to the action integral comes
from is at x = x0 , unlike the “non-singular” form (7) in which this point has
been brought to infinity.
a a
We recall in this context that the Fμν F̃μν density can be locally rewritten
as the divergence of a gauge non-invariant vector through the formula
a a
! 2ig "
Fμν F̃μν = 2∂μ Kμ , Kμ = μνρσ Tr Aν Fρσ + Aν Aρ Aσ . (8)
3
Thus (2) can be written

g2
K= dSμ Kμ , (9)
16π 2 S∞
3

where S∞ 3 is the three-sphere at inﬁnity in S4 . Since, as we noticed above,

(AIμ )N.S. tends to a pure gauge at inﬁnity and hence its ﬁeld strength vanishes,
(9) can be cast in the very expressive form

1 ! "
K= 2
dSμ μνρσ Tr Ω † ∂ν ΩΩ † ∂ρ Ω Ω † ∂σ Ω , (10)
24π S∞ 3

in which we recognise the Cartan–Maurer formula. In general terms this quan-

tity is an integer, which represents the winding number of the SU (2) gauge
transformation, Ω 5 .
5
It can be explicitly proved that, setting Ωh = exp[iT a ha ], K is invariant under the
inﬁnitesimal deformations ha → ha +δha . Thus K only depends on the homotopy
class to which Ωh belongs and can always be normalised so as to be an integer.
Instantons and Supersymmetry 309

2.2 Quantum Tunnelling

The existence of instanton solutions in YM can be interpreted as an indication

of quantum tunnelling between diﬀerent vacua, the latter being pure gauge
conﬁgurations characterised by their winding numbers [2, 11, 12]. This fact
can be illustrated in quite a number of ways. An easy, but heuristic argument
is described below. A more sophisticated analysis is presented in Appendix C.
Consider again the asymptotic formula (9), in which, however, the closed
surface S3∞ has been (smoothly) deformed to an other closed surface, which
we take as a hyper-cylinder of length T , bounded at t = −T /2 and t = T /2
by three-dimensional compact spatial manifolds, S3 . In the limit T → ∞ (9)
can then be rewritten as the sum of three contributions
ig 3
! "
K= 2
lim d3 x 4ijk Tr Ai Aj Ak |t=T /2
24π T →∞ S3

! "
− d3 x 4ijk Tr Ai Aj Ak |t=−T /2
S3

T /2 ! "
+ dSi iνρσ Tr Aν Aρ Aσ , (11)
−T /2 SL

where SL is the three-dimensional lateral surface of the cylinder. It can be

proved [13] that one can always ﬁnd a gauge where (1) A0 = 0 on the lateral
surface SL and (2) the (time-independent) gauge transformations Ω± (x) at
t = ±T /2 are such that at large |x| they are independent on the direction
x/|x|. Under these conditions the third term in the r.h.s. of (11) vanishes.
The other two terms take integer values because they represent the winding
number of the mapping x → Ω± (x) from the (manifold R3 compactiﬁed to
the) S3 sphere onto SU (2)6 . The net result of these considerations is that (11)
can be cast in the form

K = n+ − n− , (12)

1 ! † † † "
n± = − 2
ijk d3 x Tr Ω± ∂i Ω ± Ω ± ∂j Ω ± Ω ± ∂k Ω± (x) , (13)
24π S3

which shows that the instanton solution (K = 0) interpolates between vacuum

states (pure gauge configurations) with different winding numbers.
We refer the reader to Appendix C for a more rigorous discussion of
instanton-induced tunnelling effects in a YM theory.

6
As an example√ of such gauge transformations one can take Ω(x) =
exp [iπσ · x/ x2 + 1], in which all the point at large |x| are mapped into the
group element −1. In this way the three-dimensional space manifolds at t = ±T /2
become topologically equivalent to S3 .
310 M. Bianchi et al.

2.3 Introducing Fermions

In this subsection we want to brieﬂy recall some elementary facts on how

to deal with fermions in the functional language in general and in the semi-
classical approximation in particular.

Fermionic Functional Integration

When fermions are introduced it is necessary to deﬁne integration rules for

Grassmann variables. This is a beautiful piece of mathematics of which a
simple account can be found in [14]. The well-known results of this analysis
can be summarised as follows.
(1) The functional integration over the degrees of freedom of a Dirac
fermion belonging to the representation R of the gauge group has the eﬀect
of adding to the gauge action a contribution which is formally given by
# $
! "
log Dμ[R](ψ, ψ̄) exp d4 x (ψ̄iγμ Dμ [R]ψ)(x)
! " ! "
= log det iγμ Dμ [R] = Tr log iγμ Dμ [R] , (14)

where
0 σμ
iDμ [R]γμ = iDμ [R] (15)
σ̄μ 0
and the matrices σμ and σ̄μ are defined in Appendix C.
(2) For a Weyl fermion, which has half the degrees of freedom of a Dirac
fermion (cf. (15) and (A.12)), there is the subtlety that the Dirac–Weyl op-
erator maps dotted indices into undotted ones, thus making problematic the
definition of a determinant for such an operator. In the literature many pre-
scriptions have been proposed to address this issue in a rigorous way (see [15]
and works quoted therein). Actually this difficulty is not relevant in practice,
because one can always imagine to factor out the free operator and compute
the determinant of the resulting operator which is perfectly well defined [16].
The contribution from the free part is obviously irrelevant in the computation
of Green functions as it will cancel with an identical contribution from the nor-
malisation factor (see (A.1)). Loosely speaking, looking at (15) and (A.12),
we may say that ceteris paribus the contribution of a Weyl fermion to the
functional integral is the “square root” of that of a Dirac fermion.

Fermionic Zero Modes

In computing the fermionic functional integral one is led to consider the de-
composition of the associated spinor ﬁelds in eigenstates of the fermionic
kinetic operator. As is well known, the existence of zero modes in cer-
tain background gauge ﬁelds, such as instantons, is of particular relevance
Instantons and Supersymmetry 311

for non-perturbative calculations both in ordinary and supersymmetric ﬁeld

theories [2, 4, 5].
The number of zero modes of the Dirac operator in an external field is con-
trolled by the famous Atiyah–Singer index theorem [17]. The theorem states
that “the index of the Dirac operator (15), i.e. the difference between the
number of left-handed (nL ) and right-handed (nR ) zero modes, is equal to
twice the Dynkin index of the representation R times the Pontrjagin number
of the background gauge field”. In formulae we write
ind(Dμ [R]γμ ) ≡ nL − nR = 2[R]K , (16)
where [R] is the Dynkin index of the representation R. Let us now consider
some interesting applications of this theorem.
1. Weyl fermion in the adjoint representation, Adj, of the gauge group. We
must distinguish between the left-handed (Dμ [Adj]σ̄μ ) and the right-
handed (Dμ [Adj]σμ ) Weyl operator. In the first case nR = 0 and the
formula (16) becomes
ind(Dμ [Adj]σ̄μ ) = nL = 2Nc K , (17)
because [Adj] = Nc . Since obviously nL is a non-negative number, there
can exist zero modes of the left-handed Weyl operator only if the classical
background instanton field has positive winding number, K > 0. Similarly
for the right-handed Weyl operator one gets
ind(Dμ [Adj]σμ ) = −nR = 2Nc K , (18)
implying that there can be zero modes only if K < 0.
2. Actually the number of zero modes in the adjoint representation of any
compact Lie group, G, is always given by twice the value of the quadratic
Casimir operator, 2c2 (Adj(G)). This result follows from the formula [18]
Adj(G) = 4Adj(SU (2)) + n(G) 2 + s(G) 1 , (19)
which expresses how the adjoint representation of G can be decomposed
into irreducible representations of SU (2). In (19) we have introduced the
definitions
n(G) = 2(c2 (Adj(G)) − 2) , (20)
s(G) = d(Adj(G)) − 4c2 (Adj(G)) + 5 , (21)
where d(Adj(G)) is the dimension of the adjoint representation of G. The
number of zero-modes is then 4 + 2(c2 (Adj(G)) − 2) = 2c2 (Adj(G))
3. Dirac fermion in the fundamental representation, Nc , of the gauge group.
Equation (16) with [Nc ] = 1/2 gives
ind(Dμ [Nc ]γμ ) = nL − nR = K . (22)
Again in the classical background instanton field there can be either left-
handed fermionic zero modes, if K > 0, or right-handed ones, if K < 0.
312 M. Bianchi et al.

4. Fermion in the rank two anti-symmetric representation Nc (Nc − 1)/2.

The number of zero modes (of deﬁnite chirality) is (Nc − 2)K.
Deriving the explicit expression of all these fermionic zero modes is beyond the
scope of this review and we refer the interested reader to the general methods
that, starting from the seminal paper of [19], have been developed in the
literature [6, 20]. However, for completeness we give their explicit expression
for a few cases more relevant for this review in Appendix A.

2.4 Putting Together Fermion and Boson Contributions

As we said, we are interested in computing expectation values of (multi-local)

gauge-invariant operators, by dominating the functional integral with the
semi-classical contributions coming from the non-trivial minima (instantons)
of the Euclidean action. The obvious question is whether this computational
strategy leads to a reliable estimate of O.

The General Case

In order to prepare ourselves for this analysis, let us write down the formal
result obtained by performing the integration over the quadratic ﬂuctuations
(semi-classical approximation, s.c.) around an instanton solution with winding
number K. Including also the fermionic contribution in (B.17) and assuming
for simplicity that there are no scalar ﬁelds in the theory7 , one gets

− 8π
2
|K|
nF
nB −κF nF e g2
O =μ dcj (23)
s.c. Z|s.c. j=1
−2
det[−D2 (AI )] det [D(AI )]
nB 1
||a(i) || (det [Mg.f.
μν ])
× dβi √ O(c; AI ) .
i=1
2π (det[M g.f. − 1
0;μν ]) 2 det[−∂ 2 ] det[
∂ ]

where D is the fermionic kinetic operator appropriate to the kind of fermion

one is dealing with (Dirac or Weyl) and the prime on the determinants is there
to mean that obviously only non-zero eigenvalues are to be included. Further
observations about this formula are the following.
• The factor κF is 1 for a Dirac fermion 1
nFand 2 for a Weyl fermion.
• The residual fermionic integration j=1 dcj is over the Grassmannian
coeﬃcients associated with the nF zero modes of the fermionic kinetic op-
erator. We stress that in order not to get a trivially vanishing result, the
Berezin [14] integration rules require a perfect matching in the number of
fermionic zero modes between those of the fermion operators in the action
and those contained in the operator O.
7
The extension of the formulae of this section to the more general case where also
elementary scalar ﬁelds are present is possible, but not completely trivial. See
below and [4, 6, 21, 22, 23].
Instantons and Supersymmetry 313

• The extra μ dependence in front of the r.h.s. of (23) (with respect to

(B.17)) is due (similarly to the case of the bosonic functional integration, see
Appendix B), to unmatched μ factors coming from the determinant of the
fermion Pauli–Villars (PV) regulators.8
• The power κF nF is dictated by the way in which zero modes contribute
to the fermionic mass term in the action and the nature of the Grassmannian
integration rules.
• No further factor comes from dealing with the fermionic zero modes,
provided they are normalised to one, which we will always do (this is at
variance with what √ happens for bosonic zero modes, each of which contributes
a factor ||norm||/ 2π to (23)).
• Generally speaking, the ratio of determinants in (23) will be a function
of the instanton collective coordinates as well as μ.
The computational strategy outlined above can be safely used if it can be
convincingly argued that the classical minima (instantons) really dominate the
integral. This is a delicate issue which can only be settled on a case by case
basis. For instance, for the instanton contribution to dominate the functional
integral one can imagine considering Green functions that are zero in perturba-
tion theory. The argument here is that otherwise the non-perturbative instan-
ton contribution, which is proportional to exp(−8π 2 |K|/g 2 ), would represent
a completely negligible correction with respect to any perturbative term. This
is the situation one is usually dealing with in N = 1 supersymmetric theories.
In the N = 2 and N = 4 cases it is, instead, interesting to consider more
general correlators which do not necessarily vanish in perturbation theory.
In these cases instanton contributions, though comparatively exponentially
small, can always be “tracked” if the theory ϑ-dependence is followed (see
Appendix C).
A second crucial question concerns the finiteness of the r.h.s. of (23). In
his beautiful paper ’t Hooft [2] has shown that in QCD the integration over
the instanton collective coordinates around the classical instanton solution
Aμ = AIμ , all other fields equal to zero (24)
does not lead to a finite result. The reason behind this fact is that the in-
tegration over the size of the instanton, ρ, which comes from the ratios of
determinant in (23) as well as from the norm of the bosonic zero modes, di-
verges in the infrared limit, i.e. for large values of ρ (the integration near
ρ = 0 is, instead, convergent thanks to asymptotic freedom). This problem is
not present in the supersymmetric case which we discuss next.

The Supersymmetric Case

Something really surprising indeed happens in the case of a supersym-
metric theory. There, irrespective of the details of the theory (number of
8
In order to have a more readable formula we have not shown in (23) the deter-
minants of the various PV regulators.
314 M. Bianchi et al.

supersymmetries, gauge group, matter content, etc.), the whole ratio of (reg-
ularised) determinants is always exactly equal to 1 [24]. This is because the
eigenvalues of the various kinetic operators in the instanton background are,
up to multiplicities, essentially all equal and, due to supersymmetry, there is
a perfect matching between bosonic and fermionic degrees of freedom, leading
to contributions that are one the inverse of the other. The formula (23) thus
becomes
SUSY
2
− 8π |K|
nB − 12 nF e
g2
O =μ
s.c. Z|s.c.
nB (i)
||a ||
× dβi √ (−1)P{jk } O( fjk ; AI ) , (25)
i=1
2π {j } k
k

where we have explicitly carried out the ﬁnal integration over the Grassman-
nian variables cj , j = 1, 2, . . . , nF . As a result the product of the nF fermionic
ﬁelds contained in O is simply replaced by the product of the wave functions,
fj (x, β), of the nF zero modes. The sum over permutations is weighted by
alternating signs because of Fermi statistics. Finally, we have set κF = 12 ,
because in supersymmetric theories fermions are always introduced as Weyl
particles.
Actually in the case K = 1 (25) can be made even more explicit, be-
cause for a gauge invariant operator the only dependence on the collective
coordinates is that on the size and position of the instanton. Using (B.18) of
Appendix B and the coset integration formula (derived in [9]) necessary for
the generalisation to the case of the SU (Nc ) group, one gets to leading order
in g (where Z|s.c. = 1)

SUSY
2
− 8π
4Nc − 12 nF e
g2
O = VNc μ
s.c. (g 2 )2Nc

dρ 4
× d x0 (ρ2 2Nc
) (−1) P{jk }
O( fjk ; AI ) , (26)
ρ5
{jk } k

with
4 (4π 2 )2Nc
VNc = . (27)
π 2 (Nc − 1)!(Nc − 2)!
Supersymmetry has in store another surprise for us. Recalling the multiplic-
ity of the fermionic zero modes as given by the Atiyah–Singer theorem (see
Appendix A), one ﬁnds that for a supersymmetric theory
1
4Nc − nF = b1 , (28)
2
g3
β=− b1 + O(g 5 ) , (29)
16π 2
Instantons and Supersymmetry 315

where b1 > 0 is the ﬁrst coeﬃcient of the Callan–Symanzik β-function. To

prove (28) we recall the general formula
11 2 1
b1 = [Adj] − nRF [RF ] − nRB [RB ] , (30)
3 3 3
RF RB

where nRF and nRB are the numbers of fermions and bosons in the representa-
tions RF and RB , respectively. Since in a supersymmetric theory each fermion
is accompanied by a bosonic partner belonging to the same representation R,
(30) simpliﬁes to

b1 = 3[Adj] − nR [R] = 3Nc − nR [R] , (31)
R R

with nR the number of chiral superﬁelds in the representation R. To be able

to compare b1 in the above equation with the combination that appears in the
l.h.s. of (28) we make use of the Atiyah–Singer theorem (see (16)). Separating
out the contribution due to gluinos (the fermions in the gauge supermultiplet)
which accounts for a 2Nc /2 contribution, we can write the l.h.s. of (28) in the
form
1 1
4Nc − nF = 4Nc − Nc − 2 nR [R] = 3Nc − nR [R] , (32)
2 2
R R

in agreement with (31).

The interesting consequence of this equality is that we can combine the
exponential of the instanton action with the explicit μ dependence to form
the renormalisation group-invariant Λ-parameter of the theory. Introducing
the running coupling g(μ), we can thus write
2
− g(μ)
8π
μ4Nc − 2 nF e
1
2
= Λ b1 . (33)

We will exploit this key observation in the following, making it more precise
(Sect. 4.1).

3 Chiral and Supersymmetric Ward–Takahashi Identities

Before embarking in explicit instantonic calculations of correlators, we want to
spell out the constraints imposed on correlators by chiral and supersymmet-
ric Ward–Takahashi identities (WTIs). We will show that in some interesting
cases, when these “geometric” constraints are coupled to the requirement of
renormalisability, the expression of certain Green functions is (up to multi-
plicative numerical constants) completely ﬁxed.
The special Green functions which enjoy this amazing property are the
n-point correlation functions of lowest (highest) components of chiral (anti-
chiral) gauge-invariant composite superﬁelds. Although this is a very limited
316 M. Bianchi et al.

set of correlators, we will see that their knowledge, when used in conjunction
with clustering, is sufficient to draw interesting non-perturbative information
about the structure of the vacuum and of its symmetry properties. For this
reason, in this section we will limit our consideration to such correlators. We
will in particular concentrate on the case of N = 1 super QCD (SQCD) (see
Appendix A for notations) with the purpose of exploring the properties of a
sufficiently general theory in which also mass terms can be present.
WTIs provide relations among different Green functions. They will be
worked out under the assumption that supersymmetry is not spontaneously
(or explicitly) broken, i.e. under the assumption that the vacuum of the the-
ory is annihilated by all the generators of supersymmetry. Our philosophy
will be that, if we find that some dynamical calculation turns out to be in
contradiction with constraints imposed by supersymmetry, then this should
be interpreted as evidence for spontaneous supersymmetry breaking.
As we explained above, we are now going to consider the n-point Green
functions
G(x1 , . . . , xn ) = 0|T χ1 (x1 ) . . . χn (xn ) )|0 , (34)
where each χk (xk ) is a local gauge-invariant operator made out of a products
of lowest components of the fundamental chiral superfields of the theory. Thus
the operators χk are themselves lowest components of some composite chiral
super field, Xk , for which we formally have the expansion
√
Xk (x) = χk (y) + 2θα ψαk (y) + θ2 Fk (y) , (35)
yμ = xμ + iθα σμαα̇ θ̄α̇ . (36)

On these ﬁelds the Q and Q̄ generators of supersymmetry act as “raising”

and “lowering” operators according to the (anti-)commutation rules
√
[Q̄α̇ , χk (x)] = 0 , {Q̄α̇ , ψkα (x)} = 2 σ̄μα̇α ∂μ χk (x) , (37)
√
[Qα , Fk (x)] = 0 , {Qα , ψkβ (x)} = 2αβ Fk (x) . (38)

3.1 Space–time Dependence

The independence of the correlators of the form (34) from space–time argu-
ments immediately follows from the (anti-)commutation relations (37). √ Tak-
ing, in fact, the derivative of G with respect to x and contracting with 2 σ̄μα̇α ,
one gets
√ α̇α ∂
2 σ̄μ G(x1 , . . . , xn )
∂xμ

= 0|T χ1 (x1 ) . . . {Q̄α̇ , ψα (x )} . . . χn (xn ) |0 = 0 . (39)

The last equality is a consequence of the fact that Q̄ can be freely (ﬁrst com-
mutation rule in (37)) brought to act on the vacuum state at the beginning and
Instantons and Supersymmetry 317

at the end of the string of χk operators and that, under the assumption that
supersymmetry is unbroken, Q̄|0 = 0. Contributions coming from the deriva-
tive acting on the θ-functions that prescribe the time ordering of operators in
G are zero because they give rise to the vanishing equal time commutators,
[χ (x , t ), χk (xk , tk )]δ(tk − t ) = 0. Equation (39) proves the constancy of G.
A similar result clearly holds for n-point correlators, G∗ , where only lowest
components of anti-chiral superﬁelds are inserted.
We end this section with the important observation that all these cor-
relators vanish identically in perturbation theory. Only non-perturbative
instanton-like contributions can make them non-zero.

3.2 Mass and g Dependence

The following further properties hold for correlators of lowest components of

chiral, G, (or anti-chiral, G∗ ) superfields [25, 26, 4]:
(a) G is an analytic function of the complex mass parameters mf , i.e. it
does not depend on m∗f (the opposite being true for G∗ ).
(b) The mass dependence of G (and G∗ ) is completely fixed.
(c) When renormalisation group-invariant operators are inserted, the de-
pendence upon the coupling constant is, in a mass-independent renormalisa-
tion scheme, fully accounted for by the renormalisation group-invariant (RGI)
quantities Λ and [mf ]inv ≡ m̂f (see below (51)).
It is important to remark that properties of this kind can be readily ex-
ported to the generating functional of Green functions, as they only follow
from symmetry principles. They provide strong constraints on the form of
the associated effective action. A celebrated example of application of this
observation, though in a different context, can be found in the construction
of the low-energy effective action that describes the interaction of pions in
QCD [27, 28]. In supersymmetric theories the invariances are so tight that
often the full expression of the effective superpotential is completely deter-
mined [21, 29, 30, 31] (see Sect. 5).

(a) Mass Analyticity

The statement (a) follows from the supersymmetric relation (no sum over f )

∂
m∗f ∗ G(x1 , . . . , xn ) = m ∗
f 0|T χ 1 (x1 ), . . . , χn (xn ) d4 x Ff∗f (x) |0
∂mf

= m∗f d4 x 0|T χ1 (x1 ) . . . χn (xn ){Q̄α̇ ψff α̇ (x)} |0 = 0 , (40)

with Ff∗f the auxiliary field of the anti-chiral superfield Tf∗f = (χ∗f ∗f ∗f
f , ψ̄f , Ff ).
The first equality follows from the fact that, before the auxiliary field is elimi-
nated by the e.o.m., Ff∗f is the coefficient of m∗f . The second is a consequence
of the complex conjugate of the anti-commutation relation in (38). Finally,
318 M. Bianchi et al.

since Q̄ commutes with the χk ’s, it can be brought in contact with the vacuum
state which is thus annihilated.

(b) Mass Dependence

In order to simplify this analysis we restrict to Green functions where only

the gauge-invariant composite operators

g 2 αa g2
λ (x)λ a
α (x) ≡ λλ(x) , (41)
32π 2 32π 2
φ̃f r (x)φhr (x) ≡ φ̃f φh (x) (42)

are inserted. They are the lowest components of chiral superﬁelds which will be
called S and Thf , respectively. Besides their obvious complex conjugate ﬁelds,
we will sometimes also consider the composite operators ψ̃αf r (x)ψhr α
(x) =
f
ψ̃ ψh (x). In general terms we will then consider correlators of the kind
(p,q)f ,...,fp
Gh1 ,...,h1 p (x1 , . . . , xp ; xp+1 , . . . , xp+q ) (43)
g 2
g
2
= 0|T φ̃f1 φh1 (x1 ) . . . φ̃fp φhp (xp ) 2
λλ(xp+1 ) . . . 2
λλ(xp+q ) )|0 .
32π 32π
The dependence of (43) upon the mass parameters can be established in the
following way. First of all we notice that from (40) we have

∂
∂ ∂ (p,q) 1 ∂
mf G(p,q) = mf − m∗f ∗ G = G(p,q) , (44)
∂mf ∂mf ∂mf i ∂αf

where we have set

mf = |mf |eiαf . (45)
In order to compute the derivative in the r.h.s. of (44) we perform the non-
anomalous UAf (1) transformation (see Appendix A)

(ψ̃ h , ψh ) → eiδf h αf /2 (ψ̃ h , ψh ) , (φ̃h , ψh ) → ei(δf h −1/Nc )αf /2 (φ̃h , ψh ) ,

λ → e−iαf /2Nc λ , (46)

by means of which the αf dependence of the action is eliminated, but it is

brought in the ﬁelds appearing in G(p,q) . This allows to carry out in an explicit
way the αf derivative, leading to the result

∂ (p,q)f1 ,...,fp (f ),f1 ,...,fp (p,q)f1 ,...,fp

mf G = qh1 ,...,h Gh1 ,...,hp , (47)
∂mf h1 ,...,hp p

p+q 1
p
(f ),f1 ,...,fp
qh1 ,...,hp = − (δf
,f + δh
,f ) , (48)
Nc 2
=1
Instantons and Supersymmetry 319

where q (f ) is the sum of all the UAf (1) charges of the operators contained in
G(p,q) . The above diﬀerential equation is easily integrated and yields

p
1

Nf (p+q)
(p,q)f ,...,fp (p,q)f ,...,fp Nc
(mf
mh
) 2 Gh1 ,...,h1 p = Ch1 ,...,h1 p (μ, g) m . (49)
=1 =1

(p,q)f ,...,fp
The μ dependence of Ch1 ,...,h1 p (μ, g) is trivially ﬁxed by dimensional anal-
ysis and one ﬁnds
(p,q)f ,...,fp
Ch1 ,...,h1 p (μ, g) ∝ μ(p+q)(3−Nf /Nc ) . (50)

(c) g Dependence
(p,q)f ,...,f
The g dependence of Ch1 ,...,h1 p p (μ, g) is completely determined by renor-
malisability. In fact, having factorised in the l.h.s. of (49) the mass factor
p 1
f
=1 (mf
mh
) , which precisely serves the purpose of making the φ̃ φh op-
2
(p,q)
erators in G behave like RGI insertions, the rest of the g dependence must
all be expressed through the RGI quantities

g 1
g γ (g )

dg ,
m
Λ = μ exp −
dg , m̂ = m exp − (51)
β(g ) β(g )

where β = 0 and γm (g) are the Callan–Symanzik function of the theory and
the mass anomalous dimension of the matter superﬁeld, respectively. This
implies that the g dependence must be of the form
g dg ! Nf Nf "
) + γm (g )
(p,q)f1 ,...,fp
Ch1 ,...,hp (μ, g) ∝ exp −
(p + q) (3 − , (52)
β(g ) Nc Nc

in order to have

p
1 (p,q)f ,...,fp
(mf
mh
) 2 Gh1 ,...,h1 p
=1

Nf (p+q)
Nc (p,q)f ,...,fp
= Λ(p+q)(3−Nf /Nc ) m̂ th1 ,...,h1 p , (53)
=1

(p,q)f ,...,fp
with th1 ,...,h1 p a dimensionless constant tensor in ﬂavour space.
(p,q)f ,...,f
The form of th1 ,...,h1 p p
is strongly constrained (and sometimes completely
determined) by the pattern of unbroken ﬂavour symmetries of the theory. Its
explicit computation will be one of the main subjects of the next sections.
320 M. Bianchi et al.

3.3 The Anomalous Uλ (1) R-symmetry

The integrated WTI associated with the anomalous Uλ (1) R-symmetry (see
Appendix A, (A.28), (A.29) and (A.17)) reads

n
∂O(α)

2iKNc O(x1 , . . . , xn ) = (x1 , . . . , xn ) , (54)
i=1
∂α(xi ) α=0

where O(α) is the operator which is obtained by performing on O a Uλ (1)

(p,q)f ,...,f
rotation of an angle α. For the special Green function Gh1 ,...,h1 p p (see (43))
and (54) simply becomes
(p,q)f ,...,fp
2KNc Gh1 ,...,h1 p (x1 , . . . , xp ; xp+1 , . . . , xp+q )
(p,q)f ,...,fp
= 2(p + q)Gh1 ,...,h1 p (x1 , . . . , xp ; xp+1 , . . . , xp+q ) , (55)

(p,q)f ,...,f (p,q)f ,...,fp

because the Uλ (1) rotation of Gh1 ,...,h1 p p is proportional to Gh1 ,...,h1 p itself
through the factor 2(p + q). As a result only if

p + q = KNc , (56)

we can get a non-vanishing result. Notice that (56) implies K > 0 consistently
with the fact that we are dealing with lowest components of chiral superfields.
Negative values of K will come into play in correlators with insertions of
highest components of anti-chiral superfields.
A particularly interesting situation arises if we insist that each flavour
should appear exactly K times. Then (56) requires Nf ≤ Nc . At this point to
simplify our treatment, we restrict ourselves to the case K = 1. Since in this
situation p = Nf , the whole dependence on the bare mass parameters drops
out from the Green function we are considering and we get
(Nf ,Nc −Nf )f1 ,...,fNf
Gh1 ,...,hN (x1 , . . . , xNf ; xNf +1 , . . . , xNc ) ∝ Λ3Nc −Nf . (57)
f

We expressly note that the exponent to which Λ is raised in (57) is not only
the physical dimension of G(Nf ,Nc −Nf ) , but it also coincides with the first
coefficient of the β-function of SQCD (see the discussion and the formulae in
Sect. 2.4).
Among the Green functions of the type (43) which fulfil the further re-
quirements spelled out in this subsection, we wish to specially mention here
the one relevant in pure SYM where one gets the famous correlator [32, 33]

g2 g2
G(0,Nc ) (x1 , . . . , xNc ) = λλ(x1 ) . . . λλ(xNc ) . (58)
32π 2 32π 2
Instantons and Supersymmetry 321

3.4 The Konishi Anomaly

The general need to regularise products of operator ﬁelds at the same point is
at the origin of the axial anomaly [34] (see Appendix A) and of the anomalous
contribution that appears in certain supersymmetric anti-commutators. Start-
ing from the supersymmetry graded algebra summarised in (37) and (38), it
has been shown in [35] that, after regularisation, in massive SQCD the fol-
lowing (anomalous) anti-commutation relation holds:

1 g2
√ {Q̄α̇ , ψ̄α̇f φh (x)} = −mf φ̃f φh (x) + λλ(x)δhf , (59)
2 2 32π 2

where besides the naive mf φ̃f φh (x) term an extra contribution appears. This
relation is what usually goes under the name of “Konishi anomaly”. Clearly, if
the vacuum of the theory is supersymmetric, by taking the vacuum expecta-
tion value (v.e.v.) of (59) a proportionality relation between gluino and scalar
condensates emerges, namely
g2
mf φ̃f φf = λλ , no sum over f , (60)
32π 2
besides
φ̃f φh = 0 , f = h . (61)

4 Instanton Calculus
We want to show in this section that the Green functions considered in (57) re-
ceive a non-vanishing computable one-instanton contribution. In other words,
although zero in perturbation theory, they can be exactly evaluated in the
semi-classical approximation by dominating the functional integral with the
one-instanton saddle point. A non-trivial result is obtained because the num-
ber of fermionic ﬁelds that are inserted in G(Nf ,Nc −Nf ) (either at face value
or at the appropriate order in g) is precisely equal to the number of fermionic
zero modes present in the K = 1 instanton background.

4.1 Instanton Calculus in SYM

The computation of the correlator (58) in the semi-classical one-instanton

approximation is not too diﬃcult by using the results we have recollected in
Appendix B (about bosonic zero modes and collective coordinate integration)
and the explicit expression of the 2Nc gluino zero modes that can be found
in Appendix A [4, 20].
We will consider the case of a pure SYM theory with gauge group SU (Nc ).9
The striking outcome of the calculation (which is based on equations from (25)
9
For SYM theories with other compact Lie group see [36].
322 M. Bianchi et al.

to (27)) is that the apparently extremely complicated dependence of the cor-

relator upon the space–time location of the inserted operators is completely
washed out by the bosonic collective coordinate integration and, as expected,
a space–time-independent (constant) result is obtained in agreement with the
supersymmetric WTI (39). Explicitly one ﬁnds [32, 33]
3Nc
G(0,Nc ) (x1 , . . . , xNc ) = CNc Λ2−loops
SYM , (62)
where10
2Nc
CNc = , (63)
(3Nc − 1)(Nc − 1)!
= μ e−8π (g 2 )−1/3 .
2
/3Nc g 2
Λ2−loops
SYM (64)
Equation (64) follow from the known value of the two-loop coeﬃcient of the
β-function of the theory and shows that dominating the functional integral by
the semi-classical one-instanton saddle point gives a (two-loop) RGI answer.
From this result two important consequences can be derived, one concern-
ing the form of the β-function of the theory and the second the structure of
the vacuum.

The SYM β-function

One can argue that the result (64) is valid to all loops in the sense that higher-
order power corrections in g are indeed all vanishing. The argument goes as
follows. As we remarked in Appendix B just at the end of the first subsection,
one can go on with perturbation theory around the instanton background by
expanding in powers of g terms cubic and quartic in the fluctuations, as well
as terms coming from the Faddeev–Popov procedure. One should be finding in
this way logarithmically divergent contributions which would be interpreted
as higher order terms in the Callan–Symanzik β-function. In the present case,
however, no such term can arise because there is no dimensionful quantity with
which we might scale the (would-be) logarithmically divergent μ dependence.
In fact, the only other dimensionful quantities are the relative distances xi −xj
of the operator insertion points. But the supersymmetric WTI (39) prohibits
any such dependence.
We must conclude that in the regularisation and renormalisation scheme
we work and in the background gauge, the Λ parameter is “two-loop exact”.
This observation is equivalent to the result of β-exactness first put forward
in [39], which amounts to say that one has the exact formula
g3 3Nc
βSYM (g) = − . (65)
16π 2 1 − 2g 2 Nc /16π 2
10
The constant CNc differs from the similar constant appearing in (4.9) of [4] by a
factor 2Nc . This mistake was pointed out by various authors [23, 37, 38] and was
the consequence of an erroneous normalisation of the gluino zero modes.
Instantons and Supersymmetry 323

Introducing (65) in

g(μ)
dg μ
dμ
= . (66)
g(μ0 ) βSYM (g ) μ0 μ

one gets, in fact, by a straightforward integration precisely (64). We stress

that no approximation (no expansion in powers of g) has been performed in
the step from (65) to the formula (64).
Equation (65) can be generalised [39] to encompass the case of extended
supersymmetry with N = 1, 2, 4 supercharge multiplets through the simple
formula
g3 (4 − N )Nc
βN (g) = − , (67)
16π 2 1 − 2(2 − N )g 2 Nc /16π 2

which incorporates the known facts that the N = 2 β-function is one-loop

exact and the N = 4 β-function just vanishes.

The Structure of the SYM Vacuum

The space–time constancy of the result (64) allows us to compute the expecta-
tion value of the composite operator g 2 λλ/32π 2 by simply imagining that the
separations |xi −xj | are very large. Using clustering, it will be possible to write
G(0,Nc ) as the product of the v.e.v.’s of such operators (gluino condensate, in
the following).
The computation is straightforward if the vacuum of the theory is unique.
Here the situation is more complicated because of the very fact that the
gluino condensate is not vanishing. This means, in fact, that the residual
Z2Nc symmetry of the theory (see Appendix A) is actually spontaneously
broken down to Z2 with the consequence that there are Nc degenerate vacua
in which the theory can live, related by ZNc transformations. Incidentally, we
note that this result is perfectly consistent with the prediction based on the
Witten index calculation [40].
In the presence of many equivalent vacua the functional integral yields
non-perturbative results where contributions coming from diﬀerent vacua are
averaged out. Thus in order to extract useful information from the clustering
properties of the theory, one has to take into account this phenomenon and
go through a procedure called “vacuum disentangling” [4, 33]. All this simply
means that we should write for G(0,Nc ) the formula

1 ! "Nc
Nc
g2
G(0,Nc ) = Ωk | 2
λλ|Ωk , (68)
Nc 32π
k=1

with the gluino condensates transforming under Z2Nc as

324 M. Bianchi et al.

g2 2πik g2
Ωk | 2
λλ|Ωk = e Nc Ω0 | λλ|Ω0 , k = 1, 2, . . . , Nc . (69)
32π 32π 2
This equation is telling us that the average in (68) is trivial and we get in the
k-th vacuum
g2 2πik 1/Nc 2−loops 3
Ωk | λλ|Ωk = e Nc CNc ΛSYM , k = 1, . . . , Nc . (70)
32π 2

Discussion of the Results

The picture we got from the calculation presented in the previous section
looks rather convincing and physically sound. It perfectly matches all our
expectations and it has been carried out in a clean and rigorous mathematical
way. It is uniquely based on the assumption that Green functions which can
receive contributions only from the K = 1 sector of the theory can be reliably
computed by dominating the functional integral by the one-instanton saddle
point. We have also argued that the two-loop RGI result obtained in the
semi-classical approximation is exact in the sense that it does not get further
perturbative corrections.
Despite all these nice features, it has been argued in the literature that
the method employed to get the result (62) cannot be right because it seems
to encounter a number of problems with other considerations.
(1) The Nc dependence of CNc leads in the ’t Hooft limit (Nc → ∞ with
g 2 Nc fixed) to an Nc dependence of the gluino condensate (70) that it is
not what one would expect from the fact that the gluinos (together with
the gauge field, Aμ ) belong to the adjoint representation of the gauge group.
Taking into account the g 2 factor that was introduced in front of the gluino
bilinear, one would naively expect g 2 λλ ∼ O(g 2 Nc2 ) ∼ O(Nc ) in the ’t Hooft
limit. From (62) and (63), one finds instead g 2 λλ ∼ O(g 2 Nc ) ∼ O(1).
(2) In [38] the calculation of G(0,Nc ) has been repeated in a fully super-
symmetric formalism and the result summarised in (62), (63) and (64) was
confirmed (up to the correction for the factor 2Nc that we already mentioned).
Interestingly, these authors have also been able to extend, in the large Nc limit,
the semi-classical instanton calculation to Green functions which receive con-
tributions from topological sectors with winding numbers K > 1, i.e. to Green
functions with KNc insertions of the gluino bilinear. The result of this calcu-
lation, when clustering is used, is inconsistent with it, because it leads to a
value of the condensate which is not independent of K.
(3) The computation of G(0,Nc ) can be indirectly done starting from
the more complicated case where extra massive matter supermultiplets are
added [21, 31, 41] and then exploit the notion of decoupling [42, 43]. We
recall that in a nutshell decoupling is the property of a local field theory
according to which when some mass becomes large, the corresponding mat-
ter field disappears from the low energy physics (see Appendix D), modulo
possible consistency conditions resulting from the requirement of anomaly
cancellation [44].
Instantons and Supersymmetry 325

From symmetry arguments it is often possible to determine, up to a

multiplicative constant, the form of the effective superpotential of the en-
larged theory in terms of the relevant composite operators (see Sect. 5). Then
consistency arguments, following from sending to infinity each mass succes-
sively, supplemented by “constrained instanton” calculations (see the next
paragraph), can be used to determine this constant (see Sect. 5.2). Clearly
checking its value is important for the self-consistency and the reliability of
the various approaches (see point (1) above). One finds that, if computed by
looking at the effective superpotential calculations, the value of this constant
does not agree with what one can deduce from the formulae (62)–(64). Of
course, the comparison was done after having properly matched the RGI pa-
rameters associated with the different regularisations employed in the various
calculations [23]. For instance, in the SU (2) case, one finds from the equa-
tions in Sect. 4.1 C2 = 4/5 instead of the result C2 = 1 one would obtain from
decoupling arguments.
Despite a lot of work in the years that followed these findings, there is
no clear understanding of why there are such discrepancies and where they
come from. One line of arguments [21, 31, 45, 46], first prompted by the
results described in (3), relies on the observation that when scalar fields
are present other quasi-saddle points exist, in which scalar fields get a non-
vanishing v.e.v., which should be taken as background configurations in the
semi-classical calculation of Green functions. In fact, the (partial or full) break-
ing of the gauge symmetry leads in the limit of very large v.e.v.’s to a weakly
coupled theory, where semi-classical instanton calculations are expected to
be reliable. In this context the non-trivial problem arising from the fact that
in the presence of non-vanishing scalar v.e.v.’s the SQCD e.o.m. have no so-
lution (owing precisely to the nature of the scalar boundary conditions) is
circumvented by making recourse to the so-called “constrained instanton”
method [47]11 .
It is not completely clear to us whether the constrained instanton method
(sometimes also called the “weak coupling instanton” (WCI) method, from
which the nickname “strong coupling instanton” (SCI) method was in oppo-
sition attributed to the approach described in Sect. 4.1 and further employed
in Sect. 4.2 below) can be considered as a completely satisfactory solution to
the problems listed above. We now want to briefly discuss this question by
illustrating some pro’s and con’s of the two approaches.
• Certainly, if one accepts the WCI computational strategy, the problem
mentioned in (1) disappears. As for the question of consistency with clustering
(point (2) above), to date no check of the kind done in [38] was carried out.
11
The theoretical foundation of the method is somewhat delicate (it relies on intro-
ducing in the functional integral a suitable “constraint” which breaks the inte-
gration measure into sectors of well-defined instanton scale size) and its technical
implementation requires a number of non-trivial mathematical steps. Its presenta-
tion is beyond the scope of this review, but can be found in the original literature.
We recommend to the reader the nice work of [23].
326 M. Bianchi et al.

Finally we do not see a really

√ rigorous√ way √ to decide on the basis of the
present knowledge whether C2 = 2/ 5 or C2 = 1 is the correct answer for
the constant in front of the gluino condensate. One possibility to settle this
question could be to make recourse to a lattice formulation of SYM [48] and
directly measure in Monte Carlo simulations the gluino condensate. Up to now,
unfortunately, severe technical difficulties have prevented such a measurement.
For a recent review on the subject of supersymmetry on the lattice see [49].
• The whole idea of working in the Higgs phase of SQCD comes from
the key observation that, in the massless limit, the superpotential possesses
a complicated vacuum manifold (see Appendix E). It is customary in the
literature to speak about “flat directions” [21, 31, 50], i.e. constant values of
the scalar fields along which the D-term vanishes. It is in this situation that all
the explicit WCI calculations have been carried out.12 . Despite the fact that
explicit instanton calculations have been carried out in the massless limit,
their results and implications have been employed in the massive case. In this
context it should be noted that the massless limit of the massive SQCD theory
is a very delicate one. For instance, as we shall see, the SCI approach gives
results in the massive case that, when extrapolated to vanishing mass, are not
consistent with results directly obtained in the massless theory. This feature
finds a natural explanation in the infrared structure of the theory which is
such that the massless limit of the massive theory does not coincide with the
strictly massless situation [43].
• The WCI approach has found its most successful application in predicting
the non-perturbative expansion coefficients of the SW [51] expression for the
effective prepotential of the N = 2 SYM theory (see Sect. 9).
• On the other hand in N = 4 SYM, despite the fact that there are flat
directions for the scalar potential, no scalar v.e.v.’s are assumed to be gener-
ated (as a non-vanishing v.e.v. would break the (super)conformal invariance
of the theory) and all instanton calculations are performed in the SCI way
we described in Sect. 4.1. Actually, in N = 4 there is no running of the
gauge coupling (67) and one can always think that calculations are done at
infinitesimally small values of g. Thus non-perturbative instanton calculations
in N = 4 SYM do not seem to fall under the criticisms raised for the N = 1
and N = 2 cases (see Sects. 14 and 15).

4.2 Instanton Calculus in SQCD

In this section we move to SQCD. The action of SQCD is obtained by cou-

pling (in a gauge invariant and supersymmetric way) to the SYM supermul-
tiplet Nf pairs of matter chiral superﬁelds, Φrf and Φ̃fr (f = 1, 2, . . . , Nf ,
r = 1, 2, . . . , Nc ) belonging, respectively, to the Nc and N̄c representation of
the gauge group (see Appendix A for some detail and [36] for an extension
12
Besides the original papers in [45], the basic work from which all the old WCI
calculations make reference to is the paper quoted in [23].
Instantons and Supersymmetry 327

of these considerations to theories with diﬀerent gauge groups and matter

content).
We want to identify and compute, according to the strategy developed in
Sect. 4.1 to deal with SYM, the Green functions that, besides being space–
time constant, can be reliably evaluated by dominating the functional integral
with the one-instanton saddle point. We shall start by analysing the massive
case, where the further information provided to us by the Konishi anomaly
relation [35] can be exploited and will allow to determine both the gluino
and the scalar matter condensates and check the internal consistency of our
calculations. In Sect. 4.2 we will discuss the puzzling features that arise when
the limit m → 0 is taken.

Massive SQCD

Already looking at the general results derived in Sect. 3.2 about the mass de-
pendence of Green functions with only lowest components of chiral superﬁelds,
we see that their small mass limit is rather delicate, as infrared divergences
seem to arise. To avoid hitting this diﬃculty, we start by limiting the use of
instanton calculus to the computation of the correlators that according to (53)
are mass independent. Among those we will concentrate here on the following
three (see (57)):

Nf
∂
(0,Nc )
(A) F (x1 , . . . , xNc ) G(0,Nc ) (x1 , . . . , xNc )
∂mf
f =1

Nf
∂ g 2
g2
= λλ(x1 ) . . . λλ(xNc ) , (71)
∂mf 32π 2 32π 2
f =1

(B) G(Nf ,Nc −Nf ) (x1 , . . . , xNc )

g2 g2
= φ̃1 φ1 (x1 ) . . . φ̃Nf φNf (xNf ) 2
λλ(xNf +1 ) . . . λλ(xNc ) , (72)
32π 32π 2
and, in the particular case Nf = Nc ,

(C) D(x, x ) = detφ(x)detφ̃(x ) , (73)

1 f1 ,...,fNf r c
det φ = r1 ,...,rNc φrf11 , . . . , φfN , (74)
Nc ! Nf

1 fN
det φ̃ = f ,...,fNf r1 ,...,rNc φ̃fr11 , . . . , φ̃rNcf , (75)
Nc ! 1

(A) Let us start the discussion with F (0,Nc ) . We notice that it contains
exactly the number of gluino ﬁelds necessary to match the number of zero
modes that the theory possesses in the K = 1 sector. We recall that, since at
the moment we are considering the case in which the matter is massive, no
328 M. Bianchi et al.

zero modes associated with matter Weyl operators exist. In this situation, we
can safely compute the functional integral which deﬁnes the above correlator
by dominating it with the one-instanton saddle point. The calculation goes
through the following steps.
(1) Every factor ∂/∂mf can be replaced by the insertion of the action mass
term
∂ ! "
→ d4 x ψ̃ f ψf (x) + m∗f (φ∗f φf (x) + φ̃f φ̃∗f (x)) , (76)
∂mf
(2) which, after integration over the matter supermultiplets, becomes

|mf |2 2 1
Tr 2 − tr . (77)
μ D − |mf |2 D
D − |mf |2

It is understood that the covariant operators D and D2 in (77) are computed in

the one-instanton background field. The multiplicative mass factors in front of
the trace have the following origin: (i) the term m∗f comes from the expression
of the matter propagators and (ii) the ratio mf /μ comes from what is left
out from the ratio between the determinant of the matter Weyl operator
and its regulator, after the supersymmetric cancellation of the non-zero mode
contribution has been taken care of.
(3) A cancellation of modes also takes place between the two terms in (77).
Only the fermionic mode with eigenvalue mf (i.e. the zero mode in the mass-
less limit) contributes and one simply gets
∂ |mf |2 1 1
→ = . (78)
∂mf μ |mf |2 μ
(4) At this point, the functional integration with respect to the gauge
supermultiplet fields remains to be done. Since the sole effect of the matter
integration is to yield the factor μ−Nf , we are left with exactly the same
calculation we did in Sect. 4.1. We thus get
3Nc −Nf
(0,Nc )
(Λ1−loop
SQCD )
F (x1 , . . . , xNc ) = CNc , (79)
g 2Nc
from which, by integrating with respect to mf , f = 1, . . . , Nf , we obtain
3Nc −Nf
Nf
(0,Nc )
(Λ1−loop
SQCD )
G (x1 , . . . , xNc ) = CNc mf . (80)
g 2Nc
f =1

Two observations are in order here. First of all, by taking the Nc -th root of
the above expression, one can determine the value of the gluino condensate in
massive SQCD. One ﬁnds

g2 2πik

1−loop 3Nc −Nf 1
Nf 1/Nc
Ωk | λλ|Ω k = e Nc CNc
Λ SQCD m f , (81)
32π 2 g 2Nc
f =1
Instantons and Supersymmetry 329

which shows that, as in SYM, the discrete Z2Nc symmetry is spontaneously

broken down to Z2 , living behind an Nc -fold vacuum degeneracy. This is the
expected result, since the presence of massive ﬁelds cannot modify the value
of the Witten index [40].
Secondly, it can be checked that the formulae (80) and (81) deﬁne two-
loop RGI quantities, as it follows from the known expressions of the β and γm
functions of the theory. Through O(g 5 ) and O(g 2 ) they read, respectively,

g3 g5 Nf
βSQCD = − 2
(3Nc − Nf )+ (−6Nc2 + 4Nc Nf − 2 ) + O(g 7 ) , (82)
16π (16π 2 )2 Nc
g2 N 2 − 1
γm =− 2 c + O(g 4 ) . (83)
8π Nc

Actually it has been argued [39] that the following “exact” formula holds.

g3 3Nc − Nf [1 − γm (g)]
βSQCD (g) = − − , (84)
16π 2 1 − 2g 2 Nc /16π 2

which generalises (65) to the SQCD case. Formula (84) perfectly fits with the
previous ones to the order they are known and renders the expressions (80)
and (81) RGI quantities to all orders.
(B) The computation of the correlator (72) is much more subtle. First of
all, one notices that it vanishes to lowest order in g because at the instanton
saddle point φf = φ̃f = 0. Secondly, the number of inserted gluino fields
does not appear to match the number of the existing zero modes. Finally, the
matter functional integration requires the knowledge of the massive fermion
and scalar propagators, (DD − |m|2 )−1 and (D2 − |m|2 )−1 , in the instanton
background which is not available in closed form.
The first and second problems are solved by observing that the integration
over the scalar matter fields amounts to substituting φf and φ̃f with the
solutions of their classical e.o.m., which schematically read
√
φf = −i 2g (D2 − |mf |2 )−1 λψf , (85)
√ 2 −1
φ̃ = i 2g (D̃ − |mf | ) ψ̃f λ .
f 2
(86)

One easily checks that, at the expenses of going to higher order in g, in this
fashion one ends up having the right number of inserted gluino fields.
As for the last problem, we start by observing that the integration over the
matter fermions has the effect of replacing for each flavour the ψ̃ f (x)ψf (x )
product with the corresponding fermionic propagator in the instanton back-
ground. After the matter integration one thus arrives at an extremely compli-
cated integral over the collective instanton coordinates, where the unknown
fermion and scalar background propagators appear. In order to proceed
with the calculation, we notice that the instanton semi-classical approxi-
mation respects supersymmetry and that consequently the correlators we
330 M. Bianchi et al.

are considering will come out to be constant in space–time and mass in-
dependent, as shown in Sect. 3. The idea is then to perform the residual
computation in the limit of very large masses (more precisely in the limit
mf |xi − xj |−1 ΛSQCD ), where the fermion and scalar background
propagators tend to their free-ﬁeld expression. One ends up in this way with
feasible integrals which yield the result (Ref. (80))
3Nc −Nf
(Nf ,Nc −Nf )
(Λ1−loop
SQCD )
G (x1 , . . . , xNc ) = CNc . (87)
g 2Nc
We remark that this quantity is not RGI as it stands. To make it RGI we
must renormalise the scalarﬁelds. One way of doing this is to multiply both
sides of (87) by the factor f mf .
(C) The computational strategy outlined above leads for the correlator (73)
to the simple result
D(x, x ) = 0 . (88)
From the results (81) and (87) one can compute both the gluino and the
scalar condensates. Recalling (82) and (83), one gets

g2
Ωk |mf φ̃f φf |Ωk = Ωk |λλ|Ωk
32π 2
2πik

3Nc −Nf
Nf 1/Nc
= e Nc CNc Λ2−loop
SQCD m̂1−loop
f . (89)
f =1

Furthermore, one can derive the relations [26, 4]

Ωk |φ̃f φh |Ωk = 0 , f = h , (90)

Ωk |det φ̃|Ωk = Ωk |det φ|Ωk = 0 . (91)

All these results (see (89)–(91)) are fully consistent with the WTIs of super-
symmetry and with (60) and (61) implied by the Konishi anomaly relation [4].
The important conclusion of this thorough analysis is that the non-
renormalisation theorems [52] of supersymmetry are violated by instanton
effects as it results from the fact that chiral (composite) operators acquire non-
vanishing v.e.v.’s, while they are identically zero at the perturbative level. One
way of understanding this surprising finding in the language of the effective
theory approach of Sect. 5 is to say that instantons generate a contribution
to the effective superpotential which is non-perturbative in nature.

Massless SQCD

We now consider the strictly massless (mf = 0, f = 1, . . . , Nf ) SQCD theory.

From the formulae we derived in the previous sections it should be already
clear that the limit mf → 0 is not smooth. Indeed, we will see that a straight-
forward application of the instanton calculus rules, that we have developed in
Instantons and Supersymmetry 331

the massive case, to massless SQCD leads to results that do not agree with
the massless limit of the massive formulae.
The origin of this discrepancy is not completely clear. As we said, one
possibility is that the mf → 0 limit of the massive theory does not coincide
with the strictly massless theory, as a consequence of the fact that the small
mass limit of massive SQCD is plagued by infrared divergences. Besides the
divergences encountered if the massless limit of (89) is taken, a simple analy-
sis shows, in fact, that a (naive) small |mf |2 Taylor expansion gives raise to
|mf |2 × 1/|mf |2 contributions that would be absent in the strictly massless
SQCD theory. Another possibility, strongly advocated in [32] and [21, 31], is
related to the observation that in the absence of mass terms the matter super-
potential has a huge manifold of flat directions along which the exponential of
the action does not provide any damping. In this situation it is not at all clear
that the instanton solution (24) can be taken as the configuration that domi-
nates the functional integral. Other types of quasi-saddle points, where scalar
fields take a non-zero v.e.v., may be also relevant. The strategy suggested by
these authors to deal with this situation will be discussed in Sect. 5. Here
we want to first show what sort of results follow when the massive instanton
calculus developed in Sect. 4.2 is blindly applied to massless SQCD.
The Green functions that have the correct number of fermionic zero modes
in the one-instanton background are restricted to
(N ,Nc −Nf ){f }
G{h}f (x1 , . . . , xNc ) for Nc ≥ Nf (92)
2 2
g g
= φ̃f1 φh1 (x1 ) . . . φ̃fNf φhNf (xNf ) λλ(xNf +1 ) . . . λλ(xNc ) ,
32π 2 32π 2
D(x, x ) = det[φ̃(x)]det[φ̃(x )] , for Nc = Nf , (93)

because now there exist zero modes also for the matter fermions, ψ̃ f and
ψf . A non-vanishing result is obtained if for each scalar ﬁeld an appropri-
ate Yukawa interaction term is brought down from the action. In this way
2Nc gluino zero modes, λ0 , together with the fermionic matter zero modes,
ψ̃0f and ψ0f , f = 1, . . . , Nf , will appear simultaneously. At the same time,
when scalars are contracted in pairs, the scalar propagator in the instanton
background, (D2 )−1 or (D̃2 )−1 , is generated which will act on the product
λ0 ψ0 or ψ̃0 λ0 , respectively. Unlike the massive case, closed expressions for
(D2 )−1 and (D̃2 )−1 exist which allows to explicitly compute the√form of the
2
“induced scalar√ modes”, by solving the ﬁeld equations D φ + ig 2λ0 ψ0 = 0
and D̃ φ̃ − ig 2ψ̃0 λ0 = 0, respectively.
2

The problem with the SCI computational strategy we have brieﬂy de-
scribed can already be seen by taking, for simplicity the case Nc = 2 and
Nf = 1. In massless SQCD (after correcting for the usual factor 2Nc with
respect to result quoted in [4]), one gets

g2 1−loop 5
1 (Λ2,1 )
φ̃φ(x1 ) λλ(x2 ) = , (94)
32π 2 m=0 2 g 4
332 M. Bianchi et al.

while for the same Green function in the massive case we got (see (87))

g2 1−loop 5
4 (Λ2,1 )
φ̃φ(x1 ) λλ(x2 ) = . (95)
32π 2 m =0 5 g4
Apart from the numerical discrepancy visible between (94) and (95), what
is more disturbing is that (94) is in conflict with the Konishi anomaly re-
lation (60), which in the massless regime (and using clustering) implies the
vanishing of the gluino condensate. An alternative to this conclusion would
be to say that the scalar condensate can be infinite in massless SQCD (see
the discussion in Sect. 5.2).
Notice that for Nf > 1 the massless SQCD action possesses a non-
anomalous SUL (Nf ) × SUR (Nf ) × UV (1) × UÂ (1) symmetry (see (A.36)).
This means that in extracting the scalar condensates a vacuum disentangling
step analogous to the one performed in Sect. 4.1 is necessary. Proceeding in
this way, one again finds results for the condensates that do not agree with
what was found in the massive case.
Also the result for the correlator (93) is at variance with (88). We now
find D(x, x ) = 0, which implies (no disentangling is necessary here, as detφ
and detφ̃ are invariant under the chiral flavour group)

det φ̃ = 0 , det φ = 0 , (96)

signalling the spontaneous breaking of the UV (1) symmetry.

4.3 The case of Chiral Theories

In this section we wish to discuss the very interesting case of supersymmetric

theories of the Georgi–Glashow type [53], where matter fermions are chiral.
There is a quite remarkable literature on the subject. A selection of useful
papers can be found in [4, 31, 54, 55, 56].
In this review we will limit to consider SU (Nc ) gauge theories with matter
in the fundamental, Nc , and anti-symmetric, Nc (Nc − 1)/2, representation.
We recall that gauge anomaly cancellation requires the number of fundamen-
tals, nfund , and anti-symmetric, nanti ≡ M , representations to be related by

nfund = M (Nc − 4) . (97)

The resulting β-function

g2
βGG = − [3(Nc + M ) − M Nc ] + O(g 5 ) (98)
8π 2
implies asymptotic freedom if M < 3Nc /(Nc − 3).
The composite operators that, besides g 2 λλ/32π 2 , come into play are
generically constructed in terms of the lowest components of the chiral matter
superﬁelds for which we introduce the notation
Instantons and Supersymmetry 333

ΦIr , I = 1, 2, . . . , nfund ,
(99)
Xirs = −Xisr , r, s = 1, 2, . . . , Nc , i = 1, 2, . . . , M .
Non-perturbative calculations are of special importance here, as Witten
index arguments have so far been unable to make any definite statement
about the nature of the vacua of the theory. Actually a variety of scenarios
turn out to be realised according to the specific matter content of the action
that can be summarised as follows.
(I) Unbroken supersymmetry with well-defined vacua [54, 31]. One such
example is the SU (6) case with M = 1 and correspondingly nfund = 2.
The allowed superpotential possesses no flat directions and the unique per-
turbative vacuum is at vanishing values of the scalar fields. There exist
instanton-dominated (constant) Green functions, which upon using cluster-
ing give results in perfect agreement with the constraints coming from the
Konishi anomaly relations. One finds that the discrete Z30 symmetry is spon-
taneously broken down to Z6 , leaving behind 30/6=5 well-defined supersym-
metric vacua. We remark that here, unlike SYM and SQCD, the number of
vacua is not equal to Nc . It should be noted that in this example vacuum
disentangling can be trivially carried out.
A more delicate situation occurs if we double the number of families,
i.e. if we take M = 2 [56], because a non-trivial vacuum disentangling over
the transformations of the complexification of the global symmetry group
SU (4) × SU (2) is necessary here. When this is done, results from instanton
calculations allow to determine all the condensates. In particular, one finds
that the discrete Z12 symmetry group is spontaneously broken down to Z3 ,
leaving behind 12/3=4 well-defined supersymmetric vacua. The interesting
observation is that in this theory also the relations entailed by the Konishi
anomaly equations allow to completely compute all the condensates. Reas-
suringly, the two sets of results turn out to be in perfect agreement. For a
discussion of these results from the complementary effective action point of
view, see Sect. 5.3.
In both the above cases when the superpotential is switched off the vacuum
becomes ill defined, because in this limit necessarily some of the condensates
must “run away” to infinity. This is due to the fact that the relations among
condensates involve (inverse) factors of the Yukawa couplings.
(II) Unbroken supersymmetry with ill-defined vacua. This situation occurs
in theories based on a SU (Nc ) gauge group with Nc even and larger than
8. Also in the presence of a non-vanishing superpotential, one finds that,
in order to reconcile instanton results with the implication of the Konishi
anomaly relations, one has to assume that some of the scalar condensates run
away to infinity [56]. Such a result is seen to be related to the existence of
flat directions in the superpotential. In this respect the situation is similar
to massless SQCD, where we had at the same time flat directions in the
superpotential and infinite scalar condensate in order to avoid contradictory
results between instanton calculations and the Konishi anomaly equation.
334 M. Bianchi et al.

(III) Spontaneously broken supersymmetry. This conclusion indirectly

arises in Georgi–Glashow-type models in which the gauge group is SU (Nc )
with Nc odd, because of conflicting constraints between instanton calculations
and relations implied by the Konishi anomaly equation. Several specific cases
have been considered in the literature. We list here some interesting examples.
1. Nc = 5, M = 1 and consequently nfund = 1. Although in this case no su-
perpotential can be constructed, it can be shown that the theory does not
admit any perturbative flat direction [31]. Because of the absence of super-
potential the Konishi anomaly relations imply that the gluino condensate
must vanish. When this result is put together with the non-perturbative
calculations of certain instanton-dominated Green functions [54, 4], one
is led to the conclusion that the involved scalar condensate cannot take a
finite value, if clustering is used and the vacuum is supersymmetric. The
wandering to infinity of the scalar condensate in the absence of flat direc-
tions looks highly implausible and one should rather conclude that there
is a dynamical breaking of supersymmetry owing to non-perturbative in-
stanton effects.
2. Nc = 5, M = 2 and consequently nfund = 2. This time the theory admits
a superpotential but still no flat directions exist. Unlike the previous case,
from the Konishi anomaly equations one can prove that all condensates
must vanish. This is in contradiction with the non-vanishing result given
by the instanton calculation of the Green function where the product
of these condensates appear, if clustering is invoked and the vacuum is
supersymmetric. The most tempting conclusion is that supersymmetry is
dynamically broken. One might object to this conclusion that actually
the instanton calculation is performed in the absence of a superpotential,
i.e. in a situation where disentangling is necessary and flat directions are
present. This should not be a problem, however, because, unlike the case
of the mass dependence, one expects the limit in which the superpotential
coupling vanishes to be a smooth one. For a discussion of these results
from the complementary effective action point of view, see Sect. 5.3.
3. Nc ≥ 7, M = 1. There exist many instanton-dominated Green functions
which, after vacuum disentangling, yield an overdetermined set of relations
for the condensates [56]. One can solve the resulting equations finding full
consistency with clustering. However, the Konishi anomaly equations are
such as to imply the vanishing of several condensates and thus through
instanton results the run away of others. Because no (perturbative) flat
directions exist in these models, one is led again to conclude that super-
symmetry is spontaneously broken.

5 The Eﬀective Action Approach

Symmetry properties of the action in the form of anomalous and non-
anomalous WTI’s together with explicit dynamical (instanton) calculations
Instantons and Supersymmetry 335

have taught us a lot about the nature of supersymmetric N = 1 theories. A

very useful and elegant way to recollect all these results is to make recourse to
the notion of “effective” or “low-energy”, action (sometimes also referred to
as “effective Lagrangian”). This notion, though with slightly different mean-
ings and realm of application, has a long history. It was first introduced in
the papers of [27], as a way of compactly deriving the soft pion theorems of
current algebra, then fully developed for QCD with the inclusion of the η
meson and the UA (1) anomaly in the works quoted in [57] and [28]. A parallel
road was opened by Symanzik [58] to deal with the lattice regularisation of
QCD, which turned out to be crucial for understanding the approach to the
continuum of the lattice theory.
The extension of these ideas to supersymmetric theories was first proposed
in refs. [29, 30], where the cases of N = 1 pure SYM and SQCD were consid-
ered, and then expanded to a field of investigation of its own in [21, 37, 41].
Some review papers on the subject can be found in [59].
Effective actions for all the theories we have discussed in the previous
sections have been constructed and many interesting results have been ob-
tained. In the following we want to briefly review what was done with the
main purpose of comparing with instanton results.

5.1 The Eﬀective Action of SYM

The first step along the way of constructing the effective action, Γeff , describ-
ing the low-energy dynamics of a theory is to identify the degrees of freedom
relevant in the energy regime E Λ, where Λ is the theory RGI mass scale.
In the pure SYM case, where confinement seems to hold, the obvious degrees
of freedom can be collected in the (dimension three) superfield

g2
S= Tr(W α Wα ) , (100)
16π 2
whose lowest component is precisely the gluino composite operator (41).
SYM
The second step is the observation that the interesting piece of Γeff is
not so much its kinetic contribution (a D-term which is non-holomorphic in
S and reduces to the standard kinetic terms as g → 0), but rather the F -
term which provides the correct anomalous transformation properties of the
effective action. In the present case, it is enough and convenient to make
reference to the UR (1) symmetry (see (A.33)) to fix the form of this term,
which is often referred to with the name of “effective superpotential” in the
literature.
Recalling the UR (1) transformation properties of the superfield S (see the
table in Appendix A)

S(x, θ) → e3iα S(x, θe−3iα/2 ) , (101)

we are led to write for the full effective action the formula
336 M. Bianchi et al.
! SYM "
SYM
Γeff;Nc
SYM
= Γkin (S, S ∗ ) + Weff;Nc
(S) + h.c. . (102)

where
! "
SYM
Γkin (S, S ∗ ) = k (S ∗ S)1/3 D , (103)
S Nc
SYM
Weﬀ;N (S) = − S log − N c . (104)
c
(cΛSYM )3Nc F

In the above equations we used the shorthand notation

! " ! "
. . . F ≡ d4 x d2 θ . . . , . . . D ≡ d4 x d2 θ d2 θ̄ . . . . (105)

The expression of Γkin SYM

(S, S ∗ ) in (103) is in no way unique. It is only an
example of a functional having the property that (with a suitable choice of
the constant k) it reproduces the standard form of the kinetic term in the limit
g → 0. The other constant c cannot be fixed by symmetry considerations only.
A way of determining its value will be discussed in Sect. 5.2. Equation (104)
is the famous Veneziano–Yankielowicz effective action [29].
It is not difficult to prove that the second term in (102) has the desired
transformation properties under UR (1) (see (A.34)). From (101) we get in
fact (the x-dependence and the corresponding space–time integration is un-
derstood)

• d θ S(θ) → e
2 3iα
d2 θ S(x, θe−3iα/2 )

= d2 (θe−3iα/2 )S(θe−3iα/2 ) d2 θ S(θ) , (106)

• d2 θ S(θ) log S(x, θ) → e3iα d2 θ S(θe−3iα/2 ) 3iα + log S(θe−3iα/2 )

= 3iα d2 (θe−3iα/2 )S(θe−3iα/2 ) + d2 (θe−3iα )S(θe−3iα ) log S(θe−3iα/2 )

= 3iα d2 θ S(θ) + d2 θ S(θ) log S(θ) . (107)

Useful information about the non-perturbative properties of the theory can be

obtained from the formula (102) by determining the values of S which make
SYM
Γeff stationary. These are constant field configurations which minimise the
effective action. Thus they yield the values of the v.e.v. of the gluino composite
operator (gluino condensate). From (102) one gets the result

S = (cΛSYM )3 e2iπk/Nc , k = 1, . . . , Nc . (108)

If c3 is identified with (CNc )1/Nc , then (108) becomes identical to (70). How-
ever, in connection with the comments we made in Sect. 4.1, we must remark
here that there is a discrepancy between the number given by the above identi-
fication and the choice c = 1 made in [23, 31, 59]. The latter can be justified in
Instantons and Supersymmetry 337

the framework of the SQCD eﬀective action approach, if in conjunction with

WCI calculations, certain consistency relations following from decoupling (see
Sect. 5.2) are employed.
A crystal clear way to resolve this puzzling discrepancy would be to arrive
at an evaluation of the SYM effective action from first principles, i.e. à la
Wilson–Polchinski [60]. Many efforts have been made in this direction and
a lot of interesting results have been obtained [61] insisting on the role of
anomalies [35, 62] in the construction of the Wilsonian action. A different
and perhaps more promising road has been recently undertaken which uses
the matrix model formulation of SYM [63]. In this framework the form of the
effective action could be derived [64, 65] and the result c = 1 was obtained.

5.2 The Eﬀective Action of SQCD

For the purpose of extending the previous considerations to SQCD it is conve-

nient to distinguish among the three cases Nc > Nf , Nc = Nf and Nc < Nf
and separately discuss the massive and the massless situation.

SQCD with Nc > Nf

The Massive Case

The generalisation of the previous formulae to massive SQCD is almost im-
mediate if one includes among the degrees of freedom that describe the low
energy dynamics of the theory also the composite operators (42)

Thf = φ̃fr (x)φrh (x) . (109)

Apart from the unessential (for this discussion) kinetic terms, one finds that
the formula which extends (102) is
SQCD
Γeff;Nc ,Nf
(S, S ∗ ; T, T ∗ ) (110)
! SQCD "
SQCD
= Γkin (S, S ∗ ; T, T ∗ ) + Weff;Nc ,Nf
(S; T ) + h.c. ,
SQCD
Weff;Nc ,Nf
(S; T ) (111)
S Nc −Nf detT
f
= − S log 3N −N
− (N c − N f ) + m f T f .
(c ΛSQCD ) c f F
f

As before, the constant c cannot be ﬁxed by symmetry considerations only.

We will discuss the important issue of determining its precise value below. The
minimisation conditions (also called F -ﬂatness conditions) allow to determine
all the condensates and one ﬁnds13

13
We recall the elementary formula d log[detT ]/dThf = (T −1 )hf .
338 M. Bianchi et al.
Nf

2iπk

mf 1/Nc
S = e Nc 3
(c ΛSQCD ) , k = 1, . . . , Nc , (112)
c ΛSQCD
f =1
S
Thf = δhf , (113)
mf

viz. the same results that were obtained in (89) up to the normalisation of
the Λ parameter.
Equation (111) has a number of nice properties.
(1) It is mathematically meaningful not only for Nc > Nf , where instanton
calculations are feasible, but for any value of Nc and Nf , except Nc = Nf .
Actually in the last case a further composite operator has to come into play,
as already hinted at by the results of Sects. 4.2 and 4.2. We will discuss in
detail this case below.
(2) It obeys the expected decoupling theorem in the sense that when one
of the flavours gets infinitely massive, (111) precisely turns into the effective
action for the theory with one less flavour, in which that particular flavour
Nf Nf −1
is absent, provided the Λ parameters of the two theories, ΛSQCD and ΛSQCD ,
are matched as described in Appendix D.
(3) The massive S field can be “integrated out”, leaving a pure matter
effective superpotential
SQCD
Weff;Nc ,Nf
(T )

(c Λ 3Nc −Nf 1
SQCD ) (Nc −Nf )
= (Nc − Nf ) + mf Tff , (114)
detT F
f

which coincides with the Aﬄeck–Dine–Seiberg [21] eﬀective superpotential.

One has to remark, however, that the formula is only meaningful for Nc > Nf .

The Massless Case

The discussion of the massless case is quite delicate (and controversial).
This is to be expected looking at the results of the dynamical instanton cal-
culations presented in Sect. 4.2, which showed conflicting findings for the
condensates when the massless limit of the massive results were compared to
what one gets in the strictly massless case.
Let us now see what are the implications of the massless SQCD effective
action from using the formulae (111) or (114). If we start from (111), we con-
clude that, when any of the masses is sent to zero, the gluino condensate must
be taken to vanish for consistency with (112), while (113) does not contain suf-
ficient information to determine any of the scalar condensates Thf . If on the
contrary all the masses are set to zero from the beginning, as was done in [31],
and (114) is used, one must conclude that the scalar condensates run away
to infinity as the potential generated by the massless effective action (114)
monotonically decreases to zero in this limit.
Instantons and Supersymmetry 339

SQCD with Nc = Nf

When Nf = Nc new composite operators can be constructed which must

appear among the fields of the effective action. They are the determinants,
X and X̃, over colour and flavour indices of the matter fields, whose lowest
components are shown in (74) and (75). The need for new field operators is
confirmed by the observation that the action (110) does not contain a suf-
ficiently large number of massless fermions to satisfy the ’t Hooft anomaly
matching conditions [41, 44, 59] associated with the non-anomalous UÂ (1)
symmetry defined by (A.35) and (A.36) (see also the table in Appendix A).

The Massive Case

The most general form of eﬀective action is
SQCD
Γeﬀ (S, S ∗ ; T, T ∗ ; X, X̃) = Γkin
SQCD
(S, S ∗ ; T, T ∗ ; X, X̃) (115)
detT
f
+ − S log + f (Z) + m T
f f + h.c. ,
(c ΛSQCD )2Nc F
f

where f (Z) is a function of the ratio Z = X X̃/detT . For mf = 0 (all f ) the

stationarity conditions lead to the equations
detT
log + f (Z) = 0 , (116)
(c ΛSQCD )2Nc
! ∂f (Z) " −1 h
mf δhf = S 1 − Z (T )f , (117)
∂Z
∂f (Z) X̃
S = 0, (118)
∂Z detT
∂f (Z) X
S = 0, (119)
∂Z detT
which have the solution

X = X̃ = 0 , (120)
mf Thf = δhf S , (121)
−f (0)
2Nc
detT = e c ΛSQCD , (122)

a result with exactly the same structure as (89)–(91).

The Massless Case

The massless case is, as usual, more subtle. Besides S = 0 (as implied by
the massless limit of the Konishi anomaly equation (121)), by varying the
eﬀective action with respect to S one still gets the constraint
detT
log + f (Z) = 0 , (123)
(c Λ 2N
SQCD ) c
340 M. Bianchi et al.

which (if f (Z) = 0) only fixes one combination of X, X̃ and detT , leaving
the other two undetermined. This can be interpreted as the equivalent of the
statement that for Nf = Nc and mf = 0 the perturbative flat directions are
not (all) removed, so vacua with arbitrarily large values of these condensates
can occur. The effective action vanishes at the minimum and one is only left
with the constraint (123).
The explicit form of this constraint was worked out in [41] with the con-
clusion that the classical relation detT = X X̃ is lifted by quantum correction
to the formula
detT − X X̃ = (ΛSQCD )2Nc . (124)
This was the first example of a by now well-known phenomenon (see Sect. 8)
according to which the quantum theory can display a whole manifold of (de-
generate) vacuum states where supersymmetry is unbroken. It is a complex
Kähler manifold (often called the “quantum moduli space”, M) to which the
point representing the classical vacuum not always belongs. We end this dis-
cussion by noticing that the constraint (124) can also be derived by a massless
effective action of the type (115) if one simply takes

f (Z) = log(1 − Z) . (125)

SQCD with Nf > Nc

In this case neither dynamical instanton calculations are possible (see our dis-
cussion in Sect. 4) nor the general considerations of [31] apply. In principle,
one can imagine to go on with the eﬀective action approach, guided by infor-
mation on the relevant low-energy degrees of freedom provided by the ’t Hooft
anomaly matching conditions.
• For instance, in the case Nf = Nc + 1 the two baryon-like superﬁelds
r
B f = f f1 f2 ...fNc r1 r2 ...rNc Φrf11 Φrf22 . . . ΦfN
N
c
, (126)
c
f
B̃f = f f1 f2 ...fNc r1 r2 ...rNc Φ̄fr11 Φ̄fr22 . . . Φ̄rN
Nc
c
. (127)

must come into play in order to fulfil such conditions. They can combine with
Thf to give the term B f Tfh B̄h in the effective action. The whole expression of
the latter can then be argued to have the form
detT − B f T h B̃h
SQCD SQCD f
Γeff = Γkin + b
+ h.c. , (128)
(ΛSQCD ) 1 F

where b1 = 3Nc − Nf = 2Nc − 1. As a consistency check, it can be shown that,

if the (Nf + 1)-th flavour is given a mass and decoupled, then the situation
we described in the previous subsection where we had Nf = Nc is recovered.
It is interesting to remark that by solving the F -flatness equations implied
by the effective action (128), one finds that, unlike the case Nf = Nc , the point
corresponding to the classical vacuum Tfh = B f = B̃h = 0 belongs to the
Instantons and Supersymmetry 341

moduli space of the theory. At this point the (non-anomalous) symmetry of

the classical action, SUL (Nf ) × SUR (Nf ) × UV (1) × UÂ (1), is fully unbroken.
• For larger values of Nf (Nf > Nc + 1, but smaller than 3Nc , where UV
asymptotic freedom is lost) the above formulae have an obvious generalisation.
One introduces the chiral superﬁelds
r
B f1 ...fÑc = f1 ...fÑc fÑc +1 ...fNf r1 ...rNc Φrf1 . . . ΦfN
N
c
, (129)
Ñc +1 f

f fN
B̃f1 ...fÑc = f1 ...fÑc fÑc +1 ...fNf r1 ...rNc Φ̃r1Ñc +1 . . . Φ̃rNcf , (130)

where, following [41], we have set

Ñc = Nf − Nc . (131)

In terms of the operators (129), (130) and Tfh one may construct the eﬀective
action
SQCD SQCD
Γeﬀ = Γkin (132)
h
detT − B f1 f2 ...fÑc
Tfh11 Tfh22 . . . Tf Ñc B̃h1 h2 ...hÑc
Ñc
+ + h.c. ,
(ΛSQCD )b1 F

where b1 is the ﬁrst coeﬃcient of the β-function of the theory.

The trouble with this analysis is that neither the ’t Hooft anomaly condi-
tions are fulfilled, if only the above set of composite operators is considered,
nor the superpotential has the correct quantum numbers to fit the anomalous
symmetries of the theory.
An inspiring and physically compelling interpretation of the situation was
given in [41], where it was argued that the theory admits also a “dual” de-
scription in terms of a SQCD-like action with the same global “flavour” sym-
metries, hence with quark fields Qf and Q̃f (f = 1, 2, . . . , Nf ), but with gauge
group SU (Ñc ) with Ñc = Nf − Nc . This conclusion follows from the observa-
tion that the moduli space of the theory remains unmodified quantum mechan-
ically for all values of Nf > Nc +1, at least up to Nf = 3Nc . In turn this means
that the classical vacuum at the origin (where all the expectation values of
the composite fields which represent the degrees of freedom of the low energy
theory vanish) preserves the original SUL (Nf ) × SUR (Nf ) × UV (1) × UÂ (1)
symmetry. Consequently, in the dual theory there must necessarily be Nf
quark fields, though coupled to a different gauge group. In this theory the
operators (129) and (130) are interpreted as composite operators of the form
f
B f1 ...fÑc = r1 ...rÑc Q̃fr11 Q̃fr22 . . . Q̃rÑ
Ñc
c
, (133)
r
B̃f1 ...fÑc = r1 ...rÑc Qrf11 Qrf22 . . . QfÑc . (134)
Ñc

An additional chiral, gauge invariant, supermultiplet, Mfh , is assumed to exist,

which is necessary for matching the ’t Hooft anomaly conditions. In terms of
342 M. Bianchi et al.

the above composite ﬁelds an eﬀective superpotential can be written down.

It reads Qf Mfh Q̃h . The relation between this theory and the original theory
is referred to as “non-abelian electric–magnetic duality” (or more simply as
“Seiberg duality”) and indeed it can be argued to be a duality relation in
the sense that the dual of the dual is the original theory with the quarks and
gluons of one description interpreted as solitons (magnetic monopoles) of the
other.
Summarising, according to [41, 59], we can briefly describe what happens
to SQCD when Nf increases at fixed Nc beyond Nc + 1 as follows.
• - For Nc + 2 ≤ Nf < 3Nc /2 the asymptotic particles of the theory are
the the dual quark fields Qf and Q̃f and the mesons Mfh , which interact
through an IR-free (b1 = 3Ñc − Nf = 2Nf − 3Nc < 0) supersymmetric
theory with gauge group SU (Ñc ). This how the SQCD theory we started
with looks in terms of “magnetic” variables dual to the original “electric”
variables (which are instead strongly coupled in this range of Nf values).
• - As soon Nf goes through the value 3Nc /2, the first coefficient of the
dual theory β-function changes sign and the theory is expected to flow to
a non-trivial IR fixed point. This continues to be true for the whole range
of values 3Nc /2 < Nf < 3Nc . Both the original and the dual theory can
be argued to be conformal theories of interacting quarks and gluons (we
remark that 3Nc /2 < Nf < 3Nc ⇐⇒ 3Ñc /2 < Nf < 3Ñc ). However, as
Nf increases the electric variables tend to become more and more weakly
coupled and the opposite happens for the dual magnetic variables.
• - At Nf = 3Nc , where the original theory loses asymptotic freedom, the
IR fixed fixed point comes to zero coupling.
• - For even larger values of Nf > 3Nc the original electric theory is an IR
free theory of quarks and gluons.

Normalising the SQCD and SYM Eﬀective Action

As we have seen, the interesting piece of the SYM and SQCD effective ac-
tions can be fixed by symmetry arguments only up to a constant rescaling
of their Λ parameter. We want to show in this section how, exploiting the
self-consistency requirement implicit in the decoupling theorem, one can fix
these constants, if at the same time a dynamical (e.g. instanton based) infor-
mation is available. We will develop the argument along the line of reasoning
advocated in [31, 23, 37] and summarised in [59].

The Case of SQCD

Starting from (114) with just one massive flavour, say the Nf -th one, we
require that the effective superpotential of the theory with Nf flavours,
namely,
Instantons and Supersymmetry 343
SQCD
Weff;Nc ,Nf
(T )

(Λ(Nf ) )3Nc −Nf (N −N
1
SQCD f) N
= (Nc − Nf )η(Nf )
c
+ mNf TNff , (135)
detT F

goes over to the effective superpotential of the theory with one flavour less
when mNf gets large after using (D.3). To simplify and clarify notations, we
have introduced the new constant η(Nf ) = (c )3Nc −Nf /Nc −Nf with respect to
what we had in (114) and we have attached the extra superscript Nf to the Λ
parameter of SQCD in order to trace the number of “active” flavours in each
theory.
N
We now proceed to eliminate TNff by using the F -flatness condition for
N SQCD N
TNff , which amounts to the stationarity equation ∂Weff;Nc ,Nf
(T )/∂TNff = 0.
N
We also notice that the analogous conditions for the TNf f and Th f components
imply the vanishing of their expectation value. After some algebra one finds
that the r.h.s. of (135) becomes

NN−N
c −Nf
mN (Λ(Nf ) )3Nc −Nf (N −N
1
SQCD f +1)
(Nc − Nf + 1)
+1 f c
η(Nf ) c f , (136)
detT̃ F

where detT̃ is the matter determinant with the Nf -th ﬂavour missing. Since
the decoupling condition (D.3) implies
(N −1)
)3Nc −Nf = (ΛSQCD )3Nc −Nf +1 ,
f(N ) f
mNf (ΛSQCD (137)

we see that the expression (136) becomes the formula for the effective super-
potential of SQCD with Nf − 1 flavours if η(Nf ) satisfies the equation
Nc −Nf
η(Nf ) Nc −Nf +1 = η(Nf − 1) . (138)

The most general solution of (138) is

1
Nc −Nf
η(Nf ) = η0 , (139)

with η0 a quantity which does not depend on Nf . The last observation is

rather important as it can be exploited to simplify the calculation of η0 . In
practice, one can proceed in two ways. One is based on an explicit dynamical
computation which was done in the WCI approach with the result η0 = 1 [21,
31, 45, 46]. The calculation is performed in the especially simple case of SQCD
with Nf = Nc − 1 flavours, where the SU (Nc ) gauge symmetry is completely
broken by non-vanishing scalar v.e.v.’s (see Appendix E). In this situation the
theory is weakly coupled for sufficiently large v.e.v.’s, thus constrained [47]
instanton calculations are expected to be fully reliable.
The second strategy [37] consists in determining η0 by means of a self-
consistency constraint that fixes the value of the gluino condensate in an
344 M. Bianchi et al.

SU (2)1 × SU (2)2 gauge theory with matter in the (2, 2) representation. The
argument, which is quite elegant, exploits the knowledge of the eﬀective su-
perpotential of the theory, derived in [66], and conﬁrms the result η0 = 1.

The Case of SYM

Already the result mentioned at the end of the previous section is telling
us that the normalising constant, c, in (104) is to be taken equal to one, at
variance with the√ direct SCI calculation which, if a factor Nc = 2 is divided
out, gave c = 1/ 5 (see (63)–(70) and the discussion in Sect. 4.1).
There are other similar indirect ways to determine c. An elegant one is to
start from a pure N = 2 SYM theory with the addition of a mass term for
the chiral (matter) superfield which breaks supersymmetry down to N = 1.
Decoupling the massive multiplet by sending the mass parameter to infinity
leaves behind a pure N = 1 SYM theory. The reason to reach N = 1 in this
somewhat complicated way is that for the effective action of pure N = 2 SYM
we have the beautiful SW formula [51] and a simple description of the theory
in terms of low energy degrees of freedom (see Sect. 8).
To illustrate the method we wish to present here a simplified adaptation
of the original argument given in [38]. The starting point is the formula
N =1 N =1
∂Weff 32π 2 ∂Weff
g 2 λαa λaα = −16πi = Λ , (140)
∂τ b1 ∂Λ

where for the SU (2) gauge group b1 = 3 × 2 and we have set

4πi ϑ
τ= + , (141)
g2 2π
Λ = μe2πiτ (μ) . (142)
N =1 N =1
We need to compute Weff with its correct normalisation. In principle Weff
could be obtained from (104), after integrating out the S superfield. Precisely
because at this stage the normalisation of the N = 1 effective action is un-
known, we shall start from the well-defined expression of the N = 2 effective
superpotential which in the relevant strong coupling regime takes the form
N =2
√
Weff (AD , M, M̄ ) = 2M̄ AD M + mU (AD ) , (143)

where the chiral superﬁelds M, M̄ describe the monopole multiplet and AD is

the dual Higgs superfield. In this regime the quantum modulus, U , is naturally
expressed in terms of AD (and not of A). Solving the F -flatness conditions
leads to the v.e.v.’s

m dU (x) 12
aD ≡ AD = 0 , M = M̄ = − √ . (144)
2 dx x=0
In this vacuum configuration, one finds
Instantons and Supersymmetry 345
N =2
Weff (0, M , M̄ ) = m u(0) + . . .
2 =2 2
= m ΛSW + . . . = 2m ΛN PV + ... , (145)

where u(0) = U (0) = Λ2SW (as it follows from the known relation between
aD and u = U 14 ) and ΛSW is the SW dynamical scale. In the last equality
for the purpose of comparing with the rest of our formulae, we have introduced
the more standard Pauli–Villars scale (which we consistently used throughout
√
this review) related to the former by the relation [37] ΛN =2
PV = ΛSW / 2.
The last step of this quite elaborate argument consists in decoupling the
matter superfield by sending m to infinity while keeping fixed the combination
(see (D.3) and (67))
=2 4
Λ 6 = m2 Λ NPV . (146)
Inserting this relation in the last equality of (145) gives
N =1
Weff = 2Λ3 , (147)

from which the equation

g 2 αa a
λ λα = Λ3 (148)
32π 2
follows. This calculation again yields the so-called WCI result c = 1.
Although, as we have developed the argument, this computation is enough
to ﬁx the normalisation of the eﬀective potential for any number of colours, it
would be nice to repeat a similar reasoning in the generic case of an SU (Nc )
gauge group in order to explicitly check the Nc behaviour of the gluino con-
densate. This issue is of relevance for the interesting question of relating non-
supersymmetric QCD-like gauge theories with supersymmetric ones in the
large Nc limit as proposed in the nice papers of [67].

5.3 The Eﬀective Action of Georgi–Glashow-type Models

A number of interesting results have been obtained in the literature [55, 68]
for supersymmetric theories with chiral matter. Here, for brevity, we will only
discuss two specific cases (1) SU (6) with two matter superfields in the 6 and
one in the 15¯ representation and (2) SU (5) with one matter superfields in
¯ representation, as prototypes of two different
the 5 and another one in the 10
typical situations, namely unbroken supersymmetry with well-defined vacua
and dynamically broken supersymmetry, respectively (see the corresponding
discussion in Sect. 4.3).

√ u %
14 2
We recall the SW formula aD = π Λ2
dx x2x−u
−Λ4
, see Sect. 8.
SW SW
346 M. Bianchi et al.

SU (6) with Matter in 2 × 6 + 15

The construction of the eﬀective action of this theory requires, besides the
chiral composite superﬁelds (see (99))
g2
S= W α Wα , T = IJ ΦIr ΦJs X rs , (149)
32π2
U = r1 s1 r2 s2 r3 s3 X r1 s1 X r2 s2 X r3 s3 , (150)

the real (vector) ones

RIJ = Φ†I e2gV (6) ΦJ , Q = X † e2gV (15) X .

¯
(151)

The expression of the effective action which fulfils all the relevant anomalous
and non-anomalous WTIs of the microscopic theory reads [55]

GG−SU (6)
Γeff = TrR + Q + S S + ξR Tr log R + ξQ log Q

+Tr(T † R−1 T )(detR)−1 Q−1 + U U Q−3
D
S 3
XT
+ TrWR2 + WQ2 + S log 15 + hT + h U + h.c. . (152)
ΛGG F

where ξR , ξQ are (in principle) calculable constants, ΛGG is the RGI scale
parameter of the theory and
1 1
WRα = − D̄2 R−1 Dα R , WQα = − D̄2 Q−1 Dα Q . (153)
4 4
Despite its quite complicated form, the consequences of (152) are rather sim-
ple. One gets for the v.e.v.’s of the composite operators (149) and (150)
1/5 3
Sk = hh ΛGG e2πik/5 , k = 1, 2, . . . , 5 , (154)
Sk Sk
T k = , U k = , (155)
h h
in perfect agreement with instanton results and the constraints imposed by the
Konishi anomaly equations. One finds that the discrete Z15 symmetry group
is spontaneously broken down to Z3 , leaving behind 15/3=5 well-defined su-
persymmetric vacua. As we noticed in Sect. 4.3 point (I), for non-vanishing
value of the Yukawa couplings h, h supersymmetry is unbroken and the vac-
uum states are well defined. Only when either h or h go to zero and flat
directions appear in the superpotential, some of the condensates run away to
infinity.

¯
SU (5) with Matter in 5 + 10

This case is more interesting as the phenomenon of dynamical breaking of

supersymmetry is seen to occur [68]. The construction of the eﬀective action
Instantons and Supersymmetry 347

which fulfils all the anomalous and non-anomalous WTIs of the microscopic
theory again requires the introduction of the two real composite (vector) su-
perfields
R = Φ† e2gV (5) Φ , Q = X † e2gV (10) X .
¯
(156)
Furthermore, besides the chiral composite operators (149) and (150), the chiral
superfields

g2
Y = Wsr1 Wrs Φt X tr X r2 r3 X r4 r5 r1 r2 r3 r4 r5 , A= (W 2 )rs Φr Φs X s s (157)
16π 2
must come into play in order to fulfil the ’t Hooft anomaly conditions. Finally,
the requirement of the absence of flat directions in the microscopic theory
implies a judicious choice of the invariant kinetic terms. An expression of the
effective action which satisfies all the above constraints is

GG−SU (5)
Γeff = R + Q + S S + ξ1 log R + ξ2 log Q

+(Y R−1 T Q−3 Y )−1 + A RQA
D
2
S Y
+ κ1 WR2 + κ2 WQ2 + S log 13 + h.c. . (158)
ΛGG F

GG−SU (5)
The minimisation of Γeff , displays the phenomenon of dynamical
breaking of supersymmetry. One finds, in fact, that the minimum occurs at
finite non-vanishing values of all the condensates (with the exception of A
for which a vanishing result is obtained) and that at this point the effective
superpotential is positive.
It is interesting to look at the spectrum of the low-lying states that emerges
from the analysis of the effective potential (158). Together with supersymme-
try, a non-anomalous U (1) is spontaneously broken by the v.e.v. of Y . Another
anomalous U (1) remains instead unbroken and its triangle anomaly is satu-
rated by the composite fermion in A, which remains massless. The only other
massless fermion in the spectrum is the Goldstino associated with the sponta-
neous breaking of supersymmetry. The latter partially lies in the real vector
fields R and Q. In the spin zero sector we find the massless Goldstone bo-
son of the spontaneously broken U (1) mentioned above. Two more would-be
Goldstone bosons are eaten up a la Higgs to give mass to the vector bosons
belonging to R and Q. It is the fact that the Goldstino partially lies in the
real superfields R and Q that prevents integrating out their massive degrees
of freedom, because if one does so the manifest supersymmetry of the effective
action is lost.
The overall picture that is coming out is completely consistent with the
symmetry breaking pattern that emerges from the dynamical instanton com-
putation of the Green functions with only insertions of lowest components of
chiral composite superfields (see Sect. 4.3, point (III) 1).
348 M. Bianchi et al.

6 N = 2 SYM: Introduction
As shown in the previous discussion of N = 1 SYM theories, the combina-
tion of instanton calculus with holomorphy of the F -terms in the (low-energy)
effective action proves to be very powerful in that it allows to determine non-
perturbative corrections to the superpotential and argue for dynamical super-
symmetry breaking in a class of models. Unfortunately, the spectrum of bound
states in supersymmetric vacuum configurations, if present, depends not only
on the F -terms, encompassing the superpotential and gauge kinetic terms,
but also on D-terms, encoding the kinetic terms for chiral multiplets and
their couplings to vector multiplets. D-terms are determined by the Kähler
potential K(Φ, Φ† , V ), a real non-holomorphic “function” of the (light) chiral
multiplets and the vector multiplets, that in principle receives both perturba-
tive and non-perturbative corrections.15
The situation significantly improves for N = 2 SYM theories, since the
extra supersymmetry relates what in the N = 1 description would be un-
related, i.e. the Kähler potential, the superpotential and the gauge kinetic
function [69]. This is true not only when N = 2 vector multiplets are present
but also when one couples the resulting N = 2 SYM to “matter” fields belong-
ing to so-called hypermultiplets, or hypers for short.16 N = 2 supersymmetry
allows only (N = 2) minimal couplings of hyper to vector multiplets, coded
in “tri-holomorphic moment maps”, and the hypers are known to have van-
ishing anomalous dimensions [71]. The (low-energy) effective theory is thus
determined by an analytic prepotential F, which only depends on the N = 2
vector multiplets, and a choice of gauging of tri-holomorphic isometries of
the hyperkähler manifold described by the hypers [72]. Vector multiplets are
“chiral” in the N = 2 description. In turn the analytic prepotential is known
to receive only one-loop and non-perturbative corrections. In their seminal
paper [51], Seiberg and Witten were able to determine the exact form of F
for pure N = 2 SYM with gauge group SU (2) by a series of elegant argu-
ments based on electric-magnetic duality [73]. In a subsequent paper [74],
they extended their arguments to the case of N = 2 SQCD with gauge group
SU (2) that arise after minimal coupling of Nf hypermultiplets belonging to
the pseudo-real fundamental representation of SU (2). Later on, these results
have been generalised to other gauge groups with hypers in various repre-
sentations both in the Coulomb branch, corresponding to turning on v.e.v.’s
of scalars in vector multiplets thus preserving the rank of the gauge group,
15
For this reason the “exact” β-function of [32] should be properly seen as an elegant
way to hide one’s ignorance of the anomalous dimensions γ of chiral multiplets.
16
In fact it can even improve if the hypermultiplets belong to some special represen-
tation of the gauge group, whereby the theory becomes exactly superconformal
and thus UV finite so that the two derivative effective action does not receive
any correction either perturbatively or non- perturbatively. For instance, this is
the case when one extra hypermultiplet is added that belongs to the adjoint
representation, leading to the N = 4 SYM theory [70].
Instantons and Supersymmetry 349

and in the Higgs branch, corresponding to turning on v.e.v.’s of scalars in

hypers thus generically reducing the rank of the gauge group. The possibility
of having new and mixed branches has also been widely explored.
Our aim is to first describe the structure of N = 2 SYM theories both at
the microscopic level and at the macroscopic one, when they are described in
terms of Wilsonian low-energy effective actions. We then review the arguments
of Seiberg and Witten leading to the identification of an auxiliary Riemann
surface, i.e. a “complex curve”, encoding the complexified gauge coupling τ
in (the ratio of) the derivatives of its two “period” integrals, eventually ar-
riving at the determination of F in the simplest case of SU (2). We then
discuss the non-linear recursion relations satisfied by the coefficients of the
instanton expansion following the work of Matone’s [75] and check the consis-
tency of Matone’s relations and, thus, of the SW prepotential, with instanton
calculus in sectors with K = 1, 2 [76]. In order to tackle the general case,
i.e. arbitrary K and generic gauge group (with SU (Nc ) in mind), one may
exploit the “topological twist” of N = 2 SYM theories [77] that, combined
with some “non-commutativity” parameter [78], in the form of the so-called
Ω-background, allowed Nekrasov and collaborators [79, 80, 81, 82] to localise
the integrals over instanton moduli spaces and compute recursively the expan-
sion coefficients of the non-perturbative series. To this end we sketch these
beautiful, but at the same time rather technical, mathematical arguments un-
derlying the ADHM construction. We then turn to the string description of
the ADHM construction and its ramifications [83, 84]. The astonishing feature
of string theory is that the sophisticated algebro-geometric ADHM construc-
tion becomes rather transparent and intuitive once D-branes and their open
string excitations are taken into account [85, 86]. In particular, (supersymmet-
ric) gauge theories emerge as the low-energy effective theories governing the
dynamics of stacks of D-branes [87]. In this setting instantons can be realised
as lower dimensional D-branes within higher dimensional ones [83, 84]. The
structure of the ADHM data emerge naturally from the set of open strings con-
necting the various stacks of branes. Even Nekrasov’s Ω-background admits
a natural description in terms of the closed string graviphoton that couples
to D-branes and their open string excitations [85, 86]. Last but not least, the
long sought for duality between gauge fields and strings turns out to emerge
quite naturally, at least in the maximally supersymmetric case (N = 4 SYM),
in the form of Maldacena’s holographic correspondence [88, 89, 90, 91, 92].
We will return to this unprecedented achievement of string theory in Sects. 17
and 18. Here we only give a schematic description of how instanton effects can
be computed within string theory in a particular double scaling limit [85, 86].

7 N = 2 SYM: Generalities
N = 2 SYM theories admit two kinds of massless multiplets, both containing
four bosonic and as many fermionic degrees of freedom. Vector multiplets are
350 M. Bianchi et al.

described by chiral N = 2 superﬁelds that comprise a vector boson, two Weyl

fermions (the gauginos) and a complex scalar all in the adjoint representation
of the gauge group. N = 2 vector superfields will be denoted by A and their
θ expansion schematically reads
1 r μνα β
A(x, θ) = a(x) + θαr λα
r (x) + θα σ β θr Fμν (x) + ... . (159)
2
Higher-order terms in θr with r = 1, 2 can be expressed as derivatives of the
lower ones. In terms of N = 1 supersymmetry, a N = 2 vector multiplet can
be decomposed into a vector superfield V and a chiral superfield Φ, both in
the adjoint representation of the gauge group (see Appendix A for notation).
N = 2 massless “matter” appears in hypermultiplets that consist of four
real scalars and two Weyl fermions (the hyperinos), belonging to a real repre-
sentation of the gauge group. In terms of N = 1 supersymmetry, they can be
decomposed into a chiral superfield Q in an a priori complex representation R
of the gauge group and a chiral superfield Q̃ in the conjugate representation
R̄. Among the massive representations, a special role is played by the 1/2
BPS representations that are shorter than generic massive representations in
that they only involve eight bosonic and as many fermionic degrees of freedom
(see Appendix G for a brief explanation of this and related notions). Thanks
to the relation between mass and “central” charge

M = |Z| , (160)

1/2 BPS states are indeed annihilated by half of the supersymmetry charges.
The structure of classical N = 2 SYM theories is tightly constrained by
the large amount of (super)symmetry they are endowed with. The most gen-
eral two-derivative classical action is completely determined in terms of an
“analytic” prepotential F that is a priori an analytic function of the N = 2
vector multiplets A and the complex coupling constant
ϑ 4πi
τ= + 2 . (161)
2π g
The N = 2 hypermultiplet dynamics is described by a non linear σ model on
a hyperkähler space (see Appendix J). The coupling of N = 2 hypermultiplets
to vector multiplets is minimal in that the vector fields “gauge” (make local)
the global hyperkähler isometries of the hypermultiplet metric that preserve
the three Kähler structures
1 I
ωI = ω dq i ∧ dq j , (162)
2 ij
where ωijI
= −ωji I
with I = 1, 2, 3 and i, j = 1, ..., 4nH are anti-symmetric
tensors such that dω I = 0, where d denotes the exterior differential in field
space, i.e. with respect to the scalar components q i of the nH hypermultiplets.
Instantons and Supersymmetry 351

I
In the simple case of constant ωij , writing i = f + 4r with f = 1, ..., 4 and
r = 0, ..., nH − 1, one can choose

ωfI +4r,f +4r = ηfI f δrr , (163)

where ηfI f are the ’t Hooft symbols [2].

The eﬀect of “gauging” (hyperkähler) isometries can be elegantly expressed
through the minimal substitution

∂μ q i → Dμ q i = ∂μ q i + gAaμ ξai (q) , (164)

where a = 1, ..., nV . A tri-holomorphic isometry generated by the vector ﬁeld

ξa = ξai ∂/∂q i satisﬁes

Lξ ω I ≡ ιξa dω I + d(ιξa ω I ) = d(ιξa ω I ) = 0 , (165)

where ιξa denotes contraction with the Killing vector ﬁeld ξa (q). As a conse-
quence, ιξa ω I = dμIa a tri-holomorphic Killing vector ξa (q) admits hyperkähler
moment maps μIa (q) since locally ξai (q)ωijI
(q) = ∂j μIa (q). The μIa (q) may be
thought of as some sort of N = 2 auxiliary ﬁelds. In the N = 1 notation,
whereby a hypermultiplet with scalar components q f is described by two chi-
ral multiplets with scalar components φ = q 1 + iq 2 and φ̃ = q 3 + iq 4 , one
has

μ3a (q) = Da (φ, φ̃; φ† , φ̃† )

μ+ 1 2
a (q) = μa (q) + iμa (q) = Fa (φ, φ̃)
μa (q) = μa (q) − iμ2a (q) = F̄a (φ† , φ̃† ) .
− 1
(166)

Indeed the contribution of the hypermultiplets to the potential is exactly given

by
1
VH (q) = δIJ Im τa (a)μIa (q)μJa (q) . (167)
2
Notice that except for the minimal coupling (164) and its N = 2 completion,
entailing (167) and various Yukawa-type interactions, there is no neutral cou-
pling between the vector and the hyper multiplets. In particular, as indicated,
the complexiﬁed gauge couplings τa (a) can only depend on the lowest scalar
components a of the nV vector multiplets A.
Quantum renormalisability drastically restricts the choice of F(A) and
ξ(q), or equivalently of μ(q). In the microscopic fundamental theory, F(A) is
at most a quadratic function of A, while ξai (q) are linear in the q’s, viz.

ξai (q) = (Ta )i j q j , (168)

where Ta are the generators of the gauge group in the (a priori reducible)
representation spanned by the scalars in the hypermultiplets. Moreover the
hyperkähler metric is ﬂat, up to global tri-holomorphic identiﬁcations R4n /Γ
352 M. Bianchi et al.

(examples are the ALE spaces R4 /ΓADE where n = 1 and ΓADE is one of the
Kleinian discrete subgroups of SU (2), in the ADE classification, see e.g. [93]).
As a result the tri-holomorphic moment maps μIa (q) are completely determined
in case of semi-simple gauge groups. When abelian factors are present in the
gauge group, one can add constant tri-holomorphic Fayet–Iliopoulos terms ζaI ,
so that μIa (q) = μ̂Ia (q) + ζaI , where μ̂Ia (q) is such that μ̂Ia (q = 0) = 0.
A different story applies to the Wilsonian effective action17 for the light
(massless) modes that survive, i.e. do not acquire a mass after, partial or
complete gauge symmetry breaking below the scale Λ [60]. Here Λ is the
explicit cut-off in the Wilsonian effective action, such that all modes with
mass or energy above this scale have been integrated out. It is not known
how to explicitly perform this task, but the outcome of the “integrating out”
procedure is severely constrained by symmetries and one can often “guess”
the correct result to lowest order approximation, which is nothing else but
a bookkeeping of the relevant degrees of freedom and the symmetries of the
theory.
In addition to the Coleman–Weinberg [94]-type logarithmic correction at
one-loop
i A2
F1−loop (A) = b1 A2 log 2 , (169)
8 Λ
where b1 is the β function coefficient, i.e. b1 = 2Nc − Nf for SU (Nc ) with Nf
hypers in the fundamental representation, the prepotential can and in fact
must acquire an infinite number of non-perturbative corrections. Indeed N =
2 supersymmetry prevents further perturbative corrections, but the one-loop
term violates positivity of the imaginary part of the effective gauge coupling

∂ 2 F(a) ϑ(a) 4πi

τ (a) = = + 2 , (170)
∂a2 2π g (a)

where a denotes the lowest (scalar) component of the chiral superﬁeld A that
describes the N = 2 vector multiplet and has been deﬁned in (159).

8 Seiberg–Witten Analysis

In their seminal paper [51], Seiberg and Witten have shown how one can
exactly compute the analytic prepotential F(A) in the case of an SU (2) gauge
theory without hypermultiplets. In another closely related paper [74] they
have shown how to incorporate 2Nf half-hypermultiplets in the fundamental
representation of SU (2), leading to a theory that deserves to be called N = 2

17
M.B. would like to thank M. Bochicchio for first pointing out the important
difference between the “non-local” 1PI effective action, with an arbitrary number
of derivatives, and the Wilsonian low-energy effective action, often considered
only up to two derivatives.
Instantons and Supersymmetry 353

SQCD. The case Nf = 2Nc = 4 is special since it corresponds to an exact

quantum N = 2 superconformal theory.
Clearly, in the Coulomb phase we are focussing on, higher derivative terms
are generated by quantum effects with finite coefficients and are suppressed
by inverse powers of the v.e.v.’s of the scalar fields. In the superconformal
phase with vanishing v.e.v.’s the situation is much subtler. The relevant ob-
servables are correlation functions of gauge invariant operators. The modern
tool to tackle this interesting issue is the AdS/CFT correspondence proposed
by Maldacena that will be discussed in Sects. 17 and 18.
After briefly reviewing the arguments of Seiberg and Witten’s, based on
symmetries, monodromies and duality, we will describe how to check the result
by (constrained) instanton calculus.
As we already mentioned, classical N = 2 SYM admits an infinite tower
of BPS-saturated monopoles and dyons thanks to the existence of a complex
scalar central charge Z in the N = 2 extended superalgebra. In the simple
case of SU (2), they are the supersymmetric analogue of the classical solutions
found by ’t Hooft [95] and Polyakov [96] and by Julia and Zee [97]. Indeed,
we recall that the potential of pure N = 2 SYM, which reads
1
V (a, a† ) = Tr([a, a† ]2 ) , (171)
2
has flat directions identified by the condition [a, a† ] = 0. Up to gauge trans-
formations, this means that both a and a† belong to the Cartan subalgebra
generated by, e.g. T3 = σ3 /2. In modern terms, one says the theory admits a
one complex dimensional moduli space of classical vacua parametrised by a3
or rather by the gauge-invariant composite
1 2
u = Tr(a2 ) = a . (172)
2 3
Henceforth, we denote a3 by a for simplicity. Along the flat direction the
gauge group is broken to U (1) and one automatically realises the Prasad–
Sommerfield condition18 without the need of sending the scalar self-coupling
to zero [98]. Monopole solutions that saturate the Bogomol’nyi bound

MM = |pτ0 a| , (173)

where τ0 denotes the classical “complexiﬁed” coupling, that we have already

encountered many times by now, and p is the magnetic charge, can be explic-
itly constructed solving Nahm’s equations [99]. Notice the striking analogy
with the Higgs formula for the mass of a U (1) W -like boson with charge q

MW = |qe a| . (174)
18
The Prasad–Sommerﬁeld condition of non vanishing scalar v.e.v. with zero po-
tential is usually achieved by setting the scalar self-coupling λ to zero in the
potential V (ϕ) = λ(ϕ2 − ϕ20 )2 while keeping |ϕ| = |ϕ0 | at inﬁnity.
354 M. Bianchi et al.

In fact one can do better and show that 1/2 BPS-saturated dyons have a mass
spectrum given by the formula

MD = |qe a + qm aD | = |Z| , (175)

where we have introduced the notation aD = τ0 a and Z is the “central”

extension of the N = 2 superalgebra, that being central, i.e. commuting with
all the remaining generators, has to be a c-number by Schur’s Lemma [100,
101]. In terms of the analytic prepotential F(a) = τ0 a2 /2 + · · · , one is led to
the identiﬁcation
∂F
aD = = τ0 a + · · · , (176)
∂a
where the dots take into account quantum corrections to F(a) and thus to
τ (a) = τ0 + · · · . The exact (quantum) identiﬁcation aD = ∂F
∂a is tantamount
to assuming that the classical formula (175) for the central charge retains the
same form in the quantum theory, as strongly suggested by consideration of
N = 2 supersymmetry. Actually, the formula |Z| = |qe a + qm aD | displays a
remarkable symmetry under SL(2, Z) transformations acting on the electric
and magnetic charges q and p. In fact, under

qe → kqe + lqm qm → mqe + nqm (177)

Z is invariant if one simultaneously performs a “monodromy” transformation

a → na − laD aD → −ma + kaD . (178)

In this way a, aD are seen as components of a section of an SL(2, Z) bundle

over the moduli space of vacua parametrised by the gauge-invariant compos-
ite u, for which we write a = a(u), aD = aD (u). This geometrical description
implies that the components a = a(u) and aD = aD (u) undergo non-trivial
transformations, i.e. acquire non-trivial monodromy, when one parallel trans-
ports them as functions of u around some special points. As a result of their
dependence of a = a(u), aD = aD (u) on u, the complexiﬁed coupling that has
been so far considered a function of a can be considered a function of u given
by the ratio of the derivatives of aD (u) and a(u) through the chain rule
daD
∂ 2 F(a) ∂ ∂F(a) ∂aD du
τ (a) = τ (a(u)) = τ (u) = = = = da
, (179)
∂a2 ∂a ∂a ∂a du

with Im τ (u) > 0 (for vacuum stability). Remarkably, at this point, the com-
plexiﬁed eﬀective coupling τ (u) can be considered as the modular parameter
(the “period”) of an auxiliary torus, a Riemann surface of genus one. The lat-
ter is also known as an “elliptic curve”, i.e. a complex dimension one manifold
whose periods are determined in terms of elliptic integrals. In fact determining
this auxiliary elliptic curve, the so-called “Seiberg–Witten curve”, allows one
to compute its periods from the equations
Instantons and Supersymmetry 355

daD daD
aD (u) = , a (u) = (180)
du du

and, after integration w.r.t. u, F(a) itself, since aD = ∂F∂a(a) .

In order to determine the SW curve one starts by computing the mon-
odromy of the section (a = a(u), aD = aD (u)) at inﬁnity where the theory,
being asymptotically free (b1 = 2Nc = 4 for SU (2)), is weakly coupled and
(see (169))
i a2
F(a) ≈ a2 log 2 (181)
2 Λ
with Λ the RGI scale

Λ = M exp(−8π 2 /b1 g 2 (M )) . (182)

In this way one gets

i a2
aD = F (a) ≈ [2a log 2 + 2a] . (183)
2 Λ
Under u → e2πi u, one has a → −a and aD → −aD + 2a. These considerations
fix the monodromy of the section (a = a(u), aD = aD (u)) around the branch
point at infinity.
Perturbatively, the other branch point of F(a) is at a = 0. If this were
the full story, the theory would be inconsistent since Im τ could not possibly
be positive throughout the moduli space, being τ holomorphic and thus Im τ
harmonic. Seiberg and Witten argued that the non-abelian symmetry (a =
0) is never restored at the quantum level and that this is consistent with
assuming the existence of only two more singular points. They interpreted
the singularities as due to the fact that some massive states become massless
at each of the two additional singular points in the moduli space. In fact, the
two relevant states are a monopole (qe = 0, qm = 1) and a dyon (qe = 1, qm =
−1).19 In order to identify the location of these extra singularities, it is crucial
to exploit a discrete Z4 symmetry of the quantum theory for Nc = 2, which
is a remnant of the anomalous U (1)R subgroup of the U (2) R-symmetry of
the classical theory. Indeed, classically N = 2 SYM is invariant under global
SU (2) × U (1)R transformations under which the gauge field is invariant, the
gaugini rotate as a 21/2 and the complex boson is a charge +1 SU (2) singlet.
The U (1)R symmetry is broken by the quantum anomaly that preserves a
Z4Nc ≈ Z2Nc × Z2 where the latter factor is fermion parity and the former is
the above-mentioned Z4 under which u → −u. This is enough to completely
determine the SW curve

ESW : y 2 = (x2 − Λ2 )(x2 − u2 ) , (184)

19
This particular choices of electric and magnetic charges simplify the notation in
the following. Other choices are possible but require performing SL(2, Z) trans-
formations.
356 M. Bianchi et al.

which is indeed singular when u = ±Λ. A generic elliptic curve can be written
as a double cover of the sphere y 2 = (x − x1 )(x − x2 )(x − x3 )(x − x4 ) with
branch points at x = xi . By the SL(2, C) symmetry of the sphere, one can
always put three of the branch points at, say, 0, 1 and ∞ so that the remaining
complex parameter determines the shape of the torus (actually the ratio of the
two periods). In order not to spoil the Z2 symmetry of quantum N = 2 SYM
theory, it is however more convenient to ﬁx only two branch points at, say,
±Λ (or ±1 after rescaling the variables). The remaining two branch points
are set at ±u. When u reaches ±Λ the curve (184) representing the torus
degenerates, i.e. one of the two cycles and the corresponding period become
zero signalling the presence of a singularity in the theory.
The periods of ESW can be expressed in terms of elliptic integrals and after
identifying the cycles that correspond to aD (u) and a (u), one can eventually
compute F.
Making more precise the above geometrical considerations, we can say
that the vector (aD , a) is a section of a ﬂat bundle over the moduli space
parametrised by u with monodromy group Γ (2) ⊂ SL(2, Z) generated by

−1 2 1 0 −1 2
M−1 = M1 = M∞ = (185)
−2 3 −2 1 0 −1

such that M1 M−1 = M∞ . It can be checked that the monodromy transfor-

mations of (aD , a) around ±Λ are indeed represented by the matrices M±1 .
If the modular parameter τ of ESW were the ratio aD /a we would have
completed our task, since aD and a would have coincided with the two “canon-
ical” periods of the unique holomorphic20 differential dω = dx/y. However
τ = aD (u)/a (u) and this means that aD and a are periods of a meromorphic
differential whose derivative w.r.t. u is the unique holomorphic differential dω.
Seiberg and Witten identified the “natural” meromorphic differential dλ(u)
that prior to monodromy considerations is only determined up to a mero-
morphic differential independent of u. Setting Λ = 1 for simplicity, one finds
that √
x−u
dλ(u) = √ dx (186)
x2 − 1
is the unique differential that satisfies all the requirements, i.e. dλ (u) = dω
and it is such that
√ u √ √ 1 √
2 x−u 2 x−u
aD (u) = √ dx , a(u) = √ dx (187)
π 1 x2 − 1 π −1 x2 − 1

have the correct monodromy when parallel transported around ±1 and ∞.

Consistently with the surviving Z4 symmetry, one can write

F(a) = Fpert (a) + Fnon−pert (a) , (188)

20
Although it does not look holomorphic, ω is in fact holomorphic!.
Instantons and Supersymmetry 357

where
i 2 a2
Fpert (a) = Ftree (a) + F1−loop (a) = a log 2 (189)
2 Λ
and
∞
Λ4K
Fnon−pert (a) = a2 FK , (190)
a4K
K=1
with the latter incorporating the contribution of instantons of increasing wind-
ing number K.
A few comments are in order here. First a = 0 is excised from the moduli
space, i.e. there is no value of u such that a(u) = 0. Second the singular
points u = ±Λ correspond to aD (u) = 0 and aD (u) = a, respectively, and
lie on the so-called surface of marginal stability where Im τ = 0. This is
the locus where the lattice of BPS states collapses and transitions of the form
|Z = Z1 +Z2 → |Z1 +|Z2 are allowed by both charge and mass conservation.
The effective coupling g 2 (a) = 4π 2 /Im τ (u) is always semi-positive definite
and never grows too much. At large a this is due to asymptotic freedom. In
the interior of the moduli space all charged vector bosons become extremely
massive and the theory is essentially abelian. Near the singular points, one
better switches to a dual magnetic or dyonic description, whereby the abelian
magnetic or dynonic photons are coupled to light monopoles and dyons. The
effective coupling decreases with the renormalisation scale μ in the IR until it
reaches the value
1
g̃ 2 (μ)|μ=m ≈ , (191)
| log(m/Λ)|
with m the mass of the lightest charged state, be it a monopole or a dyon.
As stressed above, these states are arranged in hypermultiplets. Due to the
presence of light charged particles in hypermultiplets coupled to abelian vector
multiplets, the dual magnetic or dyonic theory is different from the original
electric theory, that only involved non-abelian vector multiples. Yet electric-
magnetic duality led us for quite a long way. Only for N = 4 SYM and for
other exactly (super)conformal-invariant theories the dual magnetic or dyonic
theory is expected to coincide with the electric theory.21
Finally, at the singular points u = ±Λ2 new branches of the moduli space
open up where monopoles or dyons can condense, i.e. acquire a v.e.v., thus in-
ducing (oblique) confinement of the chromo-electric charges and flux tubes due
to the dual Meissner effect [103]. Adding N = 1 supersymmetric mass terms
to the adjoint chiral multiplet induces dynamical chiral symmetry breaking
and confinement in a controllable way.
21
More precisely in these cases the duality maps the electric theory into a magnetic
theory with the same action but dual gauge group [102], G∗ . The latter is obtained
from the original gauge group of the electric theory, G, exchanging the role of the
weight and root lattices. Therefore, in the case of groups with simply laced Lie
algebras, G and G∗ are isomorphic. For groups with non-simply laced algebras
this is not the case and one has the following pairs: G ↔ G∗ : SO(2n+1) ↔ Sp(n),
F4 ↔ F4 and G2 ↔ G2 .
358 M. Bianchi et al.

9 Checking the SW Formula by Instanton Calculations

Our next task is to check the SW prepotential against explicit instanton com-
putations. A one-instanton check is not enough because of ambiguities in the
definition of Λ that can rescale it by a finite constant. The perspective for
a two-instanton check seems a priori daunting but the calculation turns out
to be feasible [37, 76]. In fact one can do much better. Matone has shown
that the coefficients of the SW prepotential satisfy non linear recurrence re-
lations that can be checked to hold in instanton calculus [75]. Fucito and
Travaglini have shown that multi-instanton calculus precisely reproduces the
desired relations [76]. More recently the problem has been attacked once again
in a beautiful series of papers by Nekrasov and collaborators [79, 80, 81, 82].
Introducing suitable deformations parameters (Ω-background), one can lo-
calise the measure over the multi-instanton (super)moduli space reducing the
calculation of the F K coefficients to a mere, though certainly not trivial, com-
binatorial problem. We will have limited space to discuss this fascinating issue
and we refer the reader to the original literature as well as to the accessible
reviews [79, 80, 81, 82]. We cannot resist saying immediately that the some-
what obscure deformation parameters introduced by Nekrasov admit a very
natural explanation in a string setting for the problem whereby open string
excitations of D-branes account for the gauge and matter light degrees of free-
dom [83, 84, 85]. The Ω-deformation is equivalent to turning on a background
for a closed string state in the so-called Ramond–Ramond (R–R) sector, the
graviphoton, effectively producing a non (anti-)commutative superspace [86],
i.e. a superspace where the θ variables do not anti-commute very much like
the x variables do not commute. From this vantage point, higher-order terms
in the deformation, receiving instanton corrections, are associated with higher
derivative gravitational F -terms that appear in the type II low-energy effec-
tive actions after compactification on Calabi–Yau threefolds [104]. A similar
approach for the calculation of the SW prepotential, based on localisation on
the instanton moduli space, was proposed in [105].

9.1 Matone Relations

Exploiting powerful results in the theory of uniformisation of Riemann sur-

faces, it was shown in [75] that the non-perturbative coeﬃcients in the expan-
sion of F(a) satisfy certain recursion relations known as Matone’s relations. In
order to achieve this result, it is convenient to consider the auxiliary function
a ∂F(a)
G(a) = F(a) − , (192)
2 ∂a
where F(a) is deﬁned in (188) and consider the expansion
∞
Λ4K
G(a) = a2 GK . (193)
a4K
K=0
Instantons and Supersymmetry 359

We will momentarily see that G(a) = u. The expansion coeﬃcients GK and

FK are related by
GK = 2KFK , (194)
for K = 0 while G0 = 1/2.
As previously discussed, for SU (2) the moduli space is parameterised by
u = Tr(φ2 ) and turns out to be a Riemann sphere with three punctures
at u1 = −Λ, u2 = Λ and u3 = ∞ with a symmetry u ↔ −u. We recall
that (aD (u), a(u)) is a section of a flat bundle over the moduli space with
monodromy group Γ (2) ⊂ SL(2, Z) generated by the three matrices M−1 ,
M+1 and M∞ in (185) with M−1 M+1 = M∞ .
Using the obvious integrability of the differential
W (u) du = a daD − aD da , (195)
that being a complex function of the single variable u is necessarily an exact
differential, one can define the auxiliary function
u
g(u) = dz W (z) . (196)
1
This helps determining the behaviour of F under monodromy (modular trans-
formations). In fact by integrating
1
∂u F = aD ∂u a = [∂u (aD a) − W (u)] (197)
2
one finds
1
F(u) = [aD a − g(u)] + F0 . (198)
2
One can check that, under
aD → ãD = kaD − ma
a → ã = −laD + na ,
with kn − lm = 1, one has
1
F̃(ã) = F(a) + [lmaD a − kla2D − mna2 ] , (199)
2
while G(a), conveniently defined as above, turns out to be modular invariant,
i.e.
G̃(ã) = G(a) , (200)
since u and hence g(u) are invariant. By taking the ratio of aD (u) and a (u)
and keeping in mind that u is invariant, one also finds that
∂ 2 F(a) aD (u) ∂ 2 F̃(ã) kτ (a) − m
τ (a) = =
→ τ̃ (ã) = = , (201)
∂a2 a (u) ∂ã2 −lτ (a) + n
which is the expected projective transformation of the complexified coupling.
360 M. Bianchi et al.

By uniformisation arguments, i.e. monodromy invariance and asymptotic

behaviour at large a, Matone eventually showed that [75]
G(a) = −iπg(u)/2 = u . (202)
The linear u dependence of g(u) is tantamount to saying that W (u) is a
constant, independent of u. In fact W (u) = a(u)aD (u) − aD (u)a (u) is nothing
but the Wronskian of the solutions of the second-order differential equation
satisfied by a(u) and aD (u), which in canonical (Schrödinger-like) form reads
d2 ψ(u) 1
(1 − u2 ) − ψ(u) = 0 . (203)
du2 4
As a result of the uniformisation theorem of the moduli space of Riemann
surfaces, G(a) obeys a non-linear differential equation of the form
3
d2 G 1 dG
(1 − G 2 ) 2 + a = 0, (204)
da 4 da
so that the coefficients of the expansion (193) satisfy the sought for recursion
relation
#
1
GK+1 = 2 × (2K − 1)(4K − 1)GK
8G0 (K + 1)2

K−1
+ 2G0 c(N, K) GK−N GN +1
N =0

L+1
K−1
−2 d(L, N, K) GK−L GL+1−N GN , (205)
L=0 N =0

where
c(N, K) = 2N (K − N − 1) + K − 1
d(L, N, K) = [2(K − L) − 1][2K − 3L − 1 + 2N (L − N + 1)] (206)
and G0 = 1/2. The ﬁrst few coeﬃcients read
1 5 9
G1 = , G2 = , G3 = , (207)
22 26 27
in perfect agreement with the results of Seiberg and Witten. Moreover since
u = G(a) using the asymptotic behaviour of G, one can determine the constant
value of W that reads
2i
W = aD a − aD a =. (208)
π
This relation is very useful in order to determine the “critical” curve where
Im(aD /a) = 0. On this curve the lattice of BPS states collapses to a line, as
already observed.
Instantons and Supersymmetry 361

9.2 (Constrained) Instanton Checks for K = 1, 2

Following Matone [75], Fucito and Travaglini [76] have been able to check the
non-perturbative relation

a ∂F(a)
Tr(φ2 )(a) = u(a) = G(a) = F(a) − (209)
2 ∂a

for K = 1, 2 and show agreement with the SW prepotential.

Using the relation between F in (188) and G in (193), one ﬁnds

a2 Λ4K
Tr(φ2 )(a) = − − GK 4K−2 . (210)
2 a
K

The calculation was carried out by making use of the ADHM construction,
which we now brieﬂy review in the SU (2) case [19]. In the ADHM ap-
proach [19], the gauge connection is written in the form

Aμ (x) = U † (x)∂μ U (x) . (211)

The key observation is that U (x) is not a unitary SU (2) matrix but rather a
(1 + K) × 1 “array” of quaternions, satisfying

Δ† (x)U (x) = 0 , (212)

where
Δ(x) = a + bx , (213)
with x = xμ σμ the position quaternion. Self-duality requires

Δ† (x)Δ(x) = f −1 ⊗ 1 , (214)

with f an invertible K × K matrix and 1 the 2 × 2 identity matrix. The

projector on the kernel of Δ† (x), spanned by U (x), reads

P (x) = U (x)U † (x) = 1 − Δf Δ† (x) . (215)

Gauge ﬁeld zero modes, that we here denote by aμ , are orthogonal to the
gauge orbit and can be parametrised as [20, 76]

aμ (x) = U † (x)[C σ̄μ f b† − bf σμ C † ]U (x) , (216)

with C a (1 + K) × K “matrix” of quaternions satisfying

Δ† (x)C = (Δ† (x)C)T . (217)

These conditions reduce the number of independent (quaternionic) compo-

nents of C from (1 + K) × K to (1 + K) × K − (K − 1) × K = 2K, i.e. 8K
zero modes as expected for SU (2) instantons. Modulo symmetries, which are
362 M. Bianchi et al.

local SU (2) and global SO(K), the components of C can be identiﬁed with
the ﬂuctuations of Δ, δΔ, i.e. variations of the ADHM data, satisfying the
self-duality condition

(Δ + δΔ)† (x)(Δ + δΔ)(x) = f −1 ⊗ 1 (218)

if non-linear terms are neglected. Since δΔ = C is linear in the gauge ﬁeld zero
modes parametrised by C, one can identify zero modes of the gauge ﬁelds with
solutions of the linearised ADHM equations around a given self-dual solution.
This is equivalent to identifying the bosonic zero modes as solutions of the
equation S[Aμ + aμ ] = S[Aμ ] up to cubic terms. One can similarly determine
the fermionic zero modes that in the case of N = 2 are as many as the bosonic
zero modes and are given by [20, 76]
(i)
λβ ȧ = σβμȧ aμ(i) , (219)

with i = 1, ..., 8. In the presence of ﬂat directions of the classical scalar poten-
tial, the constrained instanton method entails an expansion around a solution
of the (approximate) coupled equations22

Dμ F μν = 0 , D2 φ = 0 (220)

with boundary condition at infinity, φ → φflat . For SU (2) φflat = aσ3 /2i
modulo gauge transformations.
For K = 1, everything simplifies drastically. As discussed in detail in
Sects. 2.3 and 2.4 and Appendix B, the bosonic measure (“integrated” over
SU (2)/Z2 ) reads
8 4
4 2πρμ d x0 dρ
dμB = 2 . (221)
π g2 ρ5
Using the fermionic zero modes, that are not normalised, the fermionic mea-
sure is given by
2 4
g
dμF = d4 η d4 ξ¯ . (222)
32π 2 μ
Due to the presence of the scalar v.e.v., a, the classical action consists of
various terms

Scl = SYM + Sscal + Sferm + SYuk + Spot . (223)

22
The attentive reader may notice that these are not the classical equations since the
scalar induced source J ν = φ† (Dν φ)−(Dν φ† )φ is being neglected. Exact topolog-
ically non-trivial solutions in the presence of non-zero v.e.v.’s for the scalars are
not known [106]. The standard approach, which allows to control the ﬂuctuations
around the approximate solution, consists in adding to the action a “constraint”
on the instanton size. The resulting “solution” is thus known as a “constrained
instanton” [47].
Instantons and Supersymmetry 363

After integration over the ﬂuctuations of φ and φ† around their v.e.v., the
Yukawa couplings produce an additional (to φharmonic ) inhomogeneous term
in φ of the form
√ √ α a
φainhom = 2[D−2 ]a b bcd λα
c λdα = 2ζ λα , (224)

where ζ α = η α + xμ σμαα̇ ξ¯α̇ . The absence of zero modes with “wrong” chirality
leads to 2 −1
b ¯ α̇β̇ ¯ g g
SYuk = ā ξα̇ σb ξβ̇ √ 2μ
. (225)
2 32π
Moreover
Sscal = 4π 2 |a|2 ρ2 (226)
and we set
Λ4 = μ4 e−8π
2
/g 2
. (227)
The explicit computation of u (the v.e.v. of Tr(φ2 )) then yields
8
4 2π
d4 x0 dρ ρ3 e−4π |a| ρ Fμν
2 2 2
u = φ φa K=1 = Λ
a 4 a
Faμν
π2 g
2 4 2 −1
g g
¯ b ξ¯√ g
× d4 ηd4 ξ¯ (ηη)2 exp −āb ξσ . (228)
32π 2 2 32π 2μ

Performing the integrations over the collective coordinates yields

2 Λ4
φa φa K=1 = (229)
g 4 a2
in agreement with G1 .
For K = 2, the (constrained) instanton calculus is more laborious. The
oﬀ-diagonal component d of the lower sub-block of Δ is of the form
1 y
d= (v̄2 v1 − v̄1 v2 ) , (230)
2 y2

where y = x1 − x2 ≡ 2e and x0 = (x1 + x2 )/2, with x1 and x2 denoting the

two instanton “centres” and v1 and v2 the two extra quaternionic collective
coordinates. Similar restrictions as before apply to C = δΔ so that the oﬀ-
diagonal component γ of the lower sub-block of C is of the form
y ¯
γ= (2dη + v̄2 ν1 − v̄1 ν2 ) , (231)
y2
where η, ν1 and ν2 are quaternions that parametrise the independent ﬂuctu-
ations of the fermions. Separating the four collective coordinates associated
with translations, x0 , and the four broken Poincaré supersymmetries, η0 , the
relevant correlator reads
364 M. Bianchi et al.
1/2
Λ8 JB
φa φa K=2 = d4 v d4 e d4 ξ¯ d4 ν1 d4 ν2 e−SYuk −Sscal
16 JF

× d4 x0 d4 η0 (η0 η0 )2 Fμν
a
Faμν (232)

where 1/2
JB 210 | |e|2 − |d|2 |
= . (233)
JF π 8 |v1 |2 + |v2 |2 + 4(|d|2 + |c|2 )
Performing all the many necessary integrations yields

5 Λ8
φa φa K=2 = − (234)
4g 8 a6
in agreement with G2 .
Actually, one can formally prove that Matone relations are satisfied by
instanton calculus for any K [76].
Another elegant approach to derive the SW prepotential from first princi-
ples is based on the so-called N = 2∗ theory. This is nothing else but N = 4
SYM theory deformed by the addition of a mass M for the hypermultiplet
in the adjoint representation, or equivalently the same mass M1 = M2 = M
for two of the three adjoint chiral multiplets in the N = 1 description of the
N = 4 theory. Quite remarkably the hypermultiplet, H = {Φ1 , Φ2 }, appears
quadratically in the microscopic action,

S[ΦI=1,2 ; Φ3 , V ] = d2 θd2 θ̄ Tr(Φ†I egV ΦI ) (235)

1
+ d2 θ gTr([Φ1 , Φ2 ]Φ3 ) + M Tr(ΦI )2 + h.c. ,
2
and can be integrated out in a Gaussian fashion. One ends up with an effective
action à la Wilson–Polchinski where M plays the role of an UV cut-off. The
advantage of the approach is the UV finiteness of N = 4 SYM theory which
persists after the inclusion of the N = 2 supersymmetric mass terms. The
resulting low-energy effective action is expected to coincide with the one re-
sulting from the SW prepotential. As we said in Sect. 5, this has been partially
checked by means of the exact renormalisation group in [61].

10 Topological Twist and Non-commutative Deformation

N = 2 SYM theories admit an interesting reformulation which goes under the

name of “topological twist” [77]. Although the topologically twisted version
is not fully equivalent to the original (dynamical) theory, some of the observ-
ables coincide. In particular, one can suspect that the analytic prepotential F
could be one of these observables thanks to holomorphy. As we will see later
Instantons and Supersymmetry 365

on, this is not completely true. The topological theory cannot reproduce the
logarithmic term generated by one-loop corrections. Yet, a properly defined
partition function of the topological theory captures all the non-perturbative
corrections to F and more. Indeed, higher derivative “gravitational” F -terms
can be reliably computed by means of its topologically twisted version if a
suitable background inducing “non-commutativity” is turned on. After briefly
reviewing the topological twist formalism, we will sketch the arguments lead-
ing to the derivation of Fnon−pert from the topological partition function.
The topological twist consists in bringing bosons and fermions to transform
in the same way under the subgroup SU (2)L × SU (2)D ⊂ SU (2)L × SU (2)R ×
SU (2)I , where D stands for the diagonal subgroup of SU (2)R ×SU (2)I , which
is not to be confused with the (Euclidean) Lorentz group SU (2)L × SU (2)R ,
since SU (2)I is part of the R-symmetry group U (1)×SU (2)I . Under SU (2)L ×
SU (2)D the two Weyl gaugini transform as a four-vector, ψμ ∈ (1/2, 1/2), a
singlet, η̄ ∈ (0, 0), and a self-dual tensor, χ̄+ μν ∈ (0, 1), where, adhering to
standard notation, (jL , jD ) refers to the SU (2)L × SU (2)D spins rather than
the dimension of the representation. Similarly the superspace variables dual
to the eight supercharges are θ → θμ , θ̄μν+
, θ̄, so that the chiral superfield Φ
admits the newly looking decomposition
1
Φ = φ + θμ ψμ + θμ θν Fμν + · · · , (236)
2
where
1 μ α,r
θμ = σ θ . (237)
2 α,r
The supercharge Q̄ = εα̇r Q̄α̇r is a scalar and plays the role of topological
Becchi, Rouet, Stora and Tyutin (BRST) charge. In the topologically twisted
version, which we would like to stress is only a reformulation of N = 2 SYM
theories, the action reads

Stop = F ∧ F + {Q̄, Ψ } , (238)

where Ψ = φ∂μ ψ μ + Fμν χμν + η[φ̄, φ] is the “topological gauge fermion”.

For hyperkähler manifolds, i.e. manifolds with three closed Kähler forms,
the supercharges Q̄+
μν can also be exploited in order to perform the topological
twist [77]. Nekrasov [81, 82] proposed to also use Qμ or better deform Q̄ to
a μ ν
Q̄E = Q̄ + Ea Vμν x Q , (239)
a
where Vμν = −Vνμ
a
are the six generators of the Euclidean rotation group
SO(4) and Eaare constantparameters.
This allows to deﬁne equivariant
1
forms Ω(E) = p Ωp (E) = p i1 ,...,ip p! Ωi1 ,...,ip dxi1 ∧ . . . dxip such that

RΩ(E) = Ω(R−1 ER) (240)

366 M. Bianchi et al.

for any R ∈ SO(4). Ω(E) are naturally acted on by the equivariant exterior
derivative
dE = d + ιV (E) , (241)
a μ ν
where ιV (E) denotes contraction with the vector ﬁeld V (E) = Ea Vμν x ∂ , i.e.

dE Ωp = dΩp + ιV (E) Ωp . (242)

As a result, acting with dE on a p-form generically yields both a (p + 1)-form

dΩp and a (p − 1)-form ιV (E) Ωp .
One can check that the topological observable

Ω(E)
OP = Ω(E) ∧ P (Φ) (243)
R4

is Q̄E -closed iﬀ Ω(E) is “equivariantly” closed, i.e. iﬀ dE Ω(E) = 0. For generic

choices of E a the set of equivariantly closed forms is empty. However one
can consider E a ∈ U (2)ω ⊂ SO(4), where U (2)ω is the stability group of a
“reference” symplectic (Kähler and thus closed) form,

ω = dx1 ∧ dx2 + dx3 ∧ dx4 , (244)

that by deﬁnition satisﬁes dω = 0. In this way from the condition of equiv-

ariance, that can be phrased in terms of the vanishing of the following Lie
derivative
LV (E) ω = 0 = d(ιV (E) ω) + ιV (E) dω , (245)
it follows that, at least locally,

ιV (E) ω = dμ(E) , (246)

or in other terms
dE (ω − μ(E)) = 0 . (247)
Decomposing μ(E) along the four generators of the stability group U (2)ω , one
finds
h(x) ≡ μ0 = δμν xμ xν μa = a
ημν xμ xν , (248)
μ<ν
a
where ημν are ’t Hooft symbols. Since ω defines a complex structure one can
introduce complex coordinates z1 , z2 such that ω = dz1 ∧ dz̄1 + dz2 ∧ dz̄2 . We
also define
H = μR (E) = 1 |z1 |2 + 2 |z2 |2 , (249)
where μR (E) = 12 (1 + 2 )μ0 (z, z̄) + 12 (1 − 2 )μ3 (z, z̄) is an arbitrary linear
combination of the “real” moment maps, the complex part being μC = μ1 +
iμ2 .
Relying on the equivariance properties of ω and H, one can define the
generating function of Q̄E -closed observables by the formula
Instantons and Supersymmetry 367
& #
1 1
Z(a, ) = exp ω ∧ Tr(φF + ψ ∧ ψ)
(2πi)2 R 4 ≡C 2 2
$'
1
− H(x)Tr(F ∧ F ) , (250)
2 a

where the suﬃx a denotes the dependence on the scalar v.e.v., a. Supersym-
metry, which in this context is tantamount to topological invariance since Q̄E
is a linear combination of the supercharges, guarantees a perfect cancellation
of all perturbative contributions between bosons and fermions. As a result,
Z(a, ) is saturated by instantons, viz.

Z(a, ) = q K Z K (a, ) , (251)
K

where q = exp(2πiτ ). Moreover, the presence of H suppresses the contribution

of widely separated instantons and can be combined with ω into
1
H(x, θ) = H(x) + ωμν θμ θν . (252)
2
H(x, θ) represents a manifestly supersymmetric regulator for the holomorphic
function F(a, Λ), where the explicit presence of Λ as an argument is to denote
the dependence of F on the renormalisation group-invariant scale. In turn,
the latter gets eﬀectively replaced by F(a, Λe−H ). Indeed rescaling the metric
of R4 ≡ C2 by a factor λ and taking the limit λ → ∞, only the last term
survives in the partition function, since all the other terms are suppressed by
inverse powers of λ that appear in the propagators needed for the contractions.
Taking into account that derivatives of H with respect to xμ or, equivalently,
z1 and z2 , are proportional to 1,2 one ﬁnds
# $
1 ∂ 2 F(a, Λe−H )
Z(a, ) = exp ω∧ω + O()
2(2πi)2 R 4 ∂ log Λ2
# $
Finst (a, Λ) + O()
≈ exp (253)
1 2
where ∞
∂ 2 F(a, Λe−H )
Finst (a, Λ) = H dH . (254)
0 ∂H 2
Equation (254) makes the analytic properties of Z and F manifest.

10.1 Including Hypermultiplets

In the presence of Nf hypermultiplets in the fundamental representation with

masses mf , a possible parametrisation23 of the SW curve is [107]
23
In order to make contact with the parametrisation used previously one has to
perform the transformation y = w − 12 P (z) and set Q(z) to zero in the absence
of hypers.
368 M. Bianchi et al.

Λ2Nc −Nf Q(z)

w+ = P (z) , (255)
w
where

Nf
Q(z) = (z + mf ) (256)
f =1

and

Nc
P (z) = (z − αl ) . (257)
l=1

The αl ’s are related to the v.e.v.’s ofthe adjoint scalars belonging to the
Cartan subalgebra and are such that l αl = 0. The space of monic polyno-
mials P (z), i.e. polynomials where the coefficient of the monomial of highest
degree is 1, so that the coefficient of the monomial of next to highest degree
is 0, is U = CNc −1 and can thus be parametrised by the Nc − 1 variables
un = Tr(an ), with n = 1, ..., Nc − 1. The latter are symmetric polynomials in
the αl that can be identified with the Nc −1 Casimirs of SU (Nc ). The first
2
two symmetric
polynomials are 1 and u 2 = α
l l or, equivalently, l<l αl αl
since l αl = 0. The relation between un and αl can be similarly determined.
We now discuss how to determine the relation between un and the periods al
and aD l of the SW curve.
In the perturbative region, where |αl |, |αl −αn | |Λ|, |mf |, one can choose
local coordinates (
1 z dw
al = (258)
2πi Al w
and (
1 z dw
aD
l = , (259)
2πi Bl w
where the Al cycles encircle the cuts in the z-plane from the point αl+ to the
point αl− , where the points αl± are such that
Nf
%
P (z = αl± ) = ±2ΛNc − 2 Q(z = αl± ) . (260)

Not all Al cycles are homologically independent since l Al ≈ 0 can be shrunk
−
to zero. The Bl cycles go through the cuts from αl+ to αl+1(mod N ) . Once again

B
l l ≈ 0 in homology. As a result, l dal ∧ daD
l = 0 and on local patches
one can introduce the prepotential F(a; m, Λ) such that

dF(a; m, Λ) = aD
l dal . (261)
l

F(a; m, Λ) admits an expansion of the form

F(a; m, Λ) = Fpert (a; m, Λ) + Finst (a; m, Λ) , (262)

Instantons and Supersymmetry 369

where Finst (a; m, Λ) encompasses the instanton contribution, that can be com-
puted by the localisation techniques outlined below, and

1 al − al
Fpert (a; m, Λ) = (al − al )2 log
2 Λ
l =l

al + mf
− (al + mf )2 log (263)
Λ
l,f

encodes the logarithmic running of the gauge coupling with the mass scales
at play. Indeed al − al are the masses of the W -bosons and al + mf are the
masses of the charged hypermultiplets.

10.2 Instanton Measure and Localisation for Arbitrary K

Following the ADHM construction [19], the moduli space MK,Nc of K instan-
tons in SU (Nc ) with fixed framing (i.e. orientation in colour space) at infinity
is a 4KNc dimensional variety and can be viewed as the hyperkähler quotient
of the ADHM data (B1 , B2 , I, J), where B1,2 ∈ End(VK ), I ∈ Hom(WNc , VK )
and J ∈ Hom(VK , WNc ), with respect to the action of U (K). The correspond-
ing formulae
μC = [B1 , B2 ] + IJ = 0 (264)
and
μR = [B1 , B1† ] + [B2 , B2† ] + II † − J † J = 0 (265)
are the celebrated ADHM equations [19] that indeed enjoy invariance under
U (K) transformations.
As a result MK,Nc is neither compact in the UV (due to small size instan-
tons) nor in the IR (due to the non-compactness of R4 ).
Various compactifications of MK,Nc have been proposed [108]. The Uh-
lenbeck compactification MU K,Nc corresponds to the construction of a hy-
perkähler orbifold where the UV problem is cured by including point-like
instantons, e.g. gluing subspaces of the form MK−1,Nc × R4 , MK−2,Nc × R8 ,
MK−3,Nc × R12 and so on. Alternatively, according to Nekrasov and Schwarz
the singularities of MUK,Nc can be blown up to a smooth space MK,Nc which
NS

includes “exceptional divisors” in place of the original singularities [79, 80].

This blowing up relies on a non-commutative extension of the gauge theory
that translates in the possibility of deforming the ADHM equations (264)
and (264) to24
μR = ζR 1K , μC = 0 . (266)
Deformed instanton calculus then boils down to computing equivariant vol-
umes of MN S
K,Nc , provided one uses in the deﬁnition of the integration measure

24
In principle one can deform μC as well. But this deformation is irrelevant as it
can always be eliminated by a non analytic change of coordinates.
370 M. Bianchi et al.

the closed symplectic two-form25 lifted from MU K,Nc , where the relevant sym-
plectic form is the reference Kähler form. Since this symplectic form vanishes
when restricted to the exceptional divisors, it does not add contributions “ex-
traneous” to the original “commutative” gauge theory. In order to localise the
measure, i.e. reduce the integrals to contour integrals that are calculable by
the residue theorem, it is convenient to consider the combined action of U (K),
G = SU (Nc )/ZNc and T2 , the latter representing the maximal torus, i.e. the
exponential of the Cartan subalgebra, of SO(4). The use of this combined
action is instrumental in deforming the symplectic Kähler form ω of R4 by
Λ Λ
ADHM Λ
the moment maps μG = δG A ωΛΛ
A , where A collectively denote the
ADHM data, and μT 2 = a xi (VTa2 )i j ωjk xk and in constructing an equivariant
form that localises the integrals on point-like abelian instantons.
The partition function over the compactiﬁed instanton moduli space reads
(
K
Z(a, 1 , 2 ; q) = q 1 (267)
K MK

)
where q = e2πiτ and 1 denotes the localisation of the integral to point-
like instantons while a = (a1 , ..., aNc ) parametrise the Cartan subalgebra of
Nc
G = SU (Nc ), i.e. i=1 ai = 0 and 1 , 2 are deformation parameters corre-
sponding to the Ω background, deﬁned below. For the purpose of computing,
the integral it is convenient to rewrite the contour integral in the form
(
ZK = 1= exp(ω + μG (a) + μT 2 ()) , (268)
MK MK

due to topological BRST invariance. The non-perturbative contributions to

the prepotential, but not the perturbative ones, are proportional to the loga-
rithm of the topological partition function

1
Z(a, 1 , 2 ; q) = exp Fnon−pert (a, 1 , 2 ; q) , (269)
1 2

as previously shown. Here we are only describing an eﬃcient way to explicitly

compute the contour integrals that yield ZK , the coeﬃcients of the expansion
of the topological partition function. Using localisation, one can indeed derive
an explicit expression for Z(a, 1 , 2 ; q). Taking for simplicity 1 = −2 =
(the notation suggests that some quantum non-commutativity is switched
on as we will see!), one ﬁnds
ami + (Km,n − Ki,j + j − n)
Z(a, , −; q) = q |K| , (270)
ami + (j − n)
K (m,n) =(i,j)

25
A symplectic 2-form is the generalisation of the familiar 2-form ω = i dpi ∧ dq i
in phase space.
Instantons and Supersymmetry 371

where ami = am − ai and the sum is over the “coloured” partitions of the
instanton numbers among the Nc abelian factors U (1)Nc of the Cartan sub-
algebra of U (Nc )
K = (K 1 , . . . , K Nc ) (271)
with

K n = {Kn,1 ≥ Kn,2 ≥ · · · ≥ Kn,ln ≥ Kn,ln +1 = Kn,ln +2 = · · · = 0} , (272)

while the product in (270) is over 1 ≤ m, i ≤ Nc and n, j ≥ 1.

The theory can be enlarged by the addition of Nf hypermultiplets in the
fundamental Nc + Nc ∗ with masses m1 , ..., mNf . The explicit expression of
Z(ai , mf , , −; q) in this case becomes

Nf
Γ ( 1 (am + mf ) + 1 + Km,n − n)
Z(ai , mf , , −; q) = (qNf )|K|
ami + (j − n)
K (m,n) f =1
ami + (Km,n − Ki,j + j − n)
× . (273)
ami + (j − n)
(m,n) =(i,j)

10.3 Computing the Residues and Checking the Instanton

Contributions

In a remarkable paper, Moore, Nekrasov and Shatashvili [80] have indeed been
able to reduce the computation of ZK in the K-instanton sector to contour
integrals of the form
(
K
1 (1 + 2 )K dφI Q(φI )
ZK (a; i ) =
K! (2πi1 2 )K P (φI )P (φI + 1 + 2 )
I=1
φ2IJ (φ2IJ − (1 + 2 )2 )
× , (274)
(φ2IJ − 21 )(φ2IJ − 22 )
1≤I<J≤K

where the complex variables φIJ = φI − φJ can be thought of as entries of a

K ×K matrix, P (z) and Q(z) were deﬁned before and the integration contours
run along the real axis.
The variables φI , al and 1,2 represent an inﬁnitesimal deformation of the
ADHM equations such that

[B1 , φ] = 1 B1 [B2 , φ] = 2 B2
−φI + Ia = 0 − aJ + Jφ = −(1 + 2 )J . (275)

In the bases of the K-dimensional vector space VK and the Nc -dimensional

vector space WNc of the ADHM construction, where the K × K matrix φ and
the Nc × Nc matrix a, representing the scalar v.e.v.’s, are diagonal, one has
372 M. Bianchi et al.

(φIJ + 1 )B1,IJ = 0 (φIJ + 2 )B2,IJ = 0

(φI − al )II,l = 0 (φI + 1 + 2 − al )Jl,I = 0 . (276)

The poles at φIJ = ±1,2 should be avoided by deforming the contour or

setting 1,2 → 1,2 + iδ. Similarly al → al + iδ in order to avoid the zeroes
of P in the denominator. The origin of the poles at φIJ = ±1,2 can be
understood by means of the Duistermaat–Heckman (DH) formula

1 e−μ[ξ](Pf )
ω n e−μ[ξ] = n , (277)
n! X 2n i=1 Wi [ξ](Pf )
Pf :Vξ (Pf )=0

where X 2n is a symplectic manifold with symplectic form ω and μ is the mo-

ment map of a “torus” action generated by ξ and represented by Vξ , that
has fixed points Pf with “exponents” Wi [ξ](Pf ). In the case of X = MN S
K,Nc ,
the relevant torus action, that consists of the geometric transformations that
form an abelian group, is U (1)Nc −1 ⊂ SU (Nc ) and U (1)2 ⊂ U (2)ω . Indeed
U (1)Nc −1 is the maximal torus of the gauge group, generated by the Cartan
subalgebra and one cannot hope to get any larger torus action from the gauge
group generators. Similarly U (1)2 is the maximal abelian subgroup of the sta-
bility group of the symplectic Kähler form and one cannot get anything more
from the Euclidean rotation group. For generic ADHM data, the deformed
ADHM equations have solutions only in correspondence with the poles of the
integrand, this means that φI and φIJ = φ I − φJ are uniquely specified in
n
terms of al , 1,2 and Pf . The last ingredient, i=1 Wi [ξ](f ), in the DH formula
can be related to the Chern character of the tangent bundle of MN S
K,Nc at the
point Pf .
Another important step in the computation of the contour integral is the
26
of the residues in terms of Young tableaux Y = (Y1 , ..., YNc ),
classification
such that l |Yl | = K. Indeed to each Yl with 0 < Kl ≤ K boxes corresponds
a partition

Kl,1 ≥ · · · ≥ Kl,nl ≥ Kl,nl +1 = Kl,nl +2 = · · · = 0 . (278)

Then the pole corresponding to a given Y is located at

(r,s)
φI = al + 1 (r − 1) + 2 (s − 1) , (279)

with the integers r and s such that 0 ≤ r ≤ nl and 0 ≤ s ≤ Kl,r .

In more physical terms the ﬁxed points of the action of G × T 2 on the “re-
solved” MN S
K,Nc correspond to U (Nc ) non-commutative instantons that split
into U (1)Nc non-commutative instantons such that the instanton charge K is
26
Young tableaux are sets of boxes. The number of columns is Nc for U (Nc ). Start-
ing from the ﬁrst column the number of boxes should not increase. Boxes in
the same column correspond to anti-symmetrised indices. Boxes in the same row
correspond to symmetrised indices.
Instantons and Supersymmetry 373

split into K = l Kl with Kl in the lth subgroup. The non-commutativity
induced by the -deformation prevents the instantons from coalescing one on
top of the other.
For a given Y the residue of the contour integral reads
1
R(Y ) = × (280)
(1 2 )K

Nc
nl K l,r
Tl (1 (r − 1) + 2 (s − 1)) c
1,N nl
nm

((l (r, s) + 1) − 2 hl (r, s))(2 hl (r, s) − l (r, s))

l=1 r=1 s=1 l<m r=1 p=1

(alm + 1 (t − Km,p ) − 2 (s − 1))(alm + 1 t − 2 (s − 1 − Kl,r )) 2

Kl,r Km,p

s=1 t=1
(alm + 1 (t − Km,p ) − 2 (s − 1 − Kl,r ))(alm + 1 t − 2 (s − 1)

where = 1 + 2 , alm = al − am , l (r, s) = Kl,r − s, hl (r, s) = Kl,r + Kl,s −

r − s + 1 and
Q(z + al )
Tl (z) = . (281)
m =l (z + alm )(z + + alm )

For future use it is convenient to deﬁne

Q(z + al )
Sl (z) = 2
, (282)
m =l (z + alm )

in terms of which the ﬁrst two coeﬃcients of the instanton expansion of the
topological partition function are given by
1
Z1 = Sl (0) , (283)
1 2
l
⎡ ⎤
1 1 1 Sl (0)Sm (0)a4
Z2 = ⎣ Sl (0)[Sl () + Sl (−)] + lm ⎦
(1 2 )2 4 2 (a2lm − 2 )2
l l =m

and so on. Using the known relation between the topological partition function
and the non-perturbative contribution to the holomorphic prepotential (190)
one gets

F1 = Sl (0) ,
l
1 Sl (0)Sm (0)
F2 = Sl (0)Sl (0) + + O(2 ) (284)
4 a2lm
l l =m

and so on. Formulae tend soon to become unwieldy but Nekrasov has been
able to check agreement with previous results for the holomorphic prepotential
up to ﬁve instantons [81, 82]. The consistency among various independent
approaches conﬁrms the correctness of the result for the SW prepotential.
374 M. Bianchi et al.

11 (Constrained) Instantons from Open Strings

One of the most astonishing features of critical strings is the presence of a
massless vector boson in the open string spectrum and of a massless sym-
metric tensor in the closed string spectrum. The latter can be interpreted as
the graviton. The former can be interpreted as the photon in the abelian case
or as a gauge boson in the non-abelian one. Originally, a Yang–Mills group
was introduced ad hoc through Chan–Paton (CP) factors. They respect the
cyclicity of the Veneziano amplitude [109], that requires insertions of open
string vertex operators on the boundary of a disk. In modern terms the group
theory structure emerges from certain configuration of Dp-branes (D stand-
ing for Dirichelet, p for the number of spatial dimensions of the brane), i.e.
hypersurfaces where open strings can end [110].
In the supersymmetric case, i.e. after GSO projection, the low-energy
world-volume dynamics of Nc coincident Dp-branes is governed by the di-
mensional reduction from d = 10 to d = p + 1 of the N = 1 SYM theory
with gauge group U (Nc ) [111]. In particular, p = 3 corresponds to the cele-
brated N = 4 SYM in d = 4, some (non-)perturbative properties of which will
be discussed later on. From a macroscopic viewpoint, Dp-branes are 1/2 BPS
solitons of type II or type I supergravities, in that they preserve one-half of the
supersymmetries of the parent theory. Configurations with different kinds of
Dp-branes are generically non-supersymmetric except for very special choices
of embeddings, i.e. dimensions and orientation of the various branes w.r.t.
one another. For our purposes of relating strings to instanton calculus, it is
crucial that a configuration with K D(p − 4)-branes lying within a stack of
Nc Dp-branes, i.e. such that the branes have p − 4 dimensions in common,
preserves 1/4 of the original supersymmetries. In fact this configuration is a
“bound state” at threshold [111], i.e. the mass of the bound state is the sum of
the masses of the constituent branes. Moreover, the D(p − 4)-branes have all
the right to be considered as a “gas” of instantons within the Dp-branes [83].
We will exploit the fact that D(p − 4)-branes behave as a gas of instantons
within the Dp-branes for the case p = 3 that corresponds to N = 4 SYM
and will indicate how to get instantons in gauge theories with less or no
supersymmetries. We will also discuss how to tune the parameters, i.e. the
string tension T = 1/2πα and the string coupling gs (related to the v.e.v. of
the massless scalar dilaton) in order to decouple heavy string modes. We will
not consider the cases p = 3.
In the presence of Nc D3 and K D(−1)-branes there are three sectors of
the open string spectrum. Strings that start and end on D3-branes provide
the U (Nc ) gauge fields and their superpartners. Strings that start and end on
D(−1)-branes yield U (K) non-dynamical (background) gauge fields and their
superpartners. Together they provide
a subset
of the (super) ADHM data,
e.g. the centre of mass xCM = i Mi xi / i Mi , where Mi are the masses of
the brane constituents, and global SUSY parameters. Strings that start on
Instantons and Supersymmetry 375

D3-branes and end on D(−1)-branes or that start on D(−1)-branes and end

on D3-branes provide the remaining (super) ADHM data.
Suppressing CP factors for the moment, the vertex operators for gauge
bosons, that belong to the Neveu–Schwarz (NS) sector, read
VA = AM (p)Ψ M e−ϕ eip·X , (285)
where X M and Ψ M , with M = 0, ..., 9, denote the bosonic and fermionic
string coordinates, respectively, and ϕ the superghost boson. BRST invariance
requires p2 = 0 and p · A(p) = 0, which is the form that the linearised Yang–
Mills equations for AM (p) take in the transverse gauge. Vertex operators for
gauginos, belonging to the Ramond (R) sector, read
VΛ = Λa (p)Sa e−ϕ/2 eip·X , (286)
where Sa , with a = 1, ...16, is a chiral spin field that creates a cut for Ψ M , i.e.
a line connecting two branch points of the polydromous fields Ψ M . This means
that the operator product expansion (OPE) of Ψ M (z) with Sa (w) contains half
integer powers of z − w. BRST invariance requires p2 = 0 and p · Γab Λb (p) = 0,
which is the massless Dirac equation for Λb (p).
After reduction to d = 4, relevant for D3-branes, the gauge bosons in
d = 10 yield gauge bosons Aμ as well as six real scalars Ai = φi . The d =
10 gauginos yield four Weyl gauginos ΛA α̇
α and their anti-particles Λ̄A . The
structure of the on-shell effective action can be extracted from the knowledge
of the scattering amplitudes on the disk with D3-brane boundary conditions.
In the low-energy limit, α → 0 with the Yang–Mills coupling g 2 = 4πgs fixed,
the effective action coincides with N = 4 SYM theory.
After reduction to d = 0, relevant for D(−1)-branes, also known as D-
instantons, the gauge field vertex operator VA defined above yields 10 non-
dynamical “fields”, i.e. matrices whose dynamics is governed by an action in 0
dimensions. Due to the breaking of the (Euclidean) Lorentz symmetry SO(10)
to SO(4) × SO(6) in the presence of D3-branes, it turns out to be convenient
to split the ten “gauge bosons”, aM , into four gauge bosons, aμ , and six real
“scalars”, χi . Similarly, the d = 10 gauginos, VΛ , produce four non-dynamical
Weyl “gauginos”, ΘαA , and their anti-particles, Θ̄A α̇
. The structure of the on-
shell effective action can be extracted from the scattering amplitudes on the
disk with D(−1)-brane boundary conditions. In the low-energy limit, α → 0
with the zero-dimensional Yang–Mills coupling g02 = gs /4π 3 (α )2 fixed, the
effective action for the low lying excitations of the D(−1)-brane reads
SD(−1) = Scub + Squart , (287)
where

i 1
Scub = TrK Θ̄A σ μ [aμ , ΘA ] − τiAB Θ̄A [χi , Θ̄B ]
g02 2

1 i
− τ̄AB ΘA [χi , ΘB ] , (288)
2
376 M. Bianchi et al.

with TrK denoting the trace in the K-dimensional representation of U (K)

and
1
Squart = 2 TrK [aμ , aν ][aμ , aν ] + 2[aμ , χi ][aμ , χi ] + [χi , χj ][χi , χj ] . (289)
4g0

In what follows, it is crucial to replace Squart with a cubic action Scub , through
the Hubbard–Stratonovich procedure, that entails the introduction of auxil-
iary ﬁelds Xμν , Yμi and Zij . Their vertex operators, bilinear in the fermions
Ψ ’s, are not BRST invariant and a priori one should not insert them as ver-
tices in scattering amplitudes. Nevertheless, three-point amplitudes with one
auxiliary ﬁeld insertion are consistent and yield the correct interactions, be-
cause the BRST non-invariant part decouples. In the end one replaces Squart
with

1 1 1
Scub = 2 TrK Xμν X μν + Yμi Y μi + Zij Z ij
2g0 2 2

μ ν μ i i j
+ Xμν [a , a ] + 2Yμi [a , χ ] + Zij [χ , χ ] . (290)

We now pass to consider open strings connecting D3-branes to D(−1)-

branes. Vertex operators in this sector involve Z2 bosonic twist fields, σ(μ) ,
because one is changing the boundary conditions of the four (Euclidean)
“spacetime” coordinates from Neumann (D3) to Dirichelet (D(−1)). Twist
fields are local conformal primary operators that generate a cut in the bosonic
coordinate field X very much like spin fields, already encountered above, gen-
erate cuts in the fermionic coordinates Ψ . In the canonical superghost picture
q = −1 for bosons, the vertex operators read

Vw(−1) = wα̇ ΣC α̇ e−ϕ TK,Nc , = w̄α̇ ΣC α̇ e−ϕ TNc ,K ,

(−1)
Vw̄ (291)

where Σ = μ σ(μ) is a bosonic twist ﬁeld of dimension 1/4 = 4 × 1/16 and
C α̇ is an SO(4) spin ﬁeld of dimension 1/4. TNc ,K denote the K × Nc Chan–
Paton “matrices”. The supersymmetry partners, in the canonical q = −1/2
picture for fermions, have vertex operators of the form

Vν(−1/2) = ν A ΣCA e−ϕ/2 TK,Nc , = ν̄ A ΣCA e−ϕ/2 TNc ,K ,

(−1/2)
Vν̄ (292)

where CA is an SO(6) spin ﬁeld.

Computing amplitudes on disks with mixed boundary conditions allows
one to extract the effective action for the “twisted” sector. Defining the K ×K
matrices
W a = (wσ a w̄)K×K , (293)
the action that governs the dynamics of the light modes (or moduli) of the
system of D(−1)-branes in the presence of D3-branes, takes the form
Instantons and Supersymmetry 377

2i
Stwist = TrK (wα̇ ν̄ A + ν A w̄α̇ )Θ̄A
α̇
− Xa W a
g02

1
+ χi τABi
ν A ν̄ B − iχi wα̇ w̄α̇ χi , (294)
2
a a
where we have set Xμν = Xa η̄μν + X̄a ημν , so that the three components X̄a
a μν
actually decouple because ημν η̄b = 0.
Combining with the previous terms and rescaling appropriately the fields,
so as to get a non-trivial field theory limit, one finds for the complete action
that governs the dynamics of the light modes (or moduli) of the system of
D(−1)-branes in the presence of D3-branes

Smoduli = Scub + Scub + Stwist . (295)

One can check that

xμ0 = TrK (aμ ) and θ0αA = TrK (ΘαA ) , (296)

α̇
drop from the action, while varying w.r.t. Xa and Θ̄A yields the super ADHM
equations. The latter consist in 3K × K real bosonic equations

W a + iη̄μν
a
[aμ , aν ] = 0 (297)

that, taking into account U (K) invariance, impose 4K × K constraints on

the ADHM data which implement the hyperkähler quotient, and 8K × K
fermionic constraints (for N = 4 supersymmetry)

wα̇ ν̄ A + ν A w̄α̇ + [ΘαA , aμ ]σαμα̇ = 0 , (298)

that reduce the number of independent fermionic zero modes. These ingredi-
ents, i.e. the constrained ADHM superdata encoded in the various open string
vertex operators and their interactions encoded in the scattering amplitudes,
are suﬃcient to reconstruct the classical super instanton proﬁle as well as to
compute instanton contributions to correlation functions. In particular
(−1) a ν −ip·x0
μ (p; w, w̄) = Vw̄
Ainst Uμ(0) (−p)Vw(−1) = (w̄σa w)Nc ×Nc η̄μν p e ,
(299)
(0)
where Uμ is the “amputated” vertex operator

Uμ(0) (−p) = 2i(∂Xμ − ip · Ψ Ψμ )e−ip·X (300)

in the q = 0 superghost picture. After Fourier transforming to x space one

obtains

d4 p inst
Ainst
μ (x; w, w̄) = A (p; w, w̄)eip·x
4π 2 p2 μ
a (x − x0 )
ν
= (w̄σa w)Nc ×Nc η̄μν , (301)
(x − x0 )4
378 M. Bianchi et al.

which should coincide with the asymptotic behaviour of the unconstrained

instanton at large distance in the singular gauge. Indeed, focussing on K = 1
and Nc = 2, if one sets 2ρ2 = w̄w by a global SU (2) rotation, one ﬁnds

(x − x0 )ν
Ainst,a (x; ρ) ≈ 2ρ2 η̄μν
a
, (302)
μ
(x − x0 )4

which is the large distance term in the expansion of the celebrated BPST solu-
tion. To make contact with (1) one clearly has to extract a factor g from (302).
Higher-order terms in ρ2 = w̄w/2 are sub-dominant at large distances and are
anyway determined by solving the YM equations with the given asymptotic
behaviour. By similar methods one can compute the classical asymptotic pro-
files of the other elementary fields (gauginos and scalars) that involve the 16
supersymmetry (8 Poincaré and 8 superconformal) parameters broken by the
D-instanton but preserved by the D3-branes (in the near horizon limit). These
profiles enter the computation of instanton contributions to amplitudes.
One can then embark in the computation of instanton-dominated cor-
relators. Denoting by UO (p) the unintegrated open string vertex operators
corresponding to the SYM fields O(−p), one schematically has to compute

O1 (p1 )...On (pn )|D−inst

amp (303)

= dM UO1 (−p1 )D(M) ... UOn (−pn )D(M) e−S(M) .

The simple “product” form of the integrand is due to the fact that the am-
plitude is dominated by disconnected disks with mixed boundary conditions
D(M) obtained by inserting the non-dynamical (super)moduli fields, which
must include at least the 16 exact fermionic zero modes. This is the most in-
teresting part of the string construction of instantons. We have only devoted
few lines to it because, once the “super-instanton” profile has been generated
and the “supermoduli” have been correctly identified, one can repeat word by
word what has been pedagogically said and carefully done in the discussion
of N = 1 SYM.

11.1 N = 2 SYM from Open Strings

There are various ways to realise d = 4 N = 2 SYM in string theory. The

easiest way is to put a stack of D3-branes at an orbifold point,27 let us say
the origin of R6 /Γ , such that the holonomy group28 Γ is a discrete subgroup
27
Another possibility is to consider intersecting branes or brane with internal mag-
netic ﬂuxes preserving N = 2 supersymmetry. Other conﬁgurations are possible
in M-theory, e.g. by wrapping M5-branes around Riemann surfaces producing SW
curves, etc.
28
If the holonomy group Γ ⊂ SU (3) one has N = 1 SYM, when Γ = 1 (trivial
holonomy group) one has N = 4 SYM.
Instantons and Supersymmetry 379

of SU (2) of dimension r. As discussed in [93, 112], in the context of ALE

instantons in string theory, there are essentially two kinds of branes one can
consider. Regular branes are those that transform in the “regular” represen-

tation of Γ , i.e. the (usually reducible) representation of dimension r = i n2i
equal to the dimension of Γ . For instance for the cyclic group Zn , the regular
n-dimensional reducible representation is simply the direct sum of the n one-
dimensional irreducible representations. The D3-branes can be moved away
from the orbifold point, where the curvature is concentrated, to the flat bulk
in such a way that the r images in the covering space actually correspond to
one physical brane in R6 /Γ . There can be other branes that transform un-
der smaller (irreducible) representation of Γ , e.g. any of the n − 1 non-trivial
one-dimensional irreps of Zn , and are called “fractional” branes in that they
carry fractional R–R charge in R6 /Γ , corresponding to integer charge in the
covering space. Branes of this kind cannot be moved away from the orbifold
point and give rise to gauge theories with lower supersymmetry than branes
that can be moved into the flat bulk (here “moving” has exactly the same
meaning as above). In orbifolds the curvature is concentrated at the singular-
ity. If a (stack of) branes is displaced from the singular (orbifold) point and
placed in the bulk, the effective field theory governing the dynamics of the
light modes enjoys N = 4 SUSY.
For definiteness, let us consider the case of Γ = Zn ⊂ SU (2), correspond-
ing to the A-series in the ADE classification of discrete subgroups of SU (2)
and thus the case of ALE instantons, see, e.g. [93]. The regular representation
is n-dimensional and reducible. One starts with n stacks of Nc -branes each.
The reduction of SUSY from N = 4 to N = 2 is achieved by truncating the
parent theory with gauge group U (nNc ) to the sector which is invariant under
the action of Zn . The natural action of Zn ⊂ SU (2) on the gauge fields and
complex scalars is given by

Aμ → Aμ , φ3 → φ 3 , φ1 → ωφ1 , φ2 → ω̄φ2 , (304)

where ω = exp(2πi/n). Furthermore, Zn is taken to act on the gauge group

U (nNc ) via a discrete Wilson line

Wreg = (1n×n , ω 1n×n , . . . , ω n−1 1n×n ) , (305)

in such a way that

T a → W T a W −1 . (306)
Taking into account the combined action of Zn in (304)–(306), one concludes
that the condition W ΦW −1 = ωΦ Φ, where Φ collectively denotes the (bosonic)
ﬁelds, truncates the theory to one with a vector boson and a complex scalar φ3
in the adjoint of U (Nc )n and two complex bosons φ1,2 in the bi-fundamental
of adjacent U (Nc )’s. Since we have chosen precisely Zn ⊂ SU (2), out of the
16 supersymmetry parameters associated with the N = 4 Poincaré super-
symmetry, 8 are invariant and generate the N = 2 Poincaré supersymmetry.
380 M. Bianchi et al.

Indeed the (2L , 4) and (2R , 4∗ ) spinors (that arise from dimensional reduc-
tion of the 16 of N = 1 SYM in d = 10) give rise to (2L , 2, 1) and (2R , 2, 1)
spinors that are invariant under SU (2)H as well as to (2L , 1, 2) and (2R , 1, 2)
spinors that are not invariant under SU (2)H . The resulting N = 2 Poincaré
supersymmetry implies that each of the above bosons is accompanied by its
fermion superpartner that promote the theory to N = 2 SYM coupled to
hypermultiplets in the (Ni , N∗i+1 ) ⊕ (N∗i , Ni+1 ) representation. The one-loop
β-function of SU (Nc )n turns out to be zero, because 2Nc − 2Nc = 0, while
the U (1) ⊂ U (Nc )n are IR free (as for any abelian gauge theory coupled to
charged matter) and thus the U (1) vector multiplets decouple at low ener-
gies. One is dealing with an exact N = 2 superconformal theory in the IR.
In fact, one can turn on v.e.v.’s of the adjoint scalar (Coulomb branch) or of
the bi-fundamentals (Higgs branch). The former generically breaks the group
to U (1)nNc , the latter to U (Nc )diag realising the expected simultaneous mo-
tion of the n stacks of Nc branes away from the fixed point into the bulk,
where supersymmetry is enhanced to N = 4, since the hypermultiplets in
the bi-fundamentals produce the extra adjoint of U (Nc )diag needed to pro-
mote a N = 2 vector multiplet to a N = 4 vector multiplet. The diagonal
U (1) ⊂ U (Nc )diag is free and corresponds to the centre of mass motion of the
bound state of the various stacks of D-branes.
If instead of choosing the “regular” embedding of Zn in U (nNc ) one takes
another representation for W , one gets non-superconformal theories that live
on fractional branes. In the extreme case where W = Wk with
Wk = (ω k 1M ×M ) , (307)
and ω = e2πi/n for any k = 1, ..., n − 1 one gets pure N = 2 SYM with
gauge group U (M ) where M is not necessarily a multiple of n, i.e. M =
nNc generically. Fractional branes are stuck at the fixed point, conventionally
put at the origin of R6 /Zn and cannot move away from it. Referring to our
previous notation, out of the six real φi ’s only two (one complex), φ and
φ† , survive the orbifold projection. The precise linear combination of the six
original real scalar fields is determined by the choice of the embedding of
SU (2) into the rotation group of R6 , SO(6) ≈ SU (4). Similarly, out of the
four gaugini only two survive the projection, i.e. the ones that are singlets
of SU (2) ⊃ Γ and transform as a charged doublet under the SU (2) × U (1)
subgroup of SO(6) ≈ SU (4) commuting with Γ . The complexified gauge
coupling of the surviving N = 2 SYM theory with gauge group U (M ) is
determined by the closed string background, i.e. the v.e.v.’s of the so-called
blowing up modes of the orbifold fixed point. The blowing up modes are
nothing but twist fields for the closed string coordinates, this means that the
OPE of the bosonic coordinates X(z, z̄) with the bosonic twist fields σ(w, w̄)
contains fractional powers. We have already encountered twist fields for the
open string coordinates. Since closed string vertex operators are given by
combinations of open string vertex operators for the left- and right-moving
excitations of the closed string, blowing up modes are described by products of
twist fields for the left and right movers, schematically σ(z, z̄) = σL (z)σR (z̄).
Instantons and Supersymmetry 381

Indeed one may regard fractional D3-branes as D5-branes wrapped around

homologically non-trivial cycles, sometimes called “exceptional divisors”, that
are complex varieties of codimension one29 in R4 /Γ ≡ C2 /Γ , that shrink to
zero size, i.e. to zero area in one’s preferred units, at the ﬁxed point in the
orbifold limit, i.e. prior to resolution of the singularity. For a Zn singularity
there are n − 1 two-spheres that intersect according to the Cartan matrix
of An−1 . The complexiﬁed coupling is given by the “period integrals” of the
2-form B2 + iC2 , with B2 belonging to the Neveu–Schwarz–Neveu–Schwarz
(NS–NS) sector and C2 belonging to the R–R sector. For regular branes, the
gauge coupling of the diagonal subgroup, the one surviving when the branes
move to the bulk, is given by Φ + iC0 , where Φ is the NS–NS dilaton and C0 is
the R–R scalar “axion”. Indeed one can show that the corresponding tadpoles
precisely match the one-loop running of the couplings [113, 114]!
Essentially the same analysis applies to open strings with both ends on
D(−1)-branes (D-instantons). Taking K fractional D-instantons with
WlD−inst = (ω l 1K×K ) , (308)
produces the truncation of the world-volume low-energy theory to pure (zero-
dimensional!) N = 2 SYM with gauge group U (K). The surviving adjoint
scalars will be denoted by χ and χ† . The two associated non-dynamical
fermions will be denoted by Θαr with r = 1, 2 and their conjugates by Θ̄rα̇ .
Setting
aμK×K = xμ0 1K×K + ygμ TgK×K , (309)
where TgK×K are the generators of SU (K), and
rα
ΘK×K = θ0rα 1K×K + ζgrα TgK×K (310)

one can regard xμ0 and θ0rα as coordinates in N = 2 superspace.

Open strings connecting Nc fractional D3-branes to K fractional D-
instantons belong to the bi-fundamental (Nc , K̄) representation of U (Nc ) ×
U (K). The bosonic modes wα̇ and w̄α̇ are as in the N = 4 case, while the
fermionic modes are halved and will be consistently denoted by ν r and ν̄ r .
In the double scaling limit α → 0, g0 → ∞ (g0 has mass dimension
+2) with (4π 2 α g0 )2 = 4πgs = g 2 ﬁxed, the non-dynamical moduli ﬁelds are
governed by the action
N =2
Smoduli = Sbose + Sfermi + SADHM , (311)
where
N =2

Sbose = TrK −2[χ† , aμ ][χ, aμ ] + χwα̇ w̄α̇ χ† + χ† wα̇ w̄α̇ χ
√
N =2 2
Sfermi =i εrs TrK ν r ν̄ s χ† − Θr [χ, Θs ] (312)
2
N =2
SADHM = −iTrK [Θ̄rα̇ (wα̇ ν̄ r + ν r w̄α̇ +[Θαr , aμ ]σαμα̇ )−Xa (W a + iη̄μν
a
[aμ , aν ])].
29
Recall that Γ ⊂ SU (2) only acts on C2 ≡ R4 ⊂ C3 ≡ R6 .
382 M. Bianchi et al.

Varying the action w.r.t. Xa and Θ̄A α̇

yields the N = 2 super ADHM con-
straints
W a + iη̄μν
a
[aμ , aν ] = 0 (313)
and
wα̇ ν̄ r + ν r w̄α̇ + [Θαr , aμ ]σαμα̇ = 0 . (314)
As before, one can perform a Hubbard–Stratonovich transformation and re-
N =2
place the quartic couplings in Sbose with trilinear couplings to auxiliary fields
μ α̇ α̇
YK×K and UNc ×K and ŪK×Nc and their conjugates. As a result one gets
N =2

Sbose = TrK 2Y μ Yμ† − 2Yμ [χ† , aμ ] − 2Yμ† [χ, aμ ] (315)

† α̇ † α̇ † α̇ α̇ † α̇ † † α̇
+ Uα̇ U + Ūα̇ Ū + Ūα̇ w̄ χ + Ūα̇ w̄ χ + χw Uα̇ + χ w Uα̇ .
Computing amplitudes with insertions of the scalar field vertex operator
(p) = φ(p)e−ϕ eip·X
(−1)
Vφ (316)
at p = 0, that correspond to turning on a v.e.v. for φ in the Cartan subalgebra
of U (Nc ), one can construct the relevant action for the moduli fields. By the
invariance of the scattering amplitudes under the exchange of the dynamical
field φ with the non-dynamical field χ, the effect of the presence of a constant
φ in the computation of instanton effects simply amounts to the replacements
χK×K ⊗ 1Nc ×Nc → χK×K ⊗ 1Nc ×Nc − 1K×K ⊗ φNc ×Nc (317)
and
χ†K×K ⊗ 1Nc ×Nc → χ†K×K ⊗ 1Nc ×Nc − 1K×K ⊗ φ†Nc ×Nc . (318)
†
It is crucial to observe at this point that φ and φ do not enter the fermionic
action in the same way, indeed the additional terms in the fermionic action of
the N = 2 supermoduli read
√
N =2 2
ΔSfermi = i εrs TrK ν r φ† ν̄ s . (319)
2
As a consequence all 2K(Nc − 2) zero modes associated with ν̄ r and ν s are
lifted.

11.2 SW Prepotential from String Instantons

Let us now specialise to the case of a SU (2) gauge group. We are ready
to accomplish the task of checking the SW prepotential, FSW , by means of
Veneziano’s open string theory! The Wilsonian eﬀective action for the light
neutral modes is
Seﬀ [Φ] = d4 x d4 θF(Φ) + h.c. , (320)
Instantons and Supersymmetry 383

where Φ = Φ3 σ 3 /2 is the N = 2 vector superﬁeld,

1 μν
Φ(x, θ) = φ(x)+θrα λrα (x)+ θrα θsβ (εrs σαβ Fμν (x)+σars εαβ X a (x))+· · · . (321)
2
In (321) · · · stands for higher-order terms in θ’s that can be expressed in
terms of the lowest components. We hope the reader does not get confused by
the notation. In this section, Φ denotes an N = 2 chiral superfield (previously
denoted by A), φ is its lowest component and v denote the v.e.v. of φ, while
a or more precisely aμ are the non-dynamical moduli fields.
The contribution of the K-instanton sector to Seff [Φ] is given by

dμK e−SK (Φ,μ) ,
(K)
Seff [Φ] = (322)
MK

where μ collectively denotes the supermoduli parametrising MK . Separat-

ing the collective coordinates xμ0 and θ0rα and, dropping the subscript 0 for
simplicity, one gets

dμ̂K e−SK (Φ,μ̂) ,
(K) 4 4
Seﬀ [Φ] = d x d θ (323)
M̂K

so that comparison with the formula (323) yields

FK (Φ) = dμ̂K e−SK (Φ,μ̂) , (324)
M̂K

where M̂K denotes the supermoduli space of “centred” instantons. M̂K de-
scribes configurations with fixed position of the centre of mass of the various
instantons, which in turn are parameterised by μ̂’s, i.e. by the collective co-
ordinates that do not move the position of the centre of mass. Since Φ(x, θ)
may be taken to be a constant (slowly varying) superfield Φ(x, θ) = φ inde-
pendent of the μ̂’s, one can compute FK (φ) and then promote the argument φ
to a chiral superfield by holomorphy in the low-energy approximation. Indeed
higher (super)derivatives would contribute to the 1PI effective action. Resum-
ming the infinite number of such contributions should reveal the spectrum of
stable particles (BPS monopoles and dyons), expected on the basis of the SW
analysis. The study of this feature is beyond the scope of the present analysis.
Following this strategy till the end, one finds

Λ4K
FK (φ) = CK φ2 , (325)
φ4K

where Λ = v exp(−8π 2 /g 2 (v)b1 ) is the RG 1 scale, dynamically generated by

dimensional transmutation, and v is an arbitrary scale that can be taken to
coincide with the v.e.v. of φ. The coeﬃcients of the φ expansion of F can be
384 M. Bianchi et al.

computed by setting φ to any convenient value, including φ = 0, and are given

by
CK = dμ̂K e−SK (φ=0,μ̂) . (326)
M̂K

The coeﬃcients CK are also known as Gromov–Witten invariants. They have

been explicitly computed for K = 1 and K = 2 by performing the integral over
M̂K and shown to match with the SW proposal and to reproduce Matone’s
relations, as previously reviewed.
In fact, as previously shown in Sect. 10.2, they can all be computed by
exploiting powerful localisation properties of the integral over the (super)
moduli space. Nekrasov and collaborators [79, 80, 81, 82] have been able to
localise the integrals over instanton moduli spaces by turning on the so-called
Ω-background, characterised by a constant self-dual anti-symmetric tensor
a
Ωμν = εa ημν . From the string vantage point, the Ω-background amounts
to a constant R–R graviphoton ﬁeld strength in the (Euclidean) spacetime
directions fμν = fa ημνa
. The precise numerical factor is 12 so that fμν = 12 Ωμν
or fa = 12 εa . In the presence of such a background, the D3-brane action gets
modiﬁed to30

SD3 = SSYM − d4 x [2igfμν TrNc (φ̄F μν ) + g 2 fμν f μν TrNc (φ̄2 )] , (327)

where TrNc denotes the trace over the Nc -dimensional representation of the
U (Nc ) Chan–Paton group associated with the D3-branes. The modiﬁcation of
the eﬀective action of the D3-branes after switching on the Ω-background can
be derived by the procedure of computing open string scattering amplitudes
on the disk with an insertion of a closed string vertex operator for the R–
R graviphoton. In the canonical (−1/2, −1/2) superghost picture the vertex
operator for the R–R graviphoton reads

Vf = fμν S α σαβ S̃ Σ Σ̃ † e−(ϕ+ϕ̃)/2 .

μν β
(328)

We observe that the only relevant amplitude is

(−1) (0) (−1/2,−1/2)
VA Vφ̄ Vf . (329)

(0) (0)
In fact all other amplitudes, including the one with Vφ̄ replaced by Vφ , ei-
ther vanish or are irrelevant in the low energy limit, i.e. produce higher deriva-
tive terms. The combined eﬀect of the Ω-background and the non-vanishing
v.e.v. for φ is to replace the standard ADHM matrix Δ(Nc +2K)×2K with

Δ(Nc +2K)×2K → Δ(Nc +2K)×2K + iA(Nc +2K)×2K (v, ε) + · · · . (330)

30
In order to expose the relative strength of the various terms in the action, we
henceforth switch to the perturbative normalisation, whereby we drop the overall
1/g 2 .
Instantons and Supersymmetry 385

It is important to note hat the upper block of A(Nc +2K)×2K (v, ε) is given by

Aup
Nc ×2K
(φ, ε) = φu v wivα̇ − wjuα̇ χj i , (331)

where u, v = 1, ...Nc , i, j = 1, ..., K, and the lower block is

Alow
2K×2K
(v, ε) = [χ, aμ ]σαμα̇ + εa σαa β aμ σβμα̇ . (332)

As in the standard (commutative, in the absence of graviphoton background,

i.e. for Ω = 0) case, the gauge ﬁeld can be written in the convenient form
gAμ = U † ∂μ U , which, as before, is not a pure gauge because U is not an
Nc × Nc matrix.
The fermionic zero modes can be parametrised as

g 1/2 λαr = U † (Lr f (w, x)b̄α + bα f (w, x)L̄r )U , (333)

where Lr , L̄r are K × K spinor matrices satisfying the super ADHM con-
straints and b, b̄ are (Nc + 2K) × 2K constant spinor matrices with vanishing
upper block and diagonal lower block. Moreover, one has

fK×K (w, x) = (w̄α̇ wα̇ + (a − x1)2 )−1 (334)

as a consequence of the ADHM constraints. Finally, the scalar ﬁeld proﬁle in

the presence of the non-commutative Ω background is given by
√
φ = i 2rs U † Lr f (w, x)L̄s U + U † J U (335)

and correctly satisﬁes

√
D2 φ = −i 2rs λαr λsα − igΩμν F μν (336)

to lowest order in g with Fμν = F̃μν . We note that J is an (Nc + 2K) ×

(Nc + 2K) block diagonal matrix with the upper block JNup c ×Nc
= vNc ×Nc ,
where vNc ×Nc represents the v.e.v. of the dynamical scalar fields φNc ×Nc , and
the lower block J2K×2K
low
= χK×K ⊗ 1 + 1 ⊗ εa σ a , where χK×K represents the
non-dynamical scalar fields (moduli).
In principle one can analyse by similar means gauge theories with lower
(N = 1) or no supersymmetry. This analysis is only in its infancy and goes
beyond the scope of this review. It is the subject of very intense research
activity at present, see e.g. [115]. We hope we have provided the interested and
proficient reader with the necessary tools to enter the arena of this fascinating
endeavour.

12 Instanton Eﬀects in N = 4 SYM

In the following sections we shall review the calculation of instanton eﬀects in
N = 4 SYM [70]. This is the maximally extended (rigid) supersymmetric the-
ory in four dimensions and possesses a number of remarkable properties. It is
386 M. Bianchi et al.

ultraviolet ﬁnite [116] and provides an example of four-dimensional quantum

field theory with exact conformal invariance at the quantum level. The theory
is also believed to be invariant under a strong↔weak coupling duality, known
as S-duality, which generalises the Montonen–Olive electric-magnetic dual-
ity [73, 102]. Originally, the interest in the theory was driven by the discovery
of its finiteness properties. In recent years it has been extensively studied in
the context of the AdS/CFT duality [88, 89, 90], which relates it to type IIB
superstring theory in an AdS5 × S 5 background.
As a conformal field theory N = 4 SYM has rather different physical
properties from those of the N = 1 and N = 2 theories previously discussed.
However, instanton effects play a decisive role also here and the methods of
supersymmetric instanton calculus described in the previous sections have
recently been extensively applied to the study of non-perturbative aspects of
the model. In particular, instantons are expected to be instrumental to the
realisation of S-duality in N = 4 SYM and the study of their contributions
has led to some of the most striking tests of the validity of the AdS/CFT
correspondence.
After a brief overview of the structure of N = 4 SYM and its notable
properties, in Sects. 14–16 we shall describe the calculation of instanton con-
tributions to correlation functions, essentially following the method that in
Sect. 2 was called the SCI method. In Sects. 17 and 18 we shall then discuss
the rôle of instantons in the AdS/CFT duality.

13 N = 4 Supersymmetric Yang–Mills Theory

The N = 4 supersymmetric Yang–Mills theory was originally constructed
in [70] as the dimensional reduction of the 10-dimensional N = 1 supersym-
metric Yang–Mills theory on a six torus. The field content of the theory con-
sists of a gauge field, Aμ , four Weyl fermions, λA α (A = 1, . . . , 4) and six real
scalars, ϕi (i = 1, . . . , 6). In terms of N = 1 multiplets these fields combine
into one vector and three chiral multiplets. All the fields are in the adjoint rep-
resentation of the gauge group, which in most of the following will be taken to
be SU (Nc ). The global supergroup of symmetries of the theory is P SU (2, 2|4),
whose maximal bosonic subgroup is SO(2, 4) × SU (4). The SO(2, 4) factor
is the four-dimensional conformal group and SU (4) is the R-symmetry group
under which the fermions transform in the 4 (and their conjugates in the 4̄),
the scalars in the 6 and the gauge field is a singlet. It is often convenient to
label the scalars by an anti-symmetric pair of indices in the 4, as ϕAB , subject
to the reality constraint
1
ϕ̄AB = (ϕAB )† = εABCD ϕCD . (337)
2
Instantons and Supersymmetry 387

The two parametrisations are related by

1 1
ϕi = √ ΣAB
i
ϕAB , ϕAB = √ Σ̄iAB ϕi , (338)
2 8
i
where ΣAB (Σ̄iAB ) are Clebsch–Gordan coeﬃcients projecting the product of
two 4’s (two 4̄’s) onto the 6. They can be expressed in terms of the ’t Hooft
symbols [2] as
i a a+3 a a
ΣAB = (ΣAB , ΣAB ) = (ηAB , iη̄AB ), (339)
a+3
Σ̄iAB = (Σ̄AB
a
, Σ̄AB ) = (−ηaAB , iη̄aAB ) , i = 1, . . . , 6 , a = 1, 2, 3 .

The elementary ﬁelds are conveniently represented as colour matrices and the
classical action of the theory, which is uniquely determined (up to the choice
of gauge group) by N = 4 supersymmetry, can be written as
#
1
S= d4 x Tr Fμν F μν + 2Dμ ϕAB Dμ ϕ̄AB − 2iλαAD / αα̇ λ̄α̇
A (340)
2
$
−2gλαA [λB α , ϕ̄ AB ] − 2g λ̄ [λ̄
α̇A B
α̇
, ϕAB
] − 2g 2 AB
[ϕ , ϕCD
][ϕ̄ AB , ϕ̄ CD ] .

The action (340) is invariant under the supersymmetry transformations

1 αA B 1 ABCD
δ ϕAB = λ α − λαB A α + ε ¯α̇C λ̄α̇
D
2 2
1
α = − Fμν σα
δ λA B − 8g[ϕ̄BC , ϕ
μν β A
β + 4i D / αα̇ ϕAB ¯α̇ CA B
]α (341)
2
μ μ
δ Aμ = −iλαA σαα̇ ¯α̇
A − i
αA
σαα̇ λ̄α̇
A.

Given the gauge group, the action (340) contains a single parameter, the cou-
pling constant g.31 The absence of divergences in the theory implies that the
corresponding β-function vanishes. As discussed in Appendix C, it is possible
to add to the action a ϑ-term.
The N = 4 SYM theory has a vacuum manifold parametrised by the
v.e.v.’s of the six scalars which make the potential vanish. The resulting mod-
uli space turns out to be
M = R6r /Sr , (342)
where r is the rank of the gauge group and Sr is the group of permutations of
r elements. At a generic point of the moduli space, the theory is in a Coulomb
phase and the gauge group is broken down to U (1)r . In this phase and in
the presence of a ϑ-term the theory contains BPS-saturated monopole and
dyon states characterised by integer quantum numbers, qe and qm , associated
31
In the following we will maintain the notation used in the previous sections,
denoting the Yang–Mills coupling by g. The string coupling constant will be
denoted by gs .
388 M. Bianchi et al.

with their electric and magnetic charges [100, 101]. The conjectured S-duality
of N = 4 SYM requires that the spectrum of such states be invariant under
the action of SL(2, Z) transformations acting projectively on the complexiﬁed
coupling, τ (deﬁned in (141)),

aτ + b
τ→ , a, b, c, d ∈ Z , ad − bc = 1 , (343)
cτ + d
while simultaneously rotating the electric and magnetic quantum numbers
according to
qe −a b qe
→ . (344)
qm c −d qm
Significant evidence in support of this conjecture has been obtained using
semi-classical methods [117].
The conformal phase of the theory corresponds to the origin of the moduli
space where all the scalar v.e.v.’s vanish. As already observed, at this point the
classical (super)conformal symmetry is preserved at the quantum level, result-
ing in a non-trivial interacting conformal field theory. In this phase the fun-
damental observables are correlation functions of gauge-invariant composite
operators constructed from the elementary fields in (340). Such operators are
classified according to their transformation under the global symmetries and
are organised in multiplets of the superconformal group, P SU (2, 2|4). Some
properties of the N = 4 superconformal group and its multiplets are reviewed
in Appendix C. Each operator is characterised by its quantum numbers with
respect to the bosonic subgroup SO(2, 4) × SU (4). These can be chosen to be
two spins, (j1 , j2 ), and the scaling dimension, Δ, identifying the transforma-
tion under the conformal group together with three Dynkin labels, [k, l, m],
identifying the SU (4) representation under which the operator transforms.
N = 4 composite operators can be broadly divided into two classes, protected
operators belonging to short or semi-short “BPS” multiplets of the supercon-
formal group and unprotected ones belonging to long multiplets [118, 92]. Cor-
relation functions of protected operators satisfy special non-renormalisation
properties. A notable example of BPS multiplet is the one comprising the
P SU (2, 2|4) conserved currents, i.e. the energy–momentum tensor, Tμν , and
μ
the supersymmetry and R-symmetry currents, ΣαA and JAμ B , respectively.
We give here the explicit form of the first few components of the Tμν multiplet

Q[A1 B1 ][A2 B2 ] = Tr 2ϕA1 B1 ϕA2 B2 + ϕA1 A2 ϕB1 B2 + ϕA1 B2 ϕA2 B1

XαA1 [A2 B2 ] = Tr 2λAα ϕ
1 A 2 B2
+ λAα ϕ
2 A 1 B2
− λBα ϕ
2 A1 A2

(A1 A2 )
E (A1 A2 ) = Tr −λαA1 λA α + g tCDEF GH ϕ
2 CD EF GH
ϕ ϕ (345)

Bμν
[A1 A2 ]
= Tr λαA1 σμν α β λAβ + 2iFμν ϕ
2 A1 A2

μν β AB
ΛA A
α = Tr σ α Fμν λβ + g[ϕ̄BC , ϕ
CA B
]λα + D / αα̇ λ̄α̇ C
B + g[λα , ϕ̄BC ] ϕ .
Instantons and Supersymmetry 389

The operator Q is the lowest component of the multiplet and transforms in the
20 of the SU (4) R-symmetry, E and Bμν are respectively in the 10 and 6 and
the fermionic operators Xα and Λα transform in the 20 and 4, respectively.
(A1 A2 )
The tensor tCDEF GH in E projects the product of three 6’s onto the 10.
In the next sections we shall study various examples of correlation functions
involving the operators in (345). We shall also consider other BPS multiplets
in the same class whose lowest component is a dimension scalar operator
which, in terms of the ϕi scalars, takes the form32

{i i ···i }
Q 1 2
= Tr ϕ{i1 ϕi2 · · · ϕi
} . (346)

The ﬁrst example of long (non-BPS) multiplet is the N = 4 Konishi multiplet,

whose lowest component is the dimension 2, SU (4) singlet scalar

K1 = εABCD Tr ϕAB ϕCD . (347)
Conformal invariance implies that, given a complete basis of operators in the
theory, any correlation function is fully determined, via the operator product
expansion (OPE), by two sets of numbers, the scaling dimensions of the opera-
tors and the Wilson coefficients that couple triplets of operators. Both scaling
dimensions and Wilson coefficients receive perturbative and non-perturbative
quantum corrections, so they are non-trivial functions of the coupling, g, and
the ϑ-angle.
The spectrum of scaling dimensions in N = 4 SYM has been the subject of
extensive study in the context of the AdS/CFT correspondence. The scaling
dimensions of composite operators are determined by their two-point correla-
tion functions. For a primary operator, O(x), conformal invariance fixes the
form of the two-point function O(x)O† (y) to be
c
O(x)O† (y) = , (348)
(x − y)2Δ
where Δ is the scaling dimension and c is a constant which may, in general,
depend on g and the ϑ-angle.33 In general Δ receives quantum corrections,
Δ = Δ0 + γ, where Δ0 is the bare or engineering dimension and γ the anoma-
lous part. The latter has an expansion of the form
∞
∞

2
/g 2 +iϑ)K
γ(g, ϑ) = γnpert g 2n + (K) 2m (−8π
γm g e + c.c. , (349)
n=1 K>0 m=0
32
We use curly brackets to denote symmetrisation with subtraction of traces. For
simple symmetrisation and anti-symmetrisation we use parentheses and square
brackets, respectively.
33
This is actually an oversimplification. In general, in a given sector characterised
by certain quantum numbers, one needs to consider a complete set of operators
and resolve their mixing. The resolution of the operator mixing diagonalises the
matrix of two-point functions. Only after this step equations of the form (348)
determine the physical scaling dimensions.
390 M. Bianchi et al.

where the first series contains the perturbative contributions and the second
double series the instanton and anti-instanton contributions. The two-point
functions of protected operators are not renormalised implying that their bare
dimensions are not corrected (Δ = Δ0 ).
The behaviour (349) of the anomalous dimensions illustrates an important
feature of N = 4 observables. In general, physical quantities receive contri-
butions at all orders in perturbation theory and from all instanton sectors.
Moreover, in each instanton sector there exists an infinite series of perturba-
tive corrections arising from fluctuations around the leading instanton semi-
classical contribution. This is the consequence of the absence of chiral selection
rules and marks an important difference with respect to the cases of N = 1
and N = 2 theories considered in the previous sections. Indeed, in N = 4
SYM there are no anomalous U (1)’s. As a consequence, as will be discussed
in the next sections, there exist no correlation functions which are dominated
by the contribution of specific instanton sectors.
In the conformal phase the field equations of N = 4 SYM admit no (non-
singular) monopole or dyon solutions and the conjectured S-duality has a
different realisation. Specifically, it requires that the spectrum of scaling di-
mensions of gauge-invariant operators be invariant under the SL(2, Z) trans-
formations (343). This suggests that the scaling dimensions should naturally
be written as functions of τ and τ̄ in the form
Δ = Δ(τ, τ̄ ) = Δ0 + γ(τ, τ̄ ) . (350)
We then conclude that instanton effects, which are the source of the ϑ de-
pendence in (349), must play a crucial role here. Similarly, it can be argued
that instantons are important in determining the behaviour of correlation
functions under S-duality. As will be discussed in Sects. 17 and 18, the argu-
ments outlined here also resonate with what is understood about the role of
D-instantons in the dual type IIB string theory compactified on AdS5 × S 5 .
Unfortunately, little is known beyond these qualitative considerations and the
details of how the S-duality of N = 4 SYM is implemented in the supercon-
formal phase remain largely elusive, see, however, [119] and [120] for recent
progress.

14 Instanton Calculus in N = 4 SYM

In this section we discuss some general features of instanton calculus in the
N = 4 SYM theory, highlighting the differences with respect to the N = 1
and N = 2 cases. In the next section we analyse in more detail one-instanton
contributions to some specific correlation functions, focussing on the case of
SU (Nc ) gauge group which is particularly important for the applications to
the AdS/CFT correspondence. Multi-instanton configurations are described
using the ADHM formalism [19] (see also Sect. 9.2). A full account of the tech-
nical aspects of the ADHM construction is beyond the scope of the present
Instantons and Supersymmetry 391

work. A comprehensive review of multi-instanton calculus in supersymmetric

gauge theories can be found in [6]. The calculation of multi-instanton contri-
butions is extremely involved and their direct evaluation is only possible for
small instanton numbers (although remarkable progress was made in [81, 82].
See the discussion in Sect. 10). However, dramatic simplifications occur in the
large Nc limit of relevance for the AdS/CFT duality. A brief review of the
computation of multi-instanton corrections to N = 4 correlators in this limit
will be given in Sect. 16 following the work of [121].
The calculation of correlation functions in N = 4 SYM in the semi-classical
approximation proceeds as described in a general setting in Sect. 2. The path
integral is evaluated using a saddle point approximation around the instanton
configuration and thus reduced to a finite dimensional integral over the collec-
tive coordinate manifold associated with bosonic and fermionic zero modes.
However, the form of the interaction terms in the N = 4 action (340), and
in particular the coupling to the scalar fields, requires a modification of the
previous analysis.
In principle, as done in Sect. 2, it is possible to use as saddle point config-
uration the solution in which the gauge field is given by the standard bosonic
instanton with all the other fields vanishing, i.e.
Aμ = AIμ , λA α̇
α = λ̄A = ϕ
AB
= 0. (351)
The fluctuations of the various fields in the background of this configuration,
including fermions, should then be treated perturbatively. This results in an
expansion which, in the case under consideration of N = 4 SYM, is somewhat
hard to handle beyond leading order. A more efficient approach consists in
utilising as saddle point configuration a finite action solution of the complete
set of coupled field equations of the theory including higher order corrections
in g. This also provides a natural framework for implementing the supersym-
metric generalisation of the ADHM construction.
A generic n-point correlation function computed in the semi-classical ap-
proximation around such a saddle point has an expression analogous to (23)–
(25) which we schematically rewrite in the form

O1 (x1 ) · · · On (xn ) = dμ(β, c) e−Sinst Ô1 (x1 ; β, c) · · · Ôn (xn ; β, c) , (352)

where dμ(β, c) is the integration measure over the bosonic (β) and fermionic
(c) collective coordinates arising from the zero-mode fluctuations around the
classical solution and Sinst is the action evaluated on the solution. With Ôi we
denote the classical expressions of the operators at the saddle point. The latter
depend on the insertion points of the operators and the collective coordinates.
For pure N = 1 SYM there exists a whole manifold of saddle points
(including the one in (351)) which correspond to field configurations with
Aμ = AIμ , a gaugino solution of the Weyl–Dirac equation
α̇α
/̄
D λα = 0 , (353)
392 M. Bianchi et al.

and the anti-chiral fermion, λ̄α̇ , identically zero. The resulting semi-classical
expectation values involve integrals over the nB bosonic collective coordinates
as well as the nF fermion zero modes resulting from the index theorem and
discussed in previous sections.
The generalisation of this analysis to the N = 4 case incurs in a seri-
ous obstacle: no exact topologically non-trivial solution of the coupled ﬁeld
equations is known except (351). The N = 4 SYM ﬁeld equations read
1
A } + g[ϕ̄AB , D ϕ
Dμ F μν + i g{λαA σαν α̇ , λ̄α̇ ν AB
]=0
2
√ 1 ABCD g2
D2 ϕAB + 2 g[{λαA , λB α} + ε {λ̄α̇C , λ̄α̇
D }] − [ϕ̄CD , [ϕAB , ϕCD ]] = 0
2 2
α̇α √
/̄ λA
D α + i 2 g[ϕ
AB
, λ̄α̇
B] = 0 (354)
√
A − i 2 g[ϕ̄AB , λα ] = 0 .
/ αα̇ λ̄α̇ B
D

In the following discussion we denote by Φ(n) a solution of the classical equa-

tions of motion for the generic field Φ, which depends on n zero modes of
the Dirac operator in the instanton background. As already observed, the
equations (354) are solved by the purely bosonic configuration (351), where
(0)
AIμ = Aμ is a charge-K instanton, with nB associated collective coordinates.
One can try to do better and solve iteratively the full set of coupled equa-
tions (354). Upon substituting the instanton solution (351), the equation for
λA is of the form (353) and admits nF independent solutions for each “flavour”
A = 1, . . . , 4. After this first step of the iteration, which generates a non-trivial
A(1)
solution, λα , for the gauginos, one notices that the configuration

Aμ = A(0)
μ , λA A(1)
α = λα , ϕAB = λ̄α̇
A = 0, (355)

unlike what happens in the cases of N = 1 and N = 2 SYM, is not a so-

lution of (354) and the process has to continue. The equation for the scalar
A(1)
ﬁelds, obtained inserting back λα , admits a solution which is bilinear in the
AB(2)
fermion modes, ϕ . Again the resulting conﬁguration

Aμ = A(0)
μ , λA A(1)
α = λα , ϕAB = ϕAB(2) , λ̄α̇
A = 0, (356)

is not an exact solution of (354). A further iteration generates a non-trivial

α̇(3)
field configuration, λ̄A , for the anti-chiral fermions involving three zero-
modes. At this point also the first equation in (354) gets an extra term, so
(4)
that at the next step a modification, Aμ , of the standard bosonic instanton
(4)
is necessary. One may notice that the field strength associated with Aμ is no
longer (anti-)self-dual.
In principle this recursive procedure can only stop when, after a number
of successive iterations, a field configuration involving a number of fermion
Instantons and Supersymmetry 393

modes exceeding nF is generated. The first few steps of the construction de-
scribed above were explicitly carried out in [122]. However, a complete super-
instanton multiplet, which would exactly solve (354) in closed form is not
known. Indeed it has been argued in [6] that for generic gauge group such
an exact solution may not exist. In spite of this obstacle, it is possible to
consistently compute the semi-classical contribution to correlation functions
expanding the path integral around an appropriately approximate solution.
The crucial observation is that successive steps in the iterative procedure
outlined above produce corrections to the solution which are suppressed by
increasing powers of g. Therefore, in the weak coupling limit, it is consistent
to employ as saddle point configuration an approximate truncated solution of
the equations of motion in which only terms up to a certain power of g are
retained. Thus the idea is to solve the field equations to leading order and in-
clude in the integration in (352) all the zero modes of the truncated equations.
For the purpose of computing correlators in the semi-classical approximation,
the relevant saddle point is determined by solving the system

Fμν = F.μν ,
α̇α A
/̄
D λα = 0, (357)
√
α −λ
2 AB
D ϕ = g 2 λαA λB αB A
λα ,

with the integration measure in (352) including all the associated fermion zero
modes. The action evaluated on the solution of (357) is not simply given by
8π 2 /g 2 , but manifestly depends on a subset of the collective coordinates

8π 2
Sinst = − iϑ + S̃inst (β̃, c̃) , (358)
g2

where we have denoted with β̃ and c̃ the collective coordinates associated with
the “non-exact” zero modes, i.e. those which are zero modes of the truncated
equations (357), but not of the full coupled equations (354). These non-exact
zero modes are said to be “lifted” by the interactions. As will be discussed
more explicitly in the next section, in the case of gauge group SU (Nc ), the only
fermion modes which remain unlifted are those associated with the Poincaré
and special supersymmetries which are broken in the instanton background.
All the remaining modes are lifted by the coupling to the scalars.
The lifting of fermion zero modes has important consequences for the prop-
erties of correlation functions which receive instanton contributions. Since
some of the fermionic collective coordinates (c̃ in the formula above) ap-
pear explicitly in the action, it is not necessary to saturate the corresponding
fermionic integrations with the operator insertions in order to obtain a non-
vanishing result. This implies that in the N = 4 theory there are no (strict)
selection rules determining which correlation functions receive contributions
from which winding number sector, unlike what happens in the N = 1 and
N = 2 cases. In particular, non-vanishing correlators receive contributions
394 M. Bianchi et al.

from conﬁgurations with arbitrary instanton number. This result emerging

from explicit calculations has its origin in the absence of an anomalous U (1)
R-symmetry in N = 4 SYM.
It must be stressed that the approach described here is consistent only if
restricted to the calculation of the leading order contributions in g (see (357)).
The reason is that in order to go to higher orders in a consistent way one should
also take systematically into account all the quantum ﬂuctuations that beyond
the semi-classical approximation have been neglected.

15 One-instanton in N = 4 SYM with SU (Nc) Gauge

Group
In the one-instanton sector, the approach described in the previous section
can be implemented in a straightforward way. We focus here on the case
of SU (Nc ) gauge group, but the generalisation to orthogonal and symplectic
groups is not too difficult. As explained above, we shall use as saddle point for
the calculation of correlation functions in the semi-classical approximation the
solution to the truncated equations (357).34 The resulting saddle point field
configuration is characterised by 4Nc bosonic collective coordinates. As we
know they are the position, x0 , and size, ρ, of the instanton and its global
gauge orientation parameters. The latter can be conveniently identified with
the set of variables, wuα̇ and w̄α̇u (where u = 1, . . . , Nc is a colour index and
α̇ = 1, 2 is a spinor index), parametrising the coset SU (Nc )/SU (Nc −2)×U (1)
describing the SU (2) colour orientation of the instanton and its embedding
into SU (Nc ) [9]. Moreover, as we will be working with the approximate solu-
tion of (357), all the 8Nc fermionic collective coordinates associated with the
zero modes of the Dirac operator will be included in the integration measure.
These comprise the 16 moduli associated with the Poincaré and special super-
symmetries broken by the bosonic instanton and denoted, respectively, by ηαA
and ξ¯α̇A (A = 1, 2, 3, 4). For brevity we shall refer collectively to these modes as
superconformal modes. The remaining fermion moduli, which can be thought
of superpartners of the gauge orientation parameters, are described by 8Nc
parameters, νuA and ν̄ Au , subject to the 2 × 8 constraints

w̄α̇u νuA = 0 , ν̄ Au wuα̇ = 0 , (359)

which eﬀectively reduce their number to 8(Nc − 2).

As discussed in the previous section, only the 16 superconformal modes
remain exact zero modes to all orders in g. The νuA and ν̄ Au modes are lifted
34
In preparation for the forthcoming discussion on the AdS/CFT duality we shall
from now on work with ﬁelds rescaled by a factor of g. Consequently, the action
of the N = 4 theory will have an overall factor 1/g 2 in front. This is the normali-
sation which arises naturally in the dual string theory and therefore this rescaling
will simplify the comparison of string and gauge theory results.
Instantons and Supersymmetry 395

and appear explicitly in the instanton action. The solution of the coupled
equations (357) generates a non-trivial conﬁguration for the scalars, ϕAB ,
which is bilinear in the 8Nc fermion zero modes. Substituting this solution,
together with λ̄α̇
A = 0, in the action (truncated to the cubic couplings for
consistency with our iterative procedure) gives
π2
Sinst = −2πiτ + S4F = −2πiτ + εABCD F AB F CD , (360)
2g 2 ρ2
where τ was deﬁned in (141) and
1
F AB = √ ν̄ Au νuB − ν̄ Bu νuA . (361)
2 2
It is the four-fermion term, S4F , arising from the Yukawa couplings
λA [λB , ϕ̄AB ] which is responsible for the lifting the of the 8(Nc − 2) νuA and
ν̄ Au modes.

15.1 A Generating Function

In the following we only consider correlators of gauge-invariant operators for
which the integration over the moduli describing the global orientation of
the instanton simply produces a volume factor which can be absorbed in the
measure [9]. We denote by dμphys the gauge-invariant, or physical, integra-
tion measure on the instanton moduli space obtained after integration over
the gauge orientation parameters. The physical measure to be used in the
calculation of expectation values in semi-classical approximation in the one-
instanton sector is35

π −4Nc g 4Nc e2πiτ
dμphys e−Sinst = (362)
(Nc − 1)!(Nc − 2)!
4
× dρ d4 x0 d2 η A d2 ξ¯A dNc −2 ν A dNc −2 ν̄ A ρ4Nc −13 e−S4F ,
A=1

where the instanton action is given in (360) and (361) and the ρ, g and Nc
dependence is the result of the normalisation of the collective coordinates as
explained in Sect. 2.
Following [121] the integration over the 8(Nc − 2) non-exact modes, νuA
and ν̄ Au , can be reduced to a Gaussian form introducing auxiliary bosonic
coordinates, χi , i = 1, . . . , 6, and rewriting the r.h.s. of (362) in the form
4
π −4Nc g 4Nc e2πiτ 4
dρ d x0 d χ 6
d2 η A d2 ξ¯A dNc −2 ν A dNc −2 ν̄ A
(Nc − 1)!(Nc − 2)!
A=1

4Nc −7 4πi
×ρ exp −2ρ2 χi χi + χAB F AB , (363)
g
35
Here and in the following formulae we omit a (Nc -independent) numerical con-
stant that will be reinstated in the ﬁnal expressions.
396 M. Bianchi et al.

where χAB = √18 ΣAB i

χi and F AB is defined in (361).
The semi-classical contribution to a correlation function of local gauge-
invariant operators is obtained by integrating the product of their profiles in
the instanton background with the above measure. The integration over the
Grassmann variables in (363) requires that the classical expressions of the op-
erators soak up the 16 fermion modes ηαA and ξ¯α̇A for the result to be non-zero.
The Grassmann integrals over the ν A and ν̄ A modes are non-vanishing even if
the operators do not contain any dependence on these variables. In the follow-
ing we shall use the terminology introduced in [123] and refer to correlation
functions in which the operator insertions soak up only the 16 exact modes as
“minimal”. Correlators in which the operators contain a dependence on more
than sixteen modes will be referred to as “non-minimal”. In order to system-
atically study the instanton contributions to generic correlation functions, it
is convenient to construct a generating function. This allows to drastically
simplify the combinatorics associated with the ν A and ν̄ A integrations in the
non-minimal cases [123]. For this purpose, we introduce sources, ϑ̄uA and ϑAu ,
in (363) coupled to those fermionic variables and define
4
π −4Nc g 4Nc e2πiτ
Z[ϑ, ϑ̄] = dρ d4 x0 d6 χ d2 η A d2 ξ¯A dNc −2 ν̄ A dNc −2 ν A
(Nc − 1)!(Nc − 2)!
A=1
√
4Nc −7 8πi Au
×ρ exp −2ρ χ χ +
2 i i B u A
ν̄ χAB νu + ϑ̄A νu + ϑAu ν̄ Au
. (364)
g

Since the integrals over ν̄ and ν are Gaussian, they can be immediately com-
puted. Introducing polar coordinates

6
χi → (r, Ω) , (χi )2 = r2 , (365)
i=1

we ﬁnd
4
2−29 π −13 g 8 e2πiτ
Z[ϑ, ϑ̄]= dρ d4 x0 d5 Ω d2 η A d2 ξ¯A ρ4Nc −7
(Nc − 1)!(Nc − 2)!
A=1
∞
dr r4Nc −3 e−2ρ r Z(ϑ, ϑ̄; Ω, r) ,
2 2
× (366)
0

where all the numerical coeﬃcients omitted in previous expressions have been
reinstated and we have introduced the density

ig u AB
Z(ϑ, ϑ̄; Ω, r) = exp − ϑ̄A Ω ϑBu , (367)
πr
where the symplectic form Ω AB is given by (see (365))

6
2
Ω AB = Σ̄iAB Ω i , Ωi = 1. (368)
i=1
Instantons and Supersymmetry 397

Notice that the angular variables, Ω i , introduced in the polar representation

of the auxiliary coordinates, χi , parametrise a ﬁve-sphere. This will play a
very important role in comparing with string theory results in the context of
the AdS/CFT correspondence.
A n-point correlation function in the semi-classical approximation takes
now the form (see (352))

O1 (x1 ) · · · On (xn ) = dμphys e−Sinst Ô1 · · · Ôn , (369)

where
Ôi = Ôi (xi ; x0 , ρ, η A , ξ¯A , ν A , ν̄ A ) (370)
denotes the classical instantonic proﬁle of the operator Oi and generically
depends on all the bosonic and fermionic moduli. In particular, the ν A and
ν̄ A modes appear in gauge-invariant operators only in colour singlet bilin-
ears, in either symmetric or anti-symmetric combinations belonging to the
representation 10 or 6 of SU (4), respectively, i.e. in the combinations

(ν̄ A ν B )10 ≡ ν̄ u(A νuB) = (ν̄ Au νuB + ν̄ Bu νuA ) , (371)

(ν̄ A ν B )6 ≡ ν̄ u[A νuB] = (ν̄ Au νuB − ν̄ Bu νuA ) . (372)

The strategy is then to rewrite the dependence on these collective coordinates

in each insertion in (369) in terms of derivatives with respect to the sources
(ϑA , ϑ̄A ). In this way the dependence on the ν A ’s and ν̄ A ’s is traded for a
dependence on the angular variables Ω AB . After this step the integration over
the radial parameter, r, can be computed and one is left with an integration
over the bosonic coordinates, x0 , ρ, Ω AB , and the 16 coordinates associated
with the exact modes, ηαA and ξ¯α̇A . In the next subsections we present some
examples of such calculations.

15.2 Minimal Correlation Functions

A class of correlation functions which have been extensively studied in the

context of the AdS/CFT correspondence are those involving the operators in
the N = 4 supercurrent multiplet (345). This is a 1/2-BPS supermultiplet
and thus all its components are protected operators. Their two- and three-
point functions are not renormalised and in particular do not receive instanton
contributions. However, their four- and higher-point functions can be non-zero
in an instanton background and turn out to contain interesting dynamical
information. The ﬁrst examples of correlators in this class were considered
in [124] in the case of SU (2) gauge group. The calculations have then been
generalised to SU (Nc ) in [125] and to multi-instantons in the large Nc limit
in [121].
398 M. Bianchi et al.

• The simplest minimal correlation function involves 16 insertions of the

fermionic operator36
1
ΛA
α = Tr σ μν β A
α Fμν λβ + [ϕ̄BC , ϕ
CA B
]λα
g2

+ D / αα̇ λ̄α̇ C
B + [λα , ϕ̄BC ] ϕ
AB
, (373)

transforming in the 4 of the SU (4) R-symmetry group

G16 (x1 , x2 , . . . , x16 ) = ΛA

α1 (x1 ) Λα2 (x2 ) · · · Λα16 (x16 ) .
1 A2 A16
(374)

For the calculation of (374) in the semi-classical approximation only the con-
tribution of the first term in (373) to the classical profile of ΛA
α is relevant. In
fact by reinserting for a moment the powers of g that were absorbed in the
redefinition of the fields, it is immediately seen that all the other terms are of
higher order in g.
(0) A(1)
Substituting the solution for Aμ and λα one obtains for the classical
A
profile of Λα
96 4 4 A
Λ̂A
α (x) = 2 ρ [f (x)] ζα (x) , (375)
g
where the function f (x) is defined in (A.40) and ζαA (x) is the combination
1 ! "
ζαA (x) = √ ρ ηαA − (x − x0 )μ σαμα̇ ξ¯α̇A . (376)
ρ

We explicitly note that Λ̂Aα is linear in the superconformal collective coor-

dinates and does not depend on the ν A and ν̄ A modes. This means that in
evaluating the correlator (374) the sources ϑ̄uA and ϑAu can be set to zero. We
thus get

4
251 316 π −13 e2πiτ
G16 (x1 , . . . , x16 ) = dρ d4 x0 d5 Ω d2 η A d2 ξ¯A
g 24 (Nc − 1)!(Nc − 2)!
A=1
∞
16
dr r4Nc −3 e−2ρ ρ4Nc −7
2 2 4
× r
ρ4 [f (xi )] ζαAii (xi ) . (377)
0 i=1

The r integral is elementary and yields

∞
dr r4Nc −3 e−2ρ r = 2−2Nc ρ2−4Nc Γ (2Nc − 1) ,
2 2
(378)
0

so that
36
Here and in the following we use for the composite operators the normalisation
appropriate in the context of the AdS/CFT correspondence, which requires that
their tree-level correlation functions be proportional to Nc2 (see (422) and (423)).
Instantons and Supersymmetry 399

dρ d4 x0 5 2 A 2 ¯A
4
c1 (Nc ) 251 316 π −13 e2πiτ
G16 (x1 , . . . , x16 ) = d Ω d η d ξ
g 24 ρ5
A=1

16
ρ 4
× ζ Ai (xi ) , (379)
i=1
[(xi − x0 )2 + ρ2 ]4 αi

where
2−2Nc Γ (2Nc − 1)
c1 (Nc ) = ∼ Nc1/2 . (380)
(Nc − 1)!(Nc − 2)!
Nc →∞

The integration over the fermion modes selects terms with eight η A ’s and
eight ξ¯A ’s in (379) and results in a fully anti-symmetric tensor in the SU (4)
and spinor indices. As will be discussed in Sect. 18, the unintegrated expres-
sion (379) suﬃces for the comparison with the associated dual process in string
theory.
• As a second example of minimal correlation function, we consider the
four-point function

G4 (x1 , . . . , x4 ) = QA1 B1 C1 D1 (x1 ) · · · QA4 B4 C4 D4 (x4 ) , (381)

where the scalar operators QABCD belong to the 20 of SU (4) and are given
by
1
QABCD = 2 Tr 2ϕAB ϕCD + ϕAC ϕBD − ϕAD ϕBC . (382)
g
When evaluated on the solution of the saddle point equations (357), the
QABCD ’s contain four fermion modes. Unlike the fermions ΛA α they also in-
volve the ν A and ν̄ A modes. However, in the minimal correlator (381) this
dependence can be neglected as all the fermion modes need to be of type
η A and ξ¯A to saturate the corresponding Grassmann integrals. The relevant
terms giving the proﬁle of QABCD are
96 4 4 ! "
Q̂ABCD = 2
ρ [f (x)] (ζ αA ζαC )(ζ βB ζβD ) − (ζ αA ζαD )(ζ βB ζβC ) , (383)
g

with ζαA deﬁned in (376).

Proceeding as for the 16-point function (374), one ﬁnds

dρ d4 x0 5 2 A 2 ¯A
4
G4 (x1 , . . . , x4 ) = c1 (N ) 2−9 34 π −13 e2πiτ d Ω d η d ξ
ρ5
A=1

4
ρ 4 ! A i C i Bi Di "
× (ζ ζ )(ζ ζ ) − (ζ Ai ζ Di )(ζ Bi ζ Ci ) (xi ) , (384)
i=1
[(x − x0 )2 + ρ2 ]4

with c1 (N ) given in (380).

In the case of the four-point function (381) the ﬁnal integrals in (384) have
been explicitly computed in [126]. The result is a very complicated function
400 M. Bianchi et al.

of the distances x2ij = (xi − xj )2 , which, however, can be used to extract infor-
mation about instanton contributions to the anomalous dimensions of certain
operators via the OPE analysis. As discussed in [126], the result shows, in par-
ticular, that the Konishi operator (347) does not acquire an instanton induced
anomalous dimension. The study of the OPE also shows that there are SU (4)
singlet operators with Δ0 = 4, which do receive an instanton contribution
(as well as possibly a perturbative one) to their anomalous dimension. We
shall brieﬂy return to the calculation of instanton corrections to the scaling
dimensions of composite operators at the end of the next subsection and to
the interpretation of (379) and (384) in Sect. 18.2.

15.3 Non-minimal Correlation Functions

The minimal correlators considered above are not dominated by the contri-
bution of the one-instanton sector. Apart from ordinary perturbative correc-
tions, they receive contributions from K > 1 instanton configurations as well
as from perturbative fluctuations in each instanton sector. The non-minimal
correlation functions, in which the operator insertions soak up more than the
minimal number of fermion zero modes, have similar properties in this respect.
However, their calculation presents new complications that will now be illus-
trated with explicit examples. In general, one can distinguish two classes of
such non-minimal correlation functions, based on the features that differenti-
ate them from related minimal cases, i.e. those involving additional insertions
and those involving higher-dimensional operators.
• An example of the first type is the 20-point function
G20 (x1 , . . . , x20 ) = ΛA
α1 (x1 ) · · · Λα16 (x16 ) E
1 A16 B1 C 1
(y1 ) · · · E B4 C4 (y4 ) , (385)
where ΛA
α is defined in (373) and
1

(BC)
E BC = 2 Tr −λαB λC α + tDEF GHL ϕ
DE F G HL
ϕ ϕ . (386)
g
For the calculation of (385) in the semi-classical approximation only the first
term in (386) is relevant. Its contribution to the classical profile of E BC is
96 4 4 2 3
Ê BC = − ρ [f (x)] ζ αB ζαC − 2 ρ2 [f (x)] (ν̄ Bu νuC + ν̄ Cu νuB ) . (387)
g2 g
As explained in the previous subsection, the operator ΛA α does not depend on
the fermion modes of type ν A and ν̄ A and thus in evaluating (385) we need
to use for each E BC insertion the second term in (387), as the ΛA α insertions
already soak up all the superconformal modes. In this way we get
16
1
dμphys e−Sinst
4
G20 (x1 , . . . , x20 ) = 40 96ρ4 [f (xi )] ζ αi Ai (xi )
g i=1

4
3
× 2ρ2 [f (yj )] (ν̄ (Bj ν Cj ) ) . (388)
j=1
Instantons and Supersymmetry 401

Using the generating function (366), this formula can be rewritten in the form
4
255 316 π −13 e2πiτ
G20 (x1 , . . . , x20 ) = 32 4
dρ d x0 d Ω5
d2 η A d2 ξ¯A
g (Nc − 1)!(Nc − 2)!
A=1
∞ 16
4
dr r4Nc −3 e−2ρ r ρ4Nc −7
2 2 4 3
× ρ4 [f (xi )] ζ αi Ai (xi ) ρ2 [f (yj )]
0 i=1 j=1

δ 8 Z(ϑ, ϑ̄, Ω, r)

× . (389)
δϑu1 (B1 δ ϑ̄uC1 ) · · · δϑu4 (B4 δ ϑ̄uC4 )
1 4 ϑ=ϑ̄=0

After evaluating the derivatives and eliminating the sources the integral over
r can be performed and one gets

dρ d4 x0 2 A 2 ¯A
4
257 316 π −17 c2 (Nc ) e2πiτ
G20 (x1 , . . . , x20 ) = d η d ξ
g 28 ρ5
A=1

16
ρ 4
4
ρ 3
× ζ αi Ai (xi )
i=1
[(xi − x0 )2 + ρ2 ]4
j=1
[(y j − x 2 2 3
0) + ρ ]

! "
× d5 Ω Ω B1 C2 Ω B2 C1 Ω B3 C4 Ω B4 C3 + · · · , (390)

where the ellipsis in the last line stands for permutations of the Bi , Ci indices
and
2−2Nc (Nc − 2)2 Γ (2Nc − 3)
c2 (Nc ) =
(Nc − 1)!(Nc − 2)!

25
∼ Nc1/2 1 − + O(1/Nc2 ) . (391)
8Nc
Nc →∞

The factor of (Nc −2)2 in the numerator of c2 (Nc ) comes from the contraction
of colour indices in the ϑAu ’s and ϑ̄uA ’s sources. The integration over the ﬁve
sphere in (390) gives the SU (4) tensor
tB1 C1 ···B4 C4 = εB1 C2 B2 C1 εB3 C4 B4 C3 + permutations . (392)

The main difference to be noted with respect to the minimal cases is the non-
trivial dependence on the angular variables parametrising the five-sphere. In
general, as in the above expression, the five-sphere integral factorises and gives
rise to SU (4) selection rules. Specifically, a correlation function can receive
a non-zero instanton contribution only if the SU (4) flavour indices carried
by the non-exact modes, ν A and ν̄ A , appear in a combination containing the
SU (4) singlet representation. We shall re-examine the results (390) and (391)
in Sect. 18.2 in connection with the corresponding processes in the dual string
theory. In particular, we will see that the calculation of non-minimal corre-
lators such as (385) leads to a puzzle: the N -dependence in the SYM result
402 M. Bianchi et al.

does not agree with that of the amplitudes which are naturally identified as
their dual. The resolution of the puzzle will require taking into account further
types of contributions which do not arise in the minimal case (see Sect. 18.2).
From the previous example and the form of the generating function (366)
we can deduce some general features of non-minimal correlation functions. The
insertion of each (ν̄ A ν B ) bilinear in a correlator corresponds to two derivatives
of (367) with respect to the sources. This, besides producing a factor of g,
also modifies the r dependence of the integrand, thus affecting the overall Nc -
dependence, see (378). Moreover additional factors of Nc are associated with
the contraction of the colour indices carried by the νuA ’s and ν̄ Au ’s variables.
From (366) and (367) one checks that in general the insertion of any (ν̄ A√ ν B )10
A B
pair yields a factor g and the insertion of each (ν̄ ν )6 pair a factor g Nc .
Schematically, for a generic non-minimal n-point correlation function con-
taining q (ν̄ A ν B )10 factors and p (ν̄ A ν B )6 bilinears one finds

dρ d4 x0 5 2 A 2 ¯A
4
O1 (x1 ) · · · On (xn ) ∼ g 8+p+q e2πiτ α(Nc ) d Ω d η d ξ
ρ5
A=1

n
×ρp+q ¯ Ω) ,
Õi (xi ; x0 , ρ, η, ξ, (393)
i=1

where Õi denote the proﬁles of the operators after the dependence on the
non-exact modes has been re-expressed in terms of the Ω AB ’s of (368). For
future use we give the expression of the coeﬃcient α(Nc ) at large Nc

2−2Nc Γ 2Nc − 1 − p+q 2 p+ q 1
α(Nc ) = Nc 2 1 + O( )
(Nc − 1)!(Nc − 2)! Nc

1
+ p 1
∼ Nc2 2 1 + O( ) . (394)
Nc
• All the operators in the supercurrent multiplet considered so far only in-
volve the (ν̄ A ν B )10 bilinears. Anti-symmetric bilinears occur in higher di-
mension operators such as those in multiplets having as lowest component
the scalars (346) with ≥ 3. An example of correlation function containing
such insertions is
G16 (x1 , . . . , x16 ) = ΛA
α1 (x1 ) · · · Λα14 (x14 )Λ̃β1
1 A14 B1 B2 B3
(y1 )Λ̃C
β2
1 C2 C3
(y2 ) , (395)

where the operator Λ̃B α

1 B2 B3
, which belongs to the same multiplet as Q=3
and transforms in the 20 of SU (4), is
1

β B2 B 3 β B3 B 2
Λ̃B
α
1 B2 B3
= 1/2
Tr 2λ B1
α λ λ β + λ λ β
g 3 Nc

+λB α
2
λ β B1 λ B
β +λ
3 β B3 B 1
λβ + λB α
3
λ β B1 λ B
β +λ
2 β B2 B 1
λβ

+Fmn σ mn α
β
{λ B2
β , ϕ B1 B3
} + {λ B3
β , ϕ B1 B2
} + · · · , (396)
Instantons and Supersymmetry 403

with the ellipsis referring to terms which are negligible in the semi-classical
approximation.
The proﬁle of the operator (396) in the one-instanton background is
24
Λ̃ˆB 1 B2 B3 5
α = 1/2
ρ4 [f (x)] ζαB2 (ν̄ [B1 ν B3 ] ) + ζαB3 (ν̄ u[B1 νuB2 ] ) . (397)
g 3 Nc

The correlation function (395) is computed by replacing each inserted operator

with its classical profile and integrating over the moduli space. In particular,
the normalisation of the operators Λ̃ABC
α is such as to compensate the addi-
tional factors Nc associated with the (ν̄ A ν B )6 bilinears and the final result
1/2
behaves again like Nc in the large-Nc limit. Proceeding as in the previous
cases, one finds

dρ d4 x0 5 2 A 2 ¯A
4
c3 (Nc ) 248 316 e2πiτ
G16 (x1 , . . . , x16 ) = d Ω d η d ξ
π 17 g 24 ρ5
A=1

14
ρ4
× ζ Ai (xi ) (398)
i=1
[(xi − x0 )2 + ρ2 ]4 αi

ρ5 ρ5
× ζ B2
(y 1 ) Ω B1 B3
ζ C2
(y 2 ) Ω C1 C3
+ · · · ,
[y1 − x0 )2 + ρ2 ]5 β1 [y2 − x0 )2 + ρ2 ]5 β2

where the · · · refers to symmetrisation in (B2 , B3 ) and (C2 , C3 ). The Nc de-

pendence is contained in the coeﬃcient c3 (Nc ), where

2−2Nc (Nc − 2)2 Γ (2Nc − 2)

c3 (Nc ) = ∼ Nc1/2 . (399)
Nc (Nc − 1)!(Nc − 2)!
Nc →∞

The integration over the ﬁve-sphere in this case gives a single ε-tensor

π 3 ABCD
d5 Ω Ω AB Ω CD = ε . (400)
6

The example of (395) allows to illustrate another feature of non-minimal

correlators. Since not all the ﬁelds are employed to soak up the 16 exact
superconformal modes, there are contributions to the expectation value in
which pairs of ﬁelds are contracted with an instantonic propagator. In the
case of (395), for instance, it is possible to contract pairs of scalars in the
two Λ̃ABCa operators. Contributions of this type are of the same order in g
as those in which the extra insertions soak up ν A and ν̄ A modes, since with
the normalisations we are using (S ∝ 1/g 2 ) the scalar propagator is propor-
tional to g 2 . They are, however, sub-leading with respect to terms containing
(ν̄ A ν B )6 pairs at large Nc . The evaluation of the contributions with contrac-
tions is rather involved because they require the use of the propagator in the
instanton background, which has a complicated expression [20, 127]. We shall
404 M. Bianchi et al.

not discuss further these effects, but we stress that they are essential for the
comparison with certain string theory amplitudes [123].
There are many other interesting examples of non-minimal correlation
functions in N = 4 SYM which could be discussed. For lack of space we
conclude this section with a brief list of some other notable cases, referring
the reader to the original literature for further details. A comprehensive study
of non-minimal correlators can be found in [123].
• A special class of correlation functions in N = 4 SYM are the so-
called extremal correlators. These are n-point functions of operators of
the type (346) in which the dimension, 1 , of one of the operators nequals
the sum of the dimensions, i , i = 2, . . . , n, of the others (1 = i=2 i ).
The analysis of the associated dual amplitudes in supergravity led to the
prediction that such correlation functions should not be renormalised [128].
This was then confirmed by field theory calculations in [129, 130]. In par-
ticular, an argument for the absence of instanton corrections to extremal
correlators, based on the analysis of fermion zero modes, was given in [129].
Similar results have been shown to hold for next-to-extremal correlation
n
functions for which 1 = i=2 i −2 [131]. A more complicated class n are the
near extremal correlators, characterised by the condition 1 = i=2 i − m
with m ≤ n − 3. These satisfy certain partial non-renormalisation prop-
erties [132], which have been argued in [123] to survive the inclusion of
instanton corrections.
• The Wilson loop is a particularly important operator in non-abelian gauge
theories since it plays the rôle of order parameter characterising confine-
ment. In pure Yang–Mills theory the Wilson loop is the expectation value
(
1
W [C] = TrNc P exp i dx Aμ μ
(401)
Nc C

of the holonomy associated with the closed contour C. A generalisation of

this quantity in N = 4 SYM has been constructed in [133] together with
a proposal for the dual quantity in string theory to be associated with it.
A special class of Wilson loops in N = 4 SYM are circular BPS loops,
which are annihilated by 16 linear combinations of Poincaré and special
supersymmetries. These Wilson loops are deﬁned as
(
1
W [CR ] = TrNc P exp i ds Aμ ẋμ + iϕi ni |ẋ| , (402)
Nc CR

where ni is a constant unit vector on the ﬁve-sphere and CR is a circle of

radius R. An elegant method for computing instanton corrections to (402)
in the case of SU (2) gauge group was devised in [134]. The BPS Wilson
loop is a non-minimal correlator since it is non-polynomial in the ﬁelds. In
the SU (Nc ) case its calculation in the instanton background is a formidable
task and the SU (2) analysis of [134] has not been extended to this more
general case.
Instantons and Supersymmetry 405

• In Sect. 15.2 we mentioned that certain results concerning instanton

corrections to the anomalous dimensions of gauge-invariant composite op-
erators can be obtained from the OPE analysis of four-point functions such
as (381). On the other hand, as discussed in Sect. 13, the anomalous dimen-
sions can be computed directly from two-point functions after resolving
the operator mixing. Depending on the bare dimension of the operators
one is considering two-point functions can be minimal or non-minimal.
A systematic study of instanton contributions to two-point functions of
scalar operator was initiated in [135]. As discussed in Sect. 13, general
considerations, and in particular arguments based on S-duality, suggest
that generically anomalous dimensions in N = 4 SYM should receive both
perturbative and non-perturbative contributions. A rather surprising re-
sult found in [135] is the absence of instanton corrections to the majority
of scalar operators of bare dimensions Δ0 ≤ 5.
• Finally an important class of non-minimal correlation functions are those
relevant for the so-called BMN limit, which is the subject of Sect. 18.3.

16 Generalisation to Multi-instanton Sectors

The generalisation of the analysis presented in the previous section to multi-

instanton sectors is technically very involved and requires the full machinery of
the ADHM construction [19]. A detailed description of this formalism and its
generalisation to supersymmetric theories, as well as references to the original
literature can be found in [6]. Due to space limits we shall only report an
important result of [121], where multi-instanton contributions to N = 4 SYM
correlation functions were explicitly evaluated in the large Nc limit.
In the generic K-instanton sector and with gauge group SU (Nc ) an in-
stanton configuration in pure Yang–Mills theory is characterised by 4KNc
collective coordinates parametrising a hyper-Kähler manifold, MK . A de-
scription of the moduli space associated with general (anti-)self-dual gauge
configurations can be given using the ADHM construction [19]. This is based
on the introduction of an overcomplete set of matrix-valued parameters on
MK , satisfying non-linear constraints, which can be shown to be equivalent
to the self-duality condition for the Yang–Mills field strength. The constraints
can be implemented describing the moduli space, MK , and the associated
metric by means of what is referred to as a hyper-Kähler quotient construc-
tion [136]. In the case of the SU (Nc ) N = 4 SYM theory there are also 8KNc
fermionic collective coordinates in the generic K instanton sector. These can
be included in the ADHM formalism as matrix-valued generalisations of the
collective coordinates introduced in the one-instanton sector, subject to suit-
able constraints.
As usual, the calculation of instanton contributions to correlation functions
involves the integration over the instanton moduli space. This can be formally
achieved integrating over the redundant set of bosonic and fermionic ADHM
406 M. Bianchi et al.

matrices and imposing the constraints via δ-functions. However, as already

observed, the calculations are not feasible for generic K, since an explicit
solution to the constraints is not known.
In the case of N = 4 SYM, these calculations are also not particularly
enlightening since, in general, correlators receive non-vanishing contributions
from all instanton sectors. However, a dramatic simplification occurs in the
large Nc limit, making the calculation of K-instanton corrections to correla-
tion functions feasible for arbitrary K [121]. The reason for this simplification
is that in the large Nc limit the integration over the (super) moduli space
is dominated by a very special configuration and can be evaluated using a
saddle point approximation. After the introduction of a matrix generalisation
of the auxiliary variables χi , the saddle point that dominates the K-instanton
moduli space integration corresponds to a configuration in which the K in-
stantons have the same size and share the same location both in space–time
and in the five-sphere directions parametrised by the χi ’s.37 As in the one-
instanton sector only the 16 exact fermion zero modes associated with the
broken superconformal symmetries remain exact. The physical moduli space
integration measure obtained using the saddle point approximation is

dμphys e−Sinst
(K)
(403)

d4 x0 dρ 5 2 A 2 ¯
1/2 4
Nc g 8 e2πiKτ
−→ 2 /2−K/2+25 d Ω d η d ξ ZK ,
Nc →∞
3
K 217K π 9K 2 /2+9 ρ5
A=1

where ZK contains the integration over the ﬂuctuations around the saddle
point. These can be expressed in terms of [K] × [K] bosonic and fermionic
matrices, AM , M = 0, . . . , 9 and Ψr , r = 1, . . . , 16, by means of which the ZK
factor in (403) takes the form of the partition function of a SU (K) supersym-
metric matrix model38 , i.e.

1
ZK = d10 A d16 Ψ e−S(A,Ψ ) , (404)
Vol SU (K)
where
1
S(A, Ψ ) = − TrK [AM , AN ]2 + Ψ̄ [A,
/ Ψ] . (405)
2
The partition function ZK was computed in [80, 137] with the result
1
−9/2
K −1/2
2 2
ZK = 217K /2−K/2−8
π 9K , (406)
m2
m|K

37
In the analysis of the ﬂuctuations around the saddle point it is also important
that, as far as the global gauge orientation is concerned, the K instantons lie in
mutually orthogonal SU (2) subgroups of SU (Nc ).
38
This is the dimensional reduction to zero dimensions of 10-dimensional N = 1
SYM.
Instantons and Supersymmetry 407

where the sum is over the positive integer divisors of K.

Correlation functions of composite operators in the large Nc limit are
computed integrating their profiles in the K-instanton background with the
measure (403). In particular, if the operator profiles do not depend on the
collective coordinates parametrising the matrix model, the partition function
ZK factors out. This is the case for minimal correlation functions of gauge
invariant operators such as those considered in Sect. 15.2. As an example of
this type we consider the K-instanton contribution to (374). In the large Nc
limit the profile of the operator ΛAα in the K-instanton background does not
depend on the matrix model coordinates. It is proportional to its one-instanton
expression, namely
ρ4
96 K A
Λ̂A
α = ζ A
≡ K Λ̂ α . (407)
K−inst g 2 [(x − x0 )2 + ρ2 ]4 a 1−inst

Therefore one ﬁnds [121]

K 25/2 247 316 π −27/2 e2πiKτ

1/2
Nc
ΛA
α1 (x1 ) · · · Λα16 (x16 )K−inst =
1 A16
g 24
1
dρ d4 x0 5 2 A 2 ¯A
4 16
ρ4
× d Ω d η d ξ ζ Ai (xi ) .
m 2 ρ 5
i=1
[(xi − x0 )2 + ρ2 ]4 αi
m|K A=1

(408)

The calculation of multi-instanton contributions to non-minimal correlation

functions, even in the large Nc limit, is much more complicated. In the non-
minimal case the operator insertions depend on the one-instanton moduli
and also on the matrix model variables and thus one cannot simply factor
out ZK . It is natural to expect that in these cases, instead of the partition
function, the integration over the AM and Ψr variables should be related to
certain correlation functions in the matrix model giving rise to generalisations
of (406).

17 AdS/CFT Correspondence: a Brief Overview

As already mentioned, the recent renewed interest in N = 4 SYM stems from
the conjecture about the AdS/CFT correspondence [88, 89, 90]. In this section
we provide a brief overview of the main concepts at the basis of this conjecture
and in the following sections we review the rôle of instantons in this context.
The idea of the AdS/CFT correspondence was presented in [88] and a
more concrete formulation was given in [89, 90]. Reviews can be found in [91,
92]. In [88] Maldacena proposed a remarkable duality relation connecting two
completely diﬀerent theories, N =4 SYM with SU (Nc ) gauge group and type
IIB superstring theory in an AdS5 × S 5 background.
408 M. Bianchi et al.

The type IIB superstring theory has N = (2, 0) supersymmetry in 10 di-

mensions, i.e. it is invariant under 32 supersymmetries. Its spectrum contains
a finite number of massless states and an infinite tower of massive states. The
massless spectrum is chiral. The bosonic degrees of freedom are divided into
the so-called Neveu–Schwarz–Neveu–Schwarz (NS–NS) and Ramond–Ramond
(R–R) sectors. The massless NS–NS sector contains a traceless rank-two sym-
metric tensor (the graviton, gM N , M, N = 0, . . . , 9), an anti-symmetric two-
form (BM N ) and a scalar (the dilaton, φ). The massless R–R sector contains
a scalar (C(0) ), an anti-symmetric two-form (CM N ) and an anti-symmetric
four-form (CM N P Q ), with self-dual field strength. The massless fermions are
the spin 1/2 dilatino (λ) and the spin 3/2 gravitino (ψM ). These are com-
plex Weyl spinors of opposite chiralities. The theory has two parameters, the
coupling constant, related to the v.e.v. of the dilaton, gs = e
φ , and the in-
verse string tension, α . The latter sets the scale√for the massive states in the
spectrum which have masses proportional to 1/ α .
The background relevant for the correspondence with N = 4 SYM, AdS5 ×
S 5 , is the product of a five-dimensional anti-de Sitter space and a five-sphere.
The non-compact factor, Lorentzian AdS5 , can be described as a hyperboloid
embedded in six dimensions, i.e. in terms of six Cartesian coordinates, X i ,
i = 0, . . . , 5, satisfying the constraint

X02 − X12 − · · · − X42 + X52 = L2 (409)

where L is the (constant) radius of curvature. This deﬁnition immediately

shows that the AdS5 space has isometry group SO(2, 4). The so-called global
coordinates for AdS5 are introduced setting

X0 = L cosh ρ cos t , X5 = L cosh ρ sin t ,

Xr = L sinh ρ Ωr , r = 1, 2, 3, 4 , Ωr2 = 1 . (410)
r

In terms of these coordinates the metric reads

ds2 = L2 (− cosh2 ρ dt2 + dρ2 + sinh2 ρ dΩ 2 ) . (411)

Another convenient set of coordinates for AdS5 are the so-called Poincaré
coordinates, (zμ , z0 ). The z0 coordinate parametrises the radial direction of
AdS5 and the four zμ coordinates parametrise the directions parallel to the
boundary located at z0 = 0. In terms of these coordinates the metric is
L2 2
ds2 = 2 dzμ + dz02 . (412)
z0

The AdS5 × S 5 space is maximally supersymmetric if the two factors have

the same radius of curvature, L. The non-vanishing components of the Ricci
tensor are
4 4
Rmn = − 2 gmn , Rab = 2 gab , (413)
L L
Instantons and Supersymmetry 409

where the indices m, n span the AdS5 directions and the indices a, b the S 5
directions. Moreover the self-dual R–R five-form field strength has a non-
vanishing background value
1 1
Fmnpqr = εmnpqr , Fabcde = εabcde . (414)
L L
The conjectured duality has a holographic nature in that it relates the
physics described by the string theory in the bulk of AdS5 × S 5 to that of a
gauge theory, N = 4 SYM, living on the four-dimensional boundary of AdS5 .
The first ingredient of the correspondence is a dictionary relating the pa-
rameters of the two theories. In N = 4 SYM the parameters are the coupling,
g, and the rank of the gauge group. In the string theory, besides the coupling
constant, gs , and the inverse string tension, α , the radius of curvature, L, of
the AdS5 and S 5 spaces enters as an additional dimensionful parameter. The
relations among the gauge and string theory parameters are

L4 = 4πgs α Nc .
2
g 2 = 4πgs , (415)

The second equation can be used to relate the dimensionless ratio L4 /α to
2

the ’t Hooft coupling, λ = g 2 Nc ,

L4
= λ. (416)
α 2
The ϑ-angle, that can be turned on in the gauge theory, is related to the
expectation value of the R–R scalar
ϑ
= C(0) . (417)
2π
Given this dictionary for the parameters of the two theories, the correspon-
dence is formulated in terms of two additional basic ingredients:
• A map between the fundamental degrees of freedom of the two theories.
• A prescription for the computation the observables of one theory in terms
of those of the other.
The map between degrees of freedom is dictated by the symmetries. The (su-
per)isometries of the string background, under which the states in the string
spectrum are classified, coincide with the (super)group of global symmetries
of the gauge theory, which, as already discussed, is P SU (2, 2|4). The dual-
ity associates states in the string spectrum with gauge-invariant composite
operators in N = 4 SYM, which have the same quantum numbers under
the SO(2, 4) × SO(6) maximal bosonic subgroup of P SU (2, 2|4). Specifically,
(1) the Lorentz quantum numbers are identified, (2) the masses of the string
states are related to the scaling dimensions of the dual operators and (3) the
SO(6) quantum numbers arising in the Kaluza–Klein (KK) reduction on S 5
410 M. Bianchi et al.

of the string theory are related to the Dynkin labels characterising the trans-
formation of the dual gauge theory operators under the SU (4) R-symmetry.
Supersymmetry then implies that entire multiplets are related. The simplest
case of this relation is represented by the correspondence between the super-
gravity multiplet, which contains the graviton and its superpartners, and the
N = 4 supercurrent multiplet discussed in Sect. 13.
The prescription relating observables on the two sides of the correspon-
dence is based on the identification of properly defined partition functions.
The string partition function in AdS5 × S 5 is a functional of the boundary
values of the fields. The latter play the rôle of sources for the dual operators in
the boundary gauge theory [89, 90] and one is led to propose the holographic
formula

ZIIB [Φ|∂AdS = J] = [dA][dλ][dλ̄][dϕ] exp −SN =4 + OΦ J . (418)

Here Φ denotes a generic ﬁeld in the string theory and OΦ is the dual composite
operator in N = 4 SYM according to the map previously described.
The quantisation of string theory in an AdS5 ×S 5 background is not under-
stood well enough to make really operative use of (418). However, interesting
results can be obtained in certain limits. In particular in the weak coupling
and small curvature limit on the gravity side, where

L2
gs 1 , 1, (419)
α
classical supergravity becomes a good approximation. Based on the dictio-
nary (415), this limit corresponds to the limit of large Nc and large ’t Hooft
coupling, λ, in the gauge theory. Since in the Nc → ∞ limit λ plays effectively
the rôle of coupling constant, one obtains a duality between classical type IIB
supergravity in AdS5 × S 5 and the strong coupling limit of N = 4 SYM in the
planar approximation. This observation illustrates the strong/weak nature of
the duality, which on the one hand makes it difficult to test, but on the other
makes it a powerful tool for the study of strongly coupled gauge theories.
In the limit (419) the IIB partition function in (418) is well approximated
by
ZIIB [Φ|∂AdS = J] ∼ e−SIIB [Φ|∂AdS =J] , (420)
where SIIB is the classical type IIB supergravity action in the AdS5 × S 5
background.
In this limit the relation (418) has a simple and intriguing interpreta-
tion. Correlation functions in the gauge theory are obtained taking functional
derivatives with respect to the sources on the r.h.s. of (418). Using the ap-
proximation (420), one finds that differentiating with respect to the sources
is equivalent to solving the supergravity equations of motion with boundary
conditions Φ|∂AdS = J. Therefore the correspondence states that an n-point
Instantons and Supersymmetry 411

x1 x4 x1 x4

z z w

x2 x3 x2 x3

Fig. 1. Contact and exchange contributions to a four-point amplitude in AdS5 × S 5

correlation function, O1 (x1 ) · · · On (xn ), in N = 4 SYM is equal to an am-

plitude in which, for each Oi insertion, the dual supergravity state, Φi , is
propagated from the bulk to the boundary point xi . An intuitive graphical
representation of this prescription was proposed in [90].39 Figure 1 repre-
sents the supergravity amplitudes contributing to the process dual to a SYM
four-point function, O1 (x1 ) · · · O4 (x4 ). The interior of the circle in Fig. 1
represents the bulk of AdS5 × S 5 and the circle itself is the four-dimensional
boundary where the gauge theory lives.
A normalisable solution of the free supergravity equations of motions sat-
isfying the boundary condition Φ|∂AdS = J can be written as

Φ(i) (zμ , z0 ) = d4 x Ki (zμ , z0 ; xμ(i) ) JΦ (xμ(i) ) , (421)

(i)
where the function Ki (zμ , z0 ; xμ ) is a so-called bulk-to-boundary propagator,
i.e. the kernel that allows to express a supergravity ﬁeld, Φ, at the bulk point
(i)
(zμ , z0 ) in terms of its boundary value, JΦ , at (zμ = xμ , z0 = 0). Substitut-
ing the solution (421) into the generating functional (420) one obtains the
following expressions for the two contributions in Fig. 1:

d4 z dz0 5
4
Acont (x1 , . . . , x4 ) = Nc2 d ω Ki (zμ , z0 ; xμ(i) ) (422)
z05 i=1

d4 z dz0 5
2 4
Aexc (x1 , . . . , x4 ) = Nc2 d ω Ki (zμ , z0 ; xμ(i) )
z05 m i=1 j=3

×Gm (z, w) Kj (zμ , z0 ; xμ(j) ) , (423)

where Gm (z, w) in the second amplitude, corresponding to the exchange di-

agram, represents a bulk-to-bulk propagator in AdS5 and the index m runs
over the set of all allowed intermediate states. In (422) and (423) we have
39
In the following we shall refer to processes of this type as scattering amplitudes
in AdS.
412 M. Bianchi et al.

used Poincaré coordinates, (zμ , z0 ), to parametrise AdS5 and angles, ωi , for

the ﬁve-sphere. In terms of these parameters the AdS5 × S 5 metric becomes

L2 2
ds2 = 2 dzμ + dz02 + z02 dω52 . (424)
z0

In (422) and (423) the overall factor of Nc2 is obtained rewriting the coeﬃcient
in front of the IIB supergravity action in the string frame, namely L8 /α gs2 ,
4

in terms of Yang–Mills parameters using (415).

In the next sections we shall discuss the inclusion of instanton eﬀects in
this picture. In Sect. 18.3 we shall consider another notable limit, i.e. the
BMN limit, in which the string theory ↔ ﬁeld theory correspondence is under
control beyond the supergravity approximation.

18 Instanton Eﬀects in the AdS/CFT Duality

In the AdS/CFT correspondence, the effects of Yang–Mills instantons in
N = 4 SYM are related to non-perturbative effects induced by D-instantons
in the dual IIB string theory [138]. In the low-energy supergravity limit of
string theory, D-instantons arise as non-trivial solutions of the Euclidean field
equations. In the 10-dimensional Euclidean space they correspond to config-
urations in which the metric (in the Einstein frame) is flat and the dilaton
and the R–R scalar have non-constant profiles while all the other fields van-
ish. As the ordinary Yang–Mills instantons, the supergravity D-instantons are
characterised by their integer-valued charge. The supergravity action evalu-
ated on a charge-K D-instanton configuration is proportional to K, as in the
Yang–Mills case, and inversely proportional to the string coupling, gs . The
D-instanton solution of the type IIB field equations in AdS5 × S 5 has similar
properties and can be obtained from the flat space solution [124]. In string
theory D-instantons are identified with D(−1)-branes, i.e. point-like objects in
Euclidean 10-dimensional space. Their world-volume is zero-dimensional and
therefore open strings ending on D(−1)-branes carry no propagating degrees of
freedom. They describe instead zero-modes associated with the D-instantons
as discussed in Sect. 11. D-branes, and D-instantons in particular, can also be
described in terms of closed string modes as collective excitations using the
so-called boundary state formalism. This will be utilised in a special case in
Sect. 18.3.
The discussion of the general principles of the AdS/CFT correspondence
in Sect. 17 and specifically the fundamental relation (418) indicate that, in or-
der to make contact with the calculation of instanton contributions to N = 4
correlation functions, one should study D-instanton induced contributions to
string scattering amplitudes in AdS5 × S 5 . In principle, this involves includ-
ing in the genus expansion of the closed string amplitudes the contribution of
world-sheets with boundaries associated with the presence of D(−1)-branes.
Instantons and Supersymmetry 413

However, as already explained, in the AdS5 × S 5 background such calcula-

tions are not under control and one is restricted to a low-energy supergravity
analysis. In the supergravity approximation, the inclusion of the effect of D-
instantons requires a refinement of (420) in which the classical supergravity
action is replaced by the low-energy effective action which incorporates the
effect of the infinite tower of massive string excitations on the dynamics of
the massless modes.

18.1 The Type IIB Eﬀective Action

The type IIB string theory eﬀective action is expressed as a powers series in
the inverse string tension, α . It takes the form
1

= 4 S (0) + α S (3) + α S (4) + · · · + α S (r) + · · · ,
eﬀ 3 4 r
SIIB (425)
α
where S (0) denotes the classical action and the subsequent terms contain
higher derivative couplings, which receive D-instanton contributions. The in-
clusion of such vertices in supergravity amplitudes in AdS5 × S 5 gives rise
to contributions which are in correspondence with the correlation functions
discussed in Sects. 15 and 16.
The form of (425) is in principle determined by supersymmetry. The terms
appearing in the leading correction, S (3) , have been extensively studied. The
couplings arising at this level include the well known R4 term and a large
number of other terms related to it by supersymmetry. Schematically, in the
string frame, the form of S (3) is

1 √
α S (3) = d10 X −g e−φ/2 f1 (τ, τ̄ ) R4 + (GḠ)4 + · · · + · · ·
3 (0,0)
α

(8,−8) (12,−12)
+ f1 (τ, τ̄ ) G8 + · · · + · · · + f1 (τ, τ̄ )λ16 . (426)

The precise form of many of these couplings has been determined, see for
instance [139] where the R4 coupling was studied. In the following we shall
further discuss certain vertices which are relevant for the comparison with
the Yang–Mills calculations of the previous sections. The coefficients in (426)
are functions of the complex scalar, τ = τ1 + iτ2 = C(0) + ie−φ , where φ is
the dilaton and C(0) the R–R scalar. The effective action is invariant under
SL(2, Z) transformations acting on τ as
aτ + b
τ→ , (427)
cτ + d
where the integers a, b, c, d satisfy ad − bc = 1. Under such transformations
any supergravity field, Φ, acquires a phase
q
cτ + d Φ
Φ→ Φ, (428)
cτ̄ + d
414 M. Bianchi et al.

where qΦ is the charge of Φ under the local U (1) symmetry of (425) which
also rotates the two chiral supersymmetries [140]. In particular, the metric
and the IIB self-dual five-form are not charged, the complex combination
√
G(3) = (τ dB(2) + dC(2) )/ τ2 (where B(2) and C(2) are the NS–NS and R–R
two forms) has charge 1, the fluctuation of the complex scalar, δτ ≡ τ̂ , has
charge 2, the dilatino, λ, and the gravitino, ψM , have charge 3/2 and 1/2,
respectively. The coefficient functions in (426) transform as modular forms
with holomorphic and anti-holomorphic weights (w, −w), so that
w
(w,−w) cτ + d (w,−w)
f1 (τ, τ̄ ) → f1 (τ, τ̄ ) . (429)
cτ̄ + d

Invariance under SL(2, Z) requires that the weight w of the modular form in
each term in the eﬀective action be equal to half the sum of the U (1) charges
of the ﬁelds in the vertex.
(w,−w) (0,0)
The modular forms f1 (τ, τ̄ ) are obtained acting on f1 (τ, τ̄ ) with
modular covariant derivatives
(w,−w) (0,0)
f1 (τ, τ̄ ) = Dw−1 Dw−2 · · · D0 f1 (τ, τ̄ ) , (430)
∂
where Dw = τ2 ∂τ − i w2 .
(0,0)
The modular form in front of the R4 term, f1 (τ, τ̄ ), is given by a non-
holomorphic Eisenstein series

(0,0)
τ2
3/2
f1 (τ, τ̄ ) = . (431)
|m + nτ |3
(m,n) =(0,0)

It can be expanded in Fourier modes as

∞

(0,0) 3 2π 2 − 12
f1 (τ, τ̄ ) = FK
1
(τ2 ) e2πiKτ1 = 2ζ(3)τ22 + τ (432)
3 2
K=−∞
∞
Γ (j − 1/2)
|K| 2 μ(K, 1) e−2π(|K|τ2 −iKτ1 ) (4πKτ2 )−j
1
+4π ,
j=0
Γ (−j − 1/2)j!
K =0

where the r.h.s. is the result of a further weak coupling (large τ2 ) expansion.
The non-zero Fourier modes are interpreted as D-instanton contribu-
tions with instanton number K (K > 0 terms are D-instanton contributions
while K < 0 terms are anti-D-instanton contributions). The measure factor,
μ(K, 1), is
1
μ(K, 1) = , (433)
m2
m|K

where the sum is over the positive integer divisors of K. The coefficients
of the D-instanton terms in (432), include an infinite series of perturbative
fluctuations around any charge-K D-instanton. The leading term in this series
Instantons and Supersymmetry 415

is the one of relevance for the comparison with the semi-classical Yang–Mills
(0,0)
instanton calculations. In the case of f1 (τ, τ̄ ) this term is independent of τ2 .
From (430) it follows that the leading D-instanton term in the modular form
(τ, τ̄ ) behaves as τ2w = gs−w . The zero D-instanton term, F01 , contains
(w,−w)
f1
only two power-behaved contributions that arise in string perturbation theory
as tree-level and one-loop contributions, with no higher-loop terms.
Much less is known about higher-order terms beyond S (3) in the string
eﬀective action, but various terms in S (5) are known and certain classes of
terms at higher orders have been studied. Among the interactions at order
α we have the following:
5

5 (5) √ (0,0) (2,−2)
α S =α d10 X −g eφ/2 f2 (τ, τ̄ ) D4 R4 + f2 (τ, τ̄ ) G4 R4

(12,−12) (12,−12)
+f2 (τ, τ̄ ) R2 λ16 + f2 (τ, τ̄ ) R2 λ16 + · · · . (434)
(w,−w)
The modular forms, f2 (τ, τ̄ ), appearing here are generalisations of those
previously deﬁned. More generally at higher orders in the α expansion one
expects modular forms of the type

(0,0)
τ2 2
l+ 1
fl (τ, τ̄ ) = . (435)
|m + nτ |2l+1
(m,n) =(0,0)

All these functions satisfy relations similar to (430). The weak coupling ex-
(0,0
pansion of f2 (τ, τ̄ ) is
(0,0) 5 4π 4 − 32
f2 (τ, τ̄ ) = 2ζ(5)τ22 + τ (436)
135 2
8π 2 3
|K| 2 μ(K, 2) e−2π(|K|τ2 −iKτ1 ) 1 + τ −1 + · · ·
3
+ ,
3 16πK 2
K =0

where μ(K, 2) = m|K 1/m4 .
In the next subsection we shall discuss how the D-instanton induced terms
appearing in the IIB effective action are related to instanton contributions to
N = 4 correlation functions. In the analysis of processes dual to non-minimal
correlators it will also be important to include the effect of the fluctuations,
(w,−w)
τ̂ , of the complex scalar in the modular forms fl (τ, τ̄ ). For instance, re-
writing the complex scalar as τ = τ0 + τ̂ (where τ0 is the constant background
(0,0)
value of τ ), the expansion of the D-instanton exponential factor in f1 (τ, τ̄ )
gives rise to a series of the form
(2πiK)r
e2πiKτ = e2πiKτ0 τ̂ r . (437)
r
r!

Equations (437) and (426) show that at order α in the string low-energy
action there are eﬀective vertices of the form τ̂ r R4 , which can contribute to
scattering amplitudes in the AdS5 × S 5 background.
416 M. Bianchi et al.

18.2 D-instantons in AdS5 × S 5 and Comparison with Yang–Mills

Instantons

The discussion in the previous subsection provides the background necessary

to analyse the processes dual to the correlation functions computed in Sects. 15
and 16. These are dual to supergravity amplitudes involving the D-instanton
induced vertices in the IIB effective action. In order to make contact with N =
4 SYM, one needs to specialise the general expressions of the vertices in (426)
to the case of the AdS5 × S 5 background. For this purpose, we shall expand
the 10-dimensional supergravity fields in harmonics on the five-sphere [141]
()
Φ(X) = ΦI
(z) YI
(ω) , (438)

()
where the YI
(ω)’s are spherical harmonics, with denoting the level and
eﬀ
I a set of SO(6) indices. After expanding the supergravity ﬁelds in SIIB in
this way the amplitudes dual to SYM correlators are computed using the
prescription described in Sect. 17.
In studying AdS amplitudes we distinguish again between minimal and
non-minimal cases, characterising an amplitude as (non-)minimal if it is dual
to a (non-)minimal Yang–Mills correlator.

Minimal AdS Amplitudes

The simplest minimal amplitude is the one dual to the 16-point correlation
function (374). The operator ΛA α in (373) is dual to the type IIB dilatino,
λ, and thus according to the prescription explained in Sect. 17 we need to
consider an amplitude with 16 dilatini propagating to the boundary. The
vertex in the eﬀective action which contributes to such process is

1 √
d10 X −g e−φ/2 f1
(12,−12)

(τ, τ̄ ) t16 λ16 , (439)
α
where t16 is a 16-index anti-symmetric tensor contracting the spinor indices of
the 16 dilatini. The amplitude dual to (374) involves the leading D-instanton
term in (439) (see (430)–(432)), i.e.
1
1 √

d10 X −g 214 π 13 K 25/2 e2πiKτ e−25φ/2 t16 λ16 . (440)
α m2
K>0 m|K

The amplitude induced by this interaction is depicted on the l.h.s. of Fig. 2:

it is a contact amplitude in which the 16 dilatini interact via the vertex (440)
and propagate to the boundary points x1 , . . . , x16 .
After introducing an explicit parametrisation for AdS5 × S 5 and rewriting
the string theory parameters, gs and α , in terms of Yang–Mills parameters
using the dictionary (415), the amplitude in Fig. 2 becomes
Instantons and Supersymmetry 417

Fig. 2. D-instanton induced minimal amplitudes in AdS5 × S 5

Nc 25 1 2πiKτ d4 z dz0 5
1/2
K 2 e d ω t16
g 24 m2 z05
K>0 m|K

16
(0)
× YF (ω) K7/2
F
(z, z0 ; xi ) , (441)
i=1

where overall numerical constants have been dropped and no indices have been
indicated explicitly. In (441) we have used Poincaré coordinates, (zμ , z0 ), for
AdS5 , with zμ parametrising the directions parallel to the boundary and z0
the radial direction. In terms of these coordinates and ﬁve angular variables
for the S 5 factor, the AdS5 ×S 5 metric has the form (424). The 10-dimensional
dilatino has been expanded in spherical harmonics. In the expansion we have
(0)
retained the ground state component, dual to the SYM operator ΛA α . YF (ω)
F
denotes the corresponding harmonic function. In (441) K7/2 denotes the bulk-
to-boundary propagator for the dilatino, i.e. a spin 1/2 fermion with AdS
mass − 2L3

√ 1
F
K7/2 (z, z0 ; x) = K4 (z, z0 ; x) z0 γ5 − √ (x − z)μ γ ,
μ
(442)
z0

where
z0Δ
KΔ (z, z0 ; x) = . (443)
[(z − x)2 + z02 ]Δ
Remarkably, the result (441), in its unintegrated form, is in exact agreement
with the multi-instanton contribution to the correlation function (374) (cf.
(379) and its multi-instanton generalisation (408)) after the integration over
the 16 exact fermion zero-modes in the latter. To compare the two results, one
identiﬁes the AdS5 coordinates, zμ , z0 , with the position and size of the instan-
ton and the S 5 angles with the auxiliary angular variables, Ω AB , introduced
in the gauge theory calculation. The integration over the position of the in-
teraction point in the supergravity amplitude reproduces the integration over
the N = 4 moduli space, which, in the large Nc limit and with the inclusion
418 M. Bianchi et al.

of the auxiliary variables, is precisely one copy of AdS5 × S 5 . The bulk-to-

boundary propagators reconstruct the dependence on the moduli contained in
the proﬁles of the Yang–Mills operators. Finally, although we have not kept
track of all the numerical factors, the dependence on the parameters, g and
Nc , as well as on the instanton number, K, are in perfect agreement. The fac-
(12,−12)
tors of τ2 in the weak coupling expansion of the modular form f1 (τ, τ̄ )
give rise to the same g dependence as in the Yang–Mills result. Similarly the
matrix model partition function is reproduced by the measure factor, μ(K, 1),
in the modular form, see (433). The power of Nc in (441) follows from the
application of the AdS/CFT dictionary (415), which gives

e−φ/2 L2
1/2
= 2π Nc . (444)
α
The calculation of the amplitude dual to the four-point function (381) is com-
pletely analogous. The N = 4 scalar operator QABCD in the 20 of SU(4)
is dual to a linear combination of the trace part of the metric in the S 5 di-
rections and the S 5 components of the R–R four-form potential. The scalar
in the supergravity multiplet, corresponding to QABCD , arises at level = 2
in the expansion in spherical harmonics. An amplitude contributing to the
process dual to (381) involves the R4 interaction in the bulk. This is depicted
on the r.h.s in Fig. 2. Proceeding as in the case of the 16-point amplitude one
ﬁnds that this four-point amplitude is
1 4
d z dz0 5
Nc1/2 K 1/2 e 2πiKτ
d ω
m 2 z05
K>0 m|K

4
(2)
× YB (ω) K4 (z, z0 ; xi ) , (445)
i=1

where the bulk-to-boundary propagator, K4 , is now the one appropriate for a

(2)
scalar of mass squared −4/L2 in AdS and the YB ’s are = 2 scalar spherical
harmonics. Again the result agrees perfectly with the Yang–Mills calculation.
The examples described here illustrate the striking agreement between
instanton contributions to Yang–Mills correlation functions and D-instanton
induced supergravity amplitudes. The agreement found represents one of the
most convincing tests of the validity of the AdS/CFT correspondence. The
majority of the explicit tests of the Maldacena conjecture compare protected
quantities which do not depend on the coupling constant and thus coincide
with their free theory expressions. In these cases, the comparison is not af-
fected by the strong/weak coupling nature of the correspondence, but the cal-
culations simply test that the same non-renormalisation properties are valid
on both sides. The calculations reviewed above represent instead one of the
few instances in which a precise comparison is possible for quantities which
Instantons and Supersymmetry 419

do receive non-trivial quantum corrections.40 Such a precise agreement is re-

markable and somewhat unexpected, since the calculations in this section and
those in Sect. 15 appear to have diﬀerent regimes of validity. It is natural to
interpret the result of the comparison as due to an underlying partial non-
renormalisation property [142], whose origin, however, remains unexplained.

Non-minimal AdS Amplitudes

The discussion in the previous subsection has a natural generalisation to the

case of amplitudes dual to the non-minimal correlation functions of Sect. 15.3.
In the non-minimal case, the study of SYM correlators has not been gener-
alised to multi-instanton sectors. In this section we show how the supergrav-
ity analysis gives results which are in qualitative agreement with those of the
N = 4 calculations in the one-instanton sector. We consider supergravity am-
plitudes related to the two main types of non-minimal correlators discussed
in Sect. 15.3.
Correlators such as (395) correspond to amplitudes involving KK excited
states in the spectrum of type IIB supergravity in AdS5 ×S 5 . The N = 4 oper-
ators in 1/2 BPS multiplets with lowest component a scalar of the form (346)
with > 2 are dual to KK excited states. In particular, the fermionic operator
Λ̃ABC
α in (396) is dual to the first KK excitation of the dilatino. Therefore,
the amplitude dual to the correlation function (395) is similar to the 16-point
amplitude considered in the previous subsection, but with two of the dilatini
in the first KK excited level.
The other class of non-minimal correlators presented in Sect. 15.3 com-
prises higher-point functions which have a natural interpretation as dual to
amplitudes induced by vertices in the string effective action at order α
5

and higher. However, in these cases the comparison is less straightforward

and as will be shown shortly there are subtleties that need to be taken into
account.
The amplitude dual to the 16-point correlator (395) involves the same
interaction as in (439) and (440) with the only difference that upon reduc-
tion on the five-sphere one selects for two of the dilatini the first excited
state instead of the KK ground state. The diagrammatic representation of
the amplitude is given in Fig. 3, where the double lines indicate bulk-to-
boundary propagators for the KK excited states. The dilatini in the first
KK level are spin 1/2 fermions of mass − 2L 5
for which the bulk-to-boundary
propagator is

√ 1
F
K9/2 (z, z0 ; x) = K5 (z, z0 ; x) z0 γ5 − √ (x − z)μ γ μ . (446)
z0

40
The BPS Wilson loops mentioned in Sect. 15.3 provide another notable example
in perturbation theory.
420 M. Bianchi et al.

Fig. 3. Supergravity amplitude dual to the non-minimal 16-point function (395)

The resulting amplitude is

Nc 25 1 2πiKτ d4 z dz0 5
1/2
K 2 e d ω t16
g 24 m2 z05
K>0 m|K

14
(0)

2
(1)
× YF (ω) K7/2
F
(z, z0 ; xi ) YF (ω) K9/2
F
(z, z0 ; yi ) . (447)
i=1 j=1

(1)
where YF (ω) denotes the first excited fermionic spherical harmonic. The
result (447) is again in agreement with the corresponding Yang–Mills calcu-
lation (398).
Non-minimal amplitudes of this type, involving KK excited states, are
generalisations of the analogous minimal ones. Apart from the appearance
of bulk-to-boundary propagators for fields of the appropriate mass, the only
difference is in the five-sphere integrals, because of the presence of higher
harmonics, which are necessary to reproduce the Ω-dependence of the corre-
sponding Yang–Mills expressions.
The study of the other class of non-minimal amplitudes is more compli-
cated and yields some surprises. In order to describe the main features of these
amplitudes, we focus on the example of the process dual to the correlation
function (385) involving 16 fermionic operators, ΛA α , in the 4 of SU(4) and
four scalar operators, E AB , in the 10. As already observed, it is natural to
expect that in these cases the amplitudes should involve couplings of order α
5

and beyond. The operator E AB in (386) is dual to a linear combination of the

NS–NS and R–R two forms with indices in the internal directions. The cor-
responding ﬁeld strength, G(3) , was deﬁned in Sect. 18.1. Therefore a contact
amplitude involving the vertex

√ (14,−14)
α d10 X −g eφ/2 f2 (τ, τ̄ ) G4 λ16 (448)

represents an obvious candidate for the dual of the correlator (385). This
process is represented in the second diagram of Fig. 4.
Instantons and Supersymmetry 421

Fig. 4. Contributions to the 20-point amplitude dual to the correlation func-

tion (385)

This interpretation, however, leads to a puzzle. Using the dictionary (415)

the 12-derivative couplings at order α give rise to contributions of order
5
−1/2
Nc , in fact
α eφ/2
∼ Nc−1/2 , (449)
L
which is not the behaviour expected from the Yang–Mills analysis. The leading
1/2
contribution to the correlation function (385) is in fact of order Nc , as follows
from (390) and (391).
The resolution of this mismatch requires the inclusion in the supergravity
analysis of contributions of a type not encountered in the calculation of mini-
mal amplitudes. These are exchange diagrams involving a D-instanton-induced
vertex as well as additional perturbative couplings. The relevant D-instanton
vertices are those of order 1/α in the expansion of the effective action, so that
1/2
the resulting amplitudes give rise to contributions of order Nc , see (444).
In order to generate contributions to non-minimal amplitudes one needs to
include in the D-instanton vertices the fluctuations of the complex scalar,
τ = τ0 + τ̂ , as described at the end of Sect. 18.1.
In the case of the amplitude under consideration one needs to consider the
vertex coupling 16 dilatini with two additional insertions of τ̂ coming from
(12,−12)
the expansion of the exponential factor in the modular form f1 (τ, τ̄ ).
The non-perturbative part of the effective vertex is

1 √

d10 X −g e−25φ/2 e2πiτ τ̂ 2 t16 λ16 , (450)
α
where only the K = 1 contribution relevant for the comparison with the
one-instanton sector in N = 4 SYM has been included. The amplitude con-
tributing to the dual of the 20-point correlator (385) is depicted on the l.h.s.
of Fig. 4. The two bulk-to-bulk lines joining the D-instanton vertex at point
z to the points v and w are τ̂ τ̄ˆ propagators and the two cubic vertices are
τ̄ˆGG couplings from the classical type IIB action.
In evaluating this diagram, upon using dimensional reduction-like formu-
lae as (438), one has to sum over all the contributions associated with the
exchange of the KK excitations of the complex scalar. The coupling to the
422 M. Bianchi et al.

external three forms restricts this sum to the states allowed by the SO(6)
selection rules enforced by the integration over the five-sphere. In the present
case there is only one allowed contribution, corresponding to the exchange of a
complex scalar in the second KK excited level, i.e. a state in the representation
20 of SO(6) ∼ SU (4).
At first sight the resulting amplitude does not resemble the N = 4 SYM re-
sult: it is an exchange amplitude requiring integrations over three bulk points.
However, because of the specific coupling involved the integrations over the
positions of the two cubic couplings can be performed. This is because, after
using expressions such as (438), one can integrate by parts the derivatives in
each of the three-form field strengths onto the τ̂ scalar, and the cubic couplings
schematically reduce to the form
(∂ 2 + mτ20 )τ̂ Bij B ij , (451)
where B is the complex combination of the NS–NS and R–R two forms and
the mass term comes from the derivatives in the S 5 directions. After the
integration by parts one thus reconstructs the AdS5 wave operator acting
on the internal bulk-to-bulk propagators which then yield five-dimensional
δ-functions. The integrations at the points v and w in Fig. 4 can thus be com-
puted and the exchange diagram reduces to a contact contribution. Therefore,
the net effect of the exchange diagram is to give rise to a new coupling in the
AdS5 effective action of the form
4
1 d z dz0 −25φ/2 2πiτ
e e t16 λ16 B 4 , (452)
gs2 α z05
where the factor of gs−2 arises from the rescaling of the complex scalar, τ̂ ,
needed to make its kinetic term canonically normalised.
The amplitude induced by this vertex (expressed in terms of SYM param-
eters) takes schematically the form
√ 4
d z dz0 F
16 4
Nc e2πiτ
K 7/2 (z, z0 ; xi ) K3 (z, z0 ; yj ) , (453)
g 28 z05 i=1 j=1

which reproduces the leading large Nc term in the N = 4 result (390) with
the correct space–time dependence.
The amplitude involving the order α vertex λ16 G4 in (448) gives rise to a
5
−1/2
contribution with the same space–time dependence, but of order Nc . This
sub-leading contribution is interpreted as corresponding to the 1/Nc correction
in the SYM result (390).
The example of the above 20-point function illustrates some features com-
mon to many non-minimal amplitudes. In general, unlike in the minimal cases,
the amplitudes dual to non-minimal N = 4 correlation functions receive sev-
eral contributions. Various eﬀects such as those described in the previous
example need to be taken into account to show agreement between the Yang–
Mills and supergravity calculations. More details and other non-minimal ex-
amples are discussed in [123].
Instantons and Supersymmetry 423

18.3 Beyond Supergravity: the BMN Limit

The AdS/CFT correspondence discussed in the previous sections is a very

remarkable duality and the study of instanton effects has led to some of the
most successful tests of its validity. In the formulation presented so far the
duality has, however, some limitations. Because of our present limited un-
derstanding of the quantisation of string theory in non-trivial backgrounds
such as AdS5 × S 5 , the study of the gravity side of the correspondence is re-
stricted to the supergravity approximation. Moreover, even in this regime, the
strong–weak coupling nature of the duality makes the direct comparison of
the two sides problematic. In this section we briefly review a very interesting
limit of the correspondence, the so-called BMN limit [143], which allows to
overcome both the above limitations.41 The idea is to consider string theory
in a background obtained from AdS5 × S 5 via a special procedure known as
Penrose limit [146]. The result of the limit is a background with the geometry
of a maximally supersymmetric gravitational plane wave [147]. Remarkably,
despite the non-flatness of the metric and the presence of a R–R background,
it is possible to quantise string theory in this geometry [148, 149]. In [143]
it has been proposed that strings propagating in this particular plane-wave
background are dual to a certain sector of N = 4 SYM. The latter, usually
referred to as the BMN sector, comprises operators of large scaling dimen-
sion, Δ, and large charge, J, with respect to a U (1) subgroup of the SU (4)
R-symmetry group. The possibility of quantising string theory in the plane-
wave background has made the comparison between string and gauge theory
possible beyond the supergravity approximation, albeit only in a specific sec-
tor of N = 4 SYM. Moreover, in this limit there exists a regime in which
both sides of the correspondence are weakly coupled, so that the strong–weak
coupling problem is also avoided.
At the heart of the correspondence proposed in [143] is a relation between
the energy, E, of plane-wave string states and a combination of the scaling
dimension and R-charge of the dual operators, which reads
1
E =Δ−J, (454)
μ
where the parameter μ is related to the value of the R–R self-dual five-form
present in the background, see (459). The validity of (454) has been success-
fully tested in perturbation theory in a number of cases. Reviews of these
results can be found in [150]. In this subsection we present a brief overview of
the non-perturbative tests carried out in [151, 152, 153].

41
Another limit that has attracted some attention is the highly “stringy” regime
λ → 0 where the theory exposes higher spin symmetry enhancement [144]. The
bulk counterpart of the recombination of semi-short multiplets into long ones and
the emergence of anomalous dimensions in the boundary theory is a pantagruelic
Higgs mechanism termed La Grande Bouﬀe [145].
424 M. Bianchi et al.

In order to take the Penrose limit that gives rise to the plane-wave back-
ground we start with the AdS5 × S 5 metric written in global coordinates
!
ds2 = L2 − cosh2 ρ dt2 + dρ2 + sinh2 ρ dΩ32

+ cos2 θ dψ 2 + dθ2 + sin2 θ dΩ̃32 , (455)

where Ω3 and Ω̃3 refer to angles parametrising the two three spheres inside
AdS5 and S 5 , respectively. In this coordinate system one can choose
1
x̃± = ± √ (t ± ψ) (456)
2
as light-cone variables and define the new coordinates
1 + r y
x+ = x̃ , x− = μL2 x̃− , ρ= , θ= , (457)
μ L L
where μ is an arbitrary scale. The Penrose limit is obtained sending L to
infinity while keeping x± , ρ and y “fixed”. The resulting metric is that of a
plane wave
ds2 = 2dx+ dx− − μ2 xI xI (dx+ )2 + dxI dxI , (458)
where xI , I = 1, . . . , 8, are Cartesian coordinates such that xI xI = r2 + y 2 .
The original AdS5 × S 5 background has also a non-zero self-dual R–R five-
form (414), which after the limit has non-vanishing components

F+1234 = F+5678 = 2μ , (459)

with indices 1, 2, . . . , 8 corresponding to the xI directions.42

The plane-wave background preserves the same (maximal) amount of su-
persymmetry as the original AdS5 × S 5 . In fact at the level of the super-
isometries the Penrose limit corresponds to an Inönü–Wigner contraction.43
The supergroup of isometries resulting from the contraction of P SU (2, 2|4)
is P SU (2|2) × P SU (2|2) × U (1) × U (1) with maximal bosonic subgroup
H(4)2 SO(4)×SO(4)×U (1)×U (1), where H(4) denotes the four-dimensional
Heisenberg group [147]. In the following we shall denote the two SO(4) factors
with SO(4)C and SO(4)R , where the subscript refers to the fact that they are
subgroups, respectively, of the conformal group and the R-symmetry group of
the dual N = 4 SYM theory. The states in the string spectrum can therefore
be labelled by quantum numbers characterising their transformation under
SO(4)C × SO(4)R × U (1) × U (1). These are identiﬁed with two pairs of spins,
(s1 , s2 ; s1 , s2 ), and the light-cone energy and momentum, (p+ , p− ), associated
with translations in the x+ and x− directions.
42
Notice that the metric (458) is SO(8) invariant, but the background value of the
ﬁve-form breaks this symmetry down to SO(4) × SO(4).
43
More precisely this contraction should be called a Saletan contraction [154].
Instantons and Supersymmetry 425

A very remarkable feature of the plane-wave background is that it allows

the quantisation of string theory in the Green–Schwarz (GS) formalism.44 As
shown in [148], in the light-cone gauge the plane-wave GS string is described by
a (massive) free world-sheet theory. This allows to carry out the quantisation
essentially in same fashion as in ﬂat space. The string action constructed
in [148] is
+∞ 2πp−
1 1 1
S= dτ dσ∂ + X I ∂ − X I − m2 X I X I
2πα −∞ 0 2 2

+i S a ∂+ S a + S̃ a ∂− S̃ a − 2mS a Πab S̃ b , (460)

where the mass parameter m = μp− α has been introduced. In (460) the X I ’s
denote the transverse coordinates of the string and the index I = 1, . . . , 8 is
in the 8v of SO(8). The S a ’s and S̃ a ’s, a = 1, . . . , 8, are GS fermions. These
are SO(8) spinors of the same chirality in the 8s . The matrix Π is a product
of SO(8) γ-matrices, Π = γ 1 γ 2 γ 3 γ 4 . As usual in the light-cone gauge the
non-physical components have been eliminated setting X + (σ, τ ) = 2πα p− τ ,
whereas X − (σ, τ ) is expressed in terms of the X I ’s using the so-called Virasoro
constraints which follow from consistency with the equation of motion for the
world-sheet metric.
Since (460) describes a free theory the equations of motions lead to a
standard mode expansion. For instance, for the transverse bosons one ﬁnds
1
X I (σ, τ ) = cos(mτ ) xI0 + sin(mτ ) pI0
m
1
+i e−iωn τ +2iπnσ αnI + e−iωn τ −2iπnσ α̃nI , (461)
ωn
n =0

where
ωn = sign(n) m2 + n2 . (462)
The GS fermions have a similar mode expansion with coefficients which will
a a
be denoted by S±n and S̃±n .
Upon quantisation, the coefficients in the expansion of the world-sheet
fields give rise to creation and annihilation operators for the states in the string
spectrum. In order to construct the physical creation and annihilation opera-
tors, the SO(8) oscillators need to be decomposed under SO(4)C × SO(4)R .
Massive excitations of the string are associated with non-zero oscillators.
The bosonic ones, αnI and α̃nI , are in the 8v which decomposes as 8v →
(4; 1) ⊕ (1; 4), and one obtains

44
By “quantisation” we here mean the determination of the spectrum of states with
p− = 0, with p− > 0 for incoming and p− < 0 for outgoing states. Interactions
and the spectrum of states with p− = 0 are much subtler and not fully known
even at tree level.
426 M. Bianchi et al.

αnI → αni , αnμ+5 , α̃nI → α̃ni , α̃nμ+5 , n ∈ Z, n = 0 , (463)

where i = 1, 2, 3, 4 and μ = 0, 1, 2, 3 are vector indices in SO(4)R and SO(4)C ,

respectively. The fermions are in the 8s which decomposes into (2L ; 2L ) ⊕
(2R ; 2R ). The fermionic oscillators are decomposed using the projectors P± =
2 (1 ± Π)
1

Sna → Sn± = P± Sna , S̃n± = P± S̃na , (464)

which yield spinors with SO(4)C × SO(4)R chiralities (+, +) and (−, −).
The zero modes in the expansion of the world-sheet ﬁelds are treated
similarly. The bosonic ones, xI0 and pI0 , associated with the transverse position
and momentum of the string, are combined into
1 † 1
aI = (pI0 − i|m|xI0 ) and aI = (pI0 + i|m|xI0 ) . (465)
2|m| 2|m|

These are then decomposed as in (463). The fermion zero modes, S0 and S̃0 ,
are combined into
1 1
θ = √ (S0 + iS̃0 ) and θ̄ = √ (S0 − iS̃0 ) (466)
2 2
and then further decomposed as

θL,R = P+,− θ , θ̄L,R = P+,− θ̄ . (467)

The Fock space of states is built on a vacuum, |0h = |0, p− h , deﬁned as

the state annihilated by θR , θ̄L , aI and all the non-zero oscillators of positive
frequency. This is a non-degenerate bosonic state of zero mass usually referred
to as the BMN vacuum. The fermionic zero modes θL and θ̄R are creation
operators and generate the supergravity multiplet acting on |0h [149]. The
†
bosonic zero modes aI create the Kaluza–Klein-like excitations. The massive
string modes are created by combinations of non-zero oscillators with negative
i i μ+5 μ+5 ± ±
frequencies, α−n , α̃−n , α−n , α̃−n , S−n and S̃−n , acting on the BMN vacuum.
Physical states, |sphys , are subject to the level-matching condition

N − Ñ |sphys = 0 , (468)

where the left and right moving number operators are deﬁned by
∞
n I I a a
N= α α + n S−n Sn
n=1
ωn −n n
∞ (469)
n I I a a
Ñ = α̃ α̃ + n S̃−n S̃n .
n=1
ωn −n n

The string theory Hamiltonian can be expressed in terms of the above oscil-
lators as
Instantons and Supersymmetry 427

†

2p− H = m aI aI + θL
a a a a
θ̄L + θ̄R θR
∞

I
+ α−k αkI + α̃−k
I
α̃kI + ωk S−k
a
Ska + S̃−k
a
S̃ka . (470)
k=1

From the form of the Hamiltonian (470) it is straightforward to compute the

free string spectrum. The mass of a generic massive string excitation is

1

∞
1
M= N + Ñ |ωn | , (471)
μ m n=1

with ωn deﬁned in (462) and m = μp− α .

In the following our discussion will focus on massive string states, which
are more interesting in the context of the duality with N = 4 SYM since
their masses receive quantum corrections. For simplicity, we shall restrict our
i i
attention to states created by the α−n and α̃−n oscillators
j1 js
|s = α−n
i1
1
· · · α−n
ir
r
· · · α̃−m 1
· · · α̃−m s
· · · |0h , (472)

with r nr = s ms to satisfy level matching.
As has been mentioned before, the states in the spectrum are charac-
terised by the their SO(4)C × SO(4)R quantum numbers, besides their mass
and light-cone momentum. As in the original formulation of the AdS/CFT
correspondence, the quantum numbers associated with the symmetries on the
two sides also dictate the map relating string states to composite operators
in N = 4 SYM. Equation (457) leads to the following identiﬁcations:
1 1
H ≡ p+ = −i∂+ = −i(∂t + ∂ψ ) → D − J
μ μ (473)
i 1
p− = −i∂− = (∂t − ∂ψ ) → (D + J ) ,
2μL2 2μL2

where, as usual, L4 = 4πgs Nc α . Equations (473) relate the string light-cone

energy and momentum to linear combinations of the dilation operator, D,

and the generator, J , of the U (1) subgroup of the SU (4) R-symmetry singled
out in taking the Penrose limit. From the above relations it follows that in
the L → ∞ limit string states with ﬁnite energy and light-cone momentum
correspond to SYM operators with values of D and J satisfying

Δ → ∞, J → ∞, Δ−J ﬁnite . (474)

Operators with these properties form the BMN sector of N = 4 SYM. The
explicit form of the operators dual to states in the plane-wave string spectrum
was proposed in [143]. The starting point for the construction of such operators
is the deﬁnition of the dual to the BMN vacuum, which is identiﬁed with the
operator
428 M. Bianchi et al.

1
O= Tr Z J , (475)
JNcJ
where Z is the complex combination of the N = 4 scalar ﬁelds with J = 1
for which we choose Z = 2ϕ14 (see (337) and (338)). The operator (475)
has Δ − J = 0 as expected for the dual of a zero energy state. Operators
corresponding to the other states in the string spectrum are obtained inserting
in the trace in (475) “impurities”, i.e. other elementary ﬁelds in the N = 4
fundamental multiplet. The action of each creation operator on the string side
corresponds to the insertion of an impurity of a certain type.45 In particular,
the operators dual to states of the form (472) are
1
OJ;n
i1 ...ik
1 ...nk
=
2 J+k
g Nc
J k−1 8π 2

J
× e2πi[(n1 +···+nk−1 )p1 +(n2 +···+nk−1 )p2 +···+nk−1 pk−1 ]/J
p1 ,...,pk−1 =0
p1 +···+pk−1 ≤J

× Tr Z J−(p1 +···+pk−1 ) ϕi1 Z p1 ϕi2 · · · Z pk−1 ϕik , (476)

where the integers ni correspond to the mode numbers of the string state46
and the action of the creation operators in (472) is in correspondence with
the insertion of impurities, ϕi , for which we have the deﬁnitions
1 1
ϕ1 = √ −ϕ13 + ϕ24 , ϕ2 = √ ϕ12 + ϕ34 ,
2 2
(477)
i 13 i 12
ϕ = √ −ϕ − ϕ24 ,
3
ϕ = √ ϕ − ϕ34 ,
4
2 2

The four real scalars, ϕi , transform in the (1; 4) of SO(4)C × SO(4)R and the
map between the operators (476) and the string states (472) is determined by
the SO(4)R quantum numbers.
To make the comparison between the string theory in the plane-wave back-
ground and the BMN sector of N = 4 SYM possible at a quantitative level,
one has to consider the large Nc limit. The combination of the large Nc limit
with the limit of large Δ and J, implies that new eﬀective parameters, λ and
g2 , arise [143, 155, 156], which are related to the ordinary ’t Hooft parameters,
λ and 1/Nc , by a rescaling

g 2 Nc J2
λ = , g2 = . (478)
J2 Nc
45
For this reason the string excitations are also often referred to as impurities.
46
Conventionally, left-moving modes correspond to ni > 0 and right-moving ones
to ni < 0.
Instantons and Supersymmetry 429

These in turn are related to the parameters of the plane-wave string theory
by
1
m2 = (μp− α )2 = , 4πgs m2 = g2 . (479)
λ
The double scaling limit defined by (474) and Nc → ∞ with J 2 /Nc fixed
connects the weak coupling regime of the gauge theory to string theory at
small gs and large m. The property that in this limit physical quantities can
be expanded in powers of the effective parameters, λ and g2 , is referred to as
BMN scaling.

Instanton Eﬀects in the BMN Limit

Tests of the BMN limit of the AdS/CFT correspondence consist in verifying

the validity of the relation
1
H =D−J . (480)
μ
This is an operator relation and it requires that the eigenvalues of the two
sides be equal, i.e. masses of states in the plane-wave string spectrum, rescaled
by a factor of μ, should equal the combination Δ − J for the dual operators.
In general the comparison requires the resolution of a mixing problem, i.e. the
diagonalisation of the operators in (480).
The quantum corrections to the string mass spectrum are extracted from
two-point amplitudes. At the perturbative level, calculations of such am-
plitudes have been performed using string ﬁeld theory methods. The non-
perturbative corrections that we are interested in are induced by two-point
amplitudes in which the external states are coupled to D-instantons. These
quantum corrections to the string masses should be compared to instanton
corrections to the eigenvalues of the operator D − J , i.e. to the anomalous
dimensions of BMN operators since the charge J is not renormalised.
The leading D-instanton contribution to a two-point amplitude is obtained
coupling the external states to two disconnected disks, with Dirichlet bound-
ary conditions, localised at the same space–time point. This is schematically
depicted in Fig. 5.
The D-instanton in the plane-wave string theory can be described as a
collective excitation of elementary closed string oscillators using the boundary
state formalism [157]. The construction of the D-instanton boundary state in

Fig. 5. Leading D-instanton contribution to a two-point scattering amplitude

430 M. Bianchi et al.

the plane-wave background follows closely the approach used in [158] for the
light-cone GS string theory in ﬂat space. The boundary state describing a
D-instanton with transverse position z I will be denoted by ||z I . It is deﬁned
by the following gluing conditions:
I I
αn − α̃−n
I
||z = 0 ,

(481)
Sna + iMnab S̃−n
b
||z I = 0 , n ∈ Z,

where the matrix Mn is

1
Mn = (ωn 1 − mΠ) , (482)
n
with ωn given in (462). The explicit expression for ||z I is [157]
/∞ 0
1
||z I = (4πm)2 exp αI α̃I − iS−k a
Mk S̃−ka
||z I 0 , (483)
ωk −k −k
k=1

where ||z I 0 denotes the zero-mode part

√ I I† 1 I† I†
||z I 0 = e−|m|(z ) /2 ei 2|m|z a e 2 a a |0D
I 2
(484)

and |0D = θL θL θL θL |0h .

1 2 3 4

The plane-wave background is maximally supersymmetric, i.e. it is in-

variant under 32 supersymmetries. These are divided into 16 kinematical
supersymmetries, which do not commute with the string Hamiltonian, and
16 dynamical ones, which commute with the Hamiltonian. The boundary
state (483) and (484) is annihilated by eight kinematical and eight dynami-
cal supersymmetries as a consequence of the conditions (481). The other half
of the supersymmetries acting on ||z I generate the fermion zero modes of
the D-instanton. These are represented by the open strings attached to the
boundary of the disks in Fig. 5. We shall denote the combinations of kinemat-
ical and dynamical supersymmetries which act non-trivially on the boundary
state by (q̄L , q̄R ) and Q− , respectively. The bosonic collective coordinates of
the D-instanton correspond to its position in the 10-dimensional plane-wave
geometry.
In order to compute a two-point amplitude of the type represented in
Fig. 5, one needs to construct a state that includes the full dependence on the
collective coordinates of the D-instanton and couples to two external states.
We shall refer to such a state as a “dressed two-boundary state”. The latter
describes the two disks in Fig. 5 and is obtained considering the product of two
boundary states associated with distinct Fock spaces but located at the same
position, z I . The “dressing” corresponding to the inclusion of the dependence
on the bosonic and fermionic collective coordinates is achieved acting with
the bosonic and fermionic generators of the broken symmetries. Denoting the
Instantons and Supersymmetry 431

two Fock spaces with indices 1 and 2, the dressed two-boundary state can be
written as
+ −
||V2 ; z, η, = eiz (p1+ +p2+ ) eiz (p1− +p2− ) (485)
! − "
− 8 4 4
× η Q1 + Q2 [L (q1L + q2L )] [L (q1L + q2L )] ||z 1 ⊗ ||z 2 ,
I I

where z = (z I , z + , z − ) is the 10-dimensional location of the D-instanton and

the SO(8) spinors η and = (L , R ) denote the fermionic collective coordinates
associated with the dynamical and kinematical supersymmetries broken by the
D-instanton.
The two-point amplitude that we are interested in is obtained coupling the
dressed two-boundary state to a pair of external physical states and integrat-
ing over the bosonic and fermionic collective coordinates of the D-instanton.
Denoting the incoming and outgoing states by |s1 and |s2 , the one-particle
irreducible part of the amplitude is

A = c gs7/2 e2πiτ d8 z dz + dz − d8 η d8 ( s1 | ⊗ s2 |)||V2 ; z, η, , (486)

where c is a numerical constant which has not been explicitly computed

7/2
and the measure factor, gs e2πiτ , follows from the comparison with the D-
instanton induced contributions to the low-energy eﬀective action.
As an example of amplitude of the type (486) we consider the leading D-
instanton contribution to the case in which |s1 and |s2 are particular states
in the class (472). Speciﬁcally, we consider SO(4)R singlet states with four
impurities
1
s1 | ⊗ s2 | = εijkl εi j k l
ωn1 ωn2 ωm1 ωm2

× h 0|αn(1)i
1
αn(1)j
2
α̃n(1)k
1
α̃n(1)l
2
⊗ h 0|αm
(2)i (2)j (2)k (2)l
1
αm2 α̃m1 α̃m2 , (487)

where the prefactors ensure that the states are normalised to one. A generic
four impurity state has three independent mode numbers, after imposing the
level matching condition. In (487) we have made a special choice: each of
the states contains two left- and two right-moving excitations with pairwise
equal mode numbers. This is because only states of this type couple to a
D-instanton at leading order in gs . This is a general property of D-instanton-
induced amplitudes: because of the way in which the creation operators enter
in (483) and (484), the boundary state couples only to states with the same
number of left- and two right-moving oscillators with pairwise matched mode
numbers.
To proceed with the calculation of the leading D-instanton contribution
we insert (487) into (486). The general strategy for the calculation of such
amplitudes consists in expanding the ||z I factors in the dressed boundary
state in a power series retaining only the terms which do not annihilate the
product s1 | ⊗ s2 | on the left. The integration over the collective coordinates
432 M. Bianchi et al.

z + and z − imposes conservation of light-cone energy and momentum. Because

of the non-linear dispersion relation (462) energy conservation requires that
the mode numbers in the incoming and outgoing states be equal. Of the
remaining integrations, those over the eight transverse z I ’s and over the eight
fermionic moduli are trivial in the case of external states of the type we are
considering.47 The non-trivial part of the calculation is the integration over
the eight fermion moduli η

! " 1 (1) (1)
− − 8
s1 | ⊗ s2 | d η η(Q1 + Q2 ) exp
8
(α α̃ (488)
n
ωn −n −n

(2) (2) (1) (1) (2) (2)
+α−n α̃−n ) − i(S−n Mn S̃−n + S−n Mn S̃−n ) |01 ⊗ |02 .

These integrations induce a coupling between the two disks, since the dy-
namical supercharges, Q− , which couple to the η’s depend non-trivially on
the non-zero string oscillators. The calculation is greatly simpliﬁed when one
considers the large m limit relevant for the comparison with N = 4 SYM at
weak coupling. Since in this limit Mn ∼ m, the dominant contribution to the
amplitude with external states (487) is obtained retaining in the expansion of
the boundary state two SM S̃ factors on each disk and distributing the eight
Q− ’s evenly on the two disks. After some lengthy but straightforward algebra
one obtains
1
A(n1 , n2 ) = εijkl εi j k l e2πiτ gs7/2 m8 I ijkl,i j k l (489)
n21 n22 η

where

Iηijkl,i j k l = d8 η η + γ ij η + η + γ kl η + η − γ i j η − η − γ k l η −

= (εijkl + δ ik δ jl − δ il δ jk )(εi j k l
− δi k δj l
+ δi l δj k
) . (490)

In the case of the amplitude (489) the only contribution comes from the term
containing the product of two ε-tensors in (490) and one gets
1
A(n1 , n2 ) = 576 e2πiτ gs7/2 m8 . (491)
n21 n22

The same integral (490) arises in the calculation of two-point amplitudes

between other SO(4)C × SO(4)R singlet four-impurity operators and in these

cases the terms involving Kronecker δ’s in Iηijkl,i j k l can contribute.
The result (489) is the leading non-perturbative correction to the one
particle irreducible part of the two-point amplitude and thus it yields the
D-instanton correction to the mass matrix for states of the form (487)

47
They give rise to δ-functions that in the present case simply integrate to one.
Instantons and Supersymmetry 433
8π 2
7/2 − +iϑ 7/2
1 e2πiτ gs m7 e g2 λ g2
δM ∼ 2
= . (492)
μ (n1 n2 ) (n1 n2 )2

The SYM operators dual to the states in (487) are a special case of (476),
i.e. four impurity SO(4)C × SO(4)R singlets. They are given by

εijkl
J
On1 ,n2 ,n3 = e2πi[(n1 +n2 +n3 )p+(n2 +n3 )q+n3 r]/J
J 3 (g 2 Nc )J+4 p,q,r=0
p+q+r≤J

× Tr Z J−(p+q+r) ϕi Z p ϕj Z q ϕk Z r ϕl . (493)

In order to compute the one instanton contribution to the matrix of anomalous

dimensions for such operators, one considers the two-point function

G(x1 , x2 ) = On1 n2 n3 (x1 ) Ōm1 m2 m3 (x2 ) . (494)

The calculation proceeds as in the case of the correlators discussed in Sect. 15.
In the semi-classical approximation one needs to compute the classical profiles
of the operators On1 n2 n3 and Ōm1 m2 m3 and integrate them over the instanton
moduli space. The profiles of the operator (493) and of its conjugate contain
2J + 8 fermion zero modes each and thus (494) is non-minimal according to
the terminology introduced in Sect. 15.
Although the calculation of the two-point function (494) presents no new
conceptual difficulties, it involves rather complicated combinatorics associated
with the distribution of the exact and non-exact fermion zero modes in the two
operators. Each of the two operators should soak up eight of the 16 supercon-
formal modes in the combination (ζ 1 )2 (ζ 2 )2 (ζ 3 )2 (ζ 4 )2 , while the remaining
modes are of type ν A and ν̄ A . Expanding the trace in (493) and in the conju-
gate operator one obtains a large number of terms satisfying this requirement.
The double limit Nc → ∞, J → ∞, with J 2 /Nc fixed, simplifies somewhat the
analysis. The dominant contributions in this limit come from certain specific
distributions of the fermion modes. The large Nc limit requires that all the
ν̄ A ν B bilinears be in the 6, see (394). Moreover at large J the leading con-
tributions to the operator profiles come from terms in which as many of the
superconformal modes as possible are provided by the Z’s and Z̄’s rather than
by the impurities. This is because one gets roughly a multiplicity factor of J
associated with every Z or Z̄ providing one such mode. Taking into account
these simplifications the calculation of the profiles of On1 n2 n3 and Ōm1 m2 m3 ,
albeit rather tedious, is feasible. Eventually, the dependence on the collective
coordinates in all the relevant terms in the profile of the operator On1 n2 n3
reduces to
ρ8
J 2 2 2 2
[1 4]
ν̄ ν ζ1 ζ2 ζ3 ζ 4 (x1 ) . (495)
[(x1 − x0 )2 + ρ2 ]J+8
434 M. Bianchi et al.

Similarly, all the terms in the classical proﬁle of Ōm1 m2 m3 , which contribute
in the BMN limit contain the following factor:

ρ8
J 2 2 2 2
[2 3]
ν̄ ν ζ1 ζ2 ζ3 ζ 4 (x2 ) . (496)
[(x2 − x0 )2 + ρ2 ]J+8

After factoring out the dependence on the collective coordinates the depen-
dence on the mode numbers, ni and mi , is determined by sums of the form

J
K(n1 , n2 , n3 ; J) = e2πi[(n1 +n2 +n3 )p+(n2 +n3 )q+n3 r]/J (497)
p,q,r=0
p+q+r≤J
c c2
1
× p(p − 1)(p − 2)(p − 3) + qp(p − 1)(p − 2) + · · · ,
4! 3!
where each term contains combinatorial factors and c1 , c2 , . . . are numerical
coeﬃcients.
The two-point function (494) is thus

O(x1 ) Ō(x2 )inst

4
d x0 dρ ρ2J+16
= c(g, Nc , J)
ρ5 [(x1 − x0 ) + ρ ]
2 2 J+8 [(x2 − x0 )2 + ρ2 ]J+8
4
! A "2 ! "2
× d8 η d8 ξ¯ ζ (x1 ) ζ A (x2 )
A=1

J 23 J
× d5 Ω Ω 14 Ω [K(n1 , n2 , n3 ; J) K(m1 , m2 , m3 , J)] (498)

where c(g, Nc , J) contains the dependence on the parameters arising from the
normalisation of the operators
√ and the moduli space integration measure, as
well as the factors of g Nc obtained rewriting the (ν̄ A ν B )6 bilinears in terms
of the angular variables Ω AB . In the large J limit the sums in (497) can be
approximated with integrals. For instance, the ﬁrst term becomes

J
e2πi[(n1 +n2 +n3 )p+(n2 +n3 )q+n3 r]/J p(p − 1)(p − 2)(p − 3) (499)
p,q,r=0
1 1−x 1−x−y
→ J7 dx dy dz e2πi[(n1 +n2 +n3 )x+(n2 +n3 )y+n3 z] x4
0 0 0

From (498) and (499), recalling the analysis in Sect. 15, one can deduce the
dependence of the two-point function on the parameters. There are numerous
sources of powers of g, Nc and J in the calculation, but remarkably the final
result can be expressed only in terms of the parameters g2 and λ , as required
by BMN scaling. In detail one gets
Instantons and Supersymmetry 435
/ 02 √ 2J
1
g Nc 1 2
×e g Nc ×
2πiτ 8
× × J7
J 3 (g 2 Nc )J+4 1 23 4 1 J232 4 J 2
1234 1 23 4
1 23 4 measure sums
norm. operators ν,ν̄ x0 ,ρ
integrals integrals
7
J 7/2
2
− g8πλ +iϑ
∼ 7/2
e2πiτ = g2 e 2 , (500)
Nc
which is in agreement with the λ and g2 dependence of the string theory
result (489).
The simple mode number dependence of the string two-point amplitude
is more complicated to reproduce. In the SYM two-point function the depen-
dence on the integers ni and mi is contained in the functions K(n1 , n2 , n3 ; J)
and K(m1 , m2 , m3 , J) defined in (497). Each term in these sums receives a
large number of contributions resulting in very complicated expressions. How-
ever, combining all the contributions leads to impressive cancellations and a
very simple result. In conclusion, the one-instanton contribution to the two-
point function (494) can be written in the form
8π 2
−
32 (g2 )7/2 e g2 λ
+iϑ
1 1 ! "
G(x1 , x2 ) = 2 J+4
log Λ2 x212 , (501)
241 π 13/2 (n1 n2 m1 m2 ) (x12 )
where Λ is a scale that appears as a consequence of the logarithmic divergence
in the x0 and ρ integrals, which signals a contribution to the matrix of anoma-
lous dimensions. Notably the result is only non-zero if the mode numbers in
the two operators are equal in pairs, again in agreement with string theory.
From the coefficient of (501), one can read off the contribution to the
matrix of anomalous dimensions. The above calculation is not sufficient to
determine the actual anomalous dimension of the operator (493) since this
requires the diagonalisation of the matrix of two-point functions of all the
operators with the same quantum numbers. However, all such two-point func-
tions are expected to have the same dependence on λ and g2 found in (501).
Therefore, one can conclude that the behaviour of the leading instanton con-
tribution to the anomalous dimension of four impurity SO(4)C × SO(4)R
singlet operators is
2
7/2 − g8πλ +iϑ
g2 e 2
γinst ∼ , (502)
(n1 n2 )2
in agreement with (492). In view of the complexity of the calculation, this
result provides a striking test of the BMN proposal.
A number of other two-point string amplitudes and their dual correlation
functions have been studied in [151, 152, 153]. The many interesting results
obtained in these papers can be summarised in the following statements.
• Four impurity operators in other representations of SO(4)R and the
corresponding string states have two-point functions which behave as
436 M. Bianchi et al.

(λ )2 (g2 )7/2 exp(−8π 2 /g2 λ + iϑ), i.e. they are suppressed by two powers
of λ with respect to those in the singlet sector.
• Two impurity operators have the same suppression. The calculation of
instanton contributions to two-point functions of two impurity operators
in N = 4 SYM is rather subtle because in order to saturate the integrations
over the superconformal modes one needs to use the classical solution for
the scalar fields involving six fermion modes, ϕ(6) AB .
• Supergravity states and their KK excitations do not couple to the D-
instanton boundary state and thus, as expected, their masses do not receive
non-perturbative corrections. This result is far from obvious in the gauge
theory and requires non-trivial cancellations which have not been explicitly
verified.
• (D-)Instantons contribute to the mixing of states in the NS–NS and R–R
sectors of the plane string theory.48
• Instanton contributions to two-point functions of certain operators dual
to R–R string states, i.e. operators with an even number of fermionic
impurities, involve inverse powers of λ . Although this behaviour is rather
surprising, it is not pathological in the λ → 0 limit because the inverse
powers of λ are accompanied by the instanton weight exp(−8π 2 /λ g2 ).
These two-point functions vanish in perturbation theory.
It is notable that many of these results can be straightforwardly obtained in
string theory where they are easily deduced from properties of the D-instanton
boundary state, whereas they are much more complicated to obtain from a
field theoretical calculation in N = 4 SYM.

19 Conclusions
We would like to conclude this long review by highlighting the many top-
ics where Gabriele’s contributions along the years have been at the heart
of the theoretical developments that have made our understanding of non-
perturbative effects of field theory so deep and powerful.
Conceptually, perhaps the most important contributions in this direction
have been his works on the foundation of the notion of effective action in
a supersymmetric framework. The effective action for the N = 1 SYM the-
ory [29] and its extension to SQCD [30] are milestones along the way of dealing
with the non-perturbative structure of field theory. These works appear as an
immediate extension and generalisation of the approach established for the
description of the low-energy degrees of freedom of QCD [27, 28], as soon as
the fundamental rôle of anomalies was recognized [57, 159, 160]. The valida-
tion of the famous Witten–Veneziano formula [161] for the η mass, yielded by
lattice simulations [162], and the explicit instanton calculations, carried out in
48
Unlike in flat space, in the plane-wave background this mixing occurs also in
perturbation theory beyond tree-level [153].
Instantons and Supersymmetry 437

various instances in supersymmetric theories [4], have beautifully conﬁrmed

the predictive power of the eﬀective action approach both in a supersymmetric
and in a non-supersymmetric context.
Together with many other important, independently derived, results [21],
these ideas have proved to be of enormous impact on the way we think today
of possible extensions of the Standard Model.
We cannot end this review without mentioning what we consider the most
important step of modern physics beyond ﬁeld theory, namely the construction
of the dual Veneziano amplitude [109], which is expressed by the remarkably
simple formula
1

A(s, t) = dx x−α s−1 (1 − x)−α t−1 . (503)
0

It is unanimously recognized that (503) represents the founding paper of String

Theory. It took some time to realise that the infinite tower of “resonances”
exchanged in the s and t channel are the excitations of an open bosonic string
living in 26 dimensions. Planar duality, A(s, t) = A(t, s), and the UV softness
of the amplitude are exposed quite neatly by its geometric interpretation in
terms of vertex operators inserted on the boundary of a disk. The presence of
a massless vector excitation has brought String Theory to be the most cred-
ited candidate for the unification of all interactions, including gravity. In this
respect, the graviton comes in as the massless excitation of the closed string
spectrum and its vertex operator is a sort of “square” of the vertex operator for
the massless vector of the Veneziano amplitude. That open strings might be
considered more fundamental than closed strings is something which seems
to emerge in all modern approaches, where D-branes and their open string
excitations are used to describe interactions mediated by gauge bosons. We
want also to recall that in a somewhat more distant context string excitations
have been shown to be able to account for the microscopic degrees of freedom
of black holes, thus yielding what is considered today the only satisfactory
solution to the holographic puzzle of black hole thermodynamics [163].
In the present review we have briefly sketched the enormous simplification
that open strings bring into the ADHM construction of instantons. However,
for lack of space we had no chance to stress the far-reaching consequences of
ideas underlying the Veneziano amplitude in the quest for unification and in
the process of clarification of the many puzzles of quantum gravity. We dare to
conclude by saying that we expect the Veneziano amplitude to be among the
basic blocks of any consistent formulation of the fundamental laws of Nature.

Acknowledgements

Discussions with Massimo Testa, Yassen Stanev and especially Michael Green
are gratefully acknowledged. The work of S.K. was supported in part by a
438 M. Bianchi et al.

Marie Curie Intra-European Fellowship and by the EU-RTN network Con-

stituents, Fundamental Forces and Symmetries of the Universe (MRTN-CT-
2004-005104).

Appendix A – Notations
A.1 Generalities
E
We work in Euclidean metric with gμν = δμν . Factors of the gauge cou-
pling constant, g, will be explicit everywhere. We are interested in computing
expectation values of gauge-invariant (possibly multi-local) renormalisable,
composite operators, i.e. functional integrals of the type

1 ! "
O = Dμ(ψ, ψ̄)DAμ exp − SYM + d4 xψ̄(D + m)ψ O[ψ, ψ̄, Aμ ] , (A.1)
Z
where D can be either a Dirac or a Weyl–Dirac operator (see below) and Z
is a similar functional integral with O replaced by the identity operator.

A.2 Yang–Mills Action

The pure Yang–Mills action has the form

1 4 1
SYM = d x Tr[Fμν Fμν ] = d4 x a
Fμν a
Fμν , (A.2)
2 4 a
Fμν = T a Fμν
a
, Fμν = ∂μ Aν − ∂ν Aμ + ig[Aμ , Aν ] . (A.3)

A.3 Some Group Theory Formulae

In (A.2) the matrices T a ≡ TN a
c
, a = 1, 2, . . . , Nc2 − 1 are the SU (Nc ) gener-
ators in the fundamental representation, Nc . In general, the generators, TR ,
in the (irreducible) representation R are normalised according to the formula
! a b"
Tr TR TR = [R]δ ab , (A.4)

with [R] the Dynkin index of the representation. It is customary to normalise

generators in the Nc , N̄c and Adj representations so that
1
[Nc ] = [N̄c ] = , [Adj] = Nc . (A.5)
2
Taking the trace of the equation which deﬁnes the quadratic Casimir operator
of the representation R

a a
TR TR = c2 [R]1dim(R)×dim(R) , (A.6)
a

and using (A.4), one gets the useful relation

c2 [R]dim(R) = [R]dim(G) . (A.7)

Instantons and Supersymmetry 439

A.4 Dirac Fermions

The Euclidean action of a Dirac fermion, ψ, ψ̄, in the representation R of the
gauge group SU (Nc ) takes the form

SDF = d4 x ψ̄r Dμsr
[R]γμ ψ s , r, s = 1, . . . , dim(R) , (A.8)

where Dirac indices are understood and

r
Dμs [R] = ∂μ δsr − g(TR
a r a
)s Aμ . (A.9)
In eq. (A.8) hermitean γ-matrices are used, satisfying the anti-commutation
relations {γμ , γν } = 2δμν .

A.5 Weyl Fermions

The Euclidean action of a Weyl fermion, λaα , λ̄aα̇ (α, α̇ = 1, 2) belonging to the
adjoint representation of the gauge group SU (Nc ) takes the form

SWF = d4 x λ̄aα̇ Dμab [Adj]σ̄μα̇α λbα , (A.10)

where
Dμab [Adj] = ∂μ δ ab − gf abc Acμ , (A.11)
σ̄μ = (1, −iσk ) , (A.12)
with σk the Pauli matrices. It is also useful to introduce the matrices (σμ )αα̇
σμ = (1, iσk ) . (A.13)
and the deﬁnitions
1
(σμν )βα = (σμαα̇ σ̄να̇β − σναα̇ σ̄μα̇β ) , (A.14)
2
1
(σ̄μν )β̇α̇ = (σ̄μβ̇α σναα̇ − σ̄νβ̇α σμαα̇ ) . (A.15)
2
A.6 The SYM Action
The Euclidean action of the minimal N = 1 supersymmetric gauge theory
(super Yang–Mills, SYM), SSYM , when written in components, is simply given
by the sum of (A.2) and (A.10). The classical action is invariant under the
Uλ (1) R-symmetry [164]
λ → eiα λ , Aμ untouched . (A.16)
Quantum–mechanically this symmetry is anomalous with
g2 a a g2 a a
∂μ Jμ(λ) = 2i[Adj] Fμν F̃μν = 2iNc F F̃ , (A.17)
32π 2 32π 2 μν μν
Jμ(λ) = λ̄aα̇ σ̄μα̇α λaα , (A.18)
440 M. Bianchi et al.

but has a non-anomalous discrete subgroup

Z2Nc = {zk = eiαk , αk = 2πk/2Nc , k = 1, 2, . . . , 2Nc } . (A.19)

This important statement can be proved in diﬀerent ways. An elegant proof

makes use of the (natural) extension of SYM in which a ϑ term is added to the
action (see the last section of Appendix C). In this situation under a Uλ (1)
rotation of the gluino ﬁelds in the functional integral, we get the (anomalous)
WTI

O1 (x1 ) . . . On (xn )(ϑ) = O1 (x1 ) . . . On (xn )(ϑ+2Nc α) . (A.20)

Since the theory is classically invariant under such a rotation (the transfor-
mation u(α)Ok u† (α) = exp (iηk α)Ok , with ηk the Uλ (1) charge of Ok leaves
invariant the correlator if the vacuum is annihilated by the unitary operator
u(α)), the only eﬀect of the transformation is to change the value of the ϑ
angle. The change does not aﬀect physics if

2Nc α = 2nπ , n ∈ Z. (A.21)

Clearly this result holds for any value of ϑ, thus also at ϑ = 0.

When written in superﬁelds, the SYM action takes the form [164]

SSYM = d4 x d2 θ Tr[Wα W α ] , (A.22)
1
Wα = − D̄2 e−gV Dα egV , (A.23)
4
1 1
V (x, θ) = C(x) + θμ(x) + θ̄μ̄(x) + θ2 S(x) + θ̄2 S̄(x)
2 2
μ 1 2 1 2 1 2 2
+ θσ θ̄Aμ (x) + θ̄ θλ(x) + θ θ̄λ̄(x) + θ θ̄ D(x) + . . . , (A.24)
2 2 4
where dots stand for terms that can be expressed as derivatives of the ﬁelds
already present in (A.24).

A.7 The SQCD Action

The Euclidean action of the N = 1 supersymmetric theory which more closely
resembles QCD is obtained by coupling in a supersymmetric and gauge in-
variant way to the SYM supermultiplet Nf pairs of matter chiral superﬁelds
ﬁelds (f = 1, . . . , Nf , r = 1, . . . , Nc )
√
Φrf (x) = φrf (y) + 2θα ψαf r
(y) + θ2Ffr (y) , yμ = xμ + iθσμ θ̄ , (A.25)
√
Φ̃fr (x) = φ̃fr (y) + 2θα ψ̃αr f
(y) , +θ2F̃rf (y) , (A.26)

belonging to the representations Nc and N̄c , respectively, of the gauge group.

In this way the gauge-invariant mass term
Instantons and Supersymmetry 441
f ¯α̇r
mass
SSQCD = mf d4 x ψ̃rαf ψαf
r
+ m∗f d4 x ψ̄α̇r ψ̃ f
f αr α̇r

+ |mf |2 d4 x (φ∗f
r φ r
f + φ̃ f ∗r
φ̃
r f ) (A.27)
r

can be constructed. The rest of the action is completely standard and can be
found in any textbook or, for instance, in [4].

A.8 The “Flavour” Symmetries of the SQCD Action

(I) The classical SQCD action is invariant under the Uλ (1) R symme-
try [164] 49 which now transforms in a non-trivial way gluinos and scalars
according to

λ → eiα λ , φ → eiα φ , φ̃ → eiα φ̃ , + complex conjugate ,

¯
Aμ , ψ, ψ̄ , ψ̃, ψ̃ , untouched . (A.28)

The Uλ (1) R-symmetry of SQCD is anomalous with the same anomaly as in

SYM (see (A.17)). Again only the Z2Nc subgroup is unbroken. With respect
(λ)
to the SYM case Jμ must now be augmented with the inclusion of the matter
contribution and reads
! ↔ " ! "
Jμ(λ) = λ̄aα̇ σ̄μα̇α λaα + iφf ∗ ∂ μ φf + φf → φ̃f . (A.29)
f

(II) The massless theory with Nf ﬂavours possesses a global SU (Nf ) ×

SU (Nf )×UV (1)×UA (1) symmetry. The chiral SU (Nf )×SU (Nf ) symmetry is
broken by matter mass terms. For instance, if all masses are equal (mf = m),
the unbroken subgroup is the diagonal vector group UV (Nf ), while, if all
masses are diﬀerent (m1 = m2 = . . . = mf ), the leftover unbroken subgroup
is U (1)Nf .
(III) The UA (1) transformation

(φ, ψ) → eiα (φ, ψ) , (φ̃, ψ̃) → eiα (φ̃, ψ̃) , + complex conjugate ,
λ, Aμ , untouched . (A.30)

is classically a symmetry at vanishing masses, but it is quantum–mechanically

anomalous with
g2 a a
∂μ Jμ(A) = 2iNf F F̃ , (A.31)
32π 2 μν μν

! f ↔ " ! "
Jμ(A) = ψ̄α̇ σ̄μα̇α ψαf +iφ∗f ∂ μ φf + (φf , ψf ) → (φ̃f , ψ̃ f ) . (A.32)
f

49 PQ
Uλ (1) is sometimes also called UA (1) [25], where PQ stands for Peccei–
Quinn [165], because it is anomalous and classically unbroken even at non-
vanishing masses.
442 M. Bianchi et al.

(IV) Often, instead of Uλ (1), the linear combination

3 1
UR (1) = Uλ (1) − UA (1) (A.33)
2 2
is introduced because the associated current belongs to the current supermul-
tiplet [166]. This classical symmetry is anomalous and from (A.17) and (A.31)
we see that the associated current obeys the anomaly equation

g2 a a
∂μ Jμ(R) = i(3Nc − Nf ) F F̃ . (A.34)
32π 2 μν μν
(V) It is possible to construct a non-anomalous, exactly conserved current,
(Â) (A) (λ)
Jμ , out of the two anomalous currents Jμ and Jμ (see (A.32) and (A.29)).
One ﬁnds
Jμ(Â) = −Nf Jμ(λ) + Nc Jμ(A) . (A.35)
The transformations induced on the ﬁelds by the associated UÂ (1) symme-
try (A.35) are

(φ, φ̃) → ei(Nc −Nf )α (φ, φ̃) , (ψ, ψ̃) → eiNc α (ψ, ψ̃) , + complex conjugate ,
λ → e−iNc α λ , + complex conjugate . (A.36)

(VI) In Table A.1 we recollect for convenience the charges of elementary

and composite gauge and matter ﬁelds under the various U (1)’s we have in-
troduced. In the last two rows we report the coeﬃcient of the anomaly of the
g2 a a
associated current in units of Q = 32π 2 Fμν F̃μν and we indicate whether the

conservation of the current is broken by mass terms or not.

Table A. 1. The SU (Nf )×SU (Nf ) quantum numbers and U (1)-charges of elemen-
tary and composite ﬁelds of SQCD. The anomaly associated with each U (1) current
a
is given in units of g 2 /32π 2 Fμν a
F̃μν . In the last line by “Yes” (“No”) we mean that
the corresponding symmetry is (is not) broken by the presence of mass terms
PQ
Field SU (Nf ) SU (Nf ) UA (1) UR (1) UÂ (1) Uλ (1) = UA (1) UV (1)
λ 1 1 0 3/2 -Nf 1 0
ψ Nf 1 1 1 Nc 0 1
φ Nf 1 1 -1/2 Nc − Nf 1 0
ψ̃ 1 Nf 1 1 Nc 0 −1
φ̃ 1 Nf 1 -1/2 Nc − Nf 1 0
S|θ=0 1 1 0 3 −2Nf 2 0
Thf |θ=0 Nf Nf 2 2 2(Nc − Nf ) 2 0
Anomaly / / 2Nf 3Nc − Nf 0 2Nc 0
Mass term Yes Yes Yes Yes Yes No No
Instantons and Supersymmetry 443

A.9 Gluino Zero Modes

The explicit expression
the 2Nc gluino zero modes endowed with the
of
correct normalisation d4 x a λaα (x)λa∗ α (x) = 1 is (the index counting the
2Nc zero modes is indicated in parentheses)
1 α s
(λα s
(k) )r (x) = (δ δ − αs rk )ρ2 (f (x))2 ,
π r k
k = 1, 2 , (A.37)
i
(λα(α̇) )sr (x) = − √ σ̄μα̇β (x − x0 )μ (δrα δβs
2π
− rβ )ρ(f (x))2 , α̇ = 1, 2 ,
αs
(A.38)
1
(λα(±i) )r (x) = √ (δrα δis ± αs δri )ρ(f (x))3/2 ,
s
2π
i = 1, . . . , Nc − 2 , (A.39)
with
1
f (x) = . (A.40)
(x − x0 )2 + ρ2
The first four modes are SU (2) triplets, while the last 2(Nc − 2) are doublets.
The four triplets can be directly generated starting from the expression (24)
and acting on it with anyone of two supersymmetric and two superconformal
transformations that are unbroken in the background instanton field. They
are often called “exact zero modes” in the literature, see, for instance, [121]
and references therein. This name originates from the following observation.
Effectively the overall field configuration which is relevant for the kind of com-
putations we have presented in Sect. 4 (see Sect. 14 for further applications)
is given by the gauge instanton solution, the associated set of fermionic zero
modes and the expression of the scalar fields that are obtained by solving their
linearised classical e.o.m., i.e. the e.o.m. that result upon neglecting the quar-
tic scalar self-interaction terms. The reason for neglecting such terms is that
the latter would give rise to contributions of higher order in g compared to the
leading ones we have been keeping. When the action of the theory is computed
in this approximation and on the above field configuration, it just happens
that the result does not depend on the fermionic collective coordinates as-
sociated with the four SU (2) triplet zero modes of (A.37) and (A.38). The
other fermionic zero modes will give origin to quartic terms in the remaining
2(Nc − 2) fermionic collective coordinates (see Sect. 14).

Appendix B – Bosonic Collective Coordinates

and Functional Integration
In this appendix we want to explain how one can compute the pure gauge
part of the functional integration in the semi-classical approximation around
a non-trivial instantonic background. We will follow the method of [10], which
444 M. Bianchi et al.

neatly explains how to deal with the problem of bosonic zero modes and the
consequent need of introducing collective coordinates.
In the semi-classical approximation one starts by expanding the (gauge)
action around the (instanton) classical solution, keeping only terms up to
quadratic ﬂuctuations. Setting

Aμ = AIμ + Qμ , (B.1)

one gets in this way

1 ! "
SYM = S I − d4 x Tr Qμ Mμν (AI )Qν + O(Q3 ) , (B.2)
2
where
8π 2
SI = |K| , (B.3)
g2
Mμν (AI ) = −D2 (AI )δμν + Dμ (AI )Dν (AI )
−2[FμνI
,·], (B.4)
I
Dμ (A ) = ∂μ + g[AIμ , · ] . (B.5)

The operator Mμν (AI ) has quite a large manifold of (normalisable and
non-normalisable) zero modes. Not only it is annihilated by all the func-
tions of the form Dν (AI )F (x), as a consequence of gauge invariance, but
also by the 4|K|Nc normalisable vectors that are obtained by differentiat-
ing the instanton field configuration with respect to the 4|K|Nc parameters,
βi , i = 1, . . . , 4|K|Nc (bosonic collective coordinates in the following), upon
which the most general classical solution depends.50 The existence of such zero
modes is immediately proved by noticing that by differentiating the classical
instanton e.o.m., (δSYM /δAμ )AIμ = 0, with respect to βi , one gets

∂AI (x , β)
δ 2 SYM
d4 x
ν
δAμ (x)δAν (x ) AIμ ∂βi
∂AIν (x, β)
= Mμν (AI ) = 0,
∂βi
i = 1, . . . , 4|K|Nc . (B.6)

The most elegant way to deal with an operator with such a kernel was worked
out some time ago in [10]. The idea is to functionally integrate over all ﬂuc-
tuations, Qμ (x) = Aμ (x) − AIμ (x, β)U , that are orthogonal to the manifold
described by AIμ (x, β)U when U spans the space of topologically trivial gauge
transformations, G0 , and the parameters βi are let to move in their allowed
50
We are referring here to the SU (Nc ) gauge group case. In the one-instanton sector,
|K| = 1, the collective coordinates are the size and the location of the instanton
and its 4Nc − 5 “orientation angles” in colour space [9, 20, 23].
Instantons and Supersymmetry 445

range of variation. In more mathematical terms the latter manifold is called

the “instanton moduli space”.
The orthogonality conditions (B.6) are imposed by a straightforward gen-
eralisation of the usual Faddeev–Popov (FP) procedure [167] which consists
in introducing in the functional integral the identity

δAIμ (β)Uh
1 = ΔFP δha (x) dβi δ < (Aμ − AIμ (β)Uh ), >
G0 a,x M i δha (x)

∂AIν (β)Uh
×δ < (Aν − AIν (β)Uh ), > , (B.7)
δβi
where ΔFP is the FP determinant. In (B.7) we have used the shorthand no-
tation
1
< fμ , gμ > = d4 x TrAdj [fμ (x)gμ (x)] (B.8)
2
for the scalar product < ·, · > induced in the space of functions by the form of
the gauge action. After some algebra (see [168] for details) (B.7) can be cast
in the more expressive form

†
1 = ΔFP Dμ[ha (x)] dβi δ Tr[T a Dμ (AI )(Aμ (x)Uh −AIμ (x, β))]
G0 a,x M i

∂AIν (β)
Uh†
×δ < (Aν − AIν (β)), > , (B.9)
δβi
which shows that we are naturally brought to work in the instanton back-
ground gauge. ΔFP can be shown to have in the semi-classical approximation
the expression
! " ! "
ΔFP = detx,y
a,b − D (A ) δ(x − y) deti,j < a (β), a (β) > ,
2 I ab (i) (j)
(B.10)

where the a(i) ’s (i = 1, 2, . . . , 2|K|Nc ) are the mutually orthogonal (see the
next subsection) vectors
∂AI (x, β)
aμ(i) (x, β) = δμν − Dμ (A)[D2 (A)]−1 Dν (A)Aμ =AI ν
. (B.11)
μ ∂βi

We will indicate by ||a(i) || their norm in the metric induced by the scalar prod-
(i)
uct (B.8). The vectors aμ (x, β) are not exactly the functions ∂AIμ (x, β)/∂βi .
They diﬀer from the latter by a term which makes them to fulﬁl the equation

Dμ (AI )aμ(i) (x, β) = 0 , (B.12)

i.e. which makes them transverse with respect to the covariant derivative in
the instanton background.
Putting everything together and noticing that the orthogonality condition
among the vectors (B.11) makes immediate the computation of the factor
446 M. Bianchi et al.

deti,j [< a(i) (β), a(j) (β) >] in ΔFP , one ﬁnally gets for the v.e.v. of a gauge
invariant operator, O(A), in the semi-classical approximation around an in-
stanton conﬁguration with winding number |K| the expression

− 8π
2
|K|
e g2 ||a(i) ||
O = DQμ dβi √ (B.13)
s.c. Z|s.c. i
2π

× e− 2 d4 xd4 y Qμ Mg.f.
1
μν Qν det[−D2 (AI )]δ(Dμab (AI )Qbμ )O(AI ) ,

where

d4 xd4 y Qμ Mg.f.
DQμ e− 2
1
Z|s.c. = 0;μν Qν det[−∂ 2 ]δ(∂μ Qaμ ) , (B.14)

μν = −D (A )δμν − 2 [Fμν , · ] ,
Mg.f. 2 I I
(B.15)
Mg.f.
0;μν = −∂ δμν .2
(B.16)

Z|s.c. is the necessary normalisation factor which, in order to be consistent

with the approximation we are working in, must be evaluated by expanding
the action around the trivial solution of the field e.o.m. keeping only terms
quadratic in the fluctuations. Note that to make more transparent analogies
and differences between (B.13) and (B.14) we have named Qμ the integra-
g.f.
tion variable also in (B.14). Mg.f.
μν (M0μν ) is the gauge fixed operator that
governs the quadratic fluctuations of the gauge field in the instanton (trivial)
background and det[−D2 (AI )] (det[−∂ 2 ]) is the associated FP determinant.
One can formally perform the gauge functional integrations in the r.h.s.
of (B.13), getting

− 8π
2
|K|
nB
e g2 ||a(i) ||
O =μ nB
dβi √
s.c. Z|s.c. i=1
2π
(det [Mg.f. −2 1
μν ]) det[−D2 (AI )]
× O(AI ) , (B.17)
(det[Mg.f. − 2 det[−∂ 2 ]1
0;μν ])

where nB = 4|K|Nc is the number of bosonic zero modes and μ is the sub-
traction point (see below). The prime on det [Mg.f. μν ] is to mean that the
determinant should be taken in the space orthogonal to the manifold spanned
by the zero modes (B.11).
A number of observations are in order here.
(1) As is seen from the above equations, by the method of [10] one is
naturally led to the background gauge fixing condition Dμab (AI )Qbμ = 0.
(2) One must imagine that the above functional integral has been com-
puted in some regularisation. In these instantonic computations it is custom-
ary to work in the Pauli–Villars (PV) regularisation [2], where a ghost-like
field with mass μ (but opposite statistics) is introduced for each fundamental
field in the action (gluons, FP ghosts and, if present, fermions). The net effect
Instantons and Supersymmetry 447

of the presence of PV regulators is that the result of the functional integration

over quadratic fluctuations will have the form of a product of factors, with each
term being the ratio of the determinant of each particle quadratic operator
divided by the associated PV ghost determinant (raised to the appropriate
power according to multiplicity and statistics).
(3) When the limit μ → ∞ is taken, the only left-over μ dependence is
the multiplicative factor μnB , nB = 4|K|Nc . This factor comes about because
of the following reason. There is a one-to-one correspondence between the
eigenvectors (and the eigenvalues) of analogous operators in each ratio of
determinants, except for the zero modes. There is, in fact, a mismatch between
the numerator and the denominator in the sense that there are some (actually
nB = 4|K|Nc ) eigenvalues in the PV denominator that do not have their
counterpart in the “primed” determinant in the numerator. This leaves out
1
precisely a factor (μ2 ) 2 for each bosonic collective coordinate (and actually a
− 12
factor μ for each Weyl
√ fermion zero mode, see (23) in Sect. 2.3).
(4) The factor 1/ 2π for each bosonic zero mode appears for a similar
reason. In fact, the integrations that give rise to the product of eigenvalues
finally leading to the various bosonic determinants are all Gaussian in the
(quadratic, i.e. semi-classical) approximation in which we are working. Since,
as we noticed above, there is a one-to-one
√ correspondence between physical
modes and PV modes, all the factors 2π will compensate between the numer-
ator and the denominator, except for the factors coming from the integration
over the PV modes that are in correspondence with the bosonic zero modes.
The reason is that no Gaussian integration is associated with the bosonic zero
modes, as the latter were replaced by √ integrations over the related collective
coordinates. In this way a factor 1/ 2π for each bosonic zero mode will be
left in the denominator.
(5) In principle, one can go beyond the semi-classical formulae (B.13)
and (B.17), including perturbatively O(Q3μ ) ∼ O(g) and O(Q4μ ) ∼ O(g 2 )
corrections that were neglected before. As is well known, perturbation theory
in an external field is perfectly well defined and fully renormalisable.

B.1 Bosonic Zero Modes

We close this appendix by reporting in the case Nc = 2 and K = 1 the explicit

expression of the 4|K|Nc = 8 “transverse” bosonic zero modes and of their
norms. One ﬁnds (y = x − x0 )
√
(ν) (ν)
I
aμ = Fμν (y) , ||aμ || = 2 2π
g ,

(dil.) 2 (dil.)
aμ = AIμ (y) ρ(y2y
2 +ρ2 ) , ||aμ || = 4π
g ,
(B.18)

(a) ! a 2 " (a)

aμ = Dμ (AI ) Tg y2y+ρ2 , ||aμ || = 2πρ
g .

One can check that these vectors are mutually orthogonal.

448 M. Bianchi et al.

Appendix C – Quantum Tunnelling

The emergence of the quantum tunnelling phenomenon in the presence of
instantons is most easily and rigorously explained in the Schrödinger func-
tional formalism [169], where the theory is formulated in terms of a “prop-
agation kernel” which expresses the probability amplitude to find the gauge
(2)
field configuration A(2) (x) = (Ai (x) , i = 1, 2, 3) at the final time t = T /2,
if the gauge field configuration at the initial time t = −T /2 was A(1) (x) =
(1)
(Ai (x) , i = 1, 2, 3).

The Schrödinger Functional in the Temporal Gauge

The Schrödinger kernel is most expressively written in the temporal

gauge [170]. As a result of making use of the Faddeev–Popov procedure, one
can show that in the formal continuum language it takes the form51
[A(2) (x)]U0 [h(x)]
K[A (2)
,A (1)
;T] = Dμ[h(x)]
Ĝ0 x A(1) (x) a,x
! "
× DAa (x, t) exp − SY M [A, A0 = 0] , (C.1)
− T2 <t< T2

where U0 [h] = exp(iT a ha ) ∈ Ĝ0 with Ĝ0 the group of the time-independent,
topologically trivial gauge transformations (i.e. those that tend to the group
identity at spatial infinity) and Dμ[h(x)] is the invariant Haar measure over
SU (Nc ) at each spatial point x. The integration over the spatial components
of the gauge field is extended to all configurations that satisfy the boundary
conditions A(x, T /2) = [A(2) (x)]U0 [h(x)] and A(x, −T /2) = A(1) (x).
The gauge integration over Ĝ0 plays a crucial role in the formalism as it
has the effect of projecting out from the kernel all the states that do not
satisfy the Gauss’ law constraint. In fact, since the Gauss’ law operator is the
generator of the time-independent topologically trivial gauge transformations,
only the states annihilated by it will appear in the spectral decomposition of
K[A(2) , A(1) ; T ] [170], for which we can then formally write

K[A(2) , A(1) ; T ] = e−En T Ψn [A(2) ](Ψn [A(1) ])∗ , (C.2)
n

where

H Ψn [A] = En Ψn [A] , (C.3)

δ
Di (A)ab b
Ψn [A] = 0 . (C.4)
δAi (x)
51
For the lattice regularised formulation of the Schrödinger kernel – more commonly
called Schrödinger functional in that context – see [171].
Instantons and Supersymmetry 449

The last equation is indeed the statement that the eigenstates of the Hamil-
tonian appearing in the spectral decomposition (C.2) are left untouched by
time-independent gauge transformations that tend to the identity at inﬁnity.
In fact, from the invariance property

U0 [h]Ψn [A] = Ψn [AU0 [h] ] = Ψn [A] , (C.5)

δ
U0 [h] = exp − d3 x (Diab hb (x)) a , (C.6)
δAi (x)

the Gauss’ law (C.4) follows by expanding (C.5) in powers of h(x), if the
latter function vanishes as |x| → ∞, i.e. precisely if U0 [h] ∈ Ĝ0 . An equivalent
way to prove this statement is to observe that the Schrödinger kernel enjoys
the invariance properties

K[(A(2) )U0 , (A(1) ); T ] = K[A(2) , A(1) ; T ] = K[A(2) , (A(1) )U0 ; T ] . (C.7)

The ﬁrst equality follows from the invariance of the Haar measure, as the U0
gauge transformation can be reabsorbed in the integration measure over Ĝ0 .
The second equality is an immediate consequence of the previous equation
and the invariance property

K[(A(2) )U , (A(1) )U ; T ] = K[A(2) , A(1) ; T ] , U ∈ Ĝ0 (C.8)

which in turn follows from the observation that any time-independent gauge
transformation acting on the boundary ﬁelds can be reabsorbed by the change
of variables A → A = AU in (C.1). The invariance property (C.8) can be used
to show that the Ĝ0 gauge integration in (C.1) can be equally well performed
over the time-independent gauge transformations acting on the boundary ﬁeld
A(1) at t = −T /2.

C.2 Emergence of the ϑ Angle

We ﬁnally notice that the states Ψn also support a unitary representation,

UK , of the abelian homotopy group Π3 (SU (2)) ∼ Π3 (S3 ) = Z. Since the
Hamiltonian, H, and UK commute, they can be simultaneously diagonalised.
Thus on their common eigenvectors (for a while we will keep calling them Ψn )
we have
UK Ψn [A] = Ψn [AUK ] = e−iϑK Ψn [A] . (C.9)
Consistency with the group property

UK UK = UK+K (C.10)

implies
ϑK = Kϑ , (C.11)
naturally leading to the emergence of a ϑ-angle. States should (and will) then
(ϑ)
be indicated by Ψn [A] in the following.
450 M. Bianchi et al.

C.3 Classical Vacua and Quantum Tunnelling

The classical vacua of the theory are immediately identiﬁed as the gauge
conﬁgurations for which the classical Hamiltonian

1
1
H = d3 x Ȧai Ȧai + Fija Fija (C.12)
2 4

vanishes, thus as time-independent (Ȧai = 0) pure gauges (Fija = 0). This

simple argument shows that there are infinitely many “vacua” labelled by an
integer, K ∈ Z, which is telling us which homotopy class the K-th vacuum
belongs to.
In Euclidean time the one-instanton (K = 1) solution interpolates between
adjacent minima, i.e. between pure gauge configurations with winding number
differing by one unit.52 The formulae (12) and (13) can then be immediately
proved. Since A0 = 0, one successively gets, in fact

g2 4 a a g2 4 g2
K= d x F F̃
μν μν (x) = d x ∂μ K μ (x) = d4 x ∂0 K0 (x, t)
32π 2 16π 2 16π 2

g2
= d3
x K 0 (x, +∞) − d 3
x K 0 (x, −∞) = n+ − n− . (C.13)
16π 2
The last equality follows remembering that at very large (positive and nega-
tive) times K0 ∝ ijk Tr[Ai Aj Ak ] with A a pure gauge.
The classical vacuum degeneracy is removed by the quantum mechani-
cal tunnelling between adjacent minima occurring with an amplitude Γ I ∝
exp(−S I ) = exp(−8π 2 /g 2 ). A band spectrum is generated with the lowest
energy eigenstates and eigenvalues given by

e−iKϑ Ψ0 [A] ,
(ϑ) (K)
Ψ0 [A] = (C.14)
K∈Z
E0 (ϑ) = α0 + β0 cos ϑ , (C.15)
(K)
where, at the leading order in Γ I , Ψ0 [A] is the perturbative vacuum state
functional “centred” around the K-th minimum of the energy, i.e. around a
pure gauge field with winding number K and α0 , β0 are computable constants
proportional to the spatial volume of the system.
It is not too difficult to prove the result (C.15). We start by observing
that, once quantum tunnelling has been recognised to take place, the spectral
52
Multi-instanton solutions with |K| > 1 connect vacua with winding numbers dif-
fering by exactly |K| units. They have an action (see (B.3)) which is exponentially
small with respect to the one-instanton action. In the approximation we are work-
ing, their contribution is automatically taken care of by the exponentiation of the
one-instanton contribution implicit in the spectral formula (C.2). Incidentally this
is the way in which within the Schrödinger functional formalism the “dilute gas”
approximation [3, 12] is recovered.
Instantons and Supersymmetry 451

decomposition
of the
Schrödinger kernel can be written in more informative
form ( n → dϑ )
2π
e−E
(θ)T Ψ [A(2) ](Ψ [A(1) ])∗ .
(θ) (θ)
K[A (2)
,A (1)
;T] = dϑ (C.16)
0

For the purpose of our calculation, it is enough to take A(2) and A(1) as
pure gauges. At this point only their winding number matters and we can
simplify our notation by writing the Schrödinger kernel and the associated
(ϑ)
state functionals in the form K[K (2) , K (1) ; T ] and Ψ [K], respectively. In
this notation (C.9) becomes

UK Ψ [0] = Ψ [K] = e−iϑK Ψ [0] ,

(ϑ) (ϑ) (ϑ)
(C.17)

where, we stress, “0” means a pure gauge conﬁguration with K = 0.

To leading order in the instanton tunnelling amplitude, we only need to
evaluate the kernels K[K, K; T ], K[K, K + 1; T ] and K[K + 1, K; T ], as all the
others should be considered exponentially small to this order

K[K, K ; T ] = 0 , |K − K | > 1 . (C.18)

In order to proceed further we ﬁrst note the relation

K[K, K + 1; T ] = (K[K + 1, K; T ])∗

2π (ϑ)
Ψ [K](Ψ [K + 1])∗ e−E
(ϑ)T
(ϑ)
= dϑ
0
2π
Ψ [0](Ψ [0])∗ e−E
(ϑ)T ,
(ϑ) (ϑ)
= dϑ eiϑ (C.19)
0

that follows from (C.17). Since we are interested in computing the en-
ergy of the lowest lying state, we shall take T very large, keeping however
T exp(−8π 2 /g 2 ) < 1. Expanding the exponential of the energy up to terms
linear in T , one ﬁnds
2π
(ϑ)
K[K, K + 1; T ] = dϑeiϑ |Ψ0 [0]|2 (1 − E0 (ϑ)T + O(T 2 ))
0
2π
dϑ iϑ
= |Ψ0P.T. [0]|2 e (1 − E0 (ϑ)T + O(T 2 ))
0 2π
2π
dϑ iϑ
= −T |Ψ0 [0]|
P.T. 2
e E0 (ϑ) + O(T 2 ) , (C.20)
0 2π
2π
dϑ −iϑ
K[K + 1, K; T ] = −T |Ψ0 [0]|
P.T. 2
e E0 (ϑ)
0 2π
+O(T 2 ) , (C.21)
452 M. Bianchi et al.

K[K, K; T ] = |Ψ0P.T. [0]|2 − T |Ψ0P.T. [0]|2 ·

2π
dϑ
E0 (ϑ) + O(T 2 ) , (C.22)
0 2π

where the ﬁrst equality in (C.20) follows from the fact that in the semi-classical
approximation one has
(ϑ) 1 P.T. 2
|Ψ0 [0]|2 = |Ψ [0]| . (C.23)
2π 0
We conclude from (C.18), (C.20), (C.21) and (C.22) that the coeﬃcient of
the terms linear in T has only the three non-vanishing Fourier components of
order ±1, 0. Thus E0 (ϑ) is precisely of the form (C.15).

C.4 Adding a ϑ-term

It is instructive to see what happens if a ϑ-term is added to the gauge action.

In this case the contribution

g2
iϑ d4 x Fμν
a a
F̃μν (x) (C.24)
32π 2
should be included in (A.2).53 It is easy to prove that such an action describes
a world with a well-deﬁned ϑ-angle (obviously equal to the value appearing
in (C.24)). From the formula (see (C.1))

K [A , A ; T ] =
(ϑ) (2) (1)
Dμ[h(x)]K̃(ϑ) [(A(2) )U0 [h] , A(1) ; T ] , (C.25)
Ĝ0 x
A(2) (x)
K̃(ϑ) [A(2) , A(1) ; T ] = DAa (x, t)
A(1) (x) a,x − T <t< T
2 2

! g2 "
exp − SY M [A, A0 = 0] − iϑ d4 x Fμν
a a
F̃μν (x) , (C.26)
32π 2
one checks, in fact, that under a homotopically non-trivial (time-independent)
gauge transformation with winding number K, acting, say, on the boundary
gauge ﬁeld at T /2, the Schrödinger kernel is not invariant (recall the situation
in the absence of a ϑ-term, (C.8)), rather one has

K(ϑ) [(A(2) )UK , A(1) ; T ] = e−iKϑ K(ϑ) [A(2) , A(1) ; T ] . (C.27)

This result (which incidentally implies that physics is invariant if we replace ϑ

with ϑ + 2π) follows from the fact that the exponent in (C.26) is not invariant
53
Notice the presence of the imaginary unit in front of this term even in Euclidean
metric.
Instantons and Supersymmetry 453

under such gauge transformation. Obviously, the YM action is invariant, but

the second term is not. The reason can be traced back to the fact that the
vector Kμ in (8) is not gauge invariant. Under the time-independent gauge
transformation UK in (C.27) one ﬁnds, in fact (recall that we are in the tem-
poral gauge)

g2 ! a a "UK g2 ! "
2
d4
x F F̃
μν μν = 2
d3 x K0 [(A(2) )UK ] − K0 [A(1) ]
32π 16π

g2 ! "
= 2
d3 x K0 [A(2) ] − K0 [A(1) ] (C.28)
16π

ijk 3
! † † †
" g2 ! a a "
+ 2
d x Tr U ∂ i UK
U ∂ j UK
U ∂ k UK
= 2
d4 x Fμν F̃μν +K .
24π S3 K K K
32π

From the spectral decomposition of K(ϑ) , one concludes that (C.17) holds for
each state appearing in it, thus proving the announced statement.

Appendix D – Decoupling

The physical content of the Applequist–Carazzone theorem [42] is that in an

asymptotically free theory a heavy particle (i.e. a particle with mf Λ)
should “decouple”, that is to say, it should not influence physics at energies
E mf .
The most important (for us) consequence of this statement is that one can
relate the Λ parameter of an SU (Nc ) gauge theory with Nf flavours to that
of the theory with Nf − 1 dynamically active flavours, which is obtained after
the mass of one of the flavours (say the Nf -th one) has been sent to infinity.
The running of the coupling constant of the two theories is guided at one
loop by the evolution equations (the dependence of b1 on Nc is understood)
2
gN f
(μ) 1
= , (D.1)
8π 2 b1,Nf log μ/Λ(Nf )
2
gN f −1
(μ) 1
= . (D.2)
8π 2 b1,Nf −1 log μ/Λ(Nf −1)

A necessary implication of decoupling is that for mf Λ(Nf −1) , Λ(Nf ) the

running of g 2 in the theory with Nf ﬂavour must change from the behaviour
in (D.1) – when μ is suﬃciently larger than mf – to that in (D.2) – when μ
is well below it. The equality of the coupling constants at μ ∼ mf (required
by smoothness) leads to the sought relation

m b1,Nf
m b1,Nf −1
f f
= . (D.3)
Λ(Nf ) Λ(Nf −1)
454 M. Bianchi et al.

Notice that, since we are assuming that mf is larger than both Λ(Nf −1) and
Λ(Nf ) , from (D.3) it follows Λ(Nf −1) > Λ(Nf ) . This relation is phenomeno-
logically quite important. It is telling us that, when the energy scale, E, of
a process goes through the production threshold of a particle of mass mf ,
since the running of the coupling constant switches from that of (D.1) to that
of (D.2), it just happens that the value taken by the eﬀective coupling con-
2 2
stant, geﬀ (E), that controls the process is always the largest between gN f −1
(E)
2
and gNf (E) for all values of E.

Appendix E – Flat Directions of Massless SQCD

In this appendix we want to elucidate the structure of the vacuum manifold of
massless SQCD. The theory possesses the (non-anomalous) symmetry group
(see Table A.1)

G = SUL (Nf ) × SUR (Nf ) × UV (1) × UÂ (1) . (E.1)

Any ﬁeld conﬁguration of the type

¯
Aμ = λ = ψ = ψ̄ = ψ̃ = ψ̃ = 0 , (E.2)

D =a
φrf † (T a )rr φfr − φ̃rf (T a )rr φ̃fr† =0 (E.3)

has vanishing energy, thus it is to be interpreted as a classical vacuum state.

Non-renormalisation theorems ensure that this conﬁguration is stable against
perturbative corrections (but, as we have seen, not against non-perturbative
instantonic corrections).
For the applications it is important to determine the solutions of (E.3).
In order to simplify the discussion, it is convenient to separately examine the
case Nf < Nc and Nf ≥ Nc .
• For Nf < Nc it can be easily seen that (up to symmetry operations) the
most general solution of (E.3) is given by
#
f† vr δ rf 1 ≤ r ≤ Nf
φr = φ̃r =
f
(E.4)
0 otherwise .

If the v’s are all non-vanishing the gauge symmetry is broken from the original
SU (Nc ) group down to SU (Nc − Nf ). In the special case Nf = Nc − 1 the
gauge symmetry is completely broken. Among the quark superﬁelds, (2Nc −
Nf )Nf of them become heavy owing to the super-Higgs mechanism, while
the remaining Nf2 will contain the Goldstone bosons of the various broken
global symmetries, as well as their superpartners. The pattern of surviving
symmetries will depend upon the detailed values assumed by the vr ’s in (E.4).
• For Nf ≥ Nc the analysis of (E.3) is a bit more involved. The result is
that (up to symmetry operations) the most general pattern of scalar v.e.v.’s
that makes Da vanish is
Instantons and Supersymmetry 455
#
vr δ rf 1 ≤ f ≤ Nc
φfr = (E.5)
0 otherwise
# 1
(|vr | − b2 ) 2 δ rf 1 ≤ f ≤ Nc
2
φ̃fr † = (E.6)
0 otherwise

where b is an arbitrary constant. For non-zero vr the gauge symmetry is

completely broken and Nc2 − 1 quarks become massive by the super-Higgs
mechanism. Again the detailed pattern of surviving symmetries depend on
the particular values taken by the scalar v.e.v.’s (E.5) and (E.6).
We wish to conclude with a comment. As we have seen, the vacuum man-
ifold is not compact. This is due to the fact that the symmetry of the set of
supersymmetric vacua is a certain complexification of the symmetry group of
the Lagrangian [172]. In fact, any rescaling of the massless scalar fields, al-
though not a symmetry of the theory, when applied to a vacuum configuration
leads to another acceptable, physically inequivalent, vacuum.

Appendix F – N = 2 Lagrangian and Supersymmetry

Transformations
Rigid N = 2 supersymmetric theories consist of two kinds of massless multi-
plets. Vector multiplets and hypermultiplets.
Vector multiplets contain a vector Aμ , two spinor gaugini λrα and a complex
scalar φ all transforming in the adjoint representation of the gauge group.
Vector multiplets are described by chiral superﬁelds usually denoted by A
whose θ expansion schematically reads
1 r μνα β
A(x, θ) = a(x) + θαr λα
r (x) + θα σ β θr Fμν (x) + · · · . (F.1)
2
Higher order terms in θr with r = 1, 2 can be expressed as derivatives of the
lower ones.
The Lagrangian of pure N = 2 SYM theory is given by

L = d4 θ F(A) , (F.2)

where F(A) is a group invariant function of the chiral superﬁelds. Renormal-

isability restricts F(A) to be quadratic
1
F(A) = τ0 A 2 , (F.3)
2
so that
∂ 2 F(A)
τ0 δab = (F.4)
∂Aa ∂Ab
456 M. Bianchi et al.

and

1
LN =2 = Im τ0 Tr Fμν F μν + iλr σ μ Dμ λ̄r + Dμ φDμ φ†
4

† 2 r † r
+ [φ, φ ] + λ [φ , λr ] + λ̄ [φ, λ̄r ] . (F.5)

The Lagrangian LN =2 is invariant under Poincaré transformations (up to total

derivatives), under U (2)R R-symmetry transformations and under the global
N = 2 supersymmetry transformations

δAμ = η r σμ λ̄r + η̄ r σ̄μ λr

1
δλr = ( Fμν σ μν + [φ, φ† ])η r + iσ μ Dμ φη̄ r
2
δφ = η r λr , (F.6)

where Aμ is the gauge potential associated with the ﬁeld strength Fμν . R-
symmetry indices are raised and lowered with the symplectic matrix εrs .

Appendix G – BPS Conﬁgurations

The acronym BPS, for Bogomol’nyi–Prasad–Sommefield, was initially intro-
duced to designate certain solitonic solutions in non-supersymmetric quantum
field theories. The simplest configuration of this type is a symmetric monopole
arising in the Georgi–Glashow model [173] describing a SU (2) gauge field
coupled to a scalar field in the adjoint representation. The Lagrangian of the
model is
1 a aμν 1 λ
L = − Fμν F + Dμ Φa Dμ Φa − (Φa Φa − v 2 )2 , a = 1, 2, 3 , (G.1)
4 2 4
with gauge coupling constant e. As shown by ’t Hooft [95] and Polyakov [96],
in the Coulomb phase, i.e. in the presence of a v.e.v. for the adjoint scalar, the
theory admits monopole solutions characterised by an integer-valued topolog-
ical charge. Static finite energy configurations have vanishing scalar potential
at spatial infinity.
The condition for the vanishing of the potential defines a
two-sphere, a Φa Φa = v 2 . Therefore, for such configurations, the scalar field
provides a map from the two-sphere at spatial infinity into the two-sphere
of the Higgs vacuum. This map defines the second homotopy group of S 2 ,
Π2 (S 2 ) ≡ Z. As a result, the magnetic charge, g, associated with a solution
of the field equations satisfies a Dirac quantisation condition. Denoting by B
the non-abelian magnetic field with components B ai = − 12 εijk Fjk a
, one finds

1 4πn
g= B · dΣ = dΣ i εijk εabc Φa ∂ j Φb ∂ k Φc = , (G.2)
2
S∞ 2ev 3 e
Instantons and Supersymmetry 457

where the integer n is the winding number determined by the behaviour of

the Higgs field at spatial infinity.
No exact solution to the complete non-linear field equations is know ex-
plicitly, even in the simplest case of gauge group SU (2). However, the analysis
can be drastically simplified taking advantage of the implications of a general
bound on the mass of field configurations with non-vanishing winding num-
ber known as the Bogomol’nyi bound. For a static field configuration with
vanishing electric field, E ai = −F a0i = 0, the energy (mass) satisfies

1
M = d3 r [B a · B a + DΦa · DΦa + V (Φ)]
2

1 2
≥ d3 r (B a − DΦa ) + vg , (G.3)
2
implying the bound
M ≥ vg . (G.4)
Minimal energy configurations saturate the bound and thus should have van-
ishing potential and should satisfy the first-order Bogomol’nyi equation
B a = DΦa . (G.5)
The first explicit example of solution to (G.5) with M = vg is the spherically
symmetric one constructed by Prasad and Sommerfield [98]. Following their
analysis, the expression BPS saturated has been used to designate solutions to
the field equations saturating the Bogomol’nyi bound. For dyons with electric
and magnetic charges e and g, respectively, the bound generalises to
M ≥ v(e2 + g 2 )1/2 . (G.6)
A comprehensive review of the physics of solitons and monopoles in gauge
theory can be found in [174].

Appendix H – Extended Superalgebras, Central Charges

and Multiplet Shortening
In the context of supersymmetric theories, certain short multiplets which cor-
respond to special representations of the supersymmetry algebra (see also
Appendix A) are referred to as BPS multiplets. States in such multiplets sat-
urate a generalisation of the Bogomol’nyi bound (G.4), which relates their
mass, M , to their “central charge”.
The N = 1 supersymmetry algebra in D = 4,
{Qα , Q̄α̇ } = iσαμα̇ Pμ , {Qα , Qβ } = 0 , (H.1)
admits no central extension. Generalised (non-scalar) central charges associ-
ated with the existence of domain wall conﬁgurations may appear, but they
carry Lorentz indices.
458 M. Bianchi et al.

Extended supersymmetry algebras, on the other hand, admit non-trivial

bona ﬁde central charges, usually denoted by Z. The N = 1 superalgebra (H.1)
can be generalised to
μ
{QA
α , Q̄α̇B } = iδ B σαα̇ Pμ ,
A
{QA
α , Qβ } = Z
B AB
αβ , (H.2)

where A, B = 1, ..., N are supersymmetry indices and the central charges,

Z AB , satisfy Z AB = −Z BA . In particular, for N = 2, there is only one
complex central charge, Z ≡ Z 12 , while for N = 4 there are six complex
central charges satisfying a (self) duality condition Z AB = 12 ABCD Z̄CD , very
much as the elementary scalar fields in the theory. By means of a R-symmetry
transformation, the central charges can be skew diagonalised and shown to
satisfy
M ≥ |Z1 | ≥ |Z2 | ≥ · · · ≥ |Zi | ≥ · · · ≥ 0 , (H.3)
where M 2 = Pμ P μ is one of the Casimirs of the representation one is consid-
ering, for a proper ordering of the skew eigenvalues Zi , i = 1, ..., [N /2]. For N
odd, one eigenvalue is necessarily zero by Binet’s theorem. The relation (H.3)
represents a generalisation of the Bogomol’nyi bound.
Irreducible “massive” representations of the supersymmetry algebra are
indeed constructed by going to the rest frame. This reduces the form of the
algebra to that of a Clifford algebra and one can split the 4N supercharges into
2N creation operators and 2N annihilation operators by considering suitable
linear combinations [164]. This means that, in general, a massive multiplet
consists of 22N states, 22N −1 bosonic and as many fermionic. However, if
some of the central charges coincide with M the multiplet shortens, since
some of the creation operators annihilate the ground state.
In the case of N = 2, this happens when M = |Z|. The corresponding
multiplet is said to be 1/2 BPS since half the creation operators (four out of
eight) act trivially. As a result the supermultiplet contains only half as many
states, i.e. eight (four bosons and four fermions) instead of 16 = 24 .
In the case of the N = 4 superalgebra, one has two sub-cases M = |Z1 | =
|Z2 | (1/2 BPS) and M = |Z1 | = |Z2 | (1/4 BPS). The corresponding multiplets
are, respectively, 1/2 and 3/4 the length of ordinary N = 4 multiplets.
As discussed in Sect. 13, in the conformal phase the N = 4 SYM theory has
a larger group of symmetries, the N = 4 superconformal group, P SU (2, 2|4),
see Appendix T.
In this situation the fundamental degrees of freedom of the theory are
gauge-invariant composite operators which are organised into multiplets form-
ing unitary irreducible representations (UIRs) of P SU (2, 2|4). Each composite
operator in a multiplet can be labelled by the quantum numbers associated
with the maximal bosonic sub-group of P SU (2, 2|4), SO(2, 4) × SO(6), i.e.
two spins, j1 and j2 , the scaling dimension, Δ, and three SO(6) Dynkin labels,
[k, l, m].
A further generalisation of the concept of BPS multiplet arises in this
context. The UIRs of P SU (2, 2|4) have been classified in [175]. For a review
Instantons and Supersymmetry 459

and applications to the AdS/CFT correspondence, see [118, 92]. Ordinary

long representations contain a number of states proportional to 216 , with the
proportionality constant related to the dimension of the representation of the
bosonic sub-group under which the lowest component transforms. Shorter rep-
resentations arise when speciﬁc relations occur among the SO(2, 4) × SO(6)
quantum numbers of the lowest component of the multiplet. The correla-
tion functions considered in Sect. 15 involve operators belonging to multiplets
classiﬁed as 1/2 BPS. These are characterised by the fact that their lowest
component is a Lorentz scalar operator of dimension Δ = , with ≥ 2, trans-
forming in the [0, , 0] representation of the SO(6) R-symmetry group. Generic
multiplets of this type have 12 ( − 1) 28 components. The cases = 2, 3
1 2 2

are special in that they are characterised by a further accidental shortening

and are sometimes referred to as ultra-short. Many other shortening condi-
tions can be identiﬁed. For instance, 1/4 BPS multiplets arise when the lowest
component is a Lorentz scalar, double trace operator with Δ = 2k + l trans-
forming in the representation [k, l, k] of SO(6). In the case of 1/2 and 1/4
BPS multiplets the range of spin is, respectively, 4 and 6 units, whereas long
multiplets have a spin range of 8 units.

Appendix I – The N = 4 Superconformal Group

In this appendix we summarise some basic properties of the four-dimensional

N = 4 superconformal group, P SU (2, 2|4). More details and references can be
found in the reviews [91]. The maximal bosonic subgroup of P SU (2, 2|4) is the
direct product of the four-dimensional conformal group, SO(2, 4) ∼ SU (2, 2),
and of the R-symmetry group of the N = 4 superalgebra, SO(6) ∼ SU (4).
The conformal group is the group of transformations which preserve the form
of the metric up to a (position dependent) scale factor. In four-dimensional
Minkowski space, with metric ημν = diag(−, +, +, +), it is generated by trans-
lations, Lorentz transformations, dilations and special conformal transforma-
tions. We denote the corresponding generators respectively by Pμ , Lμν , D and
Kμ , μ, ν = 0, 1, 2, 3. For the generators of the SU (4) R-symmetry we use T a ,
a = 1, 2, . . . , 15.
The action of inﬁnitesimal conformal transformations on the coordinates,
xμ → xμ (x) = xμ + δxμ (x), is the following:

δxμ (x) = aμ (translations)

δxμ (x) = Λμ ν xν (Lorentz transformations)
δxμ (x) = λxμ (dilations) (I.1)
δxμ (x) = 2bν xν xμ − xν xν bμ (special conformal transformations) ,

where aμ and bμ are constant vectors, λ ∈ R+ and the constant matrix Λμ ν

satisﬁes η ρσ Λρ μ Λσ ν = η μν .
460 M. Bianchi et al.

The fermionic symmetries in P SU (2, 2|4) comprise 16 Poincaré supersym-

metries, generated by the supercharges QA α̇
α and Q̄A , with A = 1, 2, 3, 4 and
α, α̇ = 1, 2, and 16 special (or conformal) supersymmetries, generated by the
α
supercharges SA and S̄α̇A .
As explained in Appendix G the superconformal algebra also admits six
complex scalar central charges as well as additional generalised central charges
which carry Lorentz indices.
Neglecting the central extensions the superconformal algebra reads
[Lμν , Pρ ] = −i(ημρ Pν − ηνρ Pμ ) , [Lμν , Kρ ] = −i(ημρ Kν − ηνρ Kμ ) ,
[Lμν , Lρσ ] = −iημρ Lνσ + permutations , [Pμ , Kν ] = 2iLμν − 2iημν D ,
[D, Lμν ] = 0 , [D, Pμ ] = −iPμ , [D, Kμ ] = iKμ ,
β β
{Qα , Qβ } = {SA , SB
A B α
} = {QAα , S̄α̇ } = {Q̄A , SB } = 0 ,
B α̇

μ
α , Q̄α̇B } = 2σαα̇ Pμ δ B ,
{QA A
{SαA , S̄α̇B } = 2σαμα̇ Kμ δA B ,
1 A
{QAα , SβB } = εαβ (δ B D + T B ) + δ B Lμν εβγ σ α
A A μν γ
. (I.2)
2
The R-symmetry group of automorphisms of the generic N -extended su-
persymmetry algebra is U (N ). The N = 4 case under consideration is special
in that the U (1) factor in the decomposition U (N ) = SU (N ) × U (1) of the
R-symmetry becomes an outer automorphism: none of the other generators
in the algebra is charged under this U (1) symmetry [175, 176] and all the
fields and composite operators in N = 4 SYM are neutral under this central
U (1). The absence of the abelian factor in the R-symmetry is reflected in the
notation P SU (2, 2|4) as opposed to SU (2, 2|4).
As has been discussed in Sect. 13, the observables in the N = 4 SYM
theory are correlation functions of local gauge-invariant composite opera-
tors. Such operators are labelled by the quantum numbers characterising their
transformation under the bosonic subgroup SO(2, 4) × SO(6). A class of op-
erators playing a special role in a conformal field theory such as N = 4 SYM
are the conformal primary operators. These are defined by the condition of
being annihilated by special conformal transformations acting at the origin,

[Kμ , O(x)] = 0.
x=0
(I.3)
The existence of such operators is associated with the presence of a lower
bound on the dimension of fields and operators in a unitary conformal field
theory. Since the action of Kμ lowers the dimension of an operator, the
existence of the unitarity bound implies that in every representation of the
conformal group there must be an operator satisfying (I.3). The action of the
generators of the conformal group on primary operators is as follows:
[Pμ , O(x)] = i∂μ O(x)
[Lμν , O(x)] = [i(xμ ∂ν − xν ∂μ ) + Mμν ]O(x)
[D, O(x)] = −i(Δ − xμ ∂μ )O(x) (I.4)
[Kμ , O(x)] = [i(x2 ∂μ − 2xμ xν ∂ν + 2xμ Δ) − 2xν Mμν ]O(x) .
Instantons and Supersymmetry 461

The functional form of two- and three-point functions of primary operators is

ﬁxed by conformal invariance. In the case of Lorentz scalars, for instance, the
two-point function vanishes unless the two operators have the same scaling
dimension, in which case it takes the form
cij
Oi (x1 ) Oj (x2 ) = , (I.5)
|x12 |2Δ

where x12 = x1 − x2 , cij are constants and Δ is the common dimension of Oi

and Oj . For three-point functions conformal invariance implies
cijk
Oi (x1 )Oj (x2 )Ok (x3 ) = , (I.6)
|x12 |Δi +Δj −Δk |x13 |Δi +Δk −Δj |x23 |Δj +Δk −Δi

where cijk are numerical coeﬃcients. The form of four- and higher-point
functions is not completely determined by the conformal symmetry. Four-
point functions, for instance, are determined up a function of two confor-
mally invariant cross-ratios constructed from the four insertion points, e.g.
r = x212 x234 /x213 x224 and s = x214 x223 /x213 x224 . The scaling dimensions and the
coeﬃcients, cij and cijk , in (I.5) and (I.6) are in general functions of the
Yang–Mills coupling constant, g, and the ϑ-angle.
Local composite operators in N = 4 SYM are organised in multiplets of
the superconformal group. The bottom component of any such multiplet, i.e.
the operator of lowest dimension, is referred to as a superconformal primary
operator. Superconformal primary operators are annihilated by the special
supersymmetry generators acting at the origin,

{SαA , O(x)]x=0 = 0 , (I.7)

where the symbol {S, O] indicates a commutator if O is bosonic and an anti-

commutator if O is fermionic. Notice that superconformal primary operators
are always also conformal primaries, but the opposite is not true.
As discussed in Appendix G there are special UIRs of the superconformal
group corresponding to short BPS multiplets. Operators in such multiplets
are protected and their two- and three-point functions do not receive quan-
tum corrections. This implies that their scaling dimensions and three-point
couplings are not renormalised.

Appendix J – Compendium of Diﬀerential Geometry

An n-dimensional topological manifold is a set of points such that the neigh-
bourhood of a point P (any open set containing the point P ) looks like Rn . In
order to describe a manifold, one needs an atlas made of many patches that
are related to one another by transition functions. If the transition functions
are continuous, then the manifold is continuous. If the transition functions are
462 M. Bianchi et al.

diﬀerentiable, then the manifold is diﬀerentiable. If the transition functions

are complex analytic, then the manifold is complex.
One can add further structures. On a differentiable manifold, one can de-
fine a metric which is a symmetric bilinear form on the vector fields such
that g(U, V ) = g(V, U ) = gij U i V j in a local coordinate patch. Parallel trans-
i
port is achieved by means of a connection Γjk that can be fixed to be the
Christoffel connection imposing that the metric be covariantly constant, i.e.
0 = Di gjk ≡ ∂i gjk − Γik l
gjl − Γjil
glk . One can then construct the Riemann
k
curvature tensor Rij l and its contractions, the Ricci curvature tensor Ril and
the scalar curvature R. After parallel transport a vector gets transformed
by means of a SO(n) rotation. Transformations along closed paths form the
holonomy group of the manifold, which is necessarily a subgroup of SO(n).
On differentiable manifolds, one can define p-forms with p < n. A 0-form is
a function, a 1-form is combination of the differentials of the local coordinates
A = Ai (x)dxi . In a local coordinate patch
1
Ap = Ai1 ,...,ip dxi1 ∧ ... ∧ dxip . (J.1)
p! i ,...,i
1 p

where the (anti-)symmetric wedge product satisﬁes Ap ∧ Bq = (−)pq Bq ∧ Ap .

On forms one can define an exterior differential dAp = Bp+1 that satisfies the
(graded) Leibniz rule. On a Riemannian manifold, one can also define a Hodge
star operator ∗Ap = Ãn−p . Combining d and ∗ one can define a differential
operator δ such that δAp ≡ ∗ d∗ Ap = Cp−1 , that generalises the divergence.
The Lie derivative of a p-form along a vector field V is defined by

LV Ap = ιV dAp + d(ιV A) , (J.2)

where ιV denotes contraction with V . In a local coordinate patch one has

p
LV Ai1 ,...,ip = V i ∂i Ai1 ,...,ip − Ai1 ,...,i,...,ip ∂ik V i . (J.3)
k=1

Both d and δ are nilpotent, i.e. d2 = 0 and δ 2 = 0. The generalised Laplacian

is given by ΔAp = (dδ + δd)Ap . It coincides with the standard Laplacian
−1/2 1/2
Δ = ||g|| ∂i (||g|| g ij ∂j ) on 0-forms (scalars). A form is closed if dA = 0
and exact if A = dC. A form is co-closed if δA = 0 and co-exact if A = δC. A
form which is closed and co-closed is harmonic, i.e. ΔA = 0. The equivalence
classes of closed forms C p that differ by exact forms E p define the cohomology
groups Hp = C p /E p . De Rham has shown that one can always find a harmonic
representative in each cohomology class.54

54
Given a closed p-form such that dA = 0 but δA = 0, one can always ﬁnd a
cohomologous form A = A + dΛ such that dA = 0, by construction, and δA = 0
by requiring that δdΛ = −δA, i.e. inverting the elliptic operator Λ = −(δd)−1 δA.
Instantons and Supersymmetry 463

A symplectic manifold is an even dimensional manifold that admits a

closed 2-form, known as symplectic form, e.g. for the phase space of a point
in Rn one has Ω = i dpi ∧ dxi .
On complex manifolds, one can decompose d as d = ∂ + ∂, ¯ where both ∂
and ∂¯ are nilpotent. By a complex coordinate change, one can always put the
metric in Hermitean form ds2 = gij̄ dz i dz̄ j̄ at least locally.
The Kähler form on a complex Riemannian manifold is ω = gij̄ dz i ∧ dz̄ j̄ .
If ω is closed, dω = 0, which implies ∂ω = 0 = ∂ω, ¯ the manifold is Kähler.
¯
Locally ω = ∂ ∂K, where K(z, z̄) is the Kähler potential. If the manifold has
real dimension 4n and admits three closed Kähler forms, dω I = 0, I = 1, 2, 3,
whose components satisfy the algebra of quaternions the manifold is said to
be hyper-Kähler. If the three Kähler forms are not closed but rather dω I =
cn IJK ωJ ∧ ωK , the manifold is said to be quaternionic.
An isometry of the metric is a coordinate transformation that leaves the
metric invariant and is thus generated by a vector ﬁeld V that satisﬁes

0 = LV gij = V k ∂k gij − gik ∂j V k − gik ∂j V k ≡ −∇i Vj − ∇j Vi , (J.4)

where indices are raised and lowered with the metric.

A holomorphic isometry is such that LV ω = 0. Thanks to the closure of ω,
V admits a prepotential because d(ιV ω) = 0 implies ιV ω = dμV locally. The
prepotential is known also as the holomorphic Kähler map. A tri-holomorphic
isometry is such that LV ω I = 0. Thanks to the closure of ω I , V admits
three prepotentials because d(ιV ω I ) = 0 implies ιV ω I = dμIV locally. The
prepotentials are known also as the tri-holomorphic Kähler maps.

References
1. A.A. Belavin, A.M. Polyakov, A.M. Schwartz, Yu.S. Tyupkin: Phys. Lett. B
59, 85 (1975) 306
2. G. ’t Hooft: Phys. Rev. Lett. 37, 8 (1976); Phys. Rev. D 14, 3432 (1976); ibid.
18, 2199 (1978); Phys. Rep. 142, 357 (1986) 306, 307, 309, 311, 313, 351, 387, 446
3. S.R. Coleman: The Uses of Instantons, Lecture delivered at 1977 International
School of Subnuclear Physics, Erice, Italy, 23 July–10 August, 1977 (Plenum
Press, New York, 1978) 306, 450
4. D. Amati, K. Konishi, Y. Meurice, G.C. Rossi, G. Veneziano: Phys. Rep. 162,
169 (1988) 306, 311, 312, 317, 321, 322, 323, 330, 331, 332, 334, 437, 441
5. A.I. Vainshtein, V.I. Zakharov, V.A. Novikov, M.A. Shifman: Sov. Phys. Usp.
25 195 (1982) [Usp. Fiz. Nauk 136 553 (1982)], revised and updated version
published in ITEP Lectures on Particle Physics and Field Theory (World Sci-
entiﬁc, Singapore, 1999), Vol. 1, pp. 201–299;
M.A. Shifman: Lectures given at the International School of Physics “Enrico
Fermi”, Varenna, Italy, 3–6 July 1995, hep-th/9704114;
M.A. Shifman, A.I. Vainhstein: hep-th/9902016 306, 311
6. N. Dorey, T.J. Hollowood, V.V. Khoze, M.P. Mattis: Phys. Rep. 371, 231
(2002), and references therein 306, 312, 391, 393, 405
464 M. Bianchi et al.

7. M. Nakahara: Geometry, Topology and Physics, Graduate Student Series in

Physics, Gen. Ed. D.F. Brewer (Institute of Physics Publishing, Bristol and
Philadelphia, 1990) 307
8. J.L. Gervais, B. Sakita: Phys. Rev. B 11, 2943 (1975);
E. Tomboulis: Phys. Rev. B 12, 1678 (1975) 307
9. C. Bernard: Phys. Rev. D 19, 3013 (1979) 307, 314, 394, 395, 444
10. L.G. Yaffe: Nucl. Phys. B 151, 247 (1979). 307, 443, 444, 446
11. R. Jackiw, C. Rebbi: Phys. Rev. Lett. 37, 172 (1976); Phys. Rev. D 14, 517
(1976) 309
12. C. Callan, R. Dashen, D. Gross: Phys. Lett. B 63, 334 (1976); Phys. Lett. B66,
375 (1977) 309, 450
13. D.I. Olive, R.J. Crewther, S. Sciuto: Riv. Nuovo Cimento 2N8, 1 (1979) 309
14. F.A. Berezin: The Method of Second Quantization (Academic Press, New York,
1966);
L.D. Faddeev: Introduction to Functional Methods, in Methods in Field Theory,
Les Houches 1975, Eds. R. Balian, J. Zinn-Justin (North-Holland, Amsterdam,
1976);
P. Ramond: Field Theory: a Modern Primer (Benjamin-Cumming, Reading,
Mass., 1981) 310, 312
15. Y. Meurice: Phys. Lett. B 164, 141 (1985) 310
16. L. Maiani, G.C. Rossi and M. Testa: Phys. Lett. B 292, 397 (1992) 310
17. M.F. Atiyah, I.M. Singer: Bull. Amer. Math. Soc. 69, 422 (1963); Ann. Math.
87, 485; 546 (1968);
M.F. Atiyah, G.B. Segal: Ann. Math. 87, 531 (1968);
L. Alvarez-Gaumé: Comm. Math. Phys. 90, 161 (1983);
D. Friedan, P. Windey: Nucl. Phys. B 235, 395 (1984) 311
18. C.W. Bernard, N.H. Christ, A.H. Guth, E.J. Weinberg: Phys. Rev. D 16, 2967
(1977) 311
19. M.F. Atiyah, V. Drinfeld, N. Hitchin, Y. Manin: Phys. Lett. A 65, 185 (1978) 312, 361, 369,
20. E. Corrigan, D. Fairlie, S. Templeton, P. Goddard: Nucl. Phys. B 140, 31
(1978);
N.H. Christ, E.G. Weinberg, N.K. Stanton: Phys. Rev. D 18, 2013 (1978);
E. Corrigan, P. Goddard, S. Templeton: Nucl. Phys. B 151, 63 (1979) 312, 321, 361, 362, 40
21. I. Affleck, M. Dine, N. Seiberg: Phys. Rev. Lett. 51, 1026 (1983); Nucl. Phys.
B241, 493 (1984); Nucl. Phys. B 256, 557 (1985) 312, 317, 324, 325, 326, 331, 335, 338, 343,
22. V.A. Novikov, M.A. Shifman, A.I. Vainshtein, V.I. Zakharov: JETP Lett. 39
601 (1984) 312
23. S.F. Cordes: Nucl. Phys. B 273, 629 (1986) 312, 322, 325, 326, 336, 342, 444
24. A. D’Adda, P. Di Vecchia: Phys. Lett. B 73, 162 (1978) 314
25. G. Veneziano: Phys. Lett. B 124, 357 (1983) 317, 441
26. D. Amati, G.C. Rossi, G. Veneziano: Nucl. Phys. B 249, 1 (1985) 317, 330
27. S. Coleman, J. Wess, B. Zumino: Phys. Rev. 177, 2239 (1969);
C.G. Callan, S. Coleman, J. Wess, B. Zumino: Phys. Rev. 177, 2247 (1969);
S. Weinberg: Physica A 96, 327 (1979) 317, 335, 436
28. J. Gasser, H. Leutwyler: Phys. Rep. 87, 77 (1982); Ann. Phys. 158, 142 (1984);
Nucl. Phys. B 250, 465 (1985) 317, 335, 436
29. G. Veneziano, S. Yankielowicz: Phys. Lett. B 113, 321 (1982) 317, 335, 336, 436
30. T. Taylor, G. Veneziano, S. Yankielowicz: Nucl. Phys.B 218, 493 (1983) 317, 335, 436
31. I. Affleck, M. Dine, N. Seiberg: Phys. Lett. B 137, 187 (1983); Phys. Rev. Lett.
52, 1677 (1984); Phys. Lett. B 140, 59 (1984) 317, 324, 325, 326, 331, 332, 333, 334, 336, 338
Instantons and Supersymmetry 465

32. V.A. Novikov, M.A. Shifman, A.I. Vainshtein, V.I. Zakharov: Nucl. Phys. B
223, 445 (1983); 229, 407 (1983) 320, 322, 331, 348
33. G.C. Rossi, G. Veneziano: Phys. Lett. B 138, 195 (1984). 320, 322, 323
34. J. Schwinger: Phys. Rev. 82, 664 (1951);
S. Adler: Phys. Rev. 177, 2426 (1969);
J.S. Bell, R. Jackiw: Nuovo Cimento A 60, 47 (1969) 321
35. K. Konishi: Phys. Lett. B 135, 439 (1984);
K. Konishi, K. Shizuya: Nuovo Cimento A 90, 111 (1985);
T.E. Clark, O. Piguet, K. Sibold: Nucl. Phys. B 143, 445 (1978); ibid. 159, 1
(1979); ibidem 169, 77 (1980);
S. Gates, Jr., M.T. Grisaru, M. Roček, W. Siegel: Superspace (Ben-
jamin/Cummings, New York, 1983) 321, 327, 337
36. K. Konishi, H. Panagopoulos: Phys. Lett. B 191, 290 (1987) 321, 326
37. D. Finnell, P. Pouliot: Nucl. Phys. B 453, 227 (1995) 322, 335, 342, 343, 345, 358
38. T.J. Hollowood, V.V. Khoze, W.J. Lee, M.P. Mattis: Nucl. Phys. B 570, 241
(2000) 322, 324, 325, 344
39. V.A. Novikov, M.A. Shifman, A.I. Vainshtein, V.I. Zakharov: Nucl. Phys. B
229, 381 (1983); Phys. Lett. B 166 329 (1986] [Sov. J. Nucl. Phys. 43, 294
(1986); Yad. Fiz. 43, 459 (1986)];
T. Morris, D. Ross, C. Sachrajda: Phys. Lett. B 172, 40 (1986) 322, 323, 329
40. E. Witten: Nucl. Phys. B 202, 253 (1982) 323, 329
41. N. Seiberg: Phys. Rev. D 49, 6857 (1994); Nucl. Phys. B 431 484 (1995) 324, 335, 339, 340,
42. T. Appelquist, J. Carazzone: Phys. Rev. D 11, 2856 (1975) 324, 453
43. D. Amati, Y. Meurice, G.C. Rossi, G. Veneziano: Nucl. Phys. B 263, 591 (1986) 324, 326
44. G. ’t Hooft: in Proceedings of the 1979 Cargése Summer School, Eds. G. ’t Hooft
et al. (Plenum Press, New York, 1980) 324, 339
45. V.A. Novikov, M.A. Shifman, A.I. Vainshtein, V.I. Zakharov: JETP Lett. 39,
601 (1984); Nucl. Phys. B 260, 157 (1985);
M.A. Shifman, A.I. Vainshtein, V.I. Zakharov: Usp. Fiz. Nauk 146, 683 (1985)
[Sov. Phys. Usp. 28, 709 (1985)] 325, 326, 343
46. J. Fuchs, M.G. Schmidt: Z. Phys. C 30, 161 (1986);
J. Fuchs: Nucl. Phys. B 272, 677 (1986); ibid. 282, 437 (1987) 325, 343
47. I. Aﬄeck: Nucl. Phys. B 191, 429 (1981) 325, 343, 362
48. G. Curci, G. Veneziano: Nucl. Phys. B 292, 555 (1987) 326
49. I. Montvay: Int. J. Mod. Phys. A 17, 2377 (2002) 326
50. F. Buccella, J.P. Derendiger, S. Ferrara, C. Savoy: Phys. Lett. B 115, 375
(1982) 326
51. N. Seiberg, E. Witten: Nucl.Phys. B 426, 19 (1994); Erratum: ibid. 430, 485
(1994) 326, 344, 348, 352
52. J. Wess, B. Zumino: Phys. Lett. B 49, 52 (1974);
J. Iliopoulos, B. Zumino: Nucl. Phys. B 76, 310 (1974);
S. Ferrara, J. Iliopoulos, B. Zumino: Nucl. Phys. B 77, 413 (1974);
S. Weinberg: Phys. Lett. B 62, 111 (1976);
M.T. Grisaru, M. Roček, W. Siegel: Nucl. Phys. B 159, 429 (1979) 330
53. H. Georgi, S. Glashow: Phys. Rev. Lett. 32, 438 (1974) 332
54. Y. Meurice, G. Veneziano: Phys. Lett. B 141, 69 (1984) 332, 333, 334
55. E. Guadagnini, K. Konishi: Nuovo Cimento A 90, 400 (1985);
A. Bicci, K. Konishi: Europhys. Lett. 1, 275 (1986) 332, 345, 346
56. K. Konishi: Nucl. Phys. B 289, 253 (1987) 332, 333, 334
466 M. Bianchi et al.

57. C. Rosenzweig, J. Schechter, G. Trahern: Phys. Rev. D 21, 3388 (1980);

P. Di Vecchia, G. Veneziano: Nucl. Phys. B 171, 253 (1980);
E. Witten: Ann. Phys. 128, 363 (1980);
K. Kawarabayashi, N. Ohta: Nucl. Phys. B 175 477 (1980);
K. Kawarabayashi, N. Ohta: Prog. Theor. Phys. 66 1789 (1981);
P. Nath, A. Arnowitt: Phys. Rev. D 23, 473 (1981) 335, 436
58. K. Symanzik: in New Developments in Gauge Theories, Eds. G. ’t Hooft et al.
(Plenum, New York, 1980), p. 313;
K. Symanzik: “Some topics in quantum ﬁeld theory” in Mathematical Problems
in Theoretical Physics, Eds. R. Schrader et al., Lectures Notes in Physics, Vol.
153 (Springer, New York, 1982);
K. Symanzik: Nucl. Phys. B 226, 187; Nucl. Phys. B 226, 205 (1983) 335
59. G. Shore, G. Veneziano: Int. J. Mod. Phis. A 1, 499 (1986);
M. Peskin: Proc. of the 1996 Theoretical Advanced Study Institute on Fields,
String and Duality (Boulder, CO, 2–28 June 1996), hep-th/9702094 335, 336, 339, 342
60. K.G. Wilson: Phys. Rev. B 4, 3174 (1971);
J. Polchinski: Nucl. Phys. B 231, 269 (1984);
G. Gallavotti: Rev. Mod. Phys. 57, 471 (1985) 337, 352
61. S. Arnone, C. Fusi, K. Yoshida: JHEP 9902, 022 (1999);
S. Arnone, S. Chiantese, K. Yoshida: Int. J. Mod. Phys. A 16, 1811 (2001);
S. Arnone, D. Francia, K. Yoshida: Mod. Phys. Lett. A 17, 1191 (2002);
J. Ambjörn, R.A. Janik: Phys. Lett. B 569, 81 (2003);
S. Arnone, K. Yoshida: Int. J. Mod. Phys. B 18, 469 (2004);
S. Arnone, F. Guerrieri, K. Yoshida: JHEP 0405, 031 (2004) 337, 364
62. N. Arkani-Hamed, H. Murayama: Phys. Rev. D 57, 6638 (1998); JHEP 0006,
030 (2000) 337
63. R. Dijkgraaf, C. Vafa: hep-th/0208048;
R. Dijkgraaf, M.T. Grisaru, C.S. Lam, C. Vafa, D. Zanon: Phys. Lett. B 573,
138 (2003) 337
64. G. Hailu, H. Georgi: JHEP 0402, 038 (2004) 337
65. H. Kawai, T. Kuroki, T. Morita, K. Yoshida: Phys. Lett. B 611, 269 (2005) 337
66. K.A. Intriligator, R.G. Leigh, N. Seiberg: Phys. Rev. D 50, 1092 (1994) 344
67. A. Armoni, M.A. Shifman, G. Veneziano: Nucl. Phys. B 667, 170 (2003); Phys.
Rev. Lett. 91, 191601 (2003); Phys. Lett. B 579, 384 (2004);
A. Armoni, G. Shore, G. Veneziano: Nucl. Phys. B 740, 23 (2006) 345
68. K. Konishi, G. Veneziano: Phys. Lett. B 187, 106 (1987) 345, 346
69. S. Ferrara, B. Zumino: Nucl. Phys. B 79, 413 (1974);
M. Sohnius, K.S. Stelle, P.C. West: Phys. Lett. B 92, 123 (1980);
W. Lerche: “Lecture on N = 2 supersymmetric gauge theory”, given at the
NATO Advanced Study Institute: Les Houches Summer School on Theoretical
Physics, Session 64: Quantum Symmetries, Les Houches, France, 1 August–8
September 1995;
A. Bilal: hep-th/9601007 348
70. L. Brink, J.H. Schwarz, J. Scherk: Nucl. Phys. B 121, 77 (1977);
F. Gliozzi, J. Scherk, D. Olive: Nucl. Phys. B 122, 256 (1977) 348, 385, 386
71. P.S. Howe, K.S. Stelle, P.C. West: Phys. Lett. B 124, 55 (1983). 348
72. L. Andrianopoli, M. Bertolini, A. Ceresole, R. D’Auria, S. Ferrara, P. Fre,
T. Magri: J. Geom. Phys. 23, 111 (1997) 348
73. C. Montonen, D.I. Olive: Phys. Lett. B 72, 117 (1977) 348, 386
Instantons and Supersymmetry 467

74. N. Seiberg, E. Witten: Nucl. Phys. B 431, 484 (1994) 348, 352
75. M. Matone: Phys. Lett. B 357, 342 (1995);
M. Matone: JHEP 0104, 041 (2001) 349, 358, 360, 361
76. F. Fucito, G. Travaglini: Phys. Rev. D 55, 1099 (1997) 349, 358, 361, 362, 364
77. E. Witten: Commun. Math. Phys. 117, 353 (1988) 349, 364, 365
78. N. Seiberg, E. Witten: JHEP 9909, 032 (1999) 349
79. N. Nekrasov, A.S. Schwarz: Commun. Math. Phys. 198, 689 (1998) 349, 358, 369, 384
80. G. W. Moore, N. Nekrasov, S. Shatashvili: Commun. Math. Phys. 209, 77
(2000) 349, 358, 369, 371, 384, 406
81. N.A. Nekrasov: Commun. Math. Phys. 241, 143 (2003) 349, 358, 365, 373, 384, 391
82. N.A. Nekrasov: Adv. Theor. Math. Phys. 7, 831 (2004) 349, 358, 365, 373, 384, 391
83. M.R. Douglas: hep-th/9512077 349, 358, 374
84. E. Witten: Nucl. Phys. B 460, 335 (1996);
E. Witten: JHEP 0204, 012 (2002) 349, 358
85. M. Billo, M. Frau, I. Pesando, F. Fucito, A. Lerda, A. Liccardo: JHEP 0302,
045 (2003) 349, 358
86. M. Billo, M. Frau, F. Fucito, A. Lerda: hep-th/0606013 349, 358
87. A. Giveon, D. Kutasov: Rev. Mod. Phys. 71, 983 (1999) 349
88. J.M. Maldacena: Adv. Theor. Math. Phys. 2, 231 (1998) [Int. J. Theor. Phys.
38, 1113 (1999)] 349, 386, 407
89. S.S. Gubser, I.R. Klebanov, A.M. Polyakov: Phys. Lett. B 428, 105 (1998) 349, 386, 407, 41
90. E. Witten: Adv. Theor. Math. Phys. 2, 253 (1998) 349, 386, 407, 410, 411
91. O. Aharony, S.S. Gubser, J.M. Maldacena, H. Ooguri, Y. Oz: Phys. Rep. 323,
183 (2000);
E. D’Hoker, D.Z. Freedman: Lectures given at Theoretical Advanced Study Insti-
tute in Elementary Particle Physics on Strings, Branes and Extra Dimensions
(Boulder, CO, 3–29 June 2001), hep-th/0201253 349, 407, 459
92. M. Bianchi: Nucl. Phys. Proc. Suppl. 102, 56 (2001);
M. Bianchi: Fortsch. Phys. 53, 665 (2005) 349, 388, 407, 459
93. M. Bianchi, F. Fucito, G.C. Rossi, M. Martellini: Nucl. Phys. B 440, 129 (1995);
M. Bianchi, F. Fucito, G.C. Rossi, M. Martellini: Nucl. Phys. B 473, 367 (1996)
352, 379
94. S.R. Coleman, E. Weinberg: Phys. Rev. D 7, 1888 (1973) 352
95. G. ’t Hooft: Nucl. Phys. B 79, 276 (1974). 353, 456
96. A.M. Polyakov: JETP Lett. 20, 194 (1974) [Pisma Zh. Eksp. Teor. Fiz. 20, 430
(1974)] 353, 456
97. B. Julia, A. Zee: Phys. Rev. D 11, 2227 (1975) 353
98. M.K. Prasad, C.M. Sommerﬁeld: Phys. Rev. Lett. 35, 760 (1975) 353, 457
99. W. Nahm: Phys. Lett. B 90, 413 (1980) 353
100. E. Witten, D.I. Olive: Phys. Lett. B 78, 97 (1978) 354, 388
101. H. Osborn: Phys. Lett. B 83, 321 (1979) 354, 388
102. P. Goddard, J. Nuyts, D.I. Olive: Nucl. Phys. B 125, 1 (1977) 357, 386
103. G. ’t Hooft: Nucl. Phys. B 153, 141 (1979) 357
104. R. Dijkgraaf, M.T. Grisaru, H. Ooguri, C. Vafa, D. Zanon: JHEP 0404, 028
(2004) 358
105. T.J. Hollowood: JHEP 0203, 038 (2002) and Nucl. Phys. B 639, 66 (2002). 358
106. D. Anselmi, P. Fré: Nucl. Phys. B 404, 288 (1993) ; Phys. Lett. B 347, 247
(1995);
E. Witten: Math. Res. Lett. 1, 769 (1994) 362
468 M. Bianchi et al.

107. A. Klemm, W. Lerche, S. Theisen: Int. J. Mod. Phys. A 11, 1929 (1996) 367
108. S.K. Donaldson, P.B. Kronheimer: The Geometry of Four-Manifolds, Oxford
Mathematical Monographs (Oxford University Press, Oxford, New York, 1997). 369
109. G. Veneziano: Nuovo Cimento A 57, 190 (1968) 374, 437
110. J. Dai, R.G. Leigh, J. Polchinski: Mod. Phys. Lett. A 4, 2073 (1989);
R.G. Leigh: Mod. Phys. Lett. A 4, 2767 (1989);
J. Polchinski: Phys. Rev. Lett. 75, 4724 (1995);
C. Angelantonj, A. Sagnotti: Phys. Rept. 371, 1 (2002) [Erratum: ibid. 376,
339 (2003)] 374
111. E. Witten: JHEP 9807, 006 (1998) 374
112. M.R. Douglas, G.W. Moore: hep-th/9603167 379
113. C. Angelantonj, A. Armoni: Phys. Lett. B 482, 329 (2000) 381
114. M. Bianchi, J.F. Morales: JHEP 0008, 035 (2000) 381
115. F. Fucito, J.F. Morales, R. Poghossian, A. Tanzini: JHEP 0601, 031 (2006);
R. Blumenhagen, M. Cvetic, T. Weigand: hep-th/0609191;
M. Haack, D. Kreﬂ, D. Lust, A. Van Proeyen, M. Zagermann: hep-th/0609211;
L.E. Ibanez, A.M. Uranga: hep-th/0609213.
B. Florea, S. Kachru, J. McGreevy, N. Saulina: hep-th/0610003;
N. Akerblom, R. Blumenhagen, D. Lust, E. Plauschinn, M. Schmidt-
Sommerfeld: hep-th/0612132;
M. Bianchi, E. Kiritsis: hep-th/0702015 385
116. L.V. Avdeev, O.V. Tarasov, A.A. Vladimirov: Phys. Lett. B 96, 94 (1980);
M.T. Grisaru, M. Roček, W. Siegel: Phys. Rev. Lett. 45, 1063 (1980);
M.F. Sohnius, P.C. West: Phys. Lett. B 100, 245 (1981);
W.E. Caswell, D. Zanon: Nucl. Phys. B 182, 125 (1981);
S. Mandelstam: Nucl. Phys. B 213, 149 (1983);
L. Brink, O. Lindgren, B.E.W. Nilsson: Phys. Lett. B 123, 323(1983);
P.S. Howe, K.S. Stelle, P.K. Townsend: Nucl. Phys. B 236, 125 (1984) 386
117. A. Sen: Phys. Lett. B 329, 217 (1994) 388
118. M. Bianchi, F.A. Dolan, P.J. Heslop, H. Osborn: hep-th/0609179 388, 459
119. C. Vafa, E. Witten: Nucl. Phys. B 431, 3 (1994) 390
120. A. Kapustin, E. Witten: hep-th/0604151;
S. Gukov, E. Witten: hep-th/0612073;
A. Kapustin: hep-th/0612119 390
121. N. Dorey, T.J. Hollowood, V.V. Khoze, M.P. Mattis, S. Vandoren: Nucl. Phys.
B552, 88 (1999) 391, 395, 397, 405, 406, 407, 443
122. A.V. Belitsky, S. Vandoren, P. van Nieuwenhuizen: Class. Quant. Grav. 17,
3521 (2000) 393
123. M.B. Green, S. Kovacs: JHEP 0304, 058 (2003) 396, 404, 422
124. M. Bianchi, M.B. Green, S. Kovacs, G.C. Rossi: JHEP 9808, 013 (1998) 397, 412
125. N. Dorey, V.V. Khoze, M.P. Mattis, S. Vandoren: Phys. Lett. B 442, 145 (1998)
397
126. M. Bianchi, S. Kovacs, G.C. Rossi, Ya.S. Stanev: JHEP 9908, 020 (1999) 399, 400
127. L.S. Brown, R.D. Carlitz, D.B. Creamer, C. Lee: Phys. Rev. D 17, 1583 (1978)
403
128. E. D’Hoker, D.Z. Freedman, S.D. Mathur, A. Matusis, L. Rastelli:
hep-th/9908160 404
129. M. Bianchi, S. Kovacs: Phys. Lett. B 468, 102 (1999) 404
130. B. Eden, P.S. Howe, C. Schubert, E. Sokatchev, P.C. West: Phys. Lett. B 472,
323 (2000) 404
Instantons and Supersymmetry 469

131. J. Erdmenger, M. Perez-Victoria: Phys. Rev. D 62, 045008 (2000);

B.U. Eden, P.S. Howe, E. Sokatchev, P.C. West: Phys. Lett. B 494, 141 (2000)
404
132. E. D’Hoker, J. Erdmenger, D.Z. Freedman, M. Perez-Victoria: Nucl. Phys. B
589, 3 (2000) 404
133. S.J. Rey, J.T. Yee: Eur. Phys. J. C 22, 379 (2001);
J.M. Maldacena: Phys. Rev. Lett. 80, 4859 (1998) 404
134. M. Bianchi, M.B. Green, S. Kovacs: JHEP 0204, 040 (2002); hep-th/0107028
404
135. S. Kovacs: Nucl. Phys. B 684, 3 (2004) 405
136. N.J. Hitchin, A. Karlhede, U. Lindstrom, M. Roček: Commun. Math. Phys.
108, 535 (1987) 405
137. W. Krauth, H. Nicolai, M. Staudacher: Phys. Lett. B 431, 31 (1998);
W. Krauth, M. Staudacher: Phys. Lett. B 435, 350 (1998) 406
138. T. Banks, M.B. Green: JHEP 9805, 002 (1998) 412
139. M.B. Green, M. Gutperle: Nucl. Phys. B 498, 195 (1997) 413
140. J.H. Schwarz: Nucl. Phys. B 226, 269 (1983) 414
141. H.J. Kim, L.J. Romans, P. van Nieuwenhuizen: Phys. Rev. D 32, 389 (1985) 416
142. R. Gopakumar, M.B. Green: JHEP 9912, 015 (1999) 419
143. D. Berenstein, J.M. Maldacena, H. Nastase: JHEP 0204, 013 (2002) 423, 427, 428
144. B. Sundborg: Nucl. Phys. B 573, 349 (2000);
S.E. Konstein, M.A. Vasiliev, V.N. Zaikin: JHEP 0012, 018 (2000);
E. Witten: Spacetime Reconstruction, Talk at JHS 60 Conference, California
Institute of Technology, 3–4 November 2001 http://quark.caltech.edu/
jhs60/witten/1.html;
E. Sezgin, P. Sundell: JHEP 0109, 036 (2001);
E. Sezgin, P. Sundell: JHEP 0109, 025 (2001);
A.M. Polyakov: Int. J. Mod. Phys. A 17S1, 119 (2002) 423
145. M. Bianchi, J.F. Morales, H. Samtleben: JHEP 0307, 062 (2003);
N. Beisert, M. Bianchi, J.F. Morales, H. Samtleben: JHEP 0402, 001 (2004);
N. Beisert, M. Bianchi, J.F. Morales, H. Samtleben: JHEP 0407, 058 (2004) 423
146. R. Penrose: in Diﬀerential Geometry and Relativity, eds. M. Cahen, M. Flato
(Reidel, Dordrecht, Netherlands, 1976);
R. Gueven: Phys. Lett. B 482, 255 (2000) 423
147. M. Blau, J. Figueroa-O’Farrill, C. Hull, G. Papadopoulos: Class. Quant. Grav.
19, L87 (2002); JHEP 0201, 047 (2002) 423, 424
148. R.R. Metsaev: Nucl. Phys. B 625, 70 (2002) 423, 425
149. R.R. Metsaev, A.A. Tseytlin: Phys. Rev. D 65, 126004 (2002) 423, 426
150. A. Pankiewicz: Fortsch. Phys. 51, 1139 (2003);
J.C. Plefka: Fortsch. Phys. 52, 264 (2004);
J.M. Maldacena: Lectures given at the Theoretical Advanced Study Institute in
Elementary Particle Physics (TASI 2003)on Recent Trends in String Theory
(Boulder, CO, 1–27 June 2003), hep-th/0309246;
M. Spradlin, A. Volovich: Lectures given at ICTP Spring School on Super-
string Theory and Related Topics (Trieste, Italy, 31 March–8 April 2003),
hep-th/0310033;
D. Sadri, M.M. Sheikh-Jabbari: Rev. Mod. Phys. 76, 853 (2004);
R. Russo, A. Tanzini: Class. Quant. Grav. 21, S1265 (2004) 423
151. M.B. Green, S. Kovacs, A. Sinha: JHEP 0505, 055 (2005) 423, 435
470 M. Bianchi et al.

152. M.B. Green, S. Kovacs, A. Sinha: JHEP 0512, 038 (2005) 423, 435
153. M.B. Green, S. Kovacs, A. Sinha: Phys. Rev. D 73, 066004 (2006) 423, 435, 436
154. E.J. Saletan: J. Math. Phys. 7, 53 (1961) 424
155. C. Kristjansen, J. Plefka, G.W. Semenoff, M. Staudacher: Nucl. Phys. B 643,
3 (2002) 428
156. N.R. Constable, D.Z. Freedman, M. Headrick, S. Minwalla, L. Motl, A. Post-
nikov, W. Skiba: JHEP 0207, 017 (2002) 428
157. M.R. Gaberdiel, M.B. Green: Ann. Phys. 307, 147 (2003) 429, 430
158. M.B. Green, M. Gutperle: Nucl. Phys. B 476, 484 (1996) 430
159. E. Witten: Nucl. Phys. B 223, 422 (1983) 436
160. P. Di Vecchia, F. Nicodemi, R. Pettorino, G. Veneziano: Nucl. Phys. B 181,
318 (1981) 436
161. E. Witten: Nucl. Phys. B 156, 269 (1979);
G. Veneziano: Nucl. Phys. B 159, 213 (1979) 436
162. For a recent review see, L. Del Debbio, L. Giusti, C. Pica: Nucl. Phys. (Proc.
Suppl) B 140, 603 (2005) [hep-lat/0409100] and references therein 436
163. C. Vafa, A. Strominger: Phys. Lett. B379, 99 (1996) 437
164. J. Wess, B. Zumino: Nucl. Phys. B 70, 39 (1974);
P. Fayet, S. Ferrara: Phys. Rept. 32, 250 (1977);
J. Wess, J. Bagger: Supersymmetry and Supergravity, 2nd Edition (Princeton
University Press, Princeton, 1992) 439, 440, 441, 458
165. R. Peccei, H. Quinn: Phys. Rev. Lett. 38, 1440 (1977); Phys. Rev. D 16, 1791
(1977) 441
166. S. Ferrara, B. Zumino: Nucl. Phys. B 87, 207 (1975);
O. Piguet, K. Sibold: Nucl. Phys. B 196, 428 (1982); ibid. 447 (1982); Helv.
Phys. Acta 63, 71 (1990) 442
167. L. Faddeev, V. Popov: Phys. Lett. B 25, 29 (1967) 445
168. G. Travaglini: Ph.D. lectures, University of Rome “Tor Vergata”, unpublished
445
169. R.P. Feynman, A.R. Hibbs: Quantum Mechanics and Path Integrals (McGraw-
Hill, New York, 1977) 448
170. G.C. Rossi, M. Testa: Nucl. Phys. B 163, 109 (1980); ibid. B176, 477 (1980);
ibid. B237, 442 (1984);
K. Symanzik: Nucl. Phys. B 190, 1 (1981);
J.P. Leroy, J. Micheli, G.C. Rossi, K. Yoshida: Z. Phys. C 48, 653 (1990) 448
171. M. Lüscher: Comm. Math. Phys. 54, 283 (1977);
G. Marchesini, E. Onofri: Nuovo Cimento A 65, 298 (1981);
M. Lüscher, R. Narayanan, P. Weisz, U. Wolff: Nucl. Phys. B 384, 168 (1992);
M. Lüscher, R. Sommer, P. Weisz, U. Wolff: Nucl. Phys. B 389, 247 (1993);
ibid. 413, 481 (1994);
S. Sint: Nucl. Phys. B 421, 135 (1994); Nucl. Phys. B 451, 416 (1995) 448
172. B.A. Ovrut, J. Wess: Phys. Rev. D 25, 409 (1982);
W.E. Lerche: Nucl. Phys. B 238, 582 (1984);
T. Kugo, I. Ojima, T. Yanagida: Phys. Lett. B 135, 402 (1984) 455
173. H. Georgi, S. L. Glashow: Phys. Rev. D 6, 2977 (1972) 456
174. D. Tong: Lectures given at Theoretical Advanced Study Institute in Elementary
Particle Physics: Many Dimensions of String Theory (Boulder, CO, 5 June–1
July 2005), hep-th/0509216 457
175. V.K. Dobrev, V.B. Petkova: Phys. Lett. B 162, 127 (1985) 458, 460
176. K. Intriligator: Nucl. Phys. B 551, 575 (1999) 460
The Magnetic Monopoles Seventy-five
Years Later

K. Konishi

Dipartimento di Fisica, “E. Fermi”, Università degli Studi di Pisa, Largo

Pontecorvo, 3, Ed. C, 56127 Pisa, Italy, and INFN, Sezione di Pisa, Pisa, Italy
[email protected]

Abstract. Non-Abelian monopoles are present in the fully quantum–mechanical

low-energy effective action of many solvable supersymmetric theories. They behave
perfectly as point-like particles carrying non-Abelian dual magnetic charges. They
play a crucial role in confinement and in dynamical symmetry breaking in these the-
ories. There is a natural identification of these excitations within the semiclassical
approach, which involves the flavor symmetry in an essential manner. We review
in an introductory fashion the recent development which has led to a better un-
derstanding of the nature and definition of non-Abelian monopoles, as well as of
their role in confinement and dynamical symmetry breaking in strongly interacting
theories.

Three quarters of a century have passed since the introduction of magnetic

monopoles in quantum field theory by Dirac [1]. Our understanding of the
soliton sector of spontaneously broken gauge theories [2] is still largely unsat-
isfactory. In particular, the development in our understanding of non-Abelian
versions of monopoles [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] and vortices [18]
have been very slow, in spite of many articles written on these subjects, and
in spite of the important role these topological excitations are likely to play in
various areas of physics. For instance, they might hold the key to the mystery
of the quark confinement in QCD. Their quantum–mechanical properties are
gradually emerging, however, thanks to an ever improving grasp on the non-
perturbative dynamics in the context of supersymmetric gauge theories. Some
of the ingredients of this development include the Seiberg–Witten solution of
N = 2 supersymmetric gauge theories and exact instanton summations, bet-
ter understanding of the properties of (super-) conformal field theories, exact
results on the chiral condensates and symmetry breaking pattern in a wide
class of N = 1 supersymmetric gauge theories, and so on. Also, many new
results on non-Abelian vortices and domain walls are now available, which are
closely related to the problems concerning the monopoles.
It is the author’s opinion that a serious discussion about confinement
and non-Abelian monopoles today cannot ignore these basic results from

K. Konishi: The Magnetic Monopoles Seventy-ﬁve Years Later, Lect. Notes Phys. 737,
471–521 (2008)
DOI 10.1007/978-3-540-74233-6 15 c Springer-Verlag Berlin Heidelberg 2008
472 K. Konishi

supersymmetric gauge theories. This lecture presents a review of what the

author believes to be some of the most relevant aspects of this development,
which should serve as an introduction to this very exciting area of research.

1 Color Conﬁnement

One of the profound unsolved problems in the elementary particle physics

today is quark confinement. A popular idea, due to ’t Hooft and Mandelstam
[19] holds that the ground state of QCD (quantum chromodynamics) is a
dual superconductor: the quarks are confined by the chromo-electric vortices,
analogous to the magnetic Abrikosov–Nielsen–Olesen vortex in the usual type
II superconductors in solid. The Lagrangian of QCD
1 a μν a
L = − Fμν F + ψ̄ (iγμ Dμ + m) ψ, (1)
4
however, describes the dynamics of quarks and gluons, and it is not obvious
from (1) how magnetic (dual) degrees of freedom appear and how they inter-
act. One way to detect such degrees of freedom is ’t Hooft’s Abelian gauge
fixing. One chooses the gauge so that a given field (perhaps some composite
a
of Fμν ) in the adjoint representation to take an Abelian form
⎛ ⎞
λ1 0 0
X = ⎝ 0 λ2 0 ⎠ , λ1 > λ2 > λ3 . (2)
0 0 λ3

For a generic gauge-ﬁeld conﬁguration Aμ (x), however, it is not possible to

keep the above diagonal form everywhere in R4 . Near a singularity λ1 = λ2 ,
diagonalization of the matrix

C(x) 0
X = X|λ1 =λ2 + (3)
0 0

where C is a 2 × 2 matrix, for instance, of the form,

C = τ i (x − x0 )i , (4)

by a gauge transformation U (x), introduces a magnetic monopole, Ai

U (x) ∂ U (x)−1 .
Another possibility is to use the Cho–Faddeev–Niemi decomposition [20]
of the gauge ﬁelds (for SU (2))

Aaμ = Cμ na + σ̃(x)(∂μ n × n)a + ρ ∂μ na ; σ̃(x) = 1 + σ(x), (5)

in terms of the unit vector field n and the Abelian gauge field Cμ which live
on S 2 and S 1 factors, respectively, of SU (2), and a charged “scalar” field
The Magnetic Monopoles Seventy-five Years Later 473

φ = ρ(x) + iσ(x). (6)

The Wu–Yang singular monopole solution [21], for instance, corresponds to

xa
na = , Cμ = φ = 0. (7)
r
It is possible that these singularities, regularized, e.g., by the zero of 1 − |φ|2 ,
somehow manage to behave as dominant degrees of freedom in the ground
state of QCD.
Whichever way, a central question is whether the magnetic monopoles of
QCD is of Abelian or non-Abelian type. The ’t Hooft–Mandelstam scenario is
essentially Abelian. By assuming that the relevant infrared degrees of freedom
are those which signal the singularities of Abelian gauge fixing, one tacitly
makes a highly nontrivial dynamical assumption.
In this respect, the SU (2) gauge theory is an exception, though. It is quite
possible that in this particular case ’t Hooft’s (or related) Abelian gauge-
fixing procedure allows us to “detect” the correct magnetic degrees of free-
dom, even if the system does not dynamically Abelianize.1 The singularities of
the Abelian gauge-fixing would signal the presence of the magnetic degrees of
freedom, which correspond [22] just to the Wu–Yang monopoles, (7). As the
Cho–Faddeev–Niemi na (x) field parametrizes π2 (S 2 ) ∼ π2 (SU (2)/U (1)) = Z,
the magnetic charge of the Wu–Yang monopoles are the same, and quantized
in the same way, as the ’t Hooft–Polyakov monopoles of the Georgi–Glashow
model. In more general SU (N ) theories with N ≥ 3, however, one does not ex-
pect such a lucky situation. If the system does not dynamically Abelianize to
U (1)N −1 effective system at some low-energy scales, it would not be appropri-
ately described by an effective Lagrangian describing the Abelian monopoles
which signal the singularities of the Abelian gauge fixing.2
Actually, there is no hint that such a dynamical scenario (dynamical
Abelianization) is realized in Nature. We must seriously consider the much
more subtle possibility that somehow non-Abelian, magnetic degrees of free-
dom play a role in the physics of confinement and chiral symmetry breaking.
Are there models in which the low-energy dynamics is known and in which
non-Abelian magnetic degrees of freedom play a central role?
It does not seem to be widely known that not only do such systems exist,
but that in a sense this (occurrence of light non-Abelian monopoles) is a
most typical dynamical phenomenon in a wide class of supersymmetric gauge
systems. The class of models in question is N = 2 supersymmetric theories
1
This could explain the mysterious success of the Abelian dominance idea in lattice
simulations of the pure SU (2) gauge theory, even if there are no other indications
for dynamical Abelianization. The author thanks T. Suzuki for useful discussions.
2
Vice versa, in a system where Abelianization does take place, as in a class of
supersymmetric models mentioned in Sect. 5.4 below, ’t Hooft’s Abelian gauge
fixing should be a perfectly valid tool for extracting and studying the relevant
infrared degrees of freedom.
474 K. Konishi

with SU , SO or U Sp gauge groups with quark hypermultiplets in various

representations [23, 24, 25, 26, 27, 28, 29, 30, 31]. Moreover, the class of models
in which one can make reliable analysis about their low-energy behavior, have
increased enormously thanks to a more recent work on certain N = 1 models
[32] with scalar multiplets in the adjoint representation. Again, the appearance
of massless, non-Abelian monopoles in their low-energy effective action is a
rule, rather than an exception, in these models.
Of course, in the context of superconformal theories there are famous ex-
amples of non-Abelian dualities such as the Montonen–Olive duality in N = 4
supersymmetric theories [33] or the Seiberg duality in the N = 1 supersym-
metric models [34] with nontrivial infrared fixed points.
These problems (conformal invariance and confinement) are closely re-
lated, as the confinement and dynamical symmetry breaking can often be
seen as the result of breaking of (nontrivial) conformal invariance near an
infrared fixed-point theory.
Evidently, supersymmetric theories are trying to tell us something impor-
tant about the non-Abelian monopoles and confinement. In what follows we
review briefly the old difficulties associated with the semiclassical concepts of
non-Abelian monopoles. It will be argued that the dual group properties of
non-Abelian monopoles occurring in a system with gauge symmetry breaking
G −→ H are best defined by setting the low-energy H system in Higgs
phase, so that the dual system is in confinement phase. The transformation
law of the monopoles follows from that of monopole–vortex mixed configura-
tions in the system with a large hierarchy of energy scales, v1 v2 ,
v v
G −→
1
H −→
2
∅, (8)
under an unbroken, exact color–flavor diagonal symmetry HC+F This last
symmetry is broken by individual soliton vortex, so the latter develops con-
tinuous moduli. The transformation law among the regular monopoles, which
appear at the end point of the vortex, follows from that of the vortices. This
defines, once rewritten in the dual, magnetic variables, the dual group H̃
under which the monopoles transform as a multiplet.

2 Diﬃculties with the Semiclassical “Non-Abelian

Monopoles”
2.1 Abelian Monopoles
A system in which the gauge symmetry is spontaneously broken

φ1 =0
G −→ H (9)
where H is some non-Abelian subgroup of G, possesses a set of regular mag-
netic monopole solutions in the semiclassical approximation. They are nat-
ural generalizations of the Abelian ’t Hooft–Polyakov monopoles [2], found
The Magnetic Monopoles Seventy-ﬁve Years Later 475

originally in the G = SO(3) theory broken to H = U (1) by a Higgs mech-

anism. In that theory, the field content is just the SU (2) gauge fields and a
scalar field in the adjoint representation of the gauge group; the energy of a
static field configuration has an expression

1 1 λ
E = d3 x[ Fija 2 + (Di φa )2 + (φa 2 − F 2 )2 ] (10)
4 2 8
where
Fija = ∂i Aj − ∂j Ai − g abc Abi Acj ;
while Di φa is a covariant derivative,

Di φa = ∂i φa − g abc Abi φc .

Now the static finite energy solution of the equation of motion must behave
asymptotically as
φa → na (x) F, na (x)2 = 1, (11)
where the vector field na (x) clearly label the winding of the map S 2 → S 2 ,
the first sphere being the space sphere surrounding the monopole, the second
sphere representing the vacuum orientation in the group space. One possibility
is na has a fixed orientation, such as na (x) = (0, 0, 1) everywhere: this repre-
sents a vacuum. Another possibility is that na makes a nontrivial winding in
the group space as xi goes around the sphere, e.g.,

na (x) = (sin θ cos mφ, sin θ sin mφ, cos θ), m = ±1, ±2, . . . .

This integer labels the homotopy classes

π2 (SU (2)/U (1)) ∼ π2 (S 2 ) ∼ Z

of the scalar field configurations. The gauge fields must reduce to the pure
gauge,
1
Aai → abc nb (x) ∂i nc (x)
g
in order for the energy to be finite.
The solution of the equation of motion in the nontrivial sectors can be
found by rewriting (10) as

1 1 λ
E = d3 x[ (Fija − ijk Dk φa )2 + ijk Fija Dk φa + (φa 2 − F 2 )2 . (12)
4 2 8
The crucial observation is that while the first and third terms are semipositive
definite, the second term is a total derivative,
1 1
ijk Fija Dk φa = ∂k Bk , Bk = ijk Fija φa .
2 2
We used above a useful identity for the derivatives for gauge invariant products
476 K. Konishi

∂k Tr(A B . . .) = Tr(Dk A B . . .) + Tr(A Dk B . . .) + . . . .

Thus the second term of (12) represents F times the “magnetic” charge

gm
dv ∇ · B = dS · B = 4 πgm , B ∼ 3 r.
r

If λ = 0, |φa 2 | → F 2 (Bogomol’nyi–Prasad–Sommerﬁeld (BPS) limit) the

mass is proportional to the magnetic charge, 4 π gm F =
φ
g , while the field
configuration satisfies the linear BPS equation

Fija − ijk Dk φa = 0,

with an explicit (BPS) solution [2]

1 − K(r) gF r
Aai = aij rj , K(r) = , (13)
g r2 sinh gF r

H(r)
φa = r a , H(r) = gF r coth gF r − 1. (14)
g r2

2.2 Non-Abelian Unbroken Group

When the “unbroken” gauge group is non-Abelian, the asymptotic gauge field
can be written as
rk
Fij = ijk Bk = ijk 3 (β · H), (15)
r
in an appropriate gauge, where H are the diagonal generators of H in the Car-
tan subalgebra. A straightforward generalization of the Dirac’s quantization
condition leads to
2β · α ∈ Z (16)
where α are the root vectors of H.3 It is not difficult to write down explicit
classical solutions [5, 6] by generalizing (13) and (14).
The constant vectors β (with the number of components equal to the rank
of the group H) label possible monopoles. It is easy to see that the solution
of (16) is that β is any of the weight vectors of a group whose nonzero roots
are given by
α
α∗ = . (17)
α·α
) i
3
This is most easily seen by considering Treig Ai dx along an infinitesimal closed
curve on the surface of a sphere surrounding the monopole. By enlarging the loop
and reclosing it at the other side of the sphere, one ends up with

eig dS·B
= e4πiβ·H .

This should be an identity operator: commuting the above with nondiagonal

generators Eα yields (16).
The Magnetic Monopoles Seventy-ﬁve Years Later 477

This is just a standard group theory theorem: (16) can in fact be rewritten
as the well-known relation between a weight vector and a root vector of any
group, 2 β · α∗ /(α∗ · α∗ ) ∈ Z.
The group generated by (17) is known as the dual (we shall call it Goddard–
Nuyts–Olive–Weinberg (GNOW) dual below) of H, let us call H̃. One is thus
led to a set of semiclassical degenerate monopoles, with multiplicity equal to
that of a representation of H̃; this has led to the so-called GNOW conjecture,
ı.e., that they form a multiplet of the group H̃, dual of H [4, 5, 6]. For simply
laced groups (with the same length of all nonzero roots) such as SU (N ),
SO(2N ), the dual of H is basically the same group, except that the allowed
representations tell us that

U (N ) ↔ U (N ); SO(2N ) ↔ SO(2N ), (18)

while
SU (N )
SU (N ) ↔ ; SO(2N + 1) ↔ U Sp(2N ). (19)
ZN
There is no difficulty in explicitly constructing these degenerate set of mono-
poles [6]. The basic idea is to embed the ’t Hooft–Polyakov monopoles in
various broken SU (2) subgroups. The main results are summarized in Ap-
pendixes 9 and 9. These set of monopoles constitute the prime candidates for
the members of a multiplet of the dual group H̃.
There are however well-known difficulties in such an interpretation. The
first concerns the topological obstruction discussed in [11]: in the presence
of the classical monopole background, it is not possible to define a globally
well-defined set of generators isomorphic to H. As a consequence, no “colored
dyons” exist. In a simplest case with the breaking

φ1 =0
SU (3) −→ SU (2) × U (1), (20)

this means that

no monopoles with charges (2, 1∗ ) exist, (21)

where the asterisk indicates the dual, magnetic U (1) charge.

The second can be regarded as an inﬁnitesimal version of the same diﬃ-
culty: certain bosonic zero modes around the monopole solution, correspond-
ing to H gauge transformations, are non-normalizable (behaving as r−1/2
asymptotically). Thus the standard procedure of quantization leading to H
multiplets of monopoles, does not work. Some progress on the check of GNOW
duality along this orthodox line of thought, has been reported nevertheless
[14], in the context of N = 4 supersymmetric gauge theories. Their approach,
however, requires the consideration of particular class of multi monopole
systems, neutral with respect to the non-Abelian group (more precisely,
non-Abelian part of) H only.
478 K. Konishi

Both of these diﬃculties concern the transformation properties of the

monopoles under the subgroup H, while the relevant question should be how
they transform under the dual group, H̃. As field transformation groups, H
and H̃ are relatively nonlocal, the latter should look like a nonlocal transfor-
mation group in the original, electric description.
Another related question concerns the multiplicity of the monopoles: Take
again the case of the system with a breaking pattern, (20). One might argue
that there is only one monopole, as all the degenerate solutions are related by
the unbroken gauge group H = SU (2).4 Or one might say that there are two
monopoles, in the sense that according to the semiclassical GNO classification
they are supposed to belong to a doublet of the dual SU (2) group. Or, per-
haps, one should conclude that there are infinitely many, continuously related
solutions, as the two solutions obtained by embedding the ’t Hooft solutions
in (1,3) and (2,3) subspaces, are clearly part of the continuous set of (moduli
of) solutions. In short, what is the multiplicity (N ) of the monopoles:

N = 1, 2, or ∞? (22)

Formulated perhaps more adequately:

• What is the dual group? How do the degenerate magnetic monopoles trans-
form among themselves under the dual group? Which of the semiclassical
aspects of monopoles survive quantum effects?
In the attempt to answer these questions, some general considerations seem
to be unavoidable. The first is the fact since H and H̃ groups are non-Abelian
the dynamics of the system should enter the problem in essential way. For
instance, the non-Abelian H interactions can become strongly-coupled at low
energies and can break itself dynamically. This indeed occurs in pure N =
2 super Yang–Mills theories (i.e., theories without quark hypermultiplets),
where the exact quantum–mechanical result is known in terms of the Seiberg–
Witten curves [23, 24, 25]: see below. Consider for instance, a pure N = 2,
SU (N + 1) gauge theory. Even though partial breaking, e.g., SU (N + 1) →
SU (N )×U (1) looks perfectly possible semiclassically, in an appropriate region
of classical degenerate vacua, no such vacua exist quantum mechanically. In
all vacua the light monopoles are abelian, the effective, magnetic gauge group
being U (1)N .
Generally speaking, the concept of a dual group multiplet is well-defined
when H̃ interactions are weak (or at worst, conformal). This however means
that one must study the original, electric theory in the regime of strong cou-
pling, which would usually make the task exceedingly difficult. Fortunately,
in N = 2 supersymmetric gauge theories, exact Seiberg–Witten curves de-
scribe the fully quantum–mechanical consequences of the strong-interaction
4
This interpretation however encounters the difficulties mentioned above. Also
there are cases in which degenerate monopoles occur, which are not simply related
by the group H, see below.
The Magnetic Monopoles Seventy-five Years Later 479

dynamics in terms of weakly coupled dual magnetic variables. This is how we

know that the non-Abelian monopoles exist in fully quantum theories [27]:
in the so-called r-vacua of softly broken N = 2, SU (N ) gauge theory, the
light monopoles appear as the dominant infrared degrees of freedom and in-
teract as point-like particles having the charges of a fundamental multiplet
r of an eﬀective, dual SU (r) gauge group. In an SU (3) gauge theory broken
to SU (2) × U (1) as in (20), with an appropriate number of quark multiplets
(Nf ≥ 4), for instance, light magnetic monopoles carrying the charges

(2∗ , 1∗ ) (23)

under the dual SU (2) × U (1) appear in the low-energy effective action. (Dual)
colored dyons do exist! The distinction between H and H̃ is crucial (cf. (21)).
In N = 2, SU (N ) SQCD with Nf flavors, light non-Abelian monopoles
N
with SU (r) dual gauge group appear for r ≤ 2f only. Such a limit clearly re-
flects the dynamics of the soliton monopoles under renormalization group: the
effective low-energy gauge group must be either infrared free or conformally
invariant, in order for the monopoles to emerge as recognizable low-energy
degrees of freedom [28, 29, 30].
A closely related point concerns the phase of the system. If the dual group
were in Higgs phase, the multiplet structure among the monopoles would get
lost, generally. Therefore one must study the dual (H̃) system in confinement
phase.5 But then, according to the standard electromagnetic duality argu-
ment, one must analyze the electric system in Higgs phase. The monopoles
will appear confined by the vortices of the H system, which can be naturally
interpreted as confining string of the dual system H̃.
We are thus led to study the system with a hierarchical symmetry breaking,
v v
G −→
1
H −→
2
∅, (24)

where
v1 v2 , (25)
instead of the original system (9). The smaller VEV breaks H completely.
However, in order for the degeneracy among the monopoles not to be bro-
ken by the breaking at the scale v2 , we require that some global color–flavor
diagonal group
HC+F ⊂ Hcolor ⊗ GF (26)
remains unbroken (see below).
As we shall see, such a scenario is very naturally realized in N = 2 super-
symmetric theories. An important lesson one learns from these considerations
(and from the explicit models), is that the role of the massless flavor is fun-
damental. This manifests itself in more than one ways.
5
Non-Abelian monopoles in the Coulomb phase suffer from the difficulties already
discussed.
480 K. Konishi

(i) H must be nonasymptotically free, this requires that there be suﬃcient

number of massless flavors: otherwise, H interactions would become
strong at low energies and H group can break itself dynamically;
(ii) The physics of the r vacua [28, 30] indeed shows that the non-Abelian
N
dual group SU (r) appear only for r ≤ 2f . This limit can be understood
from the renormalization group: in order for a nontrivial r vacuum to
exist, there must be at least 2 r massless matter flavor in the original,
electric theory;
(iii) Non-Abelian vortices [35, 37], which as we shall see are closely related
to the concept of non-Abelian monopoles, require also an exact flavor
group. The non-Abelian flux moduli arise as a result of an exact color–
flavor diagonal symmetry of the system, broken by individual soliton
vortices.

3 Non-Abelian Monopoles from Vortex Moduli

It turns out that the properties of the monopoles induced by the breaking

G→H (27)

are closely related to the properties of the vortices, which develop when the
low-energy H gauge theory is put in Higgs phase by a set of scalar VEVs,
H → ∅. The crucial instrument is the exact homotopy sequence,

· · · → π2 (G) → π2 (G/H) → π1 (H) → π1 (G) → · · · (28)

But first a few words on homotopy groups and on the use of these relations
to characterize the semiclassical monopoles. We shall come back to consider
monopole–vortex mixed configurations later.
π1 (M ) and π2 (M ) are the first and second homotopy groups, respectively,
representing the distinct classes of maps from S 1 or S 2 to the (group) man-
ifold M . Now “products” among such equivalent classes can be defined and
they turn out to form a group structure [39, 8]. The definition of “the rel-
ative homotopy groups” such as π2 (G/H) and the proof of the exactness of
the sequence (28) can be found in the first reference. An exact sequence is a
useful tool for studying the structure of different groups through their corre-
spondences (group homomorphisms). “Exact” means that the kernel of the
map at any point of the chain is equal to the image of the preceding map.
Such relations are shown pictorially in Fig. 1. These sequences can be used,
for instance, as follows. Assume for simplicity that π2 (G) and π1 (G) are both
trivial. In this case it is clear that each element of π1 (H) is an image of a cor-
responding element of π2 (G/H): all monopoles are regular, ’t Hooft–Polyakov
monopoles.
Consider now the case π1 (G) is nontrivial. Take for concreteness G =
SO(3), with π1 (SO(3)) = Z2 , and H = U (1), with π1 (U (1)) = Z. For any
The Magnetic Monopoles Seventy-five Years Later 481

compact Lie groups π2 (G) = ∅. The exact sequence illustrated in Fig. 1 in this
case implies that the monopoles, classified by π1 (U (1)) = Z can further by
divided into two classes, one belonging to the image of π2 (SO(3)/U (1))—
’t Hooft–Polyakov monopoles!—and those which are not related to the
breaking—the singular, Dirac monopoles. The correspondence is two-to-one:
the monopoles of magnetic charges 2 n times (n = 1, 2, . . .) the Dirac unit
are regular monopoles while those with charges 2 n + 1 are Dirac monopoles.
In other words, the regular monopoles correspond to the kernel of the map
π1 (H) → π1 (G) [8].
The exact sequence (28) assumes an important significance when we con-
sider the system with a hierarchical symmetry breaking (24),
v v
G −→
1
H −→
2
∅.
As H is now completely broken the low-energy theory has vortices, classified
by π1 (H). If π1 (G) = ∅, however, the full theory cannot have vortices. This
apparent paradox is solved when one realizes that there is another related
paradox: monopoles representing π2 (G/H) cannot be stable, because in the
full theory the gauge group is completely broken, G → ∅, and because for
any Lie group, π2 (G) = ∅. These paradoxes solve themselves: the vortices
of the low-energy theory end at the monopoles, which have large but finite
masses. Or they are broken in the middle by (though suppressed) monopole–
antimonopole pair production. Vice versa, the monopoles are not stable as its
flux is carried away by the vortex (see Fig. 2).
Applied to the case of SO(3) → U (1) → ∅, this was precisely the logic used
by ’t Hooft in his pioneering paper on the monopoles. As is seen from Fig. 1,

‘t Hooft-Polyakov Dirac

Fig. 1. A pictorial representation of the exact homotopy sequence, (28), with the
leftmost ﬁgure corresponding to π2 (G/H)

Vortex

Monopole

B
Aφ

Fig. 2. The monopole as a sink of the magnetic ﬂux lines

482 K. Konishi

the vortices (π1 (U (1)) = Z) of the winding number two, corresponding to the
trivial element of π1 (SO(3)) = Z2 , should not be stable in the full theory:
there must be a regular monopole-like configuration, having the magnetic
charge twice the Dirac unit, gm = 4π/g, where g is the the gauge coupling
constant of the SO(3) theory, acting as a source or a sink of the magnetic flux
(Fig. 2). 6
An important new aspect we have here, as compared to the case discussed
by ’t Hooft [2] is that now the unbroken group H is non-Abelian and that the
low-energy vortices carry continuous, non-Abelian flux moduli. As the color–
flavor diagonal symmetry is an exact unbroken symmetry of the full theory,
and the non-Abelian moduli among the low-energy vortices is a consequence
of it, it follows that the the monopoles appearing as the end points of such
vortices carry the same continuous moduli.
The monopole transformation properties follow from those of the vortices,
which can be studied in the low-energy approximation.

4 N = 2 Supersymmetric Gauge Theories and Light

Non-Abelian Monopoles

It is always a healthy attitude to try to test one’s general idea against a

concrete model. For various reasons it turns out that N = 2 models provides
a good testing ground, as the results of strong infrared dynamics are known in
the form of exact Seiberg–Witten curves. Another advantage is that by varying
certain parameters upon which the system depends holomorphically, as is
usual in supersymmetric theories, one can study the system (8) in different
regimes.
In the regions of parameters where v1 v2 Λ, semiclassical analysis
in the original electric theory is justified, and one can study monopoles (in
the effective theory at mass scales much higher than v2 ) and separately, the
vortices (in the effective theory valid at mass scales much lower than v1 ). The
symmetry and homotopy-map argument allows to obtain the missing infor-
mation about the non-Abelian transformation properties of the monopoles,
from the known properties of the vortices. We come back to this discussion
in Sect. 6.2. In the concrete models studied there the breaking mass scales
√
are given by mi = m ∼ v1 ; μ m ∼ v2 , so the parameter regions explored
correspond to |mi | |μ| Λ.
These results are then checked against the fully quantum–mechanical re-
sults on the monopoles appearing as the massless degrees of freedom in the

6
The relation appears to violate the Dirac quantization condition: actually, the
minimum electric charge which could be introduced in the theory is that of a
quark, e = g/2, and which satisﬁes gm e = 2π, in accordance with Dirac’s condi-
tion.
The Magnetic Monopoles Seventy-ﬁve Years Later 483

magnetic dual theory, in the region v1 ∼ v2 ∼ Λ. This regime will be dis-

cussed ﬁrst. In the following Sect. 4.2, in fact, the parameters are chosen to
be mi , μ ∼ Λ, and in particular, mi → m.
We shall return later (Sect. 6.2) to see that how our ideas on non-Abelian
duality based on the hierarchical symmetry breaking and on color–ﬂavor di-
agonal symmetry can be studied in the same model reliably and see that the
results found match the full quantum results.

4.1 Seiberg–Witten Solution of Pure N = 2 Yang–Mills

N = 2 supersymmetric SU (2) Yang–Mills theory is described by the

Lagrangian,

1 4 † V 2 1
L= Im τcl d θ Φ e Φ + d θ WW (29)
8π 2
where
θ0 4πi
τcl ≡ + 2 (30)
2π g0
√
is the bare θ parameter and coupling constant. Φ = φ + 2 θ ψ + . . . , and
Wα = −iλ + 2i (σ μ σ̄ ν )βα Fμν θβ + . . . are N = 1 chiral and gauge superﬁelds,
both in the adjoint representation of the gauge group. The theory possesses
N = 2 supersymmetry as there are two gauginos, λ and ψ.
The scalar potential in this case is just the so-called D term
g2 †
VD = |[Φ , Φ]|2 , (31)
8
only, and the system has a continuous vacuum degeneracy (CMS—classical
moduli space), parametrized by a complex number a,

a 0
Φ = . (32)
0 −a

At any given a the gauge symmetry is broken by Higgs mechanism to U (1).

The low-energy theory is a U (1) theory, describing the photon and photino λ,
and the N = 2 partners, A = (A, ψ).
The general requirement of N = 2 supersymmetry implies that the
Lagrangian has the form

1 4 dF (A) 1 d2 F (A) α
Lef f = Im [ d θ Ā + W Wα ], (33)
4π dA 2 dA2
with F (A) is holomorphic in A. F (A) is known as prepotential. Going to
component ﬁelds, the fermionic and gauge parts take the form
1 d2 F (A)
Lf erm = [Im ](iψ̄σ̄ μ D̄μ ψ + iλ̄σ̄ μ D̄μ λ + . . .),
8π 2 dA2
484 K. Konishi

1 d2 F (A) 2
Lgauge = 2
[Im ](Fμν + i Fμν F̃ μν + . . .),
16π dA2
which shows clearly ψ and λ have the same properties as the adjoint fermions
(SUR (2) global symmetry of N = 2 supersymmetry); the second formula
shows that
dAD d2 F (A) dF (A)
τef f = = , AD ≡ ,
dA dA2 dA
acts as the low-energy eﬀective (complex) coupling constant
θef f 4πi
τef f = + 2 . (34)
2π gef f

Let us recall that in general 4D supersymmetric sigma model, with a set

of scalar multilpets Φ, the kinetic term is given by a (real) Kähler potential

∂2K
L = d4 θ K(Φ, Φ̄) = ∂μ φi ∂ μ φ̄j + . . . .
∂φi ∂ φ̄j

Here the Kähler potential has a special form, determined by the prepotential,

1 dF (A) dF (Ā)
K= [ Āi − Ai ]
2i dAi dĀi
(termed special geometry).
Coming back to the SU (2) N = 2 Yang–Mills theory where there is only
one scalar multiplet A, the bosonic part of the Lagrangian has the form
1
Lbos = (∂μ aD ∂ μ ā − ∂μ a ∂ μ āD ) + Imτ (a)(Fμν
+ 2
) , +
Fμν = Fμν + i F̃μν .
2i
Now this model has a nice property of (form) invariance under the generalized
electromagnetic duality transformation [40]
+ +
aD aD Fμν Fμν
→M , →M ; (35)
a a G+μν G+
μν

where
1 ∂
μν ≡
G+ +2
+ [τ (a) Fμν ]
2 ∂Fμν
and M is an SL(2, Z) matrix,

AB
M= , A D − B C = 1.
CD

Such an invariance group includes the electromagnetic duality transforma-

tion Fμν ↔ F̃μν , together with a ↔ aD .
Since F (A) is holomorphic, so is τ (A): it is harmonic, ∇τ = ∇Imτ = 0.
Thus Imτ cannot be everywhere positive. This means that A cannot be a good
The Magnetic Monopoles Seventy-ﬁve Years Later 485

global variable everywhere in the field space: there must be some singularities
where the description in terms of a, Fμν fails.
The beautiful argument by Seiberg and Witten [23, 24] that the singular-
ity be related to the point where the magnetic monopole of the theory—as
the bosonic part of the model is just the Giorgi–Glashow model the soliton
monopoles found by ’t Hooft and Polyakov are part of the spectrum—becomes
massless due to quantum effects, and the consequent determination of the the
prepotential F (A) are by now well known. For completeness we summarize
the main points of the solution in Appendix .4. Let us recall the main result
here: by introducing an auxiliary torus (whose genus 1 corresponds to the
rank of the gauge group SU (2)), described by the algebraic curve
y 2 = (x2 − Λ4 )(x − u) = (x + Λ2 )(x − Λ2 )(x − u), u ≡ TrΦ2 , (36)
the solution is expressed as
( (
daD dx da dx
= , = , (37)
du β y du α y
where α and β are the two canonical cycles on the torus, Fig. 3. Explicitly,
√ u 1/2
2 x−u u − Λ2 1 1 Λ2 − u
aD (u) = =i F ( , ; 2; ),
π Λ2 x − Λ 2 4 2 2 2 2
√ Λ2 1/2
2 x−u √ 1 1 2
a(u) = = 2 (u + Λ2 )1/2 F (− , ; 1; ). (38)
π Λ2 x −Λ
2 4 2 2 u + Λ2
The key step of the solution (37) was the theorem in algebraic geometry that
the integrals of the holomorphic differential ( dx
y in our case of the genus one
torus (36)) along the canonical cycles α and β (they are called period integrals)
satisfy ) dx
α y
Im ) dx
> 0,
β y
independently of the way canonical cycles are redefined. According to the
identification of the period integrals with the physical quantities as (37) this
guarantees that

∞
x
u
−Λ2 2
Λ u ∞
β β
α

−Λ2 α Λ2

Fig. 3. The torus (36) represented as a two-sheeted Riemann surfaces, with two
branch cuts (left). Note that two Riemann spheres attached at two cuts are equiv-
alent to a torus (ﬁgure on the right)
486 K. Konishi

daD 4π
Im τef f = Im = 2 > 0.
da gef f
Let us add several remarks.
(i) Another key observation by Seiberg–Witten is that the N = 2 super-
symmetry implies an exact mass formula for BPS saturated states with
magnetic and electric charges nm , ne :
√
Mnm ,ne = 2 |nm aD + ne a|. (39)

This is a consequence of the fact that the system has an underlying N = 2

supersymmetry with a central extension (see Appendix .4). This formula
generalizes the standard Higgs formula, M0,ne = g ne φ, as a ∼ g φ
semiclassically, and at the same time, the ’t Hooft–Polyakov monopole
mass formula, Mnm ,0 = 4 π nm φ/g (semiclassically aD ∼ 4 π φ/g).
Note that in the fully quantum formula (39) the magnetic and electric
charges appear symmetrically. Indeed the mass formula is invariant under
the generalized duality transformations (35), modulo appropriate relabel-
ing of magnetic and electric charges.
(iii) Quite remarkably the low-energy effective action thus determined con-
tains quantum effects in its entirety, the one-loop perturbative effects plus
the sum of infinite instanton contributions. Indeed, the Seiberg–Witten
curves have been checked against direct instanton calculations [41], and
more recently, have been rederived by an explicit instanton resumma-
tion [42].
(iii) The Seiberg–Witten solution nicely solves an old (apparent) paradox
related to the Dirac quantization versus renormalization group [8]: how
can the relation gm ge = 2 π n, n = 0, 1, . . . be compatible with the fact
that both the electric and magnetic charges are Abelian U (1) coupling
constants, expected to get renormalized in the same direction? In the
Seiberg–Witten solution, gm (μ) gets renormalized as in (magnetic version
of) QED, though monopole loops, with monopoles replacing the role of the
electron. The same infrared behavior is explained, in the original electric
picture, as due to instanton-induced nonperturbative renormalization of
the electric coupling constant ge (μ). As a consequence gm (μ) ge (μ) = 4 π
holds [23] at any infrared cutoff μ = aD . For other subtle issues related
to renormalization group properties of Seiberg–Witten solution, see [43].
(iv) How do we know that these massless monopoles are related to the ’t
Hooft–Polyakov monopoles? That they are indeed them, can be verified
by studying the electric and quark (in the cases with Nf = 1, 2, 3) num-
ber charges. As is well known the ’t Hooft–Polyakov monopoles acquire
these U (1) charges quantum mechanically, via a beautiful phenomenon of
charge fractionalization [44], which in this specific situation are the Wit-
ten’s [45] and Jackiw–Rebbi’s effects [46]. By moving within the space of
vacua (QMS) and going into the regions where semiclassical approxima-
tion is valid (where u = Tr Φ2 Λ2 ), one can compare these fractional
The Magnetic Monopoles Seventy-five Years Later 487

U (1) charges read off from the leading terms of the exact Seiberg–Witten
solution with the ones obtained many years earlier by standard quantiza-
tion of fermion fields around the semiclassical monopole backgrounds [47].
The results exactly match [48, 49].
(v) The low-energy effective Lagrangian near one of the singularities, e.g., u =
Λ2 , looks like a (dual) QED with a massless monopole, whose Lagrangian
has the standard N = 2 QED form,

1 dF (AD ) 1 d2 F (AD ) α
L= Im [ d4 θ ĀD + W D WD α ]
4π dAD 2 dA2D

+ d4 θ(M̄ eVD M + M̃ e−VD M̃ ¯ ) + d2 θ√2M̃ A M, (40)
D

where the gauge terms are just the dual of (33); the third and fourth terms
describe the monopole.
(vi) Addition of a N = 1 perturbation, the adjoint scalar mass term, μ TrΦ2
in the original electric theory induces ΔL = μ U (AD ), where the function
U (AD ) is the inverse of the solution aD (u). By minimizing the potential,
the degeneracy (quantum moduli space—QMS) is eliminated leaving just
two vacua, where
∂U
aD = 0, u = TrΦ2 = ±Λ2 , M = M̃ = μ ∼ μ Λ.
∂AD
The first result says that the magnetic monopole is massless in this vacuum
(see (40)), the third states that the magnetic monopole condenses, leading
to confinement à la ’t Hooft–Mandelstam. This is perhaps the first example
of nontrivial 4D system where this phenomenon has been demonstrated
explicitly and analytically.

4.2 Seiberg–Witten Solutions for N = 2 Models with Quarks

A general enthusiasm (alarm?) caused by the news that the SU (2)

Seiberg–Witten model with a small N = 1 perturbation exhibited the ’t
Hooft–Mandelstam mechanism of conﬁnement, was followed by a widespread
delusion (relief?) among theoretical physicists when it was realized that the
light monopoles appearing in the low-energy theory were Abelian and at the
same time conﬁnement was accompanied by dynamical Abelianization. This
surely was not a good model of QCD! The fact that in the SU (2) models
with Nf = 1, 2, 3 hypermultiplets of quarks, studied in the (quite remarkable)
second paper by Seiberg and Witten [24], as well as in pure N = 2 Yang–Mills
theories with more general gauge groups [25], the low-energy monopoles were
always Abelian, did not help.
What was not realized at the time, however, was the fact that there was
a clear reason for the Abelianization in these simplest models (see Sect. 4.3
below), and that, in the context of a more general class of N = 2 theories
488 K. Konishi

with quark multiplets, Abelian conﬁnement belonged to the exceptional cases.

In fact, confinement is more typically caused by condensation of non-Abelian
monopoles, as the subsequent analyses have revealed. We shall below briefly
summarize the main features of these models, with technical aspects kept at
its minimum.
The systems we consider are simple generalization of the N = 2 models
with
√ “quark” multiplets. The N = 1 chiral and gauge superfields Φ = φ +
2 θ ψ + . . . , and Wα = −iλ + 2i (σ μ σ̄ ν )βα Fμν θβ + . . . are both in the adjoint
representation of the gauge group, while the hypermultiplets are taken in the
fundamental representation of the gauge group. The Lagrangian takes the
form,

1 2 1
L= Im τcl d θ Φ e Φ + d θ W W + L(quarks) + ΔL + Δ L, (41)
4 † V
8π 2

where
† V † Ṽ
√
L (quarks)
= [ d θ {Qi e Qi + Q̃i e Q̃i } + d2 θ { 2Q̃i ΦQi + mi Q̃i Qi }
4

i
(42)
describes the nf ﬂavors of hypermultiplets (“quarks”), and

θ0 8πi
τcl ≡ + 2 (43)
π g0

is the bare θ parameter

√ and coupling constant. The N = 1 chiral and gauge
superﬁelds Φ = φ + 2 θ ψ + . . . , and Wα = −iλ + 2i (σ μ σ̄ ν )βα Fμν θβ + . . .
are both in the adjoint representation of the gauge group, while the hyper-
multiplets are taken in the fundamental representation of the gauge group.
We consider small generic nonvanishing bare masses mi for the hypermul-
tiplets (“quarks”), which is consistent with N = 2 supersymmetry. Further-
more, it is convenient to introduce the mass for the adjoint scalar multiplet

ΔL = d2 θ μ Tr Φ2 (44)

which breaks supersymmetry to N = 1. An advantage of doing so is that all

flat directions are eliminated and one is left with a finite number of isolated
vacua; keeping track of this number (and the symmetry breaking pattern in
each of them) allows us to make highly nontrivial check of our analyses at
various stages.
Below we summarize the physical results on these systems. To solve the
system (41), the first step is the generalization of the curve (36) to the case
of general group G. When the breaking is maximum, G → U (1)rG where rG
is the rank of the group G, we set μ = 0 and consider vacua

Φ = diag(φ1 , φ2 , . . .), φ1 = φ2 , etc. (45)

The Magnetic Monopoles Seventy-ﬁve Years Later 489

The auxiliary genus g = Nc −1 (or Nc ) curves for SU (Nc ) (U Sp(2Nc )) theories

corresponding to these classical vacua (called Coulomb branch of the moduli
space) are given by

nc
nf
2nc −nf
2
y = (x−φk ) +4Λ 2
(x+mj ), SU (Nc ), Nf ≤ 2Nc −2, (46)
k=1 j=1

and
nf

nc Λ
2
y = (x − φk ) + 4Λ
2
x + mj + , SU (Nc ), Nf = 2Nc − 1,
j=1
Nc
k=1

nc (47)
with φk subject to the constraint k=1 φk = 0, and
2

nc
nf
2
xy = x (x − φ2a )2 + 2Λ 2nc +2−nf
m1 · · · mnf −4Λ 2(2nc +2−nf )
(x+m2i )
a=1 i=1
(48)
for U Sp(2Nc ). Analogous results for SO(Nc ) theories are also known.
The connection between these genus g hypertori and physics is made [23,
24, 25, 26, 27, 28, 29] through the identification of various period integrals of
the holomorphic differentials on the curves with (daDi /duj , dai /duj ), where
the gauge invariant parameters uj ’s are defined by the standard relation

nc
Nc
(x − φa ) = uk xNc −k , u0 = 1, u1 = 0, SU (Nc ); (49)
a=1 k=0

nc
Nc
(x − φ2a ) = uk xNc −k , u0 = 1, U Sp(2Nc ), (50)
a=1 k=0

and u2 ≡ Tr Φ2 , u3 ≡ Tr Φ3 , etc. The VEVs of aDi , ai , which are directly

related to the physical masses of the BPS particles through the exact Seiberg–
Witten mass formula [23, 24]
g
√

M nmi ,nei ,Sk = 2 (nmi aDi + nei ai ) + S k mk , (51)

i=1 k

are constructed as integrals over the nontrivial cycles of the meromorphic dif-
ferentials on the curves. Sk are the i-th quark number charge of the monopole
under consideration, which enters the formula for the central charges (hence
the mass).
(i) These formulae naturally generalize those of the pure SU (2) theory, (37)
and (39). The singularities of the curves (46)–(48) are the points in the
space of vacua (QMS) where various particles become massless.
490 K. Konishi

(ii) When mi Λ these singularities are at the points where φ ∼ mi (where

the quarks become massless—see (42)) and at the points where monopoles
of pure Yang–Mills theory become massless. The latter are the points the
curve of the Yang–Mills theory,

nc
y2 = (x − φk )2 + 4Λ2n
YM
c

k=1
nc −1
become maximally singular, ∼ i=1 (x − xi )2 (x − α)(x − β).
(iii) It is the property of these curves that when mi ∼ Λ all singularities are
found to correspond to magnetic degrees of freedom (massless monopoles
and dyons). To trace how, as mi are varied, the original “electric” singular-
ities (massless quarks) make a metamorphosis into magnetic monopoles,
due to the movement of certain branch points (or branch cuts) sliding
under other branch cuts (branch surfaces), is a rather complicated busi-
ness, and has been analyzed satisfactorily only in the SU (2) theories with
matter [24, 51].
(iv) The particular form of the curve specific to different groups reflect dif-
ferent global symmetries. A nice discussion is given in [26].

4.3 Exact Quantum Behavior of Light Non-Abelian Monopoles

Physics of conﬁning vacua and properties of light monopoles in these theories

are studied by identifying all of the N = 1 vacua (the points in the QMS—
quantum moduli space, that is, the space of vacua—which survive the N = 1
perturbation) and studying the low-energy action for each of them. The under-
lying N = 2 theory, especially with mi = 0 or with equal masses mi = m, has
a large continuous degeneracy of vacua (flat directions), which has been stud-
ied by using the Seiberg–Witten curves, nonrenormalization of Higgs branch
metrics, superconformal points and their universality, their moduli structure
and symmetries, etc. [28, 29]. For the purpose of this section, however, we
are most interested in the set of vacua which are picked up when the small
generic bare quark masses mi and a small nonzero adjoint mass μ are present.
At the roots of these different branches of N = 2 vacua where the Higgs
branches meet the Coulomb branch, lie all these vacua (see Fig. 4), which
survives the N = 1 perturbation, (44). In SU (Nc ) theories with Nf flavors
with generic masses, all N = 1 vacua arising this way have been completely
classified [30, 31].
N
For nearly equal quark masses they fall into classes r = 0, 1, . . . , 2f groups
of vacua near the “roots of nonbaryonic Higgs branches,” and for Nf ≥ Nc ,
there are special vacua at the “roots of baryonic Higgs branches.” These names
reflect the fact that in the respective Higgs branch nonbaryonic or baryonic
squark VEV,
a
Qai Q̃ja , a1 a2 ...,aNc Qai11 Qai22 . . . QiNNcc , (52)
The Magnetic Monopoles Seventy-five Years Later 491

are formed (see Fig. 4). Each group of vacua coalesce in single vacua where
the gauge symmetry is enhanced into non-Abelian gauge groups, as in
Table 1.
The vacua at the root of the baryonic branch are in “free-magnetic” phase;
the light non-Abelian magnetic monopoles appear as asymptotic states; they
do not condense, no conﬁnement and no symmetry breaking occur. Although
the appearance of the Seiberg dual gauge group, SU (Ñc ), Ñc ≡ Nf − Nc is
certainly intriguing [28], these are not type of vacua we are interested in.
Our main interest is the ﬁrst classes of the so-called r-vacua, where the
magnetic gauge group is
U (r) × U (1)Nc −r ,

QMS of N=2 SQCD (SU(n) with nf quarks)

N=1 Confining vacua (with perturbation)

N=1 vacua (with perturbation) in free magnetic phase

Fig. 4. QMS of N = 2 SQCD (SU (n) with nf quarks)

492 K. Konishi

and the massless matter multiplets consist of Nf monopoles in the fundamen-

tal representation of U (r), and flavor-singlet Abelian monopoles carrying a
single charge, each with respect to one of the U (1) factors (Table 1). 7
Once the gauge group and the quantum numbers of the matter fields are all
known, the N = 2 supersymmetry uniquely fixes the structure of the effective
action. We find that
N
(i) We see the non-Abelian monopoles in action, in the generic r (2 ≤ r ≤ 2f )
vacua (see Table 2 taken from [30]). They behave perfectly as point-like
particles, albeit in a dual, magnetic gauge system.√Upon N = 1 perturba-
tion they condense (confinement phase) qai ∼ δai μ Λ and induces flavor
symmetry breaking

SU (Nf ) × U (1) → U (r) × U (Nf − r).

N
(ii) The upper limit r ≤ 2f is a manifestation of monopole dynamics: only
in this range of r the non-Abelian monopoles can appear as recognizable
infrared degrees of freedom. We now see why in the SU (2) Seiberg–Witten
models, as well as in pure N = 2 Yang–Mills (ı.e., Nf = 0) models with
diﬀerent gauge groups, the low-energy monopoles were found to be always
Abelian: in all these cases, non-Abelian monopoles would interact too
strongly, not enough of them being there. We remind the reader that the
beta function in N = 2 SU theories has the pure one-loop form with
β0 ∝ 2 r − Nf .
(iii) Indeed, there are homotopy and symmetry arguments [30, 52] which sug-
gest that non-Abelian monopoles appearing in the r-vacua are “baryonic
constituents” of an Abelian (’t Hooft–Polyakov) monopople,

Abelian monopole ∼ a1 ...ar qai11 qai22 . . . qairr , (53)

ai being the dual color indices and im the ﬂavor indices. The SU (r) gauge
ineractions, being infrared-free, are unable to keep the Abelian monopole
bound: they disintegrate into non-Abelian monopoles.
(iv) That the eﬀective degrees of freedom in the r vacua are non-Abelian
rather than Abelian monopoles, is actually required also by symmetry of

Table 1. The eﬀective degrees of freedom and their quantum numbers at the “non-
baryonic root”
SU (r) U (1)0 U (1)1 . . . U (1)nc −r−1 U (1)B
nf × q r 1 0 ... 0 0
e1 1 0 1 ... 0 0
.. .. .. .. .. .. ..
. . . . . . .
enc −r−1 1 0 0 ... 1 0
7
We shall use the notation Nc = nc indistinguishably, and analogously Nf = nf .
The Magnetic Monopoles Seventy-ﬁve Years Later 493

Table 2. Phases of SU (nc ) gauge theory with nf ﬂavors. ñc ≡ nf − nc

Label (r) Deg. freed. Eff. gauge group Phase Global symmetry
0 Monopoles U (1)nc −1 Confinement U (nf )
1 Monopoles U (1)nc −1 Confinement U (nf − 1) × U (1)
nf −1
≤ [ 2 ] NA Monopoles SU (r) × U (1)nc −r Confinement U (nf − r) × U (r)
nf /2 Rel. nonloc. - Confinement U (nf /2) × U (nf /2)
BR NA monopoles SU (ñc ) × U (1)nc −ñc Free magnetic U (nf )

the system [30, 53], not only from the dynamics. If the Abelian monopoles
of the r-th tensor flavor representation were the correct degrees of free-
dom, the low-energy effective
theory would have too large an acciden-
tal symmetry – SU ( Nrf ). The condensation of such monopoles would
produce far-too-many Nambu–Goldstone bosons than expected from the
symmetry of the underlying theory. The system prevents such an awkward
situation from being realized in an elegant manner, introducing smaller
solitons, non-Abelian monopoles, in the fundamental representation of the
SU (Nf ) so that the low-energy theory has the right symmetry.
(v) An analogous argument might be used in the standard QCD, to exclude
Abelian picture of confinement, though admittedly this is not a very rig-
orous one. We know from lattice simulations of SU (3) theory that con-
finement and chiral symmetry breaking are closely related. If Abelian ’t
Hooft–Monopole–Mandelstam monopoles were the right degrees of free-
dom describing confinement, their condensation would somehow have to
describe chiral symmetry breaking as well. We would then be led to as-
sume that they carry flavor quantum numbers of SU (Nf )L × SU (Nf )R ,
e.g.,
M onopoles ∼ Mij , Mij ∝ δij ΛQCD ,
where i, j are SU (Nf )L ×SU (Nf )R indices. But such a system would have
a far too large accidental symmetry. Confinement would be accompanied
by a large number of unexpected (and indeed unobserved) light Nambu–
Goldstone bosons.
N
(vi) The limiting case of r vacua, with r = 2f , as well as the massless (mi →
0) limit of U Sp(2Nc ) and SO(Nc ) theories, are of great interest (see Fig. 5
and Table 3). The low-energy effective theory in these cases turn out to
be conformally invariant (nontrivial infrared fixed-point) theories. This is
an analogue of an Abelian superconformal vacuum found first in the pure

Table 3. Phases of U Sp(2nc ) gauge theory with nf ﬂavors with mi → 0. ñc ≡

nf − nc − 2
Deg. freed. Eﬀ. gauge group Phase Global symmetry
First group Rel. nonloc. – Conﬁnement U (nf )
Second group Dual quarks U Sp(2ñc ) × U (1)nc −ñc Free magnetic SO(2nf )
494 K. Konishi

QMS of N=2 USp(2n) Theory with nf Quarks

Higgs Branches

Special
Higgs Branch

<Q>

SCFT
Dual
Non Abelian monopoles Quarks

Coulomb
Branch

N=1 Confining vacua (with perturbation)

N=1 vacua (with perturbation) in free magnetic phase

Fig. 5. QMS of N = 2 SQCD U Sp(2n) theory with nf quarks

SU (3) Yang–Mills theory by Argyres and Douglas [54]. It can be explicitly

checked that the low-energy degrees of freedom include relatively nonlocal
monopoles and dyons [30, 53, 55]. There are no local eﬀective Lagrangians
describing the infrared dynamics. These are the most diﬃcult cases to
analyze, but are potentially the most interesting ones, from the point
of view of understanding QCD. We shall come back to these (perhaps,
crucial) cases at the end of the lecture, Sect. 7.

5 Vortices
The moral of the story is that the non-Abelian monopoles do exist in fully
quantum–mechanical systems. In typical conﬁning vacua in supersymmet-
ric gauge theories they are the relevant infrared degrees of freedom. Their
The Magnetic Monopoles Seventy-ﬁve Years Later 495

condensation induces conﬁnement and dynamical symmetry breaking. This

brings us back to the problem of understanding these light, magnetic degrees
of freedom as quantum solitons:
• What are their semiclassical counterparts?
• Are they Goddard–Nuyts–Olive–Weinberg monopoles?
• In which sense condensation of non-Abelian monopoles imply conﬁnement?
• How has the diﬃculty related to the dual group mentioned earlier been
avoided?
These are the questions we wish to answer. The idea is to take advantage of the
fact that in supersymmetric theories there are parameters which can be varied,
upon which the physical properties of the system depend in a holomorphic
fashion. As mi and μ are varied, there cannot be phase transition at some |μ|
or at |mi |: the number of Nambu–Goldstone bosons and hence the pattern of
the symmetry breaking, must be invariant.

5.1 Abrikosov–Nielsen–Olesen Vortex

Topologically stable vortices arise when the ground states of a system have a
nontrivial moduli space which is not simply connected. The best-known case
[56] is the Abelian gauge theory with a charged complex matter field in Higgs
phase (superconductor), where the static configurations have energy density
1 2
H= F + |Di φ|2 + V (|φ|), Di = ∂i − i e Ai .
4 ij
The potential V is assumed to attain its minimum at |φ| = v = 0. The
asymptotic gauge and scalar fields must be such that the field energy be
finite,
|φ(x)| → v, Di φ → 0, Fij2 → 0.
These allow for nontrivial configurations classified by an integer,
π1 (U (1)) = Z,
i.e., by an integer winding number n,
n
φ → einϕ v, Aϕ → ,
eρ
where ρ, φ, and z are the position variables of the cylindrical coordinate sys-
tem. At the center of the vortex φ(ρ = 0, ϕ) = 0 in order for φ(ρ, φ) to be a
smooth configuration: the gauge symmetry is restored along the vortex core.
Depending on the potential, the vacuum can be superconductor of type
II where single isolated (Abrikosov–Nielsen–Olesen) vortices are stable, type
I systems where vortices stick together to form the regions of normal ground
state, and finally there is the critical case between them (BPS) where vortices
has no net interaction and the tension of winding number k vortex is equal to
k times that of the minimum-winding vortex.
496 K. Konishi

5.2 ZN Vortices

In pure SU (N ) theory with all matter ﬁelds in adjoint representation, the

true gauge group is SUZ(N
N
)
. When the gauge group is completely broken the
vacuum manifold has nontrivial structure,
SU (N )
π1 ( ) = ZN . (54)
ZN
The asymptotic behavior of the ﬁelds, required by ﬁniteness of the tension is

i
r
U (φ)∂i U † (φ); φA ∼ U φA U † ,
(0)
Ai ∼ U (φ) = exp i βj Tj φ
g j

(0)
where Tj are the generators of the Cartan subalgebra of H, φA are the (set
of) VEVs of the adjoint scalar ﬁelds which break the SU (N ) group completely.
The smoothness of the conﬁgurations requires the quantization condition: (α
= root vectors of H)

U (2π) ∈ ZN , α · β ∈ Z. (55)

The second condition of (55) appears to imply that these vortices be character-
ized by the weight vectors of the group H̃ = SU (N ), dual of H = SU (N )/ZN
[4]: one vortex for each irreducible representation of H̃. Actually, (54) shows
that there is just one stable vortex with a given ZN charge (N -ality)8 .
An interesting model of this sort is the so-called N = 1∗ theory [57, 58, 59]
deﬁned as the N = 4 supersymmetric theory with addition of mass terms for
the three adjoint scalar multiplets,

3
ΔL = mi Φ2i |θθ ,
i=1

which break supersymmetry to N = 1. The general properties of chiral con-

densates,
W W , Φ21 , Φ22 , Φ23 ,
in all possible types of vacua (confinement vacua, Coulomb vacua, Higgs
vacua) have been analyzed exactly in a series of papers [60].
This model is based on the underlying N = 4 model, which is believed
to display exact Olive–Montonen duality. In spite of the relative simplicity of
the model, the properties of ZN monopoles in the Higgs (or partially Higgs)
vacua in the N = 1∗ are not very well known, except for the SU (2) [61] or
SU (3) cases.
8
That an excitation in a theory in which all fields are neutral with respect to ZN
is characterized by a fractional ZN charge, may be thought of as an analogue of
a very general behavior of solitons: charge fractionalization.
The Magnetic Monopoles Seventy-five Years Later 497

5.3 Non-Abelian Vortices in a U (N ) Model

The ZN vortice discussed in the preceding section at ﬁrst sight appears to

carry a non-Abelian charge, being labeled by the weight vector of a non-
Abelian dual group H̃: actually, they do not [62]. It is just a single solution,
which can be transformed by Weyl transformations of H. There are no con-
tinuous moduli associated to it.
Truly non-Abelian vortices have been constructed [35, 37] in the context
of a N = 2 supersymmetric U (N ) gauge theory, with Nf ﬂavors, where the
gauge group is broken by the VEVs of a set of scalar ﬁelds in the fundamental
representations. The model Lagrangian has the form

1 2 2
L = Tr − 2 Fμν F μν − 2 Dμ φ† Dμ φ − Dμ H Dμ H † − λ c 1N − H H †
2g g
+Tr [ (H † φ − M H † )(φ H − H M ) ] (56)

where Fμν = ∂μ Wν − ∂ν Wν + i [Wμ , Wν ] and Dμ H = (∂μ + i Wμ ) H, and H

represents the ﬁelds in the fundamental representation of SU (N ), written in
a color–ﬂavor N × Nf matrix form, (H)iα ≡ qαi , and M is a Nf × Nf mass
matrix. Here, g is the U (N )G gauge coupling, λ is a scalar coupling. For

g2
λ= (57)
4
the system is BPS saturated. For such a choice, (56) can be regarded as a
truncation of the bosonic sector of an N = 2 supersymmetric U (N ) gauge
theory, and with (H)iα representing the half of the squark ﬁelds,

(H)iα ≡ qαi , q̃iα ≡ 0 (58)

In the supersymmetric context the parameter c is the Fayet–Iliopoulos pa-

rameter. In the following we set c > 0 so that the system be in Higgs phase,
and so as to allow stable vortex conﬁgurations. For generic, unequal quark
masses,
M = diag (m1 , m2 , . . . , mNf ), (59)
the adjoint scalar VEV takes the form,
⎛ ⎞
m1 0 0 0
⎜ 0 m2 0 0 ⎟
⎜ ⎟
φ = M = ⎜ .. ⎟, (60)
⎝ 0 0 . 0 ⎠
0 0 0 mN

which breaks the gauge group to U (1)N .

In order to have a non-Abelian vortex, it is necessary to choose masses
equal,
M = diag (m, m, . . . , m), (61)
498 K. Konishi

the adjoint and squark ﬁelds have the vacuum expectation value (VEV)
⎛ ⎞
1 0 0
√ ⎜ ⎟
φ = m 1N , H = c ⎝ 0 . . . 0 ⎠ (62)
0 0 1

where only the first N flavors are left explicit. The squark VEV breaks the
gauge symmetry completely, while leaving an unbroken SU (N )C+F color–
flavor diagonal symmetry (the flavor group acts on H from the right while
the U (N )G gauge symmetry acts on H from the left). The global symmetry
group associate with the other Nf − N flavors also remains unbroken. The
BPS vortex equations are

g2
(D1 + iD2 ) H = 0, F12 + c 1N − H H † = 0. (63)
2
The matter equation can be solved [65, 66, 67] by use of the N × N moduli
matrix H0 (z) whose components are holomorphic functions of the complex
coordinate z = x1 + ix2 ,

H = S −1 (z, z̄) H0 (z), W1 + i W2 = −2 i S −1 (z, z̄) ∂¯z S(z, z̄). (64)

The gauge ﬁeld equations then take the simple form (“master equation”)

g2
∂z (Ω −1 ∂z̄ Ω) = (c 1N − Ω −1 H0 H0† ). (65)
4
The moduli matrix and S are deﬁned up to a redeﬁnition,

H0 (z) → V (z) H0 (z), S(z, z̄) → V (z) S(z, z̄), (66)

where V (z) is any nonsingular N × N matrix which is holomorphic in z. This

class of model has been extensively studied recently [65, 66, 67, 68, 69, 70, 71].
In particular, in the context of these models, a considerable attention was
given to the system in which U (N ) gauge symmetry is either explicitly or dy-
namically broken to U (1)N , producing Abelian monopoles. As the terminology
used and concepts involved, though physically distinct, are often similar to
the concept of non-Abelian monopoles discussed in this note, and could be
misleading.

5.4 Dynamical Abelianization

As should be clear from what we said so far, it is crucial that the color–flavor
diagonal symmetry SU (N ) remains exactly conserved, for the emergence of
non-Abelian dual gauge group (see the next section). Consider, instead, the
cases in which the gauge U (N ) (or SU (N ) × U (1)) symmetry is broken to
Abelian subgroup U (1)N , either by small quark mass differences ((60)) or
The Magnetic Monopoles Seventy-five Years Later 499

dynamically, as in the N = 2 models with Nf < 2 N [36, 69]. From the

breaking of various SU (2) subgroups to U (1) there appear light ’t Hooft–
Polyakov monopoles of mass O( Δm g ) (in the case of an explicit breaking)
or O(Λ) (in the case of dynamical breaking). As the U (1)N gauge group is
further broken by the squark VEVs, the system develops ANO vortices. The
light magnetic monopoles, carrying magnetic charges of two different U (1)
factors, look confined by the two vortices (Fig. 6). These cases have been
discussed extensively [67, 68, 69, 70], within the context of U (N ) model of
Sect. 5.3.
The dynamics of the fluctuation of the orientational modes along the vor-
tex turns out to be described by a two-dimensional CP N −1 model [35, 37].
It has been shown [35, 36, 69, 70], that the kinks of the two-dimensional
sigma model precisely correspond to these light monopoles, to be expected
in the underlying 4D gauge theory. In particular, it was noted that there is
an elegant matching between the dynamics of two-dimensional sigma model
(describing the dynamics of the vortex orientational modes in the Higgs phase
of the 4D theory) and the dynamics of the 4D gauge theory in the Coulomb
phase, including the precise matching of the coupling constant renormalization
[36, 68, 69].
Note that these cases are analogue of what would occur in QCD if the
color SU (3) symmetry were to dynamically break itself to U (1)2 . Confinement
would be described in this case by the condensation of magnetic monopoles
carrying the Abelian charges Q1 , or Q2 , and the resulting ANO vortices will
be of two types, 1 and 2 carrying the related fluxes.

Fig. 6. Monopoles in U (N ) systems with abelianization are conﬁned by two Abelian

vortices
500 K. Konishi

6 The Model
Actually the model we need here is not exactly the model of Sect. 5.3, but
is a model which contains it as a low-energy approximation. It is the same
model already discussed in Sect. 4.2, but now we analyze it in the region,
mi μ Λ, so that the semiclassical reasoning of Sect. 3 makes sense.
For concreteness, we take as our model the standard N = 2 SQCD with Nf
quark hypermultiplets, with a larger gauge symmetry, e.g., SU (N + 1), which
is broken at a much larger mass scale (v1 ∼ |mi |) as

v1 =0 SU (N ) × U (1)
SU (N + 1) −→ . (67)
ZN
The unbroken gauge symmetry is completely broken at a lower mass scale,
√
v2 ∼ | μm|, as in (78) below.
Clearly, one can attempt a similar embedding of the model (56) in a larger
gauge group broken at some higher mass scale, in the context of a nonsuper-
symmetric model, even though in such a case the potential must be judiciously
chosen and the dynamical stability of the scenario would have to be carefully
monitored. Here we choose to study the softly broken N = 2 SQCD for con-
creteness, and above all because the dynamical properties of this model are
well understood: this will provide us with a nontrivial check of our results.
Another motivation is purely of convenience: it gives a deﬁnite potential with
desired properties.9
We are hereby back to our argument on the duality and non-Abelian
monopoles, deﬁned through a better-understood non-Abelian vortices pre-
sented in general terms in Sect. 2.2, but now in the context of a concrete
model, where the fully quantum–mechanical answer is known.
The underlying theory is thus

1 1
L= Im Scl d4 θ Φ† eV Φ + d2 θ W W +L(quarks) + d2 θ μ TrΦ2 +h.c.;
8π 2
(68)

L(quarks) = (69)
† V
√

−V †
d θ {Qi e Qi + Q̃i e Q̃i } + d θ { 2Q̃i ΦQ + mi Q̃i Q } + h.c.
4 2 i i

where mi are the bare masses of the quarks and we have deﬁned the complex
coupling constant
θ0 8πi
Scl ≡ + 2 . (70)
π g0
9
Recent developments [32, 77] allow us actually to consider systems of this sort
within a much wider class of N = 1 supersymmetric models, whose infrared
properties are very much under control.
The Magnetic Monopoles Seventy-ﬁve Years Later 501

We also added the parameter μ, the mass of the adjoint chiral multiplet, which
breaks the supersymmetry softly to N = 1. The bosonic sector of this model
is described, after elimination of the auxiliary ﬁelds, by

1 2 1 2 ¯ 2 − V − V ,
L = 2 Fμν + 2 |Dμ Φ|2 + |Dμ Q| + Dμ Q̃ 1 2 (71)
4g g
where 2
1 A 1 † † †
V1 = tij [ 2 (−2) [Φ , Φ]ji + Qj Qi − Q̃j Q̃i ] ; (72)
8 g
A

√ √ √ † †
V2 = g 2 |μ ΦA +2 Q̃ tA Q|2 + Q̃ [m + 2Φ] [m + 2Φ] Q̃
√ √
+Q† [m + 2Φ]† [m + 2Φ] Q. (73)
In the construction of the approximate monopole and vortex solutions, we
shall consider only the VEVs and ﬂuctuations around them which satisfy
[Φ† , Φ] = 0, Qi = Q̃†i , (74)
and hence the D-term potential V1 can be set identically to zero throughout.
In order to keep the hierarchy of the gauge symmetry breaking scales, (24),
we choose the masses such that
m1 = . . . = mNf = m, (75)
m μ Λ. (76)
Although the theory described by the above Lagrangian has many degenerate
vacua, we are interested in the vacuum where (see [30] for the detail)
⎛ ⎞
m 0 0 0
1 ⎜ ⎜ .. .
. .. ⎟
. ⎟
Φ = − √ ⎜ 0 . . ⎟; (77)
2 ⎝ 0 ... m 0 ⎠
0 . . . 0 −N m
⎛ ⎞
d 0 0 0 ...
⎜ .. .. ⎟
⎜ ⎟
Q = Q̃† = ⎜ 0 . 0 . . . . ⎟ , d = (N + 1) μ m. (78)
⎝0 0 d 0 ...⎠
0 ... 0 0 ...
This is a particular case of the so-called r vacuum, with r = N . Although
such a vacuum certainly exists classically, the existence of the quantum r = N
vacuum in this theory requires Nf ≥ 2 N , which we shall assume.10
10
This might appear to be a rather tight condition as the original theory loses
asymptotic freedom for Nf ≥ 2 N + 2. This is not so. An analogous discussion
can be made by considering the breaking SU (N ) → SU (r) × U (1)N −r . In this
case the condition for the quantum non-Abelian vacuum is 2 N > Nf ≥ 2 r, which
is a much looser condition.
502 K. Konishi

To start with, ignore the smaller squark VEV, (78). As π2 (G/H) ∼

π1 (H) = π1 (U (1)) = Z, the symmetry breaking (77) gives rise to regular
magnetic monopoles with mass of order of O( vg1 ), whose continuous transfor-
mation property is our main concern here.
The semiclassical formulas for their mass and ﬂuxes [6, 52] are summarized
in Appendix 9.

6.1 Low-energy Approximation and Vortices

At scales much
lower than v1 = m but still neglecting the smaller squark VEV
v2 = d = (N + 1) μ m v1 , the theory reduces to an SU (N ) × U (1) gauge
theory with Nf light quarks qi , q̃ i (the first N components of the original
quark multiplets Qi , Q̃i ). By integrating out the massive fields, the effective
Lagrangian valid between the two mass scales has the form,
1 1 1 1 2 ¯2
L= 2 (Fμν ) + 4g 2 (Fμν ) + g 2 |Dμ φ | + g 2 |Dμ φ | + |Dμ q| + |Dμ q̃|
a 2 0 2 a 2 0 2
4gN 1 N 1
2
q̃ q √
−g12 − μ m N (N + 1) + − gN

2
| 2 q̃ ta q |2 + . . . (79)
N (N + 1)

where a = 1, 2, . . . N 2 − 1 labels the SU (N ) generators, ta ; the index 0 refers

to the U (1) generator t0 = √ 1 diag(1, . . . , 1, −N ). We have taken into
2N (N +1)
account the fact that the SU (N ) and U (1) coupling constants (gN and g1 )
get renormalized differently towards the infrared.
The adjoint scalars are fixed to its VEV, (77), with small fluctuations
around it,
Φ = Φ(1 + Φ−1 Φ̃), |Φ̃| m. (80)
In the consideration of the vortices of the low-energy theory, they will be in
fact replaced by the constant VEV. The presence of the small terms (80),
however, makes the low-energy vortices not strictly BPS (and this will be
important in the consideration of their stability below).11
The quark fields are replaced, consistently with (74), as
1
q̃ ≡ q † , q → √ q, (81)
2
where the second replacement brings back the kinetic term to the standard
form.

11
In the terminology used in Davis et al. [63] in the discussion of the Abelian
vortices in supersymmetric models, our model corresponds to an F model while
the models of [68, 69, 66] correspond to a D model. In the approximation of
replacing Φ with a constant, the two models are equivalent: they are related by
an SUR (2) transformation [64, 78].
The Magnetic Monopoles Seventy-ﬁve Years Later 503

We further replace the singlet coupling constant and the U (1)

gauge ﬁeld as

g1 Aμ φ0
e≡ ; Ãμ ≡ , φ̃0 ≡ . (82)
2N (N + 1) 2N (N + 1) 2N (N + 1)

The net eﬀect is

1 1 e2 † 1 2
| q † ta q |2 , (83)
2
L= 2 (F a 2
μν ) + (F̃ μν )2
+|D μ q| − | q q −c 1 |2 − gN
4gN 4e2 2 2

c = N (N + 1) 2 μ m. (84)
Neglecting the small terms left implicit, this is identical to the U (N ) model
(56), except for the fact that e = gN here. The transformation property of the
vortices can be determined from the moduli matrix, as was done in [76]. In-
deed, the system possesses BPS saturated vortices described by the linearized
equations
(D1 + iD2 ) q = 0, (85)
e2 g2
c 1N − q q † = 0; F12 + N qi† ta qi = 0.
(0) (a)
F12 + (86)
2 2
The matter equation can be solved exactly as in [65, 66, 67] (z = x1 + ix2 )
by setting

q = S −1 (z, z̄) H0 (z), A1 + i A2 = −2 i S −1 (z, z̄) ∂¯z S(z, z̄), (87)

where S is an N × N invertible matrix over whole of the z plane, and H0 is

the moduli matrix, holomorphic in z.
The gauge ﬁeld equations take a slightly more complicated form than in
the U (N ) model (56):
2
gN e2
∂z (Ω −1 ∂z̄ Ω) = − Tr ( ta Ω −1 q q † ) ta − Tr ( Ω −1 q q † − 1), Ω = S S†.
2 4N
(88)
The last equation reduces to the master equation (65) in the U (N ) limit,
gN = e.
The advantage of the moduli matrix formalism is that all the moduli pa-
rameters appear in the holomorphic, moduli matrix H0 (z). Especially, the
transformation property of the vortices under the color–ﬂavor diagonal group
can be studied by studying the behavior of the moduli matrix.

6.2 Dual Gauge Transformation from the Vortex Moduli

The concepts such as the low-energy BPS vortices or the high-energy BPS
monopole solutions are thus only approximate: their explicit forms are valid
only in the lowest-order approximation, in the respective kinematical regions.
504 K. Konishi

Nevertheless, there is a property of the system which is exact and does not de-
pend on any approximation: the full system has an exact, global SU (N )C+F
symmetry, which is neither broken by the interactions nor by both sets of
VEVs, v1 and v2 . This symmetry is broken by individual soliton vortex, en-
dowing the latter with non-Abelian orientational moduli, analogous to the
translational zero modes of a kink. Note that the vortex breaks the color–
ﬂavor symmetry as

SU (N )C+F → SU (N − 1) × U (1), (89)

leading to the moduli space of the minimum vortices, which is

SU (N )
M CP N −1 = . (90)
SU (N − 1) × U (1)

The fact that this moduli coincides with the moduli of the quantum states
of an N -state quantum–mechanical system, is a ﬁrst hint that the monopoles
appearing at the end point of a vortex, transform as a fundamental multiplet
N of a group SU (N ).
The moduli space of the vortices is described by the moduli matrix (we
consider here the vortices of minimal winding, k = 1)
⎛ ⎞
1 0 0 −a1
⎜ .. .. ⎟
⎜ ⎟
H0 (z) ⎜ 0 . 0 . ⎟, (91)
⎝ 0 0 1 −aN −1 ⎠
0 ... 0 z

where the constants ai , i = 1, 2, . . . , N − 1 are the coordinates of CP N −1 .

Under SU (N )C+F transformation, the squark ﬁelds transform as

q → U −1 q U, (92)

but as the moduli matrix is deﬁned modulo holomorphic redeﬁnition (66), it

is suﬃcient to consider
H0 (z) → H0 (z) U. (93)
Now, for an inﬁnitesimal SU (N ) transformation acting on a matrix of the
form (91), U can be taken in the form

0 ξ
U = 1 + X, X= , (94)
−(ξ)† 0

where ξ is a small N − 1 component constant vector. Computing H0 X and

making a V transformation from the left to bring back H0 to the original
form, we ﬁnd
δai = −ξi − ai (ξ)† · a, (95)
The Magnetic Monopoles Seventy-ﬁve Years Later 505

which shows that ai ’s indeed transform as the inhomogeneous coordinates of

CP N −1 . In other words, the vortex represented by the moduli matrix (91)
transforms as a fundamental multiplet of SU (N ).12
As an illustration consider the simplest case of SU (2) theory. In this case,
the moduli matrix is simply [72]

(1,0) z − z0 0 (0,1) 1 −a0
H0 ; H0 . (96)
−b0 1 0 z − z0

with the transition function between the two patches:

1
b0 = . (97)
a0
The points on this CP 1 represent all possible k = 1 vortices. Note that points
on the space of a quantum–mechanical two-state system,

|Ψ = a1 |ψ1 + a2 |ψ2 , (a1 , a2 ) ∼ λ (a1 , a2 ), λ ∈ C, (98)

can be put in one-to-one correspondence with the inhomogeneous coordinate

of a CP 1 ,
a1 a2
a0 = , b0 = . (99)
a2 a1
In order to make this correspondence manifest, note that the minimal vortex
(96) transforms under the SU (2)C+F transformation, as

α β
H0 → V H 0 U † , U= , |α|2 + |β|2 = 1, (100)
−β ∗ α∗

where the factor U † from the right represents a ﬂavor transformation, V is a

holomorphic matrix which brings H0 to the original triangular form [76]. The
action of this transformation on the moduli parameter, for instance, a0 , can
be found to be
α a0 + β
a0 → ∗ . (101)
α − β ∗ a0
But this is precisely the way a doublet state (98) transforms under SU (2),

a1 α β a1
→ . (102)
a2 −β ∗ α∗ a2

The fact that the vortices (seen as solitons of the low-energy approxima-
tion) transform as in the N representation of SU (N )C+F , implies that there
exist a set of monopoles which transform accordingly, as N . The existence of
such a set follows from the exact SU (N )C+F symmetry of the theory, broken
by the individual monopole–vortex conﬁguration.
12
Note that, if a N vector c transforms as c → (1 + X) c, the inhomogeneous
coordinates ai = ci /cN transform as in (95).
506 K. Konishi

This answers some of the questions formulated earlier (below (22)) unam-
biguously [76]. Note that in our derivation of continuous transformations of
the monopoles, the explicit, semiclassical form of the latter is not used.
A subtle point is that in the high-energy approximation, and to lowest
order of such an approximation, the semiclassical monopoles are just certain
nontrivial field configurations involving φ(x) and Ai (x) fields only, and there-
fore apparently transform under the color part of SU (N )C+F only. When the
full monopole–vortex configuration φ(x), Ai (x), q(x) (Fig. 2) is considered,
however, only the combined color–flavor diagonal transformations keep the
energy of the configuration invariant. In other words, the monopole trans-
formations must be regarded as part of more complicated transformations
involving flavor, when higher-order effects in O( vv12 ) are taken into account.
And this means that the transformations are among physically distinct states,
as the vortex moduli describe obviously physically distinct vortices [37].
This discussion highlights the crucial role played by the (massless) flavors
in the underlying theory as has been already summarized at the end of Sect. 2.
There is, however, another important independent effect due to the massless
flavors. Due to the zero modes of the fermions, the semiclassical monopoles are
converted to some irreducible multiplets in the flavor group SU (Nf ) [46]. The
“clouds” of the fermion zero-mode fluctuation fields surrounding the monopole
have an extension of O( v11 ), which is much smaller than the distance scales
associated with the infrared effects discussed here. We conclude that there
was one more crucial role of the flavor on non-Abelian monopoles: it allows
to generate the dual magnetic gauge group on the one hand, and to “dress”
the monopoles and endow them with global, flavor quantum numbers à la
Jackiw–Rebbi, on the other. They should be regarded as two, distinct effects.
Our construction has been generalized to the symmetry breaking SO(2N +
1) → U (N ) → ∅, SO(2N + 1) → U (r) × U (1)N −r → ∅, in the concrete context
of softly broken N = 2 models. There is an interesting difference in the quan-
tum fate of the semiclassical monopoles in the case the unbroken SU factor
has the maximum rank and in the cases where r ≤ N − 1. The semiclassical
(vortex–monopole complex) argument of Sect. 3 and in this section and the
fully quantum–mechanical results (of Sects. 4.2 and 4.3) agree qualitatively,
quite nontrivially [76].
The fact that the vortices of the low-energy theory are BPS saturated,
which allows us to analyze their moduli and transformation properties ele-
gantly as discussed above, while in the full theory there are corrections which
make them non-BPS (and unstable), might cause some concern. Actually, the
rigor of our argument is not affected by those terms which can be treated as
perturbation. The attributes characterized by integers such as the transforma-
tion property of certain configurations as a multiplet of a non-Abelian group
which is an exact symmetry group of the full theory, cannot receive renormal-
ization. This is similar to the current algebra relations of Gell–Mann, which
are not renormalized. CVC of Feynman and Gell–Mann also hinges upon an
The Magnetic Monopoles Seventy-five Years Later 507

analogous situation.13 The results obtained in the BPS limit (in the limit
v2 /v1 → 0) are thus valid at any ﬁnite values of v2 /v1 [79]. Thus
The dual group H̃ is the transformation group HC+F , seen in the dual
magnetic description.

6.3 Other Symmetry Breaking Patterns

The cases such as SO(2N + 3) → SO(2N + 1) × U (1) or U Sp(2N + 2) →

U Sp(2N ) × U (1), are particularly interesting, as the groups SO(2N + 1) and
U Sp(2N ) are interchanged by the GNOW duality. In the first case, for in-
stance, the GNOW conjecture states that the monopoles belong to multiplets
of the dual group U Sp(2N ). Although there are some hints how such GNOW
dual monopoles might emerge naturally in the semiclassical approximations
[80], there is a strong argument (based on N = 2 supersymmetry and global
symmetry [30, 53]) as well as clear evidence [30], against the appearance of
these GNOW monopoles as the light degrees of freedom. In other words, even
if they might emerge in a semiclassical approximation, they do not survive
quantum effects.
It is perhaps not a coincidence that the Seiberg duals of N = 1 supersym-
metric theories do not coincide always with GNOW duals.
The systems U Sp(2N ) → U (r) × U (1)N −r → ∅ also is known to possess
light non-Abelian monopoles in the fundamental representation of the dual
group SU (r) [30], which can be nicely understood by our definition of the
dual group.

7 Conﬁnement Near Conformal Vacua

A particular class of conﬁning vacua, in which conﬁnement and dynamical

symmetry breaking are described by non-Abelian magnetic monopoles inter-
acting strongly, are of great interest. The vacua we are talking about are known
as non-Abelian Argyres–Douglas vacua. These are found as a particular case
of r vacua, with r = Nf /2 of SU (N ) SQCD, as well as in the massless limit
(mi → 0) of all of conﬁning vacua of SO(N ) and U Sp(2N ) theories. Many
other examples of vacua with analogous properties can be found in the context
of wider class of N = 1 supersymmetric gauge theories [32].
Although the details (the global symmetry, the light–degrees of freedom)
depend on the model, there is a common feature in this class of systems which
makes these particularly interesting. Because of dynamics and for symmetry
requirement the system chooses to produce non-Abelian (rather than Abelian)
magnetic monopoles as the low-energy degrees of freedom, but cannot produce
quite as many of them as to make the eﬀective theory infrared-free.
13
The absence of “colored dyons” [11] mentioned earlier can also be interpreted in
this manner.
508 K. Konishi

As a consequence, conﬁnement is caused by the condensation of certain

monopole composites rather than by the condensation of single monopoles
[53]. As non-Abelian monopoles carry ﬂavor quantum numbers of the orig-
inal quarks (this is necessary for the low-energy theory to have the correct
symmetry of the underlying theory), the pattern of the symmetry breaking
reﬂects such a mechanism. These considerations have been distilled from stud-
ies on this class of systems and on the problem of understanding non-Abelian
monopoles discussed in various parts of this lecture.

8 Quantum Chromodynamics
What does all this teach about QCD? That the Abelian superconductor pic-
ture is probably not the correct picture of real-world QCD (SU (3)) has been
already pointed out. In particular, the fact that the deconfinement and chiral
restoration transitions occur at exactly the same temperatures in SU (3) lattice
measurement, appears to make the assumption that Abelian U (1)2 monopoles
are responsible for confinement and chiral symmetry breaking, rather awk-
ward (the remark (v) of Sect. 4.3). On the other hand, in ordinary (non-
supersymmetric) gauge theories, the “sign flip” of the beta function needed to
make the non-Abelian monopoles recognizable infrared (or intermediate-scale)
degrees of freedom, is much more difficult to achieve. If the dual “magnetic”
group were again SU (3), the magnetic monopoles of such a theory (regular-
ized Z3 monopoles?) would probably interact too strongly and would form
composite monopoles (cf. the point (iii) of Sect. 4.3). A small number of light
flavors, dressing these monopoles with flavor quantum numbers, would not be
sufficient.
We might speculate that the dynamics of QCD lies somewhere between.
The dual theory could be an

SU (2) × U (1) or U (2) (103)

theory, with magnetic monopoles in 2 of the SU (2) group and moreover we

expect them to carry flavor SUL (2) × SUR (2) quantum numbers. We expect
them to interact strongly, but not too much, and it is possible that the system
is close to a nontrivial infrared fixed point, with relatively nonlocal dyons
present at the same time, as in the SCFT effective low-energy theories of the
supersymmetric models discussed in the previous subsection.
Let us assume that they are Mai , M̃jb , with the (dual) color a, b and flavor
indices i, j, and carrying opposite U (1) charges. A condensate of the form

Mai M̃ja ∼ Λ2 δji (104)

might form, inducing conﬁnement and chiral symmetry breaking SUL (2) ×
SUR (2) → SUV (2) simultaneously. It could be that the standard quark con-
densate
The Magnetic Monopoles Seventy-ﬁve Years Later 509

ψL
i
ψ̄R j ∼ Λ3 δji (105)
is closely related dynamically to or induced by the monopole condensation,
(104), for instance, via the Rubakov effect [81].
It is interesting that in such a picture, there should be a considerable
difference between a theory with quarks in the fundamental representation and
a (unrealistic) theory with quarks in the adjoint representation. The Jackiw–
Rebbi effect works diffrently in the two cases. In the former case the fermion
zero modes give rise to bosonic multiplet of degenerate monopoles, while in the
latter case some of the monopoles become fermions. In the theory with adjoint
quarks, then, there can be considerable difference between the phenomenon of
confinement and that of chiral symmetry breaking. There is an ample evidence
for such a difference (e.g., different transition temperatures) in lattice gauge
theory, as is well known.

9 Conclusive Remarks
Non-Abelian monopoles are present in the fully quantum–mechanical low-
energy effective action of many solvable supersymmetric theories. They behave
perfectly as point-like particles carrying non-Abelian dual magnetic charges.
They play a crucial role in confinement and in dynamical symmetry breaking
in these theories. There is a natural identification of these excitations within
the semiclassical approach, which involves the flavor symmetry in an essential
manner. It is hoped that such an improved grasp on the nature of non-Abelian
monopoles would one day lead to a better understanding of confinement in
QCD.

Acknowledgments
It is a great pleasure for me to present these notes in honor of the 65th
birthday of my friend Gabriele Veneziano. With his deep understanding of
physics, brilliant intuition, elegance of his logics, and inexhaustible fantasy,
as well as with his exemplary human quality, he has been a guide to many
of us contemporary and younger generations of theoretical physicists for so
many years. It is not easy to emulate such a high standard, but I present
these lecture notes, with the best of my eﬀorts and with a deep sense of
gratitude to Gabriele. Finally, I wish to thank many friends and collaborators
who contributed at various stages of this investigation.

Appendix A—Semiclassical “Non-Abelian” Monopoles

In this appendix we review some general formulae [6, 4]. These degenerate
monopoles appear in a system with the gauge symmetry breaking
510 K. Konishi

φ =0
G −→ H (A.1)

with a nontrivial π2 (G/H) and non-Abelian H.

The normalization of the generators can be chosen [4] so that the metric
of the root vector space is14

gij = αi αj = δij . (A.4)
roots

The Higgs ﬁeld vacuum expectation value (VEV) is taken to be of the form

φ0 = h · H, (A.5)

where h = (h1 , . . . , hrank(G) ) is a constant vector representing the VEV. The

root vectors orthogonal to h belong to the unbroken subgroup H.
The monopole solutions are constructed from various SU (2) subgroups of
G that do not commute with H,
1 i
S1 = √ (Eα + E−α ); S2 = − √ (Eα − E−α ); S3 = α∗ · H,
2α2 2α2
(A.6)
where α is a root vector associated with a pair of broken generators E±α . α∗
is a dual root vector deﬁned by
α
α∗ ≡ . (A.7)
α·α
The symmetry breaking (A.1) induces the Higgs mechanism in such an
SU (2) subgroup, SU (2) → U (1). By embedding the known ’t Hooft–Polyakov
monopole [2, 38] lying in this subgroup and adding a constant term to φ so
that it behaves correctly asymptotically, one easily constructs a solution of
the equation of motion [6, 27]:

Ai (r) = Aai (r, h · α) Sa ; φ(r) = χa (r, h · α) Sa + [ h − (h · α) α∗ ] · H, (A.8)

where
rj ra
Aai (r) = aij A(r); χa (r) = χ(r), χ(∞) = h · α (A.9)
r2 r
is the standard ’t Hooft–Polyakov–BPS solution. Note that φ(r =
(0, 0, ∞)) = φ0 .
14
In the Cartan basis, the Lie algebra of the group G takes the form

[Hi , Hk ] = 0, (i, k = 1, 2, . . . , r); [Hi , Eα ] = αi Eα ;

i
[Eα , E−α ] = α Hi ; (A.2)

[Eα , Eβ ] = Nαβ Eα+β (α + β = 0). (A.3)

αi = (α1 , α2 , . . .) are the root vectors.
The Magnetic Monopoles Seventy-ﬁve Years Later 511

The mass of a BPS monopole is then given by

ri (S · r)
M = dS · Tr φ B, B= . (A.10)
r4
This can be computed by going to the gauge in which
rS3 r
B= = 3 α∗ · H, (A.11)
r3 r
to be
4πhi αj∗
M= Tr Hi Hj . (A.12)
g
For instance, the mass of the minimal monopole of SU (N +1) → SU (N )×U (1)
can be found easily by using (B.4)–(B.10)

2π v (N + 1)
M= . (A.13)
g
For the cases SO(N + 2) → SO(N ) × U (1) and U Sp(2N + 2) → U Sp(2N ) ×
U (1), where TrHi Hj = C δij , one ﬁnds

4π C h · α∗ 4πv
M= = , (A.14)
g g
while for SO(2N ) → SU (N ) × U (1), SO(2N + 1) → SU (N ) × U (1), and
U Sp(2N ) → SU (N ) × U (1), the mass is
8π C h · α∗ 8πv
M= = . (A.15)
g g

In order to get the U (1) magnetic charge,15 we ﬁrst divide by an appro-

priate normalization factor in the mass formula (A.10)

Tr φ B ri (S · r)
Fm = dS · = dS · B(0) , B= . (A.16)
Nφ r4

The result, which is equal to 4πgm by deﬁnition, gives the magnetic charge.
The latter must then be expressed as a function of the minimum U (1) elec-
tric charge present in the given theory, which can be easily found from the
normalized (such that Tr T (a) T (a) = 12 ) form of the relevant U (1) generator.
For example, in the case of the√symmetry breaking, SO(2N ) → U (N ), the
adjoint VEV is of the form, φ = 4N v T (0) , where T is a 2N × 2N block-
(0)

i 0 1
diagonal matrix with N nonzero submatrices √4N . Dividing the mass
−1 0
√ 2
(A.15) by N v and identifying the ﬂux with 4πgm one gets gm = √N g
.
15
In this calculation it is necessary to use the generators normalized as
Tr T (a) T (b) = 12 δab , such that B = B(0) T (0) + . . . .
512 K. Konishi

Finally, in terms of the minimum electric charge of the theory e0 = √g

4N
(which follows from the normalized form of T (0) above) one ﬁnds
2 2 1
gm = √ = · . (A.17)
Ng N 2 e0

The calculation is similar in other cases.

The asymptotic gauge ﬁeld can be written as
rk
Fij = ijk Bk = ijk (β · H), β = α∗ (A.18)
r3
in an appropriate gauge ((A.10)). The Goddard–Nuyts–Olive quantization
condition [4]
2β · α ∈ Z (A.19)
then reduces to the well-known theorem that for two root vectors α1 , α2 of
any group,
2 (α1 · α2 )
(A.20)
(α1 · α1 )
is an integer.

Appendix B—Root Vectors and Weight Vectors

.1 AN = SU (N + 1)

It is sometimes convenient to have the root vectors and weight vectors of the
Lie algebra SU (N + 1) as vectors in an (N + 1)-dimensional space rather than
an N -dimensional one. The root vectors are then simply

(· · · , ±1, · · · , ∓1, · · · ). (B.1)

(· · · stand for zero elements) which all lie on the plane

x1 + x2 + . . . + xN +1 = 0, (B.2)

while the weight vectors are projections in this plane of the orthogonal vectors

μ = (· · · , ±1, · · · ) (B.3)

where the dots represent zero elements.

In order to use the general formulas of Weinberg and Goddard–Olive–
Nuyts we normalize these vectors so that the diagonal (Cartan) generators
may be written

Hi = diag (w1i , w2i . . . , wN

i i
, wN +1 ), i = 1, 2...N (B.4)
The Magnetic Monopoles Seventy-ﬁve Years Later 513

where wk represents the k-th weight vector of the fundamental representation

of SU (N + 1), satisfying
1
wk · wl = − ; (k = l);
2(N + 1)2
N
wk · wk = , k, l = 1, 2, . . . , N + 1; (B.5)
2(N + 1)2
N +1
and k=1 wk = 0. They are vectors lying in an N -dimensional space (B.2):
in the coordinates of the (N + 1)-dimensional space,
1
wi = (−1, . . . , −1, N, −1, −1, . . .). (B.6)
2(N + 1)3
The root vectors are simply
1
α = wi − wj = (· · · , ±1, · · · , ∓1, · · · ) (B.7)
2(N + 1)
with the norm
1
α·α= . (B.8)
N +1
Note that for i = j
−2N + N − 1 1
Tr (Hi Hj ) = w1i w1j + . . . + wN
i j
+1 wN +1 = =− ,
2(N + 1)3 2(N + 1)2
(B.9)
while
N2 + N N
Tr (Hi Hi ) = 3
= . (B.10)
2(N + 1) 2(N + 1)2
The adjoint VEV causing the symmetry breaking SU (N + 1) → SU (N ) ×
U (1) is of the form,

φ = h · H, h = v 2(N + 1)3 (0, 0, . . . , 1). (B.11)

.2 BN = SO(2N + 1)
The N generators in the Cartan subalgebra of the Lie algebra SO(2N + 1)
can be taken to be
⎛ ⎞
−iw1i J
⎜ −iw2i J ⎟
⎜ ⎟
⎜ .. ⎟ 1
Hi = ⎜ . ⎟, J= (B.12)
⎜ ⎟ −1
⎝ −iwN J ⎠
i

where wk (k = 1, 2, . . . , N ) are the weight vectors of the fundamental repre-

sentation, which are vectors in an N -dimensional Euclidean space
514 K. Konishi

1
wk · wl = 0; k = l; wk · wk = : (B.13)
2(2N − 1)

they form a complete set of orthogonal vectors. The root vectors of SO(2N +1)
group are α = {±wi , ±wi ± wj }; their duals are:

α∗ = ±2(2N − 1) wi , (2N − 1)[ ±wi ± wj ]. (B.14)

The diagonal generators satisfy

1
Tr Hi Hj = δij . (B.15)
2N − 1
In the system with symmetry breaking SO(2N + 1) → SO(2N − 1) × U (1)
the adjoint scalar VEV is

φ = h · H, h = iv 2(2N − 1) (0, 0, . . . , 1). (B.16)

.3 CN = U Sp(2N )

The N generators in the Cartan subalgebra of U Sp(2N ) are the following

2N × 2N matrices:

Bi 0
Hi = , i = 1, 2, . . . , N, (B.17)
0 −Bi t

where ⎛ ⎞
w1i
⎜ w2i ⎟
⎜ ⎟
⎜ .. ⎟
Bi = ⎜ 0 . 0 ⎟, i = 1, 2...N. (B.18)
⎜ ⎟
⎝ i
wN ⎠
−1
i
wN
The weight vectors wk (k = 1, 2, . . . , N ) form a complete set of orthogonal
vectors in an N -dimensional Euclidean space and satisfy
1
wk · wl = 0; k = l; wk · wk = . (B.19)
4(N + 1)

The root vectors of U Sp(2N ) group are α = { ± 2 wi , ±wi ± wj }. The diag-

onal generators satisfy
1
Tr Hi Hj = δij . (B.20)
2(N + 1)

For the breaking U Sp(2N ) → U Sp(2(N −1))×U (1) the adjoint scalar VEV is

φ = h · H, h = v 4(N + 1) (0, 0, . . . , 1). (B.21)
The Magnetic Monopoles Seventy-ﬁve Years Later 515

.4 DN = SO(2N )

The N generators in the Cartan subalgebra of the SO(2N ) group can be

chosen to be
⎛ ⎞
1
−iw1i
⎜ −1 ⎟
⎜ ⎟
⎜ 1 ⎟
⎜ −iw2i ⎟
⎜ −1 ⎟
Hi = ⎜ ⎟, (B.22)
⎜ .. ⎟
⎜ . ⎟
⎜ ⎟
⎝ 1 ⎠
−iwN
i
−1

where wk (k = 1, 2, . . . , N ) are the weight vectors of the fundamental repre-

sentation, living in an N -dimensional Euclidean space and satisfying
1
wk · wl = 0; k = l; wk · wk = : (B.23)
4(N − 1)

they form a complete set of orthogonal vectors. The root vectors of SO(2N )
are α = {±wi ± wj }. The diagonal generators satisfy

1
Tr Hi Hj = δij . (B.24)
2(N − 1)

In the system with symmetry breaking SO(2N ) → SO(2N − 2) × U (1) the

adjoint scalar VEV takes the form

φ = h · H, h = iv 4(N − 1) (0, 0, . . . , 1). (B.25)

Appendix C—Seiberg–Witten Curves for SU (2) N = 2

Super Yang-Mills Theory

The variable a and aD are to be considered as local variables, describing

the low-energy eﬀective action in a particular patch of the space of vacua
(QMS). On the other hand, the variable u = Tr Φ2 is a gauge invariant and
apparently unique and global variable describing the QMS. The space (aD , a)
is the covering space M̃ of the space M whose coordinate is the complex
VEV u. If the base space were simply connected, the map M̃ → M would
be trivial. In general, a closed loop of the point u in the base space induces a
discrete transformation, called monodromy group, among the inverse images
of the point u in the covering group.
The fact that the space M is nontrivial follows from the one-loop beta
function,
θef f 4πi i
τef f = daD /da = + 2 ∼ log a + . . .
2π gef f 2π
516 K. Konishi

so
i 2 A2 dF (a) i a
F (A) A log 2 , aD = (a log a + ).
2π Λ da 2π 2
The eﬀect of a loop at large u ∼ a /2, u → e u is a → e , so
2 2πi πi

aD → −aD + 2a, a → −a,

or
aD aD −1 2
→ M∞ , M∞ = .
a a 0 −1
A singularity at ∞ in the u space implies the presence of at least one more
singularity at ﬁnite u. As the theory possesses an invariance under sponta-
neously broken discrete Z2 , under which u → −u, it is natural to assume a
pair of singularirties at u = ±Λ2 . The key idea of Seiberg and Witten is that
these singularities correspond to the points of u where the ’t Hooft–Polyakov
monopole becomes massless due to quantum eﬀects. Near u ∼ Λ2 then
da i
aD (u = Λ2 ) = 0, τD = − − log aD , (C.1)
daD π
and
aD ∼ c0 (u − Λ2 ),
where (C.1) is the standard beta function of N = 2 supersymmetric QED.
Thus a closed loop in u around the point Λ2 induces the monodromy trans-
formation

1 0
a → a − a − 2aD ; aD → aD , MΛ2 = .
−2 1

The monodromy transformation around −Λ2 follows from the consistency

condition,
MΛ2 · MΛ2 = M∞ .
The map aD (u), a(u), with the desired properties is precisely the one given
in (36)–(38).

Appendix D—One-particle Representations of N = 1

and N = 2 Supersymmetry Algebra
(i) For a massive N = 1 supersymmetric particle states, one has (P μ =
(M, 0, 0, 0))
{Qα , Q̄α̇ } = δαα̇ 2M, α, α̇ = 1, 2, (D.1)
or, by deﬁning
1 1
b†α = √ Qα , bα̇ = √ Q̄α̇ . (D.2)
2M 2M
The Magnetic Monopoles Seventy-ﬁve Years Later 517

These can be regarded as two pairs of annihilation and creation opera-

tors, {bα̇ , b†α } = δαα̇ . The complete set of one particle states can then be
constructed by deﬁning the vacuum state by (i = 1, 2)

bi |0 = 0; (D.3)

the full set of states are

|0, b†1 |0, b†2 |0, b†1 b†2 |0, (D.4)

they form a degenerate supersymmetry multiplet (two bosons and two

fermions). For N supersymmetry, the same argument shows that the mul-
tiplicity of a massive multiplet is
2N

2N
= 22N . (D.5)
n=0
n

(ii) Massless N = 1 supersymmetric particle states: In this case it is not

possible to go to the rest frame but the momentum can be chosen as
P μ = (p, 0, 0, p). Then

2p 0
{Qα , Q̄α̇ } = (D.6)
0 0 αα̇

The state b†2 |0 have a zero norm. The particle states are given by the
positive norm states, half of (D.4),

|0, b†1 |0. (D.7)

The multiplicity of a massless N = 1 supersymmetry multiplet is

N

2N
= 2N . (D.8)
n=0
n

(iii) Massive N = 2 supersymmetric particle states with central charges. In

the rest frame (P μ = (M, 0, 0, 0)) the supersymmetry algebra reduces to

{Qiα , Q̄jα̇ } = δ ij δαα̇ 2M, α, α̇ = 1, 2, i, j = 1, 2, (D.9)

{Qiα , Qjβ } = αβ ij (U + iV ) (D.10)

Within an irreducible representation U and V are just numbers (electric
and magnetic charges of these particles). There are three cases:
√
1. 2M < U 2 + V 2 : It is not possible to ﬁnd a positive-norm represen-
tation of
√ the algebra;
2. 2M = U 2 + V 2 : A representation exists with multiplicity 2N = 4
(short multiplet) (these are the so-called BPS saturated case);
518 K. Konishi
√
3. 2M > U 2 + V 2 : A representation exists with multiplicity 22N = 16
(long multiplet).

Proof. Deﬁne
Q1 Q1 Q2 Q22
√ 1 = b1 √ 2 = b2 √ 1 = b3 = b4 (D.11)
2M 2M 2M 2M

U V
−√ =u −√ =v (D.12)
2M 2M
then

{bi , b†j } = δij {b1 , b4 } = u + iv {b2 , b3 } = −u − iv (D.13)

{b†1 , b†4 } = u − iv {b†2 , b†3 } = −u + iv (D.14)

Now make the change of variables

Q1α −→ eiγ Q1α Q2α −→ Q2α (D.15)

b1 −→ eiγ b1 b2 −→ eiγ b2 (D.16)

to have {b1 , b4 } real and positive:
√
U2 + V 2
{b1 , b4 } = {b†1 , b†4 } =α= (D.17)
2M

{b2 , b3 } = {b†2 , b†3 } = −α (D.18)

In order to see the spectrum, it is convenient to set

A = b1 cos ϑ + b†4 sin ϑ, B = −b1 sin ϑ + b†4 cos ϑ. (D.19)

The condition {A, B} = {A, B † } = 0 yields ϑ = π

4: A and B satisfy disjoint
anticommutators

{A, B} = 0, {A, A† } = 1 + α, {B, B † } = 1 − α. (D.20)

Thus if |α| < 1 there are two creation operators A† , B † ; while if α = ±1 B †

(or A† ) creates zero-norm states. The same passages for b2 and√ b†3 lead to a
2 2
similar result. The net result is that particles with mass M > U 2+V come
in “long multiplets”,
√
with multiplicity 22 N = 8, while the BPS particles with
mass M = U 2 +V 2
2 come in “short multiplets” of multiplicity 2N = 4.
The Magnetic Monopoles Seventy-ﬁve Years Later 519

References
1. P.A.M. Dirac: Proc. Roy. Soc. (1931) A 133, 60; Phys. Rev. 74, 817 (1948) 471
2. G. ’t Hooft: Nucl. Phys. B 79, 817 (1974), A.M. Polyakov: JETP Lett. 20, 194
(1974); M.K. Prasad, C.M. Sommerfield: Phys. Rev. Lett. 35, 760 (1975); W.
Nahm: Phys. Lett. B 90, 413 (1980) 471, 474, 476, 482, 510
3. E. Lubkin: Ann. Phys. 23, 233 (1963); E. Corrigan, D.I. Olive, D.B. Fairlie:
J. Nuyts, Nucl. Phys. B 106, 475 (1976) 471
4. P. Goddard, J. Nuyts, D. Olive: Nucl. Phys. B 125, 1 (1977) 471, 477, 496, 509, 510, 512
5. F.A. Bais: Phys. Rev. D 18, 1206 (1978) 471, 476, 477
6. E.J. Weinberg: Nucl. Phys. B 167, 500 (1980); Nucl. Phys. B 203, 445 (1982);
K. Lee, E. J. Weinberg, P. Yi: Phys. Rev. D 54 , 6351 (1996) 471, 476, 477, 502, 509, 510
7. C.H. Taubes: Commun. Math. Phys. 80, 343 (1980) 471
8. S. Coleman: “The Magnetic Monopole Fifty Years Later”, Lectures given at
International School of Subnuclear Physics, Erice, Italy (1981) 471, 480, 481, 486
9. R.S. Ward: Commun. Math. Phys. 86, 437 (1982) 471
10. N. Manton: Phys. Lett. B 154, 397 (1985), Erratum ibid. B 157, 475 (1985) 471
11. A. Abouelsaood: Nucl. Phys. B 226, 309 (1983); P. Nelson, A. Manohar: Phys.
Rev. Lett. 50, 943 (1983); A. Balachandran, G. Marmo, M. Mukunda, J. Nils-
son, E. Sudarshan, F. Zaccaria: Phys. Rev. Lett. 50, 1553 (1983); P. Nelson,
S. Coleman: Nucl. Phys. B 227, 1 (1984) 471, 477, 507
12. C. Rebbi, G. Soliani: Soliton and Particles (World Scientific, Singapore, 1984).
Many earlier references on the solitons are collected in this book 471
13. P.A. Horvathy, J.H. Rawnsley: Phys. Rev. D 32, 968 (1985); J. Math. Phys.
27, 982 (1986) 471
14. N. Dorey, C. Fraser, T.J. Hollowood, M.A.C. Kneipp: “NonAbelian duality in
N = 4 supersymmetric gauge theories” [arXiv: hep-th/9512116]; Phys.Lett. B
383, 422 (1996) 471, 477
15. C.J. Houghton, P.M. Sutcliffe: J. Math. Phys. 38, 5576 (1997)
16. B.J. Schroers, F.A. Bais: Nucl. Phys. B 512, 250 (1998); Nucl. Phys. B 535,
197 (1998)
17. M. Strassler: Prog. Theor. Phys. Suppl. 131, 439 (1998)
18. H.J. de Vega: Phys. Rev. D 18, 2932 (1978); H.J. de Vega, F.A. Shaposnik:
Phys. Rev. Lett. 56, 2564 (1986); Phys. Rev. D34, 3206 (1986); J. Heo, T.
Vachaspati: Phys. Rev. D 58, 065011 (1998), P. Suranyi: hep-lat/9912023;
F.A. Shaposnik, P. Suranyi: Phys. Rev. D 62, 125002 (2000); J. Edelstein,
W. Fuertes, J. Mas, J. Guilarte: Phys. Rev. D 62, 065008 (2000); M. Kneipp,
P. Brockill: Phys. Rev. D 64, 125012 (2001) 471
19. G. ’t Hooft: Nucl. Phys. B 190, 455 (1981); S. Mandelstam: Phys. Lett. 53B,
476 (1975); Phys. Rep. C 23, 245 (1976) 472
20. Y.M. Cho: Phys. Rev. D 21, 1080 (1980); L.D. Faddeev and A.J. Niemi: Phys.
Rev. Lett. 82, 1624 (1999); Phys. Lett. B 449, 214 (1999) 472
21. T.T. Wu, C.N. Yang: in Properties of Matter Under Unusual Conditions, ed.
by H. Mark, S. Fernbach (Interscience, New York, 1969) 473
22. K. Konishi, K. Takenaga: Phys. Lett. B 508, 392 (2001) 473
23. N. Seiberg, E. Witten: Nucl. Phys. B 426, 19 (1994); Erratum ibid. B 430,
485 (1994) 474, 478, 485, 486, 489
24. N. Seiberg, E. Witten: Nucl. Phys. B 431, 484 (1994) 474, 478, 485, 487, 489, 490
520 K. Konishi

25. P. C. Argyres, A. F. Faraggi: Phys. Rev. Lett 74, 3931 (1995); A. Klemm,
W. Lerche, S. Theisen, S. Yankielowicz: Phys. Lett. B 344, 169 (1995); Int.
J. Mod. Phys. A 11, 1929 (1996), A. Hanany, Y. Oz: Nucl. Phys. B 452, 283
(1995) 474, 478, 487, 489
26. P. C. Argyres, M. R. Plesser, A. D. Shapere: Phys. Rev. Lett. 75, 1699 (1995);
P. C. Argyres, A. D. Shapere: Nucl. Phys. B 461, 437 (1996); A. Hanany: Nucl.
Phys. B 466, 85 (1996) 474, 489, 490
27. S. Bolognesi, K. Konishi: Nucl. Phys. B 645, 337 (2002) 474, 479, 489, 510
28. P. C. Argyres, M. R. Plesser, N. Seiberg: Nucl. Phys. B 471, 159 (1996); P.C.
Argyres, M.R. Plesser, A.D. Shapere: Nucl. Phys. B 483, 172 (1997); K. Hori,
H. Ooguri, Y. Oz: Adv. Theor. Math. Phys. 1, 1 (1998) 474, 479, 480, 489, 490, 491
29. A. Hanany, Y. Oz: Nucl. Phys. B 466, 85 (1996) 474, 479, 489, 490
30. G. Carlino, K. Konishi, H. Murayama: JHEP 0002, 004 (2000); Nucl. Phys. B
590, 37 (2000) 474, 479, 480, 490, 492, 493, 494, 501, 507
31. G. Carlino, K. Konishi, S. P. Kumar, H. Murayama: Nucl. Phys. B 608, 51
(2001) 474, 490
32. F. Cachazo, M. R. Douglas, N. Seiberg, E. Witten: JHEP 0212, 071 (2002);
F. Cachazo, N. Seiberg, E. Witten: JHEP 0302, 042 (2003); F. Cachazo, N.
Seiberg, E. Witten: JHEP 0304, 018 (2003); for a review and further references,
see: R. Argurio, G. Ferretti, R. Heise: Int. J. Mod. Phys. A 19, 2015 (2004) 474, 500, 507
33. C. Montonen, D. Olive: Phys. Lett. B 72, 117 (1977) 474
34. N. Seiberg: Nucl. Phys. B 435, 129 (1995) 474
35. A. Hanany, D. Tong: JHEP 0307, 037 (2003) 480, 497, 499
36. A. Hanany, D. Tong: JHEP 0404, 066 (2004) 499
37. R. Auzzi, S. Bolognesi, J. Evslin, K. Konishi, A. Yung: Nucl. Phys. B 673, 187
(2003) 480, 497, 499, 506
38. E.B. Bogomolnyi: Sov. J. Nucl. Phys. 24, 449 (1976) 510
39. B.A. Dubrovin, A.T. Fomenko, S.P. Novikov: Modern Geometry—Methods and
Applications, Part II. The Geometry and Topology of Manifolds, translated by
R.G. Burns (Graduate Text in Mathematics, Springer, Berlin, 1985) 480
40. M.K. Gaillard, B. Zumino: Nucl. Phys. B 193, 221 (1981) 484
41. D. Finnell, P. Pouliot: Nucl. Phys. B 453, 225 (1995); N. Dorey, V.V. Khoze,
M.P. Mattis: Nucl. Phys. B 492, 607 (1997) 486
42. N. A. Nekrasov: Adv. Theor. Math. Phys. 7 , 831 (2004) 486
43. K. Konishi: Int. J. Mod. Phys. A 16, 1861 (2001) 486
44. J. Goldstone, F. Wilczek: Phys. Rev. Lett. 47, 986 (1981) 486
45. E. Witten: Phys. Lett. B 86, 283 (1979) 486
46. R. Jackiw, C. Rebbi: Phys. Rev. D 13, 3398 (1976) 486, 506
47. A. J. Niemi, Manu B. Paranjape, G. W. Semenoﬀ: Phys. Rev. Lett. 53, 515
(1984) 487
48. F. Ferrari: Phys. Rev. Lett. 78, 795 (1997) 487
49. K. Konishi, H. Terao: Nucl. Phys. B 511, 264 (1998); G. Carlino, K. Konishi,
H. Terao: JHEP 9804, 003 (1998) 487
50. A. Rebhan, P. van Nieuwenhuizen, R. Wimmer: Phys. Lett. B 594, 234 (2004);
Phys. Lett. B 632, 145 (2006); JHEP 0606, 056 (2006)
51. A. Bilal, F. Ferrari: Nucl. Phys. B 516, 175 (1998); A. Cappelli, P. Valtancoli,
L. Vergnano: Nucl. Phys. B 524, 469 (1998) 490
52. R. Auzzi, S. Bolognesi, J. Evslin, K. Konishi, H. Murayama: Nucl. Phys. B
701, 207 (2004) 492, 502
The Magnetic Monopoles Seventy-ﬁve Years Later 521

53. G. Marmorini, K. Konishi, N. Yokoi: Nucl. Phys. B 741, 180 (2006) 493, 494, 507, 508
54. P. C. Argyres, M. R. Douglas: Nucl. Phys. B 448, 93 (1995); P. C. Argyres, M.
R. Plesser, N. Seiberg, E. Witten: Nucl. Phys. B 461, 71 (1996); T. Eguchi,
K. Hori, K. Ito, S.-K. Yang: Nucl. Phys. B 471, 431 (1996) 494
55. R. Auzzi, R. Grena, K. Konishi: Nucl. Phys. B 653, 204 (2003) 494
56. A. Abrikosov: Sov. Phys. JETP 32, 1442 (1957); H. Nielsen, P. Olesen: Nucl.
Phys. B 61, 45 (1973) 495
57. R. Donagi, E. Witten: Nucl. Phys. B 460, 299 (1996) 496
58. M. J. Strassler: “Messages for QCD from the Superworld” [arXiv: hep-
th/9803009] 496
59. A. Hanany, M. Strassler, A. Zaﬀaroni: Nucl. Phys. B 513, 87 (1998) 496
60. N. Dorey: JHEP 9811, 005 (1998); N. Dorey, T.J. Hollowood, S.P. Kumar:
Nucl. Phys. B 624, 95 (2002) 496
61. V. Markov, A. Marshakov, A. Yung: Nucl. Phys. B 709, 267 (2005) 496
62. K. Konishi, L. Spanu: Int. J. Mod. Phys. A 18, 249 (2003) 497
63. S. C. Davis, A-C. Davis, M. Trodden: Phys. Lett. B 405, 257 (1997) 502
64. A.I. Vainshtein, A. Yung: Nucl. Phys. B 614, 3 (2001) 502
65. Y. Isozumi, M. Nitta, K. Ohashi, N. Sakai: Phys. Rev. D 71, 065018 (2005) 498, 503
66. M. Eto, Y. Isozumi, M. Nitta, K. Ohashi, N. Sakai: Phys. Rev. Lett. 96, 161601
(2006) 498, 502, 503
67. M. Eto, Y. Isozumi, M. Nitta, K. Ohashi, N. Sakai: J. Phys. A 39, 315 (2006) 498, 499, 503
68. D. Tong: “TASI lectures on solitons: Instantons, monopoles, vortices and kinks”
[arXiv: hep-th/0509216] 498, 499, 502
69. M. Shifman and A. Yung: Phys. Rev. D 66, 045012 (2002); M. Shifman and
A. Yung: Phys. Rev. D 70, 045004 (2004) 498, 499, 502
70. A. Gorsky, M. Shifman, A. Yung: Phys. Rev. D 71, 045010 (2005) 498, 499
71. M. Shifman, A. Yung: Phys. Rev. D 73, 125012 (2006) 498
72. M. Eto, Y. Isozumi, M. Nitta, K. Ohashi, N. Sakai: Phys. Rev. D 72, 025011
(2005) 505
73. S. Bolognesi, K. Konishi, G. Marmorini: Nucl. Phys. B 718, 134 (2005)
74. N. Seiberg: Nucl. Phys. B 435, 129 (1994)
75. R. Auzzi, S. Bolognesi, J. Evslin, K. Konishi: Nucl. Phys. B 686, 119 (2004)
76. M. Eto, K. Konishi, G. Marmorini, M. Nitta, K. Ohashi, W. Vinci, N. Yokoi:
Phys. Rev. D 74, 065021 (2006) 503, 505, 506
77. R. Dijkgraaf, C. Vafa: Nucl. Phys. B 644, 3 (2002); Nucl. Phys. B 644, 21
(2002) 500
78. R. Auzzi, S. Bolognesi, J. Evslin: JHEP 0502, 046 (2005) 502
79. M. Eto, L. Ferretti, K. Konishi, G. Marmorini, M. Nitta, K. Ohashi, W. Vinci,
N. Yokoi: “Non-abelian duality from vortex moduli: a dual model of color-
conﬁnement”, [arXiv: hep-th/0611313] 507
80. L. Ferretti, K. Konishi: in Sense of Beauty in Physics, Festschrift in honor of
the 70th birthday of A. Di Giacomo, (Edizioni PLUS, University of Pisa Press,
Pisa, 2006) 507
81. V. A. Rubakov: Nucl. Phys. B 203, 311 (1982) 509
Part VI

String dualities and symmetries

Novel Symmetries of String Theory

J. Maharana

Institute of Physics, Bhubaneswar-751005, India

[email protected]

Abstract. The evolution of a closed bosonic string in its massless background is

considered. The target space local symmetries associated with the graviton and
the two-form antisymmetric tensor are exposed through Ward identities using a
path-integral formalism in the Hamiltonian phase space. Similar Ward identities are
derived for nonabelian gauge bosons that appear as massless excitation of compact-
iﬁed strings. It is proposed that excited massive stringy states might be endowed
with hidden local symmetries. We ﬁnd evidence in support of this conjecture when
we examine string evolution in the background of some of the low-lying massive
states.

1 Introduction

It is recognized that string theory holds the promise to unify the fundamental
forces of Nature. There have been important developments to achieve this
goal [1]. One of the wonders of string theories is the rich strove of their sym-
metries. The symmetry contents have been unraveled over the decades. It is
not obvious whether we have exhausted and comprehended all the stringy
symmetries so far. There are symmetries associated with the worldsheet such
as the invariance under Weyl rescaling and the reparametrization invariance.
It is well known that the quantum constraints determine the critical dimen-
sions of the target space. On the other hand, if one envisages evolution of a
string on the background of its massless excitations, the quantum constraints
severely restrict the configurations of these backgrounds [2, 3]. The so-called
β-function equations govern the evolutions of the backgrounds. The effective
action constructed from the β-function equations reveal the local symme-
tries associated with the target space. For example, the massless spectrum of
closed strings contains a spin-2 state identified with the graviton. Therefore,
it is expected that string theory will encode general coordinate transforma-
tion invariance in its dynamics when we consider interaction of the graviton
with other states. The open string spectrum incorporates a gauge boson, so

J. Maharana: Novel Symmetries of String Theory, Lect. Notes Phys. 737, 525–552 (2008)
DOI 10.1007/978-3-540-74233-6 16 c Springer-Verlag Berlin Heidelberg 2008
526 J. Maharana

that gauge invariance is naturally expected to be manifested in string theory

where such states are involved. Furthermore, it is well-known that nonabelian
gauge bosons appear when a suitable compactification scheme is adopted.
The resulting effective action is found to be invariant under those nonabelian
gauge symmetries. Furthermore, dualities have played a cardinal role in our
understanding of the string dynamics. The extended nature of string, and
of other associated objects like branes, is responsible for the novel features
which are not encountered in field theory describing point particles. The rich-
ness of the spectrum, the appearance of a limiting temperature due to an
enormous degeneracy of the excited levels, and the presence of symmetries,
are at the root of some of the mysteries of the theory which are yet to be fully
understood.
The article presents attempts of Gabriele Veneziano and of the author
to understand symmetries of string theory, and our endeavours to explore
presence of new symmetries which might be associated with higher excited
states of strings [4, 5]. It is worthwhile to ask whether one could unravel
any local symmetries for the excited massive levels, just as massless states
are endowed with ‘gauge’ symmetries [5]. Moreover, one might contemplate
if the higher levels acquire their masses due to a Higgs-like mechanism so
that there is a phase where all excited levels are massless. Whereas string
field theory might be the appropriate arena to address these issues, the first
quantized approach might provide a more intuitive picture where one studies
the attributes of S-matrix elements. We mention in passing that most of our
results are based on computations of S-matrix elements, and there is not
enough headway to evaluate off-shell amplitudes. Of course, ideally, a complete
theoretical framework should provide us tools to derive Green’s functions. The
relevance of the preceding comments will be alluded to later, in the context
of symmetries associated with massive excited states of the strings.
The path-integral Hamiltonian formulation is adopted in order to exhibit
symmetries associated with stringy states. Let us first consider the evolution
of a closed bosonic string in the background of its massless excitations. The
action corresponds to that of a σ-model where the backgrounds play the role
of coupling constants. The quantum constraints lead to β-function equations
when we carry out the computation by the standard prescriptions in the weak
field approximation. In order to expose the underlying local symmetries asso-
ciated with the massless states, we resort to the phase-space BRS Hamiltonian
framework in the path-integral formalism. As will be demonstrated in sequel,
the local symmetries are exhibited through the Ward identities satisfied by
the S-matrix elements. We shall arrive at the Ward identities through the fol-
lowing steps: (i) obtain the canonical Hamiltonian, (ii) derive the algebra of
constrains associated with the worldsheet symmetries, (iii) present the gauge
fixed Hamiltonian with the relevant ghost terms and (iv) formally construct
the generating functional for the S-matrix. We shall adopt the same pre-
scription when we explore the presence of symmetries associated with excited
massive stringy states [6, 7].
Novel Symmetries of String Theory 527

The article is organized as follows. We present a pedagogical introduction

to BRS Hamiltonian formalism in the context of the problem at hand. In the
next section, a closed bosonic string is envisaged in the presence of graviton
and of the antisymmetric tensor field, and the Hamiltonian constraint analy-
sis is presented. Next we discuss the modifications necessary in the σ-model
action when nonabelian gauge bosons are present due to adoption of some
compactification schemes. The fourth section is devoted to the derivation of
Ward identities for the S-matrix elements with massless external states. We
discuss how to incorporate the coupling of a dilaton background to the string
in our approach. We present an illustrative example to show how anomalies
creep in to viciate the Ward identities. Next, we present our approach to
study presence of symmetries for massive stringy states. This section is based
on unpublished works [6, 7]. We elaborate on the difficulties encountered by
us while attempting to generalize our formalism for the massless case to the
higher levels, and point out why we believe that the excited states are en-
dowed with symmetries. The summary and conclusions are presented in the
final section.

2 Hamiltonian Formalism and BRS Quantization

The quantization of theories with local symmetries such as gauge theories,

Einstein’s general theory of relativity and string theory pose problems due to
the presence of first class constraints. Therefore, we are required to choose a
gauge fixing prescription. The starting point is to construct a covariant theory
and in the process we introduce more degrees of freedom. For example, in the
relativistically covariant description of free electrodynamics, one deals with
the vector potential Aμ with four degrees of freedom, whereas the photon
is endowed with two physical degrees of freedom. The procedure of gauge
fixing allows us to eliminate undesirable components. However, in this process,
we might loose the covariance character of the theory. On the other hand,
introducing a gauge fixing term like 12 (∂μ Aμ )2 requires enlargement of the
Hilbert space, as is well-known. The covariant gauge fixed descriptions of
nonabelian gauge theory and string theory is best described in the framework
of the BRS formalism.
The constrained Hamiltonian dynamics due to Dirac provides a very pow-
erful and elegant technique to study theories with local symmetries and
to quantize them. It is well-known that such theories possess first class
constraints (they might have additional second class constraints). These con-
straints satisfy Poisson bracket algebra describing the underlying local symme-
tries. The gauge fixing conditions are introduced to eliminate the redundant
degrees of freedom of the theories. The set of gauge fixing conditions, to-
gether with the originally derived first class constraints, constitute a set
of second class constraints. Next, one may follow Dirac’s prescriptions to
quantize the gauge fixed theories. One of the most attractive features of
528 J. Maharana

Dirac’s formulation is that we can keep accounts of the degrees of free-

doms of theory in the Hamiltonian phase space. Subsequently, the proce-
dures of canonical quantization may be followed. The BRS formalism is best
suited for quantization of such theories with covariant gauge fixing. It is cus-
tomary to start from the gauge invariant Lagrangian, supplement it with
a covariant gauge fixing term, and finally add the ghost part. The result-
ing new Lagrangian is no longer invariant under the local symmetry; how-
ever, it is invariant under the global BRS transformation. The BRS charge
is nilpotent and it annihilates all physical states. Furthermore, it commutes
with all operators corresponding to observables of the theory. The Balatin–
Fradkin–Vilkovisky (BFV) [8, 9, 10] Hamiltonian phase-space approach lays
down procedures to construct the Hamiltonian action, incorporating the con-
strained Hamiltonian formalism of Dirac. Let {Fa , a = 1, ..N } be a set
of first class constraints identified for the given theory. They satisfy the
algebra
{Fa , Fb }P B = fab
c
Fc , (1)
where ≈ stands for weak equality, i.e. first the the PB are evaluated
and then the constraints are to be set to zero in all such computa-
c
tions. Here fab are the structure ‘constants’ which usually do not de-
pend on the phase-space variables; however, in general, they could carry
dependence on such variables. Indeed, they coincide with structure con-
stants of the underlying Lie algebra for nonabelian gauge theories. Whereas,
in the case of Einstein–Hilbert action, these are derivatives of appropri-
ate δ-functions as they appear in the algebra of Hamiltonian and mo-
mentum constraints of the theory. Furthermore, the set {Fa } also sat-
isfy
{HC , Fa }P B ≈ Vab Fb . (2)
Here HC is the canonical Hamiltonian density derived from the gauge in-
variant action and Vab are constants determined for the theory under in-
vestigation. Equation (2) conveys that the time evolution of a first class
constraint can be expressed as a linear combination of the first class
constrains.
The
BFV provide a prescription to construct the BRS charge
1
Q = Fa η a + P a fab
c b
η ηc , (3)
2
where P a , η a are sets of Grassmann odd ghosts (in quantum theory these are
anticommuting objects) satisfying the PB relation
{P a , ηb }P B = δba , (4)
with appropriate definition of PB brackets for such objects (note that there
will be a δ-function on the RHS for fields). Moreover, Q is nilpotent by
construction
Novel Symmetries of String Theory 529

{Q, Q}P B = 0. (5)

It is to be emphasized that (5) is a nontrivial statement for a fermionic charge
like Q. Furthermore, this condition imposes severe constraints on the under-
lying Hilbert space of the corresponding quantum theory. The next step is to
construct the gauge fixed action. Recall that the PB of Q vanishes with HC
by construction, since Fa are first class. Then a Grassmann odd fermionic
object χ is introduced which is a function of the fields, of their conjugate mo-
menta, and of the set of ghosts {Pa , ηa }. The gauge fixed Hamiltonian density
is constructed to be

Hχ = HC + P a Vab ηb − {χ, Q}P B . (6)

Therefore, choice of the gauge ﬁxing function, χ, determines the eﬀective

gauge ﬁxed Hamiltonian density. Now the Hamiltonian action is

SH = d x φi πφ − Hχ ,
d i
(7)

where φi are generic fields (gauge fields, scalars, fermions), and πφi their con-
jugate momenta. The expression (7) is for d-dimensional spacetime.
Let us consider the evolution of a closed string in d-dimensional target
space. The string traces out a cylinder on the worldsheet surface dur-
ing its evolution. The underlying action is required to be invariant under
the reparametrization of the worldsheet coordinates. The Polyakov action
is the most convenient form to describe and quantize open and closed
strings [11],

1 √
S=− d2 σ −γγ ab ∂a X μ ∂b X ν ημν , (8)
2
where γab is the worldsheet metric, γ ab is its inverse, γ is determinant of
worldsheet
metric and ημν is the flat space metric of the target space. The variation
of the action with respect to γ ab results in the worldsheet energy–momentum
tensor,
1
Tab = ∂a X μ ∂b X ν ημν − γab γ cd ∂c X μ ∂d X ν ημν . (9)
2
Note that Tab = 0, since there is no kinetic term for the worldsheet metric,
√
as the analogue of Einstein–Hilbert piece, d2 σ −γ R(2) , is a topological
term. We can solve for γab from the above equation. If we insert the above
expression for the worldsheet metric into the Polyakov action, then we recover
the string action as proposed by Nambu and Goto. An important point to note
is that the equivalence between the Polyakov and the Nambu–Goto action
will hold when the equation of motion for the worldsheet metric is utilized
in (8).
530 J. Maharana

The action (8) has the following symmetry properties.

(a) Two-dimensional reparametrization invariance,

δγab = ξ c ∂c γab + ∂a ξ c γbc + ∂b ξ c γac , (10)

√ √
and hence δ −γ = ∂a (ξ a −γ). The string coordinate transforms as

δX μ = ξ a ∂a X μ . (11)

(b) Weyl invariance

δγab = 2Ωγab , δX μ = 0, (12)
where the parameter Ω depends on the worldsheet variables.
(c) Poincare invariance (in target space)

δX μ = ωνμ X ν + aμ , δγab = 0, (13)

where ωμν are antisymmetric parameters associated with the Lorentz trans-
formation and aμ are the parameters of translation.
Note that the Weyl invariance implies tracelessness of the two-dimensional
energy momentum tensor for the classical theory. The quantum invariance of
this symmetry has far reaching consequences in string theory.
If we make the orthonormal gauge choice for the worldsheet metric, γab =
e2Ω(σ,τ ) ηab with ηab = diag(−1, +1), the form of Polyakov action simplifies
√
since −γγ ab = η ab in this gauge. The condition of the vanishing of Tab
reduces to two constraints
(Ẋ ± X )2 = 0. (14)
These are the Virasoro constraints. They take the following form in the
Hamiltonian formalism:
1 2
Pμ X μ = 0, H= (P + X 2 ) = 0, (15)
2
where Pμ is momentum conjugate to X μ derived from the Polyakov ac-
tion. It is easy to check that the first constraint generates σ translations
on the worldsheet, whereas the latter, being the canonical Hamiltonian,
generates τ translations. It is more convenient to define the constraints
L± = 14 (P μ ±X μ )ημν (P ν ±X ν ) whose ‘equal time’ algebra takes an elegant
form

{L± (σ), L± (σ )}P B = ± L± (σ) + L± (σ ) ∂σ δ(σ − σ ) (16)

and
{L± (σ), L∓ (σ )}P B = 0. (17)
It is obvious from the constraint algebra (16) and (17) that the L± represent
a pair of ﬁrst class constraints. The classical BRS charge is
Novel Symmetries of String Theory 531

Q= dσ(L+ η+ + L− η− + P+ η+ η+ − P− η− η− ). (18)

The gauge ﬁxed Hamiltonian density is

Hχ = {χ, Q}. (19)

Note that χ is the gauge ﬁxing function; for ON gauge choice χ = P+ + P− .

As discussed above, the ON gauge choice corresponds to H = 12 (P 2 + X 2 ) =
L+ + L− involving string coordinates. Therefore, this choice of χ gives us the
full ON gauge Hamiltonian,

HON = L+ + L− + 2P+ η+ + P+ η+ − 2P− η− − P− η− . (20)

The classical BRS charge Q is nilpotent by construction, i.e. {Q, Q}P B = 0.

The quantum BRS charge is defined with a normal ordering prescription. The
string coordinate X μ , its canonical momenta Pμ and the ghost fields P± , η±
are expanded in Fourier series with creation and annihilation operators. The
quantum constraint Q̂2 = 0 implies the number of spacetime dimensions,
D = 26, and the ‘intercept’ α0 = 2 (we are discussing closed bosonic string)
[12]. We may remind the reader that the massless excitations of a closed
bosonic string contain a scalar Φ, called dilaton, a symmetric tensor, Gμν ,
identified with the graviton, and an antisymmetric tensor, Bμν . One of our
principal goals is to unveil the target space symmetries of the theory. The first
step in this direction is to adopt the first quantized framework and envisage
the evolution of the string in the background of these massless excitations,
whose worldsheet action takes the form

1 √ ab
S=− 2
d σ μ ν ab μ ν
γγ ∂a X ∂b X Gμν (X) + ∂a X ∂b X Bμν (X) . (21)
2
The generators
1
L± = (Pμ ± X ρ Gμρ + X ρ Bμρ )Gμν (Pν ± X λ Gνλ + X λ Bνλ ) (22)
4
satisfy the same Poisson bracket algebra as in (16) and (17). Next, the gener-
ating functional can be formally defined using the phase-space path-integral
formalism in order to implement BRS quantization,

Σ[G, B] = d[X ]d[Pμ ]d[η± ]d[P± ] exp i dσ Ẋ Pμ + P± η̇± − Hχ .
μ μ

(23)

Here Hχ = L+ + L− + 2P+ η+ + P+ η+ − 2P− η− − P− η− and Hχ is the gauge
ﬁxed Hamiltonian density for the case at hand. In the context of the BRS
quantization, the following remarks are to be borne in mind. The nilpotency
of the quantum BRS charge Q̂ imposes stringent constraints on the admis-
sible backgrounds in the form of diﬀerential equations. These are precisely
532 J. Maharana

the β-function equations computed while adopting the conformal ﬁeld theory
techniques.
The generating functional (23) was introduced by Fradkin and Tseytlin
[2] in order to study string dynamics in the presence of nontrivial background
ﬁelds in their Hamiltonian path-integral approach. Notice that Σ plays the
role of generating functional for S-matrix elements in the following sense. Let
us collectively denote the backgrounds as

B(X(σ)) = (φ(x), Gμν (x), Bμν (x), M ), (24)

where M collectively stands for massive stringy states such as the tachyon,
T and higher excited levels. If we consider the worldsheet action in the
presence of B, the resulting σ-model action is required to be conformally
invariant. In the simplest case we consider massless backgrounds B =
(φ(x), Gμν (x), Bμν (x), M = 0), and such that B fluctuates around a triv-
ial vacuum configuration.
B = B0 + B̃, (25)
where B0 = (const, ημν , const) and B̃ = (eikφ .x , αμν eik.x , ...), where αμν
is identified with the polarization tensor of graviton. The fluctuating fields
are required to satisfy the vanishing ‘β-function’ conditions kφ2 = 0, k 2 = 0,
kμ αμν = 0, ...). If we have to introduce the tachyon background, the corre-
sponding constraint is B̃T = eikT .x , kT2 = 4/α . Note that for such trivial back-
grounds Σ generates the S-matrix elements for the scattering of those states. It
is hoped that we shall be able to obtain the S-matrix elements for the mass-
less excitations even when nontrivial vacuum configurations are envisaged.
Furthermore, it is expected that the S-matrix elements can be derived also
when higher level massive modes are included in the corresponding σ-model
action, and consequently Σ is treated as a functional of (φ, G, B, M...).
Our starting point is to construct Σ in a formal sense, include the ghost
fields, and define it through the phase-space path integral. We introduce a
set of generating functionals for canonical transformation which will play an
important role in exhibiting the local symmetries associated with the massless
states of the string in the target space. Indeed, when one constructs the string
effective action in target space starting from the β-function equations, the
effective action is invariant under local target space symmetries. However, it
is not obvious to explain how the two-dimensional σ-model encodes the local
symmetries of the target space. Note that we have not introduced the string
coupling to dilaton background in (21). The additional term is

1 √
d2 σ −γR(2) Φ(X.) (26)
4π

Here R(2) is the scalar curvature of the two-dimensional worldsheet. The dila-
ton coupling (26) is not conformally invariant. However, we demand that the
sum of (21) and (26) be conformally invariant, which is equivalent to the
Novel Symmetries of String Theory 533

vanishing of the β-function conditions [3]. Consequently, one obtains three

sets of coupled differential equations involving backgrounds Gμν , Bμν and Φ
(the total number of equations being d2 ). In the context of BRS Hamiltonian
formalism, the coupling of a string to the dilaton background needs careful
discussions, and we shall return to this issue later.
It is well-known that massless gauge bosons appear in the spectrum of
closed strings when some of the spatial dimensions are compactified [13]; of
special interests are the toroidally compactified coordinates. Indeed, the hope
of unifying all fundamental forces in string theory gained strong support with
the construction of heterotic string theory [14]. This ten-dimensional theory
not only contained the spectrum of N = 1 supergravity, but also admitted
nonabelian gauge theories with SO(32) or E8 ×E8 groups. We may recall that
the seminal work of Green and Schwarz [15], which generated the so-called
1984 superstring revolution, had shown that these were the only two admissi-
ble gauge groups in order to satisfy anomaly-free conditions for theories in ten
dimensions. Therefore, when we examine the presence of local symmetries for
a string in its massless backgrounds, we are expected to address the relevant
issues for compactified string coordinates.
Let us consider the case where d spatial coordinates are compactified (we
focus on a bosonic string in critical dimensions D = 26). These coordinates
are denoted by X I , I = 1, 2, ..d, and for a closed string (in absence of nontriv-
ial backgrounds) are separated into left and right movers. Let us recall that
there will be d Kaluza–Klein gauge bosons if we were to consider a higher-
dimensional theory of gravity with compact coordinates. Our goal is to couple
the string to the nonabelian gauge boson background. It is more convenient
to fermionize the compact bosonic coordinates [16]. Essentially, each compact
bosonic coordinate corresponds to two fermionic degrees of freedom. Thus for
a set of {X I }, I = 1, 2..d bosonic coordinates, one introduces ψi , i = 1, 2, ..2d
two-dimensional Majorana fermions. Therefore, for a free closed string (in
critical D = 26 dimensions) with 16 compact coordinates, there are 32 left
moving and 32 right moving chiral fermions. Moreover, the nonabelian gauge
group is SO(32)L × SO(32)R , or two E8 × E8 gauge groups when appropriate
boundary conditions are imposed on the worldsheet fermions (corresponding
to fermionic representations of the compactified bosonic coordinates). For the
sake of simplicity, let us couple the right-handed sector to the corresponding
background gauge fields. The worldsheet action is

1 √
S=− d σ −γγ ab ∂a X μ ∂b X ν Gμν + ab ∂a X μ ∂b X ν Bμν + ieψL
2 i a
e− ∂a ψLi
2

i a M M μ j
+ ieψR (eα ∂a δij + Γij Aμ ∂a X )ψR , (27)

where eaα is the zweibein associated with the worldsheet and e is its deter-
minant; ΓijM are the generators of the gauge group SO(32) for d = 16 in the
i
fundamental representation to which ψL,R belong. The above action (27) is
534 J. Maharana

conformally invariant at the classical level. The two constraints, L± , can be

easily computed after some lengthy calculations.

0 = L+ − L− = Pμ X μ + ΠR
i i
∂1 ψ R i
+ Ψ L ∂1 ψ L , (28)

and
1 1
0 = L+ + L− = P̃μ P̃ν Gμν (X) + X μ X ν Gμν (X) − ΠLi ∂1 ψL
i
2 2
i
+ ΠR i
∂1 ψ R − ΠRi M M
Tij Aμ (X)ψR j
X μ , (29)
i
where the conjugate momenta of the chiral worldsheet fermions are ΠL,R =
1 i
2 ieψL,R and
P̃μ = Pμ + X λ Bμλ + ΠR
i M M j
Tij Aμ ψR . (30)
It is straightforward to show that the computation of the classical brackets
{L+ , L+ }P B , {L− , L− }P B and {L+ , L− }P B take the same form as in (16) and
(17). The BRS charge and the Hamiltonian can be obtained for the case under
study (i.e. with backgrounds G, B and A) following the prescriptions described
above for the case where the string evolves in the presence of the graviton
and of the antisymmetric tensor ﬁelds. Notice that the massless sector, for a
string with compact coordinates, also contains scalars belonging to the adjoint
representations of the (left × right) gauge groups. The corresponding action
in the presence of these states can be written down easily.

3 Canonical Transformations and Invariance

Properties of Σ
We show that a chosen set of canonical transformations on the phase-space
path integral (23) brings out the local symmetries of the generating func-
tional [5], Σ, under transformations of the background ﬁelds, Gμν , Bμν and
AMμ . Note, however, that this will be demonstrated at a formal level. We are
aware that careful computations might lead to anomalies. We shall present an
example where such an anomaly can be explicitly computed. In what follows,
we shall focus on exhibiting the underlying local symmetries. First, we shall
deal with the case on noncompact string in backgrounds of G and B, and then
take up the case of a string with compact coordinates.
(i) String with noncompact coordinates:
Let ΦF be a generator of canonical transformation in the phase space such that
δz = {z, ΦF }P B , where {z} collectively stand for the phase-space variables X μ
and Pμ . We introduce a generator

ΦG = dσPμ ξ μ (X). (31)

The transformations induced on z are

Novel Symmetries of String Theory 535

δG X μ = ξ μ (X), δG Pμ = −Pν ξ ν ,μ (X), and δG (η, P) = 0. (32)

Here the comma stands for ordinary derivative. The backgrounds, Gμν (X)
and Bμν (X), are functions of X μ and therefore, under ΦG , the coordinates
shift according to the rules (32) leading to

δG Gμν = Gμν ,λ ξ λ , δG Bμν = Bμν ,λ ξ λ . (33)

The Hamiltonian action

SH = d2 σ Ẋ μ Pμ + P± η̇± − HON (34)

exhibits the property

δG SH = −δ GT C SH . (35)
The LHS of the above equation is well deﬁned from the rules described above,
while δ GCT is to interpreted as follows: noting that Gμν and Bμν are second
rank tensors in the target space, their transformation rules under general co-
ordinate transformations (GCT) are dictated from their tensorial properties,
namely
δ GCT Gμν = −Gμλ ξ λ ,ν −Gνλ , ξμλ − Gμν ,λ ξ λ . (36)
A similar expression follows for δ GCT Bμν . If we introduce another generator
of canonical transformation,

ΦB = dσX μ Λμ (X), (37)

then the variation of the phase-space variables, and consequently the vari-
ations of the background, can be computed according to the prescriptions
already given. The analogous transformation property of SH is

δB SH = −δ B−gauge SH . (38)

We deﬁne the B-gauge transformation to be

δ B−gauge = ∂μ Λν (X) − ∂ν Λμ (X). (39)

Note that the target space metric, Gμν is not aﬀected by δ B−gauge transfor-
mation. We assume that the phase-space measure remains invariant under the
canonical transformations induced by the generators ΦG and ΦB . We arrive at
the following conclusion, after taking into account the relations (35) and (38):

Σ(G, B) = Σ G + δ GCT G, B + δ GCT B + δ B−gauge B . (40)

(ii) Compact case:

The massless sector of a closed bosonic string, with compact coordinates,
536 J. Maharana

consists of nonabelian gauge ﬁelds in addition to G and B (we ignore the

coupling of the string to the massless scalars arising due to compactiﬁcation).
The generator of canonical transformations, which exposes the nonabelian
gauge symmetry, turns out to be

i j M M
ΦA = dσψR ψR Tij ξ (X). (41)

The corresponding Hamiltonian action satisﬁes the relation

i
δA SH = − δ gauge SH . (42)
2
Here δ gauge is the nonabelian gauge transformation acting on AM
μ (X)
such that
δ gauge AM M
μ = ξ ,μ (X) + f
MNP N P
Aμ ξ (X), (43)
where f M N P are the structure constants deﬁned by the relation

T M , T N = if M N P T P . (44)

Taking into account (42), and repeating the arguments for the non-compact
string case, we conclude that

Σ(G, B, A) = Σ(G, B, A + δ gauge A). (45)

We may remind the reader that the relations (40) and (45) hold modulo the
anomalies alluded to earlier. We shall brieﬂy address this issue at the end of
this section.
The invariance properties of the generating functional allows us to derive
Ward identities for the S-matrix elements, as they are generated by Σ in an
elegant manner. Let us ﬁrst focus on GCT. We arrive at
& '
GCT D δSH GCT δSH GCT
0=δ Σ= d x δ Gμν + δ Bμν . (46)
δGμν δBμν G,B

The symbol <>G,B stands for an average in the sense of functional integration
in the Hamiltonian phase space with the weight factor exp(iSH (G, B)). The
backgrounds Gμν and Bμν are ﬁnally set to those conﬁgurations enforced by
the β-function equations. The generic form of SH is

SH = d2 σL(X, P, η, P, G(X(σ)), B(X(σ))). (47)

Therefore, the functional derivative of SH with respective to a generic back-

ground gives us the corresponding vertex operator. As an illustrative example
consider the variation of SH with respect to Gμν :
Novel Symmetries of String Theory 537

δSH δLH
= d2 σδ(x − X(σ)) = d2 σδ(x − X(σ))VGμν (X). (48)
δGμν (x) δGμν (X)

We can similarly obtain the corresponding vertex operator associated with

Bμν . We deﬁne VGμν and VBμν as the graviton and antisymmetric tensor vertex
operators in a background. These operators can be obtained explicitly as

1 μ ν 1 μρ νλ 1
VGμν = X X − G G Pρ Pλ −Pρ Gρμ Gνσ Bστ X τ − X ρ Bαρ Bβσ Gμα Gνβ
2 2 2
(49)
and
2VBμν = Pρ Gρμ X ν + X ν BρσX σ Gρμ − (μ ↔ ν). (50)

We can use in (46) the expression (36) for δ GCT Gμν , and the deﬁnition of
vertex operator (49), to arrive at
&
0= d2 σ VGμν Gμλ (X)ξ λ ,ν (X) + Gνλ (X)ξ λ (X),μ +Gμν ,λ (X)ξ λ (X)
'
+VBμν (X) Bμλ (X)ξ λ ,ν (X) − Bνλ (X)ξ λ ,μ (X) + Bμν ,λ (X)ξ λ (X) .
G,B
(51)

Now we are in a position to derive the desired Ward identities (WI) for
processes with multigravitons and antisymmetric tensor field. Notice that (51)
holds for arbitrary infinitesimal parameters ξ α (X). Therefore, we are per-
mitted to take the functional derivative of the right-hand side of (51) with
respect to ξ α (X) at ξ α = 0. Furthermore, we can take an arbitrary num-
ber of derivatives of the above equation with respect to Gμν (Y ) and Bμν (Z)
at the ground state values of Gμν and Bμν (i.e. backgrounds corresponding
to a string vacuum configuration). As an illustrative example, let us envis-
age the case of a closed bosonic string in critical dimensions, D = 26, in
which the background metric describes the Minkowski space. If we consider
the n-graviton amplitude (ignore the B-field, for the moment), then (51) can
be expressed as

δn
δGμ ν (y1 )...δGμn νn (yn )
& 1 1 '
d σ Gμλ ∂ν δ(x − X) + (μ ↔ ν) + Gμν ,λ δ(x − X)
2
= 0. (52)

This form of WI is familiar in multigraviton scattering amplitudes. Let us

examine (52) a little carefully. The chain of functional derivatives, applied to
the expression for the path-integral average, acts in following three ways: the
derivative with respect to metric Gμi νi (yi ) can act (i) on the vertex operator
VGμν , or (ii) on Gμλ , Gνλ , Gμν ,λ or (iii) on the Hamiltonian action SH , which
is hidden in the deﬁnition of the path integral average < ... >G itself. The
538 J. Maharana

action of the G-functional derivatives in case (i) and (ii) kills the presence of
any metric and produces a δ(yi −X) type of terms which are the contact terms.
On the other hand, each G-functional derivative acting on SH produces an
additional VG . Note that WI presented above is in the x-space representation.
If we Fourier transform (52), the familiar form WI can be recovered:
&
n '
d2 σi VGμi νi (X(σi ))eiki X(σi ) d2 σ2kν VGλν (X(σ))eikX(σ)
i=1 G=η
n &
= kiλ d2 σi VGμi νi (X(σi ))e(k+ki )X(σi )
i=1
'
2 μ ν
d σj VG j j (X(σj ))ekj X(σj ) + (other contact terms). (53)
j =i

The above equation tells us that the divergence of an (n + 1)-graviton ampli-

tude is related to the sum of lower-point S-matrix elements. It is to be kept in
mind that there could be potential anomaly terms in the WI. We arrived at
these results through formal manipulations. We expect WI to hold good for
backgrounds which are consistent with BRS invariance. Furthermore, the fluc-
tuations should correspond to on-shell external particles. Naturally, if want
to derive WI for off-shell Green functions, there could be additional terms in
the RHS of the above equation.
We can extend the aforementioned arguments to derive Ward identities
for amplitudes with gravitons, antisymmetric tensor fields, and nonabelian
gauge bosons, in the presence of generic backgrounds compatible with BRS
invariance. The vertex operator for the gauge bosons is
δSH
VAμM = i
= ψR j
TijM ψR (Gμν P̃ν − X μ ), (54)
δAM
μ

and P̃μ is given by (30). Note that Gμν and Bμν are all buried in the expression
for P̃μ derived from an action which describes the evolution of the compactiﬁed
closed bosonic string in a background with the graviton, the antisymmetric
tensor ﬁeld and a nonabelian gauge boson. The corresponding WI can then
be derived from the basic equation
&
d2 σVAμM (X(σ)) ∂μ δ(X − x)δM P
'
+f N M P ANμ (X(σ))δ(X − x) =0 (55)
G,B,A

We brieﬂy discuss the coupling of a string to the dilaton background in

the
2context of the BRS Hamiltonian formalism. The well-known coupling ∼
√
d σ −γR(2) Φ(X) is to be converted into a term involving ghost ﬁelds. This
issue has been addressed in the Lagrangian BRS approach in [17]. We noticed
Novel Symmetries of String Theory 539

that an additional piece contributes to the gauge-ﬁxed Hamiltonian density

involving ghosts and the dilaton background,
8
HD = P+ η+ (Pμ + X ρ Bμρ − X ρ Gμρ )Gμν ∂ν Φ(X)
3
8
+ P− η− (Pμ + X ρ Bμρ + X rho )Gμν ∂ν Φ(X)
3
64
+ P+ η+ P− η− ∂μ ΦGμν ∂ν Φ. (56)
9
The total Hamiltonian density is H = HON + HD . The phase-space vari-
ables transform as follows
under the canonical transformation induced by the
generator by Φghost = d2 σ(P+ η+ + P− η− )(X):

δη± = (X)η± , δP± = −(X)P± , δX μ = 0,

δPμ = −(∂μ )(P+ η+ + P− η− ). (57)

These variations of the phase-space variables induce a change in the Hamil-

tonian action, which is equivalent to shifting the dilaton by a parameter ,
i.e.
1
δghost = − (X). (58)
8
Therefore, one might naively argue that Φ(X) can be rotated away by a field
redefinition. However, it is well-known that such classical arguments are in-
valid when quantum–mechanical considerations are invoked.
Let us focus our attention on a simple scenario when the left moving sector
of a closed bosonic string is compactified on a d-dimensional torus. We may
choose d such that it corresponds to even self-dual lattice, if we desire. If we
fermionize the compact coordinates, we must introduce 2d chiral fermions,
and the massless sector will have nonabelian gauge bosons AM μ . The gener-
ator of canonical transformation given by (41) is instrumental for deriving
the WI associated with gauge invariance. We consider backgrounds where
Gμν = ημν and Bμν = 0, and we are interested in investigating whether the
canonical transformation introduces anomalies, i.e. whether the phase measure
remains invariant or not [18]. We must deal carefully with the transformed
two-dimensional chiral fermion measure, which is essential in the definition of
Σ(A). The anomaly, if any, will arise from the noninvariance of this measure
[19] under the transformations induced by (41). We start from the action

1
S=− d2 σ(∂α X μ ∂ α X ν ημν + iψ i ρα ∂ α ψ i ), (59)
2

where ψ i , i = 1, ..2d are the two-dimensional Majorana fermions on the world-

sheet, and ρα are the two-dimensional ‘γ’-matrices. Here we have already
adopted the orthonormal gauge. Let us couple only the left-moving fermions
to the corresponding gauge ﬁeld background, so that the action is schemati-
cally written as
540 J. Maharana

1 (1 − ρ5 )
S=− d2 σ ∂α X μ ∂ α Xμ ψ i i∂α δij + ∂α X μ TijM AM
μ ψ
j
, (60)
2 2
where ρ5 is the product of the three ρ-matrices. The action is invariant under
the following transformations: (i) δψ = 2i (1 − ρ5 )θ(X)T ψ, (ii) δAM
μ ∂μ θ(X) +
A ∧ θ and (iii) δX μ = 0. Here θ is the gauge parameter. In order to check
the presence of any anomaly, in the form of noninvariance of the fermionic
path-integral measure, we deﬁne the Euclidean Dirac operator

i
ρα Dα = ρα ∂α − (1 − ρ5 )aα , (61)
2

where aα = ∂α X μ T M AM μ , suppressing the indices with the understanding that

aα is a matrix. Note that caution has to be exercised to compute anomalies
in presence of ‘γ5 ’ couplings. First write

i i
ρα Dα = ρα ∂α − vα + ρ5 aα , (62)
2 2
with the prescription that we set vα = aα at the end of the calculations. It
is necessary to analytically continue aα → iaα in order to make the operator
hermitian; however, we rotate it back at the end of the computation. The
standard trick is to expand both ψ and ψ̄ in terms of a complete set of the
eigenfunctions of the hermitian Dirac operator deﬁned by

ψ= cn φn , ψ̄ = dn φ†n , (63)
n n

where φn correspond to complete set of functions satisfying the Dirac equa-

tion ρα Dα φn = λn φn , and the Dirac operator is the one which is obtained
after going through the aforementioned steps. Now, the path-integral measure
becomes
Dψ̄Dψ = dcn ddn . (64)
n
The gauge transformation introduces a change in the fermionic measure,

Dψ̄ Dψ = [( detC̃nm )(detCnm )]−1 Dλ̄Dλ, (65)

where
i(1 − ρ5 )
Cnm = δnm + d2 σφ†n (σ) (θT )φm , (66)
2

i(1 + ρ5 )
C̃nm = δnm − d2 σφ†n (σ) (θT )φm . (67)
2
In order to evaluate det Cnm we use (66), and write it in the following form
for inﬁnitesimal gauge transformation:

i
det Cnm = exp Tr d2 σφ†n (σ) (1 − ρ5 )θT φn (σ) . (68)
n
2
Novel Symmetries of String Theory 541

This equation can be rewritten as

i λn 2
det Cnm = limM 2 →∞ exp Tr d2 σφn † (σ) (1 − ρ5 )θT φn (σ)e− M 2
n
2

† i −MD2
= limM 2 →∞ exp Tr d σφn (σ) (1 − ρ5 )θT φn (σ)e
2 2
,
n
2
(69)

where D = ρα Dα . The expression for det Cnm can be evaluated by using the
completeness of states, with the aid of some identities speciﬁc to two dimen-
sions. At the end, we analytically continue back aα → −iaα and eventually
set vα = aα . The determinant (69) becomes

i
det Cnm = exp d2 σθM (X)(−i∂ α aM α − αβ ∂ α aM β ) . (70)
8π

A rather simple and analogous calculation shows that

det C̃nm = 1. (71)

The change in the path-integral measure assumes the form

1
Sanomalous = − 2 M
d σθ (X)(∂α a Mα
− ∂α aβ ) .
αβ M
(72)
16π

We use the freedom to add a local counter term

1
SCT = − d2 σaM α aM α , (73)
32π
whose gauge variation will cancel the first term of Sanomalous . We use the
definition of aα and then after some algebra we arrive at

1
Sanomalous = − d2 σαβ ∂α X μ ∂β X ν ∂μ θM AM
ν . (74)
16π
We conclude that the gauge coupling for such a theory will be inconsistent
since the path-integral measure is affected by an anomaly. We recall that the
string also couples to the antisymmetric two-form, Bμν (X):

1
SB = d2 σBμν (X)αβ ∂α X μ ∂β X ν . (75)
32π
Notice the slightly different normalization factor on the RHS. The gauge in-
variance is restored if we demand that the two-form B-field transforms as

δBμν = ∂[μ θM (X)AM

ν] (76)
542 J. Maharana

under nonabelian gauge transformation, as was implemented on AM μ . We recall

that the field strength of Bμν , Hμνλ will require the addition of a gauge
Chern–Simons (C–S) term in its transformation (76). Thus, we notice that
the coupling of a gauge background alone to a chiral sector (in the world-
sheet action) introduces an anomalous term in the fermionic path-integral
measure. This piece can be removed by introducing a local counter term, and
demanding that the two-form B-field transforms in a specific manner under
the nonabelian gauge transformation associated with the gauge background.
In turn, the field strength H of the B-field is required to be modified by a
C–S term. As emphasizes earlier, our Ward identities are valid modulo anoma-
lies, and the above illustrative example shows how such anomalies could be
computed in the path-integral context.

4 Symmetries of Massive String Excitations

We have elucidated how the local symmetries in target space could be un-
raveled by introducing canonical transformations in phase space. This was
achieved within the first-quantized approach to string theory. There are rea-
sons to believe that we are yet to unveil hidden symmetries (higher symme-
tries) of string theory. There are hints about existence of such symmetries
from the exponential degeneracy of excited string states, and from the de-
scription of very high-energy collision processes in their string theoretic de-
scriptions [20, 21]. Therefore, it is argued that discovering and understanding
such higher symmetries will provide us with deeper insight of string theory. It
is natural to ask how does one go about exploring such symmetries. We adopt
the conventional approach, in the sense that we look for classical symmetries.
These are easy to understand. Subsequently, we examine these symmetries in
a quantum context. As we illustrated earlier, the symmetry might be affected
by anomalies, leading to the breakdown of the symmetry. There are reasons
to explore in these directions. It is recognized that string-theory vacua are
embarrassingly rich. It is not unreasonable to speculate that the discovery of
new stringy symmetries might provide us a way to identify the vacuum which
describes the low-energy standard model, as well as the Universe we live in. It
is quite natural to presume that string field theory will encode the symmetries
of string theory in their totality. Therefore, this seems to be the right setting
to seek answers to the questions raised earlier. There are some hints that
string field theory could be the right forum to address these issues. However,
string field theory has not fully developed efficient techniques to carry out
practical calculations. Therefore, our approach is based on a more pragmatic
first-quantized formulation. Indeed, we envisage the evolution of a string in the
background of its massive states, generalizing the two-dimensional worldsheet
σ-model action for massless backgrounds.
We proceed to explore the higher symmetries following our experience
with massless excitations. We work in a classical framework where, quite
Novel Symmetries of String Theory 543

interestingly, we can get glimpses of the underlying symmetries. Moreover,

even in this simple approach, we can safely understand some of the features of
these symmetries. Let us first briefly recall some salient and relevant features
of the local symmetries we have studied so far. We constructed the Hamilto-
nian action, SH , for the string in the background of its massless excitations
Gμν , Bμν and AMμ . We introduced generators of canonical transformations as-
sociated with general coordinate transformation, gauge symmetry of Bμν and
nonabelian gauge symmetries. Note that, when implementing transformations
associated with GCT, the vertex involved both Bμν and AM μ , since SH has
pieces where Gμν couples to these backgrounds, and the vertex is a functional
derivative of SH with respect to Gμν . The same arguments can be repeated
when we derive WI for the other two massless excitations. The point to be
emphasized is that, if we perform any one of the three canonical transfor-
mations introduced in Sect. 3, we always obtain only the three backgrounds
and their vertex functions. In other words, the canonical transformations are
such that we do not have to introduce additional terms in the Hamiltonian
action, corresponding to new vertex functions, when we look for its invariance
properties. Let us consider infinitesimal general coordinate transformations of
the form
δξ X μ = ξ μ (X), and δη X ν = η ν (X), (77)
such that a transformation of this kind acting on a function f (X) produce
the following variation:

δξ f (X) = f (X + ξ) − f (X) = f,λ ξ λ . (78)

Thus, two transformations involving the shift of X μ will result in

δη δξ f (X) = f,λρ ξ λ η ρ + f,λ ξ λ ,ρ η ρ . (79)

Therefore, it is straightforward to verify that

(δη δξ − δξ δη ) = δξη , (80)

where δξρ = ξ λ ,ρ η ρ − η ρ ,λ ξ λ . Therefore, two such coordinate transformations

result in another coordinate transformation operation as easily demonstrated.
Let consider a generalized form of shift of the string coordinates,
¯ ρζ μ ,
δ̃X μ = ξ μ + ∂X ν ∂X (81)
{νρ}

where ∂ and ∂¯ are derivatives deﬁned in terms of worldsheet complexiﬁed

coordinates. The notation {μν} is to be interpreted as symmetrization with
respect to the two indices μ and ν, here and everywhere. We consider a world-
sheet action in the presence of a tachyon and graviton background only (and
ignore all others):

S = d2 z T (X) + (∂X μ ∂X ¯ ρ + ∂X ρ ∂X
¯ μ )Gμρ (X) . (82)
544 J. Maharana

Under this new shift transformation (81), the tachyon part simply gets
transformed to
¯ ρζ μ .
δ̃T (X) = T (X),μ ξ μ + T (X),μ ∂X ν ∂X (83)
{νρ}

Notice the form of the last term in the above equation: when the tachyon
background is varied, the produced extra piece looks like a graviton vertex,
¯ ρ couples to the graviton Gνρ (X). At this stage, we can already
since ∂X ν ∂X
notice an interesting feature. Suppose we had consider string in a tachyonic
background alone, and implemented the new shift (81). Then the variation
of the action will generate a graviton-like vertex. If we want to apply the
arguments of the previous section, then we shall have to introduce a graviton
vertex to see if we can compensate this shift by a generalized form of GCT, and
obtain a relation δ new SH = −δGCT new
SH , just as we had δG SH = −δ GCT SH .
We shall show that the variation of the graviton vertex under (81) yields some
interesting features. Note that δX μ , in this context, has two parts: one that
corresponds to the usual GCT, and another piece which we have introduced.
It will be argued that the second piece could be associated with a (local)
higher symmetry transformation.
Let us consider the variation of the graviton vertex,

¯ ρ + ∂X ρ ∂X
δ̃ (∂X μ ∂X ¯ μ )Gμρ , (84)

with δ̃X λ = ξ λ (X) + ζ{νρ}

λ ¯ ρ . Under the new coordinate transfor-
(X)∂X ν ∂X
μ
mations, partial derivatives of X will transform in two ways: they will get
shifted in the usual form due the ξ-shift, and a new shift will be added due
to ζ-part in those derivatives. The graviton background will also vary as its
argument X undergoes the shifts. It is quite obvious that the ζ-part is going
to add some extra pieces to (84), since it contains derivatives of the string
coordinates (crudely speaking these are (1,1) operators). Thus, we can see
that there will be terms like ∂X μ ∂X ν ∂X¯ ρ contracted with ζ, and so on. We
shall discuss about them later. Let us look at the coeﬃcients of ∂X μ ∂X ¯ ν
which appear after the variation operation has been performed in (84). Note
that the usual shift of the form X μ → X μ + ξ μ (X) gives rise to a variation
δξ (∂X μ ) = ξ μ ,λ (X)∂X λ . Therefore, under this variation,
¯ ν + ∂X ν ∂X
δξ [(∂X μ ∂X ¯ μ )Gμν ] = ξ μ ,λ (∂X λ ∂X
¯ ρ + ∂X ρ ∂X
¯ λ )Gμρ
μ¯ ρ ρ¯ μ
+ (∂X ∂X + ∂X ∂X )Gμρ ,λ ξ λ . (85)

This is the usual variation which was the starting point for the derivation of
gravitational WI, and which we obtained through the generator of canonical
μ ¯ ρ.
transformation, ΦG . However, there is another piece in (84), ζ{νρ} ∂X ν ∂X
ν¯ ρ
We have argued earlier that the variation induced by δζ on ∂X ∂X Gνρ takes
it away from the form of the graviton vertex. However, recall that the variation
of the tachyon background under this shift is of the form of a graviton vertex.
Novel Symmetries of String Theory 545

To brieﬂy summarize, the eﬀect of the δ̃ = δξ + δζ shift transformation is that

there is a piece in the transformed graviton vertex which is in the form of
a graviton vertex multiplied by ξ,μλ (coming from δξ variation), and another
piece coming from the δζ variation of tachyon. This variation in the action
can be compensated by the following variation of the metric:

δ̃ GCT Gμρ = Gμλ ξ λ ,ρ +ξ λ ,μ Gρλ + ξ λ Gμρ ,λ +ζ{μρ}

α
T,α , (86)

in the sense that if we ignore the presence of other terms arising due to
the δ̃ variation, we could derive a new WI. We can use the relation δ̃SH =
−δ̃ GCT SH + .... where ellipses stand for the terms we have ignored.
μ
Let us now look at the coeﬃcients of the parameter ζ{νρ} , which will be
obtained from the variation of the graviton vertex under δζ :

ν¯ η μ λ ν¯ η μ ν ¯ η μ ¯ ρ Gμρ
∂X ∂X ζ{νη} ,λ ∂X + ∂∂X ∂X ζ{νη} + ∂X ∂ ∂X ζ{νη} ∂X

μ ν¯ η ρ ¯ λ ¯ ν¯ η ρ ν ¯¯ η ρ
+ ∂X ∂X ∂X ζ{νη} ,λ ∂X + ∂ ∂X ∂X ζ{νη} + ∂X ∂ ∂X ζ{νη} Gμρ

+ μ ↔ ρ + (∂X μ ∂X ¯ μ ∂X ρ )Gμρ ,λ ζ λ ∂X η ∂X
¯ ρ + ∂X ¯ α. (87)
{ηα}

Let us examine the structure of the terms appearing in the above equation.
Suppressing target space indices we may write them as (i) ∂X ∂X∂X, ¯ (ii)
¯ ¯
∂ ∂X and (iii) terms where we interchange ∂ ↔ ∂ in (i) and (ii); and there is
¯
ﬁnally the term (iv) ∂X ∂X∂X ¯ We recall that we use equations of motion
∂X.
when we derive Ward identities: therefore, pieces appearing in the category
(ii) will vanish due to the on-shell condition. The appearance of these types
of terms (after equations of motion are implemented) forces us to think that
we must add additional vertex operators to the worldsheet action if we are to
use our technique to derive WI associated with the δζ shift of the coordinates.
Therefore, we include the vertex operators corresponding to the ﬁrst excited
massive states, which assume the form
(1) ¯ ρ + F (2) ∂∂X μ ∂X
Fμνρ ∂X μ ∂X ν ∂¯∂X ¯ ν ∂X
¯ ρ
μν ρ
¯ ρ ∂X
+S{μν}{ρ η } ∂X μ ∂X ν ∂X ¯ η . (88)

Let us now examine how the terms appearing in the above equation should
transform under the combined shifts δξ and δζ . We know the rules for trans-
formations of ∂X and ∂X¯ already. The three index background undergoes a
(1) (1)
variation δFμνρ = Fμνρ ,λ δX λ ,
(1) (1) (1) ¯ ηζλ .
δFμνρ = Fμνρ ,λ ξ λ + Fμνρ ,λ ∂X κ ∂X {κη} (89)
(2)
Similarly, we can obtain the variation of Fμν ρ . Let us look at the vertex
operator associated with F (1) , and note what will be its transformed form
546 J. Maharana

after the variations of ∂X, ∂X¯ and F (1) . It is easy to see that there will be
some pieces coming from the δξ variation that will look like the ones coming
from the δζ variation of the graviton vertex. Thus a symmetry associated
with local ζ-type shift needs the introduction of F (1),(2) backgrounds. Note,
however, that the ζ-variation of F ’s already tells us that we need introducing
vertex operators associated with still higher-level states.
We argue more qualitatively below, rather than presenting our explicit
lengthy algebra. Notice the first term of (88), which gives the coupling of
a string to the F (1) background. When we consider the variation of this
background due to the ξ-shift, there will be one term which will be as-
sociated with the variation of a the massive background with four indices
(the S-field appearing in (88)). Under usual GCT in the target space, F (1)
transforms as a tensor. Now we focus attention on the vertex involving
¯ ρ ∂X
the four-index background: S{μν}{ρ η } ∂X μ ∂X ν ∂X ¯ η . As before, the δξ -
variation of this vertex will consists of several terms: there will be terms
which will have pieces like ∂X∂X ∂X ¯ ∂X,
¯ and also terms which are the prod-
¯
uct of five pieces involving ∂X ∂X.... . Again we see that the usual δξ -shift
already is seeking the presence of higher massive level vertex. If we con-
sider the consequences of δζ -shift on the aforementioned vertex, we imme-
diately realize that we have to add more vertices corresponding to even
higher massive levels. Therefore, a simple δξ -shift already requires presence
of higher states, and hints at a hidden symmetry. It is quite interesting to
explore the consequences of inducing the shifts we have considered so far.
However, we cannot make any headway beyond a certain limit, since test-
ing our proposition through explicit computations becomes unmanageable.
Moreover, there is no reason to consider only δξ and δζ shifts. One could
consider a more general form such as δΣ X μ = ∂X ρ ∂X α ∂X ¯ η Σ μ
{ρα}η which
is generic. Obviously, one has to add other suitable terms to this expres-
sion. We can conclude that the presence of δΣ will transform tachyon in
such a way that we shall need higher massive vertices by looking at the
tachyon background variation alone. Therefore, there are two different av-
enues opening up if we generalize our original prescription (successfully uti-
lized for massless backgrounds), to investigate the symmetries of string theory.
(i) We add vertex functions corresponding to massive string states and gen-
eralize the δξ -shift by adding one extra piece, i.e. δζ shift. The consequences
have been already discussed. (ii) We can generalize the δξ shift by adding
all possible allowed terms, and immediately note that such generalization
also requires addition of additional vertex functions to the worldsheet action.
Both paths lead to the conclusion that string theory is endowed with higher
symmetries.
It is worthwhile exploring additional properties of the transformations δξ
and δζ . The former is associated with the GCT, and we have seen that two such
successive GCT transfomations correspond to another GCT transformation.
Let us closely look at δξ and δζ transformations. A simple explicit calculation
will illustrate the point:
Novel Symmetries of String Theory 547

¯ σ ζ α − ξ α ,β ζ μ ∂X β ∂X
δζ δξ − δξ δζ X μ = ξ μ ,α ∂X κ ∂X ¯ η
{κσ } {αη }
μ μ
− ξ η ,β ζ{αη ¯ β
} ∂X ∂X
α
− ζ{αη β α¯ η
} ,β ξ ∂X ∂X . (90)

It is obvious that the above rule of operation is applicable to any function of

X, f (X), or to any tensor which depends on X. Therefore, we conclude that

δζ δξ , δξ δζ = δζ̂ , (91)

μ μ μ μ
where ζ̂{ρσ} λ
= ζ{ρσ} ξ μ ,λ −ζ{λρ} ξ λ ,σ −ζ{λσ} ξ λ ,ρ −ζ{ρσ} ,λ ξ λ , and it is to be
understood in the light of (90). Therefore, a usual GCT followed by a ζ-shift
is still a combined operation of these two shift. However, two combined ζ-
shift operations will take us out of a ζ-shift, and will signal that we have to
introduce higher transformations. If we continue to repeat this process, we
will have to introduce a hierarchy of higher and higher shifts.
Our discussion has been confined to the classical level only. The generators
of canonical transformations that we have introduced for deriving WI asso-
ciated with massless backgrounds have not been restricted. Therefore, they
take us from the phase-space manifold of a string to another domain which is
huge indeed. There is a way to constraint the choice of the generators, to some
extent. We argued than the canonical transformations are associated with un-
delying symmetries. In the context of string theory, we interpret symmetry
transformation as existence of physically indistinguishable solutions to the
string equations of motion of the backgrounds. Thus, the transformed back-
grounds and phase-space variables correspond to isomorphic conformal field
theories. Rephrased in another way, with each solution of the string equations
of motion there is a two-dimensional conformally invariant theory. Moreover,
this theory is defined by specifying the phase-space variables and the genera-
tors L± . As mentioned earlier, the spacetime fields are the coupling constants
of the σ-model. The couplings of the backgrounds to the string are identified
as vertex operators. We note that the vertex should be BRS invariant in or-
der to fulfill the requirements of conformal invariance of the theory. When we
implement canonical transformation, not only phase-space variables but also
vertex operators are transformed in a specified way. However, these vertex op-
erators must be BRS invariant too. In turn, this condition imposes constraints
on the choice of the genators.
Let us look at the graviton vertex operator VG = ∂X μ ∂X ¯ ν Gμν (X), which
will get transformed according to the rules we have given. For infinitesimal
transformations we have ṼG = VG +δVG , where δVG = {VG , ΦG }P B and δQ =
{Q, Φ}P B . However, {Q, VG }P B = 0. If a generator corresponds to a symmetry
of the theory, then it should commute with the BRS charge, i.e. {Q, ΦG }P B =
0. Thus, {δVG , Q}P B = 0. The generator is already constrainted by such
requirements. Let us consider the case of weak graviton background, i.e. Gμν =
ημν + hμν . It is well-known that the BRS invariance yields the equation of
548 J. Maharana

motion, ∇μ ∇μ hαβ = 0, and the transversality condition, hμν ,μ = hμν ,ν = 0.

We mention that a lot of care must be taken in actual computations since
the quantum operators must be defined keeping in mind issues like operator
ordering and other delicate points. It is relatively simple nowadays to derive
equations of motion for massless backgrounds using standard techniques.
It is a more difficult task to fully analyse the consequences of conformal
invariance when excited massive levels couple to the string. There have been
attempts to study such cases [22] from the perspective of conformal field
theory (CFT). As in case of the graviton coupling we obtain equation of
motion and transversality condition by demanding that the vertex operator
be of dimension (1, 1). To simplify matter further, let us look at a subset of
coupling of the first excited massive levels. The F (1) coupling in (88) is slightly
modified [22] if we look for a vertex of the type (1, 0). The desired piece is
(1) (1)μ
¯ λ + ∂μ F ∂∂X ν ∂X
F(1) = Fμνλ ∂X μ ∂X ν ∂X ¯ λ, (92)
νλ

and here ∂μ is a partial derivative with respect to the spacetime coordinates.

This vertex is (1, 0) if
(1)μ (1)μν
Fμλ + 2∂μ ∂ν Fλ = 0, (93)

(1)λ
∂λ Fμν = 0, (94)
and
(1)
∇μ ∇μ Fμνλ = 0. (95)
A similar argument is to be adopted to include additional terms in (88),
when we demand that the corresponding operators should be of the (0, 1)
type. If we want to derive ‘equations of motion’ for other backgrounds, we
could proceed along these lines. It is assumed that a string is evolving in the
flat Minkowski background and that the vertex operators are to be added
to the worldsheet action in the weak field approximation. In a more general
setting, the ‘beta’ function equations for massive excited backgrounds can be
computed by adopting the technique of Riemann normal coordinate expansion
(as is customary for massless tensor background fields). It is quite obvious that
such computations are not easy, although can be carried out, in principle, using
the conformal field theory techniques.
We would like to draw attention to one more important feature of our
approach in deriving the Ward identities. It will be neccessary to introduce
a prescription to define off-shell amplitudes within the first quantized frame-
work. Let us condider a process such as the scattering graviton + graviton →
tachyon + tachyon. Note that we shall have to go off-shell for this reaction.
However, if we consider the (n + 1)-point amplitude with a single graviton
and n-tachyons, then we can write down WI in a field-theoretic framework.
The amplitude will require an off-shell description. If we are interested in
computing such an amplitude in string theory, going off-shell will amount to
Novel Symmetries of String Theory 549

introducing a conformal factor, with a speciﬁc prescription, in the amplitude,

and this prescription is not necessarily unique. We already know (from our
experiences in quantum field theory with gauge symmetries) that a gauge
fixing parameter appears in the general n-point functions. It is the S-matrix
element which is required to be gauge invariant. In the context of string the-
ory, a conformal factor (signaling the presence of a gauge fixing parameter)
is to be introduced. Kubota and Veneziano [23] have suggested a prescription
to construct off-shell amplitudes, and one of their motivations is to get a han-
dle on the n-point amplitudes involving multi tachyons and massless stringy
excitations. This programme is far from being complete, although its impor-
tance is recognized. We may stress that the importance of string field theory
is realized at every stage when we venture to address the issues alluded to
earlier.

5 Summary and Conclusions

The goal of this article is to unveil and investigate symmetries of string the-
ory in its first quantized formulation. We adopted the point of view that the
worldsheet action for a string in the background of its massless excitations
encodes the local symmetries of the theory. The background fields may be
envisaged as coupling constants of the σ-model. The conformal invariance of
the theory, leading to the equations of motion, already provides us with some
clue about the local symmetries of the target space. It becomes more trans-
parent when we construct the effective action from the equations of motion
which is manifestly invariant under target space local symmetries. It is worth
while to note that higher-order corrections force us to add higher derivative
terms in the background fields; however, the local symmetries are maintained
at each order of the perturbation theory. One may intuitively claim that the
worldsheet description inherently contains those target space symmetries.
The Hamitonian path integral formulation in phase space, adopted here,
provides an elegant technique to expose the target space symmetries. Our
derivation of the Ward identities is based on formal arguments, and could be
interpreted as a classical result. We have followed the traditional avenue, in
the sense that our first goal is to study the classical symmetries, and then
quantize the theory. We are aware that the WI we derived might be violated
due to the presence of anomalies. Indeed, we presented an example where the
conservation law is anomalous, and the anomaly was computed within the
path integral framework. In the process, we discovered that even the compati-
fied bosonic string requires, under certain circumstances, Green–Schwarz-type
Chern–Simons term in the definition of the three-form H-field. Our point is
that the invariance properties of the measure, a potential source of anomaly,
could be analysed using standard prescriptions, at least in principle. We found
that a simple generalization of the usual coordinate shift (associated with
GCT) leads to a very interesting structure. In the first place, when we include
550 J. Maharana

tachyonic background, the generalized shift conveys that GCT gets modified
when we adopt our technique to derive WI for graviton–tachyon backgrounds.
At the same time, inclusion of δζ in the shift of coordinate already signals the
presence of a hierachy of new symmetries, as described in the text. We saw
that contributions of excited massive states, in the form of vertex operators,
are required in the σ-model action. In fact, it is not possible to truncate the
contributions of these terms. Moreover, we analysed the action of the δξ and
δζ operating on string coordinates and on functions of string coordinates. We
notices that the algebra really does not close. We have already conjectured
that the massive stringy states might acquire their masses due to some spon-
taneously broken gauge symmetry.
It is interesting to note that the symmetries associated with higher ex-
cited states of c = 1 string theory have been studied a lot in the past. Let
us recall that a one-dimensional string coupled to a gravitational background
has a two-dimensional target space interpretation. We require that the cen-
tral charge value be 26. This is achieved by introducing a background charge
and, as a consequence, the Virasoro generators and the BRS charge get mod-
ified accordingly. In this picture, the general couplings are functions of two
variables, one of them being the conformal mode of the two-dimensional tar-
get space metric. The ground state is the tachyon. However, this theory also
contains an infinite set of discrete states [24]. Therefore, a σ-model action
can be constructed involving these states. In this model, we may envisage
the possibility of inducing canonical transformations for these states and seek
for associated symmetries. Indeed, the c = 1 theory is endowed with a W∞
symmetry [25], and the generators of this transformation are the generators of
the W∞ algebra. In fact, these symmetries have a very nice interpretation as
gauge transformations when they are analysed from the perspective of string
field theory, suitably formulated for c = 1 string theory [26]. This is a very
interesting and encouraging result for us. Now the question arises whether, for
the bosonic string in critical dimensions, we can unravel higher symmetries
from a string field theory perspectives.
We have a partial answer to this question [27]. One sets out with a non-
polynomial formulation of string field theory. This string field action is known
to be invariant under infintesimal gauge transformations. We recall that the
string field theory action can be expanded in terms of component fields; sim-
ilarly, the corresponding gauge parameter will also have an expansion. It was
argued that a specific choice of the gauge function, Λ, can be made, such that
this gauge transformation corresponds to a canonical transformation when
viewed from the first quantized σ-model description in the worldsheet [27].
Since string field theory can account for off-shell descriptions as well, this
gauge transformation is expected to be more general. Of course, a most gen-
eral gauge transformation could not be implemented due to technical reasons;
however, a linearized version was adopted to check the gauge transformation
properties of the first few massive levels. Moreover, the gauge functions could
be identified for a few levels explicitly. It was possible to identify the gauge
Novel Symmetries of String Theory 551

transformations, associated with some of the low-lying massive states, with

canonical transformations of massive levels in the first quantized descriptions.
At this stage one was not able to compute the algebra of the generators and
identify the underlying algebra, as was the case with the c = 1 theory where
W∞ symmetry was identified as the underlying symmetry. The generators
satisfied the corresponding algebra. Nevertheless, it is hearting to unravel the
presence of a hierarchy of symmetries in the critical bosonic string; moreover,
the presence of W∞ algebra in c = 1 strings also gives us a clue that the
critical bosonic string might be endowed with a rich underlying symmetry.

Acknowledgments
I would like to thank Ashok Das and Nick Mavromatos for useful discussions
over the years. I especially thank T. Kubota for sharing his unpublished work
and his valuable notes with me. It has been a rewarding experience to know
Gabriele as a collaborator, as a collegue and as a friend. I have immensely
beneﬁted from long discussions with him, from his very deep insights in physics
and from his human values. I wish him many more happy, prosperous and
productive years ahead.

References
1. M. B. Green, J. H. Schwarz, E. Witten: Superstring Theory (Cambidge
University Press, Cambridge, 1987); J. Polchinski: String Theory (Cambridge
University Press, Cambridge, 1998) 525
2. E. S. Fradkin, A. A. Tseytlin: Phys. Lett. B 158, 316 (1985); Nucl. Phys. B
261, 1 (1985) 525, 532
3. C. Lovelace: Phys. Lett. B 135 , 75 (1984); Nucl. Phys. B 273, 413 (1986);
A. Sen: Phys. Rev. D 32, 2102 (1985) C. G. Callan, D. Friedan, E. J. Martinec,
M. J. Perry: Nucl. Phys. B 262, 593 (1985) 525, 533
4. G. Veneziano: Phys. Lett. B 167, 387 (1986) 526
5. J. Maharana, G. Veneziano: Phys. Lett. B 169, 177 (1985); Nucl. Phys. B 283,
126 (1987) 526, 534
6. J. Maharana, G. Veneziano (unpublished work 1986) 526, 527
7. J. Maharana, G. Veneziano (unpublished work 1991 and 1993) 526, 527
8. I. A. Batalin, G. A. Vilkovisky: Phys. Lett. B 69, 309 (1977) 528
9. I. A. Batalin, E. S. Fradkin: Phys. Lett. B 122, 157 (1983) 528
10. For a review see M. Henneaux: Phys. Rep. 126, 1 (1985) 528
11. A. M. Polyakov: Phys. Lett. B 103, 207 (1981) 529
12. M. Kato, K. Ogawa: Nucl. Phys. B 212 , 443 (1983); S. Hwang: Phys. Rev. D
28, 2614 (1983) 531
13. I. B. Frenkel, V. G. Kac: Inv. Math. 62, 23 (1981); I. B. Frenkel: J. Funct. Anal-
ysis 44, 259 (1981); P. Goddard and D. Olive: in Vertex Operators in Mathemat-
ics and Physics, eds J. Lepowski, S. Mandelstam, I. M. Singer (Springer-Verlag,
New York, 1985), p. 51 533
552 J. Maharana

14. D. J. Gross, J. A. Harvey, E. Martinec, R. Rohm: Nucl. Phys. B 216, 253 (1985)
533
15. M. B. Green, J. H. Schwarz: Phys. Lett. B 149, 117 (1984) 533
16. E. Witten: Commun. Math. Phys. 92, 451 (1984) 533
17. T. Banks, D. Nemshansky, A. Sen: Nucl. Phys. B 277, 67 (1986) 538
18. A. Das, J. Maharana, P. Panigrahi: Mod. Phys. Lett. A 8, 759 (1988) 539
19. K. Fujikawa: Phys. Rev. Lett. 42, 1195 (1979); Phys. Rev. D 21, 2848 (1980);
Phys. Rev. D D29, 285 (1984) 539
20. D. J. Gross, P. Mende: Phys. Lett. B 197, 129 (1987); Nucl. Phys.B 303, 407
(1988); D. J. Gross: Phys. Rev. Lett 60, 1229 (1988) 542
21. D. Amati, M. Ciafaloni, G. Veneziano: Phys. Lett. B 197, 81 (1987); Int. J.
Mod. Phys. A 3, 1615 (1988); Phys. Lett. B 216, 41 (1989); Phys. Lett. B 289,
87 (19989); Nucl. Phys. B 403, 707 (1993) 542
22. M. Evans, B. Ovrut: Phys. Rev. D 39, 3016 (1989); Phys. Rev. D 41, 3149
(1990); R. Akhoury, Y. Okada: Nucl. Phys. B 318, 176 (1989) 548
23. T. Kubota, G. Veneziano: Phys. Lett. B 207, 419 (1988); and unpublished
results. 549
24. A. M. Polyakov: Mod. Phys. Lett. A 6, 635 (1991); S. Mukherji, S. Mukhi, A.
Sen: Phys. Lett. B 266, 337 (1991); B. Lian, G. Zuckerman: Phys. Lett. B 266,
21 (1991; S. Mukherji, S. Mukhi, A. Sen: Phys. Lett. B 266, 337 (1991) 550
25. J. Avan, A. Jevicki: Phys. Lett. B 266, 35 (1991); G. Moore, N. Seiberg: Int.
J. Mod. Phys. A 7, 2634 (1992); S. Das. A. Dhar, G. Mandal, S. R. Wadia:
Int. J. Mod. Phys. A 7, 5165 (1992); I. Klebanov, A. M. Polyakov: Mod. Phys.
Lett. A 6, 3373 (1991); E. Witten: Nucl. Phys. B 373, 187 (1992); D. Minic, J.
Polchinski, Z. Yang: Nucl. Phys. B 369, 324 (1992) 550
26. S. Mukherji, S. Mukhi, A. Sen: Phys. Lett. B 275, 39 (1991) 550
27. J. Maharana, S. Mukherji: Phys. Lett. B 284, 36 (1992) 550
Threshold Eﬀects Beyond the Standard Model

T. R. Taylor

Department of Physics, Northeastern University, Boston, MA 02115, USA

[email protected]

Abstract. In this contribution to the Festschrift celebrating Gabriele Veneziano

on his 65th birthday, I discuss the threshold eﬀects of extra dimensions and their
applications to physics beyond the standard model, focusing on superstring theory.

1 Introduction

I am very happy to contribute to the Festschrift celebrating Gabriele

Veneziano on his 65th birthday. I have known Gabriele for more than 25
years and worked with him on many projects, learning not only physics, but
how to enjoy physics. “Amusing” is the word that he often uses to describe in-
teresting ideas, and that single word characterizes best a unique style of joyful
research that led to his pioneering work on string theory, particle physics and
cosmology described in this book. When researching Gabriele’s original work
on running coupling constants, preceding our 1988 collaboration [1] described
below, I ran into a write-up of his lectures on “Topics in String Theory” de-
livered in 1987 in China and in India [2]. His paper concludes with: “But my
moral, I hope, is a clear one for the young string theorist: If string math. is
lots of fun, string phys. is no less.” Indeed, I had much fun working on string
physics over the following 20 years. In this contribution, I discuss the thresh-
old eﬀects of extra dimensions and their applications to physics beyond the
standard model, focusing on superstring theory.

2 Threshold Eﬀects of Extra Dimensions

At a given time in the history of elementary particle physics, there is always

the mystery of higher energies and the hope of building even more powerful
accelerators that would take us one step farther in the understanding of short-
distance physics. Thirty years ago, discovering yet another quark was a major

T. R. Taylor: Threshold Eﬀects Beyond the Standard Model, Lect. Notes Phys. 737, 553–560
(2008)
DOI 10.1007/978-3-540-74233-6 17
c Springer-Verlag Berlin Heidelberg 2008
554 T. R. Taylor

breakthrough; but now, the next round of experiments can hardly satisfy the-
orists without uncovering extra dimensions or producing black hole fireballs.
Threshold effects appear each time a new particle is discovered. They appear
in many physical quantities, signaling transition to new energy domains.
As an example, consider the top quark threshold. We want to see how the
QCD coupling constant evolves from the region below the top mass scale mt ,
across the threshold, to higher energies. In order to determine the correspond-
ing one-loop correction to the effective action, we can consider the vacuum
polarization diagram with two external gauge bosons at momentum scale Q,
as shown in Fig. 1. Since we are mostly interested in the effects of quark loops,
there is no need to use a full-fledged background field method. This two-point
function is
Π μν (Q) = i(Qμ Qν − Q2 g μν )Π(Q) , (1)
with

d4 P 1 1
Π(Q) ≈ i βn . (2)
(2π)4 P 2 + m2n (P + Q)2 + m2n
mn <Λ

Here, the sum extends over all particles with masses below the ultraviolet
cutoﬀ Λ, and βn denote the respective beta function coeﬃcients:
1
βn = 2(−1)Fn (λ2n − )Cn , (3)
12
where Fn is the fermion number, λn the helicity and Cn the quadratic Casimir
in the particle’s SU (3) color representation. The momentum dependence of
the integral (2) changes at the threshold. In a rough approximation,
βn
Q 2 βn
m 2
n
<mt : Π(Q) ≈
Q∼ ln + ln ,
(4π)2 Λ (4π)2 Λ
n: mn <m t n: Λ>mn ≥mt
(4)
βn
Q 2 βn
mn 2
>mt : Π(Q) ≈
Q∼ ln + ln .
(4π)2 Λ (4π)2 Λ
n: mn ≤mt n: Λ>mn >mt

Below the threshold, the top quark loop does not participate in the loga-
rithmic running of the coupling constant, which is completely determined by
the particle spectrum below mt . However, its contribution ensures a smooth
transition to higher energies, where the coupling runs with the beta function

Fig. 1. One-loop contributions to the eﬀective gauge coupling

Threshold Eﬀects Beyond the Standard Model 555

coefficient including top. While the full renormalization group beta function
determines the cutoff dependence of couplings, the finite threshold effects play
an important role in the evolution of effective physical couplings. Thus they
are very important for all applications involving extrapolations to high ener-
gies, in particular in the framework of unification scenarios.
The fact that even a single particle can produce significant threshold effects
is very important for grand unification, but it does not excite imagination
in a way like the threshold to higher dimensions, envisaged in some Kaluza–
Klein (KK) scenarios beyond the standard model [3]. When crossing to higher
dimensions, including, say, a circle of radius R, one encounters not just one
particle, but an infinite tower of KK excitations with masses mn = n/R
labeled by n ≥ 0. For each tower with βn = β0 , the sums in (4), with the
threshold mass mt replaced by 1/R, split into 0 ≤ n < QR and QR < n < ΛR.
The latter can be approximated by an integral, giving [1]:

β0
Q 1/R : Π(Q) ≈ ln(QR)2
− 2(N − 1) ,
(4π)2
β0 (5)
1/R Q Λ : Π(Q) ≈ 2 (RQ − N ) ,
(4π)2

where N = ΛR is the (large) number of KK excitations below the cutoﬀ Λ.

Note that the logarithmic running occurs only below the decompactification
scale, and is completely determined by the properties of the massless state
at the bottom of the tower. Above the threshold, there is a power (linear)
running appropriate to non-compact five dimensions. The logarithmic run-
ning is something very special to four dimensions—it is a remnant of infrared
divergences that do not appear in higher dimensions. Incidentally, in order to
explain why we live in four dimensions, one needs a mechanism that relies on
some special properties of D = 4; thus infrared divergences are very likely to
play a role in such dynamical compactifications [1]. Note that the dominant
momentum-independent one-loop threshold correction is of order O(N ).
The computations of one-loop threshold effects can be repeated for more
general, possibly anisotropic compactifications, always with the same result
that the radius R which determines the scale of logarithmic running should
be understood as the largest length scale characterizing the compact space.
Thus the onset of power running occurs as soon as the energies approach the
first Kaluza—Klein mass.
More recently, large-radius compactifications became quite a popular el-
ement of model building beyond the standard model. Although it is a very
attractive possibility, it seems to be incompatible with the existence of su-
persymmetric grand unification suggested by the observed values of gauge
coupling constants, which is based on the logarithmic running. However, as
shown in [4], it is possible that large threshold corrections can also lead to
unification, at lower energy scales, determined by the size of compact dimen-
sions. Here, supersymmetry seems to lose its special appeal; however, it is
556 T. R. Taylor

desirable for another reason. A mechanism based on large one-loop threshold

corrections can be reliable only if the higher loop eﬀects are small. Without
supersymmetry, there is no reason to expect that this is the case. The common
feature of N = 1 supersymmetric compactiﬁcations is that at heavy Kaluza–
Klein levels the spectrum as well as interactions are N = 2 supersymmetric.
It is well known that N = 2 gauge couplings are not renormalized beyond
one loop. It can be also shown [5] that the one-loop threshold corrections are
dominant, while the higher loop corrections are suppressed, at least by some
powers of the tree-level coupling constants.

3 Superstring Threshold Corrections

Threshold corrections appear also in the framework of string theory, which

brings two new elements. First, the ultraviolet cutoff is a physical parame-
ter, related to the Regge slope α that determines the masses of heavy string
modes: Λ ≈ (α )−1/2 . Thus the cutoff itself becomes a threshold for the pro-
duction of heavy string modes. Second, the tree-level coupling constants and
the radius, as well other geometric quantities characterizing the shape and
size of extra dimensions, correspond to vacuum expectation values (VEVs) of
certain moduli fields. The moduli parameterize flat directions of the tree-level
scalar potential; therefore, the determination of their VEVs is a dynamical
problem of “moduli stabilization.”
The fact that string theory is ultraviolet finite does not prevent gauge
couplings from running which, as explained before, is an infrared effect, and
can be studied by using the low-energy effective field theory. A more rigor-
ous, formal treatment of threshold corrections is complicated by the fact that
only on-shell amplitudes can be computed by using standard string-theoretical
techniques. At the same time when Gabriele was using effective field theory
with a string cutoff [2], Kaplunovsky [6] developed a full-fledged formalism for
studying threshold corrections in string theory. Then Dixon, Kaplunovsky and
Louis [7] studied moduli dependence of string loop corrections in certain het-
erotic orbifold compactifications.1 For any untwisted modulus T upon which
the threshold corrections Δ do depend non-trivially, the functional form of
this dependence is given by

Δ = A · ln |η(T )|4 · ImT + T -independent terms , (6)

where A are computable constants determined by the massless spectrum. The

Dedekind function is deﬁned by

1
These computations were later extended to more general orbifolds by Mayr and
Stieberger [8]. More recently, Lüst and Stieberger [9] studied gauge threshold cor-
rections in intersecting brane-world models. The formalism for computing thresh-
old corrections to Yukawa couplings has been developed in [10].
Threshold Eﬀects Beyond the Standard Model 557
∞

η(T ) = eπiT /12 (1 − e2πinT ) . (7)
n=1

It is very interesting to compare (5) and (6). To make it simple, consider

a six-dimensional orbifold which is a product of a two-dimensional torus T 2
and “something” four-dimensional, and that T 2 is a product of two circles
with radii R1 and R2 , respectively. For such compactifications, there exists a
modulus parameterizing the volume of T 2 : ImT = R1 R2 /α = (ΛR1 ) · (ΛR2 ).
Note that Im T ∼ N , measuring also the (approximate) number N of KK
excitations of T 2 with masses below the string cutoff. In the limit of large
radii, N → ∞, and
π
Δ ∼ −A · N ; (8)
3
thus the string threshold corrections have the same large-radius behavior as a
generic sum of KK modes (5), up to a multiplicative constant which is rather
ambiguous because it is related to the precise implementation of the mass cut-
off on the KK spectrum. However, the coefficients A are non-zero only for the
orbifold sectors with N = 2 supersymmetry. Furthermore, according to the
non-renormalization theorem proved in [11], all higher loop (genus) corrections
are zero exactly; thus the full-fledged string computations are perfectly com-
patible with the effective field theory analysis. A more precise match between
the two formalisms has been discussed in [12].
In any closed string theory like the heterotic one, KK modes of each circle
are accompanied by strings winding n times around the circle, with masses
mn = nR/α . The spectrum as well as the interactions have a small–large ra-
dius symmetry R ↔ α /R, which is extended in the orbifold compactifications
to a full T -duality: P SL(2, Z) modular invariance generated by T → −1/T
and T → T + 1. The threshold correction (6) is P SL(2, Z)-invariant. This
invariance is realized, however, in quite a non-trivial way. Δ is the coefficient
of the kinetic energy terms of gauge bosons so its form is restricted by super-
symmetry to be a real part of a holomorphic function of chiral fields. Indeed,
at the tree level g −2 = 4 ReS, where S is the dilaton superfield. The pres-
ence of ImT under the logarithm in (6), which is necessary for the modular
invariance, is in conflict with that property. Thus the string threshold correc-
tions suffer from a “holomorphic anomaly” [13, 14], which is related to the
infrared divergences associated to massless states that cannot be described in
terms of a local effective action [15]. Since then, holomorphic anomalies play
an important role in more formal areas of superstring theory, see e.g. [16].
The moduli-dependent threshold corrections have some interesting phe-
nomenological consequences. For example, in some specific orbifold models
with the light spectrum below the compactification scale consisting only of
the particles belonging to the minimal standard model, a phenomenologically
viable gauge coupling unification imposes certain constraints on the modular
transformation properties of quark, lepton and Higgs superfields [17].
558 T. R. Taylor

The fact that gauge (and other) couplings are moduli-dependent may also
help is stabilizing the moduli VEVs. In particular, in the context of hidden
gaugino condensation mechanism of supersymmetry breaking [18], the scale
ΛSYM of gaugino condensation is given, in the two-loop approximation, by
2

−8π 2
ΛSYM = μ g β1 /2β0 exp , (9)
β0 g 2
where μ is the scale at which the gauge coupling constant g is deﬁned, and β0 ,
β1 are the beta function coeﬃcients of the hidden super Yang–Mills (SYM)
sector: β(g) = − (4π)
β0
2 g − (4π)4 g + . . . . From Gabriele and Shimon’s work
3 β1 5

[19] we know that gaugino condensation can be described in terms of a sim-

ple lagrangian for the composite superfield Wα W α = λα λα + . . . , with the
coupling constant promoted to a holomorphic function of the dilaton and
moduli superfields [20]. In heterotic orbifold compactifications, the moduli
dependence is completely determined by modular invariance [21, 22]. As an
example, consider a pure SYM hidden sector and focus on the dependence
of the Veneziano–Yankielowicz lagrangian on three superfields: Wα W α , the
dilaton S and one modulus T = a + iR2 /α , where R is the (common) radius
of six compact dimensions and a is the associated axion. One finds [21] that
the condensation occurs at the expected scale
| λα λα | = Λ3SYM , (10)
with the identification:2
1 1 ! β0 "
μ2 = , 2
= 4 Re S + 2
ln η(T ) . (11)
2 ImT g (4π)
The above result can be interpreted by saying that the scale μ is the in-
frared cutoff while g is the Wilsonian coupling constant [23] including the
non-anomalous part of threshold corrections (6), due to massive KK states
with mn > μ only. This is the coupling that should be used in the effective
four-dimensional field theory at energies below the compactification thresh-
old μ, which from the low-energy point of view becomes an ultraviolet cutoff.
Indeed, g −2 is a real part of a holomorphic function, as required by supersym-
metry. However, it is not modular-invariant because the zero mass modes are
excluded from loop integrals.
In order to obtain the moduli superpotential generated by gaugino con-
densation, one integrates out the composite field Wα W α . This leads to the
following superpotential:
96π 2 β0 2
− 96π
β0 S −6
W = exp − S+ ln η(T ) = e η (T ) . (12)
β0 (4π)2
The above superpotential transforms under the P SL(2, Z) modular transfor-
mations as a form of weight −3, which ensures modular invariance of the
2
This comparison makes use of the fact that β1 /2β02 = −2/3 in SYM theory.
Threshold Effects Beyond the Standard Model 559

lagrangian. In fact, its form is determined uniquely by the modular proper-

ties and asymptotic behavior, so it can be also derived without necessarily
going into details of SYM dynamics. The corresponding scalar potential is
modular invariant; therefore, it is symmetric under R ↔ α /R and has sta-
tionary points at R2 = α (T = i). Since its form depends on the details of
Kähler potential, which also receives some non-perturbative corrections, it is
difficult to prove that T and other moduli are stabilized; however, there are
some indications that this is indeed the case in some models. As far as the
dilaton is concerned, the scalar potential exhibits a “runaway” behavior at
S → ∞, driving the model to its trivial zero-coupling limit. This problem can
be circumvented if the hidden SYM sector contains a gauge group consisting
of several simple subgroup factors. Then the dilaton VEV can be “locked” by
a “racetrack” of the potential [24].
The threshold effects of extra dimensions and the related gaugino con-
densation mechanism remain as important ingredients of superstring model
building, now including not only heterotic strings, but also D-branes and flux
compactifications. They may play a major role in connecting superstring the-
ory to the real world.

Acknowledgments

I would like to thank Gabriele for 25 years of enjoyable collaborations, his

friendship, guidance and support. I am looking forward to many future
projects, as exciting and enjoyable as usual. I am also grateful to my collabo-
rators Ignatios Antoniadis, Pierre Binétruy, Sergio Ferrara, Mary K. Gaillard,
Edi Gava, Zurab Kakushadze, Dieter Lüst, Nico Magnoli, Narain and Pran
Nath, who worked together with me on the related topics. This work is sup-
ported in part by the US National Science Foundation Grant PHY-0600304.
Any opinions, ﬁndings and conclusions or recommendations expressed in this
material are those of the author and do not necessarily reﬂect the views of
the National Science Foundation.

References
1. T. R. Taylor, G. Veneziano: Phys. Lett. B 212, 147 (1988) 553, 555
2. G. Veneziano: Topics in String Theory, CERN-TH-5019/88 (1988) 553, 556
3. I. Antoniadis: Phys. Lett. B 246, 377 (1990) 555
4. K. R. Dienes, E. Dudas, T. Gherghetta: Phys. Lett. B 436, 55 (1998) 555
5. Z. Kakushadze, T. R. Taylor: Nucl. Phys. B 562, 78 (1999) 556
6. V. S. Kaplunovsky: Nucl. Phys. B 307, 145 (1988) [Erratum-ibid. B 382, 436
(1992)] 556
7. L. J. Dixon, V. Kaplunovsky, J. Louis: Nucl. Phys. B 355, 649 (1991) 556
8. P. Mayr, S. Stieberger: Nucl. Phys. B 407, 725 (1993) 556
9. D. Lüst, S. Stieberger: arXiv:hep-th/0302221 556
560 T. R. Taylor

10. I. Antoniadis, E. Gava, K. S. Narain, T. R. Taylor: Nucl. Phys. B 407, 706

(1993) 556
11. I. Antoniadis, K. S. Narain, T. R. Taylor: Phys. Lett. B 267, 37 (1991) 557
12. M. K. Gaillard, T. R. Taylor: Nucl. Phys. B 381, 577 (1992) 557
13. J. P. Derendinger, S. Ferrara, C. Kounnas, F. Zwirner: Nucl. Phys. B 372, 145
(1992) 557
14. G. Lopes Cardoso, B. A. Ovrut: Nucl. Phys. B 392, 315 (1993) 557
15. J. Louis: SLAC-PUB-5527 (1991), published in Boston PASCOS 1991,
pp. 751–765 557
16. M. Bershadsky, S. Cecotti, H. Ooguri, C. Vafa: Nucl. Phys. B 405, 279 (1993) 557
17. L. E. Ibanez, D. Lüst, G. G. Ross: Phys. Lett. B 272, 251 (1991); L. E. Ibanez,
D. Lüst: Nucl. Phys. B 382, 305 (1992); P. Mayr, H. P. Nilles, S. Stieberger:
Phys. Lett. B 317, 53 (1993); H. P. Nilles, S. Stieberger: Phys. Lett. B 367,
126 (1996); Nucl. Phys. B 499, 3 (1997) 557
18. H. P. Nilles: Phys. Lett. B 115, 193 (1982); S. Ferrara, L. Girardello, H. P. Nilles:
Phys. Lett. B 125, 457 (1983); M. Dine, R. Rohm, N. Seiberg, E. Witten: Phys.
Lett. B 156, 55 (1985); C. Kounnas, M. Porrati: Phys. Lett. B 191, 91 (1987) 558
19. G. Veneziano, S. Yankielowicz: Phys. Lett. B 113, 231 (1982) 558
20. T. R. Taylor: Phys. Lett. B 164, 43 (1985) 558
21. S. Ferrara, N. Magnoli, T. R. Taylor, G. Veneziano: Phys. Lett. B 245, 409
(1990) 558
22. A. Font, L. E. Ibanez, D. Lüst, F. Quevedo: Phys. Lett. B 245, 401 (1990);
H. P. Nilles, M. Olechowski: Phys. Lett. B 248, 268 (1990); P. Binétruy,
M. K. Gaillard: Phys. Lett. B 253, 119 (1991); M. Cvetic, A. Font, L. E. Ibanez,
D. Lüst, F. Quevedo: Nucl. Phys. B 361, 194 (1991); D. Lüst, T. R. Taylor:
Phys. Lett. B 253, 335 (1991); P. Binétruy, M. K. Gaillard, T. R. Taylor: Nucl.
Phys. B 455, 97 (1995); P. Nath, T. R. Taylor: Phys. Lett. B 548, 77 (2002) 558
23. V. Kaplunovsky, J. Louis: Nucl. Phys. B 444, 191 (1995) 558
24. N. V. Krasnikov: Phys. Lett. B 193, 37 (1987); L. J. Dixon: SLAC-PUB-5229
(1990), published in DPF Conf. 1990, pp. 811–822; T. R. Taylor: Phys. Lett.
B 252, 59 (1990) 559
Dualities in String Cosmology1

K. A. Meissner

Institute of Theoretical Physics, Warsaw University, Hoża 69, 00-681 Warsaw,

Poland
[email protected]

Abstract. We describe in this chapter a set of duality symmetries present in the

string-inspired theory of gravity coupled to the dilaton. These dualities are the cor-
nerstones of String Cosmology, which provides alternatives to the usual inﬂation
scenario. The crucial role of Prof. Gabriele Veneziano in the discovery and the de-
velopment of string dualities is described and emphasized.

1 Introduction

Before going over to the description of dualities in String Cosmology and the
fundamental role of Prof. Gabriele Veneziano in its discovery I would like to
devote a few lines to some personal recollections. I met Gabriele for the first
time in 1984 when, as a young experimental physicist, I worked for a few
months in the UA2 group at CERN. This was a remarkable year not only
for CERN (because of the Z and W discoveries) but also for string theory
initiated 17 years earlier by Gabriele (because of the discovery of the Green–
Schwarz mechanism of cancellation of anomalies). Obviously at that time a
distance between a young experimental physics student and the world famous
theoretical physicist was so huge that neither I dared to approach Gabriele nor
I imagined that I ever would. Fortunately, a few years later, I gave a seminar at
the theory division at CERN and Gabriele was generous enough to encourage
me to apply for a 1-year position there. The stay at CERN 1990–1991 was
the beginning of our collaboration (marked with publishing two papers that I
consider the most important in my life) and the friendship that I am deeply
grateful for.
Although general relativity is a theory with an extremely large group of
local symmetries (i.e. the group of diffeomorphisms), it is very difficult to find
any nontrivial global symmetry not directly linked with diffeomorphisms. The
first such symmetry was discovered by Ehlers [1] in 1957 where it was shown

1
In honour of Prof. Gabriele Veneziano

Krzysztof A. Meissner: Dualities in String Cosmology, Lect. Notes Phys. 737, 561–571 (2008)
DOI 10.1007/978-3-540-74233-6 18 c Springer-Verlag Berlin Heidelberg 2008
562 K. A. Meissner

that four-dimensional pure gravity with one Killing vector exhibits SL(2, R)
symmetry. The argument is very simple and can be presented in a few lines.
One starts with a metric parameterized as (the Killing vector is assumed here
to be along the spatial coordinate z)

ds2 = e−ρ gij + eρ Ai Aj dxi dxj + 2eρ Ai dxi dz + eρ (dz)2 , (1)

where i, j = 0, 1, 2, all ﬁelds are real and depend only on xi . Calculating the
scalar curvature we get

1 2ρ 2 1
−g R = −g
(4) (4) (3) R − e Fij − (∇ρ) + ρ .
(3) 2
(2)
4 2

In three dimensions a vector is dual to a scalar. We perform the dualization

by a substitution in the action

Fij = e−2ρ ijk ∂ k σ, (3)

and changing the sign of the resulting expression (which follows from the
path integral formulation of dualization). Then introducing a complex ﬁeld φ
deﬁned on the upper half-plane

φ = σ + i eρ , (4)

we get the reduced three-dimensional action

∇φ̄∇φ
S= −g (3) R −
(3)
. (5)
2(Im φ)2

This action is explicitly invariant under the Ehlers SL(2, R) group acting as

aφ + b
φ→ , (6)
cφ + d

with real numbers a, b, c, d satisfying ad−bc = 1 (the Ehlers group was recently
found also at higher-derivative orders in the gravitational action [2]).
With two Killing vectors the resulting symmetry is much bigger – in fact
it is an infinite symmetry group called the Geroch group [3]. It was found in
1971 as a “solution generating” technique in a set of stationary, axisymmetric
solutions of the Einstein’s equations (the actual infinite Lie algebra structure
was found later, and the group structure of finite transformations still later at
the beginning of 1980s in the connection with coset constructions, nonlinear
σ-models and Kac–Moody SL(2, R) algebra).
In 1984 (the year of the “first string revolution”) there appeared a notion
of duality in string theory which in the simplest form says that a propaga-
tion of strings on a manifold of radius R is equivalent to a propagation on a
manifold of radius 1/R [4]. It was later generalized to more complicated fixed
Dualities in String Cosmology 563

background situations and there was an argument that it should be an exact

symmetry to all orders in α [5].
In 1986 Narain discovered a large set of consistent string theories cor-
responding to toroidal compactifications of the heterotic string [6]. These
compactifications are parameterized by points in the coset space G/H =
SO(26 − d, 10 − d)/SO(26 − d) × SO(10 − d). Although all these theories
were unrealistic phenomenologically (they correspond to N = 4 supersymme-
try in D = 4), the discovery ended the dream of a unique string theory. The
points on the Narain’s lattice did however, correspond, to different theories
and not to different solutions inside the same theory so the construction is
more a “theory generating” than “solution generating” technique. The second
paper of [6] found a correspondence between the Lorentzian self-dual lattices
in the heterotic picture of (26 − d, 10 − d) compactification and the (10 − d)
compactification in the presence of gauge and antisymmetric fields (that are
of direct concern to subsequent developments).

2 Scale Factor Duality

The crucial observation that later led to extremely fruitful lines of research
(especially in cosmology but in many other areas as well) was done by Gabriele
in the paper released in April 1991 and published half a year later [7]. As a
starting point he took the effective action for fields that are always in the
massless spectrum of any closed string theory: gravity and the dilaton φ. The
action reads

1 √
Γ (0) = 2 dd+1 x −g e−φ R + (∇φ)2 . (7)
2κ
Then the assumption is made that all fields depend only on time. With the
diagonal ansatz for the metric:
gμν = (−1, a21 (t), . . . , a2d (t)), (8)
the main observation of the paper states that the equations of motion are
invariant under
Φ → Φ, ai (t) → a−1
i (t), (9)
with
Φ=φ− ln(ai (t)). (10)
Such a symmetry is called in the paper scale factor duality (SFD) and it
heavily relies on the fact that string theory predicts the relative coefficient
of two terms in (7) to be 1 – otherwise the symmetry would not be there!
Gabriele showed that starting with some simple solutions one can generate
new solutions that still solve the equations of motion. The paper attracted
a broad attention and was a basis of an introduction almost 2 years later
of String (Pre-Big-Bang) Cosmology by Maurizio Gasperini and Gabriele [8]
(each paper has now more than 500 citations).
564 K. A. Meissner

3 O(d, d) Symmetry to the Lowest Order

Even before the seminal paper on scale factor duality was released, during our
discussions with Gabriele the question arose whether it can be generalized to
other ﬁelds, most notably the antisymmetric tensor Bμν which also is present
in the massless spectrum. Soon after it turned out that the symmetry is much
bigger than just the discrete Z2 of SFD – it is actually a large noncompact
O(d, d) symmetry with SFD as a small discrete subgroup. The paper released
in Autumn 1991 [9] took as a starting point the string massless action
# $
1 √ 1
Γ (0) = 2 dd+1 x −g e−φ R + (∇φ)2 − H 2 , (11)
2κ 12

where Hμνρ = ∂μ Bνρ + cyclic. The actions (7) and (11) are taken as basic
ingredients in String Cosmology and the existence of duality plays a funda-
mental role there (see the contribution of M. Gasperini in this volume [10]).
When ﬁelds depend only on time we can write

−1 0 0 0
gμν = , Bμν = . (12)
0 G(t) 0 B(t)

Then the paper introduced a symmetric 2d × 2d matrix (introduced earlier in

a diﬀerent context of ﬁxed backgrounds by Shapere and Wilczek [11])
−1
G −G−1 B
M0 = (13)
BG−1 G − BG−1 B

belonging to the O(d, d) group:

01
M0T = M0 , M0 ηM0 = η, η= . (14)
10

The action (11) can then be rewritten as

1 −Φ 1
Γ =− 2
(0)
dte 2
Φ̇ + Tr[Ṁ η Ṁ η] . (15)
2κ 8

The action (15) is explicitly symmetric under the action of the O(d, d) group:

M0 → Ω T M0 Ω, Φ → Φ, (16)

where Ω belongs to the O(d, d): Ω T ηΩ = η.

Soon after our second common paper appeared [12] that discussed the gen-
eral solutions to the equations of motion using the conserved current connected
with the presence of a global continuous O(d, d) symmetry of the action. It
is appropriate to describe them here in some detail, as it seems that their
potential has not yet been fully exploited.
Dualities in String Cosmology 565

We start with the description of the group O(d, d). We write the general
element of the group as
−1 −1
A1 A1 A2 1 0
Ω = Ωt Ωn = , (17)
0 AT1 A3 1

where A2 , A3 are antisymmetric d × d matrices. The action of Ωt on M gives

trivial constant rescaling and shift of G and B:

G → G = A1 GAT1 , B → B = A1 BAT1 − A2 . (18)

The action of Ωn , however, gives genuinely new (and complicated) solutions.

For example, even if we start from the simplest case B = 0, we get

G = (1 − GA3 GA3 )−1 G, B = GA3 (1 − GA3 GA3 )−1 G. (19)

We now turn to the equations of motion from the action (15). The ﬁrst
equation follows from reintroducing G00 into the action and from setting
to zero the corresponding variation. This gives directly the “Zero Energy”
condition

1
(Φ̇)2 + Tr Ṁ η Ṁ η − V (Φ) = 0, (20)
8
where we allow now for a potential V (Φ). If we assume that the potential does
not break the symmetry explicitly, then it can depend on M only through a
function of the invariants Tr(M η)p , p = 1, 2, .... However, for p odd these
traces vanish and for p even they are equal to 2d; hence, the potential can
depend only on Φ.
Such a potential is however rather unusual since Φ is not a scalar under
general coordinate transformations (but see [10] for its inclusion into a fully
general-covariant formulation). On the other hand, if the presence of V (Φ)
could be justiﬁed, then one would have a relatively simple solution to the so-
called graceful exit problem in String Cosmology [13, 14]. This point deserves
further study.
The variation of the action with respect to Φ yields:

1
∂V (Φ)
(Φ̇)2 − 2Φ̈ − Tr Ṁ η Ṁ η − = 0. (21)
8 ∂Φ
The variation of the action with respect to M has to be done carefully, since
M is subject to several constraints (it is symmetric and belongs to O(d, d)).
The resulting equation reads

∂t (M η Ṁ ) = Φ̇(M η Ṁ ), (22)

which can be integrated to give

e−Φ (M η Ṁ ) = const = A. (23)

566 K. A. Meissner

The constant matrix A satisﬁes

AT = −A, M ηA = −AηM. (24)

It is obvious that (20), (21) and (23) are invariant under the full O(d, d) group.
Substituting (23) into (20) we obtain the ﬁrst-order equation for Φ:

exp(2Φ)
(Φ̇)2 = Tr(Aη)2 + V (Φ), (25)
8
which can be solved by quadratures:
Φ −1/2
exp(2y) 2
t= dy Tr(Aη) + V (y) (26)
Φ0 8

This solution can then be used to deﬁne a “dilaton time” τ :

t
τ= eΦ dt . (27)
t0

In terms of τ the general solution of (23) simply reads

M (t) = exp(−Aητ )M (t0 ). (28)

The case of vanishing potential V = 0 is easy to analyse. In this case we

get 7
C 8
eΦ = ; C= . (29)
T −t Tr(Aη)2
We also get
T −t
τ = C ln , (30)
T − t0
so that the solution for M reads

T −t
M (t) = exp CAη ln , (31)
T − t0

where we assumed that M (t0 ) = 1. Consider, for instance, the special form
of A:
0 −Ad
A= , (32)
Ad 0
where Ad = diag(a1 , .., ad ). In this case we get
⎛
−2α1 ⎞
T −t
⎜ diag T −t0 , .. 0 ⎟
M (t) = ⎜
⎝

2α1 ⎟ ⎠, (33)
−t
0 diag TT−t 0
, ..
Dualities in String Cosmology 567

where αi = ai / Σa2i . These solutions are exactly the ones discussed in
[7] (see also [15]).
It is equally easy to analyse the case V = Λ =const [12]. The solution then
reads √ √
eΦ = C Λ/ sinh( Λ(T − t)) (34)
and / √ 0
tanh( Λ(T − t)/2)
M (t) = exp CAη ln √ (35)
tanh( Λ(T − t0 )/2)
where we again assumed that M (t0 ) = 1.
In [12] one amusing solution was found for d = 9 (corresponding to the
usual uncompactiﬁed superstring). As A we take

0 diag(−a1 , .., −a9 )
A= . (36)
diag(a1 , .., a9 ) 0

Then M is equal to
⎛
√ ⎞
diag tanh−2α1 ( Λ(T − t)/2), .. 0
M =⎝
√ ⎠
0 diag tanh2α1 ( Λ(T − t)/2), ..
(37)
where αi = ai / Σa2i . The scalar curvature and dilaton are ﬁnite for t → T
when
αi = 1. (38)
Assuming that all |αi | are equal, the only solution for αi is (−1/3,−1/3,
−1/3,+1/3,..,+1/3), so that three dimensions expand and six dimensions
contract.

4 O(d, d) Symmetry to the Next Order

In the paper [12] an argument was given that O(d, d) symmetry should be
present to all orders in the α expansion. The argument can be summarized
as follows: Consider a conformal background (a string vacuum) of massless
ﬁelds (metric, torsion and dilaton) which do not depend upon a particular
set of (possibly noncompact) string coordinates X a (a = 1, 2, ..., d). The asso-
ciated nilpotent string BRST operator Q will depend trivially (i.e. quadrat-
ically at most) on the 2d phase-space variables Z = (Pa , X a ). We perform
now a global, canonical O(d, d) transformation on the Z variables. Since this
transformation preserves commutation relations and Wick contractions, the
new BRST operator will also be nilpotent. However, the change in Z can be
traded for a change in the backgrounds, implying that also the transformed
backgrounds deﬁne a (generally inequivalent) conformal theory. Thus, O(d, d)
should be a symmetry of this particular class of string vacua. The action of
568 K. A. Meissner

O(d, d) can be very complicated, however, when we go over to higher orders

in the α expansion.
Indeed it was shown later in [16] that to the next order a seemingly very
complicated string action
#
1 √ 1
Γ = 2 −ge−φ R + (∇φ)2 − H 2 (39)
2κ 12

1
+α −RGB 2
+ 4 Rμν − g μν R ∂μ φ∂ν φ − 2φ(∇φ)2 + (∇φ)4
2

1 1 1
+ Rμνσρ Hμνα Hσρ α − 2Rμν Hμν
2
+ RH 2 − Dμ ∂ ν φHμν 2
+ φH 2
2 3 3
$
1 2 1 1 1
− H (∇φ) − Hμνλ H ρα H
2 ν ρσλ μα
Hσ + Hμν H 2 2μν
− 2 2
(H ) ,
6 24 8 144

also exhibits the O(d, d) symmetry. To display this symmetry one has to re-
deﬁne M by O(d, d) rotations of order α . The redeﬁnition can be written as

M → ω T M0 ω, (40)

where ω is in the form

a1 a2
Ω = exp , aT2 = −a2 , aT3 = −a3 , (41)
a3 −aT1

with

1 1
a1 = −α − ĠG−1 ĠG−1 + ḂG−1 ḂG−1 ,
2 2

1
a2 = −α −ĠG−1 Ḃ − ḂG−1 Ġ + (ĠG−1 Ġ − ḂG−1 Ḃ)G−1 B
2

1
+ BG−1 (ĠG−1 Ġ − ḂG−1 Ḃ) ,
2
a3 = 0. (42)

Comparing with (17) we see that this redeﬁnition is a trivial rotation (since
a3 = 0).
With this new M , the action (39) reads
#
1
Γ = dte−Φ −Φ̇2 − Tr(Ṁ η)2
8
$
1 1 1 1
−α Tr(Ṁ η)4 − (Tr(Ṁ η)2 )2 − (Tr(Ṁ η)2 )Φ̇2 − Φ̇4 . (43)
16 64 4 3

Since the O(d, d) symmetry is continuous and global, it has an associated

conserved current, which means, for a theory depending only on time, that
Dualities in String Cosmology 569

the current should be constant (it is an “integrated once” equation of motion

for M ). In analogy to (23) we call this constant A:
# $
1 1
A = e−Φ M η Ṁ + 2α M (η Ṁ )3 − M η Ṁ Tr(Ṁ η)2 − M η Ṁ Φ̇2 (44)
2 8

where AT = −A and AηM = −M ηA.

The g00 equation reads
1
0 = −Φ̇2 − Tr(Ṁ η)2
8
1 1 1 1 4
−3α Tr(Ṁ η) − (Tr(Ṁ η) ) − (Tr(Ṁ η) )Φ̇ − Φ̇ . (45)
4 2 2 2 2
16 64 4 3

Equations (44) and (45) cannot be explicitly solved because of their nonlinear
structure. Since they are ﬁrst order in the derivatives, however, they are in
principle solvable by quadratures.

5 Discussion
The main result of the above papers consisted in showing that in string theory
there exist large symmetries of the dynamical backgrounds (and not only
symmetries of propagation of strings in different static backgrounds). These
developments led to the idea that the dilaton may play a crucial role in the
evolution of the Early Universe. The idea was taken as a cornerstone of the so-
called Pre-Big-Bang Cosmology, developed in 1993 by Gabriele and Maurizio
Gasperini [8], that has grown into a separate part of early cosmology in itself.
In this idea the scale factor duality (or, more generally, O(d, d) duality) plays a
crucial role – the history of the Universe for negative times t < 0 and positive
times t > 0 is connected by a duality transformation (9), combined with the
time reversal transformation t → −t.
Such a scenario gives a natural possibility for solving all the usual problems
of standard cosmology by a phase of superinflation at negative times, driven
by the presence of the dilaton field without the need of any extra inflaton field.
The scenario gives a spectrum of perturbations which is significantly different
from that of the usual inflationary scenario – the power spectra of Pre-Big
Bang Cosmology calculated in two papers, written together with Alessandra
Buonanno and Carlo Ungarelli [17], show a significant difference in the origin
of structure: In this scenario it comes from the axion field fluctuations, and
not from the scalar perturbations of the inflaton field. It is however not clear,
at present, how to actually realize a connection from negative to positive
times (“passing through the singularity”), nor how to stop the dilaton from
evolving since, perturbatively, the dilaton does not develop any potential. This
is the so-called graceful exit problem [13, 14], for which there is no definite
solution, at present (one of the possibilities is to invoke a duality-invariant
570 K. A. Meissner

potential depending on Φ and not on φ; this seems to be very much against

the spirit of general relativity; see however [10] for a general-covariant, non-
local interpretation of such potential).
There was (and still is) quite an intensive research that was initiated by
the discovery of dualities in String Cosmology. It is difficult to even list all
the different lines of research, so we name just a few of them, here: large-
scale magnetic fields [18], cosmological perturbations [19], dilaton production
[20], relic gravitational waves [21], noncompact symmetries [22], ekpyrotic [23]
and cyclic [24] models of the Universe, phantom duality [25] and triality [26],
entropy of the Universe [27], black-hole solutions [28] and many others (for
more detailed references the reader may consult recent [29, 30] and earlier [31]
reviews).
Although it is not clear yet which of the ideas described in this paper will
be part of the future Early Universe Cosmology, one can safely predict that the
discovery of duality in the gravi-dilaton system (as well as all other discoveries
in Gabriele’s monumental set of achievements) will have a profound impact
on any future (not necessarily string related) theoretical research on gravity
and its symmetries.

References
1. J. Ehlers: Konstruktionen und Charakterisierung von Lösungen der Einstein-
schen Gravitationsfeldgleichungen, Dissertation (Hamburg, 1957) 561
2. C. Colonnello, A. Kleinschmidt: Ehlers symmetry at the next derivative order,
arXiv:0706.2816 [hep-th] 562
3. R. Geroch: J. Math. Phys. 12, 918 (1971); J. Math. Phys. 13, 394 (1972) 562
4. K. Kikkawa, M. Yamasaki: Phys. Lett. B 149, 357 (1984);
N. Sakai, I. Senda: Prog. Theor. Phys. 75, 692 (1986) 562
5. A. Giveon, E. Rabinovici, G. Veneziano: Nucl. Phys. B 322 (1989) 167 563
6. K. S. Narain: Phys.Lett. B 169, 41 (1986);
K. S. Narain, M. H. Sarmadi, E. Witten: Nucl. Phys. B 279, 369 (1987) 563
7. G. Veneziano: Phys. Lett. B 265, 287 (1991) 563, 567
8. M. Gasperini, G. Veneziano: Astropart. Phys. 1, 317 (1993) 563, 569
9. K. A. Meissner, G. Veneziano: Phys. Lett. B 267, 33 (1991) 564
10. M. Gasperini: Dilaton cosmology and phenomenology, this volume 564, 565, 570
11. A. D. Shapere, F. Wilczek: Nucl. Phys. B 320, 669 (1989) 564
12. K. A. Meissner, G. Veneziano: Mod. Phys. Lett. A 6, 3397 (1991) 564, 567
13. R. Brustein, G. Veneziano: Phys. Lett. B 329, 429 (1994) 565, 569
14. M. Gasperini, J. Maharana, G. Veneziano: Nucl. Phys. B 472, 349 (1996) 565, 569
15. M. Mueller: Nucl. Phys. B 337, 37 (1990) 567
16. K. A. Meissner: Phys. Lett. B 392, 298 (1997) 568
17. A. Buonanno, K. A. Meissner, C. Ungarelli, G. Veneziano: Phys. Rev. D 57,
2543 (1998); JHEP 9801, 004 (1998) 569
18. M. Gasperini, M. Giovannini, G. Veneziano: Phys. Rev. Lett. 75, 3796 (1995) 570
19. R. Brustein, M. Gasperini, M. Giovannini, V. F. Mukhanov, G. Veneziano:
Phys. Rev. D 51, 6744 (1995) 570
Dualities in String Cosmology 571

20. M. Gasperini, G. Veneziano: Phys. Rev. D 50, 2519 (1994) 570

21. R. Brustein, M. Gasperini, M. Giovannini, G. Veneziano: Phys. Lett. B 361,
45 (1995) 570
22. J. Maharana, J. H. Schwarz: Nucl. Phys. B 390, 3 (1993) 570
23. J. Khoury, B. A. Ovrut, P. J. Steinhardt, N. Turok: Phys. Rev. D 64, 123522
(2001) 570
24. P. J. Steinhardt, N. Turok: Phys. Rev. D 65, 126003 (2002) 570
25. M. P. Dabrowski, T. Stachowiak, M. Szydlowski: Phys. Rev. D 68, 103519
(2003) 570
26. J. E. Lidsey: Phys. Rev. D 70, 041302 (2004) 570
27. G. Veneziano: Phys. Lett. B 454, 22 (1999) 570
28. A. Sen: Nucl. Phys. B 440, 421 (1995) 570
29. M. Gasperini, G. Veneziano: Phys. Rep. 373, 1 (2003) 570
30. J. E. Lidsey, D. Wands, E. J. Copeland: Phys. Rep. 337, 343 (2000) 570
31. A. Giveon, M. Porrati, E. Rabinovici: Phys. Rep. 244, 77 (1994) 570
Spontaneous Breaking
of Space–Time Symmetries

E. Rabinovici

Racah Institute of Physics, The Hebrew University of Jerusalem, 91904

Jerusalem, Israel
[email protected]

Abstract. Kinematical and dynamical mechanisms leading to the spontaneous

breaking of space–time symmetries are described. The symmetries affected are space
and time translations, space rotations, scale and conformal transformations. Appli-
cations are made to solidification, string theory compactifications, the analysis of
stable theories with no ground states, supersymmetry breaking and the determina-
tion of the value of the vacuum energy.

1 Introduction

This being a contribution to honor Gabriele Veneziano I allow myself to open

with some personal words. I have ﬁrst heard Gabriele’s name on the ra-
dio when the late Yuval Ne’eman described the great importance of young
Gabriele’s work. That was in the late 1960s, several years later as a student
I had the privilege to learn from a still very young Gabriele about the dual
model in full detail. These were outstanding lectures. Over the years I have
learned many more things from Gabriele, some of them through direct collab-
orations, and in parallel we had developed a personal friendship for which I
am grateful.
It is not uncommon to young scientists to complain that their teachers
didn’t educate them appropriately and did not really pass them/point them
to the relevant information. I may have some such complaints of my own but
not to Gabriele. I would have liked for example to know earlier about the ideas
of Kaluza and Klein. So, in order to somewhat reduce the complaints that will
be directed at me, I would like to use this opportunity to describe something
that it is not taught extensively in particle physics courses, namely the mech-
anisms to spontaneously break space–time symmetries. The world around us
is actually not explicitly invariant under translations or under rotations. It
is also not explicitly invariant under scale and conformal symmetries. In this
work we will review various mechanisms to break all these space–time sym-
metries. I think they may yet play an important role in particle physics as

E. Rabinovici: Spontaneous Breaking of Space–Time Symmetries, Lect. Notes Phys. 737,

573–605 (2008)
DOI 10.1007/978-3-540-74233-6 19
c Springer-Verlag Berlin Heidelberg 2008
574 E. Rabinovici

well. I will ﬁrst describe attempts to break translational invariance kinemati-

cally by imposing specific boundary conditions. Then I will review the Landau
theory of solidification and an attempt to apply it to generate a dynamical
mechanism for compactifications. I will discuss both the success and chal-
lenges of that approach. Next, in the context of breaking time-translational
invariance I will discuss various systems which are well defined but have no
ground state. Following a review of the breaking of scale invariance and con-
formal invariance I will also not miss this opportunity to describe in a Katoish
manner that the vacuum energy in conformal/scale invariant theories is very
constrained, and its zero value does not depend on the presence or absence
of any spontaneously generated scales. This may eventually be recognized as
an important ingredient in understanding and explaining the cosmological
constant problem.

2 Spontaneous Breaking of Space Symmetries

Space symmetries include space translations and space rotations, and we
address here the spontaneous breaking of these space symmetries. This oc-
curs for example when a liquid solidifies and a lattice is formed. The standard
manner to identify the ground state of a system is to construct what is called
the effective potential. The symmetry properties of the ground state deter-
mine whether a spontaneous breaking of symmetries which are manifest in
the Lagrangian occurs.
Let’s review the manner in which the effective potential is constructed. One
first considers all wave functionals which have the same expectation value of
the field operator φ̃,

< Ψ (φ)|φ̂|Ψ (φ) >= φ̃ . (1)

Out of this subset of wave functionals, one chooses the particular wave
functional which minimizes the expectation value of the Hamiltonian. One
calls it Vef f (< φ >),

Vef f (φ̃) = minφ̃ < Ψ (φ)|Ĥ|Ψ (φ) > . (2)

Eventually one draws a picture portraying Vef f as a function of φ̃ and one

searches for its minimum. The wave functional for which this energy mini-
mum was obtained is the wave functional of the ground-state of the system.
However, one usually ignores the possibility that the ground-state wave func-
tion would correspond to a non-constant (in x) expectation value < φ(x) >.
Of course it makes much easier the drawing of pictures in books; here, how-
ever, we will discuss cases where < φ(x) > actually does depend on x when
evaluated in the ground state.
Why does one usually only consider wave functionals with constant values
of < φ(x) >?
Spontaneous Breaking of Space–Time Symmetries 575

The reason is expediency—when one wants to pick up the ground state of

the system among various candidates, one is interested only in the winner, that
is the true ground state. One does not care if one misses out candidate states
whose energies are just above that of the ground state of the system. As one
generally does not expect spontaneous breakdown of space–time symmetries
in the ground state, one considers it enough to search for the ground state
only among those candidates for which < φ(x) > is constant. However, that
need not always be the case.

2.1 Kinematics: Attempts to Break Spatial Translational

Invariance Through Boundary Conditions

I will ﬁrst describe an easy way to attempt to break space symmetries—that is

to break the symmetries not by the dynamics of the system but kinematically,
by imposing certain boundary conditions, which may induce such a breaking.
This easy solution is a mirror to what is done in String Theory in several cases,
including when one is considering brane sectors. To try and break translational
invariance by boundary conditions, one considers for example a system which
depends on a scalar ﬁeld φ. Assume the system lives in a box extending from
−L to L, and impose the condition of anti-periodicity, namely,

φ(L) = −φ(−L), (3)

where L is the spacial cutoﬀ we put on the system.

If the system at hand is described by an eﬀective potential that has only
one minimum, as in Fig. 1, where the expectation value < φ > vanishes,
then there is no eﬀect resulting from imposing the boundary conditions. The

Fig. 1. Unbroken symmetry

576 E. Rabinovici

Fig. 2. Broken symmetry

ground state does fulfill the boundary condition, and it remains the one which
does not break translational invariance. From the point of view of the wave
functional it is concentrated around φ = 0.
However, consider the double-well potential of Fig. 2 (in circumstances
where there is no tunneling). In this case the effective potential has two min-
ima, one at φ = a and the other at φ = −a. Imposing the boundary condition
removes both the possible true vacua of the system, because neither the ground
state for which < φ >= −a nor the ground state for which < φ >= a obeys
the boundary condition. One is driven to look for another type of ground state.
We know, for example, that in a two-dimensional system composed only of
scalar fields there is a finite energy solution, which is a soliton, that at L has
a value a and at −L has a value −a, see Fig. 3. An anti-soliton will have the
opposite values. This is a stable topological configuration, and one may imag-
ine that indeed in such a system there is no translational invariance, because
the ground state will have to be such that its spacial expectation value follows
the values of the soliton field, and thus is not translational invariant.
It is true that by imposing the boundary conditions one has forced the
system into the soliton sector, but one has to remember that this system has
a zero mode. Technically, if one solves the small fluctuations of the scalar field
in the presence of a background, which is a soliton, one finds that there is
a zero mode. This zero mode is a reflection of the underlying translational
invariance and it actually tells us that one is not able to determine, by en-
ergetic considerations, where the inflection point x0 (namely, the point from
which one turns from one vacuum to the other) is placed (see Fig. 3). Actually
there is a valid soliton solution for each value of x0 .
Why is this important? At the case at hand, the zero mode is normalizable.
This amounts to saying that the soliton mass is finite. In such a case, there is
actually no bulk violation of translational invariance. What one needs to do is
Spontaneous Breaking of Space–Time Symmetries 577

Fig. 3. A soliton attempts to break translational invariance

to construct an eigenstate conﬁguration, which is an eigenstate of the linear

momentum operation, a plane wave in terms of the center of mass coordinate
of the soliton. The lowest energy state which corresponds to a momentum
state has p = 0, one has restored in this way translational invariance. The
only problem will be to fix the system very near the edges, but in the bulk the
symmetry has been restored, and there is no breaking of bulk translational
invariance. Could one still have a case where the boundary conditions do cause
a spontaneous breaking of translational invariance?
This may occur when one drives the mass of the soliton to infinity by an
appropriate choice of parameters. When the mass of the soliton is infinite,
physically one cannot form a linear momentum state out of it, and techni-
cally the zero mode ceases to be normalizable. In such a case, one does indeed
break translational invariance spontaneously by fixing the point where the
soliton makes the transition from one vacuum to the other. This occurs for
example in String Theory in a sector containing infinite branes: branes which
have finite energy do not break translational invariance, and one can build
out of them linear momentum states. However, branes which extend up to
infinity carry infinite energy, and therefore do lead to the breakdown of trans-
lational invariance. I will mention at this point that once upon the time, when
people were considering the breakdown of extended global supersymmetries,
there was a predominant common wisdom which claimed that one cannot
break down extended global supersymmetry to anything but N = 0. That is
either all the supersymmetries are manifest together, or they are all broken
together. The argument went in the following way: one writes the formula for
the Hamiltonian

H= Q̄Iα QIα , (4)
α
578 E. Rabinovici

where I = 1, . . . , N is not summed, and it is a non-trivial constraint to get

the same Hamiltonian by summing over diﬀerent supersymmetry generators.
When one can do that, one has an extended supersymmetry. However, it is
clear from this that if the Hamiltonian does not vanish on the ground state,
then some of the QI (for each I independently) do not annihilate the ground
state. Therefore, the supersymmetries are either all preserved or all broken.
This type of argument assumed implicitly that Poincaré invariance is
present in the system. If one now considers a system of branes (see for example
[1, 2]), then part of the Poincaré invariance is preserved and part is broken.
This exposes a loop hole in the former argument, and in the absence of full
translational invariance (due to the presence of inﬁnite mass branes) one may
obtain fractional BPS states, and one may break down N = 4 to N = 2,
N = 2 to N = 1, and various other combinations.
This is an example where spontaneous breaking of translation invariance
occurs—it has an impact also on the partial breaking of global supersymmetry
and, if one wishes, this is a way to break translation invariance by forcing the
system, using boundary conditions, to a certain super-selection sector.
This is not what I mainly want to discuss here. I would like to discuss a
situation where the dynamics of the system drive the spontaneous breaking
of translational and rotational invariance.

2.2 Dynamics: The Landau Theory of Liquid–Solid

Phase Transitions

Let us now turn to discuss the transition between a liquid and a solid. This fol-
lows the seminal work of Landau [3]. In a monumental paper he simultaneously
described spontaneous symmetry breaking of both internal and space–time
symmetries. Consider a liquid, a system whose Lagrangian is either relativis-
tic or non-relativistic, and it possesses full rotational and translational invari-
ance. A solid, on the other hand, is a system which maintains only a very
small part of the translational invariance and rotational invariance (Fig. 4).
Let us simplify the study by ignoring the point structure at each lattice
point which a solid may have. That is, let’s not consider the atomic structure
at each point. One focuses ﬁrst on the question of how does the simplest lattice
form.
I will describe this following Landau and then, following [4], I am going to
describe applications to String Theory. Landau starts by deﬁning the Landau

Fig. 4. The solid lattice breaks most of the translational and rotational invariance
Spontaneous Breaking of Space–Time Symmetries 579

order parameter to monitor the transition between a solid and a liquid. It is

a scalar order parameter (x),

(x) = s (x) − 0 , (5)

the diﬀerence between the non-translational non-rotational invariant density
of the solid s (x), and the constant density 0 of the liquid. Next, consider
the Fourier decomposition of (x)

(x) = (q)eiq·x + h.c. (6)

It is useful to use as order parameters the Fourier components (q) .

The question is thus: Does the wave functional of the ground state have
support on q = 0? If the answer is positive, then at the very least continuous
translational space symmetry would be spontaneously broken. This will be
determined by studying the Landau–Ginsburg eﬀective action as expressed in
terms of the order parameter (q). The ﬁrst relevant term of the Landau–
Ginsburg action is quadratic in the order parameter and is given by

L0 = dq 1 dq 2 (q 1 )(q 2 )A(|q 1 |2 )δ(q 1 + q 2 ). (7)

The delta function δ(q 1 + q 2 ) enforces translational invariance, while ro-

tational invariance is preserved by the dependence on |q|2 of the function
A(|q|2 ). The function A(|q|2 ), like in any Landau–Ginsburg potential, is de-
termined by the microscopic theory. In the particular case at hand, it will
depend on the hardcore potential component in the atoms involved and on
other possible potentials, as well as on the temperature of the system. In the
case of neutron stars, studied in [5], the Pauli exclusion principle plays a role
in determining the function A(|q|2 ).
Let us treat ﬁrst an example that we are familiar with, that of a free
massive spin-zero particle in a relativistic ﬁeld theory. In that case the function
A(|q|2 ) is

A(|q|2 ) = |q|2 + m2 . (8)

This has a minimum at |q| = 0, as shown in Fig. 5, and thus the function
2

(q) should get the support only at q = 0: There is no spontaneous breakdown

of translational invariance, in this case.
In the presence of interactions things may become more complicated; for
example, I am not familiar even with a proof that the standard model ground
state does not violate space–time symmetry (though most likely it does not).
In any case, the microscopic theory may allow a diﬀerent function for A(|q|2 ).
In particular, assume that the form of A(|q|2 ) is as given in Fig. 6 . In this case,
the function A(|q|2 ) has a minimum at a value |q 0 |2 = 0. In such a system
the ground state wave functional gives rise to a density concentrated around
|q 0 |2 = 0. In particular, one would expect the support to be concentrated
580 E. Rabinovici

Fig. 5. The form of A(|q|2 ) in a free massive relativistic ﬁeld theory does not lead
to spontaneous breaking of translational invariance

around a sphere in q-space, whose radius is |q 0 |. So, given A(|q|2 ) of that form,
one is in a situation where there is a spontaneous breaking of translational
invariance, but not yet also a breaking of rotational invariance, which is what is
needed to form a solid. It is good enough to break just translational invariance.
The ground-state density does depend on x

(x) = dΩ (q)eiq·x + h.c. (9)
S|q0 |

In this approximation the wave functional of the ground state is supported on

a sphere S|q0 | whose radius is q 0 . In particle physics we have become rather
sophisticated, and when one writes down the Landau–Ginsburg action, one
usually requires that the expansion which one does in the order parameter

Fig. 6. Example of a function A(|q 2 |) which leads to the breaking of translational

invariance. An explicit microscopical realization of a such a form appears in neutron
stars [5]. The wave functional is concentrated at most on the shell of a sphere of
radius |q 0 |
Spontaneous Breaking of Space–Time Symmetries 581

be under control: That means, for example, that there is a limit in which
this expansion becomes exact. In the case at hand this is not the situa-
tion, which is actually very complicated; nevertheless, one follows the usual
Landau–Ginsburg expansion.
The term which follows the quadratic interaction is a cubic term:

L = L2 + L3 , (10)

L3 = d3 q 1 d3 q 2 d3 q 3 (q 1 )(q 2 )(q 3 )δ(q 1 + q 2 + q 3 ) ×

B(|q 1 |2 , |q 2 |2 , |q 3 |2 , q 1 · q 2 , q 1 · q 3 , q 2 · q 3 ). (11)

For the purpose of illustration I am going to assume, as Landau did, that

this is a good perturbation, namely that when one considers L3 one is going
already to assume that the support of comes from only those values of q
such that |q 1 |2 ∼= |q 2 |2 ∼
= |q 3 |2 ∼
= |q 0 |2 . This was determined by L2 .
In (11), once again, the delta function δ(q 1 + q 2 + q 3 ) enforces the explicit
translational invariance, and the dependence of B on the momentum respects
both translational and rotational invariance. The integral in the q’s is not
over all possible values, but only over those whose lengths is determined by
|q 0 |2 , which in turn was fixed by L2 .
An additional structure emerges due to the effect of the delta function
δ(q 1 +q 2 +q 3 ). It restricts the candidates for the ground state to have support
on at least three different values for the q i . The three vectors appearing need
to sum up to give a triangle, see Fig. 7.
Actually they are six if the field is real since one needs

(q) = (−q). (12)

Thus one has at least six components of (q) which do not vanish. In general,
instead of (q) having support on all values of a sphere, they are now broken
into triplets where the q i have to sum together to form triangles (Fig. 7). In
this manner also rotational invariance is spontaneously broken.
Let’s be even more explicit, because we have used the approximation that
all the q i have the same length, the q i that tessellate the sphere have to form
equilateral triangles, as in Fig. 7. Equilateral triangles single out a speciﬁc
angle 60◦ , that is a spontaneous breaking of rotational invariance. One has
obtained a non-zero value for q, and one has derived that the ground state is
built out of objects which have to sum up to form triangles which are equilat-
eral and thus have 60◦ angles. From energy and combinatorial considerations
one then ﬁnds that, to be on an extremum, one needs all the values of (q i )
to be equal,

|(q i )|2 = |(q 0 )|2 , (13)

582 E. Rabinovici

Fig. 7. The sphere S|q 0 | is triangulated due to the presence of a cubic term in
the Lagrangian. Since in this approximation all the sides of the triangles have the
same length, their angles are determined to be 60◦ . Rotational invariance is thus
spontaneously broken

which leads to

|(x)|2 = n|(q 0 )|2 , (14)

where n is the number of non-vanishing components of (q).
There are a couple of general ways to distribute the triplets, one in which
each q i appears in only one of the triplets, and another in which each value of
q i does participate in two triplets. The number of elements (i.e. the number
of triplet conﬁgurations) is proportional to n in both cases, being either 2n/3
or 4n/3. When one does the analysis, and √ one estimates the value of L3 , one
ﬁnds that it decreases as the inverse of n:

|(q 0 )|2
L3 ∼ √ . (15)
n
Thus the ground state will be obtained for some finite value of n. One needs
to consider only a finite number of triplet configurations when one searches
for the extrema of the free energy. Just three, i.e. six participants (if the field
is real), lead to the following density distribution

1/2 √
2 1 3
(x, y) = ± q0 cos(q0 x) + 2 cos q0 x cos q0 y . (16)
3 2 2

The corresponding free energy is

2B3q0
Ln=3
3 = √ . (17)
3 3
For the case of two spatial dimensions it turns out that if (q0 ) > 0 it is
advantageous to form a triangular lattice, while if (q0 ) < 0, the dual lattice,
which is a honeycomb lattice, is formed.
This required only studying the minimal possible conﬁguration. In three
spatial dimensions this would be a candidate for a two-dimensional lattice in
three dimensions, if one wishes some type of compactiﬁcation.
Spontaneous Breaking of Space–Time Symmetries 583

In three dimensions one needs to consider also larger conﬁgurations to

obtain the extrema. The next candidate conﬁguration has six (n = 6), i.e.
twelve values of q. This is a more complicated conﬁguration, whose density
distribution is

√ √
2 2 2
(x, y, z) = √ q0 cos q0 x cos q0 y +
3 2 2
√ √ √ √
2 2 2 2
+ cos q0 x cos q0 z + cos q0 y cos q0 z , (18)
2 2 2 2

which is actually that of a BCC lattice (in real space). The value of L3 is
larger than for the former configuration:
4B3q0
Ln=6
3 = √ > Ln=3
3 , (19)
3 6
and leads to the extrema of the free energy, corresponding to the most stable
configuration.
From amazingly simple considerations, one has a prediction that solids
in three dimensions are all BCC lattices—a very universal description of the
system. Before confronting this claim with the data one needs to recall that
the transitions between solids and liquids are not second-order transitions,
they are actually first-order transitions. So, one may question the validity of
universality claims in this context. However, it turns out that in many cases
one can arrange that the solidifications occur as a weak first-order transitions,
in which case approximate universality properties can be present.
Returning to the data and following [6], one discovers that about 40 metals,
which are on the left of the periodic table (excluding magnesium (Mg)), form
near the solidification point a BCC configuration. I will repeat the difficulties
of the analysis and the argumentation to proceed with it nevertheless. The
transition is first order—the fact that in many cases it is a weak first-order
transition softens this problem. There is no true expansion parameter in the
problem. The microscopic theory constructing A and B is very phenomeno-
logical, and therefore, the real relative stability of the metal is a very delicate

Fig. 8. In the absence of a cubic term, a quartic term would not suﬃce classically to
induce a spontaneous breaking of rotational invariance. A rhombus does not single
out a preferred angle θ
584 E. Rabinovici

matter. Even taking all these into account, the result and its agreement with
a large body of the experimental data is striking.
Consider what would have happened without a cubic term. In that case,
the term following the quadratic term would be L4 , which schematically would
assume the form

L4 = dq 1 dq 2 dq 3 dq 4 δ(q 1 + q 2 + q 3 + q 4 ) ×

C(|q 1 |2 , |q 2 |2 |q 3 |2 , |q 4 |2 , q 1 · q 2 , q 1 · q 3 , . . . ) (20)
where the delta function enforces translational invariance, and C should
be built by such invariants that maintain both rotational and translational
invariance.
This does not break rotational invariance because, unlike the case of trian-
gles, the conﬁgurations which are enforced now, assuming perturbation the-
ory, are those of quadrilaterals with equal sides. But for a rhombus (Fig. 8)
no angle is singled out. The rotational invariance is not broken. Fortunately
there is no microscopic symmetry consideration that rules out the cubic term.
Another interesting type of lattices are the Abrikosov lattices formed of
vortexes, which we do not discuss here.

String Theory Compactiﬁcations

What has been described above has a very solid basis in nature. What we
will describe next is of a much more speculative nature, and it is based on
work by Elitzur, Forge and myself [4] , in which we try to address the issues
of compactification in String Theory. There are several attitudes one might
adopt regarding compactification. One, which makes a lot of sense, is to say
that the Universe starts up very small, and the issue of compactification is an
issue of explaining why four dimensions became very large, while the rest of
the dimensions remain small. This is not what I am going to discuss here.
Here, I discuss possible dynamical aspects of compactification taking in
account some of the hints learnt from the case of solid-state physics. I don’t
have much confidence in human imagination when it is totally detached from
reality, I would hope that many of the hints available in nature to be useful to
understand other phenomena. In particle physics one has learned quite a lot
from the dynamics of solid-state physics, and statistical mechanics systems.
Returning to the case at hand we have just reviewed a system which has
lost most of its rotational and translational invariance, and we want to see
how such a thing could happen in String Theory. One of the key ingredients
driving this behavior is the presence of a bulk tachyon.
There are actually at least three types of tachyons/instabilities with which
one is familiar right now in String Theory. One is that of the Bosonic String
Theory tachyon. This instability could well be an incurable one; nevertheless,
let’s try and follow it.
Spontaneous Breaking of Space–Time Symmetries 585

The other types of instabilities, which we will discuss later, are an

instability in Open String Theory, an open string tachyon, and also localized
bulk tachyons.
For the moment we focus on bulk tachyons, which will be one key ingre-
dient. Due to them, it is preferable for a system in String Theory containing
a tachyon to have a support on a non-zero value of q 2 . One can see this from
the form of the tachyon whose vertex operator is the following

T (x) = eiq0 x + h.c. (21)

To obtain a dimension (1, 1) operator one needs q0 = 0. Tachyons do give
us the starting point that appears in Landau theory of solidification (note
that here it is not a minimum consideration). The second key ingredient that
we need for Landau’s theory of solidification, in order to obtain not only
the breakdown of translational invariance, but also of rotational invariance,
is the presence of a cubic term. We know from the OPE (operator-product-
expansion) that three tachyons do couple together (see Fig. 9). In particular,
the OPE between two tachyons does contain a third tachyon. So we have in
a such a theory a T 3 term. One indeed has the necessary ingredients to try
to follow if tachyons could lead to the spontaneous breaking of rotational and
translational invariance in String Theory, and maybe also to compactification.
In order to be more concrete, we followed the ideas of [7, 8] and tried to
handle in a reliable fashion almost marginal operators. Consider a tachyon
which is not an exact (1, 1) operator, but one which has q02 = 2 − ε. We will
also look at the subset of the full string background, a subset which contains
a c = 2 sector . We will not deal here with the question of how the total
central charge remains at the appropriate value, which is zero, and how to
dress operators.
As an illustration, consider the subset of the backgrounds which are string
moving in flat space, where the piece of the Lagrangian on which we focus is

¯ 1 + ∂X 2 ∂X
L = ∂X 1 ∂X ¯ 2 + T (X 1 , X 2 ). (22)

Fig. 9. Tachyonic cubic vertex

586 E. Rabinovici

From Landau’s theory of solidiﬁcation we know that, because the system

has support on a q0 = 0, and because the free energy of the system contains
a cubic coupling, we can try and build the triplets, which again actually
correspond to six vectors, so that they get a support in an appropriate way,
i.e. such that they break translational and rotational invariance.
The 60◦ angle, discussed in the solidiﬁcation case, manifests itself in a
suggested tachyon conﬁguration:

3 2
T (X 1 , X 2 ) = Ta cos( kia X i ), (23)
a=1 i

where the three momenta k1 , k2 and k3 are the following.

√ √
k1 = k(1, 0) k2 = k(−1/2, 3/2) k3 = k(−1/2, − 3/2). (24)
All of them have k = 2 − ε, and the structure is very similar to that of
2

the SU (3) root lattice (see Fig. 10), as before: For every ki there is also the
corresponding −ki contribution.
One can simplify the tachyon potential by taking the ansatz for the ampli-
tudes Ta = T . The Lagrangian one needs to solve is the one given in (22), and
actually one can show that the beta function of the tachyon alone vanishes to
order ε. So (23) is a solution of the approximate tachyon equations of motion.
This means that had it been up to the tachyon alone one would have obtained
the lattice, perhaps some honeycomb or triangular lattice, which would break
both translational and rotational invariance. However, this system contains
also gravity so one needs to see what is the inﬂuence of the formation of such
a lattice on gravity. As shown in [4], the beta function for the graviton βGμν

vanishes (at leading order in α ) if

Fig. 10. SU(2) roots

Spontaneous Breaking of Space–Time Symmetries 587

3
βGμν = −Rμν + ∇μ T ∇ν T = −Rμν + ε2 δμν = 0. (25)
2
For D = 2, due to the Liouville theorem Rμν can be written as Rμν = aδμν ,
so actually one can solve the equation by forming a two-dimensional sphere.
This is actually a highlight of a model for compactification. We started
by having just a tachyon. The tachyon would have produced the lattice on
its own, but because of the presence of the gravity, the lattice of tachyons
actually causes the compactification of space to a sphere.
However, it turns out, and details are presented in [4], that unfortunately
this result is not obtained in a desired reliable approximation. The main prob-
lem is that, in order to do reliable perturbation theory, we need to perform a
plane-wave expansion, with the wave lengths representing a nearly marginal
operator. However, when the sphere is formed, the topology changes, and
the change of topology means that one should now expand the fields in
terms of spherical harmonics Yl,m . This topological obstruction takes away
the reliability of our calculation. Some defects may form in order to resolve
this topological problem, and one conjecture we had at that time was that
actually parafermions, which are defects, form to resolve the tension. A more
complex form of compactification emerges.
Once again, recall that actually the system, when fully considered, has
to be coupled to the dilaton in order to maintain the total central charge.
According to the Zamolodchikov theorem [7], once the system starts to flow,
the central charge decreases from 2 and this on its own breaks the balance. In
a sense, in the case of bulk tachyons we were tantalizingly close to obtain an
explicit dynamical mechanism for compactification. However, due to topolog-
ical obstructions, what was a solution for the beta function locally in space
cannot be a global solution without taking into account other effects. We will
return to the breaking of translational invariance in the different context of
the open string tachyon.

Liquid Crystals

The tachyon is a scalar order parameter, String Theory has additional ﬁelds
which carry indexes. In particular, one might think that if one looks for a
similarity to our universe, maybe one should consider the phase of liquid
crystals. Such systems are translational invariant in some directions but not
in other (see Fig. 11). We will give now examples of that.
There are various types of liquid crystals and one can ask what is the
Landau–Ginsburg theory of them. Actually, one can also ask about vector
potential systems which are described, as gauge ﬁelds are, by vector-like order
parameters. Such systems include detergents which possess a hydrophobic and
a hydrophilic pole, and play a crucial role in cleaning our garments. One can
try to extract from p(r) the various invariants one wants to use in order to
describe this system, such as divp, curlp, sαβ = ∂α pβ + ∂β pα . It turns out that
588 E. Rabinovici

Fig. 11. Various phases of liquid crystals breaking. These systems exhibit asym-
metrical breaking of translational and rotational invariance

one can write down a Landau–Ginsburg theory for detergents, which explains
many of their very fascinating properties.
Considering the case of liquid crystals, these can also be described by
choosing for example particular spherical harmonic functions, and using them
as an order parameters.
For illustrative purposes, we give the dependence of the density φ on the
angles and on the coordinates1

1
The index structure of φ has been omitted
Spontaneous Breaking of Space–Time Symmetries 589

φ= μi Y22 (θi , φi )eiki ·ri + h.c. (26)
i

By assuming the ansatz μi = μ, the eﬀective Landau–Ginsburg free energy is

given below

F ∼ (α0 + dk + ck 2 )μ2 − βμ3 + rμ4 , (27)

from which one can extract the properties of nematic, smectic A and smec-
tic C properties, and many other exciting things for which we refer to the
literature [6].

Boundary Perturbations

Next I discuss an example where a breakdown of spatial translational symme-

try actually clearly occurs. As mentioned above one can formulate an intuitive
theorem in the bulk; the theorem states that under the renormalization group
flow, the value of the Virasoro central charge c decreases from its UV value
to a smaller IR value. This is due to the integrating out of the degrees of
freedom and applies to the unitary sectors of String Theory. In String Theory
with its ghosts the total central charge vanishes. One can imagine mechanisms
by which the central charge of the ghosts increases [9], but basically one needs
to couple the two systems maintaining a total vanishing central charge. This
can be done for example with the help of a linear dilaton, and leads to very
interesting questions and results . Generically, the matter central charge will
decrease to zero leaving one with just a c = 0 topological theory, but there
are also other possibilities. The central charge is related to the anomaly which
exists in the bulk. On the other hand, when one considers the boundary the-
ory, there are no gravitational anomalies in it. Thus in that case one can
consider tachyonic open string theory perturbations. In the example given by
the action below

S= LCF T + gORel. , (28)
Σ ∂Σ
the bulk theory is defined on the surface Σ, and on its boundary ∂Σ one adds
a relevant operator ORel. . There is a boundary renormalization group flow
which does not change the bulk central charge, and therefore does not lead to
all the problems associated with tachyons in the bulk.
One can associate a term in the boundary which measures the effective
number of degrees of freedom, and this has been done by various authors
[10, 11].
It can be proved, moreover, that one can define such a function whose
value also decreases when the theory flows on the boundary, all this without
requiring an adjustment of the total central charge. What happens for example
is that the theory flows from Dirichlet(D) to Neuman(N) boundary conditions,
so that, in other words, branes may dissolve or may be created under such
590 E. Rabinovici

Ro
• T

• O

• I D

√ √
N 1/ 2 2 N D 1/ 2
Rc
√ √
0 √ 1/ 2 2 D
1/ 2

Fig. 12. Map of the preferred boundary conditions in the c = 1 moduli space, N
stands for Neuman and D for Dirichlet boundary conditions [9]

a ﬂow. In Fig. 12 we give an example of a very simple compactiﬁcation in

which one can identify what are the stable conﬁgurations, describing when
the system chooses to obey Dirichlet and when the system chooses to obey
Neuman boundary conditions [9].
This can be used even further if one changes the relevant operator added
on the boundary into a Sine–Gordon one. In that case one can actually has
situations where one breaks translational invariance in space–time by a D −25
brane, for example dissolving into a lattice of D − 24 branes [12] (Fig. 13).
Again, such a situation will also lead to a reduction of the original amount
of supersymmetry. Thus, the idea of spontaneous breaking of spatial transla-
tional invariance is demonstratively realized in String Theory by the presence
of open string tachyons.

3 Spontaneous Breaking of Time-Translational

Invariance and of Supersymmetry
Next I will discuss a somewhat diﬀerent mechanism which may allow the
possibility of a spontaneous breaking of time-translational invariance. For that
it is useful to consider conformal and superconformal quantum mechanics.
One way to motivate the interest in such systems is to recall some basic facts
concerning the validity of a perturbative expansion.
Consider the Hamiltonian,

p2q 1
H= + gq n . (29)
2m 2
Spontaneous Breaking of Space–Time Symmetries 591

Fig. 13. A lattice of D24 branes is formed from a D25 brane in the presence of a
boundary tachyon

One may wonder if it is possible to make a meaningful perturbative expansion

in terms of small or large g or small or large m. To answer this one needs
to ﬁnd out if one can remove the g,m dependence from the operators, and
relegate it to the total energy scale. This type of rescaling is used for discussing
the harmonic oscillator. One attempts to deﬁne a new set of dimensionless
canonical variables px , x that preserve the commutation relations,

[pq , q] = [px , x] , (30)

and
1
H = h(m, g) (p2x + xn ) . (31)
2
The following decomposition
1
q = f (m, g)x , pq = px (32)
f (m, g)
gives

p2x
2H = 2
+ gf (m, g)n xn , (33)
mf (m, g)
and so one may choose
n+2
1
1
gf (m, g) = . (34)
mf (m, g)2

The Hamiltonian becomes

n 1
H = g 1− n+2 m− n+2 (p2q + q n ) .
n
(35)
2
592 E. Rabinovici

The role of g and m is indeed just to determine the overall energy scale.
They may not serve as meaningful perturbation parameters. This does not
apply to the special case of n = −2, the case of conformal quantum mechanics,
where g can be a real perturbative parameter.

3.1 Conformal Quantum Mechanics: A Stable System with No

Ground State Breaks Time-Translational Invariance

Consider the Hamiltonian

1 2
H=(p + gx−2 ) (36)
2
for a positive value of g [13]. H is part of the following algebra:

[H, D] = iH , [K, D] = iK , [H, K] = 2iD . (37)

It is an SO(2,1) algebra, one representation of which is
1 1
D = − (xp + px) , K = x2 , (38)
4 2
with H given above. The Casimir is given by
1 g 3
(HK + KH) − D2 = − . (39)
2 4 16
In the Lagrangian formalism the system is described by

1 2 g
L = (ẋ − 2 ) , S = dtL . (40)
2 x
Symmetries of the action S, and not of the Lagrangian L alone, are given by
at + b 1
t = , x (t ) = x(t) , (41)
ct + d ct + d

ab
A= , detA = ad − bc = 1. (42)
cd

H acts as translation

10
AT = , t = t + δ . (43)
δ1

D acts as dilation

α0
AD = , t = α 2 t . (44)
0 α1
Spontaneous Breaking of Space–Time Symmetries 593

K acts as a special conformal transformation

1δ t
AK = , t = . (45)
01 δt + 1

The spectrum of the Hamiltonian (36) is the open set (0, ∞), the spec-
trum is therefore continuous and bounded from below. The wave functions
are given by
√ √
ψE (x) = xJ√g+ 1 ( 2Ex), E = 0 . (46)
4

The zero-energy state is given by φ(x) = xα :

d2 g
Hφ(x) = − 2 + 2 xα = 0 . (47)
dx x
This implies

g = −α(α − 1) , (48)
and solving this equation gives
√
1 1 + 4g
α=− ± . (49)
2 2
This gives rise to two independent solutions, and by completeness these are all
the solutions. The case α+ > 0 does not lead to a normalizable solution since
the function diverges at infinity. The case α− < 0 is not normalizable either,
since the function diverges at the origin (a result of the scale symmetry).
Thus, there is no normalizable (not even plane-wave normalizable) E = 0
solution (Fig. 14)!
Most of the analysis in field theory proceeds by identifying a ground state
and the fluctuations around it. How do we deal with a system in the absence
of a ground state?
One possibility is to accept this as a fact of life. Perhaps it is possible to
view this as similar to cosmological models that also lack a ground state, such

∞ x

Fig. 14. There is no normalizable ground state for this potential

594 E. Rabinovici

those with Quintessence. In ﬁeld theory such systems have no ﬁnite-energy

states in the spectrum at all. Only time-dependent states are allowed. In the
presence of an appropriate cutoﬀ, and in quantum mechanics, it is only the
potential lowest energy state which is disallowed.
Another possibility is to deﬁne a new evolution operator that does have a
ground state

G = uH + vD + wK . (50)
This operator has a ground state if v 2 − 4uw < 0. Any choice explicitly
breaks scale invariance. Take for example

1 1
G= K + aH ≡ R , (51)
2 a
where a has the dimension of a length. The eigenvalues of R are

1 1
rn = r0 + n , r0 = 1+ g+ . (52)
2 4
This is a breaking of scale invariance by a dictum and not by the dynamics
of the system. Nevertheless, it is very interesting to search for a physical
interpretation of this. Surprisingly, this question arises in the context of black-
hole physics. Consider a particle of mass m and charge q falling into a charged
black hole. The black hole is BPS, meaning that its mass M and charge Q
are related, in the appropriate unites, by M = Q.
The black hole metric and vector potential are given by

−2 2
M M r
ds2 = − 1 + dt2 + 1 + (dr2 + r2 dΩ 2 ) , At = . (53)
r r M

Now consider the near Horizon limit, i.e. r << M , which we will reach by
taking M → ∞ and keeping r ﬁxed. This produces an AdS2 × S 2 geometry
r 2 2
M
ds2 = − dt2 + dr2 + M 2 dΩ 2 . (54)
M r

We also wish to keep M 2 (m − q) ﬁxed as we scale M . This means we must

scale (m − q) → 0, that is, the particle itself becomes BPS in the limit.
The Hamiltonian for this falling in particle, in this limit, is given by our
old friend:
p2r g 4l(l + 1)
H= + 2 , g = 8M 2 (m − q) + . (55)
2m 2r M
For l = 0 we have g > 0, and there is no ground state. This is associated with
the coordinate singularity at the Horizon. The change in evolution operator is
Spontaneous Breaking of Space–Time Symmetries 595

now associated with a change of time coordinate. One for which the world line
of a static particle passes through the black-hole horizon, instead of remaining
in the exterior of the space–time. In any case, the consequence of removing
the potential lowest energy state of the system from the spectrum can be
described as a breaking of time-translational invariance.

3.2 Superconformal Quantum Mechanics: A Stable System with

No Ground State Also Breaks Supersymmetry

The bosonic conformal mechanical system had no ground state. The absence
of a E = 0 ground state in the supersymmetric context leads to the breaking
of supersymmetry. This breaking has a diﬀerent ﬂavor from that which was
discussed for the spatial translations. We next examine the supersymmetric
version of conformal quantum mechanics [1, 14], to see if indeed supersymme-
try is broken. The superpotential is chosen to be
1
W (x) = g log x2 , (56)
2
yielding the Hamiltonian:
2
1 dW d2 W
H= 2
p + 1− σ3 . (57)
2 dx dx2

Representing ψ by 12 σ− and ψ ∗ by 12 σ+ gives the supercharges:

dW dW
Q=ψ +
−ip + +
, Q = ψ ip + . (58)
dx dx
One now has a larger algebra, the superconformal algebra,

{Q, Q+ } = 2H , {Q, S + } = g − B + 2iD ,

{S, S + } = 2K , {Q+ , S} = g − B − 2iD . (59)

A realization is
B = σ3 , S = ψ + x , S + = ψx . (60)
The zero-energy solutions are

exp(±W (x)) = x±g , (61)

and neither solution is normalizable.

The Hamiltonian H factorizes

g(g+1)
p2 + x2 0
2H = g(g−1) , (62)
0 p2 + x2
596 E. Rabinovici

and we may solve for the full spectrum:

√
ψE (x) = x1/2 J√ν (x 2E) , E = 0 , (63)

where ν = g(g − 1) + 1/4 for NF = 0 and ν = g(g + 1) + 1/4 for NF = 1.

The spectrum is continuous and there is no normalizable zero-energy state.
One must interpret the absence of a normalizable ground state. It is also
possible to deﬁne a new operator which has a normalizable ground state. By
inspection the operator (51) can be used, provided one makes the following
identiﬁcations:

NF = 1 gB = gsusy (gsusy + 1) ,
NF = 0 gB = gsusy (gsusy − 1) . (64)

Thus the spectrum diﬀers between the NF = 1 and NF = 0 sectors, and

supersymmetry would be broken. One needs to deﬁne a whole new set of
operators:

M =Q−S M + = Q+ − S +
N = Q+ + S + N + = Q + S+ (65)

which produces the algebra:

1 1 1
{M, M + } = R + B − g ≡ T1 ,
4 2 2
1 1 1
{N, N } = R + B + g ≡ T2 ,
+
4 2 2
1 1
{M, N } = L− {M + , N + } = L+ ,
4 4
1
L± = − (H − K ∓ 2iD). (66)
2
The operators T1 , T2 , H have a doublet spectrum. “Ground states” are given by

T1 |0 >= 0 ; T2 |0 >= 0 ; H|0 >= 0 . (67)

In this setup one can also exhibit [1] how in the presence of a breaking
of a space–time symmetry, global N = 2 can be broken only to N = 1. A
physical context arises when one considers a supersymmetric particle falling
into a black hole [15, 16]. This is the supersymmetric analogue of the situation
already discussed.
One should mention again that there is a dictum in the way one has
broken scale/conformal invariance in the problem. It is amusing to mention
that if one takes the dictated ground state, and decomposes it in terms of the
Spontaneous Breaking of Space–Time Symmetries 597

energy eigenstates, then one usually gets that the new ground state looks like
a thermal distribution of the old ground states. This looks very attractive and
it is related to black holes, which as mentioned above do come up.
Another example where such breakdown of time-translational invariance
may occur is the Liouville model. Also, there is no normalizable ground state.
For works on the possible breakdown of translational invariance in the two-
dimensional Liouville model see [17, 18].
Beyond d = 2, we can mention that in four dimensions in N = 1 supersym-
metric theories, where the number of flavors NF is smaller than the number
of colors, 0 < NF < NC , one also gets [19, 20] a situation where the spectrum
is bounded from below, but there is no ground state. The spectrum is open,
and actually in the presence of a cutoff such systems have no finite-energy
states at all, which is very interesting as far as Cosmology is concerned.

4 Spontaneous Breaking of Conformal Invariance

Fubini [21] also suggested to discuss such situations in a general number of
dimensions. He researched it in a scientific environment which did not yet
fully realize that interacting finite theories might exist in various number
of dimensions. Therefore, much of his analysis was of a classical nature. He
emphasized the conformal features of the system, and we are going to discuss
the breakdown of conformal invariance. The discussion of the breakdown of
time-translational invariance brought us to conformal theories and now we are
discussing also the breakdown of the conformal invariance.
If one considers a theory with only one scalar field, a general classic con-
formal invariant is given by the following Lagrangian
1
∂μ φ∂ μ φ − gφ d−2 .
2d
L= (68)
2
The symmetry of the system is the bosonic, O(d, 2) symmetry, and the
generator are Mμν , Pμ , of the Poincaré group, the special conformal trans-
formation generator Kμ , and the dilatation D. The dictum of Fubini in this
case is that the ground state is not translational invariant, and this is not
accompanied by any dynamical calculation. The vacuum expectation value
< φ(x) > is x dependent, and actually it looks very much like an instanton
− d−2
a2 + x2 2

< φ(x) >= b , (69)

2a
which is a solution of the equation of motion
d d+2
∂ 2 φ(x) − 2g φ d−2 (x) = 0 . (70)
d−2
By choosing this to be the vacuum, (again I emphasize, this is by dictum),
one breaks down the O(d, 2) symmetry (as in Fig. 15) in the following fashion:
598 E. Rabinovici

Fig. 15. The sign of the quartic coupling g determines the symmetry breaking
patterns of the symmetry group O(d, 2)

if the coupling g of the scalar self-interaction is positive, then the theory breaks
down to O(d − 1, 2) and the resulting symmetries are Mμν , Rμ . If g < 0, then
the symmetry breaks to O(d − 1), generated by Mμν , Sμ , where

1 1
Sμ = aPμ − Kμ . (71)
2 a
If g = 0, one remains with Poincaré invariance (Fig. 15). In the de Sitter
example, which occurs for g > 0, one can show again that there are signatures
of temperature. A question which at the time seemed interesting was: Does
a spontaneous breaking of conformal invariance require also the breakdown
of translational invariance? Examples were since found where this is not the
case. Counter-examples to the idea that the breaking of conformal invariance
must drive a breaking of supersymmetry were discovered, and we will discuss
in more detail some such examples. One can break scale invariance without
breaking rotational or translational invariance. We also mention briefly that
conformal invariance and scale invariance are not always equivalent, and in
a set of works (see, e.g., [22]) it has been shown that scale invariance leads,
under certain conditions, to conformal invariance.
For instance, this occurs in the case where the spectrum of the theory
is discrete, such as in a two-dimensional sigma model description in which
the target space is compact. But for non-compact target spaces one can find
counter-examples [23] in which scale invariance does not lead to conformal
invariance. In recent years it has been fully realized that theories which are
quantum mechanically scalar invariant and finite may exist in d = 2, 3, 4, 5, 6
dimensions. Such theories can exhibit spontaneous breaking, e.g., the d = 4,
N = 4 super Yang–Mills with SU (N ) gauge group which is characterized by
the following spectrum

(Aaμ , λa , φa + ia ).
The theory is parameterized by the complex parameter ig + θ, where g
is the coupling constant and θ is the angle. Such a theory has ﬂat directions
Spontaneous Breaking of Space–Time Symmetries 599

which allow phases where either < φ > vanishes and the theory is realized in
a conformal manner, or a phase in which < φ >= 0 along flat directions. This
is the Coulomb phase, in which the gauge group SU (N ) may be reduced all
the way to U (1)N , where N is the rank of the gauge group. This is the maxi-
mum possible breaking of the gauge group when the fields are in the adjoint
representation. In such a case, scale invariance is broken spontaneously and
the vacuum energy remains zero, and there is no breakdown of either trans-
lational invariance or supersymmetry. Such a theory will have a Goldstone
boson, associated with the spontaneous breaking of scale invariance, which is
called the dilaton. This is a true dilaton worthy of his name. It is interesting
to note that in such a system the vacuum energy is not influenced by the value
of < φ >, and it vanishes in all the phases.

5 O(N ) Vector Models in d = 3: Spontaneous Breaking

of Scale Invariance and the Vacuum Energy

The next example that we have is related to the spontaneous breaking of scale
invariance in a three-dimensional bosonic theory. Such a theory describes the
mixing of He3 and He4 , (see [24] and references therein).
The most general Lagrangian describing such a system is
1 1 λ4 λ6
L= ∂μ φ ∂ μ φ − λ2 (φ)2 + (φ)4 + 2 (φ)6 , (72)
2 2 4N N
and it can be treated at d = 3 − ε. The system has two order parameters,
< (φ)2 > and < φ >.
In a classical analysis performed for d = 3−ε, when the sign of λ2 changes,
< φ > is produced. However, < (φ)2 >= 0 even for λ2 > 0, which is exem-
pliﬁed by the diagram shown in Fig. 16.

Fig. 16. The phase diagram of a d = 3 − ε Conformal Theory, in three dimensions

the CP and CEP points coincide to produce a ﬂat direction
600 E. Rabinovici

When one goes to three dimensions, the point which is denoted by CP,
which is a critical point, and the point CEP which is the critical end point do
actually meet together and lead to a very interesting structure. Going directly
to d = 3, one can write down the O(N ) vector model written below
1 1 λ4 λ6
L= ∂μ φ ∂ μ φ − λ2 (φ)2 + (φ)4 + (φ)6 . (73)
2 2 4!N 6!N 2
It should be emphasized that everything said depends on the very spe-
cific manner of taking the limit. One first keeps the cutoff Λ fixed and takes
N → ∞, by performing a functional integral or selecting a subset of dia-
grams, and only then does one remove the cutoff, sending it to infinity, set-
ting the renormalized quadratic and quartic couplings to zero. Such a system
turns out to be not only classically conformally invariant, but also quantum
mechanically, having a vanishing beta function [25]. We next elaborate on
such systems.
Let us now review some more known facts about the three-dimensional
theory once a classically marginal operator,(φ2 )3 , is added [25]. For any finite
value of N , the coupling g6 of this operator is infrared-free quantum mechan-
ically, as the marginal operator gets a positive anomalous dimension already
at one loop. This implies that the theory is only well defined for zero value of
the coupling of this operator. In the presence of a cutoff, interacting particles
have mass of the order of the cutoff. At its tri-critical point the O(N ) model
in three dimensions is described by the Lagrangian
1 1
L= ∂μ φ ∂ μ φ + g6 (φ2 )3 , (74)
2 6N 2
where the fields φ are in the vector representation of O(N ).
In the limit N → ∞ [25]

βg 6 = 0 ; (75)
1/N corrections break conformality. In the large N limit, g6 is a modulus. It
turns out there is no spontaneous breaking of the O(N ) symmetry, and it is
instructive to write the eﬀective potential in terms of an O(N ) invariant ﬁeld,

σ = φ2 . (76)
The eﬀective potential is [24]

V (σ) = f (g6 )|σ|3 , (77)

where

f (g6 ) = gc − g6 (78)
with
gc = (4π)2 . (79)
Spontaneous Breaking of Space–Time Symmetries 601

The system has various phases. For values of g6 smaller than gc , i.e. when
f (g6 ) is positive, the system consists of N massless non-interacting φ particles.
These particles do not interact in the infinite N limit; thus, correlation func-
tions do not depend on g6 . For the special value g6 = gc , f (g6 ) vanishes and a
flat direction in σ opens up: The expectation value of σ becomes a modulus.
For a zero value of this expectation value the theory continues to consist of
N massless φ fields. For any non-zero value of the expectation value the sys-
tem has N massive φ particles. All have the same mass due to the unbroken
O(N ) symmetry. Scale invariance is broken spontaneously though the vacuum
energy still vanishes. The Goldstone boson associated with the spontaneous
breaking of scale invariance, the dilaton, is massless and identified as the O(N )
singlet field δσ ≡ σ −
σ. All the particles are non-interacting in the infinite
N limit. This theory is not conformal: In the infrared limit, it flows to another
theory containing a single, massless, O(N )-singlet particle. For larger values
of g6 the exact potential is unbounded from below. The system is unstable (in
the supersymmetric case the potential is bounded from below and the larger
g6 structure is similar to the smaller g6 structure [26]). Actually this instabil-
ity is an artifact of the dimensional regularization used above, which does not
respect the positivity of the renormalized field σ. In any case a more careful
analysis [25] shows that the apparent instability reflects the inability to define
a renormalizable interacting theory. All the masses are of the order of the
cutoff, and there is no mechanism to scale them down to low mass values. In
other words, the theory depends strongly on its UV completion.
This is summarized in Table 1. There, S.B. denotes spontaneous symmetry
breaking of scale invariance and V is the vacuum energy. For f (g6 ) < 0 the
theory is unstable. Note that the vacuum energy always vanishes whenever
the theory is well defined.
When
σ = 0, and the scale invariance is spontaneously broken, one can
write down the effective theory for energy scales below
σ, and integrate
out the degrees of freedom above that scale. The vacuum energy remains
zero however, and is not proportional to
σ3 as might be expected naively
[24, 27, 28, 29, 30].
For completeness we note that by adding more vector fields one has also
phases in which the internal global O(N ) symmetry is spontaneously broken.

Table 1. Marginal perturbations of the O(N ) model

f (g6 ) |σ| S.B. Masses V
f (g6 ) > 0 0 No 0 0
f (g6 ) = 0 0 No 0 0
f (g6 ) = 0 = 0 Yes Massless dilaton, N 0
particles of equal mass
f (g6 ) < 0 ∞ Yes, Tachyons or masses −∞
but ill deﬁned of order the cutoﬀ
602 E. Rabinovici

An example is the O(N ) × O(N ) model [29] with two ﬁelds in the vector
representation of O(N ), with Lagrangian:

L = ∂μ φ1 · ∂ μ φ1 + ∂μ φ2 · ∂ μ φ2 + λ6,0 (φ21 )3 +
λ4,2 (φ21 )2 (φ22 ) + λ2,4 (φ21 )(φ22 )2 + λ0,6 (φ22 )3 . (80)

Again, the β functions vanish in the strict N → ∞ limit. There are now
two possible scales, one associated with the breakdown of a global symmetry
and another with the breakdown of scale invariance. The possibilities are
summarized by the table below:

O(N ) O(N ) Scale Massless Massive V

+ + + all none 0
− + − (N − 1)π s, D N, σ 0 (81)
+ − − (N − 1)π s, D N, σ 0
− − − 2(N − 1)π s, D σ 0

Again, in all cases, the vacuum energy vanishes. Assume a hierarchy of

scales where the scale invariance is broken at a scale much above the scale
at which the O(N ) symmetries are broken. One would have argued that one
would have had a low-energy effective Lagrangian for the massless pions and
dilaton, with a vacuum energy given by the scale at which the global symmetry
is broken. This is not true, the vacuum energy remains zero. This system has
a critical surface, on one patch the deep infrared theory contains only one
massless particle: an O(N ) × O(N ) singlet. For the other patches the deep
infrared theory is described by O(N ) massless particles, most of which are not
O(N ) singlets.
In general, effective field theories should have all possible symmetries of
the underlying theory, whether they are realized linearly or non-linearly. In
finite scale invariant theories the vacuum energy Evac should be determined
by all scales and symmetries involved. It should have the same value (zero
in this case), in all phases of the system whether or not expectation values
are formed. This punches a hole in Zeldovich-like arguments [31] and offers
a different view on the gravity of the cosmological constant problem [32]. If
the theory has a global scale invariance, which is spontaneously broken, it will
produce a dilaton. The question is: Where is the dilaton? The dilaton should
be a massless field. Several authors [33, 34] tried to check the possibilities
that the dilatons might exist, noting that the dilaton must be a massless
Goldstone boson. Under certain assumptions, one finds out that actually in
certain models having a massless dilaton would not violate experimental data.
Perhaps it even predicts deviations of the equivalence principle from Galileo
famous experiment of δa/a ∼ 10−12 . Just below the present experimental
sensitivity.
Spontaneous Breaking of Space–Time Symmetries 603

This is done under the assumption that the dilaton couples in the following
universal fashion

L = F (Φ) R − F 2 + 2[∇2 Φ − (∇Φ)2 ] . (82)
It could also happen that the dilaton gets swallowed in some Higgs-like mech-
anism. One should also mention that if kinematically a finite-scale invariant
is forced by some super-selection rule (such as having a non-trivial monopole
number [35]) into a certain solitonic sector, then the rest energy of the system
should be accounted for, and the vacuum energy will be slightly lifted from
zero.
Let us finish this section by noticing an amusing thing—there are various
solutions that go under the name of Randal and Sundrum. One of the con-
structions contains two types of branes, near the boundary of the space there
is a Planck brane with tension T1 , which is fine-tuned so to have zero cosmo-
logical Constant. Then at a certain distance, very deep inside the bulk theory,
one places the TeV brane, it has negative tension and the tension is again fine
tuned, so that the cosmological constant vanishes also on that brane.
The two branes are separated by some distance which in [36] is associated
to massless particle, which is the dilaton or the radion (see Fig. 17).
In principle, there are circumstances where this distance is not fixed, and
there are several possible situations whose outcome is very similar to that one
discussed in the d = 3 conformal theory. If the sum of the tensions T1 + T2
is arranged to vanish, then the system behaves as a spontaneously broken
system, the magnitude of the vev of the field is the distance between the two
branes.
If T1 + T2 > 0, the two branes actually are attracted to sit one on top of
the other, and when T1 + T2 < 0, the branes repel, the system is unstable and
as a result one of the branes is exiled to infinity.
These three examples are in full correspondence with the conditions on
the coefficients of the (φ)6 theory that we discussed above. The difference
between the two theories, and an important difference, is that in case of the

Fig. 17. Planck brane and TeV brane

604 E. Rabinovici

(φ)6 theory we are certain that in the large N limit the theory is indeed finite
quantum mechanically.
For the case of (φ)4 we don’t have such an assurance, and it would be
nice to find a system for which we are guaranteed to be finite also quantum
mechanically, which exhibits the same type of behavior.

5.1 Conclusions

• Spontaneous breaking of translational and rotational symmetry are pos-

sible. It ﬁts data for many phases of matter, and it may have a manifestation
in the dynamics of compactiﬁcation.

• Conformal/scale invariant theories which are stable but have no ground

states indicate a new mechanism of breaking time-translational invariance as
well as supersymmetry.

• A ﬁnite-scale invariant theory has the same vanishing vacuum energy in

all its phases.

• It is a great privilege to recognize Gabriele’s outstanding contributions.

Acknowledgements
The author thanks Matteo Cardella for various discussions on this manuscript.
The author wishes to thank his various collaborators on these subjects, espe-
cially W. Bardeen, S. Elitzur, M. Einhorn, A. Forge, A. Giveon, M. Porrati ,
A. Schwimmer and G. Veneziano.

References
1. S. Fubini, E. Rabinovici: Nucl. Phys. B 245, 17 (1984) 578, 595, 596
2. J. Hughes, J. Polchinski: Nucl. Phys. B 278, 147 (1986) 578
3. L. D. Landau: Phys. Z. Soviet II 26, 545 (1937) 578
4. S. Elitzur, A. Forge, E. Rabinovici: Nucl. Phys. B 359, 581 (1991) 578, 584, 586, 587
5. C. Baym, H. A. Bethe, C. J. Pethick: Nucl. Phys. A 175, 25 (1971) 579, 580
6. S. Alexander: Symmetries and Broken Symmetries in Condensed Matter
Physics, ed by N. Boccara (IDSET, Paris1981) p.141 583, 589
7. A. B. Zamolodchikov: Sov. Phys. JETP Lett. 43, 730 (1986) 585, 587
8. A. W. W. Ludwing, J. L. Cardy: Nucl. Phys. B 285 [FS19], 687 (1987) 585
9. S. Elitzur, E. Rabinovici, G. Sarkissian: Nucl. Phys. B 541, 246 (1999) 589, 590
10. I. Aﬄeck, A. W. W. Ludwig: Phys. Rev. Lett. 67, 161 (1991) 589
11. D. Friedan, A. Konechny: Phys. Rev. Lett. 93, 030402 (2004) 589
12. J. A. Harvey, D. Kutasov, E. J. Martinec: “On the relevance of tachyons,”
arXiv:hep-th/0003101 590
Spontaneous Breaking of Space–Time Symmetries 605

13. V. de Alfaro, S. Fubini, G. Furlan: Nuovo Cim. A 34, 569 (1976) 592
14. V. P. Akulov, A. I. Pashnev: Theor. Math. Phys. 56, 862 (1983) [Teor. Mat.
Fiz. 56, 344 (1983)] 595
15. P. Claus, M. Derix, R. Kallosh, J. Kumar, P. K. Townsend, A. Van Proeyen:
Phys. Rev. Lett. 81, 4553 (1998) 596
16. R. Kallosh: “Black holes and quantum mechanics,” arXiv:hep-th/9902007 596
17. E. D’Hoker, R. Jackiw: Phys. Rev. Lett. 50, 1719 (1983) 597
18. C. W. Bernard, B. Lautrup, E. Rabinovici: Phys. Lett. B 134, 335 (1984) 597
19. I. Affleck, M. Dine, N. Seiberg: Nucl. Phys. B 241, 493 (1984) 597
20. T. R. Taylor, G. Veneziano, S. Yankielowicz: Nucl. Phys. B 218, 493 (1983) 597
21. Sergio Fubini: Nuovo Cim. A 34, 521 (1976) 597
22. J. Polchinski: Nucl. Phys. 303, 226 (1988) 598
23. S. Elitzur, A. Giveon, E. Rabinovici, A. Schwimmer, G. Veneziano: Nucl. Phys.
B 435, 147 (1995) 598
24. D. J. Amit, E. Rabinovici: Nucl. Phys. B 257, 371 (1985) 599, 600, 601
25. W. A. Bardeen, M. Moshe, M. Bander: Phys. Rev. Lett. 52, 1188 (1984) 600, 601
26. W. A. Bardeen, K. Higashijima, M. Moshe: Nucl. Phys. B 250, 437 (1985) 601
27. D. S. Berman, E. Rabinovici: “Supersymmetric gauge theories,” arXiv:hep-
th/0210044 601
28. M. B. Einhorn, G. Goldberg, E. Rabinovici: Nucl. Phys. B 256, 499 (1985) 601
29. E. Rabinovici, B. Saering, W. A. Bardeen: Phys. Rev. D 36, 562 (1987) 601, 602
30. W. M. Alberico, S. Sciuto: Symmetry and Simplicity in Physics (Proceedings
of the Symposium in occasion of Sergio Fubini’s 65th Birthday, Turin, Italy,
February 24–26) (World Scientific, Singapore 1994), p 220 601
31. Ya. B. Zeldovich: Sov. Phys. Uspekhi 11 (1968) 381 602
32. S. Weinberg: Rev. Mod. Phys. 61 (1989) 602
33. T. Damour, A. M. Polyakov: Nucl. Phys. B 423, 532 (1994) 602
34. T. Damour, F. Piazza, G. Veneziano: Phys. Rev. D 66, 046007 (2002) 602
35. E. Rabinovici: unpublished 603
36. R. Rattazzi, A. Zaffaroni: JHEP 0104, 021 (2001) 603
Part VII

String/Quantum Gravity, Black Holes and

Entropy
The Information Paradox

D. Amati

International School for Advanced Studies (SISSA), Trieste, Italy

[email protected]

Los muertos que vos matasteis gozan de buena salud1

Abstract. The incompatibility between gravity and quantum coherence repre-

sented by black holes should be solved by a consistent quantum theory that con-
tains gravity as superstring theory. Despite many encouraging results in that sense,
I question here the general feeling of a naı̈ve resolution of the paradox. And indicate
non-trivial physical possibilities towards its solution that are suggested by string
theory and may be further investigated in its context.

1 Introduction

The fact that black holes represent an apparent contradiction between gravity
and quantum mechanics is a too well-known problem to need exhaustive recall.
The best way to visualize it is to consider together the formation and evapo-
ration processes. We may envisage a black hole (b.h.) to be formed by a pure
quantum state prepared in a distant flat space (an impinging spherical wave,
two or 25 particles colliding at high energies and small impact parameter,
and so on). If the characteristics of a b.h. – including its evaporation implied
by quantum mechanics – depend only on few basic parameters (M, Q, J) as
required by general relativity (no hairs), it is clear that quantum coherence of
the initial state is totally lost in the process. The contradiction has resisted
efforts to doctored modifications (as corrections to the thermal character of
Hawking evaporation) and brought distinguished scientists to give up either
quantum mechanics [1] or the relevance of classical (general relativity) solu-
tions in a path integral formulation of the quantum theory of gravitation [2].
On the other hand, the advent of string (actually superstring) theory as a
consistent quantum theory that contains gravity gave confidence that some-
how the paradox should be solved in its framework. Much progress has been
1
Josè Zorrilla, “Don Juan Tenorio” (1844).

D. Amati: The Information Paradox, Lect. Notes Phys. 737, 609–617 (2008)
DOI 10.1007/978-3-540-74233-6 20 c Springer-Verlag Berlin Heidelberg 2008
610 D. Amati

done in studying b.h. regimes in string theories and a remarkable set of co-
incidences have been revealed. After briefly recalling those results, I will ar-
gue that the paradox not only is not trivially solved as often claimed, but
manifests its full vitality in compelling some quite novel possibilities in the
generalization to the quantum realm of some classical concepts as space–time
and its geometry, or in the influence that quantum effects may have on the ac-
tual realization of classical geometrical configurations as trapped space–time
regions.

2 String Theories and Black Holes

String theories contain arbitrarily massive states within regions characterized
by the string length ls – the basic dimensional parameter of the theory – and
thus states that, classically, would represent black holes. The mass beyond
which those states should be black holes depends [3] on the string coupling
g, the other – dimensionless – basic parameter of the theory. Or, in different
words, for every mass (or excitation energy) there is a small coupling string
regime, and a large coupling b.h. regime.
In the string regime, D-branes in four [4] and five [5] dimensions with
a convenient number of charges have been studied. BPS states have been
counted as well as nearly BPS (Bogolmon’y-Prasad-Sommerfeld) states for
certain regions of moduli space where perturbative computations are feasi-
ble [6]. Decay rates have been computed [7] – by averaging over the many
degenerate initial states – and shown to have a typical thermal distribution.
The moduli independence of these results allows to conjecture [8] their validity
beyond the moduli region where they were computed. And their g indepen-
dence, also suggested by non-renormalization arguments [9], may imply their
possible continuation beyond the weak coupling regime.
An independent treatment – on totally different grounds – of the strong
coupling regime substantiates that impression. The large g description of the
four- and five-dimensional systems just described is found by solving the 10-d
supergravity equations after reduction on the same compact manifold used for
the D-brane description. The solution generates a metric [10] that depends on
parameters that are related to the charges through the moduli of the compact
manifold. The metric shows an event horizon even in the extreme limit in
which its area gives the Bekenstein–Hawking entropy of extremal b.h. (see
e.g. [11]). This entropy and the ADM (Arnowitt-Deser-Mlsner) mass coincide
with the (exponentiated) multiplicity and mass of the BPS states with the
same charges, as computed from the D-branes in the small coupling regime.
For nearly extremal b.h. the entropy and the evaporation spectrum – obtained
by solving wave equations in the corresponding metric background – coincide
again [7] with those computed for small g. And, remarkably, even deviations
from black body spectrum seem to agree [12].
The Information Paradox 611

The microscopic formulation of the 5-d near extremal b.h. has been further
studied [13] in terms of the D1–D5 brane system. The AdS/CFT (anti de-
Sitter/conformal field theory) correspondence was shown to play a role in
the matching between supergravity results and the microscopic (SCFT) for-
mulation of the b.h. thermodynamics and Hawking radiation, the b.h. being
defined through a density matrix.
All these agreements among such different computations gave confidence
to the g continuation of the theory to a strong coupling regime where b.h.
physics is met. This direct connection between the semiclassical black hole
picture and a unitary quantum approach, has been considered the sign that
the information loss due to b.h. could be somehow recuperated [14]. But how
this may be achived is yet far from clear. In the computations just referred it
appeared clearly that the thermal Hawking radiation was obtained by the av-
eraging over the degenerate microstates that are counted by the b.h. entropy,
while each microstate would have given rise to a complex but non-thermal
radiation with well-defined spectra and correlations that carry the precise
identity of the microstate from which they would have been originated. This
is of course a basic characteristics of a microstate (a pure quantum state) irre-
spectively of g. In other words, the black hole microstates are not themselves
black holes [15]. And this not only because of the absolute specificity of its
radiation, but also by not having any signal of an event horizon associated
with each of them. This last fact is of course expected by sheer consistency: if
a b.h. microstate would be characterized by an event horizon, it would have –
itself – a Bekenstein entropy and thus would not be a pure quantum state.
The b.h. appears indeed as the macrostate correctly defined by a decoherence
procedure – density matrix – over the many non-blackholish microstates of
the theory [16].
The obvious consequence of the preceding discussion is that a well prepared
quantum state (a spherical shell impinging from large distances, or a two
particle scattering at high energy and small impact parameter, etc.) is not
expected to give rise to a b.h. even if the classical conditions for a gravitational
collapse are apparently satisfied.
The possibility that microstates do not have a horizon has been more re-
cently proposed in a different context [14]: for every wrapping of a D1 brane
(whose number defines one of the charges briefly mentioned before) a profile
function in transverse space is introduced so to enter into a momentum charge
that contributes to the BPS charge. These profile functions then enter into
the supergravity solutions that are supposed to hold in the strong coupling
regime and change their behaviour at short radius, differently for every differ-
ent profile function. They are not singular at r = 0 and the value of r where
they all start resembling the usual b.h. solution outside the horizon is identi-
fied as a fuzzy “horizon” of a fuzzball proposal for b.h.. It is unclear, however,
if and how a trapped region could emerge for the incoherent superposition
characterizing the b.h. macrostate.
612 D. Amati

3 The Role of Decoherence

If string microstates counted by the entropy of b.h. macrostates are not them-
selves black holes, it should happen that decoherence, intrinsic to any classical
limit, should be critical in building b.h. characteristics as metric singularities
and event horizons. It is not surprising that decoherence may have an impor-
tant role in high excitation string physics due to the very large degeneracy of
states in that regime. Indeed, even for g → 0, i.e. for tree diagrams, the non-
trivial spectrum of emitted particles in the decay of any high-mass excitation
gives rise to a thermal distribution if an average over the very many states
with the same mass is performed [17].
Even if effects of this kind may well be at work also for large couplings,
decoherence should have much more subtle effects in order to generate b.h.
physics from non black-holish microstates. Let me provide some speculative
ideas on how a geometric picture could arise from a decoherence procedure
in the pregeometric string approach. In this theory, indeed, even space and
time are defined through the string; they are operators and not parameters
that could be interpreted as coordinates of a space–time that may subtend
a dynamical geometry. These are all concepts that may arise in a classical
limit of the theory when quantum fluctuations may be neglected. But even in
this limit, the theory contains in principle not only the metric and possibly
matter fields, but also an infinite number of higher-rank tensor fields whose
effect may possibly be ignored only in some conditions. The (infinite number
of) equations that these (infinite number of) fields should satisfy, are given
by the condition of no conformal anomaly (β = 0), and it is in the limit
of small frequencies (in string length units) that only massless fields appear
satisfying Einstein’s equations [18]. But in presence of a horizon of a metric
solution, the statement of low frequency is not relativistic invariant. Indeed,
an arbitrary low-frequency wave for a fixed external observer will be perceived
by a free falling one with a blue shift which gets arbitrarily increased when
approaching the horizon. This means that to have disregarded contributions
with higher derivatives, or fields with higher tensorial character, would have
been an unwarranted approximation. And even a small effect of those tensors
could have avoided the metric condition that implied the singularity and the
trapped region in the usual Einstein equation. There could be many solutions
involving different field configurations in which the metric and other tensor
fields are classically entangled with relevant phases. And it could well happen
that an incoherent superposition of these different background configurations
could wash out the higher tensorial fields leaving a geometric description with,
eventually, a b.h. metric with its singularity and its event horizon. This could
be a hypothetical way in which non-b.h. microstates could give rise to a b.h.
macrostate.
In this case, the apparent contradiction between b.h. in classical general
relativity and quantum coherence is solved in a conceptually simple way:
it is the decoherence procedure, implied in any classical limit, that gives
The Information Paradox 613

rise – from a consistent quantum theory of gravitation as superstrings – to a

classical geometrical space–time description (general relativity) with eventual
trapped regions, event horizon and b.h. and, of course, the loss of quantum
coherence.

4 High-energy Collisions in String Theory and Metric

Back Reaction

Let us now discuss high-energy scattering. Superstrings provide a computa-

tional perturbative algorithm for S-matrix amplitudes that, if properly re-
sumed, allows an explicit analysis of the continuation to the strong coupling
(b.h.) regime. Therefore, as we shall discuss later, the consistent quantum the-
ory may investigate situations in which, semiclassically, the process should be
described by a b.h. formation and subsequent evaporation. Thus, hopefully,
the analysis may throw light on how and why may happen that a coherent
quantum state would not produce a b.h. even if the classical conditions to
form it are met.
Much work has been done to study trans-Planckian collisions in a string
approach [18, 19, 20]. I will recall methods and results that are consis-
tently computable in the string regime and organized in an effective action
form [21] to tackle their extention to a strong coupling regime where, semi-
classically, b.h. formation and subsequent evaporation should be expected.
As already said, string (or actually superstring) theories contain a dimen-
sional scale – the string length ls – and a dimensionless one g, the string cou-
pling that generates the genus expansion. Gravitational scales, as the Newton
constant G or the Schwarzschild radius RS corresponding to an energy E
are given by
G = g 2 ls2 /, RS = GE . (1)
For simplicity, (1) and other explicit expressions we shall give will refer to the
d = 4 case, even if the analysis we recall has been done for an arbitrary number
d of non-compactified dimensions. The method used in [18] is to consider a
trans-Planckian regime defined by a small coupling large energy

g 2 << 1, Els / >> 1 , (2)

so that
GE 2 / = g 2 (Els /)2 > 1 . (3)
In the genus expansion of string amplitudes all terms in which g 2 is enhanced
by the large factor as in (3) have to be considered, and resumed. Let us notice
that in the large energy regime of (2) and (3) RS /ls = g 2 Els / can be smaller
or larger than 1 and, as we shall see, physics will be diﬀerent on the two sides
of the inequality. The computation of the collision amplitude in superstring
theory in terms of the energy E and impact parameter b has been organized
614 D. Amati

in powers of RS2 /b2 . For b larger than both RS and ls , the two-particle col-
lision amplitude in the high-energy regime as defined by (2) – obtained by
the just discussed all order resummation – has an eikonal form, the eikonal
being a Hermitian operator (thus unitary S-matrix) in the Fock space of the
two colliding strings. Only for very large values of b – where the amplitude is
perturbative and dominated by the graviton pole – the scattering is elastic,
while for b < gEls2 / the two colliding gravitons are also excited to other
superstring states in the scattering process. The eikonal is large and allows a
classical trajectory interpretation through a saddle point in the Bessel trans-
form to transfer momentum. It reproduces the relation between the deflection
angle and the impact parameter classically experienced by each particle in the
(Aichelburg–Sexl) gravitational field created by the other one. With the extra
fact that while deflecting, colliding particles may be excited (in a calculable
way) to one of its string recurrences, implying an attenuation of the elastic
amplitude (imaginary phase) that increases, together with the deflection an-
gle, for decreasing b. In the RS < ls case, b may decrease where string effects
become relevant, giving rise to copious inelastic production [22] and thus to a
softening that implies an attenuation of the elastic amplitude and a reduced
deflection angle. In the RS > ls case, when b approaches RS , new terms ap-
pear, as said before, in the form of powers of RS2 /b2 , that look as classical
corrections despite their quantum origin. The first term has been computed
in the string framework [16, 23], and an effective action algorithm has been
proposed for computing and resuming them all [21].
This may be interpreted as a metric and dilaton background generated
by the process or, equivalently, a consistent quantum computation of back
reaction on the metric, giving effects that become relevant when approach-
ing situations in which a b.h. formation is classically expected. It could thus
represent a way of understanding how and why a b.h. is avoided in a well-
defined quantum state as that under discussion. It is perhaps unfortunate
that no further effort has been devoted in that direction. I have even a vague
recall of a sense of frustration of the scientist to whom we dedicate these
contributions, Ciafaloni and myself when – many years ago – some prelimi-
nary results could not be forced into the recognition of a horizon. The fact
that brought us to give up, while today I would consider it as the expected
sign to reveal novel quantum gravitational effects! Furthermore, if this sort
of back reaction is efficient in avoiding trapped regions in the well-defined
quantum state represented by the two colliding particles, it could perhaps
continue to do so in arbitrary collapse situations. Let me also adventure that
this possible effect of quantum back reactions on the metric may allow an
interpretation of the recent Hawking suggestion [2] that the original classi-
cal solution, as the Schwarzschild metric in a gravitational collapse, may give
an irrelevant contribution to the path integral for the actual gravitational
process.
The Information Paradox 615

5 Metric Back Reaction and Possible Avoidance

of Black Holes
The idea that standard b.h. may not be the objects realized in nature even
at the macroscopic level, has been recently explored within different con-
texts [24]. In particular, interesting suggestions have been borrowed from ge-
ometric acoustical models that can be studied experimentally and show a
physics that is associated with classical and quantum fields in curved space–
times [25]. Propagation of small disturbances in the flow of even simple fluids
are known to behave equivalently to a linear (classical or quantum) field over
an acoustic space–time endowed with an acoustic metric [26]. Depending on
that endowed metric, acoustic b.h. – trapped region corresponding to a super-
sonic regime in fluid flow – may be created. It has been however noted [27] that
Hawking-like radiation does not necessarily imply the formation of a trapped
region; it is sufficient that a sonic point conveniently develops in the asymp-
totic future. The radiation is then controlled by a temperature that contains
both the Hawking one and the rate by which the sonic point is reached for
t → ∞. This critical collapse result suggests an alternative scenario for a semi-
classical collapse and evaporation of “b.h.” objects that – very speculatively –
could be exported to semiclassical gravity. Its interpretation would imply that
some quantum back reaction on the geometry could prevent the surface of the
collapsing star (or impinging matter) from actually crossing the Schwarzschild
radius. At later stages, the evaporation process would become more efficient
so to induce a chasing of the would be horizon by the surface of the star that
could end with the complete evaporation and a flat space–time [27].

6 Conclusions and Outlook

I hope to have substantiated my (probably personal) point of view of why
some coincidences between string state multiplicity and average decay spec-
tra, on one hand, and b.h. entropy and evaporation spectra, on the other, are
far from having solved in a naive way the apparent paradox of loss of quan-
tum coherence in b.h. formation and evaporation (the information paradox).
String microstates, in particular, are not b.h. and well defined quantum states
would not generate b.h. even if they would have been expected on classical
grounds. I have discussed two ways to resolve this apparent discrepancy, both
of them accessible to further investigation in the string framework. The first
one starts from the fact that superstring theory is pregeometrical and even the
concept of space–time is induced by the string through a classical limit. Thus
space–time, geometry, event horizons, black holes and the loss of quantum
coherence would all come with the same token, i.e. the decoherence procedure
implied in the classical limit that leads to general relativity. Thus no para-
dox: either bona fide quantum (as superstrings) or classical space–time with
dynamical geometry and black holes but no a priori quantum coherence. The
616 D. Amati

other possibility is that the lack of b.h. formation in a quantum state, as two-
particle collision, may be due to well-identified quantum contributions that
give rise to apparently classical effects that act as quantum back reactions
on the metric. Effects that could remain influential even in classical gravi-
tational collapse processes thus avoiding metric singularities, trapped regions
and event horizons. Without forming, therefore, even classical b.h. despite the
fact that many external observational properties would not look very dissim-
ilar. Thus no paradox because no real black holes: no trapped region or event
horizon to spoil quantum coherence or information retrieval.

Recognition
I had the chance to enjoy a lively and fruitful collaboration with Gabriele for
many years and on a variety of subjects. Sharing – as also reflected in this
paper – the joy of elaborating original physics, the frustration of unexpected
obstacles and the persisting challenge of different viewpoints on possible de-
velopments. I wish him to keep harvesting success, surrounded by friends and
collaborators attracted by his scientific and human qualities. People of all
origins and ages ... with me at the oldest end.

References
1. S.W. Hawking: Phys. Rev. D 50, 3982 (1994) 609
2. S.W. Hawking: Lecture at the 17th Int. Conf. on General Relativity and Grav-
itation, 2004 (http://math.ucr.edu/home/baez/week207.html) 609, 614
3. M.J. Bowick, L. Smolin, L.C.R. Wijewardhana: Phys. Rev. Lett. 56, 424 (1986);
G. Veneziano: Europhys. Lett. 2, 133 (1986);
L. Susskind: hep-th/9309145 610
4. J. Maldacena, A. Strominger: Phys. Rev. Lett. 77, 428 (1966);
C. Johnson, R. Khuri, R. Myers: Phys. Lett. B 378, 78 (1996) 610
5. A. Strominger, C. Vafa: Phys. Lett. B 379, 99 (1966) 610
6. C. Callan, J. Maldacena: Nucl. Phys. B 472, 591 (1996) 610
7. S. Das, S. Mathur: Nucl. Phys. B 478, 561 (1996);
S. Gubser, I. Klebanov: Nucl. Phys. B 482, 173 (1996) 610
8. G. Horowitz, J. Maldacena, A. Strominger: Phys. Lett. B 383, 151 (1996) 610
9. J. Maldacena: hep-th/961125 610
10. M. Cvetic, D. Youm: hep-th/9508058, hep-th/9512127;
G. Horowitz, D. Lowe, J. Maldacena: Phys. Rev. Lett. 77, 430 (1996) 610
11. L. Andrianopoli, R. D’Auria, S. Ferrara, M. Trigiante: Extremal black holes in
supergravity, this volume 610
12. S. Gubser, I. Klebanov: Phys. Rev. Lett. 77, 4491 (1996);
J. Maldacena, A. Strominger: Phys. Rev. D 55, 861 (1997) 610
13. J.R. David, G. Mandal, S.R. Wadia: Phys. Rep. 369, 549 (2002) 611
14. G. Horowitz: gr-qc/9704072 611
15. D. Amati: Phys. Lett. B 454, 203 (1999) 611
The Information Paradox 617

16. S.D. Mathur: hep-th/0502050 611, 614

17. D. Amati, J. Russo: Phys. Lett. B 454, 207 (1999) 612
18. C.G. Callan, E. Martinec, M.J. Perry, D. Friedan: Nucl. Phys. B 262, 593
(1985) 612, 613
19. D. Amati, M. Ciafaloni, G. Veneziano: Phys. Lett. B 197, 81 (1987); Int. J.
Mod. Phys. A 3, 1615 (1988) 613
20. D.J. Gross, P.F. Mende: Phys. Lett. B 197, 129 (1987); Nucl. Phys. B 303, 407
(1988);
P.F. Mende, H. Ooguri: Nuc. Phys. B 339, 641 (1990) 613
21. D. Amati, M. Ciafaloni, G. Veneziano: Nucl. Phys. B 403, 707 (1993) 613, 614
22. G. Veneziano: JHEP 11, 001 (2004) 614
23. E. Gava, R. Iengo, C.-J.Zhu: Nuc. Phys. B 323, 585 (1989);
D. Amati, M. Ciafaloni, G. Veneziano: Nuc. Phys. B 347, 550 (1990) 614
24. T.A. Roman, P.G. Bergmann: Phys. Rev. D 28, 1265 (1983);
C.R. Stephens, G.’t Hooft, B.F. Whiting: Class. Q. Grav. 11, 621 (1994);
A. Ashketar, M. Bojowald: gr-qc/050429;
S.Hayward: gr-qc/0506120;
T. Vachaspati, D. Stojkovic, L.M.Krauss: gr-qc/0609024 615
25. C. Barceló, S. Liberati, M. Visser: Living Rev. Relativity 8, 12 (2005) 615
26. W.G. Unruh: Phys. Rev. Lett. 46, 1351 (1981) 615
27. C. Barceló, S. Liberati, S. Sonego, M. Visser: Class. Q. Grav. 23, 5341 (2006);
Phys. Rev. Lett. 97, 171301 (2006) 615
Cosmological Entropy Bounds

R. Brustein

Department of Physics, Ben-Gurion University, Beer-Sheva 84105, Israel

[email protected]

Abstract. I review some basic facts about entropy bounds in general and about
cosmological entropy bounds. Then I review the causal entropy bound, the condi-
tions for its validity and its application to the study of cosmological singularities.
This article is based on joint work with Gabriele Veneziano and subsequent related
research.

1 To Gabriele

On the occasion of your 65th birthday may you continue to ﬁnd joy in science
and life as you have always had, and continue to help us understand our
universe with your creative passion and vast knowledge. It is a pleasure and
an honor to contribute to this volume and present one of the subjects among
your many interests. Thank you for explaining to me why entropy bounds are
interesting and for your collaboration on this and other subjects.

2 Introduction

2.1 What Are Entropy Bounds?

The second law of thermodynamics states that the entropy of a closed system
tends to grow toward its largest possible value. But what is this maximal
value? Entropy bounds aim to answer this question.
Bekenstein [1] has suggested that for a system of energy E whose size R is
larger than its gravitational radius R > Rg ≡ 2GN E, entropy is bounded by

S ≤ ER/ = Rg R lP−2 . (1)

Here lP is the Planck length. This is known as the Bekenstein entropy bound
(BEB).

R. Brustein: Cosmological Entropy Bounds, Lect. Notes Phys. 737, 619–659 (2008)
DOI 10.1007/978-3-540-74233-6 21 c Springer-Verlag Berlin Heidelberg 2008
620 R. Brustein

Entropy bounds are closely related to black hole (BH) thermodynamics

and their interplay with their “normal” environment. They are also probably
associated with instabilities to forming BHs; however, this has not been proved
in an explicit calculation. The original argument of Bekenstein was based on
the Geroch process: a thought experiment in which a small thermodynamic
system is moved from infinity into a BH. The small system is lowered slowly
until it is just outside the BH horizon, and then falls in. By requiring that the
generalized second law (GSL) will not be violated, one gets inequality (1).
A long debate about the relationship between entropy bounds and the
GSL has been going on. On one side, Unruh, Wald and others [2, 3] have
argued that the GSL holds automatically, so that entropy bounds cannot
be inferred from situations where the law seems to be violated. They argue
that the microphysics will eventually take care of any apparent violation.
Consequently, they argued that the BEB does not have to be postulated as a
separate requirement in addition to the GSL. Responding to their arguments
Bekenstein [4] has argued that it is not always obvious in a particular example
how the system avoids violating the bound and analyzed in detail several of
the purported counter-examples of this type and demonstrated in each case
the specific mechanism enforcing the bound.
Holography [5] (see below) suggests that the maximal entropy of any sys-
tem is bounded by SHOL ≤ AlP−2 , where A is the area of the space-like surface
enclosing a certain region of space. For systems of limited gravity R > Rg ,
and since A = R2 , the BEB implies the holography bound. Physics up to
scales of about 1 TeV is very well described in terms of quantum field the-
ory, which uses, roughly, one quantum mechanical degree of freedom (DOF)
for each point in space (the number of DOF is the logarithm of the num-
ber of independent quantum states). This seems to imply that S(V ) ∼ V ,
but the BEB states that S(V ) ≤ A. The BEB does not seem to depend on
the detailed properties of the system and can thus be applied to any vol-
ume V of space in which gravity is not dominant. The bound is saturated
by the Bekenstein–Hawking entropy associated with a BH horizon, stating
that no stable spherical system can have a higher entropy than a BH of
equal size.
A bold interpretation of the BEB was proposed by ’t Hooft and Susskind
[5],—that the number of independent quantum DOF contained in a given
spatial volume V is bounded by the surface area of the region. In a later
formulation by Bousso [6] their conjecture reads “a physical system can be
completely specified by data stored on its boundary without exceeding a den-
sity of one bit per Planck area.” In this sense the world is two-dimensional
and not three-dimensional, for this reason their conjecture is called the holo-
graphic principle. The holographic principle postulates an extreme reduction
in the complexity of physical systems, and is not manifest in a description
of nature in terms of quantum field theories on curved space. It is widely
believed that quantum gravity has to be formulated as a holographic theory.
This point of view has received strong support from the ADS/CFT duality
Cosmological Entropy Bounds 621

[7, 8], which deﬁnes quantum gravity non-perturbatively in a certain class of

space–times and involves only the physical degrees admitted by holography.
One way of viewing entropy bounds is that they are new laws of nature
that have to supplement the equations that govern any fundamental theory
of quantum gravity. From this perspective the entropy bounds and the holo-
graphic principle are presumed to be valid for any physical system and their
“true” form has to be unraveled. An alternative perspective is that entropy
bounds will be automatically obeyed by any physical system and will be a con-
sequence of the fundamental dynamical equations. As such, entropy bounds
will not provide additional independent constraints on the system’s evolution.
In the ﬁnal fundamental theory entropy bounds will be tautologically correct.
My personal view on this issue at the present time is closer to the second
point of view.
My current perspective is that without detailed knowledge of the dynam-
ical equations that govern physics at the shortest distance scales and at the
highest energies it is hard to make detailed quantitative use of entropy bounds.
They are very useful as qualitative tools in the absence of the ﬁnal fundamental
theory of quantum gravity when one is trying to determine whether a can-
didate theory is correct by studying its consequences. As I will explain they
are particularly useful in discriminating among cosmologies that are suspect
of being unphysical for various reasons.

2.2 What Are Cosmological Entropy Bounds?

Is it possible to extend entropy bounds to more general situations, for exam-

ple, to cosmology? In 1989 Bekenstein [9] proposed that it might be pos-
sible to apply the BEB to a region as large as the particle horizon dp :
t
dp (t) = a(t) tinitial dt /a(t ), a(t) being the scale factor of an Friedman–
Robertson–Walker (FRW) universe. If the entropy of a visible part of the
universe obeys the usual entropy bound from nearly flat space situations,
then Bekenstein suggested that the temperature of the universe is bounded
and therefore certain cosmological singularities are avoided. The proposal to
apply the holographic bound from nearly flat space to cosmology was first
made by Fischler and Susskind [10] and later extended and modified by Bousso
[6]. Verlinde [11] proposed an entirely holographic bound on entropy stating
that the subextensive component of the entropy (the “Casimir entropy”) of a
closed universe has to be less than the entropy of a BH of the same size.
To appreciate the necessity to modify the BEB in some situations, let us
think [12] about a box of relativistic gas in thermal equilibrium at a tem-
perature T . We assume that the gas consists of N independent DOF and is
confined to a box of macroscopic linear size R. We further assume that R
is larger than any fundamental length scale in the system, and in particular
that R is much larger than the Planck length R lP . The volume of the
box is V = R3 . Since the gas is in thermal equilibrium, its energy density
is ρ = N T 4 and its entropy density is s = N T 3 (here and in the following
622 R. Brustein

we systematically neglect numerical factors). Here we are interested in the

case RT > 1 which means that the size of the box is larger than the thermal
wavelength 1/T . The case RT < 1 has been considered previously in [13]. In
this case the temperature is not relevant, rather the field theory cutoff Λ was
shown to be the relevant scale.
Under what conditions is this relativistic gas unstable to the creation of
BHs? The simplest criterion which may be used to determine whether an
instability is present is a comparison of the total energy in the box ETh =
N T 4 R3 to the energy of a BH of the same size EBH = MP2 R (MP is the
Planck mass). The two energies are equal when T 4 = 1/N MP2 /R2 . So thermal
radiation in a box has a lower energy than a BH of the same size if
1 2 2
(T R)4 < M R . (2)
N P
Another way to determine the presence of an instability to creation of BHs
is to compare the thermal entropy STh = N T 3 R3 to the entropy of the BH
SBH = MP2 R2 . They are equal when T 3 = 1/N MP2 /R. So thermal radiation
in a box has a lower entropy than a BH of the same size if
1 2 2
(T R)3 < M R . (3)
N P
From (2) and (3) it is possible to conclude the well-known fact that for fixed
R and N , if the temperature is low enough, the average thermal free energy
is not sufficient to form BHs. For low temperatures the thermal fluctuations
are weak and they do not alter the conclusion qualitatively.
Now imagine raising the temperature of the radiation from some low value
for which conditions (2) and (3) are comfortably satisfied to higher and higher
values such that eventually condition (2) is saturated. Since T R > 1, (2) is
saturated before (3). We assume that the size of the box R is fixed during
this process (recall that the number of species N is also fixed), and estimate
the backreaction of the radiation energy density on the geometry of the box
to determine whether the assumption that the geometry of box is fixed is
consistent. To obtain a simple estimate, we assume that the box is spherical,
homogeneous and isotropic. Then its expansion or contraction rate is given
by the Hubble parameter H = Ṙ/R, which is determined by the 00 Einstein
equation H 2 MP2 = N T 4 . However, if (2) is satisfied, then R12 MP2 = N T 4 ,
and therefore HR ∼ 1. The conclusion is that if (2) is saturated, then the
gravitational time scale is comparable to the light crossing time of the box,
and therefore, it is inconsistent to assume that the box has a fixed size which
is independent of the energy density inside it.
Thus we have shown that it is not possible to ignore the backreaction of
the gas on the geometry under all circumstances. Sometimes the backreaction
has to be taken into account. When the BEB is near saturation, we have
found that the basic assumptions have to be changed so it has to be modified
to adapt to an intrinsically time-dependent situation.
Cosmological Entropy Bounds 623

2.3 Why Is It Reasonable to Expect Cosmological

Entropy Bounds?

Some have argued incorrectly that it is impossible to discuss entropy bounds

in cosmology. They argue that the universe is the whole system and thus one
cannot apply thermodynamical arguments that sometimes rely on separating
a sub-system from a heat reservoir. This argument is false as the following
braneworld thought experiment explicitly demonstrated [12]. Let us consider
a brane moving in a higher-dimensional BH background. From the brane point
of view it experiences a cosmological evolution and one can imagine that the
brane falls into the BH and disappears from an external observer’s view into
the BH horizon. We are thus in a situation similar to the one envisaged in the
Geroch process: the thought experiment in which a thermodynamic system
is absorbed by a BH. The aim is to design the process such that the energy
absorbed by the BH is minimal. In such a way the entropy that the BH gains
will also be minimal, as both the energy and the entropy of the BH depend
only on its mass after the absorption. We can make the entropy balance during
the process and see under which conditions the GSL is respected.
We can gain some insight by modeling a 4D radiation-dominated (RD)
universe as a brane moving in an AdS5 -Schwarzschild space–time. For the
BH in AdS to be the dominant configuration over an AdS space filled with
thermal radiation as required for our analysis to be relevant, the BH must be
large and hot compared to the surrounding AdS5 [14]. In this limit the closed
4D universe can be treated as flat. The motion of the brane through the bulk
spacetime is viewed by a brane observer as a cosmological evolution. According
to the prescription of the RS II model [15], the 4D brane is placed at the Z2
symmetric point of the orbifold. On the other hand, in the so-called mirage
cosmology [16, 17], the brane is treated as a test object following a geodesic
motion. In both cases the evolution of the brane in the AdS5 -Schwarzschild
bulk mimics an FRW RD cosmology. From the 5D perspective one may expect
some limits on the entropy of the brane by considering what happens when
the BH swallows the brane.

2.4 What Are Cosmological Entropy Bounds Good for?

Our interest in entropy bounds in general and cosmological entropy bounds in

particular originated from the interest in determining the fate of cosmological
singularities. Specifically, we were interested in finding whether the bounce
that is an essential part of the pre-big-bang (PBB) scenario of string cosmol-
ogy [18] can be physically realized or perhaps there is some principle that
requires the solution to be singular. We needed a general principle because
string theory could not provide an explicit enough model of the hypothetical
bounce transition. The traditional tools for finding such criteria were the en-
ergy conditions that are used in the singularity theorems. However, the use of
energy conditions is limited because there are examples of cosmologies that do
624 R. Brustein

not seem to be problematic in any of their physical properties and for which
the singularity theorems are not applicable because some of the energy condi-
tions are violated. On the other hand, there are examples of cosmologies for
which we expect some problems while the singularity theorems seem perfectly
valid.
Let us consider, for example, the scale factor for a closed deSitter universe.
This is a closed universe containing a positive
cosmological constant Λ. In
D = 4 it is given by a(t) = ( Λ3 )−1/2 cosh Λ3 t, showing a bounce at t = 0.
The bounce is not allowed by the classic singularity theorems. This is not
surprising since the sources of this model violate the strong energy conditions
(SEC). The reliability of the SEC as a criterion of discriminating physical and
unphysical solutions is therefore questionable (as is well known in the context
of inﬂationary cosmology). Conversely, in a 4D contracting universe ﬁlled
with radiation consisting of N species in thermal equilibrium, the singularity
theorems imply that the solution will reach a future singularity. But entropy
bounds indicate expected problems already when T ∼ MP /N 1/2 as we will
show later.

3 The Causal Entropy Bound

3.1 The Hubble Entropy Bound

Motivated by the necessity to resolve the apparent singularity in the lowest or-
der classical PBB scenario, Veneziano has studied the possible role of entropy
bounds and proposed the Hubble entropy bound (HEB) [19]. The physical
motivations leading to the proposal of the HEB are (i) that in a given region
of space the entropy is maximized by the largest BH that can ﬁt in it and (ii)
that the largest BH that can hold together without falling apart in a cosmo-
logical background has typically the size of the Hubble radius. In the following
we review the basic ideas that led to Veneziano’s proposal of the HEB.
Veneziano considered the possibility that the BEB or holographic bounds
can be applied to an arbitrary sphere of radius R, cut out of a homogeneous
cosmological space. Entropy in cosmology is extensive so it grows like R3 , but
the boundary’s area grows like R2 . Hence, at suﬃciently large R, the (naive)
holography bound must be violated. On the other hand, SBEB ∼ ER ∼ R4
appears to be safer at large R.
In order to show how inadequate the naive bounds are in cosmology,
Veneziano applied them at the Planck time t ∼ tP ∼ 10−43 s, within standard
FRW cosmology, to the region of space that has become our visible universe
today. The size of that region at t ∼ tP was about 1030 in units of the Planck
length lP , and the entropy density was of about Planckian. Thus, the actual
entropy of the patch is

S ∼ (1030 )3 = 1090 , (4)

Cosmological Entropy Bounds 625

while

SBEB ∼ ρR4 / ∼ R4 /lP4 ∼ 10120 , SHOL ∼ R2 lP−2 ∼ 1060 . (5)

The actual entropy lies at the geometric mean between the two naive bounds,
making one false and the other quite useless. The two bounds differ by a factor
(Hdp )2 . While such a factor is of order unity in FRW-type cosmologies, it can
be huge after a long period of inflation. For this reason the (naive) holographic
entropy bound appears to be stronger than the cosmological version of the
BEB, just the opposite of what we argued to be the case for systems of limited
gravity.
A sufficiently homogeneous universe has a local time-dependent Hubble
expansion rate defined, in the synchronous gauge, by H = 16 ∂t (log det gij ).
If H does not vary much over distances ∼ 1/H, then the Hubble radius 1/H
corresponds to the scale of causal connection. If on top of this homogeneous
background some isolated lumps of size much smaller than 1/H exist, then
the expansion of the Universe is irrelevant and the situation should be similar
to that of nearly flat space. Veneziano argued that it is possible in this case
that a single Hubble patch contains several BHs. The BH can coalesce and
in the process their entropy will increase. He argued further that this way of
increasing entropy has some limit since it is hard to imagine that a BH of size
larger than 1/H can form. The different parts of its horizon would be unable
to hold together. Strong arguments in this direction were given long ago in
the literature [20]. Thus, the largest entropy in a region of space larger than
1/H is the one corresponding to one BH per Hubble volume 1/H 3 . Using the
Bekenstein–Hawking formula for the entropy of a BH of size 1/H leads to
the proposal of a “Hubble entropy bound” that the entropy is bounded by
SHEB ≡ nH S H , where nH is the number of Hubble-size regions within the
volume V , each one carrying maximal entropy S H = lP−2 H −2 ,

S(V ) < SHEB ≡ nH S H = V H 3 lP−2 H −2 = V HlP−2 . (6)

The HEB is partly holographic since S H scales as an area, and partly

extensive since nH scales as the volume. If the HEB is applied to a region
of size dp , then the bound is the geometric mean of the BEB and the naive
holography bound,

SHEB = d3p HlP−2 = SBEB SHOL .

1/2 1/2
(7)

3.2 The Causal Entropy Bound

The causal entropy bound (CEB) [21] aims to improve the HEB. It is a co-
variant bound applicable to entropy on space-like hypersurfaces. We do not
insist, a priori, on a holographic bound, but aim at generality of the hypersur-
face and then investigate how holography may or may not work. For systems
626 R. Brustein

of limited gravity Bekenstein’s bound is the tightest bound, while, in other

situations, the CEB is the strongest one which does not lead to contradictions
for space-like regions.
We shall refer to entropy in a region as to a quantity proportional to
the number of DOF in that region. To be more precise, we shall exclude from
consideration entropy associated with the background gravitational field itself.
We will, however, take into account the entropy of the perturbations of the
gravitational field. Let us first state our proposal, and then motivate and test
it. Consider a generic space-like hypersurface, defined by the equation τ = 0,
and a compact region lying within it defined by σ ≤ 0. We have proposed that
the entropy contained in this region, S(τ = 0, σ ≤ 0), is bounded by SCEB ,

−2 √
SCEB = lP d4 x −gδ(τ ) Max± [(Gμν ± Rμν )∂ μ τ ∂ ν τ ] =
σ<0

√ 1
lP−1 −1/2 d x −gδ(τ ) Max± (Tμν ± Tμν ∓ gμν T )∂ μ τ ∂ ν τ . (8)
4
2
σ<0

Here Gμν and Rμν are the Einstein and the Ricci tensor, respectively, Tμν is
the energy–momentum tensor and T its trace. To derive the second equality,
we have used Einstein equations, Gμν = 8πGN Tμν . Note the appearance of the
square root of the energy contained in the region and that (8) is manifestly
covariant, and invariant under reparametrization of the hypersurface equa-
tion: such an invariance requires a square-root of ∂ μ τ ∂ ν τ . Reality of SCEB
is assured if sources obey the weak energy condition, Tμν ∂ μ τ ∂ ν τ ≥ 0, since
then the sum of the two combinations in (8), and thus their maximum, are
positive. The weak energy condition is suﬃcient but not necessary for reality.
We expect that for physical systems reality will always be guaranteed.
Since (8) applies to any space-like region, it can be written in a local form
rather than
in an integrated form by introducing an entropy current sμ such
√
that S = d4 x −gδ(τ )sμ ∂ μ τ . Then (8) becomes equivalent to (λμ being an
arbitrary time-like vector):

−1 −1/2 1
sμ λ ≤ lP
μ
Max± (Tμν ± Tμν ∓ gμν T )λμ λν . (9)
2

In the limit in which the hypersurface is light-like, ∂ μ τ ∂μ τ = 0, (8) and

(9) read

√
SCEB = d4 x −gδ(τ ) Tμν ∂ μ τ ∂ ν τ ,
σ<0

sμ λμ ≤ lP−1 −1/2 Tμν λμ λν , λμ λμ = 0 , (10)

and become closely related to the assumptions made in [22] (1.10). We already
see signs here that the physics at short scales and high energies is important
Cosmological Entropy Bounds 627

in determining the value of the maximal entropy because Tμν is generically at

least quadratic in the fields.
The physical motivations leading us to the above proposal are similar to
those used to motivate the HEB: (i) that entropy is maximized, in a given
region of space, by the largest BH that can fit in it, and (ii) that the largest BH
that can hold together without falling apart in a cosmological background has
typically the size of the Hubble radius. The second assumption clearly needs to
be refined and, possibly, to be defined covariantly. With such a goal in mind,
we will proceed as follows: We will start by identifying a critical (“Jeans”)
length scale above which perturbations are causally disconnected so that BH
of larger size, very likely, cannot form. We will first find this causal connection
(CC) scale RCC for the simplest cosmological backgrounds, then extend it to
more general cases and, finally, guess the completely general expression using
general covariance.
In order to identify the CC scale for a homogeneous, isotropic and spa-
tially flat background, let us consider a generic perturbation around such a
background in the hamiltonian approach developed in [23]. The Fourier com-
ponents of the (normalized) perturbation and of its (normalized) conjugate

momentum satisfy Schroedinger-like equations Ψk + k 2 − (S 1/2 ) S −1/2 Ψk=0,

Π k + k 2 − (S −1/2 ) S 1/2 Π k=0, where k is the comoving momentum, a prime
denotes differentiation w.r.t. conformal time η, and S 1/2 is the so-called pump
field, a combination of the various backgrounds which depends on the spe-
cific perturbation under study. The perturbation equations clearly identify a
“Jeans-like” CC comoving momentum

2
kCC = Max (S 1/2 ) S −1/2 , (S −1/2 ) S 1/2

= Max K + K2 , − K + K2 , (11)

where K = (S 1/2 ) S −1/2 . Equation (11) always defines a real kCC since the
sum of the two quantities appearing on the r.h.s. is positive semi-definite.
Since tensor perturbations are always present, let us restrict our attention
to them. The “pump field” S 1/2 is simply given, in this case, by the scale
factor a(η) so that K → H = a /a. Equation (11) is immediately converted
−1
into the definition of a proper “Jeans” CC length RCC = akCC . Substitut-
ing into (11), and expressing the result in terms of proper-time quantities,

−2
we obtain (for tensor perturbations) RCC = Max Ḣ + 2H 2 , − Ḣ . Be-
fore trying to recast this equation in a more covariant form let us remove
the assumption of spatial flatness by introducing the usual spatial-curvature
parameter κ (κ = 0, ±1). The study of perturbations in non-flat space is
considerably more complicated than in a spatially flat background. The fi-
nal result, however, appears to be extremely simple [24, 25], and can be
obtained from the flat case by the following replacements in (11): H2 →
H2 + κ, H → H . Using this simple rule we arrive at the following
generalization
628 R. Brustein

−2
RCC = Max Ḣ + 2H 2 + κ/a2 , − Ḣ + κ/a2 . (12)

At this point we could have introduced anisotropy in our homogeneous

background and study perturbations with or without spatial curvature. In-
stead, we adopt a shortcut route. We observe that the 00 components of
the Ricci and Einstein tensors for our background are given by R00 =
−3(Ḣ + H 2 ) and G00 = 3(H 2 + κ/a2 ). Obviously,

−2 1
RCC = Max∓ (G00 ∓ R00 )
3 ρ
= 4πGN Max −p , ρ+p , (13)
3
where we have inserted Einstein equations using as an example a perfect-
fluid energy momentum tensor T μν = diag(ρ, −p, −p, −p). Equation (13) is
guaranteed to define a real RCC if the weak energy condition (reading here
ρ > 0) holds, since the sum of the two combinations is positive in this case.
In general, other perturbations may compete with tensor perturbations and
define a smaller RCC . In this case, the symbol Max in the above equations
also applies to the various types of perturbations. This may help to ensure
reality of RCC in all physical situations.
As a final step, let us convert (13) into an explicitly covariant bound on
entropy. Using RCC as the maximal scale for BHs, we get a bound on entropy
−3
which scales like S ∼ V RCC 2
RCC lP−2 = V RCC −1 −2
lP . We now express RCC −1

as in (13) in terms of the components of the Ricci and Einstein tensors in

the direction orthogonal to the hypersurface on which the entropy is being
computed. This can be done covariantly by deﬁning the hypersurface through
the equation τ = 0 and by identifying the normal with the vector ∇μ τ . This
procedure leads immediately to the proposal (8). The local form (9) clearly
follows by shrinking the space-like region to a point. Alternatively, using stan-
dard 3 + 1 ADM formalism [26], we can express the relevant components of
the Ricci and Einstein tensors in terms of the intrinsic and extrinsic curvature
of the hypersurface under study and arrive at the following ﬁnal formula:
√
−2 1/2
SCEB = lP d3 x h [Max (P , Q)] , (14)

where P = 12 R + θ̇ + 23 θ2 + σ 2 − A and Q = 12 R − θ̇ − 3σ 2 + A. Using

standard notations, we have denoted by R the intrinsic 3-curvature scalar, by
θ the expansion rate, by σ the shear, and by A the “acceleration” given (for
vanishing shifts Ni ) in terms of the lapse function N by A = N −1 N ,i;i .

3.3 The CEB in D Dimensions

In order to generalize the CEB to arbitrary dimension D [27], we generalize

the causal-connection scale RCC by looking at perturbation equations in D
dimensions. For gravitons, in the case of ﬂat universe, one ﬁnds [28]
Cosmological Entropy Bounds 629

−2 D−2 D D−4 2
RCC = Max Ḣ + H 2 , −Ḣ + H . (15)
2 2 2

If H Ḣ,RCC ∝ H −1 and one recovers HEB with a D-dependent prefactor

scaling as D(D − 2). The above result generalizes to the case of a spatially
curved universe as we have explained previously,

−2 D−2 D D−2 κ D−4 2 D−2 κ
RCC = Max Ḣ + H 2 + , −Ḣ + H + .
2 2 2 a2 2 2 a2
(16)

A covariant deﬁnition of RCC is obtained by expressing (16) in terms of the

00 components of curvature tensors. We ﬁnd

−2 D−2 1 2D − 5
RCC = Max [G00 ∓ R00 ] = 4πGN ρ − p, ρ + p , (17)
2(D − 1) D−1 D−1

where, to derive the second equality, we have used Einstein’s equations, Gμν =
8πGN Tμν and a perfect-ﬂuid form for the energy–momentum tensor.
The Bekenstein–Hawking entropy of a Schwarzchild BH of radius RBH in
D dimensions is given by S = A/4lPD−2 . The generalization of SCEB for a
region of proper volume V is therefore
V A
SCEB = βnH S BH = β D−2
, (18)
V (RCC ) 4lP

where nH ≡ V (RVCC ) is the number of causally connected regions in the volume

considered, V (x) denotes the volume of a region of size x and β is a fudge factor
reflecting current uncertainty on the actual limiting size for BH stability. For
a spherical volume in flat space
we have V (x) = ΩD−2 xD−1 /(D − 1), with
(D−1)/2 D−1
ΩD−2 = 2π /Γ 2 . But in general the result is different and depends
on the spatial-curvature radius.
Following [21], the expression for SCEB in D dimensions can be rewritten
in the explicitly covariant form

B √
SCEB = D−2 dD x −gδ(τ ) Max± [(Gμν ± Rμν )∂ μ τ ∂ ν τ ] =
lP
σ<0

1/2 √
B(8π) 1
D/2−1
d x −gδ(τ ) Max± (Tμν ± Tμν ∓ gμν T )∂ τ ∂ τ ,
D μ ν (19)
l 2
P σ<0

where σ < 0 deﬁnes the spatial region inside the τ = 0 hypersurface whose
entropy we are discussing, and T is the trace of the energy–momentum
tensor.
The prefactor B can be ﬁxed by comparing (18) and (19). Let us con-
sider the expression (18) in the limit RCC a, where a is the radius of
630 R. Brustein

the universe. In this case, over a region of size RCC we may neglect spatial
curvature and write V (RCC ) = ΩD−2 RCC D−1
/(D − 1), and the area of the
BH horizon as A = ΩD−2 RBH , thus giving (apart for negligible terms of
D−2

order (RCC /a)2 )

D−1 −1 −(D−2) 2(D − 1) −1 −(D−2)
SCEB = β V RCC lP =B V RCC lP . (20)
4 D−2

This ﬁxes B = (D−1)(D−2)32 β.
Since (19) applies to any space-like region, it can be rewritten in a local
form
D as in a 4D case by introducing an entropy current sμ such that S =
√
d x −gδ(τ )sμ ∂ μ τ . Then (19) becomes equivalent to (with λμ an arbitrary
time-like vector)

−D/2+1 1
sμ λ ≤ lP
μ
(8π) B Max± (Tμν ± Tμν ∓ gμν T )λμ λν .
1/2
(21)
2

In the limit of a light-like vector λ we get one of the conditions proposed

by Flanagan et al. [22] in order to recover Bousso’s proposal. Their bound
1
corresponds (in D = 4) to B = 4π and could be used to fix β (assuming that
it is D-independent).
For systems of limited gravity the BEB is tighter than the CEB, SBEB <
SCEB . Therefore, in all systems for which the BEB is obeyed, the CEB will
be obeyed as well. Hence, our bound is most interesting for systems of strong
gravity, and in particular in cosmology.
For general collapsing regions we have limited computational power. While
the local form (9) looks most appropriate for the study of collapsing regions,
most likely the analysis of the general case will need the use of numerical
methods. We can qualitatively check cases that are similar to the cosmological
ones [29], such as homogeneous, isotropic contracting pressureless regions, or
a contracting homogeneous, isotropic region filled with a perfect fluid. The
pressureless case can be described by a Friedman interior and a Schwarzschild
exterior. Since CEB is valid for the analogue cosmological solution, it is also
valid for this case.
A particularly interesting case is that of the (generically non-homogeneous)
collapse of a stiff fluid (p = ρ) which can be mapped by a simple field redef-
inition onto the dilaton-driven inflation of string cosmology [18]. In this case
one finds a constant SCEB in agreement with the HEB result [19]. Hence, no
problem arises in this case, even if one starts from a saturated SCEB at the
onset of collapse. For non-stiff equations of state, the situation appears less
safe if one starts near saturation. However, care must be taken in this case
of perturbations which tend to grow non-linear by and form singularities on
rather short time scales. Such cases cannot be described analytically but have
been looked at numerically.
Cosmological Entropy Bounds 631

3.4 The CEB in Cosmology

The universe is a system of strong self-gravity. The geometry of the uni-

verse is determined by self-gravity, and the size of the universe is at least
its gravitational radius. The strongest challenges to entropy bounds in gen-
eral, and to the CEB in particular, come from considering (re)collapsing
universes.
In homogeneous and isotropic D-dimensional cosmological backgrounds
we have found the dependence of RCC on the Hubble parameter H(t), its
time derivative Ḣ(t) and the scale factor a(t) in (16) and (17),

−2 D−2 D 2 D−2 κ D−4 2 D−2 κ
RCC = Max Ḣ + H + , −Ḣ + H +
2 2 2 a2 2 2 a2

4πGN
= Max ρ − (D − 1)p , (2D − 5)ρ + (D − 1)p , (22)
D−1

where κ = 0, ±1 determines the spatial curvature. Notice that RCC is well

defined if ρ is positive because the maximum in (22) is larger than the average
of the two entries in the brackets, and the average is equal to 2(D − 2)ρ.
The following four cases exhaust all possible types of cosmologies [21, 30]:
1. |Ḣ| ∼ H 2 ∼ |k|/a2 , or |Ḣ| ∼ H 2 |k|/a2 . In this case effective energy
density and pressure are of the same order, ρ ∼ p. All length scales that
may be considered in entropy bounds, such as particle horizon, apparent
horizon, RCC and the Hubble radius, are parametrically equal. This case
includes non-inflationary FRW universes with matter and radiation.
2. H 2 |k|/a2 , |Ḣ|. In this case |ρ + p| ρ, and the universe is inflationary.
In this case RCC is parametrically equal to |H|−1 .
3. |Ḣ| H 2 , |k|/a2 . In this case |ρ| p. Since ρ and p are the effective
energy density and pressure, there are no problems with causality. This
case occurs, for instance, near the turning point of an expanding uni-
verse which recollapses, or near a bounce of a contracting universe which
re-expands.
4. k/a2 |Ḣ|, H 2 . In this case the spatial curvature determines the causal
connection scale. This occurs, for example, when both H and Ḣ vanish
as in a closed Einstein universe.
We will first describe several cosmological models and explain how they
satisfy the CEB. Then we will present in a general form the conditions on
sources that guarantee the validity of the CEB.

A radiation-dominated Universe

Our ﬁrst example is a radiation-dominated universe in D dimensions. In this

case ρ = (D − 1)p and the 00 equation for the scale factor is
632 R. Brustein

κ 16πGN 16πGN
H2 + = ρ= ρ0 R0D a−D , κ = 0, ±1
a2 (D − 1)(D − 2) (D − 1)(D − 2)
(23)

In terms of the conveniently rescaled conformal time η, deﬁned by a(η)dη =

(D − 2)dt, the solutions can be put in the simple form

⎧ α
1
⎨ [sin (η/2)] κ=1
16πGN ρ0 R0D 2
a(η) = A D−2 (η/ 2)α κ=0 , A= , α= .
⎩ α (D − 1)(D − 2) D−2
[sinh (η/2)] κ = −1
(24)

As can be seen from (24) the qualitative behavior of the solutions does not
depend strongly on D. In a (closed, open or ﬂat) RD universe one always has
−2
R00 = G00 ; therefore, RCC = D−22 −Ḣ + D−4 2 D−2 κ
2 H + 2 a2 . The behavior
of SCEB is easily derived from the explicit solution for the scale factor and
RCC . In the case D=4 it is shown in Fig. 1.
A related case is when matter can be modeled by a conformal ﬁeld theory
(CFT). Kutasov and Larsen [31] pointed out that for weakly coupled CFTs
in a sphere of radius R, the free energy F , the entropy S and the total energy
E can be expanded at weak coupling and large x ≡ 2πRT ,

− F R = f (x) = aD−2n xD−2n + . . . (25)
n≥0

S = 2πf (x) , (26)
ER = (x∂x − 1)f (x), (27)

600 SCEB

200

−1.0 −0.8 −0.6 −0.4 −0.2 0.0

Fig. 1. SCEB compared with SH = (D − 2) 4G HV

N
and SB ≡ 2πRE/(D − 1) in the
expanding phase of a closed D = 4, RD Universe. Here we set β = D−2
D−1
Cosmological Entropy Bounds 633

where the dots represent non-perturbative contributions.

We can explicitly check under which conditions √ the entropy of weakly
√ −(D−2)/2
coupled CFTs obeys the CEB, S < SCEB = 4B π EV lP . In the
limit T R 1 we ﬁnd
S2 πaD D2 D−2
= (2πlP T ) . (28)
2
SCEB 4B 2 (D − 1)ΩD−1
Thus, CEB is obeyed provided that
D−2
T K(D)
< , (29)
MP aD
where K(D) is a D-dependent (but CFT-independent) constant. We conclude
− 1
that CEB is obeyed as long as temperatures are below MP by a factor aD D−2
Since aD is proportional to the number N of CFT-matter species, we obtain
a bound on temperature which scales as N − D−2 in Planck units.
1

We can also explicitly check under which conditions strongly coupled CFTs
possessing AdS duals as considered by Verlinde [11] obey the CEB. For such
CFTs,
c V
S= , (30)
12 LD−1

c D−1 L2 V
E= 1+ 2 , (31)
12 4πL R LD−1

1 L2
T = D + (D − 2) 2 , (32)
4πL R
where c is the central charge of the CFT and L ∼ 1/T is the AdS radius.
In this case, in the limit R/L ∼ T R 1 we ﬁnd
D−2
S2 1 c lP
= (33)
2
SCEB 4(D − 1)B 2 12 L
and thus CEB is obeyed for
(D−2)
1 c 4πT
< 1. (34)
4(D − 1)B 2 12 DMP
Since the central charge c is proportional to the number of CFT ﬁelds N ,
we obtain a bound on temperature which, in Planck units, scales as N − D−2 ,
1

exactly as previously obtained for the weakly coupled case.

For the case ER ∼ aD (which corresponds to RT ∼ 1) the validity of the
CEB guaranteed by a condition similar to (29).
634 R. Brustein

Finally, we would like to show that CEB holds also when ER ∼ 1. In

√ D−2
−(D−2)/2 2
this case SCEB 4B π V /RlP scales as lRP . The appropriate
setup for calculating the entropy in this case is the microcanonical ensemble
with the result S ∼ log aD ∼ log N ; thus, S < SCEB is guaranteed for a
macroscopic universe as long as
D−2
R 2
> log N . (35)
lP
In a quantum theory of gravity we expect the UV cutoff Λ to be finite and
to represent an upper bound on T (as in the example of superstring theory
and its Hagedorn temperature) and a lower bound on R (as in the minimal
compactification radius). Thus conditions (29) and (34) for the validity of
D−2
CEB are satisfied as long as MΛP < 1/N . A bound of the same form
was previously proposed in [9] and [32], and independent arguments in support
of bounds of this sort have also been put forward in [13].
The Inflationary Universe
The inflationary universe is completely compatible with the CEB. To a certain
extent this is a not such an interesting case, because the CEB is comfortably
satisfied.
The entropy balance begins for the inflationary universe after the end of
inflation when the energy of the background is converted to matter. This
process is historically called reheating and is associated with a large entropy
production. In the following we will assume that the reheating process is
instantaneous and complete. We will denote by the subscript RH quantities
at the instant of reheating.
Since Ḣ is subleading in this case, it follows from (22) that RCC ∼ 1/H.
In this case the CEB and the HEB are similar,
SCEB (tRH ) = lP−2 H(tRH )a3 (tRH ). (36)
Assuming that the energy has been completely converted into radiation, the
4
energy density of the radiation is ρ(tRH ) = TRH . From the 00 Einstein equa-
−2 2
tion lP H = ρ, thus
4 1
SCEB (tRH ) = TRH a3 (tRH )
H(tRH )
TRH
= T 3 a3 (tRH ) (37)
H(tRH ) RH
TRH
= S(tRH ).
H(tRH )
Here we have used the expression for the radiation entropy S(tRH ) =
TRH a (tRH ). Since from the 00 Einstein equation H(tRH ) ∼
3 3 TRH MP
H(tRH ) , and
Cosmological Entropy Bounds 635

since we expect that the Hubble parameter at reheat be substantially below

the Planck temperature, we conclude that the CEB is comfortable satisﬁed.

A Universe Near a Turning Point

Let us consider either a flat or closed universe with some perfect fluid in
thermal equilibrium and a constant equation of state p = γρ , 1 > γ > −1,
and with an additional small negative cosmological constant Λ = −λ. The
universe starts out expanding, reaches a maximal size and then contracts
toward a singularity. In this case the matter entropy within a comoving volume
is constant in time. But near the point of maximal expansion the apparent
horizon and the Hubble length diverge causing violation of the HEB. However,
−1
for a fixed comoving volume, SCEB ∼ V RCC , and, since RCC is never larger
than some maximal value, the CEB has a chance of doing better.
To see this explicitly, let us consider a 4D example. In this case we obtain
from (22)

−2 1 1 −3(1+γ) 3 −3(1+γ)
RCC = Max ρ0 (1 − 3γ)a − 2λ , (1 + γ)ρ0 a , (38)
3 2 2

independently of κ. The initial energy density is ρ0 and a is the ratio of the

scale factor to its initial value. Since the maximum is larger than each of the
expressions in the brackets

−2 1
RCC ≥ (1 + γ)ρ0 a−3(1+γ) . (39)
2
−1
It follows that in a fixed comoving volume SCEB scales as ∼ a3 RCC ∼
3/2(1−γ)
a . Since γ < 1, this means that SCEB grows during the expansion,
reaches a maximum at the turning point and then starts decreasing. If the
initial conditions are fixed at sufficiently early times when curvature and cos-
mological constant are negligible, the CEB will be obeyed initially provided
energy density and curvature are less than Planckian. But then the evolution
of SCEB that we have found will guarantee that the bound is satisfied at
all times until Planckian density and curvature is reached in the recollapsing
phase. Thus the CEB will be satisfied throughout the classical evolution of
our Universe.

A Static Universe

The simplest example of a non-singular cosmology is a static Einstein model

in D dimensions which was discussed in [30]. This model requires positive
curvature, and two types of sources: cosmological constant and dust; we denote
by ρΛ and ρm the energy densities associated with each of the two components.
To provide entropy, we need an additional source, which we choose to be
radiation consisting of N species in thermal equilibrium at temperature T .
The energy density of the radiation is given by ρr = N T D , and the entropy
636 R. Brustein

density of the radiation is given by sr = N T D−1 (we ignore here numerical

factors since we will be interested in scaling of quantities). The total entropy
of the system is given entirely by the entropy of the radiation Sr = sr V .
In term of these sources, Einstein equations can be written in the following
way:
1 16πGN 16πGN
H2 + = ρtot = (ρΛ + ρm + ρr ) (40)
a2 (D − 2)(D − 1) (D − 2)(D − 1)
1 8πGN
Ḣ − 2 = − (ρtot + ptot )
a (D − 2)
8πGN
=− [Dρr + (D − 1)ρm ] , (41)
(D − 2)(D − 1)
where we have used in (41) the equations of state relating pressure to energy
density: pΛ = −ρΛ , pm = 0 and (D − 1)pr = ρr .
For given ρm and ρr , one can choose ρΛ and the scale factor a such that H
and Ḣ vanish in (40) and (41), and thus obtains a static solution. In particular,
the condition given by (41) determines the scale factor in terms of ρm and ρr ,
(D − 2)(D − 1) 1
a2 = . (42)
8πGN Dρr + (D − 1)ρm

Since both H and Ḣ vanish identically, RCC is determined solely by the scale
factor a given in (42), as discussed previously.
We now wish to determine under which conditions (if any) some violations
of CEB may occur in this model. Recall that according to (20) the CEB
bounds the total entropy of a region contained in a comoving volume V by
SCEB = α(D − 1) GNVRCC , and that in the static case under consideration
RCC = 2a/(D − 2). The square of the ratio of SCEB and the entropy of the
system Sr is given by
2 2
SCEB α(D − 1)
= =
Sr sr RCC GN

D−2
ρm 1 MP
= 2πα (D − 1)(D − 2) D + (D − 1)
2
. (43)
ρr N T

Since the second factor in expression (43) is larger than unity if ρm and ρr
are positive, and neglecting the overall prefactor which is independent of the
sources in the model, we conclude that the CEB is valid provided that
D−2
T
N ≤ 1. (44)
MP
This is the same condition discussed above which should be interpreted as a
requirement that temperatures are sub-Planckian, in the case of many number
of species N .
Cosmological Entropy Bounds 637

Our conclusion is that as long as the temperature of radiation stays well

below Planckian, CEB is upheld. The fact that the model is gravitationally
unstable to matter perturbations does not seem to be particularly relevant to
the issue of validity of the CEB.

Bekenstein’s Non-singular Universe

A time-dependent non-singular cosmological model was found years ago by

Bekenstein [33] (see also [34]). This is a 4D Friedman–Robertson–Walker uni-
verse which is conformal to the closed Einstein Universe. It contains dust,
consisting of N particles of mass μ (N is constant and μ is positive), coupled
to a classical conformal massless scalar ﬁeld ψ, and N species of radiation in
thermal equilibrium. The action for the dust-ψ system is given by

1 √ 2 1 2
S=− −g (∇ψ) + ψ R d4 x − (μ + f ψ) dτ. (45)
2 6

It includes in addition to the usual action for free point particles of rest mass
μ, a dust-scalar field interaction whose strength is determined by the coupling
f . Accordingly, we may define the effective mass of the dust particles: μeff =
μ + f ψ.
The total energy density and pressure in Bekenstein’s Universe are given by

ρtot = ρr + ρψ + ρm , ptot = pr + pψ + pm , (46)

where {ρr , pr }, {ρψ , pψ } and {ρm , pm } are the energy densities and pressures
associated with the radiation, scalar ﬁeld and dust, respectively. They depend
on the scale factor in the following way

ρr = CN a−4 = N T 4 ,
1
ρψ = f 2 N 2 a−4 , (47)
2
ρm = N μeﬀ a−3 = N μa−3 − 2ρψ ,

and their equations of state γr = pr /ρr , γψ = pψ /ρψ and γm = pm /ρm are the
following:

γr = 1/3,
γψ = −1/3, (48)
γm = 0.

The dependence of ψ on a ψ = −f N a−1 yields μeﬀ = μ − f 2 N a−1 . C is an

integration constant and the only source of entropy is the radiation whose
entropy density is given by sr = N T 3 .
The solution for the scale factor a is given in terms of the conformal
time η by
638 R. Brustein

a(η) = a0 (1 + B sin η). (49)

We assume that a0 , the mean value of the scale factor, is macroscopic, so it is
large in our Planck units. If B = 0, the solution describes a static universe very
similar to the closed Einstein Universe discussed previously. For 0 < B < 1
the solution describes a “bouncing universe”: The universe bounces oﬀ at η =
3π/2 when the scale factor is minimal a = amin = a0 (1 − B), expands until it
turns over at η = 5π/2 when its scale factor is maximal, a = amax = a0 (1+B),
and continues to oscillate without ever reaching a singularity. The equations
of motion require that the energy densities of the sources obey the following
equalities at all times [33]:

a ρψ − ρr amin amax
2 = 1 − B2 = . (50)
a0 2ρψ + ρm a0 2
Since 2ρψ + ρm = N μa−3 > 0, ρr > 0 and B 2 < 1, it follows that a necessary
condition for a bounce is that ρr < ρψ . This implies that the total pressure
3 (ρr − ρψ ) is always negative. Moreover, (50) for a = amin implies that ρm ≤
1

−2ρr < 0 there. But then, the conclusion must be that in order to avoid a
singularity, μeff < 0 at least at the bounce. It is possible, however, to find a
range of initial conditions and parameters such that μeff is positive near the
turnover.
The result that ρr and ρψ are manifestly positive definite, but ρm can (and
in fact must) be negative some of the time, suggest that it might be possible
to parametrically decrease ρtot by lowering μeff (making it large and negative)
by increasing the coupling strength f , so that the amounts of radiation and
entropy are kept constant. As it turns out this is exactly the case in which the
CEB can be potentially violated. Using Einstein equations to express RCC in
2
terms of the total energy density and pressure, we find the ratio (SCEB /Sr ) :
2 ρ −3/2 1 ρ
SCEB
∼ GN −2
r tot
G N Max − p tot , ρtot + p tot , (51)
Sr N N2 3
a system for which the ratio above is smaller than one would violate the CEB.
Recalling that the maximum on the r.h.s. of (51) is always larger than the
mean of the two entries and rearranging, we find
2

SCEB 1 MP2 ρtot
≥ . (52)
Sr N T2 ρr
Since we assume that the model is sub-Planckian, namely that the first factor
is larger than one as in (44), the only way in which CEB could be violated is
if somehow the second factor was parametrically small. As discussed above, it
does seem that the second term ρtot /ρr can be made arbitrarily small by de-
creasing ρtot while keeping ρr constant. Consequently, it is apparently possible
to make the ratio SCEB /Sr smaller than one and obtain a CEB violating cos-
mology. But this can be achieved only if the effective mass of the dust particles
is negative (and large) as can be seen from (46).
Cosmological Entropy Bounds 639

Violations of the CEB (and as a matter of fact, of any other entropy

bound) go hand in hand with large negative energy densities in the dust
sector. In the model under discussion, this manifests itself in the form of dust
particles with highly negative effective masses. Occurrence of such negative
energy density would most probably render the model unstable. We argue that
any analysis of entropy bounds should be performed for stable models. This
is particularly relevant for the CEB, whose definition involves explicitly the
largest scale at which stable BHs could be formed. However, the instability
does not necessarily lead to violations of the CEB as in the previous case.
To support this argument, we have outlined possible instabilities in the dust
scalar field system when the dust particles’ mass is negative [30].

The Pre-big-bang Scenario

Veneziano [19] was the first to study entropy bounds in the context of the
PBB scenario. It has been argued [35, 36] that a form of stochastic PBB is
a generic consequence of natural initial conditions corresponding to generic
gravitational and dilatonic waves superimposed on the perturbative vacuum of
critical superstring theory. In the Einstein-frame metric this can be seen as a
chaotic gravitational collapse leading to the formation of BHs of different sizes.
For a string frame observer inside each BH this is viewed as a PBB inflationary
cosmology. The duration of the inflationary phase is controlled by the size of
the BH [35, 36], so from this point of view the observable Universe should
be identified with the region of space that was originally inside a sufficiently
large BH.
In Veneziano [19] studied a 4D PBB model and followed the evolution of
several contributions to the entropy. At time t = ti , corresponding to the first
appearance of a horizon, he used the Bekenstein–Hawking formula to evaluate
that the entropy in the collapsed region Scoll . Then he used the fact [36] that
the initial size of the BH horizon determines the initial value of the Hubble
parameter and found that

Scoll ∼ (Rin /lP,in )2 ∼ (Hin lP,in )−2 = SHEB . (53)

Thus, initially the entropy is as large as allowed by the HEB (without ﬁne-
tuning). Here it was implicitly assumed the initial string coupling is small.
After a short transient phase, dilaton-driven inﬂation (DDI) should follow
[35, 36] and last until ts , the time at which a string-scale curvature is reached.
We expect this classical process not to generate further entropy. During DDI
SHEB remains constant and the bound continues to be saturated. This follows
from the “conservation law” of string cosmology [18]
√
∂t e−φ gH = 0; (54)

hence, √
∂t ( gH 3 ) (e−φ H −2 ) = ∂t nH S H = 0 . (55)
640 R. Brustein

Veneziano suggested the following interpretation: At the beginning of the

DDI phase the whole entropy is in a single Hubble volume. As DDI pro-
ceeds, the same total amount of entropy becomes equally shared between
very many Hubble volumes until, eventually, each one of them contributes a
small number.
While the coupling is still small, SHEB cannot decrease,
√
∂t (e−φ gH) ≥ 0. (56)

It follows that
(φ̇ − 3H) ≤ Ḣ/H . (57)

Veneziano noticed that this constraint may be important. As α corrections

intervene to stop the growth of H, the entropy bound forces φ̇−3H to decrease
and eventually to change sign if H stops growing. But this is just what is
needed to convert the DDI solution into the FRW solution [18].
If the initial conditions are such that the string coupling becomes strong
while the curvature is still small, then Veneziano argued [19] that the HEB
forces a non-singular PBB cosmology as well. This time the entropy produc-
tion by the squeezing of quantum ﬂuctuations is the important factor. This
will be discussed further when we discuss the generalized second law.

3.5 Conditions for the Validity of the CEB in Cosmology

We may summarize the lessons of the previous examples by imposing con-

ditions on sources in a generic cosmological setting such that the CEB is
obeyed.
We consider a cosmic ﬂuid consisting of radiation, an optional cosmological
constant, and additional unspeciﬁed classical dynamical sources which do not
include any contributions from the cosmological constant or radiation. For
simplicity we assume that the additional sources have negligible entropy. This
is the most conservative assumption: If some of the additional sources have
substantial entropy our conclusions can be strengthened. We use the previous
notations for the total, cosmological and radiation energy densities, ρtot , ρΛ
and ρr respectively, and denote by ρ∗ the combined energy density of the
additional sources. Thus

ρtot = ρr + ρΛ + ρ∗ . (58)

We use the same notation for the relative pressures, and for the equation of
state γ ∗ ≡ ρ∗ /p∗ , which may be time dependent.
In terms of these sources, the causal connection scale can be written as

−2 4πG
Max DρΛ + 1 − (D − 1)γ ρ∗ ,
∗
N
RCC =
D−1
Cosmological Entropy Bounds 641

(D − 4)ρΛ + (2D − 5) + (D − 1)γ ∗ ρ∗ + 2(D − 2)ρr . (59)

We may now express the ratio of (SCEB /Sr )2 , neglecting as usual prefactors
of order 1
2 D−2
∗
SCEB 1 MP ρΛ ρ
∼ Max D + 1 − (D − 1)γ ∗ ,
Sr N T ρr ρr

∗
ρΛ ∗ ρ
(D − 4) + (2D − 5) + (D − 1)γ + 2(D − 2) . (60)
ρr ρr

Any CEB violation requires that this ratio be parametrically smaller than
one. Notice that the ﬁrst factor is larger than one by our requirement that the
radiation energy density be sub-Planckian. Thus the only remaining possibility
for violating CEB is that the second factor be parametrically smaller than
unity. As we show below, this can occur only if at least one of the additional
sources has negative energy density.
The r.h.s. of (60) is larger than the average of the two entries, so that
2 D−2
SCEB 1 MP ρtot
≥ (D − 2) . (61)
Sr N T ρr

Therefore, since ρtot > 0, a necessary condition for this expression to be

smaller than unity is that ρtot ρr , which we may re-express as

ρΛ ρ∗
∼− 1+ . (62)
ρr ρr

This is not a suﬃcient condition since the equations of motion could dictate,
for example, that the ﬁrst factor on the r.h.s. of (61) could be parametrically
larger than unity at the same time. By substituting condition (62) into (60),
we obtain
2 D−2
SCEB 1 MP
∼ ×
Sr N T

∗

∗ ρ ρ∗
Max − (D − 1)(1 + γ ) + D , (D − 1)(1 + γ ∗ ) + D . (63)
ρr ρr

Therefore, an additional necessary condition for SCEB /Sr to be smaller than

one is that
D
(1 + γ ∗ )ρ∗ − ρr . (64)
(D − 1)
642 R. Brustein

Condition (64) can be satisﬁed in two ways:

(i) 1 + γ ∗ > 0 and ρ∗ < 0. This obviously requires that at least one of
the sources has negative energy density. In this case (barring pathologies) the
magnitude of ρ∗ is comparable to that of ρr .
(ii) 1 + γ ∗ < 0 and ρ∗ > 0. However, for classical dynamical sources, this
typically clashes with causality which requires that the pressure and energy
density of each of the additional dynamical
! !sources obey |pi | < |ρi |; hence, if
all ρi > 0, then necessarily γ ∗ = ( pi ) / ( ρi ) > −1.
Consequently, condition (64) cannot be satisfied if all of the dynamical
sources have positive energy densities and equations of state |γi | ≤ 1. Beken-
stein’s Universe discussed previously fits well within our framework: The total
energy density is positive, but the overall contribution to ρtot of all the sources,
excluding radiation (since the cosmological constant vanishes in this case), is
negative and almost cancels the contribution of radiation, leaving a small
positive ρtot .
To summarize, if all dynamical sources (different from the cosmological
constant) have positive energy densities ρi > 0 and have causal equations
of state (|γi | ≤ 1), and if radiation temperatures are sub-Planckian, CEB is
upheld.

3.6 The CEB and the Singularity Theorems

The CEB (and entropy bounds in general) reﬁnes the classic singularity
theorems. It is satisﬁed by cosmologies for which the singularity theorems
are not applicable because some of the energy conditions are violated, but
do not seem to be problematic in any of their properties. Conversely, it
indicates possible problems when the singularity theorems seem perfectly
valid.
In general, the total energy–momentum tensor of a closed “bouncing”
universe violates the SEC, but it can obey the CEB. In order to see this
explicitly, let us consider the “bounce” condition, i.e. H = 0, Ḣ > 0 for a
closed Universe; by using the Einstein equations (40 and 41), we can express
this condition in terms of the sources as follows:

ρtot > 0, (D − 3)ρtot + (D − 1)ptot < 0. (65)

The second of these conditions is (in D = 4) precisely the condition for viola-
tion of the SEC. In terms of ρr , ρΛ and ρ∗ this reads

2ρΛ − (D − 2)ρr − (D − 3) + (D − 1)γ ∗ ρ∗ > 0 . (66)

In comparison, a necessary condition that the CEB is violated can be obtained

from (62) and (64),

2ρΛ − (D − 2)ρr − (D − 3) + (D − 1)γ ∗ ρ∗ ∼ 0 , (67)
Cosmological Entropy Bounds 643

where the l.h.s. of (67) can be either positive or negative. So we find that there
is a range of parameters for which the CEB can be obeyed in some bouncing
cosmologies but not in others.
In a spatially flat universe (κ = 0), the conditions for a bounce are slightly
different: ρtot = 0 and ρtot + ptot < 0. At the bounce these conditions imply
violation of the null energy condition (NEC). As discussed previously, classical
sources are not expected to violate the NEC, but effective quantum sources
(such as Hawking radiation) are known to violate the NEC. In terms of ρr ,
ρΛ and ρ∗ the condition for a bounce reads

1
1+ ρr + (1 + γ ∗ )ρ∗ > 0. (68)
D−1
In comparison, a necessary condition that the CEB is violated can be obtained
from (64),
1
1+ ρr + (1 + γ ∗ )ρ∗ ∼ 0 , (69)
D−1
where the l.h.s. of (69) can be either positive or negative. So, again, we find
that there is a range of parameters for which the CEB can be obeyed in some
spatially flat bouncing cosmologies but not in others.
The CEB appears to be a more reliable criterion than energy conditions
when trying to decide whether a certain cosmology is reasonable: Taking
again the closed deSitter Universe as an example, we can add a small amount
of radiation to it, and still have a bouncing model if ρΛ is the dominant
source, and SEC will not be obeyed (66). Nevertheless, the general discus-
sion in this section shows that in this case the CEB is not violated as long
as radiation temperatures remain sub-Planckian, despite the presence of a
bounce. This happens, in part, because the CEB is able to discriminate bet-
ter between dynamical and non-dynamical sources (such as the cosmological
constant), and imposes constraints that involve the former ones only, such
as (64).
We have reached the following conclusions by studying the validity of the
CEB for non-singular cosmologies:
1. Violation of the CEB necessarily requires either high temperatures
D−2
N MTP ≥ 1, or dynamical sources that have negative energy den-
sities with a large magnitude, or sources with acausal equation of state.
Of course, neither of the above is sufficient to guarantee violations of
the CEB.
2. Classical sources of this type are suspect of being unphysical or unstable,
but each source has to be checked on a case-by-case basis. In the examples
that we have discussed the sources were indeed found to be unstable or
are strongly suspected to be so.
3. Sources with large negative energy density could allow, in principle, to
increase the entropy within a given volume while keeping its boundary
644 R. Brustein

area and the total energy constant. This would lead to violation of
all known entropy bounds, and of any entropy bound which depends
in a continuous way on the total energy or on the linear size of the
system.
4. The CEB is more discriminating than singularity theorems. In the ex-
amples we have considered it allows non-singular cosmologies for which
singularity theorems cannot be applied, but does not allow them if they
are associated with speciﬁc dynamical problems.

3.7 Comparison of the CEB to Other Entropy Bounds

Finally, we compare our CEB to other bounds, in particular to Beken-

stein’s and Bousso’s. For systems of limited gravity whose size exceeds
their Schwarzschild radius: R > Rg , Bekenstein’s bound is given by S <
SBEB = lP−2 R Rg , and Bousso’s procedure results in the holography bound,
S < SHOL = lP−2 R2 , but since R > Rg , SBEB < SHOL , and therefore Bousso’s
bound is less stringent than Bekenstein’s. Consider now the CEB applied to
the region of size R containing an isolated system. Expressing CEB in the form
(8) one immediately obtains SCEB = lP−1 R3/2 E 1/2 −1/2 = (SHOL SBEB )1/2 ,
implying SBEB ≤ SCEB ≤ SHOL . We conclude that for isolated systems of
limited self-gravity the Bekenstein bound is the tightest, followed by our CEB
and, finally, by Bousso’s holographic bound. Similar scaling properties for the
HEB were discussed in [19].
For regions of space that contain so much energy that the corresponding
gravitational radius Rg exceeds R, Bekenstein’s bound is the weakest, while
the naive holographic bound is the strongest (but very often wrong). Bousso’s
proposal uses the apparent horizon RAH , while CEB uses RCC . For homo-
−2
geneous cosmologies, RCC < RAH , since RCC , according to (12), is always
larger than the average of the two terms appearing on its r.h.s., which is
−2
precisely RAH = H 2 + κ/a2 . Since, for a fixed volume, the bounds scale
−1 −1
like RAH or RCC , we immediately find that CEB is generally more gen-
erous. An important difference between our proposal and Bousso’s covari-
ant holographic bound [6] that scales as S/A is that there the entropy
S is a flux through light-like hypersurfaces. A detailed comparison with
Bousso’s proposal is therefore more subtle because of his use of the ap-
parent horizon area to bound entropy on light sheets. This can be con-
verted into a bound on the entropy of the space-like region only in special
cases.
Verlinde [11] argued that the radiation in a closed, radiation-dominated
Universe can be modeled by a CFT, and that its entropy can be evaluated
using a generalized Cardy formula. After an appropriate modification of Ver-
linde’s bound which evades the criticism about its validity for weakly coupled
CFTs, the new bound is exactly equivalent to CEB within the CFT frame-
work.
Cosmological Entropy Bounds 645

4 The Generalized Second Law and the Causal

Entropy Bound
4.1 The Generalized Second Law in Cosmology

There seems to be a close relationship between entropy bounds and the GSL.
We have proposed a concrete classical and quantum mechanical form of the
GSL in cosmology [32], which is valid also in situations far from thermal equi-
librium. We discuss various entropy sources, such as thermal, “geometric” and
“quantum” entropy, apply GSL to study cosmological solutions and show that
it is compatible with entropy bounds. GSL allows a more detailed description
of how, and if, cosmological singularities are evaded. The proposed GSL is dif-
ferent from GSL for BHs [37], but the idea that in addition to normal entropy
other sources of entropy have to be included has some similarities. We will
discuss here only 4D models. Obviously, it should be possible to generalize
our analysis to higher dimensions in a straightforward manner along the lines
of the generalizations of the CEB to higher dimensions.
The starting point of our classical discussion is the deﬁnition of the total
entropy of a domain containing more than one cosmological horizon [19]. We
have already introduced the number of cosmological horizons within a given
comoving volume V = a(t)3 . It is simply the total volume divided by the
volume of a single horizon, nH = a(t)3 /|H(t)|−3 . As usual, we will ignore
numerical factors of order unity. Here we use units in which c = 1, GN =
1/16π, = 1 and discuss only ﬂat, homogeneous and isotropic cosmologies.
If the entropy within a given horizon is S H , then the total entropy is given
by S = nH S H . Classical GSL requires that the cosmological evolution, even
when far from thermal equilibrium, must obey dS ≥ 0, in addition to Einstein
equations. In particular,

nH ∂t S H + ∂t nH S H ≥ 0. (70)

In general, there could be many sources and types of entropy, and the
total entropy is the sum of their contributions. If, in some epoch, a single type
of entropy makes a dominant contribution to S H , for example, of the form
S H = |H|α , α being a constant characterizing the type of entropy source, and
therefore S = (a|H|)3 |H|α , (70) becomes an explicit inequality,

Ḣ
3H + (3 + α) ≥ 0, (71)
H
which can be translated into energy conditions constraining the energy density
ρ, and the pressure p of (eﬀective) sources. Using the FRW equations,
1
H2 = ρ,
6
1
Ḣ = − (ρ + p), (72)
4
646 R. Brustein

ρ̇ + 3H(ρ + p) = 0,
and assuming α > −3 (which we will see later is a reasonable assumption )
and of course ρ > 0, we obtain
p 2
≤ −1 for H > 0, (73)
ρ 3+α
p 2
≥ −1 for H < 0. (74)
ρ 3+α
Adiabatic evolution occurs when the inequalities in (73) and (74) are satu-
rated.
A few remarks about the allowed range of values of α are in order. First, the
usual adiabatic expansion of a radiation-dominated universe with p/ρ = 1/3
corresponds to α = −3/2. Adiabatic evolution with p/ρ < −1 for which the
null energy condition is violated would require a source for which α < −3.
This is problematic since it does not allow a flat space limit of vanishing H
with finite entropy. The existence of an entropy source with α < −2 does not
allow a finite ∂t S in the flat space limit and is therefore suspected of being
unphysical. Finally, the equation of state p = −ρ (deSitter inflation) cannot
be described as adiabatic evolution for any finite α.
Let us discuss in more detail three specific examples. First, as already
noted, we have verified that thermal entropy during radiation-dominated evo-
lution can be described without difficulties, as expected. In this case, α = − 32
reproduces the well-known adiabatic expansion, but also allows entropy pro-
duction. The present era of matter domination requires a more complicated
description since in this case one source provides the entropy, and another
source the energy.
The second case is that of geometric entropy Sg , whose source is the exis-
tence of a cosmological horizon [38, 39]. The concept of geometric entropy is
closely related to the holographic principle and to entanglement entropy (see
below). For a system with a cosmological horizon SgH is given by (ignoring
numerical factors of order unity)
SgH = |H|−2 G−1
N . (75)
The equation of state corresponding to adiabatic evolution with dominant Sg
is obtained by substituting α = −2 into (73) and (74), leading to p/ρ = 1
for positive and negative H. This equation of state is simply that of a free
massless scalar field, also recognized as the two vacuum branches of PBB string
cosmology [18] in the Einstein frame. In [19] this was found for the (+) branch
in the string frame as an “empirical” observation. In general, for the case of
dominant geometric entropy, GSL requires, for positive H, p ≤ ρ; hence,
deSitter inflation is definitely allowed. For negative H, GSL requires ρ ≤ p,
and therefore forbids, for example, a time-reversed history of our universe
or a contracting deSitter universe with a negative constant H (unless some
additional entropy sources appear).
Cosmological Entropy Bounds 647

The third case is that of quantum entropy Sq , associated with quantum

ﬂuctuations. This form of entropy was discussed in [40, 41]. Speciﬁc quantum
entropy for a single physical degree of freedom is approximately given by
(again, ignoring numerical factors of order unity)

sq = d3 k ln nk , (76)

where nk 1 are occupation numbers of quantum modes. Quantum entropy

is large for highly excited quantum states, such as the squeezed states obtained
by amplification of quantum fluctuations during inflation. Quantum entropy
does not seem to be expressible in general as SqH = |H|α , since occupation
numbers depend on the whole history of the evolution. We will discuss this
form of entropy in more detail later, when the quantum version of GSL is
proposed.
Geometric entropy is related to the existence of a horizon or more gener-
ally to the existence of a causal boundary. From my current perspective the
geometric entropy corresponds to entanglement entropy of fluctuations whose
wavelength is shorter than the horizon, while “quantum” entropy is probably
related to entanglement entropy of fluctuations whose wavelength is larger
than the horizon (see below).
We would like to show that it is possible to formally define a temperature,
and that the definition is compatible with the a generalized form of the first
law of thermodynamics (see also [43]). Recall that the first law for a closed
system states that T dS = dE + pdV = (ρ + p)dV + V dρ. Let us now consider
the case
of single entropy source and formally define a temperature T , T −1 =
∂E V = ∂ρ , since E = ρV and S = sV . Using (72), and s = |H|
∂S ∂s α+3
, we
obtain ∂ρ = 12 |H|
∂s α+3 α+1
, and therefore,

12
T = |H|−α−1 . (77)
α+3
To ensure positive temperatures, α > −3, a condition which we have already
encountered. Additionally, for α > −1, T diverges in the ﬂat space limit,
and therefore such a source is suspect of being unphysical, leading to the
conclusion that the physical range of α is −2 ≤ α ≤ −1. A compatibility
check requires T −1 = ∂s ∂ρ
∂t / ∂t , which indeed yields a result in agreement with

(77). Yet another thermodynamic relation p/T = ∂V ∂S
E
leads to p = sT − ρ
and therefore to p/ρ = α+3 2
− 1 for adiabatic evolution, in complete agreement
with (73) and (74). For α = −2, (77) implies Tg = |H|, in agreement with
[38], and for ordinary thermal entropy α = −3/2 reproduces the known result,
T = |H|1/2 .
Is GSL compatible with entropy bounds? Let us start answering this ques-
tion by considering a universe undergoing decelerated expansion, i.e. H > 0,
Ḣ < 0. For entropy sources with α > −2, going backward in time, H is pre-
vented by the restriction S H ≤ SgH from becoming too large. This requires
648 R. Brustein

that at a certain moment in time Ḣ has reversed sign, or at least vanished.

GSL allows such a transition. Evolving from the past toward the future, and
looking at (71) we see that a transition from an epoch of accelerated expan-
sion H > 0, Ḣ > 0, to an epoch of decelerated expansion H > 0, Ḣ < 0, can
occur without violation of GSL. But later we discuss a new bound appearing
in this situation when quantum effects are included.
For a contracting universe with H < 0, and if sources with α > −2 exist,
the situation is more interesting. Let us check whether in an epoch of accel-
erated contraction H < 0, Ḣ < 0, GSL is compatible with entropy bounds. If
an epoch of accelerated contraction lasts, it will inevitably run into a future
singularity, in conflict with bound S H ≤ SgH . This conflict could perhaps
have been prevented if at some moment in time the evolution had turned
into decelerated contraction with H < 0, Ḣ > 0. But a brief look at (71),
Ḣ ≤ − 3+α 3
H 2 , shows that decelerated contraction is not allowed by GSL.
The conclusion is that for the case of accelerated contraction GSL and the
entropy bound are not compatible.
To resolve the conflict between GSL and the entropy bound, we pro-
pose adding a missing quantum entropy term dSQuantum = −μdnH , where
μ(a, H, Ḣ, ...) is a “chemical potential” motivated by the following heuristic
argument. Specific quantum entropy is given by (76), and we consider for the
moment one type of quantum fluctuations that preserves its identity through-
out the evolution. Changes in Sq result from the well-known phenomenon of
freezing and defreezing of quantum fluctuations. For example, quantum modes
whose wavelength is stretched by an accelerated cosmic expansion to the point
that it is larger than the horizon become frozen (“exit the horizon”), and are
lost as dynamical modes, and conversely, quantum modes whose wavelength
shrinks during a period of decelerated expansion (“re-enter the horizon”) thaw
and become dynamical again. Taking into account this “quantum leakage” of
entropy requires that the first law should be modified as in open systems
T dS = dE + P dV − μdN .
Consider a universe going through a period of decelerated expansion, con-
taining some quantum fluctuations which have re-entered the horizon (for
concreteness, it is possible to think about an isotropic background of gravita-
tional waves). In this case, physical momenta simply redshift, but since no new
modes have re-entered, and since occupation numbers do not change by sim-
ple redshift, then within a fixed comoving volume, entropy does not change.
However, if there are some frozen fluctuations outside the horizon “waiting to
re-enter,” then there will be a change in quantum entropy, because the mini-
mal comoving wave number of dynamical modes kmin will decrease due to the
expansion, kmin (t + δt) < kmin (t). The resulting change in quantum entropy,
(t) 2
kmin
for a single physical degree of freedom, is Δsq = k dk ln nk , and since
kmin (t+δt)

a(t)H(t)
kmin (t) = a(t)H(t), ΔSq = k 2 dk ln nk = −Δ(aH)3 ln nk=aH ,
a(t+δt)H(t+δt)
Cosmological Entropy Bounds 649

providedln nk is a smooth enough function. Therefore, for N physical DOF,

and since nH = (aH)3 ,
dSq = −μN dnH , (78)
where parameter μ is taken to be positive. Obviously, the result depends on
the spectrum nk , but typical spectra are of the form nk ∼ k β , and therefore we
may take as a reasonable approximation ln nk ∼ constant for all N physical
DOF.
We adopt proposal (78) in general

dS = dSClassical + dSQuantum
= dnH S H + nH dS H − μN dnH , (79)

where S H is the classical entropy within a cosmological horizon. In particular,

for the case that S H is dominated by a single source S H = |H|α ,

Ḣ Ḣ
3H + 3 nH (S H − μN ) + α nH S H ≥ 0. (80)
H H

Quantum-modiﬁed GSL (80) allows a transition from accelerated to decel-

erated contraction. As a check, look at H < 0, Ḣ = 0, in this case modiﬁed
GSL requires 3H(S H − μN ) ≥ 0, which, if μN ≥ S H , is allowed. If the
dominant form of entropy is indeed geometric entropy, the transition from√ ac-
celerated to decelerated contraction is allowed already at |H| ∼ MP / N . In
models where N is a large number, such as grand uniﬁed theories and string
theory where it is expected to be of the order of 1000, the transition can occur
at a scale much below the Planck scale, at which classical general relativity is
conventionally expected to adequately describe background evolution.
If we reconsider the transition from accelerated to decelerated expansion
and require that (80) holds, we discover a new bound derived directly from
GSL. It is compatible with, but not relying on, the bound S H ≤ SgH . Consider
the case in which Ḣ and H are positive, or H positive and Ḣ negative but
|Ḣ| H 2 , relevant to whether the transition is allowed by GSL. In this
case, (80) reduces to S H − μN ≥ 0, that is, GSL puts a lower bound on
the classical entropy within the horizon. If geometric entropy is the dominant
source of entropy as expected, GSL puts a lower bound on geometric entropy
SgH ≥ μN , which yields an upper bound on H,

MP
H≤√ . (81)
N
The scale that appeared previously in the resolution of the conflict between en-
tropy bounds and GSL for a contracting universe has reappeared in (81), and
remarkably, (81) is the same bound obtained in [9] using different arguments.
Bound (81) forbids a large class of singular homogeneous, isotropic, spatially
flat cosmologies by bounding the scale of curvature for such a universe.
650 R. Brustein

4.2 The Generalized Second Law in Pre-big-bang String

Cosmology

String theory is a consistent theory of quantum gravity, with the power to de-
scribe high curvature regions of space–time [44], and as such, we could expect
it to teach us about the fate of cosmological singularities, with the expec-
tation that singularities are smoothed and turned into brief epochs of high
curvature. However, many attempts to seduce an answer out of string theory
regarding cosmological singularities have failed so far in producing a conclusive
answer (see for example [45]). The reason is probably that most technical ad-
vancements in string theory rely heavily on supersymmetry, but generic time-
dependent solutions break all supersymmetries and therefore known methods
are less powerful when applied to cosmology.
We have focused [46] on the two sources of entropy defined previously. The
first source is the geometric entropy Sg , and the second source is quantum
entropy Sq . The entropy within a given horizon is S H and the total entropy
is given by S = nH S H . We will ignore numerical factors, use units in which
c = 1, = 1, GN = eφ /16π, φ being the dilaton, and discuss only flat,
homogeneous and isotropic 4D string cosmologies in the so-called string frame,
in which the lowest order effective action is

√
SLO = d4 x −ge−φ R + (∂φ) .
2

Obviously, the discussion can be generalized in a straightforward manner to

higher D.
In ordinary cosmology, geometric entropy within a Hubble volume is given
by its area SgH = H −2 G−1N , and therefore, speciﬁc geometric entropy is given
by sg = |H|G−1 N [32]. A possible expression for speciﬁc geometric entropy in
string cosmology is obtained by substituting GN = eφ , leading to

sg = |H|e−φ . (82)

Reassurance that sg is indeed given by (82) is provided by the following obser-

vation. The action SLO can be expressed in a (3 + 1) covariant form, using the
3-metric gij , the extrinsic curvature Kij , considering only vanishing 3-Ricci
scalar and homogeneous dilaton,

√
SLO = d3 xdt gij e−φ −3Kij K ij − 2g ij ∂t Kij + K 2 − (∂t φ)2 .

Now, SLO is invariant under the symmetry transformation gij → e2λ gij , φ →
φ + 3λ,
for an√ arbitrary time-dependent λ. From the variation of the action
δS = d3 xdt gij e−φ 4K λ̇, we may read off the current and conserved charge
Q = 4a3 e−φ K. The symmetry is exact in the flat homogeneous case, and
it seems plausible that it is a good symmetry even when α corrections are
present [42]. With definition (82), the total geometric entropy Sg = a3 |H|e−φ
Cosmological Entropy Bounds 651

is proportional to the corresponding conserved charge. Adiabatic evolution,

Ḣ
determined by ∂t Sg = 0, leads to a familiar equation, H − φ̇+3H = 0, satisfied
by the (±) vacuum branches of PBB string cosmology.
Quantum entropy for a single field in string cosmology is, as in [40, 41, 32],
given by
kmax
sq = d3 kf (k) , (83)
kmin
where for large occupation numbers f (k) ln nk . The ultraviolet cutoff kmax
is assumed to remain constant at the string scale. The
infrared
√ cutoff
kmin
s(η)
is determined by the perturbation equation ψkc + kc − √
2
ψkc = 0,
s(η)
where η is conformal time = ∂η , and kc is the comoving momentum√related

to physical momentum k(η) as kc = a(η)k(η). Modes for which kc2 ≤ √ss are
“frozen,” and are lost as dynamical modes. The “pump field” s(η) = a2m eφ
depends on the background evolution and on the spin and dilaton coupling
of various fields. We are interested
√
in solutions for which a /a ∼ φ ∼ 1/η,
and therefore, for all particles √ss ∼ 1/η 2 . It follows that kmin ∼ H. In other
phases of cosmological evolution our assumption does not necessarily hold,
but in standard radiation domination (RD) with frozen dilaton all modes re-
enter the horizon. Using the reasonable approximation f (k) ∼ constant, we
obtain, as in [32],
ΔSq −μΔnH . (84)
Parameter μ is positive, and in many cases proportional to the number of
species of particles, taking into account all DOF of the system, perturbative
and non-perturbative. The main contribution to μ comes from light DOF, and
therefore if some non-perturbative objects such as D branes become light, they
will make a substantial contribution to μ.
We now turn to the generalized second law of thermodynamics, taking into
account geometric and quantum entropy. Enforcing dS ≥ 0, and in particular,
∂t S = ∂t Sg + ∂t Sq ≥ 0, leads to an important inequality,
−2 −φ
H e − μ ∂t nH + nH ∂t H −2 e−φ ≥ 0. (85)
When quantum entropy is negligible compared to geometric entropy, GSL (85)
leads to
Ḣ
φ̇ ≤ + 3H, (86)
H
yielding a bound on φ̇, and therefore on dilaton kinetic energy, for a given H,
Ḣ. Bound (86) was first obtained in [19], and interpreted as following from a
saturated HEB.
When quantum entropy becomes relevant, we obtain another bound. We
are interested in a situation in which the universe
expands,
H > 0, and φ
and H are non-decreasing, and therefore ∂t H −2 e−φ ≤ 0 and ∂t nH > 0. A
necessary condition for GSL to hold is that
652 R. Brustein

e−φ
H2 ≤ , (87)
μ
−3φ
bounding total geometric entropy He−φ ≤ e √2μ . A bound similar to (87) was
obtained in [19] by considering entropy of re-entering quantum fluctuations.
We stress that to be useful in analysis of cosmological singularities (87) has
to be considered for perturbations that exit the horizon. If the condition (87)
is satisfied, then the cosmological evolution always allows a self-consistent
description using the low-energy effective action approach.
It is not a priori clear that the form of GSL and entropy sources remains
unchanged when curvature becomes large; in fact, we may expect higher-order
corrections to appear. For example, the conserved charge of the scaling symme-
try of the action will depend in general on higher-order curvature corrections.
Nevertheless, in the following we will assume that specific geometric entropy
is given by (82), without higher-order corrections, and try to verify that, for
some reason yet to be understood, there are no higher-order corrections to
(82). Our results are consistent with this assumption.
We now turn to apply our general analysis to the PBB string cosmology
scenario, in which the universe starts from a state of very small curvature
and string coupling and then undergoes a long phase of dilaton-driven infla-
tion, joining smoothly at later times standard RD cosmology, giving rise to a
singularity-free inflationary cosmology. The high-curvature phase joining DDI
and RD phases is identified with the “big bang” of standard cosmology. A
key issue confronting this scenario is whether and under what conditions can
the graceful exit transition from DDI to RD be completed [47]. In particular,
it was argued that curvature is bounded by an algebraic fixed point behavior
when both H and φ̇ are constants and the universe is in a linear-dilaton deSit-
ter space [42], and coupling is bounded by quantum corrections [48, 49]. But
it became clear that another general theoretical ingredient is missing, and we
propose that GSL is that missing ingredient.
We have studied numerically examples of PBB string cosmologies to verify
that the overall picture we suggest is valid in cases that can be analyzed
explicitly. We first consider, as in [42, 50], α corrections to the lowest order
string effective action,

1 4 √ −φ 2 1
S= d x −ge R + (∂φ) + Lα , (88)
16πα 2

where

1 2
Lα = kα
4 2
R + A (∂φ) + D∂ 2 φ (∂φ)
2 GB

1 μν
+C R − g R ∂μ φ∂ν φ ,
μν
(89)
2
Cosmological Entropy Bounds 653

with C = −(2A + 2D + 1), is the most general form of four derivative correc-
tions that lead to equations of motion with at most second (time) derivatives.
The rationale for this choice was explained in [50]. k is a numerical factor de-
pending on the type of string theory. Action (88) leads to equations of motion,
−3H 2 + φ̄˙ 2 − ρ̄ = 0, σ̄ − 2Ḣ + 2H φ̄˙ = 0, λ̄ − 3H 2 − φ̄˙ 2 + 2φ̄¨ = 0, where ρ̄, λ̄ and
σ̄ are effective sources parameterizing the contribution of α corrections [50].
Parameters A and D should have been determined by string theory; however,
at the moment, it is not possible to calculate them in general. If A and D
were determined, we could just use the results and check whether their generic
cosmological solutions are non-singular, but since A and D are unavailable at
the moment, we turn to GSL to restrict them.
First, we look at the initial stages of the evolution when the string coupling
and H are very small. We find that not all the values of the parameters A and
D are allowed by GSL. The condition σ̄ ≥ 0, which is equivalent to GSL on
generic solutions at the very early stage of the evolution, if the only relevant
form of entropy is geometric entropy, leads to the following condition on A
and D (first obtained by Madden [51]), 40.05A + 28.86D ≤ 7.253. The values
of A and D which satisfy this inequality are labeled “allowed,” and the rest
are “forbidden.” In [50] a condition that α corrections are such that solutions
start to turn toward a fixed point at the very early stages of their evolution
was found 61.1768A + 40.8475D ≤ 16.083, and such solutions were labeled
“turning the right way.” Both conditions are displayed in Fig. 2. They select
almost the same region of (A, D) space, a gratifying result, GSL “forbids”
actions whose generic solutions are singular and do not reach a fixed point.

10
D
6

2 A
−5 −3 −1 1 3 5
−2

−6

−10

Fig. 2. Two lines, separating actions whose generic solutions “turn the right way”
at the early stages of evolution (red-dashed), and actions whose generic solutions
satisfy classical GSL while close to the (+) branch vacuum (blue-solid). The dots
represent (A, D) values whose generic solutions reach a ﬁxed point, and are all in
the “allowed” region

We further observe that generic solutions which “turn the wrong way” at
the early stages of their evolution continue their course in a way similar to the
654 R. Brustein

0.20 H

0.15

0.10

0.05 .
φ
0.2 0.4 0.6 0.8 1.0

Fig. 3. Typical solution that “turns the wrong way.” The dashed line is the (+)
branch vacuum

solution presented in Fig. 3. We ﬁnd numerically that at a certain moment in

time H starts to decrease, at that point Ḣ = 0 and particle production effects
are still extremely weak, and therefore (86) is the relevant bound, but (86) is
certainly violated.
We have scanned the (A, D) plane to check whether a generic solution
that reaches a fixed point respects GSL throughout the whole evolution,
and conversely, whether a generic solution obeying GSL evolves toward a
fixed point. The results are shown in Fig. 2; clearly, the “forbidden” region
does not contain actions whose generic solutions go to fixed points. Never-
theless, there are some (A, D) values located in the small wedges near the
bounding lines, for which the corresponding solutions always satisfy (86),
but do not reach a fixed point, and are singular. This happens because
they meet a cusp singularity. Consistency requires adding higher-order α
corrections when cusp singularities are approached, which we will not at-
tempt here.
If particle production effects are strong, the quantum part of GSL adds
bound (87), which adds another “forbidden” region in the (H, φ̄) ˙ plane, the
˙
region above a straight line parallel to the φ̄ axis. The quantum part of GSL
has therefore a significant impact on corrections to the effective action. On a
fixed point φ is still increasing, and therefore, the bounding line described by
(87) is moving downward, and when the critical line moves below the fixed
point, GSL is violated. This means that when a certain critical value of the
coupling eφ is reached, the solution can no longer stay on the fixed point, and
it must move away toward an exit. One way this can happen is if quantum
corrections, perhaps of the type discussed in [48, 49], exist.
The full GSL therefore forces actions to have generic solutions that are
non-singular, classical GSL bounds dilaton kinetic energy and quantum GSL
bounds H and therefore at a certain moment of the evolution Ḣ must vanish
(at least asymptotically), and then curvature is bounded. If cusp singularities
are removed by adding higher-order corrections, as might be expected, we can
apply GSL with similar conclusions also in this case. A schematic graceful exit
Cosmological Entropy Bounds 655

0.6
0.5
0.4
0.3
0.2
0.1 .
φ
−0.7 −0.5 −0.3 −0.1 0.1 0.3 0.5

Fig. 4. Graceful exit enforced by GSL on generic solutions. The horizontal line is
bound (87) and the curve on the right is bound (86), shaded regions indicate GSL
violation

enforced by GSL is shown in Fig. 4. Our result indicates that if we impose GSL
in addition to equations of motion, then non-singular PBB string cosmology
is quite generic.

5 Area Entropy, Entanglement Entropy

and Entropy Bounds

Classical general relativity predicts space–times with event horizons and other
causal boundaries, such as apparent horizons, cosmological horizons and ac-
celeration horizons. Observers in space–times with causal boundaries can see
very different physics, as demonstrated by comparing the static observer at
infinity and a freely falling observer in the Schwarzchild geometry. For the
first, the horizon is a very special place: Energies of particles diverge and
space–time seems to end there, while for a freely falling observer the horizon
and its vicinity do not look special at all. In cosmological space–times with
causal boundaries the situation is similar. The existence of causal boundaries
is determined by the large-scale properties of space–time, and hence is in-
trinsically a non-local concept. In cosmology, for example, it is hard for a
local observer to determine whether the space–time is de Sitter space that
has a cosmological event horizon, or a Robertson–Walker space which looks
approximately de Sitter.
The interpretation of the thermodynamic properties of BHs and whether
they originate from some underlying, more fundamental, statistical mechanics
remains unclear, in spite of the intense efforts and the progress that has been
achieved over the last 30 years since the discovery by Bekenstein [37]. Quan-
tum field theory (QFT) in the fixed background of space–times with horizons
is a key element in the quantitative understanding of the statistical mechan-
ics of BHs. QFT in such background has several interesting and well-known
656 R. Brustein

features. The quantum vacuum states associated with different observers can
be very different from each other, leading to strong particle production effects:
the Hawking effect and the Unruh effect. In addition, the appearance of large
blue shifts of quantum modes near the horizon lead to the trans-Planckian
problem [52]. The proposed resolutions include the brick-wall model [53, 54]
and the stretched-horizon [55] idea. The entropy and thermodynamics are also
observer dependent, as demonstrated by the classic comparison between the
Rindler and Minkowski space observers in the Minkowski vacuum. The accel-
erated observer sees a truly thermal state, while for the Minkowski observer
the temperature vanishes. The tension between the possibility of evaluating
the entropy and other thermodynamic quantities in the semi-classical approx-
imation and their observer dependence and hence their sensitivity to physics
at the highest energy scales is intriguing and is not yet resolved.
My current point of view about the physics of space–times with causal
boundaries is the entanglement point of view. I believe that the statistical
properties of such space–times arise because classical observers in them have
access only to a part of the whole quantum state. When a system is in a pure
state, but one cannot access the complete quantum system, and a measure-
ment is performed, one is instructed by the rules of quantum mechanics to
trace over the classically inaccessible DOF. This leads to a natural framework
for interpreting the physics of spaces with causal boundaries: that it is de-
scribed by the density matrix which results from tracing over the inaccessible
DOF. In the context of BHs the idea was first proposed by ’t Hooft [54],
and by Sorkin and collaborators [56], and then extended and elaborated by
Srednicki [39] and others.
The entanglement approach considers the fundamental physical objects de-
scribing the physics of space–times with causal boundaries to be their global
quantum state and the unitary evolution operator. The entanglement ap-
proach has several obvious advantages: It naturally leads to area-law entropy,
it can incorporate the observer dependence of BH thermodynamics and of
the thermodynamics of cosmological space–times with causal boundaries. It
can naturally accommodate the geometric and quantum entropies—the first
resulting from the entanglement entropy of short-wavelength fluctuations and
the second resulting from the entanglement entropy of fluctuations whose
wavelength is larger than the causal connection scale. This interpretation is
also automatically compatible with entropy bounds and the GSL as long as
the evolution equations are “physical” because from a global point of view
it is clear that nothing special occurs when a horizon develops. Obviously,
there are also some unresolved issues that need to be better understood in
this context.
The space–times that are traditionally used to explore the entanglement
point of view are spaces with bifurcating Killing horizons such as the eternal
Schwarzschild BH or Rindler space. Israel [57] has shown that the quantum
Hilbert space of fields in space–times with bifurcating Killing horizons has a
product structure that is isomorphic to the product structure that arises in
Cosmological Entropy Bounds 657

thermoﬁeld dynamics [58]. In thermoﬁeld dynamics one formally doubles the

Hilbert space and evaluates quantum expectation values in the thermofield
double pure state in order to evaluate expectation values in a thermal state of
the original system. In this context the entropy is the entanglement entropy
that is obtained from tracing over one of the two spaces.
One of the main unresolved issues confronting the entanglement interpre-
tation is the ultraviolet (UV) divergence of entanglement entropy and other
entanglement correlation functions near the horizon, and its dependence on
the number of fields [59, 60, 61]. Another issue concerns space–times that do
not have non-degenerate bifurcating Killing horizons. For such spaces, it is un-
clear what is entangled with what, since some of the regions of the extended
space–time are missing.
The entanglement point of view has been discussed in the AdS-CFT con-
text by Maldacena [62] who studied eternal BHs in AdS. In 4D, the space has
two boundaries that are topologically S 2 × S 1 , the dual FT consists of two
CFTs “living on the boundary.” The product theory in the TFD state defines
the string theory in the bulk, whose low energy limit is the AdS-BH. The FT
side is completely well defined, and its thermodynamics can obviously be in-
terpreted as entanglement thermodynamics. The low energy state in the bulk
is the Hartle–Hawking vacuum. The entanglement point of view suggests the
following perspective. Suppose that the universe is in a pure state and that it
evolves unitarily. Then the entropy of any sub-system of it is entirely in the
eyes of the beholder: a particular classical observer.
We have shown [63] that the entropy resulting from the counting of mi-
crostates of non-extremal BHs using field theory duals of string theories can
be interpreted as arising from entanglement. The conditions for making such
an interpretation consistent were determined. First, we have interpreted the
entropy and thermodynamics of space–times with non-degenerate, bifurcating
Killing horizons as arising from entanglement. We have used a path integral
method to define the Hartle–Hawking vacuum state in such space–times, and
reveal explicitly its entangled nature and its relation to the geometry. If string
theory on such spacetimes has a field theory dual, then, in the low-energy, weak
coupling limit, the field theory state that is dual to the Hartle–Hawking state
is a thermofield double state. This allowed us to compare the entanglement
entropy to the entropy of the field theory dual, and thus to the Bekenstein–
Hawking entropy of the BH.
To further understand the nature of the time evolution of sub-systems in
this context, we have considered [64] a collapsing relativistic spherical shell
in a free quantum field theory. Once the center of the wavefunction of the
shell passes a certain radius rs , the degrees of freedom inside rs are traced
over. We have found that an observer outside this region will determine that
the evolution of the system is non-unitary. The non-unitary evolution occurs
only when the wavefunction is in the process of crossing the boundary and
the amount of non-unitarity is proportional to the area of the boundary.
658 R. Brustein

Acknowledgments
I would like to thank all the collaborators who participated in the research
that is summarized and reviewed in this article. First, I would like to thank
Gabriele Veneziano for interesting me in this subject and for collaboration in
several related projects. I would like to thank David Eichler, Marty Einhorn,
Stefano Foﬀa, Dick Madden, David Oaknin, Avi Mayo, Riccardo Sturani and
Amos Yarom for fruitful collaborations whose results are presented in this
article.

References
1. J. D. Bekenstein: Phys. Rev. D 23, 287 (1981); J. D. Bekenstein: Phys. Rev. D
49, 1912 (1994) 619
2. W. G. Unruh, R. M. Wald: Phys. Rev. D 25, 942 (1982) 620
3. M. A. Pelath, R. M. Wald: Phys. Rev. D 60, 104009 (1999); D. Marolf,
R. D. Sorkin: Phys. Rev. D 69, 024014 (2004) 620
4. J. D. Bekenstein: Phys. Rev. D 70, 121502 (2004); J. D. Bekenstein: Found.
Phys. 35, 1805 (2005) 620
5. G. ’t Hooft: “Dimensional reduction in quantum gravity”, arXiv:gr-qc/9310026;
L. Susskind: J. Math. Phys. 36, 6377 (1995) 620
6. R. Bousso: Rev. Mod. Phys. 74, 825 (2002) 620, 621, 644
7. J. Maldacena: Adv. Theor. Math. Phys. 2, 231 (1998) 621
8. O. Aharony, S. S. Gubser, J. M. Maldacena, H. Ooguri, Y. Oz: Phys. Rep. 323,
183 (2000) 621
9. J. D. Bekenstein: Int. J. Theor. Phys. 28, 967 (1989) 621, 634, 649
10. W. Fischler, L. Susskind: “Holography and cosmology”, arXiv:hep-th/9806039
621
11. E. P. Verlinde: “On the holographic principle in a radiation dominated uni-
verse”, arXiv:hep-th/0008140; I. Savonije, E. P. Verlinde: Phys. Lett. B 507,
305 (2001) 621, 633, 644
12. R. Brustein, D. Eichler, S. Foﬀa: Phys. Rev. D 71, 124015 (2005) 621, 623
13. R. Brustein, D. Eichler, S. Foﬀa, D. H. Oaknin: Phys. Rev. D 65, 105013 (2002)
622, 634
14. E. Witten: Adv. Theor. Math. Phys. 2 (1998) 505 623
15. L. Randall, R. Sundrum: Phys. Rev. Lett. 83, 4690 (1999) 623
16. P. Kraus: JHEP 9912, 011 (1999) 623
17. A. Kehagias, E. Kiritsis: JHEP 9911, 022 (1999) 623
18. M. Gasperini, G. Veneziano: Phys. Rep. 373, 1 (2003) 623, 630, 639, 640, 646
19. G. Veneziano: Phys. Lett. B454, 22 (1999) 624, 630, 639, 640, 644, 645, 646, 651, 652
20. B. J. Carr, S. W. Hawking: Mon. Not. Roy. Astron. Soc. 168, 399 (1974);
B. J. Carr: Astrophys. J. 201, 1 (1975); I. D. Novikov, A. G. Polnarev, Astron.
Zh. 57 (1980) 250 [ Sov. Astron. 24 (1980) 147] 625
21. R. Brustein, G. Veneziano: Phys. Rev. Lett. 84 (2000) 5965 625, 629, 631
22. E. E. Flanagan, D. Marolf, R. M. Wald: Phys. Rev. D 62, 084035 (2000) 626, 630
23. R. Brustein, M. Gasperini, G. Veneziano: Phys. Lett. B 431, 277 (1998) 627
24. J. Garriga, X. Montes, M. Sasaki, T. Tanaka: Nucl. Phys. B 513, 343 (1998) 627
Cosmological Entropy Bounds 659

25. A. Ghosh, G. Pollifrone, G. Veneziano: Phys. Lett. B 440, 20 (1998) 627

26. C. W. Misner, K. S. Thorne, J. A. Wheeler: Gravitation (Freeman, San
Francisco, 1970) 628
27. R. Brustein, S. Foffa, G. Veneziano: Phys. Lett. B 507 (2001) 270 628
28. M. Gasperini, M. Giovannini: Phys. Rev. D 47 (1993) 1519 628
29. C. W. Misner, K. S. Thorne, J. A. Wheeler: Gravitation (Freeman, San
Francisco, 1970), pp. 851–859 630
30. R. Brustein, S. Foffa, A. E. Mayo: Phys. Rev. D 65, 024004 (2002) 631, 635, 639
31. D. Kutasov, F. Larsen: JHEP 0101, 001 (2001) 632
32. R. Brustein: Phys. Rev. Lett. 84, 2072 (2000) 634, 645, 650, 651
33. J. D. Bekenstein: Phys. Rev. D 11, 2072 (1975) 637, 638
34. Avraham E. Mayo: “Remarks on Bousso’s covariant entropy bound”, unpub-
lished 637
35. G. Veneziano: Phys. Lett. B 406, 297 (1997); A. Buonanno, K. A. Meissner,
C. Ungarelli, G. Veneziano: Phys. Rev. D 57, 2543 (1998) 639
36. A. Buonanno, T. Damour, G. Veneziano: Nucl. Phys. B 543, 275 (1999) 639
37. J. D. Bekenstein: Phys. Rev. D 7, 2333 (1973) 645, 655
38. G. W. Gibbons, S. W. Hawking: Phys. Rev. D 15, 2738 (1977) 646, 647
39. M. Srednicki: Phys. Rev. Lett. 71, 666 (1993) 646, 656
40. R. H. Brandenberger, V. F. Mukhanov, T. Prokopec: Phys. Rev. Lett. 69, 3606
(1992); R. H. Brandenberger, T. Prokopec, V. F. Mukhanov: Phys. Rev. D 48,
2443 (1993) 647, 651
41. M. Gasperini, M. Giovannini: Phys. Lett. B 301, 334 (1993) 647, 651
42. M. Gasperini, M. Maggiore, G. Veneziano: Nucl. Phys. B 494, 315 (1997) 650, 652
43. T. Jacobson: Phys. Rev. Lett. 75, 1260 (1995) 647
44. J. Polchinski: String Theory (Cambridge University Press, Cambridge, 1998) 650
45. D. Kutasov: Phys. Scripta T117, 99 (2005) 650
46. R. Brustein, S. Foffa, R. Sturani: Phys. Lett. B 471 (2000) 352 650
47. R. Brustein, G. Veneziano: Phys. Lett. B 329, 429 (1994); N. Kaloper, R. Mad-
den, K. A. Olive: Nucl. Phys. B 452, 677 (1995) 652
48. R. Brustein, R. Madden: Phys. Lett. B 410, 110 (1997); R. Brustein, R. Mad-
den: Phys. Rev. D 57, 712 (1998) 652, 654
49. S. Foffa, M. Maggiore, R. Sturani: Nucl. Phys. B 552, 395 (1999) 652, 654
50. R. Brustein, R. Madden: JHEP 9907, 006 (1999) 652, 653
51. R. Madden, private communication 653
52. T. Jacobson: “Introduction to quantum fields in curved spacetime and the
Hawking effect”, arXiv:gr-qc/0308048 656
53. G. ’t Hooft: Nucl. Phys. B 256, 727 (1985) 656
54. G. ’t Hooft: Int. J. Mod. Phys. A 11, 4623 (1996) 656
55. L. Susskind, L. Thorlacius, J. Uglum: Phys. Rev. D 48, 3743 (1993) 656
56. L. Bombelli, R. K. Koul, J. H. Lee, R. D. Sorkin: Phys. Rev. D 34, 373 (1986) 656
57. W. Israel: Phys. Lett. A 57, 107 (1976) 656
58. Y. Takahasi, H. Umezawa: Collect. Phenom. 2, 55 (1975) 657
59. R. M. Wald: Living Rev. Rel. 4, 6 (2001) [arXiv:gr-qc/9912119] 657
60. D. Marolf: “On the quantum width of a black hole horizon”, arXiv:hep-
th/0312059 657
61. D. Marolf: “A few words on entropy, thermodynamics, and horizons”,
arXiv:hep-th/0410168 657
62. J. M. Maldacena: JHEP 0304, 021 (2003) 657
63. R. Brustein, M. B. Einhorn, A. Yarom: JHEP 0601, 098 (2006) 657
64. R. Brustein, M. B. Einhorn, A. Yarom: “Entanglement and nonunitary evolu-
tion”, arXiv:hep-th/0609075 657
Extremal Black Holes in Supergravity∗

L. Andrianopoli1 , R. D’Auria2 , S. Ferrara3 and M. Trigiante4

1
“Centro Enrico Fermi”, Compendio Viminale, Via Panisperna 89/A, I-00184
Rome, Italy,
Dipartimento di Fisica, Politecnico di Torino, Corso Duca degli Abruzzi 24,
I-10129 Turin, Italy,
Istituto Nazionale di Fisica Nucleare (INFN) Sezione di Torino, Turin, Italy, and
CERN PH-TH Division, CH 1211 Geneva 23, Switzerland
[email protected]
2
Dipartimento di Fisica, Politecnico di Torino, Corso Duca degli Abruzzi 24,
I-10129 Turin, Italy, and Istituto Nazionale di Fisica Nucleare Sezione di Torino,
Turin, Italy
[email protected]
3
CERN PH-TH Division, CH 1211 Geneva 23, Switzerland, and Istituto
Nazionale di Fisica Nucleare, Laboratori Nazionali di Frascati, Frascati, Italy
[email protected]
4
Dipartimento di Fisica, Politecnico di Torino, Corso Duca degli Abruzzi 24,
I-10129 Turin, Italy, and Istituto Nazionale di Fisica Nucleare Sezione di Torino,
Turin, Italy
[email protected]

Abstract. We present the main features of the physics of extremal black holes
embedded in supersymmetric theories of gravitation, with a detailed analysis of the
attractor mechanism for BPS and non-BPS black-hole solutions in four dimensions.

1 Introduction: Extremal Black Holes from Classical

General Relativity to String Theory
The physics of black holes [1], with its theoretical and phenomenological im-
plications, has a fertile impact on many branches of natural science, such as
astrophysics, cosmology, particle physics and, more recently, mathematical
physics [2] and quantum information theory [3]. This is not so astonishing in
view of the fact that, owing to the singularity theorems of Penrose and Hawk-
ing [4], the existence of black holes seems to be an unavoidable consequence of
∗
One of the authors (Sergio Ferrara) has explored with Gabriele Veneziano the
role of “duality” in superstring inspired eﬀective Lagrangians. The same duality
plays a central role in the physics of black holes presented in this article.

L. Andrianopoli et al.: Extremal Black Holes in Supergravity, Lect. Notes Phys. 737, 661–727
(2008)
DOI 10.1007/978-3-540-74233-6 22 c Springer-Verlag Berlin Heidelberg 2008
662 L. Andrianopoli et al.

Einstein’s theory of general relativity and of its modern generalizations such

as supergravity [5], superstrings, and M-theory [6].
A fascinating aspect of black-hole physics is in their thermodynamic prop-
erties that seem to encode fundamental insights of a so far not established
ﬁnal theory of quantum gravity. In this context a central role is played by the
Bekenstein–Hawking (in the following, B–H) entropy formula [7]:
kB 1
SB–H = AreaH , (1)
2P 4
where kB is the Boltzman constant, 2P = G/c3 is the squared Planck length
while AreaH denotes the area of the horizon surface (from now on we shall
use the natural units = c = G = kB = 1).
This relation between a thermodynamic quantity (SB–H ) and a geomet-
ric quantity (AreaH ) is a puzzling aspect that motivated much theoretical
work in the last decades. In fact, a microscopic statistical explanation of
the area/entropy formula, related to microstate counting, has been regarded
as possible only within a consistent and satisfactory formulation of quan-
tum gravity. Superstring theory is the most serious candidate for a theory of
quantum gravity and, as such, should eventually provide such a microscopic
explanation of the area law [8]. Since black holes are a typical non-perturbative
phenomenon, perturbative string theory could say very little about their en-
tropy: only non-perturbative string theory could have a handle on it. Progress
in this direction came after 1995 [9], through the recognition of the role of
string dualities. These dualities allow one to relate the strong coupling regime
of one superstring model to the weak coupling regime of another. Interestingly
enough, there is evidence that the (perturbative and non-perturbative) string
dualities are all encoded in the global symmetry group (the U -duality group)
of the low-energy supergravity eﬀective action [10].
Let us introduce a particular class of black-hole solutions, which will be
particularly relevant to our discussion: the extremal black holes. The simplest
instance of these solutions may be found within the class of the so-called
Reissner–Nordström (R–N) space-time [11], whose metric describes a static,
isotropic black hole of mass M and electric (or magnetic) charge Q:
−1
2M Q2 2M Q2
2
ds = dt 2
1− + 2 − dρ 2
1− + 2 − ρ2 dΩ 2 , (2)
ρ ρ ρ ρ
where dΩ 2 = (dθ2 + sin2 θ dφ2 ) is the metric on a 2-sphere. The metric (2)
∂
admits two Killing horizons, where the norm of the Killing vector ∂t changes
sign. The horizons are located at the two roots of the quadratic polynomial
Δ ≡ −2M ρ + Q2 + ρ2 :

ρ± = M ± M 2 − Q2 . (3)
If M < |Q| the two horizons disappear and we have a naked singularity. In
classical general relativity people have postulated the so-called cosmic cen-
Extremal Black Holes in Supergravity 663

sorship conjecture [5, 12]: space–time singularities should always be hidden

inside a horizon. This conjecture implies, in the R–N case, the bound:

M ≥ |Q| . (4)

Of particular interest are the states that saturate the bound (4). If

M = |Q| , (5)

the two horizons coincide and, setting: ρ = r + M (where r2 = x · x), the

metric (2) can be rewritten as
−2 2
Q Q 2
2
ds = dt 2
1+ − 1+ dr + r2 dΩ 2
r r
= H −2 (r) dt2 − H 2 (r) dx · dx (6)

in terms of the harmonic function

Q
H(r) = 1+ . (7)
r

As (6) shows, the extremal R–N conﬁguration may be regarded as a soli-

ton of classical general relativity, interpolating between two vacua of the the-
ory: the flat Minkowski space–time, asymptotically reached at spatial infinity
r → ∞, and the Bertotti–Robinson (B–R) metric [13], describing the confor-
mally flat geometry AdS2 × S 2 near the horizon r → 0 [5]:

r2 2
MB–R 2
ds2B–R = 2 dt 2
− 2
dr + r2 dΩ . (8)
MB–R r

Last, let us note that the condition M = |Q| can be regarded as a no-force con-
dition between the gravitational attraction Fg = M r 2 and the electric repulsion
Fq = − rQ2 on a unit mass carrying a unit charge.
Until now we have reviewed the concept of extremal black holes as it
arises in classical general relativity. However, extremal black-hole configura-
tions are embedded in a natural way in supergravity theories. Indeed super-
gravity, being invariant under local super-Poincaré transformations, includes
general relativity, i.e. it describes gravitation coupled to other fields in a su-
persymmetric framework. Therefore, it admits black holes among its classical
solutions. Moreover, as black holes describe a physical regime where the grav-
itational field is very strong, a complete understanding of their physics seems
to require a theory of quantum gravity, like superstring theory is. In this
respect, as anticipated above, extremal black holes have become objects of
the utmost relevance in the context of superstrings after 1995 [8, 6, 5, 14].
This interest, which is just part of a more general interest in the p-brane clas-
sical solutions of supergravity theories in all dimensions 4 ≤ D ≤ 11 [15, 16],
664 L. Andrianopoli et al.

stems from the interpretation of the classical solutions of supergravity that

preserve a fraction of the original supersymmetries as non-perturbative states,
necessary to complete the perturbative string spectrum and make it invariant
under the many conjectured duality symmetries [10, 17, 18, 19, 20]. Extremal
black holes and their parent p-branes in higher dimensions are then viewed
as additional particle-like states that compose the spectrum of a fundamental
quantum theory. As the monopoles in gauge theories, these non-perturbative
quantum states originate from regular solutions of the classical field equations,
the same Einstein equations one deals with in classical general relativity and
astrophysics. The essential new ingredient, in this respect, is supersymmetry,
which requires the presence of vector fields and scalar fields in appropriate
proportions. Hence the black holes we are going to discuss are solutions of
generalized Einstein–Maxwell–dilaton equations.
Within the superstring framework, supergravity provides an effective
description that holds at lowest order in the string loop expansion and in
the limit in which the space–time curvature is much smaller than the typical
string scale (string tension). The supergravity description of extremal black
holes is therefore reliable when the radius of the horizon is much larger than
the string scale, and this corresponds to the limit of large charges. Superstring
corrections induce higher derivative terms in the low-energy action and there-
fore the B–H entropy formula is expected to be corrected as well by terms
which are subleading in the small curvature limit. In this paper we will not
consider these higher derivative effects.
Thinking of a black-hole configuration as a particular bosonic background
of an N -extended locally supersymmetric theory gives a simple and natural
understanding at the cosmic censorship conjecture. Indeed, in theories with
extended supersymmetry (N ≥ 2) the bound (4) is just a consequence of the
supersymmetry algebra, and this ensures that in these theories the cosmic
censorship conjecture is always verified, that is there are no naked singular-
ities. When the black hole is embedded in extended supergravity, the model
depends in general also on scalar fields. In this case, as we will see, the electric
charge Q has to be replaced by the maximum eigenvalue of the central charge
appearing in the supersymmetry algebra (depending on the expectation value
of scalar fields and on the electric and magnetic charges). The R–N metric
takes in general a more complicated form.
However, extremal black holes have a peculiar feature: even when the dy-
namics depends on scalar fields, the event horizon loses all information about
the scalars; this is true independently of the fact that the solution preserves
any supersymmetries or not. Then, as will be discussed extensively in Sect.
4, also if the extremal black hole is coupled to scalar fields, the near-horizon
geometry is still described by a conformally flat, B–R-type geometry, with a
mass parameter MB–R depending on the given configuration of electric and
magnetic charges, but not on the scalars. The horizon is in fact an attractor
point [21, 22, 23]: scalar fields, independently of their boundary conditions at
spatial infinity, when approaching the horizon flow to a fixed point given by a
certain ratio of electric and magnetic charges. This may be understood in the
Extremal Black Holes in Supergravity 665

context of Hawking theory. Indeed quantum black holes are not stable: they
radiate a thermic radiation as a black body, and correspondingly lose their
energy (mass). The only stable black-hole configurations are the extremal
ones, because they have the minimal possible energy compatible with relation
(4) and so they cannot radiate. Indeed, physically they represent the limit
case in which the black-hole temperature, measured by the surface gravity at
the horizon, is sent to zero.
Remembering now that the black-hole entropy is given by the area/entropy
B–H relation (1), we see that the entropy of extremal black holes is a topolog-
ical quantity, in the sense that it is fixed in terms of the quantized electric and
magnetic charges, while it does not depend on continuous parameters such as
scalars. The horizon mass parameter MB–R turns out to be given in this case
(extremal configurations) by the maximum eigenvalue Zmax of the central
charge appearing in the supersymmetry algebra, evaluated at the fixed point:
MB–R = MB–R (p, q) = |Zmax (φfix , p, q)| (9)
this gives, for the B–H entropy:
AB–R (p, q)
SB-H = = π|Zmax (φfix , p, q)|2 . (10)
4

A lot of efffort was made in the course of the years to give an explanation
for the topological entropy of extremal black holes in the context of a quan-
tum theory of gravity, such as string theory. A particularly interesting problem
is finding a microscopic, statistical mechanics interpretation of this thermo-
dynamic quantity. Although we will not deal with the microscopic point of
view at all in this paper, it is important to mention that such an interpre-
tation became possible after the introduction of D-branes in the context of
string theory [8, 24]. Following this approach, extremal black holes are inter-
preted as bound states of D-branes in a space–time compactified to four or five
dimensions, and the different microstates contributing to the B–H entropy are,
for instance, related to the different ways of wrapping branes in the internal
directions. Let us mention that all calculations made in particular cases us-
ing this approach provided values for the B–H entropy compatible with those
obtained with the supergravity, macroscopic techniques. The entropy formula
turns out to be in all cases a U -duality-invariant expression (homogeneous of
degree 2) built out of electric and magnetic charges and as such it can be in
fact also computed through certain (moduli-independent) topological quanti-
ties [25], which only depend on the nature of the U -duality groups and the
appropriate representations of electric and magnetic charges [26]. We mention
for completeness that, as previously pointed out, superstring corrections that
take into account higher derivative effects determine a deviation from the area
law for the entropy [27, 28]. Recently, a deeper insight into the microscopic
description of black-hole entropy was gained, in this case, from the fruitful
proposal in [29], describing the microscopic degrees of freedom of black holes
in terms of topological strings.
666 L. Andrianopoli et al.

Originally, the attention was mainly devoted to the so-called BPS-extremal

black holes, i.e. to solutions which saturate the bound in (5). From an abstract
viewpoint BPS-saturated states are characterized by the fact that they pre-
serve a fraction, 1/2 or 1/4 or 1/8, of the original supersymmetries. What this
actually means is that there is a suitable projection operator S 2 = S acting
on the supersymmetry charge QSUSY , such that:
(S · QSUSY ) | BPS state = 0 . (11)
Since the supersymmetry transformation rules of any supersymmetric field
theory are linear in the first derivatives of the fields, (11) is actually a system
of first-order differential equations, to be combined with the second-order field
equations of the theory. Translating (11) into an explicit first-order differen-
tial system requires knowledge of the supersymmetry transformation rules of
supergravity. The latter have a rich geometric structure whose analysis will
be the subject of Sect. 3. The BPS saturation condition transfers the geo-
metric structure of supergravity, associated with its scalar sector, into the
physics of extremal black holes. We note that first-order differential equations
dΦ
dr = f (Φ) have in general fixed points, corresponding to the values of r for
which f (Φ) = 0. For the BPS black holes, the fixed point is reached precisely
at the black-hole horizon, and this is how the attractor behavior is realized
for this class of extremal black holes.
For BPS configurations, non-renormalization theorems based on super-
symmetry guarantee the validity of the (BPS) bound M = |Q| beyond the
perturbative regime: if the bound is saturated in the classical theory, the
same must be true also when quantum corrections are taken into account
and the theory is in a regime where the supergravity approximation breaks
down. That it is actually an exact state of non-perturbative string theory
follows from supersymmetry representation theory. The classical BPS state is
by definition an element of a short supermultiplet and, if supersymmetry is
unbroken, it cannot be renormalized to a long supermultiplet. For this class
of extremal black holes, an accurate agreement between the macroscopic and
microscopic calculations was found. For example, in the N = 8 theory the en-
tropy was shown to correspond to the unique quartic E7(7) -invariant built in
terms of the 56-dimensional representation. Actually, topological U -invariants
constructed in terms of the (moduli dependent) central charges and matter
charges can be derived for all N ≥ 2 theories; they can be shown, as expected,
to coincide with the squared ADM mass at fixed scalars.
Quite recently it has been recognized that the attractor mechanism,
which is responsible for the area/entropy relation, has a larger application
[30, 31, 32, 33, 34, 35, 36, 37] beyond the BPS cases, being a peculiarity of
all extremal black-hole configurations, BPS or not. The common feature is
that extremal black-hole configurations always belong to some representation
of supersymmetry, as will be surveyed in Sect. 2 (this is not the case for
non-extremal configurations, since the action of supersymmetry generators
cannot be defined for non-zero temperature [38]). Extremal configurations
Extremal Black Holes in Supergravity 667

that completely break supersymmetry will belong to long representations of

supersymmetry.
Even for these more general cases, because of the topological nature of the
extremality condition, the entropy formula turns out to be still given by a
U -duality-invariant expression built out of electric and magnetic charges. We
will report in Sect. 6 on the classification of all extremal solutions (BPS and
non-BPS) of N -extended supergravity in four dimensions.
For all the N -extended theories in four dimensions, the general feature that
allows us to find the B–H entropy as a topological invariant is the presence of
vectors and scalars in the same representation of supersymmetry. This causes
the electric/magnetic duality transformations on the vector field strengths
(which for these theories are embedded into symplectic transformations) to
also act as isometries on the scalar sectors [39].1 The symplectic structure of
the various σ-models of N -extended supergravity in four dimensions and the
relevant relations involving the charges obeyed by the scalars will be worked
out in Sect. 3.
As a final remark, let us observe that, since the aim of the present review
is to calculate the B-H entropy of extremal black holes, we will only discuss
solutions which have SB–H = 0. For this class of solutions, known as large
black holes, the classical area/entropy formula is valid, as it gives the domi-
nant contribution to the black-hole entropy. For these configurations the area
of the horizon is in fact proportional to a duality-invariant expression con-
structed with the electric and magnetic charges, which for these states is not
vanishing [41]. This will prove to be a powerful computational tool and will
be the subject of Sect. 5.2. As we will see in detail in the following sections,
configurations with non-vanishing horizon area in supersymmetric theories
preserve at most four supercharges (N = 1 supersymmetry) in the bulk of
space–time. Black-hole solutions preserving more supercharges do exist, but
they do not correspond to classical attractors since in that case the classical
area/entropy formula vanishes. These configurations are named small black
holes and require, for finding the entropy, a quantum attractor mechanism
taking into account the presence of higher curvature terms [29, 42, 43].
The paper is organized as follows. Sect. 2 treats the supersymmetry
structure of extremal black-hole solutions of supergravity theories, and the
black-hole configurations are described as massive representations of super-
symmetry. In Sect. 3 we briefly review the properties of four-dimensional ex-
tended supergravity related to its global symmetries. A particular emphasis is
given to the general symplectic structure characterizing the moduli spaces of
these theories. The presence of this structure allows the global symmetries of
extended supergravities to be realized as generalized electric–magnetic sym-
plectic duality transformations acting on the electric and magnetic charges

1
We note that symplectic transformations outside the U -duality group have a
non-trivial action on the solutions, allowing one to bring a BPS conﬁguration to
a non-BPS one [40]
668 L. Andrianopoli et al.

of dyonic solutions (as black holes). In Sect. 4 we start reviewing extremal

regular black-hole solutions embedded in supergravity and, for the BPS case,
an explicit solution will be found by solving the Killing spinor equations. In
Sect. 5 we give a general overview of extremal and non-extremal solutions
showing how the attractor mechanism comes about in the extremal case only.
Then a general tool for calculating the Bekenstein–Hawking entropy for both
BPS and non-BPS extremal black holes will be given, based on the observation
that the black-hole potential takes a particularly simple form in the super-
gravity case, which is fixed in terms of the geometric properties of the moduli
space of the given theory. Moreover, for theories based on moduli spaces given
by symmetric manifolds G/H, which is the case of all supergravity theories
with N ≥ 3 extended supersymmetry, but also of several N = 2 models,
the BPS and non-BPS black holes are classified by some U -duality-invariant
expressions, depending on the representation of the isometry group G under
which the electric and magnetic charges are classified. Finally in Sect. 6, by
exploiting the supergravity machinery introduced in Sect. 3 and 4, we shall
give a detailed analysis of the attractor solutions for the various theories of
extended supergravity. Section 7 contains some concluding remarks.
Our discussion will be confined to four-dimensional black holes.

2 Extremal Black Holes as Massive Representations

of Supersymmetry
We are going to review in the present section the algebraic structure of the
massive representations of supersymmetry, both for short and long multiplets,
in order to pinpoint, for each supergravity theory, the extremal black-hole
configurations corresponding to a given number of preserved supercharges.
The condition of extremality is in fact independent on the supersymmetry
preserved by the solution, the only difference between the supersymmetric and
the non-supersymmetic case being that the configurations preserving some
supercharges correspond to short multiplets, while the configurations which
completely break supersymmetry will instead belong to long representations
of supersymmetry. The highest spin of the configuration 2 depends on the
number of supercharges of the theory under consideration [44].
As a result of our analysis we find for example, as far as large BPS black-
hole configurations are considered, that for N = 2 theories the highest spin of
the configuration (which in this case is 1/2-BPS) is JM AX = 1/2, for N = 4
theories (1/4-BPS) is JM AX = 3/2, while for the N = 8 case (1/8-BPS) is
JM AX = 7/2. On the other hand, 1/2-BPS multiplets have maximum spin
JM AX = N/4 (N = 2, 4, 8) as for massless representations. They are given in
Tables 2–4. The corresponding black holes (for N > 2) have vanishing classical
entropy (small black holes) [25].
2
We confine our analysis here to the minimal highest spin allowed for a given
theory
Extremal Black Holes in Supergravity 669

The long multiplets corresponding to non-BPS extremal black-hole conﬁg-

urations have JM AX = 1 in the N = 2 theory, JM AX = 2 in the N = 4 theory
and JM AX = 4 in the N = 8 theory. However, as we will see in detail in the
following, for the non-BPS cases we may have solutions with vanishing or non-
vanishing central charge. Since the central charge ZAB is a complex matrix, it
is not invariant under CPT symmetry, but transforms as ZAB → Z̄AB .3 The
representation then depends on the charge of the conﬁguration: if the solution
has vanishing central charge the long-multiplet will be neutral (real), while
if the solution has non-zero central charge the long multiplet will be charged
(complex), with a doubled dimension as required for CPT invariance [44].
We have listed in Tables 1–3 all possible massive representations with
highest spin JM AX ≤ 3/2 for N ≤ 8. The occurence of long spin 3/2 multiplets
is only possible for N = 3, 2 and of long spin 1 multiplets for N = 2. In N = 1
there is only one type of massive multiplet (long) since there are no central
charges. Its structure is
1 1
(J0 + ), 2(J0 ), (J0 − ) ,
2 2

except for J0 = 0 where we have ( 12 ), 2(0) .
In the tables we will denote the spin states by (J) and the number in
front of them is their multiplicity. In the fundamental multiplet, with spin
J0 = 0 vacuum, the multiplicity of the spin (N − q − k)/2 is the dimension
of the k-fold antisymmetric Ω-traceless representation of U Sp(2(N − q)). For
multiplets with J0 = 0 one has to make the tensor product of the fundamental
multiplet with the representation of spin J0 . We also indicate if the multiplet
is long or short.

2.1 Massive Representations of the Supersymmetry Algebra

The D = 4 supersymmetry algebra is given by

" #
Q̄Aα , Q̄Bβ = − (C γ μ )αβ Pμ δAB + i (C ZAB )αβ
(A, B = 1, . . . , 2p) , (12)

where the SUSY charges Q̄A ≡ Q†A γ0 = QTA C are Majorana spinors, C is
the charge conjugation matrix, Pμ is the four-momentum operator and the
antisymmetric tensor ZAB is deﬁned as

ZAB = (ZAB ) + i γ 5 (ZAB ) , (13)

the complex matrix ZAB = −ZBA being the central charge operator. For
the sake of simplicity, we shall suppress the spinorial indices in the formulae.
3
We use here a diﬀerent deﬁnition of central charge with respect to [44]: ZAB →
iZAB
670 L. Andrianopoli et al.

Table 1. Massive spin 3/2 multiplets

N Massive spin 3/2 multiplet Long Short

8 None

6 2 × ( 32 ), 6(1), 14( 12 ), 14 (0) No q = 3, ( 12 BPS)

5 2 × ( 32 ), 6(1), 14( 12 ), 14 (0) No q = 2, ( 25 BPS)

4 2 × ( 32 ), 6(1), 14( 12 ), 14 (0) No q = 1, ( 14 BPS)

2 × ( 32 ), 4(1), 6( 12 ), 4(0) No q = 2, ( 12 BPS)

3 ( 32 ), 6(1), 14( 12 ), 14 (0) Yes no

2 × ( 32 ), 4(1), 6( 12 ), 4(0) No q = 1, ( 13 BPS)

2 ( 32 ), 4(1), 6( 12 ), 4(0) Yes no

2 × ( 32 ), 2(1), ( 12 ) No q = 1, ( 12 BPS)

1 ( 32 ), 2(1), ( 12 ) Yes no

Using the symmetries of the theory, it can always be reduced to normal form
[45]. For N even it reads

Table 2. Massive spin 1 multiplets

N Massive spin 1 multiplet Long Short

8,6,5 None

4 2 × (1), 4( 12 ), 5(0) No q = 2, ( 12 BPS)

3 2 × (1), 4( 12 ), 5(0) No q = 1, ( 13 BPS)

2 (1), 4( 12 ), 5(0) Yes No

2 × (1), 2( 12 ), (0) No q = 1, ( 12 BPS)

1 (1), 2( 12 ), (0) Yes No
Extremal Black Holes in Supergravity 671

Table 3. Massive spin 1/2 multiplets

N Massive spin 1/2 multiplet long Short

8,6,5,4,3 None

2 2 × ( 12 ), 2(0) No q = 1, ( 12 BPS)

1 ( 12 ), 2(0) Yes No

⎛ ⎞
Z1 0 ... 0
⎜ 0 Z2 ... 0 ⎟
ZAB =⎜
⎝. . .
⎟, (14)
... ... ... ⎠
0 0 ... Zp

where is the 2 × 2 antisymmetric matrix, (every zero is a 2 × 2 zero matrix)

and the p skew eigenvalues Zm of ZAB are the central charges. For N odd
the central charge matrix has the same form as in (14) with p = (N − 1)/2,
except for one extra zero row and one extra zero column. Note that it is not
always possible to reduce ZAB to its normal form with real Zm by means
of symmetries of the theory [45]. This is the case in particular of N = 8
supergravity where the SU (8) R-symmetry does not aﬀect the global phase
of the skew-eigenvalues Zm . Therefore, we shall consider the general situation
in which Zm are complex and deﬁne for each of them the spinorial matrices
which will enter the supersymmetry algebra:

Zm = (Zm ) + i γ 5 (Zm ) ,
Z̄m = (Zm ) − i γ 5 (Zm ) , m = 1, . . . , p . (15)

If we identify each index A, B, . . . with the pair of indices

A = (a, m) ; a, b, · · · = 1, 2 ; m, n, · · · = 1, . . . , p , (16)

the matrix ZAB in the normal frame will have the form

ZAB = Zam, bn = Zm δmn ab , (17)

and the superalgebra (12) can be rewritten as

" #
Q̄am , Q̄bn = − (C γ μ ) Pμ δab δmn + i C ab Zm δmn , (18)

where ab is the two-dimensional Levi Civita symbol. Let us consider a generic
unit time-like Killing vector ζ μ (ζ μ ζμ = 1), in terms of which we deﬁne the
following projectors acting on both the internal (a, m) and Lorentz indices
(α, β) of the spinors:

(±) 1 Z̄m
Sam, bn = δab δmn ± i ζμ γ μ δmn ab ,
2 |Zm |
672 L. Andrianopoli et al.

(±) 1 Zm
S̃am, bn = δab δmn ± i ζμ γ μ δmn ab , (19)
2 |Zm |
and deﬁne the projected supersymmetry generators:

Q̄(±) = Q̄ S (±) . (20)

The anticommutation relation (18) can be rewritten in the following form:

* +
(±) (±)
Q(±)
am , Q̄bn = S̃am, bn ζμ γ μ (ζν P ν ∓ |Zm |) . (21)

In the case in which ζ μ = (1, 0, 0, 0) and we are in the rest frame (P 0 = M )

the above relation reads
* +
(±) † (±)
Q(±)
am , Q bn = S̃am, bn (M ∓ |Zm |) . (22)

Since the left-hand side of (22) is non-negative deﬁnite, we deduce the BPS
bound required by unitarity of the representations

M ≥ | Zm | ∀Zm , m = 1, . . . , p . (23)

It is an elementary consequence of the supersymmetry algebra and of the

identiﬁcation between central charges and topological charges [46].

Massive BPS Multiplets

Suppose that on a given state |BP S the BPS bound (23) is saturated by q
of the p eigenvalues Zm :

M = |Z1 | = |Z2 | = · · · = |Zq | q ≤ p , (24)

then, from (22) we deduce that

am |BP S = 0 , m = 1, . . . , q ,
Q(+) (25)

namely q of the pairs of creation–annihilation operators, which have abelian

anticommutation relations, annihilate the state. The multiplet obtained by
acting on |BP S with the remaining supersymmetry generators is said to be
q/N BPS. Note that qM AX = N/2 for N even and qM AX = (N − 1)/2 for N
odd. The U Sp(2N ) symmetry is now reduced to U Sp(2(N − q)). The short
multiplet has the same number of states as a long multiplet of the N − q
supersymmetry algebra. The fundamental multiplet, with J = 0 vacuum,
contains 2 · 22(N −q) states with JM AX = (N − q)/2. Note the doubling due to
CPT invariance. Generic massive short multiplets can be obtained by making
the tensor product with a spin J0 representation of SU(2).
If we write the inﬁnitesimal generator of a supersymmetry in the form
(+) (+) (−) (−)
Q̄A A = Q̄A A + Q̄A A , (26)
Extremal Black Holes in Supergravity 673
(+)
the supersymmetries preserved by |BP S are parametrized by am with m ≤ q
and thus deﬁned by the condition
(−)
(−)
am = Sam, bn bn = 0 ; m, n ≤ q , (27)
am = 0 ; m > q , (28)

which can be written in terms of Weyl spinors A , A in the following form:

Zm Zm
am = i ζμ γ μ ab bm = i ab γ 0 bm ; m ≤ q, (29)
|Zm | |Zm |
am = 0 ; m > q. (30)

If, in a given supergravity theory, the state |BP S corresponds to a back-

ground described by a certain configuration of fields, (25) is translated into
the request that the supersymmetry variations of all the fields are zero in
the background. We consider extremal black-hole solutions for which the
supersymmetry variations of the bosonic fields are identically zero. Then the
condition (25) yields a set of first-order differential equations for the bosonic
fields, called “Killing spinor” equations, to be satisfied on the given configu-
ration
0 = δfermions = SUSY rule (bosons, am ) , (31)
where the supersymmetry transformations are made with respect to the
(+)
residual supersymmetry parameter am defined by the conditions (30). These
conditions are important in order to be able to recast (31) into differential
equations involving only the bosonic fields of the solution.

Massive Non-BPS Multiplets

Massive multiplets with Zm = 0 or Zm = 0 but M > |Zm | are called long mul-
tiplets or non-BPS states. They are qualitatively the same, the only difference
being that in the first case the supermultiplets are real, while in the second
one the representations must be doubled in order to have CPT invariance,
since Zm → Z̄m under CPT.
In both cases the supersymmetry algebra can be put in a form with 2N
creation and 2N annihilation operators. It shows explicit invariance under
SU (2)×U Sp(2N ). The vacuum state is now labeled by the spin representation
of SU(2), |ΩJ . If J = 0 we have the fundamental massive multiplet with 22N
states. These are organized in representations of SU(2) with JM AX = N/2.
With respect to U Sp(2N ) the states with fixed 0 < J < N/2 are arranged in
the (N − 2J)-fold Ω-traceless antisymmetric representation, [N − 2J].
The general multiplet with a spin J vacuum can be obtained by tensoring
the fundamental multiplet with spin J representation of SU(2). The total
number of states is then (2J + 1) · 22N .
674 L. Andrianopoli et al.

3 The General Form of the Supergravity Action in Four

Dimensions and its BPS Configurations
In this section we begin the study of extremal black-hole solutions of extended
supergravity in four space–time dimensions. To this aim we first have to intro-
duce the main features of four-dimensional N -extended supergravities. These
theories contain in the bosonic sector, besides the metric, a number nV of
vectors and m of (real) scalar fields. The relevant bosonic action is known to
have the following general form:

√ 1 1
S= −g d4 x − R + NΛΓ Fμν Λ
F Γ |μν + √ ReNΛΓ μνρσ Fμν Λ Γ
Fρσ
2 2 −g

1 r μ s
+ grs (Φ)∂μ Φ ∂ Φ , (32)
2
where grs (Φ) (r, s, · · · = 1, · · · , m) is the scalar metric on the σ-model
described by the scalar manifold Mscalar of real dimension m and the vectors
kinetic matrix NΛΣ (Φ) is a complex, symmetric, nV × nV matrix depending
on the scalar fields. The number of vectors and scalars, namely nV and m,
and the geometric properties of the scalar manifold Mscalar depend on the
number N of supersymmetries and are resumed in Table 4. The imaginary
part ImN of the vector kinetic matrix is negative definite and generalizes
the inverse of the squared coupling constant appearing in ordinary gauge the-
ories while its real part ReN is instead a generalization of the theta-angle
of quantum chromodynamics. In supergravity theories, the kinetic matrix N
is in general not a constant, its components being functions of the scalar
fields. However, in extended supergravity (N ≥ 2) the relation between the
scalar geometry and the kinetic matrix N has a very general and univer-
sal form. Indeed it is related to the solution of a general problem, namely
how to lift the action of the scalar manifold isometries from the scalar to the
vector fields. Such a lift is necessary because of supersymmetry since scalars
and vectors generically belong to the same supermultiplet and must rotate
coherently under symmetry operations. This problem has been solved in a
general (non-supersymmetric) framework in reference [39] by considering the
possible extension of the Dirac electric–magnetic duality to more general the-
ories involving scalars. In the next subsection we review this approach and
in particular we show how enforcing covariance with respect to such duality
rotations leads to a determination of the kinetic matrix N . The structure of
N enters the black-hole equations in a crucial way so that the topological
invariant associated with the hole, that is its entropy, is an invariant of the
group of electro-magnetic duality rotations, the U -duality group.

3.1 Duality Rotations and Symplectic Covariance

Let us review the general structure of an abelian theory of vectors and scalars
displaying covariance under a group of duality rotations. The basic reference
Extremal Black Holes in Supergravity 675

Table 4. Scalar manifolds of N > 2 extended supergravities. In the table, nV stands

for the number of vectors and m for the number of real scalar ﬁelds. In all the cases
the duality group G is embedded in Sp(2 nV , R)
N Duality group G Isotropy H Mscalar nV m
SU (3,n)
3 SU (3, n) SU (3) × U (n) S(U (3)×U (n))
3+n 6n
4 SU (1, 1) ⊗ SO(6, n) U (4) × SO(n) SU (1,1)
U (1)
SO(6,n)
⊗ SO(6)×SO(n) 6 + n 6n + 2
SU (1,5)
5 SU (1, 5) U (5) S(U (1)×U (5))
10 10
SO (12)
6 SO (12) U (6) U (1)×SU (6)
16 30
E7(7)
7, 8 E7(7) SU (8) SU (8)
28 70

is the 1981 paper by Gaillard and Zumino [39]. A general presentation in

D = 2p dimensions can be found in [47]. Here we fix D = 4.
We consider a theory of nV abelian gauge fields AΛ
μ , in a D = 4 space–time
with Lorentz signature (which we take to be mostly minus). They correspond
to a set of nV differential 1-forms

AΛ ≡ AΛ
μ dx
μ
(Λ = 1, . . . , nV ) . (33)

The corresponding ﬁeld strengths and their Hodge duals are deﬁned by4

F Λ ≡ d AΛ ≡ FμνΛ
dxμ ∧ dxν ,
1
Λ
Fμν ≡ ν − ∂ν Aμ ,
∂μ AΛ Λ
2√
−g
( F )μν ≡
Λ
εμνρσ F Λ|ρσ . (34)
2
The dynamics of a system of abelian gauge ﬁelds coupled to scalars in a gravity
theory is encoded in the bosonic action (32).
Introducing self-dual and antiself-dual combinations
1
F± = (F ± i F ) ,
2

F± = ∓iF ± , (35)

the vector part of the Lagrangian deﬁned by (32) can be rewritten in the form

Lvec = i F −T N̄ F − − F +T N F + . (36)

Introducing further the new tensors

1 ∂L i ∂L

GΛ|μν ≡ Λ
Σ
= ImNΛΣ Fμν Σ
+ ReNΛΣ Fμν ↔ G∓
Λ|μν ≡ ∓ 2 ∓Λ
,
2 ∂Fμν ∂Fμν
(37)
4
We use, for the tensor, the convention: 0123 = −1
676 L. Andrianopoli et al.

the Bianchi identities and ﬁeld equations associated with the Lagrangian (32)
can be written as

∇μ F Λ
μν = 0,
∇μ GΛ|μν = 0, (38)

or equivalently
±Λ
∇μ ImFμν = 0, (39)
∇μ
ImG±
Λ|μν =0. (40)

This suggests that we introduce the 2nV column vector

F
V ≡ (41)
G

and that we consider general linear transformations on such a vector

F AB F
= . (42)
G CD G

AB
For any constant matrix S = ∈ GL(2nV , R) the new vector of mag-
CD

netic and electric ﬁeld strengths V = S · V satisﬁes the same equations (38)
as the old one. In a condensed notation we can write

∂V = 0 ⇐⇒ ∂ V = 0. (43)

Separating the self-dual and antiself-dual parts

F = F+ + F− ; G = G+ + G− , (44)

and taking into account that we have

G+ = N F + ; G− = N̄ F − , (45)

the duality rotation of (42) can be rewritten as

F+ AB F+ F− AB F−
= ; = . (46)
G+ CD NF+ G− CD N̄ F −

Now, let us note that, since in the system we are considering ((32)) the gauge
fields are coupled to the scalar sector via the scalar-dependent kinetic matrix
N , when a duality rotation is performed on the vector field strengths and their
duals, we have to assume that the scalars get transformed correspondingly,
through the action of some diffeomorphism on the scalar manifold Mscalar . In
particular, the kinetic matrix N (Φ) transforms under a duality rotation. Then,
Extremal Black Holes in Supergravity 677

a duality transformation ξ acts in the following way on the supersymmetric

system: ⎧
⎨ V → V ∓ = Sξ V ∓
ξ: Φ → Φ = ξ(Φ) (47)
⎩
N (Φ) → N (ξ(Φ))
Thus, the transformation laws of the equations of motion and of N , and so
also the matrix Sξ , will be induced by a diffeomorphism of the scalar fields.
Focusing in particular on the first relation in (47), that explicitly reads
±
F Aξ F ± + Bξ G±
= , (48)
G± Cξ F ± + Dξ G±

we note that it contains the magnetic ﬁeld strength G∓ Λ introduced in (37),

which is deﬁned as a variation of the kinetic lagrangian. Under the transfor-
mations (47) the Lagrangian transforms in the following way:

L = i (Aξ + Bξ N )Γ (Aξ + Bξ N ) Δ NΛΣ
Λ Σ
(Φ)F +Γ F +Δ
Λ Σ

− Aξ + Bξ N̄ Γ Aξ + Bξ N̄ Δ N̄ΛΣ (Φ)F −Γ F −Δ ; (49)

Equations (47) must be consistent with the deﬁnition of G∓ as a variation of

the Lagrangian (49)

i ∂L
G+
Δ
Λ = (Cξ + Dξ N )ΛΣ F ≡− = (Aξ + Bξ N ) Σ NΛΔ
+Σ
F +Σ (50)
2 ∂F +Λ
that implies

−1
NΛΣ (Φ ) = (Cξ + Dξ N ) · (Aξ + Bξ N ) ; (51)
ΛΣ

The condition that the matrix N is symmetric, and that this property must
be true also in the duality transformed system, gives the constraint

S ∈ Sp(2nV , R) ⊂ GL(2nV , R) , (52)

that is:

ST C S = C , (53)

where C is the symplectic invariant 2nV × 2nV matrix:

0 −11
C= . (54)
11 0

It is useful to rewrite the symplectic condition (53) in terms of the nV × nV

blocks deﬁning S:

AT C − C T A = B T D − DT B = 0 ; AT D − C T B = 11 . (55)
678 L. Andrianopoli et al.

The above observation has important implications on the scalar manifold

Mscalar . Indeed, it implies that on the scalar manifold the following homo-
morphism is deﬁned:

Dif f (Mscalar ) → Sp(2n, R) . (56)

In particular, the presence on the manifold of a function of scalars transform-

ing with a fractional linear transformation under a duality rotation on the
scalars, induces the existence on Mscalar of a linear structure (inherited from
the vectors). As we are going to discuss in detail in Sect. 3.2, this may be
rephrased by saying that the scalar manifold is endowed with a symplectic
bundle. As the transition functions of this bundle are given in terms of the
constant matrix S, the symplectic bundle is flat. In particular, as we will see
in Sect. 3.2, for the N = 2 four-dimensional theory this implies that the scalar
manifold be a special manifold, that is a Kähler–Hodge manifold endowed with
a flat symplectic bundle.
If we are interested in the global symmetries of the theory (i.e. global sym-
metries of the field equations and Bianchi identities) we will need to restrict
the duality transformations, namely the homomorphism in (56), to the isome-
tries of the scalar manifold, which leave the scalar sector of the action invari-
ant. The transformations (47), which are duality symmetries of the system
field equations/Bianchi identities, cannot be extended in general to be sym-
metries of the Lagrangian. The scalar part of the Lagrangian (32) is invariant
under the action of the isometry group of the metric grs , but the vector part
is in general not invariant. The transformed Lagrangian under the action of
S ∈ Sp(2nV , R) can be rewritten:
−Λ −
Im F −Λ G− Λ → Im F GΛ
−Λ −
= Im F GΛ + 2(C T B)ΛΣ F −Λ G− Σ
T −Λ −Σ T ΛΣ − −

+ (C A)ΛΣ F F + (D B) GΛ GΣ . (57)

It is evident from (57) that only the transformations with B = C = 0 are

symmetries.
If C = 0, B = 0 the Lagrangian varies for a topological term

(C T A)ΛΣ Fμν
Λ Σ|μν
F (58)

corresponding to a redeﬁnition of the function NΛΣ ; such a transformation

being a total derivative it leaves classical physics invariant, but it is rele-
vant in the quantum theory. It is a symmetry of the partition function only
if ΔNΛΣ = 12 (C T A) is an integer multiple of 2π, and this implies that
S ∈ Sp(2nV , ZZ) ⊂ Sp(2nV , R).
For B = 0 neither the action nor the perturbative partition function are invari-
ant. Let us observe that in this case the transformation law (51) of the kinetic
matrix N contains the transformation N → − N1 that is it exchanges the
weak and strong coupling regimes of the theory. One may then think of such
Extremal Black Holes in Supergravity 679

a quantum ﬁeld theory as being described by a collection of local Lagrangians,

each deﬁned in a local patch. They are all equivalent once one deﬁnes for each
of them what is electric and what is magnetic. Duality transformations map
this set of Lagrangians one into the other.
At this point we observe that the supergravity bosonic Lagrangian (32)
is exactly of the form considered in this section as far as the matter content
is concerned, so that we may apply the above considerations about duality
rotations to the supergravity case. In particular, the U -duality acts in all
theories with N ≥ 2 supersymmetries, where the vector supermultiplets con-
tain both vectors and scalars. For N = 1 supergravity, instead, vectors and
scalars are still present but they are not related by supersymmetry, and as
a consequence they are not related by U -duality rotations, so that the pre-
vious formalism does not necessarily apply. 5 In the next subsection we will
discuss in a geometric framework the structure of the supergravity theories
for N ≥ 2. In particular, for theories whose σ-model is a coset space (which
includes all theories with N > 2) we will give the expression for the kinetic
vector matrix NΛΣ in terms of the Sp(2nV ) coset representatives embedding
the U -duality group. Furthermore we will show that in the N = 2 case, al-
though the σ-model of the scalars is not in general a coset space, yet it may
be treated in a completely analogous way.

3.2 Duality Symmetries and Central Charges

Let us restrict our attention to N -extended supersymmetric theories coupled

to the gravitational field, that is to supergravity theories, whose bosonic action
has been given in (32). For each theory we are going to analyze the group
theoretical structure and to find the expression of the central charges, together
with the properties they obey. As already mentioned, with the exception of the
N = 1 and N = 2 cases, all supergravity theories in four dimensions contain
scalar fields whose kinetic Lagrangians are described by σ-models of the form
G/H (we have summarized these cases in Table 4). We will first examine the
theories with N > 2, extending then the results to the N = 2 case. Here and
in the following, G denotes a non compact group acting as isometry group on
the scalar manifold while H, the isotropy subgroup, is of the form

H = HAut ⊗ Hmatter , (59)

HAut being the automorphism group of the supersymmetry algebra while

Hmatter is related to the matter multiplets. (Of course Hmatter = 11 in all
cases where supersymmetric matter does not exist, namely N > 4).
We will see that in all the theories the fields are in some representation of
the isometry group G of the scalar fields or of its maximal compact subgroup
5
There are however N = 1 models where the scalar moduli space is given by a
special Kähler manifold. This is the case for example for the compactification of
the heterotic theory on Calabi–Yau manifolds
680 L. Andrianopoli et al.

H. This is just a consequence of the Gaillard–Zumino duality acting on the

two-form field strengths and their duals, discussed in the preceding section.
The scalar manifolds and the automorphism groups of supergravity the-
ories for any D and N can be found in the literature (see for instance
[47, 48, 49, 50]). As it was discussed in the previous section, the group G
acts linearly in a symplectic representation on the electric and magnetic field
strengths appearing in the gravitational and matter multiplets. Here and in
the following the index Λ runs over the dimensions of some representation of
the duality group G. Since consistency of the quantum theory requires the
electric and magnetic charges to satisfy a quantization condition, the true du-
ality symmetry at the quantum level (U -duality), acting on quantized charges,
is a suitable discrete version of the continuous group G [10]. The moduli space
of these theories is G(ZZ)\G/H.
All the properties of the given supergravity theories for N ≥ 3 are com-
pletely fixed in terms of the geometry of G/H, namely in terms of the coset
representatives L satisfying the relation

L(Φ ) = gL(Φ)h(g, Φ), (60)

where g ∈ G, h ∈ H and Φ = Φ (Φ), Φ being the coordinates of G/H. Note

that the scalar ﬁelds in G/H can be assigned, in the linearized theory, to
linear representations RH of the local isotropy group H so that dim RH = dim
G − dim H (in the full theory, RH is the representation which the vielbein of
G/H belongs to).
With any ﬁeld-strength F Λ we may associate a magnetic charge pΛ and
an electric charge qΛ given, respectively, by

Λ 1 Λ 1
p = F , qΛ = GΛ , (61)
4π S 2 4π S 2

where S 2 is a spatial two-sphere in the space–time geometry of the dyonic so-

lution (for instance, in Minkowski space–time the two-sphere at radial infinity
2
S∞ ). Clearly the presence of dyonic solutions requires the Maxwell equations
(38) to be completed by adding corresponding electric and magnetic currents
on the right-hand side. These charges however are not the physical charges of
the interacting theory; these latter can be computed by looking at the trans-
formation laws of the fermion fields, where the physical field strengths appear
dressed with the scalar fields [51, 50]. It is in terms of these interacting dressed
field strengths that the field theory realization of the central charges occurring
in the supersymmetry algebra (12) is given. Indeed, let us first introduce the
central charges: they are associated with the dressed two-form TAB appear-
ing in the supersymmetry transformation law of the gravitino one-form. The
physical graviphoton may be identified from the supersymmetry transforma-
tion law of the gravitino field in the interacting theory, namely:

δψA = ∇A + αTAB|μν γ a γ μν B Va + · · · . (62)

Extremal Black Holes in Supergravity 681

Here ∇ is the covariant derivative in terms of the space–time spin connection

and the composite connection of the automorphism group HAut , α is a coef-
ficient fixed by supersymmetry, V a is the space–time vielbein, A = 1, · · · , N
is the index acted on by the automorphism group. Here and in the follow-
ing the dots denote trilinear fermion terms which are characteristic of any
supersymmetric theory but do not play any role in the following discussion.
The two-form field strength TAB will be constructed by dressing the bare field
strengths F Λ with the coset representative L(Φ) of G/H, Φ denoting a set of
coordinates of G/H.
Note that the same field strength TAB which appears in the gravitino
transformation law is also present in the dilatino transformation law in the
following way:
δχABC = PABCD, ∂μ φ γ μ D + βT[AB|μν γ μν C] + · · · . (63)
Analogously, when vector multiplets are present, the matter vector field
strengths TI appearing in the transformation laws of the gaugino fields, which
are named matter vector field strengths, are linear combinations of the field
strengths dressed with a different combination of the scalars:
δλIA = iPIAB,i ∂μ Φi γ μ B + γTI|μν γ μν A + · · · . (64)
Here PABCD = PABCD, dφ and PAB I I
= PAB,i dΦi are the vielbein of the
scalar manifolds spanned by the scalar fields of the gravitational and vector
multiplets, respectively (more precise definitions are given below), and β and
γ are constants fixed by supersymmetry.
In order to give the explicit dependence on scalars of TAB , T I , it is nec-
essary to recall from the previous subsection that, according to the Gaillard–
Zumino construction, the isometry group G of the scalar manifold acts on
the vector (F −Λ , G−
Λ ) (or its complex conjugate) as a subgroup of Sp(2nV , R)
(nV is the number of vector fields) with duality transformations interchanging
electric and magnetic field strengths:
−Λ −Λ
F F
S = . (65)
G−Λ G−
Λ

Let now L(Φ) be the coset representative of G in the symplectic representation,

namely as a 2 nV × 2 nV matrix belonging to Sp(2nV , R) and therefore, in
each theory, it can be described in terms of nV × nV blocks AL , BL , CL , DL
satisfying the same relations (55) as the corresponding blocks of the generic
symplectic transformation S.
Since the fermions of supergravity theories transform in a complex rep-
resentation of the R-symmetry group HAut ⊂ G, it is useful to introduce
a complex basis in the vector space of Sp(2nV , R), deﬁned by the action of
following unitary matrix:6
6
We adopt here and in the following a condensed notation where 11 denotes the nV
dimensional identity matrix 11M M
N = δN . In supergravity calculations, the index
682 L. Andrianopoli et al.

1 11 i 11
A= √ ,
2 11 −i 11

and to introduce a new matrix V(Φ) obtained by complexifying the right index
of the coset representative L(Φ), so as to make its transformation properties
under right action of H manifest:

f f̄
V(Φ) = = L(Φ)A† , (66)
h h̄

where
1 1
f = √ (AL − iBL ) ; h = √ (CL − iDL ) ,
2 2
From the properties of L(Φ) as a symplectic matrix, it is easy to derive the
following properties for V:

V η V† = −iC ; V† C V = iη , (67)

where the symplectic invariant matrix C and η are deﬁned as follows:

0 −11 11 0
C= ; η= , (68)
11 0 0 −11

and, as usual, each block is an nV × nV matrix. The above relations imply on

the matrices f and h the following properties:
†
i(f h − h† f ) = 11
(69)
(f t h − ht f ) = 0

The nV ×nV blocks f , h of V can be decomposed with respect to the isotropy

group HAut × Hmatter as
Λ
f = (fAB , f¯I¯Λ ) ≡ (f Λ M ) ,
h = (hΛAB , h̄ΛI¯) ≡ (hΛ M ) , (70)

where AB are indices in the antisymmetric representation of HAut = SU (N )×

U (1), I is an index of the fundamental representation of Hmatter and M =
¯ Upper SU (N ) indices label objects in the complex conjugate repre-
(AB, I).
Λ ∗
sentation of SU (N ): (fAB ) = f¯ΛAB , (fIΛ )∗ = f¯I¯Λ = f¯Λ I , etc.
Let us remark that, in order to make contact with the notation used for
the N = 2 case, in the deﬁnition (70) some of the entries (f¯I¯Λ and h̄ΛI¯)

M is often decomposed as M = (AB, I), AB = −BA labeling the two-time

antisymmetric representation of the R-symmetry group HAut and I running over
the Hmatter representation of the matter ﬁelds. We use the convention that the
sum over the antisymmetric couple AB be free and therefore supplemented by a
factor 1/2 in order to avoid repetitions. In particular with these conventions, when
restricted to the AB indices, the identity reads: 11ABCD ≡ 2 δCD = δC δD − δD δC .
AB A B A B
Extremal Black Holes in Supergravity 683

have been written as complex conjugates of other quantities (fIΛ and hΛI
Λ
respectively). In this way, fAB and fIΛ are characterized by having Kähler
weight of the same sign. Indeed, for all the matter coupled theories (N =
2, 3, 4) we have, as a general feature, that the entries of the blocks f and
h carrying Hmatter indices have a Kähler weight with an opposite sign with
respect to the corresponding entries with HAut indices. This may be seen from
the supersymmetry transformation rules of the supergravity fields, in virtue
the fact that gravitinos and gauginos with the same chirality have opposite
Kähler weight. We note however that this notation differs from the one in
previous papers, where the upper and lower parts of the symplectic section
Λ
were defined instead as (fAB , fIΛ ) , (hΛ AB , hΛ I ).
It is useful to introduce the following quantities:
VM = (VAB , V I¯) , where:
VAB ≡ Λ
(fAB , hΛAB ) ; VI ≡ (fIΛ , hΛI ) . (71)
The vectors VM are (complex) symplectic sections of a Sp(2nV , R) bundle
over G/H. As anticipated in the previous subsection, this bundle is actually
flat. The real embedding given by L(Φ) is appropriate for duality transforma-
tions of F ± and their duals G± , according to (46), while the complex embed-
ding in the matrix V is appropriate in writing down the fermion transforma-
tion laws and supercovariant field strengths. The kinetic matrix N , according
to Gaillard–Zumino [39], can be written in terms of the sub-blocks f , h, and
turns out to be
N = h f −1 , N = Nt , (72)
transforming projectively under Sp(2nV , R) duality rotations as already shown in
the previous section. By using (69)and (72) we find that
(f t )−1 = i(N − N̄ )f̄ , (73)
that is
(f −1 )AB Λ = i(N − N̄ )ΛΣ f¯Σ AB , (74)
¯ Σ I¯
(f −1 )I Λ = i(N − N̄ )ΛΣ f . (75)
It can be shown [50] that the dressed graviphotons and matter self-dual field
strengths appearing in the transformation law of gravitino (62), dilatino (63)
and gaugino (64) can be constructed as a symplectic invariant using the f and
h matrices as follows:
−
TAB = −i(f̄ −1 )ABΛ F −Λ = fAB
Λ
(N − N̄ )ΛΣ F −Σ = hΛAB F −Λ − fAB Λ
G−
Λ ,
− −1 −Λ ¯ −Σ −Λ ¯ Λ −
T̄I¯ = −i(f̄ )IΛ ¯ F = fI¯ (N − N̄ )ΛΣ F
Λ
= h̄ΛI¯ F − fI¯ GΛ ,
− ∗
T̄ +AB = (TAB ) ,
TI+ = (T̄I¯− )∗ , (76)

(for N > 4, supersymmetry does not allow matter multiplets and fIΛ = 0 =
+ −
TI ). To construct the dressed charges one integrates TAB = TAB + TAB and
684 L. Andrianopoli et al.

(for N = 3, 4) T̄I¯ = T̄I¯+ + T̄I¯− on a large two-sphere. For this purpose we note
that
+
TAB = hΛAB F +Λ − fAB Λ
G+
Λ = 0, (77)
+
T̄ ¯ = h̄ΛI¯ F
I
+Λ
− f¯¯ G = 0 ,
I
Λ
Λ
+
(78)

as a consequence of (72) and (45). Therefore, we can introduce the central and
matter charges as the dressed charges obtained by integrating the two forms
TAB and T̄I¯:

1 1 − 1
ZAB = − TAB = − (T +
+ TAB ) = − T−
4π S 2 4π S 2 AB 4π S 2 AB
= fABΛ
qΛ − hΛAB pΛ , (79)

1 1 1
Z I¯ = − T̄I¯ = − (T̄I¯+ + T̄I¯− ) = − T̄ −
4π S 2 4π S 2 4π S 2 I¯
= f¯¯Λ qΛ − h̄ΛI¯ pΛ (N ≤ 4) ,
I
(80)

where pΛ and qΛ were defined in (61) and the sections (f Λ , hΛ ) on the right-
hand side now depend on the v.e.v.’s Φ∞ ≡ Φ(r = ∞) of the scalar fields Φr .
We see that because of the electric–magnetic duality, the central and matter
charges are given in this case by symplectic-invariant expressions.
The scalar field-dependent combinations of fields strengths appearing in
the fermion supersymmetry transformation rules have a profound meaning
and, as we are going to see in the following, they play a key role in the
physics of extremal black holes. The integral of the graviphoton TABμν gives
the value of the central charge ZAB of the supersymmetry algebra, while by
integration of the matter field strengths TI|μν one obtains the so-called matter
charges ZI .
We are now able to derive some differential relations among the central
and matter charges using the Maurer–Cartan equations obeyed by the scalars
through the embedded coset representative V. Indeed, let Γ = V−1 dV be the
Sp(2nV , R) Lie algebra left invariant one form satisfying

dΓ + Γ ∧ Γ = 0 . (81)

In terms of (f , h), Γ has the following form:

(H)
i(f † dh − h† df ) i(f † dh̄ − h† df̄ ) Ω P̄
Γ ≡ V−1 dV = ≡ , (82)
−i(f t dh − ht df ) −i(f t dh̄ − ht df̄ ) P Ω̄ (H)

where the nV × nV sub-blocks Ω (H) and P embed the H connection and

the vielbein of G/H, respectively. This identiﬁcation follows from the Cartan
decomposition of the Sp(2nV , R) Lie algebra.
Extremal Black Holes in Supergravity 685

From (66) and (82), we obtain the (nV × nV ) matrix equation:

D(Ω)f = f̄ P ,
D(Ω)h = h̄ P , (83)

together with their complex conjugates. Explicitly, if we deﬁne the HAut ×

Hmatter -covariant derivative of the VM vectors, introduced in (71), as
AB
ω CD 0
DVM = dVM − VN ω M , ω =N
, (84)
0 ω IJ

we have
Ω (H) = i[f † (Dh + hω) − h† (Df + f ω)] = ω11 , (85)
where we have used
Dh = N̄ Df ; h = Nf , (86)
which follow from (83) and the fundamental identity (69). Furthermore, using
the same relations, the embedded vielbein P can be written as follows:

P = −i(f t Dh − ht Df ) = if t (N − N̄ )Df . (87)

Using further the deﬁnition (70) we have

1
Λ
D(ω)fAB = fIΛ PAB
I
+ f¯ΛCD PABCD ,
2
¯ Λ 1 ¯ΛAB ¯
D(ω)fI¯ = f PAB I¯ + f ΛJ PJ¯I¯ , (88)
2
where we have decomposed the embedded vielbein P as follows:

PABCD PAB J¯
P= , (89)
PICD
¯ PI¯J¯

the sub-blocks being related to the vielbein of G/H, written in terms of the
indices of HAut × Hmatter . In particular, the component PABCD is completely
antisymmetric in its indices. Note that, since f belongs to the unitary matrix
M ¯
V, we have: V = (fAB Λ
, f¯I¯Λ ) = (f¯ΛAB , f ΛI ). Obviously, the same differential
relations that we wrote for f hold true for the dual matrix h as well.
Using the definition of the charges (79) and (80), we then get the following
differential relations among charges:7

I 1
D(ω)ZAB = ZI PAB + Z̄ CD PABCD ,
2
1 ¯
D(ω)Z̄I¯ = Z̄ AB PAB I¯ + Z J PI¯J¯ . (90)
2
7
Here we are using for the matter charges a diﬀerent notation with respect to [50],
for instance, in that the quantities ZI correspond in [50] to Z̄ I .
686 L. Andrianopoli et al.

Depending on the coset manifold, some of the sub-blocks of (89) can be

SU (3,n)
actually zero. For example, in N = 3, the vielbein of G/H = SU (3)×SU (n)×U (1)
[52] is PIAB
¯ (AB antisymmetric), I = 1, · · · , n; A, B = 1, 2, 3 and it turns out
that PABCD = PI¯J¯ = 0.
In N = 4, G/H = SUU(1,1) O(6,n)
(1) × O(6)×O(n) [53], and we have PABCD =
SU (1,1)
ABCD P , PI¯J¯ = P δIJ , where P is the Kählerian vielbein of U (1) (A, · · · , D
O(6,n)
SU (4) indices and I, J O(n) indices) and PIAB is the vielbein of O(6)×O(n) .
For N > 4 (no matter indices) we have that P coincides with the vielbein
PABCD of the relevant G/H.
For the purpose of comparison of the previous formalism with the N = 2
supergravity case, where the σ-model is in general not a coset, it is interesting
to note that, if the connection Ω (H) and the vielbein P are regarded as data
of G/H, then the Maurer–Cartan equations (88) can be interpreted as an
¯
integrable system of diﬀerential equations for the section (VAB , V̄I¯, V̄ AB , V I )
of the symplectic ﬁber bundle constructed over G/H. Namely the integrable
system (84) that we explicitly write in the following equivalent matrix form:
⎛ ⎞ ⎛ ⎞⎛ ⎞
VAB 0 0 12 PABCD PAB J¯ VCD
⎜ V̄I¯ ⎟ ⎜ 0 0 1
2 PICD PI¯J¯ ⎟ ⎜ J¯ ⎟
D⎜ ⎟ ⎜ ¯
⎟ ⎜ V̄CD ⎟,
⎝V̄ AB ⎠ = ⎝ 1 P̄ ABCD P̄ AB J¯ 0 0 ⎠ ⎝V̄ ⎠ (91)
2
¯ ¯ ¯ ¯ ¯
VI 1
2 P̄
ICD
P̄ I J 0 0 VJ

has 2nV solutions given by VM . The integrability condition (81) means that
Γ is a ﬂat connection of the symplectic bundle. In terms of the geometry of
G/H this in turn implies that the IH-curvature associated to the connection
Ω (H) (and hence, since the manifold is a symmetric space, also the Rieman-
nian curvature) is constant, being proportional to the wedge product of two
vielbein.
Furthermore, besides the diﬀerential relations (90) the charges also satisfy
sum rules.
The sum rule has the following form:
1 1
ZAB Z̄ AB + ZI Z̄ I = − Qt M(N )Q , (92)
2 2
where C is the symplectic metric while M(N ) and Q are

11 −N N 0 11 0
M(N ) = · ·
0 11 0 N −1 −N 11

N + N N −1 N −N N −1
= = C V V† C ,
−N −1 N N −1
(93)

and
Extremal Black Holes in Supergravity 687
Λ
p
Q= . (94)
qΛ
This result is obtained from the fundamental identities (69) and from the
deﬁnition of V and of the kinetic matrix given in (66) and (72). Indeed one
can verify that [50, 54]:
−1
f f † = −i N − N̄ ,
−1 −1
h h† = −i N̄ −1 − N −1 ≡ −iN N − N̄ N̄ ,
h f† = N f f† ,
f h† = f f † N̄ , (95)
so that, using the explicit expression for the charges in (79) and (80), (92) is
easily retrieved.
In the following, studying the applications of these formulas to extremal
black holes, other relations coming from the same identities listed above will
also be useful, in particular:

1 −h h† h f † 1
(M + i C) = = C V (11 + η) V† C
2 f h† −f f † 2
= −(C V)M (C V̄)M , (96)
1
(M + iC) VM = i C VM , (97)
2
1
(M − iC) VM = 0 , (98)
2
M Q = C V V† C Q = −2 Re C VM < Q, V̄M > , (99)
†

C Q = −i C V η V C Q = −2 Im C VM < Q, V̄ > .
M

(100)
The symplectic scalar product appearing in (99) and (100) is deﬁned as
< V, W > ≡ V t C W , (101)
moreover V̄M = (VM )∗ . Using (71), (79), and (80) we can use the following
short-hand notation for the central charge vector:
ZM = (ZAB , Z̄I¯) =< Q, VM > . (102)
From the above expression and from (96), (92) follows.

3.3 The N = 2 Theory

The formalism we have developed so far for the D = 4, N > 2 theories is
completely determined by the embedding of the coset representative of G/H
in Sp(2n, R) and by the embedded Maurer–Cartan equations (88). We want
now to show that this formalism, and in particular the identities (69), the
688 L. Andrianopoli et al.

differential relations among charges (90) and the sum rules (92) of N = 2
matter-coupled supergravity [55, 56] can be obtained in a way completely
analogous to the N > 2 cases discussed in the previous subsection, where the
σ-model was a coset space. This follows essentially from the fact that, though
the scalar manifold Mscalar of the N = 2 theory is not in general a coset
manifold, nevertheless it has a symplectic structure identical to the N > 2
theories, as a consequence of the Gaillard–Zumino duality.
In the case of N = 2 supergravity, the requirements imposed by super-
symmetry on the scalar manifold Mscalar of the theory dictate that it should
be the following direct product: Mscalar = MSK ⊗ MQ where MSK is
a special Kähler manifold of complex dimension n and MQ a quaternionic
manifold of real dimension 4nH . Note that n and nH are, respectively, the
number of vector multiplets and hypermultiplets contained in the theory. The
direct product structure imposed by supersymmetry precisely reflects the fact
that the quaternionic and special Kähler scalars belong to different super-
multiplets. In the construction of extremal black holes, it turns out that the
hyperscalars are spectators playing no dynamical role. Hence we do not discuss
here the hypermultiplets any further and we confine our attention to an N = 2
supergravity where the graviton multiplet, containing besides the graviton gμν
also a graviphoton A0μ , is coupled to n vector multiplets. Such a theory has
an action of type (32) where the number of gauge fields is nV = 1 + n and the
number of (real) scalar fields is m = 2 n. We shall use capital Greek indices
to label the vector fields: Λ, Σ · · · = 0, . . . , n. To make the action (32) fully
explicit, we need to discuss the geometry of the manifold MSK spanned by
the vector-multiplet scalars, namely special Kähler geometry. Since MSK is
in particular a complex manifold, we shall describe the corresponding scalars
as complex fields: z i , z̄ ı̄ , i, ı̄ = 1, . . . , n. We refer to [57] for a detailed analysis.
A special Kähler manifold MSK is a Kähler–Hodge manifold endowed with
an extra symplectic structure. A Kähler manifold M is a Hodge manifold if
and only if there exists a U (1) bundle L −→ M such that its first Chern
class equals the cohomology class of the Kähler two-form K:
c1 (L) = [ K ] . (103)
In local terms we can write
K = i gij̄ dz i ∧ dz̄ j̄ , (104)
where z are n holomorphic coordinates on M
i SK
and gij̄ its metric.
In this case the U (1) Kähler connection is given by
i
Q = − ∂i Kdz i − ∂ı̄ Kdz̄ ı̄ , (105)
2
where K is the Kähler potential, so that K = dQ.
Let now Φ(z, z̄) be a section of the U (1) bundle of weight p. By definition
its covariant derivative is
DΦ = (d + ipQ)Φ , (106)
Extremal Black Holes in Supergravity 689

or, in components,

Di Φ = (∂i + 12 p∂i K)Φ ; Dı̄ Φ = (∂ı̄ − 12 p∂ı̄ K)Φ . (107)

A covariantly holomorphic section is deﬁned by the equation: Dı̄ Φ = 0. Set-

ting:
Φ̃ = e−pK/2 Φ , (108)
we get
Di Φ̃ = (∂i + p∂i K)Φ̃ ; Dı̄ Φ̃ = ∂ı̄ Φ̃ , (109)
so that under this map covariantly holomorphic sections Φ become truly holo-
morphic sections.
There are several equivalent ways of defining what a special Kähler mani-
fold is. An intrinsic definition is the following. A special Kähler manifold can
be given by constructing a (2n + 2)-dimensional flat symplectic bundle over
the Kähler–Hodge manifold whose generic sections (with weight p = 1)

V = (f Λ , hΛ ) , (110)

are covariantly holomorphic

1
Dı̄ V = (∂ı̄ − ∂ı̄ K)V = 0 , (111)
2
and satisfy the further condition

i < V, V̄ >= i(f¯Λ hΛ − h̄Λ f Λ ) = 1 , (112)

where the < , > product was deﬁned in (101). Deﬁning

Vi = Di V = (fiΛ , hΛi ), (113)

and introducing a symmetric three-tensor Cijk by

Di Vj = iCijk g kk̄ V̄k̄ , (114)

the set of diﬀerential equations

Di V = Vi ,
Di Vj = iCijk g kk̄ V̄k̄ ,
Di Vj̄ = gij̄ V̄ ,
Di V̄ = 0 , (115)

deﬁnes a symplectic connection. Requiring that the diﬀerential system (115)

is integrable is equivalent to requiring that the symplectic connection is flat.
Since the integrability condition of (115) gives constraints on the base Kähler–
Hodge manifold, we define special-Kähler a manifold whose associated sym-
plectic connection is flat. At the end of this section, we will give the restrictions
on the manifold imposed by the flatness of the connection.
690 L. Andrianopoli et al.

It must be noted that, for special Kähler manifolds, the Kähler potential
can be computed as a symplectic invariant from (112). Indeed, introducing
also the holomorphic sections
Ω = e−K/2 V = e−K/2 (f Λ , hΛ ) = (X Λ , FΛ ) ,
∂ı̄ Ω = 0 , (116)

(112) gives

K = − ln i < Ω, Ω̄ >= − ln i(X̄ Λ FΛ − X Λ F̄Λ ) . (117)

If we introduce the complex symmetric (n + 1) × (n + 1) matrix NΛΣ deﬁned
through the relations
hΛ = NΛΣ f Σ , hΛ ı̄ = NΛΣ f¯ı̄Σ , (118)
then we have
< V, V̄ >= N − N̄ ΛΣ f Λ f¯Σ = −i , (119)
so that
K = − ln[i(X̄ Λ N − N̄ ΛΣ X Σ )] , (120)
and

gij̄ = −i < Vi , Vj̄ >= −2fiΛ ImNΛΣ f¯j̄Σ , (121)

Cijk = < Di Vj , Vk >= 2iImNΛΣ fiΛ Dj fkΣ . (122)

We shall also use the following identity which follows from the previous ones:
1
fiΛ g ij̄ f¯j̄Σ = − (ImN )−1 ΛΣ − L̄Λ LΣ . (123)
2
The matrix NΛΣ turns out to be the matrix appearing in the kinetic Lagrangian
of the vectors in N = 2 supergravity. Under coordinate transformations, the
sections Ω transform as
Ω̃ = e−fS (z) SΩ , (124)

AB
where S = is an element of Sp(2nV , R) and the factor e−fS (z) is
CD
a U (1) Kähler transformation. We also note that, from the definition of N ,
(118):
Ñ (X̃, F̃ ) = [C + DN (X, F )][A + BN (X, F )]−1 . (125)
We can now define a matrix V as in (66) satisfying the relations (67), in
terms of the quantities (f Λ , f¯Λ ı̄ , hΛ , h̄Λ ı̄ ) introduced in (110) and (113). In
order to identify the blocks f and h of V in (66), we note that in N = 2
theories HAut = SU (2) × U (1), so that the fAB Λ
and hΛ AB entries in (70) are
actually SU (2)-singlets. We can therefore consistently write f and h as the
following nV × nV matrices:
Extremal Black Holes in Supergravity 691
Λ
f ≡ fAB , f¯I¯Λ ; h ≡ hΛ AB , h̄Λ I¯ , (126)
Λ
where fAB , hΛ AB , and fIΛ , hΛ I are defined as follows:
Λ
fAB = f Λ AB ; hΛ AB = hΛ AB ,
fIΛ = fiΛ PIi ; hΛ I = hΛ i PIi , (127)
¯
PIi , PI¯ı̄ being the inverse of the Kählerian vielbein PiI , P̄ı̄I defined by the
relation:
¯
gij̄ = PiI P̄j̄J ηI J¯ , (128)

and ηI J¯ is the flat metric. From the definition (126) and the properties (119),
(121) it is straightforward to verify that the f and h blocks satisfy the relations
(69), or equivalently that the matrix V satisfies the conditions (67). The rela-
tions (69) therefore encode the set of algebraic relations of special geometry.
Let us now consider the analogous of the embedded Maurer–Cartan equa-
tions of G/H. We introduce, as before, the matrix one-form Γ = V−1 dV
satisfying the relation dΓ + Γ ∧ Γ = 0. We further introduce the covariant
derivative of the symplectic section (f Λ , f¯I¯Λ , f¯Λ , fIΛ ) with respect to the U (1)-
Kähler connection Q and the spin connection ω IJ of MSK :

D(f Λ , f¯I¯Λ , f¯Λ , fIΛ ) = d(f Λ , f¯I¯Λ , f¯Λ , fIΛ )

⎛ ⎞
−iQ 0 0 0
⎜ 0 iQδ J¯ + ω J¯ 0
¯ ¯
0 ⎟
−(f Λ , f¯JΛ¯ , f¯Λ , fJΛ ) ⎜
⎝ 0
I I ⎟
⎠ (129)
0 iQ 0
0 0 0 −iQδI + ω I
J J

the Kähler weight of (f Λ , fIΛ ) and (f¯Λ , f¯I¯Λ ) being p = 1 and p = −1, respec-
tively. Using the same decomposition as in (82) and (84), (85) we have in the
N = 2 case:

Ω P̄
Γ = ,
P Ω̄

−iQ 0
Ω=ω= . (130)
0 iQδJI + ω̄ IJ
For the sub-block P we obtain

0 PI¯
P = −i(f t Dh − ht Df ) = if t (N − N̄ )Df = , (131)
P J P JI¯
¯
where P J ≡ η J I PI¯ is the (1, 0)-form Kählerian vielbein while
J
P JI¯ ≡ i f t (N − N̄ )Df I¯ (132)

is a one-form which in general, in the cases where the manifold is not a

coset, represents a new geometric quantity on MSK . Note that we get zero
692 L. Andrianopoli et al.

in the ﬁrst entry of (131) by virtue of the fact that the identity (69) implies
f Λ (N − N̄ )ΛΣ fIΣ = 0 and that f Λ is covariantly holomorphic. If Ω and P
are considered as data on MSK then we may interpret Γ = V −1 dV as an
integrable system of diﬀerential equations, namely,
⎛ ⎞
0 0 0 P̄I
⎜ 0 0 P̄ J P̄ J ⎟
¯ ¯
D(V, V̄I¯, V̄ , VI ) = (V, V̄J¯, V̄ , VJ ) ⎜
⎝ 0 PI¯ 0 0 ⎠ ,
I⎟ (133)
P J P JI¯ 0 0

where the flat Kähler indices I, I, ¯ · · · are raised and lowered with the flat
Kähler metric ηI J¯. Note that (133) coincides with the set of relations (115) if
¯ provided we also identify
we trade world indices i, ı̄ with flat indices I, I,
¯ ¯ ¯
P̄ JI = P̄ JIk dz k = P J,i PI j Cijk dz k . (134)

Then, the integrability condition dΓ + Γ ∧ Γ = 0 is equivalent to the ﬂatness

of the special Kähler symplectic connection and it gives the following three
constraints on the Kähler base manifold

d(iQ) + P̄I ∧ P I = 0 → ∂j̄ ∂i K = P I,i P̄I,j̄ = gij̄ , (135)

¯ J¯ ¯ ¯
(dω + ω ∧ ω)JI¯ = PI¯ ∧ P̄ − idQδIJ¯ − P̄ JL ∧ P LI¯ , (136)
DP JI¯ = 0, (137)
P̄J ∧ P JI¯ = 0. (138)

Equation (135) implies that MSK is a Kähler–Hodge manifold. Equation

(136), written with holomorphic and antiholomorphic curved indices, gives

Rı̄j k̄l = gı̄l gj k̄ + gk̄l gı̄j − C̄ı̄k̄m̄ Cjln g m̄n , (139)

which is the usual constraint on the Riemann tensor of the special geometry.
The further special geometry constraints on the three tensor Cijk are then
consequences of (137) and (138), which imply

D[l Ci]jk = 0 ,
Dl̄ Cijk = 0 . (140)

In particular, the ﬁrst of (140) also implies that Cijk is a completely symmetric
tensor.
In summary, we have seen that the N = 2 theory and the higher N theories
have essentially the same symplectic structure, the only diﬀerence being that
since the scalar manifold of N = 2 is not in general a coset manifold the
symplectic structure allows the presence of a new geometric quantity which
physically corresponds to the anomalous magnetic moments of the N = 2
theory. It goes without saying that, when MSK is itself a coset manifold [58],
Extremal Black Holes in Supergravity 693

then the anomalous magnetic moments Cijk must be expressible in terms of

the vielbein of G/H.
To complete the analogy between the N = 2 theory and the higher N
theories in D = 4, we also give for completeness the supersymmetry transfor-
mation laws, the central and matter charges, the differential relations among
them, and the sum rules.
The transformation laws for the chiral gravitino ψA and gaugino λiA fields
are
δψAμ = ∇μ A + AB Tμν γ ν B + · · · , (141)
i
δλiA = i∂μ z i γ μ A + T̄j̄μν γ μν g ij̄ AB B + · · · , (142)
2
where
T ≡ hΛ F Λ − f Λ GΛ , (143)
¯
T̄ı̄ ≡ T̄I¯P̄ı̄I , with: T̄I¯ ≡ h̄ΛI¯F −
Λ
f¯I¯Λ GΛ , (144)
are, respectively, the graviphoton and the matter vectors, and the position of
the SU (2) automorphism index A (A,B=1,2) is related to chirality (namely
(ψA , λiA ) are chiral, (ψ A , λı̄A ) antichiral). In principle only the (anti) self-dual
part of F and G should appear in the transformation laws of the (anti)chiral
fermi fields; however, exactly as in (77) and (78) for N > 2 theories, from
(115) it follows that:

T + = hΛ F +Λ − f Λ G+
Λ = 0,
− −Λ Λ −
T I = hΛ I F − fI GΛ = 0 , (145)

so that T = T − and TI = TI+ (i.e. T̄ = T̄ + , T̄I¯ = T̄I¯− ). Note that both the
graviphoton and the matter vectors are symplectic invariant according to the
fact that the fermions do not transform under the duality group (except for a
possible R-symmetry phase). To define the physical charges let us recall the
definition of the moduli-independent charges in (61). The central charges and
the matter charges are now defined as the integrals over S 2 of the physical
graviphoton and matter vectors

1 1
Z=− T =− (hΛ F Λ − f Λ GΛ ) = f Λ (z, z̄)qΛ − hΛ (z, z̄)pΛ ,
4π S 2 4π S 2

1 1
ZI = − TI = − (hΛ I F Λ − fIΛ GΛ ) = fIΛ (z, z̄)qΛ − hΛ I (z, z̄)pΛ .
4π S 2 4π S 2
(146)

where z i , z̄ ı̄ denote the v.e.v. of the moduli ﬁelds in a given background. In

virtue of (115) we get immediately:

ZI = PIi Zi ; Zi ≡ Di Z . (147)

As a consequence of the symplectic structure, one can derive two sum rules
for Z and ZI :
694 L. Andrianopoli et al.

1
|Z|2 + |ZI |2 ≡ |Z|2 + Zi g ij̄ Z̄j̄ = − Qt MQ (148)
2
where the symmetric matrix M was deﬁned in (93) and Q is the symplectic
vector of electric and magnetic charges deﬁned in (94).
Equation (148) is obtained by using exactly the same procedure as in (92).

4 Supersymmetric Black Holes: General Discussion

We are going to study in this section the peculiarities of extremal black holes
that are solutions of extended supergravity theories.
As anticipated in the introduction, for black-hole configurations that are
particular bosonic backgrounds of N -extended locally supersymmetric theo-
ries, the cosmic censorship conjecture (expressing the request that the space-
time singularities are always hidden by event horizons) finds a simple and
natural understanding. For the Reissner–Nordstrom black holes this is codi-
fied in the bound (4) on the mass M and charge Q of the solution, that we
recall here
M ≥ |Q|. (149)
In extended supersymmetric theories this bound is just a consequence of the
supersymmetry algebra (21), as a consequence of the fact that
* +
†(±)
Q(±)
am , Qam ≥ 0, (150)

so that the cosmic censorship conjecture is always veriﬁed.

Another general property of extremal black holes, that will be surveyed
in Sect. 5, is encoded in the so-called no-hair theorem. It states that the
end point of the gravitational collapse of a black hole is independent of the
initial conditions. Then, if one tries to perturb an extremal black hole with
whatever additional hair (some slight mass anisotropy, or a long-range field,
like a scalar) all these features disappear near the horizon, except for those
associated with the conserved quantities of general relativity, namely, for a
non-rotating black hole, its mass and charge. When the black hole is embedded
in an N -extended supergravity theory, the solution depends in general also
on scalar fields. In this case, the electric charge Q has to be replaced by
the central charge appearing in the supersymmetry algebra (which is dressed
with the expectation value of scalar fields). The black-hole metric takes a
generalized form with respect to the Reissner–Nordström one. However, for
the extremal case the event horizon looses all information about the scalar
’hair’. As for the Reissner–Nordström case, the near-horizon geometry is still
described by a conformally flat, Bertotti–Robinson-type geometry, with a mass
parameter MB–R , which only depends on the distribution of charges and not
on the scalar fields. As will be discussed extensively in Sect. 5, this follows
from the fact that the differential equations on the metric and scalars fields of
Extremal Black Holes in Supergravity 695

the extremal black hole (200) and (201) are solved under the condition that
the horizon be an attractor point [2] (see (207)). Scalar fields, independently
of their boundary conditions at spatial infinity, approaching the horizon flow
to a fixed point given by a certain ratio of electric and magnetic charges.
Since the dominant contribution to the black-hole entropy is given (at least
for large black holes) by the area/entropy Bekenstein–Hawking relation (1),
it follows that the entropy of extremal black holes is a topological quantity
fixed in terms of the quantized electric and magnetic charges while it does not
depend on continuous parameters like scalars.
It will be shown that the request that the scalars Φr be regular at the
fixed point (reached at the horizon τ → ∞) implies two important conditions
which have both to be satisfied:
r
dΦ
= 0, (151)
dτ hor

∂VB–H (Φ)
= 0. (152)
∂Φi hor

where the function VB–H (Φ, p, q), called the black-hole potential, will be
introduced in (203).
Exploiting (152), a decade ago a general rule was given [22] for finding the
values of fixed scalars, and then the Bekenstein–Hawking entropy, in N = 2
theories, through an extremum principle in moduli space. This follows from
the observation that, when the scalar fields are evaluated at spatial infinity
(τ = 0), VB–H coincides with the squared ADM mass of the black hole.
Then, since (152) does not depend explicitly on the radial variable τ (as the
extremization is done with respect to the scalar fields at any given point)
the expectation values Φ∞ may be chosen as independent variables. Equation
(152) is then reformulated as the statement that the fixed scalars Φfix are the
ones, among all the possible expectation values taken by scalar fields, that
extremize the ADM mass of the black hole in moduli space:
∂MADM (Φ∞ ) ,,
Φfix : ,Φfix = 0. (153)
∂Φr∞
Correspondingly, the Bekenstein–Hawking entropy is given in terms of that
extremum among the possible ADM masses (given by all possible boundary
conditions that one can impose on scalars at spatial infinity), this last being
identified with the Bertotti–Robinson mass MB–R :

MB-R ≡ MADM (Φﬁx ). (154)

The solutions with the scalar fields constant and everywhere equal to the fixed
value Φfix are called double extremal black holes.
The approach outlined above will prove to be a very useful computational
tool to calculate the B–H entropy since, as will be explained in Sect. 5, in
extended supergravity the explicit dependence of VB–H on the moduli is given.
696 L. Andrianopoli et al.

4.1 BPS Extremal Black Holes

For the case of BPS extremal black holes, the extremum principle (153) may be
explained by means of the Killing spinor equations near the horizon and these
are encoded in some relations on the scalars moduli spaces, discussed in detail
in Sects. 3.2 and 3.3, which express the embedding of the scalar geometry in
a symplectic representation of the U -duality group [59]. For deﬁniteness, to
present the argument we will refer, for the sequel of this subsection, to the
case N = 2, which is the model originally considered in [21, 22].
The Killing spinor equations expressing the existence of unbroken super-
symmetries are obtained, for the gauginos in the N = 2 case [57], by setting
to zero the r.h.s. of (142) that is, using ﬂat indices:

δλIA = P,iI ∂μ z i γ μ εAB B + T̄μν

I
γ μν A + · · · = 0. (155)

As we will see in detail in the next subsection, approaching the black-hole

horizon the scalars z i reach their ﬁxed values zﬁx 8 so that

∂μ z i = 0 (156)

and (155) is satisﬁed for

TI = 0, (157)
which implies, using integrated quantities:

1
ZI = Zi PIi = − TI = fIΛ qΛ − hΛI pΛ |ﬁx = 0. (158)
4π S 2
What we have found is that the Killing spinor equation imposes the vanishing
of the matter charges near the horizon. Then, remembering (147), near the
horizon we have
ZI = DI Z = 0 (159)
where Z is the central charge appearing in the N = 2 supersymmetry algebra,
so that:

∂i |Z| = 0. (160)
For an extremal BPS black hole (|Z| = MADM ), (160) coincides with
(153) giving the fixed scalars Φfix ≡ zfix at the horizon. We then see that
the entropy of the black hole is related to the central charge, namely to the
integral of the graviphoton field strength evaluated for very special values of
the scalar fields z i . These special values, the fixed scalars zfix
i
, are functions
solely of the electric and magnetic charges {qΣ , p } of the black hole and are
Λ

attained by the scalars z i (r) at the black hole horizon r = 0.

8
A point xfix where the phase velocity is vanishing is named fixed point and rep-
resents the system in equilibrium v(xfix ) = 0 [22, 23]. The fixed point is said to
be an attractor if limt→∞ x(t) = xfix .
Extremal Black Holes in Supergravity 697

Let us discuss in detail the explicit solution of the Killing spinor equation
and the general properties of N = 2 BPS-saturated black holes [21, 60, 61, 62].
As our analysis will reveal, these properties are completely encoded in the
special Kähler geometric structure of the mother theory.
Let us consider a black-hole ansatz for the metric,9 restricting the attention
to static, spherically symmetric solutions:
2
ds2 = e2U (r) dt2 − e−2U (r) Gij (r) dxi dxj ; r = Gij xi xj , i, j = 1, 2, 3
(161)
and for the vector ﬁeld strengths:

pΛ Λ (r) 2U
FΛ = abc xa
dx b
∧ dx c
− e dt ∧ x · dx. (162)
2r3 r3
Note that here r parametrizes the distance from the horizon.
It is convenient to rephrase the same ansatz in the complex formalism
well-adapted to the N = 2 theory. To this eﬀect we begin by constructing a
two-form which is anti-self-dual in the background of the metric (161) and
2
whose integral on the two-sphere at inﬁnity S∞ is normalized to 4π. A short
calculation yields

e2U (r) 1 xa b
E− = i dt ∧ x · dx + dx ∧ dxc abc ,
r3 2 r3
E − = 4 π, (163)
2
S∞

from which one obtains

− μν e2U (r) 1
Eμν γ = 2i 3
γa xa γ0 [1 + γ5 ] , (164)
r 2
which will simplify the unfolding of the supersymmetry transformation rules.
Next, introducing the following complex combination:
1 Λ
tΛ (r) =
(p + iΛ (r)) (165)
2
1

of the
magnetic charges pΛ = 4π S2
F Λ and of the functions Λ (r) =
− 4π S 2 F introduced in (162), we can rewrite the ansatz (162) as
1 Λ

F −|Λ = tΛ E − , (166)

and we retrieve the original formulae from

Λ Λ
F Λ = 2ReF −|Λ = 2r
p
3 abc x dx ∧ dx −
a b c (r) 2U
r3 e dt ∧ x · dx,
Λ
pΛ 2U
(167)
−|Λ (r)
F = −2ImF
Λ
= − 2r3 abc x dx ∧ dx − r3 e dt ∧ x · dx.
a b c

9
This ansatz is dictated by the general p-brane solution of supergravity bosonic
equations in any dimensions [15].
698 L. Andrianopoli et al.

Before proceeding further, it is convenient to deﬁne the electric and magnetic

charges of the black hole as it is appropriate in any abelian gauge theory.
Recalling the general form of the field equations and of the Bianchi identities
as given in (38), we see that on-shell the field strengths Fμν and Gμν are both
closed two-forms, since their duals are divergenceless. Hence, for the Gauss
theorem, their integral on a closed space-like two-sphere does not depend
on the radius of the sphere. These integrals are the (constant) electric and
magnetic charges of the black hole defined in (61) that, in a quantum theory,
we expect to be quantized. Using the ansatze (167) and the definition (37),
we find

1
qΛ = GΛ = NΛΣ Σ + NΛΣ pΣ = 2 NΛΣ t̄Σ . (168)
4π S 2

From the above equation we can obtain the ﬁeld dependence of the functions
Λ (r)

Λ (r) = (ImN )−1 ΛΣ qΣ − ReNΣΓ pΓ . (169)

Consider now the Killing spinor equations obtained from the supersymmetry
transformations rules (141) and (142):
− ν B
0 = ∇μ ξA + AB Tμν γ ξ , (170)
i −
0 = i ∇μ z i γ μ ξ A + g ij̄ T̄ j̄|μν γ μν AB ξB , (171)
2
where the Killing spinor ξA (r) is of the form of a single radial function times
a constant spinor satisfying

ξA (r) = ef (r) χA , χA = constant,

Z
γ0 χA = i AB χB (172)
|Z|

We observe that the condition (172) halves the number of supercharges pre-
served by the solution. Inserting (143),(144) and (172) into (170) and (171)
and using the result (164), with a little work we obtain the ﬁrst-order diﬀer-
ential equations:
U (r)
dz i e Z ij̄ ¯Λ
= − g fj̄ (N − N̄ )ΛΣ tΣ
dr r2 |Z|
U (r) U (r)
e Z ij̄ e
= g Dj̄ Z̄(z, z̄, p, q) = 2 g ij̄ ∂j̄ |Z(z, z̄, p, q)| ,
r 2 |Z| r2
(173)
U (r) U (r)
dU e e
= |hΣ pΣ − f Λ qΛ | = |Z(z, z̄, p, q)| , (174)
dr r2 r2
Extremal Black Holes in Supergravity 699

where NΛΣ (z,

z̄) is the kinetic matrix
of special geometry deﬁned by (118), the
vector V = f Λ (z, z̄), hΣ (z, z̄) , according to (110), is the covariantly holo-
morphic section of the symplectic bundle entering the deﬁnition of a special
Kähler manifold. Moreover, according to (146),

Z(z, z̄, p, q) ≡ f Λ qΛ − hΣ pΣ , (175)

is the local realization on the scalar manifold SM of the central charge of the
N = 2 superalgebra,

Z̄ i (z, z̄, p, q) ≡ g ij̄ Dj̄ Z̄(z, z̄, p, q) , (176)

are the charges associated with the matter vectors, the so-called matter central
charges, written with world indices of the special Kähler manifold. In terms
of the complex charge vector tΛ introduced in (165), the central and matter
charges have the following useful expressions:

Z = −2i f Λ ImNΛΣ tΣ , (177)

Z ı̄ = −2i f¯ı̄Λ ImNΛΣ tΣ , (178)

In summary, we have reduced the condition that the black hole should be a
BPS-saturated state to the pair of first-order differential equations (173), (174)
for the metric scale factor U (r) and for the scalar fields z i (r). To obtain explicit
solutions, one should specify the special Kähler manifold one is working with,
namely the specific Lagrangian model. There are, however, some very general
and interesting conclusions that can be drawn in a model-independent way.
They are just consequences of the fact that these BPS conditions are first-order
differential equations. Because of that there are fixed points (see footnote 171),
namely values either of the metric or of the scalar fields which, once attained
in the evolution parameter r (= the radial distance), will persist indefinitely.
The fixed point values are just the zeros of the right-hand side in either of the
coupled equations (174) and (173). The fixed point for the metric equation
(174) is r = ∞, which corresponds to its asymptotic flatness. The fixed point
for the moduli equation (173) is r = 0. So, independently from the initial data
at r = ∞ that determine the details of the evolution, the scalar fields flow
into their fixed point values at r = 0, which, as we will show, turns out to be
a horizon. Indeed in the vicinity of r = 0 also the metric takes the universal
form of the Bertotti–Robinson AdS2 × S 2 metric.
Let us see this more closely. To begin with we consider the equations
determining the fixed point values for the moduli and the universal form
attained by the metric at the moduli fixed point. Using (178), we find
, ,
0 = g ij̄ Z̄j̄ ,fix = −2i g ij̄ f¯j̄Γ (ImN )Γ Λ tΛ ,fix , (179)
, U (r) ,
dU ,, e ,
, = 2
|Z (z, z̄, p, q) |,, . (180)
dr fix r fix
700 L. Andrianopoli et al.

Multiplying (179) by fiΣ , using the identity (123) and the definition (177) of
the central charge we conclude that at the fixed point the following condition
is true: ,
0 = tΛ + i f¯Λ Z ,fix . (181)
In terms of the previously defined electric and magnetic charges (see (61) and
(168)), (181) can be rewritten as
,
pΛ = −i Z f¯Λ − Z̄ f Λ ,fix , (182)
,
qΣ = −i Z h̄Λ − Z̄ hΛ , . fix (183)

Equations (179), or equivalently (182) and (183), can be regarded as alge-

braic equations determining the value of the scalar fields at the fixed point as
functions of the electric and magnetic charges pΛ , qΣ . Note therefore that, at
the horizon, also the central charge depends only on the quantized charges:
Z(z, z̄, p, q)|fix ≡ Z(p, q).
In the vicinity of the fixed point the differential equation for the metric
becomes
dU |Z(p, q)| U (r)
= e (184)
dr r2
which has the approximate solution:

r→0 |Z(p, q)|

exp[−U (r)] −→ . (185)
r
Hence, near r = 0 the metric (161) becomes of the Bertotti–Robinson type
(see (8) ) with Bertotti–Robinson mass given by
2
MB-R = |Z(p, q)|2 . (186)

In the metric (8) the surface r = 0 is light-like and corresponds to a horizon

∂
since it is the locus where the Killing vector generating time translations ∂t ,
which is time-like at spatial inﬁnity r = ∞, becomes light-like. The horizon
r = 0 has a ﬁnite area given by

√ 2
AreaH = gθθ gφφ dθ dφ = 4π MB–R . (187)
r=0

Hence, independently from the details of the considered model, the BPS-
saturated black holes in an N =2 theory have a Bekenstein–Hawking entropy
given by the following horizon area:
AreaH
= |Z(p, q)|2 , (188)
4π
where (186) was used, the value of the central charge being determined by
(182) and (183). Such equations, as we shall see in the next secton, can also
be seen as the variational equations for the minimization of the horizon area
Extremal Black Holes in Supergravity 701

as given by (188), if the central charge is regarded as a function of both the

scalar ﬁelds and the charges:

AreaH (z, z̄) = 4π |Z(z, z̄, p, q)|2 ,

δAreaH
= 0 −→ z = zﬁx . (189)
δz

5 BPS and Non-BPS Attractor Mechanism:

The Geodesic Potential
Quite recently it was noticed that the attractor behavior of extremal black
holes in supersymmetric theories is not peculiar of BPS solutions preserving
some supersymmetries [31], and examples of non-supersymmetric extremal
black holes exhibiting the attractor phenomenon were found [34, 36, 63, 64,
65, 66].
It is then appropriate to introduce an alternative approach to extremality
which does not rely on the existence of supersymmetry [31, 36, 67]. Let us
start by writing the space–time metric of a black hole in terms of a new radial
parameter τ :

c4 c2
ds2 = e2U dt2 − e−2U dτ 2
+ dΩ 2
. (190)
sinh4 (cτ ) sinh2 (cτ )
The coordinate τ is related to the radial coordinate r by the following relation:

c2
= (r − r0 )2 − c2 = (r − r− ) (r − r+ ) . (191)
sinh2 (cτ )
Here c ≡ 2ST is the extremality parameter of the solution, with S the entropy
and T the temperature of the black hole. When c is non-vanishing, the black
hole has two horizons located at r± = r0 ± c. The outer horizon is located at
rH = r+ corresponding to τ → −∞. The extremality limit at which the two
horizons coincide, rH = r+ = r− = r0 , is c → 0. In this case the metric (190)
takes the simple form in the r coordinate

ds2 = e2U dt2 − e−2U dr2 + (r − rH )2 dΩ 2 . (192)

In the general case, if we require the horizon to have a ﬁnite area A, the scale
function U in the near-horizon limit should behave as follows:

τ →−∞ A sinh2 (cτ ) A 1

e−2U −→ = , (193)
4π c2 4π (r − r− )(r − r+ )
so that the near-horizon metric reads
4π A dr2 A
ds2 = (r − r− )(r − r+ ) dt2 − −
− dΩ 2 . (194)
A 4π (r − r )(r − r ) 4π
+
702 L. Andrianopoli et al.

The above metric coincides with the near-horizon metric of a Reissner–

Nordström solution with horizons located at r± . It is useful to introduce the
radial coordinate ρ defined as ρ = 2 ecτ , in terms of which, in the near-horizon
2
limit, we can write e−2U ∼ rρc H
, where rH = A/4π is the radius of the
(outer) horizon, and the metric becomes
2
ρc
ds2 = dt2 − (rH )2 (dρ2 + dΩ 2 ) . (195)
rH
The coordinate ρ measures the physical distance from the horizon, which is
located at ρ = 0, in units of rH . It is important to note that the distance of
a point at some finite ρ0 from the horizon is finite:
ρ0
d= rH dρ = rH ρ0 < ∞ . (196)
0

Using this feature, in [36] an intuitive argument was given in order to justify
the absence of a universal behavior for the scalar fields near the horizon of a
non-extremal black hole: the distance from the horizon is not “long enough” in
order for the scalar fields to “loose memory” of their initial values at infinity.
Let us now consider the extremal case c = 0. The relation between τ and
r becomes τ = −1/(r − rH ). In order to have a finite horizon area, U should
behave near the horizon as
2
−2U rH
e ∼ , (197)
r − rH
The physical distance from the horizon is now measured in units rH by the
coordinate ω = ln(r − rH ) in terms of which the near-horizon metric reads
1
ds2 = e2ω dt2 − (rH )2 (dω 2 + dΩ 2 ) . (198)
(rH )2
Since now the horizon is located at ω → −∞, the distance of a point at some
finite ω0 from the horizon is always infinite, as opposite to the non-extremal
case:
ω0
d= rH dω = ∞ . (199)
−∞

Therefore, as observed in [36], the infinite distance from the horizon in the
extremal case justifies the fact that the scalar fields at the horizon “loose
memory” of their initial values at infinity and therefore exhibit a universality
behavior. In order to simplify the notation, in the following we shall use the
coordinate r to denote the distance from the horizon, consistently with our
previous treatment of the BPS black-hole solutions.
Let us consider the field equations for the metric components (see (190))
and for the scalar fields Φr coming from the Lagrangian (32). By eliminating
Extremal Black Holes in Supergravity 703

the vector ﬁelds through their equations of motion, the resulting equations for
the metric and the scalar ﬁelds, written in terms of the evolution parameter
τ , take the following simple form [67]:

d2 U
= VB–H (Φ, p, q)e2U , (200)
dτ 2
D2 Φr ∂V (Φ, p, q) 2U
= g rs (Φ) B–H s e , (201)
Dτ 2 ∂Φ
with the constraint
2
dU 1 dΦr dΦs
+ grs (Φ) − VB–H (Φ, p, q)e2U = c2 , (202)
dτ 2 dτ dτ

where VB–H (Φ, p, q) is a function of the scalars and of the electric and magnetic
charges of the theory defined by
1
VB–H = − Qt M(N )Q , (203)
2
where as usual Q is the symplectic vector of quantized electric and magnetic
charges and M(N ) is the symplectic matrix defined in (93) in terms of the
matrix NΛΣ (Φ). Let us note that the field equations (201) can be extracted
from the effective one-dimensional Lagrangian:
2
dU 1 dΦr dΦs
Lef f = + grs + VB–H (Φ, p, q)e2U , (204)
dτ 2 dτ dτ

constrained with (202). The extremality condition is c2 → 0.

From (204) we see that the properties of extremal black holes are com-
pletely encoded in the metric of the scalar manifold grs and on the scalar
effective potential VB–H , known as black-hole potential or geodesic potential
[31, 67]. In particular, as it was shown in [31, 36, 67] and as we shall review
below, the area of the event horizon is proportional to the value of VB–H at
the horizon
A
= VB-H (Φh , p, q) (205)
4π
where Φh denotes the value taken by the scalar fields at the horizon. 10 This
follows from the property that there is an attractor mechanism at work in
the extremal case. To see this, let us consider the set of equations (201) at
c = 0. Regularity of the scalar fields at the horizon, which is located, with
respect to the physical distance parameter ω, at ω → −∞, implies that at the
horizon the first derivative of Φr with respect to ω vanishes: ∂ω Φr|h = 0. Near
the horizon a solution to (201), under the hypothesis that (∂VB–H /∂Φr )h be
finite, behaves as follows:
10
For the sake of clarity in the comparison with equivalent formulas in [36], let us
r
note that in [36] the definition Σ r = dΦ
dτ
has been used.
704 L. Andrianopoli et al.
,
1 ∂VB–H ,,
Φr ∼ g rs
(Φ ) ω 2 + Φrh . (206)
∂Φs ,Φh
h
2 (rH )2

Regularity of Φr at ω → −∞ then further requires that (∂VB–H /∂Φr )|h =

0, implying that the horizon be an attractor point for the scalar fields. We
conclude that in the extremal case the scalar fields tend in the near-horizon
limit to some fixed values Φrh , which extremize the potential VB–H :
,
∂VB–H ,, dΦr
ω → −∞ : Φ (ω) regular ⇒
r
→ 0 ; → 0. (207)
∂Φr ,Φh dω

These values are functions of the quantized electric and magnetic charges only:
Φrh = Φrh (p, q). Furthermore, let us consider (202). In the extremal limit c = 0,
near the horizon it becomes
2
dU
∼ VB–H (Φh (p, q), p, q)e2U (208)
dτ

from which it follows, for the metric components near the horizon
2
r2 r
e2U
∼ = , (209)
VB–H (Φh ) rH

that is:
r2 V (Φh ) 2
ds2hor = dt2 − B–H2 dr + r2 dΩ . (210)
VB–H (Φh ) r
From (208) and (210) we immediately see that the value of the potential
at the horizon measures its area, as anticipated in (205). The metric (210)
describes a Bertotti–Robinson geometry AdS2 × S 2 , with mass parameter
2
MB–R = VB–H (Φh ).
To summarize, we have just shown that the area of the event horizon of
an extremal black hole (and hence its B–H entropy) is given by the black-hole
potential evaluated at the horizon, where it gets an extremum. This justifies
our assertion at the end of the previous section.
Let us briefly comment on the non-extremal case c = 0. For these solutions,
the physical distance is measured by the coordinate ρ introduced in (193) and
the horizon is located at ρ = 0. The requirement of regularity of the scalar
fields at the horizon is less stringent. It just means that the scalars should
admit a Taylor expansion in ρ around ρ = 0 and thus it poses no constraints,
aside from finiteness, on their derivatives at the horizon:
, ,
∂Φr ,, 1 ∂VB-H ,,
Φ ∼ Φh +
r r
ρ+ rs
g (Φh ) ρ2 + O(ρ3 ) . (211)
∂ρ ,0 2 (rH )2 ∂Φs ,Φh

The horizon is therefore not necessarily an attractor point, since at ρ = 0

(∂VB–H /∂Φr )Φh can now be a non-vanishing constant.
Extremal Black Holes in Supergravity 705

5.1 Extremal Black Holes in Supergravity

For supergravity theories, supersymmetry ﬁxes the black-hole potential VB–H

defined in (203) to take a particular form that allows to find its extremum
in an easy way. Indeed, an expression exactly coinciding with (203) has been
found in Sect. 3 in an apparently different context, as the result of a sum rule
among central and matter charges in supergravity theories (93). So, in every
supergravity theory, the black-hole potential has the general form
1 1
VB–H ≡ − Qt M(N )Q = ZAB Z̄ AB + ZI Z̄ I . (212)
2 2
By making use of
the geometric relations of Sect. 3, the value of the charge
pΛ
vector Q = in terms of the moduli Φ is given by (99) and (100). Then,
qΛ
to find the extremum of VB–H we can apply the differential relations (90)
among central and matter charges found in Sect. 3.
Let us now analyze more in detail, for the case of supergravity theories, the
extremality condition c = 0 as it comes from the constraint (202) which has
to be imposed on the solution all over space–time. According to the discussion
given in the previous section, the existence of solutions to (202) does not not
rely on supersymmetry, therefore also non supersymmetric extremal black
holes still exhibit an attractor behavior (207) (found at c = 0).
At spatial infinity τ → 0, where the macroscopic features of the black
hole are well defined, we have U → MADM τ , as it follows from the general
definition of ADM mass in General Relativity (see for example [1]). The metric
(161) reduces to the Minkowski one and the constraint (202) becomes

1 dΦr dΦs
2
MADM = |Z(Φ∞ , p, q)|2 + |ZI (Φ∞ , p, q)|2 − grs ∞ ∞ . (213)
2 dτ dτ
These solutions do not necessarily saturate the BPS bound, since in general,
2
from (213), MADM = |Z(Φ∞ )|2 . They then completely break supersymmetry.
The behavior at the horizon may nevertheless be easily found thanks to the
expression (212) that the black-hole potential takes in supergravity theories,
∂V
by exploiting the condition (207) and in particular ∂ΦB–H | → 0.
r Φh
For the cases where the black-hole solution preserves some supersymme-
tries, we are going to ﬁnd that the constraint (202) yields the BPS bound
on the mass of the solution. Indeed in that case one may apply the re-
sults of Sect. 4.1. Let us restrict to the case of N = 2 supergravity, where
VB–H = |Z|2 + |ZI |2 . The Killing spinor equation δ λ = 0 gives equation
(173) that implies
, i ,2
, dz ,
, ,
, dτ , = e |g Dj̄ Z| .
2U ij̄ 2
(214)

By making use of (214), the constraint (202) reduces in the extremal limit
c = 0 to the following equation, valid all over space–time:
706 L. Andrianopoli et al.
2
dU
= e2U |Z|2 . (215)
dτ

At spatial inﬁnity τ → 0, (214) and (215) become

dΦr∞ dΦs∞
2
MADM = |Z(Φ∞ , p, q)|2 ; |ZI (Φ∞ , p, q)|2 = grs . (216)
dτ dτ
The first equation in (216) may be recognized as the saturation of the BPS
bound on the mass of the solution. On the other hand, near the horizon the
attractor condition holds ,
dΦr ,,
= 0, (217)
dτ ,h
and from (214) it gives ZI |h = 0, which may be solved to find Φfix (p, q)
leaving, for the mass parameter at the horizon
2
dU 2
= MB-R (p, q) = |Z(Φfix , p, q)|2 . (218)
dτ h

Actually, the extrema of the black-hole potential may be systematically

studied, both for the BPS and non-BPS case, by use of the geometric relations
(90). One ﬁnds that the extrema are given by
1
dVB-H = DZAB Z̄ AB + DZI Z̄ I + c.c.
2
1 1 AB CD
= Z̄ Z̄ PABCD + Z̄ AB Z̄ I PABI + c.c.
2 2

1 AB I
+ Z̄ Z̄ PABI + Z̄ I Z̄ J PIJ + c.c. = 0. (219)
2

Let us remark that the one introduced in (219) is a covariant procedure,

not referring explicitly to the horizon properties for finding the entropy, so it
is not necessary to specify explicitly horizon parameters (like the metric and
the fixed values of scalars at that point), VB–H being a well-defined quantity
over all the space–time.
The conditions (219), defining the extremum of the black-hole potential
and thus the fixed scalars, when restricted to the BPS case have the same
content as, and are therefore completely equivalent to, the relations (173) and
(174) found in the previous subsection from the Killing-spinor conditions. In
particular, extremal black holes preserving one supersymmetry correspond to
N -extended multiplets with

MADM = |Z1 | > |Z2 | · · · > |Z[N/2] | (220)

where Zm , m = 1, · · · , [N/2], are the skew-eigenvalues of the central charge an-

tisymmetric matrix introduced in (14) [68, 69, 50, 51]: Z1 = Z12 ,
Extremal Black Holes in Supergravity 707

Z2 = Z34 , . . . . At the attractor point, where MADM is extremized, super-

symmetry requires the vanishing of each term on the right-hand side of (219).
In particular, we ﬁnd ZI = 0 (recall that ZI does not exist for N > 4) and

Z̄ AB Z̄ CD PABCD = ⇒ Z̄ [AB Z̄ CD] = 0 . (221)

The above condition is satisﬁed taking Z1 = Z12 = 0 and Zm = 0, m > 1. A

general property of regular BPS black-hole solutions is that supersymmetry
doubles at the horizon. This is consistent with the fact that the near horizon
geometry is a Bertotti–Robinson space–time of the form AdS2 × S 2 , which is
known to have an unbroken N = 2 supersymmetry [5]. Let us now give an
argument for the vanishing of the supersymmetry variation along 1 , 2 of the
fermion fields at the horizon. As far as the dilatino fields are concerned, it is
sufficient to remember that, since (dΦr /dτ )h = 0, at the horizon the super-
symmetry variation is proportional to Z[AB C] . However, this expression is
also zero since the only non-vanishing central charge is Z1 ≡ Z12 and further-
more Z[12 1] = Z[12 2] = 0. As for the gaugini their supersymmetry variation
at the horizon is automatically zero being ZI = 0. Finally, let us remark
that the gravitino variation is not actually zero; however, the variation of its
field strength along 1 , 2 vanishes because of the property of the Bertotti–
Robinson solution of being conformally flat and the fact that the graviphoton
field strength TAB is Lorentz-covariantly constant at the horizon [22].
A case by case analysis of the BPS and non-BPS black holes in the various
supergravity models, by inspection of the extrema of VB–H , will be given in
Sect.6. As an exemplification of the method, let us anticipate here the de-
tailed study of the BPS solution of D = 4, N = 4 pure supergravity. The
field content is given by the gravitational multiplet, that is by the gravi-
[AB]
ton gμν , four gravitini ψμA , A = 1, · · · , 4, six vectors Aμ , four dilatini
[ABC] ϕ
χ and a complex scalar φ = a + ie parametrizing the coset manifold
G/H = SU (1, 1)/U (1). The symplectic Sp(12)-sections (fAB Λ
, hΛAB ) (Λ ≡
[AB] = 1, · · · , 6) over the scalar manifold are given by
Λ
fAB = e−ϕ/2 δAB
Λ
,
−ϕ/2
hΛAB = φe δΛAB , (222)

so that
NΛΣ = (h · f −1 )ΛΣ = φδΛΣ . (223)
The central charge matrix is then given by
Λ
ZAB = fAB qΛ − hΛAB pΛ = −e−ϕ/2 (φpAB − qAB ) . (224)

The black-hole potential is therefore

1 −ϕ
V (φ, p, q) = e (φpAB − qAB )(φ̄pAB − q AB )
2
1
= (a2 e−ϕ + eϕ )pAB pAB + e−ϕ qAB pAB − 2ae−ϕ qAB pAB
2
708 L. Andrianopoli et al.
ϕ
1 1 0 e 0 1 −a p
≡ (p, q) . (225)
2 −a 1 0 e−ϕ 0 1 q

By extremizing the potential in the moduli space we get

∂V qAB pAB
= 0 → ah = ,
∂a pAB pAB

∂V |qAB q AB pCD pCD − (qAB pAB )2 |
= 0 → eϕh = , (226)
∂ϕ pAB pAB
from which it follows that the entropy is

SB-H = 4πV (φh , p, q) = 4π |qAB q AB pCD pCD − (qAB pAB )2 |. (227)

As a ﬁnal observation, let us note, following [31], that the extremum

reached by the black-hole potential at the horizon is in particular a minimum,
unless the metric of the scalar fields change sign, corresponding to some sort
of phase transition, where the effective Lagrangian description (204) of the
theory breaks down. This can be seen from the properties of the Hessian of
the black-hole potential. It was shown in [31] for the N = 2, D = 4 case that
at the critical point Φ = Φfix ≡ Φh , from the special geometry properties it
follows:
1
(∂ı̄ ∂j |Z|)fix = gı̄j |Z|fix (228)
2
and then, remembering, from the above discussion, that Vfix = |Zfix |2 :

(∂ı̄ ∂j V )ﬁx = 2gı̄j |Zﬁx |2 . (229)

From (229) it follows, for the N = 2 theory, that the minimum is unique.
In the next section we will show one more technique for ﬁnding the entropy,
exploiting the fact that it is a “topological quantity” not depending on scalars.
This last procedure is particularly interesting because it refers only to group
theoretical properties of the coset manifolds spanned by scalars, and do not
need the knowledge of any details of the black-hole horizon.

5.2 B-H Entropy as a U -invariant for Symmetric Spaces

For theories based on moduli spaces given by symmetric manifolds G/H,

which is the case of all supergravity theories with N ≥ 3 extended super-
symmetry, but also of several N = 2 models, the BPS and non-BPS black
holes are classified by some U -duality-invariant expressions depending on the
representation of the group G of G/H under which the electric and magnetic
charges are classified. In this respect, the classification of the N = 2 invariants
is entirely similar to the N > 2 cases, where all scalar manifolds are symmetric
spaces.
Extremal Black Holes in Supergravity 709

For theories that have a quartic invariant I4 [70] (this includes all N = 2
symmetric spaces based on cubic prepotentials [71, 72] and N = 4, 6, 8 theo-
ries), the B–H entropy turns out to be proportional to its square root

SB–H ∝ |I4 |. (230)

The BPS solutions have I4 > 0 while the non-BPS ones (with non-vanishing
central charge) have instead I4 < 0. For all the above theories with the excep-
tion of the N = 8 case, there is also a second non-BPS solution with vanishing
central charge and I4 > 0.
For theories based on symmetric spaces with only a quadratic invariant I2
(this includes N = 2 theories with quadratic prepotentials as well as N = 3
and N = 5 theories), the B–H entropy is

SB-H ∝ |I2 |. (231)

In these cases, beyond the BPS solution which has I2 > 0 there is only one
non-BPS solution, with vanishing central charge and I2 < 0.
All the solutions discussed here give SB–H = 0 and then fall in the class of
the so-called large black holes, for which the classical area/entropy formula is
valid as it gives the dominant contribution to the black-hole entropy. Solutions
with I4 , I2 = 0 do exist but they do not correspond to classical attractors since
in that case the classical area/entropy formula vanishes. In this case one deals
with small black holes, and a quantum attractor mechanism, including higher
curvature terms, has to be considered for finding the entropy.
The main purpose of this subsection is to provide particular expressions
which give the entropy formula as a moduli-independent quantity in the entire
moduli space and not just at the critical points.
Namely, we are looking for
∂
quantities S ZAB (φ), Z̄ AB (φ), ZI (φ), Z̄ I (φ) such that ∂φ i S = 0, φ
i
being
11
the moduli coordinates. To this aim, let us first consider invariants Iα of
the isotropy group H of the scalar manifold G/H, built with the central and
matter charges. We will take all possible H-invariants up to quartic ones for
four dimensional theories (except for the N = 3 case, where the invariants of
order higher than quadratic
! are not irreducible). Then, let us consider a linear
combination S 2 = α Cα Iα of the H-invariants, with arbitrary coefficients
∂S
Cα . Now, let us extremize S in the moduli space ∂Φ i = 0, for some set of

{Cα }. Since Φ ∈ G/H, the quantity found in this way (which in all cases
i

turns out to be unique) is a U -invariant, and is indeed proportional to the

Bekenstein–Hawking entropy.
These formulae generalize the quartic E7(−7) invariant of N = 8 super-
gravity [70] to all other cases. 12
11
The Bekenstein–Hawking entropy SB–H = A 4
is actually πS in our notation
12
Our analysis is based on general properties of scalar coset manifolds. As a con-
sequence, it can be applied straightforwardly also to the N = 2 cases, whenever
one considers special coset manifolds.
710 L. Andrianopoli et al.

Let us ﬁrst consider the theories N = 3, 4, where matter can be present

[52, 53].
The U -duality groups13 are, in these cases, SU (3, n) and SU (1, 1) ×
SO(6, n) respectively. The central and matter charges ZAB , ZI transform in
an obvious way under the isotropy groups

H = SU (3) × SU (n) × U (1) (N = 3), (232)

H = SU (4) × SO(n) × U (1) (N = 4). (233)

Under the action of the elements of G/H the charges may get mixed with
their complex conjugate. The infinitesimal transformation can be read from
the differential relations satisfied by the charges (90) [50] .
For N = 3:

P ABCD = PIJ = 0, PABI ≡ ABC PIC , ZAB ≡ ABC Z C . (234)

Then the variations are

δZ A = ξ AI ZI , (235)
δZI = ξ¯AI Z A , (236)

where ξ AI are inﬁnitesimal parameters of K = G/H.

The possible quadratic H-invariants are

I1 = Z A Z̄A ,
I2 = ZI Z̄ I . (237)

So, the U -invariant expression is

S = |Z A Z̄A − ZI Z̄ I |. (238)

In other words, Di S = ∂i S = 0, where the covariant derivative is deﬁned

in [50].
Note that at the attractor point (ZI = 0) it coincides with the moduli-
dependent potential (212) computed at its extremum.
For N = 4
1
PABCD = ABCD P, PIJ = ηIJ P, PABI = ηIJ ABCD P̄ CDJ , (239)
2
SU (1,1) O(6,n)
and the transformations of K = U (1) × O(6)×O(n) are

1
δZAB = ABCD ξ Z̄ CD + ξAB
I
ZI , (240)
2
13
Here we denote by U -duality group the isometry group U acting on the scalars
in a symplectic representation, although only a restriction of it to integers is the
proper U -duality group [10].
Extremal Black Holes in Supergravity 711

1
δZI = ξ¯ ηIJ Z̄ J + ξ¯IAB ZAB , (241)
2
with ξ¯IAB = 12 ηIJ ABCD ξCD
J
.
The possible H-invariants are

I1 = ZAB Z̄ AB
I2 = ZAB Z̄ BC ZCD Z̄ DA
I3 = ABCD ZAB ZCD
I4 = ZI Z I . (242)

There are three O(6, n) invariants given by S1 , S2 , and S̄2 where

1
S1 = ZAB Z̄ AB − ZI Z̄I , (243)
2
1
S2 = ABCD ZAB ZCD − ZI ZI , (244)
4
and the unique SU (1, 1) × O(6, n) invariant S, DS = 0, is given by

S = |(S1 )2 − |S2 |2 |. (245)

At the attractor point ZI = 0 and ABCD ZAB ZCD = 0 so that S reduces to

the square of the BPS mass.
Note that, in absence of matter multiplets, one recovers the expression found
in the previous subsecion by extremizing the black hole potential.
For N = 5, 6, 8 the U -duality-invariant expression S is the square root of a
unique invariant under the corresponding U -duality groups SU (5, 1), O∗ (12)
and E7(−7) . The strategy is to ﬁnd a quartic expression S 2 in terms of ZAB
such that DS = 0, i.e. S is moduli-independent.
As before, this quantity is a particular combination of the H quartic in-
variants.
For SU (5, 1) there are only two U (5) quartic invariants. In terms of the
matrix AAB = ZAC Z̄ CB they are (T rA)2 , T r(A2 ), where

T rA = ZAB Z̄ BA , (246)
T r(A2 ) = ZAB Z̄ BC ZCD Z̄ DA . (247)

As before, the relative coeﬃcient is ﬁxed by the transformation properties of

ZAB under SUU(5,1)
(5) elements of inﬁnitesimal parameter ξ :
C

1 C
δZAB = ξ CABP Q Z̄ P Q . (248)
2
It then follows that the required invariant is
1
S= |4T r(A2 ) − (T rA)2 |. (249)
2
712 L. Andrianopoli et al.

The N = 6 case is the more complicated because under U (6) the

left-handed spinor of O∗ (12) splits into:

32L → 151 + 15
¯ −1 + 1−3 + 13 . (250)
O ∗ (12)
The transformations of U (6) are

1
δZAB = ABCDEF ξ CD Z̄ EF + ξAB X̄, (251)
4
1
δX = ξAB Z̄ AB , (252)
2
where we denote by X the SU (6) singlet.
The quartic U (6) invariants are

I1 = (T rA)2 (253)
I2 = T r(A2 ) (254)
1
I3 = Re(P f ZX) = Re(ABCDEF ZAB ZCD ZEF X) (255)
23 3!
I4 = (T rA)X X̄ (256)
2 2
I5 = X X̄ (257)

where the matrix A is, as for the N = 5 case, AAB = ZAC Z̄ CB .

The unique O∗ (12) invariant is
1
S= |4I2 − I1 + 32I3 + 4I4 + 4I5 | (258)
2
DS = 0. (259)

Note that at the BPS attractor point P f Z = 0, X = 0 and S reduces to the

square of the BPS mass.
For N = 8 the SU (8) invariants are 14

I1 = (T rA)2 (260)
I2 = T r(A2 ) (261)
1 ABCDEF GH
I3 = P f Z = ZAB ZCD ZEF ZGH . (262)
24 4!
E7(−7)
The SU (8) transformations are

1
δZAB = ξABCD Z̄ CD , (263)
2
where ξABCD satisfies the reality constraint:
14
The Pfaffian of an (n × n) (n even) antisymmetric matrix is defined as P f Z =
1
2n n!
A1 ···An ZA1 A2 · · · ZAN −1 AN , with the property: |P f Z| = |detZ|1/2 .
Extremal Black Holes in Supergravity 713

1
ξABCD = ABCDEF GH ξ¯EF GH . (264)
24
One ﬁnds the following E7(−7) invariant [70]:

1
S= |4T r(A2 ) − (T rA)2 + 32Re(P f Z)|. (265)
2

6 Detailed Analysis of Attractors in Extended

Supergravities: BPS and Non-BPS Critical Points
The extremum principle was found originally in the context of N = 2 four-
dimensional black holes. However, as we have described in Sect. 4, it has a
more general validity, being true for all N -extended supergravities in four
dimensions (in the cases where the Bekenstein–Hawking entropy is different
from zero) [50]. Indeed, the general discussion of Sect. 3.2 shows that the
coset structure of extended supergravities in four dimensions (for N > 2)
induces the existence, in every theory, of differential relations among central
and matter charges that generalize the ones existing for the N = 2 case.
Furthermore, as far as BPS solutions are considered, Killing-spinor equations
for gauginos and dilatinos analogous to (90) are obtained by setting to zero
the supersymmetry transformation laws of the fermions. Correspondingly, at
the fixed point ∂μ Φi = 0, for any extended supergravity theories one gets some
conditions that allow to find the value of fixed scalars and hence of the B–H
entropy both for BPS and non-BPS black- hole solutions.
We will first discuss in Sect. 6.1 the case of N = 2 supergravity, then in
Sect. 6.2 the case of the other extended theories allowing matter couplings
to the supergravity multiplet, that is N = 3, 4 extended supergravities, and
finally we will pass to analyze in Sect. 6.3 N = 5, 6, 8 theories, which are pure
supergravity models.
For every theory, the strategy adopted to find the extrema will be to solve
the equation dVB–H = 0, as given in general in (219), by setting to zero all the
independent components in the decomposition on a basis of vielbein of the
moduli space [50].
We confine our analysis to large black holes, with finite horizon area.

6.1 N = 2 Attractor Equations

In the original paper [31], the N = 2 attractor conditions were introduced via
an extremum condition on the black-hole potential (203)
1
VB–H = − QT MQ = |Z|2 + |Di Z|2 (266)
2
discussed in Sect. 5. Indeed, by making use of properties of N = 2 special
geometry, the extremum condition was written in the form
714 L. Andrianopoli et al.

∂i VB–H = 2Z̄Di Z + iCijk g jj̄ g kk̄ Dj̄ Z̄Dk̄ Z̄ = 0 , (267)

where use of the special geometry relations (115) was made.

Given (267), it is useful to write the attractor equations in a diﬀerent form.
Indeed, recalling (99) and (100) [73, 74, 33] (which are true all over the moduli
space) we may write

Q − i C M(N ) · Q = −2i V̄M ZM = −2i Z V̄ + g ij̄ Dj̄ Z̄Di V , (268)

where V is the symplectic section introduced in (110); substituting the ex-

tremum condition from (267), (268) gives the value of the charges in terms of
the fixed scalars
,
i ijk ,
[Q − i C M(N ) · Q]|fix = −2i Z V̄ + C̄ Di V Dj Z Dk Z ,,
2Z fix
for Zfix = 0 ,
,
[Q − i C M(N ) · Q]|fix = −2i g ij̄ Dj̄ Z̄Di V ,fix for Zfix = 0 . (269)

The BPS solution corresponds to set Di Z = 0, in which case, for large black
holes (Zﬁx = 0), (269) reduces to (182) and (183).
The attractive nature of the extremum was further seen to come from the
fact that the mass matrix at that point is strictly positive since

∂i ∂j VB-H |(∂i VB-H =0) = 0 ; ∂i ∂j̄ VB-H |(∂i VB-H =0) = 2|Z|2 gij̄ . (270)

Non supersymmetric extremal black holes with ﬁnite horizon area corre-
spond to solutions of (267) with

Di Z = 0 . (271)

These solutions may be divided in two classes

• 0, Z =
Di Z = 0,
• Di Z = 0, Z = 0 .
For these more general cases, the horizon mass parameter MB–R which ex-
tremizes the ADM mass in moduli space is then given by

MB–R2
= VB–H |(∂i VB-H =0) = |Z|2 + |Di Z|2 (∂ V =0) > |Z|2(∂i VB–H =0) .
i B–H
(272)
Equation (272) is a special case of the BPS bound on the mass.
If the central charge Z vanishes on the extremum, then Di Z have to satisfy

Cijk g jj̄ g kk̄ Dj̄ Z̄Dk̄ Z̄ = 0 ∀i (273)

in order to fulﬁll (267). Solutions to the above equation, for the case of special
geometries based on symmetric spaces, have been given in [75].
Extremal Black Holes in Supergravity 715

When Z = 0, Di Z = 0, one may obtain some further consequences of

(267). Let us deﬁne

Z ı̄ ≡ g iı̄ Di Z , Z̄ i ≡ g iı̄ Dı̄ Z̄. (274)

From (267) we get, by multiplication with g iı̄

i ı̄
Z ı̄ = − C Z̄ j Z̄ k (275)
2Z̄ jk
and, by multiplication with Z̄ i
i i
|Di Z|2 = − N3 (Z̄ k ) = N3 (Z k̄ ) (276)
2Z̄ 2Z
where we have introduced the deﬁnition N3 (Z̄ k ) ≡ Cijk Z̄ i Z̄ j Z̄ k . Note that, if
at the attractor point N3 = 0, then Z = 0 (or Z = 0 but then Z ı̄ = 0).
The complex conjugate of (267) may be rewritten, using (275) as

i
2ZDı̄ Z̄ = − C C j̄ Z̄ Z̄ m C k̄pq Z̄ p Z̄ q . (277)
4Z̄ 2 ı̄j̄k̄ m
By making use of the special geometry relation [76, 77, 75]
4
Cı̄j̄k̄ C j̄(m C k̄pq) = C(mp gq)ı̄ + Ēı̄mpq , (278)
3
where the tensor Ēı̄mpq deﬁned by this relation is related to the covariant
derivative of the Riemann tensor and it exactly vanishes for all symmetric
spaces, 15 , we may ﬁnally rewrite (267) as

i ¯ i ¯
2Z̄Di Z = Di ZCj̄k̄¯Z j̄ Z k̄ Z + E ¯ Z j̄ Z k̄ Z Z m̄ . (279)
3Z 2 4Z 2 ij̄k̄m̄
Moreover, using also (276) we obtain

1 i ¯
|Z| − |Di Z| Di Z =
2 2
Eij̄k̄¯m̄ Z j̄ Z k̄ Z Z m̄ . (280)
3 8Z

For symmetric spaces (280) gives

|Di Z|2 = 3|Z|2 , (281)

2
implying that for these black holes: MB–R = 4|Z|2(∂i V .
B–H =0)
This relation, for symmetric spaces, was obtained in [54] and then all
the solutions of this type have been classiﬁed in [75]. In particular, solutions
15
In this case equation (278) is a consequence of the special geometry relation
Dı̄ Cjk = 0.
716 L. Andrianopoli et al.

with Cijk ≡ 0 correspond to the special series of symmetric special manifolds

SU (1,1+n)
U (1)×SU (1+n) for which only non-BPS solutions with Z = 0 may exist.
Solutions of the type in (281) have also been found for non-symmetric
spaces based on cubic prepotentials in [34].
However, because of (280), these cannot be the most general solutions.
For the generic case of non-symmetric special manifolds, we have
instead
|Di Z|2 = 3|Z|2 + Δ, (282)

where
¯
3 Eij̄k̄¯m̄ Z j̄ Z k̄ Z Z m̄
Δ=− (283)
4 N3 (Z k̄ )

and the Bekenstein–Hawking entropy is

SB-H = A/4 = π 4|Z|2 + Δ . (284)

Note that, for these non-BPS black holes, at the attractor point Δ is real and,
because of (282), it satisﬁes −Δ < 3|Z|2 .
In all the cases, the attractive nature of the solution depends on the Hessian
matrix, which however may have null directions.

6.2 N > 2 Matter Coupled Attractors

The N = 3 Case

The scalar manifold for this theory, as discussed in Sect. 3.2, is the coset
space
SU (3, n)
G/H = (285)
SU (3) × SU (n) × U (1)
and the relations among central and matter charges are (see (90))
I
D(ω)ZAB = ZI PAB ,
1
D(ω)ZI = ZAB P̄IAB . (286)
2
The extremum condition on the black-hole potential is then
1 1
dVB–H = DZAB Z̄ AB + ZAB DZ̄ AB + DZI Z̄ I + ZI DZ̄ I
2 2
I
= PAB Z̄ AB ZI + c.c. = 0, (287)

and allows two diﬀerent solutions with non-zero area. This is expected from
Sect. 5.2 because the isometry group of the symmetric space (285) only has a
quadratic invariant
Extremal Black Holes in Supergravity 717

1
I2 = |ZAB |2 − |ZI |2 . (288)
2
Then,
• either ZAB = 0, ZI = 0, in this case we have a BPS attractor and the
black-hole potential becomes
VB–H |attr = I2 |attr > 0 , (289)
• or ZI = 0, ZAB = 0, which gives a non-BPS attractor solution with black-
hole potential
VB–H |attr = −I2 |attr > 0 . (290)

The N = 4 Case
In this case the scalar manifold is the coset space
SU (1, 1) SO(6, n)
G/H = × (291)
U (1) SO(6) × SO(n)
and the relations among central and matter charges are (see (90) and the
discussion below)
I 1
D(ω)ZAB = ZI PAB + Z̄ CD ABCD P ,
2
1
D(ω)Z̄I = Z̄ AB PABI + ZI P . (292)
2
We recall that for this theory the vielbein PABI satisﬁes the reality condition
P̄ ABI ≡ (PABI ) = 12 ABCD PCD
I
.
The extremum condition on the black-hole potential is then
1 1
dVB-H = DZAB Z̄ AB + ZAB DZ̄ AB + DZI Z̄ I + ZI DZ̄ I = 0
2 2
AB 1 ABCD 1
= PABI Z̄ ZI + ZCD Z̄I + P ZI ZI + ABCD Z̄ AB Z̄ CD
2 4

1
+ P̄ Z̄ I Z̄ I + ABCD Z̄AB Z̄CD = 0 . (293)
4
Equation (293) is satisﬁed for

Z̄ AB Z I + 12 ABCD ZCD Z̄ I = 0
. (294)
Z Z δIJ + 14 ABCD Z̄ AB Z̄ CD = 0
I J

Therefore we have, in terms of the proper values Z1 , Z2 of the central charge

antisymmetric matrix ZAB (by means of a U (1) ⊂ H transformation [45], they
may always be chosen real and positive) and of the complex matter charges
ZI
Z̄1 Z I + Z2 Z̄ I = 0
. (295)
Z I Z I + 2Z̄1 Z̄2 = 0
718 L. Andrianopoli et al.

• The BPS solution with ﬁnite area is found, as discussed in general in

Sect. 5, for
ZI = 0 ; Z2 = 0 (for Z1 > Z2 ) (296)
and corresponds to the black-hole potential

VB-H |attr = (Z1 )2 . (297)

This solution partially breaks the symmetry of the moduli space, as

SU (4) → SU (2) × SU (2) × U (1)
.
SO(n) → SO(n)

There are also two non-BPS solutions:

• One is found by choosing ZI = (z, 0)

Z1 = Z√ 2 =ρ
(298)
z = 2iρ

which gives, for the black-hole potential

VB–H |attr = (Z1 )2 + (Z2 )2 + |z|2 = 4ρ2 . (299)

In this case the isotropy symmetry then becomes

SU (4) → U Sp(4)
.
SO(n) → SO(n − 1)

• The other is obtained by choosing instead ZI = (k1 , k2 , 0) and ZAB = 0.

This solves (295) for k12 + k22 = 0, that is for k2 = ±ik1 = ik, giving

VB-H |attr = |k1 |2 + |k2 |2 = 2|k|2 . (300)

For this case, then, the isotropy symmetry preserved is

SU (4) → SU (4)
.
SO(n) → SO(n − 2)

The analysis of this section is in accord with the discussion on U -invariants

of Sect. 5.2. Indeed, the isometry group of the scalar manifold (291) admits
the quartic invariant (245)

I4 = S12 − |S2 |2 , (301)

where S1 and S2 are

the O(6, n) invariants introduced in (243) and (244) and
we have SB–H = |I4 |.
For the BPS case, I4 > 0. For the non-BPS ones we have, in the ﬁrst case
I4 = −|S2 |2 < 0, in the second case I4 = S12 > 0.
Extremal Black Holes in Supergravity 719

The case of the pure N = 4 supergravity model anticipated as an example

in Sect. 5 falls in this classification and corresponds to the BPS solution (since
in that case ZI ≡ 0). It is however interesting to look at the N = 2 reduction
of that model, where only two of the six vector fields survive, one as the
graviphoton and one inside a vector multiplet whose scalars span the coset
SU (1,1)
U (1) (axion–dilaton system). Correspondingly, the two proper values of
the N = 4 central charge play now two different roles: one, say Z1 , is the
N = 2 central charge, while the other, Z2 , is the matter charge. Equation
(295) has now two distinct solutions (corresponding to the twice degenerate
BPS solution in N = 4): the BPS one, for Z2 = 0, MADM = Z1 , and a non-
BPS one, for Z1 = 0, Z2 = 0. This is understood, in terms of invariants, from
the fact that SU (1, 1) does not have an independent quartic invariant, and in
2
fact, in this case, one finds that I4 reduces to I4 = (Z1 )2 − (Z2 )2 .

6.3 N > 4 Pure Supergravity Attractors

We are going to discuss here the attractor solutions for the extended theories
with N > 4, where no matter multiplets may be coupled. We will include a
discussion of their relation to N = 2 BPS and non-BPS black holes, already
presented in [71].

The N = 5 Case
The moduli space of this model is
SU (1, 5)
G/H = , (302)
U (5)
the theory contains 10 graviphotons and the relations among the central
charges are
1
D(ω)ZAB = + Z̄ CD PABCD . (303)
2
Correspondingly, the extremum condition on the black-hole potential is
1 1
dVB–H = DZAB Z̄ AB + ZAB DZ̄ AB
2 2
1 AB CD
= PABCD Z̄ Z̄ + c.c. = 0 . (304)
4
This extremum condition allows only one solution with non-zero area, the
BPS one. Indeed, in terms of the proper values Z1 , Z2 of ZAB , (304) becomes
Z1 Z2 + Z̄1 Z̄2 = 0 . (305)
However, by means of a U (5) rotation Z1 , Z2 may always be chosen real and
non-negative [45], leaving as the only solution with non-zero area Z1 > 0,
Z2 = 0 (or vice versa). The black-hole potential on this solution is
720 L. Andrianopoli et al.

VB-H |attr = |Z1 |2 (or 1 ↔ 2). (306)

This solution is 15 -BPS and breaks the symmetry of the moduli space:

U (5) → SU (2) × SU (3) × U (1).

However, if we truncate this model N = 5 → N = 2, we have the following

decomposition of the 10 vectors:

10 → 1 + 3 + 6 .

The singlet corresponds to the N = 2 graviphoton, while 3 is the represen-

tation of the three vectors in the vector multiplets. The 6 extra vectors are
projected out in the truncation. Correspondingly, the N = 5 central charge
ZAB reduces to

Zab = Zδab 0
ZAB → , a, b = 1, 2; I, J, K = 1, 2, 3. (307)
0 ZIJ = IJK Z̄ K

The two solutions Z1 > 0, Z2 = 0 and Z1 = 0, Z2 > 0, which were BPS

and degenerate in the N = 5 theory, in the N = 2 interpretation corre-
spond the ﬁrst to a BPS solution (if we set Z1 ≡ Z) and the second to
a non-BPS solution with Z = 0, as for the quadratic series discussed in
Sect. 6.1.
Let us inspect these results in terms of the discussion of Sect. 5.2.
The SU (5, 1) invariant is (in terms of the U (5) invariants introduced in
Sect. 5.2):
I4 = 4T r(A2 ) − (T rA)2 (308)

that is, in terms of the proper values of the central charge

2
I4 = (Z1 )2 − (Z2 )2 . (309)

The solutions Z1 = Z2 are separated by the solution Z1 = Z2 , which corre-

sponds to a small black hole, with I4 = 0. This is the solution which preserves
the maximal amount of supersymmetry ( 25 unbroken), but it does not come
from the attractor equations.

The N = 6 Case

The moduli space is

SO∗ (12)
G/H = , (310)
U (6)
and the theory contains 16 graviphotons, 15 in the twice-antisymmetric rep-
resentation of U (6) plus a singlet. The attractor solutions for this theory have
already been presented in [75].
Extremal Black Holes in Supergravity 721

The relations among the central charges are

1 CD 1
D(ω)ZAB = Z̄ PABCD + Z̄ABCDEF P̄ CDEF ,
2 4!
1 AB
D(ω)Z = Z̄ ABCDEF P̄ CDEF . (311)
2!4!
The black-hole potential for this theory is
1
VB-H = ZAB Z̄ AB + Z Z̄ (312)
2
and the extremum condition is then
1 1
dVB-H = DZAB Z̄ AB + ZAB DZ̄ AB + DZ Z̄ + ZDZ̄ = 0
2 2
1 AB CD 1 ABCDEF
= PABCD Z̄ Z̄ + ZEF Z + c.c. = 0 . (313)
4 3!
In terms of the proper-values Z1 , Z2 , Z3 of ZAB , which may always be chosen
real and non negative by a U (6) rotation, the condition to be satisﬁed on the
extremum is
Z1 Z2 + ZZ3 = 0 (1 → 2 → 3 → 1 cyclically) . (314)
This equation admits one solution 16 -BPS with Z = 0, and two independent
non-BPS solutions, both with Z = 0.
• The BPS solution is found for
Z=0 Z2 = Z3 = 0, Z1 = 0 , (315)
if we choose Z1 ≥ Z2 ≥ Z3 . In this case the black-hole potential becomes
VB-H |attr = |Z1 |2 (or 1 ↔ 2 ↔ 3) (316)
and corresponds to I4 > 0.
This solution breaks the symmetry
U (6) → SU (2) × U (4)
∗
and corresponds to an SO (12)
SU (4,2) orbit of the charge vector.
• One non-BPS solution is obtained for
Z = 0 Z1 = Z2 = Z3 = 0 . (317)
It gives for the black-hole potential
VB-H |attr = |Z|2 , (318)
and preserves all the U (6) symmetry of the moduli space. This solution
∗
corresponds to the orbit SO (12)
SU (6) . Also for this solution the quartic invariant
is positive I4 > 0.
722 L. Andrianopoli et al.

• The third solution is found by setting

Z1 = Z2 = Z3 = ρ , Z = −ρ . (319)

In this case the black-hole potential becomes

VB-H |attr = 4ρ2 . (320)

This solution breaks the symmetry U (6) → U Sp(6), and corresponds to

∗
the charge orbit SO (12)
SU ∗ (6) . The quartic invariant for this solution is negative
I4 < 0.
It is interesting to note, as already observed in [50, 71, 75], that the bosonic
sector of the N = 6 is exactly the same as the one of the N = 2 model coupled
with 15 vector multiplets with scalar sector based on the same coset (310). In
the N = 2 interpretation of this model, the singlet charge Z plays the role of
central charge, while the 15 charges ZAB are interpreted as matter charges.
The interpretation of the three attractor solutions is now different: the
first one, which was 16 -BPS in the N = 6 model, is now non-BPS and breaks
supersymmetry, while the second one in this model is 12 -BPS. The third solu-
tion, where all the proper forms of the dressed charges are different from zero,
is non-BPS in both interpretations.

The N = 8 Case

This model has been studied in detail in [54]. Its scalar manifold is the coset

E7(7)
G/H = . (321)
SU (8)

The relations among the 28 central charges are

1 CD
D(ω)ZAB = Z̄ PABCD , (322)
2
where the vielbein PABCD satisﬁes the reality condition

P̄ ABCD = ABCDEF GH PEF GH . (323)

The extremum condition is then

1 1
dVB–H = DZAB Z̄ AB + ZAB DZ̄ AB = 0
2 2
1 AB CD 1
= PABCD Z̄ Z̄ + ABCDEF GH ZEF ZGH = 0 . (324)
4 4!

In terms of the central charge proper values Z1 , · · · Z4 the condition for

the extremum may be written
Extremal Black Holes in Supergravity 723
⎧
⎨Z1 Z2 + Z̄3 Z̄4 = 0
Z1 Z3 + Z̄2 Z̄4 = 0 (325)
⎩
Z2 Z3 + Z̄1 Z̄4 = 0
and admits two independent attractor solutions:
• The BPS solution is found for
Z2 = Z3 = Z4 = 0, Z1 = 0, (326)
if we choose Z1 ≥ Z2 ≥ Z3 ≥ Z4 . In this case the black hole potential
becomes
VB-H |attr = |Z1 |2 (or 1 ↔ 2 ↔ 3 ↔ 4) (327)
and corresponds to I4 > 0. This solution breaks the symmetry
SU (8) → SU (2) × U (6)
E7
and corresponds to an orbit of the charge vector.
E6(2)
• The non-BPS solution is obtained for
π
Z1 = Z2 = Z3 = Z4 = ei 4 ρ , ρ ∈ R+ . (328)
It gives for the black-hole potential
VB–H |attr = 4ρ2 . (329)
This solution breaks the symmetry SU (8) → U Sp(8), and corresponds to
the charge orbit EE6(6)
7
. The quartic invariant for this solution is negative
I4 < 0.

7 Conclusions
This survey has presented the main features of the physics of black holes em-
bedded in supersymmetric theories of gravitation. They have an extremely rich
structure and give an interplay between space–time singularities in solutions
of Einstein matter coupled equations and the solitonic, particle-like structure
of these conﬁgurations such as mass, spin and charge.
The present analysis may be extended to rotating black holes and to ge-
ometries not necessarily asymptotically ﬂat (such as, for example, asymptot-
ically anti-de Sitter solutions). Furthermore, the concept of entropy may be
extended to theories which include higher curvature and higher derivative
matter terms [27, 28, 42, 43]. This is important in order to make contact with
superstring and M-theory where these terms unavoidably appear. In this con-
text, a remarkable connection has been found between the entropy functional
and the topological string partition function, an approach pioneered in [29].
Black-hole attractors fall in the class of possible superstring vacua, which
in a wide context have led to the study of the so-called landscape [78].
It is a challenging problem to see which new directions towards a funda-
mental theory of nature these investigations may suggest in the future.
724 L. Andrianopoli et al.

Acknowledgements
The present review is partly based on the work and discussions with the follow-
ing people: S. Bellucci, A. Ceresole, M. Duﬀ, P. Fré, E. Gimon, M. Gunaydin,
R. Kallosh, M.A. Lledó, J. Maldacena, A. Marrani, and A. Strominger.
Work supported in part by the European Community’s Human Potential
Program under contract MRTN-CT-2004-005104 “Constituents, fundamen-
tal forces and symmetries of the universe”, in which L.A., R.D’A., and M.T.
are associated to Torino University. The work of S.F. has been supported in
part by European Community’s Human Potential Program under contract
MRTN-CT-2004-005104 ‘Constituents, fundamental forces and symmetries of
the universe” and the contract MRTN- CT-2004-503369 “The quest for uni-
ﬁcation: Theory Confronts Experiments”, in association with INFN Frascati
National Laboratories and by D.O.E. grant DE-FG03-91ER40662, Task C.

References
1. B. De Witt, C. De Witt eds.: Black Holes (Gordon and Breach, New York,
1973); S. W. Hawking, W. Israel: General Relativity (Cambridge University
Press, Cambridge, 1979); R. M. Wald: General Relativity (University of Chicago
Press, Chicago, 1984) 661, 705
2. G. W. Moore: Les Houches lectures on strings and arithmetic, arXiv:hep-
th/0401049; M. R. Douglas, R. Reinbacher, S. T. Yau: Branes, bundles and
attractors: Bogomolov and beyond, arXiv:math.ag/0604597 661, 695
3. M. J. Duff: String triality, black-hole entropy and Cayley’s hyperdeterminant,
arXiv:hep-th/0601134; R. Kallosh, A. Linde: Phys. Rev. D 73, 104033 (2006)
P. Levay: Phys. Rev. D 74, 024030 (2006) M. J. Duff, S. Ferrara: E7 and
the tripartite entanglement of seven qubits, arXiv:quant-ph/0609227; P. Levay:
Strings, black holes, the tripartite entanglement of seven qubits and the Fano
plane, arXiv:hep-th/0610314 661
4. S. W. Hawking, R. Penrose: Proc. Roy. Soc. Lond. A 314, 529 (1970) 661
5. G. Gibbons: in Unified theories of Elementary Particles. Critical Assessment
and Prospects, Proceedings of the Heisemberg Symposium, München, Germany,
1981, ed. by P. Breitenlohner, H. P. Dürr, Lecture Notes in Physics, Vol. 160
(Springer-Verlag, Berlin, 1982); G. W. Gibbons: in Supersymmetry, Supergrav-
ity and Related Topics, Proceedings of the XVth GIFT International Physics
(Girona, Spain 1984), ed. by F. del Aguila, J. de Azcárraga, L. Ibáñez, (World
Scientific, Singapore, 1985), p. 147; P. Breitenlohner, D. Maison, G. W. Gib-
bons: Commun. Math. Phys. 120, 295 (1988); R. Kallosh, A. D. Linde, T. Ortin,
A. W. Peet, A. Van Proeyen: Phys. Rev. D 46, 5278 (1992); R. Kallosh, T. Or-
tin, A. W. Peet: Phys. Rev. D 47, 5400 (1993); R. Kallosh: Phys. Lett. B 282,
80 (1992); R. Kallosh, A. W. Peet: Phys. Rev. D 46, 5223 (1992); R. R. Khuri,
T. Ortin: Nucl. Phys. B 467, 355 (1996); A. Sen: Nucl. Phys. B 440, 421
(1995); A. Sen: Phys. Lett. B 303, 22 (1993); A. Sen: Mod. Phys. Lett. A 10,
2081 (1995); M. Cvetic, C. M. Hull: Nucl. Phys. B 480, 296 (1996); M. Cvetic,
I. Gaida: Nucl. Phys. B 505, 291 (1997); M. Cvetic, D. Youm: arXiv:hep-
th/9512127; M. Cvetic, A. A. Tseytlin: Phys. Rev. D 53, 5619 (1996) 662, 663, 707
Extremal Black Holes in Supergravity 725

6. For reviews on black holes in superstring theory see: J. M. Maldacena:

Black-Holes in String Theory, hep-th/9607235; A. W. Peet: TASI lectures on
black holes in string theory, arXiv:hep-th/0008241; B. Pioline: Class. Quant.
Grav. 23, S981 (2006); A. Dabholkar: Class. Quant. Grav. 23, S957 (2006) 662, 663
7. S. W. Hawking: Phys. Rev. Lett. 26, 1344 (1971); J. D. Bekenstein: Phys. Rev.
D 7, 2333 (1973) 662
8. A. Strominger, C. Vafa: Phys. Lett. B 379, 99 (1996); C. G. . Callan, J. M. Mal-
dacena: Nucl. Phys. B 472, 591 (1996); G. T. Horowitz, A. Strominger: Phys.
Rev. Lett. 77, 2368 (1996); R. Dijkgraaf, E. P. Verlinde, H. L. Verlinde: Nucl.
Phys. B 486, 77 (1997); D. M. Kaplan, D. A. Lowe, J. M. Maldacena, A. Stro-
minger: Phys. Rev. D 55, 4898 (1997); J. M. Maldacena: Phys. Lett. B 403, 20
(1997); J. M. Maldacena, A. Strominger, E. Witten: JHEP 9712, 002 (1997);
M. Bertolini, M. Trigiante: JHEP 0010, 002 (2000) 662, 663, 665
9. E. Witten: Nucl. Phys. B 443, 85 (1995) 662
10. C. M. Hull, P. K. Townsend: Nucl. Phys. B 438, 109 (1995) 662, 664, 680, 710
11. G. Nordström: Proc. Kon. Ned. Akad. Wet. 20, 1238 (1918); H. Reissner: Ann.
Physik 50, 106 (1916) 662
12. R. Penrose: Riv. Nuovo Cim. 1, 252 (1969); Gen. Rel. Grav. 34, 1141 (2002);
R. Penrose: in General Relativity, an Einstein Centenary Survey, ed. by
S. W. Hawking, W. Israel (Cambridge University Press, Cambridge, 1979) 663
13. B. Bertotti: Phys. Rev. 116, 1331 (1959); I. Robinson: Bull. Acad. Pol. Sci.
Ser. Sci. Math. Astron. Phys. 7, 351 (1959) 663
14. R. Kallosh, T. Ortin: Phys. Rev. D 48, 742 (1993); E. Bergshoeff, R. Kallosh,
T. Ortin: Nucl. Phys. B 478, 156 (1996) 663
15. The literature on this topic is quite extended. As a general review, see the
lecture notes: K. Stelle: Lectures on Supergravity p-Branes, presented at 1996
ICTP Summer School, Trieste, arXiv:hep-th/9701088 663, 697
16. M. J. Duff, R. R. Khuri, J. X. Lu: String solitons, Phys. Rep. 259, 213 (1995) 663
17. For recent reviews see: J. H. Schwarz: Nucl. Phys. Proc. Suppl. B 55, 1 (1997);
M. J. Duff: Int. J. Mod. Phys. A 11, 5623 (1996); A. Sen: Nucl. Phys. Proc.
Suppl. 58, 5 (1997) 664
18. J. H. Schwarz, A. Sen, Phys. Lett. B 312, 105 (1993); J. H. Schwarz, A. Sen:
Nucl. Phys. B 411, 35 (1994) 664
19. M. Gasperini, J. Maharana, G. Veneziano: Phys. Lett. B 272, 277 (1991);
J. Maharana, J. H. Schwarz: Nucl. Phys. B 390, 3 (1993) 664
20. J. H. Schwarz: M theory extensions of T duality, arXiv:hep-th/9601077; C. Vafa:
Nucl. Phys. B 469, 403 (1996) 664
21. S. Ferrara, R. Kallosh, A. Strominger: Phys. Rev. D 52, 5412 (1995) 664, 696, 697
22. S. Ferrara, R. Kallosh: Phys. Rev. D 54, 1514 (1996); S. Ferrara, R. Kallosh:
Phys. Rev. D 54, 1525 (1996) 664, 695, 696, 707
23. A. Strominger: Phys. Lett. B 383, 39 (1996) 664, 696
24. J. Polchinski, Y. Cai: Nucl. Phys. B 296, 91 (1988); C. G. . Callan, C. Lovelace,
C. R. Nappi, S. A. Yost: Nucl. Phys. B 308, 221 (1988); A. Sagnotti:
Open strings and their symmetry groups, Cargese Summer Inst. (1987) 0521,
arXiv:hep-th/0208020; M. Bianchi, A. Sagnotti: Phys. Lett. B 247, 517 (1990);
M. Bianchi, A. Sagnotti: Nucl. Phys. B 361, 519 (1991); P. Horava: Nucl. Phys.
B 327, 461 (1989); J. Polchinski: Phys. Rev. Lett. 75, 4724 (1995) 665
25. S. Ferrara, J. M. Maldacena: Class. Quant. Grav. 15, 749 (1998); S. Ferrara,
M. Gunaydin: Int. J. Mod. Phys. A 13, 2075 (1998) 665, 668
726 L. Andrianopoli et al.

26. L. Andrianopoli, R. D’Auria, S. Ferrara: Phys. Lett. B 403, 12 (1997) 665

27. R. M. Wald: Phys. Rev. D 48, 3427 (1993) 665, 723
28. G. Lopes Cardoso, B. de Wit, T. Mohaupt: Phys. Lett. B 451, 309 (1999);
G. Lopes Cardoso, B. de Wit, T. Mohaupt: Nucl. Phys. B 567, ) 87 (2000) 665, 723
29. H. Ooguri, A. Strominger, C. Vafa: Phys. Rev. D 70, 106007 (2004) 665, 667, 723
30. R. R. Khuri, T. Ortin: Phys. Lett. B 373, 56 (1996); T. Ortin: Phys. Lett. B
422, 93 (1998); T. Ortin: Non-supersymmetric (but) extreme black holes, scalar
hair and other open problems, arXiv:hep-th/9705095 666
31. S. Ferrara, G. W. Gibbons, R. Kallosh: Nucl. Phys. B 500, 75 (1997) 666, 701, 703, 708, 713
32. K. Goldstein, N. Iizuka, R. P. Jena, S. P. Trivedi: Phys. Rev. D 72, 124021
(2005) 666
33. R. Kallosh: JHEP 0512, 022 (2005) 666, 714
34. P. K. Tripathy, S. P. Trivedi: JHEP 0603, 022 (2006) 666, 701, 716
35. A. Giryavets: JHEP 0603, 020 (2006) 666
36. R. Kallosh, N. Sivanandam, M. Soroush: JHEP 0603, 060 (2006) 666, 701, 702, 703
37. A. Dabholkar, A. Sen, S. Trivedi: Black hole microstates and attractor without
supersymmetry, arXiv:hep-th/0611143 666
38. G. G. Gibbons: http://www.lpthe.jussieu.fr/sugra30/TALKS/Gibbons.pdf,
talk presented at the conference 30 years of Supergravity, journée Joël Scherk,
held in Paris the 18/10/2006 666
39. M. K. Gaillard, B. Zumino: Nucl. Phys. B 193, 221 (1981) 667, 674, 675, 683
40. R. Kallosh: From BPS to non-BPS black holes canonically, arXiv:hep-
th/0603003 667
41. For a review, see for instance: S. Bellucci, S. Ferrara, A. Marrani: Supersym-
metric Mechanics P Vol. 2: The Attractor Mechanism and Space Time Singu-
larities, Lecture Notes in Physics, Vol. 701 (Springer, Berlin/Heidelberg, 2006),
Proceedings of the SSM05–Winter School on Modern Trends in supersymmetric
Mechanics, INFN-LNF, Italy (2005) 667
42. G. Lopes Cardoso, B. de Wit, J. Kappeli, T. Mohaupt: JHEP 0412, 075 (2004)
667, 723
43. A. Sen: JHEP 0509, 038 (2005) 667, 723
44. L. Andrianopoli, R. D’Auria, S. Ferrara, M. A. Lledo: Nucl. Phys. B 640, 46
(2002) 668, 669
45. B. Zumino: J.Math.Phys. 3, 1055 (1962) 670, 671, 717, 719
46. E. Witten, D. I. Olive: Phys. Lett. B 78, 97 (1978) 672
47. For a review on supergravity see for example: L. Castellani, R. D’Auria, P. Fré,
Supergravity and Superstring Theory: A Geometric Perspective (World Scien-
tific, Singapore, 1990) 675, 680
48. E. Cremmer: in Supergravity ’81, ed. by S. Ferrara, J. G. Taylor, p. 313; B. Julia:
in Superspace & Supergravity, ed. by S. Hawking, M. Rocek (Cambridge, 1981)
p. 331 680
49. A. Salam, E. Sezgin: Supergravities in Diverse Dimensions, ed. by A. Salam,
E. Sezgin (North-Holland, World Scientific, 1989), Vol. 1 680
50. L. Andrianopoli, R. D’Auria, S. Ferrara: Int. J. Mod. Phys. A 13, 431 (1998) 680, 683, 687,
51. L. Andrianopoli, R. D’Auria, S. Ferrara: Int. J. Mod. Phys. A 12, 3759 (1997) 680, 706
52. L. Castellani, A. Ceresole, S. Ferrara, R. D’Auria, P. Fré, E. Maina: Nucl. Phys.
B 268, 317 (1986) 686, 710
53. E. Bergshoeff, I. G. Koh, E. Sezgin: Phys. Lett. B 155, 71 (1985); M. de Roo,
P. Wagemans: Nucl. Phys. B 262, 644 (1985) 686, 710
Extremal Black Holes in Supergravity 727

54. S. Ferrara, R. Kallosh: Phys. Rev. D 73, 125005 (2006) 687, 715, 722
55. S. Ferrara, A. Strominger: in Proceedings of College Station Workshop Strings
‘89’, ed. by Arnowitt et al. (World Scientiﬁc, Singapore, 1989), p. 245;
P. Candelas, X. de la Ossa: Nucl. Phys. B 355, 455 (1991); B. de Wit, A. Van
Proeyen: Nucl. Phys. B 245, 89 (1984); E. Cremmer, C. Kounnas, A. Van
Proeyen, J. P. Derendinger, S. Ferrara, B. de Wit, L. Girardello: Nucl. Phys.
B 250, 385 (1985); B. de Wit, P. G. Lauwers, A. Van Proeyen: Nucl. Phys. B
255, 569 (1985); S. Ferrara, C. Kounnas, D. Lust, F. Zwirner: Nucl. Phys. B
365, 431 (1991); L. Castellani, R. D’Auria, S. Ferrara: Class. Quant. Grav. 7,
1767 (1990); L. Castellani, R. D’Auria, S. Ferrara: Phys. Lett. B 241, 57 (1990)
688
56. A. Strominger: Commun. Math. Phys. 133, 163 (1990) 688
57. L. Andrianopoli, M. Bertolini, A. Ceresole, R. D’Auria, S. Ferrara, P. Fré,
T. Magri: J. Geom. Phys. 23, 111 (1997); L. Andrianopoli, M. Bertolini,
A. Ceresole, R. D’Auria, S. Ferrara, P. Fré: Nucl. Phys. B 476, 397 (1996)
688, 696
58. E. Cremmer, A. Van Proeyen: Class. Quant. Grav. 2, 445 (1985) 692
59. L. Andrianopoli, R. D’Auria, S. Ferrara, P. Fre, M. Trigiante: Nucl. Phys.
B 509, 463 (1998); G. Arcioni, A. Ceresole, F. Cordaro, R. D’Auria, P. Fre,
L. Gualtieri, M. Trigiante: Nucl. Phys. B 542, 273 (1999) 696
60. K. Behrndt, D. Lust, W. A. Sabra: Nucl. Phys. B 510, 264 (1998) 697
61. P. Meessen, T. Ortin: Nucl. Phys. B 749, ) 291 (2006) 697
62. P. Fré: Nucl. Phys. Proc. Suppl. 57, 52 (1997) 697
63. M. Alishahiha, H. Ebrahim: JHEP 0603, 003 (2006); M. Alishahiha,
H. Ebrahim: JHEP 0611, 017 (2006) 701
64. P. Kaura, A. Misra: On the existence of non-supersymmetric black hole at-
tractors for two-parameter Calabi-Yau’s and attractor equations, arXiv:hep-
th/0607132 701
65. S. Bellucci, S. Ferrara, A. Marrani, A. Yeranyan: Mirror Fermat Calabi-Yau
threefolds and Landau-Ginzburg black hole attractors, arXiv:hep-th/0608091 701
66. D. Astefanesei, K. Goldstein, S. Mahapatra: Moduli and (un)attractor black
hole thermodynamics, arXiv:hep-th/0611140 701
67. G. W. Gibbons, R. Kallosh, B. Kol: Phys. Rev. Lett. 77, 4992 (1996) 701, 703
68. S. Ferrara, C. A. Savoy, B. Zumino: Phys. Lett. B 100, 393 (1981) 706
69. A. Ceresole, R. D’Auria, S. Ferrara: Nucl. Phys. Proc. Suppl. 46, 67 (1996) 706
70. R. Kallosh, B. Kol: Phys. Rev. D 53, 5344 (1996) 709, 713
71. S. Ferrara, E. G. Gimon, R. Kallosh: Magic supergravities, N = 8 and black-hole
composites, arXiv:hep-th/0606211 709, 719, 722
72. M. Gunaydin, O. Pavlyk: JHEP 0508, 101 (2005) 709
73. S. Bellucci, S. Ferrara, A. Marrani: Phys. Lett. B 635, 172 (2006) 714
74. S. Ferrara, M. Bodner, A. C. Cadavid: Phys. Lett. B 247, 25 (1990) 714
75. S. Bellucci, S. Ferrara, M. Gunaydin, A. Marrani: Int. J. Mod. Phys. A 21,
5043 (2006) 714, 715, 720, 722
76. E. Cremmer, A. Van Proeyen: Class. Quant. Grav. 2, 445 (1985) 715
77. B. de Wit, F. Vanderseypen, A. Van Proeyen: Nucl. Phys. B 400, 463 (1993) 715
78. M. R. Douglas: JHEP 0305, 046 (2003); F. Denef, M. R. Douglas: Computa-
tional complexity of the landscape. I, arXiv:hep-th/0602072 723
Expectation Values and Vacuum Currents
of Quantum Fields∗

G. A. Vilkovisky

Lebedev Physical Institute, Leninsky Prospect 53, Moscow 119991, Russia

[email protected]

Abstract. Theory of expectation values is presented as an alternative to S-matrix

theory for quantum fields. This change of emphasis is conditioned by a transition
from the accelerator physics to astrophysics and cosmology. The issues discussed are
the time-loop formalism, the Schwinger–Keldysh diagrams, the effective action, the
vacuum currents, and the effect of particle creation.

1 Introduction

High-energy physics will probably have to undergo major changes. The accel-
erators will cease being its experimental base, and it will become a part of
astrophysics. Simultaneously, the S-matrix will cease being the central object
of high-energy theory because the emphasis on this object is entirely owing
to the accelerator setting of the problem. If there is a background radiation
that originates from some initial state in the past, then where is the S-matrix
here? Astrophysics and cosmology offer the evolution problems rather than
the scattering problems. The gravitational collapse is a typical initial-value
problem. It is such by its physical setting irrespective of whether the state of
the system is classical or quantum. The nature of measurement also changes.
No final state is prepared. One measures observables like temperatures or
mechanical deflections and subjects these measurements to a statistical treat-
ment to obtain the value of the observable. This means that one measures
expectation values in the given initial state. S-matrix theory should give way
to expectation-value theory.
There is a proof that accelerator physics is dead: Gabriele Veneziano is
leaving CERN for Collège de France. At this historic moment, my mission
is to convert him into a new faith. The present preaching consists of four
lectures:

∗
The course of four lectures given at Collège de France in May 2006.

G. A. Vilkovisky: Expectation Values and Vacuum Currents of Quantum Fields, Lect. Notes
Phys. 737, 729–784 (2008)
DOI 10.1007/978-3-540-74233-6 23 c Springer-Verlag Berlin Heidelberg 2008
730 G. A. Vilkovisky

1. Formal aspects of expectation-value theory.

2. The in-vacuum state and Schwinger–Keldysh diagrams.
3. The eﬀective action.
4. Vacuum currents and the eﬀect of particle creation.

Literature to Lectures 1 and 2 is in [1]–[16]. Additional literature to Lecture 3

is in [17]–[41] and to Lecture 4 in [42]–[56].

2 Formal Aspects of Expectation-Value Theory

2.1 Vocabulary

In these lectures,
ϕ̂i (1)
denotes the quantum field. It is an operator function on a given differentiable
manifold (referred to below as the base manifold), and i is a point of this
manifold. Generally, ϕ̂i is a collection of fields, and then i is a set containing
also the indices labelling these fields. The hat designates an operator. The
ϕ̂i is an operator in a Hilbert space which is not granted. The workers have
to build it with their own hands as a representation of the algebra of ϕ̂’s.
For simplicity, ϕ̂i will be assumed boson and real (self-adjoint) but otherwise
arbitrary.
The starting point is an operator equation for ϕ̂i

Si (ϕ̂) + Ji = 0 (2)

which is understood as an expansion. It is meant that there is a c-number

function Si (ϕ) understood as a collection of its Taylor coeﬃcients at some
c-number point of conﬁguration space:
∞
1
Si (ϕ) = Sij1 ···jn (c)(ϕ − c)j1 . . . (ϕ − c)jn , (3)
n=0
n!

and one replaces ϕj in this expansion with an operator. Which c-number field
cj will be used for this expansion does not matter because it will always sum
with the operator (ϕ̂−c)j to make the full quantum field. The expansion point
cj is often called “background field”, and there has been much emphasis on it.
In fact it is completely immaterial. I shall never make this expansion explicitly,
but I shall keep explicit the c-number term of the equation: a source Ji .
Important are only the following three points.
(1) The function Si (ϕ) is local, i.e., it depends only on ϕ and its finite-order
derivatives at the point i.
Expectation Values and Vacuum Currents 731

(2) The function Si (ϕ) is a gradient:

δ
Si (ϕ) = S(ϕ) , (4)
δϕi

i.e., there exists an action S(ϕ) generating the operator ﬁeld equations.
For its derivatives the following notation will be used:
δ δ
Si1 ···in (ϕ) = · · · in S(ϕ) . (5)
δϕi1 δϕ
Of course, only the total action matters:

Stot = S(ϕ) + ϕi Ji . (6)

(3) There is a special condition on the matrix of second derivatives of S(ϕ). I

shall refer to this continuous matrix as S2 :

Sij (ϕ) ≡ S2 (ϕ) . (7)

By locality, S2 is the kernel of some diﬀerential operator on the base mani-

fold for which I shall use the same notation S2 . It is required that S2 admit
a well-posed Cauchy problem in which case it has the unique advanced and
retarded inverses (Green’s functions) G+ and G− :

Sij G±jk = −δik , G+jk = G−kj . (8)

Because S2 is symmetric, the advanced inverse is the transpose of retarded.

One may think of S2 as of a second-order hyperbolic operator which it will
in fact be below, but the scheme is more general. It is formalism-insensitive.
One’s field equations may have the second-order differential form or the first-
order differential form – the scheme will work anyway. The importance of
the operator S2 is in the fact that it determines the linear term of the field
equations and, therefore, governs the iteration procedures. Commute ϕ̂i with
the field equations. Obtained will be a linear homogeneous equation for the
commutator [ϕ̂i , ϕ̂j ]. Consider the respective inhomogeneous equation and its
two iterative solutions: one with the advanced inverse for S2 and the other one
with retarded. The equation for the commutator is solved by their difference:

[ϕ̂i , ϕ̂j ] = i G+ij (c) − G−ij (c) + O(ϕ̂ − c) . (9)

In this way the algebra of ϕ̂’s is built as an operator expansion. This is the
quantization postulate.
By the setting of its Cauchy problem, the operator S2 introduces the con-
cept of causality. If S2 is a second-order hyperbolic operator, this is the usual
relativistic causality. But in any case the base manifold will be foliated with
the Cauchy surfaces of the operator S2 . They will be denoted as Σ.
732 G. A. Vilkovisky

A function of ϕ̂ that involves ϕ̂ on only one Cauchy surface

,
,
Q(ϕ̂) = Q(ϕ̂, ) (10)
Σ

will be called local observable. A state deﬁned as an eigenstate of local

observables ,
,
Q(ϕ̂, )| = q| (11)
Σ
will be called local state. This latter name may be confusing because the state
is, of course, a global concept, and I am using the Heisenberg picture. But the
local state is associated with a given Σ:

| = |Σ, q . (12)

Of course, for it to be deﬁned, one needs a complete set of commuting local

observables. I call the Q’s observables, but they may not even be Hermitian.
And I shall consider them linear in ϕ̂. If they are nonlinear, I shall make a
local reparametrization of the ﬁeld variables so as to make them linear.
In fact, if one has a complete set of commuting local observables, one has
already built a Hilbert space. A linear combination

|Σ = dq Ψ (q)|Σ, q (13)

is also a local state associated with Σ provided that the function Ψ (q) is
external, i.e., independent of the quantum ﬁeld ϕ̂i .
Our goal is to learn how to calculate expectation values of ﬁeld observables
in a local state, and I shall concentrate on the expectation value

Σ|ϕ̂i |Σ . (14)

However, we shall save the effort if we consider another problem first. Namely,
let us recall what would we do in the case of two local states associated with
different Cauchy surfaces:

|Σ1 , q1 = |1 , |Σ2 , q2 = |2 , (15)

Σ2 > Σ1 .
Here and below, “greater” is a notation for “later”.

2.2 The Quantum Boundary-Value Problem

In the problem where given are two local states (15), the ﬁeld’s expectation
value is replaced with the scalar product

2|ϕ̂|1 def
=
ϕ (16)

2|1
Expectation Values and Vacuum Currents 733

which I shall call mean field although it is not mean in any state.
If our goal was the scalar product (16), we would use the Schwinger
principle
δ
2|1 = i
2|δStot |1 or zero (17)
whose meaning is this. Consider a variation in the Taylor coefficients of the
field equations, i.e., in the functional form of the total action. The solution
for ϕ̂i will respond and will induce a change in the functions Q(ϕ̂) which will
induce a change in their eigenstates, and finally there will be a change in the
amplitude
2|1 induced by a change in the action. The Taylor coefficients
are local. They can be varied in the region between Σ1 and Σ2 or outside
this region. The Schwinger principle (17) says that, if they are varied outside,
the variation of the amplitude is zero. Otherwise, this variation is expressed
through the variation of the action by (17).
The Schwinger principle is a consequence of the commutation relations,
but it can also be taken for the first principle because one does not need
anything else. For many purposes (but not all) it suffices to use a specific case
of (17): a freedom of varying the source J. The result of this use is
← −
δ δ
2| T ϕ̂j1 . . . ϕ̂jn |1, if Σ2 > j1 , . . . jn > Σ1 ,
···
2|1 = (18)
δiJj1 δiJjn 0, otherwise .

Here T orders the operators ϕ̂k , k ∈ Σk , chronologically, i.e., places them in

the order of following of their Σk , and the arrow over T points the direction
of growth of the time Σ.
Let us come back to the operator ﬁeld equations. Since all ϕ̂’s in these
equations are at the same point, one can formally insert in (2) the sign of
chronological ordering:
←
−
T Si (ϕ̂) + Ji = 0 . (19)
One may worry about additional terms in (19) stemming from the distinction
between the chronological and ordinary operator products, and the noncom-
←−
mutativity of T with the derivatives in the Taylor coeﬃcients of the equations.
Because the operators in the products are at the same point, these terms are
ambiguous expressions whose handling depends on the formalisms and pro-
cedures used. There is always a happy end: these terms cancel and help to
cancel similar terms appearing in the subsequent calculations. Therefore, it
makes sense to use such formalisms and procedures that these terms do not
appear at all. This is the approach that I shall follow.
Sandwiching (19) between the states
2| and |1, and using (18), one
obtains the following equation for the amplitude:

δ
Si + Ji
2|1 = 0 . (20)
δiJ

Multiply it from the left with

2|1−1 and pull the factors
2|1 in the argument
of Si using the fact that this is a unitary transformation:
734 G. A. Vilkovisky

δ
Si
2|1−1
2|1 + Ji 1 = 0 . (21)
δiJ

In the argument, commute the operators:

δ ln
2|1 δ
Si + + Ji 1 = 0 (22)
δiJ δiJ

and use that by (18)

δ ln
2|1
=
ϕk . (23)
δiJk
The result is the following equation for the mean ﬁeld:

δ
Si
ϕ + + Ji 1 = 0 . (24)
δiJ

Equation (24) differs from the classical field equation by the operator addi-
tion δ/δiJ to
ϕ. When this operator addition acts on 1, its effect is zero, but
it will act also on
ϕ because the summands
ϕ and δ/δiJ do not commute.
Where in (24) is the Planck constant? It is easy to see by dimension that is
just in front of δ/δiJ. Therefore, if one wants to expand the equations in ,
one should expand them in δ/δiJ.
The problem boils down to expanding a function f (A + B) in B when A
and B do not commute. It suffices to expand the exponential function since
one can write ,
d ,
f (A + B) = f e(A+B)x , (25)
dx x=0

or, equivalently, ,
,
f (A + B) = e(A+B)d/dx f (x), . (26)
x=0
For the exponential function one has the identity
⎛ ⎞
x
e(A+B)x = eAx ⎝1 + dy e−Ay Be(A+B)y ⎠ (27)
0

which makes the expansion possible. This all works well if the series of com-
mutators
1 1
e−A BeA = B + [B, A] + [[B, A], A] + [[[B, A], A], A] + · · · (28)
2! 3!
terminates somewhere as in our case. Indeed, if
ϕ = A and δ/δiJ = B, then

[[B, A], A] = 0 . (29)

Under condition (29) one obtains for an arbitrary function:

Expectation Values and Vacuum Currents 735

1
f (A + B) = f (A) + f (A)B + f (A)[B, A] + O(B 2 ) . (30)
2
As compared to the ordinary Taylor expansion, there are several additional
terms with commutators at each order.
A use of the result above in (24) gives

1 δ
ϕj
Si (
ϕ) + Sijk (
ϕ) + O(2 ) = −Ji , (31)
2 δiJk

δ
ϕj
Sij (
ϕ) = −δik + O() . (32)
δJk
Here the second equation is obtained by diﬀerentiating the ﬁrst one, and it tells
us what is δ
ϕ/δJ. Up to O(), it is some Green’s function of the operator
S2 . Denote this Green’s function as

δ
ϕj
= Gjk + O() . (33)
δJk
One can work to any order, but I shall stop here. We obtain closed equa-
tions for the mean ﬁeld:
1
Si (
ϕ) + Sijk (
ϕ)Gjk (
ϕ) + O(2 ) = −Ji , (34)
2i
Sij (
ϕ)Gjk (
ϕ) = −δik . (35)
The second term in (34) is the loop

Si ( ϕ ) + i + O( 2 ) = −Ji ,
(36)

all elements of the loop being functions of

ϕ. But two questions remain to
be answered:
(i) Which Green’s function is G?
(ii) What are the boundary conditions to the mean-ﬁeld equations?
The answers are again in the Schwinger principle. Equation (18) tells us
what are G and
ϕ:
←−
1 jk
2| T ϕ̂j ϕ̂k |1
G = −
ϕj
ϕk + O() , (37)
i
2|1

2|ϕ̂j |1

ϕj = . (38)

2|1
Multiply these expressions by the coeﬃcients that make the linear Q out of ϕ:

Q(ϕ̂) = kj ϕ̂j , (39)

736 G. A. Vilkovisky

and send j either to Σ1 or to Σ2 . By the deﬁnition of the states |1 and |2,

one obtains , ,
, ,
Q(
ϕ, ) = q1 , Q(
ϕ, ) = q2 , (40)
Σ1 Σ2
, ,
, ,
kj Gjk , =0, kj Gjk , =0. (41)
j∈Σ1 j∈Σ2

From (37) it follows also that

Gjk = Gkj . (42)

The Green’s function G is symmetric and completely determined by the

boundary conditions (41). This completes the determination of the mean-
field equations (34), and for these equations one arrives at a boundary-
value problem with the boundary conditions (40). As a result, the quantum
boundary-value problem is reduced to a c-number boundary-value problem.
I say “c-number” rather than “classical” because there are differences, and
one is the presence of terms O() in the equations, but, as far as the setting
of the problem is concerned, there is no difference. One arrives at the same
boundary-value problem for the observable field as in the case of the classical
states.
Note that the Green’s function G and, thereby, the mean-field equations do
not depend on the eigenvalues q. The eigenvalues appear only in the boundary
conditions to the equations. However, G depends on the choice of the observ-
ables Q themselves and, through them, on the choice of the states |1 and |2.
Therefore, the mean-field equations are state-dependent.
Although the Green’s function G depends on the choice of the states,
it possesses two universal properties. One has already been mentioned: G
is always symmetric. The other one is this. Let us make a variation in the
operator S2 and find out how does G respond:

S2 G = −1 ,

S2 δG = −δS2 G ,
δG =?
To answer this question, one can use the Schwinger principle again. The result
is the following variational law:

δG = GδS2 G , (43)

and this law is universal. It is the same for all boundary-value problems.
The variational law (43) is remarkable. It is characteristic of ﬁnite-
dimensional matrices. If a matrix has a unique inverse, then the inverse obeys
this law. This law is valid, for example, for the inverse of an elliptic operator,
i.e., for the Euclidean Green’s function. It is valid also for the advanced and
retarded Green’s functions:
Expectation Values and Vacuum Currents 737

δG+ = G+ δS2 G+ , δG− = G− δS2 G− . (44)

But it is not valid generally, and, in the case of S2 , it is exceptional.

The variational law for G has an important implication. Namely, let us
diﬀerentiate the left-hand side of the mean-ﬁeld equations
1
Γi (ϕ) ≡ Si (ϕ) + Simn (ϕ)Gmn (ϕ) + O(2 ) (45)
2i
to see if the result is symmetric. One obtains

δΓi (ϕ) δΓj (ϕ) 1

− = Simn Gmm̄ Gnn̄ Sm̄n̄j − (i ↔ j) + O(2 )
δϕj δϕi 2i
= 0 + O(2 ) . (46)

This means that Γi (ϕ) is a gradient, i.e., there exists an action generating the
mean-field equations:
δΓ (ϕ)
Γi (ϕ) = . (47)
δϕi
There is another way to arrive at the same conclusion. Consider a function of
the mean field defined by the Legendre transformation
1
Γ (
ϕ) = ln
2|1 −
ϕk Jk (48)
i
where J is to be expressed through
ϕ by solving equation (23). It is easy to
see that this function satisfies the equation

δΓ (
ϕ)
= −Ji , (49)
δ
ϕi

and, therefore, its gradient is the left-hand side of the mean-ﬁeld equations.
Γ (ϕ) is the eﬀective action. Up to 2 it is of the form
1
Γ (ϕ) = S(ϕ) + ln det G(ϕ) + O(2 ) (50)
2i
where the second term is the loop without external lines:

Γ (ϕ) = S(ϕ) + + O( 2 ) .
(51)

The effective action exists for any boundary-value problem, but these actions
are different for different such problems. Only in the classical approximation,
the action and the equations are independent of the boundary conditions.
Let us go over to expectation values.
738 G. A. Vilkovisky

2.3 The Quantum Initial-Value Problem

In this problem, given is only one local state (which I shall assume normalized).
Since the ﬁeld operators are now sandwiched between the states associated
with one and the same Σ:

1|(· · · )|1 ,
1|1 = 1 (52)

one cannot apply the Schwinger principle: there is no room for varying the
source. One can create this room artiﬁcially by inserting a complete set of
states associated with some later Σ:

1|1 =
1|2q
2q|1 , (53)
q

Σ2 > Σ1 ,
but this alone will not help because the source is varied in both amplitudes,
and these variations cancel. It will help only if the two amplitudes in (53) are
functions of diﬀerent sources, i.e., if, instead of (53), one introduces a function
of two independent sources, J and J ∗ :

Z(J ∗ , J) =
1|2qJ ∗
2q|1J . (54)
q

This amounts to considering two copies of the quantum field: one with the
source J and the other one with the source J ∗ , and using in (54) the ampli-
tudes of both. Then one can vary only one source and, after that, make the
sources coincident. Using the Schwinger principle, one obtains
,
δ n Z(J ∗ , J) ,, ←−
=
1| T ϕ̂j1 . . . ϕ̂jn |1 . (55)
δiJj1 · · · δiJjn ,J ∗ =J
In this way the expectation values can be calculated.
The technique of two sources is called time-loop formalism because in
expression (54) one goes forward in time, from Σ1 to some Σ2 , and then back
from Σ2 to Σ1 but with another copy of the quantum field.
For every partial amplitude in (54) we have (20)

δ
Si + Ji
2q|1J = 0 . (56)
δiJ
Since the other amplitude in (54) does not depend on J, we can linearly
combine (56) to obtain

δ
Si + Ji Z(J ∗ , J) = 0 . (57)
δiJ
Only one source is active in this differential equation. The other one is
a parameter. Therefore, we can just repeat the consideration above with
Expectation Values and Vacuum Currents 739

Z(J ∗ , J) in place of
2|1, and in this way derive the mean-ﬁeld equations.
We obtain the loop expansion of exactly the same form as before:
1
Si (
ϕ) + Sijk (
ϕ)Gjk (
ϕ) + O(2 ) = −Ji , (58)
2i
Sij (
ϕ)Gjk (
ϕ) = −δik , (59)
and in these loops we must make the sources coincident. There are only two
elements in all loops,
ϕ and G. Upon setting J ∗ = J,
ϕ becomes the
genuine expectation value
,
δ ln Z(J ∗ , J) ,,

ϕ =
k
, ∗ =
1|ϕ̂k |1 , (60)
δiJk J =J

and the matrix G is given by the expression

,
1 jk δ 2 ln Z(J ∗ , J) ,, ←−
G + O() = =
1| T ϕ̂j ϕ̂k |1 −
ϕj
ϕk . (61)
i δiJj δiJk ,J ∗ =J

I am using for it the same letter G, but it is now a different Green’s function
of the operator S2 . Equations (58) with this Green’s function in all loops are
the expectation-value equations.
The solution of the expectation-value equations is specified completely by
the initial conditions on Σ1 following from (60), but it is not easy to write
these conditions down in the general terms. Only half of them is obvious: the
Q’s on Σ1 are given. To obtain the other half, one would need to find the
variables canonically conjugate to Q’s and calculate their expectation values
on Σ1 .1 The same concerns the specification of the Green’s function G. This
issue will be considered in the next lecture where a different approach to it
will be used.
Let us consider the state-independent properties of G. First, as seen from
(61), G is symmetric for any initial-value problem:

Gjk = Gkj . (62)

Second, one can apply the Schwinger principle to derive the variational law
for G. At this point, the initial-value problem diﬀers signiﬁcantly from the
boundary-value problem. When the operator S2 is varied in the generating
function (54), one can no longer play with only one source because S2 is the
1
Let Q’s be Hermitian, and let P ’s have c-number commutators with Q’s:
[P, Q] = i. Then the expectation values in the state (13) satisfy the initial condi-
tions , ,
, , ∂
Q, = dq Ψ (q)qΨ (q) , P , = i dq Ψ (q) Ψ (q)
Σ Σ ∂q
where the overline means complex conjugation. If both Q(ϕ̂) and P (ϕ̂) are linear,
these are initial conditions directly for ϕ.
740 G. A. Vilkovisky

same for both copies of the quantum field, and, therefore, both amplitudes
in (54) respond. As a consequence, all four matrices of second derivatives are
generally involved:
δ 2 ln Z δ 2 ln Z δ 2 ln Z δ 2 ln Z
, , , , (63)
δiJj δiJk δiJj∗ δiJk∗ δiJj∗ δiJk δiJj δiJk∗
i.e., the Green’s function Gjk , its complex conjugate, and two Wightman func-
tions:
1|ϕ̂j ϕ̂k |1 and its transpose. The Wightman functions can be expressed
through Gjk and the advanced or retarded Green’s function:
i
1|ϕ̂j ϕ̂k |1 − i
ϕj
ϕk = Gjk − G+jk + O() = Gkj − G−kj + O() . (64)
The result of the calculation is the following variational law for G:
δG = G− δS2 G + GδS2 G+ − G− δS2 G+ . (65)
It is no more the simple law (43), but it is, nevertheless, universal because
G+ and G− are state-independent. The variational law (65) is valid for any
initial-value problem.
The left-hand side of the expectation-value equations has the form (45) as
before but, since the variational law for G is different, the former inference
about the symmetry of δΓi /δϕj needs to be revised. This inference is no longer
valid. The advanced and retarded Green’s functions arrange it so that
δΓi (ϕ)
=0 when i < j (66)
δϕj
and
δΓi (ϕ)
= 0 when i > j . (67)
δϕj
It follows that there is no action generating the expectation-value equations.
The nonexistence of an action for the initial-value problem is seen also
from the consideration of the Legendre transform of the generating function
(54). It is now a function of two fields:
1
Γ (ϕ∗ , ϕ) = ln Z(J ∗ , J) − ϕJ + ϕ∗ J ∗ (68)
i
where
δ ln Z(J ∗ , J) δ ln Z(J ∗ , J)
ϕ= , ϕ∗ = − . (69)
δiJ δiJ ∗
The expectation-value equations are obtained as
,
δΓ (ϕ∗ , ϕ) ,,
ϕ =
1|ϕ̂|1 : , ∗ = −Ji , (70)
δϕi ϕ =ϕ

and, therefore, ,
δΓ (ϕ∗ , ϕ) ,,
Γi (ϕ) = , ∗ . (71)
δϕi ϕ =ϕ
This is not a gradient.
Expectation Values and Vacuum Currents 741

3 The In-Vacuum State and Schwinger–Keldysh

Diagrams

3.1 Speciﬁcation of the State

In order to proceed, I need to specify the state. This will be done in several
steps.
Step 1. It will be assumed that S2 is a second-order hyperbolic operator, and
the energy–momentum tensor of the ﬁeld of small disturbances δϕi with the
action
1
Sij δϕi δϕj (72)
2
satisﬁes the dominant energy condition.
Step 2. The initial-value surface will be shifted to the remote past:

Σ1 → −∞ . (73)

Consider the operator ﬁeld equations (2) and (3):

∞
1
Ji + Si (c) + Sij (c)(ϕ̂ − c)j + Sij1 ···jn (c)(ϕ̂ − c)j1 . . . (ϕ̂ − c)jn = 0 . (74)
n=2
n!

If ci is some classical solution:

Si (c) = −Ji , (75)

and φ̂i is an operator solution of S2 against the background ci :

Sij (c)φ̂j = 0 , (76)

then the ﬁeld

ϕ̂i = ci + φ̂i , i ∈ Σ → −∞ (77)
solves the operator dynamical equations asymptotically in the remote past.
It is a property of S2 that its solution with smooth data having a compact
support or decreasing at the spatial infinity decreases also in the time-like
directions. Then, as i ∈ Σ → −∞, the nonlinear terms in (74) decrease even
faster and are negligible. Thus, to build a Hilbert space of states, it suffices
to build a representation of the algebra of φ̂’s.
Step 3. A Fock space will be built associated with the linear field φ̂i . This
amounts to expanding φ̂i in some basis of solutions of S2 (c):

S2 (c)χA = 0 , (78)

φ̂i = χiA âin A + χiA â+

in
A
(79)
742 G. A. Vilkovisky

where the overline means complex conjugation, and the basis functions χiA
are normalized with the aid of the inner product:
(χA , χB ) = 0 , (χA , χB ) = δAB , (80)

(φ1 , φ2 ) ≡ −i φ1 Wμ φ2 dΣ μ . (81)
Σ
Here Wμ is the Wronskian of S2 . In this way, the concept is introduced of
some particles detectable in the past. What kind of particles are these, i.e.,
what kind of detectors detect these particles – depends on the choice of the
basis of solutions, but, in any case, the following functions will be chosen for
the local observables Q:
,
,
Q (ϕ̂, ) = −iδ
A AB
χB Wμ (ϕ̂ − c) dΣ μ , (82)
Σ
Σ

Σ → −∞ .
One needs these observables only on the initial-value surface, and, there, they
coincide with the annihilation operators of the introduced particles:
,
,
QA (ϕ̂, ) = âin A . (83)
Σ→−∞

The choice of the quantum state will be made in favour of the zero-eigenvalue
eigenstate of these observables:
âin A |1 = 0 . (84)
This is the vacuum of the introduced particles.
It follows from (77) and (79) that the field’s expectation value in the state
(84), when taken in the remote past, coincides with the classical solution ci :

1|ϕ̂i |1 = ci , i ∈ Σ → −∞ . (85)
The ad hoc classical solution ci can then be eliminated completely both from
the asymptotic form of the quantum field
ϕ̂i =
ϕi + φ̂i , i ∈ Σ → −∞ (86)
and from the equation defining the Fock modes
Sij (
ϕ)φ̂j = 0 , i ∈ Σ → −∞ . (87)
Only the mean field itself figures as a background.
The specification of the state is, however, not completed, because the mean
field in the past remains an arbitrary classical solution:
Si (
ϕ) = −Ji , i ∈ Σ → −∞ (88)
and the state itself remains the vacuum of undefined particles. To make the
final determination, one more step is needed.
Expectation Values and Vacuum Currents 743

Step 4. The final choice of the state assumes one more limitation on the
original action. Namely, it will be assumed that the external source Ji and
all the external fields that may be present in the action S are asymptotically
static in the past. This means that, asymptotically in the past, there exists a
vector field ξ μ such that it is nowhere tangent to any of the Cauchy surfaces,
and the Lie derivative in the direction of ξ μ of all external fields is zero.
Specifically,
Lξ Ji = 0 , i ∈ Σ → −∞ . (89)
If this limitation is fulfilled, then, among the solutions of (88) for the mean
field in the past, there is the static one:

Lξ
ϕi = 0 , i ∈ Σ → −∞ . (90)

Choose it. Next, use the fact that, with this choice, the operator S2 (
ϕ)
commutes with the Lie derivative, and choose for the basis solutions of S2 (
ϕ)
the functions that, asymptotically in the past, are eigenfunctions of the Lie
derivative:
iLξ χiA = εA χiA , εA > 0 , i ∈ Σ → −∞ . (91)
This fixes both the initial conditions for the mean field and the type of par-
ticles whose vacuum is the chosen state. These are particles with definite
energies.
Since S2 is a second-order hyperbolic operator, it contains some ten-
sor field, g μν , contracting the second derivatives. The inverse matrix, gμν ,
can serve and does serve in every respect as a metric on the base mani-
fold. The metric enters the original action S either as a part of the quan-
tum field ϕ̂i or as an external field. In both cases it is subject to equation
(90). When applied to the metric, this is the Killing equation. Thus, we as-
sume the existence, asymptotically in the past, of a time-like Killing vector
ξμ.
The specification of the quantum initial data is now completed. The no-
tation for the state defined above is

|1 = |in vac , (92)

and its full name is relative standard in-vacuum state. It is “relative” because
it is relative to the background generated by an asymptotically static source.
It is “standard” because it refers to the standard concept of particles. It is
“in” because these particles are incoming. And it is “vacuum” because these
particles are absent.
The state should not necessarily be chosen as the zero-eigenvalue eigen-
state. Since the expectation-value equations do not depend on the eigenvalues,
they will have the same form for any eigenstate of the annihilation operators,
i.e., for any coherent state

âin A |in α = αA |in α . (93)

744 G. A. Vilkovisky

Only the initial conditions for the mean ﬁeld will be diﬀerent:

α in|ϕ̂i |in α = ci + χiA αA + χiA αA , i ∈ Σ → −∞ . (94)

In addition to the static background ci generated by a source, the mean ﬁeld

in the past contains now the incoming wave of an arbitrary profile. This is
the general setting of the classical evolution problem for an observable field
like the electromagnetic or gravitational field. The fact that the nature of the
state has changed from classical to quantum did not affect this setting.
It will be useful to keep comparing the initial-value problem with the
boundary-value problem. In the latter case, one can define similarly the out-
vacuum state and specify the quantum boundary data as

|1 = |in vac , |2 = |out vac . (95)

3.2 Perturbation Theory

With this specification of the states, let us come back to the mean-field equa-
tions. There remains to be obtained the Green’s function G(ϕ) that figures in
the loops. We need it for an arbitrary background ϕ, but we have a variational
law, (43) or (65), which may be regarded as a differential equation for G(ϕ)
with respect to ϕ. The only thing that is missing and that depends on the
choice of states is the initial condition to this equation. It suffices, therefore,
to know G for only one background.
Then let us do the simplest: perturbation theory around the trivial back-
ground. A second-order hyperbolic operator with the trivial background is the
D’Alembert operator with flat metric, 0 :

S2 (ϕ) = 0 + P . (96)

The remainder is a perturbation P .

In the case of the boundary-value problem, the variational law is (43), and,
therefore, the expansion of G(ϕ) is of the form

G(ϕ) = G0 + G0 P G0 + G0 P G0 P G0 + . . . (97)

where G0 is G for the trivial background. This expansion is to be inserted in

the loop in the mean-ﬁeld equations:

1
Sijk (ϕ)Gjk (ϕ) = i .
2i (98)

Let for simplicity P be a potential. One obtains the loop expanded in powers
of P :
x
= dy1 . . . dyn F (x|y1 , . . . yn )P (y1 ) . . . P (yn ) .
(99)
Expectation Values and Vacuum Currents 745

The coeﬃcients F will be called formfactors. The formfactors are loop

diagrams

x y
F (x|y) = , (100)

y1
x
F (x|y1 , y2 ) = , (101)
y2
...........................

with the same propagator for all lines: the trivial-background Green’s function

———— = G0 . (102)

What is G0 ? With the trivial background and the standard in- and out-
vacuum states, it is the Feynman Green’s function:

G0 = Gfeynman . (103)

Let us do the same thing for the initial-value problem. The loop in the
expectation-value equations will, in the same way, be expanded in powers of
the perturbation, and the expansion will have the same form (99), but the
formfactors will be diﬀerent because the variational law for G is diﬀerent. It
is now (65) rather than (43). Using this law, one obtains for the formfactors
three diagrams in place of one:

x y x y x y
F (x | y ) = + − ,
(104)

ﬁve diagrams in place of one:

y1 y1 y1
x x x
F (x y1 , y2 ) = + +
y2 y2 y2
y1 y1
x x
− − ,
y2 y2 (105)

and so on. There are two types of propagators in these diagrams: the trivial-
background G, and the trivial-background retarded or advanced Green’s func-
tion. Respectively, there are two types of lines:

= G0 , = G− +
0 or G0 . (106)
746 G. A. Vilkovisky

In the latter case, the arrow points the direction of growth of time. And what
is now G0 ? In terms of the linear ﬁeld (76) it is
←
− ,
1 jk ,
G0 =
in vac| T (φ̂j φ̂k )|in vac, (107)
i trivial background

and diﬀers from the previous case in that the “

out vac|” is replaced by the
“
in vac|”. But, with the trivial background, the vacuum for the linear ﬁeld
is stable. The out-vacuum coincides with the in-vacuum. Therefore,

G0 = Gfeynman (again!) . (108)

The diagrams above are called Schwinger–Keldysh diagrams. There is not

more than one Feynman propagator in every diagram. The remaining ones
are the retarded and advanced Green’s functions organized in a special way
and with special signs of the diagrams themselves. There is a mystery in
this special arrangement. What do these diagrams want to tell us? We must
disclose their secret because working with them directly is not what can be
recommended.

3.3 Mystery of the Schwinger–Keldysh Diagrams

One thing is obvious right away. In the diagrams above, there is always a chain
of retarded Green’s functions connecting a given point y with the observation
point x. Therefore, the formfactor vanishes if at least one of the y’s is in the
future of x. This is the retardation property

F (x|y1 , . . . yn ) = 0 when ym > x , ∀m . (109)

But this is true of every Schwinger–Keldysh diagram, and why do they appear
in the special combinations? What is the role of the Feynman propagator?
Let us make a Fourier transformation of the formfactor with respect to
the diﬀerences (x − ym ) in the Minkowski coordinates:
n
F (x|y1 , . . . yn ) = dk1 . . . dkn exp i km (x − ym ) f (k1 , . . . kn ) . (110)
m=1

How come that F possesses the retardation property? It is only that f should
admit an analytic continuation to the upper half-plane in the time-like com-
ponents of k’s. Then, for ym later than x, we shall be able to close the inte-
0
gration contour in the upper half-plane of km , and the integral will vanish.
There should be a function of complex momenta f (z1 , . . . zn ) analytic in the
0
upper half-planes of zm and such that f (k1 , . . . kn ) is its limiting value on the
real axes: ,
,
f (k1 , . . . kn ) = f (z1 , . . . zn ), 0 0 . (111)
zm = km + iε
Let us build this function.
Expectation Values and Vacuum Currents 747

All diagrams in a given-order formfactor are similar. They all are integrals
over the momentum circulating in the loop, and the integrands are identical.
The diﬀerence is only in the integration contours. Thus any diagram in the
lowest-order formfactor f (k) is of the form

k polynomial in momenta
= dp dp0 .
(−p02 + p2 ) (−(p0 − k 0 )2 + (p − k)2 )
C
(112)

There are, generally, as many factors in the denominator as there are propa-
gators in the loop, and each factor contains two poles. The contour C passes
round them in accordance with the type of the propagator. One of the three
rules applies to each pair of poles:

retardation rule,

advancement rule,

Feynman rule.

Let us now shift the external momentum k 0 to the complex plane. The poles
will shift to the complex plane, but we shall also deform smoothly the contour
so that it do not cross the poles. In this way one can build a function of com-
plex momenta for each Schwinger–Keldysh diagram. Thus the lowest-order
formfactor with complex momentum, f (z), is a sum of three functions:

f (z) = dp dp (. . .) + dp dp (. . .) − dp dp0 (. . .) ,
0 0
(113)
C1 C2 C3

and the contours C1 , C2 , C3 for z 0 in the upper half-plane are shown in Fig. 1.
By considering the pinch conditions, i.e., the conditions that the poles pinch
the integration contour, one can check in each case that these functions can
have singularities only on the real axis. Therefore, if we consider them in the
upper half-plane, they are analytic, and their limits on the real axis are our
original diagrams.
There remains to be understood what are these functions. Since the inte-
grands are identical, the sum of the integrals in (113) is the integral over the
sum of the contours

f (z) = dp dp0 (. . .) . (114)
C1 + C2 − C3

Sum up the three contours in Fig. 1. The resultant contour is such that every
pair of poles is passed round by the Feynman rule. It may be called Feynman
contour.
748 G. A. Vilkovisky

p0 plane

SUM:

Cfeynman

Fig. 1. Integration contours for the three diagrams in the lowest-order formfactor
(113). The sum of the contours is the Feynman contour

But the Feynman contour deﬁnes also the in–out formfactor (100) in which
both propagators are Feynman, except that the in–out formfactor is not the
limit of f (z) from the upper half-plane. It is this limit on only half of the real
axis, and on the other half it is the limit from the lower half-plane. The in–in
and in–out formfactors are diﬀerent boundary values of the same complex
function having a cut on the real axis:
,
,
in–in : f (k) = f (z), 0 , (115)
z = k 0 + iε
,
,
in–out : f (k) = f (z), 0 , (116)
z = (1 + iε)k 0
and the function itself is the integral over the Feynman contour

f (z) = dp dp0 (. . .) . (117)
Cfeynman
The same is true of all n-th order formfactors, and this is a disclosure of
the mystery. In each case, the set of Schwinger–Keldysh diagrams is just a
splitting of one Feynman diagram whose purpose is to display the retardation
property and in this way to tell us which boundary value is to be taken.
Expectation Values and Vacuum Currents 749

3.4 Reduction to the Euclidean Eﬀective Action

The Feynman contour is famous for the fact that, when the external momenta
are on the imaginary axis, the Feynman contour is the imaginary axis itself.
With all the momenta imaginary, both the external ones and the one circu-
lating in the loop, this is the Euclidean formfactor. Then we can start with
the calculation of the Euclidean formfactor and next analytically continue it
in momenta from the imaginary axis to the real axis either in the way shown
in Fig. 2a or in the way shown in Fig. 2b. In the ﬁrst case we shall obtain the
in-out formfactor, and in the second case the in–in formfactor of Lorentzian
theory. It is invaluable that loops can be calculated Euclidean.
Then let us make one more step. A formfactor with the Euclidean momen-
tum can be put in the spectral form
∞
ρ(m2 )
f (k) = dm2 + a polynomial in k 2 , (118)
m2 + k 2
0

k2 > 0
with some spectral weight ρ(m2 ), the resolvent 1/(m2 + k 2 ), and a polynomial
accounting for a possible growth of f (k) at k 2 → ∞. There are similar forms
for the higher-order formfactors. If the formfactor is in the spectral form,
the procedure of analytic continuation boils down merely to replacing the
Euclidean resolvent with the retarded or Feynman resolvent:
∞
ρ(m2 )
in–in : f (k) = dm2 + a polynomial in k 2 ,
m2 − (k 0 + iε)2 + k2
0
(119)

k0plane k0plane

(a) IN–OUT (b) IN–IN

Fig. 2. Analytic continuation of the Euclidean formfactor that gives (a) the in–out
formfactor and (b) the in–in formfactor of Lorentzian theory
750 G. A. Vilkovisky

∞
ρ(m2 )
in–out : f (k) = dm2 + a polynomial in k 2 .
m2 − k 02 + k2 − iε
0
(120)
Note that the spectral weight is the same in all cases: the one of the Euclidean
loop. Thus, the problem boils down to obtaining the spectral weights of the
Euclidean formfactors.
Then back from the Fourier-transformed formfactors to the formfactors
themselves, and from the formfactors to the mean-field equations. For the
loop in these equations expanded in powers of the perturbation, we obtain an
expression of the following form:

x
= (c1 + c2 0 + . . .)P (x)

∞
1
+ dm2 ρ(m2 ) 2 P (x)
m − 0
0
∞
+ dm21 dm22 dm23 ρ(m21 , m22 , m23 )
0

1 1 1
× 2 P (x) P (x)
m1 − 0 m22 − 0 m23 − 0
+ ... . (121)
Here the first term is local. It comes from the polynomial in the spectral
form. The remaining terms are nonlocal but expressed through the resolvent
which is a Green’s function of the massive operator 0 − m2 . It is initially
the Euclidean Green’s function since we are calculating the Euclidean loop.
For the Lorentzian equations, we arrive at the following rule. To obtain the
expectation-value equations in the in-vacuum state, replace all the Euclidean
resolvents in (121) with the retarded Green’s functions. To obtain the mean-
field equations for the in–out problem, replace all the Euclidean resolvents
with the Feynman Green’s functions:
Euclidean,
*

1
All 2 - Retarded, (122)
m − 0 HH
HH
j Feynman.
H
At every level of expectation-value theory, there are proofs that the
expectation-value equations possess two basic properties: they are real and
causal. Causality is the retardation property discussed above. But it is not
enough to have proofs. These properties should be manifestly built into the
working formalism. Expression (121) offers such a formalism. Since the re-
tarded resolvent secures the causality and is real, this expression is manifestly
real and causal.
Expectation Values and Vacuum Currents 751

But even this is not enough. The theory may possess symmetries, and
one may want these symmetries to be manifest. To this end it will be noted
that, although expansion (121) is obtained in terms of the trivial-background
resolvent 1/(m2 −0 ), it can be regrouped so as to restore the full-background
resolvent
1 1
= 2 (123)
m2 − S 2 m − 0 − P
at each order. It does not matter whether this regrouping will be made in
the expectation-value equations or in the Euclidean equations because the
retarded and Euclidean Green’s functions obey the same variational law (43):
1 1 1 1
= 2 − 2 P + ... . (124)
m2 − 0 m − S2 m − S 2 m2 − S 2
This proves that the rule of replacing resolvents applies to the full-background
resolvents as well as to the trivial-background ones. The latter fact is im-
portant because the Euclidean loops can be calculated covariantly from the
outset, and the transition to the expectation-value equations by replacing
the full-background resolvents does not break the manifest symmetries. The
expectation-value equations are obtained in as good an approximation as the
Euclidean equations are.
There remains to be made a ﬁnal observation. For the Euclidean equations,
there is an eﬀective action:

i δ
= (125)
δϕi
because the variational law for the Euclidean Green’s function is (43). It is
invaluable that loops can be calculated without external lines. This reduces
the calculations greatly, helps to control symmetries, helps to control renor-
malizations.
Thus, at the end of the day, we conclude that there is an action that
generates the expectation-value equations, but it does so indirectly, i.e., not
through the least-action principle. To make this clear, consider (for the illus-
trative purposes only) any quadratic action:

1
Γ (ϕ) = dx ϕf (0 )ϕ .
2
Whatever the operator f (0 ) is, in the variational derivative it gets sym-
metrized:
δΓ (ϕ) 1
= f (0 ) + f T (0 ) ϕ = f sym (0 )ϕ .
δϕ 2
Assuming that the function f (0 ) is in the spectral form
∞
1
f (0 ) = dm2 ρ(m2 ) ,
m2 − 0
0
752 G. A. Vilkovisky

one obtains the variational equations with the symmetrized resolvent:

∞ sym
1
dm2 ρ(m2 ) ϕ = −J .
m2 − 0
0

These cannot be the expectation-value equations since they are not causal.
But, through the derivation above, we know how to correct this: just to re-
place the symmetrized resolvent with the retarded resolvent. The corrected
equations
∞ ret
1
dm2 ρ(m2 ) ϕ = −J .
m2 − 0
0

do not already follow from any action although indirectly they do. Only if the
action Γ (ϕ) is local, i.e., the function f (0 ) is polynomial, the least-action
principle holds directly.
Two precepts should be kept in mind when using the formalism above.
First, the replacement rule concerns the resolvents of the formfactors and not
the propagators in the loop. The loop should be calculated Euclidean. Hence
First Precept: First do the loop, next replace the resolvents.
Second, the replacement of resolvents is to be made in the equations and not
in the action. It does not make sense to make it in the action. Hence
Second Precept: First vary the action, next replace the resolvents.
We thus go over to the calculation of the Euclidean eﬀective action.

4 The Eﬀective Action

4.1 The Operator S2

The ϕi is a set of ﬁelds for which a more explicit notation will now be used:

ϕi = ϕa (x) . (126)

The operator S2 acts on a small disturbance of ϕi and is a second-order

diﬀerential operator
μν μ
Sij δϕj = (Xab ∂μ ∂ν + Yab ∂μ + Zab ) δϕb (x) . (127)

The generality of this operator will, however, be restricted by the condition

that the coeﬃcient of the senior term factorizes as
μν
Xab = ωab g μν , det ωab = 0 , det g μν = 0 . (128)
Expectation Values and Vacuum Currents 753

In this case, the operator (127) is said to be diagonal, or minimal, or nonexotic.

Condition (128) is too restrictive and not necessary. It can be replaced by a
more general condition
μν
det (Xab nμ nν ) = C(g μν nμ nν )d ∀nμ ,
d = dim a , C = 0 , det g μν = 0 ,
(129)
and even this condition can be generalized. Higher-order and ﬁrst-order oper-
ators can also be considered but, in all of these cases, the Green’s functions
of S2 are expressed through the Green’s functions of a diagonal second-order
operator. The case (128) is basic.
In the case (128), the matrix ωab can be factored out:

Sij δϕj = ωac Hbc δϕb (x) , (130)

and a covariant derivative can be introduced:

∇μ δϕa = (δba ∂μ + Aμ ab ) δϕb (131)

so as to absorb the ﬁrst-order term:

Hba = δba g μν ∇μ ∇ν + Pba . (132)

This is the ﬁnal form of S2 . A short notation will be used:

H = 1̂ + P̂ (133)

where
≡ g μν ∇μ ∇ν , (134)
and the hat designates a matrix in a, b:

1̂ = δba , P̂ = Pba , tr P̂ = Paa , etc. (135)

The matrix ωab may be regarded as a local metric in the space of ﬁelds. The
symmetry of S2 implies that this matrix is symmetric, covariantly constant,
and converts P̂ into a symmetric form:

ωab = ωba , ∇μ ωab = 0 , (136)

Pac ωcb − Pbc ωca = 0 . (137)

The dominant energy condition implies that ωab is positive definite. The ma-
trix g μν is the inverse of the metric on the base manifiold. Since we are con-
sidering Euclidean theory, this metric is positive definite too.
Apart from the algebraic factor ωac in (130), the operator S2 contains
three background fields:
g μν , ∇μ , P̂ , (138)
i.e., the metric, the connection (or covariant derivative), and the matrix poten-
tial. And where is the original background ϕ of S2 (ϕ)? When S2 is calculated
754 G. A. Vilkovisky

from the action S, the metric, connection, and potential are obtained as func-
tions of the original set of fields ϕ, but from now on it does not matter. The
effective action is expressed in a universal manner through the fields (138)
only.
The strengths of the fields (138) are respectively the Riemann tensor,
the commutator of covariant derivatives, and the potential which is its own
strength:
Rαβμν , [∇μ , ∇ν ] = R̂μν , P̂ . (139)
I shall call these field strengths curvatures and use for them the collective
notation
Rαβμν , R̂μν , P̂ = . (140)
The following contractions of the curvatures will be called currents:

Jˆμ ≡ ∇ν R̂μν , (141)

1
Jμν ≡ Rμν − gμν R , J ≡ g μν Jμν . (142)
2
The currents are conserved:

∇μ Jˆμ = 0 , ∇μ Jμν = 0 . (143)

If all the curvatures vanish, the background is trivial. The eﬀective action is
a functional of the curvatures (140).

4.2 Redundancy of the Curvatures

The effective action is a nonlocal functional of the curvatures, and this fact
conditions a certain simplification.
Since the commutator curvature is a commutator, it satisfies the Jacobi
identity, and so does the Riemann curvature:

∇γ R̂μν + ∇ν R̂γμ + ∇μ R̂νγ = 0 , (144)

∇γ Rαβμν + ∇ν Rαβγμ + ∇μ Rαβνγ = 0 . (145)

Act on these identities with ∇γ . In the ﬁrst term, the operator forms, and
in the remaining terms commute the covariant derivatives. The commutator
brings an extra power of the curvature. The equations obtained

R̂μν + O(2 ) = 2∇[ν Jˆμ] , (146)

1
Rαβμν + O( ) = 4∇[μ ∇α Jν]β − gν]β J
2
(147)
2
hold identically and have the form of inhomogeneous wave equations, the role
of inhomogeneity being played by the currents. In (146), (147), the brackets of
both types [ ] and
denote the antisymmetrization in the respective indices.
Expectation Values and Vacuum Currents 755

The equations (146) and (147) are nonlinear, but they can be solved by
iteration. The result is that the commutator and Riemann curvatures get ex-
pressed in a nonlocal fashion through their currents and an arbitrary solution
of the homogeneous wave equation

R̂wave
μν =0, wave = 0 .
Rαβμν (148)

If the metric is Lorentzian, this solution is fixed by initial data which can be
given in the remote past. It follows that the commutator and Riemann cur-
vatures are specified by giving an incoming wave and the current J. This fact
underlies the Maxwell and Einstein equations. They fix the currents J. Adding
initial conditions to these equations specifies the connection and metric.
In the present case, since the metric is Euclidean, there are no wave solu-
tions:
R̂wave
μν =0, wave = 0 ,
Rαβμν (149)
and the Green’s function 1/ is unique. Therefore, the commutator and Rie-
mann curvatures are expressed entirely through their currents:
1
R̂μν = 2∇[ν Jˆμ] + O(J 2 ) , (150)

1 1
Rαβμν = 4∇[μ ∇α Jν]β − gν]β J + O(J 2 ) . (151)
2
Thus, the curvatures are redundant because there are no waves in Eu-
clidean theory. Owing to this fact, the set of field strengths (140) reduces to

Jμν , Jˆμ , P̂ , (152)

and the eﬀective action is a functional of the reduced set.

4.3 The Axiomatic Eﬀective Action

To what class of functionals does the effective action belong? One can say in
advance that this should be a functional analytic in the curvature. Indeed,
the first variational derivative of the effective action taken at the trivial back-
ground should vanish because, in the absence of an external source, the rel-
ative vacuum becomes the absolute vacuum. The trivial background should
solve the mean-field equations in the absolute vacuum. Higher-order varia-
tional derivatives taken at the trivial background determine the correlation
functions in the absolute vacuum. They may not vanish but neither should
they blow up.
The analyticity suggests that the effective action can be built as a sum of
nonlocal invariants of N -th order in the curvature:

Γ = ΓN , ΓN = O[N ] . (153)
N
756 G. A. Vilkovisky

Nonlocal invariant is, however, an uncertain concept. Even local invariant of

N -th order in the curvature is a concept that needs to be refined, but this
is easy to do. The most general local monomial that can be built out of the
available quantities yields an invariant of the form

dx g 1/2 (∇1 ...∇1 )(∇2 ...∇2 ) . . . 1 2 . . . N + O[N +1 ] . (154)
- ./ 0
k
This monomial is a product of N curvatures and k covariant derivatives, all
indices being contracted by the metric. In (154), the labels 1, 2, . . . point out
which derivative acts on which curvature, but all the curvatures are at the
same point, and the total number of derivatives is finite. Of course, the curva-
ture sits also in the covariant derivatives and in the metric that contracts the
indices. Therefore, the N -th order invariant can only be defined up to terms
O[N +1 ]. In particular, the covariant derivatives in (154) can be commuted
freely because the contribution of a commutator is already O[N +1 ].
One may now consider a class of nonlocal invariants that can formally be
represented as infinite series of local invariants:
∞

ΓN = dx g 1/2 ck (∇1 ...∇1 )(∇2 ...∇2 ) . . . 1 2 . . . N + O[N +1 ] . (155)
- ./ 0
k=0
k
Here ck are some dimensional constants. It can be seen that this is the needed
class.2 The number of curvatures in (155) is N , but the number of deriva-
tives is unlimited. Only a finite number of derivatives can contract with the
curvatures. The remaining ones can only contract among themselves. If two
derivatives acting on the same curvature contract, they make a operator
acting on this curvature:

∇1 2 = 1 , ∇2 2 = 2 , . . . . (156)

If two derivatives acting on diﬀerent curvatures contract, the contraction can

again be written in terms of the operators:

2∇1 ∇2 = (∇1 + ∇2 )2 − ∇1 2 − ∇2 2
= 1+2 − 1 − 2 , (157)

but there appears a operator acting on the product of two curvatures:

1+2 1 2 3 . . . = () 3 . . . . (158)

As a result, (155) takes the form

2
To see it, consider any diagram with massive propagators and expand it formally
in the inverse mass. The method that accomplishes this expansion is known as
the Schwinger–DeWitt technique.
Expectation Values and Vacuum Currents 757
⎛ ⎞
∞

ΓN = dx g 1/2 ⎝ ck (1 )k1 (2 )k2 (1+2 )k3 . . .⎠
k1 ,k2 ,···=0

× ∇...1 ∇...2 . . . ∇...N +O[N +1 ] .
- ./ 0
contraction
(159)

There remains an inﬁnite series in the variables, and these variables them-
selves are operators acting on the curvatures in a given contraction. The re-
maining series is some function of the variables:

ΓN = dx g 1/2 F (1 , 2 , 1+2 , . . .) ∇...1 ∇...2 . . . ∇...N +O[N +1 ] .
- ./ 0
contraction
(160)
This is the general form of a nonlocal invariant of N -th order in the curvature.
The function F is a formfactor.
There is, in addition, the identity

∇1 + ∇2 + . . . + ∇N = 0 (161)

which reduces the number of variables in the function F . The sum in (161)
is a derivative acting on the product of all curvatures, i.e., a total derivative.
Total derivatives vanish because the curvatures may be considered having
compact supports. Thus invariants of ﬁrst order in the curvature can only be
local because any derivative is a total derivative. Therefore, the ﬁrst-order
formfactors are constants:

N =1 : F = const. (162)

At the second order, all formfactors are functions of only one argument be-
cause the remaining arguments can be eliminated by integration by parts:

N =2 : F = F (1 ) , (163)

2 = 1 , 1+2 = 0 .
At the third order, all formfactors are functions of three individual ’s because
the ’s acting on pairs can be eliminated:

N =3 : F = F (1 , 2 , 3 ) , (164)

1+2 = 3 , 1+3 = 2 , 2+3 = 1 .

The ’s acting on pairs appear beginning with the fourth order in the curva-
ture and are parameters of the on-shell scattering amplitudes.
758 G. A. Vilkovisky

Nonlocal invariants of a given order make a linear space in which all pos-
sible contractions of N curvatures and their derivatives make a basis, and
the formfactors play the role of coefficients of the linear combining. The basis
can be built by listing all independent contractions. The effective action is an
expansion in this basis with certain coefficients–formfactors:

Γ = ΓI + ΓII + ΓIII + . . . , (165)

ΓI = dx g 1/2 c1 R + c2 tr P̂ , (166)

ΓII = dx g 1/2 tr Rμν F1 () Rμν

+ R F2 () R
+ P̂ F3 () R
+ P̂ F4 () P̂

+ R̂μν F5 () R̂μν , (167)

ΓIII = dx g 1/2 tr F1 (1 , 2 , 3 ) P̂1 P̂2 P̂3

+ F2 (1 , 2 , 3 ) R̂1 μ α R̂2 α β R̂3 β μ

+ ···

+ F29 (1 , 2 , 3 ) ∇λ ∇σ R1αβ ∇α ∇β R2μν ∇μ ∇ν R3λσ .
(168)

In the first-order action (166), there are two basis contractions: the Ricci scalar
and the trace of the matrix potential, and the formfactors are constants. In the
second-order action, there are five independent contractions listed in (167). In
the third-order action, there are 29 basis contractions, examples of which are
given in (168). Here I shall stop because, for the problems of interest, the third
order is sufficient. The reason for that will be explained in the next lecture.
In the expressions above, the basis invariants are written in terms of the
curvatures, but they can be rewritten in terms of the conserved currents. Note
also that the operator arguments of the third-order formfactors F commute
because they act on different objects. Since the arguments commute, the func-
tions F themselves are ordinary functions of three variables.
Thus, even before any calculation, we have an ansatz for the effective
action, with unknown formfactors. We need them in the spectral forms
∞
ρk (m2 )
Fk () = dm2 + a polynomial in , (169)
m2 −
0
Expectation Values and Vacuum Currents 759

∞
ρk (m21 , m22 , m23 )
Fk (1 , 2 , 3 ) = dm21 dm22 dm23 , (170)
(m21 − 1 )(m22 − 2 )(m23 − 3 )
0

and then we can proceed directly to the expectation-value equations. Un-

known are only the spectral weights. These are to be calculated from the loop
diagrams, but there is an alternative approach. One can look for the general
limitations on the spectral weights stemming from axiomatic theory. These
limitations may be sufficient to solve one’s expectation-value problem. In this
case, the solution will prove to be independent of the details of the quantum-
field model and the approximations made in it. Moreover, the effective action
above does not refer even to quantum field theory. It is an action for the
observable field, and its implications may be valid irrespective of the under-
lying fundamental theory. Only certain axiomatic properties of the spectral
weights may be important. There is an example in which this approach has
been implemented [53].
Here, the axiomatic approach will not be considered. Let us see how the
effective action is calculated from loops.

4.4 Heat Kernel

Consider any diagram in the eﬀective action

, (171)
&%
and, for every propagator, write
∞
1
=− = ds esH . (172)
H
0

The kernel of the exponential operator

esH δ(x, y) ≡ K̂(x, y|s) (173)

(and the operator itself) is called heat kernel, and the parameter s is often
called proper time. Both names are matters of history, and a matter of physics
is the fact that H is negative deﬁnite. The matrix P in (133) may spoil the
negativity but, since it is treated perturbatively, as one of the curvatures, this
does not matter.
Upon the insertion of (172), the diagram remains the same as before but
with the heat kernels in place of the propagators, and the integrations over
the proper times will be left for the last:
760 G. A. Vilkovisky
'$ '$
∞ ∞ s1
= ds1 . . . dsn ... . (174)
sn
&%0 0 &%
The one-loop eﬀective action is the functional trace of the heat kernel, inte-
grated over s:

1 1 1 ∞ ds
= ln det = dx tr K̂(x, x|s) . (175)
2 H 2 0 s

Thus, one is left with diagrams with the heat kernels. It will be seen in a
moment why this is better.
The expansion rule for the exponential operator has already been consid-
ered in (27). There remains to be presented the lowest-order approximation
for the heat kernel:
1
−σ(x,y)/2s
K̂(x, y|s) = e â(x, y) + O[] , (176)
(4πs)D/2

D = dimension of the base manifold. (177)

At the lowest order in the curvature, the potential P does not aﬀect this
expression, but the metric and connection do. As mentioned above, covariant
expansions cannot be rigid. In (176)

2σ(x, y) = (geodetic distance between x and y)2 (178)

in the metric entering the operator H. The connection entering the operator H
deﬁnes a parallel transport along a line. Parallel transport is a linear mapping,
so there exists a propagator of parallel transport (the matrix that accomplishes
this mapping). In (176)

â(x, y) = propagator of the parallel transport from y to x (179)

along the geodesic connecting y and x.
The geodesic comes from the metric, and the parallel transport from the
connection.
The two-point functions (178) and (179) are the main elements of the
Schwinger–DeWitt technique mentioned above and the basic building blocks
for all Green’s functions: of the hyperbolic operator H, and of the elliptic
operator H, and the heat kernel. What is special about the heat kernel?
Special is the fact that, as seen from expression (176), the heat kernel is
ﬁnite at the coincident points. Green’s functions of the hyperbolic and elliptic
operators are singular, and this is normal. Abnormal is the fact that in the
loop diagrams they appear at the coincident points. Finiteness of the heat
kernel at the coincident points is a bonus owing to which all diagrams with
the heat kernels are ﬁnite.
Expectation Values and Vacuum Currents 761

The divergences of the loop diagrams reappear in the proper-time integrals

in (174). These integrals diverge at the lower limits. At this stage, one more
advantage of the heat kernel comes into eﬀect. Namely, the manifold dimension
D enters only the overall factor in (176). Apart from this factor, the expansion
of the heat kernel in the curvature does not contain D explicitly. Therefore,
loops with the heat kernels are calculated once for all dimensions, and then
the knowledge of the analytic dependence on D enables one to apply the
dimensional regularization to the proper-time integrals. One integrates by
parts in s keeping D < 4 and next goes over to the limit D → 4. For
example,
∞ ∞
ds 1 df (s)
f (s) = f (0) − ds ln s + O (2 − D/2) . (180)
sD/2−1 2 − D/2 ds
0 0

The dimensional regularization annihilates all power divergences. Only the

logarithmic divergences survive and take the form of poles in dimension. These
poles affect only the polynomial terms in the spectral representations of the
formfactors. They appear in the coefficients of the polynomials, thereby mak-
ing these coefficients indefinite. As a consequence, the local terms of the ef-
fective action will have indefinite coefficients. I shall come back to this issue.
After the substitution of the heat kernels for the propagators, the calcu-
lation of loops becomes an entertaining geometrical exercise.

4.5 Loops and Geometry

The heat kernel involves σ and â. The derivative of σ

xr
∇ σ(x, y) ≡ σ (x, y)
μ μ
r Q
s σ μ (x, y)
Q (181)
y

is the vector tangent to the geodesic connecting y and x, directed outwards,

and normalized to the geodetic distance between y and x:
, ,
, ,
gμν σ μ σ ν = 2σ , σμ , = 0 , det ∇ν σ μ , = 0 . (182)
x=y x=y

The normalization condition is a closed equation for σ which together with

the conditions at the coincident points can serve as the deﬁnition of σ. The
deﬁning equation for â together with the condition at the coincident points is
,
,
σ μ ∇μ â(x, y) = 0 , â, = 1̂ . (183)
x=y

The determinant

det ∇xμ ∇yν σ(x, y) = g 1/2 (x)g 1/2 (y)Δ(x, y) (184)
762 G. A. Vilkovisky

is known as the Van Vleck–Morette determinant. It is responsible, in partic-

ular, for a caustic of the geodesics emanating from x or y.
The vector σ μ can be used to expand any function in a covariant Taylor
series. For a scalar, this series is of the form
∞
(−1)n μ1
f (y) = σ . . . σ μn ∇μ1 . . . ∇μn f (x) . (185)
n=0
n!

If f is not a scalar, it should at ﬁrst be parallel transported from y to x:

∞
(−1)n μ1
f (y) = â(y, x) σ . . . σ μn ∇μ1 . . . ∇μn f (x) . (186)
n=0
n!

The covariant Taylor expansion is a regrouping of the ordinary Taylor ex-

pansion. Whatever the connection is, it cancels in this series. The series can
formally be written in the exponential form

f (y) = â(y, x) exp (−σ μ ∇μ ) f (x) (187)

which will be of use below. Two-point functions expanded in this way

get expressed through their covariant derivatives at the coincident points.
Thus
1
Δ(x, y) = 1 + Rμν σ μ σ ν + . . . . (188)
6
A loop always involves the ring of â’s

â(x, x1 )â(x1 , x2 ) . . . â(xn , x) , AA

(189)
AA =

i.e., the parallel transport around a geodetic polygon. The ring of two
â’s is the parallel transport there and back along the same path.
Therefore,
â(x, x1 )â(x1 , x) ≡ 1̂ . (190)
The ring of three â’s is the parallel transport around the geodetic trian-
gle. It involves the commutator curvature, and the curvature terms can be
calculated:
1
â(x, x1 )â(x1 , x2 )â(x2 , x) = 1̂ + R̂αβ σ1 α σ2 β + . . . , (191)
2
x
1
σ2 μ
k
Q
Qx . (192)
Q
σ1 μ
+
Q
Q x2
Expectation Values and Vacuum Currents 763

This is suﬃcient because any polygon can be broken into triangles:

Q
-
QAA
. (193)
AA

Solution of the geodetic triangle is also involved. In the notation of (192),
2 1
σ μ (x1 , x2 ) = σ1 2 + σ2 2 − 2σ1 σ2 − Rμανβ σ1 μ σ1 ν σ2 α σ2 β + . . . . (194)
3
Here the ﬁrst two terms make the Pythagorean theorem, the third term ac-
counts for the angle not being the right angle, and the terms with the Riemann
curvature can be calculated.
The above is to give a ﬂavour of what loops imply.

4.6 Calculation of Loops

The heat kernel calculates loops with a remarkable elegance. As an example,

consider the contribution of the second order in the curvature to the eﬀective
action. The respective one-loop diagram contains two curvatures and two
heat kernels with the proper times s1 and s2 :
s1
~ ~+ O[3 ]
s2

= dx g 1/2 dy g 1/2 (x)K̂(x, y|s1 )K̂(x, y|s2 )(y) + O[3 ] . (195)

Suppose that the calculation only needs to be done with accuracy O[3 ]. Then
one can insert in (195) the lowest-order approximation for the heat kernels. In
this approximation, the rings of â’s collapse to 1̂, and the remaining â’s always
transport the ’s to the same point arranging their complete contraction.
With the â’s and the numerical coeﬃcients omitted, the diagram (195) is of
the form

1 1 1/2
dx g dy g 1/2
s1 D/2 s2 D/2

σ(x, y) σ(x, y)
× (x) exp − exp − (y) . (196)
2s1 2s2

But the exponents here simply add, and the two heat kernels turn into one
with a complicated proper-time argument:

1 s1 + s2
dx g 1/2
dy g 1/2
(x) exp − σ(x, y) (y)
(s1 s2 )D/2 2s1 s2
764 G. A. Vilkovisky
, s s
1 , 1 2
= dx g 1/2 dy g 1/2 (x)K x, y , (y) . (197)
(s1 + s2 )D/2 s1 + s2
One only needs to rewrite this heat kernel in the operator form:

1 s1 s2
dx g exp
1/2
(y) , (198)
(s1 + s2 )D/2 s1 + s2
and the loop is done. The proper-time integral
∞ ∞
1 s1 s2
ds1 ds2 exp = F () (199)
(s1 + s2 )D/2 s1 + s2
0 0

is the formfactor.
What has happened? The propagators in the loop glued together, and the
loop turned into a tree:

~ ~ - ~ ~. (200)

This is what means to do the loop. It means to turn it into a tree. The role
of the propagator in the tree is played by the formfactor F ().
Consider now any multiloop diagram with parallel propagators. It turns
into a tree

~ ~ - ~ ~ (201)

in a completely similar way. The inverse proper times add:

1 1 1
+ + ... =
s1 s2 stotal
(the law of parallel conductors). There is nothing to do.
For more than two curvatures a more powerful method is used. Consider
the diagram
~
x

J
y1
J y2 + O[4 ] , (202)
~
J ~

and suppose again that it is needed only up to the next order in the curvature.
Then, with the â’s and the numerical coeﬃcients omitted, it is of the form

1 1 1 1/2 1/2
dx g dy 1 g dy2 g 1/2
s1 D/2 s2 D/2 s3 D/2
Expectation Values and Vacuum Currents 765

σ(x, y1 ) σ(x, y2 ) σ(y1 , y2 )
× exp − − − (x)(y1 )(y2 ) . (203)
2s1 2s2 2s3

Choose one of the vertices, say x, to be the observation point of the eﬀective
Lagrangian. One of the curvatures, (x), is already there. Shift the remaining
curvatures to x using the covariant Taylor series:

(yi ) = exp (−σi μ ∇μ ) (x) , (204)

σi μ = σ μ (x, yi ) , i = 1, 2 . (205)
Next, consider the geodetic triangle with the same vertices as in the diagram.
For the geodesics connecting x with yi , write

2σ(x, yi ) = (σi )2 , (206)

and, for the geodesic between the y’s, use the Pythagorean theorem:

2σ(y1 , y2 ) = (σ1 )2 + (σ2 )2 − 2σ1 σ2 + O[] . (207)

Finally, replace the integration variables:

y1 μ → σ 1 μ , y2 μ → σ 2 μ . (208)

The Jacobian
, μ ,
, ∂σ (x, yi ) ,−1 g 1/2 (x) −1 g 1/2 (x)
, ,
, ∂yi ν , = g 1/2 (y ) Δ (x, yi ) = g 1/2 (y ) (1 + O[]) (209)
i i

removes the measure g 1/2 from the integral in yi and brings an extra g 1/2 to
the integral in x. Expression (203) takes the form
2
1 σ1 2 σ2 2
dx g 1/2
g 1/2
(x) dσ 1 dσ 2 exp − −
(s1 s2 s3 )D/2 4s1 4s2

σ1 + σ2 − 2σ1 σ2
2 2
− − σ1 μ ∇μ 1 − σ2 μ ∇μ 2 (x)1 (x)2 (x) . (210)
4s3

Here the labels 1, 2 on ∇μ and point out which ∇μ acts on which . The
operators ∇μ ﬁgure as parameters in the integral, and, up to the next order
in , they commute. Since the parameters commute, the integral in σ1 μ , σ2 μ
2
is an ordinary Gaussian integral. Do it. The extra factor g 1/2 (x) cancels,
and the result is
⎛ ⎞

2
B(s1 , s2 , s3 ) dx g 1/2 exp ⎝ bik (s1 , s2 , s3 )∇i ∇k ⎠ (x)1 (x)2 (x)
i,k=1
(211)
766 G. A. Vilkovisky

where B(s1 , s2 , s3 ) is some function of the proper times, and the exponent is
a quadratic form in ∇1 , ∇2 with s-dependent coeﬃcients. The loop is done.
The integral
⎛ ⎞
∞
2
ds1 ds2 ds3 B(s1 , s2 , s3 ) exp ⎝ bik (s1 , s2 , s3 )∇i ∇k ⎠
0 i,k=1

= F (∇1 2 , ∇2 2 , ∇1 ∇2 ) (212)

is the formfactor. Integration by parts in x brings it to the arguments:

F (∇1 2 , ∇2 2 , ∇1 ∇2 ) → F (∇1 2 , ∇2 2 , ∇2 ) . (213)

The eﬀect of the calculation above is again that the loop is turned into
a tree:
~ ~

J -

J . (214)

~ J ~ ~ ~

The vertex of the tree is the formfactor F (∇1 2 , ∇2 2 , ∇3 2 ). This method applies
to any diagram with the heat kernels. One only needs to do Gaussian integrals,
and the result is always the exponential of a quadratic combination of ∇’s.
The formfactor is a function of the products ∇i ∇k .

4.7 The One-Loop Formfactors

The result of the proper-time integrations depends essentially on the dimen-

sion D. For D = 4, the one-loop formfactors in the eﬀective action (165) are
as follows.
With one exception, all second-order formfactors are logs:
1 1
F1 () = ln(−) + const. , (215)
60 2(4π)2

1 1
F2 () = − ln(−) + const. , (216)
180 2(4π)2
1 1
F3 () = , (217)
18 2(4π)2
1 1
F4 () = ln(−) + const. , (218)
2 2(4π)2
1 1
F5 () = ln(−) + const. (219)
12 2(4π)2
Expectation Values and Vacuum Currents 767

Since
∞
1
− ln(−) = dm2 + const. , (220)
m2 −
0

these expressions have the spectral forms (169) with definite spectral weights
and indefinite additive constants (polynomials of the zeroth power). Respec-
tively, the effective action contains a set of local terms with unspecified coef-
ficients:

1 1/2
Γ = dx g c1 R + c2 tr P̂ + c3 Rμν Rμν + c4 R2
2(4π)2
1
+ c5 tr(P̂ P̂ ) + c6 tr(R̂μν R̂μν ) + R tr P̂ + nonlocal terms . (221)
18
The nonlocal terms are specified completely.
The third-order formfactors have no polynomial terms and indefinite coef-
ficients. The simplest third-order formfactor is F1 (1 , 2 , 3 ) in (168). It has
the spectral form (170), and its spectral weight ρ1 (m21 , m22 , m23 ) is obtained as
follows. Consider a triangle of three spectral masses

J
m1
Jm2 A = area of the triangle.

J
m3
It can be built only if every mass is smaller than the sum of the two others.
The spectral weight ρ1 is zero if the triangle cannot be built. Otherwise, it is
proportional to the inverse area of this triangle:
1 1 1
ρ1 (m21 , m22 , m23 ) = − 2
3 2(4π) 4πA
× θ(m1 + m2 − m3 )θ(m1 + m3 − m2 )θ(m2 + m3 − m1 ) . (222)

The remaining 28 third-order formfactors are expressed through F1 and

are tabulated [36]. The tables contain various integral representations of the
formfactors, and their asymptotics.
The loop of the minimal second-order operator with arbitrary metric, con-
nection, and potential is called standard loop because every calculation with
it is done once, and the results can be tabulated. A calculation in any specific
model boils down to combining the standard loops and using the tables. A
number of recipes for the reduction to minimal operators can be found in [24].
Doing loops becomes a business similar to doing integrals.
The fact that some coefficients in the effective action remain unspecified
is none of the tragedy. The effective action is a phenomenological object in-
tended for obtaining the values of observables. The spectral weights are cer-
tain phenomenological characteristics of the vacuum like the permittivity of
a medium. They are to be calculated from a more fundamental microscopic
theory. Some microscopic theory of some level is incapable of specifying some
768 G. A. Vilkovisky

of the coeﬃcients. So what? Classical theory was capable of even less, and,
nevertheless, celestial mechanics has been successfully worked up.3 The only
important question is whether the lack of knowledge aﬀects the problems that
we want to solve. This will be cleared up in the next lecture.

5 Vacuum Currents and the Eﬀect of Particle Creation

5.1 Vacuum Currents

Consider quantum electrodynamics. In this case, ϕa (x) is a set of the vector

connection ﬁeld and the electron–positron ﬁeld

QED: ϕa = Aμ , ψ . (223)

The commutator curvature is, up to a coeﬃcient, the Maxwell tensor, and the
operator ﬁeld equations are of the form

∇ν Rνμ (Â) + Jμ (ψ̂) = −Jμext (224)

where Jμ (ψ̂) is the operator electron–positron current, and Jμext is an exter-

nal source. Averaging these equations over the in-vacuum state, one obtains,
according to the general derivation above, the same terms but as functions of
the mean field plus a set of loops:

A ψ

A ψ

A A A A
∇ Rνμ (
A) + Jμ (
ψ)+
ν
+ + + = −Jμext . (225)

A A ψ ψ
There is another such equation, for ψ, but, since ψ has no external source, its
solution is

ψ = 0 . (226)
Then, in (225), Jμ (
ψ) vanishes, and the loops with the vertices SAAψ vanish.
There are no such vertices in QED but, if there were, as in gravidynamics,
they would be proportional to
ψ and vanish by (226). The photon loop also
vanishes because neither there is a vertex SAAA , but this is already a specific
property of QED. Only the electron–positron loop survives.
The surviving loop is a function of
A and, by derivation, is the electron–
positron current averaged over the in-vacuum:
ψ

A
= Jμvac (
A) =
in vac|Jμ (ψ̂)|in vac . (227)

ψ
This is the vacuum current. According to (225), the observable electromagnetic
field satisfies the Maxwell equations with an addition of the vacuum current:
3
Remarkably, without a knowledge of string theory!
Expectation Values and Vacuum Currents 769

∇ν Rνμ (A) = −Jμvac (A) − Jμext . (228)

We obtain this current by varying the eﬀective action and next replacing the
Euclidean resolvents with the retarded resolvents:
,
vac δΓ (A) ,,
Jμ (A) = , (229)
δAμ ,→
ret

Γ (A) = dx g 1/2 RF ()R + F (1 , 2 , 3 )R1 R2 R3 + . . . . (230)

It is completely similar if ϕa (x) is a set of the metric ﬁeld and any matter
ﬁelds
GRAVITY: ϕa = gμν , ψ . (231)

The only diﬀerence is that the vertex Sggg is nonvanishing:

ψ
g

1 g g
Rμν (
g) −
gμν R(
g) + + ext
= 8πTμν , (232)
2 g
ψ

ψ = 0 , (233)
and it is assumed again that the matter ﬁelds have no sources. Again, by
derivation, the matter loop is the energy–momentum tensor of the ﬁeld ψ̂
averaged over the in-vacuum, but the vacuum current contains, in addition,
the graviton loop:
vac
Tμν ==
in vac|Tμν (ψ̂)|in vac + the graviton loop. (234)

The Einstein equations are replaced by the expectation-value equations in the

in-vacuum state:
1
Rμν − gμν R = 8πTμν
vac ext
(g) + 8πTμν . (235)
2
Since the gravitational field couples to everything, the equation (232)
should contain loops of all matter fields in Nature. The effective actions for
all loops including the graviton loop have the same structure:
,
2 δΓ (g) ,,
Tμν (g) = − 1/2
vac
, (236)
g δg μν ,→
ret

Γ (g) = dx g 1/2 R.. F ()R.. + F (1 , 2 , 3 )R1.. R2.. R3.. + . . . . (237)

Only the coefficients of the formfactors are different. To have the correct
coefficients, one would need to know the full spectrum of particles. Therefore,
in the case of gravity, the axiomatic approach is most suitable.
770 G. A. Vilkovisky

Now recall that the curvatures are redundant, and the eﬀective action is
in fact a functional of the conserved currents (141) and (142). Owing to this
fact, the expectation-value equations (228) and (235) close with respect to
these currents:
2
∇ν Rνμ + f (ret ) ∇ν Rνμ + O ∇ν Rνμ = −Jμext , (238)

1 1
Rμν − gμν R + f1 (ret ) Rμν − gμν R
2 2
1 2
+ f2 (ret )(∇μ ∇ν − gμν )R + O Rμν − gμν R = 8πTμν
ext
. (239)
2
Of course, with respect to the mean ﬁelds, these equations are closed from
the outset but, at an intermediate stage, they are closed with respect to the
Maxwell and Einstein currents. When solved with respect to these currents,
they become literally the Maxwell and Einstein equations with some external
sources but not the original ones. To make this clear, use the fact that the
vacuum terms are proportional to the Planck constant and solve the equations
by iteration:
2
∇ν Rνμ == −Jμext + f (ret )Jμext + O Jμext , (240)

1
Rμν − gμν R = 8πTμν ext
− f1 (ret )8πTμν
ext
2
ext 2
+ f2 (ret )(∇μ ∇ν − gμν )8πT ext + O Tμν . (241)

These are the Maxwell and Einstein equations with the original sources prop-
agated in a nonlocal and nonlinear manner.
There is an eﬀect in these equations that drives the entire problem.

5.2 Emission of Charges

Consider again QED and suppose that the external source has a compact spa-
tial support. This source is the current of a set of electrically charged particles
moving inside a space–time tube, but, since the observable electromagnetic
ﬁeld is the expectation value, only the total current in (228) or (240) is ob-
servable:
Jμtot = Jμext + Jμvac (A) . (242)
And the total current has a noncompact spatial support because the vacuum
contribution is nonlocal. One may calculate the ﬂux of charge through the
support tube of J ext and even through a wider tube (see Fig. 3), and it will
be nonvanishing:
Expectation Values and Vacuum Currents 771

supp J ext

Σ2

Σ1

Fig. 3. Support tube of J ext and a wider tube

Σ2
1
eT (Σ1 ) − eT (Σ2 ) = Jμvac dT μ = 0 . (243)
4π
Σ1

Here eT (Σ) is the amount of the electric charge contained inside the tube T
at a given instant Σ. The charge inside the tube is not conserved.
If, when moving away from the support of J ext , the flux (243) falls off
rapidly, then its nonvanishing only means that the boundary of the original
source gets spread. Because of the creation of virtual pairs, this boundary can
never be located precisely. The charges of the external source immersed in
the quantum vacuum are always annihilated and created again in a slightly
different place. There is no point to worry about. Just step aside a little.
However, one may ask if there is a flux of charge through an infinitely wide
tube:
Σ2 ,
1 ,
e(Σ1 ) − e(Σ2 ) = Jμvac dT μ , . (244)
4π r→∞
Σ1

In this equation, e(Σ) is the total amount of the electric charge in the compact
domain of space at a given instant Σ. For (244) to be nonvanishing, Jμvac should
772 G. A. Vilkovisky

behave as
1
Jμvac = O , r→∞, (245)
r2
√
r ∝ area of S (246)
where S is the intersection of T with Σ (Fig. 3). In this case, it would turn
out that the charge disappears, i.e., our source is emitting charge. But even
this may not be a point of concern if the current in (244) oscillates with time,
and the oscillations sum to zero for a suﬃciently long period between Σ1 and
Σ2 . The expectation values have uncertainties, and these oscillations are a
quantum noise. Just do not measure (244) too often.
However, one may ask if the charge emitted for the entire history

Σ→+∞
,
1 ,
e(−∞) − e(+∞) = Jμvac dT μ , (247)
4π r→∞
Σ→−∞

is nonvanishing. There will always be oscillations in the current, but they

may sum not to zero. Since, as r → ∞, all fields fall off, there are, in this
limit, the asymptotic Killing vectors corresponding to all the symmetries of
flat and empty space–time. Therefore, one may ask the same questions about
the emission of energy and any other charges. Thus the quantity

Σ→+∞
,
,
M (−∞) − M (+∞) = vac ν
Tμν ξ dT μ , (248)
r→∞
Σ→−∞

with ξ ν the asymptotic time-like Killing vector is the energy emitted by the
source for the entire history.
If the total emitted charges are nonvanishing, then this is the real effect,
and then the question emerges: What are the carriers of these charges? There
should be some real agents carrying them away. But the particles of the orig-
inal source stay in the tube. Besides them, there is only the electron–positron
field, but it is in the in-vacuum state. This means that, at least initially, there
are neither electrons nor positrons. There remains to be assumed a miracle:
that either the real electrons or the real positrons – depending on the sign of
the emitted charge – get created. Then they are created by pairs, and, say, the
created positron is emitted while the created electron stays in the compact
domain.
This crazy guess can be checked. We have two ways of calculating the
vacuum currents: through the effective action and by a direct averaging of
the operator currents as in (227) and (234). Specifically, for the in-vacuum of
electrons and positrons we have
vac
Tμν =
in vac|Tμν (ψ̂)|in vac (249)
Expectation Values and Vacuum Currents 773

where Tμν (ψ̂) is the operator energy–momentum tensor of the electron–

positron ﬁeld ψ̂. The equation for ψ̂

( ∂ + μ − iq
A) ψ̂ = 0 (250)

contains the electromagnetic field which in (249) figures as an external field

but is in fact the mean field solving the expectation-value equations. We know
that, in the past, all mean fields are static. In the future, they become static
again because, if the total emitted charges are finite, then all the processes
should die down. Thus, there are two asymptotically static regions: in the past
and in the future. The carriers of the emitted charges should be detectable in
the future as particles with definite energies. But then the state in which they
are absent is the out-vacuum, whereas their quantum state is the in-vacuum.
It may be the case that the in-vacuum contains the out-particles. This will be
the case if, between the static regions in the past and future, there is a region
where
A is nonstatic because then the basis functions of the Fock modes
that are the eigenfunctions of the energy operator in the future and the basis
functions that are such in the past are different solutions of the Dirac equation
(250).
If we expand ψ̂ in the basis solutions of the out-particles, insert this ex-
pansion in (249), and then insert (249) in (248), the result will be
1 , , 2
, A,
M (−∞) − M (+∞) = in vac, εA â+
out
A
âout ,in vac (251)
A

where εA is the energy of the out-mode A, and similarly for the other charges.
This result needs no comments. Miracles happen.

5.3 Emission of Charges (Continued)

An important point concerning miracles is that they happen not always. Let
us see what is needed for this particular miracle to happen. For that, it is
necessary to introduce characteristic parameters of the problem. There are
two sets of parameters.
Parameters of the quantum ﬁeld: q, μ.
Parameters of the external source: e, l, ν.
Here, q and μ are the charge and mass of the vacuum particles (e.g., of the
electrons and positrons), e is the charge of the external source, l is the char-
acteristic width of its support tube, and ν is the frequency parameter that
characterizes the nonstationarity of the source.
The vacuum current in (240) is of the form
∞
vac 1 2
J = dm2 ρ(m2 ) J ext + O J ext . (252)
m2 − ret
0
774 G. A. Vilkovisky

Here and above, the notation ret is to record that the resolvent is to be taken
retarded. The structure of the nonlinear terms in (252) is similar: There is an
overall resolvent acting on a function quadratic in J ext (see (121)). If the
vacuum particles are massive, the spectral weight will be proportional to the
θ-function:
ρ(m2 ) ∝ θ(m2 − 4μ2 ) (253)
to tell us that there is a threshold of pair creation. We need to ﬁnd the
behaviour of J vac at a large distance from the support of J ext :
,
,
J vac , =? (254)
rl

First we need to calculate the action of the retarded resolvent on a source

J ext having a compact spatial support. If J ext is static, the result is
,
1 ,
ext , C
J , = exp(−mr) , J ext static. (255)
m − ret
2
rl r

At a large distance from the source, this is the Yukawa potential. Because
the function (255) is static, it does not depend on the spacetime direction in
which the limit r l is taken. If J ext is nonstatic, this is no more the case.
The limit r l is direction-dependent, and there are directions in which the
decrease is slower. Namely, in the directions of the outgoing light rays,
,
1 ,
ext , C √
J , = exp −m rU , J ext nonstatic (256)
m2 − ret rl r

where U is a function of time4 whose order of magnitude is

1
U∼ . (257)
ν
Expression (256) is to be inserted in the spectral integral (252), and, since
the spectrum is cut off from below, we find that the vacuum current is sup-
pressed by the factor
√
μ r
J vac ∼ exp − √ , r l. (258)
ν
This is what constrains miracles. However, we find also that the suppressing
factor depends on the frequency of the source and can be removed by raising
the frequency. The farther from the support of J ext , the greater the frequency
should be for the current to be noticeable. The pair creation starts as soon as
the energy ν exceeds the threshold

ν > 2μc2 (259)

4
Of the retarded time since the surfaces Σ to which the outgoing light rays belong
are null.
Expectation Values and Vacuum Currents 775

but, for the source to emit charge, the frequency should be even greater:
μc
ν > (μc2 ) l . (260)

This is easy to understand. The particles start being created in the support
of the source with small momenta and cannot go far away. The extra factor
(μc/)l in (260) may be interpreted as the number of created particles for
which there is room in the support of the source. If the creation is more
violent, the particles get out of the tube. This is the meaning of condition
(260). The mechanism of emission and conservation of charge is illustrated
in Fig. 4. There are initially the charges of the external source in its support
tube. They repel the like particles of the created pairs and, when the number
of the latter exceeds (μc/)l, push them out of the tube. The unlike particles
stay in the tube and diminish its charge.
Since the cause of the vacuum instability is the nonstationarity of the
external source, it is interesting to consider the case where the energy ν
exceeds overwhelmingly all the other energy parameters of the problem. One
can then study the strong effect of particle production. It is assumed, in
particular, that ν exceeds both the rest energy of the vacuum particle and
its Coulomb energy in the external field:
ν μc2 , (261)
qe
ν . (262)
l
In the limit (261), the flux of charge at a given distance from the source ceases
depending on the mass μ, and the vacuum particles can be considered as mass-
less. Condition (262) enables one to get rid of the consideration of the static

Fig. 4. Mechanism of emission and conservation of charge

776 G. A. Vilkovisky

vacuum polarization which is irrelevant to the problem. The approximation

(261) and (262) is called high-frequency approximation.
The eﬀective action has been calculated above as an expansion in pow-
ers of the curvature, but the conditions of validity of this expansion have not
been discussed. This lack can now be met. It is the high-frequency approxima-
tion in which this expansion is valid. Indeed, consider the series (230). Every
next term in this series contains an extra power of R, and, by dimension,
its formfactor contains an extra power of −1 . The commutator curvature is
proportional to the charges and to −1 :
qe
R∼ 2 . (263)
l
In the limit r l along the outgoing light rays, the operator contains one
time derivative:
ν
∼ . (264)
l
As a result, every next term of the series contains, as compared to the previous
one, the extra factor
qe
1. (265)
νl
In addition, the formfactors in (230) can be calculated in the massless limit,
as has been done above.
However, the inquest of miracles is not yet completed. Assuming that the
vacuum particles are massless or that the high-frequency regime holds, we
get rid of the suppressing exponential in (258), but we still need to check the
power of decrease of the current. The power should be the one in (245) for
the emission of charge to occur. We can readily check this since we know the
behaviour of the resolvent. Expression (256) is again to be inserted in the
spectral integral (252), but this time assuming that the spectrum begins with
zero mass,
, ∞ √
vac , C
J , = dm2 ρ(m2 ) exp −m rU . (266)
rl r
0

We see that, for the current to decrease as O(1/r2 ), the spectral weight should
have a finite and nonvanishing limit at zero mass:
ρ(0) = finite = 0 . (267)
For the respective formfactor, this is a condition on its behaviour at small .
The behaviour should be
∞
ρ(m2 )
F () = dm2 2 −→ − ρ(0) ln(−) . (268)
m −
0 →0
We arrive at the following consistency condition on the vacuum formfac-
tors. In the limit where one (any) of the arguments is small and the others
are fixed, the formfactors should not grow faster than ln(−):
Expectation Values and Vacuum Currents 777
,
,
F (), = const. ln(−) , (269)
→0

,
,
F (1 , 2 , 3 ), = f (2 , 3 ) ln(−1 ) , (270)
1 →0
...........................................
If they grow faster, the charges cannot be maintained finite, i.e., an isolated
system cannot exist in such a vacuum. If they grow as ln(−), the theory of
isolated systems is consistent, but these systems emit charges. If they grow
slower, the charges are conserved.
One can check whether the one-loop formfactors satisfy this consistency
condition. The second-order formfactors (215)–(219) do. The third-order form-
factors behave generally as [35]
,
, 1
F (1 , 2 , 3 ), = f (2 , 3 ) + g(2 , 3 ) ln(−1 ) + . . . . (271)
1 →0 1
The alarming terms 1/ appear only in the arguments acting on the grav-
itational curvatures. Therefore, they can affect only the vacuum energy–
momentum tensor, and it has been checked that, in the energy–momentum
tensor, these terms coming from different formfactors cancel. In the currents,
the one-loop formfactors satisfy strictly the consistency condition. Since, in
addition, their asymptotic ln(−) terms are nonvanishing, the emission of
charges in the high-frequency regime is real. The only thing that remains to
be checked is that this emission is not a pure quantum noise. It will be checked
by a direct calculation.
Now one can answer also the question about the indefinite local terms
in the effective action. The coefficients of these terms are the unspecified
constants in (215)–(219). In the limit → 0, the values of these constants
are immaterial. Only the terms ln(−), → 0 of the formfactors work, and,
therefore, the incompleteness of local quantum field theory does not affect the
presently considered problem.
It will be noted that there are now two mechanisms by which an isolated
system can emit energy. One is purely classical: a nonstationary source can
emit the electromagnetic or gravitational waves. The other is quantum: im-
mersed in the vacuum, a nonstationary source can emit also charged particles.
A high-frequency source will generally emit both.

5.4 Particle Creation by External Fields

The problem of particle creation by external ﬁelds is a part of the expectation-

value problem. In the context of the foregoing, it can be set as follows. Consider
the quantum ﬁeld that satisﬁes a linear second-order equation

g μν ∇μ ∇ν 1̂ + P̂ φ = 0 (272)
778 G. A. Vilkovisky

containing three external fields: the metric, the connection, and the poten-
tial. The external fields are asymptotically static in the past and future but
otherwise arbitrary except that their currents
1
Jαβ = Rαβ − gαβ R , (273)
2
Jˆα = ∇β R̂αβ , (274)
1
Q̂ = P̂ + R1̂ (275)
6
are confined to a space-time tube. The quantum field is in the in-vacuum
state. What is the energy of the quanta of the field φ created by the external
fields for the entire history? In the high-frequency approximation, we have
everything to answer this question.
To formulate the answer, I need some preliminary construction. Every
current has an associated quantity called its radiation moment. It will now be
defined.
Consider a time-like geodesic in the external metric of equation (272). It
enters the domain of nonstationarity of external fields with a definite energy
and goes out of this domain with a definite energy. Let E be its energy per
unit rest mass on going out. I am only interested in the geodesics that escape
to r = ∞. They have E > 1, and, instead of E, I shall use the parameter γ
defined as √
E2 − 1
γ= , E>1, 0<γ<1. (276)
E
At r = ∞, the geodesic has a certain spatial direction, or, equivalently, it
comes to a certain point of the celestial 2-sphere. I shall denote this sphere as
S, its points as θ:
θ = (θ1 , θ2 ) , θ ∈ S , (277)
and the integral over the unit 2-sphere as

d2 S(θ) (· · · ) . (278)

A geodesic with given γ and θ will be called γ, θ-geodesic (see Fig. 5).
A γ, θ-geodesic can be emitted from every point of a compact domain.
Therefore, the γ, θ-geodesics with the same values of γ and θ make a congru-
ence, and it can be proved that this congruence is hypersurface-orthogonal.
Let the orthogonal hypersurfaces be

Tγθ (x) = const. (279)

Since the parameters γ, θ fix the congruence, they fix also the family of the
orthogonal hypersurfaces (279), and the “const.” in (279) fixes a member of the
family. The function Tγθ is determined up to a transformation Tγθ → f (Tγθ ).
This arbitrariness will be removed by the normalization condition
Expectation Values and Vacuum Currents 779
2
(∇Tγθ ) = − 1 − γ 2
(280)

and the condition that the vector ∇Tγθ is past directed. It is a property of
the geodetic congruences that the norm in (280) can be chosen constant.
The radiation moment of any scalar current J is the following hypersurface
integral:
1
D= dx g 1/2 δ (Tγθ (x) − τ ) J(x) . (281)
4π
If the current is not a scalar, it should first be parallel transported from the
integration point to r = ∞ along the respective γ, θ-geodesic. Thus if the
current is a vector, its radiation moment is

1
Dα = dx g 1/2 δ (Tγθ (x) − τ ) J β (x)aβ α (x, ∞) (282)
4π
where aβ α (x, ∞) is the propagator of parallel transport of vectors to infinity
along the γ, θ-geodesic emanating from x. The radiation moment Dα is then
a vector at infinity. In the same way, the radiation moment is defined for any
current. For the three currents (273)–(275), the radiation moments will be
denoted respectively as

Jαβ , Jˆα , Q̂ −→ Dαβ , D̂α , D̂ . (283)

Since the indices of the radiation moments pertain to a point at inﬁnity, their
contractions like
D̂α D̂α = gαβ D̂α D̂β , etc., (284)

r→∞
parameters: γ, θ

DOMAIN OF NONSTATIONARITY

Fig. 5. A γ, θ-geodesic
780 G. A. Vilkovisky

always assume the ﬂat metric gαβ at inﬁnity. All radiation moments are func-
tions of four parameters:
D = D(γ, θ, τ ) . (285)
In the limit γ = 1, the γ, θ-geodesics become null. The orthogonal hyper-
surfaces (279) also become null, and the geodesics themselves become their
generators. For the radiation moments, this is a regular limit. Nothing special
happens to them in this limit except that they become very important. The
radiation moments at γ = 1 govern the emission of waves in classical theory.
Thus if Jα in (274) is an electric current, then the following expression:

M (−∞) − M (+∞)
electromagnetic waves
∞
,
1 d α d β ,,
= dτ d S(θ) gαβ
2
D D , (286)
4π dτ dτ γ=1
−∞

is the energy of the electromagnetic waves emitted by this current for the
entire history. A similar expression with the tensor current (273):

M (−∞) − M (+∞)
gravitational waves
∞ ,
1 1 1 d αβ d μν ,,
= dτ d S(θ) (gαμ gβν − gαβ gμν )
2
D D ,
4π 2 2 dτ dτ γ=1
−∞
(287)

is the energy of the gravitational waves emitted by the current Jαβ for the
entire history.
The radiation moment is a generating function for the multipole moments.
The multipole expansion is the expansion of D at γ = 0. It makes sense for
nonrelativistic systems since γ is proportional to 1/c.
Expressions (286) and (287) are the solutions of the classical radiation
problem. And here is the solution of the quantum radiation problem [50]:

M (−∞) − M (+∞)
created particles
1 ∞ 2
1 d2
= dγ γ 2
dτ d S(θ) tr
2
D̂
(4π)2 dτ 2
0 −∞

1 1 d α d β
− gαβ D̂ D̂
3 (1 − γ 2 ) dτ dτ
2 2
1 1 d d
+ 1̂(gαμ gβν − gαβ gμν ) D αβ
D μν
.
30 3 dτ 2 dτ 2
(288)
Expectation Values and Vacuum Currents 781

This is the energy of the quanta of the field φ created by the external fields
for the entire history. As compared to the expressions above, there is an extra
time derivative in the case of the tensor and scalar moments. It accounts for
the dimension of the coupling constant. Also, instead of setting γ = 1, one
needs to integrate over γ. Otherwise, the similarity is striking. The quantum
problem of particle creation becomes almost the same thing as the classical
problem of emission of waves.
The presence in (288) of an integral over γ is not just a technical detail. The
radiation moments have both the longitudinal projections, i.e., the projections
on the direction of the geodesic at infinity and the transverse projections.
Inspecting the contractions of the moments in (286)–(288), one can see that,
at γ = 1, the longitudinal projections drop out of these contractions. In the
integral over γ, also the longitudinal projections survive. Owing to this fact,
spherically symmetric sources cannot emit waves but can produce particles
from the vacuum.
Now I can explain why, when expanding the effective action, I stopped at
the terms cubic in the curvature. In the high-frequency approximation, the
expansion (165) needs to be calculated up to the lowest-order terms that give
a nonvanishing effect. The terms of first order in the curvature are local and
give no effect. The terms of second order in the curvature are nonlocal and
contribute to the energy flux at infinity, but it turns out that their contribution
is a pure quantum noise. The real effect of particle production begins with
the third order in the curvature. Expression (288) results from the triangular
loop diagrams.
Since varying the action destroys one curvature, a cubic action generates
a quadratic current. This gives the radiation energy a chance to be positive
definite. Expression (288) is positive definite indeed:

M (−∞) − M (+∞) ≥0. (289)
created particles

In particular, for the matrix contributions, this follows from relations (136),
(137) and the positive deﬁniteness of the matrix ωab :
2

d2 d α d β
tr D̂ ≥0, tr gαβ D̂ D̂ ≤0. (290)
dτ 2 dτ dτ

The positivity of the gravitational-ﬁeld contribution can be proved

directly.

5.5 The Backreaction Problem

The energy emitted by an isolated system (in all forms) should be bounded
both from below and from above: it should be positive and less than the energy
stored in the initial state
782 G. A. Vilkovisky

0 ≤ M (−∞) − M (+∞) ≤ M (−∞) . (291)

In expression (288), the positivity is guaranteed, but the energy conservation

is not. The reason is that the setting of the problem with external fields is
physically inconsistent. The vacuum current determines the solution of the
mean-field equations, and the mean field rather than the external field deter-
mines the vacuum current. If the backreaction of the vacuum is neglected, the
conservation laws need not be observed.
One case in which the vacuum backreaction may not be neglected is where
both mechanisms of the energy emission, classical and quantum, are engaged
simultaneously. This concerns particularly the vector connection field. In ex-
pression (288), the integral over γ has a pole (1 − γ)−1 in the term with the
vector moment. The residue of the integrand in this pole is precisely the quan-
tity (286), i.e., the energy of the outgoing waves of the vector connection field.
If it is nonvanishing, e.g., if the external source emits both the electromag-
netic waves and the electrically charged particles, the integral in γ diverges.
The result is a disaster: The radiation energy appears to be infinite. In fact it
should be taken into account that the created charge affects the generation of
the electromagnetic waves, and the respective changes in the electromagnetic
field affect the creation of charge. In the self-consistent solution, the disaster
is removed.
Another example concerns the metric field when it has an event horizon.
In this case, the integral in τ diverges at the upper limit. By construction,
τ is the time of an external observer. As τ → ∞, the source moving in the
tube hits the event horizon. Its proper time does not turn into infinity. The
integrand in (288) is just finite in this limit, and the integral in τ diverges
linearly. This is the Hawking constant flux of radiation from the black hole. If
its backreaction on the metric is neglected, the total emitted energy is infinite.
But even when the quantity (288) is finite, it depends on the frequency of
the source. If the source is external, this frequency is a free parameter. The
energy of created quanta grows with frequency, and, typically, the ratio
,
M (−∞) − M (+∞) ,,
, ∼ ln ν (292)
M (−∞) ν→∞

also grows so that, at a suﬃciently high frequency, the energy conservation

law will be violated. The backreaction should take into account that, when the
source creates real particles, it loses energy and slows down. It then creates
less particles, and the process dies away. The conservation laws will then be
restored.
The backreaction problem has been solved only in a few cases [51]–[56].
The examples for which it has been solved show that the solution can be
unexpected and interesting.
Expectation Values and Vacuum Currents 783

References
1. B. DeWitt: The Global Approach to Quantum Field Theory, vols 1,2 (Oxford
University Press, Oxford, New York, 2003) 730
2. B.S. DeWitt: Dynamical theory of groups and fields. In: Relativity, Groups and
Topology. 1963 Les Houches Lectures, ed. by C. DeWitt, B.S. DeWitt (Gordon
and Breach, New York, 1964) pp. 587–820
3. G. Jona-Lasinio: Nuovo Cimento 34, 1790 (1964)
4. B.S. DeWitt: Phys. Rep. 19, 295 (1975)
5. E.S. Fradkin, G.A. Vilkovisky: Lett. Nuovo Cimento 19, 47 (1977)
6. J. Schwinger: Field theory methods in non-field theory contexts. In: Proc.
1960 Brandeis Summer School (Brandeis University Press, Brandeis, 1960) pp.
282–285
7. J. Schwinger: J. Math. Phys. 2, 407 (1961)
8. L.V. Keldysh: Zh. Eksp. Teor. Fiz. 47, 1515 (1964)
9. Yu.A. Golfand: Yad. Fiz. 8, 600 (1968)
10. P. Hajicek: Time-loop formalism in quantum field theory. In: Proc. 2nd Marcel
Grossmann Meeting on General Relativity (Trieste, 1979), ed. by R. Ruffini
(North Holland, Amsterdam, 1982) pp. 483–491
11. E.S. Fradkin, D.M. Gitman: Fortschr. der Phys. 29, 381 (1981)
12. J.L. Buchbinder, E.S. Fradkin, D.M.Gitman: Fortschr. der Phys. 29, 187 (1981)
13. R.D. Jordan: Phys. Rev. D 33, 44 (1986)
14. E. Calzetta, B.L. Hu: Phys. Rev. D 35, 495 (1987)
15. A.O. Barvinsky, G.A. Vilkovisky: Nucl. Phys. B 282, 163 (1987)
16. R.C. Hwa, V.L. Teplitz: Homology and Feynman Integrals (Benjamin, New
York Amsterdam, 1966) 730
17. G.A. Vilkovisky: Class. Quantum Grav. 9, 895 (1992) 730
18. J.S. Schwinger: Phys. Rev. 82, 664 (1951)
19. J.L. Synge: Relativity: The General Theory (North Holland, Amsterdam, 1960)
20. G. ’t Hooft, M. Veltman: Ann. Inst. Henri Poincare XX, 69 (1974)
21. P.B. Gilkey: J. Diff. Geom. 10, 601 (1975)
22. L.S. Brown: Phys. Rev. D 15, 1469 (1977)
23. L.S. Brown, J.P. Cassidy: Phys. Rev. D 15, 2810 (1977)
24. A.O. Barvinsky, G.A. Vilkovisky: Phys. Rep. 119, 1 (1985) 767
25. G.A. Vilkovisky: Heat kernel: rencontre entre physiciens et mathématiciens.
In: R.C.P. 25, vol. 43 (Publication de l’Institut de Recherche Mathématique
Avancée, Strasbourg, 1992) pp. 203–224
26. A.M. Polyakov: Phys. Lett. B 103, 207 (1981)
27. G.A. Vilkovisky: The Gospel according to DeWitt. In: Quantum Theory of
Gravity, ed. by S.M. Christensen (Hilger, Bristol 1984) pp 169–209
28. A.A. Ostrovsky, G.A. Vilkovisky: J. Math. Phys. 29, 702 (1988)
29. I.G. Avramidi: Yad. Fiz. 49, 1185 (1989)
30. A.O. Barvinsky, G.A. Vilkovisky: Nucl. Phys. B 333, 471 (1990)
31. A.O. Barvinsky, G.A. Vilkovisky: Nucl. Phys. B 333, 512 (1990)
32. A.O. Barvinsky, Yu.V. Gusev, G.A. Vilkovisky, V.V. Zhytnikov: J. Math. Phys.
35, 3525 (1994)
33. A.O. Barvinsky, Yu.V. Gusev, G.A. Vilkovisky, V.V. Zhytnikov: J. Math. Phys.
35, 3543 (1994)
34. A.O. Barvinsky, Yu.V. Gusev, G.A. Vilkovisky, V.V. Zhytnikov: Nucl. Phys. B
439, 561 (1995)
784 G. A. Vilkovisky

35. A.O. Barvinsky, Yu.V. Gusev, V.V. Zhytnikov, G.A. Vilkovisky: Class. Quan-
tum Grav. 12, 2157 (1995) 777
36. A.O. Barvinsky, Yu.V. Gusev, G.A. Vilkovisky, V.V. Zhytnikov: Covariant per-
turbation theory (IV). Third order in the curvature. Report, University of Man-
itoba, Winnipeg (1993) pp. 1–192 767
37. A.G. Mirzabekian, G.A. Vilkovisky, V.V. Zhytnikov: Phys. Lett. B 369, 215
(1996)
38. Y. Nambu: Phys. Rev. 100, 394 (1955)
39. N. Nakanishi: Prog. Theor. Phys. 24, 1275 (1960)
40. N. Nakanishi: Graph Theory and Feynman Integrals (Gordon and Breach, New
York, 1970)
41. J. Schwinger: Particles, Sources, and Fields, vol. 2 (Addison-Wesley, Reading,
1973) 730
42. A.A. Grib, S.G. Mamayev, V.M. Mostepanenko: Quantum Eﬀects in Intense
External Fields (Atomizdat, Moscow, 1980) 730
43. N.D. Birrell, P.C.W. Davies: Quantum Fields in Curved Space (Cambridge Uni-
versity Press, Cambridge, 1982)
44. N.M.J. Woodhouse: Phys. Rev. Lett. 36, 999 (1976)
45. A.G. Mirzabekian, G.A. Vilkovisky: Phys. Lett. B 317, 517 (1993)
46. A.G. Mirzabekian: Zh. Eksp. Teor. Fiz. 106, 5 (1994) [Engl. trans.: JETP 79,
1 (1994)]
47. A.G. Mirzabekian, G.A. Vilkovisky: Phys. Rev. Lett. 75, 3974 (1995)
48. A.G. Mirzabekian, G.A. Vilkovisky: Class. Quantum Grav. 12, 2173 (1995)
49. A.G. Mirzabekian, G.A. Vilkovisky: Phys. Lett. B 414, 123 (1997)
50. A.G. Mirzabekian, G.A. Vilkovisky: Ann. Phys. 270, 391 (1998) 780
51. G.A. Vilkovisky: Phys. Rev. D 60, 065012 (1999) 782
52. G.A. Vilkovisky: Phys. Rev. Lett. 83, 2297 (1999)
53. R. Pettorino, G.A. Vilkovisky: Ann. Phys. 292, 107 (2001) 759
54. G.A. Vilkovisky: Ann. Phys. 321, 2717 (2006)
55. G.A. Vilkovisky: Phys. Lett. B 634, 456 (2006)
56. G.A. Vilkovisky: Phys. Lett. B 638, 523 (2006) 730, 782
Part VIII

String Cosmology
Dilaton Cosmology and Phenomenology

M. Gasperini

Dipartimento di Fisica, Università di Bari, Via G. Amendola 173, 70126 Bari,

Italy, and Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy
[email protected]

Abstract. This paper is dedicated to Gabriele Veneziano on his 65th birthday.

Most of the results reported here are known results, due to Gabriele, or obtained
in collaboration with him, or inspired by our joint work on string cosmology. A
few new results are also presented concerning the duality invariance of a non-local
dilaton coupling to the matter sources, and its possible cosmological applications in
the context of the dark energy scenario.

Foreword and Introduction

My collaboration with Gabriele Veneziano has continued, almost uninterrupt-

edly, for more than 15 years (even now we are preparing a joint contribution
to the book Beyond the Big Bang, which will be published by Springer, as
this book). Our first meeting dates back to 1989, when Gabriele came to the
University of Turin to give a series of talks and seminars. At that time I was
working there as a young researcher at the Department of Theoretical Physics,
and I remember that Sergio Fubini (professor at the same Department) in-
troduced me to Gabriele before a seminar. After the seminar we went to my
office, and we started talking about cosmology, big bang, inflation, and strings.
Gabriele was able to make me feel at ease, in spite of the fact that I was a
bit embarrassed, being face to face with such a world-renowned scientist like
him: before that meeting, indeed, I knew him only for having seen his name
quoted in many books and articles as one of the founders of string theory. I
could not imagine that I was about to embark on the most stimulating and
important adventure of my scientific life.
After that meeting we started collaborating on string cosmology, and I
visited very often the Theory Division (now “TH Unit”) at CERN, living in
Geneva also for long periods. This has given me the opportunity to appreciate
Gabriele not only as a scientist—whose inventiveness, originality, profundity
of thought will not be stressed here, because they are well-known to the physi-
cist community—but also for his human qualities. His tireless enthusiasm for

M. Gasperini: Dilaton Cosmology and Phenomenology, Lect. Notes Phys. 737, 787–844 (2008)
DOI 10.1007/978-3-540-74233-6 24
c Springer-Verlag Berlin Heidelberg 2008
788 M. Gasperini

Fig. 1. Gabriele Veneziano (left) and the author (right), talking about dilatons at
CERN (January 1994)

physics, his generosity in sharing knowledge, his intellectual honesty, always

make working with him a rewarding and enjoyable experience. I have count-
less memories of days spent discussing and working out calculations on the
blackboard of his office (see Fig. 1), with short “coffee breaks” every now and
them, talking about physics even during lunch and dinner. Countless are the
things I have learned from him, not only from a scientific but also from a
human point of view. I will be grateful to him forever.
Choosing among the lines of research developed in collaboration with
Gabriele, I will concentrate the contribution to this book on the possible
role played by the dilaton in a cosmological context, with particular atten-
tion to the phenomenological aspects of dilaton cosmology. The dilaton is a
fundamental scalar field appearing in all models of superstrings, dilaton cos-
mology is probably the most natural and typical form of “string cosmology,”
and a direct/indirect confirmation (or disproof) of its predictions could give
us important experimental information on string theory in general.
The present contribution contains three lectures. The first lecture (Sect. 1)
is devoted to the presentation of a primordial cosmological scenario in which
the background evolution is dominated by the dilaton, and the Universe is
driven through an accelerated phase representing the “dual” counterpart of
the standard, decelerated evolution. The second lecture (Sect. 2) will discuss
the possibility that a cosmic background of relic dilaton radiation could have
survived until present, and could be detectable by the gravitational antennas
that are presently operating (or planned for a near-future operation). Finally,
Dilaton Cosmology and Phenomenology 789

the third lecture (Sect. 3) will suggest a possible “dilatonic” origin of the dark
energy ﬂuid dominating the cosmic acceleration recently observed on large
scales, stressing the main diﬀerences from other, more conventional models of
scalar “quintessence.”

Notations and Conventions

Unless otherwise stated, the following conventions are used throughout this
paper: Greek indices run from 0 to d, Latin indices from 1 to d, where d = D−1
is the number of spatial dimensions of the D-dimensional space–time manifold.
The metric signature is

gμν = diag(+, −, −, −, · · · ).

The Riemann tensor and its contractions are deﬁned by

Rμνα β = ∂μ Γνα β + Γμρ β Γνα ρ − (μ ↔ ν),

1
Rνα = Rμνα μ , R = Rμ μ , Gμν = Rμν − gμν R.
2
The conventions for the covariant derivative are

∇μ Vα = ∂μ Vα − Γμα β Vβ , ∇μ V α = ∂μ V α + Γμβ α V β .

Finally, we often use the convenient notation:

(∇φ)2 ≡ ∇μ φ∇μ φ, ∇2 φ ≡ ∇μ ∇μ φ.

1 Dilaton-dominated Inﬂation: the Pre-big

Bang Scenario
If we apply to the “specialized” literature for a description of the birth and
of the ﬁrst moments of our Universe, we may read, in the (probably) most
ancient and authoritative book, that
In the beginning God created the Heaven and the Earth,
and the Earth was without form, and void;
and the darkness was upon the face of the deep.
And the Breath of God
moved upon the face of the water.
(Genesis, The Holy Bible).
The most impressive aspect of these verses, for a modern cosmologist, is prob-
ably the total absence of any reference to the extremely hot, kinetic, explosive
state that one could expect at (or immediately after) the “big bang” deﬂagra-
tion. The described state, instead, is somewhat quiet, dark, empty—we can
790 M. Gasperini

read, indeed, about “void,” “darkness,” and “the deep” gives us the idea of
something enormously desert and empty. In this static configuration there is
at most some small fluctuation (the “Breath”), a ripple on the surface of this
vacuum.
It is amusing to note that a state of this type (flat, cold, and vacuum,
only ruffled by quantum fluctuations), can be obtained as the initial state of
our Universe, in a string-cosmology context, under the hypothesis that the
Universe evolves in a “self-dual” way with respect to the symmetries of the
low-energy string effective action [1, 2].
To introduce this result we start considering the gravi-dilaton sector of the
low-energy effective action. To lowest order in the α (higher-derivative) and
gs2 (higher-loop) expansion the action is the same for all models of superstrings
[3, 4], and is given by

1 √
S = − d−1 dd+1 x −g e−φ (R + ∂μ φ ∂ μ φ + V ) . (1)
2λs Ω

Here φ is the dilaton, and λs = (2πα )1/2 is the fundamental length parameter
of string theory. We have written the action using the so-called String frame
(S-frame) metric, i.e., the metric to which a “test” string is minimally coupled
and in which its evolution is geodesics. We have also included, for completeness
and for further applications, a (possibly non-perturbative) dilaton potential,
V = V (φ).
This action should be completed by the source term Sm (g, φ), describing
the matter ﬁelds contributions, and by the Gibbons–Hawking boundary term
SΣ , which is required (as in general relativity) to cancel the variational con-
tributions of the second derivatives of the metric following from the Einstein-
√
Hilbert Lagrangian −g R. For the S-frame action (1) the boundary term
takes the form [5]

1 √
SΣ = d−1 −g e−φ K α dΣα , (2)
2λs ∂Ω

where K α = Knα . Here K is the trace of the extrinsic curvature of the d-

dimensional hypersurface ∂Ω bounding the hypervolume Ω over which we are
varying the action, and nα is the unit vector normal to this hypersurface.
The variation of the total action S + Sm + SΣ with respect to g μν leads
then to the equations
1 2 1
Gμν + ∇μ ∇ν φ + gμν (∇φ) − gμν ∇2 φ − gμν V (φ) = λd−1
s eφ Tμν , (3)
2 2
where Gμν is the Einstein tensor and Tμν the gravitational stress tensor of
the matter sources, deﬁned as usual by the functional diﬀerentiation of Sm as
2 δSm
Tμν = √ . (4)
−g δgμν
Dilaton Cosmology and Phenomenology 791

The variation of the total action with respect to φ leads to the dilaton equation
of motion,
∂V
R + 2∇2 φ − (∇φ)2 + V − = λd−1
s eφ σ, (5)
∂φ
where σ is the (S-frame) density of dilaton charge of the sources, defined by
the functional differentiation of Sm with respect to φ:
2 δSm
σ = −√ . (6)
−g δφ
Using the dilaton equation to eliminate the scalar curvature, present in the
Einstein tensor, we can eventually rewrite (3) in the convenient (simplified)
form
1 ∂V 1
Rμν + ∇μ ∇ν φ − gμν = λd−1
s e φ
T μν + g μν σ . (7)
2 ∂φ 2

1.1 Scale Factor Duality

We will now consider the particular case in which the space–time manifold
described by the S-frame metric is spatially flat, homogeneous (but not neces-
sarily isotropic), and in which the matter fields can be phenomenologically de-
scribed as perfect fluids, at rest in the comoving frame of the given Robertson–
Walker geometry. We can thus set, in the synchronous gauge,

gμν = diag(1, −a2i δij ), ai = ai (t), φ = φ(t),

ν
Tμ = diag(ρ, −pi δij ), ρ = ρ(t), pi = pi (t), σ = σ(t). (8)

Separating the time and space components of the gravitational equations we

then obtain, from the (00) component of (3),
2

φ̇ − 2φ̇
2
Hi + Hi − Hi2 − V = 2λd−1
s eφ ρ (9)
i i i

(where H1 = ȧi /ai ). From the (ii) component of (7) we have

1 ∂V σ
Ḣi − Hi φ̇ − Hk + = λd−1
s e φ pi − . (10)
2 ∂φ 2
k

From the dilaton equation (5) we are lead, ﬁnally, to

2
∂V
2
2φ̈−φ̇ +2φ̇ Hi − 2Ḣi + Hi −
2
Hi +V − = λd−1
s eφ σ. (11)
i i i
∂φ

We have thus obtained a system of d + 2 equations for the 2d + 3 unknowns

{ai , φ, ρ, pi , σ}: its solution requires the input of d+1 “equations of state,” pi =
pi (ρ), σ = σ(ρ), specifying the properties of the considered matter sources.
792 M. Gasperini

Let us now consider the symmetries of this system of equations. There

are two symmetries, in particular, that are relevant for the discussion of this
section. One of them (also present in the cosmological equations of general
relativity) is the invariance under the time-reversal transformation t → −t,
which implies

Hi → −Hi , Ḣi → Ḣi , φ̇ → −φ̇, φ̈ → φ̈. (12)

Thanks to this invariance property, if the set of variables S = {ai (t), φ(t), ρ(t)}
represents an exact solution of (9)–(11), then the time-reversed set S3 =
{ai (−t), φ(−t), ρ(−t)} also corresponds to an exact solution of the same equa-
tions (with diﬀerent kinematic properties, in general).
The string-cosmology equations, in the particular case σ = 0 and V =
const, are also invariant under other transformations which have no analogue
in general relativity, and which include the inversion of an arbitrary number
of scale factors of the background geometry (8): the so-called scale-factor
duality transformations [1, 6]. For a simple illustration of this property we
may conveniently rewrite the equations in terms of the “shifted variables” φ,
ρ, pi , σ, deﬁned by
4
φ = φ − ln ai = φ − ln ai , i = 1, . . . , d,
i i
4 4 4
ρ=ρ ai , p k = pk ai , σ=σ ai . (13)
i i i

Equations (9)–(11) then become

2
φ˙ − Hi2 − V = 2λd−1
s eφ ρ, (14)
i

1 ∂V σ
Ḣi − Hi φ˙ + = λd−1
s e φ
p i − , (15)
2 ∂φ 2
¨ − φ˙ 2 − H 2 + V − ∂V = λd−1 eφ σ.
2φ (16)
i s
i
∂φ

a = a−1 , on the other hand, we have

Under the transformation a → 3
−1
da 3 =3 d3
a da
H = a−1 → H a−1 =a = −H. (17)
dt dt dt
We can then easily check that (14)–(16), in the particular case σ = 0 and
∂V /∂φ = 0, are invariant under the scale-factor duality transformations:

ai → a−1
i , φ → φ, ρ → ρ, pi → −pi . (18)

This type of transformation is called “dual” as it generalizes to the case of

time-dependent backgrounds the T -duality transformation inverting the com-
pactiﬁcation radius (thus interchanging “winding” and “momentum” modes)
Dilaton Cosmology and Phenomenology 793

in the spectrum of a closed string, quantized in the presence of compact spatial

dimensions [7]. For the invariance under the transformations (18), however,
there is no need of a compact geometry; what is required, instead, is a non-
trivial transformation of the dilaton. Let us suppose, in fact, that we are
inverting a number n of scale factors, say a1 , . . . , an , with 1 ≤ n ≤ d: the
condition φ → φ then implies

d
d
n
d
φ− ln ai = φ3 − ai = φ3 −
ln 3 ln a−1
i − ln ai , (19)
i=1 i=1 i=1 i=n+1

from which

n
φ → φ3 = φ − 2 ln ai . (20)
i=1

In the presence of sources, their energy density is also non-trivially trans-

formed: the condition ρ → ρ implies, in fact,

4
d 4
n 4
d
ρ ai = ρ3 a−1
i ai , (21)
i=1 i=1 i=1+n

from which
4
n
ρ → ρ3 = ρ a2i . (22)
i=1

The transformation of the pressure is similar, but with an additional “re-

ﬂection” of the equation of state along the spatial directions aﬀected by the
duality transformation:
4
n
pi → p3i = −pi a2k , i = 1, . . . , n. (23)
k=1

In any case, given a set of variables S = {ai (t), φ(t), ρ(t), pi } representing an
exact solution of (14)–(16), a new solution can be obtained by inverting an
arbitrary number n (between 1 and d) of scale factors, and is represented by

S3 = {a−1 −1 −1 3 3, p31 , . . . , p3n , pn+1 , . . . , pd }, (24)

1 , a2 , . . . , an , an+1 , . . . , ad , φ, ρ

3 ρ3, and p3i are given by (20), (22), and (23), respectively.
where φ,
The invariance under the transformations (18) is only a particular case
of a more general O(d, d) symmetry of the tree-level string cosmology equa-
tions [8] (see also the contribution of Meissner [9] to this volume), and can
be extended so as to include the NS–NS two-form Bμν in the eﬀective action.
Such an extension is also possible in the presence of ﬂuid sources: a homoge-
neous gas of strings, in particular, provides a realistic example of source which
are automatically compatible with the O(d, d) symmetry of the background
equations [10].
794 M. Gasperini

In addition, the invariance under the transformations (18) can be extended

to the case of non-trivial potentials, ∂V /∂φ = 0, and non-zero dilaton cou-
plings to the matter sources, σ = 0. In both cases, however, we need to
generalize those parts of the action describing the self-coupling of the dilaton
and the coupling of the dilation to the matter fields present in Sm .
In the case of the dilaton potential it is well-known [8, 9, 10] that the
invariance under the transformations (18) holds for non-trivial V , provided
V depends on φ through the variable φ. Such a variable, unlike φ, is not a
scalar under general coordinate transformations (as evident from the definition
(13)): it is thus impossible, in a generic background, to define a potential
which is function of φ and which can be directly inserted as a scalar into
the covariant action (1). However, as first pointed out in [11], the action and
the corresponding equations of motion can be written in a generalized form
which is invariant under general coordinate transformations in any metric
background, using for the potential a non-local variable which exactly reduces
to φ in the limit of a homogeneous geometry.
Here it will be shown that the invariance under the duality transformations
(18) can be restored also in the presence of the dilaton charge σ, provided the
dilaton coupling to the matter sources is parametrized by a non-local variable,
as in the case of the potential. This result is new, and will be explicitly derived
in the following subsection.

1.2 Non-local Dilaton Interactions

The formalism introduced in [11] is based on the non-local variable ξ(x),

deﬁned by
d+1
d y √ −φ
ξ(x) ≡ ξ[φ(x)] = − ln −g e (∇φ)2 δ(φx − φy ), (25)
λds y

where we have explicitly inserted the parameter

1, (∇φ)2 > 0,
= sign{(∇φ)2 } = (26)
−1, (∇φ)2 < 0,

so as to include in the formalism both time-like and space-like dilaton gradi-

ents. Note that we are using the convenient notation in which an index ap-
pended to round brackets, (. . . )x , means that all quantities inside the brackets
are functions of the appended variable. Similarly, φx ≡ φ(x). We can immedi-
ately check that, for a homogeneous background of the type (8) with spatial
sections of ﬁnite comoving volume ( dd y = Vd = const < ∞), the variable
ξ exactly reduces to the variable φ of (13). In that case, in fact, an explicit
integration gives
4 Vd
ξ = φ − ln ai − ln , (27)
i
λds
Dilaton Cosmology and Phenomenology 795

and the constant volume factor can be simply absorbed by rescaling φ, so that
ξ ≡ φ.
Let us now suppose that the matter couplings and the self-coupling of the
dilaton are both parametrized by ξ, according to the eﬀective action

1 √
S=− d−1
dd+1 x −g e−φ R + (∇φ)2 + V (e−ξ )
2λ
s
√
+ dd+1 x −g Lm (e−ξ ) + SΣ , (28)

which is a (generally covariant) scalar functional of the non-local variable ξ.

Note that, without loss of generality, we have written both the potential and
the matter Lagrangian Lm as a function of exp(−ξ). In higher-dimensional
manifolds with compact spatial sections, in fact, the exponential of the shifted
dilaton plays the role of a “dimensionally reduced” coupling parameter, and we
may thus expect (at least in a perturbative regime) that dilaton interactions
appear as a power expansion (or as a a simple function) of such an exponential
[11].
The generalized equations of motion can now be obtained by computing
the functional derivative of the action (28) with respect to g μν and φ. The
derivative with respect to the metric, using the standard deﬁnition of gravi-
tational stress tensor, (4), and the properties of the delta distribution, leads
to the (integro-diﬀerential) equations of motion
1
Gμν + ∇μ ∇ν φ + gμν ∇φ2 − 2∇2 φ − V
2
1
− γμν (∇φ)2 e−φ IV − 2λd−1
s Im = λd−1
s eφ Tμν , (29)
2
which generalize (3) (see Appendix A for the details of the derivation). Here

∇μ φ∇ν φ
γμν = gμν − , (30)
(∇φ)2

√
IV (x) = λ−d
s dd+1 y −g V y δ(φy − φx ), (31)

√
Im (x) = λ−d
s dd+1 y −g Lm y δ(φy − φx ), (32)

where the prime denotes the derivative with respect to the argument exp(−ξ),
namely:
∂V ∂V ∂Lm ∂Lm
V = = −eξ , Lm = = −eξ . (33)
∂(e−ξ ) ∂ξ ∂(e−ξ ) ∂ξ

The functional derivative with respect to φ leads to the dilaton equation of

motion,
796 M. Gasperini

γμν ∇μ ∇ν φ −φ
R + 2∇2 φ − (∇φ)2 + V + e IV − 2λd−1
s Im
(∇φ)2
−ξ −φ
d−1 φ

+ e − e J V − 2λs e Lm = 0, (34)

generalizing (5). Here

√
J(x) = λ−d
s dd+1 y −g (∇φ)2 δ (φx − φy ), (35)
y

where δ denotes the derivative of the delta function with respect to its argu-
ment (see Appendix A). The combination of (29) and (34) ﬁnally leads to the
equation

Rμν + ∇μ ∇ν φ

1 −φ γαβ ∇ ∇
α β
φ
+ e IV − 2λd−1s Im gμν − γμν (∇φ)2
2 (∇φ)2
1
+ gμν e−ξ − e−φ J V − 2λd−1
s eφ Lm = λd−1
s eφ Tμν , (36)
2
generalizing (7).
We can easily check that these new equations, written for a homogeneous
background, are invariant under scale-factor duality transformations even in
the presence of non-trivial potentials and dilaton couplings, i.e., for ∂V /∂ξ =
0, ∂Lm /∂ξ = 0. Consider, for instance, the background conﬁguration of (8)
with time-like dilaton gradients, for which = 1. From (30) we obtain

γ00 = 0, γij = δij . (37)

The (0, 0) component of (29) thus coincides with the (0, 0) component of (3),
and is given by (9), as before.
For the spatial components we ﬁrst note that, performing the homogeneous
limit in which ξ → φ, we are lead to the identities

(∇φ)2 e−φ IV − 2λd−1
s Im ≡ e−ξ V − 2λd−1 s eφ Lm

∂V φ ∂Lm
−→ − − 2λd−1
s e ; (38)
∂φ ∂φ
γαβ ∇α ∇β φ −φ
e IV − 2λd−1s Im ≡ e−φ J V − 2λd−1 s eφ Lm
(∇φ)2
!
i Hi ∂V d−1 φ ∂Lm
−→ − − 2λs e .
φ̇ ∂φ ∂φ
(39)

Using such identities we ﬁnd that the dependence on V and Lm completely
disappears from the spatial components of (36) with = 1, and we obtain the
condition
Dilaton Cosmology and Phenomenology 797

Ri j + ∇i ∇j φ = λd−1
s eφ Ti j . (40)
Written explicitly, the new spatial equation is given by

Ḣi − Hi φ̇ − Hk = λd−1 s e φ pi , (41)
k

and is thus crucially simpliﬁed with respect to the corresponding (local) spatial
equation (10).
The dilaton equation (34) also simpliﬁes in the homogeneous limit, thanks
to the identities (38) and (39) from which we obtain

∂V ∂Lm
R + 2∇2 φ − (∇φ)2 + V − = −2λd−1
s eφ . (42)
∂φ ∂φ
The explicit form is
2

2φ̈ − φ̇ + 2φ̇
2
Hi − 2Ḣi + Hi2 − Hi
i i i
∂V
+V (φ) − = λd−1
s eφ σ(φ), (43)
∂φ

where we have deﬁned, by analogy with (6),

∂Lm
σ(φ) = −2 . (44)
∂φ

The new set of equations (9), (41), and (43) is compatible with scale-
factor duality for any V and σ, as can be shown by rewriting the equations in
terms of the shifted variables of (13). With such variables, (9), (41), and (43)
become, respectively,
2
φ˙ − Hi2 − V = 2λd−1
s eφ ρ, (45)
i

Ḣi − Hi φ˙ = λd−1
s e φ pi , (46)
¨ − φ˙ 2 − H 2 + V (φ) − ∂V = λd−1 eφ σ(φ).
2φ (47)
i s
i
∂φ

They are manifestly invariant under the generalized transformations

ai → a−1
i , φ → φ, ρ → ρ, pi → −pi , σ → σ, (48)

preserving the shifted version of the dilaton-charge density σ.

798 M. Gasperini

1.3 The Pre-big Bang Scenario

Let us come back to the cosmological applications of scale-factor duality. Even

without using its non-local extensions, the duality symmetry of the equations
allows introducing a “dual complement” of the standard cosmological solu-
tions, and suggests new possible scenarios for the primordial evolution of our
Universe.
For a simple illustration of this possibility it will be enough to consider
a homogeneous, isotropic and spatially flat metric background, sourced by a
barotropic perfect fluid with equation of state p/ρ = γ = const, with negligible
dilaton charge. By imposing σ = 0, V = 0, and assuming γ = 0, one easily
finds that (45)–(47) are satisfied by the following particular exact solution:
2γ
t 1+dγ 2
−dγ 2 t
a= , ρ = ρ0 a , φ=− ln + const, (49)
t0 1 + dγ 2 t0

where t > 0, and t0 , ρ0 are positive integration constants. In terms of the

non-shifted variables:

2γ
t 1+dγ 2
a= , ρ = ρ0 a−d(1+γ) , p = γρ,
t0

dγ − 1 t
φ=2 ln + const, t > 0. (50)
1 + dγ 2 t0

This solution, defined over the real positive semi-axis t > 0, describes a Uni-
verse evolving from a past curvature singularity at t → 0− to an asymptoti-
cally flat configuration at t → +∞. For γ > 0 we have a phase of decelerated
expansion and decreasing curvature,

ȧ > 0, ä < 0, Ḣ < 0, (51)

typical of the standard cosmological scenario. Also, for a “realistic” equation

of state with γd ≤ 1, the dilaton turns out to be non-increasing (φ̇ ≤ 0);
in particular, for a radiation ﬂuid with γ = 1/d, one recovers the radiation-
dominated solution at constant dilaton,
1+d
2
ρ t
p= , a= , ρ = ρ0 a−(1+d) , φ = const, (52)
d t0

which is also an exact solution of the standard Einstein equations.

Thanks to the symmetries of the string cosmology equations we can now
obtain new, different solutions (which have no analogue in the context of the
Einstein equations) by performing a time reflection t → −t and, simultane-
ously, a dual transformation defined by (18). Starting in particular from (50)
we are lead to the background
Dilaton Cosmology and Phenomenology 799
− 2γ 2
t 1+dγ
a= − , ρ = ρ0 a−d(1−γ) , p = −γρ,
t0

1 + dγ t
φ = −2 ln − + const, t < 0, (53)
1 + dγ 2 t0

which is still a particular exact solution of (45)–(47). It is deﬁned on the

negative real semi-axis t < 0, and for γ > 0 it describes a phase of accelerated
(i.e., inﬂationary) expansion and growing curvature:

ȧ > 0, ä > 0, Ḣ > 0. (54)

In this case the Universe evolves from an asimptotically ﬂat initial conﬁgu-
ration at t → −∞ towards a curvature singularity at t → 0− . The dilaton
is always growing (φ̇ > 0) for t → 0− , even if we consider the dual of the
radiation-dominated solution (52):
− 1+d
2
ρ t 4d t
p=− , a= − , ρ = ρ0 a−d+1 , φ=− ln − .
d t0 d+1 t0
(55)

This interesting property of the low-energy string-cosmology equations—

i.e., the presence of an inﬂationary “partner” associated to any standard decel-
erated solution—is also valid in the absence of sources. Consider, for instance,
(45)–(47) with p = 0 = ρ, and V = 0. In the isotropic limit we ﬁnd the
particular exact solution
1/√d √ t
t
a= , φ= d − 1 ln , t > 0, (56)
t0 t0

describing decelerated expansion and decreasing curvature. By applying the

transformations (18) we are lead to the dual solution
−1/√d √ t
t
a= − , φ=− d + 1 ln − , t < 0, (57)
t0 t0

describing accelerated expansion, growing curvature and growing dilaton. Ac-

tually, both the vacuum and the fluid-dominated solutions can be obtained
as asymptotic limits of the general exact solution of the system of equations
(45)–(47) (for barotropic sources, with V = 0 and σ = 0), in the large-
curvature and small-curvature limits, respectively [12, 13].
It is well-known that the decelerated configurations, typical of the standard
cosmological scenario, cannot be extended back in time without limits: the
range of the time coordinate is bounded from below by the presence of the
initial singularity (indeed, going back in time, the growth of H is unbounded,
and the curvature blows up to infinity in a finite proper-time interval).
800 M. Gasperini

The standard scenario, however, is certainly incomplete because it ex-

cludes inflation. The inclusion of inflation, on the other hand, modifies the
behavior of the curvature scale: during a phase of “slow-roll” inflation [14],
for instance, the background geometry can be approximately described by a
de Sitter-like metric where Ḣ 0, and in which the curvature tends to set-
tle at a constant. One might think, therefore, that a complete (and realistic)
cosmological scenario could avoid the initial singularity, replacing it with a
primordial inflationary phase at constant curvature.
Unfortunately, however, an epoch of accelerated expansion at constant
curvature, described by the Einstein equations, and dominated by the poten-
tial energy of some “inflaton” scalar field satisfying causality and weak-energy
conditions, cannot be “past eternal,” as proved in [15]. Thus, the conventional
inflationary scenario mitigates the rapid growth of the curvature typical of the
standard cosmological evolution, and shifts back in time the position of the
initial singularity, without completely removing it, however (namely, without
extending in a geodesically complete way the model, back in time, to infinity).
If a constant curvature phase is not appropriate to construct a regular
model (fully extended over the whole temporal axis), the alternative we are
left is a model in which the curvature, as we go back in time, after reaching a
maximum, at some point, starts decreasing; in other words, a model in which
the standard evolution is completed and complemented by a primordial phase
with a specular behavior of H with respect to the standard one. Remarkably,
this is exactly what can be obtained assuming that the cosmological evolution
satisfies a principle of “self-duality”—i.e., assuming that the past evolution of
our Universe is described by the “dual complement” of the present one [2].
More precisely, if we consider a cosmological model satisfying (at least ap-
proximately) the self-dual condition a(t) = a−1 (−t), such that the standard
decelerated regime at t > 0 smoothly evolves, back in time, into the acceler-
ated partner at t < 0, we can then obtain a scenario in which the singularity
is automatically regularized, and the initial evolution is automatically of the
inflationary type. In such a context, the big bang singularity is replaced by an
epoch of high (but finite) curvature, characterizing the transition between the
standard cosmological phase (Ḣ < 0) and its dual (Ḣ > 0): it comes natural,
in such a context, to call pre-big bang the initial phase (t < 0) at growing
curvature and growing dilaton, in contrast to the subsequent post-big bang
phase (t > 0), describing the standard cosmological evolution.
The dilaton, on the other hand, provides an exponential parametrization of
the (tree-level) string coupling gs = exp(φ/2), controlling the relative strength
of all (gravitational and gauge) interactions [3, 4]. The principle of self-duality
thus suggests that the Universe is lead to its present state after a long evolution
started from an extremely simple—almost trivial—configuration, character-
ized by a nearly flat geometry and by a very small coupling parameter,

H 2 → 0, φ → −∞, gs2 = eφ → 0, (58)

Dilaton Cosmology and Phenomenology 801

the so-called string perturbative vacuum (see Fig. 2). In this case, the initial
Universe is characterized by a regime of extremely low energies in which the
“curvatures” (i.e., the field gradients) are small (λ2s H 2 1, λ2s φ̇2 1, . . . ),
the couplings are weak (gs2 1), and the background dynamics can be appro-
priately described by the lowest-order string effective action, at tree level in
the α and quantum loop expansion (also in agreement with the hypothesis of
“asymptotic past triviality” [16]). We can talk of “birth of the Universe from
the string perturbative vacuum,” as also pointed out in a quantum cosmology
context (see, e.g., [17, 18]).
This picture is in remarkable contrast with the standard (even inflationary)
picture in which the Universe starts evolving from a highly curved geomet-
ric state: the more we go back in time, in that context, the more we enter
a Planckian and (possibly) trans-Planckian [19] non-perturbative regime of
ultra-high energies, requiring the full inclusion of quantum gravity effects, to
all orders, for a correct description.
The principle of self-duality, on the contrary, suggests a picture in which
the more we go back in time (after crossing the epoch of maximal curvature),

BIG BANG !

string standard
perturbative cosmological
vacuum scenario

-∞
t
0
PRE-BIG BANG POST-BIG BANG

gs2 = e φ

strong coupling

standard-model
string configuration
perturbative
vacuum

-∞
t
0
PRE-BIG BANG POST-BIG BANG

Fig. 2. Qualitative time-evolution of the curvature scale (upper panel) and of the
string coupling (lower panel), for a typical self-dual background which smoothly
interpolates between the pre-big bang and the post-big bang phase, starting from
the string perturbative vacuum
802 M. Gasperini

the more we approach a flat, cold, and vacuum configuration (strongly reminis-
cent of the “biblical” scenario quoted at the beginning of Sect. 1), which can
be appropriately described by the classical background equations obtained
from the action (1). Quantum effects, in the form of higher-curvature and
higher-loop contributions, are expected to become important only toward the
end of the pre-big bang phase, when the background approaches the string
scale at t → 0− . Actually, all studies performed so far have shown that such
corrections must become dominant, eventually, in order to stop the growth
of the curvature [20] and possibly trigger a smooth transition to the post-big
bang regime [21].

1.4 A Smooth “Bounce”

The lowest-order string effective action can appropriately describe the phase
of primordial background evolution typical of the pre-big bang scenario, but
not the transition to the standard decelerated regime occurring at high cur-
vatures and strong coupling, and requiring the introduction of higher-order
corrections. Referring the reader to the existing literature for a detailed re-
view of the transition models studied so far (see, for instance, [22]), we shall
present here only two simple phenomenological examples, by applying, to this
purpose, the formalism introduced in Sect. 1.2 (and Appendix A). In these
examples, in fact, the bouncing transition is induced by the presence of a
non-local effective potential V (φ), expected to simulate the backreaction of
the quantum loop corrections in higher-dimensional manifolds with compact
spatial sections [11].
The first example is based on a potential which, in the homogeneous limit,
takes the form
V (φ) = −V0 e4φ , V0 > 0, (59)
and which may thus perturbatively interpreted as a four-loop potential. With
this potential, the duality-invariant equations (45)–(47), in vacuum (ρ = p =
σ = 0), and in the isotropic limit, are solved by the particular exact solution
[18]:
√
1/2 1/ d
2
t t
a(t) = a0 + 1+ 2 ,
t0 t0

1 t2
φ = − ln t0 V0 1 + 2 + const,
2 t0
√
1/2 d
t/t0 + 1 + t2 /t20
φ = ln 1/2
+ const, (60)
(1 + t2 /t20 )
where t0 and a0 are positive integration constants. This regular “bouncing”
solution is exactly self-dual—as it satisfies a(t)/a0 = a0 /a(−t) – and is char-
acterized by a bounded, “bell-like” shape of the curvature and of the dilaton
Dilaton Cosmology and Phenomenology 803

kinetic energy (see Fig. 3). The solution smoothly interpolates between the
pre- and post-big bang vacuum solutions (57) and (56) (corresponding to the
dashed curves of Fig. 3), which are recovered in the asymptotic limits t → −∞
and t → +∞, respectively. The bounce of the curvature, and the smooth tran-
sition between the two branches of the low-energy solutions, is induced and
controlled by the potential (59) which dominates the background evolution
in the high-curvature limit |t| → 0, and which becomes rapidly negligible as
t → ±∞, as illustrated in Fig. 3.
It should be noted that in this solution the dilaton keeps growing, mono-
tonically, even in the limit t → +∞. In more realistic examples, however,
such a growth is expected to be damped by the interaction with the mat-
ter/radiation post-big bang sources [23], and/or by the action of a suitable
non-perturbative potential appearing in the strong coupling regime.
The second example of bounce is based on a general integration of the
duality-invariant equations (45)–(47), in the presence of isotropic fluid sources
with σ = 0 and of a two-loop (non-local) potential which in the homogeneous
limit takes the form
V (φ) = −V0 e2φ , V0 > 0. (61)
In this case the equations can be integrated exactly not only for barotropic
equations of state (p/ρ = γ = const), but also for any ratio p/ρ which is an
integrable function of an appropriately defined time-like parameter [2].
An interesting example (motivated by the study of the equation of state
of a string gas in rolling backgrounds [24]) is the case in which p/ρ smoothly
evolves from the value γ = −1/d at t = −∞ to the value γ = 1/d at t = +∞,
thus connecting the radiation equation of state to its dual partner, according
to the law:
p 1 x
= 2 . (62)
ρ d x1 + x2
Here x1 is an arbitrary integration constant, and x is a (dimensionless) time-
like coordinate defined by

•
φ

H
t
V

Fig. 3. Plot of the curvature, of the dilaton kinetic energy, and of the potential
V (φ), for the bouncing solution (60). The dashed curves represent the (singular)
vacuum solutions (56), (57), obtained with V = 0. All curves are plotted for t0 = 1,
V0 = 1, and d = 3
804 M. Gasperini

dx L
= ρ, (63)
dt 2
where L is a constant with dimensions of length (we are using units in which
2λd−1
s = 1, so that [ρ] = L−2 ). Using (61)–(63), and choosing a simplifying
set of integration constants (appropriate to the pedagogical purpose of this
paper), we can then obtain the following particular exact solution [2]:
2/(d−1)
a = a0 x + x2 + x21 ,
2d/(d−1)
x
φ
e = ad0 eφ0 1+ ,
x2 + x21
d − 1 2φ0 2 −(d+1)/(d−1)
ρeφ = 2
e x + x21 ,
dL
d−1 −(3d+1)/2(d−1)
peφ = 2 2 e2φ0 x x2 + x21 , (64)
d L
where a0 and φ0 are integration constants. The smooth and bouncing behavior
of this solution is illustrated in Fig. 4.
The above solution is self-dual, in the sense that φ(x) = φ(−x), ρ(x) =
ρ(−x), and
−1
a(x) a(−x)
2/(d−1)
= 2/(d−1)
(65)
a0 x1 a0 x1
(with an appropriate choice of the integration constant a0 it is always possible
to set to 1 the ﬁxed point of scale-factor inversion). The solution satisﬁes,
asymptotically,
dx
x → −∞ ⇒ a ∼ (−x)−2/(d−1) ∼ ρ ∼ ,
dt
1 dt
x → +∞ ⇒ a ∼ x2/(d−1) ∼ ∼ . (66)
ρ dx

H2
ρeφ
t
peφ

Fig. 4. Plot of the curvature, of the string coupling, of the eﬀective energy density
and of the eﬀective pressure for the self-dual solution (64). The curves are plotted
for d = 3, L = 1, x1 = 1, φ0 = 0, and a0 = exp(−2/3)
Dilaton Cosmology and Phenomenology 805

Re-expressing a, φ, ρ, and p, in the asymptotic limits x → ±∞ in terms of

the cosmic time t, we can check that this solution smoothly interpolates be-
tween the pre-big bang configuration (55) describing accelerated expansion,
growing dilaton, negative pressure, and the final post-big bang configuration
(52), describing the radiation-dominated state with frozen dilaton and decel-
erated expansion. As in the previous case the smoothing out of the tree-level
singularity, and the appearance of bouncing transition, is a consequence of
the effective potential (61).

1.5 Cosmological Perturbations

The phase of pre-big bang evolution, being accelerated, can amplify the quan-
tum fluctuations of the metric tensor (and of other background fields) just like
any other type of inflationary evolution. However, because of the kinematic
properties of pre-big bang inflation (associated to the shrinking of the Hubble
horizon H −1 ), the spectral distribution of the metric fluctuations, after their
amplification, tends to grow with frequency [25]. This peculiar aspect of the
spectrum may be regarded as representing both an advantage and a difficulty
of pre-big bang models with respect to other models of inflation.
The advantage is of phenomenological nature, and refers to the transverse
and traceless tensor part of the metric fluctuations. Their amplification leads
to the formation of a stochastic background of relic gravitational waves whose
spectral energy density, Ωg , grows with frequency
2 δ
H1 ω
Ωg (ω, t) = Ωr (t) , δ > 0, ω ≤ ω1 . (67)
MP ω1

Here MP = 8πG = λ−1 P is the Planck mass, H1 Ms = λs

−1
the inflation–
radiation transition scale (expected to be controlled by the string mass scale
Ms ), Ωr = ρr /ρc is the fraction of critical energy density in radiation, ω1 is
the ultraviolet cutoff (i.e., the maximal amplified frequency) of the spectrum,
and δ a model-dependent parameter depending on the background kinematics
[25, 26, 27] (see also the contribution of Buonanno and Ungarelli [28] to this
volume).
Thanks to the growth of the spectrum, the cosmic graviton background
present today as a relic of the inflationary epoch is higher at higher frequen-
cies (in particular, higher than the backgrounds predicted by conventional
models of inflation), and thus more easily detectable by current gravitational
antennas (see, e.g., [22]). Conversely, however, the spectrum is strongly sup-
pressed in the low-frequency regime: we should thus expect, in particular, a
negligible contribution of tensor metric perturbations to the observed CMB
anisotropy on large scales (as in the case of the ekpyrotic [29] and “new ekpy-
rotic [30] scenarios where, however, the gravitational background is expected
to be low even in the low-frequency regime [31]). It may be stressed, in this
connection, that the possible absence of tensor contributions at large scales
806 M. Gasperini

emerging from (planned) future measurements of the CMB polarization (such

as those of WMAP, PLANCK), in combination with a positive signal possibly
detected at high frequency by the next generation of gravitational antennas
(such as LIGO/VIRGO, LISA, BBO, and DECIGO), could represent a strong
experimental signal in favor of models of pre-big bang inflation (see, e.g., [32]).
The difficulties associated to a growing spectrum refer to the scalar part of
the metric perturbations. In fact, a growing scalar spectrum cannot account
for the observed peak structure of the temperature anisotropies of the CMB
radiation, which requires, instead, a nearly flat (or “scale-invariant”) primor-
dial distribution: Ωs (ω) ∼ ω ns −1 , with ns ≈ 1. There are two possible ways
out of this problem.
A first possibility relies on the growth of the dilaton—and thus of the string
coupling gs2 = exp φ—during the phase of pre-big bang inflation. Even starting
at weak coupling, a pre-big bang background unavoidably evolves toward the
strong coupling regime gs ∼ 1. If the bounce is not immediate then the Uni-
verse, before the transition to the standard regime, enters a strong coupling
phase where higher-dimensional extended objects like Dirichlet branes and
antibranes [4] (whose tension is proportional to the inverse of the string cou-
pling) become light, and can be copiously produced [33]. The cosmic evolution
may become dominated by the presence of these higher-dimensional sources
[34] and, in that context, a phase of conventional slow-roll inflation can be
triggered by the interaction (and eventual collision) of a brane–antibrane pair
[35] (see also the contribution of Tye [36] to this volume). This new inflation-
ary regime may efficiently dilute all pre-existing inhomogeneities and generate
a new spectrum of scale-invariant, adiabatic scalar perturbations, as required
for a successful explanation of the observed anisotropy. This may resolve the
incompatibility between a (growing) spectrum of pre-big bang perturbations
and present large-scale observations.
There is, however, a second possibility which avoids introducing additional
inflationary epochs besides the initial dilaton-dominated one, and which is
based on the so-called “curvaton mechanism” [37]. According to this mecha-
nism the (flat, adiabatic) spectrum of scalar metric perturbations, responsible
for the observed anisotropies, is not produced during the primordial evolution:
instead, it is the outcome of the post-inflationary decay of a massive scalar
field (the curvaton), whose quantum fluctuations are amplified during inflation
with a nearly flat spectrum, and are converted into curvature perturbations
after its decay. In the context of the pre-big bang scenario the role of the
curvaton is possibly played by the Kalb–Ramond axion σ [38], associated—
by space–time duality—to the four components of the NS-NS two-form Bμν
present in the massless multiplet of the string spectrum.
For a brief discussion of this possibility we should explain, first if all,
why axion fluctuations can be amplified by pre-big bang inflation with a flat
spectrum [39], unlike metric fluctuations. The reason is that the slope of the
spectrum is directly related to kinematic behavior of the effective “pump field”
Dilaton Cosmology and Phenomenology 807

responsible for the amplification, and that metric and axion fluctuations have
different pump fields, even in the same given background.
In order to clarify this point let us complete the low-energy action (1) by
adding the contribution of the antisymmetric field Bμν , considering (for sim-
plicity) a model already dimensionally reduced to four space–time dimensions:

1 4 √ −φ 1
S=− 2 d x −g e R + ∂μ φ ∂ φ + V − Hμνα H
μ μνα
,
2λs Ω 12
Hμνα = ∂μ Bνα + ∂ν Bαμ + ∂α Bμν . (68)

In the absence of sources the equations of motion for Bμν are automatically
satisﬁed by introducing the “dual” axion ﬁeld σ, such that

eφ μναβ
H μνα = √ ∂β σ, (69)
−g

and the last term of the action (68) can be replaced by

1 √
S= 2 d4 x −g eφ (∇σ)2 . (70)
4λs
Perturbing the metric and the axion ﬁeld,

gμν → gμν + hμν , σ → σ + δσ, (71)

around a homogeneous, conformally ﬂat metric background, using the con-

formal time coordinate η (such that dt = adη), and applying the standard
formalism of linear cosmological perturbations (see, e.g., [40]), we obtain for
tensor metric and axion fluctuations, respectively, the following quadratic ac-
tions:

1
Sh = d3 x dη zh2 (η) h2 + h∇2 h ,
2
a
zh = √ e−φ/2 , (72)
2 λs

1
Sσ = d3 x dη zσ2 (η) δσ 2 + δσ∇2 δσ ,
2
a
zσ = √ eφ/2 . (73)
2 λs
Here h is one of the two physical polarization states of tensor perturbations,
the primes denote differentiation with respect to η, and ∇2 is the flat-space
Laplace operator, ∇2 = δ ij ∂i ∂j . The variation of these actions with respect
to h and δσ leads to the equations of motion, which can be written in terms
of the canonical variables u = (hzh ) and v = (δσzσ ) as follows:

z
(hzh ) − ∇2 + h (hzh ) = 0, (74)
zh
808 M. Gasperini

z
(δσzσ ) − ∇2 + σ (δσzσ ) = 0. (75)
zσ

The canonical equations are the same for u and v, but the pump fields, zh
and zσ , are different.
Consider, for instance, the axion equation (75), and recall that during
inflation the accelerated evolution of the pump field can be parametrized as
a power-law evolution in the negative range of the conformal-time parameter
[22, 40], i.e.,
ασ
MP η
zσ (η) = √ − , − ∞ ≤ η < 0, (76)
2 η1

where η1 > 0 is some appropriate reference timescale. Expanding in Fourier

modes, (75) becomes a Bessel equation for the mode vk ,

(ν 2 − 1/4) 1
vk + k 2 − σ 2 vk = 0, νσ = − ασ , (77)
η 2

and its general solution can be conveniently written as a combination of ﬁrst-

kind and second-kind Hankel functions [41], of argument kη and index νσ , as
follows:
vk = (−η)1/2 A+ (k)Hν(2)
σ
(kη) + A − (k)H (1)
νσ (kη) . (78)

We shall now canonically normalize this general solution by imposing that

the initial state of the ﬂuctuations corresponds to a spectrum of quantum
vacuum ﬂuctuations [22, 40]. More explicitly, we shall require that the mode
vk , on the initial spatial hypersurface at η → −∞, may represent freely oscil-
lating, positive frequency modes satisfying the canonical normalization

vk vk∗ − vk vk∗ = i, (79)

from which
e−ikη
vk → √ , η → −∞ (80)
2k
(modulo an arbitrary phase). Using the large argument limit of the Hankel
functions [41],

2 −ikη−iν 2 ikη+iν
Hν(2) (kη) = e , Hν(1) (kη) = e (81)
πkη πkη

(ν = −π/4 − νπ/2), we obtain A+ = π/4 and A− = 0. The normalized
exact solution for the the axion ﬂuctuations δσ k can be ﬁnally written as
ν
vk eiθk πη1 1/2 η σ (2)
δσ k = = Hνσ (kη), (82)
zσ MP 2 η1
Dilaton Cosmology and Phenomenology 809

where θk is an arbitrary phase determined by the choice of the initial

conditions.
In order to determine the spectrum of the fluctuations after their inflation-
ary amplification, we must then consider the limit η → 0− , in which |kη| 1
and the amplitude of the mode k is stretched “outside the horizon.” We can
use, to this purpose, the small argument limit of the Hankel functions [41],
which reads (for ν = 0),
Hν(2) (kη) = p∗ν (kη)ν − iqν (kη)−ν + . . . (83)
where qν and pν are complex (ν-dependent) coefficients (for ν = 0 there are
additional logarithmic corrections). We obtain, in this limit,
2νσ
vk eiθk πη1 1/2 −νσ ∗ η
δσ k = → −iqνσ (kη1 ) + pνσ (kη1 )νσ
. (84)
zσ MP 2 η1

The cases we are interested here are limited to “conventional” inﬂationary

backgrounds with ασ ≤ 1/2, i.e., νσ ≥ 0 (see [32] for a detailed discussion of
all possibilities). For such backgrounds the time dependence of δσ k tends to
disappear as η → 0− , the fluctuations become frozen, asymptotically, and their
(dimensionless) spectral amplitude k 3 |δσ k |2 , controlling the typical amplitude
of the perturbations on a comoving length scale r = k −1 [40], has the following
k-dependence:
2
k 3 |δσ k | ∼ k 3−2νσ = k 2+2ασ . (85)
This result also holds in the limiting case ασ = 1/2 with the only addition of
2
a mild logarithmic correction [26, 27], i.e., k 3 |δσ k | ∼ k 3 ln2 (kη1 ).
The above calculations can be exactly repeated, in the same form, for
the tensor perturbation variable, starting from (74): the resulting spectrum is
formally the same,
2
k 3 |hk | ∼ k 3−2νh = k 2+2αh , (86)
with the difference that the spectral slope is now determined by the power
αh , controlling the evolution of the tensor pump field zh through an equation
analogous to (76).
We are now in the position of discussing the possible pre-big bang pro-
duction of a flat spectrum of axion fluctuations, even if the associated metric
fluctuations are amplified (in the same background) with a growing spectrum.
Let us consider, to this purpose, an exact anisotropic solution of the string
cosmology equations (9)–(11), in vacuum, and without dilaton potential. The
solution describes a phase of pre-big bang inflation characterized by the ac-
celerated (isotropic) expansion of three spatial dimensions, with scale factor
a(η), and by the accelerated contraction of n “internal” spatial dimensions,
with scale factors bi (η), i = 1, . . . , n. In conformal time, such a solution can
be parametrized for η → 0− as [13, 22]
β0 /(1−β0 ) βi /(1−β0 )
η η
a= − , bi = − ,
η1 η1
810 M. Gasperini
!
βi + 3β0 − 1 η
φ4+n = i
ln − , (87)
1 − β0 η1
where the constant coefficients β0 , βi satisfy the Kasner-like condition

βi2 + 3β02 = 1, (88)
i

and φ4+n is the higher-dimensional dilaton appearing in the full (4 + n)-

dimensional effective action. The four-dimensional dilaton φ is related to φ4+n
by 4
e−φ = Vn e−φ4+n ≡ e−φ4+n bi , (89)
i
namely by
3β0 − 1 η
φ = φ4+n − ln bi = ln − . (90)
i
1 − β0 η1
Let us compute, for this background, the kinematic powers αh and ασ
controlling the evolution of the pump fields (72) and (73):
1
zh ∼ ae−φ/2 ∼ (−η)αh , αh = , (91)
2
5β0 − 1
zσ ∼ aeφ/2 ∼ (−η)ασ , ασ = . (92)
2(1 − β0 )
It follows, according to (86), that the spectrum of tensor (as well as of scalar)
metric perturbations is always characterized by a slope which is cubic (mod-
ulo log corrections) [13, 26, 27], and which is also “universal,” in the sense
that it is insensitive to the background parameters (n, β0 , βi ). For the ax-
ion fluctuations, on the contrary, we find from (85) that the spectral slope is
strongly dependent on such parameters, and that a scale-invariant spectrum
with 2 + 2ασ = 0 is allowed, in particular, provided β0 = −1/3.
We may note, in the special case in which the background is fully isotropic
and expanding
√ (i.e., β0 = βi < 0), that the Kasner condition (88) implies
β0 = −1/ d, so that a scale-invariant spectrum corresponds to d = 9, i.e.,
just to the number of spatial dimensions determined by critical superstring
theory [3, 4].
In the less special case in which the spatial geometry can be factorized as
the product of a three-dimensional and a n-dimensional isotropic subspaces
we have, instead, βi = β = β0 , and 3β02 + nβ 2 = 1. The spectral slope, in this
case, can be expressed in terms of the parameter
−1
1 V̇n V̇3 nβ
r= = , (93)
2 Vn V3 6β0

controlling the relative time evolution of the proper volumes of the internal
and external spaces. Eliminating β in terms of β0 through the Kasner condi-
tion, and replacing β0 with r in (92), one can then parametrize the deviations
Dilaton Cosmology and Phenomenology 811

from a flat axion spectrum as the relative shrinking or expansion of the two
subspaces [42].
Given a sufficiently flat spectrum of axion fluctuations, amplified by the
phase of pre-big bang inflation, we are then lead to a post-big bang configura-
tion which is initially characterized (at some given time scale ηi ) by a primor-
dial sea of “isocurvature” scalar perturbations, dominated on super-horizon
scales by the axion fluctuations δσ (the metric fluctuations are subdominant
on such large scales, being strongly suppressed by the steep slope of their
spectrum). The axion can play the role of the curvaton provided that the ini-
tial configuration, besides containing the initial fluctuations δσ i , also contains
a non-vanishing axion background, σi = 0, whose energy density ρσ —even
if subdominant—is initially determined by an appropriate potential (possibly
approximated by Vσ ∼ m2 σ 2 ). In that case the background evolution, after
an initial slow-roll regime, leads to a phase where the axion background starts
oscillating with proper frequency m, at a curvature scale H ∼ m, simulating
a dust fluid (ρσ ∼ a−3 ) which may become dominant with respect to the
radiation fluid, and eventually decay at the typical scale H ∼ λ2P m3 .
In such a type of background the axions fluctuations δσ become linearly
coupled to scalar metric perturbations, and may act as sources for the so-
called Bardeen potential Ψ . New metric perturbations can then be generated,
starting from Ψ (ηi ) = 0, with the same spectral slope as the axion one, and
with a spectral amplitude not smaller, in general, than the axion amplitude.
Referring to the literature for a detailed computation [37, 38], we shall recall
here that the final spectrum (after the axion decay) of the super-horizon
Bardeen potential is related to the initial axion perturbations by

|Ψk | = λP f (σi ) |δσ k (ηi )| ,

Ωσ
f (σi ) = c1 + c2 + c3 λP σi (94)
λP σ i

(the λP factors are due to the canonical normalization of the axion field and
of its fluctuations). Here σi is the initial amplitude of the axion background,
Ωσ ∼ 1 is the axion fraction of critical density at the axion decay epoch, and
c1 , c2 , c3 are dimensionless numbers of order one (Ωσ cannot be much smaller
than one, to avoid a too strong “non-Gaussianity” of the spectrum” [43]).
Thanks to its structure, the “form factor” f (σi ) has a minimum of order one
around λP σi ∼ 1. A (nearly) scale-invariant axion spectrum thus reproduces
a (nearly) scale-invariant spectrum of scalar metric perturbations.
As discussed in the literature, a curvaton-induced spectrum of scalar met-
ric perturbations provides the right “adiabatic” initial conditions for repro-
ducing the observed temperature anisotropies of the CMB radiation, exactly
as in the case of the slow-roll scenario. The only difference is the “indirect”
(i.e., post-inflationary) production of the scalar spectrum, triggered by the
presence of a non-vanishing axion background. It must be stressed, however,
812 M. Gasperini

that the direct connection (94) with the axion spectrum of primordial origin
gives us the possibility of extracting, from present CMB observations, impor-
tant constraints on the parameters of pre-big bang models of inﬂation [38].
In particular, using the experimental normalization of the anisotropy spec-
trum, and the direct relation between the pre-big bang inﬂation scale H1 and
the string scale Ms , one can speculate about the possibility of “weighing the
string mass with the CMB data” [44]. Another application concerns the slope
of the scalar perturbation spectrum which, according to most recent WMAP
results [45], is given by

ns ≡ 3 + 2α = 0.951−0.019
+0.015
. (95)

Using (92), and the Kasner condition (88), one obtains

β0 −0.355, βi2 0.62. (96)
i

With d = 9 dynamical dimensions this result seems to point out the existence
of a small anisotropy between the kinematics of the external and internal
spaces during pre-big bang inflation
√ (a fully isotropic
! expansion would corre-
spond, in fact, to β0 = −1/ 9 −0.33 and i βi2 = 6/9 0.66). It should
be noted, however, that other interpretations of the data are√also possible.
For instance, the result (95) is also compatible with β0 = −1/ 8 −0.3535,
describing the isotropic expansion of d = 8 spatial dimensions! Incidentally,
the number (and the kinematics) of the extra spatial dimensions play a crucial
role also in the possible production of primordial “seeds” for the large-scale
magnetic fields [46].
It should be mentioned, finally, a possible non-Gaussian “contamination”
of the statistical properties of the anisotropy spectrum, possibly present in
curvaton models with Ωσ 1 [43] (see (94)). A possible detection of non-
Gaussianity, in future CMB measurements, could provide support to the cur-
vaton mechanism, and could be used for a direct discrimination between this
scenario and other, more standard scenarios based on slow-roll inflation.

2 The Relic Dilaton Background

The accelerated evolution of the Universe, during the phase of pre-big bang in-
flation, amplifies the quantum fluctuations of all fields present in the string ef-
fective action: thus, in particular, it amplifies the dilaton fluctuations, δφ ≡ χ.
The formation of a stochastic background of relic gravitational waves, asso-
ciated to the amplification of the tensor part of metric fluctuations, is thus
accompanied by the simultaneous formation of a comic background of relic
dilatons [47], whose primordial (high-energy) spectral distribution tends to
follow that of tensor metric perturbations [13].
Dilaton Cosmology and Phenomenology 813

There is, however, a possible important diﬀerence in the present in-

tensity of the two cosmic backgrounds, due to the fact that dilatons—
unlike gravitons—could become massive in the course of the standard (post-
inflationary) evolution. Actually, dilatons must become massive if they are
non-universally coupled to ordinary matter with gravitational strength (or
higher) [48, 49], to avoid the presence of long-range scalar forces which are
excluded by the standard gravitational phenomenology (in particular, by the
high-precision tests of the equivalence principle). The induced mass may dras-
tically modify the amplitude and the slope of the dilaton spectrum, in the
frequency band associated to its non-relativistic sector.
For a simple illustration of the effects of the mass on the spectrum we will
consider here the model of vacuum, dilaton-dominated pre-big bang back-
ground described by (57), smoothly joined at η = −η1 < 0 to the standard
radiation-dominated background with frozen dilaton, described by (52) (we
shall work in d = 3 spatial dimensions). Perturbing the background equa-
tions [13] one finds, in this case, that the dilaton pump field is the same field
zh ∼ a exp(−φ/2) governing the amplification of metric fluctuations. Taking
into account a possible mass contribution, m2 = ∂ 2 V /∂φ2 , one then obtains
for the Fourier modes χk the canonical equation:

zh
(χzh )k + k + m a −
2 2 2
(χzh )k = 0. (97)
zh

During the initial pre-big bang regime the potential is negligible (m2 = 0),
and the canonically normalized solution for χk is that of (82) (with νσ replaced
by νh ). In the subsequent radiation-dominated era, φ stabilizes to a constant,
so that zh ∼ a ∼ η and the eﬀective potential zh /zh is vanishing. Assuming
that the dilaton mass is small enough in string units, and considering the
high-frequency sector of the spectrum, associated to the relativistic modes of
proper momentum p = (k/a) m, we can neglect also the mass term of (97),
to obtain the general solution
1
χk = √ c+ (k)e−ikη + c− (k)eikη , η ≥ −η1 . (98)
a 2k
Matching χ and χ with the pre-big bang solution (82) at η1 , for super-horizon
modes with (kη1 ) 1, we are lead to

c± (k) = ±c(k)e∓ikη1 , |c(k)| ∼ (kη1 )−νh −1/2 (99)

(modulo numerical factors with modulus of order 1). Thus, at large times
η η1 ,
c(k)
χk ∼ √ sin kη. (100)
a k
The spectral energy density for the relativistic sector of the dilaton
background, in the radiation era, in then determined by
814 M. Gasperini

dρ k3
k = 2 |Xk |2 + k 2 |Xk |2
dk 2a
4 4 −2νh −1 −2νh −1
k k k p
∼ |c(k)|2 ∼ = p4 , (101)
a a k1 p1
where k1 ∼ η −1 is the high-frequency cutoff scale. In units of critical energy
density, ρc = 3MP2 H 2 ,
2 2 δ
p dρχ H1 H1 a1 4 p
Ωχ (p, t) = ∼ , m < p < p1 , (102)
ρc dp MP H a p1
where we have defined the (model-dependent) slope parameter δ = 3−2νh > 0,
and we have introduced the (time-dependent) proper momentum associated
to the cutoff scale, p1 = k1 /a = H1 a1 /a, determined by the background
curvature scale H1 at the end of inflation. In general, (H1 /H)2 (a1 /a)4 ≡
ρr (t)/ρc (t) ≡ Ωr (t), and we may thus conclude that the relativistic sector of
the dilaton spectrum, in the radiation era, is exactly the same as the spectrum
of tensor metric perturbations (see (67)), in the same model of background.
However, even if the mass is small, and initially negligible, the proper
momentum p = k/a(t) is continuously red shifted with respect to m during
the subsequent cosmological evolution, so that all modes tend to become non-
relativistic, p < m. For non-relativistic modes the solution (98) is no longer
valid, and the correct spectrum must refer to the exact solutions of (97) with
m = 0. In the radiation era such a solution can be given in terms of the Weber
cylinder functions [50], and one finds that the non-relativistic sector of the
spectrum splits into two branches, with different slopes: a first branch of modes
becoming non-relativistic at a timescale tnr when they are already inside the
horizon, with proper momentum p such that p(tnr ) ∼ m H(tnr ); and a
second branch of modes becoming non-relativistic when they are still outside
the horizon, with p(tnr ) ∼ m H(tnr ). The two branches are separated by
the momentum scale pm of the mode becoming non-relativistic just at the
time of horizon crossing, i.e., p(tnr ) = m = H(tnr ), and thus related to the
cutoff scale p1 by
1/2 1/2
pm m anr m H1 m
= = = . (103)
p1 H1 a1 H1 Hnr H1
Without applying to the explicit form of the massive solutions of (97), a
quick estimate of the non-relativistic spectrum can be obtained [51] by noting
that, if pnr > H(tnr ), the number of produced dilatons is the same as in
the relativistic case, and the only effect of the non-relativistic transition is a
rescaling of the energy density, i.e.,

m
Ωχ → Ωχ =
rel nr
Ωχrel . (104)
p
For this branch of the spectrum we then obtain, from (102),
Dilaton Cosmology and Phenomenology 815
2 2 δ−1
m H1 H1 a1 3 p
Ωχ (p, t) ∼ , pm < p < m. (105)
H1 MP H a p1
In the case pnr < Hnr , on the contrary, the slope of the spectrum—
determined by the background kinematics at the time of horizon exit—has
to be the same as that of the relativistic sector, while the time dependence
has to be the non-relativistic one (ρχ ∼ a−3 ) of (105). Continuity with the
branch (105) at p = pm then gives
1/2 2 2 δ
m H1 H1 a1 3 p
Ωχ (p, t) ∼ , peq < p < pm . (106)
H1 MP H a p1
The lower limit peq < p has been inserted here to recall that we are ne-
glecting the effects of the transition to the matter-dominated phase, i.e., we
are considering modes re-entering the horizon during the radiation era, with
p > peq = Heq ∼ 10−27 eV. We should recall, also, that the spectrum has been
computed in a radiation-dominated background, and thus is valid, strictly
speaking, only for t > teq .
The three branches (102), (105), and (106) describe the spectrum (between
peq and p1 ) of primordial dilatons produced in the simple example of “mini-
mal” pre-big bang model that we have considered. We refer to the literature
for a more detailed computation, for a discussion of its transmission to the
present epoch t0 , and for the possible modifications induced by generalized
background evolutions (see, e.g., [32]). For the pedagogical purpose of this
paper, this example provides a sufficiently clear illustration of the effects of
the mass on the spectrum: in particular, it clearly illustrates the enhance-
ment produced at lower frequencies because of the reduced spectral slope of
the branch (105), which may become even decreasing if δ < 1 (see Fig. 5).
In such a context one is naturally lead to investigate whether this enhanced
intensity might favor the detection of a non-relativistic dilaton background,
with respect to other, relativistic types of cosmic radiation (such as the relic
graviton background).

2.1 Light but Non-relativistic Dilatons

For a phenomenological discussion of this possibility we must start with two
important assumptions. The ﬁrst is that the produced dilaton are light enough
to have survived until the present epoch. Supposing that massive dilatons have
dominant decay mode into radiation (e.g., two photons), with gravitational
coupling strength, i.e., with a decay rate Γ ∼ λ2P m3 , it follows that the primor-
dial graviton background is still “alive” in the present Universe (characterized
by the timescale H0−1 ) provided H0−1 < Γ −1 , i.e.,
< 102 MeV.
m∼ (107)
The second assumption we need is that the total energy density of the
dilaton background, integrated over all modes, turns out to be dominated by
816 M. Gasperini

its non-relativistic sector. Only in this case we can evade the stringent bound
imposed by the nucleosynthesis, which applies to the relativistic part of any
cosmic background of primordial origin.
The energy density of a relativistic background, in fact, evolves in time-like
the radiation energy density, ρrel /ρrad = Ω rel /Ωrad = const: the present value
of their ratio is thus the same as the value of the ratio at the nucleosynthesis
epoch. To avoid disturbing the nuclear processes occurring at that epoch,
on the other hand, one must require that Ω rel /Ωrad ∼ < 0.1 [52]. Using the
< 5 × 10−6 ,
present value of Ωrad , one is then led to the constraint Ω rel (t0 ) ∼
which imposes a severe constraint on all relativistic primordial backgrounds.
In particular, it imposes an upper limit on the peak value of the graviton
background produced in models of pre-big bang inﬂation, thus determining
the minimal level of sensitivity required for its detection [22].
The energy density of a non-relativistic background, on the contrary,
evolves like the dark matter density, and grows in time with respect to the
radiation background: Ω nr /Ωrad ∼ a. As a consequence, the value of Ω nr can
be very large today, even if negligible at the nucleosynthesis epoch. The only
constraint we must apply, in this case, is the critical density bound,
p1
2 2
h Ωχ (t) = h d(ln p) Ωχ (p, t) < 1, (108)

to be imposed at any time t, to avoid a Universe overdominated by such a

cosmic background of dust matter. Here h 0.73 is the present value of the
Hubble parameter H0 in units of 100 km s−1 Mpc.
For the dilaton spectrum of (102)–(106) there are, in particular, two diﬀer-
ent cases in which the total energy density is dominated by the non-relativistic
modes. A ﬁrst (obvious) possibility is the case in which all modes of the spec-
trum are presently non-relativistic, namely p1 (t0 ) < m (in this case the branch
(105) extends from pm to p1 ). This implies, however, that
1/2 2/3
H1 a1 a1 aeq Heq H0
m> = H1 = H1
a0 aeq a0 H1 Heq
1/2
H1
10−4 eV. (109)
MP

For a typical string-inﬂation scale, H1 ∼ Ms , we obtain a lower limit on

m which is well compatible with the upper limit (107), but which requires
mass values too high to be compatible with the sensitivity band of present
gravitational antennas (see Sect. 2.2).
The second (more interesting) possibility is the case in which m < p1 (t0 ),
but the parameter δ is smaller than one, and the slope is ﬂat enough, so that
the spectrum is peaked not at p1 but at pm = p1 (m/H1 )1/2 (see Fig. 5).
In that case, the momentum integral (108) is dominated by the peak value
Ωχ (pm ), and the critical density bound can be approximated by the condition
Dilaton Cosmology and Phenomenology 817

log Ω χ

pδ pδ –1
pδ

relat.
non – relat.
non – rel. inside
outside horizon
horizon
pm m p1
p

Fig. 5. Example of dilaton spectrum dominated by the non-relativistic sector. The

spectrum is peaked at p = pm , and the slope parameter satisﬁes the condition δ < 1

< 1. Using (105), and noting that in the matter-dominated era

Ωχ (pm , t0 ) ∼
(t > teq ) the value of the non-relativistic spectrum keeps frozen at the equality
< 1, which implies
value Ωχ (teq ), we are led to the condition Ωχ (teq , pm ) ∼

< Heq MP4 H1δ−4 1/(δ+1) .
m∼ (110)

For H1 ∼ Ms , and δ → 0, this bound can be saturated by masses as small as

4
MP
m ∼ Heq ∼ 10−23 eV. (111)
Ms
It is quite possible, therefore, to have a dilaton mass small enough to
fall within the sensitivity range of present gravitational detectors, even if the
energy density of the dilaton background is dominated by non-relativistic
modes (thus evading the relativistic upper bound Ω rel ∼ < 10−6 ), and even
if the background intensity is large enough to saturate the critical density
bound, Ωχ ∼ 1.
So small mass values, however, are necessarily associated with long-range
dilaton forces: in particular, if the mass satisfies the condition m < p1 (t0 ) ∼
(Ms /MP )1/2 10−4 eV (as in the example illustrated in Fig. 5), the correspond-
ing force has a range exceeding the centimeter. This might imply macroscopic
violations of the equivalence principle (due to the non-universality of the dila-
ton coupling [48]), and macroscopic deviations from the standard Newtonian
form of the low-energy gravitational interactions (which seem to be excluded,
however, by present experimental results [53, 54]).
We should recall, in fact, that in the presence of long-range dilaton fields
the motion of a macroscopic test body with non-zero dilaton charge is no
longer described by a geodesics. There are forces on the test body due to
the gradients of the dilaton field, according to the generalized conservation
equation
σ
∇ν Tμ ν = ∇μ φ, (112)
2
818 M. Gasperini

following from the application of the contracted Bianchi identity to the gravi-
dilaton equations (3) and (101). The integration of this conservation equation
over a (space-like) t = const hypersurface then gives, in the point particle (or
monopole) approximation, the non-geodesic equation of motion [55]

duμ
+ Γαβ μ uα uβ = q∇μ φ, (113)
dτ
where q is a dimensionless ratio representing the relative intensity of scalar to
tensor forces (i.e., the effective dilaton charge per unit of gravitational mass
of the test body).
For the fundamental components of macroscopic matter, such as quark and
lepton fields, the value of q (or of the charge density σ) is to be determined
from an effective action which includes all relevant dilaton loop corrections
[48, 13], and which is of the form

1 4 √
S= 2 d x −g − ZR (φ)R − Zφ (φ)(∇φ)2 − V (φ)
2λs

+ Zki (φ)(∇ψi )2 − Mi2 Zm
i
(φ)ψi2 . (114)

Here we have used, for simplicity, a scalar model of matter fields ψi , and we
have called Z the dilaton “form factors” arising from the loop corrections.
The effective dilaton charge, therefore, turns out to be frame-dependent (the
charge q appearing in (113), for instance, is referred to the S-frame action
and to the S-frame equations (112)). The reason of such a frame dependence
is that, in a generic frame, the metric and the dilaton fields are non-trivially
mixed through the ZR and Zφ coupling functions, so that the associated
dilaton charge actually controls the matter coupling not to the pure scalar
part, but to a mixture of scalar and tensor part of the gravi-dilaton field.
A frame-independent and unambiguous definition of the dilaton coupling
strengths can be given, however, in the canonically rescaled Einstein frame
(E-frame), where the full kinetic part of the action (114) (including the matter
and gravi-dilaton sector) is diagonalized in terms of the canonically normal-
ized fields gμν , φ and ψi [13]. Assuming that the dilaton is stabilized by its
potential, and expanding the Lagrangian term describing the interaction be-
tween φ and ψi around the value φ0 which extremizes the potential, we can
define, in this rescaled frame, the effective masses m i and charges qi for the
canonical fields ψi . In the weak coupling limit in which ZR Zφ exp(−φ)
one then finds, in particular, that the canonical dilaton charge qi deviates
from the standard “gravitational charge” by the dimensionless factor [13]

i
qi ∂ Zm
qi ≡ √ 1+ ln . (115)
4πG m i ∂φ Zki φ=φ0
Dilaton Cosmology and Phenomenology 819

For a pure Brans–Dicke model of scalar–tensor gravity one has, for

instance, q i = 1 (because there is no dilaton coupling to the matter fields
in the Jordan frame, where ∂Z i /∂φ = 0). For a string model, on the contrary,
the coupling parameters q i deviate from 1 and are non-universal, in general,
since the loop form factors Z i tend to be different for different fields ψi . In
particular, in the conventional scenario which assumes that the loop correc-
tions determining the coupling are the same determining also the effective
mass of the given particle, one obtains large dilaton charges (q i ∼ 50) for
the confinement-generated components of the hadronic masses [48, 49], and
smaller charges (q i ∼ 1) for the leptonic components. In that case, the total
dilaton charge of a macroscopic body tends to be large (in gravitational units)
−4
and composition-dependent [55], so that a large dilaton mass (m > ∼ 10 eV)
is required to avoid conflicting with known gravitational phenomenology.
This conclusion can be avoided if the loop corrections combine to produce
a cancellation, in such a way that the value of the coupling parameters q i
turns out to be highly suppressed with respect to the natural value of order
one (a scenario of this type has been proposed, for instance, in [56]). In that
case q i 1, and light dilaton masses (as required, for instance, for a resonant
interaction with gravitational antennas) may be allowed, without clashing
with experimental observations.
In the rest of this section we will focus our attention on this possibility,
considering the response of the gravitational detectors to a cosmic background
of massive, non-relativistic dilatons, assuming that the background energy
density corresponds to large fraction of critical density, and that the dilatons
are arbitrarily light and very weakly coupled to ordinary matter.

2.2 Dilaton Signals in Gravitational Antennas

The operation mechanism of all gravitational antennnas is based on the so-

called equation of “geodesic deviation” (see, e.g., [57]), which governs the
response of the detector to the incident radiation. Such an equation is obtained
by computing the relative acceleration between the world lines of two nearby
test particles, separated by the infinitesimal space-like vector η μ , and evolving
geodesically in the given gravitational background. The interaction with a
dilaton background can be easily included, in this context, by replacing the
geodesic paths of the test particles with the world lines described by (113):
one is lead, in this way, to a generalized equation of deviation [55],
D2 η μ
+ Rναβ μ η ν uα uβ = q η ν ∇ν ∇μ φ, (116)
Dτ 2
which is at the ground of the response of a detector to a background of gravi-
dilaton radiation (the symbol D denotes covariant differentiation along a curve
parametrized by the affine time-like variable τ ).
This equation implies that a gravitational detector can interact with the
scalar radiation in two ways: either
820 M. Gasperini

(i) directly, through the non-geodesic coupling of its scalar charge to the
second derivatives of the scalar background [55, 58]; or
(ii) indirectly, through the geodesic coupling of its gravitational charge to the
scalar part of the metric ﬂuctuations induced by the dilaton, and contained
inside the Riemann tensor [59].
For a precise discussion of the response of the detector we need to compute
the “physical strain” h(t) induced by the scalar radiation, which is expressed
in terms of the so-called antenna pattern functions F (θ, φ), describing the
detector sensitivity along the diﬀerent angular directions. To this purpose, we
shall rewrite (116) in the approximation of small displacements ξ μ around the
unperturbed path of the text bodies, by setting η μ = Lμ + ξ μ (τ ), with Lμ =
const. We then obtain, in the non-relativistic limit,

ξ¨i = −Lk Mk i , (117)

where
Mk i = Rk00 i + q∂k ∂ i φ (118)
is the total (scalar–tensor) stress tensor describing the “tidal” forces due to
the incident radiation. For the pedagogical purpose of this paper we shall
assume that the tensor (i.e., gravity wave) part of the radiation is absent, and
that the scalar radiation can be simply described as a linear ﬂuctuation of the
Minkowski metric background ημν and of a constant dilaton background φ0 :
thus, in the longitudinal gauge,

ds2 = (ημν + δgμν ) dxμ dxν = (1 + 2ψ)dt2 − (1 − 2ϕ)δij dxi dxj ,

φ = φ0 + χ, (119)

so that
Mij = ∂i ∂j ϕ − δij ψ̈ − q∂i ∂j χ. (120)
To discuss the detection of a stochastic background of massive scalar ra-
diation, it is also convenient to expand the ﬂuctuations in Fourier modes of
proper momentum p = p n and frequency ν = E(p) = (p2 + m2 )1/2 , where
the unit vector n speciﬁes the propagation direction of the given mode on the
angular two sphere Ω2 . We obtain

1 ∞ m2
Mij = dp (2πE)2 δij ψ(p, n
d2 n ) − ni nj ϕ(p, n
) + 2 ni nj ϕ(p, n
)
2 −∞ Ω2 E

p2
) e2πi(pn·x−Et) + h.c.
+ q 2 ni nj χ(p, n (121)
E

(note that we are using “unconventional” units in which h = 1, i.e., = 1/2π,

for an easier comparison with the experimental variables). We will also assume
that the dilaton is the only source of scalar metric perturbations, so that
Dilaton Cosmology and Phenomenology 821

ϕ = ψ [40]). Introducing the transverse and longitudinal projectors of the

scalar stresses, deﬁned, respectively, by

Tij = δij − ni nj , Lij = ni nj , (122)

defining Mij = −F̈ij , and projecting the stress tensor onto the detector tensor
Dij (specifying the geometric configuration and the orientation of the arms
of the detector), we finally obtain the scalar strain as [58, 60, 61]

1 ∞
h(t) ≡ D Fij =
ij
dp 2
d n F geo ( )
n)ψ(p, n
2 −∞ Ω2

+ F ng ( ) e2πi(pn·x−Et) + h.c..
n)χ(p, n (123)

Here

geo ij m2
F =D Tij + 2 Lij , (124)
E
p2 ij
F ng = q D Lij , (125)
E2
are the antenna pattern functions corresponding, respectively, to the geodesic
(or indirect) and non-geodesic (or direct) interaction of the detector with the
scalar radiation background.
It should be noted that the scalar radiation, differently from the case of
the tensor component, contributes to the response of the detector also with
its longitudinal polarization states. The longitudinal contribution is present
also in the ultra-relativistic limit m → 0, p → E, thanks to the non-geodesic
coupling (125). In the opposite, non-relativistic limit p → 0, E → m, the
geodesic strain tends to become isotropic, Tij + (m/E)2 Lij → δij , while the
non-geodesic one becomes sub-leading.
The results (123) is valid for any type of detector described by the re-
sponse tensor Dij , and is formally similar to the expression for the strain
obtained in the case of tensor gravitational radiation—modulo the presence
of different pattern functions, due to the different polarization properties. The
scalar strain (123) can thus be processed, following the standard procedure,
to correlate the outputs of two detectors and to extract the so-called signal-
to-noise ratio (SNR), representing the experimentally relevant variable for the
detection of a stochastic background of cosmic radiation [62].
For our scalar massive background, with spectral energy density Ω(p), we
obtain [58, 60, 61], in particular,
1/2
∞
3N H02 dp γ 2 (p) Ω 2 (p)
SN R = 2T
8π 3 0 p3 (p2 + m2 )3/2 P1 ( p2 + m2 ) P2 ( p2 + m2 )
(126)
822 M. Gasperini

(see also [32] for a detailed computation). Here T is the total (experimental)
correlation time, N an (irrelevant) normalization factor, P1 and P2 the noise
power spectra of the two detectors, and γ(p) the so-called overlap reduction
function, which modulates the correlated signal according to the relative ori-
entation and distance of the detectors, located at the positions x1 and x2 :

1
γ(p) = F1 (
d2 n n) e2πipn·(x1 −x2 ) .
n) F2 ( (127)
N Ω2

The overlap is to be calculated with the geodesic pattern function Figeo of

(124) if we are considering the indirect signal due to a spectrum of scalar
metric fluctuations, Ωψ (p); it is to be calculated with the non-geodesic pattern
function Fing of (125) if we are considering, instead, the direct signal due to
a spectrum of dilaton fluctuations, Ωχ (p).
We are now in the position of stressing another important difference from
the case of pure tensor radiation, due to the presence of the mass in the
noise power spectra Pi . For a typical power spectrum, in fact, the minimum
level of noise is reached around a rather narrow frequency band ν0 : outside
that band the noise rapidly diverges, and the signal (126) tends to zero. As
ν = (p2 + m2 )1/2 we have, in principle, three possibilities.
(1) If m ν0 then the noise is always outside the sensitivity band Pi (ν0 ), and
the signal is always negligible.
(2) If m ν0 then the sensitivity band may only overlap with the relativistic
sector of the spectrum, for p ∼ ν ∼ ν0 .
(3) If m ∼ ν0 , finally, the whole non-relativistic part of the spectrum p ∼ <m
satisfies the condition Pi (ν) ∼ Pi (m) ∼ Pi (ν0 ).
It is thus possible to obtain a resonant response to a massive, non-relativistic
background of scalar particles, provided the mass lies in the band of maximal
sensitivity of the two detectors [58, 60]. Considering the present, Earth-based
gravitational antennas, operating between the hertz and the kilohertz range,
it follows that the maximal sensitivity is presently in the mass range

10−15 eV ∼ < 10−12 eV.

<m∼ (128)

Amusingly enough, it turns out that such small values are not so unrealistic
if the dilaton mass is perturbatively generated by the mechanism of radiative
corrections. For a scalar particle, gravitationally coupled to fermions of mass
Mf with dimensionless strength q, there are, in fact, quantum loop correc-
tions to the mass of order qMf (Λ/MP ), where Λ is the cutoﬀ, which we shall
assume typically localized at the tera electron volt scale (see, for instance,
[63]). Considering the dilaton coupling to ordinary baryonic matter (Mf ∼ 1
GeV) the induced mass is then

Λ Mf
m∼q × 10−6 eV. (129)
1 Tev 1 Gev
Dilaton Cosmology and Phenomenology 823

Thus, a value of q smaller than (but not very far from) the present upper
limits [53] (imposing q ∼< 10−4 in the relevant mass range (128)) is perfectly
compatible with the possibility of resonant response of the present detectors.
Quite independently from the possible origin of the dilaton mass, if we
assume that the mass is in the resonant range (128), and that the bounds on
q are satisfied, we find that a cosmic background of non-relativistic dilatons
is possibly detectable by the interferometric antennas of second generation—
such as Advanced and Enhanced LIGO—provided the background energy
density is sufficiently close to the saturation of the critical density bound
[58, 60]. This interesting possibility can be illustrated by considering, for an
approximate estimate, the simplified situation of two identical detectors with
P1 = P2 = P , responding non-geodesically with maximal allowed overlap
N γ ng q 2 (4π/15) (the numerical factor is referred to the particular case of
interferometric antennas). Let us suppose, also, that the SNR integral (126) is
dominated by the peak value Ωm of the non-relativistic dilaton spectrum, and
that such value is reached around p = m (otherwise the response is suppressed
by the factor (p/m)4 , [60]). Equation (126) gives, in this case,
√
2T q 2 H02 Ωm
SN R ∼ , (130)
10π 2 m5/2 P (m)

and the condition of detactable background (SNR > ∼ 1) implies

1/2
T
m5/2 P (m) ∼< q 2 h2 Ωm × 10−33 Hz3/2 . (131)
4 × 107 s
This condition is compared in Fig. 6 with the expected spectral noise of
the three LIGO generations (see, e.g., [64]), for T = 4 × 107 s. The region of
the plane {m, P } corresponding to a detectable background is located above
the bold noise curves (labeled by LIGO I, LIGO II, and LIGO III), and be-
low the dashed lines, representing the upper limit (131) for diﬀerent constant
values of the parameter q 2 h2 Ωm . This limit may be interpreted either as a
constraint on the intensity Ωm , for backgrounds geodesically coupled (q 2 = 1)
to the detectors, or as a limit on the non-geodesic coupling strength q 2 , for
backgrounds of given energy density Ωm . As shown in the picture, phenomeno-
logically allowed backgrounds are in principle accessible to the sensitivity of
next-generation interferometers (see also [32] for a more detailed discussion).

2.3 Enhanced Signals for Flat Non-relativistic Spectra

The result reported in (130) is generally valid for a growing spectrum with a
steep enough slope, as typically obtained in “minimal” models of pre-big bang
inflation. However, the cross-correlated signal may result strongly enhanced
with respect to (130) if the dilaton spectrum is sufficiently flat, and if the
considered pair of detectors satisfies the condition γ(p) → const = 0 for p → 0.
Let us consider, in fact, the SNR integral (126), which can be written as
824 M. Gasperini

-44
LIGO I
10 – 5
-45 LIGO II

Log (P/Hz–1)
10 – 7
-46

-47
LIGO IIII
-48 10 – 11 10 – 9

1 1.5 2 2.5 3
Log (m/Hz)

Fig. 6. The noise power spectra of the three LIGO generations (bold curves), and
the condition of detectable dilaton background (dashed lines), plotted at diﬀerent
values of the parameter q 2 h2 Ωm (ranging from 10−5 to 10−11 )

p1
γ 2 (p)Ω 2 (p)
(SN R) ∼ T 2
dp , (132)
0 p3 E 3 P1 (E)P2 (E)
where E = (p2 + m2 )1/2 , and where we can assume that Ω(p) is a power-law
function of p, with an ultraviolet cutoff at p = p1 . For a massless spectrum
(p = E), this integral is always convergent (for any slope), even in the infrared
limit p = E → 0: in fact, when p → 0, the physical strains are produced
outside the sensitivity band of the detectors, and the noises blow up to infinity,
Pi (E) → Pi (0) → ∞. For m = 0, on the contrary, in the infrared limit p → 0
the noises keep frozen at the frequency scale determined by the mass of the
scalar background, Pi (E) → Pi (m) = const. In this second case, the behavior
of the integral dependes on γ(p) and Ω(p).
Suppose now that γ(p) → γ0 = const for p → 0, and that Ω(p) ∼ pδ , for
p < m. For δ < 1 we find that the integral is dominated by the infrared limit,
and gives
m
T γ02 dp 2
(SN R) ∼ 3
2
Ω (p)
m P1 (m)P2 (m) 0 p3
T γ 2 2(δ−1) m
= 3 0 p . (133)
m P1 P2 0

Thus, the integral is infrared divergent [65] for all spectra (even if blue, δ > 0)
with δ < 1 !
This divergence is obviously unphysical, and can be removed by noting
that the observation time T is finite, and is thus associated to a minimum
−1
resolvable frequency interval Δν = ΔE = Δ(p2 /2m) > ∼ T , defining the
minimum momentum scale
1/2
pmin = (2m/T ) > 0, (134)
acting as effective infrared cutoff for the integral (133). This implies a modified
dependence of SNR on the correlation time T in the case of flat enough spectra:
Dilaton Cosmology and Phenomenology 825

m T 1/2 , δ > 1,
SN R ∼ T 1/2 pδ−1 p ∼ (135)
min T 1−δ/2 , δ < 1.
For δ < 1, in particular, there is a faster growth of SNR with T , which may
produce an important enhancement of the sensitivity to a cosmic background
of non-relativistic scalar particles, as discussed in [61, 65].
It is important to stress that the case γ(p) → γ0 = const for p → 0 has not
been “invented” ad hoc: it can be implemented, in practice, with detectors
already existing and operative (or with detectors planned to be working in
the near future, like resonant spheres). A first simple example, studied in [65],
refers in fact to spherical, resonant-mass detectors, whose monopole mode
is characterized by the “trivial” response tensor Dij = δ ij . In that case the
geodesic pattern function (124) is isotropic,
2p2 + 3m2
F geo = , (136)
p 2 + m2
and the geodesic overlap function (127), for two identical spheres with spatial
separation |x1 − x2 | = d, is given by
2
2 2p2 + 3m2 sin(2πpd)
γ(p) = . (137)
N p 2 + m2 pd
This function clearly satisfies the requirement γ(p) → γ0 = const for p → 0.
A second example, studied in [61], refers to the so-called common mode of
the interferometric antennas, characterized by the response tensor
ij
D+ = ui v j + v i uj , (138)
where ui and v i are the unit vectors specifying the spatial orientation of
the axes of the interferometer. Let us consider, for instance, a geometrical
configuration where the vectors u and v are coaligned with the x1 and x2 axes
of a Cartesian frame, respectively, and the direction n ofthe incident radiation
is specified (with respect to the axes x1 , x2 and x3 ) by the polar and azimuthal
angles ϕ and θ. The computation of the geodesic pattern function (124) gives,
in that case,
p 2
F+geo = 2 − sin2 θ. (139)
E
The geodesic overlap function (127), for two coplanar interferometers with
spatial separation |Δx| = d, is [61]
2
geo 4π p2 p4 1 p p4
γ+ (p) = 4 − 4 2 + 4 j0 (α) + 4 2 − 2 4 j1 (α)
N E E α E E

3 p 4
+ 2 j2 (α) , (140)
α E

where α = 2πpd, and j0 , j1 , j2 are spherical Bessel functions. Thus, also in

this case, γ → 16π/N = const for p → 0.
826 M. Gasperini

3 Late-time Cosmology: Dilaton Dark Energy

In this third lecture we will discuss the possibility that a homogeneous, large-
scale dilaton field may be the source of the so-called dark energy which pro-
duces the cosmic acceleration first observed at the end of the last century [66],
and confirmed by most recent supernovae data [67, 68].
Let us recall, to this purpose, that the initial phase of pre-big bang inflation
is characterized by the monotonic growth of the dilaton and of the string
coupling gs (see Sect. 1.3): the subsequent epoch of standard evolution thus
opens up in the strong coupling regime, and should be described by an action
which includes all relevant loop corrections. Late enough, i.e., at sufficiently
low-curvature scales, the higher-derivative corrections can be neglected, and
the action can be written in the form of (114). In that context, the loop
form factors Z(φ), and the dilaton potential V (φ), may play a crucial role in
determining the late-time cosmological evolution.
There are, in principle, two possible alternative scenarios.
(i) The dilaton is stabilized by the potential at a constant value φ = φ0
which extremizes V (φ). In this case, the loop corrections induce a constant
renormalization of the effective dilaton couplings (as discussed in Sect. 2.1),
and the Universe may approach a late-time configuration dominated by the
dilaton potential, with H 2 ∼ V (φ0 ).
(ii) The dilaton fails to be trapped in a minimum of the potential, and keeps
running even during the post-big bang evolution. In this case the late-time
cosmological evolution is crucially dependent on the asymptotic behavior
of the factors Z(φ).
These two different possibilities have different impact on the so-called coinci-
dence problem (i.e., on the problem of explaining why the dark matter and
dark energy densities are of the same order just at the present epoch), as we
shall discuss in the following subsections.

3.1 Frozen Dilaton in the Moderate Coupling Regime

The ﬁrst type of scenario can be easily implemented [69] using a generic non-
perturbative potential which is instantonically suppressed (V ∼ exp(−1/gs2 ))
in the weak coupling limit gs2 → 0, and which develops a non-trivial structure
with a (semi-perturbative) minimum gs2 ∼ αGU T ∼ (Ms /MP )2 ∼ 0.1–0.01 in
the regime of moderate string coupling. A typical example is the “minimal”
potential given, in the E-frame, by [70]

V3 (φ) = m2V ek1 (φ−φ1 ) + βe−k2 (φ−φ1 ) e− exp[−γ(φ−φ1 )] , (141)

where k1 , k2 , β, , and γ are dimensionless parameters of order 1 (see Fig. 7).

The presence of a local minimum at φ0 φ1 allows solutions with φ = const
during the radiation-dominated phase, and (for appropriate values of mV )
Dilaton Cosmology and Phenomenology 827
∼
V
0.3

0.2 strong
← weak coupling coupling

0.1
φ0

φ
-10 -8 -6 -4 -2 2 4

Fig. 7. Plot of the potential (141) for k1 = k2 = β = γ = 1, = 0.1, φ1 = −3,

mV = 0.1, and a local minimum (independent of mV ) at φ0 = −3.112, corresponding
to gs2 = exp(φ0 ) 0.045

may also lead to a late phase of accelerated expansion driven by the potential
energy V (φ0 ), provided the dilaton is not permanently shifted away from the
minimum φ0 by the transition to the matter-dominated epoch [69].
Let us consider, in fact, the equation of motion of a homogeneous dila-
ton field φ(t) in the conformally rescaled E-frame (with metric g3), where the
graviton kinetic energy is canonically normalized, and let us assume that the
rescaled matter sources can be described as a perfect fluid of energy density
ρ3, pressure p3, and dilaton charge σ
3. Starting from an action of the type (114)
we find that the generalized dilaton equation, for a cosmological background,
takes the form
3
A(φ) φ̈ + 3H 3 φ̇ + B(φ)φ̇2 + ∂ V + λ2 [C(φ) (3ρ − 33 3] = 0,
p) + σ (142)
P
∂φ
where A, B, and C are functions describing the rescaled (E-frame) loop cor-
rections. For a minimally coupled field, for instance, A = 1, B = C = σ 3 = 0;
for the dilaton, at tree level in the string coupling, A = C = 1, B = 0. In the
most general case we find that a stable dilaton configuration with φ̇ = 0 = φ̈ is
possible, in the radiation era (3ρ = 33p), if the scalar charge of the fluid is neg-
3 = 0, and the dilaton extremizes the E-frame potential, ∂ V3 /∂φ = 0.
ligible, σ
When the Universe becomes matter-dominated (3 p = 0), however, a new
acceleration φ̈ = −A−1 λ2P C ρ3 is suddenly generated, which tends to remove
the dilaton away from its equilibrium position. Such an acceleration is in
competition with the restoring force φ̈ = −A−1 (∂ V3 /∂φ) (see (142)). The
possibility that the dilaton may bounce back to the stable minimum φ = φ0 ,
driving the Universe towards a final phase of accelerated, potential-dominated
expansion, thus crucially depends on the values of two parameters: the (loop-
corrected) strength λ2P C(φ0 ) of the dilaton coupling to dark matter, and the
slope of the dilaton potential (141), determined by the mass scale mV which
also controls the amplitude of the minimum, V (φ0 ) ∼ m2V . Such an amplitude,
on the other hand, should correspond to the present Hubble scale (V (φ0 ) ∼
828 M. Gasperini

H02 ), in a realistic model able to describe the present phase of accelerated

expansion.
It can be shown, with a simple numerical analysis, that the values of
the coupling strength allowed by present gravitational phenomenology are
compatible with a late-time phase dominated by the potential only for a ﬁnite
range of values of V (φ0 ), depending on the value of the dilaton coupling at
the equality epoch [69]. Using the phenomenological upper limit |Ceq | 0.1
one ﬁnds that the dilaton, after a smal shift at t = teq , bounces back to the
minimum provided 10−7 Heq ∼ < mV ∼ < Heq (which includes the realistic case
mV ∼ H0 ∼ 10−6 Heq ) [69]. Smaller values of |Ceq | correspond to a larger
mass interval. We can say, therefore, that the coincidence problem (i.e., why
V (φ0 ) ∼ H02 ), in this context remains, but is somewhat alleviated because—
thanks to the dynamical correlation between the amplitude V (φ0 ) and the
matter-dilaton coupling—only a restricted range of values is allowed for V (φ0 ).

3.2 Running Dilaton: Saturation of the Loop Corrections

and Asymptotic “Freezing”

The second possibility, which will be discussed here in more detail, in the case
in which the dilaton is not stopped by the structures formed by the potential
around gs2 = 1, and keeps rolling towards +∞ along a smoothly decreasing
potential. A possible example of non-perturbative potential of this type is
given, in the E-frame, by [71]
2
eφ
V3 = c41 m2V e−β1 exp(−φ) − e−β2 exp(−φ) , (143)
b1 + c21 eφ

where b1 , c1 , β1 , and β2 are dimensionless parameters, with 0 < β1 < β2 . This

potential is instantonically suppressed in the weak coupling limit φ → −∞,
and is exponentially decaying as

V3 = m2V (β2 − β1 ) e−φ + O e−2φ (144)

in the limit φ → +∞ (see Fig. 8). In this case, as we shall see, we can obtain a
scenario of “coupled quintessence” [72] in which the late Universe approaches
a (possibly accelerated) state dominated by a mixture of kinetic and potential
energy density, and the coincidence problem may find a satisfactory solution
thanks to the dilaton–dark matter interactions.
In this case, however, a realistic scenario requires some mechanism of satu-
ration of the loop corrections, so as to keep the present effective values of grav-
itational and gauge couplings approximately constant and sufficiently “small,”
even in the large “bare coupling” limit φ → +∞. As discussed in [73], such a
saturation can be obtained thanks to the large number of fields (e.g., gauge
bosons) entering the loop corrections, assuming (as in models of “induced
gravity”) that the loop form factors of (114) have a finite limit for φ → +∞,
Dilaton Cosmology and Phenomenology 829

∼
V

0.3

← weak 0.2 strong

coupling coupling

0.1

φ
-6 -4 -2 2 4

Fig. 8. Plot of the potential (143) for b1 = 1, c1 = 10, β1 = 0.1, β2 = 0.2, and
mV = 1. The dilaton is monotonically growing from the string perturbative vacuum
along a “bell-like” non-perturbative potential

and that can be approximated by a Taylor expansion in powers of the inverse

bare coupling gs2 = exp φ. Applying these assumptions to the gravi-dilaton
form factors, to the potential, and to the dimensionless parameters qi (φ) con-
trolling the dilaton charge density of the various matter ﬁelds, we can set, for
φ → +∞,

ZR (φ) = c21 + b1 e−φ + O(e−2φ ),

Zφ (φ) = −c22 + b2 e−φ + O(e−2φ ),
V (φ) = V0 e−φ + O(e−2φ ),
qi (φ) = q0i + O(e−2φ ). (145)

The dimensionless coefficients c21 and c22 of this expansion are typically
of order N ∼ 10−2 , because of their quantum-loop origin and of the large
number N of gauge bosons in GUT groups like E8 . This is in agreement with
the fact that c21 controls (according to the action (114)) the asymptotic value
of the ratio between the string and the Planck length scale, c21 = (λs /λP )2 ,
which is indeed expected to be a number of the above order. The coefficients
b1 , b2 . . . , on the contrary, are numbers of order 1. Note that the expansion of
V (φ) agrees with the asymptotic form of the potential (144).
We should note, finally, that the asymptotic values of the dilaton charges,
q0i , have to be strongly suppressed for the ordinary components of matter
(such as baryons) and for electromagnetic radiation: if we want a dilaton field
active on a cosmological scale of distances, in fact, we need long-range inter-
actions, and we must avoid unacceptable deviations from the standard gravi-
tational phenomenology by suppressing the dilaton couplings, as discussed in
Sect. 2.1. For the (possibly exotic) components of dark matter, however, there
is no strict phenomenological bound imposing such suppression: in that case,
the asymptotic charge q0 could be non-vanishing, and of order 1, leading to
interesting late-time deviations from the standard cosmological scenario.
830 M. Gasperini

For a simpler illustration of this possibility it is convenient to work in

the diagonalized E-frame, obtained from the metric g of (114) through the
rescaling
−1
gμν = c21 ZR g3μν . (146)
The action (114) becomes, in this new frame

2
1 4 √ 3 1 2 3 3
S= 2 d x −g −R + k (φ) ∇φ − V (φ) + Sm (3 g , φ, matter),
2λP 2
(147)
where 2
∂ ln ZR Zφ
2
k (φ) = 3 −2 , V3 (φ) = c41 ZR
−2
V. (148)
∂φ ZR
Assuming that the matter action Sm describes a perfect ﬂuid with a dark
matter component ρ3m , a baryon component ρ3b , and a radiation component
ρ3r = 33
pr , the cosmological Einstein equations for the action (147) can then
be written (omitting the tilde, and in units 2λ2P = 1) as

6H 2 = ρr + ρb + ρm + ρφ ,
ρr
4Ḣ + 6H 2 = − − pφ , (149)
3
where
k 2 (φ) 2 k 2 (φ) 2
ρφ = φ̇ + V, pφ = φ̇ − V. (150)
2 2
The associated dilaton equation, assuming a negligible density of dilaton
charge for baryons and radiation (σr = 0 = σb ), can be written as [71]
1
k 2 (φ̈ + 3H φ̇) + kk φ̇2 + V + [ψ (ρb + ρm ) + σm ] = 0, (151)
2
where we have defined ψ = − ln ZR , and the prime denotes differentiation with
respect to φ. The combination of (149)–(151) leads, finally, to the equations
of energy–momentum conservation for the various fluid components:

ρ̇r + 4Hρr = 0,
ψ
ρ̇b + 3Hρb − φ̇ ρb = 0,
2
ψ σm
ρ̇m + 3Hρm − φ̇ ρm − φ̇ = 0,
2 2
1
ρ̇φ + 3H(ρφ + pφ ) + φ̇ [ψ (ρb + ρm ) + σm ] = 0 (152)
2
(the last equation is simply the dilaton equation (151), rewritten in ﬂuido-
dynamical form).
Let us now concentrate on the coupled dark matter/dilaton system, and
note that there are two types of interactions between these two cosmic sources:
Dilaton Cosmology and Phenomenology 831

a first one, specific to the particular type of dark matter field, generated
by the “intrinsic” dilaton charge σm ; and a second one, more “universal,”
generated by the standard dilaton coupling to the trace of the stress tensor,
and associated to the ψ terms of the above equations. Both types of coupling
are renormalized by the loop corrections, but with opposite effect according to
the asymptotic limits of (145). In fact, the dilaton charge tends to grow, and
to reach a constant asymptotic value as φ → +∞. The coupling parameter
ψ , on the contrary, tends to be exponentially suppressed as

b1 e−φ
ψ = − (ln ZR ) → , φ → +∞. (153)
c21

As a consequence, after the transition to the matter-dominated phase, the

Universe may enter two diﬀerent types of dynamical regimes [71].
(1) If the dark matter charge σm is still negligible at the beginning of
the matter-dominated phase (as well as the dilaton potential, expected to
become important only near the present epoch), then the Universe enters the
so-called dragging regime, in which ρm is coupled to φ through the ψ terms
of (153), and the evolution of the (still subdominant) dilaton kinetic energy
ρφ is “dragged” by ρm .
The cosmic evolution, during this regime, can be analytically described (in
an approximate way) by noting that the loop factor k(φ) goes to a constant
at late enough timescales,
√ c2
k(φ) → k0 = 2 , φ → +∞, (154)
c1

according to (148) and (145). Introducing the canonical variable φ = k0 φ (see

the action (147)), and neglecting the subdominant contributions of ρr and ρb ,
we can then rewrite the coupled equations (151) and (152), for the dragging
regime, as follows:
¨ ˙
φ + 3H φ + ρm = 0, (155)
2
˙
ρ̇m + 3Hρm − ρm φ = 0, (156)
2
√
where = ψ /k0 e−φ /( 2c1 c2 ) 1 is the effective coupling parameter.
˙
Neglecting the time dependence of with respect to that of H and φ (for
small enough time intervals), we find that the system of equations (149) and
(155), is satisfied by
˙
φ −2H. (157)
Thus, from (156),

˙2
ρm ∼ a−(3+ ∼ H 2 ∼ φ ∼ ρφ ,
2
)

2
a ∼ t2/(3+ ) . (158)
832 M. Gasperini

During this phase the dark matter and the (kinetic) dilaton dark energy densi-
ties are characterized by the same time evolution, which slightly deviates from
the standard behavior of a dust-dominated Universe (ρ ∼ a−3 , a ∼ t2/3 ). The
kinematics, however, remains decelerated (as 1).
(2) A second, possibly accelerated, freezing regime is eventually reached
in the limit in which the dilaton potential comes into play, and the coupling
induced by the intrinsic charge density σm becomes dominant with respect to
the exponentially suppressed coupling due to ψ .
Using again the canonical variable φ, assuming that σm = q(φ)ρm (for a
homogeneous ﬂuid), and considering the asymptotic limits q(φ) → q0 , V =
V0 exp(−φ) of (145), we can rewrite the coupled dilaton–dark matter equations
(152), for the freezing regime, as follows:
q0 ˙
ρ̇m + 3Hρm − ρm φ = 0,
2k0
q0 ˙
ρ̇φ + 6Hρk + ρm φ = 0. (159)
2k0
We have deﬁned the kinetic and potential energy densities, ρk and ρV , respec-
tively, as

˙2
φ 0
= V0 e−φ/k
ρk = , ρV = V (φ) , ρφ = ρk + ρV . (160)
2
The system of equations (159) and (149) (with ρr = ρb = 0) can be solved
by a late-time conﬁguration in which ρm , ρφ , V and H 2 scale in time in the
same way, so that the critical fractions of dark matter density, Ωm = ρm /6H 2 ,
dilaton kinetic energy, Ωk = ρk /6H 2 , and potential energy, ΩV = V /6H 2 , are
separately frozen at constant values determined by k0 and q0 only (i.e., by
the parameters c1 , c2 , and q0 of the asymptotic expansion (145)). A simple
analysis gives [71]

3k02 3k02 + q0 (q0 + 2)

Ωk = , ΩV = ,
(q0 + 2)2 (q0 + 2)2
Ωφ = Ωk + ΩV , Ωm = 1 − Ωφ , (161)

where k0 is given by (154) (see also [32] for a detailed computation).

In this asymptotic state the Universe is thus dominated by a fixed mixture
of dark matter and dilaton (kinetic plus potential) energy density. The dilaton
fluid has equation of state
pφ Ωk − ΩV q0 (q0 + 2)
w= = =− 2 , (162)
ρφ Ωk + ΩV 6k0 + q0 (q0 + 2)
and can play the role of the dark energy fluid responsible for the observed
cosmic acceleration, provided q0 > 1.
In fact, by rewriting the Einstein equations (149) for Ḣ in the form
Dilaton Cosmology and Phenomenology 833

2Ḣ
1+ = ΩV − Ωk , (163)
3H 2
we obtain
ä Ḣ 3 1 q0 − 1
= 1 + 2 = (ΩV − Ωk ) − = . (164)
aH 2 H 2 2 q0 + 2
The expansion is accelerated (ä > 0) for q0 > 1 or q0 < −2. The second case
(corresponding to an acceleration of superinflationary type, with Ḣ > 0) is to
be excluded, however, in our context, as it would imply Ωm < 0 according to
(161). Thus, acceleration is only possible for q0 > 1. The explicit form of this
asymptotic solution can be finally obtained through the integration of (164),
which gives
a ∼ t(q0 +2)/3 , H ∼ a−3/(q0 +2) , (165)
from which
˙2
φ
ρm ∼ H ∼ 2
∼ V0 e−φ/k0 ∼ a−6/(q0 +2) . (166)
2
To illustrate the smooth background evolution from the initial radiation
phase to the intermediate dragging phase, and to the final freezing regime, we
shall conclude this subsection by presenting the results of an exact numerical
integration of the string cosmology equations (149)–(152). For our illustrative
purpose, we will assume that ZR and Zφ are given by the expansion (145)
truncated to first order in exp(−φ), with b1 = b2 = 1, c21 = 100 and c22 = 30.
We will adopt the model of dilaton coupling already used in [71], parametrized
by the time-dependent charge
eq0 φ
q(φ) = q0 , (167)
c2 + eq0 φ
with c2 = 150 and q0 = 2.5. We will also use the E-frame potential (144),
with β1 = 0.1, β2 = 0.2, and c21 mV = 10−3 Heq . The last choice, which implies
mV ∼ H0 , is crucial to obtain a realistic scenario in which the asymptotic
accelerated regime starts at a phenomenologically acceptable epoch (see [71,
32] for a discussion of the mass scale of the non-perturbative dilaton potential,
and of the degree of fine-tuning possibly required for realistic cosmological
applications). Finally, we will integrate our equations imposing the initial
conditions ρφ (ti ) = ρr (ti ), ρm (ti ) = 10−20 ρr (ti ), ρb (ti ) = 7 × 10−21 ρr (ti ),
φ(ti ) = −2, at the initial scale H(ti ) = 1040 Heq .
The obtained scaling evolution ρ = ρ(a) is illustrated in Fig. 9 for the
various cosmic components. We can note that, at large enough times, baryons
(full line) and radiation (dotted line) are fully decoupled from the dilaton, and
obey the standard scaling behavior (ρr ∼ a−4 , ρb ∼ a−3 ). The late-time dark
matter evolution, on the contrary, is closely tied to the dilaton evolution,
and the ratio of their energy densities becomes asymptotically frozen at a
constant. With the particular numerical values used in this example we obtain
an asymptotic configuration characterized by Ωφ 0.73 and Ωm 0.27, with
a dark energy equation of state w −0.76.
834 M. Gasperini

redshift z
105 104 103 102 101 0
-70 radiation
DRAGGING FREEZING

-75
dark matter
ln ρ -80
dilaton baryons
-85

40 42 44 46 48 50 52 54
ln a

Fig. 9. Late-time evolution (on a logarithmic scale) of the various components of

the cosmic energy density. The plots are the result of a numerical integration of
(149)–(152)

3.3 Non-local Coupling and Pressure Back Reaction

Another interesting asymptotic conﬁguration can be obtained in the case in

which the dilaton is non-locally coupled to the dark matter components, as
discussed in Sect. 1.2. In that case, the fractions Ωm and Ωφ may also become
frozen at constant asymptotic values, but the background evolution turns out
to be decelerated for any values of q0 , as the dark matter develops an effective
pressure which tends to compensate the accelerating action of the dilaton
potential. This effect is new, and will be illustrated in some details in this
subsection.
We start assuming that, in the matter part of the action (158), the dilaton
is non-locally coupled to the sources through the variable ξ(φ), as in the
action (168). However, differently from the action (28), we will assume (for
simplicity) that the dilaton potential is local, V = V (φ): the gravi-dilaton part
of the action is thus identical to that of (114), and our model is described by

1 √
S=− 2 d4 x −g ZR (φ)R + Zφ (φ)(∇φ)2 + V (φ)
2λs

√
+ d4 x −g Lm (e−ξ ). (168)

Let us vary this action with respect to g and φ, and evaluate the resulting
(general covariant) field equations in the limit of a homogeneous, isotropic,
spatially flat background, using the results of Sect. 1.2. We obtain a set of
equations similar to (45)–(47) for what concerns the dilaton charge density
σ(φ), but different for the potential (which now is local), and for the presence
of the loop corrections ZR , Zφ .
Let us finally transform the equations in the E-frame (using the rescaling
(146)), and consider the asymptotic limit in which ρr , ρb are negligible, and
the dark matter is coupled to the dilaton only through its intrinsic dilaton
charge (namely, the limit in which ψ 0). The resulting equations (omitting
Dilaton Cosmology and Phenomenology 835

the tilde, in units 2λ2P = 1) can be written as

6H 2 = ρm + ρφ , (169)

σm
4Ḣ + 6H 2 = −pφ − , (170)
2
σm σm
ρ̇m + 3H ρm + − φ̇ = 0, (171)
2 2
σm
ρ̇φ + 3H(ρφ + pφ ) + φ̇ = 0, (172)
2
with ρφ and pφ defined by (150), as before. A comparison with the asymptotic
limit of (149) and (152), shows that the genuinely new effect of the non-
local interactions is the appearance of an effective pressure term σm /2 for the
dark matter component. Indeed, the new terms present in (170) and (171),
can also be obtained from the standard Einstein equations through the shift
pm → pm + σm /2.
We are now in the position of asking whether or not this modification
(of non-local origin) may change the results of the previous subsection, in
particular those concerning the asymptotic freezing configuration. We shall
consider, to this purpose, the limit in which σm → q0 ρm and V = V0 exp(−φ),
using the canonical variable φ as in the previous computations.
Let us look for solutions of (169)–(172) by requiring for ρm , ρk , and ρV
the same scaling behavior, and thus imposing, as a first condition, that
ρ̇m ρ̇φ
= . (173)
ρm ρφ

Using (171) and (172) for ρm and ρφ , and the Einstein equation (169), we
obtain
6k q0 q0
˙
φ
= ΩV 1 + − Ωk 1 − . (174)
H q0 2 2
We are denoting with a bar the fractions of critical energies for the new
conﬁguration associated to the non-local equations, to distinguish it from the
“local” freezing solution of (161). We also impose, as a second condition, that

ρ̇m ρ̇V
= . (175)
ρm ρV

The deﬁnition (160) of ρV , together with (171), gives then

˙
φ
= 3k0 , (176)
H
which, combined with (174), leads to
836 M. Gasperini

2 − q0 q0
ΩV = Ωk + . (177)
2 + q0 q0 + 2

From the deﬁnition of Ω k and (176), on the other hand, we have

˙
φ 3
Ωk = = k02 . (178)
12 H 2 4
The insertion of this result into (177) ﬁnally gives

3k02 (2 − q0 ) + 4q0
ΩV = . (179)
4(2 + q0 )

The combination of Ω k and Ω V provides now the values of Ω φ , Ω m , and

the equation of state w, according to the deﬁnitions (161) and (162). As we are
interested in the kinematical properties of the solution we shall compute, in
particular, the acceleration parameter ä/(aH 2 ): dividing by 6H 2 the modiﬁed
equation (170) we obtain

Ḣ 3 3 3
2
= Ω V − Ω k − q0 Ω m − , (180)
H 2 4 2
from which
ä Ḣ 1
≡1+ 2 =− , (181)
aH 2 H 2
quite independently of the values of k0 and q0 ! The integration of Ḣ ﬁnally
provides a ∼ t2/3 and H 2 ∼ ρ ∼ a−3 , as in the standard phase of dark
matter-dominated evolution.
The considered model of non-local coupling is thus associated to an asymp-
totic freezing phase which is decelerated, and in which the dilaton energy den-
sity has the same dynamical behavior of a dust ﬂuid, ρφ ∼ ρm ∼ a−3 , in spite
of a pressure which is non-vanishing, in general:

qq 3k02 − 2
w= . (182)
2 3k02 + q0

This result can be understood by noting (180) and (181) together imply
q0 1 σm
Ωk − ΩV + Ωm ≡ p φ + = 0, (183)
2 6H 2 2
namely a zero total pressure for the coupled dilaton–dark matter fluid (see
(170). The dark matter pressure associated to the non-local effects thus gen-
erates a backreaction which exactly compensates—at least in this model—the
dilaton pressure, leading the system to restore, asymptotically, the standard
dust matter configuration.
Dilaton Cosmology and Phenomenology 837

3.4 Main Diﬀerences from Uncoupled Models

Let us come back to the class of models in which the dilaton is locally cou-
pled to the dark matter components, as discussed in Sect. 3.2. If we identify
the accelerated freezing phase with our present cosmological phase, and thus
the energy density of the dilaton field with the “dark energy” density respon-
sible for the present cosmic acceleration, we are lead to a dilaton model of
dark energy which is substantially different from the conventional models of
quintessence [74] based on a rolling scalar field, uncoupled to dark matter.
A first, important (conceptual) difference concerns the mentioned prob-
lem of the cosmic coincidence. In the considered class of dilaton models this
problem, if not solved, is at least relaxed: in fact, the dark energy and dark
matter densities are of the same order not only today but also in the future
(forever), and also in the past for a significantly amount of time, depending
on the beginning of the freezing epoch (see below).
A second, more phenomenological difference concerns the scaling behavior
of the baryonic and dark matter components of the dust fluid during the
freezing epoch. Because of the coupling to the dilaton, the dilution in time
of the dark matter density ρm is slower than the standard baryon dilution,
ρb ∼ a−3 : in particular, the ratio ρb /ρm decreases in time as
ρb
∼ a−3q0 /(2+q0 ) (184)
ρm
(see (166)). This could explain why the present fraction of baryons is small
(∼ 10−2 ) in critical units—provided the accelerated epoch has an early enough
beginning. Direct/indirect measurements of the past value of the ratio ρb /ρm ,
compared with its present value, could provide unambiguous tests of this class
of models.
Finally, concerning the beginning of the accelerated epoch, it is important
to stress that in dilaton models the acceleration can start much earlier than
in models of uncoupled quintessence [75, 76].
For a simple illustration of this point we may consider a model in which,
during the accelerated regime, there are two types of sources with different
dynamical behavior: (i) an uncoupled component ρu , with pressure pu = 0
and scaling behavior ρu ∼ a−3 (represented by baryons and, possibly, by a
fraction of non-baryonic dark matter uncoupled to the dilaton); (ii) a cou-
pled component ρc , with pressure pc = wρc , and a slower scaling behavior
ρc ∼ a−3(1+w) (represented by the dilaton and by the fraction of dark matter
coupled to the dilaton). Thus, even if today ρc dominates, and drives an ac-
celerated evolution, at ealy enough times the Universe was dominated by ρu ,
and decelerated. From the Einstein equations
6H 2 = ρu + ρc ,
4Ḣ + 6H 2 = −pc = −wρc , (185)
we obtain that the acceleration switches off at the scale aacc such that
838 M. Gasperini

ä Ḣ 1
=1+ = − [Ωu − (1 + 3w)(Ωu − 1)]acc = 0,
aH 2 acc H2 2
acc
(186)

where Ωu = ρu /6H 2 , Ωc = 1 − Ωu . In terms of the present values Ωu0 , Ωc0 of

these fractions the above condition becomes
−3
aacc aacc −3(1+w)
Ωu0 = (1 + 3w) Ωu0 − 1 , (187)
a0 a0
and ﬁxes the beginning of the acceleration at the redshift scale zacc such that

0 −1/3w
a0 Ωu − 1
zacc ≡ − 1 = (1 + 3w) − 1. (188)
aacc Ωu0
Consider now a model of uncoupled quintessence, in which the uncou-
pled component corresponds to the totality of the dark matter ﬂuid (plus

(a)

1
0.8 Ωm = 0.2
zacc

0.6
0.4
Ωm = 0.4
0.2

-1.4 -1.2 -1 -0.8 -0.6 -0.4

w
(b)
5
Ωb= 0.04
4

3
Ωb = 0.05
zacc

-1 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4

Fig. 10. Beginning of the accelerated epoch for dark-energy models with uncoupled
(top) and fully coupled (bottom) dark matter, according to the observations of
type Ia supernovae in the SNLS data set. The plotted curves are obtained from
(188), for constant values of the present fraction of the uncoupled dust matter Ωu0
Dilaton Cosmology and Phenomenology 839

subdominant contributions), i.e., Ωu0 = Ωm 0

. Using the recent analysis of the
SNLS Collaboration [68], based on present observations of supernovae Ia and
large-scale structure, one ﬁnds 0.2 ≤ Ωm 0
≤ 0.4, and −1.2 ≤ w ≤ 0.8. One
< zacc ∼
then obtains, from (188), 0.4 ∼ < 1 (see Fig. 10, top panel).
If we consider instead a model of dilaton dark energy, then the uncoupled
component may range from the baryon component Ωb to some fraction of
the dark matter component Ωm . In the “maximally coupled” version of the
model, in which Ωu0 = Ωb0 , we can re-apply the supernovae results of SNLS
with Ωb0 0.04 − −0.05, to obtain w −0.65 ± 0.15. Equation (188) then
implies [76] zacc 3 − −4 (see Fig. 10, bottom panel).
Thus, dilaton models of dark energy are compatible with a beginning of
the cosmic accelerations at epochs much earlier than those suggested by other
models, according to the most recent supernovae data. The extension back in
time of the accelerated regime might have a signiﬁcant impact on the dilution
of baryons, according to (184). Finally, strongly coupled models tend to be
compatible with a “less negative” parameter w (see Fig. 10), thus alleviating
the need for “phantom” dark energy [77] with “supernegative” (w < −1)
equation of state.

Acknowledgements
I am very grateful to all friends and collaborators contributing to the results
reported in this paper. First of all, I would like to thank Gabriele Veneziano
for many years of collaboration, friendship, and support. In addition, I am
grateful to Valerio Bozza, Massimo Giovannini, and Jnan Maharana for their
collaboration on the results presented in the ﬁrst lecture; to Nicola Bonasia,
Eugenio Coccia and Carlo Ungarelli for their collaboration on the results
presented in the second lecture; to Luca Amendola, Federico Piazza, and Carlo
Ungarelli for their collaboration on the results presented in the third lecture.

Appendix A
In this appendix we present a detailed derivation of the equations of motion
(29) and (34), starting from the action (28) which includes non-local dilaton
interactions.
The functional derivation of the action with respect to the metric g μν (x)
contains, besides the standard contributions leading to (3), the new non-local
contributions Vμν (x) and Lμν (x), and can be written as follows:
√

δS −g e−φ x 1 2
= − Gμν + ∇ μ ∇ν φ + g μν ∇φ − 2∇ 2
φ − V
δg μν (x) 2λd−1
s 2 x
1√
+ −g Tμν (x) + Vμν (x) + Lμν (x), (A.1)
2
840 M. Gasperini

where

1 √ δ
Vμν (x) = − d−1 dd+1 x −g e−φ V x μν e−ξ(x ) , (A.2)
2λ δg (x)
s
√ δ
Lμν (x) = dd+1 x −gLm x μν e−ξ(x ) (A.3)
δg (x)
(V and Lm are deﬁned by (33)). We need now to compute the functional
derivation of exp(−ξ). Using the deﬁnition (25) we obtain

δ −ξ(x ) 1
e =− d dd+1 y δ d+1 (x − y)δ(φx − φy ) e−φy
δg μν (x) λs

1 √ 1√ ∂μ φ∂ν φ
− −g gμν (∇φ) + 2 −g (∇φ) 2
2 2 (∇φ)2 y
1 √
= − d γμν −g (∇φ)2 e−φ δ(φx − φx ), (A.4)
2λs x

where γμν is deﬁned in (30). Thus,

1 √ −2φ 1

Vμν = −g e γμν (∇φ)2 IV , (A.5)
2λd−1
s 2
√ 1
Lμν = − −g e−φ γμν (∇φ)2 Im , (A.6)
2
where IV and Im are deﬁned in (31) and (32). Inserting these results into
√
(A.1), multiplying by (−2λd−1
s ) exp(−φ)/ −g, and imposing the condition of
zero functional derivative, one is ﬁnally lead to (29).
Let us now consider the functional derivative with respect to φ(x). Sepa-
rating the local and non-local terms, as before, we obtain
√
δS −g e−φ x
= d−1
R + 2∇2 φ − (∇φ)2 + V x
δφ(x) 2λs
+ A(x) + B(x), (A.7)

where

1 √ δ
A(x) = − d−1 dd+1 x −g e−φ V x e−ξ(x ) , (A.8)
2λ δφ(x)
s
√ δ
B(x) = dd+1 x −gLm x e−ξ(x ) . (A.9)
δφ(x)
The functional derivative of the variable (25) leads to
δ
e−ξ(x )
δφ(x)

1 √
= d d y −
d+1
−g e−φ (∇φ)2 δ(φx − φy )δ d+1 (x − y)
λs y
Dilaton Cosmology and Phenomenology 841
√
+ −g e−φ (∇φ)2 δ (φx − φy ) δ d+1 (x − x ) − δ d+1 (x − y)
y
√
−g e−φ ∂ μ φ
−∂μ δ(φx − φy ) δ d+1
(x − y) , (A.10)
(∇φ)2 y

where ∂μ = ∂/∂y μ , and δ denotes the derivative of the delta function with
respect to its argument.
The ﬁrst term of this integral exactly cancels the term containing ∂μ e−φ
in the last part of the integral; also, the third term exactly cancels the term
containing ∂μ [δ(φx − φy )] in the last part of the integral. Thus, we are left
with
√
δ δ d+1 (x − x )
e−ξ(x ) = dd+1
y −ge −φ
(∇φ)2 δ (φx − φy )
δφ(x) λds y
√
e−φ −g ∂ μ φ
− d δ(φx − φx )∂μ . (A.11)
λs (∇φ)2 x

The second term on the right-hand side of the above equation can be
conveniently rewritten as

e−φ √ ∂μφ
− d δ(φx − φx ) −g ∇μ
λs (∇φ)2 x
√
e−φ −g
= − d δ(φx − φx ) γμν ∇μ ∇ν φ. (A.12)
λs (∇φ)2
For the ﬁrst term containing δ we can exploit the properties of the delta
function, and the identities
1 d 1 d
dy0 = dφy , = , (A.13)
φ̇y dφy φ̇y dy0
to obtain

dφy √
λ−d
s dd y −g e−φ (∇φ)2 δ (φx − φy )
φ̇y y
√
−φ
−d d dφy d −g e (∇φ)2
= λs d y δ(φx − φy )
φ̇y dy0 φ̇ y
√
d dφy
= −e−ξ(x) + λ−d s e −φ(x)
d y −g (∇φ) 2 δ (φx − φy )
φ̇y y

= −e−ξ(x) + e−φ(x) J(x), (A.14)

where J is the integral defined in (35). Inserting the results (A.12) and (A.14)
into (A.11), using the definitions of A and B, and integrating over x , we
finally obtain
842 M. Gasperini
√
−g e−φ √
−ξ
A(x) + B(x) = d−1
V − −g Lm e − e−φ J x
2λs x
√ −φ
−g e−φ e
+ γμν ∇ ∇ φ
μ ν
IV − Im . (A.15)
(∇φ)2 2λd−1
s x

Summing this contribution to the other terms of (A.7), multiplying by (2λd−1

√ s
exp φ/ −g), and imposing the vanishing of the functional derivative, we are
lead to the equation of motion (34) for the dilaton.

References
1. G. Veneziano: Phys. Lett. B 265, 287 (1991) 790, 792
2. M. Gasperini, G. Veneziano: Astropart. Phys. 1, 317 (1993) 790, 800, 803, 804
3. M. B. Green, J. Schwartz, E. Witten: Superstring Theory (Cambridge Univer-
sity Press, Cambridge, 1987) 790, 800, 810
4. J. Polchinski: String Theory (Cambridge University Press, Cambridge, 1998) 790, 800, 806, 8
5. M. Gasperini: in Gravitational Waves, ed. by I. Ciufolini et al. (IOP Publishing,
Bristol, 2001), p. 280 790
6. A. A. Tseytlin: Mod. Phys. Lett. A 6 1721 (1991) 792
7. K. Kikkawa, M. Y. Yamasaki: Phys. Lett. B 149 357 (1984) 793
8. K. A. Meissner, G. Veneziano: Mod. Phys. Lett. A 6, 3397 (1991); Phys. Lett.
B 267, 33 (1991) 793
9. K. Meissner: Dualities in string cosmology, this volume 793
10. M. Gasperini, G. Veneziano: Phys. Lett. B 277, 256 (1992) 793
11. M. Gasperini, M. Giovannini, G. Veneziano: Phys. Lett. B 569, 113 (2003);
Nucl. Phys. B 694, 206 (2004) 794, 795, 802
12. M. Gasperini, G. Veneziano: Mod. Phys. Lett. A 8, 3701 (1993) 799
13. M. Gasperini, G. Veneziano: Phys. Rev. D 50, 2519 (1994) 799, 809, 810, 812, 813, 818
14. A. D. Linde: Phys. Lett. B 129, 177 (1983) 800
15. A. Vilenkin: Phys. Rev. D 46, 2355 (1992); A. Borde, A. Vilenkin: Phys. Rev.
Lett. 72, 3305 (1994) 800
16. A. Buonanno, T. Damour, G. Veneziano: Nucl. Phys. B 543, 275 (1999) 801
17. M. Gasperini, G. Veneziano: Gen. Rel. Grav. 28, 1301 (1996) 801
18. M. Gasperini, J. Maharana, G. Veneziano: Nucl. Phys. B 472, 349 (1996) 801, 802
19. R. H. Brandenberger, J. Martin: Phys. Rev. D 71, 023504 (2005) 801
20. M. Gasperini, M. Maggiore, G. Veneziano: Nucl. Phys. B 494, 315 (1997) 802
21. R. Brustein, R. Madden: Phys. Lett. B 410, 110 (1997); Phys. Rev. D 57, 712
(1998); C. Cartier, E. J. Copeland, R. Madden, JHEP 0001, 035 (2000) 802
22. M. Gasperini, G. Veneziano: Phys. Rep. 373 1 (2003) 802, 805, 808, 809, 816
23. M. Gasperini: Mod. Phys. Lett. A 14, 1059 (1999) 803
24. M. Gasperini, M. Giovannini, K. A. Meissner, G. Veneziano: Nucl. Phys. (Proc.
Suppl.) B 49, 70 (1996) 803
25. M. Gasperini, M. Giovannini: Phys. Lett. B 282, 36 (1992); Phys. Rev. D 47,
1519 (1993) 805
26. R. Brustein, M. Gasperini, M. Giovannini, V. Mukhanov, G. Veneziano: Phys.
Rev. D 51, 6744 (1995) 805, 809, 810
Dilaton Cosmology and Phenomenology 843

27. R. Brustein, M. Gasperini, M. Giovannini, G. Veneziano: Phys. Lett. B 361,

45 (1995) 805, 809, 810
28. A. Buonanno and C. Ungarelli: Primordial gravitational radiation in string
cosmology, this volume 805
29. J. Khoury, B. A. Ovrut, P. J. Steinhardt, N. Turok: Phys. Rev. D 64, 123533
(2001) 805
30. E. I. Buchbinder, J. Khoury, B. A. Ovrut: New ekpyrotic cosmology, hep-
th/0702154 805
31. L. A. Boyle, P. J. Steinhardt, N. Turok: Phys. Rev. D 69, 127302 (2002) 805
32. M. Gasperini: Elements of String Cosmology (Cambridge University Press,
Cambridge, 2007) 806, 809, 815, 822, 823, 832, 833
33. M. Maggiore, A. Riotto: Nucl. Phys. B 548, 427 (1999) 806
34. S. Alexander, R. Brandenberger, D. Easson: Phys. Rev. D 62, 103509 (2000) 806
35. C. Burgess et al: JHEP 0107, 047 (2001); S. Kachru et al.: JCAP 0310, 013
(2003) 806
36. H. Tye: Brane inﬂation: string theory viewed from the cosmos, this volume 806
37. K. Enqvist, M. Sloth: Nucl. Phys. B 626, 395 (2002); D. H. Lyth, D. Wands:
Phys. Lett. B 524, 5 (2002); T. Moroi, T. Takahashi: Phys. Lett. B 522, 215
(2001) 806, 811
38. V. Bozza, M. Gasperini, M. Giovannini, G. Veneziano: Phys. Lett. B 543, 14
(2002); Phys. Rev. D 67, 063514 (2003) 806, 811, 812
39. E. J. Copeland, R. Easther, D. Wands: Phys. Rev. D 56, 874 (1997); E. J.
Copeland, J. E. Lidsey, D. Wands: Nucl. Phys. B 506, 407 (1997) 806
40. V. F. Mukhanov, H. A. Feldman, R. H. Brandenberger: Phys. Rep. 215, 203
(1992) 807, 808, 809, 821
41. M. Abramowitz, I. A. Stegun: Handbook of Mathematical Functions (Dover,
New York, 1972) 808, 809
42. A. Melchiorri, F. Vernizzi, R. Durrer, G. Veneziano: Phys. Rev. Lett. 83, 4464
(1999) 811
43. D. Lyth, C. Ungarelli, D. Wands: Phys. Rev. D 67, 023503 (2003) 811, 812
44. M. Gasperini: in Proc. of the Fifth Paris Cosmology Colloquium, ed. by H. J.
De Vega and N. Sanchez (Publication Observatoire de Paris, Paris, 1999), p.
317 812
45. D. N. Spergel et al.: astro-ph/0603449 812
46. M. Gasperini, M. Giovannini, G. Veneziano: Phys. Rev. D 52, 6651 (1995); M.
Gasperini, M. Giovannini, G. Veneziano: Phys. Rev. Lett. 75, 3796 (1995); D.
Lemoine, M. Lemoine: Phys. Rev. D 52, 1995 (1995); M. Gasperini, S. Nicotri:
Phys. Lett. B 633, 155 (2006) 812
47. M. Gasperini: Phys. Lett. B 327, 314 (1994) 812
48. T. Taylor, G. Veneziano: Phys. Lett. B 213, 459 (1988) 813, 817, 818, 819
49. J. Ellis et al: Phys. Lett. B 228, 264 (1989) 813, 819
50. R. Durrer, M. Gasperini, M. Sakellariadou, G. Veneziano: Phys. Rev D 59,
43511 (1999) 814
51. M. Gasperini, G. Veneziano: Phys. Rev. D 59, 43503 (1999) 814
52. R. Brustein, M. Gasperini, G. Veneziano: Phys. Rev. D 55, 3882 (1997) 816
53. E. Fischbach, C. Talmadge: Nature 356, 207 (1992) 817, 823
54. C. D. Hoyle et al: Phys. Rev. D 70, 042004 (2004) 817
55. M. Gasperini: Phys. Lett. B 470, 67 (1999) 818, 819, 820
56. T. Damour, A. M. Polyakov: Nucl. Phys. B 423, 352 (1994); Gen. Rel. Grav.
26, 1171 (1994) 819
844 M. Gasperini

57. C. Misner, K. Thorne, J. A. Wheeler: Gravitation (Freeman, San Francisco

1973) 819
58. M. Gasperini: Phys. Lett. B 477, 242 (2000) 820, 821, 822, 823
59. M. Bianchi et al.: Phys. Rev. D 57, 4525 (1998); M. Maggiore, A. Nicolis: Phys.
Rev. D 62, 024004 (2000) 820
60. M. Gasperini, C. Ungarelli: Phys. Rev. D 64, 064009 (2001) 821, 822, 823
61. N. Bonasia, M. Gasperini: Phys. Rev. D 71, 104020 (2005) 821, 825
62. B. Allen, J. D. Romano: Phys. Rev. D 59, 102001 (1999) 821
63. M. Doran, J. Jackel: Phys. Rev. D 66, 043519 (2002) 822
64. B. J. Owen, B. S. Sathyaprakash: Phys. Rev. D 60, 022002 (1999) 823
65. E. Coccia, M. Gasperini, C. Ungarelli: Phys. Rev. D 65, 067101 (2002) 824, 825
66. S. Perlmutter et al.: Nature 391, 51 (1998); A. G. Riess et al.: Astron. J. 116,
1009 (1998) 826
67. A. G. Riess et al.: Astrophys. J. 607, 665 (2004) 826
68. P. Astier et al.: Astron. Astrophys. 447, 31 (2006) 826, 839
69. M. Gasperini: Phys. Rev. D 64, 043510 (2001) 826, 827, 828
70. N. Kaloper, K. A. Olive: Astropart. Phys. 1, 185 (1993) 826
71. M. Gasperini, F. Piazza, G. Veneziano: Phys. Rev. D 65, 023508 (2002) 828, 830, 831, 832, 8
72. L. Amendola: Phys. Rev. D 62, 043511 (2000); L. Amendola, D. Tocchini-
Valentini: Phys. Rev. D 64, 04359 (2001); Phys. Rev. D 66 043528 (2002) 828
73. G. Veneziano: JHEP 0206, 051 (2002) 828
74. B. Ratra, P. J. E. Peebles: Phys. Rev. D 37 3406 (1988); C. Wetterich: Nucl.
Phys. B 302, 668 (1988); M. S. Turner, C. White: Phys. Rev. D 56, 4439 (1997);
R. R. Caldwell, R. Dave, P. J. Steinhardt: Phys. Rev. Lett. 80, 1582 (1998); I.
Zlatev, L. Wang, P. J. Steinhardt: Phys. Rev. Lett. 82, 896 (1999); Phys. Rev.
D 59, 123504 (1999) 837
75. L. Amendola, M. Gasperini, D. Tocchini-Valentini, C. Ungarelli: Phys. Rev.
D 67, 043512 (2003); L. Amendola, M. Gasperini, F. Piazza: JCAP 09 , 014
(2004) 837
76. L. Amendola, M. Gasperini, F. Piazza: Phys. Rev. D 74, 127302 (2006) 837, 839
77. R. R. Caldwell, M. Kamionkowski, N. N. Weinberg: Phys. Rev. Lett. 91, 07130
(2003) 839
Relic Gravitons and String Pre-big-bang
Cosmology

A. Buonanno1 and C. Ungarelli2

1
Physics Department, University of Maryland, College Park, MD 20742, USA
[email protected]
2
Physics Department “Enrico Fermi”, University of Pisa, Largo Pontecorvo 3,
56127 Pisa, Italy and Geosciences and Earth Resources Institute, CNR,
via Moruzzi 1, 56124 Pisa, Italy
[email protected]

Abstract. In this paper, after discussing the mechanism of graviton production

during an early phase of accelerated expansion, we will review the main features
of the spectrum of primordial gravitational radiation for the class of string-inspired
models called pre-big-bang models. Furthermore, we will also outline the implica-
tions on pre-big-bang models of current and future searches of gravitational waves
with ground-based detectors.

Foreword

This contribution reviews one of the many research topics originally pioneered
by Gabriele Veneziano in String theory and fundamental interactions. We are
thankful to Gabriele for having taught us not only physics, but also how to
be good physicists, choosing and tackling problems with deepness and seri-
ousness. He will continue to be for us a precious source of inspiration.

1 Introduction

In recent years, a number of detectors have been designed and built to search
for gravitational waves (GWs). Ground-based interferometers aimed at detect-
ing GWs in the frequency range between 10 Hz and 1 kHz, such as LIGO [1],
VIRGO [2], GEO600 [3], and TAMA [4] are now operating at design sen-
sitivity (or close to it in the case of VIRGO). The design of a space-based
three-arm interferometer, the Laser Interferometer Space Antenna (LISA) [5],
will explore the frequency window between 0.1 and 10 mHz. A second genera-
tion of space-based detector probing primordial GWs [6, 7] is under planning.
Following earlier theoretical works [8], prototypes for detecting high-frequency

A. Buonanno and C. Ungarelli: Relic Gravitons and String Pre-big-bang Cosmology, Lect.
Notes Phys. 737, 845–861 (2008)
DOI 10.1007/978-3-540-74233-6 25 c Springer-Verlag Berlin Heidelberg 2008
846 A. Buonanno and C. Ungarelli

GWs, in the millihertz band, have been developed [9]. Finally, the large num-
ber of millisecond pulsar detectable with the square kilometer array (SKA) [10]
would provide an ensemble of clocks that can be used as multiple arms of a
GW detector in the frequency range around 10−9 Hz.
One of the possible targets of such search is a stochastic gravitational-wave
background (SGWB). Depending on its origin, the stochastic background can
be broadly divided into two classes (for a review see, e.g., [11, 12]): the astro-
physically generated background due to the incoherent superposition of grav-
itational radiation emitted by large populations of astrophysical sources that
cannot be resolved individually, and the primordial GW background gener-
ated by processes taking place in the early stages of the Universe. A primordial
component of such background is especially interesting, since it would carry
unique information about the state of the primordial Universe. Here we fo-
cus our attention on a particular type of primordial stochastic background,
namely the relic radiation produced by the parametric amplification of metric
tensor perturbations during an early stage of accelerated expansion (inflation-
ary stage) [13]. Leaving the detailed analysis of the production mechanism to
the following section, the energy and spectral content of such radiation is
encoded in the spectrum, defined as follows:
1
ΩGW = f ρ̃GW (f ) , (1)
ρc
where f is the frequency, ρc is the critical energy density of the Universe
(ρc = 3H02 /8πG) and ρ̃GW is the GWs energy density per unit frequency, i.e.,
∞
ρGW = df ρ̃GW (f ) . (2)
0

For a spectrum produced during an early stage of slow-roll inflation, the spec-
trum decreases as f −2 in the frequency window 10−18 − 10−16 Hz, and then
slowly decreases up to a frequency corresponding to modes whose physical
frequency becomes less than the maximum causal distance during the reheat-
ing phase (which is of order of a few gigahertz). For this class of models, the
spectral content of the SGWB is fixed in terms of the shape parameters of
the inflaton potential. Its magnitude depends on both the value of the Hub-
ble parameter during inflation and a number of features characterizing the
Universe evolution after the inflationary era—for example, tensor anisotropic
stress due to free-streaming relativistic particles, equations of state [14, 15],
and so on. An upper bound on the spectrum can be obtained from the mea-
surement of the quadrupole anisotropy of the cosmic microwave background
(CMB). Through the Sachs–Wolfe effect, a SGWB at large scales (i.e., at wave-
lengths comparable to the present value of the Hubble radius) would induce
stochastic anisotropies in the CMB temperature. This yields an upper limit of
h20 ΩGW ∼ 5 × 10−15 at f ∼ 10−16 Hz [16]. Since for a generic slow-roll infla-
tionary model the spectrum is (weakly) decreasing with frequency (for a recent
Relic Gravitons and String Pre-big-bang Cosmology 847

review see, e.g., [15, 17]), this implies an upper bound h20 ΩGW ∼ 5 × 10−16 at
frequencies around f ∼ 100 Hz , where ground-based detectors such as LIGO
reach the best sensitivity. For a flat spectrum, the recent LIGO results [18] sets
an upper limit h20 ΩGW < 6.5 × 10−5 . For frequency-independent spectra, the
expected upper limit for the current LIGO configuration is h20 ΩGW < 5×10−6 ,
while the advanced LIGO project design sensitivity is h20 ΩGW ∼ 8×10−9 (see,
e.g., [19]).
The spectrum predicted by the class of single-field inflationary models is
then too low to be observed by ground-based detectors. It is therefore evident
that a background satisfying the bound imposed by the observed amount of
CMB anisotropies at large scales could be detected at frequencies relevant for
ground-based GW detectors provided that its spectrum grows significantly
with frequency.
Pre-big-bang (PBB) models, originally proposed by Veneziano [20], and
then Gasperini and Veneziano [21] (for a detailed review, see [22]), represent
an interesting class of inflationary models alternative to the standard slow-
roll ones. In particular, the presence in the inflationary phase of fields like the
dilaton or moduli, can have important consequences on the spectral proper-
ties of the SGWB, thus affecting the possibility of detection by earth-based
−16
interferometers. As first shown in [23], at low frequencies, say f > ∼ 10 Hz,
the SGWB spectrum grows as ΩGW ∼ f . Hence, the COBE bound is eas-
3

ily evaded and the spectrum can peak at frequencies around 10 − 103 Hz, still
satisfying the bound from big-bang nucleosynthesis (BBN) [24] and CMB [25].
The aim of this paper is both to review the general mechanism of cos-
mological graviton production, describing its key features, and discuss the
prospect of detection within the class of PBB cosmological models. The pa-
per is organized as follows. In Sect. 2 we review the mechanism of parametric
amplification of metric perturbations, and discuss examples in De Sitter and
slow-roll inflation. In Sect. 3 we compute the SGWB in non-minimal PBB
models and discuss the main modifications in non-minimal models. In Sect. 4
we review the implications of current and future results of ground-based de-
tector (in particular the LIGOs) on PBB models. Finally, in Sect. 5 we draw
some conclusions.

2 Graviton Production in Cosmology

One of the most relevant aspects of inflationary models is that they pro-
vide a natural mechanism for generating perturbations in all matter fields.
Such primordial perturbations can then be considered as seeds for the ob-
served CMB anisotropies and large-scale structures, and can also yield to a
SGWB. Those observable consequences are all related to the well-known phe-
nomenon of amplification of quantum–vacuum fluctuations in cosmological
backgrounds [13]. In this section, starting with the simple toy model of a
848 A. Buonanno and C. Ungarelli

one-dimensional harmonic oscillator [26], we shall compute the SGWB in De

Sitter inﬂation and slow-roll inﬂation.
Let us consider a one-dimensional harmonic oscillator moving in an ex-
panding background described by a scale factor a(t). The Lagrangian is

a2 m 2
L= ẋ − ω 2 x2 , (1)
2
the canonical momentum and the corresponding Hamiltonian—computed as
the Legendre transformation of the Lagrangian (3)—read

p = a2 mẋ , (2)

1 p2
H= + a2 mω 2 x2 . (3)
2 a2 m
The corresponding equations of motion are
ȧ
ẍ + 2 ẋ + ω 2 x = 0 , (4)
a

ä
ÿ + ω − 2
y = 0, (5)
a
where we denote with a dot the derivative with respect to the cosmic time
t and y = a x is the proper physical amplitude of the harmonic oscillator.
Without specifying the details of the cosmological evolution, the properties of
the solutions of (6) and (7) can be derived by analyzing their behavior in two
diﬀerent regimes:
(a) When ω 2 ä/a, the comoving amplitude and momentum are adiabat-
ically damped
1
x ∼ eiωt , p ∼ aωeiωt . (6)
a
Hence, in this regime the proper physical amplitude and momentum are ap-
proximately constant (as well as the Hamiltonian (5));
(b) For ω 2 ä/a the comoving amplitude and momentum are frozen
t
1
x∼B+C dt 2 , p∼C. (7)
0 a (t )
Notice that in this freeze-out regime

d λphys
> 0, (8)
dt H −1

where λphys = 2πa/ω is the proper physical wavelength of the oscillator and
H = ȧ/a is the expansion rate. The condition (10) implies that the back-
ground expansion is accelerating (as it occurs during an inﬂationary phase)
Relic Gravitons and String Pre-big-bang Cosmology 849

and the proper wavelength characteristic of the oscillator is expanding faster

than the maximum causal distance H −1 . Furthermore, during such freeze-out
regime the value of the energy is asymptotically dominated by the term pro-
portional to x2 and is due to the stretching of the oscillator produced by the
rapidly accelerated expansion. Let us now consider a cosmological evolution
characterized by the following three diﬀerent phases:

ω 2 > ä/a t < tex , t > tre , (9)

ω 2 < ä/a tex < t < tre . (10)

By smoothly joining the solution of the equations of motion (6) and (7) in the
three different phases, it is straightforward to show that the final energy Efin
of the harmonic oscillator (which is asymptotically constant during the initial
and final phases) is enhanced during the intermediate, accelerating phase by
2
a factor proportional to [a(tre )/a(tex )]

2
a(tre )
Ef ∼ Ein . (11)
a(tex )

Note that for a classical oscillator initially at rest (xin = pin = 0) the initial
energy is zero and no amplification takes place. Within the same cosmological
evolution described by (11) and (12), let us consider instead a one-dimensional
quantum mechanical oscillator initially in the ground state. The initial wave
function is α 1/4
e−αin x /2 ,
in 2
ψin (x) = (12)
π
where αin = a2 (tex )mω 2 /. In the final stage of the cosmological evolution,
the harmonic oscillator will be in a high occupation number state.1 This can
be shown by computing the expectation value of the Hamiltonian (defined in
the final stage) with respect to the initial vacuum state defined by the wave
function (14). In the final stage of the cosmological evolution, the Hamiltonian
operator can be approximated as

2 d 2 mex ω 2 x2
Ĥf = − 2
+ , (13)
2mex dx 2

where mex = a2 (tex ) m. Expressing the wavefunction (14) in terms of the

eigenfunctions of (15), one obtains (see, e.g., [11])
∞
(n)
ψin (x) = β2n ψﬁn (x) (14)
n=0

1
More precisely in a squeezed state.
850 A. Buonanno and C. Ungarelli

n
1/4 2(2n)! 1 ωfin − ωin
β2n = (αin αfin ) , (15)
αin + αfin n! 2(ωfin + ωin )
√
ψfin (x) = Nn Hn ( αfin x) e−αfin x /2 ,
(n) 2
(16)
where αfin = a(tre )2 mω 2 /, ωin,fin = ω/a(tex,re ), Hn are Hermite polynomi-
als, and Nn is a normalization constant. Using (18) it is straightforward to
show that the expectation value of the Hamiltonian (15) on the initial state
described by the wavefunction (14) is given by
2
1 ωfin − ωin
Efin = ωfin + √ . (17)
2 2 ωin ωfin

Hence, for a sufficiently long intermediate phase (for which ωfin ωin ) the
harmonic oscillator final state is a semiclassical state characterized by a large
number of created quanta

1 ωin
Nf ∼ . (18)
4 ωfin

This simple example shows how a period of accelerated expansion (i.e., an

inflationary phase) can generate large scale inhomogeneities and anisotropies.
Barring some technical issues (the conditions that guarantee the existence
of a well-defined vacuum state at the onset of the inflationary phase), ev-
ery quantum field in the vacuum state can be described as a collection of
harmonic oscillators. The occurrence of a phase of accelerated expansion pro-
duces an amplification of the vacuum energy of those oscillators, stretching
their corresponding wavelengths to scales larger than the horizon (k < aH).
Such modes then eventually re-enter the horizon (k > aH) later, during the
radiation/matter-dominated phases of the Universe evolution. Since those
matter fields are gravitationally coupled (at least minimally) to the gravi-
tational field, such enhancement of vacuum fluctuations is transferred to the
background metric, therefore yielding to perturbations. Within the class of
homogeneous and isotropic cosmological models, the perturbations can be
classified in terms of their properties under coordinate transformations cor-
responding to the space–time isometries. (The latter are represented by the
group SO(3) of three-dimensional rotations, for a review, see [28].) In partic-
ular, scalar perturbations (described by fields invariant under rotations) are
coupled to ordinary matter and radiation fields during the radiation/matter
dominated phases, thus they are the seeds for large-scale structures and CMB
anisotropies. Tensorial perturbations which are described by a field that trans-
forms as a rank 2 tensor under rotations, produce a characteristic spectrum
of stochastic gravitational radiation, whose energy and spectral content both
depend on the background evolution and carries a unique imprint of the in-
flationary phase.
Relic Gravitons and String Pre-big-bang Cosmology 851

2.1 De Sitter Inﬂation

For the sake of simplicity, let us consider a simpliﬁed two-stage cosmological

model where the epoch of accelerated expansion is described by a De Sit-
ter phase smoothly connected to a standard radiation dominated phase. In
conformal coordinates, the space–time metric is given by

ds2 = a(η)2 (dη 2 − dx2 ) , (19)

where η is the conformal time. The background evolution is speciﬁed by the

following expressions for the scale factor:
1
a(η) = − , η < η1 < 0 (20)
Hds η

1
a(η) = (η − 2η1 ) (21)
Hds η12
For each comoving wave number k, we add transverse and traceless ﬂuctua-
tions of the metric described by the following tensor:
(A) (A)
hab (k, η) = eA
ab (k)h̃k (η)e
ik·x
, (22)
where a, b = 1, 2, 3, A = +, × labels the polarization state described by the
(A) (A)
tensor eab and k is the comoving wave vector. The amplitude hk satisﬁes
the following equation:
2
d d (A)
+ 2H + |k| hk = 0 ,
2
(23)
dη 2 dη

where H = (1/a)da/dη. The general solution of (25) can be written in terms of

elementary functions (in this case half-integer Hankel functions). In particular,
imposing that for η → −∞ the solution of (25) corresponds to a vacuum state,
one obtains [27]

a(η1 )
1 + Hds ω −1 e−ik(η−η1 ) ,
(A)
hk = η < η1 , (24)
a(η)

a(η1 )
αk e−ik(η−η1 ) + βk eik(η−η1 ) ,
(A)
hk = η > η1 , (25)
a(η)
where ω = ck/a and αk , βk are the so-called Bogoliubov coeﬃcients relative
to the transition from a De Sitter to the radiation-dominated regime. In par-
ticular, for η → +∞ |βk |2 represents the number of gravitons created per
unit cell of the phase space. The Bogoliubov coeﬃcients can be computed
by imposing the continuity of the amplitude and its time derivative on the
space-like surface η = η1 [27]:
852 A. Buonanno and C. Ungarelli
√
H0 Hds H0 Hds
αk = 1 + i − , (26)
ω 2ω 2

H0 Hds
βk = . (27)
2ω 2
The graviton energy density per unit cell of phase is therefore
2
ω dω H02 Hds
2
df
dρGW = 2ω 2
|βk |2 = 2
, (28)
2π 4π f

where f = ω/2π is the physical frequency and H0 is the present value of

the Hubble constant. Using the deﬁnition (1) the spectrum turns out to be
scale-invariant and its value reads
4
16 Minﬂ
ΩGW = , (29)
9 Mpl

where Mpl = G−1/2 = 1.22 × 1019 GeV is the Planck mass and Minfl is the
2 4 2
inflationary scale defined by Hds = 8πMinfl /3Mpl . This result cannot be di-
rectly compared with experimental sensitivities, since the presence of a mat-
ter dominated and a dark energy phase is not properly taken into account.
However, for modes that at the time of radiation–matter equality have physi-
cal wavelengths smaller than the horizon (corresponding to frequencies today
f > (H0 /2π)(1 + zeq )1/2 , zeq being the redshift of matter–radiation equality),
the frequency dependence is not affected by the presence of matter and dark
energy-dominated eras. The corresponding spectrum is reduced by a factor
1/(1 + zeq ) with respect to (31) and is given by
4
16 Minfl Ωr
ΩGW = , (30)
9 Mpl 1 − Ωde

where Ωr are the current fractions of radiation and dark energy densities in
units of the critical energy density, respectively. Current WMAP data place
an upper limit on the inﬂation scale around Minf ∼ 2 × 1016 GeV [29]. Since
the total energy in radiation is approximately h20 Ωr ∼ 4.15 × 10−5 , assuming
Ωde = 0.7, one ﬁnds for the spectrum (32)

h20 ΩGW < 1.7 × 10−15 . (31)

2.2 Slow-roll Inﬂation

In the previous section we have focused our attention to a simpliﬁed inﬂation-

ary model where the phase of accelerated expansion is driven by a constant
energy density. However, a more general class of inﬂationary models is charac-
terized by a scalar ﬁeld Φ slowly rolling in a potential V (Φ). For such models,
the expansion rate is not constant during the accelerating period and this
Relic Gravitons and String Pre-big-bang Cosmology 853

feature produces a small tilt in the spectrum. In particular, for frequencies

f > (H0 /2π)(1+zeq )1/2 the spectrum has a power-law frequency dependence

ΩGW ∼ f nT . (32)

For single-ﬁeld models characterized by a slowly varying potential the spectral

slope nT is given by [30]
2
2
Mpl V∗
nT = − , (33)
8π V∗

where V∗ is the value of the inflationary potential when the scale associated to
the present size of the horizon (corresponding to a frequency f0 = (1/2π)H0 )
crossed the horizon during the inflationary phase and V∗ is the first derivative
of the inflaton potential at that point.Taking into account the frequency de-
pendence of the spectral slope (35), for frequencies f (H0 /2π)(1 + zeq )1/2
the spectrum reads [30]
4 nGW
5 M∗ Ωr f
ΩGW = , (34)
2 Mpl 1 − Ωde f0
1/4
where M∗ = V∗ and

1 f
nGW = nT 1 − [(nS − 1) − nT ] log , (35)
2 f0

where nS is the spectral index for scalar perturbations. A detailed analysis [16]
using solutions of inflationary flow equations shows that for single-field slow-
roll inflationary models the maximum of the spectrum (36) compatible with
WMAP data is h20 ΩGW ∼ 5 × 10−16 for frequencies f ∼ 100 Hz (see also
[14, 17]).

3 Gravitational-wave Background in Pre-big-bang

Inﬂation

In slow-roll inflation, the horizon and flatness problems are solved by postu-
lating the presence of an epoch during which the energy–momentum tensor
is dominated by the potential energy of a scalar field. This potential energy
drives the phase of accelerated expansion, during which the field slowly rolls
towards the minimum of the potential. In the 1990s, several attempts of build-
ing such cosmological setup in string theory encountered a number of prob-
lems [31]2 . Due to the presence of other fields, superstring theory at low energy
2
For more recent successful attempts to build slow-roll inflationary models within
string theory see, e.g., [32].
854 A. Buonanno and C. Ungarelli

does not give Einstein general relativity—e.g., heterotic string theory in four
dimensions is described by the action

1 −ϕ 1
Γeﬀ = 2 d x |g| e
4
R + g ∂μ ϕ ∂ν ϕ − (dB) − V (ϕ) , (1)
μν 2
2λs 12

where ϕ is the dilaton ﬁeld, related to the string coupling by g 2 = eϕ ;

dB = ∂μ Bνρ + ∂ν Bρμ + ∂ρ Bμν , where Bμν is the two-form gauge field or
antisymmetric field; V (φ) is a non-perturbative potential; and where λs is
the string scale. In writing (38) we disregard for simplicity the internal di-
mensions, whose dynamics can be described in terms of moduli fields [22].
Henceforth, we limit the discussion to the homogeneous and isotropic case
with B = 0 and V = 0 [ds2 = −dt2 + a2 (t) dx2 , ϕ = ϕ(t)].
In 1991 Veneziano [20] discovered that the solution of the low-energy
string-effective action (38) satisfies the scale factor
√
duality symmetry: a(t) →
1/a(t), ϕ(t) → ϕ(t)−6 log a(t), with a(t) ∼ t
3 1/ 3
and ϕ(t) ∼ − log t. Noticing
this property, Veneziano [20] conceived the idea of implementing the inflation-
ary phase at times before the would-be big-bang singularity, proposing the pre-
big-bang scenario. Indeed, it can be easily shown that for t < 0, ȧ > 0, ä > 0,
thus the Universe undergoes a (super) inflationary phase. Two different but
physically equivalent descriptions of the PBB phase exist: either the string-
frame picture described by (38), where the Universe undergoes an accelerated
expansion (H > 0, Ḣ > 0, ϕ̇ > 0), or the Einstein-frame picture, where the
action (38) has the standard Hilbert–Einstein form and the evolution of the
Universe is described by an accelerated contraction, or gravitational collapse
(H < 0, Ḣ < 0, ϕ̇ > 0).
This new kind of inflation, which can be shown to solve the homogeneity
and flatness conundra, is driven by the kinetic energy of the dilaton field and
forces both the string coupling (ġ > 0) and the spacetime curvature to grow
toward the future (i.e., toward the stringy phase). As a consequence, at least
in the homogeneous case, the inflationary stage lasts for ever (t → −∞) and
the initial state of the Universe is nearly flat, cold, and decoupled: g 1,
Rλ2s 1.
The scale factor duality symmetry has constituted the basis of a class of in-
flationary models subsequently investigated by Gasperini and Veneziano [21],
and it is also at the basis of the so-called ekpyrotic cosmological scenario
[33, 34].
As far as metric perturbations are concerned, the most striking feature of
the PBB spectra is a strong tilt toward high frequencies [36]. In [23], Veneziano
and collaborators estimated the SGWB, obtaining
3
2
1 2 f 1 fs 1 g1
ΩGW ∼ g 1 + log + 3 , (f < fs , ) (2)
zeq s fs 2 f zs gs

3
Here for convenience we ﬁx the origin of time at t = 0.
Relic Gravitons and String Pre-big-bang Cosmology 855
6−2β 2β
g12 f f
ΩGW ∼ + , (fs < f < f1 ) , (3)
zeq f1 f1
where gs is the value of the string coupling at the onset of the high-curvature
stringy phase, fs is the frequency corresponding to the lowest scale exiting
during the dilaton-driven phase, f1 ∼ 1011 Hz is the ultraviolet cutoﬀ, g1 =
Ms /Mpl , zs is the redshift of the stringy phase, β = − log(gs /g1 )/ log zs , and
zeq is the redshift of matter–radiation equality.

3.1 Pre-big-bang Minimal Model

Here, we derive more in detail the SGWB following [35]. In the string frame,
where strings follow geodesic trajectories, it is straightforward to show that
the canonical variable Ψμν associated to tensor perturbations is related to the
metric by
g
gμν = a2 (ημν + hμν ) = a2 ημν + Ψμν . (4)
a
The Fourier modes of the two physical traceless and transverse polarization
states satisfy the following wave equation:

Ψk + (k 2 − V )Ψk = 0 , (5)

where prime denotes diﬀerentiation with respect to the conformal time and

V = (g/a) /(g/a). In the following, we shall restrict our attention to a class
of minimal PBB models characterized by an initial accelerated, dilaton-driven
phase followed by a stringy phase (during which H and dϕ/dt are assumed
to be approximately constant [37]) eventually evolving towards a standard
radiation-dominated phase. During the dilaton-dominated regime (−∞ < η <
ηs < 0) the scale factor and the dilaton ﬁeld read
−α
1 η − (1 − α)ηs
a(η) = − , (6)
Hs ηs αηs

η − (1 − α)ηs
ϕ(η) = ϕs − γ log , (7)
αηs
√ √
where α = 1/(1 + 3), γ = 3. During the stringy phase (ηs < η < η1 )
one expects that higher-order terms saturate the growth of the curvature [37].
Hence during this phase the scale factor and the dilaton can be parametrized
as follows:
1
a(η) = − , (8)
Hs η
η
ϕ(η) = ϕs − 2β ln . (9)
ηs
Finally, assuming that a non-perturbative dilaton potential sets in stabilizing
the dilaton, the radiation phase (η1 < η < ηr ) is described by
856 A. Buonanno and C. Ungarelli

1
a(η) = (η − 2η1 ) . (10)
Hs η12
For those three diﬀerent phases the potential V reads
1 4 −2
V (η) = 4ν − 1 [η − (1 − α)ηs ] , −∞ < η < ηs , (11)
4

1 4
V (η) = 4μ − 1 η −2 , ηs < η < η1 , (12)
4

V (η) = 0 , η1 < η < ηr . (13)

where, 2ν = |2α − γ + 1|, 2μ = |2β − 3|. The exact solutions of (42) in the
three phases are

Ψk = |η − (1 − αηs )|Hν(2) (k|η − αηs |) , −∞ < η < ηs , (14)

Ψk = |η| A+ Hμ(2) (k|η|) + A− Hμ(1) (k|η|) , ηs < η < η1 , (15)

2
Ψk = i B+ e−ikη − B− eikη , η1 < η < ηr , (16)
πk
(1,2)
where Hμ,ν are Hankel’s functions of the first and second kind. The Bogoli-
ubov coefficients A± , B± can be computed by requiring the continuity of the
Fourier modes and its first derivative on the space-like surfaces η = ηs and
η = η1 . The result for the spectrum is
2μ+1 5−2μ ,
(2πfs )4 f1 f , (2) αf f
ΩGW (f ) = a(μ) 2 2 , Hν J
H0 Mpl fs fs , fs
μ
fs
,2
αf f (1 − α) fs (2) αf f ,,
−Hν(2) Jμ + H Jμ (17)
fs fs 2α f ν fs fs ,
where
α 2μ
2 (2μ − 1)2 Γ 2 (μ) .
a(μ) =
48
For the class of cosmological models under consideration ν = 0; hence, using
(2) (2)
the identity H0 (z) = −H1 (z), the spectrum is given by
2μ+1 5−2μ ,
(2πfs )4 f1 f , (2) αf f
ΩGW (f ) = a(μ) 2 2 , H0 Jμ
H0 Mpl fs fs , fs fs
,2
(2) αf f (1 − α) fs (2) αf f ,,
+H1 Jμ − H Jμ . (18)
fs fs 2α f 0 fs fs ,
Assuming that the curvature scale at the onset of the string scale is Hs ∼
1/λs ∼ ggut Mpl ∼ 0.015 MPl and that the cosmic time value at which the
Relic Gravitons and String Pre-big-bang Cosmology 857

stringy phase ends is t1 ∼ λs the peak frequency is f1 ∼ 4.3 × 1010 Hz. Hence
the spectrum depends on two arbitrary parameters, fs and β. (Note that
(3) can be recovered with the following mapping: zs = f1 /fs and gs /g1 =
(fs /f1 )β , with β given by 2μ = |2β − 3|.)
From (55), one ﬁnds that the maximum value of the spectrum compatible
with the BBN and CMB bounds is

h20 ΩGW
max
∼ 3.0 × 10−7 . (19)

Such value is quite interesting, since it is about one order of magnitude below
the sensitivity of ﬁrst-generation LIGO interferometers, and well above the
sensitivity of second-generation interferometers, such as advanced LIGO.

3.2 Pre-big-bang Non-minimal Models

The SGWB in the minimal PBB model was originally evaluated neglecting
the higher-curvature corrections in the equation of tensorial fluctuations dur-
ing the stringy phase. Gasperini [38] evaluated the higher-order equation for
tensorial fluctuations and showed that these corrections modify the amplitude
of the perturbation only by a factor of order 1.
In [39, 40] the authors examined the effect of radiation produced during
reheating processes occurring below the string scale. Such processes may be
needed in the PBB model to dilute relic particles produced during (or at the
very end of) the PBB phase. The abundance of those particles could spoil the
BBN predictions [41]. Depending on when and for how long the entropy is
produced, it can change the shape and reduce the amplitude of the SGWB. If
we assume that the reheating process occurs at the end of the stringy phase
(i.e., all the entropy is produced at the end of the stringy phase), then the
effect of the process is a simple scaling of the original spectrum by the factor
(1 − δs)4/3 , where δs is the fraction of the present thermal entropy density
that the reheating process produced.
Finally, as first noticed in [39], it is well possible that many more cosmolog-
ical phases are present between the pre- and the post-big-bang cosmological
phases (see, e.g., [39, 42]). If this is the case, the GW spectra during the
high-curvature and/or strong coupling region will be characterized by several
branches with increasing and decreasing slopes. Due to the dependence of
the spectra on a larger number of parameters, it would be more difficult to
constrain these non-minimal scenarios using GW detectors.

4 Accessibility of LIGO to Pre-big-bang Models

We shall now discuss the implications of recent and future analysis of LIGO
on PBB models, notably on its parameter space.
858 A. Buonanno and C. Ungarelli

Fig. 1. The f1 − μ plane with fs = 30 Hz [18]. The shaded regions are excluded
by the LIGO S3 upper limit (darker) and by the LIGO S4 limit. The hatched
regions are accessible to future LIGO runs, assuming an observation time of 1 year:
the predicted sensitivity for the H1L1 pair, assuming first design configuration (−);
expected LIGO sensitivity for the H1H2 pair, assuming first design configuration (\);
expected Advanced LIGO sensitivity for the H1H2 pair, assuming interferometer
configuration optimized for the binary neutron star inspiral search (/). The solid
black curve is the exclusion curve consistent with the nucleosynthesis limit (the
excluded region is above the curve). The horizontal dashed line denotes the value of
f1 = 4.3 × 1010 Hz (courtesy of LIGO)

As already pointed out in the previous section, in minimal PBB models

the spectrum (55) is characterized by the following parameters: (i) the di-
mensionless quantity μ = |2β − 3| which is positive definite and constrained
to be μ ≤ 3/2, since for μ > 3/2 the spectrum would be incompatible with
the existing experimental bounds and (ii) the frequency parameter fs which
is defined so that 0 < fs < f1 . Since the spectrum (55) behaves as (f /fs )3
for f < fs , a comparison with LIGO data is insensitive to values of fs above
the interferometer band. In particular, one finds that LIGO data can scan
values fs < 30 Hz [18]. Furthermore, in the high-frequency limit, the value of
the spectrum is independent on fs . The other parameter is the frequency f1
(which defines the peak frequency of the spectrum)4

4
In the more common version of the minimal PBB model [22, 23, 42], the
frequency f1 is obtained by imposing that the energy density becomes criti-
cal at the beginning of the radiation phase and that the photons we observe
Relic Gravitons and String Pre-big-bang Cosmology 859
1/2
Hs t1
f1 ∼ 4.3 × 1010 Hz , (1)
0.15Mpl λs

where Hs is the curvature scale at the onset of the intermediate stringy phase
and t1 is the value of the cosmic time at which such phase ends. Even assuming
Hs ∼ λ−1 s , t1 ∼ λs since the spectrum (55) has a strong dependence on f1
(ΩGW ∝ f14 ), an order of magnitude combined variation in t1 , Hs can yield
quite a large variation in the spectrum.
Based on a previous analysis carried out in [19], during the last scientific
run, the LIGO scientific collaboration has scanned the three-dimensional pa-
rameter space μ , fs , f1 using 192 s—long intervals with 1/32 Hz resolution—
assessing the accessibility of LIGO to each of the PBB parameters describing
the spectrum (18). Furthermore, the design sensitivity of the initial and ad-
vanced LIGO configuration was taken into account. A summary of the results
is shown in Fig. 1, where the f1 − μ plane is considered (fixing fs = 30 Hz).
The results pertaining the third (S3) and fourth (S4) LIGO scientific runs
provide a first, albeit quite restricted, scanning of the parameter space. The
indirect BBN bound is still quite a strong constraint, but future and longer
runs of the LIGO interferometers are expected to enlarge the available part
of the parameter space, eventually overcoming the BBN bound.

5 Conclusions

The most interesting and robust feature of the relic background of gravita-
tional radiation predicted by the PBB model is the positive spectral slope
(nT = 3) at low frequency, i.e., for modes that exit the horizon during
the dilaton-driven (super) inflationary phase and re-enter during radiation-
dominated era. Such attribute is a consequence of the Universe’s equation of
state during the (super) inflationary PBB phase and is shared by other non-
conventional cosmological models. For example in quintessential inflationary
model [43], where the standard radiation-dominated era is preceded by a phase
characterized by a stiffer equation of state, the SGWB increases linearly with
frequency. A blue primordial spectrum of GWs could be produced in a class of
models with superluminal (w = p/ρ < 1) equation of state [44], as discussed
in [45], where the inflaton field is characterized by a non-local Lagrangian.
Furthermore, in other cosmological setups based upon superstring theory, as
the the cyclic/ekpyrotic models [46], the GW spectrum increases as function
of frequency, although its amplitude normalization makes it unobservable by
ground- and space-based detectors.

today originated from the amplified vacuum fluctuations during the dilaton-
driven inflationary phase. Within these assumptions (57) can be re-written as
f1 g1 (Hs /(0.15 MPl ))1/2 (H0 MPl )1/2 Ωγ , where Ωγ = 4 × 10−5 h−2
1/2 1/4
0 and g1
is the string coupling at the end of the stringy phase.
860 A. Buonanno and C. Ungarelli

The blue spectrum predicted by PBB models yields to a lack of tensorial

contributions to the CMB temperature and polarization anisotropies. There-
fore, the detection of a tensorial component of primordial origin in the CMB
fluctuations would rule out the current version of the PBB model.
Finally, it is worth to notice that during the initial dilaton-driven infla-
tionary phase characteristic of PBB models the initial (classical) tensor inho-
mogeneities are not de-amplified [47], as it occurs in slow-roll inflation. This
result, though paradoxical, does not imply that the initial value of tensor in-
homogeneities must be fine tuned to an unnaturally small value. Indeed, it
can be shown [47] that the energy density of such tensor classical fluctuations
is indeed de-amplified. However, in order to solve the homogeneity problem
in those superstring-inspired models, more tighter constraints [47] than in
slow-roll inflation models ought to be imposed.
All that said, future ground- and space-based detectors will be in the
unique and privileged position of either detect the PBB SGWB or put relevant
bounds on the parameter space of the string–cosmology scenario originally
proposed by Gabriele Veneziano.

References
1. http://www.ligo.org 845
2. http://www.virgo.pi.infn.it 845
3. http://www.geo600.uni-hannover.de 845
4. http://www.tama.mtk.nao.ac.jp 845
5. http://lisa.jpl.nasa.gov 845
6. http://science.hq.nasa.gov/universe/science/bang.html 845
7. N. Seto, S. Kawamura, T. Nakamura: Phys. Rev. Lett. 87, 221103 (2001) 845
8. E. Iacopini, E. Picasso, F. Pegoraro, L. A. Radicati: Phys. Lett. A 73, 140
(1979); C.M. Caves: Phys. Lett. B 80, 323 (1979) 845
9. A.M. Cruise: Class. Quant. Grav. 17, 2525 (2000); A.M. Cruise, R. Ingley:
Class. Quant. Grav. 22, S497 (2004) 846
10. http://www.skatelescope.org 846
11. B. Allen, “The stochastic gravity-wave background: sources and detection,”
[gr-qc/9604033] 846, 849
12. M. Maggiore Phys. Rep. 331, 283 (2000) 846
13. L.P. Grishchuk: Sov. Phys. JEPT 40, 409 (1975); A.A. Starobinski: JEPT Lett.
30, 682 (1979); V.A. Rubakov, M. Sazhin, A. Veryaskin: Phys. Lett. B 115,
189 (1982); R. Fabbri, M. Pollock: Phys. Lett. B 125, 445 (1983); L.F. Abbott,
D.D. Harari: Nucl. Phys. B 264, 487 (1986); B. Allen: Phys. Rev. D 37, 2078
(1988); V. Sahni: Phys. Rev. D 42, 453 (1990) 846, 847
14. L. A. Boyle, P. J. Steinhardt, N. Turok: Phys. Rev. Lett. 96, 311101 (2006) 846, 853
15. L. A. Boyle, P. J. Steinhardt: “Probing the early universe with inﬂationary
gravitational waves,” [astro-ph/0512014] 846, 847
16. B. C. Friedman, A. Cooray, A. Melchiorri: “WMAP-normalized inﬂationary
model predictions and the search for primordial gravitational waves with direct
detection experiments,” [astro-ph/0610220]. 846, 853
Relic Gravitons and String Pre-big-bang Cosmology 861

17. T.L. Smith, M. Kamionkowski, A. Cooray: Phys. Rev. D 73 023504 (2006) 847, 853
18. B. Abbott et al.: “Searching for a stochastic background of gravitational waves
with LIGO,” [astro-ph/0608606] 847, 858
19. V. Mandic, A. Buonanno: Phys. Rev. D 73, 063008 (2006) 847, 859
20. G. Veneziano: Phys. Lett. B 265, 287 (1991) 847, 854
21. M. Gasperini, G. Veneziano: Astropart. Phys. 1, 317 (1993); Mod. Phys. Lett.
A 8, 3701 (1993); Phys. Rev. D 50, 251 (1994) 847, 854
22. M. Gasperini, G. Veneziano: Phys. Rep. 373, 1 (2003) 847, 854, 858
23. R. Brustein, M. Gasperini, M. Giovannini, G. Veneziano, Phys. Lett. B 361,
45 (1995) 847, 854, 858
24. C.J. Copi, D.N. Schramm, M.S. Turner: Phys. Rev. D 55, 3389 (1997). 847
25. T. Smith, E. Pierpaoli, M. Kamionkowski: Phys. Rev. Lett. 97, 021301 (2006) 847
26. G. Veneziano: “String cosmology: concepts and consequences,” [hep-
th/9512091] 848
27. B. Allen: Phys. Rev. D 37, 2078 (1988) 851
28. H. Kodama, M. Sasaki: Suppl. Prog. Theor. Phys. 78, 1 (1984);
V. F. Mukhanov, H. A. Feldman, R. H. Brandenberger: Phys. Rep. 215, 203
(1992) 850
29. W. H. Kinney, E. W. Kolb, A. Melchiorri, A. Riotto: Phys. Rev. D 74, 023502
(2006) 852
30. M. Turner: Phys. Rev. D 55, 435 (1997) 853
31. B.A. Campbell, A.D. Linde, K.A. Olive: Nucl. Phys. B 335, 146 (1991);
R. Brustein, P.J. Steinhardt: Phys. Lett. B 302, 196 (1993) 853
32. S. Kachru, R. Kallosh, A. Linde, J. Maldacena, L. McAllister, S.P. Trivedi:
JHEP 0408, 030 (2004) 853
33. J. Khoury, B.A. Ovrut, P.J. Steinhardt, N. Turok: Phys. Rev. D 64, 123522
(2001) 854
34. J. Khoury, B.A. Ovrut, N. Seiberg, P.J. Steinhardt and N. Turok: Phys. Rev.
D 65, 086007 (2002) 854
35. A. Buonanno, M. Maggiore, C. Ungarelli: Phys. Rev. D 55, 3330 (1997) 855
36. R. Brustein, M. Gasperini, M. Giovannini, V. F. Mukhanov, G. Veneziano,
Phys. Rev. D 51, 6744 (1995) 854
37. M. Gasperini, M. Maggiore, G. Veneziano: Nucl. Phys. B 494, 315 (1996). 855
38. M. Gasperini: Phys. Rev. D 56, 4815 (1997). 857
39. M. Gasperini, “Relic gravitons from the pre-big bang: what we know and what
we do not know,” [hep-th/9607146] 857
40. R. Brustein, M. Gasperini, G. Veneziano: Phys. Rev. D 55, 3882 (1997) 857
41. A. Buonanno, M. Lemoine, K.A. Olive: Phys. Rev. D 62, 083513 (2000) 857
42. A. Buonanno, K. Meissner, C. Ungarelli, G. Veneziano: JHEP 001, 004 (1998)
857, 858
43. P.J.E. Peebles, A. Vilenkin: Phys. Rev. D 59, 063505 (1999); M. Giovannini:
Phys. Rev. D 60, 123511 (1999) 859
44. L. Grishchuk: “Relic gravitational waves and cosmology,” [astro-ph/0504018] 859
45. M. Baldi, F. Finelli, S. Matarrese: Phys. Rev. D 72, 083504 (2005) 859
46. L. A. Boyle, P. Steinhardt, N. Turok: Phys. Rev. D 69, 127302 (2004) 859
47. A. Buonanno, T. Damour: Phys. Rev. D 50, 3713 (2001) 860
Magnetic Fields, Strings and Cosmology

M. Giovannini

Centro “Enrico Fermi”, Via Panisperna 89/A, 00184 Rome, Italy,

and Department of Physics, Theory Division, CERN, 1211 Geneva 23, Switzerland
[email protected]

Abstract. The main motivations and challenges related with the physics of large-
scale magnetic fields are briefly reviewed. The interplay between large-scale magnetic
fields and scalar CMB anisotropies is addressed with specific attention on recent
progresses.

1 Half a Century of Large-Scale Magnetic Fields

1.1 A Premise

The content of the present contribution is devoted to large-scale magnetic

fields whose origin, evolution and implications constitute today a rather in-
triguing triple point in the phase diagram of physical theories. Indeed, sticking
to the existing literature (and refraining from dramatic statements on the his-
torical evolution of theoretical physics), it appears that the subject of large-
scale magnetization thrives and prospers at the crossroad of astrophysics,
cosmology and theoretical high-energy physics.
Following the kind invitation of Jnan Maharana and Maurizio Gasperini, I
am delighted to contribute to this set of lectures whose guideline is dictated by
the inspiring efforts of Gabriele Veneziano in understanding the fundamental
forces of Nature. My voice joins the choir of gratitude proceeding from the
whole physics community for the novel and intriguing results obtained by
Gabriele through the various stages of his manifold activity. I finally ought
to convey my personal thankfulness for the teachings, advices and generous
clues received during the last 15 years.

1.2 Length Scales

The typical magnetic ﬁeld strengths, in the Universe, range from few μG
(micro-Gauss in the case of galaxies and clusters) to few Gauss (in the case of
planets, like the earth or Jupiter) and up to 1012 G in neutron stars. Magnetic

M. Giovannini: Magnetic Fields, Strings and Cosmology, Lect. Notes Phys. 737, 863–939
(2008)
DOI 10.1007/978-3-540-74233-6 26 c Springer-Verlag Berlin Heidelberg 2008
864 M. Giovannini

fields are not only observed in planets and stars but also in the interstellar
medium, in the intergalactic medium and, last but not least, in the intracluster
medium.
Magnetic fields whose correlation length is larger than the astronomical
unit (1 AU = 1.49 × 1013 cm) will be named large-scale magnetic fields. In fact,
magnetic fields with approximate correlation scale comparable with the earth–
sun distance are not observed (on the contrary, both the magnetic field of the
sun and the one of the earth have a clearly distinguishable localized structure).
Moreover, in magnetohydrodynamics (MHD), the magnetic diffusivity scale
(i.e. the scale below which magnetic fields are diffused because of the finite
value of the conductivity) turns out to be, amusingly enough, of the order of
the AU.

1.3 The Early History

In the 1940s large-scale magnetic field had no empirical evidence. For in-
stance, there was no evidence of magnetic fields associated with the galaxy
as a whole with a rough correlation scale of 30 kpc.1 More specifically, the
theoretical situation can be summarized as follows. The seminal contributions
of Alfvén [1] convinced the community that magnetic fields can have a very
large lifetime in a highly conducting plasma. Later on, in the 1970s, Alfvén
will be awarded by the Nobel prize “for fundamental work and discoveries in
magnetohydrodynamics with fruitful applications in different parts of plasma
physics”.
Using the discoveries of Alfvén, Fermi [2] postulated, in 1949, the exis-
tence of a large-scale magnetic field permeating the galaxy with approximate
intensity of micro-Gauss and, hence, in equilibrium with the cosmic rays. 2
Alfvén [3] did not react positively to the proposal of Fermi, insisting, in a
somehow opposite perspective, that cosmic rays are in equilibrium with stars
and disregarding completely the possibility of a galactic magnetic field. Today
we do know that this may be the case for low-energy cosmic rays but certainly
not for the most energetic ones around, and beyond, the knee in the cosmic
ray spectrum.
At the historical level it is amusing to notice that the mentioned contro-
versy can be fully understood from the issue 75 of Physical Review where it is

1
Recall that 1 kpc = 3.085 × 1021 cm. Moreover, 1Mpc = 103 kpc. The present size
of the Hubble radius is H0−1 = 1.2 × 1028 cm ≡ 4.1 × 103 Mpc for h = 0.73.
2
In this contribution magnetic fields will be expressed in Gauss. In the SI units
1 T = 104 G. For practical reasons, in cosmic ray physics and in cosmology it is also
useful to express the magnetic field in GeV2 (in units = c = 1). Recalling that
the Bohr magneton is about 5.7 × 10−11 MeV/T the conversion factor will then be
1 G = 1.95 × 10−20 GeV2 . The use of Gauss (G) instead of Tesla (T) is justified by
the existing astrophysical literature where magnetic fields are typically expressed
in Gauss.
Magnetic Fields, Strings and Cosmology 865

possible to consult the article of Fermi [2], the article of Alfvén [3] and even a
paper by Richtmyer and Teller [4] supporting the views and doubts of Alfvén.
In 1949 Hiltner [5] and, independently, Hall [6] observed polarization of
starlight which was later on interpreted by Davis and Greenstein [7] as an
effect of galactic magnetic field aligning the dust grains.
According to the presented chain of events it is legitimate to conclude that
• the discoveries of Alfvén were essential in the Fermi proposal who was
pondering on the origin of cosmic rays in 1938 before leaving Italy3 because
of the infamous fascist legislation and
• the idea that cosmic rays are in equilibrium with the galactic magnetic
fields (and hence that the galaxy possesses a magnetic field) was essential
in the correct interpretation of the first, fragile, optical evidence of galactic
magnetization.
The origin of the galactic magnetization, according to [2], had to be somehow
primordial. It should be noticed, for sake of completeness, that the observa-
tions of Hiltner [5] and Hall [6] took place from November 1948 to January
1949. The paper of Fermi [2] was submitted in January 1949, but it contains
no reference to the work of Hiltner and Hall. This indicates the Fermi was
probably not aware of these optical measurements.
The idea that large-scale magnetization should somehow be the remnant
of the initial conditions of the gravitational collapse of the protogalaxy idea
was further pursued by Fermi in collaboration with Chandrasekar [8, 9] who
tried, rather ambitiously, to connect the magnetic field of the galaxy to its
angular momentum.

1.4 The Middle Ages

In the 1950s various observations on polarization of Crab nebula suggested

that the Milky Way (MW) is not the only magnetized structure in the sky.
The effective new twist in the observations of large-scale magnetic fields was
the development (through the 1950s and 1960s) of radio-astronomical tech-
niques. From these measurements, the first unambiguous evidence of radio-
polarization from the Milky Way was obtained (see [10] and references therein
for an account of these developments).
It was also soon realized that the radio-Zeeman effect (counterpart of the
optical Zeeman splitting employed to determine the magnetic field of the sun)
could offer accurate determination of (locally very strong) magnetic fields
in the galaxy. The observation of Lyne and Smith [11] that pulsars could
be used to determine the column density of electrons along the line of sight
opened the possibility of using not only synchrotron emission as a diagnostic
of the presence of a large-scale magnetic field, but also Faraday rotation. For

3
The author is indebted to Prof. G. Cocconi who was so kind to share his personal
recollections of the scientiﬁc discussions with E. Fermi.
866 M. Giovannini

a masterly written introduction to pulsar physics the reader may consult the
book of Lyne and Smith [12].
In the 1970s all the basic experimental tools for the analysis of galactic
and extragalactic magnetic fields were ready. Around this epoch also extensive
reviews on the experimental endeavors started appearing and a very nice
account could be found, for instance, in the review of Heiles [13].
It became gradually evident in the early 1980s that measurements of large-
scale magnetic fields in the MW and in the external galaxies are two comple-
mentary aspects of the same problem. While MW studies can provide valuable
information concerning the local structure of the galactic magnetic field, the
observation of external galaxies provides the only viable tool for the recon-
struction of the global features of the galactic magnetic fields.
Since the early 1970s, some relevant attention has been paid not only
to the magnetic fields of the galaxies but also to the magnetic fields of the
clusters. A cluster is a gravitationally bound system of galaxies. The local
group (i.e. our cluster containing the MW, Andromeda together with other
fifty galaxies) is an irregular cluster in the sense that it contains fewer galaxies
than typical clusters in the Universe. Other clusters (like Coma, Virgo) are
more typical and are then called regular or Abell clusters. As an order of
magnitude estimate, Abell clusters can contain 103 galaxies.

1.5 New Twists

In the 1990s magnetic fields have been measured in single Abell clusters but
around the turn of the century these estimates became more reliable, thanks
to improved experimental techniques. In order to estimate magnetic fields in
clusters, an independent knowledge of the electron density along the line of
sight is needed. Recently, Faraday rotation measurements obtained by radio
telescopes (like VLA4 ) have been combined with independent measurements
of the electron density in the intracluster medium. This was made possible by
the maps of the x-ray sky obtained with satellites measurements (in particular
ROSAT5 ). This improvement in the experimental capabilities seems to have
partially settled the issue confirming the measurements of the early 1990s and
implying that also clusters are endowed with a magnetic field of micro-Gauss
strength which is not associated with individual galaxies [14, 15].
While entering the new millennium the capabilities of the observers are
really confronted with a new challenge: the possibility that also superclus-
ters are endowed with their own magnetic field. Superclusters are (loosely)
gravitationally bound systems of clusters. An example is the local superclus-
ter formed by the local group and by the VIRGO cluster. Recently a large
4
The Very Large Array Telescope consists of 27 parabolic antennas spread over a
surface of 20 km2 in Socorro (New Mexico).
5
The Roengten SATellite (flying from June 1991 to February 1999) provided maps
of the x-ray sky in the range 0.1–2.5 keV. A catalog of x-ray bright Abell clusters
was compiled.
Magnetic Fields, Strings and Cosmology 867

new sample of Faraday rotation measures of polarized extragalactic sources

has been compared with galaxy counts in Hercules and Perseus-Pisces (two
nearby superclusters) [16]. First attempts to detect magnetic ﬁelds associated
with superclusters have been reported [17]. A cautious and conservative ap-
proach suggests that these fragile evidences must be corroborated with more
conclusive observations (especially in light of the, sometimes dubious, inde-
pendent determination of the electron density6 ). However, it is not excluded
that as the 1990s gave us a ﬁrmer evidence of cluster magnetism, the new
millennium may give us more solid understanding of supercluster magnetism.
In the present historical introduction various experimental techniques have
been swiftly mentioned. A more extensive introductory description of these
techniques can be found in [18].

1.6 Hopes for the Future

The hope for the near future is connected with the possibility of a next gen-
eration radio-telescope. Along this line the SKA (square kilometer array) has
been proposed [15] (see also [19]). While the technical features of the instru-
ment cannot be thoroughly discussed in the present contribution, it suffices
to notice that the collecting area of the instrument, as the name suggest, will
be of 106 m2 . The specifications for the SKA require an angular resolution of
0.1 arcsec at 1.4 GHz, a frequency capability of 0.1–25 GHz and a field of
view of at least 1 deg2 at 1.4 GHz [19]. The number of independent beams is
expected to be larger than 4 and the number of instantaneous pencil beams
will be roughly 100 with a maximum primary beam separation of about 100
deg at low frequencies (becoming 1 deg at high frequencies, i.e. of the order of
1 GHz). These specifications will probably allow full-sky surveys of Faraday
rotation.
The frequency range of SKA is rather suggestive if we compare it with
the one of the Planck experiment [20]. Planck will operate in nine frequency
channels from 30 to, approximately, 900 GHz. While the three low-frequency
channels (from 30 to 70 GHz) are not sensitive to polarization, the six high-
frequency channels (between 100 and 857 GHz) will be definitely sensitive to
CMB polarization. Now, it should be appreciated that the Faraday rotation
signal decreases with the frequency ν as ν −2 . Therefore, for lower frequencies
the Faraday rotation signal will be larger than in the six high-frequency chan-
nels. Consequently, it is legitimate to hope for a fruitful interplay between the
next generation of SKA-like radio-telescopes and CMB satellites. Indeed, as
suggested above, the upper branch of the frequency capability of SKA almost

6
In [21] it was cleverly argued that information on the plasma densities from
direct observations can be gleaned from detailed multifrequency observations of
few giant radio-galaxies (GRG) having dimensions up to 4 Mpc. The estimates
based on this observation suggest column densities of electrons between 10−6 and
10−5 cm−3 .
868 M. Giovannini

overlaps with the lower frequency of Planck so that possible eﬀects of large-
scale magnetic ﬁelds on CMB polarization could be, with some luck, addressed
with the combined action of both instruments. In fact, the same mechanism
leading to the Faraday rotation in the radio leads to a Faraday rotation of the
CMB provided the CMB is linearly polarized. These considerations suggest, as
emphasized in a recent topical review, that CMB anisotropies are germane to
several aspects of large-scale magnetization [18]. The considerations reported
so far suggest that during the next decade the destiny of radio-astronomy and
CMB physics will probably be linked together and not only for reasons of
convenience.

1.7 Few Burning Questions

In this general and panoramic view of the history of the subject we started
from the relatively old controversy opposing E. Fermi to H. Alfvén with the
still uncertain but foreseeable future developments. While the nature of the
future developments is inextricably connected with the advent of new instru-
mental capabilities, it is legitimate to remark that, in more than 50 years,
magnetic fields have been detected over scales that are progressively larger.
From the historical development of the subject a series of questions arises
naturally:
• What is the origin of large-scale magnetic fields?
• Are magnetic fields primordial as assumed by Fermi more than 50 years
ago?
• Even assuming that large-scale magnetic fields are primordial, is there a
theory for their generation?
• Is there a way to understand if large-scale magnetic fields are really pri-
mordial?
In what follows we will not give definite answers to these important questions,
but we shall be content of outlining possible avenues of new developments.
The plan of the present lecture will be the following. In Sect. 2 the main
theoretical problems connected with the origin of large-scale magnetic fields
will be discussed. In Sect. 3 the attention will be focused on the problem of
large-scale magnetic field generation in the framework of string cosmological
model, a subject where the pre-big-bang model, in its various incarnations,
plays a crucial role. But, finally, large-scale magnetic fields are really primor-
dial? Were they really present prior to matter–radiation equality? A modest
approach to these important questions suggests to study the physics of mag-
netized CMB anisotropies which will be introduced, in its essential lines, in
Sect. 4. The concluding remarks are collected in Sect.5.
Magnetic Fields, Strings and Cosmology 869

2 Magnetogenesis
While in the previous section the approach has been purely historical, the
experimental analysis of large-scale magnetic fields prompts a collection of
interesting theoretical problems. They can be summarized by the following
chain of evidences (see also [18]):
• In spiral galaxies magnetic fields follow the orientation of the spiral arms,
where matter is clustered because of differential rotation. While there may
be an asymmetry in the intensities of the magnetic field in the northern
and southern emisphere (like it happens in the case of the Milky Way),
the typical strength is in the range of the micro-Gauss.
• Locally magnetic fields may even be in the milli-Gauss range and, in this
case, they may be detected through Zeeman splitting techniques.
• In spiral galaxies the magnetic field is predominantly toroidal with a
poloidal component present around the nucleus of the galaxy and extend-
ing for, roughly, 100 pc.
• The correlation scale of the magnetic field in spirals is of the order of 30
kpc.
• In elliptical galaxies magnetic fields have been measured at the micro-
Gauss level, but the correlation scale is shorter than in the case of spirals:
this is due to the different evolutionary history of elliptical galaxies and
to their lack of differential rotation.
• Abell clusters of galaxies exhibit magnetic fields present in the so-called
intracluster medium: these fields, always at the micro-Gauss level, are not
associated with individual galaxies;
• Superclusters might also be magnetized even if, at the moment, conclusions
are premature, as partially explained in Sect. 1 (see also [17] and [18]).
The statements collected above rest on various detection techniques rang-
ing from Faraday rotation, to synchrotron emission, to Zeeman splitting of
clouds of molecules with an unpaired electron spin. The experimental evi-
dence swiftly summarized above seems to suggest that different and distant
objects have magnetic fields of comparable strength. The second suggestion
seems also to be that the strength of the magnetic fields is, in the first (sim-
plistic) approximation, independent on the physical scale.
These empirical coincidences remind a bit of one of the motivations of the
standard hot big-bang model, namely, the observation that the light elements
are equally abundant in rather different parts of our Universe. The approxi-
mate equality of the abundances implies that, unlike the heavier elements, the
light elements have primordial origin. The four light isotopes D, 3 He, 4 He and
7
Li are mainly produced at a specific stage of the hot big bang model named
nucleosynthesis occurring below the typical temperature of 0.8 MeV when
neutrinos decouple from the plasma and the neutron abundance evolves via
free neutron decay [23]. The abundances calculated in the simplest big-bang
nucleosythesis model agree fairly well with the astronomical observations.
870 M. Giovannini

In similar terms it is plausible to argue that large-scale magnetic ﬁelds

have comparable strengths at large scales because the initial conditions for
their evolutions were the same, for instance at the time of the gravitational
collapse of the protogalaxy. The way the initial conditions for the evolution of
large-scale magnetic fields are set is generically named magnetogenesis [18].
There is another comparison which might be useful. Back in the 1970s the
so-called Harrison–Zeldovich spectrum was postulated. Later, with the devel-
opments of inflationary cosmology the origin of a flat spectrum of curvature
and density profiles has been justified on the basis of a period of quasi-de Sitter
expansion named inflation. It is plausible that in some inflationary models not
only the fluctuations of the geometry are amplified but also the fluctuations of
the gauge fields. This happens if, for instance, gauge couplings are effectively
dynamical. As the Harrison–Zeldovich spectrum can be used as initial condi-
tion for the subsequent Newtonian evolution, the primordial spectrum of the
gauge fields can be used as initial condition for the subsequent MHD evolution
which may lead, eventually, to the observed large-scale magnetic fields. The
plan of the present section is the following. In Sect. 2.1 some general ideas of
plasma physics will be summarized with particular attention to those tools
that will be more relevant for the purposes of this lecture. In Sect. 2.2 the
concept of dynamo amplification will be introduced in a simplified perspec-
tive. In Sect. 2.3 it will be argued that the dynamo amplification, in one of its
potential incarnations, necessitates some initial conditions or as we say in the
jargon, some seed field. In Sect. 2.4 a panoramic view of astrophysical seeds
will be presented with the aim of stressing the common aspects of, sometimes
diverse, physical mechanisms. Sects. 2.5 and 2.6 the two basic approaches to
cosmological magnetogenesis will be illustrated. In the first case (see Sect.
2.5) magnetic fields are produced inside the Hubble radius at a given stage in
the life of the Universe. In the second case (see Sect. 2.6) vacuum fluctuations
of the hypercharge field are amplified during an inflationary stage of expan-
sion. Section 2.7 deals with the major problem of inflationary magnetogenesis,
namely, conformal (Weyl) invariance whose breaking will be one of the themes
of string cosmological mechanisms for the generation of large-scale magnetic
fields.

2.1 Magnetized Plasmas

Large-scale magnetic ﬁelds evolve in a plasma, i.e. a system often illustrated

as the fourth state of matter. As we can walk in the phase diagram of a given
chemical element by going from the solid to the liquid and to the gaseous
state with a series of diverse phase transitions, a plasma can be obtained
by ionizing a gas. A typical example of weakly coupled plasma is therefore
an ionized gas. Examples of strongly coupled plasmas can be found also in
solid-state physics. An essential physical scale that has to be introduced in
the description of plasma properties is the so-called Debye length that will be
discussed in the following paragraph.
Magnetic Fields, Strings and Cosmology 871

Different descriptions of a plasma exist and they range from effective fluid
models of charged particles [24, 25, 26, 27] to kinetic approaches like the ones
pioneered by Vlasov [28] and Landau [29]. From a physical point of view, a
plasma is a system of charged particles which is globally neutral for typical
lengthscales larger than the Debye length λD :

T0
λD = , (1)
8πn0 e2
where T0 is the kinetic temperature and n0 the mean charge density of the
electron–ion system, i.e. ne ni = n0 . For a test particle the Coulomb poten-
tial will then have the usual Coulomb form, but it will be suppressed, at large
distances by a Yukawa term, i.e. e−r/λD . In the interstellar medium there are
three kinds of regions which are conventionally defined:
• H2 regions, where the hydrogen is predominantly in molecular form (also
denoted by HII);
• H0 regions (where hydrogen is in atomic form);
• and H+ regions, where hydrogen is ionized (also denoted by HI).
In the H+ regions the typical temperature T0 is of the order of 10–20 eV while
for n0 let us take, for instance, n0 ∼ 3 × 10−2 cm−3 . Then λD ∼ 30 km.
For r λD the Coulomb potential is screened by the global effect of the
other particles in the plasma. Suppose now that particles exchange momentum
through two-body interactions. Their cross section will be of the order of
2
αem /T02 and the mean free path will be mfp ∼ T02 /(αem 2
n0 ), i.e. recalling
(1) λD mfp . This means that the plasma is a weakly collisional system
which is, in general, not in local thermodynamical equilibrium and this is
the reason why we introduced T0 as the kinetic (rather than thermodynamic)
temperature.
The last observation can be made even more explicit by defining another
important scale, namely, the plasma frequency which, in the system under
discussion, is given by
1/2
4πn0 e2 n0
ωpe = 2 MHz, (2)
me 103 cm−3

where me is the electron mass. Notice that, in the interstellar medium (i.e. for
n0 10−2 cm−3 ) (2) gives a plasma frequency in the giga hertz range. This
observation is important, for instance, in the treatment of Faraday rotation
since the plasma frequency is typically much larger than the Larmor frequency,
i.e.
eB0 B0
ωBe = 18.08 kHz, (3)
me 10−3 G
implying, for B0 μG, ωBe 20 Hz. The same hierarchy holds also when the
(free) electron density is much larger than in the interstellar medium, and, for
872 M. Giovannini

instance, at the last scattering between electrons and photons for a redshift
zdec 1100 (see Sect. 4).
The plasma frequency is the oscillation frequency of the electrons when
they are displaced from their equilibrium configuration
in a background of ap-
proximately fixed ions. Recalling that vther T0 /me is the thermal velocity
of the charge carriers, the collision frequency ωc vther /mfp is always much
smaller than ωpe vther /λD . Thus, in the idealized system described so far,
the following hierarchy of scales holds
λD mfp , ωc ωpe , (4)
which means that before doing one collision the system undergoes many oscil-
lations, or, in other words, that the mean free path is not the shortest scale in
the problem. Usually one defines also the plasma parameter N = n−1 −3
0 λD , i.e.
the number of particles in the Debye sphere. In the approximation of weakly
coupled plasma, N 1 which also imply that the mean kinetic energy of the
particles is larger than the mean inter-particle potential.
The spectrum of plasma excitations is a rather vast subject and it will
not strictly necessary for the following considerations (for further details see
[24, 25, 26]). It is sufficient to remark that we can envisage, broadly speaking,
two regimes that are physically different:
• typical length-scales much larger than λD and typical frequencies much
smaller than ωpe ;
• typical length-scales smaller (or comparable) with λD and typical frequen-
cies much larger than ωpe .
In the first situation reported above it can be shown that a single-fluid de-
scription suffices. The single-fluid description is justified, in particular, for the
analysis of the dynamo instability which occurs for dynamical times of the
order of the age of the galaxy and length-scales larger than the kilo parsec. In
the opposite regime, i.e. ω ≥ ωpe and L ≥ λD the single-fluid approach breaks
down and a multi-fluid description is mandatory. This is, for instance, the
branch of the spectrum of plasma excitation where the displacement current
(and the related electromagnetic propagation) cannot be neglected. A more
reliable description is provided, in this regime, by the Vlasov–Landau (i.e.
kinetic) approach [28, 29] (see also [25]).
Consider, therefore, a two-fluid system of electrons and protons. This sys-
tem will be described by the continuity equations of the density of particles,
i.e. ∂ne ∂np
+ ∇ · (ne v e ) = 0, + ∇ · (np v p ) = 0, (5)
∂t ∂t
and by the momentum conservation equations

∂
me ne + v e · ∇ v e = −ene E + v e × B − ∇pe − Cep , (6)
∂t

∂
mp np + v p · ∇ v p = enp E + v p × B − ∇pp − Cpe . (7)
∂t
Magnetic Fields, Strings and Cosmology 873

Equations (5), (6) and (7) must be supplemented by Maxwell equations read-
ing, in this case
∇ · E = 4πe(np − ne ), (8)
∇ · B = 0, (9)
∂B
∇×E+ = 0, (10)
∂t
∂E
∇×B = + 4πe(np v p − ne v e ). (11)
∂t
The two-fluid system of equations is rather useful to discuss various phe-
nomena like the propagation of electromagnetic excitations at finite charge
density both in the presence and in the absence of a background magnetic
field [24, 25, 26]. The previous observation implies that a two-fluid treatment
is mandatory for the description of Faraday rotation of the cosmic microwave
background (CMB) polarization. This subject will not be specifically discussed
in the present lecture (see, for further details, [30] and references therein).
Instead of treating the two fluids as separated, the plasma may be consid-
ered as a single fluid defined by an appropriate set of global variables:
J = e(np v p − ne v e ), (12)
ρq = e(np − ne ), (13)
ρm = (me ne + mp np ), (14)
me ne v e + np mp vp
v= , (15)
me ne + mp np
where J is the global current and ρq is the global charge density, ρm is the
total mass density and v is the so-called bulk velocity of the plasma. From the
definition of the bulk velocity it is clear that v is the center-of-mass velocity
of the electron–ion system. The interesting case is the one where the plasma
is globally neutral, i.e. ne np = n0 , implying, from Maxwell and continuity
equations the following equations
∇ · E = 0, ∇ · J = 0, ∇ · B = 0. (16)
The equations reported in (16) are the first characterization of MHD equa-
tions, i.e. a system where the total current as well as the electric and magnetic
fields are all solenoidal. The remaining equations allow to obtain the relevant
set of conditions describing the long-wavelength modes of the magnetic field,
i.e.
∇ × B = 4πJ , (17)
∂B
∇×E =− . (18)
∂t
In (17), the contribution of the displacement current has been neglected for
consistency with the solenoidal nature of the total current (16). Two other rel-
evant equations can be obtained by summing and subtracting the momentum
874 M. Giovannini

conservation equations, i.e. (6) and (7). The result of this procedure is

∂v
ρm + v · ∇v = J × B − ∇P (19)
∂t
J 1
E+v×B = + (J × B − ∇pe ), (20)
σ enq

where nq n0 ne and P = pe + pp . Equation (19) is derived from the sum

of (6) and (7) and in (19) J × B is the Lorentz force term which is quadratic
in the magnetic field. In fact using (17)
1
J ×B = (∇ × B) × B. (21)
4π
Note that to derive (20) the limit me /mp → 0 must be taken, at some point.
There are some caveats related to this procedure since viscous and collisional
effects may be relevant [25]. Equation (20) is sometimes called one-fluid gen-
eralized Ohm law. In (20) the term J × B is nothing but the Hall current and
∇pe is often called thermoelectric term. Finally, the term J /σ is the resis-
tivity term and σ is the conductivity of the one-fluid description. In (20) the
pressure has been taken to be isotropic. Neglecting the Hall and thermoelec-
tric terms (that may play, however, a role in the Biermann battery mechanism
for magnetic field generation), the Ohm law takes the form

J = σ(E + v × B). (22)

Using (22) together with (17) it is easy to show that the Ohmic electric field
is given by
∇×B
E= − v × B. (23)
4πσ
Substituting then (23) into (18) and exploiting known vector identities, we
can get the canonical form of the magnetic diffusivity equation
∂B 1
= ∇ × (v × B) + ∇2 B, (24)
∂t 4πσ
which is the equation to be used to discuss the general features of the dynamo
instability.
MHD can be studied into two different (but complementary) limits
• the ideal (or superconducting) limit where the conductivity is set to infinity
(i.e. the σ → ∞ limit) and
• the real (or resistive) limit where the conductivity is finite.
The plasma description following from MHD can also be phrased in terms
of the conservation of two interesting quantities, i.e. the magnetic flux and
the magnetic helicity [27, 31]:
Magnetic Fields, Strings and Cosmology 875

d 1
B · dΣ = − ∇ × ∇ × B · dΣ, (25)
dt Σ 4πσ

d 1
d3 xA · B = − d3 xB · ∇ × B. (26)
dt V 4πσ V

In (25), Σ is an arbitrary closed surface that moves with the plasma. In

the ideal MHD limit the magnetic flux is exactly conserved and the flux is
sometimes said to be frozen into the plasma element. In the same limit also
the magnetic helicity is conserved. In the resistive limit the magnetic flux and
helicity are dissipated with a rate proportional to 1/σ which is small provided
the conductivity is sufficiently high. The term appearing at the right-hand
side off (26) is called magnetic gyrotropy.
The conservation of the magnetic helicity is a statement on the conserva-
tion of the topological properties of the magnetic flux lines. If the magnetic field
is completely stochastic, the magnetic flux lines will be closed loops evolving
independently in the plasma and the helicity will vanish. There could be, how-
ever, more complicated topological situations where a single magnetic loop is
twisted (like some kind of Möbius stripe) or the case where the magnetic loops
are connected like the rings of a chain. In both cases the magnetic helicity
will not be zero since it measures, essentially, the number of links and twists
in the magnetic flux lines. The conservation of the magnetic flux and of the
magnetic helicity is a consequence of the fact that, in ideal MHD, the Ohmic
electric field is always orthogonal both to the bulk velocity field and to the
magnetic field. In the resistive MHD approximation this is no longer true [27].

2.2 Dynamos

The dynamo theory has been developed starting from the early 1950s through
the 1980s and various extensive presentations exist in the literature [32, 33,
34]. Generally speaking, a dynamo is a process where the kinetic energy of the
plasma is transferred to magnetic energy. There are different sorts of dynamos.
Some of the dynamos that are currently addressed in the existing literature
are large-scale dynamos, small-scale dynamos, nonlinear dynamos, α-dynamos
etc.
It would be difficult, in the present lecture, even to review such a vast lit-
erature and, therefore, it is more appropriate to refer to some review articles
where the modern developments in dynamo theory and in mean field elec-
trodynamics are reported [35, 36]. As a qualitative example of the dynamo
action it is practical do discuss the magnetic diffusivity equation obtained,
from general considerations, in (24).
Equation (24) simply stipulates that the first-time derivative of the mag-
netic fields intensity results from the balance of two (physically different)
contributions. The first term at the right-hand side of (24) is the the dynamo
term and it contains the bulk velocity of the plasma v. If this term dominates
the magnetic field may be amplified, thanks to the differential rotation of the
876 M. Giovannini

plasma. The dynamo term provides then the coupling allowing the transfer of
the kinetic energy into magnetic energy. The second term at the right-hand
side of (24) is the magnetic diffusivity whose effect is to damp the magnetic
field intensity. Defining then as L the typical scale of spatial variation of the
magnetic field intensity, the typical time-scale of resistive phenomena turns
out to be
tσ 4πσL2 . (27)
In a nonrelativistic plasma the conductivity σ goes typically as T 3/2 [24, 25]. In
the case of planets, like the earth, one can wonder why a sizable magnetic field
can still be present. One of the theories is that the dynamo term regenerates
continuously the magnetic field which is dissipated by the diffusivity term
[32]. In the case of the galactic disk the value of the conductivity7 is given by
σ 7 × 10−7 Hz. Thus, for L kpc tσ 109 (L/kpc)2 sec.
Equation (27) can also give the typical resistive length-scale once the time-
scale of the system is specified. Suppose that the time-scale of the system is
given by tU ∼ H0−1 ∼ 1018 sec where H0 is the present order of magnitude of
the Hubble parameter. Then

tU
Lσ = , (28)
σ
leading to Lσ ∼ AU. The scale (28) gives then the upper limit on the diffusion
scale for a magnetic field whose lifetime is comparable with the age of the
Universe at the present epoch. Magnetic fields with typical correlation scale
larger than Lσ are not affected by resistivity. On the other hand, magnetic
fields with typical correlation scale L < Lσ are diffused. The value Lσ ∼ AU
is consistent with the phenomenological evidence that there are no magnetic
fields coherent over scales smaller than 10−5 pc.
The dynamo term may be responsible for the origin of the magnetic field
of the galaxy. The galaxy has a typical rotation period of 3 × 108 years and
comparing this figure with the typical age of the galaxy, O(1010 years), it can
be appreciated that the galaxy performed about 30 rotations since the time
of the protogalactic collapse.
The effectiveness of the dynamo action depends on the physical properties
of the bulk velocity field. In particular, a necessary requirement to have a
potentially successful dynamo action is that the velocity field is non-mirror-
symmetric or that, in other words,
v · ∇ × v = 0. Let us see how this
statement can be made reasonable in the framework of (24). From (24) the
usual structure of the dynamo term may be derived by carefully averaging over
the velocity field according to the procedure of [37, 38]. By assuming that the
motion of the fluid is random and with zero mean velocity the average is
taken over the ensemble of the possible velocity fields. In more physical terms

7
It is common use in the astrophysical applications to work directly with η =
(4πσ)−1 . In the case of the galactic disks η = 1026 cm2 Hz.
Magnetic Fields, Strings and Cosmology 877

this averaging procedure of (24) is equivalent to average over scales and times
exceeding the characteristic correlation scale and time τ0 of the velocity field.
This procedure assumes that the correlation scale of the magnetic field is much
bigger than the correlation scale of the velocity field which is required to be
divergence-less (∇ · v = 0). In this approximation the magnetic diffusivity
equation can be written as
∂B 1
= α(∇ × B) + ∇2 B, (29)
∂t 4πσ
where
τ0
α=−
v · ∇ × v, (30)
3
is the so-called α-term in the absence of vorticity. In (29) and (30) B is
the magnetic field averaged over times longer that τ0 which is the typical
correlation time of the velocity field.
The fact that the velocity field must be globally non-mirror-symmetric
[33] suggests, already at this qualitative level, the deep connection between
dynamo action and fully developed turbulence. In fact, if the system would be,
globally, invariant under parity transformations, then the α term would simply
be vanishing. This observation may also be related to the turbulent features
of cosmic systems. In cosmic turbulence the systems are usually rotating and,
moreover, they possess a gradient in the matter density (think, for instance,
to the case of the galaxy). It is then plausible that parity is broken at the
level of the galaxy since terms like ∇ρm · ∇ × v are not vanishing [33].
The dynamo term, as it appears in (29), has a simple electrodynamical
meaning, namely, it can be interpreted as a mean Ohmic current directed
along the magnetic field
J = −αB. (31)
Equation stipulates that an ensemble of screw-like vortices with zero mean
helicity is able to generate loops in the magnetic flux tubes in a plane orthog-
onal to the one of the original field. As a simple (and known) application of
(29), it is appropriate to consider the case where the magnetic field profile is
given by a sort of Chern–Simons wave

Bx (z, t) = f (t) sin kz, By = f (t) cos kz, Bz (k, t) = 0. (32)

For this proﬁle the magnetic gyrotropy is nonvanishing, i.e. B·∇×B = kf 2 (t).
From (29), using (32) f (t) obeys the following equation

df k2
= kα − f (33)
dt 4πσ

admits exponentially growing solutions for suﬃciently large scales, i.e. k <
4π|α|σ. Notice that in this naive example the α term is assumed to be con-
stant. However, as the ampliﬁcation proceeds, α may develop a dependence
878 M. Giovannini

upon |B|2 , i.e. α → α0 (1 − ξ|B|2 )α0 [1 − ξf 2 (t)]. In the case of (33) this modi-
fication will introduce nonlinear terms whose effect will be to stop the growth
of the magnetic field. This regime is often called saturation of the dynamo
and the nonlinear equations appearing in this context are sometimes called
Landau equations [33] in analogy with the Landau equations appearing in
hydrodynamical turbulence.
In spite of the fact that in the previous example the velocity field has
been averaged, its evolution obeys the Navier–Stokes equation which we have
already written but without the diffusion term

∂v
ρm + (v · ∇)v − ν∇ v = −∇P + J × B,
2
(34)
∂t
where ν is the thermal viscosity coefficient. There are idealized cases where the
Lorentz force term can be neglected. This is the so-called force-free approxi-
mation. Defining the kinetic helicity as Ω = ∇ × v, the magnetic diffusivity
and Navier–Stokes equations can be written in a rather simple and symmetric
form
∂B 1
= ∇ × (v × B) + ∇2 B,
∂t 4πσ
∂Ω
= ∇ × (v × Ω) + ν∇2 Ω. (35)
∂t
In MHD various dimensionless ratios can be defined. The most frequently
used are the magnetic Reynolds number, the kinetic Reynolds number and
the Prandtl number:

Rm = vLB σ, (36)
vLv
R= , (37)
ν
Rm LB
Pr = = νσ , (38)
R Lv
where LB and Lv are the typical scales of variation of the magnetic and veloc-
ity ﬁelds. If Rm 1 the system is said to be magnetically turbulent. If R 1
the system is said to be kinetically turbulent. In realistic situations the plasma
is both kinetically and magnetically turbulent and, therefore, the ratio of the
two Reynolds numbers will tell which is the dominant source of turbulence.
There have been, in recent years, various studies on the development of mag-
netized turbulence (see, for instance, [27]) whose features diﬀer slightly from
the ones of hydrodynamic turbulence. While the details of this discussion will
be left aside, it is relevant to mention that, in the early Universe, turbulence
may develop. In this situation a typical phenomenon, called inverse cascade,
can take place. A direct cascade is a process where energy is transferred from
large to small scales. Even more interesting, for the purposes of the present
lecture, is the opposite process, namely the inverse cascade where the energy
Magnetic Fields, Strings and Cosmology 879

transfer goes from small to large length-scales. One can also generalize the
the concept of energy cascade to the cascade of any conserved quantity in the
plasma, like, for instance, the helicity. Thus, in general terms, the transfer
process of a conserved quantity is a cascade.
The concept of cascade (either direct or inverse) is related with the concept
of turbulence, i.e. the class of phenomena taking place in fluids and plasmas at
high Reynolds numbers. It is very difficult to reach, with terrestrial plasmas,
the physical situation where the magnetic and the kinetic Reynolds numbers
are both large but in such a way that their ratio is also large, i.e.
Rm
Rm 1, R 1, Pr = 1. (39)
R
The physical regime expressed through (39) rather common in the early Uni-
verse. Thus, MHD turbulence is probably one of the key aspects of magnetized
plasma dynamics at very high temperatures and densities. Consider, for in-
stance, the plasma at the electroweak epoch when the temperature was of the
order of 100 GeV. One can compute the Reynolds numbers and the Prandtl
number from their definitions given in (36)–(38). In particular,

Rm ∼ 1017 , R = 1011 , Pr 106 , (40)

which can be obtained from (36)–(38) using as ﬁducial parameters v 0.1,

σT /α, ν (αT )−1 and L 0.01 Hew −1
0.03 cm for T 100 GeV.
If an inverse energy cascade takes place, many (energetic) magnetic do-
mains coalesce giving rise to a magnetic domain of larger size but of smaller
energy. This phenomenon can be viewed, in more quantitative terms, as an
effective increase of the correlation scale of the magnetic field. This consider-
ation plays a crucial role for the viability of mechanisms where the magnetic
field is produced in the early Universe inside the Hubble radius (see Sect. 2.5).

2.3 Initial Conditions for Dynamos

According to the qualitative description of the dynamo instability presented

in the previous subsection, the origin of large-scale magnetic fields in spiral
galaxies can be reduced to the three keywords: seeding, amplification and
ordering. The first stage, i.e. the seeding, is the most controversial one and
will be briefly reviewed in the following sections of the present review. In more
quantitative terms the amplification and the ordering may be summarized as
follows:
• During the 30 rotations performed by the galaxy since the protogalactic
collapse, the magnetic field should be amplified by about 30 e-folds;
• If the large-scale magnetic field of the galaxy is, today, O(μG) the magnetic
field at the onset of galactic rotation might have been even 30 e-folds
smaller, i.e. O(10−19 G) over a typical scale of 30–100 kpc.
880 M. Giovannini

• Assuming perfect ﬂux freezing during the gravitational collapse of the

protogalaxy (i.e. σ → ∞), the magnetic field at the onset of gravitational
collapse should be O(10−23 ) G over a typical scale of 1 Mpc.
This picture is oversimplified and each of the three steps mentioned above can
be questioned. In what follows the main sources of debate, emerged in the last
10 years, will be briefly discussed.
There is a simple way to relate the value of the magnetic fields right after
gravitational collapse to the value of the magnetic field right before gravita-
tional collapse. Since the gravitational collapse occurs at high conductivity,
the magnetic flux and the magnetic helicity are both conserved (see, in par-
ticular, (25)). Right before the formation of the galaxy a patch of matter of
roughly 1 Mpc collapses by gravitational instability. Right before the collapse
the mean energy density of the patch, stored in matter, is of the order of the
critical density of the Universe. Right after collapse the mean matter density
of the protogalaxy is, approximately, six orders of magnitude larger than the
critical density.
Since the physical size of the patch decreases from 1 Mpc to 30 kpc, the
magnetic field increases, because of flux conservation, of a factor (ρa /ρb )2/3 ∼
104 where ρa and ρb are, respectively the energy densities right after and right
before gravitational collapse. The correct initial condition in order to turn on
the dynamo instability would be |B| ∼ 10−23 Gauss over a scale of 1 Mpc,
right before gravitational collapse.
The estimates presented in the last paragraph are based on the (rather
questionable) assumption that the amplification occurs over 30 e-folds while
the magnetic flux is completely frozen in. In the real situation, the achievable
amplification is much smaller. Typically a good seed would not be 10−19 G af-
ter collapse (as we assumed for the simplicity of the discussion) but rather [35]

|B| ≥ 10−13 G. (41)

The galactic rotation period is of the order of 3 × 108 years. This scale
should be compared with the typical age of the galaxy. All along this rather
large dynamical time-scale the effort has been directed, from the 1950s, to
the justification that a substantial portion of the kinetic energy of the system
(provided by the differential rotation) may be converted into magnetic energy
amplifying, in this way, the seed field up to the observed value of the magnetic
field, for instance in galaxies and in clusters. In recent years a lot of progress
has been made both in the context of the small- and of large-scale dynamos [36,
39] (see also [40, 41, 42]). This progress was also driven by the higher resolution
of the numerical simulations and by the improvement in the understanding
of the largest magnetized system that is rather close to us, i.e. the sun [36].
More complete accounts of this progress can be found in the second article
of [39] and, more comprehensively, in [36]. Apart from the aspects involving
solar physics and numerical analysis, better physical understanding of the role
of the magnetic helicity in the dynamo action has been reached. This point
Magnetic Fields, Strings and Cosmology 881

is crucially connected with the two conservation laws arising in MHD, i.e. the
magnetic flux and magnetic helicity conservations whose relevance has been
already emphasized, respectively, in (25) and (26). Even if the rich interplay
between small-and large-scale dynamos is rather important, let us focus on
the problem of large-scale dynamo action that is, at least superficially, more
central for the considerations developed in the present lecture.
Already at a qualitative level it is clear that there is a clash between the
absence of mirror-symmetry of the plasma, the quasi-exponential amplification
of the seed and the conservation of magnetic flux and helicity in the high (or
more precisely infinite) conductivity limit. The easiest clash to understand,
intuitively, is the flux conservation versus the exponential amplification: both
flux freezing and exponential amplification have to take place in the same
superconductive (i.e. σ −1 → 0) limit. The clash between helicity conservation
and dynamo action can also be understood in general terms: the dynamo
action implies a topology change of the configuration since the magnetic flux
lines cross each other constantly [39].
One of the recent progress in this framework is a more consistent formula-
tion of the large-scale dynamo problem [39, 39]: large-scale dynamos produce
small-scale helical fields that quench (i.e. prematurely saturate) the α effect.
In other words, the conservation of the magnetic helicity can be seen, accord-
ing to the recent view, as a fundamental constraint on the dynamo action.
In connection with the last point, it should be mentioned that, in the past, a
rather different argument was suggested [43]: it was argued that the dynamo
action leads to the amplification not only of the large-scale field but also of the
random field component. The random field would then suppress strongly the
dynamo action. According to the considerations based on the conservation of
the magnetic helicity, this argument seems to be incorrect since the increase
of the random component would also entail and increase of the rate of the
topology change, i.e. a magnetic helicity nonconservation.
The possible applications of dynamo mechanism to clusters are still under
debate and it seems more problematic. The typical scale of the gravitational
collapse of a cluster is larger (roughly by one order of magnitude) than the
scale of gravitational collapse of the protogalaxy. Furthermore, the mean mass
density within the Abell radius ( 1.5 h−1 Mpc) is roughly 103 larger than the
critical density. Consequently, clusters rotate much less than galaxies. Recall
that clusters are formed from peaks in the density field. The present overden-
sity of clusters is of the order of 103 . Thus, in order to get the intracluster
magnetic field, one could think that magnetic flux is exactly conserved and,
then, from an intergalactic magnetic field |B| > 10−9 G an intracluster mag-
netic field |B| > 10−7 G can be generated. This simple estimate shows why it
is rather important to improve the accuracy of magnetic field measurements
in the intracluster medium: The change of a single order of magnitude in the
estimated magnetic field may imply rather different conclusions for its origin.
882 M. Giovannini

2.4 Astrophysical Mechanisms

Many (if not all) the astrophysical mechanisms proposed so far are related
to what is called, in the jargon, a battery. In short, the idea is the following.
The explicit form of the generalized Ohmic electric ﬁeld in the presence of
thermoelectric corrections can be written as in (20) where we set nq = ne to
stick to the usual conventions8
∇×B ∇Pe
E = −v × B + − . (42)
4πσ ene

By comparing (23) with (42), it is clear that the additional term at the right-
hand side receives contribution from a temperature gradient. In fact, restoring
for a moment the Boltzmann constant kB we have that since Pe = kB ne Te ,
the additional term depends upon the gradients of the temperature, hence
the name thermoelectric. It is interesting to see under which conditions the
curl of the electric ﬁeld receives contribution from the thermoelectric eﬀect.
Taking the curl of both sides of (42), we obtain

1 ∇ne × ∇Pe ∂B
∇×E = ∇2 B + ∇(v × B) − =− , (43)
4πσ en2e ∂t

where the second equality is a consequence of Maxwell’s equations. From (43)

it is clear that the evolution of the magnetic field inherits a source term iff the
gradients in the pressure and electron density are not parallel. If ∇Pe ∇ne ,
a fully valid solution of (43) is B = 0. In the opposite case a seed magnetic
field is naturally provided by the thermoelectric term. The usual (and rather
general) observation that one can make in connection with the geometrical
properties of the thermoelectric term is that cosmic ionization fronts may
play an important role. For instance, when quasars emit ultraviolet photons,
cosmic ionization fronts are produced. Then the intergalactic medium may be
ionized. It should also be recalled, however, that the temperature gradients
are usually normal to the ionization front. In spite of this, it is also plausible
to think that density gradients can arise in arbitrary directions due to the
stochastic nature of density fluctuations.
In one way or in another, astrophysical mechanisms for the generation
of magnetic fields use an incarnation of the thermoelectric effect [44] (see
also [45, 46]). In the 1960s and 1970s, for instance, it was rather popular to

8
For simplicity, we shall neglect the Hall contribution arising in the generalized
Ohm law. The Hall contribution would produce in (42) a term J × B/ne e that
is of higher order in the magnetic ﬁeld and that is proportional to the Lorentz
force. The Hall term will play no role in the subsequent considerations. However,
it should be borne in mind that the Hall contribution may be rather interesting
in connection with the presence of strong magnetic ﬁelds like the ones of neutron
stars (i.e. 1013 G). This occurrence is even more interesting since in the outer
regions of neutron stars strong density gradients are expected.
Magnetic Fields, Strings and Cosmology 883

think that the correct “geometrical” properties of the thermoelectric term may
be provided by a large-scale vorticity. As it will also be discussed later, this
assumption seems to be, at least naively, in contradiction with the formulation
of inflationary models whose prediction would actually be that the large-scale
vector modes are completely washed out by the expansion of the Universe.
Indeed, all along the 1980s and 1990s the idea of primordial vorticity received
just a minor attention.
The attention then focused on the possibility that objects of rather small
size may provide intense seeds. After all we do know that these objects may
exist. For instance the Crab nebula has a typical size of a roughly 1 pc and
a magnetic field that is a fraction of the multi Gauss. These seeds will then
combine and diffuse leading, ultimately, to a weaker seed but with large cor-
relation scale. This aspect may be, physically, a bit controversial since we
do observe magnetic fields in galaxies and clusters that are ordered over very
large length-scales. It would then seem necessary that the seed fields produced
in a small object (or in several small objects) undergo some type of dynamical
self-organization whose final effect is a seed coherent over length-scales 4 or 5
orders of magnitude larger than the correlation scale of the original battery.
An interesting idea could be that qualitatively different batteries lead to
some type of conspiracy that may produce a strong large-scale seed. In [44] it
has been suggested that Population III stars may become magnetized, thanks
to a battery operating at stellar scale. Then if these stars would explode as
supernovae (or if they would eject a magnetized stellar wind), the pregalactic
environment may be magnetized and the remnants of the process incorporated
in the galactic disk. In a complementary perspective, a similar chain of events
may take place over a different physical scale. A battery could arise in fact
in active galactic nuclei at high redshift. Then the magnetic field could be
ejected leading to intense fields in the lobes of “young” radio-galaxies. These
fields will be somehow inherited by the “older” disk galaxies and the final seed
field may be, according to [44], as large as 10−9 G at the pregalactic stage.
In summary, we can therefore say that
• both the primordial and the astrophysical hypothesis for the origin of the
seeds demand an efficient (large-scale) dynamo action;
• due to the constraints arising from the conservation of magnetic helicity
and magnetic flux the values of the required seed fields may turn out to be
larger than previously thought at least in the case when the amplification
is only driven by a large-scale dynamo action;9
• magnetic flux conservation during gravitational collapse of the protogalaxy
may increase, by compressional amplification, the initial seed of even 4
orders of magnitude;

9
The situation may change if the magnetic ﬁelds originate from the combined
action of small- and large-scale dynamos like in the case of the two-step process
described in [44].
884 M. Giovannini

• compressional ampliﬁcation and large-scale dynamo are much less eﬀective

in clusters: therefore, the magnetic field of clusters is probably connected
to the specific way the dynamo saturates, and, in this sense, harder to
predict from a specific value of the initial seed.

2.5 Magnetogenesis: Inside the Hubble Radius

One of the weaknesses of the astrophysical hypothesis is connected with the

smallness of the correlation scale of the obtained magnetic fields. This type
of impasse led the community to consider the option that the initial condi-
tions for the MHD evolution are dictated not by astrophysics but rather by
cosmology. The first ones to think about cosmology as a possible source of
large-scale magnetization were Zeldovich [47, 48] and Harrison [49, 50, 51].
The emphasis of these two authors was clearly different. While Zeldovich
thought about a magnetic field which is uniform (i.e. homogeneous and ori-
ented, for instance, along a specific Cartesian direction), Harrison somehow
anticipated the more modern view by considering the possibility of an inho-
mogeneous magnetic field. In the scenario of Zeldovich the uniform magnetic
field would induce a slight anisotropy in the expansion rate along which the
magnetic field is aligned. So, for instance, by considering a constant (and uni-
form) magnetic field pointing along the x̂ Cartesian axis, the induced geometry
compatible with such a configuration will fall into the Bianchi I class

ds2 = dt2 − a2 (t)dx2 − b2 (t)[dy 2 + dz 2 ]. (44)

By solving Einstein equations in this background geometry, it turns out that,

during a radiation-dominated epoch, the expansion rates along the x̂ and the
ŷ − ẑ plane change and their difference is proportional to the magnetic energy
density [47, 48]. This observation is not only relevant for magnetogenesis but
also for cosmic microwave background anisotropies since the difference in the
expansion rate turns out to be proportional to the temperature anisotropy.
While we will get back to this point later, in Sect. 4, as far as magnetization
is concerned we can just remark that the idea of Zeldovich was that a uniform
magnetic field would modify the initial condition of the standard hot big bang
model where the Universe would start its evolution already in a radiation-
dominated phase.
The model of Harrison [49, 50, 51] is, in a sense, more dynamical. Fol-
lowing earlier work of Biermann [52], Harrison thought that inhomogeneous
MHD equations could be used to generate large-scale magnetic fields provided
the velocity field was turbulent enough. The Biermann battery was simply a
battery (as the ones described above in this session) but operating prior to
decoupling of matter and radiation. The idea of Harrison was instead that
vorticity was already present so that the effective MHD equations will take
the form
∂ 2 e e
(a Ω + B) = ∇2 B, (45)
∂τ mp 4πσmp
Magnetic Fields, Strings and Cosmology 885

where, as previously defined, Ω = ∇×v and mp is the ion mass. Equation (45)
is written in a conformally flat Friedmann–Robertson–Walker (FRN) metric
of the form
ds2 = Gμν dxμ dxν = a2 (τ )[dτ 2 − dx2 ], (46)
where τ is the conformal time coordinate and where, in the conformally flat
case, Gμν = a2 (τ )ημν , ημν being the four-dimensional Minkowski metric. If we
now postulate that some vorticity was present prior to decoupling, then (45)
can be solved and the magnetic field can be related to the initial vorticity as
2
mp ai
B∼− ωi . (47)
e a

If the estimate of the vorticity is made prior to equality (as originally

suggested by Harrison [49]) or after decoupling as also suggested, a bit later, in
[53], the result can change even by two orders of magnitude. Prior to equality
|Ω(t) 0.1/t and, therefore, |B eq | ∼ 10−21 G. If a similar estimate is made
after decoupling, the typical value of the generated magnetic field is of the
order of 10−18 G. So, in this context, the problem of the origin of magnetic
fields is circumvented by postulating an appropriate form of vorticity whose
origin must be explained.
The Harrison mechanism is just one of the first examples of magnetic field
generation inside the Hubble radius. In cosmology we define the Hubble radius
as the inverse of the Hubble parameter, i.e. rH = H −1 (t). The first possibility
we can think of implies that magnetic fields are produced, at a given epoch
in the life of the Universe, inside the Hubble radius, for instance by a phase
transition or by any other phenomenon able to generate a charge separation
and, ultimately, an electric current. In this context, the correlation scale of the
field is much smaller than the typical scale of the gravitational collapse of the
protogalaxy which is of the order of mega parsecs. In fact, if the Universe is
decelerating and if the correlation scale evolves as the scale factor, the Hubble
radius grows much faster than the correlation scale. Of course, one might
invoke the possibility that the correlation scale of the magnetic field evolves
more rapidly than the scale factor. A well-founded physical rationale for this
occurrence is what is normally called inverse cascade, i.e. the possibility that
magnetic (as well as kinetic) energy density is transferred from small to large
scales. This implies, in real space, that (highly energetic) small-scale magnetic
domains may coalesce to form magnetic domains of smaller energy but over
larger scales. In the best of all possible situations, i.e. when inverse cascade
is very effective, it seems rather hard to justify a growth of the correlation
scale that would eventually end up into a mega parsec scale at the onset of
gravitational collapse.
In Fig. 1 we report a schematic illustration of the evolution of the Hubble
radius RH and of the correlation scale of the magnetic field as a function of
the scale factor. In Fig. 1 the horizontal dashed line simply marks the end of
the radiation–dominated phase and the onset of the matter-dominated phase:
886 M. Giovannini

INSIDE THE HUBBLE RADIUS

T ~ 100 GeV
3 cm
−1
R(a) = H
a a 2
2 a
a
5/3 λ(a) 5/3
a a
3/2
3/2
a a
100 AU

Fig. 1. Evolution of the correlation scale for magnetic fields produced inside the
Hubble radius. The horizontal thick dashed line marks the end of the radiation-
dominated phase and the onset of the matter-dominated phase. The horizontal thin
dashed line marks the moment of e+ –e− annihilation (see also footnote 2). The
full (vertical) lines represent the evolution of the Hubble radius during the different
stages of the life of the Universe. The dashed (vertical) lines illustrate the evolution
of the correlation scale of the magnetic fields. In the absence of inverse cascade the
evolution of the correlation scale is given by the (inner) vertical dashed lines. If
inverse cascade takes place, the evolution of the correlation scale is faster than the
first power of the scale factor (for instance a5/3 ) but always slower than the Hubble
radius

while above the dashed line the Hubble radius evolves as a2 (where a is the
scale factor), below the dashed line the Hubble radius evolves as a3/2 .
We consider, for simplicity, a magnetic field whose typical correlation scale
is as large as the Hubble radius at the electroweak epoch when the temper-
ature of the plasma was of the order of 100 GeV. This is roughly the regime
contemplated by the considerations presented around (40). If the correlation
scale evolves as the scale factor, the Hubble radius at the electroweak epoch
(roughly 3 cm) projects today over a scale of the order of the astronomical
unit. If inverse cascades are invoked, the correlation scale may grow, depend-
ing on the specific features of the cascade, up to 100 AU or even up to 100 pc.
In both cases the final scale is too small if compared with the typical scale of
the gravitational collapse of the protogalaxy. In Fig. 1 a particular model for
the evolution of the correlation scale λ(a) has been reported.10

10
Notice, as it will be discussed later, that the inverse cascade lasts, in principle,
only down to the time of e+ − e− annihilation (see also thin dashed horizontal
Magnetic Fields, Strings and Cosmology 887

2.6 Inﬂationary Magnetogenesis

If magnetogenesis takes place inside the Hubble radius, the main problem is
therefore the correlation scale of the obtained seed field. The cure for this
problem is to look for a mechanism producing magnetic fields that are co-
herent over large scales (i.e. mega parsec and, in principle, even larger). This
possibility may arise in the context of inflationary models. Inflationary models
may be conventional (i.e. based on a quasi-de Sitter stage of expansion) or
unconventional (i.e. not based on a quasi-de Sitter stage of expansion). Un-
conventional inflationary models are, for instance, pre-big-bang models that
will be discussed in more depth in Sect. 3.
The rationale for the previous statement is that, in inflationary models,
the zero-point (vacuum) fluctuations of fields of various spin are amplified,
typically fluctuations of spin 0 and spin 2 fields. The spin 1 fields enjoy however
of a property, called Weyl invariance, that seems to forbid the amplification
of these fields. While Weyl invariance and its possible breaking will be the
specific subject of the following subsection, it is useful for the moment to look
at the kinematical properties by assuming that, indeed, also spin 1 field can
be amplified.
Since during inflation the Hubble radius is roughly constant (see Fig. 2),
the correlation scale evolves much faster than the Hubble radius itself and,
therefore, large-scale magnetic domains can naturally be obtained. Notice
that, in Fig. 2 the (vertical) dashed lines illustrate the evolution of the Hubble
radius (that is roughly constant during inflation) while the full line denotes
the evolution of the correlation scale. Furthermore, the horizontal (dashed)
lines mark, from top to bottom, the end of the inflationary phase and the onset
of the matter-dominated phase. This phenomenon can be understood as the
gauge counterpart of the superadiabatic amplification of the scalar and tensor
modes of the geometry. The main problem, in such a framework, is to get large
amplitudes for scale of the order of mega parsec at the onset of gravitational
collapse. Models where the gauge couplings are effectively dynamical (break-
ing, consequently, the Weyl invariance of the evolution equations of Abelian
gauge modes) may provide rather intense magnetic fields.
The two extreme possibilities mentioned above may be sometimes com-
bined. For instance, it can happen that magnetic fields are produced by super-
adiabatic amplification of vacuum fluctuations during an inflationary stage of
expansion. After exiting the horizon, the gauge modes will reenter at different
moments all along the radiation- and matter-dominated epochs. The spectrum
of the primordial gauge fields after reentry will not only be determined by the
amplification mechanism but also on the plasma effects. As soon as the mag-
netic inhomogeneities reenter, some other physical process, taking place inside
the Hubble radius, may be triggered by the presence of large-scale magnetic
line in Fig. 1) since for temperatures smaller than Te+ −e− the Reynolds number
drops below 1. This is the result of the sudden drop in the number of charged
particles that leads to a rather long mean free path for the photons.
888 M. Giovannini

SUPERADIABATIC AMPLIFICATION

R(a) ~ const

R(a) λ(a )
a2 a2

a3/2 a3/2
24
10 cm

Fig. 2. Evolution of the correlation scale if magnetic ﬁelds would be produced by

superadiabatic amplification during a conventional inflationary phase. The dashed
vertical lines denote, in the present figure, the evolution of the Hubble radius, while
the full line denotes the evolution of the correlation scale (typically selected to
smaller than the Hubble radius during inflation)

ﬁelds. An example, in this context, is the production of topologically non-

trivial configurations of the hypercharge field (hypermagnetic knots) from a
stochastic background of hypercharge fields with vanishing helicity [54, 55, 56]
(see also [57, 58, 59, 60, 61]).

2.7 Breaking of Conformal Invariance

Consider the action for an Abelian gauge ﬁeld in four-dimensional curved

space–time
1 √
Sem = − d4 x −GFμν F μν . (48)
4
Suppose, also, that the geometry is characterized by a conformally ﬂat line
element of Friedmann–Robertson–Walker type as the one introduced in (46).
The equations of motion derived from (48) can be written as

√
∂μ −GF μν
= 0. (49)
√
Using (46) and recalling that −G = a4 (τ ), we will have
√ η μα η νβ
−GF μν = a4 (τ ) Fαβ = F μν (50)
a2 (τ ) a2 (τ )
Magnetic Fields, Strings and Cosmology 889

where the second equality follows from the explicit form of the metric. Equa-
tion (50) shows that the evolution equations of Abelian gauge fields are the
same in flat space–time and in a conformally flat FRW space–time. This prop-
erty is correctly called Weyl invariance or, more ambiguously, conformal invari-
ance. Weyl invariance is realized also in the case of chiral (massless) fermions
always in the case of conformally flat space–times.
One of the reasons of the success of inflationary models in making predic-
tions is deeply related with the lack of conformal invariance of the evolution
equations of the fluctuations of the geometry. In particular it can be shown
that the tensor modes of the geometry (spin 2) as well as the scalar modes
(spin 0) obey evolution equations that are not conformally invariant. This
means that these modes of the geometry can be amplified and eventually af-
fect, for instance, the temperature autocorrelations as well as the polarization
power spectra in the microwave sky.
To amplify large-scale magnetic fields, therefore, we would like to break
conformal invariance. Before considering this possibility, let us discuss an even
more conservative approach consisting in studying the evolution of Abelian
gauge fields coupled to another field whose evolution is not Weyl invariant.
An elegant way to achieve this goal is to couple the action of the hypercharge
field to the one of a complex scalar field (the Higgs field). The Abelian–Higgs
model, therefore, leads to the following action

√ ∗ 2 ∗ 1
S = d x −G G (Dμ ) φDν φ − m φ φ − Fμν F
4 μν μν
, (51)
4

where Dμ = ∂μ − ieAμ and Fμν = ∂[μ Aν] . Substituting (46) into (51) and
assuming that the complex scalar ﬁeld (as well as the gauge ﬁelds) are not a
source of the background geometry, the canonical action for the normal modes
of the system can be written as

a 1
S = d3 xdτ η μν (Dμ Φ)∗ Dν Φ + − m2 a2 Φ∗ Φ − Fαβ F αβ , (52)
a 4

where Φ = aφ, Dμ = ∂μ − ieAμ and Fμν = ∂[μ Aν] . From (52) it is clear
that also when the Higgs field is massless the coupling to the geometry breaks
explicitly Weyl invariance. Therefore, current density and charge density fluc-
tuations will be induced. Then, by employing a similar Vlasov–Landau de-
scription the resulting magnetic field will be of the order of Bdec ∼ 10−40 Tdec 2

[62] which is, by far, too small to seed any observable field even assuming,
optimistically, perfect flux freezing and maximal efficiency for the dynamo ac-
tion. The results of [62] disproved earlier claims (see [63] for a critical review),
neglecting the role of the conductivity in the evolution of large-scale magnetic
fields after inflation.
The first attempts to analyze the Abelian–Higgs model in de Sitter space
have been made by Turner and Widrow [66] who just listed such a possibil-
ity as an open question. These two authors also analyzed different scenarios
890 M. Giovannini

where conformal invariance for spin 1 fields could be broken in four space–
time dimensions. Their first suggestion was that conformal invariance may be
broken, at an effective level, through the coupling of photons to the geometry
[67]. Typically, the breaking of conformal invariance occurs through products
of gauge-field strengths and curvature tensors, i.e.
1 1 1
Fμν Fαβ Rμναβ , Rμν F μβ F να gαβ , Fαβ F αβ R, (53)
m2 m2 m2
where m is the appropriate mass scale, Rμναβ and Rμν are the Riemann and
Ricci tensors and R is the Ricci scalar. If the evolution of gauge fields is
studied during phase of de Sitter (or quasi-de Sittter) expansion, then the
amplification of the vacuum fluctuations induced by the couplings listed in
(53) is minute. The price in order to get large amplification should be, ac-
cording to [66], an explicit breaking of gauge invariance by direct coupling of
the vector potential to the Ricci tensor or to the Ricci scalar, i.e.
RAμ Aμ , Rμν Aμ Aν . (54)
In [66] two other different models were proposed (but not scrutinized in detail),
namely, scalar electrodynamics and the axionic coupling to the Abelian field
strength.
Dolgov [68] considered the possible breaking of conformal invariance due to
the trace anomaly. The idea is that the conformal invariance of gauge fields is
broken by the triangle diagram where two photons in the external lines couple
to the graviton through a loop of fermions. The local contribution to the
√
effective action leads to the vertex ( −g)1+ Fαβ F αβ , where is a numerical
coefficient depending upon the number of scalars and fermions present in the
theory. The evolution equation for the gauge fields, can be written, in Fourier
space, as

Ak + HAk + k 2 Ak = 0, (55)
8
and it can be shown that only if > 0 the gauge fields are amplified. Further-
more, only ∼ 8 substantial amplification of gauge fields is possible.
In a series of papers [69, 70, 71] the possible effect of the axionic coupling
to the amplification of gauge fields has been investigated. The idea here is that
conformal invariance is broken through the explicit coupling of a pseudoscalar
field to the gauge field (see Sect. 5), i.e.
√ ψ
−gcψγ αem Fαβ F̃ αβ , (56)
8πM
where F̃ αβ is the dual field strength and cψγ is a numerical factor of order 1.
Consider now the case of a standard pseudoscalar potential, for instance m2 ψ 2 ,
evolving in a de Sitter (or quasi-de Sitter space–time). It can be shown, rather
generically, that the vertex given in (56) leads to negligible amplification at
large length-scale(s). The coupled system of evolution equations to be solved
in order to get the amplified field is
Magnetic Fields, Strings and Cosmology 891
αem
B − ∇2 B − ψ ∇ × B = 0, (57)
2πM
2 2
ψ + 2Hψ + m a ψ = 0, (58)

where B = a2 B. From (57), there is a maximally ampliﬁed physical frequency

αem αem
ωmax ψ̇max m (59)
2πM 2π

where the second equality follows from ψ ∼ a−3/2 M cos mt (i.e. ψ̇max ∼ mM ).
The amplification for ω ∼ ωmax is of the order of exp [mαem /(2πH)] where
H is the Hubble parameter during the de Sitter phase of expansion. From
the above expressions one can argue that the modes which are substantially
amplifed are the ones for which ωmax H. The modes interesting for the
large-scale magnetic fields are the ones which are in the opposite range, i.e.
ωmax H. Clearly, by lowering the curvature scale of the problem, the pro-
duced seeds may be larger and the conclusions much less pessimistic [71].
Another interesting idea pointed out by Ratra [72] is that the electro-
magnetic field may be directly coupled to the inflaton field. In this case the
coupling is specified through a parameter α, i.e. eαϕ Fαβ F αβ where ϕ is the
inflaton field in Planck units. In order to get sizable large-scale magnetic fields
the effective gauge coupling must be larger than one during inflation (recall
that ϕ is large, in Planck units, at the onset of inflation).
In [73] it has been suggested that the evolution of the Abelian gauge cou-
pling during inflation induces the growth of the two-point function of magnetic
inhomogeneities. This model is different from the one previously discussed
[72]. Here the dynamics of the gauge coupling is not related to the dynamics
of the inflaton which is not coupled to the Abelian field strength. In particu-
lar, rB (Mpc) can be as large as 10−12 . In [73] the MHD equations have been
generalized to the case of evolving gauge coupling. Recently, a scenario similar
to [73] has been discussed in [74].
In the perspective of generating large-scale magnetic fields Gasperini
[75] suggested to consider the possible mixing between the photon and the
graviphoton field appearing in supergravity theories (see also, in a related
context [76]). The graviphoton is the massive vector component of the grav-
itational supermultiplet and its interaction with the photon is specified by
an interaction term of the type λFμν Gμν , where Gμν is the field strength of
the massive vector. Large-scale magnetic fields with rB (Mpc) ≥ 10−34 can be
obtained if λ ∼ O(1) and for a mass of the vector m ∼ 102 TeV.
Bertolami and Mota [77] argue that if Lorentz invariance is sponta-
neously broken, then photons acquire naturally a coupling to the geometry
which is not gauge-invariant and which is similar to the coupling considered
in [66].
892 M. Giovannini

3 Why String Cosmology?

The moment has come to review my personal interaction with Gabriele
Veneziano on the study of large-scale magnetic fields. While we had other
15 joined papers with Gabriele (together with different combinations of au-
thors), two of them [80, 81] (both in collaboration with Maurizio Gasperini)
are directly related to large-scale magnetic fields. Both papers reported in
[80, 81] appeared in 1995 while I was completing my PhD at the Theory
Division of CERN.
My scientific exchange with Gabriele Veneziano started at least 4 years
earlier and the first person mentioning Gabriele to me was Sergio Fubini. At
that time Sergio was Professor of Theoretical Physics at the University of
Turin and I had the great opportunity of discussing physics with him at least
twice a month. Sergio was rather intrigued by the possibility of getting precise
measurements on macroscopic quantum phenomena like superfluidity, super-
conductivity, and quantization of the resistivity in the (quantum) Hall effect.
I started working, under the supervision of Maurizio Gasperini, on the spec-
tral properties of relic gravitons and we bumped into the concept of squeezed
state [82], a generalization of the concept of coherent state (see, for instance,
[83, 84, 85]). Sergio got very interested and, I think, he was independently
thinking about possible applications of squeezed states to superconductivity,
a topic that became later on the subject of a paper [86]. Sergio even suggested
a review by Rodney Loudon [87], an author that I knew already because of his
inspiring book on quantum optics [88]. Reference [87] together with a physics
report of Schumaker [89] was very useful for my understanding of the sub-
ject. Nowadays a very complete and thorough presentation of the intriguing
problems arising in quantum optics can be found in the book of Mandel and
Wolf [90].
It is amusing to notice the following parallelism between quantum optics
and the quantum treatment of gravitational fluctuations. While quantum op-
tics deals with the coherence properties of systems of many photons, we deal,
in cosmology, with the coherence properties of many gravitons (or phonons)
excited during the time evolution of the background fields. The background
fields act, effectively, as a “pump field.” This terminology, now generally ac-
cepted, is exactly borrowed by quantum optics where the pump field is a laser.
In the 1960s and 1970s the main problem of optics can be summarized by the
following question: Why is classical optics so precise? Put into different words,
it is known that the interference of the amplitudes of the radiation field (the
so-called Young interferometry) can be successfully treated at a classical level.
Quantum effects, in optics, arise not from the first-order interference effects
(Young interferometry) but from the second-order interference effects, i.e. the
so-called Hanbury–Brown–Twiss interferometry [90], where the quantum na-
ture of the radiation field is manifest since it leads, in the jargon introduced
by Mandel [90], to light which is either bunched or antibunched. A similar
problem also arises in the treatment of cosmological perturbations when we
Magnetic Fields, Strings and Cosmology 893

ask the question of the classical limit of a quantum mechanically generated

fluctuation (for instance relic gravitons).
The interaction with Sergio led, few years later, to a talk that I presented at
the physics department of the University of Torino. The title was Correlation
properties of many photons systems. I mentioned my interaction with Sergio
Fubini since it was Sergio who suggested that, eventually, I should talk to
Gabriele about squeezed states.
During the first few months of 1991, Gabriele submitted a seminal paper
on the cosmological implications of the low-energy string effective action [91].
This paper, together with another one written in collaboration with Mau-
rizio Gasperini [92], represents the first formulation of pre–big-bang mod-
els. A relatively recent introduction to pre–big-bang models can be found
in [93].
In [80, 81] it was argued that the string cosmological scenario provided
by pre-big-bang models [91, 92] would be ideal for the generation of large-
scale magnetic fields. The rationale for this statement relies on two different
observations:
• in the low-energy string effective action gauge fields are coupled to the dila-
ton whose expectation value, at the string energy scale, gives the unified
value of the gauge and gravitational coupling;
• from the mathematical analysis of the problem it is clear that to achieve
a sizable amplification of large-scale magnetic fields it is necessary to
have a pretty long phase where the gauge coupling is sharply growing
in time [80].
Let us therefore elaborate on the two mentioned points. In the string
frame the low-energy string effective action can be schematically written as
[94, 95, 96]
−ϕ
√ e 1
Seff = − d x −G
4
R + G ∂α ϕ∂β ϕ − Hμνα H
αβ μνα
2λ2s 12

e−ϕ i
+ Fαβ F αβ + e−ϕ ψ γ a Da ψ + h.c. + R2 + ....... + O(g 2 ) + ....
4 2
(1)

In (60) the ellipses stand, respectively, for an expansion in powers of (λs /L)2
and for an expansion in powers of the gauge coupling constant g 2 = eϕ . This
action is written in the so-called string frame metric where the dilaton ﬁeld
ϕ is coupled to the Einstein–Hilbert term.
Concerning the action (60) few general comments are in order:
• the relation between the Planck and string scales depends on time and,
in particular, 2P = eϕ λ2s ; the present ratio between the Planck and string
scales gives the value, i.e. g(τ0 ) = eϕ0 /2 = P (τ0 )/λs ;
894 M. Giovannini

• in four space–time dimensions the antisymmetric tensor ﬁeld H μνα can be

written in terms of a pseudoscalar field, i.e.
μναρ
H μνα = eϕ √ ∂ρ σ; (2)
−G
In critical superstring theory the dilaton field must have a potential that van-
ishes in the weak coupling limit (i.e. ϕ → −∞). Moreover, from the direct
tests of Newton law at short distances it should also happen that the mass
of the dilaton is such that mϕ > 10−4 . This requirement may be relaxed by
envisaging nonperturbative mechanisms where the dilaton is effectively decou-
pled from the matter fields and where a massless dilaton leads to observable
violations of the equivalence principle.
From the structure of the action (60), Abelian gauge fields are amplified
if the gauge coupling is dynamical. Consider, in fact, the equations of motion
for the hypercharge field strength

√
∂μ e−ϕ −GF μν = 0, (3)

where Fμν = ∂[μ Aν] . In the Coulomb gauge where A0 = 0 and ∇ · A = 0

the equation for the rescaled vector potential Aμ = eϕ/2 Aμ becomes, for each
independent polarization and in Fourier space,

1
Ak + k − g
2
Ak = 0, (4)
g
where, as usual, the prime denotes a derivation with respect to the confor-
mal time coordinate. In (63) k denotes the comoving wave-number. From
the structure of (63) there exist two different regimes. For k 2 |g(g −1 ) | the
solution off (63) is oscillatory. In the opposite limit, i.e. k 2 |g(g −1 ) |, the
general solution can be written as

C1 (k) C2 (k) τ 2
Ak (τ ) = + g (τ )dτ , (5)
g(τ ) g(τ )
where C1 (k) and C2 (k) are two arbitrary constants. These two constants can
be fixed by imposing quantum mechanical initial conditions for τ → −∞.
Thus, depending on the evolution of g(τ ) the Fourier amplitude Ak can be
amplified.
It can be shown [80, 81] that the amplified magnetic energy density de-
pends on the ratio between the value of the gauge coupling at the reentry and
at the exit of the typical scale of the gravitational collapse, i.e.
2
1 dρB k4 gre
r(k) = 4 . (6)
ργ d ln k a ργ gex
The parameter r(k) measures the relative weight of the magnetic energy den-
sity in units of the radiation background. To turn on the galactic dynamo in
Magnetic Fields, Strings and Cosmology 895

its simplest realization, one should require that r(kG ) ≥ 10−34 for a typical
comoving wave-number corresponding to the typical scale of the gravitational
collapse of the protogalaxy. As explained before, this requirement seems to
be too optimistic in light of the most recent understanding of the dynamo
theory. The limit r(kG ) ≥ 10−24 seems more reasonable.
The fact that the gauge coupling must be sharply growing in order to
produce large-scale magnetic fields fits extremely well with the pre–big-bang
dynamics where, indeed, the gauge coupling is expected to grow. The second
requirement to obtain a phenomenologically viable mechanism for the ampli-
fication of large-scale gauge fields turned out to be the existence of a pretty
long stringy phase.
The “stringy” phase is simply the epoch where quadratic curvature cor-
rections start being important and lead to an effective dynamics where
the dilaton field is linearly growing in the cosmic time coordinate (see [93]
and references therein). Towards the end of the stringy phase the dilaton
freezes to its (constant) value and the Universe gets dominated by radia-
tion. One possibility for achieving the transition to radiation is represented
by the back-reaction effects of the produced particles [102]. In particular,
the short-wavelength modes play, in this context a crucial role. It is inter-
esting that while the magnetic energy spectrum produced during the stringy
phase is quasi-flat and the value of r(kG ) can be as large as 10−8 imply-
ing a protogalactic magnetic field of the order of 10−10 G. Under these con-
ditions the dynamo mechanism would even be superfluous since the com-
pressional amplification alone can amplify the seed field to its observed
value.
The results reported above may be “tested” in a framework where the pre-
big-bang dynamics is solvable. Consider, in particular, the situation where
the evolution of the dilaton field as well as the one of the geometry is
treated in the presence of a nonlocal dilaton potential [97, 98, 99, 100,
101].
In the Einstein frame description, the asymptotics of the (four-dimensional)
pre-big-bang dynamics can be written as [102]
√
τ 2( 3 + 1)
a(τ ) a− − , a− = e−ϕ0 /2 √ ,
2τ0 3
√
√ 3+1 √ τ
ϕ− = ϕ0 − ln 2 − 3 ln √ − 3 ln − ,
3 2τ0
√
1 3
H− = , ϕ− = − , (7)
2τ τ
for τ → −∞, and
√
τ 2( 3 − 1)
a(τ ) a+ , a+ = e ϕ0 /2
√
2τ0 3
896 M. Giovannini
√
√ 3−1 √ τ
ϕ+ = ϕ0 − ln 2 − 3 ln √ + 3 ln ,
3 2η0
√
1 3
H+ = , ϕ+ = , (8)
2τ τ
for τ → +∞. In (66) and (67), H = a /a and, as usual, the prime denotes
a derivation with respect to τ . The branch of the solution denoted by minus
describes, in the Einstein frame, an accelerated contraction, since the first
derivative of the scale factor is negative, while the second is positive. The
branch of the solution denoted with plus describes, in the Einstein frame, a
decelerated expansion, since the first derivative of the scale factor is positive
while the derivative is negative. In both branches the dilaton grows and its
derivative is always positive-definite (i.e. ϕ± > 0 ) as required by the present
approach to bouncing solutions. The numerical solution corresponding to the
asymptotics given in (66) and (67) is reported in Fig. 3.
In the Schrödinger description the vacuum state evolves, unitarily, to a
multimode squeezed state, in full analogy with what happens in the case
of relic gravitons [82, 103, 104]. In the following the same process will be
discussed within the Heisenberg representation. The two physical polariza-
tions of the photon can then be quantized according to the standard rules of
quantization in the radiation gauge in curved space–times:

8
a(τ)

0
-100 -50 0 50 100 150 200 250 300
τ
Fig. 3. The evolution of the scale factor in conformal time for a bouncing model
regularized via nonlocal dilaton potential in the Einstein frame
Magnetic Fields, Strings and Cosmology 897

d3 k −ik·x †
Âi (x, τ ) = â eα
k,α i A k (τ )e + â eα
k,α i A k (τ ) ik·x
e , (9)
α
(2π)3/2

and

d3 k α −ik·x † α ik·x
π̂i (x, τ ) = â e
k,α i Π k (τ )e + â e
k,α i Π k (τ ) e , (10)
α
(2π)3/2

where eα
i (k) describe the polarizations of the photon and

Πk (τ ) = Ak (τ ), [âk,α , â†p,β ] = δαβ δ (3) (k − p). (11)

The evolution equation for the mode functions will then be, in Fourier space,

−1
Ak + k − g(g ) Ak = 0,
2
(12)

i.e. exactly the same equation obtained in (63). The pump ﬁeld can also be
expressed as
2
ϕ ϕ
g(g −1 ) = − . (13)
4 2
The maximally ampliﬁed modes are then the ones for which
2
kmax |g(g −1 ) |. (14)

The Fourier modes appearing in (71) have to be normalized while they are
inside the horizon for large and negative τ . In this limit the initial conditions
provided by quantum mechanics are

1 −ikτ k −ikτ
Ak (τ ) = √ e , Πk (τ ) = −i e . (15)
2k 2
In the limit τ → +∞ the positive and negative frequency modes will be mixed,
so that the solution will be represented in the plane-wave orthonormal basis as

1
Ak (τ ) = √ c+ (k)e−ikτ + c− (k)eikτ ,
2k

k −ikτ
Ak (τ ) = −i c+ (k)e − c− (k)eikτ
. (16)
2

where c± (k) are the (constant) mixing coeﬃcients. The following two relations
fully determine the square modulus of each of the two mixing coeﬃcients in
terms of the complex wave-functions obeying (71):

|c+ (k)|2 − |c− (k)|2 = i(Ak Πk − Ak Πk ), (17)

898 M. Giovannini

1
|c+ (k)|2 + |c− (k)|2 = |Π k |2
+ k 2
|A k |2
. (18)
k2
After having numerically computed the time evolution of the properly nor-
malized mode functions, (76) and (77) can be used to infer the value of the
relevant mixing coefficient (i.e. c− (k)). Equation (76) is, in fact, the Wronskian
of the solutions. If the second-order differential equation is written in the form
(71), the Wronskian is always conserved throughout the time evolution of the
system. Since, from (74), the Wronskian is equal to 1 initially, it will be equal
to 1 all along the time evolution. Thus, from (76) |c+ (k)|2 = |c− (k)|2 + 1.
The fact that the Wronskian must always be equal to 1 is the measure of the
precision of the algorithm.
In Figs. 4 and 5 the numerical calculation of the spectrum is illustrated for
different values of k. In Fig. 5 the mixing coefficients are reported for modes
k kmax . In Fig. 4 the mixing coefficients are reported for modes around
kmax . Clearly, from Fig. 5 a smaller k leads to a larger mixing coefficient
which means that the spectrum is rather blue. Furthermore, by comparing
the amplification of different modes, it is easy to infer that the scaling law
is |c+ (k)|2 + |c− (k)|2 ∝ (k/kmax )−ng , with ng ∼ 3.46, which is in excellent
agreement√with the analytical determination of the mixing coefficients leading
to ng = 2 3 ∼ 3.46 [see below (88)].
The second-piece information that can be drawn from Fig. 4 concerns kmax ,
whose specific value √
5 − 0.5
kmax . (19)
τ0
can be determined numerically for different values of τ0 .
For the value of kmax reported in (78), the obtained mixing coefficient is
1, i.e. |c− (kmax )| 1. According to Fig. 4 as we move from kmax to larger

1.2

k = kmax −1
1

0.8
log(|c+|2 + |c–|2)

0.6

0.4

kmax
0.2

k= k max +1
0

log(|c+|2 – |c–|2)
−0.2
−100 −80 −60 −40 −20 0 20 40 60 80 100

Fig. 4. The evolution of the mixing coeﬃcients for k kkmax in units of τ0

Magnetic Fields, Strings and Cosmology 899
20

kτ0 = 10−5

15
kτ0 = 10−4
log(|c+|2 + |c–|2)

10 kτ0 = 10−3

kτ0 = 10−2
5

log(|c+|2 – |c–|2)

−5
−2 0 2 4 6 8 10
τ x 10
5

Fig. 5. The numerical estimate of the mixing coeﬃcients in the case kτ0 1

k, (|c+ (k)|2 + |c− (k)|2 ) (|c+ (k)|2 − |c− (k)|2 ), implying that |c− (k)| ∼ 0.
Moreover, from the left plot of Fig. 5 it can be appreciated that
|c− (kmax )|2 = 1, log (|c+ (kmax )|2 + |c− (kmax )|2 ) = log 3 0.477. (20)
Thus the absolute normalization and slope of the relevant mixing coeﬃcient
can be numerically determined to be
−2√3
k
|c− (k)|2 = . (21)
kmax

It can be concluded that (80) is rather accurate as far as both the slope and
the absolute normalization are concerned. The numerical estimates presented
so far can also be corroborated by the usual analytical treatment based on
the matching of the solutions for the mode functions before and after the
bounce. The evolution of the modes described by (71) can be approximately
determined from the exact√ asymptotic solutions given in (66) and (67), and
implying that ϕ± ± 3/τ . Thus the solutions of (71) can be obtained in
the two asymptotic regimes, i.e. for τ ≤ −τ1
√
−πτ i π (ν+1/2) (1)
Ak,− (τ ) = e 2 Hν (−kτ ), (22)
2
and for τ ≥ τ1
√
πτ i π (μ+1/2) −iπ(μ+1/2) (2)
Ak,+ (η) = e 2 (1)
c− Hμ (kτ ) + c+ e Hμ (kτ ) , τ ≥ −τ1 ,
2
(23)
900 M. Giovannini
(1,2)
where Hα (z) are Hankel functions of ﬁrst and second kind whose related
indices are √ √
3−1 3+1
ν= , μ= . (24)
2 2
The time-scale τ1 deﬁnes the width of the bounce and, typically, τ1 ∼ τ0 .
The phases appearing in (81) and (82) are carefully chosen so that
1
lim Ak = √ e−ikτ . (25)
τ →−∞ 2k
Using then the appropriate matching conditions

Ak,− (−τ1 ) = Ak,+ (τ1 ),

Ak,− (−τ1 ) = Ak,+ (τ1 ), (26)

and deﬁning x1 = kτ1 , the obtained mixing coeﬃcients are

π ν + μ + 1 (1)
c+ (k) = i x1 eiπ(ν+μ+1)/2 − Hμ (x1 )Hν(1) (x1 )
4 x1

(1) (1)
+Hμ(1) (x1 )Hν+1 (x1 ) + Hμ+1 (x1 )Hν(1) (x1 ) , (27)

π ν + μ + 1 (2)
c− (k) = i x1 eiπ(ν−μ)/2 − Hμ (x1 )Hν(1) (x1 )
4 x1

(1) (2)
+Hμ(2) (x1 )Hν+1 (x1 ) + Hμ+1 (x1 )Hν(1) (x1 ) , (28)

satisfying the exact Wronskian normalization condition |c+ (k)|2 − |c− (k)|2 =
1. In the small argument limit, i.e. kτ1 ∼ kτ0 1, the leading term in (87)
leads to
i 2μ+ν iπ(ν−μ)/2 −μ−ν
c− (k) e x1 (ν + μ − 1)Γ (μ)Γ (ν) (29)
4π
√
If we now insert the values given in (83), it turns out that c− (k) 0.41 |kτ1 |− 3 .
The spectral slope agrees with the numerical estimate, as already stressed.
The absolute normalization cannot be determined from (88), where the small
argument limit has already been taken. In order to determine the absolute
normalization, the specific value of kmax τ1 has to be inserted in (87). The re-
sult of this procedure, taking τ1 ∼ τ0 is |c− (kmax )|2 = 0.14, which is roughly
a factor of 10 smaller than the interpolating formula given in (80).
The observation that a dynamical gauge coupling implies a viable mech-
anism for the production of large-scale magnetic fields can be interesting in
general terms and, more specifically, in the context of the pre-big bang models.
In fact, in pre-big bang models, not only the fluctuations of the hypercharge
Magnetic Fields, Strings and Cosmology 901

field are amplified. In the minimal case we will have to deal with the fluctua-
tions of the tensor [105, 138] and scalar [106] modes of the geometry and with
the fluctuations of the antisymmetric tensor field [107, 108].
The amplified tensor modes of the geometry lead to a stochastic back-
ground of gravitational waves (GW) with violet spectrum in both the GW
amplitude and energy density. In Fig. 6 the GW signal is parametrized in
terms of the logarithm of ΩGW = ρGW /ρc , i.e. the fraction of critical energy
density present (today) in GW. On the horizontal axis of Fig. 6 the logarithm
of the present (physical) frequency ν is reported. In conventional inflation-
ary models, for ν ≥ 10−16 Hz, ΩGW is constant (or slightly decreasing) as
a function of the present frequency. In the case of string cosmological mod-
els, ΩGW ∝ ν 3 ln ν, which also implies a steeply increasing power spectrum.
This possibility spurred various experimental groups to analyze possible di-
rects limits on the scenario arising from specific instruments such as resonant
mass detectors [109] and microwave cavities [110, 111]. These attempts are
justified since the signal of pre-big bang models may be rather strong at high
frequencies and, anyway, much stronger than the conventional inflationary
prediction
The sensitivity of a pair of VIRGO detectors to string cosmological gravi-
tons has been specifically analyzed [112] with the conclusion that a VIRGO
pair, in its upgraded stage, can certainly probe wide regions of the parameter
space of these models. If we maximize the overlap between the two detec-
tors [112] or if we reduce (selectively) the pendulum and pendulum’s internal
modes contribution to the thermal noise of the instruments, the visible region
(after 1 year of observation and with SNR = 1) of the parameter space will get
even larger. Unfortunately, as in the case of the advanced LIGO detectors, the

Big−bang nucleosynthesis bound

-5
Advanced LIGO/VIRGO

Pulsar timing bound

CMB bound
-10 quintessential
GW

LISA inflation
Log h Ω
2

Pre−big bang

-15

Conventional inflation Ekpyrotic model

-20
Planck explorer (advanced)

-15 -10 -5 0 5 10
Log( ν /Hertz)

Fig. 6. The spectrum of relic gravitons from various cosmological models presented
in terms of h2 ΩGW
902 M. Giovannini

sensitivity to a ﬂat ΩGW will be irrelevant for ordinary inﬂationary models

also with the advanced VIRGO detector. It is worth mentioning that growing
energy spectra of relic gravitons can also arise in the context of quintessential
inflationary models [113, 114]. In this case ΩGW ∝ ν ln2 ν (see [114] for a full
discussion).
The spectra of gravitational waves have features that are, in some sense,
complementary to the ones of the large-scale magnetic fields. The parameter
space leading to a possible signal of relic (pre-big bang) gravitons with wide-
band interferometers has only a small overlap with the region of the parameter
space leading to sizable large-scale magnetic fields. This conclusion can be
evaded if the coupling of the dilaton to the hypercharge field is, in the action,
of the type e−βϕ Fμν F μν [115] where the parameter β has values 1 and 1/2,
respectively, for heterotic and type I superstrings. In particular, in the case
β = 1/2, it is possible to find regions where both large-scale magnetic fields
and relic gravitons are copiously produced.
Let us finally discuss the scalar fluctuations of the geometry. The spec-
trum of the scalar modes is determined by the spectrum of the Kalb–Ramond
axion(s). If the axions would be neglected, the spectrum of the curvature fluc-
tuations would be sharply increasing, or as we say in the jargon, the spectrum
would be violet in full analogy with the spectrum of the tensor modes of the
geometry. This result [106] has been recently analyzed in light of a recent
controversy (see [97, 98] and references therein).
If the Kalb–Ramond axions are consistently included in the calculation,
it is found that the large-scale spectrum of curvature perturbations becomes
flat [108] and essentially inherits the spectrum of the Kalb–Ramond axions.
If the axions decay (after a phase of coherent oscillations), the curvature
perturbations will be adiabatic as in the case of conventional inflationary
models but with some important quantitative differences [108] since, in this
case, the CMB normalization is explained in terms of the present value of
the string curvature scale and in terms of the primordial slope of the axion
spectrum.

4 Primordial or Not Primordial, This Is the Question...

While diverse theoretical models for the origin of large-scale magnetism can
certainly be questioned on the basis of purely theoretical considerations, di-
rect observations can tell us something more specific concerning the epoch
of formation of large-scale magnetic fields. It would be potentially useful to
give some elements of response to the following burning question: Are really
magnetic fields primordial?
The plan of the present section is the following. In Sect. 4.1 different mean-
ings of the term primordial will be discussed. It will be argued that CMB
physics can be used to constrain large-scale magnetic fields possibly present
prior to matter–radiation equality. In Sect. 4.2 the scalar CMB anisotropies
Magnetic Fields, Strings and Cosmology 903

will be speciﬁcally discussed by deriving the appropriate set of evolution equa-

tions accounting for the presence of a fully inhomogeneous magnetic field. In
Sect. 4.3 the evolution of the different species composing the pre-decoupling
plasma will be solved, in the tight-coupling approximation and in the pres-
ence of a fully inhomogeneous magnetic field. Finally Sect. 4.4 contains various
numerical results and a strategy for parameter extraction.

4.1 Pre-equality Magnetic Fields

The term primordial seems to have slightly different meanings depending on
the perspective of the various communities converging on the study of large-
scale magnetic fields. Radio-astronomers have the hope that by scrutinizing
the structure of magnetic fields in distant galaxies it would be possible, in the
future, to understand if the observed magnetic fields are the consequence of a
strong dynamo action or if their existence precedes the formation of galaxies.
If the magnetic field does not flip its sign from one spiral arm to the other,
then a strong dynamo action can be suspected [116]. In the opposite case
the magnetic field of galaxies should be primordial, i.e. present already at
the onset of gravitational collapse. In this context, primordial simply means
protogalactic. An excellent review on the evidence of magnetism in nearby
galaxies can be found in [117]. In Fig. 7 a schematic view of the Milky Way is
presented. The magnetic field follows the spiral arm. There have been claims,
in the literature, of three to five field reversals. The arrows in Fig. 7 indicate
one of the possible field reversals. One reversal is certain beyond any doubt.
Another indication that would support the primordial nature of the magnetic
field of galaxies would be, for instance, the evidence that not only spirals but
also elliptical galaxies are magnetized (even if the magnetic field seems to have
correlation scale shorter than in the case of spirals). Since elliptical galaxies
have a much less efficient rotation, it seems difficult to postulate a strong
dynamo action. We will not pursue here the path of specific astrophysical
signatures of a truly pregalactic magnetic field and we refer the interested
reader to [116, 117].
As a side remark, it should also be mentioned that magnetic fields may
play a role in the analysis of rotation curves of spiral galaxies. This aspect has
been investigated in great depth by Battaner, Florido and collaborators also
in connection with possible effects of large-scale magnetic fields on structure
formation [119, 120, 121, 122] (see also [123] and references therein).
The large-scale magnetic fields produced via the parametric amplification
of quantum fluctuations discussed earlier in the present lecture may also be
defined primordial but, in this case, the term primordial has a much broader
signification embracing the whole epoch that precedes the equality between
matter and radiation taking place, approximately, at a redshift zeq = 3230 for
h2 Ωm0 = 0.134 and h2 Ωr0 = 4.15 × 10−5 . Consequently, large-scale magnetic
fields may affect, potentially, CMB anisotropies [18]. Through the years, vari-
ous studies have been devoted to the effect of large-scale magnetic fields on the
vector and tensor CMB anisotropies [124, 125] (see also [126] and references
therein for some recent review articles).
904 M. Giovannini

SUN
( x =0 , y = 8 kpc)
x = r cosθ
10
PERSEUS y= r sin θ
κ(θ − θ ο)
r = rο e
SAGITTARIUS
rο = 2.3 kpc
SCUTUM
(kpc)

NORMA
CARINA

CRUX

−5 Galactic
Center

−10 0 10 (kpc)

Fig. 7. The schematic map of the MW is illustrated. Following [118] the origin
of the two-dimensional coordinate system are in the galactic center. The two large
arrows indicate one of the possible (three or ﬁve) ﬁeld reversals observed so far

The implications of fully inhomogeneous magnetic ﬁelds on the scalar

modes of the geometry remain comparatively less explored. By fully inho-
mogeneous we mean stochastically distributed fields that do not break the
spatial isotropy of the background [22, 23].
CMB anisotropies are customarily described in terms of a set of carefully
chosen initial conditions for the evolution of the brightness perturbations of
the radiation field. One set of initial conditions corresponds to a purely adi-
abatic mode. There are, however, more complicated situations where, on top
of the adiabatic mode there is also one (or more) nonadiabatic mode(s). A
mode, in the present terminology, simply means a consistent solution of the
governing equations of the metric and plasma fluctuations, i.e. a consistent
solution of the perturbed Einstein equations and of the lower multipoles of
the Boltzmann hierarchy.
The simplest set of initial conditions for CMB anisotropies implies, in a
ΛCDM framework, that a nearly scale-invariant spectrum of adiabatic fluc-
tuations is present after matter–radiation equality (but before decoupling)
for typical wavelengths larger than the Hubble radius at the corresponding
epoch [127].
It became relevant, through the years, to relax the assumption of exact
adiabaticity and to scrutinize the implications of a more general mixture of
Magnetic Fields, Strings and Cosmology 905

adiabatic and nonadiabatic initial conditions (see [128, 129, 130] and refer-
ences therein). In what follows it will be argued, along a similar perspective,
that large-scale magnetic fields slightly modify the adiabatic paradigm so that
their typical strengths may be constrained. To achieve such a goal, the first
step is to solve the evolution equations of magnetized cosmological pertur-
bations well before matter–radiation equality. The second step is to follow
the solution through equality (and up to decoupling). On a more technical
ground, the second step amounts to the calculation of the so-called transfer
matrix [131] whose specific form is one of the subjects of the present analysis.

4.2 Basic Equations

Consider then the system of cosmological perturbations of a ﬂat Friedmann–

Robertson–Walker Universe, characterized by a conformal time-scale factor
a(τ ) (see (46)), and consisting of a mixture of photons, baryons, CDM particles
and massless neutrinos. In the following the basic set of equations used in
order to describe the magnetized curvature perturbations will be introduced
and discussed. The perspective adopted here is closely related to the recent
results obtained in [132, 133] (see also [134, 135] for interesting developments).
In the conformally Newtonian gauge [136, 137, 138, 139, 140], the scalar
ﬂuctuations of the metric tensor Gμν = a2 (τ )ημν are parametrized in terms
of the two longitudinal ﬂuctuations, i.e.

δG00 = 2a2 (τ )φ(τ, x), δGij = 2a2 (τ )ψ(τ, x)δij , (1)

where δij is the Kroeneker δ. While the spatial curvature will be assumed
to vanish, it is straightforward to extend the present considerations to the
case when the spatial curvature is not negligible.
In spite of the fact that the present discussion will be conducted within
the conformally Newtonian gauge, it can be shown that gauge-invariant de-
scriptions of the problem are possible [133]. Moreover, specific nonadiabatic
modes (like the ones related to the neutrino system) may be more usefully
described in different gauges (like the synchronous gauge). The rationale for
the last statement is that the neutrino isocurvature modes may be singular
in the conformally Newtonian gauge. These issues will not be addressed here
but have been discussed in the existing literature (see, for instance, [139, 140]
and references therein). Furthermore, for the benefit of the interested reader
it is appropriate to mention that the relevant theoretical tools used in the
present and in the following paragraphs follows the conventions of a recent
review [140].

Hamiltonian and Momentum Constraints

The Hamiltonian and momentum constraints, stemming from the (00) and
(0i) components of the perturbed Einstein equations are
906 M. Giovannini

∇2 ψ − 3H(Hφ + ψ ) = 4πGa2 [δρt + δρB ], (2)

∇ · (E × B)
∇ (Hφ + ψ ) = −4πGa (pt + ρt )θt +
2 2
, (3)
4πa4
where H = a /a and the prime denotes a derivation with respect to the
conformal time coordinate τ . In writing (90) and (91) the following set of
conventions has been adopted

δρt (τ, x) = δργ (τ, x) + δρν (τ, x) + δρc (τ, x) + δρb (τ, x), (4)

B 2 (x)
δρB (τ, x) = , (5)
8πa4 (τ )

(pt + ρt )θt (τ, x) = (pγ + ργ )θγ (τ, x) + (pν + ρν )θν (τ, x)

+ (pc + ρc )θc (τ, x) + (pb + ρb )θb (τ, x). (6)
Concerning (92), (93) and (94) the following comments are in order:
• In (92) the total density fluctuation of the plasma, i.e. δρt (τ, x) receives
contributions from all the species of the plasma.
• In (93) the fluctuation of the magnetic energy density δρB (τ, x) is quadratic
in the magnetic field intensity.
• In (94) θt (τ, x) = ∂i vti is the divergence of the total peculiar velocity while
θγ (τ, x), θν (τ, x), θc (τ, x) and θb (τ, x) are the divergences of the peculiar
velocities of each individual species, i.e. photons, neutrinos, CDM particles
and baryons.
The second term appearing at the right-hand side of (91) is the divergence
of the Poynting vector. In MHD the Ohmic electric field is subleading and, in
particular, from the MHD expression of the Ohm law we will have

(∇ × B) × B
E×B . (7)
4πσ
Since the Universe, prior to decoupling, is a very good conductor, the ideal
MHD limit can be safely adopted in the first approximation (see also [130]);
thus for σ → ∞ (i.e. infinite conductivity limit) the contribution of the
Poynting vector vanishes. In any case, even if σ would be finite but large,
the second term at the right-hand side of (91) would be suppressed in
comparison with the contribution of the divergence of the total velocity
field.
The total (unperturbed) energy density and pressure of the mixture, i.e.

ρt = ργ + ρν + ρc + ρb + ρΛ ,
pt = p γ + p ν + p c + p b + p Λ , (8)
Magnetic Fields, Strings and Cosmology 907

determine the evolution of the background geometry according to Friedmann

equations:
8πG 2
H2 = a ρt , (9)
3

H2 − H = 4πGa2 (ρt + pt ), (10)

ρt + 3H(ρt + pt ) = 0. (11)

Notice that in (96) the contribution of the cosmological constant has been
included. If the dark energy is parametrized in terms of a cosmological con-
stant (i.e. pΛ = −ρΛ ), then, δρΛ = 0. Furthermore, the contribution of ρΛ to
the background evolution is negligible prior to decoupling. Slightly diﬀerent
situations (not contemplated by the present analysis) may arise if the dark
energy is parametrized in terms of one or more scalar degrees of freedom with
suitable potentials.

Dynamical Equation and Anisotropic Stress(es)

The spatial components of the perturbed Einstein equations imply instead

1
ψ + H(φ + 2ψ ) + (H2 + 2H )φ + ∇2 (φ − ψ) δij
2

1
− ∂i ∂ j (φ − ψ) = 4πGa2 (δpt + δpB )δij − Πij − Π̃ij . (12)
2

Equation (100) contains, as source terms, not only the total ﬂuctuation of the
pressure of the plasma, i.e. δpt , but also

B 2 (x) δρB (τ, x)

δpB (τ, x) = 4
= . (13)
24πa (τ ) 3

1 1 2 j
Π̃ij (τ, x) = Bi B j
− B δ i . (14)
4πa4 3
Moreover, in (100), Πij (τ, x) is the anisotropic stress of the fluid. As it will
be mentioned in a moment (and later on heavily used) the main source of
anisotropic stress of the fluid is provided by neutrinos which free-stream from
temperature smaller than mega electronvolts. Notice that both the anisotropic
stress of the fluid, i.e. Πij (τ, x), and the magnetic anisotropic stress, i.e.
Π̃ij (τ, x), are, by definition, traceless.
Using this last observation, (100) can be separated into two independent
equations. Taking the trace of (100) we do get
1
ψ + H(φ + 2ψ ) + (2H + H2 )φ + ∇2 (φ − ψ) = 4πGa2 (δpt + δpB ). (15)
3
908 M. Giovannini

By taking the difference between (100) and (103), the following (traceless)
relation can be obtained:
1
∂i ∂ j (φ − ψ) − δij ∇2 (φ − ψ) = 8πGa2 (Πij + Π̃ij ). (16)
3
By applying the differential operator ∂j ∂ i to both sides of (104), we do obtain
the following interesting relation:
∇4 (φ − ψ) = 12πGa2 [(pν + ρν )∇2 σν + (pγ + ργ )∇2 σB ], (17)
where the parametrization
∂j ∂ i Πij = (pν + ρν )∇2 σν , ∂j ∂ i Π̃ij = (pγ + ργ )∇2 σB , (18)
has been adopted. In (105) σν (τ, x) is related with the quadrupole mo-
ment of the (perturbed) neutrino phase–space distribution. In (105) σB (τ, x)
parametrizes the (normalized) magnetic anisotropic stress. It is relevant to
remark at this point that in the MHD approximation adopted here the two
main sources of scalar anisotropy associated with magnetic fields can be
parametrized in terms of σB (τ, x) and in terms of the dimensionless ratio
δρB (τ, x)
ΩB (τ, x) = . (19)
ργ (τ )
Since both ΩB (τ, x) and σB (τ, x) are quadratic in the magnetic field intensity,
a non-Gaussian contribution may be expected. ΩB (τ, x) is the magnetic energy
density referred to the photon energy density, and it is constant to a very good
approximation if magnetic flux is frozen into the plasma element.
There is, in principle, a third contribution to the scalar problem coming
from magnetic fields. Such a contribution arises in the evolution equation of
the photon–baryon peculiar velocity and amounts to the divergence of the
Lorentz force. While the mentioned equation will be derived later in this
section, it is relevant to point out here that the MHD Lorentz force can be
expressed solely in terms of σB (τ, x) and ΩB (τ, x). In fact a well-known vector
identity stipulates that
1
∂i Bj ∂ j B i = ∇ · [(∇ × B) × B] + ∇2 B 2 . (20)
2
From the definition of σB in terms of Π̃ij , i.e. (106), it is easy to show that
3 1
∇2 σB = ∂i Bj ∂ j B i − ∇2 ΩB . (21)
16πa4 ργ 2
Using then (108) into (109) and recalling that
4π∇ · [J × B] = ∇ · [(∇ × B) × B], (22)
we obtain
3 ∇2 Ω B
∇2 σ B = ∇ · [(∇ × B) × B] + . (23)
16πa4 ργ 4
Magnetic Fields, Strings and Cosmology 909

Curvature Perturbations

Two important quantities must now be introduced. The first one, conven-
tionally denoted by ζ, is the density contrast on uniform curvature hypersur-
faces,11 i.e.
(δρt + δρB )
ζ = −ψ − H . (24)
ρt
The definition (112) is invariant under infinitesimal coordinate transforma-
tions. In fact, while δρB is automatically gauge-invariant (since the magnetic
field vanishes at the level of the background), ψ and δρt transform as [140]

ψ → ψ̃ = ψ + H,
˜ − ρ ,
δρt → δρ (25)
t t

for

τ → τ̃ = τ + 0
xi → x̃i = xi + ∂ i . (26)

Recalling (99), (112) can also be written as

δρt + δρB
ζ = −ψ + . (27)
3(ρt + pt )

The second variable we want to introduce, conventionally denoted by R is

the curvature perturbation on comoving orthogonal hypersurfaces,12 i.e.

H(Hφ + ψ )
R = −ψ − . (28)
H2 − H
Inserting (115) and (116) into (90), the Hamiltonian constraint takes then the
form
∇2 ψ
ζ =R+ . (29)
12πGa2 (pt + ρt )
Equation (117) is rather interesting in its own right and it tells that, in the
long wavelength limit,
ζ R + O(k 2 τ 2 ). (30)
When the relevant wavelengths are larger than the Hubble radius (i.e. kτ 1),
the density contrast on uniform curvature hypersurfaces and the curvature
11
Since, as it will be discussed, ζ is gauge-invariant, we can also interpret it as the
curvature fluctuation on uniform density hypersurfaces, i.e. the fluctuation of the
scalar curvature on the hypersurface where the total density is uniform.
12
It is clear, from the definition (116) that the second term at the right-hand side
is proportional, by the momentum constraint (91), to the total peculiar velocity
of the plasma which is vanishing on comoving (orthogonal) hypersurfaces.
910 M. Giovannini

ﬂuctuations on comoving orthogonal hypersurfaces coincide. Since the ordi-

nary Sachs–Wolfe contribution to the gauge-invariant temperature fluctua-
tion is dominated by wavelengths that are larger than the Hubble radius after
matter–radiation equality (but before radiation decoupling), the calculation of
ζ (or R), in the long-wavelength limit, will essentially give us the Sachs–Wolfe
(SW) plateau.
A remark on the definition given in (112) is in order. The variable ζ must
contain the total fluctuation of the energy density. This is crucial since the
Hamiltonian constraint is sensitive to the total fluctuation of the energy den-
sity. If the magnetic energy density δρB is correctly included in the definition
of ζ, then the Hamiltonian constraint (117) maintains its canonical form.
Equations (117) and (118) can be used to derive the appropriate transfer
matrices, allowing, in turn, the estimate of the Sachs–Wolfe plateau. For this
purpose it is important to deduce the evolution equation for ζ. The evolution of
ζ can be obtained from the evolution equation of the total density fluctuation
which reads, in the conformally Newtonian gauge,
E·J
δρt − 3ψ (pt + ρt ) + (pt + ρt )θt + 3H(δpt + δρt ) + 3Hδpnad = . (31)
a4
The technique is now rather simple. We can extract δρt from (27)

δρt = 3(ρt + pt )(ζ + ψ) − δρB . (32)

Inserting (120) into (119) we get to the wanted evolution equation for ζ. Before
doing that it is practical to discuss the case when the relativistic fluid receives
contributions from different species that are simultaneously present. In the
realistic case, considering that the cosmological constant does not fluctuate,
we will have four different species.
For deriving the evolution equation of ζ, it is practical (and, to some
extent, conventional) to separate the pressure fluctuation into an adiabatic
component supplemented by a nonadiabatic contribution:

δpt δpt
δpt = δρt + δς. (33)
δρt ς δς ρt

In a relativistic description of gravitational ﬂuctuations, the pressure ﬂuctu-

ates both because the energy density fluctuates (first term at the right-hand
side of Eq. (121)) and because the specific entropy of the plasma, i.e. ς, fluc-
tuates (first term at the right-hand side of (121)). The subscripts appearing
in the two terms at the right-hand side of (121) simply mean that the two
different variations must be taken, respectively, at constant ς (i.e. δς = 0) and
at constant ρt (i.e. δρt = 0).
Here is an example of the usefulness of this decomposition. Consider, for
instance, a mixture of CDM particles and radiation. In this case the coefficient
of the first term at the right-hand side of (121) can be written as
Magnetic Fields, Strings and Cosmology 911

δpt 1 δρr
= , (34)
δρt ς 3 δρc + δρr ς

where we simply used the fact that δpr = δρr /3 and that δρt = δρr + δρc .
Now, the quantity appearing in (122) must be evaluated at constant ς, i.e.
for δς = 0. The specific entropy, in the CDM radiation system, is given by
ς = T 3 /nc where T is the temperature and nc is the CDM concentration. The
relative fluctuations of the specific entropy can then be defined and they are
δς 3 δρr δρc
S= = − , (35)
ς 4 ρr ρc

where it has been used that ρr T 4 and that ρc m nc (m is here the

typical mass of the CDM particle). Requiring now that S = 0 we do get
δρc = (3/4)(ρc /ρr )δρr . Thus, inserting δρc into (122), the following relation
can be easily obtained:

δpt 4ρr p
= ≡ t = c2s . (36)
δρt ς 3(3ρc + 4ρr ) ρt

The second and third equalities in (124) follow from the definition of the
total sound speed for the CDM-radiation system. This occurrence is general
and it is not a peculiarity of the CDM-radiation system so that we can write,
for an arbitrary mixture of relativistic fluids:

δpt p
= t = c2s . (37)
δρt ς ρt
The definition of relative entropy fluctuation proposed in (123) is invariant
under infinitesimal gauge transformations [140] and it can be generalized by
introducing two interesting variables, namely,
δρr δρc
ζr = −ψ − H and ζc = −ψ − H . (38)
ρr ρc

Using the continuity equations for the CDM and for radiation, i.e. ρr = −4Hρr
and ρc = −3Hρc , (126) can be also written as

δr δc
ζr = −ψ + , ζc = −ψ + , (39)
4 3
where δr = δρr /ρr and δc = δρc /ρc . Thus, using (127), the relative ﬂuctuation
in the speciﬁc entropy introduced in (123) can also be written as

S = −3(ζc − ζr ). (40)

It is a simple exercise to verify that (123) and (128) have indeed the same
physical content.
912 M. Giovannini

Up to now the coefficient of the first term at the right-hand side of (121)
has been computed. Let us now discuss the second term appearing at the right-
hand side of (121). Conventionally, the whole second term is often denoted by
δpnad , i.e. nonadiabatic pressure variation. From (123) defining the relative
fluctuation in the specific entropy, i.e. S = δς/ς, the following equation can
be written:
δpt δpt
δpnad = , δς ≡ S. (41)
δς ρt S ρt
Now, S must be evaluated, inside the round bracket, for δρt = 0. The result
will be
δpt 4 ρc ρr
= . (42)
S ρt 3 3ρc + 4ρr
Recalling the definition of sound speed and using (130) into (129), we do get

δpnad = c2s ρc S. (43)

If the mixture of ﬂuids is more complicated, the discussion presented so

far can be easily generalized. If more than two ﬂuids are present, we can still
separate, formally, the pressure ﬂuctuation as

δpt = c2s δρt + δpnad . (44)

However, if more than two fluids are present, the nonadiabatic pressure density
fluctuation has a more complicated form that reduces to the one previously
computed in the case of two fluids:
1
2
δpnad = ρi ρj (cs i − c2s j )Si j ,
6Hρt
ij
pi
Sij = −3(ζi − ζj ), c2s i = , (45)
ρi
where Si j are the relative fluctuations in the entropy density that can be
computed in terms of the density contrasts of the individual fluids. The indices
i and j run over all the components of the plasma. Assuming a plasma formed
by photons, neutrinos, baryons and CDM particles, we will have that various
entropy fluctuations are possible. For instance

Sγc = −3(ζγ − ζc ), Sγν = −3(ζγ − ζν ), ... (46)

where the ellipses stand for all the other possible combinations. From the
definition of relative entropy fluctuations it appears that Sγν = −Sνγ . Finally,
with obvious notations, while c2s denotes the total sound speed, c2s i and c2s i
denote the sound speeds of a generic pair of fluids contributing Sij to δpnad ,
i.e.
p p pj
c2s = t , c2s i = i , c2s j = . (47)
ρt ρi ρj
Magnetic Fields, Strings and Cosmology 913

In light of (134), also the physical interpretation of (132) becomes more clear.
The contribution of δpnad arises because of the inherent multiplicity of ﬂuid
present in the plasma. Thanks to (132) using (120) in (119), we can obtain
the evolution equation for ζ which becomes

H H 1 θt
ζ = − δpnad + c2s − δρB − . (48)
pt + ρt pt + ρt 3 3

The evolution equation for R can also be directly obtained by taking the
ﬁrst-time derivative of (117), i.e.

∇2 ψ H(3c2s + 1)∇2 ψ
ζ = R + + . (49)
12πGa2 (pt + ρt ) 12πGa2 (pt + ρt )

By now inserting (137) into (136) and by using the momentum constraint of
(91) to eliminate θt we do get the following expression:

H H 1
R =− δpnad + c −
2
δρB
pt + ρt pt + ρt s 3
Hc2s ∇2 ψ H∇2 (φ − ψ)
− + . (50)
4πGa2 (pt + ρt ) 12πGa2 (pt + ρt )

It could be ﬁnally remarked that (138) can be directly derived from (103).
For this purpose, The deﬁnition (116) can be derived once with respect to τ .
The obtained result, once inserted back into (103) reproduces (138).

4.3 Evolution of Diﬀerent Species

Up to now the global variables defining the evolution of the system have been
discussed in a unified perspective. The evolution of the global variables is
determined by the evolution of the density contrasts and peculiar velocities of
the different species. Consequently, in the following paragraphs, the evolution
of the different species will be addressed.

Photons and Baryon

The evolution equations of the lowest multipoles of the photon–baryon system

amount, in principle, to the following two sets of equations:

δb = 3ψ − θb , (51)
∇ · [J × B] 4 ργ
θb + Hθb = −∇2 φ + + ane xe σT (θγ − θb ), (52)
a4 ρb 3 ρb
and
914 M. Giovannini

4
δγ = 4ψ − θγ , (53)
3
∇2 δγ
θγ + + ∇2 φ = ane xe σT (θb − θγ ). (54)
4
Equation (140) contains, as a source term, the divergence of the Lorentz force
that can be expressed in terms of σB (τ, x) and ΩB (τ, x), as already pointed
out in (111).
At early times photons and baryons are tightly coupled by Thompson
scattering, as it is clear from (140) and (142) where σT denotes the Thompson
cross section and ne xe the concentration of ionized electrons. To cast light on
the physical nature of the tight-coupling approximation, let us subtract (142)
and (140). The result will be

4 ργ ∇2 δ γ ∇ · [J × B]
(θγ − θb ) + ane xe 1 + (θγ − θb ) = − + Hθb − . (55)
3 ρb 4 a4 ρb
From (143) it is clear that any deviation of (θγ − θb ) swiftly decays away.
In fact, from (143), the characteristic time for the synchronization of the
baryon and photon velocities is of the order of (xe ne σT )−1 which is small
compared with the expansion time. In the limit σT → ∞ the tight coupling is
exact and the photon–baryon velocity field is a unique physical entity which
will be denoted by θγb . From the structure of (143), the contribution of the
magnetic fields in the MHD limit only enters through the Lorentz force, while
the damping term is always provided by Thompson scattering.
To derive the evolution equations for the photon–baryon system in the
tight-coupling approximation, we can add (140) and (142) taking into account
that θb θγ = θγb . Of course, also the evolution equations of the density
contrasts will depend upon θγb . Consequently, the full set of tightly coupled
evolution equations for the photon–baryon fluid can be written as
4
δγ = 4ψ − θγb , (56)
3
δb = 3ψ − θγb , (57)
HRb ∇2 δγ 3 ∇ · [J × B]
θγb + θγb + + ∇2 φ = , (58)
(1 + Rb ) 4(1 + Rb ) 4 a4 ργ (1 + Rb )
where
3 ρb (τ ) 698 h2 Ωb
Rb (τ ) = = . (59)
4 ργ (τ ) z+1 0.023
The set of equations (144), (145) and (146) have to be used in order to obtain
the correct initial conditions to be imposed on the evolution for the integration
of the brightness perturbations.
If we assume, effectively, that σT → ∞ we are working to lowest order
in the tight-coupling approximation. This means that the CMB is effectively
Magnetic Fields, Strings and Cosmology 915

isotropic in the baryon rest frame. To discuss CMB polarization in the pres-
ence of magnetic ﬁelds, one has to go to higher order in the tight-coupling
expansion. However, as far as the problem of initial conditions is concerned,
the lowest order treatment suﬃces, as it will be apparent from the subsequent
discussion.

Neutrinos

After neutrino decoupling the (perturbed) neutrino phase–space distribution

evolves according to the collisionless Boltzmann equation. This occurrence im-
plies that to have a closed system of equations describing the initial conditions
it is mandatory to improve the fluid description by adding to the evolution
of the monopole (i.e. the neutrino density contrast) and of the dipole (i.e.
the neutrino peculiar velocity), and also of the quadrupole, i.e. the quantity
denoted by σν and appearing in the expression of the anisotropic stress of the
fluid (see (105) and (106)).
The derivation of the various multipoles of the perturbed neutrino phase–
space distribution is a straightforward (even if a bit lengthy) calculation and
it has been performed, for the set of conventions employed in the present
lecture, in [140]. The result is, in Fourier space,
4
δν = 4ψ − θν , (60)
3
k2
θν = δν + k 2 φ − k 2 σ ν , (61)
4
4 3
σν = θν − kFν 3 . (62)
15 10
In (150) Fν 3 is the octupole of the (perturbed) neutrino phase–space distri-
bution. The precise relation of the multipole moments of Fν with the density
contrast and the other plasma quantities is as follows:
3 Fν 2
δ ν = Fν 0 , θν = kFν 1 , σν = . (63)
4 2
For multipoles larger than the quadrupole, i.e. > 2, the Boltzmann hierarchy
reads
k
Fν = [Fν(−1) − ( + 1)Fν(+1) ]. (64)
2 + 1
In principle, to give initial conditions, we should specify, at a given time after
neutrino decoupling, the values of all the multipoles of the neutrino phase–
space distribution. In practice, if the initial conditions are set deep in the
radiation epoch, the relevant variables only extend, for the purpose of the
initial conditions, up to the octupole. Specific examples will be given in a
moment.
916 M. Giovannini

CDM Component

The CDM component is, in some sense, the easier. In the standard case the
evolution equations do not contain neither the magnetic ﬁeld contribution
nor the anisotropic stress. The evolution of the density contrast and of the
peculiar velocity are simply given, in Fourier space, by the following pair of
equations:

δc = 3ψ − θc , (65)
θc + Hθc = k 2 φ. (66)

Magnetized Adiabatic and Nonadiabatic Modes

The evolution equations of the fluid and metric variables will now be solved
deep in the radiation-dominated epoch and for wavelengths much larger than
the Hubble radius, i.e. |k τ | 1. In the present lecture only the magnetized
adiabatic mode will be discussed. However, the treatment can be usefully
extended to the other nonadiabatic modes. For this purpose we refer the in-
terested reader to [132] (see also [139]). Moreover, since this lecture has been
conducted within the conformally Newtonian gauge, there is no reason to
change. However, it should be noticed that fully gauge-invariant approaches
are possible [133]. To give the flavor of the possible simplifications obtain-
able in a gauge-invariant framework, we can just use gauge-invariant concepts
to classify more precisely the adiabatic and nonadiabatic modes. For this
purpose, in agreement with (126), let us define the gauge-invariant density
contrasts on uniform curvature hypersurfaces for the different species of the
pre-decoupling plasma:
δγ δν
ζγ = −ψ + , ζν = −ψ + , (67)
4 4
δc δb
ζc = −ψ + , ζb = −ψ + . (68)
3 3
In terms of the variables of (155) and (158) the evolution equations for the
density contrasts, i.e. (144), (148), (154) and (154), acquire a rather symmetric
form:
θγb θν
ζγ = − , ζν = − , (69)
3 3
θc θγb
ζc = − , ζb = − . (70)
3 3
From (157) and (158) we can easily deduce a rather important property of fluid
mixtures: in the long-wavelength limit the relative fluctuations in the specific
entropy are conserved. Consider, for instance, the CDM-radiation mode. In
this case the nonvanishing entropy fluctuations are
Magnetic Fields, Strings and Cosmology 917

Sγc = −3(ζγ − ζc ), Sνc = −3(ζν − ζc ). (71)

Using (157) and (158) the evolution equations for Sγc and Sνc can be readily
obtained and they are

Sγc = −(θγb − θc ), Sνc = −(θν − θc ). (72)

Outside the horizon the divergence of the peculiar velocities is O(|kτ |2 ), so the
fluctuations in the specific entropy are approximately constant in this limit.
This conclusion implies that if the fluctuations in the specific entropy are zero,
they will still vanish at later times. Such a conclusion can be evaded if the
fluids of the mixture have a relevant energy–momentum exchange or if bulk
viscous stresses are present [143, 144].
A mode is therefore said to be adiabatic iff ζγ = ζν = ζc = ζb . Denoting
by ζi and ζj two generic gauge-invariant density contrasts of the fluids of the
mixture, we say that the initial conditions are nonadiabatic if, at least, we
can find a pair of fluids for which ζi = ζj .
As an example, let us work out the specific form of the magnetized adia-
batic mode. Let us consider the situation where the Universe is dominated by
radiation after weak interactions have fallen out of thermal equilibrium but
before matter–radiation equality. This is the period of time where the initial
conditions of CMB anisotropies are usually set both in the presence and in
the absence of a magnetized contribution. Since the scale factor goes, in con-
formal time, as a(τ ) τ and H τ −1 , (90) can be solved for |kτ | 1. The
density contrasts can then be determined, in Fourier space, to lowest order in
kτ as

δγ = δν = −2φi − Rγ ΩB ,
3 3
δ b = δ c = − φi − R γ Ω B , (73)
2 4
where the fractional contribution of photons to the radiation plasma, i.e.
Rγ has been introduced and it is related to Rν , i.e. the fractional contribution
of massless neutrinos, as
r
Rγ = 1 − Rν , Rν = ,
1+r
4/3
7 4 Nν
r = Nν ≡ 0.681 . (74)
8 11 3

In (161) φi (k) denotes the initial value of the metric ﬂuctuation in Fourier
space. It is useful to remark that we have treated neutrinos as part of the
radiation background. If neutrinos have a mass in the meV range, they are
nonrelativistic today, but they will be counted as radiation prior to matter–
radiation equality. Concerning (161) the last remark is that, of course, we just
kept the lowest order in |kτ | < 1. It is possible, however, to write the solution
to arbitrary order in |kτ | as explicitly shown in [139].
918 M. Giovannini

Let us then write (105) in Fourier space and let us take into account that
the background is dominated by radiation. The neutrino quadrupole is then
determined to be
Rγ k2 τ 2
σν = − σB + (ψi − φi ), (75)
Rν 6Rν
where ψi (k) is the initial (Fourier space) value of the metric fluctuation defined
in (89).
Let us then look for the evolution of the divergences of the peculiar veloc-
ities of the different species. Let us therefore write (146), (149) and (153) in
Fourier space. By direct integration, the following result can be obtained:
k2 τ
θγb = [2φi + Rν ΩB − 4σB ], (76)
4

k2 τ R γ ΩB Rγ
θν = φi − + k2 τ σB , (77)
2 2 Rν
k2 τ
θc = φi . (78)
2
As a consistency check of the solution, (164), (165) and (166) can be inserted
into (91). Let us therefore write (91) in Fourier space

2 4 4
k Hφi = 4πGa
2
ργ (1 + ρb )θγb + ρν θν + ρc θc , (79)
3 3
where we used that ψi = 0 and we also used the tight-coupling approximation
since θγ = θb = θγb . Notice that in (91) the term arising from the Poynting
vector has been neglected. This approximation is rather sound within the
present MHD treatment. In (167) Rb 1 (see (147) for the definition of Rb )
since we are well before matter–radiation equality. The same observation can
be made for the CDM contribution which is negligible in comparison with
the radiative contribution provided by photons and neutrinos. Taking into
account these two observations, we can rewrite (167) as

k 2 Hφi = 2H2 (Rγ θγb + Rν θν ), (80)

where (97) and (98) have been used. Inserting then (164) and (165) into (168),
it can be readily obtained that the left-hand side exactly equals the right-hand
side, so that the momentum constraint is enforced.
The ﬁnal equation to be solved is the one describing the evolution of the
anisotropic stress, i.e. (150). Inserting (150) and (165) into (62), we do get
an interesting constraint on the initial conditions on the two longitudinal
ﬂuctuations of the geometry introduced in (89), namely:

2 Rγ
ψi = φi 1 + R ν + (4σB − Rν ΩB ). (81)
5 5
Magnetic Fields, Strings and Cosmology 919

Concerning the magnetized adiabatic mode, the following comments are in

order:
• The peculiar velocities are always suppressed, with respect to the other
terms of the solution, by a factor |kτ | which is smaller than 1 when the
wavelength is larger than the Hubble radius.
• In the limit σB → 0 and ΩB → 0 the magnetized adiabatic mode presented
here reproduces the well-known standard results (see for instance [138]).
• The difference between the two longitudinal fluctuations of the metric is
due to both the presence of magnetic and fluid anisotropic stresses.
• The longitudinal fluctuations of the geometry are both constant outside
the horizon and prior to matter–radiation equality; this result still holds
in the presence of a magnetized contribution as it is clearly demonstrated
by the analytic solution presented here.
The last interesting exercise we can do with the obtained solution is to
compute the important variables R and ζ introduced, respectively, in (165)
and (166). Since both ψ and φ are constants for |kτ | < 1 and for τ < τeq , also
R will be constant. In particular, by inserting (169) into (116), the following
expression can be obtained:

3 4 Rγ
Ri = − 1 + Rν φi − (4σB − Rν ΩB ), (82)
2 15 5
where Ri (k) denotes the initial value, in Fourier space, of the curvature per-
turbations. In numerical studies it is sometimes useful to relate the initial
values of φ and ψ, i.e. φi and ψi to Ri . This relation is expressed by the fol-
lowing pair of formulae that can be derived by inverting (170) and by using
(169):
10 2Rγ (4σB − Rγ ΩB )
φi = − Ri − ,
15 + 4Rν 15 + 4Rν
5 + 2Rν 2 Rγ (5 + 2Rν )
ψi = −2 Ri − (4σB − Rγ ΩB ). (83)
15 + 4Rν 5 15 + 4Rν
From the Hamiltonian constraint written in the form (117) it is easy to deduce
in the limit |kτ | 1 that ζi (k) = Ri (k). The same result can be obtained
through a different, but also instructive, path. Consider the definition of ζ
given either in (112) or in (115). The variable ζ can be expressed in terms of
the partial density contrasts defined in (155) and (156). More precisely, from
the definitions of the two sets of variables it is easy to show that
ρν ζν + ργ ζγ + ρc ζc + ρb ζb δρB
ζ= + ζB , ζB = . (84)
ρt 3(pt + ρt )
Thus, to obtain ζ it suffices to find ζγ , ζν , ζb and ζc evaluated at the initial
time and on the adiabatic solution. Using (161) and (169) into (155) and (156)
we obtain, as expected,
920 M. Giovannini

φi Rγ
ζγ = ζν = ζc = ζb = − ψi + + ΩB . (85)
2 4

This result was expected, since, as previously stressed, for the adiabatic mode
all the partial density contrasts must be equal. Inserting now (173) into (172)
and recalling that the CDM and baryon contributions vanish deep in the
radiation epoch, we do get

φi
ζ = − ψi + = Ri , (86)
2

where the last equality follows from the deﬁnition of (116) evaluated deep in
the radiation epoch and for the adiabatic solution derived above.
Up to now, as explained, attention has been given to the magnetized adia-
batic mode. There are, however, also other nonadiabatic modes that can enter
the game. We will not go, in this lecture, through the derivation of the various
nonadiabatic modes. It is, however, useful to give at least the result in the
case of the magnetized CDM-radiation mode. In such a case the full set of
equations admitting the adiabatic solution can be solved, in the limit τ < τ1
and kτ < 1, by

τ τ
φ = φ1 , ψ = ψ1 ,
τ1 τ1

τ
δγ = δν = 4ψ1 − R γ ΩB ,
τ1

3 τ
δc = − S∗ + Rγ ΩB + 3ψ1 ,
4 τ1

τ 3
δb = 3ψ1 − R γ ΩB ,
τ1 4
2
k 2 τ1 τ
θc = φ1 ,
3 τ1
2
k 2 τ1 τ k2 τ
θγb = (φ1 + ψ1 ) + [Rν ΩB − 4σB ],
2 τ1 4
2
k 2 τ1 τ kτ Rγ
θν = (φ1 + ψ1 ) + 4 σ B − ΩB ,
2 τ1 4 Rν

8 Rγ
Fν3 = kτ 4 σ B − ΩB ,
9 Rν
3
Rγ k 2 τ12 τ
σν = − σB + (ψ1 − φ1 ) , (87)
Rν 6Rν τ1

where
Magnetic Fields, Strings and Cosmology 921

15 + 4Rν 3
ψ1 = S∗ + Rγ ΩB ,
8(15 + 2Rν ) 4

15 − 4Rν 3
φ1 = S∗ + Rγ ΩB . (88)
8(15 + 2Rν ) 4
In (175) the following notation for the nonvanishing entropy fluctuations has
been employed:
Scγ = Scν = S∗ . (89)
In deriving (175) it is practical to use a form of the scale factor (obtained
by solving (97), (98) and (99) for a mixture of matter and radiation) which
explicitly interpolates between a radiation-dominated regime and a matter-
dominated regime:
2
τ τ 1 h2 Ωm0
a(τ ) = aeq +2 , 1 + zeq = = 2 , (90)
τ1 τ1 aeq h Ωr0
where Ωm0 and Ωr0 are evaluated at the present time and the
scale factor is
normalized in such a way that a0 = 1. In (178) τ1 = (2/H0 ) aeq /Ωm0 . In
terms of τ1 the equality time is
2 −1
√ h Ωm0
τeq = ( 2 − 1)τ1 = 119.07 Mpc, (91)
0.134
i.e. 2τeq τ1 . In this framework the total optical depth from the present to
the critical recombination epoch, i.e. 800 < z < 1200 can be approximated
analytically, as discussed in [145]. By defining the redshift of decoupling as
the one where the total optical depth is of order 1, i.e. κ(zdec , 0) 1, we will
have approximately
−α1
Ωb 0.0268
zdec 1139 , α1 = , (92)
0.0431 0.6462 + 0.1125 ln (Ωb /0.0431)
where h = 0.73. From (180) and (178) it follows that for 1100 ≤ zdec ≤ 1139,
275 Mpc ≤ τdec ≤ 285 Mpc.
Equations (179) and (180) will turn out to be relevant for the effective
numerical integration of the brightness perturbations which will be discussed
later on. For numerical purposes the late-time cosmological parameters will
be fixed, for a spatially flat Universe, as 13
13
The values of the cosmological parameters introduced in (181) are compatible
with the ones estimated from WMAP-3 [127, 146, 147] in combination with the
“Gold” sample of SNIa [148] consisting of 157 supernovae (the furthest being at
redshift z = 1.75). We are aware of the fact that WMAP-3 data alone seem to
favor a slightly smaller value of ωm (i.e. 0.126). Moreover, WMAP-3 data may
also have slightly different implications if combined with supernovae of the SNLS
project [149]. The values given in (181) will just be used for a realistic numerical
illustration of the methods developed in the present investigation.
922 M. Giovannini

ωγ = 2.47 × 10−5 , ωb = 0.023, ωc = 0.111, ωm = ωb + ωc , (93)

where ωX = h2 ΩX and ΩΛ = 1 − Ωm ; the present value of the Hubble param-

eter H0 will be ﬁxed, for numerical estimates, to 73 in units of km/(sec Mpc).

Transfer Matrix and Sachs–Wolfe Plateau

Before presenting some numerical approaches suitable for the analysis of mag-
netized CMB anisotropies, it is useful to discuss a class of analytical estimates
that allow the calculation of the so-called Sachs–Wolfe plateau. The idea, in
short, is very simple. We have the evolution equation for ζ given in (136). This
evolution equation can be integrated across the matter–radiation transition
using the interpolating form of the scale factor proposed in (178).
Consider, ﬁrst, the case of the magnetized adiabatic mode where δpnad = 0.
Deep in the radiation-dominated epoch, for τ τeq , c2s → 1/3 and, from
(136), ζ = 0, so that

3 4 Rγ
ζ = ζi Ri , ζi = − φi 1 + Rν − (4σB − Rν ΩB ). (94)
2 15 5

When the Universe becomes matter-dominated, after τeq , c2s → 0 and the
second term at the right-hand side of (136) does contribute signiﬁcantly at
decoupling (recall that for h2 Ωmatter = 0.134, τdec = 2.36 τeq ). Consequently,
from (136), recalling that c2s = 4aeq /[3(3a + 4aeq )], we obtain

3 a Rγ ΩB
ζf = ζi − , ΩB f = Ω B i . (95)
4(3a + 4aeq )

The inclusion of one or more nonadiabatic modes changes the form of (136)
and, consequently, the related solution (183). For instance, in the case of
the CDM-radiation nonadiabatic mode the relevant terms arising in the sum
(133) are Scγ = Scν = Si where Si is the (constant) ﬂuctuation in the relative
entropy density initially present (i.e. for τ τeq ). If this is the case, δpnad =
c2s ρc Si and (136) can be easily solved. The transfer matrix for magnetized
CMB anisotropies can then be written as
⎛ ⎞ ⎛ ⎞⎛ ⎞
ζf Mζζ MζS MζB ζi
⎝ Sf ⎠ = ⎝ 0 MSS MSB ⎠ ⎝ Si ⎠ . (96)
ΩB f 0 0 MBB ΩB i

In the case of a mixture of (magnetized) adiabatic and CDM-radiation modes,

we ﬁnd, for a > aeq

1 Rγ
Mζζ → 1, MζS → − , MζB → − ,
3 4
MSS → 1, MSB → 0, (97)
Magnetic Fields, Strings and Cosmology 923

and MBB → 1. Equations (184) and (185) may be used, for instance, to ob-
tain the magnetized curvature and entropy fluctuations at photon decoupling
in terms of the same quantities evaluated for τ τeq . A full numerical anal-
ysis of the problem confirms the analytical results summarized by (184) and
(185). The most general initial condition for CMB anisotropies will then be
a combination of (correlated) fluctuations receiving contribution from δpnad
and from the fully inhomogeneous magnetic field. To illustrate this point, the
form of the Sachs–Wolfe plateau in the sudden decoupling limit will now be
discussed.
To compute the SW contribution, we need to solve the evolution equation
of the monopole of the temperature fluctuations in the tight-coupling limit,
i.e. from (145) and (146),

HRb k 2 δγ 4HRb 4 2 k2
δγ + δγ + = 4ψ + ψ − k φ− (ΩB −4σB ).
1 + Rb 3 1 + Rb 1 + Rb 3 3(1 + Rb )
(98)
In the sudden decoupling approximation the visibility function, i.e. K(τ ) =
κ (τ )e−κ(τ ) and the optical depth, i.e. −κ(τ ) are approximated, respectively,
by δ(τ − τdec ) and by θ(τ − τdec ) (see [150, 151] for an estimate of the width
of the last scattering surface). The power spectra of ζ, S and ΩB are given,
respectively, by
nr −1 ns −1
k k
Pζ (k) = Aζ , PS (k) = AS , (99)
kp kp
2ε
2 k
PΩ (k) = F(ε)Ω B L , (100)
kL
where Aζ , AS and Ω B L are constants and

4(6 − ε)(2π)2ε
F(ε) = ,
ε(3 − 2ε)Γ 2 (ε/2)
ρB L B2
ΩB L = , ρB L = L , ργ = a4 (τ )ργ (τ ). (101)
ργ 8π

To deduce (187), (188) and (189) the magnetic ﬁeld has been regularized,
according to a common practice [22, 124, 126], over a typical comoving scale
L = 2π/kL with a Gaussian window function and it has been assumed that
the magnetic ﬁeld intensity is stochastically distributed as

2π 2 j

Bi (k, τ )B j (p, τ ) = P (k) PB (k, τ ) δ (3) (k + p), (102)
k3 i
where ε
ki k j k
Pij (k) = δij − 2 , PB (k, τ ) = AB . (103)
k kp
924 M. Giovannini

As a consequence of (190) the magnetic field does not break the spatial
isotropy of the background geometry. The quantity kp appearing in (187)
and (191) is conventional pivot scale that is 0.05 Mpc (see [128, 129, 130]
for a discussion of other possible choices). Equations (188) and (189) hold
for 0 < ε < 1. In this limit the PΩ (k) (see (188)) is nearly scale-invariant
(but slightly blue). This means that the effect of the magnetic and thermal
diffusivity scales (related, respectively, to the finite value of the conductivity
and of the thermal diffusivity coefficient) do not affect the spectrum [22]. In
the opposite limit, i.e. ε 1, the value of the mode-coupling integral ap-
pearing in the two-point function of the magnetic energy density (and of the
magnetic anisotropic stress) is dominated by ultraviolet effects related to the
mentioned diffusivity scales [22]. Using then (187),(188) and (189), the C can
be computed for the region of the SW plateau (i.e. for multipoles < 30):

Aζ 9 2 4
C = Z1 (nr , ) + Rγ2 Ω B L Z2 (, ) − Aζ AS Z1 (nrs , ) cos γrs
25 100 25
4 3
+ AS Z1 (ns , ) − Aζ Rγ Ω B L Z3 (nr , ε, ) cos γbr
25 25
6
+ AS Rγ Ω B L Z3 (ns , ε, ) cos γbs , (104)
25

where the functions Z1 , Z2 and Z3

2
n−1 Γ (3 − n)Γ + n−1
2
π k0
Z1 (n, ) = 2n , (105)
4 kp
Γ 2 2 − n2 Γ + 52 − n2

2ε
π 2 2ε k0 Γ (2 − 2ε)Γ ( + ε)
Z2 (ε, ) = 2 F(ε) , (106)
2 kL
Γ 2 − ε Γ ( + 2 − ε)
2 3

ε n+1
π 2 ε n+1 k0 k0 2
Z3 (n, ε, ) = 2 2 2 F(ε)
4 kL kp

Γ 2 −ε− 2 Γ + 2 + 4 − 4
5 n ε n 1

× , (107)
Γ 4 − 2 − 4 Γ 4 +− 2 − 4
2 7 ε n 9 ε n

are deﬁned in terms of the magnetic tilt ε and of a generic spectral index n
which may correspond, depending on the speciﬁc contribution, either to nr
(adiabatic spectral index), or to ns (nonadiabatic spectral index) or even to
nrs = (nr + ns )/2 (spectral index of the cross-correlation). In (192) γrs , γbr
Magnetic Fields, Strings and Cosmology 925

and γsb are the correlation angles. In the absence of magnetic and nonadia-
batic contributions and for (192) and (193) imply that for nr = 1 (Harrison–
Zeldovich spectrum) ( + 1)C /2π = Aζ /25 and WMAP data [127] would
imply that Aζ = 2.65 × 10−9 . Consider then the physical situation where on
top of the adiabatic mode there is a magnetic contribution. If there is no cor-
relation between the magnetized contribution and the adiabatic contribution,
i.e. γbr = π/2, the SW plateau will be enhanced in comparison with the case
when magnetic ﬁelds are absent. The same situation arises when the two com-
ponents are anticorrelated (i.e. cos γbr < 0). However, if the ﬂuctuations are
positively correlated (i.e. cos γbr > 0), the cross-correlation adds negatively to
the sum of the two autocorrelations of ζ and ΩB so that the total result may
be an overall reduction of the power with respect to the case γbr = π/2. In
(193),(194) and (195) k0 = τ0−1 where τ0 is the present observation time.

4.4 Numerical Analysis

The main idea of the numerical analysis is rather simple. Its implementation,
however, may be rather complicated. In order to capture the simplicity out of
the possible complications, we will proceed as follows. We will ﬁrst discuss a
rather naive approach to the integration of CMB anisotropies. Then, building
up on this example, the results obtainable in the case of magnetized scalar
modes will be illustrated.

Simplest Toy Model

Let us therefore apply the Occam razor and let us consider the simplest situ-
ation we can imagine, that is to say the case where
• magnetic ﬁelds are absent;
• neutrinos are absent;
• photons and baryons are described within the tight-coupling approxima-
tion to lowest order (i.e. σT → ∞);
• initial conditions are set either from the adiabatic mode or from the CDM-
radiation mode.
This is clearly the simplest situation we can envisage. Since neutrinos are
absent, there is no source of anisotropic stress and the two longitudinal ﬂuc-
tuations of the metric are equal, i.e. φ = ψ. Consequently, the system of
equations to be solved becomes

k 2 c2s H H
R = ψ− δpnad , (108)
H2 − H pt + ρt

H H
ψ = − 2H − ψ− H− R, (109)
H H
926 M. Giovannini

4
δγ = 4ψ − θγb , (110)
3

HRb k2
θγb =− θγb + δγ + k 2 ψ, (111)
Rb + 1 4(1 + Rb )

δc = 3ψ − θc , (112)

θc = −Hθc + k 2 ψ. (113)

We can now use the explicit form of the scale factor discussed in (178) which
implies:

1 2(x + 1)
H= ,
τ1 x(x + 2)
2 x2 + 2x + 4
H = − 2 2 ,
τ1 x (x + 2)2
1 2(3x2 + 6x + 4)
H2 − H = 2 , (114)
τ1 x2 (x + 2)2

where x = τ /τ1 . With these speciﬁcations the evolution equations given in

(196)–(201) become

dR 4 x(x + 1)(x + 2) 2
= κ ψ, (115)
dx 3 (3x2 + 6x + 4)2

dψ 3x2 + 6x + 4 5x2 + 10x + 6

=− R− ψ, (116)
dx x(x + 1)(x + 2) x(x + 1)(x + 2)

dδγ 4(3x2 + 6x + 4) 4(5x2 + 10x + 6) 4

=− R− ψ − θ̃γb , (117)
dx x(x + 1)(x + 2) x(x + 1)(x + 2) 3

dθ̃γb 2Rb (x + 1) κ2
=− + δγ + κ2 ψ, (118)
dx Rb + 1 x(x + 2) 4(1 + Rb )

dδc 3(3x2 + 6x + 4) 3(5x2 + 10x + 6)

=− R− ψ − θ̃c , (119)
dx x(x + 1)(x + 2) x(x + 1)(x + 2)

dθ̃c 2(x + 1)
=− θ̃c + κ2 ψ. (120)
dx x(x + 2)
Magnetic Fields, Strings and Cosmology 927

In (203)–(208) the following rescalings have been used:

κ = kτ1 , θ̃γb = τ1 θγb , θ̃c = τ1 θc . (121)

The system of equations (203)–(208) can be readily integrated by giving initial

conditions for at xi 1. In the case of the adiabatic mode (which is the one
contemplated by (203)–(208) since we set δpnad = 0) the initial conditions are
as follows
2
R(xi ) = R∗ , ψ(xi ) = − R∗ ,
3
δγ (xi ) = −2ψ∗ , θ̃γb (xi ) = 0,
3
δc (xi ) = − ψ∗ , θ̃c (xi ) = 0. (122)
2
It can be shown by direct numerical integration that the system (203)–(208)
gives a reasonable semiquantitative description of the acoustic oscillations. To
simplify initial conditions even further, we can indeed assume a ﬂat Harrison–
Zeldovich spectrum and set R∗ = 1.
The same philosophy used to get to this simpliﬁed form can be used to
integrate the full system. In this case, however, we would miss the impor-
tant contribution of polarization since, to zeroth order in the tight-coupling
expansion, the CMB is not polarized.

Integration of Brightness Perturbations

To discuss the polarization, we have to go (at least) to first order in the tight-
coupling expansion [152, 153, 154]. For this purpose, it is appropriate to intro-
duce the evolution equations of the brightness perturbations of the I, Q and U
Stokes parameters characterizing the radiation field. Since the Stokes param-
eters Q and U are not invariant under rotations about the axis of propagation
the degree of polarization P = (Q2 + U 2 )1/2 is customarily introduced [155,
156]. The relevant brightness perturbations will then be denoted as ΔI , ΔP .
This description reproduces to zeroth order in the tight coupling expansion,
the fluid equations that have been presented before to set initial conditions
prior to equality. For instance, the photon density contrast and the divergence
of the photon peculiar velocity are related, respectively, to the monopole and
to the dipole of the brightness perturbation of the intensity field, i.e. δγ = 4ΔI0
and θγ = 3kΔI1 . The evolution equations of the brightness perturbations can
then be written, within the conventions set by (89)

1
ΔI + (ikμ + κ )ΔI + ikμφ = ψ + κ ΔI0 + μvb − P2 (μ)SP , (123)
2

κ
ΔP + (ikμ + κ )ΔP = [1 − P2 (μ)]SP , (124)
2
928 M. Giovannini

ik κ
vb + Hvb + ikφ + [ΩB − 4σB ] + (vb + 3iΔI1 ) = 0. (125)
4Rb Rb
Equation (213) is nothing but the second relation obtained in (140) having
introduced the quantity ikvb = θb . The source terms appearing in (211) and
(212) include a dependence on P2 (μ) = (3μ2 − 1)/2 ( P (μ) denotes, in this
framework, the -th Legendre polynomial); μ = k̂ · n̂ is simply the projection
of the Fourier wave-number on the direction of the photon momentum. In
(211) and (212) the source term SP is defined as
SP (k, τ ) = ΔI2 (k, τ ) + ΔP0 (k, τ ) + ΔP2 (k, τ ). (126)
The evolution equations in the tight-coupling approximation will now be
integrated numerically. More details on the tight coupling expansion in the
presence of a magnetized contribution can be found in [132].
The normalization of the numerical calculation is enforced by evaluat-
ing, analytically, the Sachs–Wolfe plateau and by deducing, for a given set
of spectral indices of curvature and entropy perturbations, the amplitude of
the power spectra at the pivot scale. Here is an example of this strategy. The
Sachs–Wolfe plateau can be estimated analytically from the evolution equa-
tion of R (or ζ) by using the technique of the transfer matrix appropriately
generalized to the case where on top of the adiabatic and nonadiabatic con-
tributions the magnetic fields are consistently taken into account. The main
result is expressed by (192).
If the SW plateau is determined by an adiabatic component supplemented
by a (subleading) nonadiabatic contribution both correlated with the mag-
netic field intensity, the obtainable bound may not be so constraining (even
well above the nano-Gauss range) due to the proliferation of parameters. A
possible strategy is therefore to fix the parameters of the adiabatic mode to
the values determined by WMAP-3 and then explore the effect of a mag-
netized contribution which is not correlated with the adiabatic mode. This
implies in (192) that AS = 0 and γbr = π/2. Under this assumption, in Figs.
8 and 9 the bounds on BL are illustrated. The nature of the constraint de-
pends, in this case, both on the amplitude of the protogalactic field (at the
present epoch and smoothed over a typical comoving scale L = 2π/kL ) and
on its spectral slope, i.e. ε. In the case ε < 0.5 the magnetic energy spectrum
is nearly scale-invariant. In this case, diffusivity effects are negligible (see, for
instance, [18, 126]). As already discussed, if ε 1, the diffusivity effects (both
thermal and magnetic) dominate the mode-coupling integral that lead to the
magnetic energy spectrum [18, 126].
In Fig. 8 the magnetic field intensity should be below the different curves
if the adiabatic contribution dominates the SW plateau. Different choices of
the pivot scale kp and of the smoothing scale kL are also illustrated. In Fig. 8
the scalar spectral index is fixed to nr = 0.951 [144]. In Fig. 9 the two curves
corresponding, respectively, to nr = 0.8 and nr = 1 are reported.
Magnetic Fields, Strings and Cosmology 929
−7.4
−1 −1
kp =0.05 Mpc , kL=1 Mpc
−1
kp =0.002 Mpc , kL=0.5 Mpc−1
−7.6
−1
kp =0.05 Mpc , kL=2Mpc−1

−7.8 nr =0.951, γr b = π/2

log BL/nG

−8

−8.2

−8.4

−8.6

−8.8
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
ε

Fig. 8. Bounds on the protogalactic ﬁeld intensity as a function of the magnetic

spectral index ε for diﬀerent values of the parameters deﬁning the adiabatic contri-
bution to the SW plateau

nr = 1
−7.6 nr = 0.8

−7.8 kp=0.05 Mpc−1

kL= 1 Mpc−1

−8
γbr = π/2
log BL/nG

−8.2

−8.4

−8.6

−8.8
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
ε

Fig. 9. Same plot as in Fig. 8 but with emphasis on the variation of nr

930 M. Giovannini

If ε < 0.2, the bounds are comparatively less restrictive than in the case ε
0.9. The cause of this occurrence is that we are here just looking at the largest
wavelengths of the problem. As it will become clear in a moment, intermediate
scales will be more sensitive to the presence of fully inhomogeneous magnetic
fields.
According to Figs. 8 and 9 for a given value of the magnetic spectral index
and of the scalar spectral index, the amplitude of the magnetic field has to
be sufficiently small not to affect the dominant adiabatic nature of the SW
plateau. Therefore Figs. 8 and 9 (as well as other similar plots) can be used to
normalize the numerical calculations for the power spectra of the brightness
perturbations, i.e.

k3 k3 k3
|ΔI (k, τ )|2 , |ΔP (k, τ )|2 , |ΔI (k, τ )ΔP (k, τ )|. (127)
2π 2 2π 2 2π 2
Let us then assume, for consistency with the cases reported in Figs. 8 and
9, that we are dealing with the situation where the magnetic field is not
correlated with the adiabatic mode. It is then possible to choose a definite
value of the magnetic spectral index (for instance = 0.1) and a definite value
of the adiabatic spectral index, i.e. nr (for instance nr = 0.951, in agreement
with [144]). By using the SW plateau the normalization can be chosen in
such a way the adiabatic mode dominates over the magnetic contribution.
In the mentioned case, Fig. 8 implies BL < 1.14 × 10−8 G for a pivot scale
kp = 0.002 Mpc−1 . Since the relative weight of the power spectra given in
(187) and (188) is fixed, it is now possible to set initial conditions for the
adiabatic mode according to (161)–(163), (164)–(166) and (167) deep in the
radiation-dominated phase. The initial time of integration will be chosen as
τi = 10−6 τ1 in the notations discussed in (178). According to (179), this choice
implies that τi τeq .
The power spectra of the brightness perturbations, i.e. (215), can be then
computed by numerical integration. Clearly, the calculation will depend upon
the values of ωm , ωb , ωc and Rν . We will simply fix these parameters to their
fiducial values reported in (181) (see also (147)) and we will take Nν = 3 in
(162) determining in this way the fractional contribution of the neutrinos to
the radiation plasma.
The first interesting exercise, for the present purposes, is reported in Fig.
10 where the power spectra of the brightness perturbations are illustrated for
a wave-number k = 0.1 Mpc−1 . Concerning the results reported in Fig. 10
different comments are in order:
• For ε = 0.1 and nr = 0.951, the SW plateau imposes BL < 1.14 × 10−8 G;
from Fig. 10 it follows that a magnetic field of only 30 nG (i.e. marginally
incompatible with the SW bound) has a large effect on the brightness
perturbations as it can be argued by comparing, in Fig. 10, the dashed
curves (corresponding to 30 nG) to the full curves which illustrate the
case of vanishing magnetic fields.
Magnetic Fields, Strings and Cosmology 931

−9
k= 0.1 Mpc−1, nr =0.951, ε=0.1
x 10
7
BL = 30 nG
BL = 10 nG
BL = 0
6

k3 |ΔI (k,τ)|2/(2π2)
4

0
0 50 100 150 200 250 300
τ/Mpc

−12 k= 0.1 Mpc−1, nr =0.951, ε=0.1

x 10
3
BL = 30 nG
BL = 10 nG
BL = 0
2.5

2
k |ΔP(k,τ)| /(2π )
2
2

1.5
3

0.5

0
0 50 100 150 200 250 300
τ/Mpc

k = 0.1 Mpc−1, nr = 0.951, ε = 0.1

x 10−10

BL = 30 nG
1 BL = 10 nG
BL = 0
0.9

0.8
k3|ΔP(k,τ) ΔI(k,τ)|/(2π2)

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0
0 50 100 150 200 250 300
τ/Mpc

Fig. 10. The power spectra of the brightness perturbations for a typical wave-
number k = 0.1Mpc−1 . The values of the parameters are speciﬁed in the legends.
The pivot scale is kp = 0.002 Mpc−1 and the smoothing scale is kL = Mpc−1 (see
Figs. 8 and 9)
932 M. Giovannini

• The situation where BL > nG cannot be simply summarized by saying that

the amplitudes of the power spectra get larger since there is a combined
effect which both increases the amplitudes and shifts slightly the phases
of the oscillations.
• From the qualitative point of view, it is still true that the intensity oscil-
lates as a cosine, the polarization as a sine.
• The phases of the cross-correlations are, comparatively, the most affected
by the presence of the magnetic field.
The features arising in Fig. 10 can be easily illustrated for other values of
and for different choices of the pivot or smoothing scales. The general lesson
that can be drawn is that the constraint derived only by looking at the SW
plateau are only a necessary condition on the strength of the magnetic field.
They are, however, not sufficient to exclude observable effects at smaller scales.
This aspect is illustrated in the plot at the left in Fig. 11 which captures a
detail of the cross-correlation. The case when BL = 0 can be still distinguished
from the case BL = 0.5 nG. Therefore, recalling that for the same choice of

−13
k = 0.1 Mpc−1, nr = 0.951, ε = 0.1
x 10
4.5
BL = 0.5 n G

BL = 0
4

3.5
k3|ΔP(k,τ) ΔI(k,τ)|/(2π2)

2.5

1.5

1
276.6 276.65 276.7 276.75 276.8 276.85 276.9 276.95 277
τ/Mpc

k = 0.1 Mpc−1 , nr = 0.951

-9.5

BL = 10 nG

BL = 0
-10
log k3 |ΔI(k,τdec)|2/(2π2)

-10.5

-11

-11.5

-12
0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
ε

Fig. 11. A detail of the cross-correlation (top). The autocorrelation of the intensity
at τdec as a function of ε, i.e. the magnetic spectral index (bottom)
Magnetic Fields, Strings and Cosmology 933

parameters the SW plateau implied that BL < 11.4 nG, it is apparent that
the intermediate scales lead to more stringent conditions even for nearly scale-
invariant spectra of magnetic energy density. For the range of parameters of
Fig. 11 we will have that BL < 0.5 nG which is more stringent than the
condition deduced from the SW platea by, roughly, one order of magnitude.
If ε increases to higher values (but always with ε < 0.5) by keeping fixed
BL (i.e. the strength of the magnetic field smoothed over a typical lengthscale
L = 2π/kL ), the amplitude of the brightness perturbations gets larger in
comparison with the case when the magnetic field is absent. This aspect is
illustrated in the bottom plot of Fig. 11 where the logarithm (to base 10)
of the intensity autocorrelation is evaluated at a fixed wave-number (and at
τdec ) as a function of ε. The full line (corresponding to a BL = 10 nG) is
progressively divergent from the dashed line (corresponding to BL = 0) as ε
increases.
In Fig. 12 the power spectra of the brightness perturbations are reported
at τdec and as a function of k. In the two plots at the top the autocorrelation
of the intensity is reported for different values of BL (left plot) and for dif-
ferent values of ε at fixed BL (right plot). In the two plots at the bottom the
polarization power spectra are reported always at τdec and for different values
of BL at fixed ε. The position of the first peak of the autocorrelation of the
intensity is, approximately, kd 0.017 Mpc−1 . The position of the first peak
of the cross-correlation is, approximately, 3/4 of kd . From this consideration,
again, we can obtain that BL < 0.3 nG which is more constraining than the
SW condition.
Up to now the adiabatic mode has been considered in detail. We could
easily add, however, nonadiabatic modes that are be partially correlated with
the adiabatic mode. It is rather plausible, in this situation, that by adding
new parameters, also the allowed value of the magnetic field may increase.
Similar results can be achieved by deviating from the assumption that the
magnetic field and the curvature perturbations are uncorrelated. This aspect
can be understood already from the analytical form of the SW plateau (192). If
there is no correlation between the magnetized contribution and the adiabatic
contribution, i.e. γbr = π/2, the SW plateau will be enhanced in comparison
with the case when magnetic fields are absent. The same situation arises
when the two components are anticorrelated (i.e. cos γbr < 0). However, if the
fluctuations are positively correlated (i.e. cos γbr > 0), the cross-correlation
adds negatively to the sum of the two autocorrelations of R and ΩB so that
the total result may be an overall reduction of the power with respect to the
case γbr = π/2.
From Fig. 12 various features can be appreciated. The presence of magnetic
fields, as already pointed out, does not affect only the amplitude but also the
phases of oscillations of the various brightness perturbations. Moreover, an
increase in the spectral index ε also implies a quantitative difference in the
intensity autocorrelation.
934 M. Giovannini

kp = 0.002 Mpc−1, kL = 1 Mpc−1, nr = 0.951, ε = 0.1 kp = 0.002 Mpc−1, kL = 1 Mpc−1, nr = 0.951, ε = 0.1
x 10−9 x 10−9
1.6
BL = 30 nG
5 BL = 10 nG 1.4 ε = 0.1
BL = 0 ε = 0.5
1.2
4
k3|ΔI(k,τdec)|2/(2π2)

k3|ΔI(k,τdec)|2/(2π2)
1

3
0.8

0.6
2
0.4

1
0.2

0
0 −4 −3.5 −3 −2.5 −2 −1.5 −1 −0.5
−4 −3.5 −3 −2.5 −2 −1.5 −1 −0.5 log(k Mpc)
log(k Mpc)

kp = 0.002 Mpc−1, kL = 1 Mpc−1, nr = 0.951, ε = 0.1 kp = 0.002 Mpc−1, kL = 1 Mpc−1, nr = 0.951, ε = 0.1
x 10−13 x 10−13
2

1.8 BL = 10 nG 3 BL = 30 nG
BL = 0 BL = 0
1.6
2.5
1.4
k3|ΔP(k,τdec)|2/(2π2)

1.2 k3|ΔP(k,τdec)|2/(2π2) 2

1
1.5
0.8

0.6 1

0.4
0.5
0.2

0 0
−4 −3.5 −3 −2.5 −2 −1.5 −1 −0.5 −4 −3.5 −3 −2.5 −2 −1.5 −1 −0.5
log(k Mpc) log(k Mpc)

Fig. 12. The power spectra of the brightness perturbations at τdec for the parame-
ters reported in the legends

5 Concluding Remarks
There is little doubts that large-scale magnetic exist in nature. These fields
have been observed in a number of different astrophysical systems. The main
question concerns therefore their origin. String cosmological models of pre-
big-bang type still represent a viable and well-motivated theoretical option.
Simple logic dictates that if the origin of the large-scale magnetic fields is
primordial (as opposed to astrophysical) it is plausible to expect the presence
of magnetic fields in the primeval plasma also before the decoupling of radi-
ation from matter. CMB anisotropies are germane to several aspect of large-
scale magnetization. CMB physics may be the tool that will finally enable us
either to confirm or to rule out the primordial nature of galactic and clusters
magnetic field seeds. In the next 5 to 10 years the forthcoming CMB preci-
sion polarization experiments will be sensitive in, various frequency channels
between 30 GHz and roughly 900 GHz. The observations will be conducted
both via satellites (like the Planck satellite) and via ground based detectors
(like in the case of the QUIET arrays). In a complementary view, the SKA
telescope will provide full-sky surveys of Faraday rotation that may even get
close to 20 GHz.
In an optimistic perspective the forthcoming experimental data together
with the steady progress in the understanding of the dynamo theory will
Magnetic Fields, Strings and Cosmology 935

hopefully explain the rationale for the ubiquitous nature of large-scale

magnetization. In a pessimistic perspective, the primordial nature of mag-
netic seeds will neither be confirmed nor be ruled out. It is wise to adopt a
model-independent approach by sharpening those theoretical tools that may
allow, in the near future, a direct observational test of the effects of large-scale
magnetic fields on CMB anisotropies. Some efforts along this perspective have
been reported in the present lecture. In particular, the following results have
been achieved:
• Scalar CMB anisotropies have been described in the presence of a fully
inhomogeneous magnetic field.
• The employed formalism allows the extension of the usual CMB initial
conditions to the case when large-scale magnetic fields are present in the
game.
• By going to higher order in the tight-coupling expansion the evolution of
the brightness perturbations has been computed numerically.
• It has been shown that the magnetic fields may affect not only the ampli-
tude but also the relative phases of the Doppler oscillations.
• From the analysis of the cross-correlation power spectra it is possible to
distinguish, numerically, the effects of a magnetic field as small as 0.5 nG.
It is interesting to notice that a magnetic field in the range 10−10 –10−11 G
is still viable according to the present considerations. It is, therefore, not ex-
cluded that large-scale magnetic fields may come from a primordial field of the
order of 0.1–0.01 nG present prior to gravitational collapse of the protogalaxy.
Such a field, depending upon the details of the gravitational collapse, may be
amplified to the observable level by compressional amplification. The present
problems in achieving a large dynamo amplification may therefore be less rel-
evant than for the case when the seed field is in the range 10−9 − −10−18 nG.
To confirm this type of scenario, it will be absolutely essential to introduce the
magnetic field background into the current strategies of parameter extraction.
The considerations reported in the present lecture provide already the
framework for such an introduction. In particular, along a minimalist per-
spective, the inclusion of the magnetic field background boils down to add
two new extra parameters: the spectral slope and amplitude of the magnetic
field (conventionally smoothed over a typical comoving scale of mega parsec
size). The magnetic field contribution will then slightly modify the adiabatic
paradigm by introducing, already at the level of initial conditions, a sublead-
ing non-Gaussian (and quasi-adiabatic) correction.

References
1. H. Alfvén: Arkiv. Mat. F. Astr., o. Fys. 29 B, 2 (1943) 864
2. E. Fermi: Phys. Rev. 75, 1169 (1949) 864, 865
3. H. Alfvén: Phys. Rev. 75, 1732 (1949) 864, 865
936 M. Giovannini

4. R. D. Richtmyer, E. Teller: Phys. Rev. 75, 1729 (1949) 865

5. W. A. Hiltner: Science 109, 165 (1949) 865
6. J. S. Hall: Science 109, 166 (1949) 865
7. L. J. Davis J. L. Greenstein: Astrophys. J. 114, 206 (1951) 865
8. E. Fermi, S. Chandrasekar: Astrophys. J. 118, 113 (1953) 865
9. E. Fermi, S. Chandrasekar: Astrophys. J. 118, 116 (1953) 865
10. R. Wielebinski, J. Shakeshaft: Nature195, 982 (1962) 865
11. A. G. Lyne, F. G. Smith: Nature 218, 124 (1968) 865
12. A. G. Lyne, F. G. Smith: Pulsar Astronomy (Cambridge University Press,
Cambridge, 1998) 866
13. C. Heiles: Annu. Rev. Astron. Astrophys. 14, 1 (1976) 866
14. F. Govoni, L. Feretti : Int. J. Mod. Phys. D 13, 1549 (2004) 866
15. B. M. Gaensler, R. Beck, L. Feretti: New Astron. Rev. 48, 1003 (2004) 866, 867
16. Y. Xu, P. P. Kronberg, S. Habib, Q. W. Dufton: Astrophys. J. 637, 19 (2006) 867
17. P. P. Kronberg : Astron. Nachr. 327, 517 (2006) 867, 869
18. M. Giovannini: Int. J. Mod. Phys. D 13, 391 (2004) 867, 868, 869, 870, 903, 928
19. http://www.skatelescope.org 867
20. http://www.rssd.esa.int 867
21. P. P. Kronberg: Rep. Prog. Phys. 57, 325 (1994) 867
22. M. Giovannini: Class. Quant. Grav. 23, R1 (2006) 904, 923, 924
23. J. Bernstein, L. S. Brown, G. Feinberg: Rev. Mod. Phys. 61, 25 (1989) 869, 904
24. T. J. M. Boyd, J. J. Serson: The Physics of Plasmas (Cambridge University
Press, Cambridge, 2003) 871, 872, 873, 876
25. N. A. Krall, A. W. Trivelpiece: Principles of Plasma Physics (San Francisco
Press, San Francisco, 1986) 871, 872, 873, 874, 876
26. F. Chen: Introduction to Plasma Physics (Plenum Press, New York, 1974) 871, 872, 873
27. D. Biskamp: Non-linear Magnetohydrodynamics (Cambridge University Press,
Cambridge, 1994) 871, 874, 875, 878
28. A. Vlasov: Zh. Éksp. Teor. Fiz. 8, 291 (1938); J. Phys. 9, 25 (1945) 871, 872
29. L. D. Landau: J. Phys. U.S.S.R. 10, 25 (1945) 871, 872
30. M. Giovannini: Phys. Rev. D 71, 021301 (2005) 873
31. M. Giovannini: Phys. Rev. D 58, 124027 (1998) 874
32. E. N. Parker: Cosmical Magnetic Fields (Clarendon Press, Oxford, 1979) 875, 876
33. Ya. B. Zeldovich, A. A. Ruzmaikin, D. D. Sokoloﬀ: Magnetic Fields in Astro-
physics (Gordon Breach Science, New York, 1983) 875, 877, 878
34. A. A. Ruzmaikin, A. M. Shukurov, D. D. Sokoloﬀ : Magnetic Fields of Galaxies
(Kluwer Academic Publisher, Dordrecht, 1988) 875
35. R. Kulsrud: Annu. Rev. Astron. Astrophys. 37, 37 (1999) 875, 880
36. A. Brandenburg, K. Subramanian: Phys. Rept 417, 1 (2005) 880
37. S. I. Vainshtein, Ya. B. Zeldovich: Usp. Fiz. Nauk. 106, 431 (1972) 876
38. W. H. Matthaeus, M. L. Goldstein, S. R. Lantz: Phys. Fluids 29, 1504 (1986) 876
39. A. Lazarian, E. Vishniac, J. Cho: Astrophys. J. 603, 180 (2004); Lect. Notes
Phys. 614, 376 (2003) 880, 881
40. A. Brandenburg, A. Bigazzi, K. Subramanian: Mon. Not. Roy. Astron. Soc.
325, 685 (2001) 880
41. K. Subramanian, A. Brandenburg: Phys. Rev. Lett. 93, 205001 (2004) 880
42. A. Brandenburg, K. Subramanian: Astron. Astrophys. 439, 835 (2005) 880
43. R. Kulsrud, S. W. Anderson: Astrophys. J. 396, 606 (1992) 881
44. M. J. Rees: Lect. Notes Phys. 664, 1 (2005) 882, 883
Magnetic Fields, Strings and Cosmology 937

45. K. Subramanian, D. Narashima, S. Chitre: Mon. Not. Roy. Astron. Soc. 271,
L15 (1994) 882
46. N. Y. Gnedin, A. Ferrara, E. G. Zweibel: Astrophys. J. 539, 505 (2000) 882
47. Ya. Zeldovich, I. Novikov: The Structure Evolution of the Universe (Chicago
University Press, Chicago, 1971), Vol. 2 884
48. Ya. Zeldovich: Sov. Phys. JETP 21, 656 (1965) 884
49. E. Harrison: Phys. Rev. Lett. 18, 1011 (1967) 884, 885
50. E. Harrison: Phys. Rev. 167, 1170 (1968) 884
51. E. Harrison: Mon. Not. R. Astr. Soc. 147, 279 (1970) 884
52. L. Biermann: Z. Naturf. 5A, 65 (1950) 884
53. I. Mishustin, A. Ruzmaikin: Sov. Phys. JETP 34, 223 (1972) 885
54. M. Giovannini: Phys. Rev. D 61, 063004 (2000) 888
55. M. Giovannini: Phys. Rev. D 61, 063502 (2000) 888
56. G. Piccinelli, A. Ayala: Lect. Notes Phys. 646, 293 (2004) 888
57. D. Boyanovsky, H. J. de Vega, M. Simionato: Phys. Rev. D 67, 123505 (2003) 888
58. D. Boyanovsky, M. Simionato, H. J. de Vega: Phys. Rev. D 67, 023502 (2003) 888
59. M. Giovannini, M. E. Shaposhnikov: Phys. Rev. D 57, 2186 (1998) 888
60. K. Bamba: arXiv:hep-ph/0611152 888
61. A. Sanchez, A. Ayala, G. Piccinelli: arXiv:hep-th/0611337 888
62. M. Giovannini, M. E. Shaposhnikov: Phys. Rev. D 62, 103512 (2000) 889
63. M. Giovannini, M. Shaposhnikov: Proc. CAPP2000 (July 2000, Verbier Switzer-
land), eprint Archive [hep-ph/0011105] 889
64. E. Calzetta, A. Kus, F. Mazzitelli: Phys. Rev. D, 57, 7139 (1998)
65. A. Kus, E. Calzetta, F. Mazzitelli, C. Wagner: Phys. Lett. B 472, 287 (2000)
66. M. S. Turner, L. M. Widrow: Phys. Rev. D 37, 2734 (1988) 889, 890, 891
67. I. Drummond, S. Hathrell: Phys. Rev. D 22, 343 (1980) 890
68. A. Dolgov: Phys. Rev. D 48, 2499 (1993) 890
69. S. Carroll, G. Field, R. Jackiw: Phys. Rev. D 41, 1231 (1990) 890
70. W. D. Garretson, G. Field, S. Carroll: Phys. Rev. D 46, 5346 (1992) 890
71. G. Field, S. Carroll: Phys. Rev. D 62, 103008 (2000) 890, 891
72. B. Ratra: Astrophys. J. Lett. 391, L1 (1992) 891
73. M. Giovannini: Phys. Rev. D 64, 061301 (2001) 891
74. K. Bamba, J. Yokoyama: e-print Archive [astro-ph/0310824] 891
75. M. Gasperini: Phys. Rev. D 63, 047301 (2001) 891
76. L. Okun: Sov. Phys. JETP 56, 502 (1982) 891
77. O. Bertolami, D. Mota: Phys. Lett. B 455, 96 (1999) 891
78. M. Giovannini: Phys. Rev. D 62, 123505 (2000)
79. L. H. Ford: Phys. Rev. D 31, 704 (1985)
80. M. Gasperini, M. Giovannini, G. Veneziano: Phys. Rev. Lett. 75, 3796 (1995) 892, 893, 894
81. M. Gasperini, M. Giovannini, G. Veneziano: Phys. Rev. D 52, 6651 (1995) 892, 893, 894
82. M. Gasperini, M. Giovannini: Phys. Rev. D 47, 1519 (1993) 892, 896
83. D. Stoler: Phys. Rev. D 1, 3217 (1970); D. Stoler: Phys. Rev. D 4, 2309 (1971)
892
84. A. O. Barut, L. Girardello: Commun. Math. Phys. 21, 41 (1971) 892
85. H. Yuen: Phys. Rev. A 13, 2226 (1976) 892
86. S. Fubini, A. Molinari: Nucl. Phys. Proc. Suppl. 33C, 60 (1993) 892
87. R. Loudon: J. Mod. Opt. 34, 709 (1987) 892
88. R. Loudon: The Quantum Theory of Light (Clarendon Press, Oxford, 1983) 892
89. B. L. Schumaker: Phys. Rep. 135, 318 (1986) 892
938 M. Giovannini

90. L. Mandel, E. Wolf: Optical Coherence and Quantum Optics (Cambridge Uni-
versity Press, Cambridge, 1995) 892
91. G. Veneziano: Phys. Lett. B 265, 287 (1991) 893
92. M. Gasperini, G. Veneziano: Astropart. Phys. 1, 317 (1993) 893
93. M. Gasperini, G. Veneziano: Phys. Rep. 373, 1 (2003) 893, 895
94. C. Lovelace: Phys. Lett. B 135, 75 (1984) 893
95. E. Fradkin, A. Tseytlin: Nucl. Phys. B 261, 1 (1985) 893
96. C. Callan et al.: Nucl. Phys. B 262, 593 (1985) 893
97. M. Gasperini, M. Giovannini, G. Veneziano: Phys. Lett. B 569, 113 (2003) 895, 902
98. M. Gasperini, M. Giovannini, G. Veneziano: Nucl. Phys. B 694, 206 (2004) 895, 902
99. K. A. Meissner, G. Veneziano: Mod. Phys. Lett. A 6, 3397 (1991) 895
100. K. A. Meissner G. Veneziano: Phys. Lett. B 267, 33 (1991) 895
101. M. Gasperini, J. Maharana, G. Veneziano: Nucl. Phys. B 472, 349 (1996) 895
102. M. Giovannini: Class. Quant. Grav. 21, 4209 (2004) 895
103. M. Gasperini, M. Giovannini: Phys. Lett. B 301, 334 (1993) 896
104. M. Giovannini: Phys. Rev. D 61, 087306 (2000) 896
105. R. Brustein, M. Gasperini, M. Giovannini, G. Veneziano: Phys. Lett. B 361,
45 (1995) 901
106. R. Brustein, M. Gasperini, M. Giovannini, V. F. Mukhanov, G. Veneziano,
Phys. Rev. D 51, 6744 (1995) 901, 902
107. K. Enqvist M. S. Sloth: Nucl. Phys. B 626, 395 (2002); M. S. Sloth: Nucl.
Phys. B 656, 239 (2003) 901
108. V. Bozza, M. Gasperini, M. Giovannini, G. Veneziano: Phys. Rev. D 67 (2003)
063514; V. Bozza, M. Gasperini, M. Giovannini, G. Veneziano: Phys. Lett. B
543, 14 (2002) 901, 902
109. P. Astone et al.: Astron. Astrophys. 351, 811 (1999) 901
110. Ph. Bernard, G. Gemme, R. Parodi, E. Picasso: Rev. Sci. Instrum. 72, 2428
(2001) 901
111. A. M. Cruise: Class. Quantum Grav. 17, 2525 (2000); A. M. Cruise: Mon. Not.
R. Astron. Soc 204, 485 (1983) 901
112. D. Babusci and M. Giovannini: Int. J. Mod. Phys. D 10 477 (2001); D. Babusci
and M. Giovannini: Class. Quant. Grav. 17, 2621 (2000) 901
113. P. J. E. Peebles, A. Vilenkin: Phys. Rev. D 59, 063505 (1999) 902
114. M. Giovannini: Class. Quant. Grav. 16, 2905 (1999); M. Giovannini: Phys.
Rev. D 60, 123511 (1999); D. Babusci and M. Giovannini: Phys. Rev. D 60,
083511 (1999); M. Giovannini: Phys. Rev. D 58, 083504 (1998) 902
115. M. Gasperini, S. Nicotri: Phys. Lett. B 633, 155 (2006) 902
116. R. Beck: Astron. Nachr. 327, 512 (2006) 903
117. R. Beck, A. Brenburg, D. Moss, A. Skhurov, D. Sokoloﬀ: Annu. Rev. Astron.
Astrophys. 34, 155 (1996) 903
118. J. Vallée: Astrophys. J. 566, 261 (2002) 904
119. E. Battaner, E. Florido: Mon. Not. R. Astron. Soc 277, 1129 (1995) 903
120. E. Battaner, E. Florido, J. Jimenez-Vincente: Astron. Astrophys. 326, 13
(1997) 903
121. E. Florido, E. Battaner: Astron. Astrophys. 327, 1 (1997) 903
122. E. Florido et al.: arXiv:astro-ph/0609384 903
123. E. Battaner, E. Florido: Fund. Cosmic Phys. 21, 1 (2000) 903
Magnetic Fields, Strings and Cosmology 939

124. J. Barrow, K. Subramanian: Phys. Rev. Lett. 81, 3575 (1998); J. Barrow,
K. Subramanian: Phys. Rev. D 58, 83502 (1998); C. Tsagas, R. Maartens:
Phys. Rev. D 61, 083519 (2000); A. Mack, T. Kahniashvili, A. Kosowsky:
Phys. Rev. D 65, 123004 (2002); A. Lewis: Phys. Rev. D 70, 043518 (2004);
T. Kahniashvili, B. Ratra: Phys. Rev. D 71, 103006 (2005) 903, 923
125. G. Chen et al.: Astrophys. J. 611, 655 (2004); P. D. Naselsky et al.: Astrophys.
J. 615, 45 (2004); L. Y. Chiang, P. Naselsky: Int. J. Mod. Phys. D 14, 1251
(2005); L. Y. Chiang, P. D. Naselsky, O. V. Verkhodanov, M. J. Way: Astrophys.
J. 590, L65 (2003); D. G. Yamazaki et al.: Astrophys. J. 625, L1 (2005) 903
126. K. Subramanian: Astron. Nachr. 327, 399 (2006) 903, 923, 928
127. H. V. Peiris et al. [WMAP Collaboration]: Astrophys. J. Suppl. 148, 213 (2003)
904, 921, 925
128. K. Enqvist, H. Kurki-Suonio, J. Valiviita: Phys. Rev. D 62, 103003 (2000) 905, 924
129. H. Kurki-Suonio, V. Muhonen, J. Valiviita: Phys. Rev. D 71, 063005 (2005) 905, 924
130. K. Moodley, M. Bucher, J. Dunkley, P. G. Ferreira, C. Skordis: Phys. Rev. D
70, 103520 (2004) 905, 906, 924
131. M. Giovannini: Phys. Rev. D 73, 101302 (2006) 905
132. M. Giovannini: Phys. Rev. D 74, 063002 (2006) 905, 916, 928
133. M. Giovannini: Class. Quant. Grav. 23, 4991 (2006) 905, 916
134. J. D. Barrow, R. Maartens, C. G. Tsagas: arXiv:astro-ph/0611537 905
135. T. Kahniashvili, B. Ratra: arXiv:astro-ph/0611247. 905
136. E. Harrison: Rev. Mod. Phys. 39, 862 (1967) 905
137. J. M. Bardeen: Phys. Rev. D 22, 1882 (1980) 905
138. C.-P. Ma E. Bertschinger: Astrophys. J. 455, 7 (1995) 901, 905, 919
139. M. Giovannini: Phys. Rev. D 70, 123507 (2004) 905, 916, 917
140. M. Giovannini: Int. J. Mod. Phys. D 14, 363 (2005) 905, 909, 911, 915
141. J. Bardeen, P. Steinhardt, M. Turner: Phys. Rev. D 28, 679 (1983)
142. R. Brandenberger, R. Kahn, W. Press: Phys. Rev. D 28, 1809 (1983)
143. M. Giovannini: Phys. Lett. B 622, 349 (2005) 917
144. M. Giovannini: Class. Quant. Grav. 22, 5243 (2005) 917, 928, 930
145. D. Spergel et al. [WMAP Collaboration]: arXiv:astro-ph/0603449 921
146. W. Hu N. Sugiyama: Astrophys. J. 444, 489 (1995); ibid. 471, 30 (1996) 921
147. L. Page et al. [WMAP collaboration]: arXiv:astro-ph/0603450 921
148. A. G. Riess et al.: Astrophys. J. 607, 665 (2005) 921
149. P. Astier et al.: astro-ph/0510447 921
150. P. Naselsky, I. Novikov: Astrophys. J. 413, 14 (1993) 923
151. H. Jorgensen, E. Kotok, P. Naselsky, I. Novikov: Astron. Astrophys. 294, 639
(1995) 923
152. P. J. E. Peebles, J. T. Yu: Astrophys. J. 162, 815 (1970) 927
153. A. G. Doroshkevich, Ya. B. Zeldovich, R. A. Sunyaev: Sov. Astron. 22, 523
(1978) 927
154. M. Zaldarriaga, D. D. Harari: Phys. Rev. D 52 (1995) 3276. 927
155. S. Chandrasekar: Radiative Transfer (Dover, New York, 1966) 927
Cosmological Singularities and a Conjectured
Gravity/Coset Correspondence

T. Damour

Institut des Hautes Etudes Scientiﬁques, 35 route de Chartres,

F-91440 Bures-sur-Yvette, France
[email protected]

Abstract. We review the recently discovered connection between the Belinsky–

Khalatnikov–Lifshitz-like “chaotic” structure of generic cosmological singularities
in 11-dimensional supergravity and the “last” hyperbolic Kac–Moody algebra E10 .
This intriguing connection suggests the existence of a hidden “correspondence” be-
tween supergravity (or even M -theory) and null geodesic motion on the inﬁnite-
dimensional coset space E10 /K(E10 ). If true, this gravity/coset correspondence
would oﬀer a new view of the (quantum) fate of space (and matter) at cosmological
singularities.

1 Introduction
It is a pleasure to participate in the celebration of the seminal accomplish-
ments of Gabriele Veneziano. I will try to do so by reviewing a line of research
which is intimately connected with several of Gabriele’s important contribu-
tions, being concerned with the cardinal problem of String Cosmology: the fate
of the Einstein-like space–time description at big crunch/big bang cosmologi-
cal singularities. Actually, the work described below started as a by-product of
the string cosmology program initiated by Gasperini and Veneziano [1]. While
collaborating with Gabriele on the possible birth of “pre–big bang bubbles”
from the gravitational collapse instability of a generic string vacuum made of
a stochastic bath of incoming gravitational and dilatonic waves [2], an issue
raised itself: What is the structure of a generic spacelike (i.e. big crunch or big
bang) singularity within the effective field theory approximation of (super-)
string theory (when keeping all fields, and not only the metric and the dila-
ton)? The answer turned out to be surprisingly complex, and rich of hidden
structures. It was first found [3, 4] that the general solution, near a space-like
singularity, of the massless bosonic sector of all superstring models (D = 10,
IIA, IIB, I, HE, HO), as well as that of M theory (D = 11 supergravity),
exhibits a never-ending oscillatory behaviour of the Belinsky–Khalatnikov–
Lifshitz (BKL) type [5]. However, it was later realized that behind this seem-
ing entirely chaotic behaviour there was a hidden symmetry structure [6, 7, 8].

T. Damour: Cosmological Singularities and a Conjectured Gravity/Coset Correspondence,

Lect. Notes Phys. 737, 941–948 (2008)
DOI 10.1007/978-3-540-74233-6 26 c Springer-Verlag Berlin Heidelberg 2008
942 T. Damour

This led to the conjecture of the existence of a hidden equivalence (i.e. a

correspondence) between two seemingly very different dynamical systems: on
the one hand, 11-dimensional supergravity (or even, hopefully, “M -theory”),
and, on the other hand, a one-dimensional E10 /K(E10 ) nonlinear σ model,
i.e. the geodesic motion of a massless particle on the infinite-dimensional coset
space1 E10 /K(E10 ) [8]. The intuitive hope behind this conjecture is that the
BKL-type near spacelike singularity limit might act as a tool for revealing a
hidden structure, in analogy to the much better established AdS/CFT cor-
respondence [9], where the consideration of the near horizon limit of certain
black D-branes has revealed a hidden equivalence between 10-dimensional
string theory in AdS space–time on one side, and a lower-dimensional CFT
on the other side. If the (much less firmly established) “gravity/coset corre-
spondence” were confirmed, it might provide both the basis of a new definition
of M -theory, and a description of the “de-emergence” of space near a cosmo-
logical singularity (see [10] and below).

2 Cosmological Billiards

Let us start by summarizing the BKL-type analysis of the “near spacelike

singularity limit”, that is, of the asymptotic behaviour of the metric gμν (t, x),
together with the other ﬁelds (such as the 3-form Aμνλ (t, x) in supergrav-
ity), near a singular hypersurface. The basic idea is that, near a spacelike
singularity, the time derivatives are expected to dominate over spatial deriva-
tives. More precisely, BKL found that spatial derivatives introduce terms in
the equations of motion for the metric which are similar to the “walls” of
a billiard table [5]. To see this, it is convenient [11] to decompose the D-
dimensional metric gμν into nondynamical (lapse N , and shift N i , here set
to zero) and dynamical (e−2β , θia ) components. They are deﬁned so that the
a

line element reads

d
e−2β θia θja dxi dxj .
a
ds2 = −N 2 dt2 + (1)
a=1

Here d ≡ D−1 denotes the spatial dimension (d = 10 for SUGRA11 , and d = 9

for string theory), e−2β represent (in an Iwasawa decomposition) the “diago-
a

nal” components of the spatial metric gij , while the “oﬀ diagonal” components
are represented by the θia , deﬁned to be upper triangular matrices with 1’s on
the diagonal (so that, in particular, det θ = 1).
The Hamiltonian constraint, at a given spatial point, reads (with Ñ ≡
N/ det gij denoting the “rescaled lapse”)

1
Here K(E10 ) denotes the (formal) “maximal compact subgroup” of the hyperbolic
Kac–Moody group E10 .
Cosmological Singularities 943

H(β a , πa , P, Q)

1 ab

= Ñ G πa πb + cA (Q, P, ∂β, ∂ β, ∂Q) exp − 2wA (β) .
2
(2)
2
A

Here πa (with a = 1, ..., d) denote the canonical momenta conjugate to

the “logarithmic scale factors” β a , while Q denote the remaining configu-
ration variables (θia , 3-form components Aijk (t, x) in supergravity), and P
their canonically conjugate momenta (Pai , π ijk ). The symbol ∂ denotes spa-
tial derivatives. The (inverse) metric Gab in (2) is the DeWitt “superspace”
metric induced on the β’s by the Einstein–Hilbert action. It endows the D-
dimensional2 β space with a Lorentzian structure Gab β̇ a β̇ b .
One of the crucial features of (2) is the appearance of Toda-like exponential
potential terms ∝ exp(−2wA (β)), where the wA (β) are linear forms in the
logarithmic scale factors: wA (β) ≡ wAa β a . The range of labels A and the
specific “wall forms” wA (β) that appear depend on the considered model. For
ab (β) ≡ β −β
S b a
instance, in SUGRA11 there appear: “symmetry wall forms” w
g
(with a < b), “gravitational wall forms” wabc (β) ≡ 2β + a
β (a = b,
e
e=a,b,c
b = c, c = a), “electric 3-form wall forms”, eabc (β) ≡ β a +β b +β c (a = b, b = c,
c = a), and “magnetic 3-form wall forms”, ma1 ....a6 ≡ β a1 + β a2 + ... + β a6
(with indices all different).
One then finds that the near-spacelike-singularity limit amounts to consid-
ering the large β limit in (2). In this limit a crucial role is played by the linear
forms wA (β) appearing in the “exponential walls”. Actually, these walls enter
in successive “layers”. A first layer consists of a subset of all the walls called
the dominant walls wi (β). The effect of these dynamically dominant walls is
to confine the motion in β-space to a fundamental billiard chamber defined by
the inequalities wi (β) 0. In the case of SUGRA11 , one finds that there are 10
S S S
dominant walls: 9 of them are the symmetry walls w12 (β), w23 (β), ..., w910 (β),
1 2 3
and the 10th is an electric 3-form wall e123 (β) = β + β + β . As noticed in
[6] a remarkable fact is that the fundamental cosmological billiard chamber
of SUGRA11 (as well as type II string theories) is the Weyl chamber of the
hyperbolic Kac–Moody S algebra E10 . More precisely, the 10 dynamically dom-
S S
inant wall forms w12 (β), w23 (β), ..., w910 (β), e123 (β) can be identified with
the 10 simple roots {α1 (h), α2 (h), ..., α10 (h)} of E10 . Here h parametrizes a
generic element of a Cartan subalgebra (CSA) of E10 . [Let us also note that
for heterotic and type I string theories the cosmological billiard is the Weyl
chamber of another rank 10 hyperbolic Kac–Moody algebra, namely BE10 ].
In the Dynkin diagram of E10 , Fig. 1, the 9 “horizontal” nodes correspond
to the 9 symmetry walls, while the characteristic “exceptional” node sticking

2
10 dimensional for SUGRA11 ; but the various superstring theories also lead to a
10-dimensional Lorentz space because one must add the (positive) kinetic term
of the dilaton ϕ ≡ β 10 to the nine-dimensional DeWitt metric corresponding to
the nine spatial dimensions.
944 T. Damour

iα
10

i i i i i i i i i
α1 α2 α3 α4 α5 α6 α7 α8 α9
Fig. 1. Dynkin diagram of E10

out “vertically” corresponds to the electric 3-form wall e123 = β 1 + β 2 + β 3 .

(The fact that this node stems from the 3rd horizontal node is then seen to
be directly related to the presence of the 3-form Aμνλ , with electric kinetic
energy ∝ g i g jm g kn Ȧijk Ȧmn .)
The appearance of E10 in the BKL behaviour of SUGRA11 revived an old
suggestion of Julia [12] about the possible role of E10 in a one-dimensional
reduction of SUGRA11 . A posteriori, one can view the BKL behaviour as a
kind of spontaneous reduction to one dimension (time) of a multidimensional
theory. Note, however, that we are always discussing generic inhomogeneous
11-dimensional solutions, but that we examine them in the near-spacelike-
singularity limit where the spatial derivatives are subdominant: ∂x ∂t .
Note also that the discrete E10 (Z) was proposed as a U -duality group of the
full (T 10 ) spatial toroidal compactiﬁcation of M -theory by Hull and Townsend
[13].

3 Gravity/Coset Correspondence
References [8, 14] went beyond the leading-order BKL analysis just recalled
by including the first three “layers” of spatial-gradient-related subdominant
walls ∝ exp(−2wA (β)) in (2). The relative importance of these subdominant
walls, which modify the leading billiard dynamics defined by the 10 dominant
walls wi (β), can be ordered by means of an expansion which counts how many
dominant wall forms wi (β) are contained in the exponents of the subdominant
wall forms wA (β), associated to higher spatial gradients. By mapping the
dominant gravity wall forms wi (β) onto the corresponding E10 simple roots
αi (h), i = 1, ..., 10, the just described BKL-type gradient expansion becomes
mapped onto a Lie algebraic height expansion in the roots of E10 . It was
remarkably found that, up to height 30 (i.e. up to small corrections to the
billiard dynamics associated to the product of 30 leading walls e−2wi (β) ),
the SUGRA11 dynamics for gμν (t, x), Aμνλ (t, x) considered at some given
spatial point x0 , could be identified to the geodesic dynamics of a massless
particle moving on the (infinite-dimensional) coset space E10 /K(E10 ). Note
the “holographic” nature of this correspondence between an 11-dimensional
dynamics on one side, and a 1-dimensional one on the other side.
A point on the coset space E10 (R)/K(E10 (R)) is coordinatized by a time-
dependent (but spatially independent) element of the E10 (R) group of the
Cosmological Singularities 945

a
(Iwasawa) form: g(t) = exp h(t) exp ν(t). Here, h(t)
= βcoset (t)Ha belongs to
α
the 10-dimensional CSA of E10 , while ν(t) = α>0 ν (t)E α belongs to a
Borel subalgebra of E10 and has an inﬁnite number of components labelled
by a positive root α of E10 . The (null) geodesic action over the coset space
E10 /K(E10 ) takes the simple form

dt sym sym
SE10 /K(E10 ) = (v |v ), (1)
n(t)

where v sym ≡ 12 (v + v T ) is the “symmetric”3 part of the “velocity” v ≡

(dg/dt)g −1 of a group element g(t) running over E10 (R).
The correspondence between the gravity (2) and coset (3) dynamics is best
exhibited by decomposing (the Lie algebra of) E10 with respect to (the Lie
algebra of) the GL(10) subgroup defined by the horizontal line in the Dynkin
diagram of E10 . This allows one to grade the various components of g(t) by
their GL(10) level . One finds that, at the = 0 level, g(t) is parametrized by
a
the Cartan coordinates βcoset (t) together with a unimodular upper triangular
zehnbein θcoset i (t). At level = 1, one finds a 3-form Acoset
a
ijk (t); at level = 2, a
6-form Acoseti1 i2 ...i6 (t), and at level = 3 a 9-index object A coset
i1 |i2 ...i9 (t) with Young-
tableau symmetry {8, 1}. The coset action (3) then defines a coupled set of
a a coset coset coset
equations of motion for βcoset (t), θcoset i (t), Aijk (t), Ai1 ...i6 (t), Ai1 |i2 ...i9 (t).
By explicit calculations, it was found that these coupled equations of motion
could be identified (modulo terms corresponding to potential walls of height
at least 30) to the SUGRA11 equations of motion, considered at some given
spatial point x0 .
The dictionary between the two dynamics says essentially that
a
(i) βgravity (t, x0 ) ↔ βcoseta
(t) , θia (t, x0 ) ↔ θcoseta coset
i (t), (ii) ∂t Aijk (t) corre-
sponds to the electric components of the 11-dimensional field strength Fgravity
= d Agravity in a certain frame ei , (iii) the conjugate momentum of Acoset i1 ...i6 (t)
corresponds to the dual (using εi1 i2 ...i10 ) of the “magnetic” frame compo-
nents of the 4-form Fgravity = d Agravity , and (iv) the conjugate momentum
of Ai1 |i2 ...i9 (t) corresponds to the ε10 dual (on jk) of the structure constants
i
Cjk of the coframe ei (d ei = 12 Cjk i
ej ∧ ek ).
The fact that at levels = 2 and = 3 the dictionary between supergravity
and coset variables maps the first spatial gradients of the SUGRA variables
Aijk (t, x) and gij (t, x) onto (time derivatives of) coset variables suggested
the conjecture [8] of a hidden equivalence between the two models, i.e. the
existence of a dynamics-preserving map between the infinite tower of (spa-
a
tially independent) coset variables (βcoset , ν α ), together with their conjugate
coset
momenta (πa , pα ), and the infinite sequence of spatial Taylor coefficients
(β(x0 ), π(x0 ), Q(x0 ), P (x0 ), ∂Q(x0 ), ∂ 2 β(x0 ), ∂ 2 Q(x0 ), . . . , ∂ n Q(x0 ), . . .)
3
Here the transpose operation T denotes the negative of the Chevalley involution
ω defining the real form E10(10) of E10 . It is such that the elements k of the Lie
subalgebra of K(E10 ) are “T -antisymmetric”: kT = −k, which is equivalent to
them being fixed under ω : ω(k) = + ω(k).
946 T. Damour

formally describing the dynamics of the gravity variables (β(x), π(x), Q(x),
P (x)) around some given spatial point x0 .4
It has been possible to extend the correspondence between the two models
to the inclusion of fermionic terms on both sides [15, 16, 17]. Moreover, [18]
found evidence for a nice compatibility between some high-level contributions
(height −115!) in the coset action, corresponding to imaginary roots,5 and
M -theory one-loop corrections to SUGRA11 , notably the terms quartic in the
curvature tensor. (See also [19] for a study of the compatibility of an under-
lying Kac–Moody symmetry with quantum corrections in various models.)

4 A New View of the (quantum) Fate of Space

at a Cosmological Singularity
Let us now, following [10], sketch the physical picture suggested by the grav-
ity/coset correspondence. That is, let us take seriously the idea that, upon
approaching a spacelike singularity, the description in terms of a spatial contin-
uum, and space–time based (quantum) field theory breaks down, and should
be replaced by a purely abstract Lie algebraic description. More precisely, we
suggest that the information previously encoded in the spatial variation of
the geometry and of the matter fields gets transferred to an infinite tower of
spatially independent (but time-dependent) Lie algebraic variables. In other
words, we are led to the conclusion that space actually “disappears” (or “de-
emerges”) as the singularity is approached.6 In particular (and this would
be bad news for Gabriele’s pre–big bang scenario), we suggest no (quantum)
“bounce” from an incoming collapsing universe to some outgoing expanding
universe. Rather it is suggested that “life continues” for an infinite “affine
time” at a singularity, with the double understanding, however, that (i) life
continues only in a totally new form (as in a kind of “transmigration”) and
(ii) an infinite affine time interval (measured, say, in the coordinate t of (3)
with a coset lapse function n(t) = 1) corresponds to a sub-Planckian interval
of geometrical proper time.7
Let us also comment on some expected aspects of the “duality” between
the two models. It seems probable (from the AdS/CFT paradigm) that, even

4
One, however, expects the map between the two models to become spatially non-
local for heights ≥ 30.
5
i.e. such that (α, α) < 0, by contrast to the “real” roots, (α, α) = +2, which enter
the checks mentioned above.
6
We have in mind here a “big crunch”, i.e. we conventionally consider that we
are tending towards the singularity. Mutatis mutandis, we would say that space
“appears” or “emerges” at a big bang.
7
Indeed, it is found that the coset time t (with n(t) = 1) corresponds to a “Zeno-
√
like” gravity coordinate time (with rescaled lapse Ñ = N/ g = 1) which tends
to +∞ as the proper time tends to zero.
Cosmological Singularities 947

if the equivalence between the “gravity” and the “coset” descriptions is for-
mally exact, each model has a natural domain of applicability in which the
corresponding description is suﬃciently “weakly coupled” to be trustable as
is, even in the leading approximation. For the gravity description this domain
is clearly that of curvatures smaller than the Planck scale. One then expects
that the natural domain of validity of the dual coset model would correspond
(in gravity variables) to that of curvatures larger than the Planck scale. In ad-
dition, it is possible that the coset description should primarily be considered
as a quantum model, as now sketched.
The coset action (3) describes the classical motion of a massless particle on
the symmetric space E10 (R)/K(E10 (R)). Quantum mechanically, one should
consider a quantum massless particle, i.e., if we neglect polarization eﬀects8
a Klein–Gordon equation,

Ψ (β a , ν α ) = 0 , (1)

where denotes the (formal) Laplace–Beltrami operator on the inﬁnite-

dimensional Lorentz-signature curved coset manifold E10 (R)/K(E10 (R)).
Equation (4) would apply to the case considered here of uncompactified M -
theory. In the case where all spatial dimensions are toroidally compactified,
it has been suggested [20, 21] that Ψ satisfy (4) together with a condition
of periodicity over the discrete group E10 (Z). In other words, Ψ would be a
“modular wave form” on E10 (Z)\E10 (R)/K(E10 (R)).
Let us emphasize (still following [10]) that all reference to space and time
has disappeared in (1). The disappearance of time is common between (4)
and the usual Wheeler-DeWitt equation in which the “wave function(al) of
the universe” Ψ [gij (x)] no longer depends on any extrinsic time parameter.
(As usual, one needs to choose among all the dynamical variables a specific
“clock field” to be used as an intrinsic time variable parametrizing the dy-
namics of the remaining variables.) The interesting new feature of (4) (when
compared to a Wheeler–DeWitt type equation) is the disappearance of any
notion of geometry gij (x) and its replacement by the infinite tower of Lie
algebraic variables (β a , ν α ).9 This quantum de-emergence of space, and the
emergence of an infinite-dimensional symmetry group E10 10 which deeply in-
tertwines space-time with matter degrees of freedom, might be radical enough
to get us closer to an understanding of the fate of space–time and matter at
cosmological singularities.

8
Actually, [15, 16, 17] indicate the need to consider a spinning massless particle,
i.e. some kind of Dirac equation on E10 /K(E10 ).
9
Note that this is conceptually very different from the E11 -based proposal of [22].
10
Let us note that E10 enjoys a similarly distinguished status among the (infinite-
dimensional) hyperbolic Kac–Moody Lie groups as E8 does in the Cartan–Killing
classification of the finite-dimensional simple Lie groups [23].
948 T. Damour

Acknowledgments
It is a pleasure to dedicate this review to Gabriele Veneziano, a dear friend
and a great physicist from whom I have learned a lot. I am also very grateful
to my collaborators Marc Henneaux and Hermann Nicolai for the (continuing)
E10 adventure. I also wish to thank Maurizio Gasperini and Jnan Maherana
for their patience.

References
1. M. Gasperini, G. Veneziano: Phys. Rep. 373, 1 (2003) 941
2. A. Buonanno, T. Damour, G. Veneziano: Nucl. Phys. B 543, 275 (1999) 941
3. T. Damour, M. Henneaux: Phys. Rev. Lett. 85, 920 (2000) 941
4. T. Damour, M. Henneaux: Phys. Lett. B 488, 108 (2000) [Erratum-ibid. B 491,
377 (2000)] 941
5. V. A. Belinsky, I. M. Khalatnikov, E. M. Lifshitz: Adv. Phys. 19, 525 (1970) 941, 942
6. T. Damour, M. Henneaux: Phys. Rev. Lett. 86, 4749 (2001) 941, 943
7. T. Damour, M. Henneaux, B. Julia, H. Nicolai: Phys. Lett. B 509, 323 (2001) 941
8. T. Damour, M. Henneaux, H. Nicolai: Phys. Rev. Lett. 89, 221601 (2002) 941, 942, 944, 945
9. O. Aharony, S. S. Gubser, J. M. Maldacena, H. Ooguri, Y. Oz: Phys. Rep. 323,
183 (2000) 942
10. T. Damour, H. Nicolai: “Symmetries, Singularities and the De-emergence of
Space”, essay submitted to the Gravity Research Foundation (March 2007) 942, 946, 947
11. T. Damour, M. Henneaux, H. Nicolai: Class. Quant. Grav. 20, R145 (2003) 942
12. B. Julia: in Lectures in Applied Mathematics, Vol. 21 (1985), AMS-SIAM, p.
335; preprint LPTENS 80/16 944
13. C. M. Hull, P. K. Townsend: Nucl. Phys. B 438, 109 (1995) 944
14. T. Damour, H. Nicolai: arXiv:hep-th/0410245 944
15. T. Damour, A. Kleinschmidt, H. Nicolai: Phys. Lett. B 634, 319 (2006) 946, 947
16. S. de Buyl, M. Henneaux, L. Paulot: JHEP 0602, 056 (2006) 946, 947
17. T. Damour, A. Kleinschmidt, H. Nicolai: JHEP 0608, 046 (2006) 946, 947
18. T. Damour, H. Nicolai: Class. Quantum. Grav. 22, 2849 (2005) 946
19. T. Damour, A. Hanany, M. Henneaux, A. Kleinschmidt, H. Nicolai: Gen. Rel.
Grav. 38, 1507 (2006) 946
20. O. J. Ganor: arXiv:hep-th/9903110 947
21. J. Brown, O. J. Ganor, C. Helfgott: JHEP 0408, 063 (2004) 947
22. P. C. West: Class Quantum Grav. 18, 4443 (2001) 947
23. V. G. Kac: Inﬁnite Dimensional Lie Algebras, 3rd edition (Cambridge Univer-
sity Press, Cambridge, 1990) 947
Brane Inﬂation: String Theory Viewed
from the Cosmos∗

S.-H. H. Tye

Newman Laboratory for Elementary Particle Physics, Cornell University, Ithaca,

NY 14853, USA
[email protected]

Abstract. Brane inflation is a specific realization of the inflationary universe sce-

nario in the early universe within the brane world framework in string theory. The
naturalness and robustness of this realistic scenario is explained. Its predictions
on the cosmological observables in the cosmic microwave background radiation, es-
pecially possible distinct stringy features, such as large non-Gaussianity or large
tensor mode that deviates from that predicted in the slow-roll scenario, are dis-
cussed. Stringy Kaluza–Klein (KK) modes as hidden dark matter is also a possi-
bility. Another generic consequence of brane inflation is the production of cosmic
strings towards the end of inflation. These cosmic strings are nothing but super-
strings stretched to cosmological sizes. The properties of these cosmic superstrings
and their subsequent cosmological evolution into a scaling network open up their
possible detections in the near future, via cosmological, astronomical and/or gravita-
tional wave measurements. At the moment, cosmological data are already imposing
strong constraints on the details of the scenario. Finding distinctive stringy signa-
tures in cosmological observations will go a long way in revealing the specific brane
inflationary scenario and validating string theory as well as the brane world picture.
Precision measurements may even reveal the structures of the flux compactification.
Irrespective of the final outcome, we see that string theory is confronting data and
making predictions.

1 Introduction
It is believed by many that superstring theory is the fundamental theory of all
matter and forces, including a consistent quantum gravity sector. In fact, it is
the only known theory that incorporates general relativity in a quantum me-
chanically consistent way around the near Minkowski spacetime that describes
our universe today. The theory is also extraordinarily intricate, revealing nu-
merous deep and rich mathematical and physical structures. However, the
string scale is believed to be so high that it is almost hopeless to ﬁnd stringy
signatures at any high-energy experiments in the conceivable future. Since
∗
In celebration of the 65th birthday of Gabriele Veneziano, teacher and friend.

S.-H. Henry Tye: Brane Inﬂation: String Theory Viewed from the Cosmos, Lect. Notes Phys.
737, 949–974 (2008)
DOI 10.1007/978-3-540-74233-6 26
c Springer-Verlag Berlin Heidelberg 2008
950 S.-H. H. Tye

such a high-energy scale was probably once reached in the early universe, it is
natural to look for stringy signatures in cosmology. Looking towards the sky
for information and tests on fundamental physics has a long tradition. This
follows the route taken by, for example, the discovery of Newton’s gravita-
tional force law and Einstein’s theory of general relativity.
The inflationary universe was proposed to solve a number of fine-tuning
problems such as the flatness problem, the horizon problem and the defect
problem [1]. Besides providing an origin for the hot big bang (the ultimate
free lunch), its prediction of an almost scale-invariant density perturbation
power spectrum (which is responsible for structure formation in our universe)
has received strong observational support from the temperature fluctuation
and polarization in the cosmic microwave background radiation (CMBR), e.g.,
COBE [2] and WMAP [3]. However, the origin of the key ingredients of the
inflationary scenario, namely, the scalar field known as the inflaton and its po-
tential, remains undetermined. In this sense, the inflationary universe scenario
is considered by many to be a paradigm or framework, not quite a theory. As
the cosmological data keep improving in a very impressive fashion, it becomes
urgent to find a specific model that has a solid theoretical foundation.
If string theory is the theory of everything, we should be able to find a
natural inflationary scenario there. This will allow us to identify the inflaton
and its properties, while at the same time cosmological measurements will
help us to determine the precise stringy description of our universe. With
some luck, we may even find distinct stringy signatures in this framework in
the cosmological data to confirm our faith in the theory. Since the inflationary
scale turns out to be comparable to the string scale, such an investigation
is clearly very worthwhile. If the scenario is natural, one should be able to
explain why many e-folds of expansion are generic (without fine-tuning). A
good test requires the scenario/model to be over-constrained, i.e., the number
of measurements should eventually exceed the number of parameters in the
model. We shall explain how (and in what sense) brane inflation, a specific
realization of the inflationary universe scenario in the early universe within
the brane world framework in string theory, satisfies these two criteria; that
is, it is both natural and testable.
Since the discovery of D-branes in string theory [4], a natural realization
of nature in string theory is the brane world. In the brane world, all standard
model particles are open string modes. Since each end of an open string must
end on a brane, the standard model (SM) particles (being light) are stuck
on a stack of Dp-brane, where 3 of the p dimensions span our universe of
standard model particles, while the remaining p − 3 dimensions are wrapping
some cycles in the bulk (the remaining 9 − p spatial dimensions) where closed
string modes such as the graviton live (Fig. 1a). Suppose our today’s universe
is described by such a brane world solution in string theory. A simple, realistic
and well-motivated inflationary model is the brane inflation, where the inflaton
is simply the position a Dp-brane moving in the bulk [5]. In the simple D3-
D̄3-brane inflation [6], inflation takes place while the D3-brane is moving
Brane Inflation 951

Fig. 1. (a) The brane world scenario. Here, as light open string modes with each
end of an open string ending on a brane, the standard model particles are stuck to
the branes, while closed string modes such as a graviton are free to roam the bulk.
(b) During brane inflation, a tiny region of the branes (i.e., our universes) grows
by an exponentially large factor. Fluctuations such as defects, radiation or matter
will be inflated away. Also, the differences in spacing between branes as well as the
curvature decreases rapidly

towards the D̄3-brane (i.e., anti-D3-brane, which has the same tension but
opposite RR charge as a D3-brane) inside the six-dimensional bulk (due to
the attractive force between them), and inflation ends when they collide and
annihilate each other. Fluctuations that are present before inflation, such as
defects, radiation or matter, will be inflated away (see Fig. 1b). Here, the
relative D3-D̄3-brane position φ is the inflaton and the inflaton potential
V (φ) comes from their tensions and interactions. The annihilation releases
the brane tension energy that heats up the universe to start the hot big bang
epoch. Typically, strings of all sizes and types may be produced during the
collision. Large fundamental strings and/or D1-branes (or D-strings) become
cosmic superstrings.
In a more realistic brane world scenario, all moduli of the six extra spatial
dimensions are dynamically stabilized via flux compactification [7, 8], and the
presence of RR fluxes introduces intrinsic torsion and warped geometry, so
there are regions in the bulk with warped throats. They are six-dimensional
versions of the Randall–Sundrum (RS) warped geometry. There are numer-
ous such solutions in string theory, some with a small positive vacuum energy
(cosmological constant). This is known as the string landscape. Presumably
the standard model particles are open string modes; they can live either on
D7-branes wrapping a 4-cycle in the bulk or (anti-)D3-branes at the bottom
952 S.-H. H. Tye

of a warped throat (Fig. 2). In the early universe, there is an extra pair of
D3-D̄3-branes. Due to the attractive forces present, the D̄3-brane is expected
to sit at the bottom of a throat. Here again, inflation takes place as the
D3-brane moves down the throat towards the D̄3-brane, and inflation ends
when they collide and annihilate each other, allowing the universe to settle
down to the string vacuum state that describes our universe today. This is the
KKLMMT scenario [9]. Although the original toy model version encounters
some fine-tuning problems, the scenario becomes substantially better as we
make it more realistic: It is surprisingly robust, that is, many e-folds of in-
flation are a generic feature. This is very encouraging. Briefly speaking, there
are two key stringy ingredients that come into play:
• Because of the warped geometry, a consequence of flux compactification,
a mass M in the bulk becomes hA M at the bottom of a warped throat, where
hA 1 is the warped factor (Fig. 2). This warped geometry tends to flatten,
by orders of magnitude, the inflaton potential V (φ), so the attractive D3-
D̄3-brane potential is rendered exponentially weak in the warped throat. The
potential takes the form

1 1 φ4A
V (φ) = VK + VA + VDD̄ = βH 2 φ2 + 2T3 h4A (1 − ) + ... (1)
2 NA φ 4

where the ﬁrst term VK (φ) = m2 φ2 /2 + ... receives contributions from the
Kähler potential and various interactions in the superpotential [9] as well as

Fig. 2. A pictorial sketch of the compactified bulk. Besides some warped throats,
there are D7-branes wrapping a 4-cycle. The D3-D̄3-brane inflationary scenario in
a generic flux compactified six-dimensional bulk. The blue dots stand for mobile
D3-branes, while the red dots are D̄3-branes sitting at the bottoms of throats. After
inflation and the annihilation of the last D3-brane with the D̄3-brane in A-throat,
the remaining D̄3-branes in S-throat may be the standard model branes
Brane Inflation 953

possible D-terms [10]. H is the (initial) Hubble parameter so this interac-

tion term behaves like a conformal coupling. Here, β, and more generally VK ,
probes the structure of the flux compactification [11, 12]. The warp factor de-
pends on the details of the throat. Crudely, h(φ) ∼ φ/φedge , where φ = φedge
when the D3-brane is at the edge of the throat, so h(φedge ) 1. At the bottom
of the throat, where φ = φA , hA = h(φA ) = φA /φedge . T3 is the D3-brane
tension and the effective tension is warped to a very small value T3 h4A (as
we shall see, hA ∼ 10−2 ). The attractive gravitational (plus RR) potential
is further warped to a very small value : NA 1 is the D3 charge of the
throat. If the last 55 e-folds of inflation takes place inside the throat, then
φedge ≥ φ ≥ φA during this period of inflation. Note that β is expected to be
of order unity, β ∼ 1. Despite the warped geometry effect, the above potential
< 1/5 [13]. We see in Fig. 3
yields enough inflation only if β is small enough, β ∼
that the data can easily over-constrain the model. However, this is not the
end of the story.

• Because the inflaton is an open string mode, its kinetic term appears
inside the Dirac–Born–Infeld (DBI) action. For slow-roll, this term reduces to
the usual kinetic term. However, when the inflaton is moving relativistically,
the full effect of the DBI action must be taken into account [14]. The DBI
action in brane inflation leads to the “Lorentz factor”
1
γ(φ) = , (2)
1 − φ̇2 /T (φ)

where T (φ) = T3 h(φ)4 is the warped D3-brane tension and the limiting speed,
c(φ) = T (φ), is decreasing rapidly as the D3-brane moves down the throat
c ∼ φ2 → φ2A . This means the speed φ̇ of φ is limited by the rapidly de-
creasing limiting speed irrespective of the steepness of the inflaton potential.
In the warped throat, even for a steep potential, the inflaton motion must
slow down considerably towards the bottom of the throat as it is becoming
ultra-relativistic, so it takes a while before it reaches the bottom of the throat.
As a result, the warped geometry of the throat combined with the DBI
action generically allows for many e-folds of inflation. Robustness of the over-
all scenario suggests that we are in the right direction. A few comments are
in order here :
(i) Since the inflaton is an open string mode that stretches between the branes,
it no longer exists as a physical degree of freedom after the D3-D̄3-brane
annihilation.
(ii) The above scenario does not guarantee enough inflation; however, it does
yield enough inflation for a large region in the parameter space. Once CMBR
and other cosmological data are introduced, constraints on the parameters
will sharpen the predictions. At the moment, data are already putting strong
constraints on the parameter space. Future data will constrain the parameters
further and tell us about the structure of the bulk as well as the throat.
954 S.-H. H. Tye

Log(Gµ) ns – 1

–6
0.1
–6.5
–7 .075
–7.5
0.05
–8
–8.5 .025
–9
0.0
β 0.1 0.0 0.04 0.08 0.12 0.16 0.20
0 0.05 0.15 .025 beta
0.05

Log r d ns /d ln k

–3 0.0008
–4 0.0006
–5 0.0004
–6 0.0002
0
–7 0.05 β 0.1 0.15
– 0.0002
–8
– 0.0004

0 0.05 β 0.1 0.15

Fig. 3. The predictions of the slow-roll brane inﬂationary scenario [9, 13]: the cosmic
string tension μ, the power spectrum index ns , the ratio r of the tensor to the scalar
density perturbations and the running of ns

(iii) The presence of a D3-D̄3-brane pair explicitly breaks supersymmetry.

Although this breaking is large, it is very soft, as we shall see. Furthermore,
the warping exponentially suppresses the breaking terms. So it is justified
to study the scenario within the supergravity approximation when the string
scale is much smaller than the Planck scale.
(iv) The interplay between cosmology and gauge/gravity duality should re-
ceive more attention, since cosmological data may provide valuable informa-
tion about strongly coupled gauge theory (via structures of throats and cosmic
string properties).
(v) There are many variations of the above scenario. For large m [15], or for
a modified warped throat [16], enough inflation can be obtained without the
D̄3-brane. Multi-throat and/or multi-brane scenarios are also very easy to en-
vision [17, 18]. It is beyond the scope of this review to discuss the large set of
Brane Inflation 955

multi-brane inﬂationary models under the name “assisted inﬂation”. Clearly,

they should be fully explored.
(vi) The six-dimensional (or seven-dimensional in M theory) compactification
typically introduces many light closed string modes known as moduli. The re-
sulting effective potential involving these bulk modes is in general complicated
enough so, with some fine-tuning, one can find a flat enough direction to carry
out inflation. It is entirely possible that nature takes this path and moduli
inflation should be and has been extensively studied. However, the moduli
inflationary scenario does not seem to have distinct stringy signatures, or as
compelling and predictive as brane inflation.

The rest of this chapter discusses the various aspects of the above scenario:
• Inflation. For small m or β, the model reduces to the slow-roll scenario
(Fig. 3). In this case, WMAP and other cosmological data impose the con-
straint β < 0.05 [19]. That is, 0.05 ∼<β∼ < 0.2 is ruled out. For large inflaton
mass m, the DBI action comes into play and new stringy features such as non-
Gaussianity will appear [15]. Furthermore, the three-point correlation function
(or bispectrum) has a distinct distribution that is clearly different from what
may appear in a slow-roll scenario [20, 21]. For intermediate values of m, the
tensor mode perturbation may be large [22]. It can also be distinguished from
that coming from the slow-roll scenario. This is encouraging since, unlike the
scalar mode perturbation, the metric perturbation directly probes the very
early universe.
• Heating at the end of inflation. The D3-D̄3-brane annihilation produces
only closed strings, with the graviton as the lightest mode. The transfer of
energy from closed string modes to the standard model particles which are
open string modes seems problematic, since gravitational radiation can make
up at most a few percent of the density of the standard model particles during
big bang nucleosynthesis. Naively, this problem seems most severe if inflation
takes place in one throat (the A-throat), while the standard model branes
are in another throat (the S-throat). It is satisfying that an analysis of what
happens indicates that heating will work out nicely. In fact, the situation im-
proves dramatically when one considers a realistic (i.e., flux compactification)
scenario instead of a toy model version based on the Randall–Sundrum sce-
nario. It also offers some possibilities of specific features (such as KK modes
as hidden dark matter [23]) that may be tested.
• Production and properties of cosmic strings.
• Evolution of the cosmic string network and its possible detection. Here, we
discuss our present knowledge of the scaling cosmic string network and some
of its observational consequences.
The history of cosmic strings is a long one [24, 25]. First proposed by Kib-
ble and others, it was applied to generate density perturbations that seeded
the structure formation. This requires a tension of Gμ ∼ 10−6 . This was ruled
out by the CMBR data. The possibility of superstrings as cosmic strings
was first studied by Witten [26]. However, in the heteroric string framework,
956 S.-H. H. Tye

Gμ ∼ 10−3 , which is far too big to be compatible with observations. In any

case, either these cosmic strings would have been inflated away, or they are un-
stable to breakage. In brane inflation in Type IIB theory, we see that they are
produced after inflation [27, 28], with much lower tensions due to the warped
geometry [9, 13]. They are stable under a variety of situations [29, 30], so
they can survive to form a scaling cosmic string network. Cosmic superstrings
will also have non-trivial tension spectrum and junctions can appear [29]. Of
course, the presence of a cosmic string network is not guaranteed. However, if
they are around, the chances of detecting them are very promising. Irrespec-
tive of the final outcome, we see that string theory is confronting data and
making predictions.

2 Brane Inflation
It is possible (in fact one may argue likely) that the inflaton potential has
relatively flat directions outside the throat, allowing substantial inflation. Un-
fortunately, the precise potential is rather dependent on the detailed structures
of the compactification and remains to be explored more carefully. To avoid
this issue, we shall assume here that the D3-brane starts close to or inside the
throat. If we have enough e-folds in the throat, then the physics outside the
throat need not concern us. As explained earlier, this is an easy condition to
satisfy.
First, let us consider the potential V (y) per unit volume between a parallel
Dp-D̄p-brane pair separated by a distance y, where the Dp-branes are BPS
with respect to each other. We shall consider p < 7, where Tp is the Dp-
brane tension. We may view V (y) as coming from the closed string exchanges
between the branes (Fig. 4a). In the closed string channel, at large y, when
the massive mode exchanges are Yukawa-suppressed,

κ2 Tp2 1
V (y) − Γ ((7 − p)/2) , (1)
π (9−p)/2 y 7−p

where κ2 = 8πG10 and Tp = (2πα )−(p+1)/2 is the Dp-brane tension. Here

α = m−2 s is the Regge slope and ms is the string scale. For p < 7, V (y)
vanishes as y → ∞. This is simply the attractive gravitational (NS-NS) plus
massless RR interaction between the branes. At short distances, the exchange
of the massive closed string modes are not Yukawa-suppressed and the evalua-
tion of V (y) is somewhat subtle. Because of the exponentially growing degen-
eracy (as a function of mass) in the closed string spectrum, a naive summation
yields an oscillating divergent result. Looking at Fig. 4a, we see that we may
evaluate V (y) as a one-loop radiative correction in the open string channel by
including the whole tower of open string modes. The particular way of group-
ing the contributions should be dictated by the soft supersymmetry breaking
[31, 32]. When the two branes are parallel, there is no potential between them
Brane Inﬂation 957

Fig. 4. (a) The exchange of closed strings between two branes. In the dual channel,
this describes the one-loop radiative effect of the open strings stretching between
two branes. (b) The potential V (y) between the D3-brane and the D̄3-brane due to
the diagram (a), as a function of the separation y for the brane pair, where α =1
[32]. The dashed curve is the imaginary part of V (y). The thick line is the real part
of V (y). The Coulombic potential (the thin red curve) is shown for comparison.
(c) The potential V (φ, T ) as a function of the inflaton y ∼ φ and the tachyon
expectation value T [33]. Brane inflation is a hybrid inflationary scenario

because of supersymmetry. Each mass level contains a set of supermultiplets.

The contribution to the potential V (y) from the open string bosons is exactly
cancelled by the contribution from the open string fermions, mass level by
mass level. Now we consider the D̄p-brane as a Dp-brane rotated by π. Su-
persymmetry broken by this rotation is large, in the sense that level crossings
take place. However, the supersymmetry breaking is very soft, that is, the open
string spectrum follows the spectral ﬂow. For each broken supermultiplet,

(−1)F m2n
i = 0, n = 1, 2, 3, (2)
i

where i runs over the spectrum in each large but “softly broken” supermul-
tiplet (and F is the fermion number). Keeping this grouping in the sum over
the open string spectrum yields a ﬁnite V (y) (Fig. 4b). This very soft SUSY
breaking also justiﬁes the continuous use of the supergravity formulation.
In the open string one-loop channel, a tachyon appears at short distances,

y2 1
α m2tachyon = − , (3)
4π 2 α 2
958 S.-H. H. Tye

which contributes an imaginary part to V (y). We see that the Coulombic

form is a very good approximation before√ the tachyon appears, by which
time inflation is over anyway. With φ = T3 y, the tachyon appears when
φ = φE , and the annihilation process begins. The potential V (φ, T ) in Fig.
4c is evaluated using boundary superstring field theory method [33]. So we
have φi > φ55 > φE > φA , where φi is the initial D3-brane position when
inflation starts and φ55 is the value of φ at 55 e-folds before inflation ends. So
the scenario is a hybrid inflation. In the more realistic KKLMMT scenario,
V (φ) becomes VDD̄ (φ) given in (1).
Warped throats such as the Klebanov–Strassler (KS) warped deformed
conifold [34] are generic in any flux compactification that stabilizes the mod-
uli. The DBI action for the inflaton field follows simply because the inflaton
is an open string mode. By now it is clear that enough inflation is generic in
this scenario, thanks to (i) the warped geometry of the throat in a realistic
string compactification, which tends to flatten (by orders of magnitude) the
attractive Coulombic potential between the D3-brane and the D̄3-brane [9].
The warped geometry also reduces the vacuum energy that breaks supersym-
metry, so the supergravity approximation is expected to be valid. (ii) The
warped geometry of the throat combined with the DBI action, which forces
the inflaton to move slowly as it falls towards the bottom of the throat, as
pointed out by Silverstein and Tong [14]. In fact, one may get enough e-folds
just from around the bottom of the throat [35].
Inside the throat, the metric takes the form

ds2 = h2 (r)(−dt2 + a(t)2 dx2 ) + h−2 (r)(dr2 + r2 ds25 ), (4)

and the potential takes the simple approximate form (1),

m2 2 vV0 1
V (φ) = VK (φ) + V0 + VDD̄ (φ) φ + V0 1 − 2 4 , (5)
2 4π φ

where the constant term V0 = 2T3 h4A = 2T3 h(φA )4 is the effective vacuum
energy. The factor v depends on the properties of the warped throat, with
v = 27/16 for the KS throat. With some warping (say, hA 1/5 to 10−3 ),
the attractive Coulombic potential VC (φ) can be very weak (i.e., flat). The
quadratic term VK (φ) receives contributions from a number of sources and
is rather model-dependent. However, m2 is expected to be comparable to
H02 = V0 /3Mp2 , where Mp is the reduced Planck mass (G−1 = 8πMp2 ). This
sets the canonical value for the inflaton mass m0 = H0 (which turns out to
be around 10−7 Mp ).
The scale of the throat R is given by
27πgs NA α2
R4 = . (6)
4
For a generic value of m, usual slow-roll inflation will not yield enough e-folds
of inflation. Reference [13] shows that m ∼ < m0 /3 will be needed. Naı̈vely,
Brane Inflation 959

a substantially larger m will be disastrous, since the inﬂaton will roll fast,
resulting in very few e-folds in this case. However, for a fast-roll inﬂaton, string
theory dictates that we must include higher powers of the time derivative of
φ, in the form of the DBI action

S = − d x a (t) T 1 − φ̇ /T + V (φ) − T ,
4 3 2 (7)

= T3 h(φ)4 is the warped D3-brane tension at φ. For the usual

where T (φ)
slow-roll, T 1 − φ̇2 /T − T φ̇2 /2, reproducing the standard kinetic term.
It is quite amazing that the DBI action now allows enough e-folds even when
the inflaton potential is steep [14, 15]. As the D3-brane approaches D̄3-brane,
φ and T (φ) decrease, and h(φ) → h(φA ). The key is that φ̇ is bounded by the
limiting speed, and this bound gets tighter as T (φ) decreases. This happens
even if the potential is steep, for example, when m > H0 . So the inflaton rolls
slowly either because the potential is relatively flat (so γ 1 in the usual
slow-roll case) or because the warped tension T (φ) is small (so 1 γ < ∞).
As a result, it can take many e-folds for φ to reach the bottom of the throat.
When γ 1, the kinetic energy is enhanced by a Lorentz factor of γ. Note
that the inflaton is actually moving slowly down the throat even in the ultra-
relativistic limit. However, the characteristics of this scenario are very different
from the usual slow-roll limit, where γ 1. To draw a distinction, we call this
the ultra-relativistic regime.
In general, there are three parameters, namely, m, λ and φA (note that
V0 is a function of λ and φA ), plus the constraint that the D3-brane should
be inside the throat. We find that the power spectrum can be red-tilted in all
three scenarios.
(1) β 1, γ 1, the slow-roll case, when m2 0. Here, there are essentially
two parameters : m and V0 . After fitting the COBE density perturbation data
[2], the predictions are reduced to a one-parameter, namely β, analysis [13].
For small β, ns ∼ 0.98 + β, log r ∼ −8.8 + 60β, log Gμ ∼ −9.4 + 30β. The
cosmological data restrict the relevant range to 0 ≤ β < 0.05 [19].
(2) β ∼ 1, γ 1 at Ne ∼ 55, but increases to a large value towards the end of
inflation; this corresponds to some intermediate values of m2 . In this case, the
tensor mode can be large, i.e., as large as saturating the present observational
bound r < 0.3 [3]. Here, the DBI introduces a deviation from the slow-roll
relation between R and the tensor power spectrum index nt [22],

r γ
nt = − , (8)
8 1−−κ
where is the usual slow-roll parameter divided by γ and κ measures the run-
ning of γ. For large φ, the parameterization of the potential should probably
include a φ4 term.
(3) β 1, γ is large throughout. In this ultra-relativistic case, m is large so
VK dominates (i.e., V0 can be ignored), and the model is again reduced to the
960 S.-H. H. Tye

above three parameters before imposing the COBE normalization. In this sce-
nario, ensuring that all 55 e-folds of inflation take place, while the D3-brane
is inside the throat becomes a strong constraint; that is, the “initial” position
φi (at 55 e-folds before the end of inflation) should satisfy φi ≤ φe where
φe is the value at the edge of the throat, i.e., h(φe ) 1. To implement this
condition, we need to introduce the D3-brane tension T3 , or the string scale
α . Since V0 can be ignored in this case, one may obtain all the inflationary
properties without the D̄3-brane. |fN L | 0.32γ 2 ∼ < 300 yields γ ∼ < 31 [36].
However, one should check if reheating or preheating can be successfully re-
alized in such a scenario. The structure of the non-Gaussianity from this UV
DBI model is different from that due to slow-roll. The three-point correlation
function A(k1 , k2 )/k1 k2 k3 (where k1 + k2 + k3 = 0) [20] is shown in Fig. 5.
Note that ns is quite sensitive to the warped factor. This point is clearly
illustrated by the two different predictions of ns using two different approx-
imations to the KS warp factor: an AdS cut-off (very slightly blue tilt) [22]
and a mass gap cut-off (red tilt) [35]. For large R, we have to consider a highly
orbifolded version of the throat in order to fit it inside the bulk.
(4) For tachyonic inflaton mass (m2 < 0), the scenario becomes the multi-
throat brane inflation scenario proposed by Chen [17, 38]. The Coulombic
term VDD̄ is negligible and inflation takes place as the D3-brane moves out
of a throat (see Fig. 2). For small tachyonic mass, this is simply a slow-roll

Fig. 5. The shape of the three-point correlation function in the DBI model [20]. For
comparison, the shape at the upper left corner is (negative of) that from a standard
slow-roll model
Brane Inﬂation 961

model. This IR DBI inflation can happen when inflaton mass takes a generic
value, m ≈ H (β ≈√1). The distance the inflaton travels through during
inflation, Δφ ≈ HR2 T3 , is always sub-Planckian. This model may be real-
ized in a multi-throat compactification starting with a number of antibranes
settled down at the ends of various throats. These antibranes are classically
stable but can annihilate against the fluxes quantum mechanically [37]. The
end products of such a phase transition are many D3-branes in, say, the B-
throat, which is sufficiently long (typically more than twice longer than the
A-throat). The IR model predicts large non-Gaussianity with the same shape
as in the UV model. The difference is the running, fN L ≈ 0.036β 2 Ne2 , that
is, fN L decreases with k, while fN L increases with k in the UV model. The
power spectrum index undergoes an interesting phase transition at a critical
e-fold from red (ns − 1 ≈ −4/Ne ) at small scales to blue (ns − 1 ∼ 4/Ne ) at
large scales [38, 39]. This transition is due to the Hagedorn phase when the
red-shifted string scale drops below the Hubble constant. If such a transition
falls into the observable range of CMBR, it predicts a large running of ns
around the transition point, i.e., a large negative dns /d ln k. Outside of this
transition region, dns /d ln k is unobservably small.
If the brane inflationary scenario is correct, it will provide a great probe
to both the origin of our early universe and the particular compactification
in string theory, i.e., where we are in the cosmic landscape. For example, the
inflaton is actually a six-component field. So far, we have only considered the
radial mode. When a 4-cycle is close to the A-throat, the symmetry of the
throat (S 3 × S 2 for the KS case) would be broken by the 4-cycle’s position,
shape and orientation, generating a richer inflaton potential [12]. This may
also tell us whether eternal inflation is happening or not. Since φ is bounded
by the size of the bulk, eternal inflation is far from a given in brane inflation
[40].

3 Graceful Exit
The crucial step that links the inflationary epoch to the hot big bang epoch
is the heating at the end of inflation. This is known as the graceful exit,
namely, how the inflationary energy can be efficiently transferred to heat up
the standard model particles, and be compatible with the well-understood
late-time cosmological evolution? This is the heating problem (also called
reheating or preheating problem). To see why this is quite a non-trivial issue,
we first look at the end process of brane inflation.
In the above brane inflationary scenario, inflation ends when the D3-brane
annihilates with the D̄3-brane. Significant insights have been gained into such
a process [41]. Tachyonic modes appear when the brane–antibrane distance
approaches the string scale and the annihilation process may be described by
tachyon rolling [42, 43]. (The decay width is signified by the imaginary part
of the potential V (φ).) No matter whether there are adjacent extra branes
962 S.-H. H. Tye

surviving such an annihilation (e.g., a D3-brane colliding with a stack of D̄3-

branes), the initial end product is expected to be dominated by non-relativistic
heavy closed strings [44, 45, 46]. These will then go to lighter closed strings,
light KK modes, gravitons and open strings. We know from observations that,
during big bang nucleosynthesis (BBN), the density of gravitons can be no
more than a few percent of the total energy density of the universe. The rest is
contributed by the standard model particles (mostly photons, neutrinos and
electrons), which are open strings attached to a stack of SM (anti-)branes. We
also know that the density of any non-relativistic relics can be no more than
about 10 times that of the baryons. Therefore, the question becomes how the
brane annihilation products, originally dominated by the closed string degrees
of freedom, can eventually become the required light open string degrees of
freedom living on the SM branes, with a negligible graviton density and a
non-lethal amount of stable relics. This question is particularly sharp in the
multi-throat scenario, where the inflationary branes annihilate in one throat
(A-throat), while the SM branes are sitting in another throat (S-throat). Let
us discuss this case and then comment on the other cases.
A number of studies have been done to address this heating problem
[47, 48, 49, 50]. An important observation is that, because the KK mode wave
function is peaked at the bottom of the throat, its interaction with particles
located there is much enhanced compared to that with the graviton, whose
wavefunction spreads throughout the bulk. This is essentially along the line
of Randall–Sundrum warped geometry. Because of this, the graviton emission
branching ratio during the brane decay and KK evolution is suppressed by
powers of warp factors [51]. In a realistic compactification, throats are typi-
cally separated in the bulk, which tends to generate resonance effects in the
tunnelling from one throat to another. We expect the compactification vol-
ume to be dominated by the bulk, with typical size L R, another important
ingredient in the success of the graceful exit. Again, the realistic scenario of
heating improves in a number of ways over the RS scenario. The discussions
below follows [23] and relies on Fig. 2.
First, we note that that the cross sections for KK self-interaction and KK
interactions with SM particles in a throat with size R and a warp factor h
goes like
6
L 1
σ∼ . (1)
R MP2 h2
This is much bigger than that for the graviton, where the corresponding σ ∼
MP−2 . Note that the factor (L/R)6 comes from the six-dimensional bulk. Next,
it is important to follow the thermal history of the KK modes as the universe
expands. Because of the above warped enhanced KK self-interactions, it is
easy to see that the KK modes become non-relativistic before they decouple.
So, instead of a tower of non-relativistic KK modes, only the lightest few stable
KK modes remain. As a result, their relic density is very much suppressed.
The qualitative picture of heating goes as follows.
Brane Inflation 963

Massive closed string modes produced during the D3-D̄3-brane annihi-

lation rapidly decay to light KK modes and gravitons. Among the light KK
modes in a throat are ones with conserved angular momenta, so they are quite
stable against further decay, with typical mass of order hA /R. Due to the self-
interaction, the relic density in the non-relativistic KK modes is very much
suppressed. Due to the red-shift and the low tunnelling rate, the universe en-
ters a matter-dominated phase with these KK modes, which then tunnel to
the S-throat and other throats, if present. To ameliorate the hierarchy prob-
lem, we expect the S-throat to have a much smaller warped factor hS hA .
Generically, we expect the tunnelling rate from A-throat to S-throat to be
enhanced by the bulk resonance eﬀect (for R/L ∼ < hA ) [52],

ΓA→S ∼ h9A /R h17

A /R, (2)
where the second rate is that for the case when there is no bulk resonance
effect. Once the KK modes reach the S-throat, they rapidly decay to open
string modes and heat up the universe, starting the hot big bang epoch. For a
successful scenario, (i) the matter-dominated duration should be long enough
to red-shift away the gravitational radiation away, but not so long as over-cool
the universe. This condition requires hA ∼ 10−1 to 10−3 . It is very encouraging
that these values are precisely those required to fit the CMBR data. (ii) The
decay of KK modes in the S-throat should go to open string modes instead
of to gravitational radiation. This is guaranteed because the coupling of KK
modes to gravitons is dictated by the Newton’s constant G4 = 8π/MP2 , while
their couplings to open strings modes, i.e., SM particles, are enhanced by the
localization of both the KK modes and the SM branes in the throat, as shown
in (11).
It is interesting to point out some novel features in this heating scenario:
• There is a matter-dominated epoch between the end of inflation and the
beginning of the hot big bang era. The cosmic scale factor can grow by a large
factor (105 or more) during this epoch. As a result, both the gravitational
radiation and the gravitino density will be substantially suppressed. It will be
interesting to study other cosmological consequences of such an epoch.
• There is a dynamical process that selects a long throat to be heated. This
is because the dense spectrum in a long throat makes the level matching of
the energy eigenstates, a necessary condition for tunnelling between throats,
easier to satisfy. This may provide a dynamical explanation of the selection of
the RS type (i.e., with very large warping that solves the hierarchy problem)
warp space as our standard model throat in the early universe.
• Although KK modes as dark matter have been considered in the literature,
we see the possibility of KK modes as hidden dark matter. These are almost
stable KK modes in another throat (say, the B-throat in Fig. 2), which in-
teracts only via gravitons with SM particles. This hidden dark matter has
many unusual properties compared to the usual dark matter candidates, e.g.,
it may tunnel to the S-throat and generate a cosmic ray that violates the
GZK bound.
964 S.-H. H. Tye

4 Production and Properties of Cosmic Superstrings

Although the production of domain walls and monopoles at the grand uni-
fied (GUT) scale will over-close the universe by many orders of magnitude,
cosmic strings do not suffer from the same problem. This is a consequence
of the intercommutation properties of strings, which leads to a scaling cos-
mic string network that tracks the radiation (matter) during the radiation-
(matter)-dominated era. A key property of cosmic string is its tension μ. In
fact, cosmic strings around the GUT scale, i.e., Gμ ∼ 10−6 , was originally
proposed as an alternative to inflation in generating density perturbation for
structure formation [25]. However, the properties of CMBR data, in partic-
ular the acoustic peaks, ruled out this possibility. It is these same data that
strongly support inflation. In fact, all defects present before inflation would
have been inflated away. So we need to consider only defects that are produced
after inflation.
The topological properties of defect formation in tachyon condensation are
well understood in superstring theory [53]. The spontaneous symmetry break-
ing will support defects with even codimension (i.e., 2k), as classified by K
theory. In particular, D3-D̄3-brane annihilation yields D1-branes and funda-
mental F 1-strings, when the large massive ones appear as cosmic strings in
our universe [27, 28, 29]. Qualitatively, it is easy to see how this takes place.
There is a U (1) gauge theory associated with each brane, and the tachyon cou-
ples to one combination U (1)− . This is simply the Abelian Higgs model in the
field theory approximation. Tachyon rolling results in spontaneous symmetry
breaking and the resulting vortices are D1-strings. So they are cosmologically
produced via the Kibble mechanism. The other U (1)+ becomes confining, and
the resulting flux tubes become the fundamental closed strings [54]. So cos-
mic strings are generically produced towards the end of brane inflation. It is
quite amazing that string theory dictates that the dangerous domain walls
and monopole-like defects are not produced. In the Type IIB theory that we
are studying, there is simply no D0- or D2-branes.
We find that the cosmic string tension μ roughly satisfies 10−13 < Gμ <
−6
10 . Fundamental string (F-string) tension in 10 dimensions defines the
string scale α via TF 1 = 1/2πα . In Type IIB theory, there are branes in-
cluding D1-branes, or D-strings, with tension TD1 = 1/2πα gs , where gs is
the string coupling. In the light of all the progress coming from dualities in
string theory, we now know that the D-strings and the F -strings should be
considered on the same footing and a general string state in Type IIB is the
bound state of these two types of strings. In 10 flat dimensions, supersymme-
try dictates that the tension of the bound state of p F -strings and q D-strings
is given by [55],
q2
Tp,q = TF 1 p2 + . (1)
gs2
Brane Inflation 965

This tension spectrum (for coprime (p, q)) allows junctions to be formed [29].
Since the D3-D̄3-brane annihilation most likely takes place at the bottom of
a throat, that will be where the cosmic superstrings are. To be speciﬁc, we
consider the KS throat [34] whose properties are relatively well understood.
On the gravity side, this is a warped deformed conifold. Inside the throat, the
geometry is a shrinking S 2 ﬁbred over an S 3 . The tensions of the bound state
of p F-strings and that of q D-strings were individually computed for the KS
throat [56]. The tension formula for the (p, q) bound states is given by [57]

h2A q2 bM 2 2 πp
Tp,q
+( ) sin ( ), (2)
2πα gs2 π M

where b = 0.93 is a number numerically close to one and M is the number

of fractional D3-branes, that is, the units of 3-form RR flux F3 through the
S 3 . For M → ∞ and b = hA = 1, it reduces to (13). Very interestingly, the
F -strings are charged in ZM and are non-BPS. The D-string on the other
hand is charged in Z and is BPS with respect to each other. Because p is ZM -
charged with non-zero binding energy, binding can take place even if (p, q)
are not coprime. Since it is a convex function, i.e., Tp+p < Tp + Tp , the p-
string will not decay into strings with smaller p. The interpretation of these
strings in the gauge theory dual is known. The F -string is dual to a confining
string between a quark and an anti-quark, while the D-string is dual to an
axionic string. M fundamental
√ strings can terminate to a point-like baryon
(with mass ∼ M hA / α ), irrespective of the number of D-strings around.
Besides the above Kibble and confining mechanisms, there are other pos-
sible ways to produce cosmic strings which may evolve to a cosmic string
network :
• Consider another throat with warped factor hC . If the temperature at the
beginning of the hot big bang is Ti , then strings in C-throat will be excited if
T i > hC ms .
• D-strings can be stable inside D3-branes [30]. Such D-strings can be pair-
produced inside the horizon at the end of inflation when a small stack of
D3-branes collide with a larger stack of D̄3-branes.
• One may also consider the situation when a single brane move towards the
bottom of the A-throat. Assuming that heating is not a problem for such a
scenario, stable D-strings can be pair-produced if Ti > ms hA .
Isolated loops would just decay via gravitational radiation. However, if the
density of loops is high enough so that they overlap and tangle with each other,
then their reconnections will yield long strings and lead to a scaling cosmic
string network. For the C-throat, this probably requires Ti hC ms . This is
more likely for small Gμ, since the decay rate is proportional to Gμ ∼ Gm2s h2C ,
so light tension sting loops will be quite long lived. In addition to cosmic
strings in A-throat, the universe may have cosmic strings with much smaller
tensions if throats with large warping exist in the bulk. These cosmic strings
interact very weakly with cosmic strings in A-throat.
966 S.-H. H. Tye

5 Evolution and Detection of Cosmic Superstrings

The cosmological evolution of cosmic superstrings is a very challenging prob-
lem. For slow-moving cosmic strings that stretch across the horizon, the en-
ergy density naively scales like a−2 . For cosmic string loops, the naive energy
density is similar to that for monopoles, scaling like a−3 . So, naively, the
cosmic string density is a problem. However, their interactions substantially
suppress the density. The intercommutation of intersecting cosmic strings and
the decay of the resulting cosmic string loops (to gravitational waves) reduce
the density so that it decrease like radiation (matter) during the radiation-
(matter)-dominated era [25]. Furthermore, the resulting scaling cosmic string
network energy density is insensitive to the initial density, i.e., the network
rapidly approaches the scaling solution. As a consequence, the physics is essen-
tially dictated by the single parameter Gμ in the Nambu–Goto or the Abelian
Higgs model, and by the tension spectrum for a more complicated model.
Although the cosmic string network reaches a scaling solution, the fraction
of energy density in cosmic string loops has been an outstanding question
[25]. Early simulations did not reach fine enough resolution to determine the
role played by string loops [58]. The basic assumption is that once a loop
is produced by the intersection of long strings (including self intersection),
they decay quickly via gravitational radiation. More recent analysis seems to
change the story.
Let us first consider the Nambu–Goto case. The fraction of energy density
in the string network is given by

Ωs = Ω∞ + Ωloops ∼ Γ Gμ + χ αGμ, (1)
where the first term is the contribution of long strings, with Γ ∼ 102 for
Nambu–Goto strings. The second term is the contribution of string loops
within the horizon. Very crudely speaking, χ ∼ 103 . The value of α, the ratio
of characteristic loop size to the horizon size, is poorly understood. It has been
estimated to be as small as α < 10−12 , or α ∼ Gμ or even (Gμ)5/2 . Recently,
both numerical simulations [59, 60] and analytic studies [61] have indicated
that there are more energies in the string loops than previously thought. That
is, α may be as big as 0.25, although α ∼ 10−4 seems to be more likely. For
small Gμ, the increase in the energy density in the string network can be very
substantial.
As mentioned earlier, cosmic superstrings will have different properties
than vortices in the Abelian Higgs model. Although a simulation is not avail-
able, one can analyse the evolution of the string network by solving a set of
coupled equations. As shown in Fig. 6b, recent analysis on the tension spec-
trum (i) strongly suggests that cosmic superstrings also evolve dynamically
to a scaling solution (with a stable relative distribution of strings with differ-
ent quantum numbers) [62, 63], very much like usual cosmic strings (either
coming from the Abelian Higgs model or from Nambu-Goto type) [25]. This
is due to the rapid decrease in the density of strings with large tensions,
Brane Inflation 967

Fig. 6. (a) The (p, q) string binding generates junctions [29]. (b) (p, q) string network
evolution as a function of the cosmic scale factor. The top three lines stand for total
density, while the bottom three lines stand for the corresponding (p, q) = (1, 0) string
density. We see that, irrespective of the initial densities, both the total density and
the (1, 0) density approach rapidly the scaling solutions [63]

which goes roughly like μ(p, q)−N , where N ∼ 8. We shall consider a scenario
where the cosmic strings are stable enough to allow such a scaling solution.
The inter-commutation probability of vortices is known to be around unity,
P 1, while that of superstrings is rather complicated, but P ∼ gs2 [62],
where the string coupling gs ∼ 1/10. Also, the tension spectrum tells us that
cosmic superstrings will come in a variety of tensions and charges. A simple
analysis indicates that a number of species of cosmic strings will be around
in the string network [63], so
n
Ωs → Ωs ,
P
where n is the eﬀective number of types, n ∼ 5. For very small P , it is argued
that 1/P → 1/P 2/3 [60]. It is not clear how the presence of baryons in the
tension spectrum (ii) will impact on the evolution of the string network. It is
clear that further studies, the properties of cosmic string spectrum (including
baryons), their productions, stabilities and interactions, and the cosmic evolu-
tion of the network as well as their possible detections will be most interesting
to watch. It is reasonable to be optimistic about the detectability of cosmic
superstrings, but this is far from guaranteed.
Originally proposed as an alternative to inﬂation, the detection of cosmic
strings has been extensively studied [25]. Since the cosmic superstrings inter-
act with the SM particles only via gravity, all detection involves the gravita-
968 S.-H. H. Tye

tional interactions of cosmic strings. Recent understanding on the importance

of string loops will certainly enhance the detectability of cosmic strings. Since
the particular brane inflationary scenario is not yet known, the cosmic string
tensions are only loosely constrained. We shall be open-minded in comparing
with observation. Many ways to detect cosmic strings have been suggested.
Here let us discuss some of them :
• Gravitational lensing is probably most direct. Cosmic string introduces a
deficit angle, so a galaxy behind a long cosmic string will appear as a double
(undistorted) image. The image separation is roughly 5 × 106 Gμ arcsec. For
Gμ 10−7 , this approach becomes very challenging. Finding a lensing by a
junction will be quite definitive [29, 64].
• Micro-lensing. This was first studied in [65]. For small string tension, string
loops are expected to be dominant. They can lens stars by watching the bright-
ness of a star doubling for a short period of time. Since there are more string
loops for smaller tension, non-observation may put a lower bound on the string
tension [66].
• In brane inflation, the density perturbation (and CMBR anisotropy) comes
from two sources: the usual quantum fluctuation (scalar and tensor modes)
during inflation and the fluctuations (scalar and vector modes) induced by
the cosmic string network. The density perturbation coming from the cosmic
string network is active and incoherent, so there is no acoustic peaks that are
prominent in the density perturbation coming from inflation. The COBE data
roughly yield Gμ 10−6 if the scaling solution of the cosmic string network
is the sole source of the density perturbation. Using WMAP data, one finds
that the contribution from cosmic strings is bounded by about 10%, which
translates to about Gμ ∼ 7 × 10−7 . So the cosmic string production towards
the end of brane inflation is perfectly compatible with the present CMBR
data [3], while future data may be able to test this scenario [67, 68].
• Since the density perturbation coming from cosmic string is continuously
being produced, its magnitude in CMBR anisotropy at large l will not be
attenuated as much as that coming from inflation. For Gμ ∼ 7 × 10−7 , the
contribution from cosmic strings may become comparable to (bigger than)
that from inflation at l > 2000 (l > 3000). This may be measurable if
Gμ is not too small. Polarization in CMB will also be measured. In par-
ticular, the B (i.e., curl) mode due to the tensor mode perturbation will
be tested, reaching ΔT 0.5μK. Here the gravitational wave anisotropy
density is much higher than that in a pure inflationary scenario, so passage
through space will presumably yield a B-mode polarization clearly larger than
that coming from a purely inflationary scenario [68]. Figure 7 illustrates this
possibility.
• As a cosmic string moves with velocity v across the sky, a shift in the CMB
temperature may be observed, ΔT /T 8πGμvγ [69]. A careful analysis of
the CMBR data may probe Gμ 10−10 . It is important to see what bound
on Gμ the data can eventually reach. Detection may be possible for as small
as Gμ 10−13 .
Brane Inflation 969

Fig. 7. The CMBR power spectrum from WMAP [3]. They are (from top) the
temperature T T correlation (black), the temperature-electric-mode polarization T E
correlation (red), the EE correlation (green), possible B-mode polarization BB cor-
relation (blue) and possible BB correlation (red/blue) from cosmic strings [68]. The
dashed lines are likely background/foreground that should be subtracted

Fig. 8. The detectability of cosmic strings by LISA via gravitational radiation,

both background and bursts for Nambu–Goto strings [72]
970 S.-H. H. Tye

•The cosmic string network also generates gravitational waves that may be
observable. This has been studied extensively in the literature. The stochas-
tic gravitational wave spectrum has an almost ﬂat region that extends from
f ∼ 10−8 Hz to f ∼ 1010 Hz. Within this frequency range, both ADVANCED
LIGO/VIRGO (sensitive at f ∼ 102 Hz) and LISA (sensitive at f ∼ 10−3 Hz)
may have a chance. Following [70], we obtain Ωgw h2 0.04Gμ coming from
long strings. Since LIGO II/VIRGO can reach Ωgw h2 10−10 at f 100
Hz, it can reach Gμ ≥ 2 × 10−9 . Such stochastic gravitational wave also inﬂu-
ences the very precise pulsar timing measurements. Although present pulsar
timing measurement is compatible with Gμ < 10−6 , a modest improvement
on the accuracy may detect a network of cuspy cosmic string loops down to
Gμ 10−11 .
Cusps and kinks are quite common in oscillating cosmic strings. Strongly
focused beams of relatively high-frequency gravitational waves are emitted
by these cusps and kinks. The sharp bursts of gravitational waves have
very distinctive waveform: t1/3 (cusps) and t2/3 (kinks) [71]. ADVANCED
LIGO/VIRGO may detect them for values down to Gμ ≥ 10−13 and LISA to
10−15 [71, 72, 73], so this may be the most sensitive test of cosmic strings.
At the moment, theoretical uncertainties (such as string tension, tension
spectrum, interactions and cosmic string loops) must be better understood.
Figure 8 takes into account the recent analysis where the string loops are im-
portant.
• Cusps also introduces temperature shifts in the CMBR that should be
searched. They may appear as a sharp down and then up temperature shift
that is quite distinctive [66, 74].

6 Remarks
Brane inflation is a natural realization of inflation in the brane world scenario
in string theory. If the string scale is close to the GUT scale, as expected,
cosmology offers a powerful approach to study and test string theory. We see
that brane inflation offers a variety of possible distinct stringy signatures to
be detected. Existing data are perfectly compatible with brane inflation. It
is exciting that near-future experiments/observations will likely provide non-
trivial tests of the scenario.
Many interesting problems remain. Here is a partial list. On the theoreti-
cal side:
• Search for other inflationary scenarios in string theory.
• Search for other distinct stringy signatures that can be detected.
• We have seen that the structure of the bulk as well as the properties of the
warped deformed throat impacts on the CMBR predictions, e.g., the power
spectral index. Flux compactifications must be studied in much greater detail
than currently known.
Brane Inflation 971

• The gauge/gravity duality has played an important role in studying the

properties of throats and the cosmic string tension spectrum. One may actu-
ally apply cosmology to study strongly coupled gauge theory via gauge/gravity
duality.
• Non-Gaussianity in CMBR and its more detailed properties.
• Understand better the properties of cosmic strings, such as the tension
spectrum and their interactions, their production and stability, and the cos-
mological evolution of the string network that may include baryons and/or
light domain walls bounded by the cosmic strings.
• Gott finds that closed time-like curves appear when two cosmic strings move
ultra-relativistically towards each other [75]. He proposed to use this as a time
machine. It is argued that energetics would prevent the appearance of such
closed time-like curves in our universe under any realistic situation [76]. This
important issue certainly deserves further analysis.
On the observational side:
• Searching for cosmic string signatures, large tensor mode and/or non-
Gaussianity that differs from that predicted in slow-roll inflation in CMBR
will be important.
• Astronomical searches for lensing, micro-lensing, temperature shifts due to
moving strings and string cusps can be both challenging and exciting. Some of
these searches need not be dedicated searches, i.e., they can be part of other
programs.
• Gravitational wave detection of the stochastic background gravitational ra-
diation due to cosmic strings as well as bursts coming from string cusps will
be valuable.
One should consider the discovery of cosmic strings as another verifica-
tion of the inflationary paradigm. This will shed light on the specific brane
inflationary scenario that took place, providing a valuable probe to the brane
world picture before inflation. That is, information on the early universe be-
fore inflation may not be totally lost. To my knowledge, this is the best ob-
servational window into supertstring theory. Irrespective of the final outcome,
whether brane inflation or some other stringy scenario is eventually proved
correct or not, we see that string theory is confronting data and making a
number of distinctive predictions that can be tested in the near future. This
is exciting.

Acknowledgment

I thank Rachel Bean, Xingang Chen, David Chernoﬀ, Gia Dvali, Hassan
Firouzjahi, Girma Hailu, Nick Jones, Louis Leblond, Levon Pogosian, Sash
Sarangi, Sarah Shandera, Gary Shiu, Ben Shlaer, Horace Stoica, Ira
Wasserman, Mark Wyman and Jiajun Xu for collaborations and valuable
discussions. Discussions with Cliﬀ Burgess, Jim Cline, Shamit Kachru, Re-
nata Kallosh, Igor Klebanov, Andre Linde, Liam McAllister, Juan Maldacena,
972 S.-H. H. Tye

Irit Maor, Ken Olum, Joe Polchinski, Fernando Quevedo, Eva Silverstein, Bret
Underwood and Alex Vilenkin are gratefully acknowlegded. This work is sup-
ported by the National Science Foundation under grant PHY-0355005.

References
1. A. H. Guth: Phys. Rev. D 23, 347 (1981); A. D. Linde: Phys. Lett. B 108, 389
(1982); A. Albrecht, P. J. Steinhardt: Phys. Rev. Lett. 48, 1220 (1982) 950
2. G. F. Smoot et al.: Astrophys. J. 396, L1 (1992); C. L. Bennett et al.: Astro-
phys. J. 464, L1 (1996) 950, 959
3. D. N. Spergel et al.: astro-ph/0603449 950, 959, 968, 969
4. J. Polchinski: Phys. Rev. Lett. 75, 4727 (1995) 950
5. G. R. Dvali, S.-H. H. Tye: Phys. Lett. B 450, 72 (1999) 950
6. C. P. Burgess, M. Majumdar, D. Nolte, F. Quevedo, G. Rajesh, R. J. Zhang:
JHEP 0107, 047 (2001); G. R. Dvali, Q. Shaﬁ, S. Solganik: hep-th/0105203;
S. Buchan, B. Shlaer, H. Stoica, S.-H. H. Tye: JCAP 0402, 013 (2004) 950
7. S. B. Giddings, S. Kachru, J. Polchinski: Phys. Rev. D 66, 106006 (2002) 951
8. S. Kachru, R. Kallosh, A. Linde, S. P. Trivedi: Phys. Rev. D 68, 046005 (2003)
951
9. S. Kachru, R. Kallosh, A. Linde, J. Maldacena, L. McAllister, S. P. Trivedi:
JCAP 0310, 013 (2003) 952, 954, 956, 958
10. C. P. Burgess, R. Kallosh, F. Quevedo: JHEP 0310, 056 (2003) 953
11. M. Berg, M. Haack, B. Kors: Phys. Rev. D 71, 026005 (2005); hep-th/0409282;
JHEP 0511, 030 (2005) 953
12. D. Baumann, A. Dymarsky, I. R. Klebanov, J. Maldacena, L. McAllister,
A. Murugan: hep-th/0607050 953, 961
13. H. Firouzjahi, S.-H. H. Tye: JCAP 0503, 009 (2005) 953, 954, 956, 958, 959
14. E. Silverstein, D. Tong: Phys. Rev. D 70, 103505 (2004) 953, 958, 959
15. M. Alishahiha, E. Silverstein, D. Tong: Phys. Rev. D 70, 123505 (2004) 954, 955, 959
16. A. Dymarsky, I. R. Klebanov, N. Seiberg: JHEP 0601, 155 (2006) 954
17. X. Chen: Phys. Rev. D 71, 063506 (2005) 954, 960
18. S. Dimopoulos, S. Kachru, J. McGreevy, J. G. Wacker: hep-th/0507205 954
19. U. Seljak, A. Slosar: Phys. Rev. D 74, 063523 (2006) 955, 959
20. X. Chen, M. X. Huang, S. Kachru, G. Shiu: hep-th/0605045 955, 960
21. J. M. Maldacena: JHEP 0305, 013 (2003) 955
22. S. E. Shandera, S.-H. H. Tye: JCAP 0605, 007 (2006) 955, 959, 960
23. X. Chen, S.-H. H. Tye: JCAP 0606, 011 (2006) 955, 962
24. E. W. Kolb, M. S. Turner: The Early Universe (Addison-Wesley Publ. Co.,
Redwood City, 1990) 955
25. A. Villenkin, E. P. S. Shellard: Cosmic Strings and Other Topological Defects
(Cambridge University Press, Cambridge, 2000) 955, 964, 966, 967
26. E. Witten: Phys. Lett. B 153, 243 (1985) 955
27. N. Jones, H. Stoica, S.-H. H. Tye: JHEP 0207, 051 (2002); S. Sarangi,
S.-H. H. Tye: Phys. Lett. B 536, 185 (2002); N. T. Jones, H. Stoica,
S.-H. H. Tye: Phys. Lett. B 563, 6 (2003) 956, 964
28. G. Dvali, A. Vilenkin: JCAP 0403, 010 (2004) 956, 964
29. E. J. Copeland, R. C. Myers, J. Polchinski: JHEP 0406, 013 (2004) 956, 964, 965, 967, 968
30. L. Leblond, S.-H. H. Tye: JHEP 03, (2004) 055 956, 965
Brane Inﬂation 973

31. J. Garcia-Bellido, R. Rabadan, F. Zamora: JHEP 0201, 036 (2002) 956

32. S. Sarangi, S.-H. H. Tye: Phys. Lett. B 573, 181 (2003) 956, 957
33. N. T. Jones, S.-H. H. Tye: JHEP 0301, 012 (2003) 957, 958
34. I. R. Klebanov, M. J. Strassler: JHEP 0008, 052 (2000) 958, 965
35. S. Kecskemeti, J. Maiden, G. Shiu, B. Underwood: JHEP 0609, 076 (2006) 958, 960
36. P. Creminelli, A. Nicolis, L. Senatore, M. Tegmark, M. Zaldarriaga: JCAP
0605, 004 (2006) 960
37. S. Kachru, J. Pearson, H. L. Verlinde: JHEP 0206, 021 (2002) 961
38. X. Chen: JHEP 0508, 045 (2005) 960, 961
39. X. Chen: Phys. Rev. D 72, 123518 (2005) 961
40. X. Chen, S. Sarangi, S.-H. H. Tye, J. Xu: hep-th/0608082 961
41. A. Sen: JHEP 0204, 048 (2002); JHEP 0207, 065 (2002) 961
42. G. Shiu, S.-H. H. Tye, I. Wasserman: Phys. Rev. D 67, 083517 (2003) 961
43. J. M. Cline, H. Firouzjahi, P. Martineau: JHEP 0211, 041 (2002) 961
44. N. Lambert, H. Liu, J. Maldacena: hep-th/0303139 962
45. X. Chen: Phys. Rev. D 70, 086001 (2004) 962
46. L. Leblond: JHEP 0601, 033 (2006) 962
47. N. Barnaby, C. P. Burgess, J. M. Cline: JCAP 0504, 007 (2005) 962
48. L. Kofman, P. Yi: Phys. Rev. D 72, 106001 (2005) 962
49. D. Chialva, G. Shiu, B. Underwood: JHEP 0601, 014 (2006) 962
50. A. R. Frey, A. Mazumdar, R. Myers: Phys. Rev. D 73, 026003 (2006) 962
51. S. Dimopoulos, S. Kachru, N. Kaloper, A. E. Lawrence, E. Silverstein: Phys.
Rev. D 64, 121702 (2001) 962
52. H. Firouzjahi, S.-H. H. Tye: JHEP 0601, 136 (2006) 963
53. A. Sen: JHEP 9808, 010 (1998); JHEP 9809, 023 (1998); E. Witten: JHEP
9812, 019 (1998); P. Horava: Adv. Theor. Math. Phys. 2, 1373 (1999) 964
54. O. Bergman, K. Hori, P. Yi: Nucl. Phys. B 580, 289 (2000) 964
55. J. H. Schwarz: Phys. Lett. B 360, 13 (1995) [Erratum-ibid. B 364, 252 (1995)]
964
56. S. S. Gubser, C. Herzog, I. R. Klebanov: JHEP 09, 036 (2004) 965
57. H. Firouzjahi, L. Leblond, S.-H. H. Tye: JHEP 0605, 047 (2006) 965
58. A. Albrecht, N. Turok: Phys. Rev. Lett. 54, 1868 (1985); D. P. Bennett,
F. R. Bouchet: Phys. Rev. Lett. 60, 257 (1988); B. Allen, E. P. S. Shellard:
Phys. Rev. Lett. 64, 119 (1990) 966
59. V. Vanchurin, K. Olum, A. Vilenkin: Phys. Rev. D 72, 063514 (2005); gr-
qc/0511159; C. Ringeval, M. Sakellariadou, F. Bouchet: astro-ph/0511646;
C. J. A. Martins, E. P. S. Shellard: Phys. Rev. D 73, 043515 (2006) 966
60. A. Avgoustidis, E. P. S. Shellard: Phys. Rev. D 73, 041301 (2006) 966, 967
61. J. Polchinski, J. V. Rocha: Phys. Rev. D 74, 083504 (2006) 966
62. M. G. Jackson, N. T. Jones, J. Polchinski: JHEP 0510, 013 (2005) 966, 967
63. S.-H. H. Tye, I. Wasserman, M. Wyman: Phys. Rev. D 71, 103508 (2005)
[Erratum-ibid. D 71, 129906 (2005)] 966, 967
64. B. Shlaer, M. Wyman: Phys. Rev. D 72, 123504 (2005) 968
65. C. Hogan, R. Narayan: MNRAS 211, 575 (1984) 968
66. D. Chernoﬀ, S.-H. H. Tye: to appear. 968, 970
67. M. Landriau, E. P. S. Shellard: Phys. Rev. D 69 023003 (2004); L. Pogosian,
M. C. Wyman, I. Wasserman: astro-ph/0403268; astro-ph/0604141; E. Jeong,
G. F. Smoot: Astrophys. J. 624, 21 (2005) 968
68. L. Pogosian, S.-H. H. Tye, I. Wasserman, M. Wyman: Phys. Rev. D 68, 023506
(2003) 968, 969
974 S.-H. H. Tye

69. N. Kaiser, A. Stebbin: Nature 310, 391 (1984); J. R. Gott: Ap. J. 288, 422
(1985) 968
70. R. R. Caldwell, B. Allen: Phys. Rev. D 45, 3447 (1992) 970
71. T. Damour, A. Vilenkin: Phys. Rev. D 64, 064008 (2001); Phys. Rev. D 71,
063510 (2005) 970
72. C. J. Hogan: Phys. Rev. D 74, 043526 (2006) 969, 970
73. X. Siemens, J. Creighton, I. Maor, S. R. Majumder, K. Cannon, J. Read: Phys.
Rev. D 73, 105001 (2006) 970
74. A. A. de Laix, T. Vachaspati: Phys. Rev. D 54, 4780 (1996); A. Stebbin: Astro-
phys. J. 327, 584 (1988); F. Bernardeau, J.-P. Uzan: Phys. Rev. D63, 023004
(2001); D63, 023005 (2001) 970
75. J. R. I. Gott: Phys. Rev. Lett. 66, 1126 (1991). 971
76. B. Shlaer, S.-H. H. Tye: Phys. Rev. D 72, 043532 (2005) 971

Evolution Nicholas H Barton PDF
0% (7)
Evolution Nicholas H Barton PDF
2 pages
Gravitation and Gauge Symmetries PDF
No ratings yet
Gravitation and Gauge Symmetries PDF
537 pages
Assignment 2
No ratings yet
Assignment 2
8 pages
The Theory of Complex Angular Momenta - Gribov Lectures On Theoretical Physics (CUP)
100% (2)
The Theory of Complex Angular Momenta - Gribov Lectures On Theoretical Physics (CUP)
311 pages
Hyperbolic Chaos
100% (1)
Hyperbolic Chaos
318 pages
Huang K. - Quantum Field Theory
100% (16)
Huang K. - Quantum Field Theory
446 pages
Differential Geometry With Applications To Mechanics and Physics
100% (8)
Differential Geometry With Applications To Mechanics and Physics
476 pages
(Stefanucci G., Van Leeuwen R.) Nonequilibrium
100% (2)
(Stefanucci G., Van Leeuwen R.) Nonequilibrium
619 pages
Gregory L. Baker, Jerry P. Gollub Chaotic Dynamics - An Introduction PDF
100% (1)
Gregory L. Baker, Jerry P. Gollub Chaotic Dynamics - An Introduction PDF
134 pages
Kriele. Spacetime - Foundations of General Relativity and Differential Geometry PDF
100% (5)
Kriele. Spacetime - Foundations of General Relativity and Differential Geometry PDF
444 pages
Thanu Padmanabhan - Sleeping Beauties in Theoretical Physics 26 Surprising Insights
100% (3)
Thanu Padmanabhan - Sleeping Beauties in Theoretical Physics 26 Surprising Insights
305 pages
Substitutional Analysis
From Everand
Substitutional Analysis
Daniel Edwin Rutherford
No ratings yet
Rotations, Quaternions, and Double Groups
From Everand
Rotations, Quaternions, and Double Groups
Simon L. Altmann
3/5 (1)
Methods of Quantum Field Theory in Statistical Physics
From Everand
Methods of Quantum Field Theory in Statistical Physics
A. A. Abrikosov
4/5 (2)
QFT Lecture Notes
100% (1)
QFT Lecture Notes
175 pages
Spinors and Space-Time Anisotropy
100% (3)
Spinors and Space-Time Anisotropy
307 pages
(Lecture Notes in Physics 726) D. Husemöller, M. Joachim, B. Jurčo, M. Schottenloher (auth.)-Basic Bundle Theory and K-Cohomology Invariants_ With contributions by Siegfried Echterhoff, Stefan Fredenh hoy.pdf
100% (3)
(Lecture Notes in Physics 726) D. Husemöller, M. Joachim, B. Jurčo, M. Schottenloher (auth.)-Basic Bundle Theory and K-Cohomology Invariants_ With contributions by Siegfried Echterhoff, Stefan Fredenh hoy.pdf
356 pages
Clifford Algebras and Spinors (Pertti Lounesto) (Z-Library)
100% (1)
Clifford Algebras and Spinors (Pertti Lounesto) (Z-Library)
346 pages
Pub - The Physics of The Standard Model and Beyond PDF
100% (2)
Pub - The Physics of The Standard Model and Beyond PDF
314 pages
Livro Felsager Geometry Particles Fields
100% (6)
Livro Felsager Geometry Particles Fields
664 pages
(Cambridge Monographs On Mathematical Physics) Olivier Babelon, Denis Bernard, Michel Talon - Introduction To Classical Integrable Systems - Cambridge University Press (2007)
No ratings yet
(Cambridge Monographs On Mathematical Physics) Olivier Babelon, Denis Bernard, Michel Talon - Introduction To Classical Integrable Systems - Cambridge University Press (2007)
616 pages
Advanced Particle Physics - Volume I - Boyarkin PDF
No ratings yet
Advanced Particle Physics - Volume I - Boyarkin PDF
628 pages
Puetz - Relativistic Geodesic
100% (1)
Puetz - Relativistic Geodesic
485 pages
Equations of Motion in Relativistic Gravity: Dirk Puetzfeld Claus Lämmerzahl Bernard Schutz Editors
No ratings yet
Equations of Motion in Relativistic Gravity: Dirk Puetzfeld Claus Lämmerzahl Bernard Schutz Editors
842 pages
Vectors, Spinors, and Complex Numbers in Classical and Quantum Physics David Hestenes
100% (1)
Vectors, Spinors, and Complex Numbers in Classical and Quantum Physics David Hestenes
23 pages
Atiyah - Mathematics in The 20th Century
No ratings yet
Atiyah - Mathematics in The 20th Century
14 pages
Poisson Structures
100% (3)
Poisson Structures
470 pages
Extracting Physics From Gravitational Waves
100% (1)
Extracting Physics From Gravitational Waves
243 pages
(Progress in Mathematical Physics 59) Gerardo F. Torres Del Castillo (Auth.) - Spinors in Four-Dimensional Spaces-BirkhÃ User Basel (2010)
No ratings yet
(Progress in Mathematical Physics 59) Gerardo F. Torres Del Castillo (Auth.) - Spinors in Four-Dimensional Spaces-BirkhÃ User Basel (2010)
182 pages
Geometric Algebra and Its Application To Mathematical Physics - C. Doran
No ratings yet
Geometric Algebra and Its Application To Mathematical Physics - C. Doran
187 pages
Advanced Quantum Field Theory PDF
No ratings yet
Advanced Quantum Field Theory PDF
393 pages
Complex Manifolds
No ratings yet
Complex Manifolds
113 pages
Plane and Solid Analytic Geometry
No ratings yet
Plane and Solid Analytic Geometry
393 pages
Bajardi-Capozziello - Noether Symmetries in Theories of Gravity
100% (1)
Bajardi-Capozziello - Noether Symmetries in Theories of Gravity
452 pages
(Oxford Master Series in Physics) Rodrigo Soto - Kinetic Theory and Transport Phenomena-Oxford University Press (2016)
100% (1)
(Oxford Master Series in Physics) Rodrigo Soto - Kinetic Theory and Transport Phenomena-Oxford University Press (2016)
269 pages
(International Series of Monographs On Physics) Walter T. Grandy Jr. - Entropy and The Time Evolution of Macroscopic Systems-Oxford University Press, USA (2008) PDF
100% (1)
(International Series of Monographs On Physics) Walter T. Grandy Jr. - Entropy and The Time Evolution of Macroscopic Systems-Oxford University Press, USA (2008) PDF
224 pages
Variational Principles in Physics: From Classical To Quantum Realm
100% (1)
Variational Principles in Physics: From Classical To Quantum Realm
122 pages
(C. Grosche F. Steiner) Handbook of Feynman Path I (B-Ok - Xyz)
100% (1)
(C. Grosche F. Steiner) Handbook of Feynman Path I (B-Ok - Xyz)
459 pages
Sonia Mazzucchi - Mathematical Feynman Path Integrals and Their Applications (2009, World Scientific Publishing Company) - Libgen - Li
100% (1)
Sonia Mazzucchi - Mathematical Feynman Path Integrals and Their Applications (2009, World Scientific Publishing Company) - Libgen - Li
225 pages
Strings, Branes, Black Holes and Quantum Field Theory: Professor Jerome Gauntlett
No ratings yet
Strings, Branes, Black Holes and Quantum Field Theory: Professor Jerome Gauntlett
29 pages
Collective Classical and Quantum Fields in Plasmas, Superconductors, Superfluid 3he, and Liquid Crystals (Hagen Kleinert)
No ratings yet
Collective Classical and Quantum Fields in Plasmas, Superconductors, Superfluid 3he, and Liquid Crystals (Hagen Kleinert)
417 pages
Naturalness, String Landscape and Multiverse: Arthur Hebecker
100% (1)
Naturalness, String Landscape and Multiverse: Arthur Hebecker
321 pages
Tensor and General Relativity
100% (1)
Tensor and General Relativity
88 pages
Introduction To Tensor Network Methods - Numerical Simulations of Low-Dimensional Many-Body Quantum Systems
100% (1)
Introduction To Tensor Network Methods - Numerical Simulations of Low-Dimensional Many-Body Quantum Systems
172 pages
Gauge Invariance and Weyl-Polymer Quantization
100% (1)
Gauge Invariance and Weyl-Polymer Quantization
104 pages
Richard L. Bishop - Geometry of Manifolds
100% (3)
Richard L. Bishop - Geometry of Manifolds
287 pages
Collective Classical & Quantum Fields
100% (1)
Collective Classical & Quantum Fields
410 pages
Dine M. Supersymmetry and String Theory
No ratings yet
Dine M. Supersymmetry and String Theory
537 pages
Das & Okubo-Lie Groups and Lie Algebras For Physicists PDF
100% (5)
Das & Okubo-Lie Groups and Lie Algebras For Physicists PDF
358 pages
Universal Geometric Algebra
No ratings yet
Universal Geometric Algebra
15 pages
Electricity and Magnetism For Mathematicians Garrity
43% (7)
Electricity and Magnetism For Mathematicians Garrity
75 pages
Theory of Group Representations - A. O. Barut PDF
100% (2)
Theory of Group Representations - A. O. Barut PDF
736 pages
New Relativistic Paradoxes and Open Questions
No ratings yet
New Relativistic Paradoxes and Open Questions
128 pages
(International Series of Monographs On Physics 135) Vladimir Fortov, Igor Iakubov, Alexey Khrapak - Physics of Strongly Coupled Plasma (2006, Oxford University Press) PDF
No ratings yet
(International Series of Monographs On Physics 135) Vladimir Fortov, Igor Iakubov, Alexey Khrapak - Physics of Strongly Coupled Plasma (2006, Oxford University Press) PDF
535 pages
(Universitext) Mark J.D. Hamilton-Mathematical Gauge Theory - With Applications To The Standard Model of Particle Physics-Springer (2018)
No ratings yet
(Universitext) Mark J.D. Hamilton-Mathematical Gauge Theory - With Applications To The Standard Model of Particle Physics-Springer (2018)
666 pages
Bilenky S. Introduction To The Physics of Massive and Mixed Neutrinos PDF
No ratings yet
Bilenky S. Introduction To The Physics of Massive and Mixed Neutrinos PDF
270 pages
General Relativity
No ratings yet
General Relativity
94 pages
String
100% (1)
String
260 pages
Encyclopedia of Mathematical Physics Vol.5 S-Y Ed. Fran Oise Et Al
100% (10)
Encyclopedia of Mathematical Physics Vol.5 S-Y Ed. Fran Oise Et Al
471 pages
Colliding Plane Waves in General Relativity
From Everand
Colliding Plane Waves in General Relativity
J. B. Griffiths
No ratings yet
Molecular Quantum Electrodynamics
From Everand
Molecular Quantum Electrodynamics
D. P. Craig
4/5 (2)
Axiomatics of Classical Statistical Mechanics
From Everand
Axiomatics of Classical Statistical Mechanics
Rudolf Kurth
5/5 (1)
Foundations of Radiation Hydrodynamics
From Everand
Foundations of Radiation Hydrodynamics
Dimitri Mihalas
3.5/5 (2)
Duality Theory For Clifford Tensor Powers
No ratings yet
Duality Theory For Clifford Tensor Powers
47 pages
New Closed Analytical Solutions For Geometrically Thick Fluid Tori Around Black Holes
No ratings yet
New Closed Analytical Solutions For Geometrically Thick Fluid Tori Around Black Holes
10 pages
Maccormack'S Method For Advection-Reaction Equations: February 1999
No ratings yet
Maccormack'S Method For Advection-Reaction Equations: February 1999
13 pages
Effect of The Toroidal Magnetic Field On The Runaway Instability of Relativistic Tori
No ratings yet
Effect of The Toroidal Magnetic Field On The Runaway Instability of Relativistic Tori
8 pages
PhysRevD 93 064055 PDF
No ratings yet
PhysRevD 93 064055 PDF
36 pages
On The Polish Doughnut Accretion Disc Via The Effective Potential Approach
No ratings yet
On The Polish Doughnut Accretion Disc Via The Effective Potential Approach
31 pages
MNRAS 2008 Takahashi 1155 65
No ratings yet
MNRAS 2008 Takahashi 1155 65
11 pages
Learning Theories - Behaviorism
No ratings yet
Learning Theories - Behaviorism
20 pages
Chapter 12 T - Test, F Test
No ratings yet
Chapter 12 T - Test, F Test
38 pages
5 factor theory 5
No ratings yet
5 factor theory 5
4 pages
Oral Comm
No ratings yet
Oral Comm
7 pages
Bounded Rationality
No ratings yet
Bounded Rationality
2 pages
Reflective Journal 4
No ratings yet
Reflective Journal 4
3 pages
Hamiltonian mechanics of gauge systems 1st Edition Prokhorov 2024 scribd download
100% (4)
Hamiltonian mechanics of gauge systems 1st Edition Prokhorov 2024 scribd download
80 pages
Hypothesis Testing
100% (3)
Hypothesis Testing
26 pages
Review of Basic Statistics
No ratings yet
Review of Basic Statistics
86 pages
Jean Faber and Gilson A. Giraldi - Quantum Models For Artifcial Neural Network
No ratings yet
Jean Faber and Gilson A. Giraldi - Quantum Models For Artifcial Neural Network
8 pages
Hypothesis
No ratings yet
Hypothesis
1 page
Interactionist Theory 2
No ratings yet
Interactionist Theory 2
19 pages
Models of Traffic Flow
No ratings yet
Models of Traffic Flow
28 pages
Saba Anwar 1024 Analytical Chemistry.
No ratings yet
Saba Anwar 1024 Analytical Chemistry.
3 pages
General Theory of Relativity
No ratings yet
General Theory of Relativity
7 pages
Matter Waves
No ratings yet
Matter Waves
5 pages
Lecture 2: Renormalization Groups (Continued) David Gross 2.1. Finite Renormalization
No ratings yet
Lecture 2: Renormalization Groups (Continued) David Gross 2.1. Finite Renormalization
9 pages
Spacetime Geometry and General Relativity (CM334A)
No ratings yet
Spacetime Geometry and General Relativity (CM334A)
48 pages
Quiz - GRP A
No ratings yet
Quiz - GRP A
2 pages
Nonequilibrium Many Body Theory of Quantum Systems A Modern Introduction
0% (1)
Nonequilibrium Many Body Theory of Quantum Systems A Modern Introduction
6 pages
[Ebooks PDF] download Phase Transitions and Renormalisation Group Jean Zinn-Justin full chapters
100% (3)
[Ebooks PDF] download Phase Transitions and Renormalisation Group Jean Zinn-Justin full chapters
43 pages
Action-Angle Variables in Quantum Mechanics: Abhijit Lahiri, Gautam Ghosh B L, T.K. Karb
No ratings yet
Action-Angle Variables in Quantum Mechanics: Abhijit Lahiri, Gautam Ghosh B L, T.K. Karb
5 pages
Books by Jhon Scales Avery
No ratings yet
Books by Jhon Scales Avery
3 pages
EMT 2 Course Folder Complete
No ratings yet
EMT 2 Course Folder Complete
5 pages
Practice Exam Chapter 10-TWO-SAMPLE TESTS: Section I: Multiple-Choice
No ratings yet
Practice Exam Chapter 10-TWO-SAMPLE TESTS: Section I: Multiple-Choice
19 pages
Basic Inorganic Chemistry Fundamental Particles
No ratings yet
Basic Inorganic Chemistry Fundamental Particles
29 pages
Criminology Theories
No ratings yet
Criminology Theories
1 page
Angela-Karanjai-Dissertation-1
No ratings yet
Angela-Karanjai-Dissertation-1
55 pages