0% found this document useful (0 votes)
22 views345 pages

Distributions in Physics and Engineering - Saichev, Woyczyński

Distributions in Physics and Engineering - Saichev, Woyczyński

Uploaded by

Anton Perkov
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views345 pages

Distributions in Physics and Engineering - Saichev, Woyczyński

Distributions in Physics and Engineering - Saichev, Woyczyński

Uploaded by

Anton Perkov
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 345

To TANYA and LIZ-

with love and respect


Applied and Numerical Harmonic Analysis

Distributions in the Physical and


Engineering Sciences
Applied and Numerical Harmonic
Analysis

Series Editor
JOHN J. BENEDETTO
University 0/ Maryland

Editorial Board
Akram Aldroubi Douglas Cochran
NIH, Biomedical Engineering/Instrumentation Arizona State University

Ingrid Daubechies Hans G. Feichtinger


Princeton University University 0/ Vienna

Christopher Heil Murat Kunt


Georgia Institute o/Technology Swiss Federal Institute o/Technology, Lausanne

James McClellan Wim Sweldens


Georgia Institute o/Technology Lucent Technologies, Bell Laboratories

Michael Unser Martin Vetterli


NIH, BiomedicalEngineering/instrumentation Swiss Federal Institute o/Technology, Lausanne

Victor Wickerhauser
Washington University
Alexander l. SAICHEV
University of Nizhniy Novgorod
and
Wojbor A. WOYCZYNSKI
Case Western Reserve University

DISTRIBUTIONS IN THE
PHYSICAL
AND ENGINEERING
SCIENCES

Volume 1
Distributional and Fractal
Calculus,
Integral Transforms and
Wavelets

BIRKHAUSER
Boston Basel Berlin
Alexander I. Saichev W ojbor A. W oyczynski
Radio Physics Department Department of Statistics and Center
University of Nizhniy Novgorod for Stochastic and Chaotic Processes
Nizhniy Novgorod, 603022 in Science and Technology
Russia Case Western Reserve University
Cleveland, Ohio 44106
U.S.A.

Library of Congress Cataloging In-Publication Data


Woyczynski, W. A. (Wojbor Andrzej), 1943-
Distributions in the physical and engineering sciences / Wojbor A.
Woyczynski, Alexander I. Saichev.
p. cm. -- (Applied and numerical harmonic analysis)
Includes bibliographical references and index.
Contents: V. 1. Distributional and fractal calculus, integral
transforms, and wavelets.
ISBN-13: 978-1-4612-8679-0 e-ISBN-13: 978-1-4612-4158-4
DOl: 10.1007978-1-4612-4158-4
1. Theory of distributions (Functional analysis) I. Saichev, A.
I. II. Title. III. Series.
QA324.w69 1996
515'.782' 0245--dc20 96-39028
CIP

Printed on acid-free paper


Birkhauser H02'

© 1997 Birkhauser Boston

Copyright is not claimed for works of U.S. Government employees.


All rights reserved. No part of this publication may be reproduced, stored in a retrieval system,
or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording,
or otherwise, without prior permission of the copyright owner.

Permission to photocopy for internal or personal use of specific clients is granted by


Birkhauser Boston for libraries and other users registered with the Copyright Clearance
Center (CCC), provided that the base fee of$6.00 per copy, plus $0.20 per page is paid directly
to CCC, 222 Rosewood Drive, Danvers, MA 01923, U.S.A. Special requests should be
addressed directly to Birkhauser Boston, 675 Massachusetts A venue, Cambridge, MA 02139,
U.S.A.

ISBN -13: 978-1-4612-8679-0

Camera-ready text prepared in LA1EX by T & T TechWorks Inc., Coral Springs, FL.

987 6 543 2 1
Contents

Introduction xi

Notation xvii

Part I DISTRIBUTIONS AND THEIR BASIC APPLICATIONS 1

1 Basic Definitions and Operations 3


1.1 The "delta function" as viewed by a physicist and an engineer . 3
1.2 A rigorous definition of distributions . . . . . . . . 5
1.3 Singular distributions as limits of regular functions . 10
1.4 Derivatives; linear operations . . . . . . . . . . . . 14
1.5 Multiplication by a smooth function; Leibniz formula 17
1.6 Integrals of distributions; the Heaviside function . 20
1.7 Distributions of composite arguments . . 24
1.8 Convolution . . . . . . . . . . . . . . . 27
1.9 The Dirac delta on Rn , lines and surfaces 28
1.10 Linear topological space of distributions . 31
1.11 Exercises . . . . . . . . . . . . . . . . . 34

2 Basic Applications: Rigorous and Pragmatic 37


2.1 Two generic physical examples . . . . . . . . . . . 37
2.2 Systems governed by ordinary differential equations . 39
2.3 One-dimensional waves . . . . . . . . . . . . 43
2.4 Continuity equation . . . . . . . . . . . . . . 44
2.5 Green's function of the continuity equation and
Lagrangian coordinates . . . . . . . . . . . . 49
2.6 Method of characteristics . . . . . . . . . . . 51
2.7 Density and concentration of the passive tracer 54
2.8 Incompressible medium . . . . . . . . . . . . 55
viii Contents

2.9 Pragmatic applications: beyond the rigorous theory


of distributions . 57
2.10 Exercises . . . . . . . . . . . . . . . . . . . . . . 70

Part IT INTEGRAL TRANSFORMS AND DIVERGENT SERIES 73

3 Fourier Transform 75
3.1 Definition and elementary properties . . . . . . 75
3.2 Smoothness, inverse transform and convolution 78
3.3 Generalized Fourier transform 81
3.4 Transport equation 84
3.5 Exercises . . . . . . . . . . . 90

4 Asymptotics of Fourier Transforms 93


4.1 Asymptotic notation, or how to get a camel to pass through a
needle's eye .. . . . . . . 93
4.2 Riemann-Lebesgue Lemma . . . . . . . . . . . . . . . . . 98
4.3 Functions with jumps . . . . . . . . . . . . . . . . . . . . 101
4.4 Gamma function and Fourier transforms of power functions . 112
4.5 Generalized Fourier transforms of power functions 123
4.6 Discontinuities of the second kind 130
4.7 Exercises . . . . . . . . . . . . . . 134

5 Stationary Phase and Related Method 137


5.1 Finding asymptotics: a general scheme 137
5.2 Stationary phase method . . . . . . . . 140
5.3 Fresnel approximation . . . . . . . . . 141
5.4 Accuracy of the stationary phase method. 142
5.5 Method of steepest descent. 145
5.6 Exercises . . . . . . . . . . . . . . . 146

6 Singular Integrals and Fractal Calculus 149


6.1 Principal value distribution. . . . 149
6.2 Principal value of Cauchy integral 152
6.3 A study of monochromatic wave . 153
6.4 The Cauchy formula . 157
6.5 The Hilbert transform . . . . . . 160
6.6 Analytic signals . . . . . . . . . 162
6.7 Fourier transform of Heaviside function 163
6.8 Fractal integration .. 166
6.9 Fractal differentiation 170
6.10 Fractal relaxation . 175
6.11 Exercises . . . . . . . 180
Contents ix

7 Uncertainty Principle and Wavelet Transforms 183


7.1 Functional Hilbert spaces . . . . . . . . . . . . . . . . 183
7.2 Time-frequency localization and the uncertainty principle 190
7.3 Windowed Fourier transform. . . . . . . . 193
7.4 Continuous wavelet transforms. . . . . . . 210
7.5 Haar wavelets and multiresolution analysis 225
7.6 Continuous Daubechies' wavelets 231
7.7 Wavelets and distributions 237
7.8 Exercises............. 243

8 Summation of Divergent Series and Integrals 245


8.1 Zeno's "paradox" and convergence of infinite series 245
8.2 Summation of divergent series . . . . . . . . . . . 253
8.3 Tiring Achilles and the principle of infinitesimal relaxation 255
8.4 Achilles chasing the tortoise in presence of head winds 258
8.5 Separation of scales condition . 260
8.6 Series of complex exponentials 264
8.7 Periodic Dirac deltas. . . . . . 268
8.8 Poisson summation formula . . 271
8.9 Summation of divergent geometric series 273
8.10 Shannon's sampling theorem. 276
8.11 Divergent integrals . 281
8.12 Exercises. . . . . . . . . . . 283

A Answers and Solutions 287


A.1 Chapter 1. Definitions and operations 287
A.2 Chapter 2. Basic applications . . . . 288
A.3 Chapter 3. Fourier transform. . . . . 292
A.4 Chapter 4. Asymptotics of Fourier transforms 294
A.5 Chapter 5. Stationary phase and related methods 296
A.6 Chapter 6. Singular integrals and fractal calculus 302
A.7 Chapter 7. Uncertainty principle and wavelet transform 308
A.8 Chapter 8. Summation of divergent series and integrals 312

B Bibliographical Notes 325

Index 331
Introduction

Goals and audience


The usual calculus/differential equations sequence taken by the physical sciences
and engineering majors is too crowded to include an in-depth study of many widely
applicable mathematical tools which should be a part of the intellectual arsenal of
any well educated scientist and engineer. So it is common for the calculus sequence
to be followed by elective undergraduate courses in linear algebra, probability and
statistics, and by a graduate course that is often labeled Advanced Mathematics for
Engineers and Scientists. Traditionally, it contains such core topics as equations
of mathematical physics, special functions, and integral transforms. This book is
designed as a text for a modern version of such a graduate course and as a reference
for theoretical researchers in the physical sciences and engineering. Nevertheless,
inasmuch as it contains basic definitions and detailed explanations of a number of
traditional and modern mathematical notions, it can be comfortably and profitably
taken by advanced undergraduate students.
It is written from the unifying viewpoint of distribution theory and enriched
by such modern topics as wavelets, nonlinear phenomena and white noise theory,
which became very important in the practice of physical scientists. The aim of
this text is to give the readers a major modern analytic tool in their research.
Students will be able to independently attack problems where distribution theory
is of importance.
Prerequisites include a typical science or engineering 3-4 semester calculus
sequence (including elementary differential equations, Fourier series, complex
variables and linear algebra-we review the basic definitions and facts as needed).
No probability background is necessary as all the concepts are explained from
scratch. In solving some problems, familiarity with basic computer programming
methods is necessary although using a symbolic manipulation language such as
Mathematica, MATLAB or Maple would suffice. These skills should be acquired
during freshman and sophomore years.
xii Introduction

The book can also form the basis of a special one/two semester course on the
theory of distributions and its physical and engineering applications, and serve as a
supplementary text in a number of standard mathematics, physics and engineering
courses such as Signals and Systems, Transport Phenomena, Fluid Mechanics,
Equations of Mathematical Physics, Theory of Wave Propagation, Electrodynam-
ics, Partial Differential Equations, Probability Theory, and so on, where, regret-
tably, the distribution-theoretic side of the material is often superficially treated,
dismissed with the generic statement"... and this can be made rigorous within the
distribution theory..." or omitted altogether.
Finally, we should make it clear that the book is not addressed to pure mathe-
maticians who plan to pursue research in distributions theory. They do have many
other excellent sources; some of them are listed in the Bibliographical Notes.
Typically, a course based on this text would be taught in a Mathematics!Applied
Mathematics Department. However, in many schools, some non-mathematical
sciences departments (such as Physics and Astronomy, Electrical, Systems, Me-
chanical and Chemical Engineering) could assume responsibility.

Philosophy
The book covers distributions theory from the applied view point; abstract
functional-theoretic constructions are reduced to a minimum. The unifying theme
is the Dirac delta and related one- and multidimensional distributions. To be sure,
these are the distributions that appear in the vast majority of problems encountered
in practice.
Our choice was based on the long experience in teaching mathematics gradu-
ate courses to physical scientists and engineers which indicated that distributions,
although commonly used in their faculty's professional work, are very seldom
learned by students in a systematic fashion; there is simply not enough room in
the engineering curricula. This induced us to weave distributions into an exposi-
tion of integral transforms (including wavelets and fractal calculus), equations of
mathematical physics and random fields and signals, where they enhance the pre-
sentation and permit achieving both, an additional insight into the subject matter
and a computational efficiency.
Distribution theory in its full scope is quite a complex, subtle and difficult branch
of mathematical analysis requiring a sophisticated mathematical background. Our
goal was to restrict exposition to parts that are obviously effective tools in the above
mentioned areas of applied mathematics. Thus many arcane subjects such as the
nuclear structure of locally convex linear topological spaces of distributions are
not included.
Organization xiii

We made an effort to be reasonably rigorous and general in our exposition:


results are proved and assumptions are formulated explicitly, and in such a way that
the resulting proofs are as simple as possible. Since in realistic situations similar
sophisticated assumptions may not be valid, we often discuss ways to expand the
area of applicability of the results under discussion. Throughout we endeavor to
favor constructive methods and to derive concrete relations that permit us to arrive
at numerical solutions. Ultimately, this is the essence of most of problems in
applied sciences.
As a by-product, the book should help in improving communication between
applied scientists on the one hand, and mathematicians on the other. The first group
is often only vaguely aware of the variety of modern mathematical tools that can be
applied to physical problems, while the second is often innocent of how physicists
and engineers reason about their problems and how they adapt pure mathematical
theories to become effective tools. Experts in one narrow area often do not see
the vast chasm between mathematical and physical mentalities. For instance, a
mathematician rigorously proves that

lim (log(logx»)
x-+oo
= 00,

while a physicist, usually, would not be disposed to follow the same logic. He
might say:
-Wait a second, let's check the number 10 100 , which is bigger than most phys-
ical quantities-I know that the number of atoms in our Galaxy is less than 1070 .
The iterated logarithm of 10 100 is only 2, and this seems to be pretty far from
infinity.
This little story illustrates psychological difficulties which one encounters in
writing a book such as this one.
Finally, it is worth mentioning that some portions of material, especially the
parts dealing with the basic distributional formalism, can be treated within the
context of symbolic manipUlation languages such as Maple or Mathematica where
the package Di r acDe 1 ta • mis available. Their use in student projects can enhance
the exposition of the material contained in this book, both in terms of symbolic
computation and visualization. We used them successfully with our students.

Organization
Major topics included in the book are split between two parts:
Part 1. Distributions and their basic physical applications, containing the basic
formalism and generic examples, and
xiv Introduction

Part 2. Integral transforms and divergent series which contains chapters on


Fourier, Hilbert and wavelet transforms and an analysis of the uncertainty principle,
divergent series and singular integrals.
A related volume (Distributions in the Physical and Engineering Sciences, Vol-
ume 2: Partial Differential Equations, Random Signals and Fields, to appear in
1997) is also divided into two parts:
Part 1. Partial differential equations, with chapters on elliptic, parabolic, hy-
perbolic and nonlinear problems, and
Part 2. Random signals and fields, including an exposition of the probability
theory, white noise, stochastic differential equation and generalized random fields
along with more applied problems such as statistics of a turbulent fluid.

The needs of the applied sciences audience are addressed by a careful and rich
selection of examples arising in real-life industrial and scientific labs. They form
a background for our discussions as we proceed through the material. Numerous
illustrations (62) help better understanding of the core concepts discussed in the
text. A large number (125) of exercises (with answers and solutions provided in a
separate chapter) expands on themes developed in the main text.
A word about notations and the numbering system for formulas. The list of
notation is provided following this introduction. The formulas are numbered sep-
arately in each section to reduce clutter, but, outside the section in which they
appear, referred to by three numbers. For example, formula (4) in section 3 of
chapter 1 will be referred to as formula (1.3.4) outside Section 1.3. Sections and
chapters can be easily located via the running heads.

Acknowledgments
The authors would like to thank Dario Gasparini (Civil Engineering Depart-
ment), David Gurarie (Mathematics Department), Dov Hazony (Electrical Engi-
neering and Applied Physics Department), Philip L. Taylor (Physics Department)
of the Case Western Reserve University, Valery I.Klyatskin of the Institute for At-
mospheric Physics, Russian Academy of Sciences, Askold Malakhov and Gennady
Utkin of the Radiophysics Faculty of the Nizhny Novgorod University, George
Zaslavsky of the Courant Institute at New York University, and Kathi Selig of the
Fachbereich Mathematik, Universitat Rostock, who read parts of the book and
offered their valuable comments. A CWRU graduate student Rick Rarick also
took upon himself to read carefully parts of the book from a student viewpoint and
his observations were helpful in focusing our exposition. Finally, the anonymous
referees issued reports on the original version of the book that we found extremely
helpful and that led to a complete revision of our initial plan. Birkhauser edi-
Authors xv

tors Ann Kostant and Wayne Yuhasz took the book under their wings and we are
grateful to them for their encouragement and help in producing the final copy.
The second named author also acknowledges the early distribution-theoretic in-
fluences of his teachers; as a graduate student at Wroclaw University he learned
some of the finer points of the subject (such as Gevrey classes theory and hypoel-
liptic convolution equations) from Zbigniew Zieleiny (now at SUNY at Buffalo)
who earlier also happened to be his first college calculus teacher at the Wroclaw
Polytechnic. Working with Kazimierz Urbanik (who in the 50s, simultaneously
with Gelfand, created the framework for generalized random processes) as a thesis
advisor also kept the functional perspective in constant view. Those interest were
kept alive with the early 70s visits to Seminaire Laurent Schwartz at Paris Ecole
Polytechnique.

Authors
Alexander /. SAICHEV, received his B.S. in the Radio Physics Faculty at Gorky
State University, Gorky, Russia, in 1969, a Ph.D. from the same faculty in 1975 for
a thesis on Kinetic equations o/nonlinear random waves, and his D.Sc. from the
Gorky Radiophysical Research Institute in 1983 for a thesis on Propagation and
backscattering o/waves in nonlinear and random media. Since 1980 he has held
a number of faculty positions at Gorky State University (now Nizhniy Novgorod
University) including the senior lecturer in statistical radio physics, professor of
mathematics and chairman of the mathematics department. Since 1990 he has
visited a number of universities in the West including the Case Western Reserve
University, University of Minnesota, etc. He is a co-author of a monograph Non-
linear Random Waves and Turbulence in Nondispersive Media: Waves, Rays and
Particles and served on editorial boards of Waves in Random Media and Radio-
physics and Quantum Electronics. His research interests include mathematical
physics, applied mathematics, waves in random media, nonlinear random waves
and the theory of turbulence. He is currently Professor of Mathematics at the Radio
Physics Faculty of the Nizhniy Novgorod University.

Wojbor A. WOYCZYNSKJ received his B.S.IM.Sc. in Electrical and Computer


Engineering from Wroclaw Polytechnic in 1966 and a Ph.D. in Mathematics in
1968 from Wroclaw University, Poland. He has moved to the U.S. in 1970, and
since 1982, has been Professor of Mathematics and Statistics at Case Western
Reserve University in Cleveland, and served as chairman of the department there
from 1982 to 1991. Before, he has held tenured faculty positions at Wroclaw
University, Poland, and at Cleveland State University, and visiting appointments
at Carnegie-Mellon University, Northwestern University, University of North Car-
olina, University of South Carolina, University of Paris, Gottingen University,
xvi Introduction

Aarhus University, Nagoya University, University of Minnesota and the University


of New South Wales in Sydney. He is also (co-)author and/or editor of seven books
on probability theory, harmonic and functional analysis, and applied mathematics,
and serves as a member of editorial boards of the Annals of Applied Probability,
Probability Theory and Mathematical Statistics, and the Stochastic Processes and
Their Applications. His research interests include probability theory, stochastic
models, functional analysis and partial differential equations and their applica-
tions in statistics, statistical physics, surface chemistry and hydrodynamics. He is
currently Director of the CWRU Center for Stochastic and Chaotic Processes in
Science and Technology.
Notation

ral least integer greater than or equal to a


LaJ greatest integer less than or equal to a
C concentration
C complex numbers
C(x) = I; cos(rrt 2/2) dt, Fresnel integral
Coo space of smooth (infinitely differentiable) functions
V = Co' space of smooth functions with compact support
V' dual space to V, space of distributions
D the closure of domain D
D/Dt = +
a/at v . V, substantial derivative
c5(x) Dirac delta centered at 0
c5(x - a) Dirac delta centered at a
t::. Laplace operator
£ = Coo -space of smooth functions
£' dual to £, space of distributions with compact support
erf (x) = (2/,.fir) I; exp( _s2) ds, the error function
j(w) Fourier transform of f(t)
{f(x)} smooth part of function f, see page 104
Lf(x)l jump of function f at x
</>,1/1 test functions
y(x) canonical Gaussian density
y€(x) Gaussian density with variance €
I e-ttS-1dt, gamma function
00
r(s) =
I h (x) g (x )dx , the Hilbert space inner product
0
(h, g) =
x(x) canonical Heaviside function, unit step function
iJ the Hilbert transform operator
j,J Jacobians
IA(X) the indicator function of set A (=1 on A, =0 off A)
Imz the imaginary part of z
).Ax) = rr- 1€(x 2+ €2)-1, Cauchy density
xviii Notation

LP(A) Lebesgue space of functions f with fA If(x)IP dx < 00


N nonnegative integers
rP = 0(1/1) rP is of the order not greater than 1/1
rP = 0(1/1) rP is of the order smaller than 1/1
PV principal value of the integral
R real numbers
Rd d-dimensional Euclidean space
Rez the real part of z

sign (x)
p
=
density
°
1 if x > 0, -1 if x < 0, and 0 if x =
sinc llJ = sin rr llJ / rr llJ
S space of rapidly decreasing smooth functions
S' dual to S, space of tempered distributions
S(x) = f; sin (rr t 2 /2) dt, Fresnel sine integral
T,S distributions
T[rPl action of T on test function rP
Tf distribution generated by function f
f generalized Fourier transfonn of T
z* complex conjugate of number z
z integers
V gradient operator
~ Fourier map
-+- converges to
=> uniformly converges to
convolution
[.n* physical dimensionality of a quantity


0 empty set
end of proof, example
Part I
DISTRIBUTIONS AND THEIR BASIC
APPLICATIONS
Chapter 1
Basic Definitions and Operations

1.1 The "delta function" as viewed by


a physicist and an engineer
The notion of a distribution (or a generalized function-the term often used
in other languages) is a comparatively recent invention, although the concept is
one of the most important in mathematical areas with physical applications. By
the middle of the 20th century, the theory took final shape, and distributions are
commonly used by physicists and engineers today.
This book presents an exposition of the theory of distributions, their range of
applicability, and their advantages over familiar smooth functions. The Dirac
delta function-more often called the delta function-is the most fundamental
distribution, introduced by the physicists as a convenient "automation" tool for
handling unwieldy calculations. Its introduction was preceded by the practical use
of another standard discontinuous function, the so-caIledHeaviside function, which
was applied in the analysis of electrical circuits. However, as is the case of many
mathematical techniques that are heuristically applied by physicists and engineers,
such as the nabla operator or operational calculus, intuitive use can sometimes lead
to false conclusions, which explains the need for a rigorous mathematical theory.
Let us begin with describing the way in which distributions and, in particular,
the "delta function", are usually introduced in physical sciences.
Typically, the delta function is defined as a limit, as B -+ 0, of certain rectangular
functions (see Fig. 1. 1. 1),

fe(x) 1/, 2B,


= {0 for Ixl < B; (1)
for Ixl > B.

As B -+ 0, the rectangles become narrower, but taller. However, their areas always
4 Chapter 1. Definitions and operations

f(x)

- - 1/2£

£
x
FIGURE 1.1.1
A naive representation of the delta function as a limit of rectangular functions.

remains constant, since for any e,

f fe(x)dx = 1. (2)

In other words, the delta-function is being defined as a pointwise limit

8(x) = lim fe(x). (3)


e-O

This pointwise limit, as can be easily seen, is zero everywhere except at the point
x = 0 where it is infinity. Therefore, the common in the applied literature definition
of the delta function is

8(x) = {oo, for x = 0; (4)


0, for Ixl > 0,

under the additional condition that the area beneath it is equal to one. This, in
particular, yields the well-known probing property of the delta function when
convolved with any continuous function:

f 8(x - a)t/J(x)dx = t/J(a). (5)

In other words, integrating t/J against a delta function we recover the value of t/J at
the (only) point where the delta function is not equal to zero. Here, and throughout
1.2. A rigorous definition of distributions 5

the remainder of the book, an integral written without the limits will indicate
integration over the entire infinite line (plane, space, etc.), that is, from -00 to
+00.
At this point we would also like to bring up the question of dimensionality which
is always of utmost importance to physicists and engineers, but usually neglected
by mathematicians. The delta function is one of the few self-similar functions
whose argument can be a dimensional variable, for example a spatial coordinate x
or time t, and depending on the dimension of its argument, the delta function itself
has a nonzero dimension. For instance, the dimension of the delta function of time
is equal to the inverse time,
[8(t)D = l/T,

i.e., the dimension of frequency since, by definition, the integral of the delta func-
tion of time with respect to time is equal to one-a dimensionless quantity.
Notice that 8(t) has the same dimension as the inverse power function l/t. In
what follows (see Section 6.2) we will derive formulas important for physical appli-
cations formulas which provide a deeper inner connection between such seemingly
unrelated functions.

1.2 A rigorous definition of distributions


The physical definition of the delta function introduced in the previous section is
not mathematically correct. Even if we skip over the question of whether functions
can take 00 as a value, the integral of the delta function given by equality (1.1.4)
is either not well defined if understood as a Riemann integral, or equals zero if
understood as a Lebesgue integral. Observe, however, that for each e > 0, the

f
integral
Te[4>1 = fe(x)4>(x)dx (1)

exists for any fixed continuous test function 4>, and as e -+ 0+, it converges to the
value of the test function at zero:

(2)

As we show below, one of the possible mathematically correct definitions of the


delta function can be based on integral equalities of type (2) and their interpretation
as limits of integrals of type (1) rather than on the pointwise limits of ordinary
functions. Recall that the integral (1) represents what in mathematics is called a
linear functional on test functions 4> (x), generated by the function fe(x) which
determines all the functional's properties and is called the kernel of a functional.
6 Chapter 1. Definitions and operations

The notion of a functional is more general than that of a function of a real


variable. A functional depends on a variable which is a function itself but its values
are real numbers. This modem mathematical notion will help us develop a rigorous
definition of the delta function. It should be noted that Paul DIRAC, "father" of
the delta function and one of the creators of quantum mechanics, recognized the
necessity of a functional approach to distributions earlier than most of his fellow
physicists. For this reason the rigorous version of the intuitive delta function will
be called henceforth the Dirac delta distribution, or simply the Dirac delta. We
will use that term to emphasize that the delta function is not a function.
Let us consider a linear functional T[t/>] on test functions t/>, generated by an
integral

T[t/>] = f f(x)t/>(x)dx (3)

with kernel f (x). Test functions t/> will come from a certain set 'D of test functions
which will be selected later. Once this set of test functions is chosen, the set oflinear
functionals on 'D, called the dual space of V and denoted V', will be automatically
determined. It is these functionals that will be identified later with distributions.
The functional which assigns to each test function its value at 0 will correspond to
the Dirac delta distribution. It is worthwhile to observe that the narrower the set
of the test functions, the broader the set of linear functionals defined on the latter,
and vice versa. Therefore, as a rule, to obtain a large set of distributions, we have
to impose rather strict constraints on the set of test functions. At the same time
the set of test functions should not be too small. This would restrict the range of
pI:oblems where the distribution theoretic tools can be used.
There are a few natural demands on the set 'D of test functions. In particular,
it has to be broad enough to identify usual continuous kernels f via the integral
functional (3). In other words, once the values of functional T [t/>] are known for
all t/> E 'D, kernel f has to be uniquely determined. Paraphrasing, we can say that
we require that the set V' of distributions be rich enough to include all continuous
functions.
It turns out that the family of all infinitely differentiable functions with compact
support is a good candidate for the space 'D of test functions. From now on, we
shall reserve 'D for this particular space. Recall that a function is said to be of
compact support if it is equal to zero outside a certain bounded set on the x-axis.
The support of f itself, denoted supp f, is by definition the closure of the set of
x's such that f(x) #- O.
Let us show that the value ofa continuous function f at any point x is determined
by the values offunctional (3) on all test functions t/> E 'D.
Consider the function

w(x) = { C exp{ -(1 - x 2)-1}, for Ixl < 1; (4)


0, for Ixl ~ 1,
1.2. A rigorous definition of distributions 7

where the constant C is selected in such a way that the normalization condition

f w(x)dx =1

is satisfied. It turns out that C ~ 2.25, and the bell-shaped function is pictured in
Fig. 1.2.1. It can be easily shown that the function w is an infinitely differentiable
function with compact support. Indeed, it vanishes outside the [-1, 1] interval
which is bounded, and it has derivatives of arbitrary order everywhere, including the
two delicate points +1, and -1, where at +1 one checks that all the left derivatives
are zero and the right derivatives are obviously identically zero, and one proceeds
similarly at -1.

ro(x)

------~------~------~~-----x
FIGURE 1.2.1
Graph of a bell-shaped function which is both very smooth and has compact
support.

Rescaling the function w, for each 8 > 0, we can produce a new function

Clearly, it has compact support as it vanishes outside the interval [-8, 8], and it is
also infinitely differentiable, as can be checked by an application of the chain rule.
Moreover, changing the variables one can check that

f We (x )dx = 1.
8 Chapter 1. Definitions and operations

It follows from the generalized mean value theorem for integrals, and from the
continuity of the function f (x) that, as 8 ~ 0, the value of the functional

Te[f] = f f(x)wB(x)dx ~ f(O).

Thus the value of f at 0 can be recovered by evaluating functionals TB at f. Values


of f at other y's can be recovered by evaluating the integral functionals on test
functions WB shifted by y. This gives a proof of our statement. •

By definition, any linear functional T [<p] which is continuous on the set 1) of


infinitely differentiable functions with compact support is called a distribution.
The set of all distributions on 1), that is the dual space 1)', is often called the
Sobolev-Schwartz space.
A Russian mathematician Sergei SOBOLEV laid the foundation of the rigor-
ous theory of distributions in the 1930's while looking for generalized solutions
for partial differential equations. Laurent SCHWARTZ, a French mathematician,
completed the work on the foundations by building a precise structure for the dis-
tribution theory based on concepts that are called locally convex topological vector
spaces. He was awarded the Fields Medal for his work in 1950. Thus, for the first
time since Newton, ideas about differentiability underwent a major revision.
A few additional comments about the above definition are warranted, and some
of its statements have to be made more precise. A functional T is said to be linear
on 1) if it satisfies the equality

for any test functions <p and 1/1 in 1) and arbitrary real (or complex) numbers (X and
f3. A functional T on 1) is called continuous if for any sequence of functions <Pk (x)
from 1) which converge to a test function <p (x), the numbers T [<Pk], representing the
values of the functional T on <Pk'S, converge to the number T[<p]. The convergence
of the sequence <Pk of test functions in 1) to <P is understood in this case as meaning
(1) The supports of the <Pk'S, that is the (closures of) sets of x's where <Pk :j:. 0,
are all contained in a fixed bounded set on the x-axis, and
(2) As k ~ 00, the functions <Pk themselves, and all their derivatives
<Pkn ) (x), n = 1,2, ... , converge uniformly to the corresponding derivatives of
the limit test function <p(x), that is, for each n = 0, 1,2, ... ,

(5)

Let us give several examples of such functionals.


1.2. A rigorous definition of distributions 9

Example 1. Let the function f(x) appearing on the right hand side of (3) be
locally integrable, that is, integrable over any finite interval of the x-axis. A
continuous function on the whole axis is an example of such a function as well as a
function which is simply integrable. Then the right-hand side of (3) is well defined
for any tP E V and clearly defines a linear functional on it. Its continuity in the
sense defined above is immediately verifiable. A distribution defined in such a way,
with the help of a standard "good" function f, will be called a regular distribution
and denoted Tf. In this context, we can say that locally integrable functions can
be identified with certain distributions in the distribution space V'. •
However, some linear continuous functionals (distributions) on V cannot be
identified with locally integrable kernels, and are then called singular distributions.
Example 2. The simplest example of a singular distribution is a functional
that assigns to each test function tP in V its value at x = o. This distribution is
traditionally denoted by 8, and thus by definition,

8[tP] = tP(O). (6)

It is not a regular distribution generated by a locally integrable function, and it is


called the Dirac delta. The above defining equation is sometimes written heuris-
tically in the integral form

f 8 (x)tP (x)dx = tP(O),

although formally the integral on the left-hand side does not make any sense.
However, the above equation can serve as an intuitive mnemotechnic rule that, if
used judiciously, will greatly facilitate actual calculations involving Dirac delta
distributions. By now, it should be also clear to the reader that the name "delta
function" is a misnomer. The delta function is not a function but a singular distri-
bution functional, and one should not talk lightly about its value at x. Writing the
argument x is however convenient since in the future it will permit us to talk about
the distribution functional 8 (x - a) defined by the formula

f 8(x - a)tP(x)dx = tP(a); (7)

it could be thought of as the Dirac delta shifted by a. Another more rigorous


possibility would be to denote by 8a the Dirac delta centered at a, but this notation
becomes unwieldy if a has to be replaced in the SUbscript by a more complex
expression. •
In what follows, in addition to routine (ab)use of the integrals (3) and (7), to
denote the action of the distribution functional on test functions, we will utilize
10 Chapter 1. Definitions and operations

another convenient compact notation

Tf[4>] = ! l(x)4>(x)dx. (8)

Remark 1. When discussing both regular functions and singular distributions, a


vital role was played by the notion of the support of a function. Recall that support
of a regular function I (x) was defined as closure of the set of x 's where I (x) was
different from O. Thus the support of the bell-shaped function from (4) was the
segment [-1, 1]. Similarly, one can define the notion of support of a distribution
functional. The distribution T is considered to be equal to zero in the open region B
on the x -axis if T [4>] = 0 for all the test functions 4> with supports contained in B.
The complement of the largest open region in which distribution T is equal to zero
will be called the support of distribution T and denoted supp T. It immediately
follows from the above definition that the support of the delta function consists of
a single point x = 0, that is,
supp &= {OJ.

1.3 Singular distributions as limits


of regular functions
Although the Dirac delta distribution itself cannot be represented in the form
of an integral functional, it can be obtained as a limit of a sequence of integral

!
functionals
Tk[4>] = Ik(x)4>(x)dx (1)

with respect to kernels that are regular functions, for example, the rectangular
functions introduced in Section 1.1. In a sense the distribution & can then be
understood as being represented by such an approximating sequence (fk(X)}, and
many properties of & can be derived from the properties of the sequence Ik. The
approximation of & by Ik in the sense that, for each test function 4> in V

as k ~ 00, is called the weak approximation and the corresponding convergence


-the weak convergence.
The choice of a weakly convergent sequence of distributions {Tk} represented
by regular functions {/k} is clearly not unique, and instead of rectangular func-
tions from Section 1.1, it is always possible and often more convenient to select
1.3. Singular distributions as limits of functions 11

them in such a way that functions Ik(X) are infinitely differentiable (although not
necessarily with compact support).
Example 1. Consider the family of Gaussian functions (see Fig. 1.3.1)

Ye(x) =- 1- exp (X2)


-- (2)
,J2rr8 28

parametrized by a parameter 8 > 0, and take as a weakly approximating sequence


A(x) = Yl/k(X), k = 1,2, .... Notice that the constant in front of the exponential
function in (2) has been selected in such a way that

f Ye(x)dx = 1.

f(x)

------~~------~r-----~~~~---- X
FIGURE 1.3.1
Graphs of the first two elements of the sequence of Gaussian functions wealdy
convergent to Dirac delta.

As k ~ 00, we have that 8 ~ 0, and the approximating Gaussian functions


become higher and higher peaks, more and more concentrated around x = 0, while
preserving the total area underneath them. This satisfies the above normalization
condition. •
Example 2. Let us consider another sequence of regular functionals converging
weakly to the Dirac delta, determined by the kernels Ik = Al/k, where

1 8
Ae(X) = - 2 2· (3)
rr x +8
12 Chapter 1. Definitions and operations

Physicists often call these functions Lorentz curves, and mathematicians call them
Cauchy densities. •
Although at first sight Lorentz functions look somewhat like Gaussian functions
(see Fig. 1.3.1) there are some significant differences. Both are infinitely differen-
tiable and integrable functions (satisfying the above normalization condition) on
the entire x-axis, since the indefinite integral

f -2_1_dx = arctanx
x +1

has a finite limit in 00 and -00, and both have values at zero that blow up to +00
as 8 -+ 0, since
1 1
Ye(O) = .J21f8 , and Ae(O) = - .
1f8

But whereas a Gaussian function decays exponentially to 0 as x -+ ±oo, the


asymptotic behavior of the Lorentz functions at x -+ ±oo is only

so that they decay to 0 much less rapidly than the Gaussian functions, and the areas
underneath their graphs are much less concentrated around the origin x = 0 than
those of Gaussian functions. Hence, in particular, if f (x) = x 2 then

Tf[yeJ = f x2 ~ exp(- x 2 )dx =


v21f8 28
8 (4)

is well defined, while

is not, since the integral on the right diverges. However, for all the test function
4> E V, and for 8 -+ 0,

in view of the compact support of test functions.


Here a note of caution is in order lest the reader get the impression that regular
functions weakly approximating the Dirac delta must concentrate their nonzero
values in the neighborhood of x = O. This is not the case once we abandon the
restriction (which appeared without mentioning it in the above two examples) that
the weakly approximating regular functions be positive or even real-valued.
1.3. Singular distributions as limits of functions 13

Example 3. Consider complex-valued oscillating functions

Is(x) = [;f
-
. (iX2)
exp - - , (5)
27re 2e

parametrized bye> O. Their real parts are pictured in Fig. 1.3.2. They are
frequently encountered in quantum mechanics and quasi-optics. In quasi-optics
they appear as the Green's functions of a monochromatic wave in the Fresnel
approximation. The modulus of these function is constant

1
Ils(x)1 = .j27re'

for any x, which diverges to 00 as e ~ O.

Ref(x)

n
(' ~ "
x

V V V
FIGURE 1.3.2
Graph ofan element ofa sequence of functions Is (x) (5) which do not converge
to 0 for x :F 0 as e ~ 0, and still weakly converge to the Dirac delta.

Nevertheless, as e ~ 0, these functions converge weakly to the Dirac delta. In


physical terms it can be explained by the fact that function Is(x) defined by (5)
oscillates at a higher and higher rate the smaller e becomes. As a result, the integrals
of their products with any test function t/J (x) supported by a region which excludes
point x = 0 converge to t/J (0) = 0 as e ~ o. •
Notice that all of the above examples of weakly approximating families Is
for the Dirac delta have been constructed with the help of a single function f,
be it Gaussian, Lorentz or oscillating complex-valued, which was later rescaled
14 Chapter 1. Definitions and operations

following the same rule:

The properties of the limiting Dirac delta really do not depend on the particular
analytic form of the original regular function f. Practically, any smooth enough
function satisfying the normalization condition f f (x) dx = 1 will do. In particu-
lar, function f need not be symmetric (even). The sequence produced by rescaling
function
f(x) = {x-
0,
2 eXP(-1/X), for x > 0;
for x ~ 0,
(6)

whose plot is represented in Fig. 1.3.3 will also weakly approximate the delta
function. However, in what follows, we shall see that the fine structure of function
f should not be always ignored by the physicists and that it can affect the final
physical result.

fix)

;---------------------------------x
FIGURE 1.3.3
An example ofa function f which is not even but such that f(x/e)/e, weakly
converge to a as e ~ O.

1.4 Derivatives; linear operations


The infinite differentiability of the chosen set 1) of test functions <p (x) allows us
to define, for any distribution T E 1)', a derivative of arbitrary order, thus freeing us
from a constant worry about differentiability within the class of regular functions.
1.4. Derivatives; linear operations 15

It is one of the main advantages the theory of distributions has over the classical
calculus of regular functions. Before we provide a general definition, let us observe
that the familiar integration-by-parts formula in the integral calculus applied to a
differentiable function f (x) and a test function l/J (x) E V reduces to

f f'(x)l/J(x)dx = - f f(x)l/J'(x)dx, (1)

since the boundary term


f(x)l/J(x) [00=0,
because the test function l/J is zero outside a certain bounded set on the x -axis. If we
think about the regular function f as representing a distribution Tf, then equation
(1) can be rewritten as
(Tf)'[l/J] = -Tf[l/J'], (2)
which is valid for any test function l/J. We can take the above equality as a definition
of the functional on the left-hand side and call it the derivative of the distribution
Tf -notice that the right hand side does not depend on the differentiability of f.
This idea can be extended to any distribution.
If T is a distribution in V' then its derivative T' is defined as a distribution in
V' which is determined by its values (as a functional) on test functions l/J E V by
the equality
T'[l/J] = -T[l/J'].
It is always well defined, since it is a linear and continuous functional on V.
Derivatives of higher order are defined by consecutive application of the operation
of the first derivative. Hence, by definition, if T is a distribution on V then its
n-th derivative T(n) is again a distribution in V determined by its values o.n test
functions l/J E V by

So, distributions always have derivatives of all orders. It is a very nice universe,
indeed. Let us illustrate the concept of the distributional derivative on the Dirac
delta distribution.
Example 1. Consider the distribution 8 (x - a) which is defined by the probing
property at the point x = a:

8(x - a)[l/J] = l/J(a),

or, writing informally, by the condition

f 8(x - a)l/J(x) = l/J(a).


16 Chapter 1. Definitions and operations

Hence its nth derivative (cS(x - a»(n) is defined by the equality

In particular, the first derivative cS' of the Dirac delta is the functional on V defined
by the equality
cS'[if>] = -if>' (0).

The weak approximation of cS' by regular functions can be accomplished, for ex-
ample, by taking a sequence of derivatives y' e of Gaussian functions from formula
(1.3.2) (see Fig. 1.4.1).

f(x)

------~~--------~--------~~=_----- X

FIGURE 1.4.1
Approximating functions of the first derivative of the Dirac delta.

(or of any other smooth weak approximants) since

f y'e(x)if> (x)dx =- f Ye(x)if>'(x)dx -+ -cS[if>'] = cS'[if>]

as s -+ o.
Notice that the operation of differentiation is a linear operation on the space of

distributions in the sense that if we define the linear combination of two distribu-
tions T and S from V' by the equality

(aT + ,8S)[if>] = aT[if>] + ,8S[if>],


1.5. Multiplication by a smooth function. Leibniz formula 17

where a and P are numbers, then

(aT + PS)' = aT' + pS'.


The proof of this fact is immediate from the above basic definitions.

1.5 Multiplication by a smooth function; Leibniz formula


Another linear operation, which produces a new distribution from a distribution
T and an infinitely differentiable function g, is the multiplication of T by g. Denote
the set of all infinitely differentiable functions (but not necessarily with compact
support) on the x-axis by Coo.
By definition, the product gT of a function g E Coo by a distribution T E V' is
a distribution in V' determined by

(gT)[cp] = T[gcp], cp E V. (1)

The right-hand side is well defined since the product of an infinitely differentiable
g(x) by a test function from V is again a function from V, and in particular it
has compact support. The above formula obviously corresponds to a formula for
regular functions:

The above definition, in the particular case of a constant function g(x) = c,


which certainly is infinitely differentiable, provides a definition of cT-a product
of the number c by the distribution T E V':

(cT)[cp] = T[ccp].

Example 1. Let us calculate the product of an arbitrary infinitely differentiable


g with the delta function 8(x - a). By definition

(g8(x - a»[cp] = 8(x - a)[gcp] = g(a)cp(a) = g(a)8(x - a)[cp]

and we have demonstrated that the distribution

g8(x - a) = g(a)8(x - a).



18 Chapter 1. Definitions and operations

Observe that our definition does not allow multiplication of distributions by


functions that are not infinitely differentiable. The product gl/J on the right-hand
side of the defining formula (1) has to be infinitely differentiable if we are to apply
the functional T to it, and that cannot be guaranteed unless g itself is infinitely
differentiable. This is an essential restriction that has to be kept in mind.
The differentiation of distributions and their multiplication by a smooth function
are tied together by an analogue of the classical Leibniz formula for the derivative
of a product of two functions.
If g is a function from COO and T is a distribution from V' then

(gT)' = g'T + gT'. (2)

Indeed, by (1.4.2), applying the left-hand side to a test function l/J we get that

(gT)'[l/J] = -(gT)[l/J'] = -T[gl/J'] = -T[(gl/J)' - g'l/J]

= -T[(gl/J)'] + T[g'l/J] = T'[gl/J] + (g'T)[l/J]


= (gT')[l/J] + (g'T)[l/J] = (g'T + gT') [l/J]. •
Similarly, one can prove a general Leibniz formula

(2a)

for distributions. It is well known for smooth functions from the standard calculus
courses.
Formulas (2) and (2a) may look nice and elegant, but in practice, different
portions of the above chain of equalities may tum out to be more useful in the
evaluation of the derivative of a product gT.

Example 2. Applying the Leibniz formula to the product of g and a distribution


cS(x - a), we immediately get from the second equality in the above chain that

(gcS(x - a»'[l/J] = -g(a)l/J'(a).

Thus, in particular, for a constant c

(ccS(x - a»'[l/J] = -cl/J'(a).


1.5. Multiplication by a smooth function. Leibniz formula 19

The above formula can be obtained in a more straightforward manner by observing


that
(g~(x - a»'[cp] = -(g~(x - a»[cp'].

and then using the calculation of g~(x - a) = g(a)~(x - a) from Example 1. In


a similar fashion, by the repeated use of the above argument one can show that

(g~(x - a»(n) = g(a)~(n)(x - a). (3)

This equality expresses again the remarkable multiplier probing property of the
Dirac delta distribution, complementing the equality ~ [cp] = cp (0) discussed before.
It can be also expressed as follows: a function multiplier of the Dirac delta can
be viewed as a constant which can be factored outside the test functional. In
the future we will often refer to the multiplier probing property analyzing various
applied problems. •
Example 3. Let us find a distribution equal to the product of a function g E Coo
and the derivative of the Dirac delta ~'(x - a). Following the above rules of
differentiation of distributions and their mUltiplication by smooth functions, we
get
(g~')[cp] = ~'[gcp] = -~[(gcp)'] = -~[g'cp + gcp']
= -g'(a)cp(a) - g(a)cp'(a).

Thus, the derivative of the Dirac delta loses the multiplier probing property of the
Dirac delta itself-it is a linear combination of values of both g and g' at the point
x = a. In the particular case of g(x) = x and a = 0 the above calculation gives

x~'(x) = -~(x). (4)

A by-product of the above example is that the Dirac delta is a generalized dis-

tributional solution of the differential equation

xT' = -T.

This fact, as we shall see later, will have useful consequences for solving real-life
physical and engineering problems. The elegant equation (4), as well as the more
general formula

cannot be derived if one sticks to the intuitive understanding of the delta function
described in Section 1.1, and it shows the power of mathematical tools introduced
on the last few pages.
20 Chapter 1. Definitions and operations

A word of warning is in order here. Under no circumstances can both sides of


formula (4) be divided by x since, obviously,

a'(x) ¥= _ a(x) ,
x

not to mention the fact that the right hand side is not well defined because the
function 1/x does not belong to Coo. This illuminates difficulties with the operation
of division of a distribution by functions that vanish at a certain point. We shall
return to this problem later.
On the other hand, the above properties of the Dirac delta distributions allow
us to sometimes solve the different division problem of finding a distribution T
from V' which satisfies equation gT = 0, where g is a known smooth function.
If T represents a regular function f, such an equation obviously has a multitude
of solutions as it implies only that f(x) and g(x) cannot be different from 0 at
the same point x. In other words the intersection of supports of f and g has to be
empty:
f(x)g(x) = 0 <===> supp f n suppg = 0.

In the generalized sense, however, such equations may appear in different appli-
cations (for example, in the analysis of the propagation of waves in dispersive
media) and may have nontrivial solutions. As an exercise one can check that the
distribution of the form

T = coa(x) + ... + Cn_ta(n-t)(x), (5)

where co. Ct ••.•• Cn-t. is a solution of equation

(6)

Hence, in the above sense, it solves the problem of division of zero.

1.6 Integrals of distributions;


the Heaviside function
By analogy with classical calculus, one could define an (indefinite) integral of
a distribution T as a distribution S such that S' = T. Without searching for the
general solution of this problem, let us observe that its solution for the Dirac delta
1.6. Integrals of distributions. The Heaviside function 21

is easy. Consider the so-called Heaviside or unit step function

X(x) = {I, for x ::: 0; (1)


0, for x < 0,

often encountered in physical applications and pictured in Fig. 1.6.1.

X(x-a)

--------~------~a~---------------- X

FIGURE 1.6.1
The graph of the shifted Heaviside function X(x - a).

In the sense of classical analysis it has no derivative at x = 0, but its distributional


derivative is well defined, and it is easy to see that

X' = Tx' = 8. (2)

Indeed, checking the values of the left hand side as a functional on test functions,
we get that

Tx'[4>l = -Tx [4>'l = - f X(x)4>'(x)dx =- 10 00


4>' (x)dx = 4>(0) = 8[4>],

Having found the derivative of the Heaviside function, one can compute easily
the distributional derivative of any piecewise-smooth function I (x) which has
jump discontinuities at points Xk, k = 1,2, ... ,n. Such a function can be always
represented as a sum of its continuous piecewise-smooth part Is without jumps,
and pure jump part in the following form
n
I(x) = Is(x) + L:(/(xk + 0) - I(Xk - 0) )X(x - Xk),
k=l
22 Chapter 1. Definitions and operations

or in a more compact form

n
I(x) = Is(x) + LLIk lx(x - Xk),
k=l

where
LA 1 = I (Xk + 0) - I (Xk - 0)

denotes the size of the corresponding jump (see Fig. 1.6.2).

f(x)

f(x)

fix)

~----~--------~----~-----------x
xl x2 x3
FIGURE 1.6.2
Graphs of a function I (X) with jumps and the corresponding continuous
function Is (X) which has been obtained from I by the removal of its jumps.

Since the derivative is a linear operation we immediately see that

n
I' = {f/} + LLA 18(x - Xk),
k=l

which, read in the reverse order, gives a formula for an indefinite integral of any
distribution of the following form: a locally integrable function plus a linear com-
bination of Dirac deltas centered at points of jumps.
It can happen that a function has first n - 1 derivatives in the classical sense and
only the derivative of order n - 1 displays some discontinuities and its derivative
has to be considered in the distributional sense. Before providing an example,
let us define the function sign (x) which is another of those special discontinuous
1.6. Integrals of distributions. The Heaviside function 23

functions that we will encounter often in what follows. By definition

+1, for x> 0;


sign (x) ={ 0, for x = 0; (3)
-1, for x < o.

Its graph is presented on Fig. 1.6.3. By a computation similar to that above, one
can check that sign' (x) = 2a(x).

Sign(x)

1~--------------

-------------------.-------------------x

FIGURE 1.6.3
Graph of the function sign (x).

Example 1. Consider the function f(x) = x 2 sign (x). It is differentiable in


the classical sense and
f'(x) = 21xl

for any point x. The derivative, however, is not differentiable at x = 0, but in the
distributional sense it is easy to check that

f"(x) = 2 sign (x),

so that,
f"'(x) = 2 sign' (x) = 4a(x). •
At this point it should be observed that there is some flexibility in computing
a function whose distributional derivative is equal to the Dirac delta. The distri-
butional derivative is determined by its functional action on test functions. So if
we change the value of the Heaviside function at a single point, we also obtain an
24 Chapter 1. Definitions and operations

indefinite integral of the Dirac delta. As a consequence function

I(x) = sign(x)
2

also satisfies equality I' = o.


As far as the definite integral ofa distribution on the entire real line is concerned,
it is clear that some additional assumptions are necessary. One such possible
restriction is that the distribution T has compact support. Any such distribution
can be identified with a continuous linear functional on the space Coo, in the
sense that its value T[4>] is defined not only on any infinitely differentiable test
function with compact support 4> E V, but also on any infinitely differentiable
function 4> E Coo. Since for a distribution Tf representing a regular function I

f
with compact support
Tf[4>] = l(x)4>(x)dx,

it is natural, for a distribution T with compact support, to define

f T = T[1],

where 1 on the right hand side stands for a function identically equal to 1. In

f
particular,
0 = 0[1] = 1.

This line of thinking can be extended to introduce another linear operation on


distributions, namely, their convolution with a smooth function. This will be done
in Section 1.8.

1.7 Distributions of composite


arguments
The reader should have already noticed that the only singular distribution explic-
itly defined so far was the Dirac delta distribution and whatever we could obtain
from it by the linear operations of differentiation and multiplication by a smooth
function from Coo. In this section we continue using this method of producing
new distributions from the ones already constructed by introducing new linear op-
erations on general distributions. As usual our guide will be how the analogous
operation on regular functions can be expressed in terms of the integral functional.
1.7. Distributions of composite arguments 25

Let us begin with distributions of a composite argument, that is, a composition


of a distribution with a function of the x-variable. In the case of a regular function
f(x), the interplay between the integration and composite arguments is expressed
by the usual change-of-variable formula

f f(a(x»~(x)dx f f(Y)~(P(y»IP'(Y)ldy,
=

where y = a(x), and x = P(y) represents the function inverse to a(x), such that
p(a(x» = x. An assumption guaranteeing validity of the above formula is that
the function a(x) is strictly monotone and that it maps the x-axis onto the entire
y-axis.
If we want to use this equality in the functional setting, it is clear that further
restrictions on the composite argument a(x) are necessary. Namely, to assure
that the factor ~(P(y»IP'(Y)1 on the right-hand side is a function in 'D, it is not
sufficient to assume that function a(x) is strictly monotone and that it maps the
x-axis onto the entire y-axis. We also need P(y) to be an infinitely differentiable
function R ~ R.t
So, under the above restrictions on the composite argument a(x), it is clear how
we should proceed in the case of distributions.
By definition, the formula

T(a(x»[~(x)] = T[~(P(Y»IP'(Y)I] (1)

determines the composition of the distribution T with the function a (x).


Example 1. Consider a shift function a(x) = x-a. Composition of this
function with the Dirac delta clearly gives

8(a(x» = 8(x - a),

where 8 (x - a) was introduced earlier.


Example 2. Consider the distribution 8(a(x) - a) defined by the equality

8(a(x) - a)[~(x)] = 8(y - a)[~(p(y))lP'(Y)1l = ~(p(a))lp'(a)l,
which can be symbolically written as

8(a(x) _ a) = 8(x - p(a» (2)


la'(p(a))l ,

lWhat to do if a(x) is not one-to-one (for example, a(x) = x 2 ) will be discussed elsewhere in
this book.
26 Chapter 1. Definitions and operations

where we have taken into account the fact that fJ' (a) = l/a' (fJ(a». This formula is
most frequently applied to a linear composition function a(x) = cx, where c =f. o.
In this case, we get that
8(x)
8(cx) = - .
Icl
The above equality expresses the previously mentioned self-similarity property of
the Dirac delta distribution.

As any other distribution, the distribution of a composite argument can be differ-


entiated and, in general, standard formulas from classical analysis can be applied.
Let us demonstrate the above statement by rewriting the relation (2) in another
equivalent form. Assuming, for definiteness, that a(x) is a strictly increasing
function, the absolute value signs can be dropped in (2), and it can be written,
using the multiplier probing property of the Dirac delta, as

8(x - fJ(y» 8(x - fJ(y»


8(a(x) - y) = a'(fJ(y» = a'(x) ,

which gives
a'(x)8(a(x) - y) = 8(x - fJ(y». (3)

If we differentiate both sides of the above equality with respect to y we get

o 0
a'(x)-8(a(x) - y) = -;;-8(fJ(y) - x).
oy uy

On the other hand, by the classical rules of calculus,

o
-8(a(x) - y)
ox
= -a ,(x)-8(a(x)
0
oy
- y).

Hence we arrive at useful relation

o
-8(a(x) - y)
0
+ -8(fJ(y) - x) = o. (4)
ox oy

Bear in mind that both variables x and y above have the same status, and that the
above distributional equality can be tested with test functions ~ (x) and ~ (y). Once
this is done, we recover the familiar chain rules of differential calculus:

d,d
dy ~(fJ(y» =
fJ (y) dfJ~(fJ),
1.8. Convolutions 27

and
d,d
dx tP (ot (x» = ot (y) dot tP(ot).

1.8 Convolution
A combination of the shift transformation and integration gives rise to another
important linear operation on distributions: the convolution with a function tP E V.
*
By definition, the convolution T tP is a regular Coo function defined by the
formula
*
(T tP)(x) = T,[tP(x - t)].

Notice that it is defined point-wise for every x separately, and that t is the running
argument of the distribution T and the test function on the right-hand side. In
particular
*
(T tP)(O) = T[~]

where ~(t) = tP( -t), so that

In other words, the Dirac delta behaves as a unity for convolution "multiplication".
Remark 1. The convolution operation can be similarly defined for the distribution
T with compact support, and an arbitrary infinitely differentiable function tP.
If we want to extend the above operation to permit convolution of two distribu-
tions, a "weak" approach is necessary.
If T is a distribution with compact support and S is an arbitrary distribution
*
in V', then their convolution T S is a distribution in V' acting on test functions
tP E V as follows:
*
(T S)[tP] = Tx[Sy[tP(x + y)]]

or equivalently
(T * S)[tP] = (T * (S *~) )(0).
One easily checks that the Dirac delta is the unit element for this more general
operation of convolution multiplication as well, that is

8 * S = S.

If one differentiates the convolution of two distributions one gets that

(S * T)(k) = S(k) * T = T(k) * S.


28 Chapter 1. Definitions and operations

The convolution is a linear operation since

It is also commutative since


S* T = T*S,
and associative, i.e.,

Finally, another important property of the convolution is that

supp (S * T) c supp S + supp T,


where for two sets A, B, by definition A + B = {x + y, x e A. y e B}. The same
relationship is true for singular supports.

1.9 The Dirac delta on Rn , lines and surfaces


By analogy with distributions on R, distributions on R n are defined as linear
continuous functionals on the space V(Rn) ofinfinitely differentiable test functions
l/J (x) of compact support in Rn.
Again, if a function f(x) of an n-dimensional variable x =(Xl. ... , x n ) is
locally integrable, then it defines a distribution on R n by the formula

Tf[l/J] = f ... f f(x)l/J(x)dnx,

where the integral is an n-tuple integral with respect to the differential d n x =


dXl ... dxn . In the future, to avoid unwieldy formulas, we will denote the multiple
integral f ... f by a single integral sign f without any risk of confusion. The
dimension will be clear from what appears under the integral sign.
It turns out that all the conclusions about distributions in V = V(R) can be
extended, with obvious adjustments, to distributions on multidimensional spaces.
In particular, we will define a Dirac delta distribution R(x - a) by

R(x - a)[l/J] = l/J(a).

With the help of the above Dirac delta we can, for example, define the singular
dipole function as
1.9. The Dirac delta on R n , lines and surfaces 29

=
where n p / p is the unit vector in the direction of the dipole and the operator of
directional derivative n . V acts on the delta function via the equality

-(n. Vc5(x - a») [4>] = n· V4>(a).


In view of these general similarities, we will not go through detailed introduction of
multidimensional distributions, and concentrate instead on a few issues reflecting
the special nature of the multidimensional spaces. We also restrict our attention to
the Dirac delta distribution on the 3-D space.
As in the 1-D case, the Dirac delta c5(x) can be obtained as a weak limit of
distributions represented by regular functions fk(X) on R3. For example, it is
convenient to take

where gk(Xi) are regular functions of one variable approximating the one-
dimensional Dirac delta c5 (Xi). In this context, the 3-D Dirac delta can be intuitively
viewed as a simple product of 1-D Dirac deltas

although that operation was never formally defined. The above picture, however,
hides the important property of isotropy of the 3-D Dirac delta, which can be
expressed as an invariance with respect to the group of rotations of R3. This
isotropy becomes more transparent if we take the Gaussian function

Ye (x) =
1
../fii8 exp
(_x2)
282

with 8 = 1/ k as the approximating one-dimensional regular function of c5. Its


coordinatewise product

depends only on the magnitude (norm)

r = J
Ixl = xi + x~ + xi

of vector x and not on its orientation in space.


Let us also observe that, in a similar fashion, we can think of the Dirac delta in
space-time as the product
c5(x, t) = c5(x)c5(t).
30 Chapter 1. Definitions and operations

As the next step, compute the Dirac delta «S(a(x) - a) on R3 of the composite
argument a (x) , which corresponds to finding out how the Dirac delta is transformed
under a change of the coordinate systemy = a(x), where coordinate-wise

Yi = ai(x).

If we assume that a(x) is a one-to-one function which satisfies required differen-


tiability conditions, then the equality

«S(a(x) _ a) = MJ3(a) - x) (1)


11(x)1

is valid with x = f3Cy) representing coordinate transformation inverse to a(x), and


1 standing for the Jacobian

l(x) = I8ai(X)
8xj
I
of the transformation from y-coordinates to x-coordinates.
Equation (1) extrapolates to the Dirac delta, the classical change of variables
formula for integrals of functions of several variables:

f fCy - a)</JCy)d3 y = f f(a(x) - a)</J(a(x»ll(x)ld 3 x.

The absolute value of the Jacobian determinant describes the compression (111 < 1)
and stretching (Ill > 1) of the elementary volume d 3 y in comparison with the
original elementary volume d 3 x. Symbolically, we can write this fact as a heuristic
equation

The above discussion of the Dirac delta distribution on R3 with a single point
support can be extended to introduce Dirac delta distributions whose singular sup-
ports are lines or surfaces in R 3 •
So, if a is a surface in R 3 , then the sUrface Dirac delta «SU is defined by

where </J E V(R3 ), and the integral on the right-hand side is the surface integral.
In the same fashion, one defines a line Dirac delta for a curve l in R3, by the
1.10. Linear topological space of distributions 31

condition
tSd4>l = 14>(X)dl
with the line integral on the right-hand side.
These distributions are often applied in physics, for example, as a mathemati-
cal model of electrically charged surfaces and strings. Standard operations with
distributions can be extended to the above Dirac deltas in a natural manner, e.g.,
operation of multiplication by an infinitely differentiable function (surface or linear
charge density), differentiation, etc. Thus, in electrodynamics, the Dirac delta

of a double layer is often encountered. It is functionally defined by the condition

-f aa n (f(x)tSu )4> (x)dx = i f(n· V4»dCl,

where n is the normal unit vector to the surface CI which describes, for example, a
dipole surface.
In particular cases, the notation introduced above for surface and line Dirac
deltas is not always used since the latter can be sometimes constructed from the
usual one-dimensional Dirac delta. Thus, a surface Dirac delta corresponding to
surface Xl = 0 can be more readily interpreted as the usual Dirac delta tS(Xl), and
the line Dirac delta concentrated on the x3-axis can be written in the form of a
product of Dirac deltas tS(Xl)tS(X2). In the same manner, the field of a spherical
wave, propagating away from the origin with velocity c, can expressed with the
help of the one-dimensional Dirac delta as follows:

1
U (x, t) = Ixl tS(lxl - ct).

1.10 Linear topological space


of distributions
As we mentioned in Section 1.2, the set V of test functions forms a linear space,
i.e., for any complex numbers a, b, and any test functions 4>, l/f from'D, their linear
combination
a4> + bl/f (1)

is also a test function in V. A function identically equal to 0 plays the role of a


neutral element for addition. Moreover, we defined in V a notion of convergence
32 Chapter 1. Definitions and operations

of sequences of test functions (in other words, a topology on V 2 ), with respect


to which the above linear combinations are continuous, thus determining what is
called the structure of a linear topological space for V.
A similar structure can be established for the set V' of distributions. Hence, for
any complex numbers a, b and any distributions T, S, the linear combination

aT+bS

is again a distribution in V' defined by its action on test functions from V by

(aT + bS)[cf>] = aT[cf>] + bS[cf>].

The zero distribution, defined by condition T[ cf>] = 0 for any cf> E V, plays the role
of a neutral element for addition of distributions. The topology of V'is determined
by the following definition of the convergence of a sequence Tk of distributions.
We shall say that, as k ~ 00,

n~T

in V' if, for each test function cf> E V, (complex) numbers

Tk[cf>] ~ T[cf>].

It is immediate to check that linear combinations of distributions are continuous


in this topology, or in other words,

lim (aTk
k-+oo
+ bSk) = a lim
k-+oo
n + b k-+oo
lim Sk.

The above topology is called the weak topology, or the dual topology to the topology
of V, and it will be the only convergence considered on V'. The reader will
recognize that an approximation of the Dirac delta by regular functions considered
in Section 1.3 was conducted in the spirit of weak convergence.

Example 1. Consider distributions

S
Te = Zlx 1
e-1
, s> o.

2To be more precise, the topology is defined by convergence of all, not necessarily countable,
"sequences" but we will not dwell on that in this book.
1.10. Linear topological space of distributions 33

Indeed, these functions are locally integrable, and they represent distributions in
'D', as long as e > O. Notice, however, that Ixl- 1 is not a locally integrable function
around x = 0, so it does not represent a distribution from 'D'. Let us compute the
limit

In view of the above, it is obvious that if the above limit exists, it cannot be related
to Ix 1-1. The way to proceed is to check values of Te on test functions from 'D.
Let q, be a fixed test function from V. Then, for some positive number M,

because q, has compact support. On the other hand, by the mean value theorem,
for each x,
q,(x) = q,(0) + xq,'(Ox)

for a certain Ox in the interval [0, MJ. Hence,

Since
fM Ixle-1dx = !Me,
10 e
the first term on the right-hand side of (2) converges to q, (0) as e ~ o. The second
term converges to 0 since q,' remains bounded on [- M, M], and since

is bounded as well. Therefore, we get that Te[q,J ~ q,(0), which gives


Mixing the weak limits and linear combinations, one can produce additional
nontrivial distributions.
Example 2. For an arbitrary sequence {an} of complex numbers, the series
00
Lanl)(x -n)
-00
34 Chapter 1. Definitions and operations

defines a distribution in V even when the numerical series L~oo an does not
converge. Indeed, the series always converges weakly in V' since action on a fixed
test function will always cut out all but finitely many terms of the series. •

In addition to the space V' of distributions, we will also consider some other
spaces and equip them with the structure of a linear topological space. One such
e'
space, the space of distribution of compact support, has been introduced before.
Its weak topology is determined by convergence on test functions from Coo, not
just on Coo functions with compact support, as is the case for weak convergence
in V'. Hence, it is more difficult to achieve.

Remark 1. The Dirac deltas and their derivatives are dense in the space of all
distributions in the sense that any T E 'D' is a weak limit of distributions of the
form

Remark 2. If a distribution T has a support equal to to} then it is a finite sum


of derivatives of the Dirac delta, that is

n
T = LakcS(k)(x)
k=l

for some constants ak.

Remark 3. Nevertheless, one can find a family of test functions cp such that
weakly, with respect to that family,

1.11 Exercises
1. Find the weak limits, as k ~ 00, of the following sequences of functions:
(a) _k 3xexp(_k2x 2).
(b) k 3 x/«kx)2 + 1)2.
(c) 1/(1 + k2 x 2 ).
(d) exp( _e-kx ).

2. What conditions have to be imposed on a function f (x) to guarantee that the sequence
(k 2 f(kx)} weakly converges to cS'(x)?
1.11. Exercises 35

3. Let a > O. Find the derivative of the Heavisde function of the following composite
arguments:
(a) X (ax).
(b) x(e Ax sin ax).
4. Calculate the distributional derivative t'(x) of function f(x) = X(x 4 -1), and find
the weak limit f'(x/e)/e 2 , as e ~ 0+.
S. Find distributional solutions of equation (x 3 + 2x 2 + x)y(x) = o.
6. Find the nth distributional derivative of function eAx X(x).
7. Prove the identity

m ~n.

8. Using the standard one dimensional Dirac delta distribution, construct a surface
Dirac delta corresponding to the level surface u : g(x) =
a E R, X E R3 of a smooth
function g on R3.
9. Using the standard one dimensional Dirac delta distribution, construct a line Dirac
delta corresponding to the curve l of intersection of two level surfaces g1 (x) = a1, g2 (x) =
a2,X E R3.

10. Find the length of the level curve l : l/I(x) = c,X E R2, without finding a
parametric description of it.
11. Define a vector-valued distribution

P = 8(l/I(x) - c)Vl/I(x),

by its action on an arbitrary vector-valued test function t/J

P[t/J] = f 8(l/I(x) - c) (Vl/I(X) . t/J(X))d3x.

What is the physical meaning of the functional action P[ t/J]?


Chapter 2
Basic Applications: Rigorous and
Pragmatic

2.1 1\vo generic physical examples


In this section we give a couple of seemingly naive physical examples. Keeping
them in mind, however, reinforces appropriate intuitive images of distributions
that help solidify a formal mathematical understanding of the theory, and make it
easier to grasp the automation of computations that can be achieved with help of
distribution theory.
Example 1. Beads on a string. This rather elementary example illustrates
possibilities to simplify many calculations if one uses the notion of the Dirac delta
distribution. Let us try to describe the linear mass density of beads with masses
mk, k = 1, ... , n strung along a tight string which coincides with the x -axis. Recall
that the linear density p(x) of the string itself is defined as the ratio l:!.m/ l:!.l, where
l:!.m is the mass of an infinitesimal string segment l:!.l. If the size of the beads is
small relative to other scales: string length, distances between beads, etc., then the
inner structure of the beads is insignificant for most calculations- beads can be
assumed to be material points, and the linear mass density p (x) can be accurately
described with the help of a sum of Dirac deltas:

n
p(x) = po(x) + Lmk8(X - Xk), (1)
k=l

where Xl, ••• ,Xn are coordinates of the locations of the beads.
The generalized string density plot is displayed in Fig. 2.1.1, where arrows sym-
bolically indicate the delta-shaped bead densities. Their different heights reflect
variations among the masses of the beads. This generalized density is extremely
convenient in calculations of various physical quantities, such as the string's mass
center, moment of inertia, and many others. If, for instance, the string of beads
38 Chapter 2. Basic Applications

-r------L--------L-----L-------L--____ x
xl x2 x3 x4

FIGURE 2.1.1
A schematic graph of the density of mass for beads on a string.

is placed in the force field f(x), then the total force acting on the system can be
calculated by means of

F = f p(x)/(x)dx = f Po(x)f(x)dx + trmJ(xk ). •


Example 2. Dipole in an electrostatic field. Let us discuss the behavior
of a molecule in an electrostatic field. It is often sufficient to consider it as an
infinitesimal dipole with a given dipole moment p and not to worry about its
internal microscopic structure. For simplicity, we shall again assume that the
dipole is located at position a on the x-axis. The charge distribution of a single
dipole can then be described with the help of the Dirac delta's derivative:

p(x) = -p8'(x -a). (2)

Hence, if a dipole is placed in an electric field E (x) directed along the x-axis, then
the force acting on the dipole is equal to

F = f p(x)E(x)dx = pE'(a).
2.2. Systems governed by ordinary differential equations 39

The formula makes it clear that, in an inhomogeneous (space-dependent) field, the


dipole is moving in the direction of a stronger field. •

2.2 Systems governed by ordinary differential equations


We are now adequately prepared to consider one of the most fundamental areas
of applying distribution theory, namely, integration of linear ordinary differential
equations. The latter are the main mathematical tool in studying a variety of
physical problems. In particular, equations describing the signal transfer through
a linear system are typically inhomogeneous linear differential equations of the
form
Ln (:t) x(t) = g(t), (1)

where
(2)

is a given polynomial of degree n (an =f. 0), pk = (d/dt)k is taken to meand k /dt k,
and g(t) is a known function of time t called the input signal. Notice that if the
input signal g(t) belongs to the space of test functions then the identity

g(t) = f 8(t - r)g(r)dr (3)

is satisfied. It turns out that the solution of (1) which, in the systems engineering
terminology, will be called the output signal, can be written in the form of the

f
convolution integral
x(t) = H(t - r)g(r)dr, (4)

which expresses the time invariance of the system's properties. The distribution
H is called the transfer function of the system. Substituting (3) and (4) into (1),
we get that

Ln (:t) H(t) = 8(t). (5)

In other words, the transfer function H(t) is the response of the system to the
Dirac delta input signal. In this context, H (t) is also often called the fundamental
solution or the Green's function of equation (1).
Physically, it is also clear that the solution of equation (5) should satisfy the
causality principle according to which the system's response cannot occur prior to
40 Chapter 2. Basic applications

the appearance of the input signal, i.e.,

H(t) =0 for t < O. (6)

So, let us try to construct a solution to equation (5) satisfying the causality condition
(6). To this end, consider an auxiliary homogeneous differential equation

Ln (:t) y(t) = 0, (7)

with initial conditions

y(O) = y'(O) = ... = y(n-2)(0) = 0, y(n-l)(O) = 1/an , (8)

where an are the coefficients in (2). Function

H(t) = y(t)X(t) (9)

obviously satisfies the causality condition (6). We shall check that H also satisfies
equation (5). Indeed, following the differentiation rules for distributions

H'(t) = y'(t)X(t) + y(t)8(t).

However, the first initial condition and the Dirac delta's multiplier probing prop-
erty imply that y(t)8(t) = y(0)8(t) =
0 so that, actually, H'(t) = y'(t)X(t).
Repeating this argument we get that

H(k)(t) = y(k) (t)X (t), k = 0,1,2, ... , n - 1. (10)

Using again the Dirac delta's probing property and the last initial condition, we
also obtain the formula for the highest derivative:

(11)

Substituting (10) and (11) into (5), we obtain

X (t)Ln (:t )y(t) + 8(t) = 8(t).

This demonstrates that H is the desired fundamental solution since y is a solution


of the homogeneous equation (7).
2.2. Systems governed by ordinary differential equations 41

Substituting the above formula for H in the convolution (4), we arrive at an


explicit expression for the output signal (solution of equation (1)), in the form

x(t) = f~oo y(t - r}g('r}dr, (12)

which is often called the Duhamel integral. Equivalently we can write

x(t} = 10 00
y(r}g(t - r)dr.

Example 1. Harmonic oscillator with damping. Let us apply the above general
scheme to the equation

x + 2ax + (Jix = g(t),

In this case the transfer function is completely determined by the solution of the
corresponding homogeneous equation

satisfying initial conditions y(O} = 0, y(O) = 1. The solution is well known to


be of the form

where Wl = -Jw 2 - a 2 • One can check by direct differentiation that function


y E Coo. Therefore, in this case, the output signal is described by the Duhamel
integral
x(t} = -110
Wl 0
00
e-aT: sin(wlr)g(t - r)dr.

The corresponding fundamental solution, that is, a response of the system to the
Dirac delta input, is plotted in Fig. 2.2.1. •

The coefficients in equation (1) were constant. However, it is worth mentioning


that similar distributional arguments apply also to equations with time-dependent
coefficients. Without exposition of the full theory, let us illustrate this fact in a
simple example.
Example 2. Time-dependent coefficients. Consider a first order equation

a(t)x + b(t)x = g(t}. (13)


42 Chapter 2. Basic applications

yet)

FIGURE 2.2.1
The fundamental solution of the harmonic oscillator equation with damping.

Let a(t), b(t) E C<'O(R), and assume additionally that a(t) never vanishes. Then,
analogous with (4), we can look for a special solution of the form

x(t) = f H(t, i)g(i)di, (14)

where the Green's function H(t, i) satisfies equation

a(t)H + b(t)H = 8(t - i).

Utilizing properties of the Dirac delta, it is easy to check that the above equation
has a solution which satisfies the causality principle and is of the form

H(t, i) = X(t - i)y(t, i),

where y(t, i) is a solution of the homogeneous Cauchy problem

a(t)y + b(t)y = 0, y(t = i, i) = 1/a(i).


2.3. One-dimensional waves 43

Hence, substituting

H(t, T) = x(t - T) exp -


aCT)
[it 'C
bet') ]
--dt'
aCt')

into (14) we obtain the desired solution of equation (13):

x(t) = It
-00
geT)
- - exp -
aCT)
[it 'C
bet')
--dt'
aCt')
] dT.

2.3 One-dimensional waves


The emergence of distribution theory greatly extended boundaries of rigorous
theoretical analysis of mathematical physics problems. Let us consider a simple
example which illustrates how distribution theory helps in dealing with typical
physical situations.
Consider a field u(x, t) satisfying the 1-D wave equation

(1)

with initial conditions

u(x, t = 0) = g(x), au(x,t) I = h (x). (2)


at t=O

In order to obtain a solution to this initial value problem in the classical sense,
functions g and h appearing in the initial conditions must be continuously differ-
entiable at least two times. However, it is very often interesting to know how the
initial rectangular pulse

g(x) = X(x +a) - X(x -a) (3)

propagates (if h == 0). The initial condition (3) has no classical derivatives at points
x = ±a. So ordinary calculus is not helpful here. However, it is well known that
the sum of two pulses

u(x, t) = ~ (g(X - et) + g(x + et)), (4)


44 Chapter 2. Basic applications

traveling in opposite directions at speed c provides a solution to the above problem.


One way to arrive at such a solution would be to smooth out the initial pulse's
edges in the e-neighborhood of the jumps to get differentiable initial conditions,
solve the equation, and then find the limit of the solution as e -+ O.
A distributional approach permits us to avoid this unwieldy technique altogether
and we shall check, by direct substitution, that function (4) satisfies the equation
(1). Actually, in view of the linearity of equation (1), it suffices to check that
one of the components of traveling waves in (4), say X(x - ct + a), satisfies the
wave equation (1). Taking the distributional chain rule into account, the second
derivative of the above function with respect to t turns out to be

a 2
- 2 X (x - ct + a) = c 28'(x - ct + a),
at
and, on the other hand,

as well, so that the verification is complete.


Let us remark that the general solution of the initial value problem (1-2) is the
well known D'Alembert solution

• 1
u(x, t) = -(g(X - ct)
2
+ g(x + ct») + -1
2c
lx-d
x +ct
h(y)dy. (5)

Even in this well known case, distribution theory is useful in extending the class
of admissible functions g(x) and h (x), and in facilitating a rigorous interpretation
of (5) as a "generalized D' Alembert solution" of the wave equation.

2.4 Continuity equation


2.4.1. Continuity equation for the density of a single particle. In this sec-
tion we discuss an-intuitively unexpected but important for physicists--example
of the Dirac delta's application including differentiation of the Dirac delta of a
composite argument and utilizing its multiplier probing property.
Consider a gas of moving particles. Denote the velocity of a particle which is
located at point x E R3 by v(x, t). Then the motion of that particle satisfies a
2.4. Continuity equation 45

vector nonlinear ordinary differential equation

db(t)
~ == v(b(t),t), (1)

where b(t) is the position vector of a particle at a given instant t. Leaving aside the
question of the inner structure of the particle, assume that its size is infinitesimal,
and define its density by

p(x, t) == m 8(b(t) - x), (2)

where m is the particle mass. To derive an equation for the particle density, let us
differentiate (2) with respect to t to get

op 0
- == m-8(b(t) -x).
ot ot

By the chain rule,

o 8(b(t) - x)
ot == - (db(t»)
~. V x 8(b(t) - x);

so, taking into account equation (1) of the particle motion, we obtain

:t 8(b(t) - x) + (v(b(t), t) . V x )8(b(t) - x) == o. (3)

Since the term v(b(t), t) is independent of x, we take the velocity vector inside the
V x operator:

(v(b(t), t) . Vx )8(b(t) - x) == (Vx • v(b(t), t) )8(b(t) - x).

Now, in view of the Dirac delta's multiplier probing property,

v(b(t), t)8(b(t) - x) == v(x, t)8(b(t) - x).

As a result, equality (3) can be rewritten in the form

8
:t (b(t) - x) + (Vx • v(x, t)8(b(t) - x») == o.
46 Chapter 2. Basic applications

Multiplying both sides by the particle mass, and recalling the definition (2) of
particle density, we arrive at

(4)

which is the traditional continuity equation, and is often written in a more trans-
parent physically, divergence notation:

-ap d.
at + IV (vp) = o. (5)

This simple equation occupies one of the central places in physics and deserves
additional commentaries. The above derivation strikes many physicists as unex-
pected because the continuity equation is derived typically as a corollary to the
mass conservation law in an ideal continuous medium, thus ignoring its particle
nature. This is how the name "continuity equation" originated. The traditional
derivation required not only continuity, but also sufficient smoothness of fields
p(x, t) and v(x, t) describing the motion of an ideal continuous medium. Our
derivation shows that the continuity equation remains valid for a singular density
of each separate particle in the medium as well.
Hence, it even makes sense to talk about the continuity equation for a gas con-
sisting of just one particle. Such a degenerate gas should not elicit skepticism
among physicists. Mathematicians study similar objects quite seriously, having in
mind not material particles but a gas of points in phase space. The density of a
point gas in phase space, as any real gas, satisfies the continuity equation. Notice
that equation (5) can be viewed as an equation for a density of a point in phase
space, since the mathematical phase space of solutions of equation (1) coincides,
in that case, with the physical space R3.

2.4.2. Mass conservation law. A smooth mass density p(x, t) of an ideal contin-
uous medium satisfies the mass conservation law, and so does the singular density

f f
(2) as
p(x, t)d 3x = m 8(b(t) - x)d 3x = m.

This follows from Dirac delta's basic properties. One says that the mass conserva-
tion law is guaranteed by the divergence form of the continuity equation (5). We
shall show this by integrating all summands of equation (5) over a certain fixed
(time-independent) space region Q. As a result, we get that

+{
!!..-mg
dt Jg div (vp)d 3x = 0, (6)
2.4. Continuity equation 47

where

is mass of the medium contained in region n. Now recall that the Gauss divergence
theorem states that if 0' is a smooth surface enclosing a bounded three dimensional
region n, and n is the external normal vector to 0' then, for any smooth functionf

t L
onR3,
if· n)dO' = div fd 3 x. (7)

This theorem can be readily extended to the case where function f is replaced
by a vector distribution. Now, if we transform the second term in (6) using this
generalized Gauss divergence theorem, we get

L div (vp)d 3 x = t p(v· n)dO'. (8)

If, during time interval [tl, t2], no particle crosses the surface 0' which bounds
region n, then the surface integral in (8) is automatically equal to 0, and equation
(6) gives the condition
d
-mn =0,
dt
which implies the mass conservation law

mn = const

2.4.3. Continuity equation for continuous media. Continuity equation (5) was
derived for the density of a single particle. However, in view of its linearity, the
continuity equation is also satisfied by a superposition of densities of different
particles and thus remains valid for the full microscopic density of the medium

p(x,t) = Lmio(bi(t) -x), (9)

where the summation extends over all the medium particles, and mi and bi(t)
denote, respectively, mass and position of the ith particle.
A physicist, who has read the above material and until now assumed p(x, t)
to be a smooth function may be tempted to swing to the other extreme and start
suspecting that all physically realizable solutions of the continuity equation are
singular and have a structure similar to (9). This is not necessarily so since we
will show that, in the analysis of the motions of a macroscopic medium, at scales
48 Chapter 2. Basic applications

very large compared to the distances between adjacent particles, it is still natural
to assume p(x, t) to be a smooth function.
Let us begin by rewriting density (2) of a single particle in the following form

p(x, t; y) = m8(b(Y, t) - x), (10)

which explicitly involves the initial particle position

b(y,t = 0) =y.

In practice, the particle's initial position is never known accurately. This fact can
be modeled mathematically by taking the convolution

- t) = I3
p(x, 1 f g (Y-7o)
-1- p(x, t; 7o)d 3 z, (11)

of density (10) with an "initial position uncertainty function" g (y / I) /1 3 . Here g(7o)


is a normalized (j g(7o)d 3 z = 1) function of a dimensionless variable 70, while [
can be viewed as a measure of uncertainty about the initial particle position. It is
clear that the "averaged" density p(x, t) obtained in such a way also satisfies the
continuity equation
op + d'lV (-)
at vp = 0. (12)

However, in contrast to the singular density (2), p also satisfies smooth initial
condition
_ 1 (Y-X)
p(x, t = 0) = [3 g - [ - .

Similarly, the averaged density (9) satisfies equation (12) and a smooth initial
condition
p(x, t '"' mjg (Y'
= 0) = I31 7 -X) .
-'-1- (13)

From the physicist's viewpoint, function g describes an uncertain initial particle


position, and [ accounts for a macroscopic maximum accuracy of the measuring
instrument. We should add that the concrete shape of the function g(7o) is not
significant. One can rigorously prove that, under fairly general assumptions, and
for ['s large in comparison to the distances between adjacent particles, in the limit
of infinitely particles, the sum in (13) approximates a "generic" macroscopic initial
density ,oo(x), whose form is independent of the form of the individual summands.
2.5. Green's function; continuity equation; Lagrangian coordinates 49

roo ... i'"

FIGURE 2.4.1
The figure shows symbolically, in the I·D case, the initial singular density
Po(x) of particles with identical mass m, located at points Yi = f3(is i\), where
f3(y) = 2y(lyl+s)/(2Iyl+s), i\ = sIlO, and the corresponding averaged den·
sity Po (x) (continuous line) obtained with the help of the uncertainty function
g(yl 1)1 I = exp( _y2 I 12)/(.j7i1), I = 4i\. It is clear from the graphs that the
averaged density has a minimum at the origin where the density of particles
is smaller.

2.5 Green's function of the continuity equation and Lagrangian


coordinates
The preceding section makes it clear why it is important for physicists and
engineers to solve the initial value problem

-ap
at +
d·IV (pv) = 0 , (1)

p(x, t = 0) = Po (x) ,

with smooth initial data. However, as we will see in this section, the physical sin·
gular density of microparticles (2.4.10), while remaining very important, acquires
a new, mathematical, interpretation. It turns out that

G(x, t,y) = ~(b(y, t) - x) (2)


50 Chapter 2. Basic applications

is Green's function of the continuity equation. This means that the solution of the
initial value problem (1) for an arbitrary initial condition Po(x) can be written in

f
the integral form
p(x, t) = Po(y)8(b(y, t) - x)d 3 y. (3)

Let us transform this expression with the help of formula (1.9.1) which gives that

8(b(y, t) - x) = j (x, t)8(a(x, t) - y). (4)

In Section 1.9., functions


y = a(x, t), (5)

and
x = b(y, t), (6)

are inverse functions of each other, and they provide a connection between the
initial position y and the current position x of the same particle of the medium.
This seems to be the right place to recall traditional hydrodynamics terminol-
ogy. Current coordinates x of a particle in a certain fixed coordinate system are
traditionally called the Eulerian coordinates. Initial coordinates y of the particle
are then called its Lagrangian coordinates. Both the Eulerian and the Lagrangian
coordinates determine spatial position of the particle. But whereas an external ob-
server may find it preferable to use the Eulerian coordinates, the observer traveling
with the particle itself may find the Lagrangian coordinates more convenient. The
situation is similar to that of a person who prefers to identify himself by his birth-
place rather than by the listing of his current address. Formulas (5) and (6) define
laws of transformation of Eulerian coordinates of a particle into its Lagrangian
coordinates, and vice versa. Function

j (x, t) = Iaa~: t) I (7)

appearing in (4) is the Jacobian of the transformation of Lagrangian into Eulerian


coordinates.
Let us return now to expression (3) for the density field. Substituting into (3) the
right-hand side of equality (4), and applying the Dirac delta's multiplier probing
property, we find that density

p(x, t) = Po(a(x, t»j(x, t), (8)

is proportional to the initial density po(y) in the neighborhood of a particle which


is at x at time t, and to the Jacobian j(x, t) which takes into account changes
in density caused by infinitesimal deformations of the medium's volume in the
neighborhood of a particle.
2.6. Method of characteristics 51

2.6 Method of characteristics


In this section we will present the method of characteristics-the standard gen-
eral method of solving first-order partial differential equations. Our exposition
will be restricted to the continuity equation. In this case the physical appeal of the
method is that the characteristics are physically observable paths of particles.
Often, an effective mathematical approach reduces the problem at hand to a
simpler problem which has been solved earlier (one can say that by "moving
backwards" mathematicians manage to move forward and obtain new results).
The method of characteristics follows a similar path by reducing partial differential
equations to more familiar ordinary differential equations. To see how this works,
let us rewrite the continuity equation (2.5.4) in the form

ap
-at + (v . V)p = -up
' (1)

where
u(x. t) = (V . v) = div v(x, t) (2)

is an auxiliary scalar field. We will solve equation (1) under the above mentioned
initial condition p(x. t = 0) = po(x).
A solution to any partial differential equation is, by definition, a function of
several independent variables. In the case of density field p(x. t), there are three
spatial coordinates and the time variable. The method of characteristics makes
an assumption that all of these variables depend on a single common parameter,
and dependence is selected in such a way that the partial differential equation is
transformed into an ordinary differential equation. For the continuity equation, it
is convenient to select the time t as that common parameter and assume that the
spatial coordinates x are functions of time, that is x = b(t). In this case, by the
chain rule, as long as function b(t) satisfies

db
dt = v(b, t). (3)

the left-hand side of (1) turns out to be an ordinary derivative, and the continuity
equation itself becomes an ordinary differential equation

-dp
dt
= -up. (4)

The system (3-4) of ordinary differential equations is called the characteristic


equations.
52 Chapter 2. Basic applications

In our particular case, the system splits into two groups of equations for two
different types of functions. The first group consists of equations (3), whose
solutions represent characteristics, i.e., paths x = b(t) in phase space. The last
group consists of a single equation (4) which describes the density evolution along
the characteristics. Observe that the characteristic equations (3) coincide with
equations of motion (2.4.1) for the medium's particle. For that reason, in hydro-
dynamics, the ordinary derivative with respect to time in equation (4) is called the
substantial derivative and the special notation

D a
-=-+v·V
Dt at
is used.
By complementing equations (3) with the initial condition

b(t = 0) =y, (5)

we fix the point in space where the characteristic originates, i.e., the point from

o
~--'--'--~--~-------L-----------t
t

FIGURE 2.6.1
I-D case. Set of characteristic curves of the continuity equation mapping
Lagrangian coordinates into Eulerian coordinates and vice versa.

which the particle starts its motion. Therefore, the corresponding initial condition
in equation (4) should be the initial density Po(x) at pointy, that is

p(t = 0) = Po{y). (6)


2.6. Method of characteristics 53

From the hydrodynamic viewpoint, initial conditions (5) fix the Lagrangian coor-
dinate of a particle. Consequently, the equation (4) determines evolution of the
density in the neighborhood of a particle with a given Lagrangian coordinate, and
is the continuity equation in the Lagrangian coordinate system. Thus, solutions

p = R(y, t), bey, t), (7)

of the initial value problem (3-6) describe the density field in the Lagrangian
coordinate system.
However, if we observe the evolving state of a hydrodynamic field by sensors
in fixed spatial positions, then we are interested in a description of the density
field p(x, t) in the Eulerian coordinate system. The problem is that, for a giveny,
field R(y, t) determines density at a point with coordinates bey, t) which may not
coincide with Eulerian coordinates x at the point of observation. In the 1-D case,
the above situation is pictured in Fig. 2.6.l.
The situation is saved by the fact that, by varying y, we obtain a family of char-
acteristics with, hopefully, one of them hitting point x at time t. Mathematically
speaking, for a given mappingy E R3 f-+ x E R3 given by the formula

x = bey, t), (8)

we have to find an inverse transformation

y = a(x, t), (9)

determining the Lagrangian coordinate of the particle which at time t is located


at point x. Substituting that inverse transformation into R(y, t), we obtain the
required density field
p(x, t) = R(a(x, t), t) (10)

in the Eulerian coordinate system.


It is easy to see that the solution of continuity equation (4) in the Lagrangian
coordinate system can be written in the form

R(y, t) = Po(y) exp[ - fot u(b(y, r), r)dr J. (11)

As a result the Eulerian density field is given by the expression

p(x, t) = Po(a(x, t»j (x, t), (12)


54 Chapter 2. Basic applications

where, as is clear from (11),

j(x, t) = exp[ - t u(bCy, -r), -r)d-r] I


10 Y=Il(X,t)
' (13)

is the Jacobian of transformation (2.5.7) of Lagrangian into Eulerian coordinates.


Until now, to simplify the exposition, we left out an important mathematical
question of the existence and uniqueness of an inverse transformation (9) for all
x E R3. Now is a good time to address this question.
As is well known, transformation (8) is an isomorphism (i.e., a continuously
differentiable, and one-to-one mapping with a continuously differentiable inverse)
if and only if the Jacobian j in (2.5.7) is bounded and positive for any x E R3. It
is clear from (13) that a sufficient condition for (8) to be an isomorphism is that
the field u(x, t) of (2) be bounded. Actually this assumption was made implicitly
above when we omitted the sign of absolute value around j in (2.5.4) (which the
general formula requires).
Let us also remark that the method of characteristics just described, is embedded,
as if in the genetic code, in Green's function (2.5.2) and in its probing properties. As
a consequence, solution (12)- (13) can be obtained without resorting to the method
of characteristics. But the time and space used to describe that alternative method
has not been wasted as it sheds light on the problem from a different viewpoint.

2.7 Density and concentration of the passive tracer


The continuity equation plays an important role in the study of ecological prob-
lems related to dispersion of pollutants in the environment. Obviously, in addition
to the medium density, also the density of any passive tracer suspended in and
carried by the medium, satisfies the continuity equation. A suitable example here
is the density of smoke particles released by the smokestack of a power station.
The continuity equation describes evolution of the passive tracer density if ,oo(x)
in (2.5.1) is replaced by the initial density of the passive tracer.
Sometimes for instance in the analysis of chemical reactions, the density itself is
not as important as the concentration C(x, t). It measures not the absolute, as the
density does, but the relative proportion of the tracer in a physically infinitesimal
unit of the medium's volume. That's why, while the density increases when the
medium is compressed and decreases when the medium expands, the concentration
preserves its value in the neighborhood of an arbitrary fixed particle. In other words,
in the Lagrangian coordinate system, the concentration does not depend on time,
i.e.,
CCy, t) = CoCy) = const, (1)
2.8. Incompressible medium 55

where Co(x) is the initial concentration field, and it satisfies equation

DC
-=0, (2)
Dt

where, in Eulerian coordinates, D / Dt = a/at + v . V is the substantial deriva-


tive introduced in Section 2.6. Thus, the Eulerian field of concentration satisfies
equation
ac
-at + (v· V)C = o. (3)

Its solution can be obtained by expressing the Lagrangian coordinates in (1) through
the Eulerian coordinates. As a result we get that

C(x, t) = Co (a(x, t)). (4)

Let us also observe that the equation (3) is an obvious consequence of a mathe-
matically more important initial value problem

ay
at + (v . V)y = 0, y(x, t = 0) = x, (5)

whose solutiony = a(x, t) describes a transformation of the Eulerian coordinates


into the Lagrangian coordinates.

2.8 Incompressible medium


Both the Green function for concentration equation (2.7.3) and for the La-
grangian coordinates (2.7.5) are equal to the Dirac delta which appears on the
right-hand side of equality (2.5.4). Hence, we can write

C(x, t) = ! Co(y)8(a(x, t) - y)d 3 y. (1)

If a medium is such that


j(x, t) == 1, (2)

then, as is clear from (2.5.4), the Green's function of the continuity equation coin-
cides with the Green's functions of equations (2.7.3) and (2.7.5), and the continuity
56 Chapter 2. Basic applications

equation itself takes the form

ap
at + (v . V)p = o. (3)

A medium where the identity (2) is satisfied is an incompressible medium. Con-


dition (2) implies that the volume of a region "frozen" into the medium, does not
change in time. It is clear from (2.6.2) and (2.6.13) that the condition

v .v = div v(x, t) == 0 (4)

is necessary and sufficient for the medium to be incompressible. The above identity
means that the velocity field of an incompressible medium is purely rotational.
Indeed, by the fundamental theorem of vector calculus, any smooth vector field
can be represented as a sum of a potential and a rotational components, that is

v = vp +vr , (5)

where
Vp = Vf{J = grad f{J(x, t), (6)

for a certain f{J called the scalar potential of the vector field v, and

Vr = V x 1/J = rot 1/J(x, t) (7)

for a certain 1/J which is called the vector potential of the same field v.
The divergence of the rotational part

V . Vr = V . (V x 1/J) = (V x V) .1/J == 0, (8)

a condition that coincides with the incompressibility condition (4). The divergence
of the potential component is equal to

V ·vp = V· Vf{J = (V· V)f{J = !l.f{J, (9)

where the letter !l. is used to denote the Laplace operator which, in the 3-D Carte-
sian coordinate system, has the form

(10)
2.9. Pragmatic applications; beyond the rigorous theory 57

If the Laplacian of a function ip is identically zero then function ip is affine every-


where, that is
ip(x) = a + v . x.

Here the first term a is just a constant-a potential is determined by condition (6)
only up to an arbitrary constant. The second term describes a parallel uniform
motion which is not affected by compressibility. This means that the velocity field
in an incompressible medium is purely rotational.
Finally, let us observe that in the two dimensional medium (x E R2), which is
used as a model of surface, or oceanographic phenomena, the vector potential of a
purely rotational velocity field degenerates into a scalar function y,(x, t) called the
stream function, and components of the velocity field in the Cartesian coordinate
system can then be expressed through the stream function by formulas

(11)

2.9 Pragmatic applications: beyond the rigorous theory of


distributions
The rigorous distribution theory discussed up to now relied on narrow spaces
e
of test functions like 'D, (or the space S of rapidly decreasing functions to be
introduced in later chapters). However, in various physical and engineering applied
problems one is often tempted to go beyond those well defined spaces and apply
results of the rigorous theory in a formal fashion. This sometimes leads to correct
results but it has to be done carefully lest erroneous conclusions are arrived at.
Basically there are two ways to assure a correct outcome.
The first method is to build from scratch a distribution theory suitable for the
problem under investigation by selecting a specialized space of test functions, and
then to follow the general scheme developed in Chapter 1. Such an approach is
worth the time and effort required only if the potential application area is large
enough. We will pursue this path ourselves by introducing the space S of rapidly
decreasing smooth functions in connection with the study of Fourier integrals.
However this approach is not practical in every situation--construction of a
plethora of different distribution spaces adapted to each new situation is best left
to theoretical mathematicians. For an applied scientist such a "micromanagment"
would often be confusing.
Thus, the second, more pragmatic approach is often used. One borrows and
formally applies relations and formulas from the rigorous distribution theory, and
then one checks their validity in each particular case. If this is done with care, no
harm will result and the desired calculations can be completed expeditiously. The
general rule is that any extension of the test function space results in the narrower
58 Chapter 2. Basic applications

class of linear functionals on them (distributions). This can entail a loss of some
of the crucial properties of distributions, such as their infinite differentiability.
To avoid unexpected pitfalls, it is also essential to develop an intuition based on
experience in pragmatic applications.
In this section we analyze five typical examples of the second approach:

• distributions on a finite interval;


• differential equations with singular coefficients;
• nonmonotonic composite arguments of the Dirac delta;
• nonlinear functions of distributions;
• supersingular distributions.

All five present delicate mathematical questions when framed within the rigorous
distribution theory. Yet, in physical and engineering practice, we see them suc-
cessfully handled in a nonrigorous fashion. So, it is only reasonable to discuss
them openly without pretending that they do not exist, see how it is being done,
and how to avoid related potential dangers.
2.9.1. Distributions on a finite interval. Many applied problems require calcu-
lating integrals with finite limits such as

lb f(x)q,(x)dx. (1)

Often, one is tempted to operate with analogous functionals but with function f (x)
replaced by a distribution and, in particular, by the Dirac delta. We have already
met such a situation in Example 2.1.1 of beads threaded on a finite length string.
The above integral is not well defined within distribution theory even if q,(x) is a
test function in V. Indeed, finite integral limits actually mean that we use as a test
function the truncated function

4J(x) = q,(x) (X (x - a) - X(x - b»),


where X(x) is the Heaviside function. Nevertheless, the functional seems to be
well defined if restricted to distributions with support contained in the interval of
integration. Such is the case of 8(x - c) with a < c < b or its derivatives. It is
relatively easy to extend the standard distribution theory on the entire real line to
the case of distributions with supports inside a given open interval.
The rigorous theory developed in Chapter 1 is not, however, able to suggest
what should be expected if the support of the Dirac delta is one of the endpoints
of the integration interval. Let us consider this situation in more detail in a typical
2.9. Pragmatic applications; beyond the rigorous theory 59

example of the functional

J = f 8(x)X(x)dx. (2)

Having absolutely no idea of what the meaning of the above integral is, one could
start with its evaluation by a formal change of variables y = X(x), notice that
8(x)dx = dy, and conclude that

J = 1 1

o
ydy =-.
1
2

This heuristic answer, which is extensively used in applications, can be justified


more formally.
To get a deeper insight into the situation, let us follow a more rigorous path of
evaluating the functional (2) as a limit J = limk-+oo h of ordinary integrals

(3)

where 8k(X) and Xk(X) are regular functions, weakly converging to distributions
8 (x) and X(x), respectively. As examples of such weakly converging sequences
let us take
8k(X) = kJ...(kx - a), and Xk(X) = A(kx), (4)
where
A(x) = L~ J...(y)dy,
with
1 1
J...(x)=---
7r x 2 +1
being the Lorentz kernel, so that

A(x) = ~(arctan (x) + i). (5)

It can be easily seen that the values of integrals h = J(a) do not depend on k,
and range over the interval (0, 1) as a varies from -00 to +00. Therefore, unlike
the values of functionals on the test function space 'D, the value of functional (2)
depends on the "inner structure" of a distribution. Functions Xk(X) - 1/2 are
clearly odd functions and, in physical applications one often "imagines" the Dirac
delta 8 (x) as an "even function". Under such an assumption, we would always
60 Chapter 2. Basic applications

have J = 1/2. Because of this, physicists are accustomed to using the following

I
working formula: For any continuous function ~(x),

1 a
b
cS(x - c)t/J(x)dx =
~(c)
0
~(c)/2
if c E (a, b);
if c ¢ [a, b);
if c = a or b.
(6)

Of course, the above physical assumption of the Dirac delta's evenness has to be
taken with a grain of salt, and if used recklessly it can result in an incorrect solution
of the particular physical problem. Thus, in the case of beads on a string occupying
interval (0, 1), the bead at the left endpoint of the interval with density p = mcS (x)
either is assumed to be a part of the string or not. Therefore, the contribution of
that bead to the force acting on the string has to be taken either as I(O)m or zero,
depending on the actual physical situation. No middle ground is sensible here.
In a more general case of weakly converging sequences (4), parameter a is used
to describe an infinitesimal quantity, comparable to the "thickness" of the Dirac
delta, displacing it to the left (a < 0) or to the right (a > 0) from its basic location
at x = O. Under zero displacement, we obtain from (4) that J = 1/2, and for
a ~ -00, when the support of the Dirac delta becomes completely separated
from the region of nonzero values of the Heaviside function, we have J ~ O.

2.9.2. Differential equations with singular coefficients. Let us discuss another


interesting and instructive example demonstrating a limited nature of the range
of applications of formula (6). Consider the following differential equation for
function f(x):
f' = pcS(x - c)f, f(O) = fo· (7)

Here p is a known constant and c > O. A formal solution of this equation

obtained by the separation of variables has a discontinuity at x = c with the size


of the jump
L/l = fo(e P - 1). (8)

Hence, distribution theory indicates that the solution of equation (7) can be written
in the form
f(x) = 10 + Lflx(x - c),

and that its derivative is


f'(X) = LflcS(x - c).
2.9. Pragmatic applications; beyond the rigorous theory 61

Substituting the above formulas in equation (7), rearranging the terms, and inte-
grating over all x's, we get that the value of the integral

J = f 8(x - c)x(x - c)dx = L/lpL/l


- plo e 1 - P
= p(e .
P-
P - 1)

The graph of J as a function of p is presented in Fig. 2.9.1.

J(p)

-20 -10 o 10 20 P

FIGURE 2.9.1
Possible values of integral J as a function of p.

The above "calculation" makes it clear that as p varies from -00 to +00, J
decreases from 1 to 0, and as p ~ 0, the functional J converges to 1/2. The
latter situation corresponds to what the physicists call the solution of (7) in the
first perturbation approximation (or, the Born approximation). It is obtained by
replacing f by 10 on the right-hand side.
Let us also remark that expressions like (8) for the jump of a solution at the point
of coefficient's singularity are used in complex problems which require gluing two
classical solutions on either side of a singularity point.
2.9.3. Nonmonotonic composite arguments of delta functions. Another ex-
ample of a situation where we are forced to abandon the framework of the rig-
orous distribution theory is related to the necessity of operating with Dirac delta
8(a(x) - y) of a nonmonotone composite arguments a(x). In this case, equation
y = a(x) may either have no roots at all, or may have multiple roots (see Fig.
2.9.2) Xk = fJk(y), k = 1,2, ... , n(y).
62 Chapter 2. Basic applications

a(x)

~--~------~-- ________-L_________ X
xl x2 x3

FIGURE 2.9.2
An example of a nonmonotonic Dirac delta argument.

Accordingly, in applications, one often uses the formula

(9)

or, in view of the Dirac delta's multiplier probing property,

1 n
8(a(x) - y) = la'(x)1 (;8(X - Xk),

which turns out to also function well in the case of nonmonotonic functions a(x)
as long as all its terms are well defined, i.e., provided function a (x) has continuous
nonzero derivatives in the vicinity of points Xk. In the example that goes back to
Dirac himself, the relationship

(10)

remains valid for all la I > O. Sometimes, one can even remove the above men-
tioned requirement that la'(xk)1 > O. Indeed, the relationship
2.9. Pragmatic applications; beyond the rigorous theory 63

is satisfied for any a since the multiplier 2x removes the singularity of the corre-
sponding functional at point x = O.
Another typical situation where relations like (9) are useful arises when one tries
to count the number of intersections of the level y by the graph of function a(x)
over the interval (0, z). Indeed, by (9), the integral,

N(y, z) = foz la'(x)Ic5(a(x) - y)dx (11)

gives the desired crossing number. Similarly,

N+(y, z) = foZ a'(x)x(a'(x»c5(a(x) - y)dx (12)

counts the number of upcrossings of the level y by function a(x), and

N_(y, z) = - foz a'(x)x(-a'(x»c5(a(x) - y)dx (13)

counts the number of downcrossings of the level y by the same function.


Counters of this type are used extensively in processing observational data.

2.9.4. Nonlinear transformations of distributions. One of the most obvious


and essential difficulties encountered in distribution theory is the problem of non-
linear transformations. One has to remember that the ability to analyze nonlinear
functions such as exp(f(x» or f2(x) is a major appeal of classical analysis. But
if we replace function f by the Dirac delta or its derivatives, the above expressions
do not make sense. That's why distribution theory is most effective when applied
to linear problems.
Nonlinear transformations of distributions are, however, not always meaningless
and, even if applied heuristically, lead sometimes to correct physical results. In
those cases, the rule of thumb is: the less singular the distributions, the more free-
dom one has in handling their nonlinear transformations. For example, equalities

x _ { g(1), for x 2: 0;
g(X( »- g(O), for x < O.

and, in particular,

are completely rigorous and can be established by pointwise inspection. When


dealing with more singular functions, e.g., the Dirac delta, more attention should
64 Chapter 2. Basic applications

be paid, as was emphasized earlier, to the symmetry and the inner structure of
distributions. Hence, when facing the integral

(14)

one tries to make sense out of it by making certain assumptions (resulting usually
from the physics of the problem) about the relationship of smooth approximants to
the singular distributions a(x) and the Heaviside function X(x). If, for example, se-
quences converging weakly to these distributions satisfy the following consistency
condition

then one can use the above mentioned change of variables y = X(x) to obtain

Also, it is relatively easy to rigorously define a product of two distributions if


their singular supports are disjoint. However, the corresponding relationships tum
out to be, as a rule, rather trivial and are of the type of the formula

a(x - a)a(x - b) = 0, a =1= b.

There exists, however, a possibility to rather rigorously define nontrivial nonlinear


combinations of singular distributions. It is related to parameter-dependent dis-
tributions. Let T(x, y) be a distribution in variable x depending on parameter y.
Distribution a(x - y) can serve here as an example. A functional

T(x, y)[l/J(x)] = l/t(y) (15)

maps the set V of test functions l/J(x) onto a set L of functions l/t(y). Provided
LeV, function (15) can be used itself as a test function for another distribution
S, that is we can evaluate S(y)[l/t(y)]. The last expression can be used to define

f S(y)T(x, y) dy

as a distribution determined by the following functional action:

f S(y)T(x, y) dy[l/J(x)] = S[l/t(y)].


2.9. Pragmatic applications; beyond the rigorous theory 65

A convolution of two distributions, which was introduced rigorously earlier, is a


special example of such a quadratic form often encountered in applications.
Another nontrivial and fruitful type of multiplication of singular distributions is
the direct product of distributions which permits a construction of distributions on
multidimensional spaces. For example, the direct product B(x)B (y) can be defined

!
rigorously by
B(x)B(y)q,(x, y)dx dy = q,(0, 0).

Distributions on d-dimensional spaces are, however, a separate and interesting


matter discussed in Section 1.9.
2.9.5. Supersingular distributions. It can happen that the function (15) does not
have any meaning within the framework of the standard distribution theory, while
the functional
T(x, y)[1/I(y)) = g(x), (16)

is well defined for test functions 1/1 (y) E V, for each value of parameter x. Then
it makes sense to say that the functional (15) determines a new distribution R:

T(x, y)[q,(x)] = R(y). (17)

Its algorithmic action postulates an extension to the functional case of the Fubini
Theorem on preservation of the value of the double integral under change of the
integration order:

T(x, y)[1/I(y)][q,(x)] = R(y)[1/I(y)],

!
and, consequently,
R(y)[1/I(y)] = g(x)q,(x)dx.

Let us illustrate the above concept in the typical example of

T(x, y) = B(X(x) - y). (18)

Since the Heaviside function X(x), appearing as an argument of the Dirac delta (18)
clearly violates assumptions on thus far allowable arguments of distributions, the
formula (18) does not define a distribution in a rigorous sense. Moreover, applying
to (18) the pragmatic equality (9) also leads to a nonsensical result. Nevertheless,
equation (18) rigorously defines a distribution (depending on parameter x) on the
set of test functions {1/1 (y)} such that

B(X(x) - y)[1/I(y)] = 1/1 (X (x». (19)


66 Chapter 2. Basic applications

Multiplying this equality by rp(x) and integrating it over all x's, we find the algo-
rithm of action of the distribution R(y) (17) in our special case (18):

R(y)[1/I(y)] = 1/1(0) (o rp(x) dx + 1/1(1) roo rp(x) dx.


1-00 10
Hence, the sought distribution is

R(y) = 8(x(x) - y) [rp (x)] = 8(y) (o


1-00
rp(x)dx +8(y -1) roo rp(x)dx.
10
(20)

This equality displays the algorithm of action of distribution (18)on the family
of test functions {rp(x)}. It is clear from (20), that this distribution can be called
neither regular nor singular. Indeed, the right-hand side of (20) is not a continuous
operation on the set of test functions {rp(x)}. For this reason a new term is needed
and we will call distributions of type (18) the supersingular distributions. Notice
that the supersingular Dirac delta (18) no longer enjoys the usual probing property
since it depends on the values of the test function rp(x) on the entire x-axis.
In the latter part of this book we will discover that equalities like (20) find an
application in solving stochastic problems and provide a clear-cut probabilistic
interpretation. In this section we will illustrate the situation similar to (20) in an
example of gas dynamics discussed in Section 2.5.
Example 1. Density of the gas of sticky particles. Particles of the
I-D gas, initially (for t = 0) distributed on the x-axis with density ,oo(x), move
with constant velocity v(x). In this case, the relation between the Lagrangian
coordinates y and the Eulerian coordinates x of the particles is given by the obvious
equality
x = b(y, t) = y + v(y)t. (21)

Assume that v(x) is a continuously differentiable function such that

min v'(x) = -u < 0, u> o.


The flow's evolution can be split into two qualitatively distinct stages.
Up to time
t* = l/u

the particles preserve their spatial order and move in the single-stream regime.
The function x = b(y, t) (21), and its inverse function y = a(x, t) are strictly
monotone and continuously differentiable, while the particle density is given by
the rigorous formula

p(x, t) = f ,00 (y)8(b(y, t) - x)dy, (22)


2.9. Pragmatic applications; beyond the rigorous theory 67

analogous to the equation (2.5.3).


For t > t*, there appears a new, multi-stream, regime of motion when some
particles catch up with other particles, and there are points on the x-axis where
simultaneously several particles are located moving with different velocities. At
that stage, the functional (22) has to be calculated by means of the pragmatic
formula (2.9.9) which leads to

N
p(x, t) = L po (an (x, t))!in(x, t)l, (23)
n=l

where the summation is carried over all N (N ~ 1) roots y = an (x, t) of the


equation
b(y, t) = x,

solved for y for fixed x and t, and

. a
In(X, t) = -an (x, t .
)
ax
Expression (23) has a clear-cut physical meaning: the density of particles at the
point x is equal to the sum of densities (2.5.4) of all streams arriving at this point.
The graph of the function x = b(y, t) in the single-stream and multi-stream regimes
is shown in Fig. 2.9.3(a).
Now let us consider the situations where the particles are sticky and are forbidden
to overtake each other. After a "collision" they move together. Mathematically,
the phenomenon of particles sticking together corresponds to the passage from a
nonmonotone (for t > t*) function b(y, t) to a monotone function b(y, t), where
the nonmonotone piece of the former was replaced by the horizontal piece of
the latter. The position x*(t), the coordinate of adhesion of sticky particles, is
determined, for example, by the momentum conservation law. The typical graph
of function b(y, t) is shown in Fig. 2.9.3(b).
The corresponding particle density is described by

p(x, t) = f PO(y)8(b(y, t) - x) dy (24)

containing a supersingular Dirac delta. To find its action on the test function PO (y)
note that, as in (19),

8(b(y, t) - x)[q,(x)] = q,(b(y, t)).


68 Chapter 2. Basic applications

x=b(y,t) x=b(y,t)
b
a
, ....... , ,

~-~----~--y
-"----------y yl y2

FIGURE 2.9.3
The graphs of functions representing the Lagrangian coordinates y and the
Eulerian coordinates x in the two cases of: (a) noninteracting particles with
resolved multi-stream motion for t > t*; (b) sticky particles.

Multiplying this equality by Po(y) and integrating it over all y we obtain, in view
of the Fubini postulate (see (17) and the following comments), that

f p(x, t)t/J(x) dx = t/J(x*(t» f


Y2(t)

Yl (t)
Po(y) dy + f +f
[ Yl (t)

-00 Y2(t)
00]
t/J(b(y, t»Po(y) dy.

Here, Yl(t) < Y2(t) are edges of the function b(y, t)'s plateau, where b = x*(t)
(see Fig. 2.9.3(b». Now, let y = a(x, t) be the inverse function to the monotone
function x = b(y, t). Choosing a new integration variable x, connected with y via
the equality y = a(x, t), we merge the last two integrals into a single integral:

f p(x, t)t/J(x)dx = m(t)t/J(x*) + f t/J (x) Po (a (x, t»j(x, t)dx, (25)

where
m(t) = 1 Y2 (t)

Yl(t)
Po(y) dy

is the total mass of particles glued in the cluster at point x = x*(t), and

.( t) _ {aa(x, t) }
J x, - ax .
2.9. Pragmatic applications; beyond the rigorous theory 69

The braces are used to indicate that j(x, t) is the derivative of function a(x, t)
for all x # x*(t), where a(x, t) is a smooth function of variable x. At the point
x = x* one can assign to j (x*, t) any finite value.
It is clear that if we insert the formula

p(x, t) = m(t)B(x - x*(t)) + PO(a(x, t))j(x, t) (26)

in the functional

f p(x, t)q,(x)dx = p(x, t)[q,(x)],

the relation (25) is recovered. It means that (26) gives the sought generalized
density of the gas of sticky particles and takes into account the adhesion process.
Recall that the first summand at the right-hand side of (26) is the singUlar density
of the sticking particles' cluster, while the second summand is the smooth density
of nonsticking particles outside the cluster (see Fig. 2.9.4).

m5(x-x*)

~ ______________L-__________________ X
x*
FIGURE 2.9.4
Plot of the generalized density (26). The vertical arrow indicates symbolicaUy
the singular density of the cluster of glued particles.

We stress that, as expected, the generalized density (24) satisfies the physical
mass conservation law:

m= f p(x,t)dx= f PO(y)dy=const.
70 Chapter 2. Basic applications

2.10 Exercises
Ordinary differential equations
1. Convert the homogeneous differential equation y + y y + w 2 Y = 0 with the initial
conditions y(O) = a, )1(0) = b into a nonhomogeneous equation; the solution thereof
satisfies the causality principle which, for t > 0, coincides with a solution of the original
initial value problem.
2. What initial conditions should be imposed on the corresponding homogeneous
equation in order that, for t > 0, its solution coincides with a solution (satisfying the
causality principle) of equation y + yy + w 2 y = 8(t)?
3. Find an even solution of equation y+y = 8(t).
4. Find a solution of equation y + y = a X(t) + b8(t) satisfying the causality principle.
5. What equation can replace equation y+ y = (l/e)f(x/e) for "microscopically
small" e? <f
f(x)dx = 1.)
6. Find a general solution of equation ty = thy.

Wave equation
7. A special solution of the wave equation 02u/ot 2 = c 20 2u/ox 2+ f(x, t) is expressed
via the Green function G(x - y, t - .) by the following formula:

u(x, t) = f f
d. dy f(y, .)G(x - y, t - .).

Find the Green function of the wave equation.


S. Write the D' Alembert formula utilizing the Green function.

Continuity equation
9. With the help of Dirac delta find the I-D density and concentration of passive tracer
whose velocity depends on the distance from the origin via the formula v = gx. Assume
that the initial densities and concentrations are Po (x) and Co(x).
10. Solve the previous problem in the case if the particles move by inertia, and for
t = 0 their velocity depends linearly on their coordinate.
11. Find the dependence on altitude (measured as a distance from the cloud) of the
density of rain drops assuming that their initial density is Po, initial velocity is v, and that
they fall freely under the gravity force acceleration g.
12. Obtain the same result as a solution of the continuity equation for densities p(x)
that stabilize in time, such that

o~ (V(X)p(X») = O.
2.10. Exercises 71

13. What happens if one assumes that g < O?


14. Let the stream function W of a 2-D incompressible fluid be independent of time,
i.e., W = W(XI. X2). What should be the passive tracer distribution so that its density
would not depend on time?
15. One of possible distributions of the passive tracer described in the above problem
=
has density p(x) a8(w(x) - b). What is the dimension of factor a? Find the mass of
passive tracer with that distribution.
16. Derive the continuity equation in the 6D phase space (x, v)for the density f(x, v, t)
of a collection of particles with positions and velocities satisfying equations

dXj _ v:. dV
dtj =g(Xj , Vj), i = 1,2, ... ,n, ...
dt - I,

17. Solve the above equation by the Green function method in the case when particles
=
move in a viscous medium (g -hv) and their initial density is fo(x, v).
18. Consider again the above problem but under the assumption that all of the
particles have mass m and at time t =
0 are concentrated at the origin with the initial
density fo(x, v) = m8(x)w(v), where function w(v) satisfies the norming condition
f w(v)d 3 v = 4rr f:;w(v)v 2 dv = 1. Find the density p(x, t) of particles in the physical
space x E R3.
19. What is the weak limit of density f(x, v, t) from the previous problem as t -+ 00.

20. In the simplest example of the inertial motion of particles the liouville equation
takes the form
af +(v.Vx)f=O.
at
In the hydrodynamic approximation the initial condition for this equation is fo(x, v) =
PO(x)8(v - vo(x». It expresses mathematically the fact that the particles with the same
position x have to have the same velocity vo(x). We shall seek a solution of the liouville
equation in the similar form

f(x, v, t) = p(x, t)8(v - v(x, t».

In this situation find equations for p(x, t) and v(x, t).


21. Assume that the velocities of particles in the hydrodynamic flow satisfy identical
equations DV/ Dt = g(X, V, t). Write these equations for the velocity field v(x, t) in
Eulerian coordinates.
22. Density of particles p(x, t) = f f(x, v, t)d3 v always satisfies the continuity
equation
~ + div (v(x, t)p) = O.
Find an expression for the velocity field v(x, t) in terms of the density f(x, v, t) in the
6-D phase space.
72 Chapter 2. Basic applications

23. Let the initiall-D density (see Fig. 2.3) be

L
00

Po(x) = m c5(x - fJ(iD..s»,


i=-oo

where fJ(y) = 2y(lyl + s)/(2Iyl + s). Find the averaged density .oo(x) «see, (2.4.13» in
= =
the limit case D.. --+ 0, I --+ 0, m D..sp, D.. 0(1).

Pragmatic approach
24. From the pragmatic viewpoint, what is the coefficient J.t in the equality 8(f(x» =
J.tc5(x), if f(x) =
ax for x 2: 0 and =
fJx for x < o?

25. What pragmatic value would one assign to functional f = f f(x)c5'(x)dx, where
f(x) is the function from the previous exercise?

26. Find the matching (glueing) conditions for solution of the equation

y + yc5(t)y + c,}y = o.

27. Find the matching (glueing) conditions for solution of the equation

y" + y(c5(t)y)' + Cl)2y = O.

28. Find the distribution R(y) = c5(a(x) - y)[4>(x)] in the following cases:
=
(a) a(x) XX(x);
(b) a(x) = (Ixl + 1) sign (x);
(c) a(x) = x + (1/2)(lx - 11 -Ix + 11).
29. Find a generalized particle density if the particles stick at point x = 0, and up
to that point move with constant velocity v(y) =
-w/4y, w > 0, where y is the
Lagrangian coordinate of the particle. Assume that the initial density of particles was the
same everywhere and equal to Po.
Part II
INTEGRAL TRANSFORMS AND
DIVERGENT SERIES
Chapter 3
Fourier Transform

3.1 Definition and elementary properties


In this chapter we study the Fourier transform and investigate its properties for
functions f(t) depending on a single variable t which will be interpreted as time.
The Fourier transform (or Fourier image) j(w) of f(t) is defined by

-
f(w) = -1
21f
f .
f(t)e-uotdt, (1)

whenever the integral on the right-hand side exists.


Notice that if f is absolutely integrable on the whole real line the above integral
is well-defined. In particular, if f(t) is bounded and decays at infinity faster than
1/ltl1+,; then the integrability condition is satisfied and the Fourier transform is
well-defined as a functior. of w. On the other hand, constant functions such as
f(t) = 1 do not have well defined Fourier transforms in the above sense. We will
return to this difficulty later on in the section on generalized Fourier transforms.
Formula (1) describes an operation f ..... j called the Fourier transformation
which transforms a function f of the time variable t into a function j of another
variable w; in this context it will be called angular frequency. It is clear from the
defining formula (1) that the Fourier transformation is a linear operation, that is,
for any constants A, B,

(Af + Bg)- = Aj + Bg. (2)

Other connections between simple operations on the original function f(t) and
corresponding modifications of their Fourier images also follow immediately from
the formula (1). So, if f(t) ..... j(t), then
76 Chapter 3. Fourier transform

f(t + 'l') t---+ j(w)e iWT , (3a)

f(t)eiO.t t---+ j(w - Q), (3b)

f(-t) t---+ j(-w), (3c)

f*(t) t---+ j*(-w). (3d)

The first two equalities concern the shifts in variables t and w and the last two
describe what happens under the change of sign of the same variables. If function
f(t) is real-valued then its Fourier image satisfies the symmetry condition

j( -w) = j*(w). (4)

This formula permits us to deal with positive frequencies alone.


Numerous applications of the Fourier transform depend on its fundamental con-
nection with the notions of homogeneity of time and space. Indeed, the complex
Fourier integral kernel f(t) = e iwt satisfies the remarkable functional equation

f(t + 'l') = f(t)f('l'). (5)

Equation (5) reflects the invariance of the exponential function exp(t) under time
shifts: when the time variable t is shifted by a fixed 'l', function exp(t) does not
change except for a constant multiplier exp('l'). By contrast with a purely real
solution ert of the same functional equation (5), increasing if r > 0 and decreasing
if r < 0, for the complex exponential

leiwtl = 1,
that is, the modulus of function e iwt remains constant. Also, since for any integer
k

function e iwt is invariant (i.e. transformed into itself) under time-shifts T =


2rr k / w, that is,
exp(iw(t + 2rrk/w)) = exp(iwt).

In other words, function e iwt is periodic with period T = 2rrk/w, and it can be
viewed as a clock hand performing a full revolution in time T, or as a ruler of
length T applied to the time axis.
The variable w is usually interpreted as the angular frequency and in the physical
sciences one often uses the quantity

v = w/2rr
3.1. Definition and elementary properties 77

which is called frequency. In what follows, we will only utilize angular frequency
w which is more convenient from the view point of mathematical exposition, and-
without danger of misunderstanding-we will refer to it as frequency w.
Another remarkable property of the Fourier transform's kernel eiwt is particularly
important in the analysis of linear systems: a linear combination of arbitrarily time
shifted exponential functions is again, up to a constant coefficient, such a function.
More precisely,

A exp [iw(t + 't'1)] + B exp [iw(t + 't'2)] = Ce iwt ,

where
C = A exp(iw't'l) + B exp(iw't'2).

Consider a physical linear system. Recall that the system is said to be linear if its
response to an outside input obeys the superposition principle, that is the system's
response to the sum of input signals is equal to the sum of independent responses
to the components of the sum. Mathematically, this means that the output signal,
which will be denoted by g(t) is a linear operator on the input signal q,(t). In the
general case such an operator is of the form 1:

g(t) = f h(t, 't')q,('t')d't',

where h (t, 't') is a response to the a-pulse input, which thus completely determines
properties of the linear system. Suppose that properties of the system are invariant
in time. Then, under an arbitrary time shift of the input signal, the output signal
should not change except for the same time shift. Such a property of invariance
with respect to time shifts will be satisfied if the response to the a-pulse input
depends only on one variable, i.e., h(t, 't') = h(t - 't'). In this case the relationship
between the input and output signals is given by the familiar convolution integral

g(t) = f h('t')q,(t - 't')d't'.

If input signal is of the form q,(t) = eyt , then the shape of the time-invariant
system's response will not be distorted by the system as g(t) = Kq,(t). The

f
complex factor
K = h('t')q,(-'t')d't'

11be area of mathematics studying such general operators is called functional analysis. In partic-
ular, it provides the above representation formula.
78 Chapter 3. Fourier transfonn

describes the attenuation of the output signal in comparison with the input signal.
=
In the particular case of 4J e iwt , we see that K = 2rrh(w), so that-up to the
factor 2rr-we get the Fourier transform of h(t). This property makes the Fourier
transform technique effective in physical and engineering applications.
To get a better feel for the remarkable function e iwt , we suggest contemplating
the following erroneous line of reasoning which students are sometimes tempted
to follow. Observe that exp(i2rrvt) = [exp(i2rrW', where v = w/2rr is the
frequency (measured in Herzes). Since exp(i2rr) = 1, conclude that exp(iwt) =
1vt = 1, and replace the integral (1) by a simpler integral (1/2rr) J f(t)dt. What
went wrong?

3.2 Smoothness, inverse transform and convolution


We begin with the relationship between the smoothness of function f(t) and the
asymptotic behavior of its Fourier transform i(w) as w ~ 00. Assume that f(t)
is n-times continuously differentiable on R and absolutely integrable together with
its first n derivatives. Multiply (3.1.1) by (-iw)n and note that (_iw)ne-iwt is the
nth derivative of function e-iwt with respect to t. Hence, integrating n times by
parts, we arrive at the equality

(1)

which expresses one of the most useful consequences of the invariance of the
Fourier transform's kernel with respect to shifts of its variables: differentiation
of the original function corresponds to multiplication of its Fourier image by i w.
Absolute integrability of the integrand in (1) implies that the expression on the
left-hand side is bounded so that the modulus Ii(w) I of the Fourier transform
decays at infinity not slower than Iwl-n • This is the basic connection between
the original function's degree of smoothness and its Fourier transform's speed of
decay as Iwl ~ 00. In view of the extraordinary importance of this property we
will formulate it explicitly as the following principle:
Sufficiently smooth functions with absolutely integrable nth derivative have
Fourier transforms that decay at infinity not slower than Iwl-n •
In view of the symmetry between the Fourier transformation and its inverse (to
be established below), the inverse implication is also true:
If the Fourier transform of original function is a smooth function with an ab-
solutely integrable nth derivative (with respect to w), then the original function
decays at infinity not slower than Itl-n •
3.2. Smoothness, inverse transform and convolution 79

Now let us tum to the inversion formula which permits recovery of a function
from its Fourier transform. We will assume that f(t) is absolutely integrable and
sufficiently smooth so that j(w) is absolutely integrable as well. Multiplying
(3.1.1) by (jJ(w)eiwT:, where (jJ(w) is an absolutely integrable function, and integrat-
ing both sides of the equality with respect to w, we obtain that

Note that we have changed the order of integration on the right-hand side. This is
justified by the absolute integrability of integrands. Let us put

and evaluate the inner integral on the right-hand side using the well-known formula

f exp( -bx 2 + ikx)dx = Ii exp ( - ::) , (3)

valid for any Re b ~ 0, b 1: O. As a result, equality (2) is transformed into equality

f- (E2W2
f(w) exp --2- + iWT ) dw = f 1
f(l) .../2iiE exp -
- T)2) dt.
(t 2E2

For E --+- 0, the second function in the right-hand side integral weakly converges
to a shifted Dirac delta c5«t - T), and its probing property recovers the value of
f(t) at t = T. On the left-hand side, in view of the absolute integrability of j(w),
we can just set E = O. Finally, replacing T by 1, we arrive at the Fourier integral

f(t) = f j(w)eiwtdw, (4)

which expresses function f(l) through its Fourier transform.


We would like to stress that the direct Fourier transform (3.1.1) and the inverse
Fourier transform (4) are symmetric in the sense that if j (w) is the Fourier transform
of f(l) then f(-w)/27r is the Fourier transform of function j(I). Using the
notation introduced in (3.1.2) we can write this statement as follows:

If f(t) 1-+ j(w) then j(t) 1-+ f(-w)/27r. (5)


80 Chapter 3. Fourier transfonn

The above relationships can generate a protest among engineers and physicists
since frequency w and time t have different dimensionalities. That is why, whenever
necessary, we will assume that the frequency and the time are nondimensionalized.
In view of (4), the integral in the brackets in (2) can be replaced by q,(r - t),
and we obtain that

(6)

The integral on the right-hand side is the corwolution o/functions f and q,:

f f(t)q,(r - t)dt = f(t) * q,(t). (7)

Comparing (6) with (4) we conclude that the Fourier transform of a convolution
of functions is, up to the constant 2Jr, a product of their Fourier transforms, that is

f(t) * q,(t) ~ 21l' f(w)(fi(w). (8)

Multiplying (4) by q,(t)i wt , integrating it with respect to t, and invoking the


defining formula (3.1.1) to evaluate the integral on the right-hand side, we arrive

f f f(Q)~(w
at
f(t)q,(t)iwtdt = 21l' - Q)dQ, (9)

dual to the equality (6). The formula implies, in particular, that the Fourier trans-
form of a product of two functions is the convolution of their Fourier images:

f(t)q,(t) ~ f(w) * (fi(w). (10)

We shall call formulas (6) and (9) Parseval equalities, although that name is often
reserved for their special case which is obtained from (6) by setting r = 0 and
substituting I*(-t) for q,(t). As a result, we get that

(11)

In engineering applications, P(t) = If(t)1 2 often represents the signal's power


function, so that the integral on the left-hand side of (11) is the signals' total
energy. Thus, in view of the Parceval equality (11), the function li(w)1 2 shows
how energy is distributed over different frequencies.
3.3. Generalized Fourier transform 81

3.3 Generalized Fourier transform


Let us try to extend the Fourier transform's domain beyond the class of absolutely
integrable functions. Such a generalized Fourier transform will be introduced in
the same way the distributions were in Chapter 1. Recall that the distribution was
defined as a linear functional which assigned a number to each function t/J from a
certain set of test functions. In other words, you can tell a distribution by its action
on test functions.
Equality (3.2.6) offers a similar opportunity to extend the Fourier transform's
domain. We will call j(w) the generalized Fourier transform of function /(t) if
the integral on the right-hand side is equal to the functional on the left-hand side
for each test function t/J (t) from a certain class S of test functions to be determined
later. Since the convolution on the right-hand side is equally well defined for
function / replaced by a distribution T, and it remains a smooth function in the
t
latter case, we can define the generalized Fourier transform of a distribution T
by the condition that

(1)

Example 1. To find the Fourier transform of the Dirac delta, notice that the
*
right-hand side of (1) becomes B t/J('r) = t/J(r). The next step is to look at the
t
left-hand side and search for a that will make the left-hand side equal to t/J(r).
t
But the inversion formula (3.2.4) makes it clear that such a has to be the function
identically equal to 1/2rr. Hence,

- 1
B(w) = 27r. (2)

This fact can be also symbolically written as:

B(t) = - 1
2rr
f.e,wtdw. (3)

Conversely, a function which gives value B[t/J] = 4)(0) to the integral on the left-
hand side of (3.2.6) is equal to the generalized Fourier transform of the constant
function /(t) = 1. This is clear from comparison of the right-hand side offormula
(3.2.6) with (3.1.1). In other words,

i = B(w).

82 Chapter 3. Fourier transform

A mathematical aside: tempered distribution? Before we continue with examples of other


generalized Fourier transforms, let us discuss some mathematical problems emerging here.
The original distribution space V' was introduced as a set of functionals T on the set V
of infinitely differentiable test functions <fJ(t) with compact support. If the generalized Fourier
transforms t are to be defined by linear functionals on the left-hand side of equality (1) then Fourier
transforms 4>(w) will have to play the role of test functions. However, the Fourier transforms 4>,
t
for <fJ E V, are not a rich enough set to determine in the above mentioned sense. In particular,
they do not have compact supports. As a result we loose the symmetry of the generalized Fourier
transform. Indeed, it is impossible to assign to each distribution T E V' its generalized Fourier
transform which would be a continuous linear functional on the set (4)(w) : <fJ E V}.
The solution is to expand the space of test functions {t/J (t)}, by requiring a symmetry in equality
(1), or more precisely, by demanding that the sets of <fJ(t)'s and of 4>(w) 's be equal. It turns out that
the right space of test functions happens to be the set S of all infinitely differentiable functions
t/J (x), which decrease at infinity, together with all their derivatives, faster than arbitrary power
function Ixl- n • Such functions are often called rapidly decreasing. In other words, <fJ E S if, for
any positive integers n, m :::: 0, we can find constants Kmn such that, for arbitrary x,

(4)

It is easy to show that if function <fJ(t) E S then its Fourier transform 4>(w) is also infinitely
differentiable and rapidly decreasing. Indeed, according to (3.2.1), the infinite differentiability of
t/J(t) implies that its Fourier transform decays faster than an arbitrary power Iwl- n , and vice versa.
the speed of decay of <fJ(t) as It I .... 00 implies that its Fourier transform is infinitely differentiable.
Expansion of the test function space from V to S obviously narrows the set of corresponding
distributions from 1)' to S' C V'. Traditionally, the set S' of continuous linear functionals on S
with convergence related to the conditions (4) is called the space of tempered distributions.
Notice that the selection of this new and smaller distribution space does not affect us too
seriously. All the examples of distributions from V' discussed so far are also continuous functionals
on set S. All the distnbutions with compact support, including the Dirac delta and all its derivatives,
are tempered distributions. In other words,

S' c V'.

Also, any function /(t) which grows slower than a certain power of t, defines a continuous
J
functional onS (that is a distribution Tf E 8') by the formula /(t)<fJ(t)dt, t/J E S. Hence, the
name tempered distributions.
To get a better feel for the narrowness of the set of distributions introduced by the above
expansion of the set of test functions, notice that both function exp( t 2 ) and e' represent distributions
in the space V' but not in S'. On the other hand, for any distribution in S'one can define its
Fourier transform which is also a distribution in 8', and this is the real reason for the usefulness
of tempered distributions.

All the operations applicable to ordinary Fourier transforms remain valid for
their generalized cousins. So, in view of (3.1.3a), the shifted Dirac delta ~(t - 7:)
has Fourier transform e- iWT j27r, and (3.2.1) implies that the Fourier transform

ZThis material may be skipped by the first time reader


3.3. Generalized Fourier transfonn 83

of c5(n)(t) is equal to (iw)n /2Jr. Two additional formulas involving generalized


Fourier transforms will be useful:

(5a)

(5b)

As was the case for ordinary distributions, the generalized Fourier transforms
could be defined as weak limits of regular Fourier transforms. For example, let us
show that the functions
j - (w, J.. ) = -
1 sin(wJ..)
, (6)
Jr w
dependent on parameter J.., converge to c5(w) as J.. --+- 00. For that purpose consider
the functional
- sin (wJ..)
F(J..) = / cp(w) w dw,

which is called the Dirichlet integral. Differentiating the above equality with
respect to J.., we get that

F' (J..) = / ~(w) cos(wJ..)dw = ~ [cp( +J..) + cp( -J..)].

Now, taking definite integrals of both sides over the interval (0, J..), and noticing
that F (0) = 0, we get that

F(J..)
IjA
=- cp(t)dt.
2 -A

If the Fourier transform ~(w) is sufficiently smooth, then the original function cp(t)
is absolutely integrable and the above integral converges, as J.. --+- 00, to

F(oo) ="2 1/ cp(t)dt -


= JrCP(O),

which is the value of the test function at 0 multiplied by Jr. This proves that,
weakly,
1 sin(wJ..) .
- --+- c5(W)Slgn (J..) (J.. --+- 00), (7)
Jr W

although the same function does not converge to zero for any w =F 0, as it "fills
out" the area between the four branches ofthe hyperbola ±1/1f Iwl (see Fig. 3.3.1).
84 Chapter 3. Fourier transfonn

Sin(ml)/m

FIGURE 3.3.1
The graph of kernels (6) in the Dirichlet integral. The kernels weakly converge
to 1f8(w) as A. ~ 00 ,filling up the area between hyperbolas ±l/lwl.

3.4 Transport equation


Many applications of the Fourier transform are based on the fact that it often
simplifies solving functional equations. In particular, it reduces some differential
and integral equations to algebraic equations. Numerous examples of this method
will be discussed in the following chapters. For now, to give a simple illustration
of what we have in mind, we will just consider a I-D transport equation

-af
at + v-ax =
af f g(u)f(x, v - u, t)du, (1)

with the initial condition

f(x, v, t = 0) = fo(x, v).

Similar equations arise in scattering theory. The collision (scattering) integral on


the right -hand side takes into account the process of interaction of particles with the
medium, and function f(x, v, t) is the particle density in the phase space (x, v).
The density of particles in the actual physical space, in our case-the x axis, can
3.4. Transport equation 85

then be found by integrating f over the velocities

p(x, t) = f f(x, v, t)dv. (2)

To begin with, we will show an important consequence of equation (1). Inte-


grating it over all v's gives

a
ap + ax
8t f vf(x, v, t)dv = p(x, t) f g(v)dv.

Integrating it next over all x's, we obtain an equation for the total mass of particles

M = f p(x, t)dx,

and for its rate of change

-aM =M
at
f g(v)dv.

Thus, to fulfill the mass conservation law, it is necessary to impose the condition

f g(v)dv = O. (3)

Let us now return to the transport equation (1). Its right-hand side contains a
convolution integral, whose Fourier transform is equal to the product of Fourier
images of the factors under the integral sign. That means that by passing to the

f .
Fourier image
F(x, f.L, t) = - 1 f(x, v, t)e-I/J.vdv (4)
2rr
of f with respect to v, we transform the integro-differential equation (1) into a
purely differential second-order equation

aF . a2 F _
-at + 1aXaf.L
-,)- = 2rrg(f.L)F(x, f.L, t).

Here
g(f.L) = - 1
2rr
f .
g(v)e-I/J.vdv. (5)
86 Chapter 3. Fourier transform

An application of one more Fourier transformation

<I>(K, 1", t) 1
= 2rr f F(x, 1", t)e -iKX dx, (6)

this time with respect to x, gives the first-order partial differential equation
a<I> a<I> _
at - K aJL = 2rrg(JL)<I>(K, 1", t), (7)

with the initial condition

<I>(K, 1", t = 0) = <l>O(K, 1").

This equation can be easily solved by the method of characteristics described in


Section 2.6. Recall that the method assumes that the independent variable I" is a
function of t, and that it satisfies equation

dJL
- = -K, lI(t = 0) = v. (8)
dt r-

In this case, the left-hand side of equation (7) turns out to be a total derivative of
<I>(t) = <I>(K, JL(t), t) with respect to time t:

-d<l> = - 2-
rrg(JL) <I>(t), (9)
dt

<I>(t = 0) = <l>O(K, v).

The solution of the characteristic equations (8) and (9) is of the form

I" = v - Kt, (10)

<I>(K, v, t) = <l>O(K, v) exp ( 2rr fot g(v - K"C)d"C ). (11)

Notice that (11) is the solution of equation (7) along the line I" = v - Kt in the
(1", t) plane (Fig. 3.4.1).
As the parameter v varies, we obtain a family of foregoing lines which cover
the entire plane. Point (1", t) corresponds to a line with parameter v = I" + Kt.
Substituting in (11), we find the solution of equation (7) with given I" and t:

<I>(K, 1", t) = <l>O(K, I" + Kt) exp ( 2rr fot g(JL + K"C)d"C). (12)
3.4. Transport equation 87

(J,l,'t)

FIGURE 3.4.1
Characteristic lines for equation (7).

To get the desired solution of the original equation (1), we need to calculate the
double inverse Fourier transform

f(x, v, t) = ff <I>(K, f,L, t) exp(if,Lv + iKX)dK df,L.

However, if we are interested only in finding the density of particles on the x axis,
there is no need to evaluate this double integral. Indeed, it suffices to put f,L = 0
in (12), which, as is obvious from (4), corresponds to integration of f over all v's,
and then just to find the inverse transform with respect to K:

Let us take a look at properties of the density function in the case of a particle
stream with the initial density p(x). In addition, we shall assume that at each
stream point the particle velocities are distributed with a normalized density p(v).

f
Thus
fo(x, v) = p(x)p(v), p(v)dv = 1.

Respectively,
88 Chapter 3. Fourier transform

and the expression (13) takes the form

Introducing a new variable of integration y = K t, the above integral is transformed


into
p(x, t) =~ f p(y /t)jJ(y) exp(tG(y) + iyx/t )dy (15)

where
G(y) = -
21l' loy g(p,)dp, =
y 0
f g(v).
ryv
1 - e- iyv dv. (16)

To investigate the asymptotic behavior of the density as t .... 00, observe that
the first factor under the integral sign can be taken outside

p(x, t) = ~M f jJ(y) exp(tG(Y) + iyx/t )dy,

and, as far as the function G(y) is concerned, we can restrict ourselves to the first
nonvanishing term of the power series expansion

(17)

In the above formula


(vn) = f vng(v)dv.

In view of (3), the first summand on the right-hand side of (17) is equal to zero.
Assuming that the collision integral kernel g(v) is symmetric, the second term in
(17) vanishes as well. Let us keep in the expansion (17) only the quadratic term:

It corresponds to the so-called diffusion approximation which replaces kernel g (v)


in the original equation (1) by the distribution

Then the transport equation becomes the differential equation

aj aj 2 a2 j
-+v-=(v)-,
at ax av 2
3.4. Transport equation 89

and the expression for density (15) takes the form

For very large times, it asymptotically converges to

p(x, t) =M 3 (3X2 )
21T{V 2 )t 3 exp - 2{v 2 )t 3 . (18)

°
To complete the picture, we will provide another useful form of the density which
follows from equality (13). First, notice that setting g = in (13) gives the particle
density in the absence of scattering. Denote it by po(x, t). The corresponding
density fO(x, v, t) in the phase space satisfies equation

afo afo
-at + vax- = 0, fO(x, v, t = 0) = fo(x, v).

Here, the use of the Fourier transform is not necessary since, obviously,

fO(x, v, t) = fo(x - vt, v).

Hence, the density of nonscattered particles

pO(x, t) = f fo(x - vt, v)dv.

In particular, if all the particles were motionless at t = 0, that is, if

then

On the other hand, according to (13), for g == 0,

pO(x, t) = 21T f <l>O(K, Kt)eiICxdx.

Taking the inverse Fourier transform we get that


90 Chapter 3. Fourier transform

and substituting this expression for <l>O(K, Kt) in (13), we get the convolution inte-

f
gral
p(x, t) = pO(y, t)G(x - y, t)dy, (19)

where function

describes influence of the scattering processes on the density. The behavior of


G(x, t) depends on the form of function g(/L) (5), which expresses the nature of
scattering. In the diffusion approximation, G(x, t) is given by the right-hand side
of the equality (18), with M deleted.

3.5 Exercises
1. Find the Fourier transform of function f(t) = x(t)e- yt , y > o.
2. Find the Fourier transform of function f(t) = e- y1tl , y > o.
3. Taking into account symmetry between the Fourier transform and its inverse, and
the answer to the preceding problem, find the Fourier transform of f(t) = 1/(y2 + t 2).
4. Find the Fourier transform of function f(t) = t 2/(y2 + t 2).
5. Find the Fourier transform of function
00

f(t) = I)(t + n)e-yn , y > o.


n=O

6. Find the Fourier transforms of the odd component fo(t) =


(1/2)[f(t) - f( -t)J
and of the even component fe(t) = (1/2)[f(t) + f(-t)J of function f from the previous
exercise.
7. Find r(y) = max lei min le and study its behavior as y .... 0+.
8. Find a function f(t) such that i(w) = sin w/w. (Hint. First find f'(t).
9. FindtheFouriertransformoffunctionf(t) = 1-t2 forltl!: 1, and = ofor It I > 1.
(Hint: Find first f" (t).)
10. Find the Fourier image of the function

2 - t2 , for It I < 1;
f(t) = { (2 - t)2, for 1 !: It I < 2;
0, for 2 !: Itl.
3.5. Exercises 91

Hint: Begin with calculation of the Fourier image of the third derivative of f(t).
11. The values of function f(t) are known at points tn = l::!..n, -00 < n < 00. Find
the Fourier image of the linear interpolation function
t - nl::!..
!l(t) = f(nl::!..) + [f«n + 1)l::!..) - f(nl::!..)]-l::!..-' nl::!.. < t < (n + 1)l::!...
Hint. Begin with calculation of the Fourier image of the second derivative of the interpo-
lated function.
12. Find the Fourier image of the function f(t) = Jh(r)h(,r + t) dr, where
h(t) _
-
{t,0, t
for 0 < ~ 9;
for 9 ~ t and t ~ O.

13. Taking into account the form of the solution to the preceding exercise find the 4th
order derivative of f(t).
14. Using the fact that the function

f( ' ) - {COSP(t), for It I < rr/2;


t, p - 0, for It I ::: rr/2;

satisfies, for p ::: 2, the recurrence relation

d2
dt 2 f(t; p) + p2 f(t; p) = p(p - 1)f(t; p - 2),

find the Fourier image l(w; p) for any integer p ::: 1.


Chapter 4
Asymptotics of Fourier Transforms

In Chapter 3 we demonstrated that the Fourier transform i «(I) of a smooth function


I(t) rapidly decays to zero as (I) -+- 00. However, smoothness is rare in natural
phenomena and one often encounters processes that are either discontinuous or
violate the smoothness assumption in other ways. Such phenomena include, for
example, shock fronts generated by large amplitude acoustic waves, ocean waves,
or desert dunes with their characteristic sharp crests. These and many other exam-
ples explain the importance of the Fourier analysis of nonsmooth processes.
Roughly speaking, values of the Fourier transform i at angular frequency (I) are
determined by the behavior of the function I(t) at time scales of the order 21l' /(1).
The latter quantity decreases as (I) increases. Hence, violations of smoothness
which, by their very nature have a local character, are related mostly to the behavior
of the Fourier transform at large values of (I). The larger the (I) is, the more the
impact of nonsmoothness is felt. Mathematically speaking, the nonsmoothness of
the original function dictates the asymptotic behavior of its Fourier transform as
(I) -+ 00. In the present chapter we will study these asymptotics.

4.1 Asymptotic notation, or how to get a camel to pass through


a needle's eye
We begin by recalling the standard, and widely used asymptotic notation.
• If the fraction t/J (x) / y, (x) converges to 1 as x -+- 00, then this fact is denoted
in the form of an equivalence relation

t/J(x) '" y,(x), (x -+ 00),

and we say that t/J(x) is asymptotically equivalent to function y,(x) for x -+ 00.
94 Chapter 4. Asymptotics of Fourier transform

If, for a positive constant a,

t/J(x) '" a1/J(x), (x ~ 00),

then we shall say that t/J(x) and 1/J(x) are o/the same order at infinity.
• If the fraction t/J (x)/1/J (x) converges to 0 as x ~ 00, then we write

t/J(x) = o{1/J(x)} (x ~ 00),

and say that function t/J (x) is at infinity 0/the order smaller than 1/J (x). In particular,
the notation t/J(x) = o{1} (x ~ 00), means that function t/J(x) converges to 0 at
infinity.
• Finally, if there exist positive constants a and M such that

1t/J(x)/1/J(x)1 ::: M, for a::: x < 00,

then we write that


t/J(x) = O{1/J(x)} (x ~ 00),

and say that t/J(x) is at infinity o/the order not greater than 1/J(x).
In a similar fashion, analogous asymptotic notation can be introduced for x ~ 0,
or for any other limit point.
We shall illustrate how the asymptotic notation works by solving two simple, and
somewhat light-hearted problems. The intuitively surprising solutions are obtained
via standard asymptotic analysis.
Example 1. Camel passing through a needle's eye. Let us start with a familiar
elementary school mind teaser. Assume that Earth is an ideal ball of radius R =
6400 km and that it is wrapped tightly at the Equator with a length of rope. Cut
the rope in one place, splice into it another piece of rope of length L =1 m, and
stretch it, keeping it at a uniform height h above the Earth's surface. The question
usually asked is: What is that height? The obvious answer is that h = L/21f =16
em. In particular, h is independent of the radius of the Earth (it would be the same
on the Moon, Jupiter, tennis ball, etc.). So, with just an extra 1m piece of rope,
cats all over the Equator would have the freedom to walk underneath the rope.
A young friend of ours spent a sleepless night puzzling over the philosophical
ramifications of the above answer. But he was even more perplexed by the solution
to the following related problem. So, perhaps, the insomniacs should skip this page
and the rest should travel to the Sahara Desert and hoist the spliced rope above
the Earth surface as high as possible. The question is: How high would that be?
Denote this height by H. The situation is schematically pictured in Fig. 4.1.1.
To better expose the mathematical contents of the problem, let us assume initially
that the Earth' radius is equal to 1. Then the height.,., depends on the angle () by
4.1. Asymptotic notation 95

RR+H
FIGURE 4.1.1
A schematic picture of the spliced rope hoisted above the Earth's surface at
one point of the Equator.

the formula
1
11 = cosO -1,

and the length 1 of the additional piece of rope, in terms of angle 0, is

1 = 2 (tan 0 - 0).

Our case corresponds to the situation where the length of the additional piece of
rope is very small in comparison to the Earth's radius (I « 1). So we can replace
the above exact expressions by an asymptotically equivalent expressions valid for
0--+ 0 to get

Eliminating 0 from these equivalence relations, we get a relationship connecting


11 and I:
(1)

where
c¥=
2
3' f.L = 21 2(3)a ~ 0.655.

It is clear from (1) that 1 = 0(11), (11 --+ 0), that is, 1 is of an order smaller than 11 as
11 --+ o. Actually, this means that the height 11 of the rope hoisted over the Sahara
Desert is much larger than the length 1 of the inserted piece of rope, and the ratio
of these two quantities gets larger as 1 gets smaller.
Finally, going back to our original "real" Earth example and taking 1 = L / R =
1.5625 . 10-7 , we get from the asymptotic relation (1) that the sought height

H = 11 . R = 121.6 m.
96 Chapter 4. Asymptotics of Fourier transform

Now, if we splice into our original, tightly fitting rope a tiny piece of rope of length
L=3mrn Gust the size of the eye of a needle), the asymptotic formula (1) indicates
that we can hoist the rope the height of H =2.53m over the Sahara Desert (enough
for a camel to pass under it). •
Example 2. Beetle on a rubber string. A beetle crawls along a rubber string of
length 1m. As soon as it covers a distance of 1 em the string is stretched by 1m.
After it covers another centimeter, the string is stretched by another meter, and
so on. For convenience, we'll call each centimeter covered by the beetle a step.
Question: How many steps does the beetle need to reach the end of the string?
Many people would answer with great conviction that the beetle will never reach
the end of the string and that, in fact, its distance from the end of the string will
increase indefinitely. Asymptotic analysis shows, however, that the number of
needed steps is finite and helps to evaluate it with high precision. The legend has
it that Andrei Sakharov, father of Russian H-bomb and later a famous dissident,
solved the problem in one minute during a solemn anniversary celebration of the
Soviet Academy, working on the back of the invitation card. This record is not
included in the Guiness Book.
Let 8 be the fraction of the string length covered by the beetle by the time
the string is first stretched. In the above setting 8=0.01. Our goal is to find an
asymptotic formula, as 8 ~ 0, for the required number N of steps as a function
of 8. In the nth step the beetle covers a fraction of the string length equal to 81n.
Hence the total fraction of the string length covered by the beetle in N steps is

1
L(N) = 8 ( 1 + -
2
+ -31 + -41 + -51 + ... + -N1) . (2)

The harmonic series on the right-hand sides diverges and L(N) ~ 00 as N ~ 00,
so the beetle will reach the string's end in a finite number of steps. Its arrival at the
string's end in N steps means that L(N) ""' 1 for a certain large N. To find that N
approximately, we will need an asymptotic formula for the partial sum

1 1 1 1 1
H (N) = 1 + -2+3
- +4- + -5 + ... + -N

of the harmonic series. For that purpose consider an auxiliary numerical sequence

u(n) = lin -lo[(n + l)ln].

Summing up the first N terms of this sequence gives

N
L u(n) = H(N) -lo(N + 1). (3)
n=l
4.1. Asymptotic notation 97

Let us show that, as N -+ 00, the corresponding series converges absolutely.


Indeed, observe that, for any x such that -1 < x < 00,

In(l +x):::: x.

Hence
-1 >In (l+n)
- - > -1-.
n- n -n+1

Therefore,
1 1 1
o <- u(n) -< - - - - < -.
n n +1 n2

The majorizing series L 1/n2 converges absolutely, and so does L u(n). As a


result, we obtain the following asymptotic relation:

N
Lu(n) = y +o{l} (N -+ 00),
n=l

where y is the limit value of the partial sums. Hence, the right-hand side of equality
(3) has to satisfy the same asymptotic relation, and we get that

H(N) ,...., In(N + 1) + y. (4)

It is known that
y = 0.57721566490 ... , (5)

and, traditionally, it is called the Euler constant. It is a transcendental number, like


the more familiar constants:rr and e.
Substituting (4) into (2), we arrive at the asymptotic formula

L(N) ,...., 8[ln(N) + y],

since replacement of In(N + 1) by In(N) does not affect the asymptotics. Finally,
the right-hand side is approximately 1 when

N ,...., exp (rl - y) . (6)

For 1/8 = 100 we get that the beetle needs N ~ 1.52.1043 to reach the end of the
string.
At this point, people who guessed that the beetle would never reach the end of
the string could argue that, after all, they were "almost" right and that the above
98 Chapter 4. Asymptotics of Fourier transforms

N is physically as good as "never," given that the age of the universe is some 20
billion years =6.31.10 17 s, and that the common educated guess is that it will be
in existence only for another 20 billion odd years. If the beetle moved with the
speed of lern/s, and started out at the Beginning of Time, by now it would have
covered only 0.415 of the string's length. Even a bionic Superbeetle, traveling at
jet fighter speed, would not be able to complete the journey before the End of the
World.
Let us conclude the analysis of the example with a remark that the approximate
formula (6) is amazingly accurate. Derived as an asymptotic formula for 8 ~ 0, it
also works very well for arguments of order 1. For example, for 8 = 1/2, the exact
formula (2) gives N = 4, whereas the asymptotic formula (6) gives n '" 4.15.
Similarly, for 8 = 1/3, formula (2) gives N = 11 and (6) gives N ....... 11.27. For
8 = 1/4, we get respectively N = 31 and N '" 30.65. •

4.2 Riemann-Lebesgue Lemma


In this section we tum to the principal topic of the present chapter: a study
of the asymptotic behavior of Fourier transforms of nonsmooth functions. The
following Riemann-Lebesgue Lemma, formulated below in a simplified-sufficient
for physical applications-version, is the key.
Assume that function f(t) is continuous on a finite interval [0, T]. Then,

(1)

To prove the above, recall that a continuous function f(t) on a closed interval
[0, T] is uniformly continuous. This means that, for any e > 0, one can find a
8 = Tin> 0, n < 00, such that, for all m = 0, 1, ... , n - 1,

If(t) - f(m8)1 < el2T whenever m8 < t < (m + 1)8. (2)

Splitting the interval of integration in (1) into n subintervals (m8, (m + 1)8), we


obtain

f
o
T
f(t)e-iwtdt =
n-l
L f(m8)
m=O
f
(m+1)8

m8
e-iwtdt

n-l (m+1)8

+L f e- iwt [f(t) - f(m8)]dt. (3)


m=O m8
4.2. Riemann-Lebesgue Lemma 99

Conditions (2) imply that the integrals on the right-hand side of (3) satisfy inequal-
ities

II
(m+1)c5

e-iwt[J(t)-/(m8)]dtl < ;:.


mc5

Adding up these n terms, and taking into account the fact that n8 = T, we obtain
that the second sum on the right-hand side of (3) is less than e/2. The integrals in
the first sum are easily evaluated to give

I .
(m+1)c5

mc5
.
e-1w(m+l)c5 - e-1wmc5
e- 1wt dt = i - - - - - - - -
w
.

The moduli of these integrals are bounded from above by 2/lwl. By the First
Weierstrass Theorem, a continuous function on a closed interval is bounded so that

I/(t)1 < F < 00, t E [0, T],

and we obtain that the first sum in (3) is less than 2Fn/lwl. Therefore, the total
integral in (1) satisfies the inequality

lI T . I2Fn
l(t)e-1wtdt < - -
e
+-.
Iwl 2
o

Now, it is clear that, for an arbitrary e > 0, we can select a finite n = 4Fn/e
such that the investigated integral is less than e for any w > n. This proves the
Riemann-Lebesgue Lemma. •
Additional remarks are in order here:
Remark 1. In physical applications, the upper limit in the integral (1) is often
infinite, and it is useful to generalize the Riemann-Lebesgue Lemma to that case.
This can be easily done if function I(t) is assumed not only continuous on (0,00),
but also absolutely integrable on the half-line. Then, for any e > 0, we can find a
T such that

I i'
00

I/(t)ldt <
T

Replacing e/2 in (2) by e/3, and putting n = 6Fn/e, we can immediately check
the validity of Riemann-Lebesgue Lemma in the case of the infinite upper integra-
tion limit.
100 Chapter 4. Asymptotics ofFourier transforms

Remark 2. The above argument indicates that the absolute convergence of the
above improper integral is not necessary for the validity of the Riemann-Lebesgue
Lemma. A sufficient condition is that the integral

converges uniformly for large w's. Recall, that the integral converges uniformly
for Iwl > n, if, for any e > 0, one can find a number L{e) > T, independent of
wand such that, for all Iwl > n and I > L,

IfI
00

f{t)e-ilLltdtl < e.

We will omit the proof of this statement but the following example illuminates the
situation.

Example 1. Consider the integral

f
00
cos{wt)
f177jdt = Ko{w),
vI +t 2
o

which is called the modified Bessel function of the third kind. The integral does
not converge absolutely since the integrand decays at infinity too slowly (only as
1ft). However, it is not difficult to show that the integral converges uniformly for
w > 0, so that the Riemann-Lebesgue Lemma is applicable. Indeed, in the theory
of Bessel functions one demonstrates that the above integral rapidly converges to
oas w ~ 00. More precisely,

The absence of absolute convergence of the integral is reflected by its behavior for
w ~ 0, where Ko{w) ...... -In{lwl).
4.3. Functions withjumps 101

4.3 Functions with jumps


4.3.1. Discontinuities of the first kind. Having armed ourselves with the
Riemann-Lebesgue Lemma, we can now proceed with an analysis of the asymptotic
behavior of Fourier transforms of functions with discontinuities of the first kind
(that is, jumps). We begin with a look at the simplest situation where the original
function I(t) has a jump at point t = 't' and is continuous, together with its first
derivative, for t #- 't'. The additional assumption is that both function I (t) and its
derivative I' (t) are absolutely integrable over the entire real axis.
Let us split the Fourier integral at the jump point to get
00 T
- = -1/
I(w)
2~
.
l(t)e- UJJt dt =-1/
2~
. +-1/ .
l(t)e- UJJt dt
2~
I (t)e- UJJt dt, (1)
-00
and then integrate by parts each of the two terms on the right-hand side. The first
term

-
1 /00. e-i(J}T
I(t)e-'(J}tdt = I('t' + 0)-2
1 /00 .
. + - 2 ' 1'(t)e-l(J}tdt. (2)
2~ ~IW ~IW

In view of the assumptions, function I' (t) is continuous and absolutely integrable.
Hence, the Riemann-Lebesgue Lemma is applicable to the second term on the
right-hand side of (2), and implies that, as W ~ 00,
it is of the order smaller than function 1/w. Thus, we have the asymptotic
relation

-
1 /00 l(t)e-i(J}tdt = I('t'
e-i(J}T
+ 0)-.- + 0 { -1 } .
~ ~,W W
T

A similar relation

-
1 /T. e-i(J}T {1}
I(t)e-,(J}(dt=-/('t'-O)-.-+o-
2~ zW
2~
-00 W

holds true for the second integral in (1). Putting these two results together we get
the desired asymptotic formula for the Fourier transform of a function with a jump
at point 't':
e-i(J}T {1}
j(w) = L/1-.- +0 - (W ~ (0), (3)
2HlW W
102 Chapter 4. Asymptotics of Fourier transfonn

fit)

~~--~------=-------------------=-t

FIGURE 4.3.1
Function I(t) is smooth outside t = 'l' and has a discontinuity ofthe first kind
at t = 'l'.

where we used the notation

If('l')l = I('l' + 0) - I('l' - 0)

to denote the jump size of I at the point 'l'.


Remark 1. There is a clear-cut connection between the asymptotic formula
(3) and the generalized Fourier transform. Formula (3) describes the asymptotic
behavior of the classical Fourier transform of a discontinuous function. However,
if we multiply both sides in (3) by iw we obtain that

-iWT
- e
iwl(w) = L/1-- + o{1},
2Jf

which, according to (3.3.2), describes the asymptotic behavior of the generalized


derivative I' (t). Indeed, its first term corresponds to the Fourier transform of the
shifted Dirac delta L/('l')18(t - 'l').
4.3.2. Remainder terms of the asymptotics. Similar asymptotic relations are
often encountered in physical and engineering problems. However, applying them
in practice, we immediately face the following fundamental dilemma: How large
need w be so that we can ignore the summand of order oIl/wI, and retain on the
right-hand side of (3) only the "principal" first term? A common sense physical
answer is: The values of w have to be much larger than 1. This obviously is not a
rigorous answer, and on one particular occasion one of the authors observed two
4.3. Functions withjumps 103

distinguished physicists seriously arguing whether number 4 is much larger than


1.
The rigorous approach to the problem reformulates the question and asks for
estimates of the remainder term's magnitude, which are specific to each asymptotic
formula. To explain the notion of remainder term, let us replace formula (3) by an
exact equality
-iW"f
- e
f(w) = 2 l.- + Rl (w).
Lf
JrIW
(4)

Here, Rl is called the remainder term and it is equal to the difference between the
accurate value of j and the value of its principal asymptotic term. In our case,

(5)

By {f' (t)} we mean a function which is equal to the derivative of f(t) for all values
t "# r, where it exists in the classical sense, and which is arbitrarily defined at t = r
(this does not influence the integral's value). Then the question of estimation of
the remainder term's magnitude is reduced to searching for inequalities of the type

IR1(W)1 < M(w),

where, hopefully, the function M(w) has a simpler structure than Rl (w). If that
is the case, the majorant M (w) permits a quantitative evaluation of the error com-
mitted by using just the principal term in the asymptotic formula. Sometimes the
search for an accurate majorant, which does not overexaggerate the true error of an
asymptotic formula, requires a lot of mathematical virtuosity. However, it is often
possible to use the standard tool of asymptotic expansions which will be described
below.

4.3.3. Asymptotic expansions. Consider the Fourier transform of a function f(t)


with a single jump at t = r. The additional assumption is that, for t "# r, f (t) has
n continuous derivatives which are absolutely integrable on (-00, r) and (r, 00).
In this case, by repeated integration by parts, we obtain that

n-l ( 1 )m+l
j(w)
e-iW"f
= -2-
Jr
L:-
m=o IW
Lf(m)l + Rn(w), (6)

with the remainder term

(7)
104 Chapter 4. Asymptotics of Fourier transform

It is clear that the above formula generalizes formula (4). The familiar symbol
Lf(m)(r)l denotes the jump size of f(m)(t) at t = r.
By the Riemann-Lebesgue Lemma, the remainder term in (6) is of order smaller
than the last term in the sum, which is of order O{l/w"}. However, the Riemann-
Lebesgue Lemma does not provide a recipe for the quantitative estimate of the
remainder term. The situation becomes simpler if, as in practical computations,
one retains only terms up to (n -l)th power of the quantity l/w. Then one obtains
a rough estimate of the remainder term based on the inequality

The integral on the right-hand side is completely independent of wand, as a rule, it


is easily evaluated either analytically or numerically to a desired degree of accuracy.
If number In is known then the remainder term has an explicit estimate

Roughness of the estimate is caused by the fact that the majorizing function M (w )
is of the same order of magnitude as the last term l/wn in (6). However, putting
M(w) together with the last term of the sum, we can find a majorant for the
remainder term Rn-l (w) which is of order smaller than the order of the last term
of the truncated asymptotic formula

(8)

In particular, a majorant of the remainder term in (4) found in this fashion, which
contains only the principal term of asymptotic expansion, gives

IRl(W)1 ::::: 2.Jrl W2 {I Lf'll + It} = 0 {~d .


The inequality demonstrates that, as w increases, the first term in (4), which is
O{1/w}, gives a better and better description of the asymptotic behavior of the
Fourier transform.
Example 1. Consider two real-valued integrals which can be evaluated in closed
form:

f e-
00

S = 2.
'J'{
ht sin(wt)dt = 2.
'J'{W
2w
+ h2 (9)
o
4.3. Functions withjumps 105

f
and
00

C = -1 e-ht cos(wt)dt = -1 h (10)


11: 11:W
2
+ h2 •
o
We shall find their asymptotics for large w by expanding the algebraic expressions
on the right-hand sides into power series in l/w. Note, that their principal asymp-
totic terms are, respectively, l/11:w and h/11:w2 • Thus, as w -+ 00, the second
integral turns out to be of order smaller than that of the first integral.
Let us take a closer look at that disparity of asymptotics of the two seemingly
similar integrals. Like an old war veteran who, on a rainy day, feels that his
amputated leg is still there, our integrals "feel" the influence of the full Fourier
integrals

(11)

Functions which are being Fourier-transformed in (11) are, respectively, odd and
even extensions to the half-line (-00, 0) of the function e- ht from (9-10). Note
that, in the first case, the integrand has a jump discontinuity of the size 2i at t = o.
Therefore, according to the asymptotic formula (4), the principal asymptotic term
of the integral S is l/11:w. On the other hand, the even extension of e- ht in the
integral C is everywhere continuous, but has a discontinuous derivative with a
jump of size L/' (0) 1 = - 2h. This means that, in our case, the first nonvanishing
(principal) term in the asymptotic expansion (6) of C in a power series in l/w is
the second term, which is equal to 1/11:w2 • •

Example 2. A contrast is even stronger in the behavior of integrals

!.JH exp (_w2 /4) ,


00
C = f exp( _t 2 ) cos(wt)dt =
o (12)
00
S = f exp( _t 2 ) sin(wt)dt.
o

The first integral, which is proportional to the Fourier transform of an infinitely


differentiable function, decays to 0 (as w increases) faster than the exponential
function. The second integral, in view of the above asymptotic formulas, satisfies
the asymptotic relation
1
S'" -. (13)
w
It has a milder power-type decay at infinity (see Fig. 4.3.2). •
4.3.4. Log-log scales and power-type behavior. The above differences in asymp-
totic behavior are well illustrated on Fig. 4.3.2 by the graphs of C and S as functions
106 Chapter 4. Asymptotics of Fourier transform

b
1

FIGURE 4.3.2
Dependence of integrals C and S from formula (12) on w. (a) Linear scales,
(b) Logarithmic scales.
4.3. Functions withjumps 107

of w. The graph (a) is in the usual, linear scales. However, in the areas of physics
where power functional relations are common, it is more convenient to employ the
logarithmic scales which are used on the graph (b).
The logarithmic scales are of extraordinary importance in the physical sciences
and engineering, and deserve a few detailed comments. For the sake of concrete-
ness, we will illustrate the situation in the case of acoustic pressure P which is often
measured in units called decibels (or, in short, dB), and found from the formula

D = 10 log(p2 / Pt).

A threshold value Po is selected on the basis of physical considerations. For


example, in acoustics it is selected to be equal to 2 . 10-5 Pa. The squares of
the quantities under the logarithm are introduced by the physicists so that the
energetic characteristics of the processes, which are proportional to the square of
the pressure, can be measured in decibels. Also, the squares ensure that the decibels
are well defined for arbitrary, even negative values of P. Taking advantage of
the logarithm's properties, the above formula for the number of decibels can be
rewritten in the form
D = 20 log IPI- 20 log Po.

Assume that an experiment found two values of acoustic pressure, Pl and P2,
generated by a car noise and a jet aircraft noise. The difference between the
corresponding numbers of decibels does not depend on the selection of units of
pressure for the initial measurements, be they Pascals or millimeters of mercury.
The difference is also independent of the threshold value Po. This is the reason
why the decibels are so useful in comparing measurement results. If only the
relative, and not the absolute number of decibels is important, then one can utilize
a truncated formula
D = 2010g(lPI).

As an example, consider a power-type Fourier transform

of a time-dependent function I(t). In decibels (that is in the logarithmic scale) its


values are expressed by

D = -20r log(w) + 2010g(IAI).

If the frequency w itself is also measured in the logarithmic scale, that is, if we
introduce s = log(w), then in the (D, s)-plane the power law is represented by
a straight line with the slope equal to - 20r determined by the power exponent
r. Hence, if the graphs are drawn in the logarithmic coordinate scales, then the
108 Chapter 4. Asymptotics of Fourier transfonn

presence, and even the magnitude of the exponent r can be readily detected. Fig.
4.3.2 (b) clearly demonstrates, for large w, the absence of the power law for C,
and its presence for S. The graph of S looks a little bit like a boomerang with
two linear pieces symmetrically angled towards the S axis. The decreasing piece
corresponds to the power law asymptotics (13), and the linear growth for small w
is connected to the principal asymptotics S(w) - w/2 of the integral S in (12) for
w-+ O.

4.3.5. Fourier transforms of pulse functions and optimization of directional


antennas. A study of the properties of Fourier transforms of discontinuous func-
tions would be incomplete without mentioning rectangular functions describing
pulse signals or indicator functions of intervals. They are the simplest discontinu-
ous functions with an obvious symmetry, useful in applications. Since we are not
going to discuss direct engineering problems here, assume that the argument t is
dimensionless, and concentrate on the specific example.

Example 3. Define the function

n(t) = [X(t + 1) - X(t - 1)] = {~: ItIt II :::> 1;1.

Its Fourier transform

f·e-'wtdt
1
-
n(w) = -1
21l'
= -1 sinc
1l'
(W)
- •
1l'
(14)
-1

where, in a commonly accepted notation,

. sin (1l'w)
smc (w) = .
1l'W

The graph of the absolute value of function sinc w is plotted in Fig. 4.3.2. A
slow decay of the local maxima of I sinc wi, as w increases, reflects the slow decay
asymptotics (- l/w) of the Fourier transform of n(t) related to the jumps of n(t)
at t = ±1.
In the theory of antennas, the graph on Fig. 4.3.3 represents the antenna's
directional pattern as a function of an azimuth-dependent coordinate w. Existence
of the far away "lobes" with large amplitudes is not desirable as it lessens the
angular resolution of a radar system. For that reason, in real systems, one tries to
dampen the lobes, while preserving some of the character of the original function
100. •
4.3. Functions withjumps 109

sinc(x)

FIGURE 4.3.3
Graph of the absolute value of function sinc x.

4.3.6. Smoothing and filtering. We shall mention two methods which help to ac-
celerate the decay of slowly decreasing tails of Fourier transforms of discontinuous
functions. They both rely on the idea of discontinuity smoothing.
The first smoothing method, which is commonly used in linear filtration of pulse
signals, relies on the convolution

g(t) = f h('r)f(t - "C)d"C.

of original function f(t) with a filtering function h(t). Its Fourier transform,
according to (3.2.8), is equal to

g(w) = 21rh(w)j(w). (15)

So if the normalized Gaussian function

1
h(t) = --exp
2
- -) ,
./2rre
(t2e 2
(16)

is taken as the filtering function, its Fourier transform is


110 Chapter 4. Asymptotics of Fourier transform

and formula (15) takes the form

t
o
FIGURE 4.3.4
Graph of the pulse function n(t) before and after smoothing.

This particular filtering procedure does not significantly change the form of
the Fourier transform i(w) for not too large frequencies (say, for JwJ < l/s),
but it dramatically dampens the slowly decaying maxima of the lobes for large
frequencies (say, JeuJ » l/s).
The filtration method, so effective in signal processing, is not optimal for antenna
problems. Filtration washes out the function, and makes g(t) "last longer" than
f (t). An antenna has a finite spatial extent, and the function f (t )-which describes
the distribution of sources on the antenna as a function of the spatial parameter
t-must be zero outside it. In mathematical terms, the engineering problem can
be formulated as follows: Find a smooth function, with support [-1, 1], which
optimizes-according to a chosen criterion-the Fourier transform i(w).
In the engineering practice, the experience would dictate a selection appropri-
ate for a given concrete situation. Here, without getting too deeply involved in
mathematical intricacies, we will provide a few examples of functions which have
a fast-decaying Fourier transforms while preserving some oHhe characteristics of
the impulse function.
4.3. Functions withjumps 111

Example 4. Consider a triangular function

f(t) = {1 - Itl, It I < 1, (17)


0, Itl ~ 1.

It is a continuous function, but its derivative has jumps of size 1 at t = ±1, and of
size -2 at t = o. Therefore, by analogy with formula (8),

-
f(w) '" - -1- ( . - 2 + e'w
e-/(u . ).
2Jrw2

Since the second derivative of the triangular function is zero outside the points
t = -1,0,1, the remainder term R2(W) (7) is equal to zero as well. This means
that the asymptotic formula

-
f(w) 1 sinc2
= 2Jr (,,~)
"" (18)

is exact. This fact could have been guessed if we had noticed that the triangu-
lar function (17) is the convolution of the rectangular function n (2t) with itself.
Hence, the Fourier transform of the triangular function is equal to the square of
the Fourier transform of n(2t) (multiplied by 2Jr), which turns out to be equal to
TI(w/2)/2. •
The absence of discontinuities in the triangular function guarantees a relatively
strong, by comparison with the rectangular function, damping of the Fourier trans-
form's lobes. Nevertheless, in applications one often selects even smoother sub-
stitutes of the pulse function.
Example 5. Consider function

f(t) = {coos2 (Jrt/2) = [1 + cos(Jrt)]/2, for It I < 1; (19)


for It I ~ 1.

As an exercise, we shall go through a detailed computation of the Fourier transform


of f(t). First, let us get rid of the constant term in (19) by taking the derivative

for It I < 1. It follows from formula (3.1.2) that

_ Jr [- - ] i [sin(W-Jr) Sin(W+Jr)]
g(w) = i - n(w -Jr) - n(w + Jr) = - - ----
4 4 (w - Jr) (w + Jr)
112 Chapter 4. Asymptotics of Fourier transforms

is proportional to the difference of shifted Fourier transforms of rectangular func-


tions. Taking sin (w ± Jr) = - sin (w) outside the brackets, and writing the remain-
ing fractions over the common denominator, we get that

_() . Jr sin(w)
gW=I- 2 2·
2Jr -w

Returning to the Fourier transform of the original function (19), we find that

j(w) =g(w) =:: sin(w) . (20)


iw 2 w(Jr 2 - w 2 )

For w -+ 00, this Fourier transform decays as 1/w3 , the better behavior resulting
from the function itself having discontinuities only in the second derivative.
Similarly, the function

f(t) = {cos4 CI1'{) = i + ! cos(Jrt) + i cos(2Jrt), for It I < 1; (21)


0, for It I ~ 1.

which has discontinuities only in the fourth derivative, has the Fourier transform

f-(w) = -Jr
3 3 sinw
, (22)
2 w(w 2 - Jr2)(roZ - 4Jr 2 )

which decays at infinity as 1/w5 • Graphs of functions (19) and (21) are plotted on
Fig. 4.3.5. It is almost impossible to differentiate them by naked-eye inspection
and tell which one is smoother in the neighborhood of t = ± 1. On the other hand,
their Fourier transforms, shown in Fig 4.3.6 with the logarithmic scales on the
ordinate axes, are very sensitive to the existence of "hidden" discontinuities of the
original functions. •

4.4 Gamma function and Fourier transforms of


power functions
The previous section discussed Fourier transforms of functions with isolated
discontinuities of the first kind (jumps); finite, albeit different at some points,
one-sided limits were assumed to exist. If a function has a discontinuity at an
isolated point t which is not of the first kind, then we say that it is a discontinuity
of the second kind. At such a point, either the one-sided limits are infinite (as in
limHo+ r l or they do not exist (as in limHo+ sin(t- 1»).
4.4. Gamma function and Fourier transfonn ofpower functions 113

fit)
a

~-'::"1~-~:---+---""""'-~-1- t

fit)
b

---~l--~~~---+--~~~~--l--t

FIGURE 4.3.5
(a) Graph of function (19). (b) Graph of function (21).

In this section we will study the Fourier transforms of functions f(t) = 0 for
t ~ 0, smooth for t > 0 and sufficiently rapidly decaying for t ~ 00, with the
asymptotics at the origin

f(t) '" t a - 1 (t ~ 0+), (1)

where
a >0. (2)

For 0 < a < 1, the functions f themselves have a discontinuity of the second
kind, and for fractional a > 1, it is their derivatives of order n = La J (the greatest
integer less than or equal to a) and greater that are discontinuous. Our analysis,
114 Chapter 4. Asymptotics of Fourier transfonns

a
1
0.1
0.01
0.001
0.0001
0.00001
-6~~_ _ _ _--,-__ CO

1. 10 -30-20-10 0 10 20 30

b
1

0.1

0.01
0.001
0.0001

o 10 20 300)

FIGURE 4.3.6
(a) Graph of the absolute value of the Fourier image (20) of function (19).
(b) Graph of the Fourier image (22) of function (21).

with obvious adjustments, applies equally to functions with shifted singularities of


the type

[(t) "'" (t - -r)a-l, (t .... -r + 0).

Condition (2) guarantees that [(t) is locally integrable in any neighborhood of


point t = 0, and that the Fourier transform of [(t) exists in the classical sense.
Integration by parts, which was so effective in finding asymptotics for Fourier
transforms of step functions, is not helpful in this case. More efficient is a com-
parison of the asymptotic laws of the Fourier transforms of (1) with asymptotics
of certain "gauge" functions. To construct these gauge functions we need to recall
4.4. Gamma function and Fourier transform ofpower functions 115

properties of the Gamma function

f
(Xl

r(s) = e-ttS-1dt. (3)


o

The integral on the right-hand side is also called the Euler integral.
The Gamma function provides an interpolation of the factorial function (n -I)!
to noninteger arguments. Indeed, for positive integer arguments, one can check by
induction that
r(n + 1) = 1· 2·3· ... · n = n!

For general (even complex) values of its argument, the Gamma function satisfies
the recurrence relation
r(z + 1) = zr(z).

Also, we have a symmetrization formula

r(z)r(l- z) = 1l'/sin(1l'Z),

from which it follows immediately that r(1/2) = ,.fii. So, in this sense,

(-1/2)! =,.fii.

To study the Fourier transform of (1), we will begin with a more general line
integral

over a contour C in the complex z-plane which is closely related to the Gamma
function. Here, a and p are positive numbers. The integrand is analytic in an
arbitrary bounded domain of the complex plane, which does not contain z = o.
As contour C, see Fig. 4.4.1, we select the contour formed by a segment of
the real axis from s to R, followed by a segment of the circular arc from z =
R exp(iO) = R to Z = R exp(ifJ), then by a radial segment from z = R exp(ifJ) to
z = s exp(ifJ) at an angle fJ to the real axis (0 < fJ < 1l' /2), and completed by a
small arc from z = sexp(ifJ) to z = sexp(iO) = s. We travel along the contour
in the positive direction, leaving the enclosed domain (which does not contain OJ)
on the left-hand side.
By the Cauchy Theorem (see Section 6.4), the integral of za-le-pz over C
vanishes. Splitting it into four obvious pieces corresponding to the above segments
116 Chapter 4. Asymptotics of Fourier transforms

Imz

c
R Rez
FIGURE 4.4.1
Contour of integration leading to formula (4).

of contour C we get that

R P
0= f ta-1e-P'dt + iRa f exp (iacp - pRei'P)dcp
8
R
0
P (4)
_e ipa f t a - 1 exp (-peiPt) dt - ie a f exp (iacp - peei'P) dcp.
8 0

On the first segment, which is a subset of the real axis, we replaced the variable
of integration z by the real variable of integration t, on the first arc we made a
substitution z = R exp(icp), 0::::; cp ::::; (3, on the radial segment-z = t exp(i{3),
and on the small arc- z = e exp(icp).
Let e ~ 0 and R ~ 00. In the limit, the integral along the small arc vanishes
because in the factor ea ~ 0, parameter a is positive.
As R ~ 00, the integral over the large arc also converges to 0, and a proof of
this fact is equivalent to a proof of the well known Jordan Lemma. We will sketch
it beginning with an obvious inequality

f f
P P
Rai exp(iacp - PRei'P)dCPi ::::; R a exp(-pR coscp) dcp. (5)
o 0

The convexity of the graph of cos cp for 0 ::::; cp ::::; 1r /2 implies a geometrically
obvious (see Fig. 4.4.2) inequality

1- cos{3
coscp:::1- {3 cp, O::::;cp::::;{3::::;1r/2.
4.4. Gamma function and Fourier transform of power functions 117

which, together with inequality (5), gives

Ra
p
. I~Raexp(-pR) f exp (pR 1- f3cosf3 q>) dq>
If exp(iaq>-pRe'lP)dq>
p

o 0

Ra-I f3
= [exp( - pR cos (3) - exp( - pR)].
p(1 - cos(3)

1C/2 <p

FIGURE 4.4.2
Convexity of the function cos q>.

For 1f31 < rr /2, we have cos f3 > 0, and the function in brackets decays exponen-
tially to zero as R increases, thus offsetting the polynomial increase of the factor
R a - I . This guarantees that, for any a, the integral along the large arc converges
to 0 as R ~ 00. A useful comment is in order here: For f3 = rr /2, when the first
summand in brackets is equal to 1, and for a < 1, the integral over the large arc
also converges to 0 because of the factor Ra - I .
So, in the limits e ~ 0 and R ~ 00, equality (4) takes the form

f f
00 00

ta-Ie-ptdt = eipa t a - I exp(-peiPt)dt.


o 0
118 Chapter 4. Asymptotics ofFourier transforms

Multiplying both sides by pa, introducing on the left-hand side a new variable of
integration t' = pt, and noticing that the transformed integral on the left-hand side
coincides with the Gamma function (3), we get an equality

f
00

r(a) = pa eifJa t a - 1 exp( - peifJt)dt,


o

which, after a change of variables,

u=peifJ=h+iw, h=pcos{J, w=psin{J, (6)

becomes an equality

r(a) = u a 1 00
t a - 1 exp(-ut)dt, (7)

valid for a > 0, and h = Re u > O. For a fractional power function u a , its
principal branch must be selected.
For 0 < a < 1, we can put h = 0 ({J = 7r /2) (see the comment following the
analysis of the integral over the large arc) to obtain the formula

f
00

r(a) = (iw)a t a - 1 exp(-iwt)dt. (8)


o

Formulas (7) and (8) have an easy interpretation in terms of Fourier transforms.
The integral on the right-hand side of (8), up to a factor 1/27r, coincides with the
Fourier transform of function

g(t; a) = x(t)t a- 1 , 0 < a < 1, (9)

so, its Fourier transform


_ r(a)
g(w, a) = 27r(iw)a (10)

In particular, for a = 1/2, we have a remarkable formula

x(t) 1
- - t---+- • (11)
./i (1 + i)./27rw
Remark 1. Note, that it is easy to obtain the power law g(w) '" l/wa in (10)
from dimensional analysis, although this type of argument will not yield the precise
value r(a) /27r of the numerical coefficient.
4.4. Gamma function and Fourier transform of power functions 119

Remark 2. The validity of the same power law across the whole frequency
range of the Fourier transform reflects two, fundamentally different, behaviors of
the original time function. Its validity for w-+-oo indicates the presence of a
discontinuity of the second kind of the original function at time t = 0, while its
validity for w -+- 0, is a consequence of moderate ('" t a - 1) decay of the tail of the
original function as t -+- 00.
Remark 3. Equalities (7-8) were derived with the help of an integration contour
in the upper half-plane (see Fig. 4.4.1). Hence (see (6)), frequency w = pcosfJ
appearing in these formulas is positive. However, it is easy to show that the above
proof remains in force if the contour is reflected into the lower half-plane. Thus,
the formulas (7-11) remain valid for all frequencies -00 < w < 00. In particular,

x(t) { (1 - i)/.J8rrlWT, for w > 0;


(12)
..fi......... (1 + i)/.J8rrlWT, for w < o.

Remark 4. Time reversal in the original function results in the frequency chang-
ing sign in the Fourier transform (see (3.1.3c)). Hence, it follows from (12) that

x(-t) +
......... { (1 i)1 ,J81l' Iwl, for w > 0;
.Jjtf (1 - i) I ,J81l' Iwl, for w < o.

Combining this relationship with (12), we find the Fourier transform of a symmetric
in time function 1/.Jjtf, displaying a discontinuity of the second kind:

1 1
- - ......... --=== (13)
.Jjtf ,J21l' Iwi·

Now, let us return to the discussion of the formula (7). With the help of notation
introduced in (6), we can rewrite (7) in the form

f
00

r(a) = (h + iw)a t a - 1 exp (-ht - iwt) dt. (14)


o

In other words, function

(15)

has the Fourier transform


_ r(a)
g(w; a, h) = 21l'(h + iw)a (16)
120 Chapter 4. Asymptotics of Fourier transfonns

Its principal asymptotics at infinity is described by the relation


_ r(a)
g(w; a, h) '" 27r(iw)IX' (Iwl -+ (0), (17)

which has a form identical to (10), but is valid for any a > O. For fractional
a this asymptotic formula is a consequence of the original function's (see (15»
discontinuities of the second kind, or of similar discontinuities of its derivatives.
For the integer values of a the formula agrees with the asymptotic behavior of
Fourier transforms of functions with explicit (in the function itself), or hidden (in
the derivatives) discontinuities of the first kind which were discussed in Section
4.3.
Despite asymmetry of the function (15) (it is identically equal to 0 for t < 0)
one can still consider its even and odd components. For an arbitrary function g(t),
these two components are given by

1 1
geven(t) = 2[g(t) + g(-t)], godd(t) = 2[g(t) - g(-t)]. (18)

Clearly,
geven (t) + godd(t) = g(t).
It follows from properties (3.1.3c) and (3.1.4) of the Fourier transform of real
functions that the Fourier transforms of even and odd components of g correspond,
respectively, to the real and the imaginary parts of the Fourier transform of the
original function g. More formally,

geven(w) = Re g(w), godd(W) = i 1m g(w). (19)

Separating the real and imaginary parts of the Fourier transform (16), we obtain
that

(20)

The argument K of a complex number h + iw introduced above depends on the


dimensionless frequency y via the formula

K = arctany, y = w/ h. (21)

For y -+ ±oo and K -+ ±7r /2, the Fourier transforms of even and odd parts of
the original function (15) have the following principal asymptotics:

_ r(a)cos(a7r/2)
geven(w) '" 2 IX ' (22a)
7rW
4.4. Gamma function and Fourier transform ofpower functions 121

_ () . r(a) sin(a1l' /2). ()


godd w "V 1,,_ Slgn W , (Iwl ~ 00). (22b)
kJ,Wa

Remark 5. In engineering and physical applications, these asymptotic formulas


are much more useful and important than the exact formulas (20). Formulas (20)
give Fourier transforms of a narrow class of gauge functions, whereas formulas (22)
describe asymptotics of Fourier transforms of a much broader class of functions
which, at some arbitrary instants of time tk, have local singularities (t - tk)a-l.
"V

Remark 6. For odd values of a, the cosine in asymptotic formula (22a) becomes
0, and for even a the sine in (22b) vanishes. This means that the asymptotics
of the corresponding functions is of order smaller than l/wa • This phenomenon
is similar to the one already encountered for functions C and S in (4.3.4-5). Its
essence can be explained with the help of two functions: X (t)t 2 and X (t)t 3. The
former, extended to an odd function becomes sign (t) t 2 , which has discontinuities
of the second derivative, whereas its even extension is an infinitely differentiable
function t 2 • The latter becomes infinitely differentiable under the odd extension
but has a discontinuity in the third derivative under the even extension.

The above functions have Fourier transforms only in the generalized, djstribu-
tional sense. However, the general principle stating that Fourier transforms of
infinitely differentiable functions decay, for w ~ 00, faster than any power of
w, extends to them as well. Consequently, the generalized Fourier transforms of
an even function t 2 and the odd function t 3 do not have power asymptotics. In
other words, all the coefficients in their asymptotic expansions in powers of l/w
are equal to O. This will become clear when we recall the generalized Fourier
transforms (3.3.5) of the above two functions.
In the case of fractional a, the principal asymptotics, as Iwl ~ 00, of Fourier
transforms of both even and odd parts of function (9) are the same and of order
l/wa . Either extension of function (9) to the negative half-line does not remove
its characteristic absence of smoothness at the origin. Here, the reader may feel
that the asymptotics of the Fourier transform of function (9) established rigorously
above for any a, contradicts geometric common sense which seems to be telling us
that the graphs of even and odd components of (9) are qualitatively different (see
Fig. 4.4.3 (a) and (b». The graph of the even component of

(23)

with a characteristic cusp at t = 0, gives us an impression of a function that is


much less smooth than that of the odd component, although the latter also has a
vertical slope at t = O. Nevertheless, in spite of our geometric intuition, both of
them have the same Fourier transform asymptotics 1/w3/ 2 • The Fourier transform
of function (23) can be calculated explicitly with the help of formula (20), but an
alternative derivation provided below is more direct.
122 Chapter 4. Asymptotics of Fourier transforms

g(t) g(t)
even - a odd-b

-----------+-----------t -----------4-----------t

FIGURE 4.4.3
Graphs ofeven (a) and odd (b) components of function g(t) = X (t).../ie- ht ,
corresponding to a = 3/2.

Since r(3/2) = (1/2)! = ..;;r/2, it follows from (16) that

_( /1 1
g w) = V;J;3 4(1 + iy)../1 + iy' y = w/h.

Separating the real and imaginary parts is easy if we use algebraic identities instead
of the general relation (22) which contains trigonometric functions. Squaring both
sides of equality ";1 + iy = x + iy, and solving the resulting algebraic equations
with respect to x and y, we obtain the main branch of the radical

Further, standard but tedious calculations give that g(w) is equal to

x [(1 _ y2 ) _ iy (1 + _--::1==)]
1+ Jl + y2 1+ Jl + y2 .
4.5. Generalized Fourier transfonns ofpower functions 123

4.5 Generalized Fourier transforms of power functions


In the previous section we demonstrated (4.4.15-16) that

In the limit h ~ 0 the above formula generates the whole family of generalized
Fourier transforms, many of them not encountered thus far. We shall study them
in this section.
Recall (Section 3.2), that g(w) is said to be a generalized Fourier transform of
function g(t) if, for any test function t/J(t) E S, and its Fourier transform ~(w)

1g(w)~(w)dw ;T( 1
(also in S),
= g(t)f(J(-t)dt.

Note that the above formula corresponds to formula (3.2.6) with 'l' = 0, but is
sufficient to uniquely determine distribution g(w).
By definition, the Fourier transform of function g from (4.4.15) is equal to

g(w; a, h) = 1
00

ta-le-ht-iwt dt. (1)


o

Multiplication of both sides of (1) by a function ~ E S, and integration over the


entire w-axis, gives that

1 1 [I ~(w)e-iwtdw]
00

21f g(w; a, h)~(w)dw = ta-1e- ht dt.


o

The integral in the brackets is equal to function t/J(-t) E S which is absolutely


integrable and rapidly (that is, faster than any power) decreases to 0 at infinity. For
that reason, the integral on the right-hand side of the equality

1 1
00

2T( g(w; a, h)~(w)dw = ta-1e-htt/J(-t)dt


o

exists in the classical sense, for any h 2: 0 and a > O. In particular, for h = 0 we
124 Chapter 4. Asymptotics of Fourier transforms

arrive at a symbolic equality

f f
00

21l' g(w; a)~(w)dw = ta-I<p(-t)dt. (2)


o

The left-hand side is a functional of a product of the generalized Fourier transform


g(w; a) and a test function ~(w). The right-hand side is a regular linear continuous
functional of function <p E S. Hence, for h --+ 0+, the distribution g(w; a, h)
weakly converges to the distribution g(w; a) E S', which is determined as a
functional on S by the right-hand side of equality (2). For this distribution, in
analogy with (4.4.16), we employ a symbolic notation

_ r(a)
g(w· a) - -----.:.....:.....- (a> 0).
, - 21l'(iw o)a'+

The above distribution is a generalized Fourier transform of function X (t)t a - I and


we record this fact in the form of relation

a-I r(a)
X (t)t t-+ 21l'(iw + o)a' (a> 0). (3)

Let us establish some properties of the distribution g(w; a). Differentiating both
sides of (1) m times with respect to w we arrive at

(i)m d~ g(w; a, h) = g(w; a + m, h),

which, as h --+ 0, becomes (in the sense of weak convergence) a distributional


equality
(i)m : ; g(w; a) = g(w; a + m). (4)

It shows that, for values a> 1, one can express distributions g(w; a) by distribu-
tions g(w; a) with parameter a values in the interval 0 < a < 1.

The case a = 1; Fourier transform of the Heaviside function. The limit case of
a = 1, when
_ 1
g(w; 1) = 21l'(iw + 0) (5)

has to be considered separately. It is convenient to establish the functional action


of this distribution on test functions by a detailed analysis of the corresponding
4.5. Generalized Fourier transforms ofpower functions 125

regular Fourier transforms

g(w; 1, h) = 21r(i~ + h)
for h > O. Separating their real and imaginary parts we get that

The regular limit, as h ~ 0, of functions on the right-hand side does not exist.
However, this is not a serious obstacle as we are interested in the weak convergence,
that is, in the result of integration of these functions against an arbitrary test function
(iJ(w). In that sense, the real part, which is the familiar Lorentz function, converges
to 8(w)/2. The imaginary part, when integrated against an arbitrary smooth and
absolutely integrable function (iJ(w), gives that

·
11m
h-+o
fhwiP(w) d W=
2 +w2
pvf rp(w)d w,
--
w

where PV f stands for the principal value of the integral. At this point, we will
not dwell on the notion of the principal value of the integral as it is going to be
discussed in depth (together with its physical applications) in Chapter 6. We shall
only show that the above principal value integral induces a new distribution which,
symbolically, will be denoted
pv.!..w
In this notation, distribution g(w; 1) is given by equality

- 11' 1
IpV -;--,
g(w; 1) = -2°(w) + -2
11: IW
(6)

which is often written in the form

1 1
-.- 0 = PV-;--
IW+ IW
+ 11:8(w). (7)

For a = 1, our general original function X(t)t a - 1 degenerates to the Heaviside


function X (t), so that (6) becomes the generalized Fourier transform of Heaviside
function. In this case, its real part is equal to the Fourier transform of the even com-
ponent of the Heaviside function, which is simply equal to 1/2, and the imaginary
part is the Fourier transform of the odd component, which is sign(t)/2.
126 Chapter 4. Asymptotics ofFourier transforms

Utilizing the recurrence formulas (4), for any integer m, we can express the
generalized Fourier transforms of functions X(t)t m - 1 via the Fourier transform of
Heaviside function

For m = 2, this formula gives the Fourier transform

_ I" .ld 21
g(cu;3) = --~
2
(cu) +z---PV-.
21l' dcu 2 cu

of function x(t)t 2 discussed in Remark 4.4.6. Its real part is the generalized
Fourier transform of infinitely differentiable function t 2 • It is identically equal
to 0 for arbitrary cu "# 0, and , as was discussed before, does not have a power
asymptotics for cu -+- 00.
The case a = 0; Fourier transform offunction X(t)/t-a physical approach.
The case of the Fourier transform of function X (t) / t, corresponding to the value
a = 0, has to be considered separately from other generalized Fourier transforms
of power functions. Copying formally the approach that was so successful in
analyzing of generalized Fourier transforms for a > 0, we will try to determine
the Fourier transform of X(t)/t as the weak limit (for h -+- 0) of the integral

f~
00

g(cu; 0, h) = 2~ exp(-ht - icut)dt. (8)


o

Just a passing glance at (8) permits an observation that, for any cu and any h, the
integral on the right-hand side is infinite, in view of the nonintegrable singularity
at the lower limit. Nevertheless, neither mathematicians nor physicists throw up
their hands in despair in such a situation. Mathematicians introduce new distribu-
tions that assign well defined values to integrals (8). Physicists quote additional
physical arguments, which also give a finite answer. The ideas of mathematicians
and physicists, although different in details and method of argumentation, are sim-
ilar in essence. Various ways of computing integrals of type (8) can be grouped
under a unifying umbrella of renormalization techniques. Most often, renormal-
ization techniques are applied in quantum electrodynamics where the computation
of physical quantities by the perturbation method leads to divergent integrals.
Without reference to the physical processes that are described by function X (t) / t,
we will use only very general renormalization ideas. A reasonably accurate mea-
surement of the Fourier transform of a real physical process at frequency cu requires
much longer time than the period T = 21l' / cu of the corresponding oscillation. For
4.5. Generalized Fourier transforms ofpower functions 127

that reason, the Fourier transform at frequency w = 0 is not an observable quan-


tity as it would require for its measurement an infinitely long observation interval.
Hence, it is natural to exclude it from considerations, reading its value off other
values of the Fourier transform. For w = 0, the Fourier transform (8)

00

8(0; 0, h) = ~ / ~ exp( -ht)dt


o

is infinite. Nevertheless, subtracting it from (8), we arrive at a renormalized Fourier


transform
00

l(w; h) = 2~ / ~e-ht [e- iwt - 1] dt, (9)


o
which assumes finite values and correctly reflects the dependence of the Fourier
transform 8 on frequency.
Passing in (9) to a new variable of integration or = wt,

00

-
/(w; 1/1
h) = 21r ;e-ILT: [ . - 1] dor,
e-1T:
o

where f,L = h / w is a dimensionless parameter. Differentiating both sides of the


above equality with respect to that parameter, we obtain that
_ 00

d/ = _~ /
d f,L 21T
[e-T:(IL+i) _ e-T:IL] dor = ~
21T
[.!. __+
f,L f,L
1_.] .
I
o

Utilizing the fact that if f,L -+ 00 then 1 -+ 0, we can compute 1 from

- 1/00[1----.1]
/=--
21T S S+l
ds=Re/+iIm/,
_ _ (10)
IL

where
(Ha)

1m / -= - 41[1 1 - 21r arctg (f,L) ] sign (f,L). (Hb)


128 Chapter 4. Asymptotics of Fourier transfonns

The absolute value under the logarithm and function sign (It) in the imaginary part
make these expression valid for negative It as well.
Now, to find a generalized renormalized Fourier transform of function X (t)/t,
it suffices to let h -+- 0 in expressions (10) and (11). At the beginning, we will do
that for the imaginary part. Observing that in this case It -+- 0+ for w > 0, and
It -+- 0- for w < 0, we get that

-
1m f = -41 sign (w).

The real part of the renormalized Fourier transform X(t) / t requires a more thought-
ful treatment. Recall that It = h/w, and observe that h in the first equality in (11)
can not be taken to converge to 0 as its right hand side then becomes infinite. For
that reason, we will impose a restriction w » h. Then It « 1, and we can utilize
a simpler approximate expression

- 1 1 1
Re f = -In(llti) = --2 lnOwi) + C, C = -In(h).
21r 1r 21r

The constant C above will be called the calibrating constant, as its value should
be selected on the basis of comparison with results of the measurements and the
choice of a frequency scale.
Combining the last two formulas we arrive at a remarkable relation

x(t) 1 1 .
-
t
t---+ --In(lwi)
21r
+C - i-sIgn (w).
4
(12)

The above "physical" approach to calculating the Fourier integral proved successful
even in the case which, taken at its face value, diverged for any frequency.
The case a = 0; Fourier transform of 1/ t-a mathematical approach. We shall
now show how one can deal with this situation from the mathematical viewpoint.
For simplicity, we shall restrict ourselves to a calculation of the Fourier transform
of an even function l/ltl. In this situation, one defines a new distribution

1
T = PVit!,

directly as a functional on test functions. Its continuity will be guaranteed by an


exclusion of singularities in the corresponding integral. In the case under consid-
eration, this functional is defined by the equality

T[4>] = f1
4>(t) - 4>(0) dt +
Itl
f 4>(t) dt.
It I
(13)
-1 Itl>1
4.5. Generalized Fourier transforms ofpower functions 129

Such regularization of divergent integrals is justified, from the view-point of math-


ematicians, by the fact that for any test function t/J(t) E S with t/J(O) = 0, the value
of the function T[t/J] (13) coincides with the original integral t/J(t)/Itl dt.J
Let T(w) be the Fourier transform of our distribution. By Parseval formula
(3.2.6) (for 1" = 0), it has to satisfy equality

T[4J] = ~/
21l'
PV.!:..t/J(-t)dt
It I
=~
21l'
[/1 t/J(-t) - t/J(O) dt
It I
+/ t/J(-t) dt] .
It I
-1 Itl>1

If we transform the integrals in the brackets by expressing test function

1
in terms of its own Fourier transform, and change the order of integration, we get

[i COS("';) -
that

t[~l = ~ f ~(.,) 1dt + 008:"") dtJ dw.


Passing to the new variable of integration 1" = wt in the inner integrals gives
T[4J] = -~ / iP(w) [Cin(w) + Ci(w)] dw,

where
00

Ci(z) = - / co:(S) ds
z
is a special function called the integral cosine, and

· )=
Cm(z l o
Z 1 - cos(s)d s
s

is another related special function. Neither of them, separately, can be expressed


in terms of elementary functions, but their sum, up to a constant, is equal to the
logarithmic function. Indeed, if we write function Cin of a real argument w in the
form
w 1 w
Cin(w) = / 1- ;oss ds = / 1- ;ass ds +/ 1- ;oss ds,
o 0 1
130 Chapter 4. Asymptotics of Fourier transforms

and the last integral above in the form

f =f f f
w w w 00
1- coss 1 coss coss .
S ds -;ds - -s-ds = In(lwl) - -s-ds - Cl(W),
1 1 1 1

then we see that


Cin(w) = 1n(lwi) + y - Ci(w),

f
where constant

f
1 00
1- coss d coss
y= s- - -d s.
s s
o 1

One can prove that constant y coincides with the Euler constant (4.1.5). Hence

Cin(w) + Ci(w) = 1n(lwl) + y,

which gives the following addition to our tables of generalized Fourier transforms:

4.6 Discontinuities of the second kind


Let us return to one of the main topics of this chapter: analysis of the asymptotics
of Fourier transforms of functions with discontinuities of the second kind through
a study of the gauge functions (4.4.9) and (4.4.15). Consider the integral

f
00

f(t)e-iwtdt,
o

where, for t > 0, f(t) is a sufficiently smooth function, and for t ~ 0, it has a
singularity of the type
f(t) '" Ata - 1 , a >0.

To make the situation more concrete assume that f(t) == 0 for t < 0, so that tQe
above integral is equal, up to a factor of 1j2rr, to the Fourier transform of function
f(t). In the previous two sections, a detailed analysis of gauge functions (4.4.9)
and (4.4.15) gave us some hints that the Fourier transform j(w) of function f(t)
4.6. Discontinuities of the second kind 131

has, for w -+ 00, principal asymptotics of the order l/wa. This fact has yet to be
proved rigorously, and explicit necessary conditions on function f(t) have to be
spelled out.
We will adopt a "patching-up" method which relies on removing singularities
from the original function by superposing on it a "patching-up" function with the
same singularity. More exactly, we will consider an auxiliary function

v(t) = f(t) - Ag(t; a, h). (1)

Since g(t; a, h) '" t a - 1 (t -+ 0), function v(t) is of order smaller than f(t),
that is, v(t) = o{t a - 1 } (t -+ 0). Hence, it is natural to expect asymptotics of its
Fourier transform to be of order smaller than the expected principal asymptotics of
the Fourier transform j(w), that is, v(w) = o{1/wa } (w -+ (0). If this is indeed
the case, then the principal asymptotics of function

f(t) = Ag(t; a, h) + v(t)

coincides with the main asymptotics (4.4.17) of the gauge (or patching-up) func-
tion g(t; a, h) multiplied by A, and the Fourier transform of function f(t) has
asymptotics

- w - A r(a)
f( ) - 21f(iw)a
~}
+ 0 { wa (w -+ (0). (2)

The rigorous proof is based on the following modification of the more general
Riemann-Lebesgue Lemma:
Let function v (t) be continuous for t > 0 and absolutely integrable on the infinite
interval (0, (0). Furthermore, assume the same is true for all the derivatives of
v(t) of order up to n = La + IJ ~ a > O. Additionally, assume that, for t -+ 0,
function v(t), and all its derivatives up to order n, satisfy the asymptotic relations

v(m) = o{t a - m- 1 } (t -+ 0). (3)

Then the integral

f
00

F(w) = v(t)e- iwt dt (4)


o

has the asymptotics

(5)
132 Chapter 4. Asymptotics of Fourier transforms

First, let us sketch a proof of this modification. Asymptotic relations (3) are
equivalent to the following statement: For any e > 0 we can find a K > 0 such
that, for t < K,
m = 1,2, ... ,n. (6)

We shall apply these inequalities to the integral (4). Select a> > 11K, and split the
interval of integration into subintervals (0, 1/a» and (1/a>, (0). For the integral
over the first subinterval,

I1o1/fJJ v(t)e-'fJJtdt
. I ~ 11/fJJ Iv(t)ldt ~ e 11/fJJ t a- 1dt = ;e wa'
0 0
1
(7)

which means that this piece of integral (4) is, as 1a>1 ~ 00, of order smaller than
1/a>a.
The second piece of the integral (4) over interval (1/a>, (0) will be transformed
by repeated integration by parts. The first integration by parts gives that

1 l/fJJ
00 .
v(t)e-1fJJtdt e- i ( -1 )
= -.-v
la> a>
+ -;-
1 1 00

la> l/fJJ
.
v'(t)e-1fJJtdt.

In view of (6), the first summand on the right-hand side is ~ ela>a and, as a
result, it is of order smaller than 1/a>a for 1a>1 ~ 00. Repeating the integration by
parts another n - 1 times, and making a similar observation to the effect that the
nonintegral terms are o{1/a>a}, we are lead in the end to an asymptotic equality

1 00 v(t)e- 1fJJt
. dt = ( 1
-;- )n 1 v(n) (t)e-1fJJt dt +
00
. 0 { -1 }.
l/fJJ la> l/fJJ wa

Let us split the integral on the right-hand side

In view of (6), the contribution of the first summand

1
( -;- )n 1 1//( I
v(n)(t)e-ifJJtdt ~ _IE 1 1//(
t a - n- 1dt < - IE- . -1.
la> l/fJJ ron l/fJJ n- CJt a>a

{2..} .
So
(00 v (t)e- ifJJt dt = (~)n (00 V(n) (t)e-ifJJt dt +0
J1/fJJ la> A//( wa
4.6. Discontinuities of the second kind 133

It follows from the Riemann-Lebesgue Lemma of Section 4.2 that the remaining
integral, which contains a continuous and absolutely integrable function v(n)(t),
converges to 0 as Iwl ~ 00. This, together with (7), proves the validity of asymp-
totic relation (5). •

Now, the proof of (2) immediately follows from the above modification of the
Riemann-Lebesgue Lemma, since the auxiliary function v(t) in (1) satisfies con-
ditions of the lemma.

Example 1. Consider the asymptoties of Fourier transform of function

I(t) = exp(-Itl a ), cx > o. (8)

In contrast to functions considered above, for t = 0, it assumes a non-zero finite


value 1(0) = 1. As a result, the auxiliary integral

has the principal asymptoties of the order O(l/w) which is absent in the actual
asymptoties of the full Fourier transform of function (8). To exclude this asymp-
toties and to find the asymptotic behavior of the full Fourier transform, first consider
the derivative of function (8)

I'(t) = _cxltl a - 1 exp(-Itla) sign (t). (9)

Let us form an auxiliary function equal to 0 for t < 0 and, for t > 0, given by
equality
v(t) = I' (t) + cxg(t; cx, h).

It is easy to see that it satisfies all the requirements of the lemma proved above.
Hence, the principal asymptoties, for Iwl ~ 00, of the Fourier transform of the
"one-sided" derivative I'(t)x(t) is described by formula (2) with A = -cx. The
actual derivative (9) of the original function (8) is an odd function of t. Conse-
quently, the asymptotics of its Fourier transform is twice the imaginary part of the
asymptoties (2), that is

. r(cx) sin(1l'cx/2) .
2ICX 2 Sign (w).
1l'lwl a

Finally, the asymptotics of Fourier transform of the original function (8) can be
found by dividing the above expression by iw, thus obtaining
134 Chapter 4. Asymptotics of Fourier transform

r(a + 1) sin(1l"a/2)
j -(w) '" - ------,,-- (10)
1l"lwl a +1 .

4.7 Exercises
1. Find the main power asymptotics (as x -+- O)offunction/(x) = (sin2x-2sinx)/x.

2. Investigate asymptotic behavior, as x -+- 0, and as x -+- 00, of function / (x) =


(x - tanhx)/x 2 •
3. Investigate asymptotic behavior, as x -+- 0, of function / (x) = (1 - x cot x) Ix.

= °
4. Utilizing answer to the above exercise find asymptotic behavior, as a -+- 00, of the
root of the transcendental equation a - x cot x, < x < 1l" 12.

5. Investigate asymptotic behavior, as N -+- 00, of the expression /(N) = n:=1 (1 +


ani N), where {an} is a bounded sequence.

6. Find the asymptotics (as N -+- 00) of expression n:=1 (1 + ~ql(n~», where
~ = tiN, and ql(1') is a function integrable in the interval l' e (0, t). Provide an upper
estimate of the remainder term in the obtained asymptotic formula.
7. Determine the character of convergence to 0, as w -+- 00, of the following two
integrals:

S(w) = rooSinwtSin(
10 atp4 )dt,C(W)=
1+ t 10roo coswtSin(~)dt.
1 + pt

8. Find the principal asymptotics, for w -+- 00, of the Fourier transform of function
Jet) = exp( -altI 3 ), and evaluate the infinitesimal order of the remainder term.

9. Assume that Jet) is an even, infinitely differentiable function on the interval t e

°
(-1', 1'), which is identically equal to zero outside this interval. Furthermore, suppose that
there exists limit limH .. -o /(t)/(1' - t)n = A, where A > and n is positive integer.
Find the principal asymptotics of the Fourier transform of this function for w -+- 00.

10. Find the principal asymptotics, for w -+- 00, of the integral

J(w) = 1 1
-1
In (2+t2) coswtdt.
1 +2t
---2

11. Find the principal asymptotics, for w -+- 00, of the Fourier transform of function
Jet), with the graph shown in Fig. 4.7.1.
4.7. Exercises 135

fit)

--------~---,~--~--------t

FIGURE 4.7.1

12. What is the principal asymptotics (as (J) .... 00) of the Fourier transform of function

_r2 { I, t < 0;
/( )
t =e l-(t/(l+t»tI, t>O, (0 < P< 1).

Its graph, reminiscent of the profile of an ocean wave (or a sand dune), appears in Fig.
4.7.2.

fit)

--------------~~--------~==--t

FIGURE 4.7.2

13. Find the principal asymptotics, for (J) .... 00, of the Fourier transform of the semi-
circle function /(t), equal to zero outside the interval t E (-1,1), and equal to ~
inside that interval.
ChapterS
Stationary Phase and Related Methods

In this chapter we will use methods developed in Chapter 4 to provide a general


scheme for finding asymptotics. The remarkable Kelvin's method of stationary
phase will be employed as well.

5.1 Finding asymptotics: a general scheme


Consider the integral

1= I(x) = foB I(t)exp[-ixp(t)]dt, (1)

where 1 (t) is a continuously differentiable function on the interval [0, B) and such
that 1(0) = 1 =F O. The function p(t) appearing in the exponent will be assumed
twice differentiable on (0, B) and monotonically increasing with p'(t) > O.
To apply to (1) the standard methods of asymptotic analysis described in Chapter
4, we will change the variable of integration to

s = p(t) - p(O).

Denote the monotonically increasing inverse of the above function by

t = q(s) (q(Q) = B).

After this change of variables, the integral (1) assumes familiar form of the Fourier
integral

1= exp[-ixp(O)] 10fQ F(s)q'(s)e- .


IXS ds,
138 Chapter 5. Stationary phase and related methods

where F (s) = f (q (s)) is continuously differentiable on the interval [0, Q] function


such that F (0) = f =1= O. Let us investigate the asymptotic behavior of I as
x ~ 00.
First, observe that the factor in front of the integral has modulus 1 and has no
effect on asymptotics. So, from now on, we will omit it (putting p(O) = 0) and
consider Q
1= fo F(s)q'(s)e- ixs ds. (2)

The function q' (s) is also continuously differentiable on the interval (0, Q). If,
in addition, it had a finite limit for s ~ 0, then a further study of the integral I
would repeat the previously developed asymptotic analysis of Fourier images of
functions with discontinuities of the first kind.
More interesting, and physically more important, is the case of functions p(t)
which have the asymptotics

(t ~ 0) (3)

for a certain ex > 0, which we will consider in some detail. In this situation

q'(s) '" GsfJ- l (s ~ 0), (4)

where
1
f3 =-.
ex
The above asymptotics (s ~ 0) of the integrand and experience gained in the
previous chapter suggest the following asymptotics for the integral (2):

1 )fJ = f
I'" r(f3)fG ( -:-
IX
1 (1) (-1.
-r -
ex ex P,X
)l/a (Ixl ~ 00). (5)

On the other hand, if F(Q) = f(B) =1= 0, then the behavior of the integrand
close to the upper limit gives the asymptotics

I'" f(B)q'(Q)fix (Ix I ~ 00),

which, for ex > 1, is of order smaller than (5). Thus, the main term of the asymptotic
expansion of integral (2) is given by formula (5). Reinserting the factor omitted
earlier, and returning to the notation of the original integral (1), we can finally write
that, for any ex > 1, as Ix I ~ 00,

iA(Xl f(t)exp[-ixp(t)]dt",r ( ;;+1


1 ) ( 1 ) l/a
Pix f(A)exp[-ixp(A)]. (6)
5.1. Finding asymptotics: a general scheme 139

The upper limit in (6) was deliberately set to be infinite to avoid distraction caused
by smaller order asymptotics generated by a finite upper limit. The lower limit
was kept arbitrary for the sake of generality.
Observe that the assumption that function p(t) be strictly increasing is not nec-
essary for the above result. A strict monotonicity suffices as the two cases can be
transformed into each other by a nonessential replacement of i into -i.
Let us indicate a number of consequences of formula (6) that are important in
applications:
(1) If function p(t) is symmetric in the neighborhood of A, then the asymptotics

p(t) '" Pit - Ala, (7)

for an a > 0, implies the doubled asymptotics (6) for the integral with infinite
limits:

f f(t)exp[-ixp(t»)dt '" 2r (~+ 1) (p~xy/a f(A)exp[-ixp(A»). (8)

(2) If p(t) is antisymmetric in the neighborhood of A, then only the real part is
preserved in the asymptotics and

f f(t) exp[-ixp(t»)dt '" r (~ + 1) Re (p~x) I/a f(A) exp[-ixp(A»).


(9)
(3) In terms of distribution theory the equality (8) means that, for any a > 1,
the family of functions of variable t

exp[-ixp(t)It - Ala] / 2r (~+ 1) (p~x) I/a , x eR,

weakly converges to the Dirac delta ~(t - A) as x ~ 00. The particular case
a = 2 corresponds to the familiar function (1.3.5).
(4) If f(t) is constant and p(t) = P(t - A)a, then the asymptotic relation (6)
becomes an exact equality. Specifying f = 1, A = 0, and x = 1, and separating
the real and imaginary parts we arrive at the following standard integral formulas
valid for a > 1 and P > 0:

10tx) cos(Pta ) dt = (1 ) (1 )l/a


r ~ +1 P cos (2a) ,
7r
(lOa)

10rOO sin(Pta)dt = r ( ~1 + 1) ( 1 ) I/a


P sin (2a).
7r
(lOb)
140 Chapter 5. Stationary phase and related methods

5.2 Stationary phase method


Assumptions (5.1.3) and (5.1.7) which secured the above asymptotics may seem
artificially chosen, just to make mathematics rigorous. Actually, many of them, and
in particular the case a = 2, emerge perfectly naturally in the physical phenomena.
Consider the integral

LB f(t)exp[-ixp(t)]dt. (1)

where p (t) is an arbitrary function twice differentiable on the interval of integration.


Function f(t) will be assumed continuously differentiable. It turns out that in this
fairly general situation the asymptotics of (1) corresponds to the special case a = 2.
We shall begin the asymptotic (x ~ 00) analysis of (1) by finding stationary
points of p(t) where
p'(t) = O.

Denote the roots of this equation by rm. m = 1, ...• N < 00. Assume that all
of them correspond to simple extrema of function p(t) with p" (rm) =f. O. In their
neighborhood, p(t) has a parabolic behavior

(2)

Consequently, the integral (1) has automatically the symmetric asymptotics (5.1.7)
with
a = 2, P = p"(rm )/2.
Let us partition the interval of integration into disjoint intervals, each containing
just one of the stationary points. In our case a = 2 > 1, and the asymptotics
contributed by the boundary points of the intervals are of order smaller than the
asymptotics generated by the stationary points (5.1.8). For this reason, in final
formulas only the latter appear. Summing contributions of all the stationary points
we arrive at the asymptotic (x ~ 00) formula

L B N
f(t)eXp[-iXP(t)]dt,...,'?; f(rm)
21r
. "( ) exp[-ixp(rm)].
IXP rm
(3)

If some stationary points coincide with the endpoints of the interval of integration
then the corresponding summands will appear with coefficient 1/2.
5.3. Fresnel approximation 141

5.3 Fresnel approximation


Let us take a look at the asymptotics (5.2.3) from a slightly different viewpoint
and focus our attention on the case where there is only one stationary point T and
the limits of integration are infinite. Then (5.2.3) reduces to the asymptotic equality

f f(t) exp[-ixp(t)] dt "" f(T) .


IXP (T)
2rr
" exp[-ixp(T)], (x -+ (0). (1)

We shall attempt to find the "hidden springs" of the stationary phase method by
analyzing this example in some depth. The totally rigorous mathematical derivation
of (1) seems bland and incomplete to physicists and engineers if it is given without
that extra insight that comes from a perhaps imprecise but revealing heuristic
arguments. Actually, mathematicians also often gain a deeper understanding of
their subject by accumulating a store of sometimes imprecise analogies acquired
in "real-life" experiences and physical "thought" experiments.
The stationary phase method can also be elucidated by such "real-life" argu-
ments: the fast oscillation
exp[-ixp(t)] (2)

has current (time-dependent) frequency w = xp' (t) and period T = 2rr /Iwl which
decays like 1/x. If f(t) has a characteristic scale a then, in the domain of inte-
gration where T « a, the adjacent crests and troughs of the integrated process
compensate, the better the bigger x, and only close to the stationary point t = T,
where W(T) = 0, does that compensation becomes less effective. As a result, a
small and shrinking with the growth of x neighborhood of point T gives the main
contribution to the integral. In the neighborhood of the simple stationary point T,
function p(t) is well approximated by the parabola

p(t) = p(T) + r(t - T)2/2, r = p"(T).

Outside that small neighborhood, function (2) can be replaced by

(t - T)2]
exp [ -ip(T) - ixr 2 '

without changing the value of the integral significantly, since both functions os-
cillate quickly and give a small contribution to the integral. For this reason, the
original integral (1) can be replaced by an asymptotically equivalent expression

. f
exp[-lp(T)] [.
f(t) exp -lxr (t-T)2]
2 dt. (3)
142 Chapter 5. Stationary phase and related methods

Furthermore, observe that for x ~ 00, in the neighborhood of the stationary point
essential for the integral, function /(t) practically coincides with constant /(1:).
The latter can be taken outside the integral sign and the remaining integral can be
calculated with the help of the standard formula

Remark 1. Physicists often stop short of the asymptotic relation (1) and operate
with the integral (3). Analogously with optics, where such integrals appear in the
so-called "Fresnel approximation", the approximation of the integral on the left-
hand side of (1) by the integral (3) will be also called the Fresnel approximation.
Later on, discussing optics applications, we shall show that the asymptotic formulas
(5.2.3) and (1) correspond to the crude geometric optics approximation.

Remark 2. The Fresnel approximations of integrals of type (3) are closely related
to the Fresnel sine and cosine integral special functions

(4a)

(4b)

often encountered in wave problems. They are both odd functions of z, with limits

C(oo) = S(oo) = 1/2. (5)

The graphs of Fresnel integral functions are shown in Fig. 5.3.1.

5.4 Accuracy of the stationary phase approximation


To enhance the practical value of the main-asymptotics formulas (5.2.3) and
(5.3.1) for integrals of rapidly oscillating functions we need estimates of the re-
mainder terms. Their magnitude strongly depends on functions /(t) and p(t).
Moreover, no universal method of finding precise estimates is known. For that rea-
son we will analyze the accuracy of asymptotics (5.3.1) in just two generic cases,
hoping that detailed analysis of a few concrete examples will better illuminate the
essence of the problem than plowing through a laborious general argument.
5.4. Accuracy of the stationary phase approximation 143

0.6
0.5
0.4
0.3
0.2
0.1

3 4 5
FIGURE 5.3.1
Graphs of the Fresnel integral functions C(z) and S(z).

Example 1. First, let us find the magnitude of error created by replacing the
integral (5.3.3) by the right-hand side of (5.3.1) in the case when /(t) is a con-
tinuously differentiable (smooth) Gaussian function. Then (5.3.3) becomes the
standard integral

f [ p 20 2 2
2
ex - -t2- i x r t- ] dt= Iff.J1 -
-
ixr
1
(ija 2xr)
. (1)

For simplicity's sake put 'l' = 0; consideration of the more complex general case
contributes little to the understanding of the essence of the situation. First factor
on the right-hand side of (1) corresponds to the main asymptotics (5.3.1), and the
second describes the deviation from it. Thus, the relative error is of the order
(obtained via formula (5.3.1»

I1- .J1 - (ija


1
2xr)
I" "20-12-xr, (x -+- 00). (2)

For a general smooth function / (t) the accuracy offormulas like (5.3.1) can be esti-
mated by replacing a with the characteristic scale of function / (t )-an admittedly
nonrigorous but heuristically useful approach. •
144 Chapter 5. Stationary phase and related methods

Example 2. Consider the jump function /(t) = x(a -Itl). Then the calculation

i:
is reduced to evaluation of the integral

1= exp [-ixr ~] dt = 2/f.(C(z) - is(z)), (3)

where

Z=
2xr
~ --.
7r

To investigate the asymptotic behavior of the Fresnel integrals that appear in (3)
we shall write the first of them in the form

1
C(z) = 2- c(z),

100
where
c(Z) = C(oo) - C(z) = cos(7rt 2 /2)dt.

Changing to a new variable of integration y = t 2 , we get that

c(z) =
1
2
[00 v'Y1 cos(7rY/2) dy,
Z2

and integration by parts gives

c(z) = - ~ sin(7r Z2 /2) + ~ [00 1", sin(7rY /2) dy.


7rZ 27r Z2 y", Y

Another integration by parts shows that the remaining integralis 0(1/z) as z --+- 00.
Therefore, C(z) satisfies the asymptotic relation

C(z) = ~ + ~ sin(7rz2/2) + o(l/z), (z --+- 00).


2 7rZ

An analogous relation is valid for S(z):

S(z) = -21 - -7rZ


1
cos(7rz 2 /2) + o(l/z), (z --+- 00).
5.5. Method of steepest descent 145

Substituting these asymptotics into (3) we get that

1= f{f
-.7-r
lxr
( 1 + -.j2[ exp[l'lrXZ
7rZ
. 2 /2] + o(l/z) ) , (Z -* (0).

Now it is clear thatthe relative error of replacing I by the right-hand side of (5.3.1)
is

Note that it is of the order'" 1/,.fX and not'" 1/ x, the latter being the case for
smooth functions f(t) (see (2». •

5.5 Method of steepest descent


Formula (5.2.3) contains the main ingredient of the stationary phase method. It
is related to the steepest descent method (or Laplace's method) which is applicable
to purely real integrals

i B
f(t) exp[ -xp(t)] dt. (1)

Without repeating considerations that led us to (5.2.3) we shall give the final formula
for the main asymptotics of integral (1).
Let p(t) be a sufficiently smooth function which has only simple minima at
points'l"m located inside the interval (A, B). It turns out that just rewriting (5.2.3)
without the imaginary unit i gives the correct asymptotics

lA
B
f(t)exp[-xp(t)]dt'" L f('l"m)
m=l
N
"
27r

xp ('l"m
) exp[-xp('l"m)], (Z -* (0).

(2)
Despite its superficial similarity, the above formula differs from formula (5.2.3) in
an essential way. First of all, the summation in it is not over all extrema but just
over all minima of the exponent. Secondly, and this is the main point here, for
different values of the minima, the exponential factors in the sum have different
magnitudes, and these differences increase with the growth of x. For that reason
the main asymptotics of integral (2) contains only one term corresponding to the
absolute minimum of function p(t) in the interval of integration
146 Chapter 5. Stationary phase and related methods

l A
B
f(t) exp[-xp(t)]dt 'V f(r)
~
- , , - exp[-xp(r)],
xp (r)
(x -+ 00), (2)

where p(t) ::: p(r), for all t E [A, B].

5.6 Exercises
1. The stationary phase method is often useful in problems of wave propagation. In
particular, in Chapter 9 we will encounter the integral

G(p) = - -
1 1
21ro
00
exp[-ikpcosh(t)]dt.

which describes complex amplitude of a cylindrical wave. Analyze this integral using the
stationary phase method.

2. The real Anger function J,Az) and Weber function E,Az) (here, we assume that w
and z are real) are uniquely determined by the equation

D",(z) = J",(z) - iE",(z) =.!..


11' Jor exp[-i(erxP - z sin 1/1)] dl/l (1)

and are often encountered in problems of mathematical physics. Find the main asymptotics
of Anger and Weber functions for a fixed z and w ~ 00.

3. Find the main asymptotics of Anger and Weber functions (see Exercise 2) for fixed
w and z~ 00.

4. Find the Fresnel approximation of function D",(z) for z » 1.

5. In Exercises 3 and 4 we have explored the asymptotic behavior of Anger and Weber
functions along the w and z axes ofthe (w. z)-plane. What happens in the rest ofthe plane?
More precisely, study the asymptotic behavior of Anger and Weber functions along the
raysz = pwforO < p < 1.
6. Study the asymptotics of (1) for p = 1. w ~ 00. Hint: As p ~ 1 the expression
on the right-hand side of the relation (5.5) in the Answers and Solutions chapter diverges
to infinity. This fact indicates that for p = 1 the asymptotics is of a different order.

7. Complete investigation of the integral D",(pw) =


(1/11') J; e-;"'P(q,)dl/l. where
p(1/1) =
1/1 - p sin 1/1, by checking its asymptotic behavior for w ~ 00 if p > 1.

Remarks on Exercises 2-7: The integral in Exercise 7 clearly has two different types of
asymptotic behavior in the (z. w)-plane: one in the octant 0 < z < w (p < 1) and another
in the octant 0 < w < z (p > 1). Moreover, its asymptotic behavior on the boundary
z = w (p = 1) of these octants is qualitatively different from its asymptotic behavior in
5.6. Exercises 147

either of them. The total picture can be summarized as follows: the integral in Exercise
7 obeys the asymptotic power law of order a (p) where

I, if 0 :::: p < 1;
a(p) = { 1/3, if p = 1; (2)
1/2, if 1 < p.
The function a (p) which determines asymptotics of the integral in Exercise 7 has jumps
as we move from one (z, w)-region to another. An infinitesimal change of parameter p
can cause a major change in the asymptotic behavior of that integral. Such phenomena
are called phase transitions and they correspond to the physical phase transitions like
melting, evaporation or crystallization, where small changes in temperature (or other
physical parameters) can cause large and sudden changes in the physical properties of
matter.
At first sight, the phase transition for the integral in Exercise 7 in the vicinity of the
critical point p = 1 could be puzzling. The next exercise provides and additional insight
into why it occurs.

1
8. Consider the integral

1 00 e-it»tdt
/(w)=- --, t" ~ o. (3)
21r 0 ~
It follows from (4.3.3) that, for any t" > 0,

1
I(w) '" . !-' (w --+ 00) (4)
21fIW", t"

i.e., (3) obeys the asymptotic power decay law of the order a = 1. On the other hand, it
follows from (4.6.2) that, for t" = 0,
1
T(w)= ~' (5)
2",1fiw
i.e., (3) obeys the asymptotic power decay law of the order a = 1/2. Study the "phase
transition" in the asymptotic behavior of (3) as t" --+ o.
9. Utilize the method of steepest descent sketched in Section 5.5 to derive the Stirling's
approximate formula for the factorial:

n! = n . (n - 1) ..... 2 . 1 '" ../21fn n" e-" (n --+ 00).

10. Consider the Riemann equation

av + v av = 0, v(x, t = 0) = vo(x), (6)


at ax
where Vo (x) is an infinitely differentiable and absolutely integrable function whose deriva-
tive attains its minimum at a certain point z, i.e.,

inf v~(x) = v~(z) = -u, u > O. (7)


-oo<x<oo
148 Chapter 5. Stationary phase and related methods

The equation arises in various physical problems (see Volume 2, Chapter 12 on nonlinear
partial differential equations). The solution of this equation is known only in the implicit
form
v(x, t) = vo(x - tv(x, t». (8)
Find the asymptotics (K -+ 00) at the time t = -1/u of the spatial Fourier transform

ii(K, t) = -1
2Jr
f .
v(x, t)e-IKXdx (9)

of the solution v. As an example consider the initial condition

vo(x) = -xexp(-x 2 ). (10)


Chapter 6
Singular Integrals and Fractal Calculus

This chapter is devoted to integrals similar to the familiar divergent Cauchy integral

f qJ(S) ds.
s-x
(1)

Such integrals are often encountered in physical applications. If the function qJ(s)
does not vanish at S = x then the integrand in (1) has a nonintegrable singularity
at that point. In practice physicists, using their intuition as a guide, often assign
certain finite values to these integrals anyway. Then, it is a mathematician's job to
justify rigorously these "renormalizations of infinities", translate additional physi-
cal requirements into mathematical terms and point out how different assumptions
lead to different values of integral (1). The situation is fairly typical in collaboration
of physicists and mathematicians.
The first two sections of this chapter will survey various "recipes" for evaluation
of singular integrals and later a typical physical example will be provided. At that
time, natural physical considerations will help us select the unique solution.

6.1 Principal value distribution


Integral (6.0.1) will be studied via the notion of its principal value which, for
x = 0, is defined as a symmetric limit

(1)

with an analogous definition for other values of x. The letters 'PV in front of the
integral indicate that the integral is taken in the sense of its principal value. In
150 Chapter 6. Singular integrals and fractal calculus

terms of the distribution theory, equality (1) defines distribution Tl/s , which acts
on a test function qJ using the expression on the right-hand side of (1).
Another way to calculate the principal value of integral (6.0.1) can be proposed
if we restrict our attention to a test function qJ(x) e V which has compact support.
Then one can find a number R such that qJ(x) == 0 for Ixi > Rand

f qJ~S) f qJ~S)
R
'PV ds = 'PV ds.
-R

Representing function qJ(x) in the form

qJ(X) = [qJ(x) - qJ(O)] + qJ(O),

and noticing that the principal value of the integral containing the constant term
qJ(O) is zero, we get that

'PV f qJ~S) f ds =
R
qJ(s) ~ qJ(O) ds. (2)
-R

Now the integral on the right-hand side can be understood in the ordinary sense
since its singularity at 0 has been removed. In other words, the integrand has
been regularized. This immediately follows from the fact that any test function
qJ(s) e V satisfies the Lipschitz condition

IqJ(a) - qJ(b) I < Kia - bl. (3)

If function qJ(s) does not have compact support, then one can still use equality

'PV f qJ(s) ds =
s
lim
R-oo
f
R
qJ(s) - qJ(O) ds,
s
-R

instead of (2) to define the principal value of integrals with infinite limits.
Finally, let us point out another obvious but useful representation of the principal-
value integral (1):

pvf qJ(s) ds = (00 qJ(s) - qJ(-s) ds.


s 10 s
6.1. Principal value distribution 151

Standard operations on "well-behaved" convergent integrals, such as the change


of variables, differentiation with respect to a parameter, etc., can also be used
in analysis of principal-value integrals. The following example illustrates the
situation.
Example 1. Consider the singular integral

I(x, b) = 'PV f exp(-b2s2)


s-x
ds, (4)

and observe that a change of variables 'l' = bs transforms it into the integral

I(x, b) = I(bx),

where
I(x) = 'Pvf exp(-'l'2) d'l'.
'l'-X

Introducing variable of integration y = 'l' - x, we obtain that

I(x) = exp(-x 2)J(x),

where
J(x) = 'PV f exp(-y: - 2xy) dy.

Differentiation of the above expression with respect to parameter x gives

J' (x) = -2 f exp( _y2 - 2yx)dy = -2,Jii exp(x 2).

Taking into account the fact that J (0) = 0, one can get that

I(x) = -2,JiiD(x),

where
D(x) = exp(-x 2) foX exp(l)dy

is the so-called Dawson integral. Thus, finally, we arrive at the formula

'PV f exp(-b2s2)
s-x
ds = -2,JiiD(bx).

Let us remark that for Ixl -+ 0, we have that D(x) '" x, and that for Ixl -+ 00 the
Dawson integral has the asymptotics D(x) '" 1/2x.
152 Chapter 6. Singular integrals and fractal calculus

D(x)

1 2 3 4 5 X

FIGURE 6.1.1
Graph of the Dawson integral.

6.2 Principal value of Cauchy integral


Another approach to evaluation of integral (6.0.1) depends on its interpretation
as the limit
f
rp(s) ds = lim
s -x y-+o
f
rp(s) ds,
S - Z
(1)

where
z = x +iy
is a complex parameter. Moving into the complex plane vicinity of the real axis
removes the singularity.
Let us find the above limit by separating explicitly the real and the imaginary
parts of the expression

1 s- x iy
s- z = (s - x)2 + y2 + (s - x)2 + y2·

Substituting this sum into (1) and noticing that the integral of the real part converges
to the principal value as y -+ 0, we obtain that

lim
y-+O
f rp(s) ds
s- Z
= pVf rp(s) ds
s- Z
+i lim
y-+O
f Y2
(s - x) +y
2rp(s)ds. (2)
6.3. A study of monochromatic wave 153

Notice that the factor in front of function q>(s) in the last integral coincides, up to
rr, with the familiar Lorentz curve (1.3.3)

1 Y
-; (x - s)2 + y2 '

which weakly converges to 8(x - s) as y ~ 0+. Hence, the evaluation of the


integral (6.0.1) using the limit procedure (1) leads to an identity

f q>(s)
- - d s = PV
s-x
f q>(s)
s-x
.
- - d s ± l7rq>(x).

The plus sign corresponds to the limit y ~ 0+ with y's restricted to the upper
half-plane and the minus-to the limit y ~ 0- with y's restricted to the lower
half-plane. Thus equality (2) determines two distributions:

1 1
- - - . - = PV--
s- X -,0 S - x
+ irr8(x - s) (3)

and
1 1
- - - . - = PV-- - irr8(x - s). (4)
s -x +,0 s-x
Although assigning complex values to real-valued integrals may seem strange at
the first glance, these formulas often give the correct physical answer. The point is
that their imaginary parts reflect the causality principle which was not spelled out
explicitly when the original physical problem was posed but which, as we will see
later on, plays an important role. Obvious physical arguments then permit us to
indicate which of the formulas (3-4) exactly corresponds to the physical problem
under consideration.

6.3 A study of monochromatic wave


Formulas of Section 2 give, for example, the correct physical answer in the
problem of radiation by a monochromatic wave source with complex amplitude.
For simplicity, we will discuss only the I-D case. Then, wave radiation is described
by the nonhomogeneous wave equation
154 Chapter 6. Singular integrals and fractal calculus

If the wave source is monochromatic, that is

D(x, t) = w(x)coswt = Rew(x)eiwt ,

then the radiated wave is also monochromatic and can be written in the form

E(x,t) = Reu(x)i wt ,

where the complex amplitude u (x) of the propagating wave satisfies the Helmholtz
equation
(1)

Here, k = wlc is the so-called wavenumber. The equation can be solved with
the help of Green's function-an approach that will be discussed later on. Here,
we will solve it by passing to the frequency domain and considering the Fourier
transform
U(K) = 2Jr1 1 .
u(x)e-1Kxdx.

The inverse Fourier transform is then given by

u(x) = 1u(x)eiKXdx.

Taking the Fourier transforms of both sides of equation (1) we get

U(K) _ W(K) _ W(K)


- k 2 - K2 - 2k
[_1+___1_]
K k K- k '

where W(K) is the Fourier transform ofthe source function w(x). We will assume
that function W(K) is sufficiently smooth and rapidly decaying to 0 for IKI ~ 00.
The sought complex amplitude of the propagating wave can now be found by the
inverse Fourier transform:

x = -1
u ()
2k
[I W(K)
- - eiKXd K
K+k
- 1W(K)
- - eiKXd K] .
K-k

Changing the variable of integration K in the first integral to -K and observing that
W(-K) = W*(K), where the asterisk denotes the complex conjugate, we obtain
that
u(x) = --
k
11 Re[w(K)eiKX ]
K-k
dK. (2)
6.3. A study of monochromatic wave 155

First of all, let us calculate the principal value

1
PV[U(X)] = --Re
k
[ PV f W(K).
_ _ e'KxdK ]
K-k

of that integral by splitting it into the sum of two components

pVf W(K) eiKXdK


K-k

=eikx[pV f W(K)COS[~K_-kk)X]dK+iPV f W(K)sin[~~kk)X]dKJ, (3)

and studying the asymptotic behavior of each of them separately as Ixl ~ 00.
Notice that, according to (3.3.7),

sin[(K - k)x]
-"----~ ~

JrO(K -
k)' ()
SIgn X ,
K-k

weakly as Ixl ~ 00. Consequently,

lim PV
x--+oo
f- W(K)
sin[(K - k)x]
K - k
-.
dK = Jrw(k) sIgn (x).

Now consider the first integral on the right-hand side of (3). Adding and subtracting
number w(k) from function W(K) we arrive at the equality

pVf W(K) COS[(K - k)x]dK


K-k

= f 1{I(K, k) COS[(K - k)x]dK


- f
+ w(k)PV k)x]
COS[(K -
K-k
dK,

where
k W(K) - w(k)
1{I(K, ) = k'
K-

By previous assumptions, 1{1 (K, k) is a continuous function of K. The second


integral on the right-hand side of the above equality vanishes because the integrand
is odd. The first integral converges uniformly for Ix I > 0, and by the Riemann-
Lebesgue Lemma, its value converges to 0 as x ~ 00. Thus we arrive at the
asymptotic formula

(Ixl ~ 00),
156 Chapter 6. Singular integrals and fractal calculus

which permits us to drop the corresponding, converging to zero, term to get that

'PV[U(x)]""" IIm[w(k)e ikx sign (x)], (Ixl -+ (0). (4)

The above expression contradicts the radiation condition and is physically not
acceptable. 1 To save the situation we will tum to formulas of Section 2 which give
that
1T ·k
u(x) = 'PV[u(x)] =f i"kRe[w(k)e' X]. (5)

Formulas (4) and (5) imply that if we select the plus sign in the formulas of Section
2 then we shall arrive at the asymptotic formula

i ?!.w*(k)e- ikx x > 0,


u(x) ~{ .! -(k) i k x '
'7Cw e ,
0
x < ,

which does satisfy the radiation condition. Here, the physically acceptable answer
corresponds to the distribution

11.
- - - = 'PV-- - l1T8(K - k). (6)
K-k+iO K-k

Let us consider the physical arguments in favor of such a choice. There are no
purely monochromatic wave sources in nature emitting radiation for infinitely long
times. All real-world phenomena begin and end in finite time. The fact that the
source was turned on sometime in the past can be taken into account by assuming
that its time dependence reflects asymptotically negligible intensity of the source
at time -00. That is exactly what the replacement of k = w/c in the preceding
formulas by
k =- w - 1. -
Y = k - I·0, (7)
C C

accomplishes and what justifies utilization of the distribution (6).


Our choice was based on the causality principle which asserts that it is impossible
to receive the wave before the wave source is turned on. The same result can be
obtained if a principle of infinitesimal relaxation is applied. According to this
principle, any medium (even the vacuum) damps waves. This means that, for
example, the propagating to the right monochromatic wave exp(iwt - ikx) is
attenuated with the growth of x. Again we are led to the conclusion that the real k
in (2) should be replaced by the formula (7).

INotice that the radiation condition demands that far from the source (i.e., for Ixl ~ (0) there
only exist waves running away from the source, like exp(i(wt - klxl».
6.4. The Cauchy fonnula 157

6.4 The Cauchy formula


The results of Section 2 are closely related to the Cauchy formula from the theory
of functions of a complex variable. The formula asserts that for any function f (z),
analytic in a simply connected domain D in the complex plane C and continuous
on its closure iJ including the boundary contour C,

f( z) = _1_
"'_.
~"
f f(s)d s
,.
~ - z
,
C

where z is a point in the interior of D and contour C is oriented counterclockwise.


If z is an arbitrary point of the complex plane then the above integral defines a new

f
function
F(z) = ~ f(s)d s . (1)
2m S- z
C

For z inside the contour of integration

F(z) = f(z), zED.

If z ¢ iJ then the integrand is analytic everywhere in D and, by the Cauchy's


Theorem, F(z) == O. For a boundary point z = so E C the integral (1) is singular
and F(z) will be understood in the principal-value sense. In the present context,
this will mean that

F(so)
.
= PV[F(so)] = r--.O
hm "'_.
1
~, I
f f(S)d
S- so
s, (2)
Cr

where Cr is a curve obtained from contour C by removing its part contained in a


disc of radius r --+ 0 and centered at so (see Fig. 6.4.1).
Observe that, for any function f continuous on contour, function F(z) is well
defined everywhere.
Formula (2) generalizes the concept of the principal value of a singular integral
on the real axis and we will study it for a function f(z) analytic inside contour C
and continous on it. To that end substitute an identity

f(s) = [f(s) - f(so)] + f(so),


158 Chapter 6. Singular integrals and fractal calculus

FIGURE 6.4.1
A schematic illustration of countour Cr.

into (2) and split the integral into two parts:

PV[F(so)] = ~
2m
f [f(s) - f(so))d s
S- so
+ lim f(s~) f~.
r ...... O 2m s - so
c ~

We deliberately replaced Cr by C in the first integral since for f(z) satisfying the
Lipschitz condition (6.1.3) the integral is no longer singular and one can integrate
over the whole closed contour C. Since its integrand is analytic inside the contour
and continous on the contour, the first integral vanishes by Cauchy's theorem. Thus

PV[F(so)] = lim
r ...... O
f(s~)
2m
f ~.so S-
(3)
c,

The last integral can be evaluated assuming that contour C is smooth in the vicinity
of point so, which is called the regular point ofthe contour. Let us add and subtract
from (3) the integral over portion Cr of the located in D circle of radius r with center
at so. Since the integral over the closed contour Cr + Cr is equal to zero, equality
(3) can be rewritten in the form

PV[F(so)) = - lim
r ...... O
f(s~)
2m
f ~.so
S-
c,
6.4. The Cauchy formula 159

The latter integral is easy to evaluate:

f .-!!L =
~ - ~o
f o .
ire/~dqJ = -i1r.
rel'fJ
Cr 11:

Hence, we obtain that


'PV[F(~o)] = f(~o)/2,

and the principal value of the analytic function is equal to

'Pvf f(nd~
- ~ ~o
= i1rf(~o), (4)
c

that is, the value of function f at the singular point of the integrand multiplied by
i 1r. Thus, the behavior of the Cauchy integral in the neighborhood of a regular
point ~o of contour C can be summarized as follows:

f
c
f(~)d~
.:....~~-'--z...:... = 2m
. { {(~o),
2f(~o),
0,
Z-
Z
Z-
~o+,
= ~o,
~o-,
(5)

where the plus sign corresponds to the limit value of the integral while ~o is ap-
proached from the inside of contour C, and the minus sign corresponds to the
approach from outside.
Example 1. Let us consider a contour C consisting of the interval [- R, R] on
the real axis and the semicircle CR of radius R with the center at point Z = °
located in the upper half-plane. If f(z) is analytic in the upper half-plane and such
that the integral over C R uniformly converges to zero as R _ 00, then the Cauchy
formula is transformed into

f s-z f(s)
--ds = 2mf(z),
.
y > 0, (6)

and equality (4) assumes the form

'PV f -s-x
f(s)
- d s = l1rf(x).
.
(7)

Recall that, by the well known Jordan Lemma in the complex functions theory,
function
f(z) = exp(iAZ), (8)
160 Chapter 6. Singular integrals and fractal calculus

satisfies the conditions mentioned earlier so that the above formula implies that

PV f eiAs
- - d s = 7rie'AX.
s-x
.
(9)

In particular, it follows that for x = 0

-iPV f eiAS
---;-ds =
f sin{}l.s)
-s-ds = 7r. (10)

Notice that the Cauchy formula (6) interpreted in the spirit ofthe distribution theory
defines a new distribution

All
T(s - z) = - . - - , (11)
27r1 S - Z

which is called the analytic representation of the Dirac delta. Its functional action
assigns to a function !(s), analytic in the upper half-plane and rapidly decaying
at infinity, its value at the point z. Applied to any usual "well-behaved" function
of the real variable s, it defines a new function of complex variable z which is
analytic everywhere with the possible exception of the real axis. Crossing the real
axis at the point z = x the functional T(s - z)[fl has a jump of size !(x); this
follows from formulas of Section 2. We will extract this jump by introducing a
new distribution

1 y
+T
A A A*

8(s - z) = T(s - z) (s - z) = - 2
7r (s - x) + y 2'

which is harmonic for y =f:. 0 and which converges to the usual Dirac delta 8 (s -x) as
y ~ 0+. As we will see in Volume 2, the corresponding functional 8(s - z)[! (s)]
solves the Dirichlet problem for the 2-D Laplace's equation in the upper half-plane
y>Q •

6.S The Hilbert transform


The integral Hilbert transform

1/I(t)
1
= -PV
7r
f rp(s)
- ds
s- t
(1)
6.5. The Hilbert transform 161

of function q;(t) is also defined in terms of the principal value of the integral in-
volved. We shall find the inversion formula for the Hilbert transform by applying
the Fourier transform to both sides of definition (1). The left-hand side is trans-
formed into
-
1/I(w) = -1 f
1/I(t)e- lwt dt,
211"
.
and the right-hand side, after a change of the integration order and other simple
manipulations, assumes the form

-2
1
211"
f .[ f
q;(s)e- IWS PV
eiw(s-t)
s- t
]
dt ds.

Since
PV f eiw(s-t)
. dt =i
f sin(wt)
---dt = i1l" sign (w),
s- t t

we get that
t(w) = iip(w) sign (w). (2)

Now let g(t) be the Hilbert transform of function 1/I(t). Then, according to the
above formula, its Fourier transform is

g(w) = it(w) sign (w) = -ip(w)

so that g(t) = -q;(t). Thus, the inverse Hilbert transform has the form

1
q;(t) = --PV
11"
f 1/I(s)
--ds.
s- t
(3)

One of the most important applications of the Hilbert transform is related to the
causality principle. We shall illustrate it in the example of an absolutely integrable
function h(t) describing a response of a linear physical system to the Dirac delta
impulse 8(t). The Fourier transform h(w) appears in many physical applications.
Let us write it with the real frequency w replaced by a complex number)... = w+ia:

-
h()"') = -1
211"
1 0
00
h(t)e- l')"tdt. (4)

The zero lower limit takes into account the causality principle which requires that
the system's response cannot appear before the action of the impulse: h (t < 0) ==
O. It is obvious from (4) that function h()"') is analytic in the lower half-plane and
162 Chapter 6. Singular integrals and fractal calculus

continuous on the real axis a = o. This, in tum, means that h(w) satisfies the

- f
relation
-i7rh(w) = 'PV h(K)
--dK,
K-W

similar to (6.4.7). The complex function h(w) can be represented as a sum of its
real and imaginary parts: h(w) = fP(w) + iy,(w). Substituting them into the last
equality and comparing separately the real and the imaginary parts, we discover
that fP and y, are related to each other by the Hilbert transform. The connecting
formulas (1) and (3) applied to the real and imaginary parts of the Fourier transform
of response function are called in physics the dispersion relations and are widely
used in the theory of wave propagation in dispersing media.

6.6 Analytic signals


Another physical application of the Hilbert transform is related to the notion of
analytic signal, which appears in various areas of physical sciences, from electrical
engineering to quantum optics. It can be introduced in the following way. To each
real process g(t), which will be assumed to be absolutely integrable, we will assign
a complex signal ~(t) with the real part equal to g(t) and the imaginary part TJ(t)
defined by the condition that ~(z) is an analytic function of the complex variable
z = t + ifJ in the upper half-plane. The analyticity can be achieved by making
the Fourier transform of the signal g(t) vanish for negative frequencies, that is, by
replacing ~ (w) by 2 X (w)~ (w ). The last expression can be also written in the form

~(w) = 2X(w)~(w) = ~(w) + ~(w) sign (w). (1)

The first component on the right-hand side is the Fourier transform of the original
signal and the second is the Fourier transform of the imaginary part of the analytic
signal ~(t). Formula (6.5.2), relating the Fourier transforms of a function and its
Hilbert transform, implies that the imaginary part of the analytic signal of real
variable is expressed through its real part by

1
TJ(t) = --'PV
7r
f g(s)
- ds.
s-t
(2)

The concept of analytic signal helps to solve the crucial electrical engineering
problem of definition of amplitude and phase of the narrow band signal whose
Fourier transform is concentrated in the small neighborhood of the carrying fre-
6.7. Fourier transform of Heaviside function 163

quency Q. Such a signal is usually represented in the form

g(t) = A(t) cos[Qt + 1jF(t)], (3)

where A(t) and 1jF(t) are slowly varying within the period T = 2Jr IQ. Thestandard
engineering problem of finding A(t) and 1jF(t) from the known form of Ht) is
not well posed mathematically, since it reduces to solving one equation for two
unknowns A and ((i. That fundamental difficulty makes, for example, comparing
accuracy for phase measurements by different phase detectors questionable. From
the theoretical viewpoint the best prescription is to uniquely define the amplitude
and the phase via the concept of the analytic signal, the imaginary part providing
the missing second equation

11(t) = A(t) sin[Qt + 1jF(t)]. (4)

6.7 Fourier transform of Heaviside function


Having armed ourselves with the notion of the principal value of singular integral,
we are now in a position to explore Fourier transforms of the Heaviside function and
related distributions. This topic was only briefly mentioned in Chapter 4. We shall
begin by rewriting the Hilbert transform (6.5.1) in the language of convolutions:

1jF(t) = ((i(t) * [-~PV (~)].

Comparing the Fourier transform (6.5.2) of this function with the formula f(t) *
1--+ 2Jr /(w)ip(w) (3.2.8), after simple transformations we obtain that
((i(t)

PV (~) 1--+ -~ sign (w).

Inverting this expression with the help of formula (3.2.5) we get that

sign (t) 1--+ .!..Jr PV (~)


lW
. (1)

Now, we are ready to recover the Fourier transform of the Heaviside function. To
do that we will represent the latter in the form

1 1. ( )
X (t) = 2 + 2 SIgn t .
164 Chapter 6. Singular integrals and fractal calculus

The Fourier transform of the first summand is equal to ~ (w ) /2 so that for the Fourier
transform of the Heaviside function we have

1
X(t) ~ -~(w)
2
+ -'PV
1
21l
( :-
1)
lW
.

It is convenient to write this relation with the help of the distributions (6.2.3):

1 1
X(t) ~ - .. --.-. (2)
2m W - 10

Once the Fourier transform of the Heaviside function has been calculated we
can evaluate Fourier transforms of a wide class of functions

(3)

representable by integrals with variable upper limit of absolutely integrable func-


tions f(t). Indeed, if we represent F(t) as a sum

F(t) = F(oo)X(t) + G(t),

where
G(t) = F(t) - F(oo)X(t),

then, according to (3), the Fourier transform of F(t) can be expressed in terms of
the Fourier transform of G(t) by

- 1 1 -
F(w) = F(00)-2 . - - . + G(w).
111 w - 10
(4)

Example 1. Consider an absolutely integrable function f (t) = X(t )e-PI, P >


O. Then G(t) = -x (t)e- pl / p. Its Fourier transform exists in the classical sense.
Thus, equation (4) implies that

-
F(w) =- 1- [-1- - -1 -]
21lip w - iO w - ip .
(5)

Integration of (3) leads, in turn, to a new function which linearly increases as



t ~ 00. We shall learn how to find the Fourier transform of such functions by first
evaluating the Fourier transform of the absolute value function It I.
6.7. Fourier transform of Heaviside function 165

Example 2. Let us write the absolute value function in the product form It I =
t sign (t). The Fourier transforms of each of the two factors are already known
to us. Recall that t t-+ i8'(w) and that the Fourier transform of sign (t) is given
by formula (6.7.1). In this fashion, with the help of formula (3.2.10) according
to which the Fourier transform of a product is equal to convolution of the Fourier
transforms of factors, we get that

It I = tSlgn(t)
. ., (w) * -t'V
~ 18 1
7r
1 ) = -'PV-
(:-
IW
1
7r dw
d(l)
- .
W
(6)

Let us identify the new distribution arising on the right-hand side through its action
as a functional on an arbitrary test function rp E S :

d (1) dw
I rp(w) dw t'V ; = -t'V Irp'(W)
-;;;-dw.

Recall that the principal value of the above integral is, by definition, equal to

t'VI rp'(w) dw = lim


W £~o
[/-£ rp'(w) dw +1rp'(w) dW] .
W
00

W
-00 £

For the first integral, integrating by parts expression in the brackets, we obtain that
-e -e
rp'(w)d - I [rp(w) - rp(-s)]d
I w- 2 w.
W w
-00 -00

A similar transformation of the second integral yields eventually that

The latter limit defines a new distribution

which functionally acts via the formula

t'V (:2) [rp(w)]


166 Chapter 6. Singular integrals and fractal calculus

= lim [/- [4>(w) - 4>( -e)] dw +


e-M
e
w2
100
[4>(w) - 4>(e)] dW] .
w2
-00 e

Obviously, the last equality can be written in the form

'PV (~) [4>(w)] = 'PV 1 [4> (W);:z 4>(O)]dw, (7)

or in an equivalent regularized form

'PV (~2 ) [4>(w)] = 1 00

4>(w) + 4>(:::) - 24>(0) dw. (8)


o

Thus, we derived a new distribution-theoretic formula

which yields the Fourier transform of function It I:

It I t---+ -~'PV
1f
(~).
w2
(9)

6.8 Fractal integration


In the last three sections of this chapter we develop another class of important
singular integrals which arise when one tries to extend the notion of n-tuple integrals
and of n-th order derivatives of classical calculus to noninteger (or fractional) n.
We begin with the concept of fractal or fractional integration. It is natural to
introduce it as a generalization of the Cauchy formula

= (n _1 1)! j'
-00 (t - s)n-lg(s) ds, (1)
6.S. Fractal integration 167

which expresses the result of n-tuple integration of function g (t) of a single variable
via the single integration operator.
Before we move on to fractal integrals, let us take a closer look at the Cauchy
formula (1). It is valid for absolutely integrable functions g(t) which decay for
t ---+ -00 sufficiently rapidly to guarantee the existence of the integral on the
right-hand side of (1). Assuming that the integrand g(t) vanishes for t < 0, the
Cauchy formula can be rewritten in the form

= 1
(n -I)!
lot
0
(t-s)n-lg(s)ds. (2)

Remark 1. The Cauchy formula can be viewed as an illustration of the general


Riesz Theorem about representation of any linear continuous (in a certain precise
sense) operator L transforming function g of one real variable into another function
L[g] as an integral operator

L[g](t) = f h(t, s)g(s) ds (3)

with an appropriate kernel h(t, s). In our case, the n-tuple integration linear
operator in (1) has a representation via the single integral operator with kernel
h(t, s) = (t - s)n-l/(n - I)!.

Let us check the validity of the Cauchy formula (1) by observing that the n-tuple
integral in (1) is a solution of the differential equation

dn
-x(t) = g(t) (4)
dt n

satisfying the causality principle. Such a solution, in view of (2.2.4), can be written
as convolution
x(t) = f kn{t - s)g(s) ds, (5)

where
kn{t) = x (t)y(t),
and y(t) is the solution of the corresponding homogeneous equation

dn
-x(t)
dtn
=0
168 Chapter 6. Singular integrals and fractal calculus

with the initial conditions

y(O) = y' (0) = ... = In-2) (0) = 0, y(n-l) = 1.

Solving the above initial-value problem we get

1 n-l
kn(t) = t X(t),
(n -I)!

the kernel that appears in the Cauchy formula (1).


This is a good point to introduce fractal integrals. Replacing integer n in kernel
kn by an arbitrary positive real number a and the factorial (n - I)! by the gamma
function r(a) (see (4.4.3» we arrive at the generalized kernel

(6)

So it is natural to call the convolution operator

(la g)(t) = ka(t) * g(t), (7)

the fractal integration operator of order a.


In the case when function g(t) == 0 for t < 0 the convolution (7) reduces to the
integral
(la g)(t) = -1-
r(a) 0
lot (t - s)a-l g(s) ds (8)

where the upper integration limit reflects the causality property of the operator of
fractal integration.
Let us establish some of the important properties of the fractal integration oper-
ator assuming, for simplicity, that function g(t) in (8) is bounded and continuous.
Existence of fractal integrals. For a ~ 1 the integrand in (8) is bounded and
continuous, and the integral exists in the Riemann sense. For 0 < a < 1, the
kernel (t - s)a-l is singular but the singularity is integrable and the integral is an
absolutely convergent improper integral.
Zero-order integration. As a ~ 0+ the operators fa tend to the identity
operator. Indeed, in view of the recurrent formula r(a + 1) = ar(a) for the
gamma function and the fact that r(l) = 1 we have the asymptotics r(a) ......
1ja, (a ~ 0+). Hence,

lim (lag)(t) = lim !g(S)X(t-s)a(t-S)a-ldS.


a ..... 0+ a ..... O+
6.8. Fractal integration 169

Example 1 in Section 1.9 shows that the function X(t - s)a(t - s)a-1 weakly
converges to the Dirac deha8(t - s - 0) as a -+ 0+. 2 Consequently,

or, equivalently, in the distributional language,

ko(t) = 8(t). (9)

Iteration o/fractal integrals. As in the case of usual n-tuple integrals, repeated


application of fractal integrals is subject to the rule

(10)

To see this it suffices to check that, for any a, fJ > 0,

ka * kfJ = ka+fJ· (11)


Indeed, the left-hand side of (11), in view of (6), equals

so that, passing to the new dimensionless variable of integration r = s / t,

x(t)t a+fJ - 1
ka * kfJ = r(a)r(fJ) B(a, fJ), (12)

where
B(a, fJ) = 10 1 r a - 1 (1 - r)fJ- 1dr

is the beta function. It can be expressed in terms of the gamma function by

B( fJ) = r(a)r(fJ)
a, r(a + fJ)·

Substituting it into (12) we obtain equality (11).

2This notation emphasizes that the support of this Dirac delta lies inside the interval (0, t) so that
=
f~ 8(t - s - 0) ds 1 and not 1/2 as in (2.9.6). A similar situation was encountered in formula
(2.9.4) for ex -+ +00
170 Chapter 6. Singular integrals and fractal calculus

Fractal integrals as continuous operators. The following two inequalities show


that integration of fractal order has some continuity properties as a linear operation.
These properties will find an application in our construction of Brownian motion
in Chapter 14 of Volume 2, and for those purposes it will suffice to assume that
!,
o < ex < 1. First, observe that, for ex > I Ct is a continuous operator from
L2[0, 1] into LP[O, 1] for each p < 00. 3 Indeed, by the Schwartz Inequality,

{1(10t
-:SJo f2(s)ds
)P/2 . (t
10 k~(s)ds )P/2dt (13)

-:sc(ex,p) ( 10(1 f2(s)ds )P/2 =c(ex,p)lIf11~,


where c(ex, p) is a constant depending only on ex and p.
Additionally, by a similar argument, but using the Holder Inequality with 1/ p +
l/q = 1,

IIIP flloo = sup It kp(t - s)f(s)dsl


0:9:::1 10
{1
-:s ( 10 Ikp(s)lqds
)l/q (10 t If(s)jPds
)1/P = c({3, p)lIf11 p' (14)

so that for any {3 > 1/ p, the operator IP is continuous from V[O, 1] into the space
of continuous functions qo, 1].

6.9 Fractal differentiation


The operator D Ct of fractal or fractional differentiation is defined as the inverse
of the operator I Ct of the fractal integration, that is, via the operator equation

(1)

3Recall the LP[o. 1] denotes the Lebesgue space of functions f on the interval [0, 1] which have
pth powers integrable, Le., for which the norm IIfllp := (Jo1If(sW ds)l/p < 00.
6.9. Fractal differentiation 171

where Id denotes the identity operator. Similarly to the fractal integration operator,
the fractal differentiation operator has an integral representation

(2)

where ra (t) is the convolution kernel which will be identified next. To do that notice
that the operator equation (1) is equivalent to the convolution algebra equation

(3)

where ka is the fractal integration kernel (6.8.6). Denote by y the solution of the
r
equation a + y = n, where n = a 1 is the smallest integer greater than or equal
to a. In other words,

n -1 < a ~ n, y =n -a, O~y<1. (4)

Applying the fractal integration operator /Y to both sides of (3) we get

The expression on the left-hand side represents the usual n-tuple integral of the
fractal differentiation kernel r a. Thus, if we differentiate it n times we arrive at the
explicit formula
y =n -a> O. (5)

for the fractal differentiation kernel. The corresponding fractal differentiation


operator is then given by the convolution

(D a g)(t) = k~n)(t) * g(t). (6)

For t > 0, the n-th derivative of the kernel ky(t) appearing in (5) exists in the
classical sense and

1
ra(t) = _ _ t- a - 1 = La(t), t > O. (7)
r(-a)

So, the fractal differentiation kernel is equal to the fractal integration kernel with
the opposite index -a. Here, the gamma function r (-a) of negative noninteger
variable is defined via the above mentioned recurrence property as follows:

r(-a) = r(n - a)/(n - a -1)· ... · (-a).


172 Chapter 6. Singular integrals and fractal calculus

Therefore, the fractal differentiation operator can be treated as the fractal integra-
tion operator of the negative order:

(8)

which adds attractive symmetry to the fractal calculus.


Consequently, for a function g(t) which vanishes on the negative half-axis,
equation (6) can be rewritten in the following symbolic integral form:

(9)

For a ~ 0, the above improper integral diverges in view of the nonintegrable sin-
gularity of the integrand in the vicinity of the upper limit of integration. Therefore,
its values have to be taken as the values of the corresponding regularized integral
which can be found treating equality (6) as the convolution of distributions. In view
of properties of the distributional convolution, the operation of n-tuple integration
can be shifted from the first convolution factor to the second so that

(D a g)(t) = ky (t) * g(n)(t), (10)

with converging integral on the right-hand side. In particular, for integer a = n,


taking (6.8.9) into account, we obtain (as expected) that

Example 1. Let us now consider the special case of a function g(t) which
vanishes identically for t < 0 and is of the form

g(t) = X (t)q,(t), (11)

where q,(t) is an arbitrary infinitely differentiable function. Differentiating (11) n


times and taking into account the multiplier probing property (1.5.3) of the Dirac
delta, we obtain

= X (t)q,(n)(t) + L
n-l
g(n)(t) 8(m) (t)q,(n-m-l) (0).
m=O

Substituting the above formula in (10), and remembering that y = n - a and

t > 0,
6.9. Fractal differentiation 173

we finally get that

(D a g)(t) = 1
f(n - a) 0
it (t - s)n-a-lt/J(n) (s) ds

+ L t/J(n-m-l) (O)kn- m- a (t),


n-l
t > o. (12)
m=O

In particular,
D a kfJ = kfJ-a.

Also
Da(X(s)sfJ log lsI) = f(fJ + 1) X(s)sfJ- a [ log lsi + C] ,
f(fJ -a + 1)

where the constant (see the literature in the Bibliography for its derivation and
other formulas of the fractal calculus)

C = (log f)' (fJ + 1) - (log f)' (fJ - a + 1).

Furthermore, for 0 < a < 1, we have

a 1 f' t/J'(s) I_a


(D g)(t) = f(1 _ a) Jo (t _ s)a ds + t/J(O) f(1 _ a) t , t > O. (13)

Observe that, in contrast to (9), the singular integral on the right-hand sides of (12)
and (13) converges absolutely, and that the regularizations (10), (12) define a new
distribution-the principal of function Ta(t)X(t) (7):

'PV x(t)_I_ t - a- 1 , a ::: O. (14)


f(-a)

Its convolution

(D a g)(t) = -1- 'PV


f(-a)
it
0
(t - s)-a-l g(s) ds

with any function of the form (11) has a distributional interpretation via the right-
hand side of formula (12). •

The following properties of the operation of differentiation of fractal order high-


light its peculiarities.
174 Chapter 6. Singular integrals and fractal calculus

Nonloeal character. Values D n get) of the usual derivatives of integer orders


depend only on values of function get) in the immediate and arbitrarily small (in-
finitesimal) vicinity of the point t. By contrast, the fractal (noninteger) derivatives
are nonloeal operators since the value Da get) depends on the values of get") for
all t" < t. In particular, this fact explains why a function's discontinuity at a cer-
tain point (t = 0 for function (11» generates slowly decaying "tails" in its fractal
derivatives (the last sum on the right-hand side of formula (12».

Causality. Fractal derivatives enjoy the causality property: If function get) is


identically equal to zero for t < to then so does its fractal derivative.

Scale irwarianee. Like usual derivatives, fractal derivatives are scale invariant.
This means that differentiation of the compressed (K > 1) function gK (t) = g(Kt)
just requires mUltiplication of the compressed derivative by the compression factor:

(15)

Fourier transform. Under Fourier transformation, fractal derivatives behave just


like the ordinary derivatives. In the distributional sense

(16)

This formula follows directly from (10) and from the results of Section 4.4.

Remark 1. Fractal Laplaeians. The above definitions of fractal differential


operators for functions of one variable can be extended to fractal partial differential
operators for functions of several variables (see the literature at the end for further
details). For example the fractal Laplacian can be defined through the Fourier
transform approach as follows: For any <p E V(Rd ),

The following integration by parts formula is then obtained via the Parseval equality

It is also clear that the fundamental solution (Green's function) G a of the equation
-l1au = 8 has the Fourier transform
6.10. Fractal relaxation 175

The explicit inversion depends on the dimension d of the space. If a = 2, d ~ 3,


or 0 < a < 2, d ~ 2, or 0 < a < 1, d = 1, then

6.10 Fractal relaxation


Recently, more and more frequently, physicists find applications for the fractal
calculus which permit construction of generalized mathematical models of such
phenomena as relaxation, diffusion and wave propagation. In this section we will
illustrate these possibilities in the example of relaxation processes.
Informally, one says that a physical system has the relaxation property if within
finite time. (called the relaxation time) of cessation of external perturbation the
system "forgets" the perturbation and returns to its original state. Such systems are
encountered in the wide spectrum of applied problems from physics and electrical
engineering to biology and economics. In the simplest case, the mathematical
model of a linear relaxing system is the first-order ordinary differential equation

X' + px = g(t), (1)

where function g(t) describes the external perturbation of the system and p =
1/. > 0 is called the relaxation frequency. The response x(t) of the system (1) to
the external perturbation which satisfies the causality principle is given by formula
(2.2.4):
x(t) = H(t) * g(t). (2)

The fundamental solution H (t) entering (2) satisfies equation (2.2.5) which, in our
particular case, is of the form

H' + pH = 8(t), H(t) = 0 for t < O. (3)

Its well known solution is

H(t) = x(t)exp(-pt). (4)

From the physicist's perspective the main feature ofthe relaxation model (1) is
the presence of a unique (for this system) characteristic relaxation time • = 1/ p.
The model itself is just a special case of the whole family of justifiable models
176 Chapter 6. Singular integrals and fractal calculus

described by fractal differential equations

Dax + pax = g(t), O<a:::;1. (5)

For obvious dimensional reasons, to preserve the frequency dimension of p, equa-


tion (5) contains the a's power of p. Equation (5) appears in the physical literature
as an adequate model of relaxation processes in viscoelastic materials.
As was the case for equation (1), the solution of equation (5) satisfying the
causality principle is given by the expression (2)

x(t) = H(t) * g(t),


where the fundamental solution H(t) satisfies the fractal differential equation

DaH + paH = ~(t), H(t) = 0 for t < O. (6)

We will find the needed fundamental solution by the recursive method. As the
first step, let us apply the operator of fractal integration [a to both sides of (6) to
obtain equation
(7)
where ka(t) (6.8.6) is the kernel of the fractal integration operator. We shall
represent the solution of (7) as a power series
(Xl

H(t) = LpnaHn(t), (8)


n=O

in parameter p, which will be substituted in (7) to compare coefficients for the


same powers pna. As a result we obtain an infinite system of recurrence relations

Ho(t) = ka(t), n = 0,1,2, ....

Taking into account property (6.8.10) of the fractal integration kernels, we can
solve the above system to get that

Substituting this expression into (8) we arrive at the final formula for the funda-
mental solution of the fractal differential equation (5):

(9)
6.10. Fractal relaxation 177

where

R(IL. a) = r(a) ?;
00 (-l)nILn
r(na + a) (10)

is a function of the dimensionless variable f.L = (pt)a. It describes the fundamental


laws of fractal relaxation. The Heaviside function on the right-hand side of (9)
ensures that the causality principle is satisfied for the fundamental solution and the
power factor t a - 1 secures the correct dimensionality of H.
One can easily prove that the series (10) converges absolutely for any IL. This,
in particular, implies that the function (9) is indeed a solution of the equation (6).
For a = 1, series (10) becomes the Taylor series for the standard exponential
relaxation
R(f.L. 1) = exp( -f.L). (11)

For other values of a, function R(f.L. a) is a special example of the so-called


Mittag-Leffler functions which relatively seldom appear in applied problems. For
certain particular values of parameter a. function R(IL. a) can be expressed in
terms of more familiar special functions.
Example 1: Relaxation of order 1/2. For a = 1/2, series (10) can be split

R = Reven - Rodd (12)

into its even and odd parts and the odd part series can be summed easily to get

(13)

since f(1/2) = "f1i. The even part

IL2m
,?;
00

Reven = 1 + "f1i r(m + 1/2) .

Using the recurrent property of the gamma function the denominator can be written
in the form
f(m + 1/2) = 1 . 3 ... ~~(2m - 1) ./ii. (14)

Finally, changing the index of summation to n = m - 1, we get

IL 2n +12n
?;
00

Reven = 1 + 2f.L 1.3 ..... (2n + 1)·


178 Chapter 6. Singular integrals and fractal calculus

Browsing through mathematical tables 4 we run into expansion

2 Io 2 z2n+12n
.fii .fii?; 1 . 3 .....
Z 2 2 00
erf (z) := - e- t dt = - e - z ,
0 (2n + 1)

for the well-known error function so that the even part can now be written in the
form
Reven = 1 + v'1il1-ef.L 2 erf (11-). (16)

Substituting (16) and (13) into (12) we arrive at the final expression for the 1/2-
fractal relaxation function:

(17)

J;r 1
where
erfc (z) =1- erf (z) =
00
e- t2 dt (18)

is the complementary error function. The graphs of the exponential and 1/2-fractal

FIGURE 6.10.1
Comparison of exponential, fractal and Kohlrausch relaxation functions for
a = 1/2.

relaxation functions are compared in Fig. 6.10.1.


Remark 1. Asymptotics offractal relaxation. For a < 1, the fractal relaxation

systems display slow, power-type long-range decay rates in response to Dirac delta

4e.g., Handbook ofMathematical Functions, M. Abramovitz and I.A Stegun, Eds., formula 7.1.6
6.10. Fractal relaxation 179

pulse perturbations-a contrast to the classical exponential relaxation function. We


shall check this phenomenon for the particular case a = 1/2 discussed in the above
example. Since

2 ~ 1 . 3 ..... (2m - 1)
./iizez erfc(z) "'" 1 + ~(_l)m 2 2 ' (z -+ 00),
m=l ( Z )m

(see the asymptotic formula 7.1.23 in the above mentioned mathematical tables)
the main asymptotics of 1/2-fractal relaxation function (18) is

1
(/1- -+ 00).
R "'" 2/1- 2 '

The corresponding asymptotics of the fundamental solution H (t) (9) for a = 1/2
and large times t » 1/ p is described by the expression

1
H(t) "'" r;:;' (t -+ 00).
2p t v lrt

Remark 2. Kohlrausch relaxation function. Modelers of relaxation phenomena


also often use the Kohlrausch relaxation functions (also called stretched exponen-
tials)
0< a < 1,

which, for large times t, decay slower than the classical exponential relaxation
function but faster than the corresponding a-fractal relaxation functions (see Fig.
6.10.1). It is easy to see that function Pa(t) satisfies the differential equation

or equivalently (see (1.7.3»

Thus, it can be seen as a version of the classical exponential relaxation function


in a-fractally rescaled time or, to coin a new term, as an a-fractal time relaxation
function. In this terminology, function H which was just called a-fractal relaxation
function, should perhaps be called the a-fractal frequency relaxation function in
view of its Fourier transform properties.
180 Chapter 6. Singular integrals and fractal calculus

6.11 Exercises
Principal value
1. Find the solution u (x, t) of the forced wave equation

a2 u a2 u df(t)
at2 = ax2 + 8(x)dt'
u
where f (t) is an absolutely integrable function. Hint: Pass to the Fourier image (k, w) =
(1/4rr 2)J Ju(x,t)exp(-ikx - jwt)dxdt, of the solution, and then take the inverse
Fourier transform taking into account the causality principle.
Hilbert transform
2. Find the Hilbert transform of the rectangular impulse tp(t) = X(t) - X(t - .).
3. Find the Hilbert transform of the Lorentz-type function
1
tp(t) = t 2 + .2 .
4. What is the Hilbert transform of tp(t) = em,?
5. Find the Hilbert transform of tp(t) = sin(vt)/vt.

6. Prove that if 1{F(t) is the Hilbert transform of tp(t) (in brief, tp(t) :; 1{F(t», then:
(a) tp(t + .) ~ 1{F(t + .);
H

H
(b) tp(at) ~ 1{F(at), a > 0;
(c) tp( -t) ~ -1{F ( -t), and, in particular, if tp is an even function then 1{F is odd and
vice versa;
(d) If tp(t) is an n-times differentiable function, then

m = 1,2, ... ,n;


(e) "Energies" (or, in other words, L2 norms) of tp(t) and 1{F(t) are identical:

f Itp(t)1 2 dt = f 11{F(t)1 2 dt;

*
(t) The convolution tp(t) tp(t) ~ -1{F(t) 1{F(t); *
J
(g) If tp(t) dt = 27r~(0) =1= 0, then 1{F(t) ..... -2~(0)/t, (t ~ 00);
(h) If function tp(t) has a jump Ltp' = tp(. + 0) - tp(r - 0) at the point t = ., then
1{F(t) has at that point a logarithmic singularity:

(t ~ .);

(i) If tp(t) is even, smooth and bounded, then

1{F(t) ..... -ct, (t ~ 0)


6.11. Exercises 181

where
c = ~ (Xl (j/(O) ~ (j/(t) dt.
1r 10 t

Analytic signals
7. Find the dependence of an analytic signal ~(t) = ~(t) + i1/(t) on the real time
variable if its real part
cosOt
~(t) = -2--2.
t +.
8. Let ~(t) = (sin vt Ivt) cos Ot. Find ~(t).
9. Let
~(t)=/(t)cosOt+g(t)sinOt, 0>0, (1)
where functions I(t) and g(t) have finite support Fourier images, identically equal to zero
for Iwl ~ O. Then, the corresponding analytic signal is equal to

~(t) = [/(t) - ig(t)]e mt . (2)


Prove it.
Remark 1. Recall that the imaginary part of the analytic signal (6.6.2) is equal to minus
the Hilbert transform (6.5.1) of ~(t). This implies the following corollary to the above
= =
result: If (j/ 1 cos Ot + g sin Ot then its Hilbert transform is y, g cos Ot - 1 sin Ot.
Remark 2. Signals with finite-support Fourier image seldom appear in electrical en-
gineering applications. However, for narrow-band signals the replacement of the actual
analytic signal by the expression (6.9.3) gives a rather good approximation. For example,
the signal from Exercise 6 has an unbounded Fourier image but is narrow-band if O. » 1.
In this case, it is easy to see that the approximate expression

~
a
(t) = __ 1
t 2 +.2
e,.....
..t

is very close to ~(t).

10. Let ~(t) = t sin Ot, 0 > O. Find ~(t).


11. Use the concept of analytic signal to find the instantaneous amplitude, phase and
frequency of

12. Let ~(t) = X(t) sin Ot. Find the imaginary part 1/(t) ofthe corresponding analytic
signal.
13. Let ~(t) = X(t)t a - 1, 0 < a < 1. Find ~(t).
14. Find the analytic signal ~e (t) which corresponds to the even function ~e (t) = It la-l .
15. Let ~(t) = 8' (t). Find the imaginary part 1/(t) ofthe corresponding analytic signal.
16. Let Ii be the Hilbert transform operator
liq,(t) = !..1r 'PV f q,(s) ds.
s-t
182 Chapter 6. Singular integrals and fractal calculus

Find the set of functions 4J(t) on which H acts as just the shift operator, i.e., H4J(t) =
4J(t + T), for a certain T.
Fractal calculus
17. Extend the Cauchy formula to the case of the n-tuple integral

x(t) = i~ dt1a1(t1) £: dt2a2(t2) ... i: 1


dtnan(tn)g(tn),

where a1 (t), a2 (t), ... , an (t), are known functions such that the above n-tuple integral
converges absolutely for any integrable function g(t).
18. Let A C Rn. Express the multiple integral

1= f·'?· i a(x1,x2, ... ,xn)g(b(X1,X2, ... ,xn»dx1 ... dxn

via a single integral of function g (u).


Chapter 7
Uncertainty Principle and Wavelet
Transforms

The method of wavelet transforms, which provides a decomposition of functions


in terms of a fixed family of functions of constant shape but varying scales and
locations, recently acquired broad significance in the analysis of signals and of
experimental data from various physical phenomena. It is clear that the potential
of this method has not yet been fully tapped. Nevertheless, its value for the whole
spectrum of problems in many areas of science and engineering, including the
study of electromagnetic and turbulent hydrodynamic fields, image reconstruction
algorithms, prediction of earthquakes and tsunami waves, and statistical analysis
of economic data, is by now quite obvious.
Although the systematic ideas of wavelet transforms have been developed only
since the early 80s, to get the proper intuitions about sources of their effectiveness
it is necessary to become familiar with a few more traditional ideas, tools and
methods. One of those is the celebrated uncertainty principle for the Fourier
transforms which will be given special attention in this chapter. A close relative
of the wavelet transform-the windowed Fourier transform, will also be studied
in this context. We begin though with a brief sketch of the notion of the functional
Hilbert space which provide a convenient framework for our analysis.

7.1 Functional Hilbert spaces


The extension of classic 3-D Euclidean geometry concepts such as the space,
vector, composition, multiplication of vectors by scalars, inner product of vectors,
angle, orthogonality and parallelness, to a broad class of mathematical objects, was
one of the success stories of twentieth century mathematics. As a result, a multitude
offunctional spaces were introduced, studied and added to our permanent arsenal.
The linear topological spaces of distributions briefly described in Section 1.9 are
one such example. In this section, we discuss another class of functional spaces
called Hilbert spaces.
184 Chapter 7. Uncertainty principle and wavelet transforms

FIGURE 7.1.1
Composition of 2-D vectors.

At first, recall the basic notions of the usual 3-D geometry where each point
of the space is identified, in a fixed Cartesian coordinate system, with a vector a
anchored at the origin and with a tip at a given point. Each such vector is uniquely
described by its coordinates (al, a2, a3)-an ordered triple of real numbers. If
b = (bI. b2, b3) is another 3-D vector then the inner or scalar product of these
two vectors is defined by the equality

(1)

Since, alternatively,
(a, b) = lIallllbll cos a,
where, by Pythagoras' theorem,

lIall 2 = (a, a) (2)

is the square of the vector's norm (length, magnitude) and a is the angle between
the two vectors, the inner product of two vectors clearly depends on their mutual
orientation. In particular,

aJ.b if and only if (a,b) = O.

The geometric composition of vector a with vector b (see, Fig. 7.1.1) corre-
7.1. Functional Hilbert spaces 185

sponds to the algebraic operation

(3)

of vector addition. Vector c = (Ct. c2. C3) is called the sum of vectors a and b
if its coordinates are sums of corresponding coordinates of the summand vectors:
Cn = an + bn • '! = 1.2.3. Such addition operation is obviously commutative,
that is
a+b= b+a. (4)
Besides addition, one introduces the operation of multiplication of a vector by
a scalar:
(5)

which geometrically represents vector contraction for Irl < 1, vector dilation if
Ir I > 1 and vector reflection in case of r = -1.
The following three properties of the inner product and the norm are fundamental
for the geometric properties of the Euclidean space:
(i) The norm is homogeneous, that is, for any scalar r and vector a,

IIrall = Iriliali. (6)

(ii) The norm and the addition operation are related via the triangle inequality,
that is, for any two vectors a and b,

lIa + bll ~ lIall + IIbll· (7)

(iii) The norms and the inner product are related by the Schwartz inequality

I(a. b)1 ~ lIalillbli. (8)

The first step in the generalization of the geometry of 3-D Euclidean spaces to
abstract functional Hilbert spaces are two observations:
(a) The concept of the inner product (and related geometry of the space) can
be immediately extended to d-dimensional Euclidean spaces by defining for any
a = (at •...• ad) and b = (bt. ...• bd)

(b) If one wants to operate with vectors with complex (rather than real) coordi-
nates and preserve the positivity of the norm the only adjustment in the definition
of the inner product is as follows:

(9)
186 Chapter 7. Uncertainty principle and wavelet transforms

where, as usual, the asterisk denotes the complex conjugation. Then, the square

d d
lIall 2 = ~::>:an =
n=l
L Ian 12 ::: 0
n=l
(10)

defines a positive norm


lIall = J(a, a). (11)

In the complex case, the symmetry of the inner product is replaced by the Hermitian
property
(a, b) = (b, a)*. (12)

The linearity with respect to the second variable in the inner product is preserved,
though, as for any complex number z,

(a, zb) = z(a, b). (13)

Example 1. Inner product space ofpolynomials. Let us consider the set l~ of


all polynomials

of degree at most N - 1 with complex coefficients. Each of these polynomials


is uniquely determined by its coefficients for different powers of x, that is by the
(complex) vector a = (ao, ... , aN-I). The sum of such polynomials is again a
polynomial of the above type and the same is true for a product of a complex number
and such polynomial. Moreover, the summation and multiplication by scalars in
the family l~ of such polynomials corresponds to the analogous operations on the
coefficient vectors and identifies the vector space structure of the family l~ of all
polynomials of degree at most N - 1 as that of an N-dimensional complex vector
space. The natural inner product leads to the notion of the "distance" I = lIa - bll
between polynomials with coefficient vectors a and b. •
Example 2. Inner product space of complex exponentials. Consider the set of
all infinite sums of complex exponentials

1
L
00
E(x) = - - ane inx , x E [-1l',1l'] (14)
...tiiin=-oo
with complex coefficients. This set forms a natural vector space under termwise
addition and multiplication by scalars and can be identified with the (infinite-
dimensional) vector space of coefficient vectors (sequences) a = (... , a-I, ao, aI,
7.1. Functional Hilbert spaces 187

a2, •. .). However, if we wanted to introduce the inner product in such space
associated with the norm

liE II =

we immediately run into the question of convergence of the above series and to
proceed we have to assume additionally that

L
00

lan l2 < 00. (15)


n=-oo

This is the first fundamental difference with the finite dimensional spaces. The
subset l~ of sums E satisfying condition (15) remains closed under operations of
termwise addition and multiplication by scalars since IIzE II = Izlil E II and

liE + FII ~ IIEII + IWII,

for any sums E, F and scalar z.


Attempts to generalize the above examples immediately lead us to the idea of

the functional Hilbert space L 2 (R) of complex-valued functions I (x) defined on
the entire real axis R and such that

(16)

Now, we can introduce the inner product in L2(R) via the formula

(f, g) = I I*(x)g(x) dx (17)

and the related norm


1/2
11111 = J(f, f) = (
l'/(X),2dX ) (18)

In view of the classical integral Schwartz inequality

l(f, g)1 = II f*(X)g(X)dXI


188 Chapter 7. Uncertainty principle and wavelet transfonns

:: f( If(x)1 2dx )
1/2 (
f Ig(x)1 2dx
)1/2
= IIfllllgll, (19)

condition (16) assures that the inner product (17) is well defined (i.e., that f*g is
integrable). The Schwartz inequality also immediately leads to (see Exercises) the
triangle inequality

IIf + gil ::: IIfil + IIgll, f, g E L 2 (R), (20)

which, incidentally implies that the Hilbert space L2(R) is closed under the usual
pointwise addition of functions. It is also closed under multiplication by scalars
since, by (18),

The above inner product (17) is Hermitian, that is

(f, g) = (g, 1)*

and homogeneous in the second variable, as

(f, Zg) = z(f, g)

for any complex constant z and f, g E L2(R).


Of course, the norm in L2(R) is nonnegative, that is for any f E L2(R)

IIfil ~o

and if f(x) == 0 then IIfli = o.


A mathematical aside: functions with vanishing norm and the Lebesgue integral. 1 It is quite
clear that beside the function equal to zero identically there are other functions f(x) :F 0 such
= =
that II f II O. In other words, II!II 0 does not necessarily imply f (x) =
O. For example, any
function different from zero at a finite or countable number of points would have norm zero. This
creates somewhat unpleasant situation of having two different functions f(x) :F g(x) for which
the distance II f -g" = o. The satisfactory resolution of this problem is not possible within the
framework of the Riemann integral which we implicitly used throughout the preceding chapters
(and which is sufficient for our other purposes). It requires introduction of the more general
Lebesgue integral (hence letter L in the notation of the functional Hilbert space) which permits
integration of a much broader class of functions than the Riemann integral (see the bibliographical
notes at the end). For example the Dirichlet function

D(x) = {0, ~ x ~ irr~tional;


1, if x IS rational;

1This material may be skipped by the first time reader.


7.1. Functional Hilbert spaces 189

is not integrable on [0,1] in the Riemann sense since the upper approximating sums (always equal
to 1) and the lower approximating sums (always equal to 0) do not converge to the same number.
However, it is integrable in the Lebesgue sense and its Lebesgue integral is equal to O.
Interpreting the integrals in (16-18) as Lebesgue integrals, it is customary to formally define the
functional Hilbert space L 2 (R) not just as the space of square-integrable functions but as the space
of equivalence classes of square-integrable functions where two functions I, g are understood as
equivalent (written I = g) if and only if III - gil = O. It is easy to see that, if we define the
Lebesgue measure IA I of the set A eRas the Lebesgue integral of its indicator function I A (x)
(equal to 1 on A and to 0 off A), then two functions belong to the same equivalence class if and
only if they differ on a set of Lebesgue measure zero (or in measure theory jargon, are equal almost
everywhere).
Now, the norms and inner products are the same for all functions in a given equivalence class, so
practically one always does computations on concrete functions, but if elements I of the Hilbert
space L2(R) are meant as equivalence classes of functions equal almost everywhere then the
desired strict positivity of the norm is achieved, that is

11/11 =0 if, and only if 1=0.

The norm IIfII in the functional Hilbert space L 2 (R), and the related distance
II I - g II of (equivalence classes of) functions I, g permit us to introduce the notion
of the limit of functions in L2(R). Namely, we say that

In ~ I (21)

if
lim
n-+oo
II/n - III = o.
This notion permits us to study approximation problems in the functional Hilbert
space.
Having introduced in the functional Hilbert space the algebraic structure (addi-
tion and multiplication by scalars), the compatible inner product structure and the
related metric (norm) structure one could sensibly introduce in L2(R) and study
the geometric concepts such as the angles between functions, their orthogonality,
etc. This very fruitful approach is the essence of the branch of mathematics called
functional analysis.

Remark 2. Completeness of the functional Hilbert space. The finite dimensional


inner product spaces introduced at the beginning of this sections enjoyed the im-
portant property of being complete, that is, the Cauchy criterion of convergence
remained valid for them. The same criterion happens to hold true for the space
L 2 (R). In other words, the functional Hilbert space is a complete inner product
space, that is for any sequence of functions In E L 2 (R), n = 1, 2, ... , such that

lim
n,m-+OO
II/n-Imil=O
190 Chapter 7. Uncertainty principle and wavelet transforms

there exists a function I E L2(R) such that

lim
n-+oo
II/n - III = O.

Remark 3. Other Hilbert and normed functional spaces. The above discussion
applies, with obvious adjustments, to Hilbert spaces of functions over other subsets
of the real axis such as L2([a, b]), L 2([0,00», etc., as well as to their natural
analogues L 2 (Rd ) defined for functions of several variables.
On the other hand, it is often necessary to consider functional spaces where the
introduction of the inner product structure is impossible and one has to be satisfied
only with the norm structure. Examples of non-Hilbertian functional normed
spaces, such as LP(R), 1::: p < 00, p =f:. 2, which consists of all (equivalence
classes of) functions for which

IIfllp = ( / I/(x)I PdX) lip < 00,

have been mentioned before in Chapter 6 (also, see the Bibliographical Notes). In
particular, space L I = L I (R) consists of all absolutely integrable functions on the
real axis.

7.2 Time-frequency localization and the uncertainty principle


Consider a (perhaps complex-valued) signal I(t) such that

(1)

The quantity 1I (t) 12 can be thought of as the signal's "mass" density and describes
its distribution in time. If the signal I (t) is square integrable but (1) is not satisfied
then one can always normalize it by considering I(t)/(j I/(t)1 2dt)1/2. In this
context, the quantity

can be interpreted as the location in time of the signal's "center of gravity," or its
mean location. For the purposes of this section, and without loss of generality, we
7.2. Time-frequency localization and the uncertainty principle 191

will assume that its mean location is at 0 or, in other words, that J tlf(t)1 2dt = O.
In this case, the quantity

(2)

measures the average square deviation from the mean time location, or the degree
of localization of the signal around its mean in the time domain.
On the other hand, the Fourier transform

-
f(w) = -1
21l'
f .
f(t)e-/(lJt dt

displays no direct information about the signal's time localization, but has explicit
information about its frequency localization. The square of its modulus Ij(w)1 2
is the frequency domain counterpart of the time density If(t)12. Note that, by
Parseval's formula (3.2.11),

so that 21l' 1j (w) 12 can be viewed as the signal's normalized density in the frequency
domain. Assume (again, without loss of generality) that the mean frequency

Then the quantity


(3)

measures the mean square deviation from the mean frequency location, or the
degree of localization of the signal in the frequency domain.
The uncertainty principle asserts that there exists a lower bound on the simulta-
neous localization of the signal in time and frequency domains. More precisely, it
states that
(4)
whenever the variances 0"2[11 and 0"2[j] are well defined. Note the universal
constant 1/4.
To see why the uncertainty principle holds true, consider the integral

I(x) = f Ixtf(t) + !'(t)1 2dt ~ 0 (5)


192 Chapter 7. Uncertainty principle and wavelet transforms

where x is a real parameter. Then, since

Ixtf(t) + f'(t)1 2 = (xtf + f')(xt/* + (f')*)

we get that

The first integral in (6) is just U 2 [f] (by definition (2». The second integral is
equal to

since tlfl2 decays to zero at ±oo in view of the assumption u 2[f] < 00. Finally,
the third integral is equal to

because of Parseval's formula (3.2.11) and the fact that the Fourier transform of
f' is i(Oj«(O). As a result, the integral
(7)

This is a quadratic polynomial in variable x and, in view of (5), it is nonnegative


for all values of x. As such, it has a nonpositive discriminant

which immediately yields the uncertainty principle (4).


Remark 1. The Heisenberg uncertainty principle in quantum mechanics. The
(3-D version of the) above uncertainty principle concerning time-frequency local-
ization has a celebrated interpretation in quantum mechanics, where the principle
asserts that the position and the momentum of a particle cannot be simultaneously
measured with arbitrary accuracy. Indeed, in quantum mechanics the particle is
represented by a complex wave function f(x), where If(x)1 2 is the probability
density of its position in space. The observables are represented by operators A
on wave functions; the mean value of the observable is

f (Af)(x)/*(x)dx.
7.3. Windowed Fourier transform 193

The position observable is represented by a multiplication by variable (vector) x


and the momentum observable is represented by the operation of differentiation
a/ax. However, via the Fourier transform, the latter also becomes an operation of
multiplication but by an independent variable (vector) w in the frequency domain.
Thus the uncertainty principle (4) gives the universal lower bound for the product
of variances of the probability distributions of the position and of the momentum.
In the three-dimensional space, and in the physical units, the lower bound 1/4 in (4)
.has to be replaced by a different mathematical constant multiplied by a universal
physical constant called the Planck constant. The employed above probabilistic
concepts of means and variances will be further studied in Chapter 13.

Remark 2. One can check that the equality in the uncertainty principle (4)
obtains only for the Gaussian function / (t) = 11: -1/4 exp( _t 2 /2). Thus the optimal
simultaneous time and frequency localization is attained for a Gaussian-shaped
signal.

7.3 Windowed Fourier transform


7.3.1. Forward windowed Fourier transform. The uncertainty principle dis-
cussed in Section 7.2 is a basic law of mathematics and it is impossible to fool
nature by measuring the frequency of the incoming signal with an arbitrary preci-
sion in a finite time interval. Moreover, for most of the signals we have to deal with
in practical problems, such as speech, musical sounds, radar signals, the situation
is often much worse than the basic uncertainty inequality permits and

u (f)u (/) » 1. (1)

Nevertheless, it is often possible to process these signals in such a way that, with-
out violation of the uncertainty principle, one can obtain information about the
signal's "current" frequency and its time evolution. These various practical signal
processing methods are adapted to different kinds of signals and pursue different
goals. In this section we will take a look at one of these methods called the win-
dowed Fourier transform which is closest perhaps to the spirit of the usual Fourier
transform.
In what follows, to better grasp the mechanisms behind the windowed Fourier
transform, it will be instructive to test them on a sample signal that we will call
the simplest tune. Mathematically, it is described by the real part of the complex
function
/(t) = exp(i 4>(t», (2)
194 Chapter 7. Uncertainty principle and wavelet transforms

where
Q .
<I> (t) = wot + -V SID(vt) (3)

is the signal phase. The simplest tune is plotted on Fig. 7.3.1.

Ref(t)

v
FIGURE 7.3.1
Plot of the simplest tune in case of wo = 10v and fJ = Q/v = 5.

It is customary to say in the theoretical physics context that the simplest tune
has the instantaneous frequency (admittedly, an oxymoron)

d<l>(t)
Winsr(t) = - - =
dt
W() + Q cos(vt), (4)

which oscillates with period T = 2Jr / v between its high value Wo + Q and low
value W() - Q. By contrast with a theoretician, an experimenter has to deal not
with mathematical formulas but with real signals and his job is to come up with a
signal processing method that will discover the existence of frequency oscillations
in the simplest tune.
The mathematical tool that is helpful in this situation is called the windowed
Fourier transform which is just the usual Fourier transform

-
f(w, r) = -2Jr1 f .
f(t)g(t - r)e- uut dt (5)

of the time-windowed signal f(t)g(t - r), where g(t) is the windowing function
that usually is chosen to have value equal to 1 in a vicinity of the origin t = 0 (say,
7.3. Windowed Fourier transform 195

inside an interval of length),,), and that either vanishes or has values very close to 0
outside this neighborhood. This windowing function property will assure effective
time-localization.
Usually, one defines the windowing function g(t) via a windowing shapefunction
go(x) of a dimensionless variable x and the formula

g(t) = go(t/),,), (6)

where)" is a scaling parameter. Some typical examples of normalized (1\ gO 1\ = 1)


windowing shape functions are (see Fig. 7.3.2):
(a) Finite memory window

go(x) = X(x + 1) - X(x); (7a)

(b) Relaxation window

go(x) = 2x(-x)exp(2x); (7b)

(c) Gaussian window

(7c)

Shift'l' centers the window at different locations on the time-axis t. If /(t) is a


time-dependent signal and processing is performed in the real-time then 'l' is just
the current time of the experiment and the time-window g(t) has to satisfy the
causality principle, i.e., g(t) == 0 for t > O. So, in this case, the finite memory and
relaxation windows are appropriate but the Gaussian window is not. If the whole
signal is recorded before processing, or the variable t has other interpretation (e.g.,
space, or angle variable) then the experimenter has more freedom in selecting
the windowing shape function, and very often the Gaussian window is a good
candidate.
7.3.2. Frequency localization. The time-window g(t) was designed to separate
well the time-localized pieces of duration)" of the incoming signal /(t). Luckily,
it turns out that the Fourier image of the time-window g(t) can help in frequency
localization. To see how this happens let us express the original signal/ (t) through
its Fourier transform:

/(t) = f j(w')e ialt dw', (8)

and substitute it into the right-hand side of (5). Note that, in the case of the simplest
tune (2), j(w) exists only in the distributional sense. The change of the integration
196 Chapter 7. Uncertainty principle and wavelet transfonns

----'------I-------x
-2 -1 o 1 2

~~------~-------x
-2 -1 o 1 2

---~------I--------x
-2 -1 o 1 2
FIGURE 7.3.2
Examples of windowing shape functions. (a) Finite memory window; (b) Re-
laxation window; (c) Gaussian window.
7.3. Windowed Fourier transform 197

order gives that

j(w, t") = e-iaJ'r f j(w')'g(w - w')e iah: dw'. (9)

Remarkably, except for the nonessential factor in front of the integral, this expres-
sion looks like the symmetric counterpart of (5) in the frequency domain. Now,
the role of the signal is played by its Fourier image j(w) and the time-window has
been replaced by the frequency-window g(w).
The uncertainty principle (7.2.4) tells us that if the effective duration of the time-
window is J.. then one can expect the effective width of the frequency-window to
be of order at least 1/J... In terms of the dimensionless window shapes 8o(x) and
go(y) where, similarly to (6),

g(w) = J..go(J..w), (10)

both 8o(X) and go(y) have to have a similar effective widths'" 1.


However, the actual situation is a bit more complicated than the above juggling
of the uncertainty principle may indicate. When the engineers talk about effec-
tively localized frequency-window, they think about the compact support of the
frequency-windowing shape function go(y) or at least about its rapid decay out-
side a finite frequency band. However, we know from the properties of the Fourier
transform that it is impossible for both the function and its Fourier transform to have
compact supports. Furthermore, the frequency windowing shape function will de-
cay rapidly for Iyl > 1 only if the time windowing shape function is smooth. This
fact eliminates time windowing shape functions (7a) and (7b), which have good
time-localization properties, as good candidates for good frequency localization
by their Fourier transforms. Abrupt truncations in them introduce discontinuities
of the first kind which slow the decay of their Fourier transforms. For example,
the modulus of Fourier image of the relaxation window (7b)

Igo(y)l=
1r
kz
4+ y
(11)

decays to zero slowly as Igo(y)1 '" 1/(lrlyl), (y -+- 00).


So, to achieve better frequency localization one has to take smoother windowing
shape functions like, for example, functions described in Section 4.3.
Example 1. Compact time-window, power-law decay of the frequency windOw.
Take the windowing shape function

8o(X) = ~[X(X + 2) - x (X)] sin2 (lr;). (12)


198 Chapter 7. Uncertainty principle and wavelet transforms

corresponding to function (4.3.19), normalized appropriately and shifted to satisfy


the causality principle. Its frequency counterpart

_ ·4 siny
go(y) = e'Y -rr--;:--"-~ (13)
3 y(rr2 _ y2)

decays as l/lyl3, faster than (11), which produces tolerable frequency localiza-
tion while preserving perfect time localization. The power law of the frequency
windowing shape (13) decay was caused by hidden discontinuities (in the second
derivative) of the time windowing shape (12). •
Example 2. Gaussian time and frequency windows; Gabor transform. Since the
Fourier image of a Gaussian time windowing shape gives a Gaussian frequency
windowing shape, in this case we have excellent localization in both time and
frequency domains. Indeed, if gO (x) is given by (7c) then

1 rr- 1/ 4
go(y) = ~go(Y) = ~ exp( _y2 12). (14)
v2rr v2rr

The extra factor 1/.,fii is the result of our asymmetric, but physical definition
(3.1.1) of the Fourier transform. In mathematics, for esthetic reasons, one often
prefers a symmetric definition of the Fourier transform and its inverse:

-
f(y) 1
=.,fii f .
f(x)e- Ixy dx, 1
f(x) =.,fii f- .
f(x)e 'xy dy. (15)

In this case, go(y) == go(Y). The windowed Fourier transform

(16)

based on the Gaussian window is called the Gabor transform in honor of the
physicist who introduced it for studying quantum-mechanical problems. •

7.3.3. Energy density in the time-frequency domain. In applied problems the


1
quantity of interest is usually not the complex function (w, r) itself but its squared
modulus
- 2
E(w, r) = 2rrlf(w, r)1 . (17)

It follows from (5) that

E(w, r) = 2~ f f dt dfJ e- iw8 f*(t)g*(t - r)f(t + fJ)g(t + fJ - r). (18)


7.3. Windowed Fourier transfonn 199

For the sake of symmetry between the time and frequency domains we permit both
functions I(t) and g(t) to be complex-valued. Integrating the above equality over
all w, we get
(19)

Observe that the three unwieldy integrals on the right-hand side were reduced to
an elegant single integral by noticing first that

~
21T
f e- iw8 dw = l)(O), (20)

and then using the probing property of the Dirac delta to get rid of another integral.
After integration of (19) over all r we have

(21)

where the norms on the right-hand side are the Hilbertian L2-norms introduced in
(7.1.18).
Since we assumed at the beginning of this section that the time windowing
shape function go is normalized, i.e., IIgoll = 1, the squared L2-norm of the time
windowing function itself
(22)

i.e., it is equal to the duration of the time-window. Remembering that the squared

f I/(t)1
norm
11/112 = 2 dt

represents the energy Ef of the original signal I(t), formula (21) implies the
following energetic relation:

Ef = ~ f f
dw dr E(w, r).

Thus the function E (w, r) /}.. has a physical interpretation as the joint frequency-
time density of the signal's energy.

7.3.4. Mean frequency and standard deviation of the windowed Fourier trans-
form. Recall that the windowed Fourier transform was introduced earlier in this
section to track (with a precision determined by duration).. of the time window) the
time revolution of the "instantaneous frequency" Winst(r) of signal I(t). The lat-
ter was sufficiently clearly defined for the simplest tune signal, but for the general
200 Chapter 7. Uncertainty principle and wavelet transforms

situation we need a more rigorous definition. A good, and analytically convenient


definition is the mean value

w('l') = e('l') f wE(w, -r)dw, (23)

f
where
e(-r) = 1/ 1/(t)1 2 Ig(t - -r)1 2 dt (24)

is the normalizing constant, although some physicists would perhaps prefer to use,
as the definition of the instantaneous frequency at time -r, the value Wmax = Wmax ( -r)
which maximizes the joint energy density E(w, -r).
However, before advising the reader to go ahead and apply the above definition
in research problems, let us step back and see what happens in a typical situation
where /(t) and g(t) are real functions. Then, in accordance with the Fourier
transform properties,
!(-w, -r) = !*(w, -r),
so the energy density E (w, -r) is an even function of w for each -r and, necessarily,
w (-r) == O. In this context, the above notion of the "instantaneous frequency" is
useless.
The situation is different and more promising for analytic signals

/(t) = A(t) exp(i<I> (t», (25)

where A (t) and <I>(t) are, respectively, the signal's real-valued amplitude and phase.
In physical and engineering applications, the real signal / (t) is often a narrow-band
process which can be written in the form

/(t) = A(t) cos(W()t + rp(t», (26)

where A(t) and rp(t) are slowly varying in comparison to cos(W()t). The "simplest
tune" signal introduced earlier in this section is narrow-band if

W()>> n, and W() » v. (27)

In such cases we will consider "approximately analytic" signals, replacing in (25)


the exact amplitude and phase by the amplitude A(t) and phase

<I>(t) = W()t + rp(t) (28)

of a narrow-band process.
7.3. Windowed Fourier transfonn 201

Let us calculate the current frequency w( r) (23) of an analytic signal (25).


Multiplying (18) by w, integrating, and keeping in mind that the differentiation of
(20) with respect to () gives

~
21T
f we-icuB dw = i8'«()),

we obtain

w(r) = -ic(r) f f*(t)g*(t - r).!!...[!(t)g(t - r)]dt.


dt

Substitute in this formula the expression (25) for the analytic signal and take into
account that the window g(t) is a real-valued function to arrive at the formula

w(r) = Winst(r) = f Winst(t)P(t; r)dt, (29)

where
d!l>(t)
Winst(t) = - - (30)
dt
is the "instantaneous frequency" of the analytic signal, and

(31)

is the normalized power of the signal taking into account the window's weight
Ig(t - r)1 2 . If I/(t)1 2 is constant, as in the case of the simplest tune signal (2-4),
then the power
P(t; r) = -g1 2 (t - r) = -go
)..
1 2 -- .(t -r)
)..)..

For).. ~ 0, the power function P(t; r) weakly converges to 8(t - r) and Winst(r)
converges to the instantaneous frequency Wi nst ( r ).
Unfortunately, the above conclusion does not imply that the windowed Fourier
transform permits, in the limit ).. ~ 0, the precise measurement of the signal's in-
stantaneous frequency. Actually, the accuracy of such measurement is determined
by the frequency deviation a (r) = ,J D( r ), where

D(r) = c(r) f (w - w(r))2 E(w, r) dw. (32)

Simple algebra shows that

D(r) = w2(r) - (w(r))2, (33)


202 Chapter 7. Uncertainty principle and wavelet transforms

where the second frequency moment

(34)

Calculations similar to those that brought us to the expression (29) for the current
frequency give

or, after substitution of the analytic signal (25),

(35)

Here, in agreement with notation from formula (29)

(36)

Do(r) = c(r) f [:t If(t)g(t -r)lr dt. (37)

Finally, substituting (35) into (33) we obtain

(38)

where
(39)

Both terms on the right-hand side of (38) have an obvious physical interpretation.
The first term (39) takes into account the error in determining the instantaneous
frequency caused by averaging over the time-window of duration A, and it con-
verges to 0 as A -+ O. On the other hand, the second term, (37), blows up to 00
as A -+ 0 and has a more fundamental nature related to the uncertainty principle
considered in Section 7.2.
Example 3. Current frequency for the simplest tune seen through a Gaussian
window. Let us consider in some detail the behavior of the current frequency
Winst(r) (29), and the competition between two components of the current fre-
quency deviation D(r) (38), in the case of the simplest tune signal (4) and the
Gaussian windowing shape (7c).
7.3. Windowed Fourier transform 203

Taking into account that If(t)1 2 = 1 for the simplest tune, elementary calcula-
tions give
-
Winst ()
S = WO + ~£e -a2 /4 COSS,
I"'t
(40a)

(40b)

and
D 1 (40c)
0= 2A 2 '

with the dimensionless parameters

a = VA, S = vr. (41)

Fig. 7.3.3 shows, in cases when a 1 and a = 2, the (dimensionless) time

a b
o -----------
o -----------

-0 ----------- -'------------- S

FIGURE 7.3.3
Time dependence of (a) the current frequency, and (b) the frequency devia-
tion, for the simplest tune signal for two values, 1 and 2, of the dimensionless
parameter a.

dependence of the current frequency Winst (r) measured by the windowed Fourier
transform (9), and the instantaneous frequency deviation ainst(r) = (Dinst(r»1/2
due to time-window averaging.
Recall that for a = 0, the current frequency Winst(r) coincides with the in-
stantaneous frequency Winst(r). The amplitude of Winst(r) is smaller in the case
a = 2 than in the case a = 1 which is a consequence of the smoothing action
204 Chapter 7. Uncertainty principle and wavelet transforms

of time-averaging. Fig. 7.3.3(b) shows that the instantaneous frequency deviation


(1inst(r) has peaks at the times when the instantaneous frequency changes quickly,
and valleys when it changes slowly.
To evaluate the efficiency of instantaneous frequency measurement via the win-
dowed Fourier transform we will consider the ratio of the full frequency dispersion
(38) at s = 0 to its limit value DOC = 0 2 /2 for A -+ 00:

2(0) = D(O) = _1_ + (1- e-a2/2)2 (42)


p Doo 4a 2fJ2 '

where the new dimensionless parameter

fJ =O/v. (43)

Note that s = 0 is in a sense the "best" case since at that time the instantaneous
frequency changes slowly and D(r) has a minimum. The graphs of dependence
of p(O) on parameter a are shown in Fig. 7.3.4 for several values of parameter fJ.

p(O)

~--~------------~--------------~a
a* 1 2
FIGURE 7.3.4
The graphs of dependence of p(O) on parameter a for several values of pa-
rameter fJ.

As we explained before, the blow-up of the graphs to infinity as a -+ 0 is


due to the uncertainty principle effects which guarantee that the window of a
shorter duration will lead to greater indeterminacy of the measured frequency.
As a increases, the uncertainty principle effects become negligibly small, but the
instantaneous frequency measurement error due to the time-averaging, increases.
7.3. Windowed Fourier transform 205

The value a* for which P attains its minimal value Pmin determines the optimal
duration A* = 2a* /v of the time-window. It is clear from Fig. 7.3.4 that only if
{3 » 1 does Pmin « 1 and, as a result, the windowed Fourier transform algorithm
is capable of accurately tracking the instantaneous frequency. The latter values
correspond to slow and/or large changes of the instantaneous frequency. •
To conclude our windowed Fourier analysis of the simplest tune, a word of
caution is necessary. Conclusions based on simple integral characteristics of the
joint frequency-time density E(w, 1') (17), such as the mean frequency (23) and
deviation (32), can be sometimes misleading. They are quite coarse and lose a lot
of information contained in the joint density. So, it is useful to indicate their area
of applicability which is luckily possible for the simplest tune signal in view of the
relatively simple structure of its windowed Fourier transform jew, 1').
First, note the formula

L
00

i asinb = In(a)e inb , (44)


n=-oo

where
In(a) =.!.. (1f cos (a sin x _ nx)dx, (45)
1r 10
are Bessel functions of integer order n which will be encountered often through-
out the remainder of this book. Substituting a = Q/v and b = vt in (44) and
multiplying it by e iwot we will obtain for the simplest tune signal (2) the formula

f(t) = L
00
In(Q/v)ei(wo+nv)t. (46)
n=-oo

Hence, f(t) has the following distributional Fourier image:

L
00

jew) = I n (Q/v)13(w - wo - nv). (47)


n=-oo

Applying it to the definition (9) of the windowed Fourier transform we get, in view
of (10), that

L
00
jew, 1') =~ I n ({3)go(a(y - n»e-i(y-n)s, (48)
v n=-oo

where y = (w - wo) / v. It follows from (48) that our integral characteristics-based


analysis of the joint density E(w, 1') is certainly not applicable for a » 1 when
206 Chapter 7. Uncertainty principle and wavelet transfonns

the right-hand side of (48) collapses to a sum of separate peaks and E(w, r) (see
(17» becomes a polymodal function of variable w.

E
a

I~
E
b

o
E
c

FIGURE 7.3.5
Plotsofsimplesttune'sjointdensity E(w, r)incaseof,B = 10 and (a) a = a* =
0.27, (b) a = 1, (c) a = 2. The unimodaJity disappears and multimodaJity
sets in at a ~ 1.
7.3.5. Inverse windowed Fourier transform. As for any other integral trans-
form, the question whether there is enough information contained in the win-
dowed Fourier transform j(w, r) to recover from it the original function f(t) is
of paramount importance. To answer this question let us keep in mind that the
windowed Fourier transform of f(t) is nothing but the ordinary Fourier transform
ofthe windowed function f(t)g(t - r). Hence, applying the usual formula (3.2.4)
for the inverse Fourier transform we immediately get

f(t)g(t - r) = f j(w, r)e iwt dw. (49)

This identity permits the recovery of values of function f (t) only inside the window
g(t - r), or in practical terms, only where g(t - r) is not too small.
7.3. Windowed Fourier transfonn 207

To remove this limitation let us select a function h*(t), multiply both sides of
(49) by its shift by 'l' and integrate them over all 'l'. The result is

Af(t) = / dr / dwj(w, 'l')h*(t - r)i wt , (SO)

where
A = / h*(9)g(9)d9. (SI)

If the auxiliary function h(9) is chosen so that the constant A =f:. 0, then the desired
inverse windowed Fourier transform formula takes the form

f(t) =A1/ / - d'l' dwf(w, 'l')h*(t - 'l')e'wt • . (S2)

Clearly, the above inverse formula is not unique as it depends on the choice of
function h(9). For example, if we take

h(t - 'l') = a(t - 'l' - 'l'max),

where 'l'max is the time when g(t) has its maximum value, i.e.,

g(t) :5 gmax = g('l'max),

then A = gmax and the inverse formula takes the form

f(t) = -1- / -
f(w, . t dw.
t - 'l'max)e'W (S3)
gmax

The above nonuniqueness of the inverse formulas is caused by the obvious re-
dundancy ofthe windowed Fourier transform, which maps function f(t) of a sin-
gle variable into function j(w, 'l') of two variables. Sometimes, however, such an
overdetermination is useful. For instance, if one has to reconstruct the entire signal
f(t) from an incomplete information about its transform, the overdetermination
present in the windowed Fourier transform can be helpful.
Among all the possible inverse formulas (S2) one can try to find the optimal
one in the sense that it would maximize the value of coefficient A (important in
numerical computations) among all the auxiliary functions h(9), such that

IIhll = IIgll· (S4)


208 Chapter 7. Uncertainty principle and wavelet transforms

We assume that the window g(t) is given and is of finite energy, i.e., both g, h E L2.
In terms of Section 7.1, A is just the inner product (h, g) and, by the Schwartz
inequality

Thus, obviously, the greatest possible value of A = IIgll 2 is attained if h(t) = g(t),
and the optimal (in the above sense) inverse formula is

f(t) = 1// -
IIgll 2 dr: .
dwf(w, r:)g*(t - r:)e ,wt ,

or, in view of (6) and (22),

1 (t-r:)/
f(t)= / dr:)".gO -)..- -
dwf(w,r:)e . .
,wt (55)

The asterisk was dropped because the windowing shape function go(x) was real-
valued.

7.3.6. From windows to wavelets. Although windowing was adopted in this


section as our favored method, it is only fair to take now the last parting look at
the windowed Fourier transform

-
f(w, 1 /
r:) = 21l" f(t)go (t - r:) e- 1wt
-)..- . dt, (56)

to assess impartially its merits and shortcomings.


First of all, note that the right-hand side of (56) contains three free parameters
r:, w, and ).., but only two of them-the current time r: and the frequency w-are
variables. The remaining parameter, the window duration ).., is usually assumed
to be constant. This line of thinking is tied to the intuition that the windowed
Fourier transform is just a parameterized version of the regular Fourier transform
introduced for the purpose of tracking instantaneous frequencies of the signal.
Such motivation is, however, also the source of limitations of the windowed Fourier
transform and keeping).. constant, restricts our ability to simultaneously analyze
the time-frequency properties of the signal. To see more precisely what we mean,
let us consider the signal
f(t) = fo(tja)

of a given shape fo(x) but with variable width governed by the parameter a.
Suppose that both fo(x) and go(x) are well localized in the vicinity of x o. =
Then, for a « ).., approximately

j(w, r:) ~ j(w)go(-r:j)..).


7.3. Windowed Fourier transform 209

This means that, for a « J.., the windowed Fourier transform really measures the
signal's ordinary Fourier image and is not very good at doing its localization-in-
time-and-frequency job. The window simply becomes too broad to be of any value.
In the opposite case, when the window becomes very narrow (J.. «a),

j(w, r) ~ f(r)g(w)e-WT:,

the windowed Fourier transform does an excellent job at time-frequency localiza-


tion but fails to provide any information about its spectral properties.
So, the windowed Fourier transform has a limited applicability field and seems
to be most suitable for time-frequency analysis of narrow-band signals with phase
<I>(t) subject to strong but slow nonlinear time-evolution. The example here is
the simplest tune (2-4) for Q » v (wo » Q). However, most of the signals of
interest to scientists and engineers, from cardiograms and seismograms to stock
market quotations and turbulent velocity fields, do not resemble simple tunes (Fig.
7.3.1) and have a much richer structure which often includes appearance of the
wide range of scales. Tools more flexible than windowed Fourier transform are
necessary for their satisfactory analysis, and it is clear from the very beginning that
the scale parameter J.. has to be treated as one of the primary variables. This leads
to a suggestion of the new signal processing algorithm described by

f(J.., r) = A(J..)
A f f(t)1/I* (t-r)
-J..- dt

which is called the continuous wavelet transform and which takes into account both
the location and the scale properties via parameters rand J.., respectively. The shape
function 1/1 (x) of the wavelet transform kernel is usually called the mother wavelet.
In contrast with the windowed Fourier transform where the window's shape gO
plays a minor role, the choice of the mother wavelet is of utmost importance and
we will devote a lot of attention to it in the following sections.
The reader has probably noticed already that the notion of frequency has been
lost in the process. This is not accidental and abandoning the frequency paradigm
(closely tied to selecting the mother wavelet containing trigonometric functions
e- iwt ) in favor of the scale paradigm (which permits full flexibility in selecting the
mother wavelet) turns out to be not a weakness but the main strength of the wavelet
transforms. The wavelet transforms can be tuned to the peculiarities of a signal
we have to work with. If the signal is narrow-band then we can take an oscillating
wavelet
1/I(x) = e iQx go(x)

giving rise to a wavelet transform similar to the windowed Fourier transform but
with different emphasis. The wavelet transform acts like a microscope, narrowing
the visible area of the signal with the growth of the wavelets frequency w = Q/J...
210 Chapter 7. Uncertainty principle and wavelet transforms

7.4 Continuous wavelet transforms


7.4.1. Definition and properties of continuous wavelet transform. In this sec-
tion we take a general look at the continuous wavelet transform both theoretically
and as it relates to physical and engineering problems. Mathematical questions
concerning particular wavelet systems will be dealt with in the last three sections
of this chapter.
The continuous wavelet image of signal I(t) is defined by

10., r) = A()')
A f l(t)'I/I* (t-r)
-).- dt, (1)

where '1/1 (x) is a certain function called the mother wavelet and ). and r are called,
respectively, the scale variable and the location variable. Function A()') will be
specified later. Note that, to distinguish it from the Fourier transform

-
I(w) = 1
21r f .
l(t)e- 1wt dt, (2)

the continuous wavelet transform will be denoted by applying a "hat" / to the


original function I.
Observe that the mother wavelet 'I/I(x) plays the role of complex exponentials
exp(iz) in the Fourier transform (then, also A = 1/21r). Varying in (2) the fre-
quency w by compressing or dilating the complex exponential function, we obtain,
after integration, a new function j(w) which represents the complex amplitude of
the corresponding harmonic component of the original signal:

I(t) = f j(w)e iwt dt. (3)

Consequently, the Fourier image j(w) measures the contribution of different har-
monies to the in general nonharmonic signal I(t).
A similar compression and dilation of the mother wavelet is accomplished for
the continuous wavelet transform by the scaling parameter).. Its exact analog for
the Fourier transform is the period T = 21r/lwl. In a sense, one can interpret the
value of the continuous wavelet transform /()., r) as a measure of the contribution
of the rescaled by ). mother wavelet 'I/I«t - r)/).) to the signal I(t) .
The coefficient A()') can be selected arbitrarily as to magnify or reduce sensi-
tivity of the transform to different scales. However, very often it is simply selected
as
A()') = 1/../i, (4)
7.4. Continuous wavelet transfonn 211

so that
f(J..., r)
A
=..fi.
1 f f(t)1/I* (t-r)
-J...- dt. (5)

This choice guarantees that the arbitrary rescaling of the mother wavelet preserves

r ~f
the mother wavelet's L2-norm. Indeed,

I ~ 1/1* C~ r) = 11/1 2 C~ r) Idt = f \1/1 (x) \dx = 1\1/1 (x) 1\2 .


2

(6)
One could say that with this choice of A(J...) all the scales carry equal weight.
As we already mentioned in the previous section, for all its great features dis-
cussed at length in Chapter 3, the Fourier transform has from the point of view of a
physicist one essential shortcoming: its "mother wavelet" exp(iz) has unbounded
support. As a result, based on information contained in the Fourier image jew) it
is difficult to assess where signal f(t) (or its special features) is located on the t
axis and where it is equal to O. In particular, this type of information is totally lost
in the "spectral density" \j(w)\2 of the distribution of harmonic components over
the frequency w axis. That drawback will be removed in the continuous wavelet
transform by selecting a localized mother wavelet 1/I(z) which decays rapidly to
zero as Z -+ ±oo. Consequently, in the continuous wavelet transform, in addition
to the scale parameter J..., there appears another primary parameter-the location
shift r. Varying it we can track the time t evolution of the "events."
Example 1. Wavelet transform expressed via the Fourier transform. To complete
the general picture one should note that the continuous wavelet transform can be,
obviously, expressed in terms of the Fourier images of the original function and
the mother wavelet:

(7)

In particular, substituting the distributional Fourier image of function 1/I(z) =


exp(iz), we immediately get

-if,(wJ...) = 8(wJ... - 1) = 8(w - 1/J...)/J...,

and setting A(J...) = 1/27r we obtain

1(J..., r) = j(l/J...) exp(ir /J...). (8)

Example 2. Morlet wavelets. The often encountered in practical application



mother wavelet
1/I(z) = eiQzcp(z), (9)
212 Chapter 7. Uncertainty principle and wavelet transforms

with the Gaussian windowing function

lP(Z) = exp( _Z2 /2), (10)

is traditionally called the complex-valued Morlet wavelet (the plot of its real part,
for Q = 10, is shown in Fig. 7.4.1). As a result, the Fourier image of l/I(z) is also
Gaussian:
~(w) = _1_ exp (_ (w - Q)2). (11)
../2rr 2
Recall that the Gaussian shape of the windowing function is the minimizer in
the uncertainty principle (7.2.4) and, consequently, it optimizes the joint resolution
in time t and frequency w. Indeed, the continuous wavelet image

= A(J..) J /(t) exp ( Q


-i T(t - r) -
(t -
2J..2
r)2) dt (12)

contains information about the original (not too fast increasing) function /(t) in

FIGURE 7.4.1
The plot of the real part of the Morlet mother wavelet for Q=10.

the window of effective length '" u[lP] = J../../2. Expressing, as in (7), j(J.., r)
through the Fourier images of the analyzed functions we get
7.4. Continuous wavelet transform 213

This means that /0., 'l') depends on the values of the Fourier image j(w) in the
frequency band of width u [t] = 1/}"./i centered at the frequency

Q= Q/}... (14)

In other words, /(}.., 'l') supplies information about the spectral properties of the
original function with resolution 1/}.../i. The arbitrary parameter Q entering in
the definition (9) of the Morlet wavelet could be called the efficiency factor of the
Morlet wavelet since the quantity Q/27r is of the order of the number of Morlet
wavelet's periods contained in its window. •

Just as the first automobiles of the last century took inspiration from and mim-
icked the horse-drawn carriages, and only later developed their own identity, the
wavelets underwent a similar evolution which started with their identity as "im-
proved" versions of the Fourier transform and only gradually developed into being
recognized for their own outstanding capabilities. These capabilities, still far from
being fully tapped, are related to the fact that the mathematical theory of wavelets,
as we will see later on, imposes very few restrictions on the choice of the mother
wavelet's shape. We will illustrate them on concrete applications in the rest of this
section.
One of the powerful applications of the continuous wavelet transform is the
study of open and hidden singularities in the incoming signal f(t). Usually, the
singularities are caused by physical (biological, economic, etc.) laws, whose valid-
ity the experimenter is trying to confirm, or come from the existence of the sharp
boundaries between the regions where the process f(t) evolves smoothly. The
mother wavelets that are useful in this context are quite unlike the Morlet wavelet
(9-10).
Example 3. Mexican hat wavelet. Differentiation can bring to the surface function's
hidden singularities. For this reason one often selects mother wavelets so that the
corresponding continuous wavelet transform converges, for }.. ~ 0, to a desired
derivative of the function being analyzed. One of such examples is the Mexican
hat
(15)

which is just the second derivative of the Gaussian function (10). Its Fourier image
is

(16)
214 Chapter 7. Uncertainty principle and wavelet transfonns

--~--------~---+---+---------=.--z

FIGURE 7.4.2
The Mexican hat mother wavelet.

Substituting (15) into (1), and integrating by parts twice, we get

I().., 7:) = -A()"»)" 2


A I (t -
qJ 7:)
-)..- d 22 I(t) dt.
dt (17)

It is customary to select
A()") = Ij)..3./2ii, (18)

so that j().., 7:) converges, for).. -+ 0, to exactly 1"(7:).


Another property of the continuous wavelet transform essential to understand

its mechanism is based on the Schwartz inequality (7.1.19)

II I(t)g*(t) dtl2 ~ I I/(t)1 2 dt I Ig(t)1 2 dt, (19)

which applied to the function

g(t) = 1
../I1/! (t-7:)
-)..- (20)

yields the inequality


(21)

The inequality provides an upper bound on possible values of the modulus of the
continuous wavelet transform (5) of 1 (t). Let us assume, without loss of generality,
7.4. Continuous wavelet transform 215

--6 -4 -2 0 2 4 6
FIGURE 7.4.3
The grey-scale plot of the wavelet image of f(t) = exp( -It I) in the case of the
Mexican hat mother wavelet. The horizontal axis represents the T-variable
and the vertical-A-variable. The grey-scale level changes from black to white
as the values of the wavelet image increase. The black oval spot in the lower
middle portion ofthe plot is a consequence of the singularity of the original
function's second derivative at t = O.

that both the signal and the mother wavelet are normalized so that II!II = 111/111 = 1.
It is clear that the maximum values are achieved, and the inequality (21) becomes
an equality, if the original function f(t) is equal, for certain A = AO and T = TO,
to the wavelet

f(t) = 1
-1/1 TO) .
--
...;;.:0
(t -AO
(22)

Informally, we can say that the continuous wavelet transform is best tuned to, or
resonates with signals that have shapes similar to that of the mother wavelet.
Note that the more complex-structured the mother wavelet (20) and the res-
onating signal (22) are, the more pronounced the above resonance property of the
corresponding continuous wavelet transform is. To make things a bit more for-
mallet us define the signal as complex-structured if its time (7.2.2) and frequency
(7.2.3) localizations satisfy the "strong" uncertainty principle:

(23)
216 Chapter 7. Uncertainty principle and wavelet transforms

Example 4. Complex-structured signal. Let us consider signal f(t) whose


Fourier image is the familiar Gaussian function

_ Jr-l/4 ({J}
f(w) = - - exp - - ( 1 + iy)
), (24)
.j2Jrf-t 2f-t2

where y is a real number and the constant f-t (with the dimension of frequency)
has the meaning of effective width of the Fourier image. The coefficient in front
of the exponential function has been selected so that the normalization condition

is satisfied. With the help of the integral formula (3.2.3), we compute the inverse
Fourier transform

f(t) = Jrl/4.j1
~
+ iy exp (t +
-2(1
2f-t2)
iy) . (25)

In tum, using the integral formula

we find the frequency (7.2.3) and time (7.2.2) localizations of the complex-valued
signal (25):
2
2 - f-t 212
a [f] = 2' a [f] =
2f-t2 (l y ). + (26)

Substituting these expressions into (25) we get the following condition for the
signal f(t) (25) to be complex-structured:

y»1. (27)

Remark 1. To better see reasons why signal (25) turned out to be complex-

structured let us write the complex-valued Fourier image i(w) of an arbitrary
signal f(t) in the exponential form

i(w) = A(w) exp(-i <I> (w», (28)

where A(w) = li(w)1 is the nonnegative amplitude and <I>(w)-the real phase of
the complex Fourier image i(w). The amplitude and phase of the Fourier image
7.4. Continuous wavelet transfonn 217

(24) of signal (25) are

A(w) =- - exp (w2


1l'-1/4
-- ,
) yw2
cI>(w) = 2/.L2. (29)
,J21l' /.L2 2/.L

The complex structure of signal (25) was conditioned on the fast nonlinear variation
of the phase of the Fourier image (24) as a function of w. Indeed, according to (3),
the signal can be written in the form

Employing the stationary phase method, asymptotically (y ~ 00), the value of


signal f(t) at a given instant t is determined by the integral contribution in the
small neighborhood of the stationary point, in our case n = 2t /.L 2 / y. Substituting
here, instead of n, the effective width /.L of the Fourier image we shall find the
effective duration of the complex-structured signal:

T~y//.L, (T /.L » 1).

Remark 2. The approximate estimate of the signal (25) duration obtained above
via the stationary phase method may seem unnecessary at first sight since we
already know the exact form of the signal and the exact formula for its time local-
ization:
(30)

Nevertheless, the above argument has a heuristic value, emphasizing the principal
role of the phase in complex-structure signal formation. It also shows a universal
method of calculation of its form and duration.
Example 5. Complex-structured mother wavelet. As another example of mother
wavelet let us take function y,(z) coincidin~ with the complex-structured signal
f(z) (25). The continuous wavelet image f().., 1') of function f(t) to which the
mother wavelet is perfectly tuned is

K()", 1') =,JI


1 f f(t)/* (t-1')
-)..- dt. (31)

Recall that the form (5) of the continuous wavelet transform selected here guaran-
tees that, for any).., the normalization condition
218 Chapter 7. Uncertainty principle and wavelet transforms

is satisfied. Notice that we also introduced special notation K (A, 1') for the special
continuous wavelet image of the mother wavelet itself. Function K (A, 1') is some-
times called the wideband ambiguity function of the mother wavelet and it plays
an important role in wavelet theory. In terms of the Fourier images

K(A, 1') = 2lr,JI! j(W)j*(wA)eiwT: dw, (32)

so that substituting (24) we obtain

K(A, 1') = 1IT!


;v -; exp (1
-Zp wJL22 + iw1' ) dw, (33)

where
(34)

Finally, evaluation of the integral (33) gives

The above function has a maximum at l' = 0, and its modulus square has the
following dependence on 1 :

It is natural to interpret function I (A) as a sort of resonance curve which charac-


terizes the response efficiency of the continuous wavelet transform as a function
of the scale parameter 1. Fig. 7.4.4 shows graphs of function I (A) for signals of
different complexity, as measured by parameter y. It is clear from the illustrations
that the resonance is best emphasized for large values of y, that is for signals of
large complexity.
The maximal value of I (1) is achieved for A = 1. It is related to the fact that
for A = 1 function (31) becomes the autocorrelation function

K(1') = ! f(t)f*(t - 1')dt

of the original signal. The autocorrelation function has some remarkable properties.
In particular, it transforms any signal, however complex, into a simple signal whose
Fourier image,
(36)
7.4. Continuous wavelet transform 219

I
1

0.8
0.6

1 2 3 4 5 A
FIGURE 7.4.4
Graphs of function I (A) for different complexity-structure of signals as mea-
sured by y.

is real and nonnegative, with the phase <I>(w) == O. In electrical engineering one
often says that all the harmonics of the autocorrelation function K (t) have identical
phases.
The autocorrelation function achieves its maximum at r = 0 and decays rela-
tively rapidly as Irl increases. In particular, it is easy to see that its localization
properties are determined by

1
u[K] = v:; ,
ILv2

so that, in view of (30), it is clear that for y » 1 the autocorrelation function


K(r) = K(A = 1, r) is much better localized on the r-axis than the original
signal f(t) (25) on the t-axis. •

7.4.2. Inversion of the continuous wavelet transform. As for any other integral
transform the basic question is: Does the continuous wavelet image j (A, r) contain
sufficient information permitting recovery of the original function f(t)? In more
practical terms: Does there exist an inversion formula for the continuous wavelet
transform?
To answer these questions let us multiply equality (1) by y,«8 - r)/A) and
integrate over all r. The result is the auxiliary integral

I(A,8) = f f(A,
A
r)y, (8-r)
-A- dr. (37)
220 Chapter 7. Uncertainty principle and wavelet transforms

Equivalently,

I(A,B)=A(A)! dtf(t)! d1'1/I*C~1')1/I(B~1'). (38)

It is easy to see that the inner integral can be expressed via the autocorrelation

!
function (35)
K(z) = 1/I(s)1/I*(s - z)ds (39)

of the mother wavelet as follows:

! (1/1* t-1')
-A- 1/1 (B-1')
-A- d1' = AK (B-t)
-A- . (40)

As a result,
I(A,B)=AA(A)! f(t)K(B~t) dt. (41)

To solve this integral equation for f(t) let us multiply (41) by a function B(A), to
be selected later, and integrate over all A:

10 00
I (A, B)B(A) dA = ! f(t)g(B - t) dt, (42)

where
g(s) = 10 00
K(S/A)C(A) dA, (43)

and
C(A) = AA(A)B(A). (44)

Clearly, the right-hand side of (42) would be reduced to f (B), thus solving equation
(41) for function f(t) if

g(s) = 1000 K(S/A)C(A)dA = a(s). (45)

Let us find C(A) for which the distributional equation (45) is satisfied. Remem-
bering that the Fourier image of the autocorrelation function K(z) is 2:7r1..fr(ev)1 2 ,
we get the equation

(46)
7.4. Continuous wavelet transfonn 221

equivalent to equation (45). To eliminate the dependence of the above integral on


lJ) we shall select C(J..) so that

J..C(J..) = 1/ DJ... (47)

In this case, (46) becomes

(48)

where
(49)

is the normalizing constant that can be calculated from (48) by introducing the new
variable of integration" = lJ)J.. to get

(50)

Putting together (44), (47) and (49) we get that

1
B(J..) = DJ..3 A(J..) , (51)

so that, from (51) and (42),

2. (JO I (J.., 9) dJ.. _ / 9


D 10 J..3A(J..) - ().

Substituting expression (37) for I(J.., t) we finally obtain the inverse continuous
wavelet transform

1
/(t) = D 10roo J..3dJ..
A(J..)
f (t - 1')
d1' /(J.., 1')t/F -J..- .
A
(52)

In particular, if the continuous wavelet transform is defined by (4-5), then the


inversion formula takes the form

1
/(t) = D 10roo J..2.ji
dJ.. f d7: /(J.., 7:)t/F
A (t-7:)
-J..- . (53)

However, the above inversion formulas require several caveats.


222 Chapter 7. Uncertainty principle and wavelet transfonns

Remark 3. First of all we have to admit that in the process of making our
calculations transparent we cheated quite a bit. The observant reader would have
noticed that the passage from (48) to (50) is justified only if 1~(w)12 is an even
function. For that reason formulas (52-53) are valid only for two-sided mother
wavelets, as mother wavelets with even square modulus Fourier image are called.
To this class belong all the purely real-valued mother wavelets such as the Mexican
hat (15-16). On the other hand, the complex-valued Modet wavelet (9-10) is not
of this type. For that reason mathematicians often work with one-sided mother
wavelets whose Fourier image is

~(w) == 0, w ::: o. (54)

For such mother wavelets, instead of (46) we have the equality

g(w) = -27r 10 00
-
lo/(WA)1 2 -
dA 1
= -x(w). (55)
D 0 A 27r

To explain its consequences let us express the right-hand side of (42) in terms of
the Fourier images j (w) and g(w ):

f I (A, (})B(A) dA = 27r f j(w)g(w)e iw8 dw.

Substituting here (37), (51) and (55), we arrive at the relation that replaces equality
(52) for one-sided mother wavelets:

f- .() 1
f(w)x (w)e'W dw = D 10roo A3dA
A(A)
f A

dT f(A, T)o/
T)
(0 -
-A- .

As we have shown in Section 6.6, the Fourier integral on the left-hand side is, up
to coefficient 1/2, equal to the analytic signal

2 (OOdA
F(t) = D 10
A3 A(A)
f dT f(A, T)o/
A (O-T)
-A- (56)

corresponding to the original signal f(t). Remembering that the real part of the
analytic signal coincides with f(t), we arrive at the inversion formula for the
continuous wavelet transform for one-sided mother wavelets:

2
f(t) = D Re
(OOdA
10
A3 A(A)
f dT f(A, T)o/
A ((}-T)
-A- . (57)
7.4. Continuous wavelet transform 223

Example 6. Poisson wavelets. As an example of one-sided mother wavelets


consider
l/Im(Z) = (1 - iZ)-m-l, m >0, (58)

which are called Poisson wavelets. Their Fourier images

(59)

can be calculated by means of the residues method to be

-
l/Im(w) = r(m1+ 1) w m e-w x(w). (60)

Poisson wavelets can be used to identify open and hidden singularities of signal
f (t) and, for m = 2, like the Mexican hat, in the search for edges between different
regimes of the original function f (t). Indeed, for m = 2, the Poisson wavelet

1 dZ 1
l/Iz(z) = --2d zz -1- .•
-IZ
(61)

Its real part


1 dZ 1
Re l/Iz(z) = -2 dzz 1 + zZ
has a shape similar to that of the Mexican hat and possesses, for A ~ 0, the same
differentiating properties. •
Remark 4. The above derivation indicates that the necessary condition for the ex-
istence of the inversion formulas is finiteness of coefficient D (49), or equivalently,

f I~(K)IZ~~
the inequality
< 00. (62)

Mother wavelets satisfying condition (62) are called admissible wavelets. The
complex-valued Modet wavelet (9-10) is not admissible since its Fourier image
does not vanish for w = 0 and, consequently, the integral (62) diverges. Neverthe-
less, in practice this is not a serious obstacle. First of all, the inversion formula is
not always needed, and, secondly, for sufficiently large values of the "goodness"
parameter Q, the Fourier image of the Modet wavelet takes a very small value at
w = 0, and it is not difficult to adjust it a little bit to make it admissible.
Remark 5. It follows from condition (62) that the Fourier images of admissible
wavelets satisfy condition ~(w = 0) = 0, which is equivalent to the condition

f l/I(z)dz = O. (63)
224 Chapter 7. Uncertainty principle and wavelet transforms

This in tum implies that any admissible wavelet has to have a oscillatory (sign-
changing) nature-this provides a partial explanation of the term wavelet.
Remark 6. The fact that the continuous wavelet image j (). , r) of the function
f(t) of a single variable depends on two variables indicates that the continuous
wavelet transform contains redundant information and is overdetermined. One of
the consequences of this fact is that the inversion formula is not unique. This is
easily seen by multiplying (1) not by the mother wavelet, as was done earlier, but
by another function
q; ( ~)
).. ,

and integrating the resulting equation over all r. For the sake of simplicity of the
argument let us assume that q;(z) has a one-sided Fourier image, that is ip(w) == 0 for
w :s O. As a result we get an analog of expression (41), the only difference being
that instead of the autocorrelation function (39) one enters the cross-correlation
function
K (z) = fq;(s )1{1* (s - z) ds (64)

whose Fourierimage is 21fip(w)1fr*(w). Replacing 11fr(w) 12 by ip(w) 1fr* (w) in all the
preceding formulas, we arrive at an infinite variety of continuous wavelet inversion
formulas:

f(t) = Re D 21 00

o )..
d)..- -
-3
A()")
f (o-r)
dr f().., r)q; - - ,
A

)..
(65)

which are all well defined as long as

(66)

The, complex in general, coefficient D entering in formula (65) is equal to

Notice that the condition (66) can be fulfilled not only for admissible mother
wavelets but also if ip(w) converges sufficiently fast to zero as w -+ 0+. This means
that the inversion formula (65) remains valid also in cases when the formulas (52-
53) do not make sense. For example, formula (65) permits recovery of the original
function f(t) from the continuous Morlet wavelet image j().., r) (9-10) if one
takes q;(z) to be the Poisson wavelet (58).
The question of how to make wavelet transforms more economical and less
redundant, while preserving their good scale and time localization properties is
7.5. Haar wavelets and multiresolution analysis 225

a subtle mathematical problem. For the usual Fourier transform and the Fourier
series the lack of overdetermination, and the uniqueness of the inverse Fourier
transform (or Fourier coefficient sequences) is guaranteed by the Hilbert space L 2
orthogonality of complex exponentials (or trigonometric functions) on the interval
[0, 21l'], that is, by the condition that

if m#n.

The mathematically difficult task of constructing orthogonal wavelet systems will


be discussed at some length in the next three sections.

7.5 Haar wavelets and multiresolution analysis


In this section we will take a look at a special (one can say digital) series rep-
resentation for real-valued signals in terms of the so-called Haar wavelets. This
idealized system provides a good easy introduction to the concepts of wavelet
transforms and multiresolution analysis. Each term of the expansion will provide
information about both the time and the frequency localization of the signal. The
Haar wavelets will be obtained from a single prototype-a mother wavelet-by
translations in time and frequency, although the explicit shift in frequency will be

",(1) 'II (t)


1 1,3

0.5

t
-1 3
-0.5

-1

-
FIGURE 7.5.1
The Haar mother wavelet, and a wavelet of order (1,3).

replaced by a more natural in this case dilation (rescaling, stretching) in time. This
226 Chapter 7. Uncertainty principle and wavelet transfonns

will guarantee that all the wavelets have the same shape. To eliminate redundancy
and overdetermination, we will make the wavelet system orthogonal.
The Haar mother wavelet is defined as follows

1,
l/I(t) = { -1,
°
for ~ t < 1/2;
for 1/2 ~ x < 1; (1)
0, otherwise.

The Haar wavelet

of order (m, n), m, n = ... , -1,0,1, ... , is obtained by rescaling (dilating or


compressing) the time in the mother wavelet l/I(t) by a factor of 2m and then
translating the resulting wavelet by an integer n multiplicity of 2- m • The dilation
makes the wavelet l/Im,n(t) fit in the interval of length 2- m , and the translation
places its support finally in the interval [2- m n, 2- m (n + 1)] (see Fig. 7.5.1). We
will call parameter m-the level of resolution of the wavelet, and parameter n-
the location parameter of the wavelet. Then the number 2- m can be seen as its
resolution, and 2-mn-as its location.
The coefficient 2m/2 in the definition (2) was selected to make all the Haar
wavelets normalized in L2(R), that is, to compensate for the dilation operation to
guarantee that
(3)

It turns out that:

The system of Haar wavelets

m, n = ... - 2, -1, 0, 1,2, ... (4)

is orthogonal, that is

(l/Ij,k, l/Im,n) = f l/Ij,k(t)l/Im,n (t) dt = 0, if (j, k) oF (m, n), (5)

and complete in L 2(R). The latter means that any function f E L2(R) has an
L 2 -convergent representation

00 00

f = L L wm,nl/lm,n
m=-oon=-oo
(6)
7.5. Haar wavelets and multiresolution analysis 227

where, in view of the orthonormality, the expansion coefficients

Wm,n = wm,n[f] = <t, t/lm,n) = f f(t)t/lm,n(t)dt. (7)

The above properties of orthogonality and completeness parallel properties of


the trigonometric system of functions on a finite interval (say, [0, 21r Dwhich give
rise to the usual Fourier series expansions.
The orthogonality (5) can be shown as follows. For a fixed resolution level
j = m, iflocation parameters k, n are different, then wavelets t/lj,k(t) and t/lj,n(t)
have disjoint supports, and the integral of their product is clearly zero. At different
resolution levels, say j < m, either the supports of t/lj,k(t) and t/lm,n (t) are disjoint
and the previous argument applies, or the support of t/lm,n (t) sits entirely within the
interval where t/lj,k(t) is constant (either +2jf2 or -2 j / 2), and again the integral
of their product vanishes because

f t/lj,k(t)t/lm,n(t)dt = ±2jf2 f t/lm,n(t)dt = O. •


The completeness of the Haar wavelet system (4) is more difficult to establish and
the proof relies on demonstrating that if all the wavelet coefficients wm,n [f] = 0,
then function f is necessarily 0 in L 2 • We will give a flavor of the proof by showing
that this is indeed the case if fELl n L2. So, assume that wm,n = 0, m, n =
... -1,0, 1, ....
Since Wo,O = 0 then

10o1/2 f(t)dt= 11
V2
f(t)dt=-
111
2 0
f(t)dt.

However, since W-1,0 = 0,


{1 {2 1 (2
10 f(t) dt = 11 f(t) dt = 210 f(t) dt

and, by induction, for any n

10o1/2 f(t) dt = -+1


1 10 2ft f(t) dt = 112ft
lim -+1 f(t) dt = 0
2n 0 n-+oo 2n 0

since we assumed the finiteness of the integral f If (t) I dt (f ELl). Clearly, the
same argument can be repeated for any dyadic interval of the form [2- m n, 2- m (n +
228 Chapter 7. Uncertainty principle and wavelet transfonns

1)], so that, by approximation, for any interval [a, b]

( f(t)dt = O.
J[a,b]

This implies that f = 0 in L 1 n L 2 , and the proof of completeness of the Haar


wavelets is done. •

Remark 1. An alert reader would observe a seemingly paradoxical nature of


expansion (6), where an arbitrary square integrable function in L 1, for which in
general J f (t) dt =f. 0, has an expansion into a series of Haar wavelets for which
J Y,m,n (t) dt = O. The explanation is that the convergence of the series (6) is
in L2 (that is in the mean square sense) so that the integrals themselves need
not be preserved in the limit. To avoid this phenomenon one sometimes considers
functions I[O,l](t-n), n = ... ,
-1,0,1, ... in combination with the Haarwavelet
subsystem Y,m,n for m ~ o. We will return to this theme later.

Note that the inner series in the expansion (6) consists of wavelets of fixed
resolution 2-m, that is, it represents a function with constant values on dyadic
intervals [2- mn, 2-m(n + 1)], n = ... , -1,0,1, ... , and gives the contents of
function f at fixed resolution level m (see Fig. 7.5.2) Then the partial sum

S
L L
00

fR,S(t) = wm,ny,m,n(t) (8)


m=Rn=-oo

of the expansion (6) gives an approximation of function f(t) at resolutions finer


than 2- R and coarser than 2- s (see Fig. 7.5.3).
Thus expansion (6) may be interpreted as a multiresolution analysis of the func-
tion space L 2 (R).
Remark 2. Scaling function. We have already observed in Remark 1 that the
multiresolution analysis of functions in L2(R) can be accomplished by means of
a slightly different system that starts out with the scaling function

f{J(t) = I[O,l](t)

and its integer translates

f{Jn(t) = f{J(t - n), n = ... , -1, 0,1, ... ,

and supplements them with Haar functions y,m,n(t) with nonnegative resolution
levels m = 0,1,2, ... and arbitrary integer location parameter n. Note that the
7.5. Haar wavelets and muItiresolution analysis 229

f(t)
0.6
0.5
0.4
0.3
0.2
0.1
1 2 3 4 t
m=-2
0.4
0.3
0.2~ _________________

0.1
1 3 4 t
-0.1
-0.2
m=-1
0.2
0.15
0.1
0.05
-0. 05 ~_____--L 2 4
"L-_ _ _...; t
-0.1
m=O
0.2
0.1 I
I I
1 I :l t
-0.1
~ 4

-0.2

FIGURE 7.5.2
Function f(t) = t exp( _t 2 /2) and its contents at resolution levels m =
-2, -1, O.
230 Chapter 7. Uncertainty principle and wavelet transforms

fW
4 t

4 t
FIGURE 7.5.3
(a) Approximation of function 1 (t) = t exp( _t 2 /2) with resolution Bner than

22 and coarser than 2-2• In view of the deBnition (1) of the Baar mother

wavelet, J: f-2,2(t)dt = 0, while J: I(t) dt "# 0, which leaves a vertical gap


between 1 and its approximation 1-2,2 (see Remarks 1 and 2). (h) Addition

of the constant CR = L;;:;!=~ wm ,02m / 2 (in our case, R = 2) removes the gap.
7.6. Continuous Daubechies' wavelets 231

resulting system is still orthonormal and complete, and gives a multiresolution


expansion of a function 1 E L 2 (R) of the form

L +L L
00 00 00

1= wnrpn(t) wm,n1/!m,n, (9)


n=-oo m=On=-oo

f
with coefficients
Wn = wn(f) = I(t)rpn(t)dt, (10)

and Wm,n as in formula (7).

Remark 3. Self-similar (fractal) properties 01Baar wavelets. The crucial obser-


vation for the general theory of wavelets (to be discussed in the next section) is that
the scaling function rp(t) (the indicator function of the interval [0,1]) is self-similar
in the sense that it satisfies the scaling relation

rp(x) = rp(2x) + rp(2x - 1), (11)

and that the mother wavelet 1/! (t) can be obtained from the scaling function via the
formula
1/!(x) = rp(2x) - rp(2x - 1). (12).

The scaling relation (11) asserts that the scaling function is a certain linear combina-
tion of its own dilations and translations. It completely characterizes the indicator
function rp(t) up to a constant multiplier. Indeed, given values of rp(t) at t = 0
and 1, the scaling relation (11) permits computation of values of rp at all dyadic
rationals, i.e., real numbers of the form 2- m n.

7.6 Continuous Daubechies' wavelets


The Haar wavelets discussed in the previous section enjoyed many useful prop-
erties such as orthonormality, completeness, compact support and self-similarity
but, as elegant as was their construction, they were anything but smooth. As a
matter of fact they were not even continuous-a property important in many appli-
cations. So, in the present section we will explore the possibility of constructing
smoother wavelets.
Since the scaling relation (7.5.11) characterizes the indicator scaling function
I and thus the Haar wavelets, more complex scaling relations will have to be
allowed. It turns out that one can find smooth scaling functions which satisfy a
232 Chapter 7. Uncertainty principle and wavelet transforms

scaling relation
N
q;(t) = Lakq;(2t - k)
k=O

for some positive integer N > 2 and coefficients ak (by (7.5.11), N = 2 was
necessary and sufficient for Haar wavelets). Then the mother wavelet can be
selected to be
N
t(t) = L(-1)kaN-kq;(2t - k),
k=O

and the corresponding wavelet system can be built with its help via formula (7.5.2).
Such an approach was suggested by Ingrid DAUBECHIES in 1988, and the resulting
wavelets are called Daubechies wavelets.
Conceptually, the above construction is a clearcut generalization of the construc-
tion of Haar wavelets from the scaling function I provided in the previous section.
However, for N > 2, the selection of coefficients ak becomes highly nontrivial.
Also, as a rule, the smoother one wants the wavelets one wants, the larger the N
one has to take.
Below, we provide a sketch of the relatively simple construction of continuous
Daubechies wavelets which is due to David POLLEN (1992). Their scaling function
q;(t) satisfies the scaling relation

q;(t) = aq;(2t) + (1 - a)q;(2t - 1) + (1 - a)q;(2t - 2) + aq;(2t - 3), (1)

where
1 +.J3
a= (2)
4

and where, for real numbers of the form at + f3.J3 with ( dyadic) rational at, f3, the
overline indicates the "conjugation" operation

The support of the resulting q;(t) is contained in the interval [0, 3] and, additionally,

L
00

q;(k) = 1. (3)
k=-oo

Assume that there exists a scaling function q;(t) supported by [0,3] and satisfying
(1) and (3) for integer values of the argument t. The scaling relation (1) written
7.6. Continuous Daubechies' wavelets 233

for t = 0, 1,2,3 becomes a matrix equation

CP(O»)
( cp(l) =
(a 0 0 0) (CP(O»)
(1 - a) 1 - a a _0_ cp(l)
cp(2)
cp(3) 0 a0 0 a
0 (1 - a) 1 - a cp(2)
cp(3)

which, in view of condition (3), has exactly one solution:

1+y'3 1- y'3
cp(O) = 0, cp(l) = 2 ' cp(2) = 2 ' cp(3) = O.

Starting with these prescribed values and using the scaling relation (1) one can
produce values of the scaling function cp(t) for any dyadic rational t. For example,

cp(lj2) = 2 +4y'3, cp(3j 2) = 0, cp(5j2) = 2 - y'3,


4

and so on.
The values of cp(t) for dyadic t are clearly of the form a + fJy'3 with dyadic a
and fJ. One can also prove (see the Bibliographical Notes) that they also satisfy
two extended partition of unity (see also (3» formulas

L
00

cp(t - k) = 1
k=-oo

and

k~OO
00 (3 -2y'3 +k) cp(t-k)=t.

Since the support of cp(t) is contained in [0,3] the above properties also give the
interval translation properties for dyadic t E [0,1]:

1+y'3
2cp(t) + cp(t + 1) = t + 2 '

3-y'3
2cp(t + 2) + cp(t + 1) = -t + 2 '

-1 +y'3
cp(t) - cp(t + 2) = t + 2 .
234 Chapter 7. Uncertainty principle and wavelet transforms

Combining them with the scaling relation (1) gives the scaling relations for dyadic
t E[0,1]:
O+t) = aq;(t);
q; ( -2-

1+t)
q; ( -2- = aq;(t) + at + 2+v'3
4 ;

2+t) _
q; ( -2- = aq;(1 +t) +at
v'3
+ 4;

3+t) _ 1
q; ( -2- = aq;(1 + t) - at + 4"; (4)

q;
4+
( -2- t) _+ 3 - 2v'3
= aq;(2 + t) - at 4 ;

5 +t) = aq;(2 + t).


q; ( -2-

Compared with the original scaling relation (1), they have a clear advantage: the
values of q;(t) at the next resolution level depend only on one value at the previous
resolution level (instead of four in (1».
The above formulas form a basis for the following recursive construction of
the continuous version of the scaling function on the whole interval [0,3]. Start
with function 80(t) which is equal to q;(t) at integers 0,1,2,3, and which linearly
interpolates q; in-between these integers. Clearly, 80(t) is continuous. In the next
step, form 81 (t) at the second resolution level by applying the (right-hand sides
of) scaling relations (4) to 80. More precisely, for t E [0, 1], define

O+t) = a80(t);
81 ( -2-

1+t)
81 ( -2- = a80(t) + at + 2+v'3
4 ;

81 (
2+t)
-2-
_ v'3
= a80(1 +t) +at + 4;

3+t) _ 1
81 ( -2- = a80(1 + t) - at + 4";

81
4+
( -2- t)
= a80(2 + t) - _+ 3 - 2v'3
at 4 ;
7.6. Continuous Daubechies' wavelets 235

gl ( 5; t) = ago(2 + t).

Outside [0, 3] set gl (t) =


O. Function gl (t) is continuous and coincides with
q>(t) at dyadic points with resolution 2- 1 (in-between, it again provides a linear
interpolation). Continuing this procedure we obtain a sequence gn of continuous,
piecewise linear functions (zero outside [0,3]) which agree with q>(t) at dyadic
points of the form k2- n •

1.25

0.75

0.5
; ......
0.25 :

0.5 1
"
3
-0.25

FIGURE 7.6.1
Values of the Daubechies' scaling function computed at dyadic points
t = n . 2-6 ,0 ::: t ::: 3, via the scaling relation (1).

Notice that functions Ign (t) I ::: 3 for all n = 1, 2, ... , and since 0 ::: la I ::: a < 1
(see (2», we get that

max Igk(t) - gk+j(t) I ::: a k max Igo(t) - gj(t)1 ::: 6a k .


t t

Hence the sequence of functions gk(t) satisfies uniformly the Cauchy condition,
and the limit
q>(t) = lim gn(t).
n--+oo

is a continuous function. This is the scaling function we were searching for.


Remark 1. Note that the scaling function q>(t) is not differentiable because

Iim q>(2-i) - . q>(0) -_ I·1m q>(2-i) _ I· a i q>(1) _ Ii (2a)j (1)-


. - lID . - m q> - 00,
i--+oo 2-J i--+oo 2-J j--+oo 2-J j--+OO

since 2a > 1 and q>(0) # O.


236 Chapter 7. Uncertainty principle and wavelet transforms

With some additional work one can now establish that

f q>(t) dt = 1,

and that the integer translations of q>(t) form an orthonormal system, that is

1.5 .il
l
1 :v'

0.5 .- ..
.. .
~

/
___0....._5~__ i 1.5'. 2 ~. 2.5 3
:#
. :
.,.. ..
-0.5
."
'"
-1 ..
.....

FIGURE 7.6.2
Values of the Daubechies' mother wavelet computed at dyadic points
t = n . 2-6 , 0 ::::: t ::::: 3, via the formula (5).

Following the general scheme explained in detail for the Haar wavelets in Section
7.5, we can now define the mother wavelet ""(t) via equality

""(t) = -iiq>(2t) + (1 - a)q>(2x - 1) - (1 - a)q>(2x - 2) + aq>(2x - 3), (5)

f
and check that
""(t) dt = O.

The integer shifts of the mother wavelet are orthonormal, that is

f ""(t)""(t - k) dt = {O, ~fk =F 0;


1, If k = O.
7.7. Wavelets and distributions 237

Moreover, the scaling function ({J and the mother wavelet t/F are orthogonal as well,
that is
f ({J(t)t/F(t - k)dt = 0.

Thus, again, by an argument similar to that used for the Haar wavelets, the set
of Daubechies wavelets

t/Fm,n(t) = 2m/ 2 t/F(2m - n), m, n = ... -1,0,1 ... ,

forms an orthonormal complete basis in L 2 (R), and so does the set of functions

({In(t) = ({J(t - n), n = ... , -1,0,1, ... ,

t/Fm.n(t) = 2m/ 2 t/F(2m - n), m = ... 0,1,2 ... , n = ... , -1,0,1, ....

7.7 Wavelets and distributions


The scaling relation

<I>(t) = 21/2 Lak<l>(2t - k) (1)


k

can also have distributional, rather than just function solutions (following our
convention we denote distributions by capital letters). In the most trivial case
when (1) has only one nonzero term, sayao = ../2, such a solution may be guessed
immediately:
<I>(t) = 8(t),
since 8(t) = 28(21).
However, in the case of the scaling relation (1) with finitely many (but at least
two) nonzero coefficients ak, the scaling distribution <I> can no longer be a linear
combination of Dirac deltas, that is, of the form

n
<I>(t) = L Ck8(t - tk)' (2)
k=O

This can be seen as follows. Suppose that the nonzero coefficients are ao and aI,
and perhaps some others. Since the support of <I> must be contained in the interval
238 Chapter 7. Uncertainty principle and wavelet transforms

[0, 1], we may take 0 S to < t1 < ... < tn S 1 in (2). The scaling relation (1)
then forces equation

n
(;CkO(t 1 2 n [
- tk) = 2/ Ck ( t
{; aoZo tk )
-"2 Ck o ( t - -2-
+a1 Z 1) + ....]
tk +

Comparing coefficients for the same Dirac deltas on either side of the above equa-
tion, we get in the case to = 0 that

1/2 Co
Co = 2 aoZ '

and since ao =1= 21/2 we get that Co = 0 . If to > 0 then

so that again Co = O. By the same argument we get that C1 = C2 = ... = Cn = 0


because tk/2 < tk < (tk + 1)/2. So, there are no solutions of the form (2).
The situation is better if infinitely many nonzero coefficients are permitted in
the scaling relation (1), and we will indicate some avenues that can be pursued in
such a case.
One option is to seek a solution <I> E S' of (1) which is a distribution with
compact support. Then its Fourier transform <I>(w) is an analytic function in the
entire complex plane C, and the scaling relation (1) translates into the following
relation for <I>(w):

<I>(w) = r 1/ 2 L ak exp( -iwk/2) <I> (w/2).


k

We, however, will not follow this route, and instead will construct the scaling dis-
tribution (and the corresponding wavelets) by demanding that its integer translates
form an orthonormal basis at the zero resolution level.
A mathematical aside: orthogonality of distributions. 2To make a rigorous discussion of the
orthogonality of distributions possible we have to select an inner product of (at least some) distri-
butions. One such possibility is the inner product

(T, S) =~
2JT
f T(w)S*(w) dw,
w2 +1
(3)

2This material may be skipped by the first time reader.


7.7. Wavelets and distributions 239

which is well defined for all the distribution in the Sobolev space

H. := {f e S' : 11/11 2 = f I~~I: dw < 00 } , (4)

which is a subspace of the space of tempered distributions S'. Oearly, the Dirac delta B(t) e H.
since its Fourier transform is identically equal to 1/21r.
Now, our job is to construct a multiresolution analysis of the Sobolev space H., and the first
step is to find a scaling distribution 4>, the integer translates thereof would be a orthonormal basis
at the zero resolution level, that is for

Vo = {f e H. : supp feZ}.

Distributions in Vo are of the form

T(t) = L CkB(t - k), (5)


k

Any distribution in H. can be then approximated by partial sums of dilations of (5).


The first difficulty one encounters is that the usual dilations do not preserve the norm 11.11 in H..
So to give ourselves some leeway, we will allow use of other inner products in H., and introduce
the family of inner products

(T S)
, a
= 2.
21r
f T(w)S*(w) d
w2 + a 2 w, (6)

parameterized by parameter a > o. They all generate norms 1I.lIa equivalent to the original norm
11.11 = 11·111, and the orthogonality notions for all of them are equivalent. Their role will become
clear later on.
If 4> is of the form (5), then the orthonormality condition gives that

Thus Bok are Fourier coefficients (in L 2 (R» of a function appearing as a fraction in the last integral,
which therefore must be equal (in L 2 (R» to its Fourier series
240 Chapter 7. Uncertainty principle and wavelet transfonns

Since ci> has period 21r, the desired scaling function in Vo C 1i has the Fourier transform

:i.. _ [
... (w, a} -
L 1
2 2
]
-1/2
_ [2a(COSha - COSW}]1/2
- • (7)
(w+21rn) +a sinha
n

That the sum of the above infinite series is as indicated can be seen as follows (see also formula
(8.7.7).
By expanding exp(-ixt) on [0,1] in the Fourier series, and evaluating it at t 0, we get that =
1 1 cos(x /2}
L
00

n=-oo
x + 21rn = 2 sin(x/2}

so that
a(w,a} := L (w + 4"'n}2
1
+a2
n

i l l
= 4a Ln (w + ai}/2 + 21rn
- L n
(w - ai}/2 + 2",n

i [COS(W + ia}/4 cos(w - ia}/4]


= Sa sin(w + ia}/4 - sin(w - ia}/4

sin(-ia/2)

1 sinha/2
= 4", cosa/2 - cosw/2·

The Fourier transform ci> given by (7) cannot be simply inverted. However, it turns out that
the construction of the corresponding mother wavelet distribution 111 leads to an invertible Fourier
transform. Indeed, since
1I1(t} =L dk ..ti<ll(2t - k} (8)
k

for some coefficients dk, and since the integer translations of the mother wavelet have to be
orthonormal and orthogonal to the scaling function cI> (that is, orthogonal to all integer translations
of the Dirac delta) we obtain the following conditions on the Fourier transform Iji:
7.7. Wavelets and distributions 241

In view of the scaling relation (8), If, has period rr, and both series can be simplified by separating
odd and even terms, which leads to equations

iii (w)a(w, a) + liI(w + rr)a(w + 21r, a) = O.


Their easily identifiable solution is

,i,( ) _ -iw/2 ( a(w + 2rr) ) 1/2


... w,a -e
a(w, a) (a(w + 2rr, a) + a(w, a»

. 2
=e-1w/ (cosha/2 - cosw/2)
( 4a
-.-
)1/2 •
sinha

(taking just the positive root does not work here as it did in (7), since it does not satisfy the second
of the above pair of equations).

For ex = 2- m (and this is just what we need for the construction of a wavelet
basis in 11) the easily evaluated inverse transform of "'(ro, ex) gives the following
formula for the mother wavelet

w(t, 2- m ) = C [2coSh(Tm-1)8(t -1/2) - 8(t) - 8(t -1)]

for a certain constant C.


Now, we are finally ready for the construction of the orthonormal wavelet basis
in the Sobolev distribution space 11, using the mother wavelet 'II as the starting
point. Here, the advantage of using the family of norms 1I.lIa becomes obvious.
As we observed earlier, the classical rescaling

does not preserve the norm 11.11, but

Therefore, the correct rescaling in 11 which produces the first level resolution
wavelet is

and the complete multiresolution wavelet basis for 11 is provided by the integer
translation of the scaling distribution

<l>n(t) = <I>(t - n), n = ... , -1,0,1, ... ,


242 Chapter 7. Uncertainty principle and wavelet transforms

and the wavelets

Wm,n(t) = 2 3m / 2 W(2 mt - n, 2- m ), n = ... , -1,0,1, ... , m = 0, 1, ... ,

or, alternatively, by the wavelet system

complemented by linear combinations of the Dirac delta ~.

Remark 1. One can also expand distributions in weakly convergent series


of smooth function wavelets. In this case, the typical result is that a tempered
distribution which is the r-th derivative of a function (measure) of polynomial
growth has a wavelet expansion with the coefficients an = O(lnl k), for some
integer k.
More precisely, assume that the scaling function q; is in Sr, that is it has r
continuous and rapidly decreasing derivatives. In other words,

1q;(k)(t)1 ::::: C p ,k(1 + Itl)-P, k = 0, 1, ... , r, pEN, t E R.

The spaces Sr contain the space S of rapidly decreasing test functions. Then the
mother wavelet 1/1 is also in Sr, and for any tempered distribution T E S; (of order
r) the expansion with respect to q;(t - n) and 1/I(t - n) exists and the coefficients
an = O(lnlk) for some integer k.
As a result, for T E S;, we have the usual wavelet expansion

where

and where the convergence is in S; or, alternatively,


T(t) = I>nq;(t - n) + L
bm,n1/lm,n(t).
n m~O,n
7.B. Exercises 243

7.8 Exercises
Function spaces:
1. Prove the inequality P =J If(t)g(t)1 dt ~ I/fl/I/gl/ which was used, among
others, in the proof of the uncertainty principle in Section 7.2. Compare with the Schwartz
inequality (7.1.9).
2. Prove the triangle inequality 1/ f + g 1/ ~ 1/ f 1/ + 1/ g 1/ for the functional Hilbert space
L2.
Windowed Fourier transform:
3. Let j(w, r) be the windowed Fourier transform of the signal f(t). Denote by
j'(w, r) the windowed Fourier transform of the derivative
f'(t). Express j' in terms of
j.
4. Signal x(t) is a solution of the differential equation
dx(t)
-;It + hx(t) = f(t),

where f(t) is a signal with known windowed Fourier image j(w, r) and h is a (real or
complex) constant. Express the windowed Fourier image of x(t) in terms of j(w, r).
5. Let j(w, r) be the windowed Fourier image of signal f(t) (i.e. f(t) ~ j(w, r».
Find the windowed Fourier images of signals (a) f(t)e iWOI , and (b) f(t + (}).
6. Find the windowed Fourier transform j (w, r) of function f (t) = e VI.

7. Utilizing results of the previous exercises, find the windowed Fourier image of the
signal f(t) = e VI coswot.
8. Assume that the values of the windowed Fourier image j (w, r) of signal f (t) are
known outside the interval r E [0, T] only. Is it possible, on the basis of this incomplete
information, to recover values of signal f (t) for all values of t?
Wavelets:
9. Provide the expression for coefficient D in (7.4.49) if one employs the more popular
in mathematical literature definition (7.3.15) of the Fourier transform.
10. The distributional relation

-1
Do)..
1 00
K (S)
- -d)" =8(s)
)..2
(1)

played" the principal role in derivation of formulas for the inverse continuous wavelet
transform. In practice it is impossible to carry out the integration all the way to ).. = o.
Show that the regularized integral

g~(s) =D 1100 ~ K
(S)
i d)"
)..2
244 Chapter 7. Uncertainty principle and wavelet transforms

converges weakly to the Dirac delta as E --+- O.


11. Find an explicit formula for function cI>(x) = «6 - x 2 )/(8.,fii» exp( _x 2 /4) (see
answer to the Exercise 10) in the case of the Mexican hat mother wavelet (7.4.15).
12. Obtain, in the case of two-sided mother wavelets, a formula connecting 1/12 and
1/12, analogous to the Parseval formula for the ordinary Fourier transform.
13. Denote by 10(>..,r) the continuous wavelet image of signal I(at), a > 0, com-
pressed (a > 1) or dilated (a < 1) in comparison with the original signal I(t). Find out
10
how (A, r) is related to the continuous wavelet image of signal I(t) itself in the case
of wavelet transform definition (7.4.5).
14. Find I(A, r) (7.4.1) for the self-similar signal I(t) = Itla.
15. Let
I(t) = (5 - 9t + 4t 2 )/(5 -12t + 8t 2 )3.
Find numerically and graphically the Haar wavelet expansion of I(t) with resolution
level coarser than 0 and finer than 6. Graph the resolution level n contents of I(t) for
n = 0, I, ... ,6. Use your computer in order to estimate numerically the maximum error
of your approximation.
16. Use your computer and the defining scaling relations to produce numerical values
of the Daubechies scaling function and mother wavelet at the dyadic points up to resolution
level 6.
ChapterS
Summation of Divergent Series and
Integrals

The theory of distributions has a flavor similar to the theory of summation of


divergent series and integrals and, as we have seen in Chapter 6, is closely related
to the theory of singular integrals. A generic alternating series

1 - 1 + 1 - 1 + ... = L ±1
is a good example here. It seems to make no sense to assign a specific value to
this infinite sum. Nevertheless, mathematicians have produced certain reasonable
rules of summation that assign to it value 1/2. Such an assignment is in complete
agreement with the intuition of physicists who encounter similar series. In this
chapter, we will see how one can sum this, or even more strange, divergent series
and integrals. To gain a better insight into the essence of this problem, let us begin
with elementary examples and recall basic notions and theorems of the ordinary
theory of convergent infinite series.

S.l Zeno's "paradox" and convergence of infinite series


8.1.1. Geometric series. Recall the celebrated 4th century B.S. Achilles-and-
tortoise "paradox" due to the Greek philosopher Zeno. At time t = 0 Achilles is
at point x = 0 and the tortoise at x = 1. Achilles begins to chase the tortoise with
velocity +1, and the tortoise begins to run away with velocity 0 < v < 1.
To catch the tortoise Achilles has to first reach point x = 1. This will happen
at time ro = 1. By that time the tortoise will have moved by distance vro = v.
To cover that distance Achilles needs time rl = v, during which the tortoise will
have moved further to the right by distance vri = v 2 (see Fig 8.1.1).
The pattern continues ad infinitum, seemingly indefinitely delaying the moment
246 Chapter 8. Summation of divergent series and integrals

11(1 v)

FIGURE 8.1.1
A graph representing paths of Achilles chasing the tortoise. They will meet
at the point of intersection of the corresponding straight lines.

when Achilles catches the tortoise. Of course, time t needed by Achilles to catch
the tortoise is the sum of an infinite series

t = iO + i1 + i2 + ... ,

where im = v m , m = 0, 1, 2, ... , so that

L
00

t = 1+v + v 2 + v 3 + ... = vm , (1)


m=O

is the sum of a geometric series.


Leaving aside philosophical significance of Zeno's "paradox" concerning the
nature of space and time, let us underline the basic mathematical difficulty en-
countered in formula (1): we are trying to perform infinitely many mathematical
operations, each of which takes a certain amount of time. At a naive level, this
seems to be an impossible task if we want to do it exactly.
An insight into how this difficulty can be overcome is gained by a look at the
problem from the physical viewpoint. To find time t when Achilles catches the
tortoise, it suffices to solve the system of equations of motion for both of them:

x = vt + 1, (2a)

x = t. (2b)
8.1. Zeno's "paradox" and convergence of infinite series 247

The solution is
1
t=--, (3)
1-v
which, together with formula (1), gives

1
Lv
(Xl
m =-. (4)
m=O
1-v

Equation (4) attaches to the infinite series on its left-hand side, a number from the
analytic expression on its right-hand side. Note that the latter has a well defined
mathematical and physical meaning, even if the left-hand side does not form a
convergent series. For example, for v = -1, equation (4) states that

1
1 - 1 + 1 - 1 + ... = 2' (5)

an equality anticipated in the preamble to this chapter. The question is: How can we
provide a mathematical justification of this type of identities? The answer is given
by the theory of summability of divergent series which formalizes procedures that
are consistent with the physical principle of infinitesimal relaxation. On the other
hand, we'll be able to elucidate rigorous mathematical constructs of summability
theory by looking at their physical roots. Observe that series (1) which appears on
the left-hand side of equality (4) arises in an attempt to solve equations of motion
(2-3) by the method of consecutive approximations in parameter v.

8.1.2. Criteria for convergence. The above example illustrates the basic differ-
ence between finite sums
(6)

and the infinite series


(7)

The value of a finite sum can always be explicitly computed (at least in principle),
whereas it is not always possible to assign a numerical value to an infinite series.
However, for some series, the sequence of partial sums

(8)

converges, as n ~ 00, to a finite limit

S = lim Sn, (9)


n-+(Xl
248 Chapter 8. Summation of divergent series and integrals

and then it is natural to think about S as the sum of the infinite series, the value of
which is being approximated by computable finite partial sums. Such series are
called convergent.
One learns in calculus that it is possible to find out whether the series converges
or not without actually computing the limit of its partial sums. One such approach
is based on the so-called Cauchy criterion which states:
Series S converges if and only if for any given number E > 0, one can find
a integer N = N(E) such that, for any n ~ N and arbitrary positive integer
k = 1,2, ... ,
(10)

If the series converges, then its remainder

L
00

Rn = S - Sn = am
m=n+l

is a well defined number, and its absolute value measures the error of approximation
of the infinite series S by its finite partial sum Sn. In particular, if n ~ N (E) then
IRnl < E.
Example 1. Geometric series. Let us return to the geometric series

L qm = 1 + q + q2 + ....
00

S=
m=O

Its partial sums Sn obviously satisfy identity

from which we immediately get that

1- qn+1
Sn = ---"--
1-q

Hence, the partial sums sequence Sn converges if and only if Iq I < 1. If it does
converge, its limit
1
S= lim Sn = - - .
n-+oo 1- q
The remainder Rn is also easily computable:


1 qn+1
Rn = - - - Sn = - - .
1-q 1-q
S.l. Zeno's ''paradox'' and convergence ofinfinite series 249

A comparison of terms of an arbitrary series with corresponding terms of the


geometric series gives a handy sufficient condition of convergence of general series.
If, for all sufficiently large m, we have that lam 111m < q < 1, then the series
L::=o am converges.
If a series does not converge then we call it a divergent series. One obvious
example is the geometric series with Iq I ~ 1.
Example 2. Harmonic series. Consider the harmonic series

00 1 1 1
H= L - - = 1 + -
2 +-+ ....
m=om +1 3

As we have observed in Section 4.1, its partial sums Hn satisfy an asymptotic


relation
Hn '" In(n + 1) + y, (n -+ 00) (11)

which immediately proves the divergence of the harmonic series.


Note, however, that if we alternate signs of the harmonic series to obtain

00 (_l)m 1 1
L = L - - = l - - + - - ... ,
m=O m + 1 2 3

then the latter series converges. This is due to the general phenomenon which
makes all the alternating series of the form
00

D = L(-l)mcm ,
m=O

converge. The corresponding formal statement is known in calculus as the Leibniz


Theorem.
The alternating harmonic series is an example of a convergent series Lm am
for which the series of absolute values Lm lam I diverges. Series Lm am which
converges together with the series Lm lam I of absolute values is called absolutely
convergent. Obviously, any geometric series with Iq I < 1 converges absolutely.

8.1.3. Conditional and absolute convergence. Summation of convergent infinite


series is an associative operation. This means that if we group the terms of a
convergent series into finite blocks, then the sum of the whole series is equal to
the sum of the series consisting of these blocks. More precisely, the associativity
means that if we select an increasing sequence of integers

o < ko < kl < k2 < k3 < ... < km < ...


250 Chapter 8. Summation of divergent series and integrals

and denote the finite sums

sums in finite blocks generated by the above sequence, then the series

00

S'= I>~
m=O

also converges and S = S' . This immediately follows from the fact that the
sequence of partial sums {So' S~, ...• S~ • ... } forms a subsequence of the original
sequence of partial sums {So. S1. ...• Sn •.. .}.
In contrast, summation of convergent series is not necessarily a commutative
operation. A series formed by a permutation (perhaps infinite) of terms of a
convergent series may converge to a different limit, or may even diverge. If a
series converges to the same limit after any permutation of its terms, then we call it
unconditionally convergent. It turns out that the necessary and sufficient condition
for the unconditional convergence of a series is its absolute convergence. If a
series converges, but not absolutely, then in view of the above statement, one says
that it converges conditionally. For a conditionally convergent series, the Riemann
Theorem asserts that any number is the sum of a certain permutation of the original
series. This striking theorem will not be proved here (see Bibliography for the
proof), but we will illustrate it by a concrete example.

Example 3. A conditionally convergent series. Select positive integers p and q,


and consider a permutation of the alternating harmonic series in which q negative
terms follow p positive terms:

, 1 11 1 1 1
L =1+-+
3
... +-----
2p - 1 2
... --+--+
2q 2p + 1
... +4p-- -1 - ....

Consider partial sums L ~(p+q) containing the first n (p + q) terms ofthe rearranged
series. Since finite summation is commutative,

np-l 1 nq-l 1
L~(p+q) = ] ; 2m +1 - ] ; 2(m + 1)

Now,
S.l. Zeno's "paradox" and convergence of infinite series 251

and
nq-l 1 1 nq-l 1
~ 2(m+1) = z]; m+1'
which implies that the sums in the above expression for L~(p+q) can be expressed
through partial sums of the harmonic series. In view of the asymptotic relation
(11), as n ~ 00,
2np-l 1
L --
m +1
m=O
"" In(2np) +y,
np-l 1
L --
m=Om+1
"" In(np) + y,
so that,

, 1 1
Ln(p+q) "" In(2np) - Zln(np ) - Zln(nq ), (n ~ 00).

Hence, finally,

n~~ L~(p+q) = In (2m· (12)

The reader can complete the example by proving that not only subsequence L~(p+q)
but also the entire series L' converges to the same limit.
It is clear that, if we choose different integers p' and q', thus selecting a different
rearrangement of the alternating harmonic series, then we'll get different limits as
long as p / q i- p' / q'. In particular, for p = q = 1, we get the well known formula

1 1
L = 1- - + - - ... = In2
2 3

but for p = 2, q = 1, we obtain that

, 11111
L = 1+ - - - + - + - - -
32574
+ ... =
3
- In 2.
2 •
The above example gives a taste of the proof of the Riemann Theorem. Although
the latter seems to contradict common sense, it is sometimes used-although with-
out crediting Riemann- by dishonest financiers taking new loans to repay grow-
ing old debts. The same argument explains illusory prosperity of declining nations
which issue new paper money to cover inflationary budgets.
252 Chapter 8. Summation of divergent series and integrals

8.1.4. Functional series and uniform convergence. In what follows we will


often work with infinite series

L am (x),
00

S(x) = (13)
m=O

where the terms are not numbers but functions. For such series the notion of
pointwise convergence, that is the notion of convergence for every point x sepa-
rately, is not sufficient. For that reason one introduces a stronger notion of uniform
convergence which will be discussed below.
Assume that series (13) converges for each x E [a, b]. Then, by the definition
of pointwise convergence, for any x E [a, b] and for any given number e > 0,
one can find an integer N = N(e, x) such that, for any n ~ N(e, x) and for an
arbitrary positive integer k = 1,2, ... , ISn+k(X) - Sn(x)1 < e.
If we can make the selection of N = N(e, x) independent of x E [a, b] (one
says "select N uniformly over [a, b]"), then the series is said to converge uniformly
over [a, b]. Notice that the condition of uniform convergence is equivalent to the
condition that
n(e) = max N(e, x) < 00.
xe[a.b]

The importance of uniform convergence becomes obvious when we try to in-


vestigate properties of functions am (x) that are inherited by the sum S (x) of
the whole series. It is easy to see that, in general, the continuity of am (x)
does not imply the continuity of S(x) if the series (13) converges pointwise.
Indeed, if Sn (x) = x n , x E [0, 1], then although the corresponding terms
am(x) = Sm(x) - Sm-l(X) = xm - x m- 1 are continuous, the limit S(x) which
equals 0 for 0 ::: x < 1, and 1 for x = 1, is a discontinuous function.
However, the uniform convergence of the series precludes the above situation,
and we have the following theorem:
If functions am (x), m = 0,1,2, ... , are continuous on [a, b] and the series
Lm am (x) converges uniformly on [a, b1 then S(x) is continuous on that interval.
Similarly, pointwise convergence is not sufficient to permit differentiation and
integration of the functional series term by term or, in other words, interchanging
the order of infinite summation with operations of differentiation and integration.
However, in the presence of uniform convergence we have the following two useful
theorems:
If functions am (x), m = 0,1,2, ... , are integrable on [a, b] and the series
Lm am (x) converges uniformly on [a, b1 then
8.2. Summation of divergent series 253

Assume that functions am (x), m = 0,1,2, ... , are defined on [a, b] and are
continuously differentiable on (a, b). Then, if the series Lm am (xo) converges for
an xo E [a, b1 and the series of derivatives Lm a:" (x) converges uniformly on
(a, b), then

for any x E (a, b).


Checking the uniform convergence of a functional series is not always easy
but there exist several criteria that can be helpful. One of the most useful is the

°
following Weierstrass criterion.
Let Cm > be a sequence of numbers such that the series Lm Cm converges.
If functions am (x), m = 0,1,2, ... , are defined on [a, b1 and satisfy, for all
m = 0, 1,2, ... , inequalities

then the series Lm am (x) converges uniformly on [a, b].


Example 4. An application of the Weierstrass criterion. Let Iq I < 1. The
above Weierstrass criterion, combined with properties of the geometric series,
immediately yields the uniform convergence of the functional series
00

Lqmcos(mx)
m=O

on the whole real line.

8.2 Summation of divergent series


In this section we will study the main topic of this chapter: methods of summa-
bility of divergent series
(1)

We have encountered divergent series before. One example was the series L ±1,
which can be obtained by specifying am = (_1)m in expression (1). We also
discussed physical situations, where divergent series arise. The solution of equa-
tions of motion (8.1.2), with v = -1, by the method of successive approximation,
254 Chapter 8. Summation of divergent series and integrals

provides such an example. In mathematics, the study of divergent series is also


motivated by other reasons as for example to deal with the fact that the product
of two convergent series can be a divergent series. Quite reasonably, one would
like to assign to this product a value equal to the product of sums of the two factor
series.
The basic idea, underlying some of the most useful methods of summation of
divergent series, is contained in the relatively innocentlookingAbel's Theorem:
If series L~=o am converges to sum S then, for any 0 < q < 1, the power series

00

S(q) = Lamqm (2)


m=O

also converges and, additionally,

lim S(q) = S. (3)


q~l-

However, sometimes it happens that the series L am diverges, but the power
series (2) converges for any 0 < q < 1, and the sums S(q) have a finite limit as
q --+- 1-. In such a situation, S can be defined by formula (3), and is called the
generalized sum of series (1) in the Poisson-Abel sense. Often, the above approach
is just called the Abel method of summation, although Abel himself never worked
on the summability of divergent series.
Example 1. Summing L ±1 by the Abel method. For the series L ±1, the
auxiliary power series (2) has the sum

1
= L(-I)mqm = --,
00
S(q) Iql < 1.
m=O 1 +q

As q --+- 1 - 0, the above sum S(q) has a finite limit 1/2. Thus, the Abel method
leads to the familiar formula (8.1.5)

1 - 1 + 1 - 1 + ... = 1/2. (4)

Another approach, known as the Cesaro method of generalized summation,



depends on the formation of arithmetic averages

Al = So+ SI So + SI + ... + Sn
Ao = So, 2 , ... , An = .....:...-....:...-----, ... (5)
n+l
8.3. Tiring Achilles and the principle of infinitesimal relaxation 255

of partial sums of the original series (1), and on the investigation of their conver-
gence. If the sequence Sn itself converges, then the sequence of Cesaro averages An
always converges to the same limit. But sometimes An converges even though Sn
diverges. In this case, we shall say that the original series (1) is Cesaro summable
and the limit of the sequence An is called the generalized sum of (1) in the Cesaro
sense.
Example 2. Summing L ±I by the Cesaro method. In the generic example of
the series L ±I, the partial sums are S2n = 1, S2n+1 = O. Consequently,
A _ k+I 1
+I,
2k- 2k A2k+1 =-
2

so that Ak ~ 1/2. Thus, the Cesaro method leads to the same generalized sum as
the Abel method of summation. •
There exists a large number of other summability methods. To make them
useful, mathematicians usually demand that they satisfy regularity and linearity
conditions. By definition, a summability method is called regular if it assigns
to an already convergent series a generalized sum equal to its usual sum S. The
method is called linear, if a generalized sum for the series L(pam +qbm) is equal
to pS + qT, provided series Lam and Lbm have generalized sums S and T,
respectively.
Note, that the regularity of the Abel method follows from the Abel Theorem.
Its linearity is obvious. It is also not difficult to prove regularity and linearity of
the Cesaro method.
Naturally, not all series have finite generalized sums. For example, series 1 +
1 + 1 + . . . has infinite Abel and Cesaro generalized sums, that is, they are not
Abel or Cesaro summable. Summable series with alternating signs are sometimes
called semiconvergent series.

8.3 Tiring Achilles and the principle of infinitesimal relaxation


As we already noticed, properties of finite and infinite sums can be drastically
different. Just recall the Riemann Theorem which implies that the addition in
conditionally convergent infinite series is not commutative whereas in finite sums
it is. For semiconvergent series addition is not associative either; such series are
sensitive not only to rearrangements of their terms, but also to their grouping. For
example, for the series L±I, we have

(1 - 1) + (1 - 1) + (1 - 1) + ... = 0 + 0 + o... = 0, (Ia)


256 Chapter 8. Summation of divergent series and integrals

but
1 - (1 - 1) - (1 - 1) - ... =1 - 0 - 0 - ... = 1. (lb)

A comparison of this result with the result of Abel (or Cesaro) summation of the
same series (see (B.1.5» may give you second thoughts about the summability
methods introduced in the preceding section. One can argue that (la) and (lb)
taken together, although ambiguous, make more sense than the answer 1/2 obtained
before, since one would like the sum of integers to be an integer as well. Similar
misgivings can help motivate the search for additional summability methods based
on physical arguments.
To illustrate what we have in mind we shall return to the Achilles and tortoise
example. However, this time we will closely watch Achilles' consecutive steps
along the x-axis, rather than just the overall time he needs to catch the tortoise.
Since Achilles' speed is equal to 1, his total displacement is given by series (B.1.1)
with t replaced by x:

=L
00

x vm = 1 + v + v 2 + ... (2)
m=O

Let us imagine the chase as a series of physical stages, each consisting of Achilles
reaching the previous position of the tortoise and resting for an instant (as Zeno
prescribed originally), before embarking on the next stage.
In contrast to formula (B. 1. 1), series (2) has a physical meaning also for negative
velocities v of the tortoise. If v < 0, the tortoise always runs towards Achilles.
For example, if v = -1, Achilles reaches point x = 1 at the end of the first stage
having met the tortoise on his way and, at the same time, the tortoise halts at x = o.
Then, Achilles turns around, and at the end of the second stage finds himself at
point x = 0, while the tortoise has returned to point x = 1, and so on. The graph
of Achilles' motion is presented in Fig. B.3.1a.
Now let us strip Achilles of his semigod status and assume, realistically, that he
is getting tired chasing the tortoise back and forth. As his, now human, strength
is sapped, after each tum he slows down by a factor of q. The graph of Achilles'
motion is presented on Fig. 8.3.1b, and his full displacement is given by the sum
of an absolutely convergent series

x ~
= L..,.,(-1) 1
q = --.
m m
m=O 1 +q

Note that in the case q -+ 1 - 0 of infinitesimally slow relaxation of Achilles'


strength, the above sum is equal to 1/2; the infinitesimally tiring Achilles will end
his chase in the middle of the interval originally separating him from the tortoise.
The calculation of Achilles' total displacement, provided above in the case of
infinitesimal relaxation of his speed, can be applied to an arbitrary series (B.2.1)
8.3. Tiring Achilles and the principle of infinitesimal relaxation 257

x
1

.5

v 5
V
10
V V V15 20
V 25 t
x b

0.5

x
1
-- c

0.5 -

25

FIGURE 8.3.1
The graphs of motion of tiring Achilles.

in which case it coincides with the Abel generalized summation procedure (8.2.2)
and (8.2.3).

The Cesaro method can also be "supported" by a similar physical argument.


Suppose that the never tiring Achilles covers the distance am in each stage, but his
tiring alter ego decreases his speed linearly (as opposed to the exponential decay
seen in the above example related to the Abel method). In this case

m
Vm =1---
n+1
258 Chapter 8. Summation of divergent series and integrals

and he comes to a full stop only after n steps. In this case, his total displacement

xn =
~
~
( m) = -------
am 1- - -
+ + ... + Sn
So S1
(3)
m=O n +1 n +1

is equal to the arithmetic mean (8.2.5) of partial sums of series (8.2.1). For am =
(_l)m and n ~ 00 (the case of infinitesimal relaxation of Achilles' speed), we
see that Xn ~ 1/2. In other words, the Cesarean Achilles will eventually collapse
from exhaustion along the Abelian Achilles.

8.4 Achilles chasing the tortoise in presence of head winds


The principle of infinitesimal relaxation provides a physical interpretation for
the equality 1-1 + 1-1 ... = 1/2 (8.1.5), and also for the general framework of
Abel and Cesaro summation methods. However, a more detailed analysis of that
principle reveals that the ambiguity in assigning values to the series 1-1 +1-1 ...
and to other semiconvergent series, has not been entirely removed. In the following
example, a regular linear method of generalized summation assigns to the series
1-1 + 1-1 ... values 1/2 (as in (8.2.4», 0, or 1 (as in (8.3.1», depending on the
selection of infinitesimal relaxation rates.
Example 1. Summation with variable relaxation rates. Achilles chases the
tortoise in asymmetric conditions that slow him down at different rates depending
on whether he is running up or down the x-axis (think about the wind blowing in
the negative direction along the x-axis). As a result, Achilles' speed up the x-axis
decreases by a factor of p at each stage, and his speed down decreases (increases)
by a factor q, p =f. q. During the m-th stage his speed

V2k = (qp)k, V2k+1 = p(qpl, k = 0, 1,2, .... (1)

Thus, the original divergent series 1 - 1 + 1 - 1 ... is replaced by a series


00
x = L(-l)m vm , (2)
m=O

which, in view of the Cauchy criterion, converges absolutely for °


< pq <
1. Since the terms of an absolutely convergent series can be rearranged without
changing their sum,
00 00

x = L(qpl- p L(qpl,
k=O k=O
8.4. Achilles chasing tortoise in presence of head winds 259

where we grouped together the terms of series (2) with even and odd indices,
respectively. Formula (8.1.4) for the sum of a geometric series yields

1-p
x=---.
1-qp

Now, following the principle of infinitesimal relaxation, we let qp -+ 1 - O. That


can be accomplished in different ways, and it turns out that the final answer is
sensitive to the choice of rates at which q and p converge to 1.
If, for instance, p = qT, then

and in the limit q -+ 1-, applying L'Hospital's rule we get that

r
x=--. (3)
r+1

For r = 1, that is when p = q and the conditions of running to the right and
to the left are the same, we get that x = 1/2-the result obtained by the Abel and
Cesaro methods.
However, if we let r ::/= 1 vary then the limiting x can take other values as well.
To see this, it is useful to distinguish between the cases of two-sided and one-sided
relaxations. Ifr > Oandq < 1,thenp = qT < 1andtheAchilles'speeddecreases
independently of whether he moves to the right or to the left. In this case we say that
a two-sided relaxation takes place, and the Achilles' total displacement x, given by
formula (3), can take any value from the interval (0,1). For r -+ 0, we have x -+ 0
which corresponds (see (8.3.1a» to the rearrangement (1- 1) + (1 - 1) + ... = 0
of the series 1 - 1 + 1 - 1 + .... For r -+ 00, we have x -+ 1 which corresponds
(see (8.3.1b» to the rearrangement 1 - (1 - 1) - (1 - 1) - ... = 1.
Note that the condition qp = q1+T < 1, equivalent to the absolute convergence
of series (1), can also be fulfilled in the case when p > 1 and p < 11q. This is
the model of one-sided relaxation corresponding to the values r < 0 and Achilles
running in presence of head winds blowing in the direction of the negative x -axis.
Then, the limiting x < O. The result has a transparent physical meaning. Running
to the left Achilles goes further than running to the right. The accumulating drift of
infinitesimal displacements leads to the full displacement which can be anywhere
in the interval (-00,0). In this way the limit value x = -00 can be interpreted as
achievable as a result of a perturbation of the rearrangement 1-(1-1) - (1-1) +...
of the series 1 - 1 + 1 - 1 ... , in which each expression in parentheses is replaced
by a very small number e, so that e + e + ... = 00. •
260 Chapter 8. Summation of divergent series and integrals

8.5 Separation of scales condition


Examples provided in Section 8.4 demonstrated the existence of regular, linear
methods of summation satisfying the principle of infinitesimal relaxation, which
give sums different than those provided by the Abel and Cesaro methods. In
other words, the principle of infinitesimal relaxation by itself does not produce a
unique sum for a semiconvergent series. In this context it seems desirable to seek
additional natural conditions which would eliminate the above ambiguities.
In this section we introduce the separation ofscales condition which will restrict
unwanted arbitrariness in the generalized methods of summation discussed so far.
Roughly speaking, the condition requires that the relaxation of semiconvergent
series terms should proceed at a much slower rate than the rate of internal oscil-
lations around zero of the series itself. Without explicitly formulating the scales
separation condition we will proceed to discuss it on a revealing example.
Consider the (fJ-summation method which associates with the series L am an
auxiliary series

=L
00
Srp(8) a m (fJ(8m), (1)
m=O

where (fJ(x), x > 0, is a certain function. If series (1) converges for 8 > 0 and its
sums have a finite limit
(2)

then Srp is taken as a generalized sum of series L am.


To make the (fJ-summation method work, function (fJ(x) will be assumed to be
continuous and satisfy the following conditions:
• "Sufficiently" many derivatives (fJ'(x), (fJ"(x), ••. , are continuous and inte-
grable over the half-line (0, (0);
• cp(O) = 1;
• For "sufficiently" large n = 1,2, ... ,

lim xn(fJ(x)
x.-+oo
= O. (3)

The conditions were deliberately stated in a somewhat vague form, but their specific
role will become clear later on.
The method is obviously linear. Its regularity is assured by the continuity of (fJ(x)
at 0 which yields limx.-+o (fJ(x) = 1. Function (fJ(x), which completely determines
the generalized cp-summation method, describes the relaxation law. Condition (3)
demands that cp(x) rapidly decreases to 0 at infinity, thus guaranteeing the relaxation
principle. The separation of scales condition is fulfilled: inasmuch 8 -+ 0, the
multiplier cp(8m) varies slower and slower as m increases.
8.5. Separation of scales condition 261

Example 1. Abel and Cesaro methods as rp-summation. Formula (1) generates


the Abel and the Cesaro methods as special cases. Indeed, taking rp(x) = exp( -x)
we arrive at the auxiliary series
00
Srp(l) = L am (e-&)m , (4)
m=O

which gives formula (8.2.2) of the Abel method after substituting q = exp( -l).
On the other hand, if we select rp to be the triangular function

rp(x) = {1 - x, for 0 < x < 1; (5)


0, for x ~ 1,

and replace l) -+ 0 by a sequence l)n = 1/(n + 1), n -+ 00, then we arrive at


the Cesaro summation method. Either function fulfills the conditions imposed on
function rp above and, as a result, Abel and Cesaro methods satisfy the condition
of separation of scales. •
As promised at the beginning of this section, the condition of separation of scales
guarantees uniqueness of the sum of series L ±1. More precisely, the generalized
sum (3) of series L ±1 does not depend on the selection of the relaxation function
rp(x), as long as it satisfies the above listed conditions. To see this, substitute
am = (_1)m in (1), and group the terms in pairs to get

00
Srp(l) = L[rp(2kl) - rp(2kl) + l)]. (6)
k=O

In view of the Mean Value Theorem, for a continuously differentiable I,


I(b) - I(a) = I'(c)(b - a) (7)

for a certain c E (a, b). Thus, sum (6) can be rewritten in the form

The above series is an approximate sum, with the partition points Xk = 2kl) +Ck, for
the integral of function rp' (x) over interval (0, 00). Hence, the assumed integrability
of the derivative rp' (x) implies that
262 Chapter 8. Summation of divergent series and integrals

The factor 1/2 is due to the length of the partition intervals being 2c5. Finally,

Srp = &-.0
lim Srp(8) = --
11
2 0
00
q/(x)dx = --rp(x)
1
2
1 = -,
0
1
2
00

which proves our assertion that under the scales separation condition all generalized
rp-summation methods (1-3) give the same sum 1 - 1 + 1- 1 ... = 1/2.
Remark 1. Divergent improper integrals. In a calculus course, the usual ap-
proach is to approximate integrals over finite intervals by finite sums. Extension of
the rp-summation method of approximation to integrals over infinite intervals re-
quires certain precautions, and it would be a good exercise for the reader to provide
a rigorous proof here, and find assumptions that have to be imposed on functions
rp(x) and rp'(x) to make the approximation work in this case.
Remark 2. Rapidly divergent semiconvergent series. Note that our proof of the
validity of the generalized summation method (1-3) for the series L ±1 depended
only on the integrability of the first derivative rp' (x) over the interval (0,00). This
condition was obviously satisfied by the relaxation function rp(x) = exp(-x) of
the Abel method, as well as the triangular function (5) of the Cesaro method.
However, the stronger the growth of oscillations (as m -+ 00) of the terms of a
semiconvergent series, the stronger smoothness conditions will have to be imposed
on the relaxation function rp(x). For that reason the Cesaro method does not work
well for series diverging stronger than the series L ±1.
Example 2. A rp-summation for a rapidly diverging series. Consider series

00

L(-l)mm = -1 +2 - 3 +4- ...


m=1

The corresponding auxiliary series in (1) is

00 1 00
Srp(8) = L(-1)mmrp(8m) = ~ L(-1)mg (8m),
m=1 m=O

where g(x) = xrp(x). As before, we will pair up the terms of this series to get
1 00
Srp(8) = ~ L[g(2k8) - g(2k8 + 8)]. (8)
m=O

But in this case, the factor 1/8 makes the Mean Value Theorem useless in determin-
ing the behavior of Srp(8) as 8 -+ O. What is needed is a more precise information
8.5. Separation of scales condition 263

contained in the Taylor formula which states that if function f (x) has a continuous
derivative of order n + 1 then, for some C E (a, b),

f(x) = Ln f(m)()
a (x - a)m +
f(n+1)( )
c (x - a)(n+l). (9)
m=O m! (n + I)!

For n = 1 we get, in particular, that

g(2k8) - g(2k8 + 8) = -g'(2k8)8 - g"(ck)8 2/2,

for a Ck E (2M, 2k8 + 8) which in combination with (8) gives

(10)

As a simple consequence of Taylor's formula (9)

la
a +A
f(x)dx = f(a)l:!.
1
+ 2· f'(c)l:!.2, C E (a, a + ll.),

applied to function f(x) = g'(x), a = 2k8 and l:!. = 28, we get that

1 (2k+2)8

2k8
g'(x)dx = g'(2k8)28 + g"(ek)28 2, (11)

for some ek E (2k8, (2k + 2)8). This equality can now be used to eliminate g' (2k8)
from (10) and arrive at

Observe that, for any 8 and for a suitable selection of ek's and Ck'S, the above
formula is exact. The integral

10 00
g'(x)dx = 1:=0
xq>(x)

in view of assumptions on function q>. Integrability of q>"(x) assures then the


convergence, as 8 -+ 0, of the remaining two sums to the corresponding integrals,
264 Chapter 8. Summation of divergent series and integrals

so that finally

.
hm Srp(8)
8--+0
=- 11
4 0
00
g " (x)dx 1, 1
= --g (0) = --.
4 4

Thus the principle of infinitesimal relaxation combined with the separation of scales
condition permitted us to arrive at a striking result

1- 2+3- 4+ ... = 1/4. (12)

In some physical situations the separation of scales condition is not satisfied.



That was the case in the analysis of Achilles chasing tortoise in the presence of
head winds. Another, and more convincing example of violation of the scales
separation condition will be encountered later on when we study the summability
of semiconvergent integrals. However, in physical applications, such cases are rare
and the condition is widely applied.

8.6 Series of complex exponentials


Many problems of the theory of divergent (and also convergent) series can be
solved by a study of the complex exponential series

00

L e imz , (1)
m=O

where z = x + iy. For y > 0 the series converges absolutely and

~ imz 1
~e = 1- eiz'
m=O

In addition, for y -+ 0+ , the expression on the right gives the Abel summation
formula of a divergent series

Loo 1-
e imx - - -
- 1- e ix '
(2)
m=O
8.6. Series of complex exponentials 265

which, for x = 7r, reduces to the familiar equality 1 - 1 + 1 - 1... = 1/2.


Separating the real and the imaginary parts in (2), we arrive at the well known
formulas of generalized summation: if x#- ±27rn, n = 0,1,2, ... ,

L
00

cosmx = 1/2 (3a)


m=O

and
~ . 1 x
~slDmx = -cot-. (3b)
m=O 2 2

If x = ±27rn, n = 0, 1, 2, ... , then the series (2) becomes a real-valued series


1 + 1 + 1 + ... which diverges to +00, and formulas (3) can be complemented by
formulas

L
00 00

cosmx = +00, LSinmx =0, x = ±27rn, n = 0, 1,2, ....


m=O m=O

A more detailed investigation of the character of singularities of series (2) in the


neighborhood of these points will be pursued at the beginning of the next section.
For y > 0, the absolute convergence of the series obtained by term-by-term
differentiation of series (1) justifies formulas

for n = 1,2, .... If, for y ~ 0+, the right-hand sides of these equalities converge
to finite limits, then we will take them as generalized sums of the corresponding
divergent series on the left-hand side. Thus, putting x = 7r, we obtain that

where
1
f(y) = 1 + e-Y =
1
2: + 2:1 tanh (Y)
2 .
The Taylor expansion of the hyperbolic tangent function is of the form

1 (Y)
2: tanh 2 = 6~ 22k(2k)!- 1 B2kY 2k-l
, lyl < 7r,
266 Chapter 8. Summation of divergent series and integrals

where coefficients Bn entering in the above formula are called Bernoulli numbers
and can be determined from the Taylor expansion

x x x x ~ Bk n
- - = -- + -ooth- = L..J - x .
eX - 1 2 2 2 k=l k!

In particular,

1 1 5 691 7
B6 = 42' Bs = - 30' BlO = 66' B12 = - 2730' B14 = 6""
Hence,

00

L(-1)mm 2k = 0, (4)
m=O

For k = 1, (4) gives equality 1 - 2 + 3 - 4 + ... = 1/4 (8.5.12), and for k =2


we get that

~ m 1 3 1
L.,,(-1) -m =1-8+27-64+"'=-8'
m=l

Differentiation of series (1) led to the generalized summation formulas (4). The
integration of the similar series

(5)

also gives rise to useful relations. Multiplying both sides of (5) by a function! (y)
and integrating them term-by-term with respect to y over (0, 00), we get

~ F(m)ei(m-l)z = foo !(y)d! ,


L."
m=l
100 eY - e'X

where F(t) = J: !(y)e-tYdy.


8.6. Series of complex exponentials 267

In particular, fot' f(y) = ys-1, we get that F(t) = r(s)t- S, where r(s) is the
gamma function (4.4.3). Hence, we obtain the equality

L mS e
00 1 i(m-1)x 1 00
y s-1dY
= r(s) 0 eY _ eix .
1 (6)
m=1

For x = 0, the above expression gives the well known formula for the Riemann
Zeta function

s(s) =
,?;
00 1
-=-
mS
1
r(s)
1 0
00 yS-1dy
eY - 1
, s> 1,

and, for x = 1f, we get

-,?;
s _ 00 (_1)m-1 _ _1_ (00 yS-1dy
'1( ) mS - r(s) 10 eY + l ' s > O.

If we substitute s = 1 in (6), then we get

~ 1 imx -eix
~-e
m=1 m -
1 0
00
dy
-
eY - eix -
11 dz
1-eiz Z

where we introduced a new variable z = 1- eixe-Y • After evaluation of the


integral, we finally obtain that

~ 1 imx 1 1 i -
~ -e
ix
= -In(1 - e ) = -In + -(1f x), 0< x < 21f.
m=1 m 2 2(1 - cos x) 2

To conclude this section, we derive another useful formula by integrating (5)


with respect to y over (p, 00), p > 0, and then substitute x = O. This gives that

L
00 -mp
_e- = In(1 + coth(p /2» - In 2, p > O.
m=1 m

For small p this series can be interpreted as "quasiharmonic" in which the contri-
bution of large terms is damped by the exponential multiplier. In particular, for
p ~ 0 we get the asymptotic formula

L _e-
00 -mp
'" In(1/p),
m=1 m
268 Chapter 8. Summation of divergent series and integrals

which indicates that the main contribution to the sum is made by the first N ~ 1/P
terms.

8.7 Periodic Dirac deltas


In this section we will consider the infinite series of Dirac deltas and the func-
tional series

= 1+2 L L eimz -1,


00 00

U(x, y) e-my cos(mx) = 2Re z = x + iy, (1)


m=l m=O

which will play a key role in our analysis. For y > 0, the latter converges, and
following Section 8.6, one can find its sum to be

_
U(x,y ) - sinhy
----- (2)
coshy - cosx

Note that for y ~ 0+ the above function gives the Abel generalized sum of the
divergent trigonometric series

L cos(mx).
00

U(x) = U(x, y = 0+) = 1+ 2 (3)


m=l

The same equality may be rewritten in the complex form to get

L
00

U(x) = U(x, 0+) = e imx . (4)


m=-oo

Setting x = 1r in (3), we get a numerical series U(1r) =


1 - 2 + 2 - 2 + ... ,
and we already know (see (8.2.4)) that its Abel sum is U(1r) = 1 - 1 = O. The
same answer U (x) == 0 is obtained for any other x if we put formally y = 0 in (2).
However, a closer inspection of the right-hand side of equality (4) indicates that for
x = 2n1r, n = 0, ±1, ±2, ... , the limit U(2n1r) = +1 + 1 + 1 + ... = 00. The
situation is elucidated by Fig. 8.7.1, where the graph of function U(x, y) is shown
for small y > O. It hugs the x-axis everywhere except for small neighborhoods of
points x = 2n1r, where it has sharp peaks.
8.7. Periodic Dirac deltas 269

U(x,y)
10

O<y'<y
4

FIGURE 8.7.1
The graph of function U (x, y) for small y > o.

To fully understand the fine structure of the series (4), consider an auxiliary series

00 2y
4>(x,y) = " (5)
n~ (x - 2rrn)
2
+ y 2'

For each y > 0, by its very structure, 4> (x, y) is an even, continuous and periodic
function of x with period 2rr, which can be expanded into a Fourier series

+L
00

4>(x, y) = co em cos(mx). (6)


m=l

The zeroth coefficient

1 l1r 4> (x , y)dx = -21 l1r L ( 00 2ydx


Co = -2
rr -1r rr -1r n=-oo X -
2
rrn)
2
+ y2 .

Mter changes of variables, and gluing the integrals together, we obtain a formula
containing a single integral that can be explicitly evaluated to get

co=-
1
2rr
f x
2ydx
22=1.
+y
270 Chapter 8. Summation of divergent series and integrals

Similarly,

em = -1 l1f
7r -1f ct>(x,y)cosmxdx = -7r
1 f 2ycos(mx)dx
X
2
+y
2 = 2e -my .

Hence, series (6) coincides with series (1), so that, for any y > 0, we have equality
ct>(x, y) == U(x, y) or, equivalently,

----- = 1+2 L L .
sinh y 00 00 2y
e-my cos(mx) = (7)
coshy - COSX m=l n=-oo (x - 27rn)2 + y2

Now, if we let y -+ 0+, the last series weakly converges to a periodic Dirac delta
distribution

lim L00

y--+O+ n=-oo (x - 27rn)


2y
2
+y
2 = 27r L00

n=-oo
8(x - 27rn),

and, in view of (4) and (7), we obtain a distributional Abel summation equality

L L
00 00

eimx = 27r 8(x - 27rn), (8)


m=-oo n=-oo

where the left-hand side is the Fourier series representation of the periodic Dirac
delta with period 27r. An obvious extension of this formula for the case of an
arbitrary period 27r / il has the form

(9)

which should be compared with the already familiar formula (3.3.3) for the Fourier
image of the Dirac delta.
Equality (7), used here as a tool in deriving distributional formulas (8) and (9),
can also be used as a summation tool for more ordinary series often appearing in
physical applications. For example, setting x = 0 and x = 1l', we obtain that

~ 4y y 2 00 4y y
~
n=l
(27rn)2 + y2 = coth 2 - Y' f:r.
" 7r2(2n - 1)2 + y2 -- tanh-

B.B. Poisson summation formula 271

8.8 Poisson summation formula


A limited supply of elementary and special functions leads to a situation in which
analytic solutions of many physical and engineering problems can only be written
with the help of series of elementary or special functions. For example, the well-
known method of separation of variables in partial differential equations leads to
solutions representable in the form of functional series, and the situation is similar
for solutions obtained by the method of successive approximations. Often it turns
out that a series obtained in this way converges poorly or does not converge at all.
In such cases it is desirable to find a transformation accelerating convergence of
that series, or an outright analytic expression for its sum. Sometimes, this goal can
be achieved through the Poisson summation formula

(1)

which immediately follows from (8.7.9) by multiplying both sides by f(x) and
integrating them over the whole x-axis.
Observe the main feature offormula (1). The slower the function f(x) on the
right-hand side varies, the faster the Fourier image j(w) on the left-hand side
decays to zero. This means that the more terms one needs on the right-hand
side for good approximation of the infinite series, the fewer terms of the series are
necessary on the left-hand side for accurate computation of its sum. So the Poisson
summation formula is capable of transforming poorly convergent series into rapidly
convergent ones, and many of its applications rest on the above phenomenon.
Relying on properties (3.1.3) and (3.2.5) of the Fourier transform and on formula
(1) we can rewrite the Poisson formula in the form

-2:n:!1 m=-oo
00
L f(mt..)exp(-im!1s) = n=_oo!1
L f s + -2:n:n) .
00 _ (
(2)

The left-hand side of (2) represents the discrete Fourier transform of function f (t),
and the right-hand side expresses it in terms of the ordinary Fourier image j(w).
Hence formula (2) is useful for interpreting results of the computer implementation
of the discrete Fourier transform.
Example 1. Poisson summation formula for series of rational functions. Con-
sider the series
272 Chapter 8. Summation of divergent series and integrals

of rational functions which is often encountered in physical applications. A use of


the Poisson summation formula gives that

L L
00 00

S(ex) = j(m, ex) = f(2rrn, ex), (4)


m=-oo n=-oo

where

and
f(t, ex) = f j(w,ex)cos(wt)dw

in view of the evenness of j. The above integral can be evaluated by the method
of residues or by checking the tables of integrals. The result is

f(t, ex) = ~e-,Bltl (y cos(yt) -,B sin(YltD), (5)

where
,B = ex8, y=exJl-8 2 ,

A substitution of (5) into (4) gives

"2
S(ex) = -2rr[y +Y L 00
e- 21T,Bn cos(2rryn) - ,B L e-
00 21T ,Bn sin(2rryn) ] ,
y n=l n=l

which is recognizable as a familiar series of complex exponentials. The computa-


tion of its sums reduces to

1
=L
00
Q(.l.., v) exp[-(.l.. + iv)n] = .
n=l exp(.l.. + IV) - 1

for the sum of a geometric series. Since, clearly,

y [Y"2 + yRe Q(2rr,B, 2rry) + ,BIm Q(2rr,B, 2rry) ],


S(ex) = 2rr

an explicit calculation of the real and the imaginary parts of function Q(.l.., v) gives

,B sin(2rry)
S(ex) = -rry y cosh(2rr,B)
sinh(2rr,B) -
- cos(2rry)
. (6)
8.9. Summation of divergent geometric series 273

This, and related formulas, can also be found by other methods, but the Poisson
formula provides, as a rule, the fastest path to the goal. The typical graph of S(a)
(6) is pictured in Fig. 8.8.1.

S
14
12
10
8
6
4
2

10 12 a
FIGURE 8.8.1
Graph of series (3) in case of 8 = 0.05 evaluated with the help of formula (6).
The larger a, the more terms in (3) are needed to maintain the validity of the
result.

8.9 Summation of divergent geometric series


The Poisson summation formula was used in the Example 1 of the previous
Section 8.8 to explicitly sum a nontrivial but absolutely convergent series; later on
we will see its applications to accelerate the convergence of already convergent
series. In this section, however, we will utilize it to solve a more exotic summability
problem for the everywhere divergent series

L cos(mz),
00

S(z) = 1 + 2 (1)
m=l

where z is a complex variable, and where the Abel method and thus even more so
the Cesaro method, fail.
274 Chapter 8. Summation of divergent series and integrals

Let us form an auxiliary perturbed series

+2 L e L
00 _ 2 00
S(z, e) = 1 sm cos(mz) = exp(-em 2 + imz), (2)
m=l m=-oo

which, obviously, absolutely converges for any e > O. If the limit

S(z) = lim S(z, e)


s-+O+

exists for some z, then we will take it as a generalized sum of the divergent series
(1).
To find the set of z's for which the above limit exists we will transform series
(2) by means of the Poisson summation formula

L L
00 00
/(m) = 2:rr !(2:rrn) , (3)
m=-oo n=-oo

with /(t) = exp( -et 2 + itz). The left-hand side of (3) coincides with series (2),
and the right-hand side contains function

-
/(w) = -1
2:rr
f exp(-et 2 + it(z - w»dt = 1~exp -
2", :rr e
[(Z-W)2] .
4e

Hence,
S(z, e) =
1
~ L
00
exp
[(Z - 42:rrn)2] . (4)
2",:rr e n=-oo e
Notice that

I [
exp
(z - 2:rrn)2] I
=exp
[y2 - (x4- 2:rrn)2]
.
4e e

It means that for any z = x + iy from the set (blackened out in Fig. 8.9.1)

G = {z E C : Iyl < Ix - 2:rrnl, n = 0, ±1, ±2, ... ,

all the terms of series (4) converge to zero as e -+ 0+. Correspondingly, it is


easy to prove that for z E G we have S(z) = lims-+o+ S(z, e) = O. Thus, we
demonstrated that

+ 2 L cos(mz) =
00

S(z) = 1 0, ZE G.
m=l
8.9. Summation of divergent geometric series 275

FIGURE 8.9.1
Region G in the complex plane.

In particular, for x = 1f, we get that

(Xl

1 + 2 L(-I)m cosh(my) = 0, Iyl < 7r.


m=l

Substituting q = exp y, the above equality can be rewritten in the form


(Xl

L(-I)m[qm + (l/q)m] = 1, (5)


m=O

Forq> 1,

converges absolutely. In view of the linearity and regularity of our summation


method, we obtain the following striking extension of the familiar formula for the
sum of a geometric series:

~
~(-1) q
m m 1
= --,
m=O 1 +q
276 Chapter 8. Summation of divergent series and integrals

8.10 Shannon's sampling theorem


In this section, we will obtain another useful result with the help of the Poisson
summation formula. Shannon's sampling theorem, which is of importance in
information theory, will follow as a particular case.
Consider a function g(t) representing the uniformly convergent on the entire
real-axis infinite series

L
00

g(t) = l(m/:1)l{F(t - md). (1)


m=-oo

Here I(t) and l{F(t) are known functions with Fourier images i(w) and ;;'(w),
respectively. Let us apply the Fourier transform (3.1.1) to both sides of (1). In view
of the uniform convergence of the series, the integration and infinite summation
operations can be interchanged so that, taking into account the formula (3.1.3a)

~
21l'
f l{F(t - md)e- iwt dt = ;;'(w)e-iwmll.,

we have

L
00

g(w) = ;;'(w) l(m/:1)e- iwm ll.. (2)


m=-oo

Finally, transforming the sum in (2) by means of the Poisson formula (8.8.2), we
get
21l' - ~ -( 21l'n)
g(w) = t;l{F(w) n~oo 1 w +~ . (3)

Now, let i(w) be a function with compact support, identically equal to zero for

(4)

Then the supports of the summands in (3) have empty intersections and, for a
given frequency w, only one term of the infinite series is different from zero. In
particular, for Iwl ::: 1l' / d, equality (3) is equivalent to the equality

21l' - -
g(w) = t;l{F(w)/(w). (5)
B.10. Shannon's sampling theorem 277

Assume additionally that -if,(w) is of the form

- fl.
l/Io(w) = 21r n(fl.w/rr), (6)

where the rectangular function

n(v) = X(v + 1) - X(v -1). (7)

Then, equality (5) is valid for any w and assumes the form

g(w) = j(w). (8)

The equality of the Fourier images implies the equality of the functions themselves.
Hence, substituting in (1) the inverse Fourier image

sin1: rrt
l/Io(t) = - , 1:= ~' (9)
1:

of the rectangular function (6), we arrive at the equality

/(t) = f
m=-(X)
/(mfl.) sin «rr t/fl.) -rrm),
(rrt / fl.) - rrm
(10)

which expresses the contents of Shannon's sampling theorem:


Assume that the Fourier image of function /(t) vanishes outside the interval
Iwl ~ fl.. Then, for any t, the values of /(t) are completely determined via
formula (10) by the values of this function at discrete time instants

tm = mfl.. (11)

The theorem has numerous applications in physics and information theory which
will not be discussed here. However, we will take a closer look at some of its
modifications and extensions which will permit us to grasp the meaning of this
result at a deeper level. .
First of all, note that if -if,(w) is taken to be a one-sided rectangular function (as
opposed to the symmetric function (6»

-
l/I(w) = -rrfl. [X(w) - X(w -rr/ fl.)]. (12)
278 Chapter 8. Summation of divergent series and integrals

Then the right-hand side of (8) becomes

8(W) = 2j(w)X(w).

As we observed in Section 6.6, the right-hand side is the Fourier image of an analytic
signal F(t) corresponding to function f(t). The original function corresponding
to the Fourier image (12) is

1/I(t) = ~[exp(i1l't/~)
1l'lt
-1].

Substituting this expression into (1), we find an explicit formula for the complex

f
function
F(t) = f(m~) exp«i1l't/~) - i1l'm) -1,
m=-oo (1l't/~)-1l'm

representing the analytic signal corresponding to function f(t).


Let us rewrite formula (3), replacing ~ by~:

21l'- ~-
8(W) = 81/1(w) L f(w + 2:rrn/~), (13)
m=-oo

and observe that if


~ <~, (14)

and j(w), as·before, identically vanishes for Iwl ~ ~, then the supports of sum-
mands in the series (13) are separated by gaps of length 21l'(1/~ - II ~). As a
result, to pass from (13) to (3), it suffices to select 1fr(w) from a broad class of
functions

~/(21l'), for Iwl ~ 1l' I ~;


1fr(w) = { 0, for Iwl ~ 1l'(2/~ - II M; (15)
arbitrary, for 1l' I ~ < Iwl < 1l'(21~ - II M.

The schematic plot of one of possible Fourier images 1fr (w) for which the equality
(8) is valid is shown in Fig. 8.10.1.
Taking one of the Fourier images (15) and calculating the corresponding original
function 1/I(t) we arrive at a formula more general than (to):

L
00

f(t) = f(m~)1/I(t - m~). (16)


m=-oo
8.10. Shannon's sampling theorem 279

FIGURE S.10.1

The plot of one of possible Fourier images lfr (w) for which the equality (S) is
valid. The triangles symbolize here the summands in the series (13).

In particular, killing the Fourier image (15) for Iwl ~ rr / ll. we arrive at the most
widely used form of Shannon's formula

f(t) = e f:
m=-oo
f(m8) sin «rr t/ ll.) - errm) ,
(rrt/ll.)-errm
e = 8/ ll. < 1. (17)

It would seem that formulas (16) and (17) have, in comparison with the simplest
formula (to), the shortcoming of requiring the knowledge of values f (t) at densely
distributed time instants tm = 8m. However, the indicated drawback is partly
amended by the great flexibility in the selection of function 1/I(t). The freedom to
impose values of the Fourier image lfr (w) in the intervals

-rr <
ll.-
Iwl < rr
-
(2 1)
- - -
8 ll.
(18)

can be utilized to improve the speed of convergence of the series appearing on


the right-hand side of the generalized Shannon's theorem. Indeed, the experience
gathered by the reader while studying the Fourier transform's asymptotics in Chap-
ter 4 suggests that to achieve that goal one should choose lfr(w) decaying to zero
inside the intervals (18) as smoothly as possible. An infinitely differentiable (in
the classical sense) lfr(w) would be ideal. In this case, the corresponding original
function 1/1 (t) would decay for It I --+ 00 faster than any power function 1/ tn. As a
result, it may turn out that in computing values of f(t) with a given accuracy fewer
280 Chapter 8. Summation of divergent series and integrals

terms are needed in the series (16) than in (10). An example of a damped Fourier
image 1fr(w) and the corresponding rapidly decaying function 1/I(t) is provided in
Exercises at the end of this chapter.
In electrical engineering applications one often deals with narrow-band / (t), that
is, with functions whose Fourier images are concentrated in a narrow neighborhood
of specific frequencies ±wo and vanish outside the intervals

Iw - wol < n, Iw + wol < n.

Using the simplest Shannon formula (10) we should impose values of function
/(t) in intervals of length
(19)

1
because the Fourier image (w ) of a narrow-band function / (t) is identically equal
to zero only for frequencies satisfying condition

analogous to (4).
At the same time it is intuitively clear that for a narrow-band signal for which
n « wo, one should have a more adequate version of formula (16), with sufficiently
large intervals B between the readings: B ~ 21l' / n » 1l' / wo. Let us derive it.
To begin with, note that the compactly supported Fourier image of a narrow-band
signal can be represented in the form of two components

I(w) = I+(w) + I-(w), (20)

concentrated in the neighborhoods of the central frequencies +WO and -wo, re-
spectively. Choose B so that the central frequencies of components 1+
in the sum
(13) would be located halfway inbetween adjacent central frequencies of the com-
ponents 1-. To accomplish this it is necessary that for some positive integer 1 we
have the equality
21l' 1l'
-wo + - I = WO - -.
B B
Consequently,
B = 1l'(1 + 1/2)/WO. (21)

Select the value of I in such a way that, for any w, only one of the summands in
(13) is different from zero. To achieve this it is sufficient to demand that B satisfies
the inequality 1l' /2B > n. Hence by (21) it follows that

(22)
B.11. Divergent integrals 281

It remains to select as {fr(w) a function that would tum equality (13) into

g(w) = i+(w) + i-(w).

By analogy with (15), it is clear that here it is sufficient to let

{fr(w) = 1 8/(21r),
0,
arbitrary,
for Iw±wol ~ w;
for Iw±wol ~ rr/8 - Q;
forQ< Iw±wol <rr/8-Q.
(23)

In particular, setting {fr(w) to be identically zero for frequencies satisfying the two
inequalities Iw ± wo I > Q, that is selecting

- /) [(w-wo)
1/I'0(w) = 21r n + n (w+WO)]
-Q- -Q- , (24)

where n (v) is given by (7), we can calculate the corresponding original function
to be
1/I'0(t) = 28 sin(Qt) cos(wot). (25)
rrt
Substituting it into (16), we arrive at the simplest variant of the narrow-band Shan-
non's theorem:

28 ~ sin(Qt - m/)Q)
f(t) = - LJ f(m8) cos(wot - mw(8). (26)
rr m=-oo t - m8

Note that, if f(t) is a real narrow-band process, that is if Q « WO then, without


violating inequality (22), one can select I » 1, and make the distance 8 (21) be-
tween readings tm much larger than tl. (19), as in the case of the standard Shannon's
theorem.

S.ll Divergent integrals


Summability problem for divergent integrals is close in spirit to that for divergent
infinite series. Let us demonstrate this using the Fourier transform

X(w) = - 1
2rr 0
1 00
e-UJJtdt

282 Chapter 8. Summation of divergent series and integrals

of the Heaviside function X (t) as an example. It is a typical divergerit integral.


Evaluating it by the infinitesimal relaxation method gives

1o
00
e-lwtdt
.
= lim
a-+O+ 0
1 00
e-lwt-atdt
.
= -.-1 -
IW +0
(1)

and illustrates an application of the Abel summation method to divergent integrals.


The generalized sum (1) of the above divergent integral is quite stable with
respect to a wide class of regularizing functions. Indeed, let us replace the above
exponential regularizing function exp( -at) by an arbitrary, absolutely integrable
and continuously differentiable function I(at), and consider the integral

(2)

The change of variable x = at transforms (2) into the integral

11
-
a 0
00
e- IPX
. I(x)dx,

where we also introduced a new parameter p = wJa. If a -+ 0 then p -+


00. Hence, to evaluate the above integral, we can employ results of Chapter 4
on the asymptotic behavior of Fourier transforms of discontinuous functions. In
particular, by analogy with (4.3.3), we have that

11
-
a 0
00
.
e- IPX 1
I(x)dx '" -.-/(0)
lap
1
= :-
IW
1(0), a -+ 0, W :;C o. (3)

The regularity assumption for our summation method requires that 1 (0) = 1. So,
for W :;C 0, the summation result is the same as in (1).
The above summation method obviously satisfies the separation of scales con-
dition (as discussed in the context of series summability methods). However, one
can easily produce examples of methods which violate it and thus can potentially
give nonunique answers.
Example 1. Divergent integral with nonunique generalized sum. Consider the
divergent integral
10 00
sin x dx. (4)

Summation of this integral following the prescription given in (2) gives value 1,
which can be also found by formally writing the integral as a series E an with
terms
an = l
lT (n+l)

lTn
sinxdx = 2(-1)n,
B.11. Divergent integrals 283

so that
10 00
sin x dx = 2 - 2 + 2 - 2 + ... = 1.

However, if one considers a method of summation based on the integral

[(a) = 10 00
e-ax sin x [1 + 2aq sinx]dx, (5)

one obtains a result different from 1. Indeed, the integral (5) can be easily evaluated
to give
1 4q
/(a) = - 2 - - + -2--·
a +1 a +4
As a ~ 0, the integrand in (5) converges uniformly on bounded sets of x's to
sin x-the integrand in (4). However, for any q ¥= 0,

lim /(a) = 1 +q ¥= 1,
a-+O

and for different values of parameter q, one gets different summation results. This
is related to the fact that the relaxation function e- ax [1 + 2aq sin x] is not of the
form !(ax) and violates the principle of separation of scales.

8.12 Exercises
1. Find the sum of the series

L e-my sin(mx),
00

S(x, Y) = Y> o.
m=l

2. Find the sum of the divergent series


00

S = Lmsin(mx).
m=l

3. Using the Poisson summation formula find the functional action of the distribution
00

E = Lsin(mx).
m=l
284 Chapter 8. Summation of divergent series and integrals

4. The analysis of waves propagating in resonators and waveguides leads to series of


the following type:

L
00

S(w, M = f(mMexp(-img(w»,
m=-OO

where f (x) is an absolutely integrable function such that f (0) = 1. Transform the above
series by means of the Poisson summation formula and find its weak limit for /). -+ o.

5. Using formula (8.7.8), transform the series

P(x) = L eXP(i(m. a (x) ).


m
where the summation is extended to all vectors m with integer components (m1, m2, m3),
and y = a(x) is an infinitely differentiable vector field which provides a one-to-one
mapping of the x-space into the y-space such that the Jacobian J(x) = laa/axl of the
transformation is everywhere positive and continuous. The inner product appearing under
the summation is (a(x) . m) = a1 (x)m1 + a2(x)m2 + a3(x)m3.
6. The 3-D Poisson summation formula turns out to be useful in solid state physics
and, in particular, in the study of crystal properties. Using the result from Exercise 5,
derive the right-hand side of the formula if the left-hand side is
00

F(x) =
L
ml,m2,m3=-oo

where a1, a2, a3 are three not coplanar vectors, x = (Xl, x2, X3) in a certain Cartesian
coordinate system, and f(x) is an absolutely integrable function with 3-D Fourier image
j(k).
7. Calculate the discrete Fourier transform of the rectangular function

n(t) = X(t + 1) - X(t - 1). (1)

8. Find the discrete Fourier transform of the function

f(t) = n(t)cos4 (1l't/2)

and, then, compare it with its usual Fourier image (4.3.22).

9. Find the discrete Fourier image of the periodic function f (t), with period T = M /).,
where M is a positive integer.
10. Find the sum of the functional series

S (/).) ~ sin(m/).)
=~
m=1 m

with the help of the Poisson summation formula.


B.11. Divergent integrals 285

t
11. The functions
IN(t) = Sin~t)
m=l
are partial sums of the Fourier series of the periodic function 1 (t)=(1f - t) 12+1f It I (21f) J
(see solution to Exercise 10). Compare functions IN(t) and I(t), and investigate the
asymptotic behavior (for N -+ 00) of IN(t) in the vicinity ofthe discontinuities t = 21fn
of I(t).
12. The Gibbs phenomenon discovered in Exercise 11 is undesirable in many physical
and engineering applications. Find the method of summation of the first N terms of the
series
~- sin(mt)
IN(t) = ~h(m, N)-m-
m=l
which, at the continuity points of I(t) from Exercise 11, would guarantee convergence
of IN(t) to I(t), and would avoid the Gibbs phenomenon at the discontinuity points of
I(t).
13. Derive a formula analogous to the formula (8.9.26) for the analytic signal F(t)
corresponding to the narrow-band signal I(t), whose Fourier image is identically zero
for Iw±wol::: n.
14. Construct a function 1/I(t) entering into the Shannon series (8.9.16), and possessing
a Fourier image of the type (8.9.15). Use common sense.
15. Find the maximal distance" between readings in the narrow-band Shannon's
formula (8.9.25) for n = wo/lO.
16. Suppose that I(t) is a narrow-band function with the Fourier image vanishing
for Iw ± wol ::: n = wo/lO. Find function 1/I(t) entering into the generalized Shannon
formula (8.9.16), which decays sufficiently rapidly as It I -+ 00. Utilize results of the
Exercises 14 and 15.
Appendix A
Answers and Solutions

A.l Chapter 1. Definitions and operations


1. (a) (./1i/2)8'(x);
(b) «-rr/2)8'(x);
(c) The "zero distribution" 0;
(d) X(x).
2. f f(x)dx = 0, f F(x)dx = 1, where F(x) = f~oo f(y)dy.
3. (a) x'(ax) = 8(x) and is independent of a.
(b)

L :n)
00

x'(e Ax sin ax) = (-1)"8 (x _


n=-oo
and is independent of A.
4. f'(x) = 8(x - 1) - 8(x + 1), and

lim f'(x/s)/s2
8-+0+
= -U'(x).

5. y(x) = A8(x - 1) + B8'(x - 1) + C8(x). where A, B, C are arbitrary constants.


6. Taking into account the multiplier probing property, we have

8. 8a =8(g(x) - a)IVg(x) I. The action of this distribution as a functional on an


arbitrary test function t/J e V(R3 ) is

8a [t/Jl = f 8(g(x) -a)IVg(x)It/J(x)d3 x = 1 t/Jdu.


288 Answers and solutions

9.
2
8l = I[Vg1 (x) X V g2(X)] I IT 8(gk (x) - ak)·
k=1

10. J IVy,(x) 18(y, (x) - e)d2 x.


11. P[ cP] is equal to the flux of the vector field cP inside the region bounded by a level

1cP·
surface y,(x) = e:
P[t/>] = nda,
where n is the interior unit normal vector to the level surface a. The interior points x are
those for which y, (x) > e.

A.2 Chapter 2. Basic applications


Ordinary differential equations
1. ji + yy + (J)2y = (b + ya)8(t) + a8(t).
2. y(O) = 1, Y(O) = -y.

3. y(t) = (1/2) sin Itl.


4. y(t) = X (t)(a + (b - a) cost).

5. ji + y = 8(t).
6. y = Aeht + BX(t)eht •
Wave equation
7. The Green function G(x, t) is a solution of the following initial value problem:

82 G a2 G 8
at 2 = e2 ax 2 ' G(x, t = 0) = 0, at G(x, t = 0) = 8(x).
Using the D' Alembert formula we obtain that

G(x, t) = ;eX(t)( X(x + et) - X(x - et)).

Hence the solution of a non-homogeneous wave equation is of the form

1 1t ix+C(t-~)
u(x, t) = 2c dT dy f(y, T).
-00 x-c(t-~)

8. u(x, t) = g(x) * 8G(x, t)/8t + h(x) * G(x, t), where * denotes the convolution in
the spatial variable.
Chapter 2. Basic applications 289

Continuity equation
9.
p(x, t) = f Po(y)8(x - yeBt)dy = e- gt Po(xe- gt ),

C(x, t) = f Co(y)8(y - xe-gt)dy = Co(xe- gt ).

For g > 0, as time increases, the density at each point x decreases to 0, and the concen-
tration asymptotically becomes homogeneous and converges everywhere to a constant. IT
the total mass of the passive traces is equal to m, then for g < 0 and t .... (Xl the density
weakly converges to the distribution m8 (x) and the concentration to the zero distribution.

10. p(x, t) = J
Po(y)8(x - y - gyt)dy = (1/1 + gt)Po«x/l + gt». For g > 0
the density converges everywhere to zero but much slower than in the previous problem.
However, for g < 0 the density becomes singular in finite time t* -l/g. =
11. p(x, t) = J
PoV 8(x - v. - g.2/2)d. =
Po(1 + ag/v2)-1/2. One can see a
similar phenomenon watching the stream of water flowing out of a faucet. The further it
is from the faucet, the thinner it gets.

12. The equation in question has the integral p(x)v(x) = const = PoV. Eliminating
the time t from the equations of motion

x = vt + gt 2/2, v(t) = v + gt,


we get v(x) = (v 2 + 2gx)1/2. Thus, p(x) = Pov/(v2 + 2gx)I/2.
13. In this case the x axis is simply reversed upwards and we are not solving the "rain"
problem but a "fountain" problem where the droplets move upwards. In this case the
density of droplets is
-1/2
p(x) = 2Po ( 1 - ag/v 2) .

if x < v 2/2g, and it is = 0 if x > v 2/2g. The infinite singularity of the density as
x .... v 2 /2g - 0 describes the effect of concentration of droplets at the point where rising
droplets reach their highest elevation. By the way, explain why an "extra" coefficient 2
appeared in the numerator of the above formula.

14. The continuity equation

+ 0\11 !!!... _ 0\11 !!!... = 0


Op
at OX2 OXI OX1 OX2
has the integral of motion p = g(\II (Xl, X2», where g(\II) is an arbitrary function. IT we
additionally impose a physical requirement that g ~ 0 then we obtain a class of densities
of the passive tracer that will be independent of time.

15. Since in this example the dimension of the Dirac delta is [8] = [t]/[x 2], we get that
1
[a] = [m]/[t]. Also, m = a dl/lvl, where the integration is performed on the contour
(or contours) given by equation \II(x) = b.
290 Answers and solutions

16. The Liouville equation for the particle density is


a!
- + (v· V x )! + (V v ·g(x, v)f)
at
= o.

17. In this case the Liouville equation is

a!
8i + (v· V x )! = h(V v · vf).
Its Green function is

G(x, v,Y, U, t) = 8(x - Y - (ul h)(l - e- T»8(v - ue- T).

"C = ht introduced above is dimensionless time. Using the Green function we can write
the solution in the form

!(x, v, t) = f !o(y, u)8(x - Y - (ul h)(l - e- T»8(v - ue-T)d3y d 3u

= e3T !o(x - (vi h)(eT - 1), veT).

18. p(x, t) = J !(x, v, t)d3v = my3(t)w(xy(t», where y(t) = heTl(eT - 1).

19. lim,....oo !(x, v, t) = mh 3w(xh)8(v).


20. Density p(x, t) satisfies the continuity equation and velocity v(x, t) satisfies the
(nonlinear) Riemann equation
av
8i + (v· V)v = O.

21. Replacing the substantial derivative by its expression in Eulerian coordinates we


obtain the Riemann equation with external forcing acting on particles in the hydrodynamic
flow
av
at + (v . V)v = g(x, v, t).

22. v(x, t) = Jv!(x, v, t)d3vI p(x, t).


23. The analogue of formula (2.4.11) in the I-D case is

Po x
~
_ ( ) =!!.l.~ A
usg
(x - f3(il:lS»)
I .
1=-00

For l:l --+- 0 and smooth functions g(z) and f3(y), the sum converges to the integral

- (x ) = !!.I
Po f g (x - f3(y»)
I d y.
Chapter 2. Basic applications 291

Now, observe that for I -+ 0, the function g (x / I) / I weakly converges to the Dirac delta
8(x). In this fashion the sought limit function is

Po(x) = p / 8(x - f3(y»dy = pa'(x),

where y = a(x) is the function inverse to the function x = f3(y). In our case, for x >
anda(x) = (x -1 + ../x 2 +s2)/2,
°
_ p ( IXI)
Po(x) = '2 1 + ../x2 + s2 .

° °
Remark. The condition ll. = 0(/) for I -+ permitted us to correctly choose the order of
the limit passages: first let ll. -+ for a positive I, and then I -+ 0.
Pragmatic approach
24. JL = (lal + If3D/I2af3l, if one assumes that Dirac delta is even.
25. f = -(a + 13)/2.
26. y(o_) = y(O+), .y(0_) = eYy(O+).
27. y(O_) = eYy(O+), y'(O_) = y'(O+).

28. (a) R(y) = 8(y) J~oo q,(x) dx + x (y)q,(y);


(b) R(y) = x(lyl- 1)q,«lyl-l);
(c) R(y) = 8(y) J~l q,(x) dx + q,(lyl + 1).
29. The generalized particle density is

p(x, t) = m(t)8(x) + p(x, t),


where m (t) = Po.;w; is the mass of the cluster of particles sticking at the origin, and
1
p(x, t) = Po- +
2 (1 Ixl )
../x2 +wt
is the continuous component of the density. It has a minimum p(O, t) =
Po/2 at x 0,=
and increases to its original value Po as Ix I -+ 00 where the particles do not move.
The mass conservation law is here reduced to the requirement that the mass of the
cluster is equal to the deficit of the mass of unglued particles:

m(t) = /(Po-P(X,t)]dX.
A direct substitution leads to the equality

to dx
10 ../x2 + l(x +.JX2+1) = 1,
one of the standard definite integral that can be found in the tables of integrals or via
Mathematica. Here it was derived via a "physical" argument.
292 Answers and solutions

A.3 Chapter 3. Fourier transform


1. We have
-
f+(w) = -1
2rr
1 0
00
' ) tdt = -1- .-1- .
e- (Y+IW
2rrlw+y

2. In view of the properties of the Fourier transform the answer can be obtained from
the answer to the previous exercise:
- - 1 1
f(w) = 2Ref+(w) = - 2 2'
rr w + y

3. j(w) = (1/2y)e- Y1wl •


4. j(w) = 8(w) - (y/2)e- Y1wl •
5. The answer is

-
f(w) = -1
00
2:
e (.IW-Y ) n = _1 1 .
2rr n=O 2rr 1 - exp(iw - y)

6. We have
- - 1 sinw
fo(w) = 1m f(w) = -4rr cosh y - cosw
'

-
fe(w) -
= Re f(w) = -1 [ sinh y + 1] .
4rr cosh y - cos w

7. r(y) = coth(y /2) ~ 00 for y ~ O.


8. The Fourier image of the derivative is iwj(w) = (1/2)(e iW - e- iw ). Hence, by the
shift theorem, f'(t) = rr[8(t-l)-8(t+ 1)]. Integrating t, and assuming that f( -00) = 0
we get that f(t) = rrTI(t), where TI(t) = 1 for It I :::: 1, and = 0 for It I > 1.
9. The second derivative f"(t) = 2(8(t+ 1)+8(t -1) - TI(t)). Using the answer to the
previous problem we get that its Fourier transform is (2/rr)(cosw - sinw/w). Therefore
j(w) = (l/w2)(2/rr)(sinw/w - cosw).
10. The third derivative

flll(t) = 8(t + 2) - 28(t + 1) + 28(t -1) - 8(t - 2).

The corresponding Fourier image

F'(w) = ~(sin(2W) - 2sin(w»).

Multiplying this expression by -iw3 we finally obtain


- 2sinw(1 - cosw)
f(w) = 3'
rrw
Chapter 3. Fourier transform 293

11. The second derivative

L
00

f/,(t) = l:!.2f(nl:!.)8(t - nl:!.),


n=-oo

where we used the standard notation

l:!.2f(nl:!.) = f«n + 1)l:J.) - 2f(nl:!.) + f«n - 1)l:!.)

for the second-order difference of function f. The corresponding Fourier image

- 1 ~
fi(w) = - 23r l:!.w2 L.J l:!.2f(nl:!.)e"J} n.
. t:.

n=-oo

Regrouping terms of the above series, we arrive at

which is more convenient for calculations.


12. First, observe that the sought Fourier image is related to the Fourier image of
function h(t) via the formula j(w) = 23rlh(w)1 2 • Calculate first the Fourier image of
function h(t). Its second derivative is

h" = 8(t) - 8(t - (}) - (}8'(t - ()).

Thus the Fourier image of function h(t) is

- (}2
h(w) = 23r02 e-' (1
[ '0
+ in) - ]
1 ,

where 0 = w(} is a new dimensionless argument. In this way,

or, finally
_ (}4
f(w) = 23r04 [0 2 + 2(1 - cos 0 - 0 sin 0)] .

13. f(4)(t) = 8(t) - 8(1tl- (}) - (}8'(ltl- ()) - (}28"(t).


14. Apply the Fourier transform to both sides of the recurrence relation to obtain
- p(p -1) -
f(w; p) = 2 2 f(w; P - 2).
p - w

In the case of even p = 2n, the previous identity implies that

-
f(w; 2n) = 2n!f(w; 0)
- nn 2
1
w2.
m=l
4m -
294 Answers and solutions

Substituting
-
f(w; 0) = -Jr1 17r/2
0
cos(wt) dt =
sin(Jrw/2)
JrW
,
we obtain
-
f(w; 2n) = 2n!
sin(Jrw/2) nn 2
1

JrW 4m -w
m=l
Similarly, for odd p = 2n + 1,
-
f(w; 2n
- 1) nn
+ 1) = (2n + 1)!f(w; 1
2 2.
m=l (2m + 1) - w

A substitution

-
f(w;I)=-
1 17r/2 cost cos(wt)dt = cos(Jrw/2) 2'
Jr 0 Jr(1 - w )

gives, finally,

-
f(w; 2n + 1) = (2n + I)! cos(Jrw/2) nn 1
2 2·
Jr m=O (2m + 1) - w

A.4 Chapter 4. Asymptotics of Fourier transforms


1. f(x) '" x 2 (x --+ 0).

2. f(x) '" x/3 (x --+ 0), f(x) '" l/x (x --+ 00).

3. f(x) '" x/3 (x --+ 0).

4. It is clear that the root x(a) --+ 0 as a --+ 00. Hence we can use the asymptotic
formula cotx = l/x - x/3 + O(x 3 ) and replace the original equation by a simpler
quadratic equation
2 3 3
x - -ax + - =0 (1)
2 2
which gives the following asymptotic behavior of the root:

(a --+ 00). (2)

Remark. A related question of finding positive roots of the transcendental equation

x = tan x (3)

also arises in several mathematical physics problems.


Chapter 4. Asymptotics of Fourier transforms 295

x,tan(x)

----~~----+-----~~--~+-----~~--~~ X

FIGURE A.4.1
Graphs of the functions x and tan x. The first two roots, Xl, x2, of equation (3) are
marked on the x-axis.

The roots xn (see Fig. A4.1) can be expressed via the solution x (a) of the equation (1)
as follows:

where
rr(2n + 1)
an = 2
Hence, in view of (2), the corresponding approximate values of the roots of (3) are

Xn ~ an4 + ~4 J a2 -
n
~.3 (4)

The larger n the more accurate the approximation is. However, even for n = 1, the
approximate value Xl ~ 4.493397 given by (4) differs from the true solution by less than
1.2 . 10-5 • So in many practical situations expressions (4) are as good as exact analytic
solutions of the equation (3).
5. Taking logarithms of both sides to replace the product by a sum, we obtain that
In f(N) = :L:=lln(l+anl N). The boundednessofsequence {an} implies that ani N -+
o as N -+ 00, uniformly in n. This permits use of asymptotics In(1 + x) ..... x, for each
term separately, and adding them up. As a result, we get In f (N) ..... (II N) :L:=l an.
Hence, f(N) ..... exp[(I/N) :L:=l an].
6. The answer is

D(1 + ~cp(n~» = exp (1' cp(r)dr) (1- R),


296 Answers and solutions

o< R < !l l' q12('r) dr:.

We tum your special attention to this formula as it forms a basis of numerous scientific
(Physical, chemical and biological) laws.

7. Integral Sew) decays, as w .... 00, more rapidly than any power of w. Integral C(w)
has a power asymptotics C(w) '" -(a/w2) (w .... 00).

8. jew) = -(6a/7rW4) + O(w- 10 ).


9. We have
- n!A. 7r
few) '" 7rw"+1 sm(wr: - '2 n ) (Iwl .... 00).
The trigonometric factor in the above formula describes, as physicists say, the interference
contributions from hidden discontinuities of the function at two points t = ±r:.

10. J(w) '" -(3/w2)cosw.

11. jew) '" sinw/(i7rw2), (w .... 00).


12. The answer is

r(p + 1) exp (.7r


f-(w) '" 27rlwIJ:l+l ''2(1- P)'Slgnw,
)
(w .... 00).

13. One obtains


- sin Iwl - cos Iwl
few) '" 2.Jil'Wflwl ' (Iwl .... 00).

A.5 Chapter 5. Stationary phase and related methods


1. Apply the general asymptotic formula (5.2.3). In our case, take x = kp .... 00,
pet) = cosh(t), f(t) = -1/27r. There exists only one stationary point r: = 0 where
pl/(O) = 1. Consequently, there remains only one term in the formula (5.2.3), and that
term has to be multiplied by 1/2 because the stationary point coincides with the lower
limit of integration. Hence,

1 .
G(p) '" - . exp[-Ikp), (kp .... 00),
"j8mkp
an expression equivalent with (9.4.3).

2. Initially, it is easier to find asymptotics of the complex function Dw(z) described in


Chapter 4. To make use of it notice that Dw(z) (5.6.1) is the Fourier transform of function

f(t/» = 2eizsinif>[X(t/» - X(t/> - 7r»),


Chapter 5. Stationary phase and related methods 297

which has two jumps: Lf(O)l = 2 and Lf(Jr)l = -2. So, in view of the asymptotic
relation (4.3.3), we get
1 .
Dw(z) - - . (1 - e'W1r), (cu -+ 00).
JrCUI

Now, separate the real and imaginary parts of this expression to obtain the sought asymp-
totics of Anger and Weber functions:

1.W _ sin(cuJr) , EW _ 1- cosCUJr , (CU -+ 00).


CUJr CUJr

3. Again it is simpler to find initially the asymptotics of Dw(z). Separation of it real and
imaginary parts is straightforward. The asymptotics is obtained by the stationary phase
method. There is only one stationary point rp* = Jr /2. Therefore, with help of formula
(5.3.1), we get

4. In our case, calculation of integral (5.6.1) in the Fresnel approximation is reduced


to an application of approximate equality

where, as in Exercise 3, the stationary point rp* = Jr /2, and to the extension of the domain
of integration to the entire line. Thus, in the Fresnel approximation, the exact integral
(5.6.1) is replaced by the approximate expression

The last integral can be evaluated in closed form using the standard formula (3.2.3) which
gives

Dw(z) ~ V{2
;zi exp [(
i z - 2 CUJr)
+ 2z
2
iCU ]
' (z » 1).

Notice that as z -+ 00 the above expression tends to the expression obtained in Exercise
3.
5. Let us write integral (5.6.1) in the form

(2)

where
p(rp) = rp - psinrp, (3)
which is more convenient for asymptotic analysis. In our case (0 < p < 1) function p(rp)
is strictly monotone so that there exists the unique and strictly monotone and smooth
298 Answers and solutions

inverse function cP = q(s). Consequently, it is possible to write the integral (2) in the
familiar Fourier transform form:

(4)

where
!(s) = q'(s)[X(s) - Xes -n-)]. (5)
This function has two jumps

L!(On = 2/ P'(O) = 2/(1 - p), L!(n-)l = 2/ P'(n-) = 2/(1 + p).


Thus, in view of (4.3.3), we get

Dw(pw) _ ~
n-WI
(_1__
1- p
e- iW1r )
1+P
, (W _ 00). (6)

Notice that this relation is a natural generalization of the answer to Exercise 2 (in the case
z < W (p < 1)) which follows from (6) by taking p _ O.
6. Observe that in the case p = 1, the inverse function cp =
q(s) is no longer smooth
over the entire interval cp E [0, n-]. Indeed, in the vicinity of cp = 0 the original function
P(cp) and its derivative P'(cp) have the following asymptotic behavior:

=
Consequently, cp(s) - (6s)1/3 and the function !(s) (5) has at s 0 a singularity of the
order
16 )1/3
!(s)- ( 9s 2 =
Asa - 1, (s _ 0+),

where
A = (16/9)1/3, a = 1/3.
Thus, the asymptotics of Dw(w) is described by (4.6.2), i.e.,
r(a)
Dw(w) - A 2n-(iw)a' (w - 00).

Inserting the numerical values of constants A and a and utilizing the symmetrization
formula r(a)r(1 - a) = n-/ sin(n-a) one gets

2 ) 1/3 .j3 _ i
( 9w (w _ 00).
Dw(w) - "J273 , (7)

=
7. In this case function P(cp) (3) has a simple stationary point cp* arccos(1/ p). For
convenience, express p via an auxiliary variable 9: p =
1/ cos 9, 0 < 9 < n-/2. Then
p* = 9 and elementary calculations yield

P(cp·) =9 - tan 9, P"(cp*) = tan 9.


Chapter 5. Stationary phase and related methods 299

A substitution of these quantities into (5.3.1) gives

1 .
D(Apw) '" exp [I (w(tan9 - 9) -",/4)]. pcos9 = 1, (w -4 (0). (8)
../2rrwtan9

Noticing that
wtan9 = J Z2 - w2 , 9 = arctan Jz2/w2 -1,
it is easy to see that (8) is a natural generalization of the answer to Exercise 3 in the case
z > w (p > 1).
8. Transform (5.6.3) applying the following "physical" argument: if t represents
the time and w-the frequency, then it is natural to analyze the integral (5.6.3) in the
=
dimensionless variable of integration u wt and rewrite the former as

1
I(w) = ~A(y). (9)
'\I 2rrw

The deliberately separated factor 1/../21rw has the dimensionality of the original integral
and
(10)

is a dimensionless function of a dimensionless variable y = .fWT. The substitution


v=Ju+y 2 Ieadsto
A(y) = e'Y• 2 <I> (y), (11)

i€1
where
1 ~ ~
<I>(y) = -
°o·2
e- IV dv = M"! - C(v 2/rry) + i S(v 2/rry)
rr Y '\121

is the complex Fresnel integral expressed through the Fresnel sine and cosine integrals
discussed in Sections 5.3-4. The discussions of Section 5.4 indicate that

<1>( ) '" { 1/..tiI, (y -4 0);


y 1/(iy..[iii)e-iy2 , (y -4 (0).

Consequently, it follows from (9-11) that

I (w) '" { 1/(2../rriw), (w-r -4 0); (12)


1/(2rriwv'L), (WT -4 (0).

Having arrived this far, we already have noticed the remarkable fact that the above formula
contains the main asymptotics of our integral: for T = 0 (5.6.5) and for T > 0 (5.6.4).
Although, for any T > 0, the formula (16) implies the asymptotic power law with a = 1
(w -4 (0), for w « T, we already witness the appearance of the intermediate asymptotic
power law with a = 1/2 (5.6.5). For T -4 0, the region of frequencies where the power
law with a = 1/2 obtains, expands towards large w's, and for T 0 the power law with=
a = 1/2 is valid everywhere.
300 Answers and solutions

9. Introduce a new function

(13)

which is well defined for any real x > -1. For x = n, n = 1, 2, ... , it coincides with the
factorial n!. Let us rewrite the integral (13) in the form suitable for asymptotic analysis
by introducing a new variable of integration T such that t = XT. Then

x! = x(x+l) 1 00
exp[-xP(T)]dT,

where
P(T) = T -In T.

This function has a unique minimum at T = 1, where P(l) = P"(l) = 1. Hence, in view
of (5.5.3),
(x --+ 00),

which, in particular, gives the Stirling formula. Note that the asymptotic Stirling formula
gives a good approximation of the factorial for any finite n. For 1! = 1 we get a decent
approximation 0.9221, and even for (1/2)! = ..fii/2 ~ 0.8862, the Stirling formula gives
0.7602. In a certain sense one can claim that the Stirling's formula is most precise for
x = 1. The relative error decreases with the growth of n but the absolute error increases!

10. Introduce in the integral (5.6.9) a new variable of integration

y =x - v(x, t)t. (14a)

Taking into account equation (5.6.6) satisfied by the solution v, the old variable of inte-
gration is expressed in terms of the new variable via the equality

x = P(y, t) := y + vo(y)t, (14b)

and the Fourier integral takes the form

V(K, t) = ~
2:n:
f vo(y)e-iKP(y,t)dP(y, t).

Integrating by parts we arrive at a more convenient for asymptotic analysis expression

ii(K, t) = _1._
2:n:1K
f v' (y)e-iKP(y,t)dy.
0
(15)

Here, and below, the prime denotes differentiation with respect to y. As long as 0 :::: t <
-l/u, the function P(y, t) is a strictly monotone smooth function, with P'(y, t) > 0 for
any y, and the Fourier image (15) decays to zero, as K --+ 00, more rapidly than any power
function. However, at the time t = -l/u, there appears on the y-axis a point y = z where
P'(z, -l/u) = O. It means that the behavior of P(y, -l/u) irI the vicinity of this point
Chapter 5. Stationary phase and related methods 301

might have the power asymptotics of the integral (15) as K ~ 00. To find this asymptotics
expand P(y, -l/u) in the Taylor series in powers of (y - z):

P(y, -l/u) = P(z, -l/u) 1 " (z, -l/u)(y -


+ '2P 1 11/ (z, -l/u)(y - z) 3 + ...
z) 2 + 6"P

The term of order 1 has disappeared since, in view of the assumption of this exercise,
V~(z) = -u and P'(z, -l/u) = O. Moreover, since z is the minimum of function vo(z),
we also have P"(z. -l/u) = O. Suppose that

Then the above Taylor expansion gives the asymptotic relation

P(y, -l/u) - P(z. -l/u) '" P . (y - d. (y -+ z).

Since (y - Z)3 is an odd function of (y - z) we have to use the asymptotic formula (5.1.9)
and obtain that

V(K. . exp [-IK


_ -l/u) '" - ur(4/3) . P (z, -1 /u)] Re (p.IK 4)-1/3 , (K ~ 00). (16)
2:7r1

In the concrete case (5.6.10), where

z=O, u=-l, P(z,l) =0, P=l,

we get

(K -+ 00).

Remark. Physicists and engineers usually do not work with the complex Fourier image
of the signal but with the real and nonnegative spectral density of the signal's energy

J
In this fashion, its integral over the entire K -axis gives the "total energy" v 2 (x , t) dx of
the signal. The corresponding asymptotics of the energy density at the time t = -l/u is
then

(K -+ 00).
302 Answers and solutions

A.6 Chapter 6. Singular integrals and fractal calculus


Principal value
1. Transforming the original partial differential equation by Fourier transfonn in the
space and time coordinates x and t produces the algebraic equation

2 2 iw-
u(k - w ) = 21r few).

Its solution
u(k, w) = -few)
i -
41r
[1
-- - -- .
k - w k +w
1]
First let us calculate the inverse Fourier image in x

u(x, w) = f u(k, w)eikx dk

of each of the two tenns in the brackets and write

u(x, w) = u_(x, w) - u+(x, w),

where
i -
u±(x, w) = 41r few)
f eikx dk
k ±w .

To obtain u_(x, w) observe that in view of the causality principle the frequency wap-
pearing inside the integral should be replaced by w - iO. The resulting integral is then

f
evaluated with the help of (6.2.4) which gives

f ---.-
eikx dk
+ =
k - w 10
PV
eikx dk
-- -
k- w
.
i1re'ox.

To calculate the above principal value integral note that

PV f eikxdk
-
k-w
- = . f = f
e'ox PV -e
isx
s
ds
.
e,oxi
sin(sx)
- - - ds
s
= i1re'QX
.
sign (x).

Substituting the right-hand side of this equation into the preceding expression we obtain

f ---.-eikxdk
+ =
k- w 10
.
-'-21rie' OX x(-x).

So
1 - .
u_(x,w) = "if(w)e'OXx(-x).
Similar calculations give
Chapter 6. Singular integrals and fractal calculus 303

Therefore
u(x, w) = ~ j(w) exp( -iwlxl).
Finally, taking the inverse Fourier transform in w we obtain
1
u(x, f) = 2. f(f -Ixl).
As expected, the obtained solution satisfies the radiation condition which, in this case,
means that the waves generated by a point source at the origin should run away from the
origin.
Hilbert transform
2. 1/I(f) = (1/:7r) In I(,r - t}/fl.

3. The problem can be solved by passing to the Fourier images of the corresponding
functions. We know that iP(w) = exp( -lwl1')/21'. According to (6.5.2),

1/I(f) = -21m 1 00
iP(w)i wt dw. (1)

Hence,
f
1/I(f) = 1'(f2 + 1'2) .
4. 1/I(f) = ie mt sign (n), n # o. In particular, if fP(f) = cos nf, then 1/I(f) =
- sin Inlf. If fP(f) = sin Inlf, then 1/1 (f) = cos nf.
5. If one remembers that the Fourier image

iP(w) = 21)X(W + v) - X(w - v)),

then the simplest way to find 1/I(f) is to utilize the formula (1) which holds true for any

l
real function fP(f). Thus,

1 v • cos vf - 1
1/I(f)=-- SInwfdw= .
v 0 vf

Analytic signals
7. Recall that the Fourier image of the original function is

~(w) = 411' (e-lw-nIT + e-1w+nIT).


The corresponding Fourier image of the analytic signal (6.6.1) is equal to

~(w) = ~(w)X(w).

Utilizing the inverse Fourier transform

t(f) = 21 ~(w)eiwt
00
dw
304 Answers and solutions

yields

8. Two cases should be considered separately:

= sin
__ vt e,'n
..t,
~(t) 0> v,
vt
and
~(t) = ~
Ivt
(e ivt cosOt -1), 0< v.

Notice that in the first case the analytic signal ~(t) is obtained from the original function
~(t) just by replacing cos Ot by emt . This is a particular case of a more general result
that is the subject of Exercise 8. In the case 0 = v both expressions coincide.

9. Proof' It is evident that

1 1
~(t) = 2"[/(t) - ig(t)]e' t
'0
+ 2"[/(t) + ig(t)]e-''0t.
Now it follows from the assumptions that the Fourier image of the second summand is
equal to zero for w > 0, hence the Fourier image of the first summand coincides with
one-half of the Fourier image of the analytic signal. As a consequence, the analytic signal
is twice the first summand.
Remark 1. Recall that the imaginary part of the analytic signal (6.6.2) is equal to minus
the Hilbert transform (6.5.1) of ~(t). This implies the following corollary to the above
result: If qJ = / cos Ot + g sin Ot then its Hilbert transform is "'" = g cos Ot - / sin Ot.
Remark 2. Signals with finite-support Fourier image seldom appear in electrical en-
gineering applications. However, for narrow-band signals the replacement of the actual
analytic signal by the expression (6.9.3) gives a rather good approximation. For example,
the signal from Exercise 6 has an unbounded Fourier image but is narrow-band if 0. » 1.
In this case, it is easy to see that the approximate expression

is very close to ~(t).

10. Function g(t) has the Fourier image g(w) = iB'(w) supported by the single point
w = O. Hence, according to Exercise 8. ~(t) = -itemt . Generally, if in formula (6.11.1)
/(t) = Pn(t), g(t) = Qm(t), are polynomials of degrees nand m, respectively, then

because the Fourier transforms of these polynomials have a one-point support w = O.

11. The corresponding analytic signal is

~(t) = A(t)ei«l>(t) = Alem}t - iA2e m2t .


Chapter 6. Singular integrals and fractal calculus 305

Consequently,
A(t) = J Ai + A~ + 2AIA2 sin(02 - 01)t.
Assuming, for the sake of concretness, that Al > A2 (p = A2/ Al < 1), we get

A.( n ( pCOS(02 - 01)t )


t) = .'It - a r c t a n . ,
1 + p sm(02 - 01)t

and

12. For convenience, let us replace the original function ~(t) by its complex twin
~c = X (t)e iflr and recall that the sought function 11(t) is equal to minus Hilbert transform
(6.6.2) of the original function. So the corresponding complex twin

11c(t) = --pv
1
7r
f ~c(s)
- ds
s- t
= -eiflr -PV
1
7r
1-Or
00
e
-dO.
0
i9

Here we have introduced the new integration variable 0 = O(s - t). After splitting the
integral into real and imaginary parts one gets

11c(t) = e iflr [~Ci(IOtl) - i (~+ Si(Ot»)] ,

. l
where

Ci(x) = - 1x
00 coso
- - dO,
0
x> 0, SI (x) = 0
x sinO
-0- dO,

are, respectively, the integral cosine and sine functions introduced before. Clearly, the
imaginary part of 11c(t) coincides with the sought function 11(t) so that finally

11(t) = .!.7r Ci (lOtI) sin Ot - (!2 + .!.


7r
Si (Ot») cos Ot.

Notice that, in a sense, the corresponding analytic signal violates the causality principle.
Indeed, at t < 0, when the original signal ~(t) is identically equal to zero, the analytic
signal ~(t), along with 11(t), is aheady nonzero.
13. The Fourier image (4.4.10) of ~(t) (4.4.9) has aheady been found. So, the corre-
sponding analytic signal is equal to

(2)

The last integral obviously reduces to integral (4.4.8) which we will rewrite in another,
more suitable for our purposes, form (0 < {3 < 1):

1 o
00 wfJ-leiwr dw = f({3)1tI-fJ x { ~~p
I
~f t > 0
,1ft < O.
306 Answers and solutions

-----------+----~--~~--~----~--~----~--~r_--_r_x

FIGURE A.6.1
Graphs of (a) ~ and (b) 71 as functions of the dimensionless variable x = Ot.

Taking P = 1 - ot and substituting the result into (2) we obtain

r()_ Itl a - 1 {Sin7rot+iCOS7rot, ift>O;


.. t - .
SID7rot
x .
-I,
if t < 0.

14. It follows from the Hilbert transform property (c) of Exercise 5 that

~e(t) = W) + ~*(-t),
where the asterisk means the complex conjugate and ~(t) is the analytic signal from the
previous exercise. Thus

15. Similar to ~(t), the sought function is a singular distribution. In order to find it, let
us multiply the formal equality

71(t) = --7r1 PV f -8'(s) ds


S - t
Chapter 6. Singular integrals and fractal calculus 307

by a test function cp(l) and integrate it over all II's. Mter simple distribution-theoretical

f
transformations we get that

f cp(t)'1(t) dt = -rr1 'PV -cp'(s) ds.


s- t
The last functional has been described in detail in Section 6.7 and the results of that section
imply that

'1(t) = ~ 'PV (t; ).


16. Applying the Fourier transform to the last equality we get an algebraic equation

CU > o. (3)

It has no ordinary continuous non-zero solutions ;j, (cu), but there are distributional singular
solutions like solution (1.5.5) of the equation (1.5.6). Indeed, if CUn is a root of the
"dispersion equation" i = e;wT. then 8(cu - cun ) is a solution of equation (3). Clearly,

CUn = ~ (i + 2rrn). n = 0.1.2•...•

so that we obtain

cp(t) = exp (iif) ~an exp (i~n t). (4)

where an are arbitrary constants. In other words, we have found solutions

cp(t) = exp (
.rr
'"2 Tt) t(t). (5)

where t (t) is an arbitrary periodical analytic signal with period T. Furthermore, since Ii is
a real operator, both Re cp(t) and 1m cp(t) are also solutions to this exercise. The simplest
example here is obtained for t(t) == 1. Then cp(t) = em,. Recp = cosOt. Imcp =
sin Ot. 0 = rr /2T. A little more complicated example is obtained setting an = an. la I <
1. Then, forinstance,
""(
Re'l' cos(Ot) - a cos(30t)
t) =
1 + a - 2a cos(40t)
2 •

Fractal calculus
17. As it often happens, it is simpler to derive a more general formula involving the
integral

(6)
This is a linear functional of g(t) satisfying the causality principle and as such it has an
integral representation

x(t) = i~ h(t.s)g(s)ds
308 Answers and solutions

the kernel thereof can be found by substituting g(t) = l5(t - s) in (6):

In particular

18. We shall utilize the identity

g(b) = 1 g(u) BBu X(u - b) duo

Substituting it in the original integral and changing the order of integration we get

1= 1 [I . i
g(u) BBu '!. a(xt. ... , xn)X (u - b(Xl, ... , xn) )dXl ••. dXn] duo

Note that the inner n-tuple integral is an integral of function a over (not necessarily
connected) domain Cu = An Bu created by intersection of domains A and Bu, where the
latter is the set of points satisfying the inequality

Denote that integral by

1/I(u) = I·'!· [u a(xl, ..• ,Xn)dxl ... dxn.

So, the desired formula (also called the Catalan formula) has the form

1= Ig(U).!!.-1/I(U)dU.
du

If b(Xl, ... , xn) is a bounded continuous function with minimum m and maximum M
then the Catalan formula simplifies to

I = i M
g(u) d1/l(u).

A.7 Chapter 7. Uncertainty principle and wavelet transform


Function spaces:
1. Consider an auxiliary function l(x) = j<lfl + xIgl)2 dt ::: O. Clearly,
l(x) = IIgll2x2 + 2Px + IIf1l2,
Chapter 7. Uncertainty principle and wavelet transfonn 309

so that the discriminant of this quadratic polynomial must be nonpositive, i.e.,

2. The norm's definition gives

where P was defined in Exercise 1. Applying Schwartz inequality to the last term on the
right-hand side we get

which implies the triangle inequality.

Windowed Fourier transform.


3.
1-, (w, .) = -
iwl(w, .)
a I(w,
+ -a. - .).

4. Applying the windowed Fourier transform to both sides of the differential equation
we obtain
- + (h + IW)X
di . - = I(w,
d.
- •.)

Its solution is
i(w, .) = 1 00
j(w,. - 0)e-(h+iw)6 dO.

5. (a) l(t)ei"'Or ..... j(w - CI.l(), .);

(b) I(t + 0) ..... j(w, • + O)e iw6 •


6. IT the time-window g(t) is sufficiently well localized to guarantee the existence of
the Fourier image
1
iP(w) = 23r f .
g(t)eVr-IWr dt

of function ql(t) = g(t)e vr , then

Notice that in this case the Fourier image of the original function does not exists even in
the distributional sense. However, the windowed Fourier image of I(t) exists and can me
measured experimentally.

7.
j(w, .) = e VT ~ [iP(w - CI.l()e-i(W-"'O)T + iP(w + CI.l()e-i(w+ClJQ)T] •
If VA « 1 then approximately,
310 Answers and solutions

where jo(w, t') is the windowed Fourier image of the monochromatic signal fo(t) =
coswot:

8. We know all the values of the function

F(w, t') = jew, t')[X(-t') + X(t' - T)].

The latter, in view of equality (49), is related to the original signal as follows:

f(t)g(t - t')[X(-t') + X(t' - T)] = f F(w, t')eiCllt dw.

After multiplication of both sides by g*(t - t') and integration over all t' we arrive at the

f f
equality
f(t)A(t) = dt' dw F(w, t')g*(t - t')e iwt .

= [1.: + [00]
IT the coefficient
r
A(t) Ig2(0)1 dO

is different from zero everywhere then the value of the signal can be recovered via the
formula
f(t) = --
1
A(t)
f f -
dt' dw F(w, t')g*(t - t')e'wt. .
Note that for the validity of the above formula it suffices that get) :I: 0 for all points of a
certain interval of length A > T.
Wavelet transforms
9.

10. Passing in the integral to new variable of integration 0 = S/ A we get

gEeS) = Disl F1 (lSI) -;- ,

where
F(x) = foX K(z)dz.
The fact that the autocorrelation function of a real-valued mother wavelet is even was
taken into account here. To write gE (x) in the standard form weakly converging to the
Dirac delta let us introduce another notation:
1
<I>(x) = Dlxl F(lxl).
Chapter 7. Uncertainty principle and wavelet transfonn 311

Then
g€(s) = ~<t> G)
and the necessary and sufficient condition of the weak convergence of the above function
to the Dirac delta is that

o< D = 2 1o
00 F(s)
- - ds <
s
00.

Note that, in the process, we have obtained a new expression for the coefficient D in (49).
11. It is clear that in this case
d4
K (z) = dz 4 Ko(z),

where
Ko(z) = f qJ(s)qJ(z - s) ds = ..jii exp( _Z2 /4).
Hence,
F(x)
d3
=..jii dx 3 exp(-x 2 /4)
.;;r _x 3 )e-x 2 /4.
= g(6x
Dividing this expression by x and norming it properly we get
6-x 2
<t>(x) = r.: exp( _x 2 /4).
8",11"

The graph of this function is displayed on Fig. A. 7.1.

FIGURE A.7.1
The graph of function <t>(x) from Exercise 11.

12. First. let us write out the square modulus of the continuous wavelet transform
(7.4.1):
312 Answers and solutions

Integrating over all T and utilizing equality (7.4.40) we obtain

f II(A, T}1 2 dT = f f dtl dt2 f(tl}/* (t2}AIA(A) 12 K Cl ~ t2) .

Now it suffices to integrate both sides of the equality over A > O. To use the distributional
relation (1) and the local probing property of the Dirac delta we shall divide both sides by
DA3 IS(A}1 2. SO, if the continuous wavelet transform is defined by formula (7.4.5) then

IIfll2 =~
10
(Xl ~~ 1 00

-00
II(A, T}12dT.

In this case,
1 2
Ef(A, T) = A2If (A, T}I
A

has a natural interpretation as the energy density of the signal in (A, T}-domain.

13.

14.
A A (A)
f(A, T} = Aa +1 f/J(T/A},

where f/J(x} = J Isla1/l*(s - x}ds.

A.8 Chapter 8. Summation of divergent series and integrals


1.
Le
00
imz 1 ,
S(x,y} = 1m = Im--. z =x +iy.
1-e'z
m=l
Separating the imaginary part we finally get
1 sin x
S(x,y } = --:-----
2 cosh y - cos x

2. The series represents a distribution

S= _!.!!...
2dx L.J
~ e imx •
m=-OO

Recalling the generalized equality (8.7.8) we finally get

L
00

S = -7r 8'(x - 27rn},


m=-OO
Chapter 8. Summation of divergent series and integrals 313

The "prime" indicates differentiation with respect to x. If rp(x) is a test function in 'D
then the functional corresponding to S acts

L
00

S[rp] = 1l' rp'(21l'n).


m=-oo

The compact support of rp (x) assures that there are only finitely many non-zero terms in
the above series.

3. The Poisson summation formula (8.8.2) gives

L L
00 00

f(m~) exp( -imx) = 21l' j(x + 21l'n).


m=-oo n=-OO

In our case, f(x) = (1/2) sign (x). Its generalized Fourier image (see Section 4.5) is

- 1
f(w) = 'PV-;-.
IW

f
Hence,
1 00
E[rp] = -i 'PV ~ n~oo rp(x + 21l'n)dx.

4.
S(w, ~) = 21l' L
00 1
~ j«g(w) - 21l'n)/ ~).
n=-oo

Since the weak limit (~ .... 0) of function j (x / A) / ~ is equal to 8 (x), we have, weakly,
that
lim S(w, A) =
d-+O
Ig~(w) I L 8(w -
n
wn ),

where the summation is extended over all roots Wn ofthe equation g(w) = 21l'n.
5. The triple sum under consideration splits into the product of single sums

P(x) = nL 3

k=l mk=-oo
00

exp(imkak(x».

Aplying formula (8.7.8) to each of these sums we get


00

L
In view of the discussion in Section 1.9, each of the products of the 1-D Dirac deltas
represents the 3-D Dirac delta
314 Answer.s and solutions

where n is a vector with integer components (n1. n2. n3). As a result,

P(x) = (211')3 L c5(a(x) - 21I'n).


n
where summation extends over aU vectors n. Recalling relation (1.8.1), we can rewrite
the last formula in the form

P(x) = (J21I')3
(x) n
L c5(x - b(21I'n».
where x = bey) is a vector field inverse to y = a(x).
6. Expressing the components of the above triple integral through the 3-D Fourier

fff
image
f(x) = j(k)exp(i(k.x»d3k.

fff
we get
F(x) = d 3kj(k)exp(i(k·x»x

L
00
x exp[i(m1(a1 . k) +m2(a2 ·k) +m3(a3' k»)].
ml,m2,m3=-00

Let us write the triple sum inside the integral in the vector form used in the previous
exercise:

L + m2(a2 . k) + m3(a3 . k»)] = L


00
exp[i (m1(a1 . k) exp(i(m. a(k»).
mlom2,m3=-00 m

where the vector field p =


a(k) has coordinates a/(k) = (at· k). I = 1.2.3. The
corresponding Jacobian of the transformation of k into P is the mixed triple product
J = (a1 . [a2 x a3])' In view of the results of Exercise 5, we have

Lm exp(i(m. a(k») = (~·~x~)


~211')3 ] Ln c5(k - b(21I'n»).
where k = bCp) is the vector field inverse to the vector field p = a(k). Let us find it,
noting that by definition bCp) satisfies the identity p = a(bCp». It is clear that bCp) is a
linear homogeneous function of p representable in the form

1
bCp) = 211' L3

• =1
bsp•.

Constant vectors blo~. b3 can be determined by substituting the last equality in the
previous one. This gives
Chapter 8. Summation of divergent series and integrals 315

where 8/s is the Kronecker symbol, equal to 1 when the indices coincide and zero otherwise.
The last equality means, in particular, that vector bl is perpendicular to the vectors a2 and
a3. Consequently, b l = C[a2 x a3]. The coefficient C can be found from the condition
(al . bl ) = 21r. Proceeding similarly in the case of vectors bz and b 3 we get

In this fashion,
(21r)3
Lexp ( i(m·a(k» ) = ( . [ ]) L8(k-21rb(n».
m ~ ~x~ n

Substituting the right-hand side of this equality in the original expression for F(x) and
using the probing property of the Direc delta we finally obtain

L
00

nl ,n2 ,n3=-00
j(blnl + b2n2 + b) n3) exp[i (nl (bl . x) + n2(bz . x) + n3(b3 . x») l
7. Recall that the discrete Fourier transform of function f (t) is given by the series

_ a
L
00
h(w) = 21r f(ma)exp(-imaw). (1)
m=-OO

In our case,
_ a
L
N
n~(w) = 21r exp(-imaw),
m=-N
where N = L1/ aJ, is the greatest integer::5 1/ a. Clearly,

_
n~(w) a [ -1 + 2Re £;
= 21r N
exp(-imaw)
]

= ~ [-1 + 2Re 1- exp(-iaw(N + 1»] = ~ sin(wa(N + 1/2», (2)


21r 1- exp(-iaw) 21r sin(wa/2)
in view of the standard formula 1 + q + ... + qN =
(1 - gN+l)/(l - q) for geometric
sum.
Of course one has to remember that there is a certain indeterminacy embedded in any
discrete Fourier transform, and the discrete Fourier image corresponds to infinitely many
original functions f(t). In particular, for a given value of N, the discrete Fourier image
(2) is the SlllIl:e for all rectangular functions n(t/,r) as long as • satisfies inequalities
aN < • < a(N + 1). On the other hand, fixing. and varying a within the interval
• / (N + 1) < a < • / N, we obtain different discrete Fourier images of the same original
316 Answers and solutions

function. The selection of a particular value is a matter of taste. Here, we choose 11 =


l/(N + 1/2), which gives (3) its simplest form
- 11 sinw
Il,:l(w) = 2Jr sin(wl1/2)

8. Recall that

cos4 (Jrt /2) = 3/8 + (1/2) cos(Jrt) + (1/8) cos(2Jrt).


Hence, the discrete Fourier image has the representation

(3),

where
_ 11 N
!n(w) := 21r L exp(-iml1w) cos(nJr 11m).
m=-N
It is easy to see that

in(w) = ~ [lo(w + Jrn) + lo(w - Jrn)] , (4)

where lo(w) is given by (3). Choosing, for the sake of concreteness 11 = l/(N + 1), the
smallest value for which (3) remains valid, we get from (3) that

- 11
!o(w) = 2Jr [cot(l1w/2) sinw - cosw]. (5)

Substituting (5) into (4), and (4) into (3), we get

- 11
!,:l(w) = 21r sinw x

x H cot ( ~ w) - ~ [cot ( ~ (w + Jr)) + cot ( ~ (w - Jr)) ]

+ 1~ [cot (~(W + 2Jr)) + cot (~ (w - 2Jr)) ]} .

This formula is valid without any additional preconditions. Its drawback is that it is
difficult to compare with the ordinary Fourier image (4.3.22)

!-() 3 3
w --Jr sinw
(6)
- 2 w(w2 - Jr2)(wZ - 4Jr 2) •

of function !(t). However, after sweating through some heavy-duty trigonometric ma-
nipulations, we get the formula

(7)
Chapter 8. Summation of divergent series and integrals 317

2 + 3 cos( ti7r) + cos( 6.w)


x~~~~--~~~~~~~~~~~
[cos(6.7r) - cos(6.w)][cos(26.7r) - cos(6.w)]
the structure thereof transparently generalizes the structure of the Fourier image j (w) to
the case of the periodic function jt:.(w). In particular, expression (7) permits to easily
trace the limiting passage from jt:.(w) to j(w) as 6. -+ O. For specific values of 8, the
expression (7) can take an even simpler form. So, for example, for 6. = 1/3 (N = 2), we
have
-
h/3(W) =
1. (W) 7+2cos(w/3)
487r Slflwcot "6 1 + 2cos(2w/3)· (8)

Now, we can compare the discrete Fourier image (7) with the Fourier image (8) in the
interval 0 < w < Q, where Q =
7r / 6. =
7r(N + 1) is the halfperiod of the discrete
Fourier image (7). To complete the comparison let us denote the ratio

RN (x ) -_ jt:.(w)
_ ,
w 6.w
x=-=-=
w
f(w) Q 7r 7r(N + 1) .
The graphs of RN(X), for N = 2 and N = 10, are shown of Fig. A.8.1.
R(x)

lr-----------------~~ __
0.8

0.6

0.4

0.2

o
~~----~------~----~------~------~
0.2 0.4 0.6 x
0.8 1
FIGURE A.S.1
The plots, for N = 2 and N = 10, of discrete and ordinary Fourier transforms ratio
for function f(t) from Exercise 8. The function bas hidden singularities. The plots
demonstrate that the discrete Fourier transform may be smaUer than the ordinary
one.

The plots indicate an intuitively surprising argument: the "rougher" discrete Fourier
image can decay to zero, as w increases, faster than the ordinary Fourier image of function
f (t). There is a simple explanation for this phenomenon which is obvious and convincing
to an engineer and can have a heuristic value to a mathematician. The point is that the
discrete Fourier image (7) is the Fourier image of not only the original function f(t),
the fourth derivative thereof is discontinuous, but also of many other functions, including
318 Answers and solutions

some that are infinitely differentiable. Those, of course, have Fourier images i(w) that
decay, as w ~ 00, faster than any power of w. As a result, the discrete Fourier image (7)

L
00

it:.(w) = i(w + Zrrnjl!.),


n=-OO

in the intervallwl < n= rr j l!., and can tum out to be smaller than the Fourier image (6).

9. Taking advantage of the periodicity of I(t) we can regroup the terms of the series
(1) and rewrite it in the form

Then, in view of the generalized equality (8.7.8),

L
00

it:.(w) = F(n)8(w - 2rrnjT), (9)


n=-oo

where

=L
M-l
F(n) I(ml!.) exp( -iZrrnmj M). (10)
m=O

10. Initially, let us take a look at the question of applying the Poisson summation
formula to evaluate the sum of the series

L I(ml!.),
00

S(l!.) = l!. > O.


m=l

Selecting an even function I(t), with values at t = ml!. coinciding with those of the
function 1 in the above series, the equality can be rewritten in the form

1 1
S(l!.) = -2/(0) + 2 L00

m=-oo
I(m l!.) ,

which is more suitable for an application of the Poisson summation formula. The latter
gives

Since the Fourier image i(w) of an even function I(t) is also even, the above equality
can be rewritten in the form

S(l!.)
rr-
= ~ 1(0) -
1
2/(0)
2rroo_
+ t; L
I(Zrrnj l!.).
n=-l
Chapter 8. Summation of divergent series and integrals 319

If j(w) has a compact support, then the series on the right has only finitely many terms
different from zero. In our case it is convenient to choose
sint - 1:1
f(t) = 1:1-,
t
f(w) = 2[X(w + 1) - X(w -1)].

Substituting these function in the preceding equality we get


1 00
$(1:1) = 2(11' -x) +11' LX(1- 21rn/I:1),
n=l

'
so that finally
f
m=l
sin(ml:1) = !(1r - 1:1) + 11'
m 2
l~J
21r
where LxJ is the largest integer ~ x. Note that the expression on the left hand side
represent the Fourier series of the periodic function appearing on the righ-hand side.
11. Counting on the curiosity of the reader we take a look at a more general problem
of the Fourier integral
(11)

where, in the case of a 21r-periodic function f(t), the Fourier image appearing inside the
integral

L
00

j(w) = jm~(w - m). (12)


m=-OO
If we replace (11) by the equality

fN(t) = f j(w)h(w, N)e ifUt dw (13)

where h(w) is the rectangular function, for example,

h(w, N) = n(w/N), n(t) = X(t + 1) - X(t -1), (13a)

then the series (12) retains only the needed number 2N + 1 of terms. The corresponding
original function is
h(t, N) = 2sin(Nt)/t. (13b)
Using formula (3.2.6), we can rewrite the equality (13), with the help of the convolution

fN(t) = 21r1 f(t) * h(t, N). (14)

By (3.3.7), the function h(t, N)/21r weakly converges, as N _ 00, to the Dirac delta.
So, if f(t) is sufficiently smooth, then fN(t) converges pointwise (and even uniformly)
to f(t). However if, as is the case in this exercise, the original function f(t) is only
piecewise continuous, then it is necessary to study in more detail the asymptotic behavior
(for N » 1) of the convolution integral (14) in the vicinity of discontinuities of the first
kind.
320 Answers and solutions

Assume that the piecewise continuous function f(t) has a jump at t = T of size
lfl =
f(T + 0) - f(T - 0). Hence, asymptotically, as t -+ T,

f(t) '" ~[f(T - 0) + f(T + 0)) + ~ lfl sign (t - T).

Substituting this expression into (13), we find that the behavior of fN(t) in the vicinity of
the discontinuity point is described by the following asymptotics:
t-T
fN(t) '" ~[f(T - 0) + f(T + 0)] + ~ lfl Si (x), X ="-N' (15)

where the integral sine function

Si (z) = Iooz -sinx


x
- dx. (16)

Observe certain features of the asymptotic formula (15). First of all, Si (0) = 0, which
means that at the discontinuity point t = T of function f(t), the truncated Fourier integral
(13) (and in our case, the series in Exercise 11) converges to the arithmetic average of the
one-sided limits of the function.
Secondly, if we remove the already analized first summand on the right-hand side of
=
(15), place the discontinuity at the origin t 0 (T = 0), and rewrite (15) in the form

fN(t) '" lflg(tIN), g(x) = ..!.7r Si(x)


then a new, important in physics and engineering phenomenon can be observed. The odd
function g(x) entering in the above formula, normalized by the size lfl of the jump,
describes fN(t) behavior in the vicinity of the discontinuity. For x -+ ±oo, we have
g(x) -+ ±1/2. However, since the integrand in (16) changes sign, the approach of g(x)
to the limit is not monotone. In particular, as is clear from the the graph of function g(x)
shown on Fig. A.8.2,
g(x) assumes the maximal value g(7r) ~ 0.59 for x =
7r (t =
7rIN). This means that,
for arbitrarily large N, at the distance 7r INto the left and to the right of the jump point
of function f(t), the graph of function fN(t) has sharp up and down excursions. This
anomalous behavior of function fN(t) in the neighborhood of jump points of function
f(t) is called the Gibbs phenomenon.
12. As before, we shall rely on formula (14). Observe that the Gibbs phenomenon was
caused by the fact that function h(t, N) entering (14) changed signs. So the phenomenon
can be avoided if we select a nonnegative function h(t, N) such that its Fourier image
has a compact support. The latter is needed so that only finitely terms of the series from
Exercise 11 are different from zero. So let us consider function

h(t, N) = A 4Sin;2(Nt) (17)

as a candidate. Coefficient A will be selected later from a normalization condition, which,


as is clear from (15), takes the form

2~ f h(t, N)dt = 1.
Chapter 8. Summation of divergent series and integrals 321

g(x)
0.6

0.5

0.4

0.3

0.2

0.1

0 5 10 15 20 25 30 X

FIGURE A.S.2
Graph ofthe function (1/7r) Si (x).

First of all, let us check that the Fourier image of h has a compact support. In view of
(3.2.10), the Fourier image of the square ofthe original function is

h(w, N) = AfI(w/ N) * fI(w/ N).


The convolution of two functions with compact support also has compact support. In our
case
=
h(w, N) 2AN { 1- Iwl/2N, for Iwl ~ 2N; (19)
0, for Iwl > 2N.
With this information it is easy to find the correct norming A, which has to be such
that h(O, N) = 1. Thus A =
1/2N. It is not difficult to show that the above function
h(t, N)/27r weakly converges to the Dirac delta and thus guarantees the convergence of
fN(t) to f(t) at the continnuity points of the latter. On the other hand, the positivity of
function (18) removes the Gibbs phenomenon.
Finally, it should be mentioned that function (19) substituted in (12) preserves in the
sum (13) 4N - 3 terms: 2N - 1 on the left and on the right of the terms m = O. To keep
only N terms of both sides of m = 0 one has to replace N by (N + 1) /2 in the preceding
formulas. As a result we obtain that
h( N) _ { 1 - Iwl/(N + 1), for Iwl ~ 2N;
w, - 0, for Iwl > 2N;

h t N = _4_sin2«N + l)t/2)
(, ) N+1 t2
The corresponding sum in the formulation of Exercise 12 assumes then the familiar shape

fN(t) = E
N

m=l
( m ) sin(mt)
1 - -- ---
N +1 m
(20)
322 Answers and solutions

of the Cesaro sum for the series from Exercise 11. Fig. A.8.3 shows the graphs of functions
fN(t) = r::=1 sin(mt)/m obtained by the simple summation of the first N =25 terms
of the Fourier series, and by the Cesaro summation method. The plots clearly shows how
the Cesaro method helps to avoid the Gibbs phenomenon.

f (t)

1t t

FIGURE A.8.3

TbegrapbsoffunctionsfN(t) = r::=1
sin(mt)/mobtainedbythesimplesummatioD
ortbe first N = 25 terms of the Fourier series, and by the Cesaro summation method.

13. It sufficies to select


- 8
1/Io(w) = -TI«w - WO)/O)·
1C

The inverse Fourier transform


. 215
1/Io(t) = e"·'Ot - sin(nt),
1C

substituted into (8.9.16), gives

F(t) = 2c5 eiwot ~ f(mc5) sin(nt - mc5n) e-imwo6 (21)


1C ~ t-mc5
m=-oo

Recall that one of the features of the analytic signal that is attractive for the engineers is
that it uniquely determines its amplitude A(t) and phase q,(t) so that

f(t) = A(t) cos(wot + q,(t».


To see this it is sufficient to write the complex signal in the form

F(t) = A(t) exp(iwot + q,(t»,


Chapter 8. Summation of divergent series and integrals 323

separate the real and imaginary parts


F(t) = le(t) cos(wot) + ils(t) sin(wot),
and compare this equality with with the previous one to obtain the slowly evolving in time
envelope

of the narrow-band signal.


14. Initially, let us construct an appropriate class of sufficiently smooth Fourier images
{fr(w) such that the corresponding origianal function 1/I(t) decays to zero (as It I -4 (0)
faster than 1/10 (t) (8.9.6), and such thatthe series (8.9.16) converges faster than the standard
Shannon's series (8.9.17). Our experience with the Fourier transform suggests that it is
useful to write {fr (w) in the form of the convolution
- ~-
1/I(w) = -1/Io(w)
l::!.
* -;P(J.tw/rc)
J.t
rc
of the rectangular function ~{fro(w)/ l::!., where {fro(w) is given by the equality (8.9.6),
and of the "enveloping" it compact-support function 1.t;P(l.tw/rc)/rc. The constant I.t is
determined from the equality rc/I.t = rc(l/~ - 1/l::!.). If ;p(v), q:{r) being the original
function, vanishes identically outside the interval Iv I ::: 1 and satisfies norming condition
qJ(O) = 1, then {fr (w) is a function of type (8.10.5). With such choice of {fr (w), the function
of interest
~ sin '['
1/I(t) = --qJ(l::!.'['/I.t), '[' = rct/l::!..
l::!. '['
As ;p(v) one can take, for example, the sufficiently smooth function

-(0) = { (4/3) cos4 (rcO/2), for 101 < 1;


qJ 0, for 101 > 1;
the graph thereof is shown of Fig. 4.3.3b. Then, in view of (4.3.22),
sin '['
qJ('[') = 4rc 4 '['('['2 _ rc2)('['2 _ 4rc2)
.

15. First, let us find the largest value of I which determines the length of the interval ~
between readings (8.9.21). In our case, it follows from (8.9.22) that I < 9/2. Therefore,
we choose I = 4. Thus, by (8.9.22), ~ = 9rc /2WO.
16. Taking advantage of the freedom to choose arbitrarily values of the Fourier image
{fr(w) in the intervals 0 <Iw ± wol < rc/~ - 0, the widths thereof are p = wo/45 in
our case, let us smooth out the Fourier image {fro(w) (8.9.24) with the help of convolution
with the function

{fr(w) = {fro(w) * ;P(l.tw/rc), I.t = 9Orc/wo.


Calculate the inverse Fourier image with the help of (8.9.25) to obtain

= !!:.qJ(rct
1/I(t)
rc
/ I.t) 2~ sin(Ot) cos(wot).
rct
Recall, that 0 = WO/10, ~ = 9rc/2wo, I.t = 9Orc/WO.
AppendixB
Bibliographical Notes

The history of distribution theory and its applications in physics and engineering goes
back to
[1] O. HEAVISIDE, On operators in mathematical physics, Proc. Royal
Soc. London, 52(1893),504-529, and 54 (1894),105-143,
and
[2] P. DIRAC, The physical interpretation of the quantum dynamics,
Proc. Royal Soc. A, London, 113(1926-7), 621-641.
A major step towards the rigorous theory and its application to weak solutions of partial
differential equations was made in the 1930s by
[3] J. LERAY, Sur Ie mouvement d'un liquide visquex emplissant l' espace,
Acta Mathematica 63 (1934), 193-248.
[4] R. COURANT R., D. HILBERT, Methoden der Mathematischen
Physik, Springer, Berlin 1937.
[5] S. SOBOLEV, Sur une theoreme de 1'anayse fonctionelle, Matemat-
iceski Sbornik 4 (1938),471-496.
The theory obtained its final elegant mathematical form (including the locally convex
linear topological spaces formalism) in a classic treatise of

[6] L. SCHWARTZ, Theorie des distributions, vol. 1(1950), volII (1951),


Publications de 1'Institut de Mathematique de L'Universite de Stras-
bourg,

which reads well even today.


In its modem mathematical depth and richness the distribution theory and its appli-
cation to Fourier analysis and differential equations can be studied from many sources
starting with massive multivolume works by

[7] I.M. GELFAND et al. Generalized functions, 6 volumes, Moscow,


Nauka 1959-1966.
326 Bibliographical notes

and

[8] L. HORMANDER, The Analysis of Linear Partial Differential Op-


erators, 4 volumes, Springer, Berlin-Heildelberg-New York-Tokyo
1983-1985,

to smaller, one volume monographs from research oriented

[9] E.M. STEIN, G. WEISS, Introduction to Fourier Analysis on Eu-


clidean Spaces, Princeton University Press 1971,
[10] L.R. VOLEVICH, S.G. GINDIKIN, Generalized Functions and Con-
volution Equations, Moscow, Nauka 1994,
to, textbook style

[11] R. STRICHARTZ, A Guide to Distribution Theory and Fourier Trans-


form, CRC Press, Boca Raton 1994,
[12] V.S. VLADIMIROV, Equations of Mathematical Physics, Moscow,
Nauka 1981,

An elementary, but rigorous, construction of distributions based on the notion of


equivalent sequences was developed by
[13] J. MIKUSINSKI, R. SIKORSKI, The Elementary Theory ofDistribu-
tions, I (1957), II (1961), PWN, Warsaw.
[14] P. ANTOSIK, J. MIKUSINSKI, R. SIKORSKI, Generalized Func-
tions, the Sequential Approach, Elsevier Scientific, Amsterdam 1973.
The applications of distribution theory have appeared in uncountable physical and
engineering books and papers. As far as more recent, applied oriented textbooks are
concerned, which have some affinity to our book, we would like to quote

[15] F. CONSTANTINESCu,Distributions and Their Applications in Physics,


Pergamonn Press, Oxford 1980.
[16] T. SCHUCKER, Distributions, Fourier Transforms and Some ofTheir
Applications to Physics, World Scientific, Singapore 1991.
which however, have a different spirit and do not cover some of the modem areas covered
by our book.

The classics on Fourier integrals are


[17] S. BOCHNER, Vorlesungen aber Fouriersche Integrale, Akademis-
che Verlag, Leipzig 1932,
[18] E. C. TITCHMARSCH, Introduction to the Theory ofFourier Integrals,
Clarendon Press, Oxford 1937,
with numerous modem books on the subject, including the above mentioned monograph
[9] and elegant expositions by
Bibliographical notes 327

[19] H. BREMERMAN, Complex Variables andFourier Transforms, Addison-


Wesley, Reading, Mass. 1965,
[20] H. DYM, H.P. McKEAN, Fourier Series and Integrals, Academic
Press, New York 1972,
[21] T. W. KORNER, Fourier Analysis, Cambridge University Press 1988.

The asymptotic problems (including the method of stationary phase) discussed in this
book are mostly classical. The well known reference is e.g.
[22] N.G. DE BRUIJN,Asymptotic Methods in Analysis, North-Holland,
Amsterdam 1958,
with the newer reference being
[23] M.B. FEDORYUK, Asymptotics, Integrals, Series, Nauka, Moscow
1987.
The special functions have a rich literature including
[24] H. BATEMAN, A, ERDELYI, Higher Transcendental Functions, 2
volumes, McGraw-Hill, New York 1963,
[25] F.W.J.OLvER,AsymptoticsandSpeciaIFunctions,NewYork1974,
and their connections with harmonic analysis on groups are explained, e.g., in
[26] N. VA. VILENKIN, Special Functions and Group Representations,
Nauka, Moscow 1965.
As always, it is handy to keep around
[27] JAHNKE-EMDE, Tables ofHigher Functions, Teubner-Verlag, Leipzig
1960,
[28] LS. RYZHIK, LM. GRADSHTEYN, Tables ofIntegrals, Sums, Series
and Products, FM, Moscow 1963,
[29] M. ABRAMOWITZ, LA. STEGUN,HandbookofMathematicaIFunc-
tions, National Bureau of Standards, 1964
although the role of such compendia has been recently diminished with introduction of
computer symbolic manipulation software such as Mathe11Ultica and Maple.

A good source on the mathematical theory of singular integrals is


[30] E.M. STEIN, Singular Integrals and Differentiability Property of
Functions, Princeton University Press 1970,
with vast literature spread through mathematical journals. The well known sources on
fractal calculus are
[31] A.H. ZEMANIAN, GeneralizedIntegral Transformations, Interscience,
New York 1968,
328 Bibliographical notes

[32] K.B. OLDHAM, J. SPANIER, The Fractional Calculus. Theory and


applications of Differentiation and Integration to Arbitrary Order,
Academic Press, San Diego 1974,
[33] A.C. McBRIDE, Fractional Calculus and Integral transforms of
Generalized Functions, Pitman, London 1979,
and more recent advances in the area can be gleaned from the collection of research papers
[34] A.C. McBRIDE, G.F. ROACH, Editors, Fractional Calculus, Re-
search Notes in Mathematics, Pitman, Boston 1985.
The wavelets have obtained recently several excellent expositions, and the reader can
benefit from consulting

[35] Y. MEYER, Wavelets and Operators, Cambridge University Press


1992,
[36] I. DAUBECHIES, Ten Lectures on Wavelets, SIAM, Philadelphia 1992.
[37] G. KAISER,A Friendly Guide to Wavelets, Birkhiiuser-Boston 1994.

This is a very active research area and some new results have appeared in the following
volumes of articles
[38] I. DAUBECHIES, Editor, Different Perspectives on Wavelets, Ameri-
can Mathematical Society, Providence, R.I. 1993,
[39] C.K. CHUI, Editor, Wavelets: A Tutorial in Theory and Applications,
Academic Press, New York 1992.
The latter contains articles by D. Pollen on construction of Daubechies wavelets and G.G.
Walter on wavelets and distributions which were used in Chapter 7. An engineering
perspective can be found in
[40] A. COHEN, R.D. RYAN, Wavelets and Digital Signal Processing,
Chapman and Hall, New York 1995.

The classic text on divergent series is


[41] G.H. HARDY,DivergentSeries, Clarendon Press, Oxford 1949,
but the problem has broader implications and connections with asymptotic expansions
and functional analytic questions concerning infinite matrix operators, see e.g.
[42] R.B. DINGLE, Asymptotic Expansions, Academic Press, New York
1973,
[43] I.J. MADDOX ,InjiniteMatricesofOperators, Springer-Verlag, Berlin
1973.
The modem viewpoint is presented in
[44] B. SHAWYER, B. WATSON, Borel's Methods ofSummability: The-
ory andApplications, Clarendon Press, Oxford 1994.
Bibliographical notes 329

Finally, an exhaustive discussion of the Shannon's sampling theorem and related inter-
polation problems can be found in
[45] R.J. MARKS II, Introduction to Shannon's Sampling and Interpola-
tion Theory, Springer-verlag, Berlin 1991.
Index

Abel method of summation, 254 Cauchy integral


Abramovitz, M., 177,327 principal value of-, 152
Achilles, 245 causality principle, 39, 153
tiring-, 255 Cesaro method of summation, 254, 261
Anger function, 146 characteristics, 51ff, 86
Antosik, P., 326 Chui, C.K., 328
approximation Cohen, A., 328
Born, 61 Constantinescu, F., 326
diffusion-,91 continuity equation, 70
Fresnel-, 14, 141 in Lagrangian coordinates, 53
geometric optics, 142 of continuous medium, 47
weak-,1O of single particle, 44
asymptotic equivalence, 93 convergence
asymptotic expansion, 103 absolute, 250
principal term of-, 104 conditional, 250
asymptotics uniform, 8, 101
general scheme, 137ff weak, 10
of fractal relaxation, 178 convolution, 27, 80
autocorrrelation function, 218 coordinates,
Eulerian, 50
Bateman, H., 327
Lagrangian, 50
beads on a string, 37
Courant, R., 325
beetle on a string, 96
Bernoulli numbers, 266
Bessel function, 102, 204 d' Alembert solution, 44
beta function, 169 Daubechies, I., 328
Bochner, S., 326 Daubechies wavelet, 231ff
Bremerman, H., 327 scaling relation, 232
Dawson integral, 151
camel de Bruijn, N.G., 327
passing through needle's eye, 94 decibels, 107
Catalan formula, 308 delta-function, 3
Cauchy criterion, 248 derivative
Cauchy density, 12 fractal 17Off
Cauchy formula, 157, 166 substantial, 52
332 Index

differential equations energy


fractal, 175 density, 198
ordinary, 39 total,80
with singular coefficients, 60 equation
with time-dependent coefficients, 41 characteristic, 51, 86
diffusion approximation, 88 divergence form of-, 46
dilation, 225 Liouville-, 290
Dingle, R.B., 328 Riemann-, 147
dipole, singular, 28, 38 wave-,70
Dirac, P., 6, 325 equivalence relation (asymptotic), 95
Dirac-delta, 6, 9 Er&lyi, A., 325
analytic representation of-, 160 Euler
on lines, 28ff constant, 97, 131
on R n , 28ff integral, 115
on surfaces, 28ff Eulerian coordinates, 50
periodic-, 268
selfsimilarity of-, 26 Fedoryuk, M.B., 327
with nonmonotonic argument, 61 filtering, 11 0
Dirichlet integral, 83 Fourier image, 75
Fourier transform, 75
discontinuities
asymptotics, 93ff
of the first kind, 10 1
discrete-, 271
of the second kind, 112, 131
generalized, 81
dispersion relation, 162
inverse-,78
distribution
inverse windowed-, 206
convolution of-, 27
of convolution, 80
derivative of-, 15
of derivative, 78
direct product of-, 65
of fractal derivative, 174
elementary construction of-, 326
of Heaviside function, 125, 163
integral of-, 20
of power function, 112, 123
multiplication of-, 18
of smooth function, 78
nonlinear transformation of-, 63
redundancy of windowed-, 207
of composite argument, 24, 61
windowed, 193ff, 309
on finite interval, 58
fractal (fractional)
orthogonality of-, 238
differential equation, 176
parameter-dependent, 64
differentiation, 170ft', 307
principal value-, 150ft'
causality, 174
regular, 9
nonlocal character, 174
singular, 9 scale invariance of-, 174
supersingular, 65 integration, 166ff, 307
support of-, 10 iteration of-, 169
tempered, 82 Laplacian, 174
vector-valued, 35 relaxation, 175
dual space, 6 frequency, 77
Duhamel integral, 41 angular-, 75
Dym, H., 327 current-, 202
instantaneous-, 194
electrodynamics, xii localization, 191
Index 333

mean-, 199 Haar wavelet, 226ff


relaxation-, 175 location, 227
standard deviation of-, 199 orthogonality, 227
Fresnel approximation, 141ff resolution level, 227
function( s), scaling relation, 231
Anger-, 146 selfsirnilarity, 231
autocorrelation-, 219 Hardy, G.H., 328
beta-, 169 harmonic oscillator, 41
Bessel-, 100,205 with damping, 41
Cauchy-,12 Heaviside, 0., 325
error-, 178 Heisenberg uncertainty principle, 192
Gamma-, 112, 115 Hermitian property, 186
Gaussian-, 11 Hilbert
Green's-, 39, 49 space, 183
Heaviside-, 21 completeness of-, 189
infinitely differentiable-, 7 transform, 160ff, 303
jumpof-,22 Hilbert, D., 325
locally integrable-, 9 Holder inequality, 170
Lorentz-, 12 Horrnander, L., 326
Mittag-Leffler, 177
of the order not greater than ..., 94 incompressible medium, 55
initial-value problem
of the order smaller than ... , 94
for I-D wave equation, 43
of the same order, 94
integral
power-,80
collision-, 84
pulse-,I09
cosine, 130, 305
rapidly decreasing-, 82
Dawson, 151
Riemann's zet-, 267
Dirichlet-, 81, 85
scaling-, 228
divergent-, 262ff, 309
stream-,57
Duhamel-,41
transfer-, 39
Euler-,115
unit step-, 21
fractal, 166
Weber-, 146
Fresnel sine and cosine-, 142
wide-band ambiguity-, 218
Lebesgue, 187
windowing-, 195
principal value of-, 125
with compact support, 7
scattering-, 84
functional
sine, 305, 320-1
analysis, 189
singular-, 149ff
continuous-, 7, 8
isomorphism, 54
linear-, 5, 8
Jahnke-Emde, 327
Gabor transform, 198 Jordan Lemma, 116
Gelfand, I.M., 325 jumps
geometric optics approximation, 142 of the first kind, 101
Gibbs phenomenon, 285,320
Gindikin, 5.G., 326 Keiser, G., 328
Gradshteyn, I.M., 327 kernel,6
Green's function, 39,49 Korner, T.W., 327
334 Index

Lagrangian coordinates, 50 transition, 147


Laplace operator, 56 Poisson summation formula, 271ff, 313ff
Laplace's method, 145 Pollen, D., 232, 328
Lebesgue potential
integral, 188 scalar, 56
spaces LP, 190 vector, 56
Leibnitz formula, 17ff power-type behavior, 107
lemma principal value
Riemann-Lebesgue, 98, 132 of Cauchy integral, 152
Leray, J., 325 of integral, 126, 149
linear combination, 17 principle
linear functional, 6, 9 causality-, 40, 153
Liouville equation, 289 of asymptotic attenuation, 157
Lipschitz condition, 150 of infinitesimal relaxation, 156, 255ff
log-log scales, 105 superposition-, 77
uncertainty-, 183ff, 190
Maddox, LJ., 328 probing property, 4
majorant, 103 multiplier, 19
Marks II, R.J., 329
mass conservation law, 46 recursive method, 175
McBride, A.C., 328 redundancy, 207,223
McKean, H.P., 327 regime
medium multi-stream, 67
incompressible, 55 single-stream, 66
Meyer, Y., 328 relaxation
Mikusinski, J., 326 asymptotics, 178
multiresolution analysis, 225 fractal in frequency, 179
fractal in time, 179
norm, 184 Kohlrausch, 179
of order 112, 177
observables remainder term, 103
momentum, 193 renormalization, 127
position, 193 Riemann equation, 147
Oldham, K.B., 328 Riemann Theorem, 250
Olver, F.W.J., 327 Riemann-Lebesgue Lemma, 98, 132
operator Riemann's zeta function, 267
Laplace, 56 Riesz Theorem, 167
linear, 77 Roach, G.F., 328
Ryan, R.D., 328
Parceval equality, 80, 192 Ryzhik, I.S., 327
particles
sticky,66 Schucker, T., 326
partition of unity, 233 Schwartz inequality, 169, 185, 189, 207,
passive tracer, 215
concentration, 54 Schwartz, L., 8, 325
density, 54 self-similarity
phase of Direc delta, 26
space, 46, 84 of function, 5
Index 335

of Haar wavelets, 231 S,82


separation of scales, 260ff S',82
series Sobolev-, 238
Abel's summability of-, 254, 261 Sobolev-Schwartz-, 8
Cesaro's summability of-, 254, 261 Spanier, J., 328
convergence of-, 245ff stationary phase method, 140
absolute-, 249 accuracy of-, 143
Cauchy criterion, 248 stationary point, 140
conditional-, 249 steepest descent method, 145
pointwise-, 252 Stegun, I.A., 327
unconditional-, 250 Stein, E.M., 325, 327
uniform-, 252 Stirling formula, 147
Weierstrass criterion-, 253 stream function, 57
divergent-, 245ff, 312 Strichartz, R., 326
geometric-, 273ff substantial derivative, 52
summability of-, 253
functional, 252 Taylor formula, 264
tp-summability of-, 260ff test function, 5
geometric, 245, 248 theorem
generalized sum of-, 253 Abel's-, 254
harmonic, 96, 249 Cauchy's-, 115, 157
of complex exponentials, 264ff Fubini's-, 65
Shannon's sampling theorem, 276ff, 320 fundamental- of vector calculus, 56
Shawyer, B., 328 Gauss divergence, 47
signals Leibniz, 249
analytic-, 162, 303 mean value-, 261
and systems, xii Riemann -, 250
cornplex-structured-, 215 Shannon sampling-, 276ff
input-,39 Weierstrass-, 99
narrowband-, 162 time-frequency localization, 190
output-,39 Titchmarsch, B.C., 326
smoothing and filtering of-, 110 topology
time-windowed-, 195 dual,32
Sikorski, R., 326 weak, 32
simplest tune, 193 tortoise, 245
smoothing, 109 transfer function, 39
Sobolev, S., 8, 325 transform
Sobolev-Schwartz space, 8 Fourier-,77
solution Gabor-, 198
d' Alembert, 44 Hilbert-, 160,303
fundamental-, 39 transport
space, equation, 84ff
V,8 phenomena, xii
V',8 triangle inequality, 185
Hilbert-, 183 ff
inner product-, 184 uncertainty principle, 183ff
linear topological-, 31 Heisenberg-, 192
phase-,46 strong-, 215
336 Index

Vilenkin, N. Ya., 327


Vladimirov, V.S., 326
Volevich, L.R., 326

Walter, G.G., 326


Watson, B., 328
wave
monochromatic, 153
propagation, xi
I-D equation, 43
wavelet
admissible, 223
Daubechies wavelet, 231ff
scaling relation, 232
Haar-, 226ft"
location, 226
orthogonality, 226
resolution level, 226
scaling relation, 231
selfsimilarity, 231
image,210
Mexican hat-, 213
Morlet-,211
efficiency factor of-, 212
mother-, 210ft"
complex-structured-, 217
one-sided-, 221
two-sided-, 221
Poisson, 222
transform, 210ft", 310
inversion, 219
wavenumber, 154
Weber function, 146
Weiss, G., 327
window
finite memory-, 195
relaxation-, 195
Gaussian-, 195

Zemanian, A.H., 327


Zeno's paradox, 245

You might also like