0% found this document useful (0 votes)
35 views

Lnotes Mathematical Found QM Temp

This document provides an introduction to measure theory and Lebesgue integration, which are necessary mathematical foundations for quantum mechanics. It begins with a discussion of issues with defining the Hilbert space L2(R) using the Riemann integral and introduces measure spaces as a way to define volume. It notes Banach-Tarski paradox which shows the need to restrict volume measurement to a subset of power sets. It then defines σ-algebras and measurable spaces to formalize the concept of measurable sets and states the definition of a measure.

Uploaded by

tirthendu sen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views

Lnotes Mathematical Found QM Temp

This document provides an introduction to measure theory and Lebesgue integration, which are necessary mathematical foundations for quantum mechanics. It begins with a discussion of issues with defining the Hilbert space L2(R) using the Riemann integral and introduces measure spaces as a way to define volume. It notes Banach-Tarski paradox which shows the need to restrict volume measurement to a subset of power sets. It then defines σ-algebras and measurable spaces to formalize the concept of measurable sets and states the definition of a measure.

Uploaded by

tirthendu sen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 135

Ulm University

Mathematical Foundations of Quantum


Mechanics

Stephan Fackler
Version: July 17, 2015
Contents
Introduction iii

1 A Crash Course in Measure Theory 1


1.1 Measure Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 The Lebesgue Integral . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Lebesgue Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2 The Theory of Self-Adjoint Operators in Hilbert Spaces 15


2.1 Basic Hilbert Space Theory . . . . . . . . . . . . . . . . . . . . 15
2.1.1 Orthonormal Bases . . . . . . . . . . . . . . . . . . . . 16
2.1.2 Bounded Operators on Hilbert Spaces . . . . . . . . . 20
2.1.3 Sobolev Spaces . . . . . . . . . . . . . . . . . . . . . . . 24
2.1.4 The Fourier Transform on L2 (Rn ) . . . . . . . . . . . . 32
2.2 Symmetric and Self-Adjoint Operators . . . . . . . . . . . . . 37
2.2.1 Unbounded Operators on Hilbert Spaces . . . . . . . . 37
2.2.2 The Difference Between Symmetry and Self-Adjointness 41
2.2.3 Basic Criteria for Self-Adjointness . . . . . . . . . . . . 54
2.2.4 Self-Adjoint Extensions . . . . . . . . . . . . . . . . . . 56
2.2.5 The Spectrum of Self-Adjoint Operators . . . . . . . . 66
2.3 The Spectral Theorem for Self-Adjoint Operators . . . . . . . 68
2.3.1 The Spectral Theorem for Compact Self-Adjoint Oper-
ators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
2.3.2 Trace Class Operators . . . . . . . . . . . . . . . . . . . 74
2.3.3 The Spectral Theorem for Unbounded Self-Adjoint Op-
erators . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
2.3.4 Measurement in Quantum Mechanics . . . . . . . . . 85
2.3.5 The Schrödinger Equation and Stone’s Theorem on
One-Parameter Unitary Groups . . . . . . . . . . . . . 87
2.4 Further Criteria for Self-Adjointness . . . . . . . . . . . . . . 94
2.4.1 von Neumann’s criterion . . . . . . . . . . . . . . . . . 95
2.4.2 Kato–Rellich Theory . . . . . . . . . . . . . . . . . . . 96
2.4.3 Nelson’s criterion . . . . . . . . . . . . . . . . . . . . . 99

3 Distributions 103
3.1 The Space of Distributions . . . . . . . . . . . . . . . . . . . . 103
3.2 Tempered Distributions . . . . . . . . . . . . . . . . . . . . . . 109
3.2.1 The Fourier Transform of Tempered Distributions . . 111
3.3 The Nuclear Spectral Theorem . . . . . . . . . . . . . . . . . . 116

i
Contents

3.3.1 Gelfand Triples . . . . . . . . . . . . . . . . . . . . . . 117

A The postulates of quantum mechanics 127

Bibliography 129

ii
Introduction
These lecture notes were created as a companion to the lecture series hold
together with Kedar Ranade in the summer term 2015 under the same title.
The lecture was aimed at both master students of physics and mathematics.
Therefore we required no prior exposure to neither the apparatus of func-
tional analysis nor to quantum physics. The mathematical background was
presented in my lectures, whereas the students were introduced to the physics
of quantum mechanics in Kedar’s part of the lecture.
The aim of the lectures was to present most of the mathematical results and
concepts used in an introductory course in quantum mechanics in a rigorous
way. Since physics students usually have no background in Lebesgue integra-
tion, a short primer on this topic without proofs is contained in the first chap-
ter. Thereafter the fundamentals of the theory of Hilbert and Sobolev spaces
and their connection with the Fourier transform are developed from scratch.
It follows a detailed study of self-adjoint operators and the self-adjointness of
important quantum mechanical observables, such as the Hamiltonian of the
hydrogen atom, is shown. Further, the notes contain a careful presentation
of the spectral theorem for unbounded self-adjoint operators and a proof
of Stone’s theorem on unitary groups which is central for the description of
the time evolution of quantum mechanical systems. The spectral theory of
self-adjoint operators and Hamiltonians is only covered in a very rudimentary
manner.
In the last part a short introduction to the theory of distributions is given.
Further, we present the nuclear spectral theorem which gives the spectral
decomposition of self-adjoint operators in a form very natural for physicists.
The appendix covers precise mathematical statements of the postulates of
quantum mechanics presented in the course for further easy reference.

Ulm, July 2015

iii
A Crash Course in Measure Theory 1
In classical quantum mechanics (pure) a quantum mechanical system is de-
scribed by some complex Hilbert space. For example, the (pure) states of a
single one-dimensional particle can be described by elements in the Hilbert
space L2 (R) as introduced in introductory courses in quantum mechanics. A
natural first attempt to mathematically define this space is the following:
( Z∞ )
2 2
L (R) = f : R → C : f|[−n,n] Riemann-int. for n ∈ N and |f (x)| dx < ∞ .
−∞
However, there are several issues. First of all, the natural choice
Z∞ !1/2
2
kf k2 B |f (x)| dx
−∞

does not define a norm on L2 (R) as there exist functions 0 , f ∈ L2 (R) with
kf k2 = 0. This problem can easily be solved by identifying two functions
f , g ∈ L2 (R) whenever kf − gk2 = 0. A more fundamental problem is that the
above defined space is not complete, i.e. there exist Cauchy sequences in L2 (R)
which do not converge in L2 (R). Therefore one has to replace L2 (R) as defined
above by its completion. This is perfectly legitimate from a mathematical
point of view. However, this approach has a severe shortcoming: we do not
have an explicit description of the elements in the completion. Even worse,
we do not even know whether these elements can be represented as functions.
To overcome these issues, we now introduce an alternative way to integra-
tion, finally replacing the Riemann-integral by the so-called Lebesgue-integral.
In order to be able to introduce the Lebesgue-integral we need first a rigorous
method to measure the volume of subsets of Rn or more abstract sets which
then can be used to define the Lebesgue integral.
The material covered in this chapter essentially corresponds to the basic
definitions and results presented in an introductory course to measure theory.
We just give the definitions with some basic examples to illustrate the concepts
and then state the main theorems without proofs. More details and the proofs
can be learned in any course on measure theory or from the many excellent
text books, for example [Bar95] or [Rud87]. For further details we guide the
interested reader to the monographs [Bog07].

1.1 Measure Spaces


For n ∈ N let P (Rn ) denote the set of all subsets of Rn . The measurement of
volumes can then be described by a mapping m : P (Rn ) → R≥0 ∪ {∞}. In order
to obtain a reasonable notion of volume one should at least require

1
1. A Crash Course in Measure Theory

(i) m(A ∪ B) = m(A) + m(B) for all A, B ⊂ Rn with A ∩ B = ∅,

(ii) m(A) = m(B) whenever A, B ⊂ Rn are congruent, i.e. B can be obtained


from B by a finite combination of rigid motions.

Intuitively, this sounds perfectly fine. However, there is the following


result published by S. Banach and A. Tarski in 1924.

Theorem 1.1.1 (Banach–Tarski paradox). Let n ≥ 3 and A, B ⊂ Rn be arbitrary


bounded subsets with non-empty interiors. Then A and B can be partitioned into a
finite number of disjoints subsets

A = A1 ∪ . . . ∪ An and B = B1 ∪ . . . ∪ Bn

such that for all i = 1, . . . , n the sets Ai and Bi are congruent.

Using such paradoxical decompositions we see that m must agree for all
bounded subsets of Rn with non-empty interiors. For example, by splitting a
cube Q into two smaller parts, we see that m(Q) ∈ (0, ∞) leads to a contradic-
tion. Hence, it is impossible to measure the volume of arbitrary subsets of Rn
in a reasonable way!

Remark 1.1.2. Of course, we all know that in physical reality such a paradox
does not occur. Indeed, the decompositions given by the Banach–Tarski
paradox are not constructive and therefore cannot be realized in the real
world. More precisely in mathematical terms, the proof of the Banach–Tarski
paradox requires some form of the axiom of choice.

Since we cannot measure the volume of arbitrary subsets of Rn in a con-


sistent reasonable way, it is necessary to restrict the volume measurement
to a subset of P (Rn ). This subset should be closed under basic set theoretic
operations. This leads to the following definition which can be given for
arbitrary sets Ω instead of Rn .

Definition 1.1.3 (σ -algebra). Let Ω be a set. A subset Σ ⊂ P (Ω) is called a


σ -algebra if

(a) ∅ ∈ Σ,

(b) Ac ∈ Σ for all A ∈ Σ,

(c) ∪n∈N An ∈ Σ whenever (An )n∈N ⊂ Σ.

The tuple (Ω, Σ) is called a measurable space and the elements of Σ are called
measurable.

2
1.1. Measure Spaces

Note that it follows from the definition that for A, B ∈ Σ one also has
A ∩ B ∈ Σ and B \ A ∈ Σ. The closedness of Σ under countable unions may
be the less intuitive of the above defining properties. It guarantees that
σ -algebras behave well under limiting processes which lie at the hearth of
analysis. We now give some elementary examples of σ -algebras.

Example 1.1.4. (i) Let Ω be an arbitrary set. Then the power set P (Ω) is
clearly a σ -algebra.

(ii) Let Ω be an arbitrary set. We define Σ as the set of subsets of Ω which


are countable or whose complement is countable. One then can check
that Σ is a σ -algebra. Here one has to use the fact that countable unions
of countable sets are again countable. Note that Σ does in general not
agree with P (Ω). For example, if Ω = R, then the interval [0, 1] is not
contained in Σ.

We now give an important and non-trivial example of a σ -algebra which


will be frequently used in the following.

Example 1.1.5 (Borel σ -algebra). Let Ω be a subset of Rn for n ∈ N, or more


general a normed, metric or topological space. Then the smallest σ -algebra
that contains all open sets O of Ω
\
B(Ω) = Σ
Σ σ -algebra:
Σ⊃O

is called the Borel σ -algebra on Ω. One can show that B(Rn ) is the smallest
σ -algebra that is generated by elements of the form [a1 , b1 ) × · · · [an , bn ) for
ai < bi , i.e. by products of half-open intervals.

Recall that a function f : Ω1 → Ω2 between two normed or more general


metric or topological spaces is continuous if and only if the preimage of every
open set under f is again open. This means that f preserves the topological
structure. In the same spirit measurable mappings are compatible with the
measurable structures on the underlying spaces.

Definition 1.1.6 (Measurable mapping). Let (Ω1 , Σ1 ) and (Ω2 , Σ2 ) be two


measurable spaces. A map f : Ω1 → Ω2 is called measurable if

f −1 (A) ∈ Σ1 for all A ∈ Σ2 .

A function f : Ω1 → Ω2 between two normed spaces (or more generally two


metric or topological spaces) is called measurable if f is a measurable map
between the measurable spaces (Ω1 , B(Ω1 )) and (Ω2 , B(Ω2 )).

3
1. A Crash Course in Measure Theory

It is often very convenient to consider functions f : Ω → R, where R


denotes the extended real line R = R ∪ {∞} ∪ {−∞}. In this case one calls f
measurable if and only if X∞ = {x ∈ Ω : f (x) = ∞} and X−∞ = {x ∈ Ω : f (x) =
−∞} are measurable and the restricted function f : Ω \ (X∞ ∪ X−∞ ) → R is
measurable in the sense just defined above. If a real-valued function takes the
values ∞ or −∞, we will implicitly always work with this definition. We will
often need the following sufficient conditions for a mapping to be measurable.

Proposition 1.1.7. Let Ω1 and Ω2 be two normed vector spaces or more generally
metric or topological spaces. Then every continuous mapping f : Ω1 → Ω2 is
measurable. Further, every monotone function f : R → R is measurable.

Furthermore, measurable functions are closed under the usual arithmetic


operations and under pointwise limits.

Proposition 1.1.8. Let (Ω, Σ, µ) be a measure space.

(a) Let f , g : Ω → C be measurable. Then f + g, f − g, f · g and f /g provided


g(x) , 0 for all x ∈ Ω are measurable as well.

(b) Let fn : Ω → C be a sequence of measurable functions such that f (x) B


limn→∞ fn (x) exists for all x ∈ Ω. Then f is measurable.

We now assign a measure to a measurable space.

Definition 1.1.9 (Measure). Let (Ω, Σ) be a measurable space. A measure on


(Ω, Σ) is a mapping µ : Σ → R≥0 ∪ {∞} that satisfies

(i) µ(∅) = 0.
P∞
(ii) µ(∪n∈N An ) = n=1 µ(An ) for all pairwise disjoint (An )n∈N ⊂ Σ.

The triple (Ω, Σ, µ) is a measure space. If µ(Ω) < ∞, then (Ω, Σ, µ) is called a
finite measure space. If µ(Ω) = 1, one says that (Ω, Σ, µ) is a probability space.

One can deduce from the above definition that a measure satisfies µ(A) ≤
µ(B) for all measurable A ⊂ B and µ(∪n∈N Bn ) ≤ ∞
P
n=1 µ(Bn ) for arbitrary
(Bn )n∈N ⊂ Σ. Moreover, one has µ(A \ B) = µ(A) − µ(B) for measurable B ⊂ A
whenever µ(B) < ∞. We begin with some elementary examples of measure
spaces.

Example 1.1.10. (i) Consider (Ω, P (Ω)) for an arbitrary set Ω and define
µ(A) as the number of elements in A whenever A is a finite subset and
µ(A) = ∞ otherwise. Then µ is a measure on (Ω, P (Ω)).

4
1.2. The Lebesgue Integral

(ii) Let Ω be an arbitrary non-empty set and a ∈ Ω. Define

δa : P (Ω) → R≥0

1 if a ∈ A


A 7→  .
0 else

Then δa is a measure on (Ω, P (Ω)) and is called the Dirac measure in a.

We now come to the most important example for our purposes.

Theorem 1.1.11 (Lebesgue Measure). Let n ∈ N. There exists a unique Borel


measure λ, i.e. a measure defined on (Rn , B(Rn )), that satisfies
n
Y
λ([a1 , b1 ) × · · · [an , bn )) = (bk − ak )
k=1

for all products with ai < bi . The measure λ is called the Lebesgue measure on
Rn .

Of course, one can also restrict the Lebesuge measure to (Ω, B(Ω)) for
subsets Ω ⊂ Rn . The uniqueness in the above theorem is not trivial, but
essentially follows from the fact that the products of half-open intervals used
in the above definition generate the Borel-σ -algebra and are closed under finite
intersections. The existence is usually proved via Carathéodory’s extension
theorem.

1.2 The Lebesgue Integral


Given a measure space (Ω, Σ, µ), one can integrate certain functions f : Ω → C
over the measure µ. One extends the integral step by step to more general
classes of functions. A function f : Ω → C is a simple function if there exist
finite measurable sets A1 , . . . , An ∈ Σ and a1 , . . . , an ∈ C such that f = nk=1 ak 1Ak .
P

Here 1Ak is the function defined by



1 if x ∈ Ak


1Ak (x) =  .
0 if x < Ak

Definition 1.2.1 (Lebesgue integral). Let (Ω, Σ, µ) be a measure space.

(i) For a simple function f : Ω → R≥0 given by f = nk=1 ak 1Ak as above one
P

defines the Lebesgue integral as


Z n
X
f dµ = ak µ(Ak ).
Ω k=1

5
1. A Crash Course in Measure Theory

(ii) For a measurable function f : Ω → R≥0 the Lebesgue integral is defined


as Z Z
f dµ = sup g dµ.
Ω g simple: Ω
0≤g≤f

(iii) A general measurable function f : Ω → C can be uniquely decomposed


into for non-negative measurable functions f : Ω R → R≥0 such hat f =
(f1 − f2 ) + i(f3 − f4 ). One says that f is integrable if Ω fi dµ < ∞ and writes
f ∈ L1 (Ω, Σ, µ). In this case one sets the Lebesgue integral as
Z Z Z Z Z !
f dµ = f1 dµ − f2 dµ + i f3 dµ − f4 dµ .
Ω Ω Ω Ω Ω

Moreover, for a measurable set A ∈ Σ we use the short-hand notation


Z Z
f dµ B 1A f dµ
A Ω

whenever the integral on the right hand side exists.

We will often use the following terminology. Let (Ω, Σ, µ) be a measure


space and P (x) a property for every x ∈ Ω. We say that P holds almost
everywhere if there exists a set N ∈ Σ with µ(N ) = 0 such that P (x) holds for
all x < N . For example, on (R, B(R), λ) the function f (x) = cos(πx) satisfies
|f (x)| < 1 almost everywhere because one can choose N = Z which has zero
Lebesgue measure. In the following we will often make use of that the fact
that the integrals over two measurable functions f and g agree whenever
f (x) = g(x) almost everywhere.
Notice that we can now integrate a function f : [a, b] → C in two different
ways by either using the Riemann or the Lebesgue integral. These two inte-
grals however agree as soon as both make sense and the Lebesgue integral can
be considered as a true extension of the Riemann integral (except for some
minor measurability issues).

Theorem 1.2.2 (Lebesgue integral equals Riemann integral). The Riemann


and Lebesgue integral have the following properties.

(a) Let f : [a, b] → C be a Riemann integrable function. Then there exists


a measurable function g : [a, b] → C with f = g almost everywhere and
g ∈ L1 ([a, b], B([a, b]), λ). Moreover, one has
Z b Z
f (x) dx = g dλ.
a [a,b]

6
1.2. The Lebesgue Integral

(b) Let f : I → C for some interval I ⊂ R be Riemann integrable in the improper


sense. If
Z
sup |f (x)| dx < ∞,
K⊂I K
compact interval

then there exists a measurable function g : I → C with f = g almost every-


where and g ∈ L1 (I, B(I), λ). Moreover, one has
Z Z
f (x) dx = g dλ.
I I

Moreover, if f is measurable (for example if f is continuous), one can choose g


equal to f .

For an example of a Lebesgue-integrable function which is not Riemann-


integrable, consider f (x) = 1[0,1]∩Q (x). Then f is not Riemann-integrable as
on arbitrary fine partitions of [0, 1] the function takes both
R values 0 and 1,
whereas the Lebesgue integral can be easily calculated as [0,1] f dλ = λ([0, 1] ∩
Q) = 0.
Now suppose that one has given a sequence fn : Ω → C of measurable
functions such that limn→∞ fn (x) exists almost everywhere. Hence, there
exists a measurable set N with µ(N ) = 0 such that the limit exists for all x < N .
We now set 
limn→∞ fn (x) if this limit exists,


f (x) = 
0
 else.

One can show that the set C of all x ∈ Ω for which the above limit exists
is measurable. It follows easily from this fact the function f : Ω → C is
measurable as well. Note further that because of C ⊂ N one has µ(C) = 0.
Hence, the Lebesgue integral of f is independent of the concrete choice of
the values at the non-convergent points and therefore the choice does not
matter for almost all considerations. We make the agreement that we will
always define the pointwise limit of measurable functions in the above way
whenever the limit exists almost everywhere. This is particularly useful for the
formulation of the following convergence theorems for the Lebesgue integral.

Theorem 1.2.3 (Monotone convergence theorem). Let (Ω, Σ, µ) be a measure


space and fn : Ω → R a sequence of measurable functions with fn+1 (x) ≥ fn (x) ≥
0 almost everywhere. Suppose further that f (x) = limn→∞ fn (x) exists almost
everywhere. Then
Z Z
lim fn dµ = f dµ.
n→∞ Ω Ω

7
1. A Crash Course in Measure Theory

Note that the monotonicity assumption is crucial for the theorem. In fact,
in general one cannot switch the order of limits and integrals as the following
example shows.
Z Z
lim 1[n,n+1] dλ = 1 , 0 = lim 1[n,n+1] dλ.
n→∞ R Ω n→∞

However, the following result holds for non-positive and non-monotone


sequences of functions.

Theorem 1.2.4 (Dominated convergence theorem). Let (Ω, Σ, µ) be a measure


space and fn : Ω → C a sequence of measurable functions for which there exists an
integrable function g : Ω → R such that for all n ∈ N one has |fn (x)| ≤ g(x) almost
everywhere. Further suppose that f (x) = limn→∞ fn (x) exists almost everywhere.
Then Z Z
lim fn dµ = f dµ.
n→∞ Ω Ω

For the next result we need a finiteness condition on the underlying


measure space.

Definition 1.2.5 (σ -finite measure space). A measure space (Ω, Σ, µ) is called


σ -finite if there exists a sequence of measurable sets (An )n∈N ⊂ Σ such that
µ(An ) < ∞ for all n ∈ N and

[
Ω= An .
n=1

For example, (N, P (N)) together with the counting measure or the measure
spaces (Rn , B(Rn ), λ) for n ∈ N, where λ denotes the Lebesgue measure, are
σ -finite. Moreover, every finite measure space and a fortiori every probability
space is σ -finite. For an example of a non-σ -finite measure space consider
(R, P (R)) with the counting measure.

Definition 1.2.6 (Products of measure spaces). Consider the two measure


spaces (Ω1 , Σ1 , µ1 ) and (Ω2 , Σ2 , µ2 ).

(i) The σ -algebra on Ω1 × Ω2 generated by sets of the form A1 × A2 for


Ai ∈ Σi (i = 1, 2) (i.e. the smallest σ -algebra that contains these sets) is
called the product σ -algebra of Σ1 and Σ2 and is denoted by Σ1 ⊗ Σ2 .

(ii) A measure µ on the measurable space (Ω1 × Ω2 , Σ1 ⊗ Σ2 ) is called a


product measure of µ1 and µ2 if

µ(A1 × A2 ) = µ1 (A1 ) · µ2 (A2 ) for all A1 ∈ Σ1 , A2 ∈ Σ2 .

Here we use the convention that 0 · ∞ = ∞ · 0 = 0.

8
1.2. The Lebesgue Integral

For example, one has B(Rn )⊗B(Rm ) = B(Rn+m ) which can be easily verified
using the fact that products of half-open intervals generate B(Rn ). It follows
from the characterizing property of the Lebesgue measure λn on (Rn , B(Rn ))
that for all n, m ∈ N the measure λn+m is a product measure of λn and λm . One
can show that there always exists a product measure for two arbitrary measure
spaces. In most concrete situations there exists a uniquely determined product
measure as the following theorem shows.

Theorem 1.2.7. Let (Ω1 , Σ1 , µ1 ) and (Ω2 , Σ2 , µ2 ) be two σ -finite measure spaces.
Then there exists a unique product measure on (Ω1 ×Ω2 , Σ1 ⊗Σ2 ) which is denoted
by µ1 ⊗ µ2 .

It is now a natural question how integration over product measures is


related to integration over the single measures. An answer is given by Fubini’s
theorem.

Theorem 1.2.8 (Fubini–Tonelli theorem). Let (Ω1 , Σ1 , µ1 ) and (Ω2 , Σ2 , µ2 ) be


two σ -finite measure spaces and f : (Ω1 ×Ω2 , Σ1 ⊗Σ2 ) → C a measurable function.
Then the functions
Z Z
y 7→ f (x, y) dµ1 (x) and x 7→ f (x, y) dµ2 (y)
Ω1 Ω2

are measurable functions Ω2 → C respectively Ω1 → C. If one of the three integrals


Z Z Z Z
|f (x, y)| dµ2 (y) dµ1 (x), |f (x, y)| dµ1 (x) dµ2 (y) or
Ω1 Ω2 Ω2 Ω1
Z
|f (x, y)| dµ1 ⊗ µ2 (x, y)
Ω1 ×Ω2

is finite, then one has for the product and iterated integrals
Z Z Z
f (x, y) d(µ1 ⊗ µ2 )(x, y) = f (x, y) dµ2 (y) dµ1 (x)
Ω1 ×Ω2 Ω1 Ω2
Z Z
= f (x, y) dµ1 (x) dµ2 (y).
Ω2 Ω1

Moreover, if f is a non-negative function, one can omit the finiteness assumption


on the integrals and the conclusion is still valid (in this case all integrals can be
infinite).

Note that there are also variants of Fubini’s theorem (not in the above gen-
erality) for non σ -finite measure spaces. However, this case is more technical
and rarely used in practice and therefore we omit it.

9
1. A Crash Course in Measure Theory

1.3 Lebesgue Spaces


We now come back to the motivation at the beginning of this chapter. After
our preliminary work we can now define L2 (R) or more generally Lp (Ω) over
an arbitrary measure space (Ω, Σ, µ).

Definition 1.3.1 (Lp -spaces). Let (Ω, Σ, µ) be a measure space. For p ∈ [1, ∞)
we set
( Z )
p p
L (Ω, Σ, µ) B f : Ω → K measurable : |f | dµ < ∞ ,

Z !1/p
p
kf kp B |f | dµ .

For p = ∞ we set

L∞ (Ω, Σ, µ) B {f : Ω → K measurable : ∃C ≥ 0 : |f (x)| ≤ C alm. everywhere}.


kf k∞ B inf{C ≥ 0 : |f (x)| ≤ C almost everywhere}.

Note that the space L1 (Ω, Σ, µ) agrees with the space L1 (Ω, Σ, µ) previ-
ously defined in Definition 1.2.1. One can show that (Lp (Ω, Σ, µ), k·kp ) is a
semi-normed vector space, i.e. k·kp satisfies all axioms of a norm except for def-
initeness. Here, the validity of the triangle inequality, the so-called Minkowski
inequality, is a non-trivial fact. If one identifies two functions whenever they
agree almost everywhere, one obtains a normed space.

Definition 1.3.2 (Lp -spaces). Let (Ω, Σ, µ) be a measure space and p ∈ [1, ∞].
The space Lp (Ω, Σ, µ) is defined as the space Lp (Ω, Σ, µ) with the additional
agreement that two functions f , g : Ω → K are identified with each other
whenever f − g = 0 almost everywhere.

As a consequence of the above identification (Lp (Ω, Σ, µ), k·kp ) is a normed


vector space. In contrast to the variant using the Riemann integral these
spaces are complete.

Definition 1.3.3 (Banach space). A normed vector space which is complete


with respect to the given norm is called a Banach space.

Recall that a normed vector space or more generally a metric space is called
complete if every Cauchy sequence converges to an element in the space. A
sequence (xn )n∈N in a normed vector space (V , k·k) is called a Cauchy sequence
if for all ε > 0 there exists n0 ∈ N such that kxn − xm k ≤ ε for all n, m ≥ n0 .
Using this terminology we have

Theorem 1.3.4 (Riesz–Fischer). Let (Ω, Σ, µ) be a measure space and p ∈ [1, ∞].
Then Lp (Ω, Σ, µ) is a Banach space.

10
1.3. Lebesgue Spaces

Let (fn )n∈N be a sequence in Lp (Ω, Σ, µ) with fn → f in Lp . One often says


that fn converges to f in the p-th mean which gives the right visual interpre-
tation for convergence in Lp -spaces. Note that the sequence 1[0,1] , 1[0,1/2] ,
1[1/2,1] , 1[0,1/4] , 1[1/4,1/2] and so on converges in Lp ([0, 1]) for all p ∈ [1, ∞) to
the zero function although fn (x) diverges for all x ∈ [0, 1]. Conversely, point-
wise convergence in general does not imply convergence in Lp . For example,
the sequence fn = 1[n,n+1] does not converge in Lp (R) although fn (x) → 0 for
all x ∈ R. In concrete situations one can often infer Lp -convergence from
pointwise convergence with the help of the dominated convergence theorem.
In the opposite direction one has the following useful result which actually
follows directly from the proof of the Riesz–Fischer theorem.

Proposition 1.3.5. Let (Ω, Σ, µ) be a measure space and p ∈ [1, ∞). Further
suppose that fn → f in Lp (Ω, Σ, µ). Then there exist a subsequence (fnk )k∈N and
g ∈ Lp (Ω, Σ, µ) such that

(a) fnk (x) → f (x) almost everywhere;

(b) |fnk (x)| ≤ |g(x)| for all n ∈ N almost everywhere.

We will later need some further properties of Lp -spaces. The following


result is natural, but needs some effort to be proven rigorously.

Proposition 1.3.6. Let Ω ⊂ Rn be open and p ∈ [1, ∞). Then Cc (Ω), the space of
all continuous functions on Ω with compact support (in Ω), is a dense subspace of
Lp (Ω).

The Cauchy–Schwarz inequality for L2 -spaces generalizes to Hölder’s


inequality in the Lp -setting. In the following we use the agreement 1/∞ = 0.

Proposition 1.3.7 (Hölder’s inequality). Let (Ω, Σ, µ) be a measure space. Fur-


ther let p ∈ [1, ∞] and q ∈ [1, ∞] be its dual index given by p1 + 1q = 1. Then for
f ∈ Lp (Ω, Σ, µ) and g ∈ Lq (Ω, Σ, µ) the product f · g lies in L1 (Ω, Σ, µ) and satisfies
Z Z !1/p Z !1/q
p q
|f g| dµ ≤ |f | dµ |g| dµ .
Ω Ω Ω

As an important and direct consequence of Hölder’s inequality one has


the following inclusions between Lp -spaces.

Proposition 1.3.8. Let (Ω, Σ, µ) be a finite measure space, i.e. µ(Ω) < ∞. Then
for p ≥ q ∈ [1, ∞] one has the inclusion

Lp (Ω, Σ, µ) ⊂ Lq (Ω, Σ, µ).

11
1. A Crash Course in Measure Theory

Proof. We only deal with the case p ∈ (1, ∞) (the other cases are easy to show).
It follows from Hölder’s inequality because of p/q ≥ 1 that
Z !1/q Z !1/q Z !1/p Z !(1−q/p)·1/q
q q p
|f | dµ = |f | 1 dµ ≤ |f | dµ 1 dµ
Ω Ω Ω Ω
Z !1/p
p
= µ(Ω)1/q−1/p |f | dµ .

A second application of Hölder’s inequality is the next important estimate


on convolutions of two functions.

Definition 1.3.9. Let f , g ∈ L1 (Rn ). We define the convolution of f and g by


Z
(f ∗ g)(x) = f (y)g(x − y) dy.
Rn

Note that it is not clear that f ∗ g exists under the above assumptions. This
is indeed the case as the following argument shows. Note that the function
(x, y) 7→ f (y)g(x − y) is measurable as a map R2n → R by the definition of
product σ -algebras and the fact that the product and the composition of
measurable functions is measurable. It follows from Fubini’s theorem that
the function x 7→ (f ∗ g)(x) is measurable and satisfies
Z Z Z Z Z
|f ∗ g| (x) dx ≤ |f (y)||g(x − y)| dy dx = |f (y)| |g(x − y)| dx dy
Rn Z Z Rn R n Rn Rn

= |f (y)| |g(x)| dx dy = kf k1 kgk1 .


Rn Rn

Hence, the function f ∗ g is finite almost everywhere. Moreover, we have


shown that f ∗ g ∈ L1 (Rn ) and that the pointwise formula in the definition
holds with finite values almost everywhere after taking representatives. It
follows from the next inequality that the convolution also exists as an Lp -
integrable function if one function is assumed to be in Lp .

Proposition 1.3.10 (Minkowski’s inequality for convolutions). For some p ∈


[1, ∞] let g ∈ Lp (Rn ) and f ∈ L1 (Rn ). Then one has

kf ∗ gkp ≤ kf k1 kgkp .

Proof. We only deal with the cases p ∈ (1, ∞) as the boundary cases are simple
to prove. We apply Hölder’s inequality
R to the functions |g(x − y)| and 1 for the
measure µ = |f (y)| dy (i.e. µ(A) = A |f (y)| dy) and obtain
Z !1/p Z !1/q
p
|(f ∗ g)(x)| ≤ |g(x − y)| |f (y)| dy |f (y)| dy ,
Rn Rn

12
1.3. Lebesgue Spaces

where 1/p + 1/q = 1. Taking the Lp -norm in the above inequality, we obtain
the desired inequality
Z Z !1/p
p p/q
kf ∗ gkp ≤ |g(x − y)| |f (y)| dy kf k1 dx
Rn Rn
Z Z !1/p
1/q p 1/q 1/p
= kf k1 |f (y)| |g(x − y)| dx dy = kf k1 kf k1 kgkp
Rn Rn
= kf k1 kgkp .

13
The Theory of Self-Adjoint 2
Operators in Hilbert Spaces
2.1 Basic Hilbert Space Theory
By the postulates of quantum mechanics a quantum mechanical system is
described by some complex Hilbert space. Before going any further, we
therefore need some basic results from Hilbert space theory. In this section
we introduce Hilbert spaces and bounded operators between these spaces.
As important examples for the further development, we introduce Fourier
transforms and Sobolev spaces.
We follow the typical physical convention that an inner product on some
complex vector space is linear in the second and anti-linear in the first com-
ponent.

Definition 2.1.1. A Hilbert space H is a K-vector space endowed with an inner


p
product h·|·i such that H is complete with respect to the norm k·kH B h·|·i
induced by the inner product (i.e. every Cauchy sequence in H converges to
an element in H).

Recall that a sequence (xn )n∈N in a normed space (N , k·k) is called a Cauchy
sequence if for every ε > 0 there exists N ∈ N with kxn − xm k ≤ ε for all n, m ≥ N .
Note that the spaces Cn for n ∈ N are finite-dimensional Hilbert spaces with
respect to the inner product hx|yi = nk=1 xk yk . We now give a first important
P

infinite-dimensional example.

Example 2.1.2. Let (Ω, Σ, µ) be an arbitrary measure space. Then L2 (Ω, Σ, µ)


as defined in Definition 1.3.2 is a Hilbert space with respect to the inner
product Z
hf |giL2 B f (x)g(x) dµ(x).

Note that the space L2 (Ω, Σ, µ) is complete by the Riesz–Fischer Theorem 1.3.4.
Further, the finiteness of the scalar product is a consequence of Hölder’s
inequality. As a special case one can take for an open set Ω ⊂ Rn the measure
space (Ω, B(Ω), λ) and obtains the L2 -space L2 (Ω) = L2 (Ω, B(Ω), λ|B(Ω) ).

We now state some elementary concepts and properties of Hilbert spaces.

Proposition 2.1.3. Let H be a Hilbert space and x, y ∈ H. Then the Cauchy–


Schwarz inequality
|hx|yi| ≤ kxkH kykH (CS)

15
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

holds. In particular, the scalar product seen as a mapping H×H → K is continuous,


i.e. xn → x and yn → y in H imply hxn |yn i → hx|yi.

2.1.1 Orthonormal Bases


Orthonormal bases are one of the most fundamental concepts in Hilbert space
theory and we will meet such bases in abundance while studying concrete
quantum mechanical systems.

Definition 2.1.4. A family (ei )i∈I of elements in some Hilbert space H is called
orthogonal if hei |ej i = 0 for all i , j ∈ I. If one additionally has kei k = 1 for all
i ∈ I, one says that (ei )i∈I is orthonormal. If moreover the linear span of (ei )i∈I
is dense in H, the family (ei )i∈I is called an orthonormal basis of H.

As a simple example, consider the Hilbert space Cn for some n ∈ N. Then


the n unit vectors form an orthonormal basis of Cn , whereas every subset
of the unit vectors is orthonormal. Here the linear span of e1 , . . . , en clearly
is not only dense in Cn , but even spans Cn completely. This changes in the
infinite-dimensional setting. Consider ` 2 , the space of all square summable
sequences. Then the unit vectors (en )n∈N again form an orthonormal basis as
one can directly verify. However, the span of (en )n∈N consists exactly of the
sequences with finite support which is a dense proper subspace of ` 2 . Let us
now consider the space L2 ([0, 1]) for which it is slightly more difficult to give
an example of an orthonormal basis.

Example 2.1.5 (Trigonometric system). Consider en (x) = e2πinx for n ∈ Z.


Then (en )n∈Z forms an orthonormal set in L2 ([0, 1]) as one can verify by a
direct computation. Moreover, if one consider the Fréchet kernel defined by
N X
n
1 X
FN (x) = e2πikx ,
2N + 1
n=−N k=−n

it follows from a standard theorem in analysis that for every continuous


periodic function f ∈ Cper ([0, 1]) the convolution Fn ∗ f converges uniformly
to f (we will study the convolution of functions on the real line more closely
in some later section). Note that it follows from
Z1
|FN ∗ f − f |2 dx ≤ kFN − f k2∞ −−−−−→ 0
0 N →∞

that the convergence FN ∗ f → f also holds in L2 . Since FN ∗ f is explicitly


given by
N X n
1 X
(FN ∗ f )(x) = fˆ(k)e2πikx ,
2N + 1
n=−N k=−n

16
2.1. Basic Hilbert Space Theory

where fˆ(k) for k ∈ Z denotes the k-th Fourier coefficient of f given by


Z 1
fˆ(k) = f (x)e−2πikx dx = hek |f i,
0

one sees immediately that FN ◦ f lies in the span of (en )|n|≤N . Hence, the span
of (en )n∈Z is dense in C([0, 1]) and therefore also dense in L2 ([0, 1]) by a variant
of Proposition 1.3.6 (note that L2 ((0, 1)) = L2 ([0, 1])) and we have shown that
(en )n∈Z forms an orthonormal basis of L2 ([0, 1]).

We have constructed orthonormal bases of fundamental examples of


Hilbert spaces. In fact, this is no coincidence as every Hilbert space has
an orthonormal basis. The most important infinite-dimensional case is when
the index set of an orthonormal bases can be chosen as the natural numbers

Definition 2.1.6. A normed space (N , k·k) is called separable if there exists a


countable dense subset of N .

Theorem 2.1.7. Every Hilbert space H has an orthonormal basis. Moreover and
more concretely, if H is infinite dimensional and separable, then there exists an
orthonormal basis (en )n∈N of H.

The proof of the above theorem in the separable case usually uses the
well-known Gram–Schmidt orthonormalization process known from linear
algebra applied to a dense countable subset of H. Orthonormal bases are a
fundamental tool in the study of Hilbert spaces as we see in the following.

Lemma 2.1.8. Let H be a Hilbert space and x, y ∈ H. Then

kx + yk2 = kxk2 + kyk2 + 2 Rehx|yi.

Proof. This follows by a direct computation from the relation between the
scalar product and the norm on a Hilbert space. Indeed, we have

kx + yk2 = hx + y|x + yi = hx|xi + hx|yi + hy|xi + hy|yi = kxk2 + kyk2 + hx|yi + hx|yi
= kxk2 + kyk2 + 2 Rehx|yi.

Note that in particular if x and y are orthogonal, i.e. hx|yi = 0, then kx +


yk2 = kxk2 +kyk2 . As a direct application we can study the expansion of Hilbert
space elements with respect to orthonormal bases. All what follows can be
extended to arbitrary Hilbert spaces. However, for the sake of simplicity we
will only deal with the separable infinite-dimensional case.

Theorem 2.1.9. Let H be an infinite-dimensional separable Hilbert space and


(en )n∈N an orthonormal basis of H.

17
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

(a) For every x ∈ H one has



X
x= hen |xien ,
n=1
where the convergence of the series is understood in the norm of H. Moreover,
the above expansion is unique.

(b) For every x, y ∈ H Parseval’s identity holds:



X
hx|yi = hx|en ihen |yi.
n=1

In particular one has



X
kxk2 = |hen |xi|2 .
n=1

Proof. We start with (a). Suppose that x = ∞


P
n=1 an en is an arbitrary expansion
of x. Then it follows from the continuity of the scalar product that for every
k ∈ N one has
* X ∞ + * X N + XN
hek |xi = ek | an en = lim ek | an en = lim an hek |en i = ak .
N →∞ N →∞
n=1 n=1 n=1

The uniqueness of the expansion follows directly from the above equation.
For the existence of the expansion we have to show that the partial sums
PN
n=1 hen |xien converge to x in H. By orthogonality we have
2 2 2
XN XN X N XN
kxk2 = x − hen |xien + hen |xien = x − hen |xien + hen |xien
n=1 n=1 n=1 n=1
2
X N X N X N
= x − hen |xien + |hen |xi|2 ≥ |hen |xi|2 .
n=1 n=1 n=1

This shows that the sequence (hen |xi)n∈N is square summable and that

X
|hen |xi|2 ≤ kxk2 . (2.1)
n=1

In particular it follows that for m ≥ n one has by orthogonality


2 2
X m Xn X m Xm X∞
2
hek |xiek − hek |xiek = hek |xiek = |hek |xi| ≤ |hek |xi|2 ,
k=1 k=1
k=n
k=n k=n

which goes to zero as n → ∞. This shows that the sequence of partial sums
forms a Cauchy sequence in H. By the completeness of H the sequence of

18
2.1. Basic Hilbert Space Theory

partial sums converges. Hence, ∞


P
n=1 hen |xien ∈ H. It remains to show that this
series is indeed x. In order to show this define for N ∈ N
N
X
PN : x 7→ hen |xi.
n=1

We have just shown that for every x ∈ H the limit of PN x as N → ∞ exists.


Moreover, it follows from the estimate (2.1) that PN is a contraction, i.e.
kPN xk ≤ kxk for all x ∈ H. Moreover, if x is in the span of (en )n∈N , then PN x = x
for N large enough. In particular, one has limN →∞ PN x = x for all x in the span
of (en )n∈N which is a dense subset of H by the definition of an orthonormal
basis. In other words, we have limN →∞ PN = Id pointwise on a dense subset.
It now follows from Lemma 2.1.10 applied to PN − Id because of kPN k ≤ 1 that
this convergence indeed holds for all x ∈ H, i.e. one has the desired expansion

X
x= hen |xien
n=1

for all x ∈ H. This finishes the proof of (a). Now Parseval’s identity in (b) is an
immediate consequence. Indeed, we have for x, y ∈ H
2
X ∞ XN X M
hx|yi = |hen |xien | = lim lim hen |xihek |yihen |ek i
n=1 N →∞ M→∞
n=1 k=1

X
= hx|en ihen |yi.
n=1

We now prove the lemma left open in the previous proof.

Lemma 2.1.10. Let Tn : X → X be a sequence of uniformly bounded linear opera-


tors on some Banach space X, i.e. kTn xk ≤ C kxk for some C ≥ 0 and all n ∈ N and
x ∈ X. Further suppose that limn→∞ Tn x = 0 for all x in a dense subset M ⊂ X.
Then
lim Tn x = 0 for all x ∈ X.
n→∞

Proof. Let x ∈ X and ε > 0. Choose y ∈ M with kx − yk ≤ ε. Then we have

lim sup kTn xk ≤ lim supkTn (x − y)k + kTn yk ≤ Ckx − yk.


n→∞ n→∞

Since ε > 0 is arbitrary, it follows that limn→∞ kTn xk = 0.

If we apply Parseval’s identity to the trigonometric basis considered in


Example 2.1.5, we obtain Plancherel’s identity from Fourier analysis.

19
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

Corollary 2.1.11 (Plancherel’s identity). Let f ∈ L2 ([0, 1]) and let fˆ(n) for
n ∈ Z denote its n-th Fourier coefficient. Then
Z1 X
|f (x)|2 dx = |fˆ(n)|2 .
0 n∈Z

2.1.2 Bounded Operators on Hilbert Spaces


We have already encountered bounded linear operators in Lemma 2.1.10.
This subsection is devoted to a closer study of such operators. Recall that
by the postulates of quantum mechanics a physical observable is modeled
by a self-adjoint operator on some Hilbert space. Such operators are often
unbounded in important examples. Nevertheless it is important as a first
step to understand the easier case of bounded self- and non-self-adjoint op-
erators. Moreover, the evolution operators of quantum mechanical systems
that we will consider in the study of the Schrödinger equation and orthog-
onal projections which are fundamental in the mathematical description of
the measurement process in quantum mechanics are important examples of
bounded operators.
Definition 2.1.12. A linear operator T : X → Y between two Banach spaces is
bounded if there exists a constant C ≥ 0 such that
kT xkY ≤ C kxkX for all x ∈ X .
The smallest constant C ≥ 0 such that the above inequality holds is called the
operator norm of T and is denoted by kT k. The space of all bounded linear
operators between X and Y is denoted by B(X, Y ). It is a Banach space with
respect to the operator norm.
The last fact is left as an exercise to the reader. Notice that a linear operator
T is bounded if and only if it is continuous, i.e. xn → x in X implies T xn → T x
in X. This is also left as an exercise to the reader. Finally, the reader should
verify that
kT k = sup kT xk .
kxk≤1
Suppose one has given two bounded linear operators T : X → Y and S : Y → Z
between Banach spaces X, Y and Z. Then one has for all x ∈ X
kST xk ≤ kSk kT xk ≤ kSk kT k kxk .
This shows by definition the fundamental operator inequality kST k ≤ kT k kSk,
the so-called submultiplicativity of the norm. As a consequence the composi-
tion operation B(X) × B(X) → B(X) is continuous, i.e. Sn → S and Tn → T in
B(X) implies Sn Tn → ST .
We continue with a fundamental class of examples of bounded operators.

20
2.1. Basic Hilbert Space Theory

Example 2.1.13 (Multipliers for orthonormal bases). We choose the Hilbert


space H = ` 2 = ` 2 (N). Further let en = (δnm )m∈N denote the n-th unit vector.
We have seen in the discussion before Example 2.1.5 that (en )n∈N is an or-
thonormal basis of ` 2 (N). Let x = (xn )n∈N be a sequence in ` 2 (N). Then one
has the unique representation x = ∞
P
n=1 xn en with respect to the unit vector ba-
sis. Hence, for a bounded sequence (an )n∈N ∈ ` ∞ (N) we obtain a well-defined
linear operator by setting
∞  ∞
X  X
T  xn en  = an xn en .
n=1 n=1

This follows from the fact that (an xn )n∈N ∈ ` 2 (N) whenever (xn )n∈N ∈ ` 2 (N).
Moreover, with the help of the Parseval’s identity one obtains
X ∞ ∞
2 X ∞
X X∞ 2
2 2 2 2

an xn en =

|an xn | ≤ (sup |an |) |xn | = (sup |an |) xn en .
n=1 n=1 n∈N n=1 n∈N n=1

This shows that T is bounded with kT k ≤ supn∈N |an |. Conversely, for all ε > 0
there exists n0 ∈ N such that |an0 | > supn∈N |an | − ε. Let ϕ ∈ [0, 2π) be such that
eiϕ an0 = |an0 |. Then one has keiϕ en0 k = 1 and

kT (eiϕ en0 )k = eiϕ an0 = |an0 | ≥ sup |an | − ε.


n∈N

Hence, kT k ≥ supn∈N |an | − ε for all ε > 0. Since ε > 0 is arbitrary, the equality
kT k = supn∈N |an | follows. Notice that the same reasoning applies if one
replaces ` 2 by an arbitrary infinite-dimensional separable Hilbert space and
(en )n∈N by an arbitrary orthonormal basis of H.

Notice that the above example in particular shows that in general there
is no x in the unit ball of H such that kT xk = kT k, i.e. T does not attain its
norm. We now turn our attention to dual spaces, a concept fundamental for
the Dirac formulation of quantum mechanics.

Example 2.1.14 (Orthogonal projection). Let H be a separable Hilbert space


of infinite dimension and (en )n∈N an orthonormal basis of H. Using the
expansion with respect to (en )n∈N we define for N ∈ N the operator
∞  N
X  X
PN  an en  = an en .
n=1 n=1

Clearly, PN is linear and we have implicitly shown in estimate (2.1) in the


proof of Theorem 2.1.9 that PN is a bounded operator with kPN k ≤ 1. Because
of PN e1 = e1 one directly sees that kPN k = 1. Note that PN2 = PN . Hence, PN is a

21
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

projection onto span{e1 , . . . , eN }. Moreover, one has hPN x|(Id −PN )yi = 0 for all
x, y ∈ H. Indeed, one has
N
*X N
X + N
X * N
X +
hen |xien |y − hek |yiek = hx|en i en |y − hek |yiek
n=1 k=1 n=1 k=1
N
X
= hx|en i(hen |yi − hen |yi) = 0.
n=1

Hence, the kernel and the image of PN are orthogonal subspaces. Such a
projection is called an orthogonal projection.
More generally, let M ⊂ H be a closed subspace of H. Since M is closed, M
is complete with respect to the norm induced by the inherited scalar product
of H. Hence, M is a Hilbert space as well. We assume that M is infinite-
dimensional. The finite dimensional case is simpler and can be treated as
above. Then M has an orthonormal basis (en )n∈N by Theorem 2.1.7. Now
define the linear operator

X
PM : x 7→ hen |xien .
n=1

By estimate (2.1) used in the proof of Theorem 2.1.9 one has kPM xk ≤ kxk.
Moreover, taking again x = e1 we see that PM e1 = e1 . Hence, kPM k = 1. More
generally, one has PM x = x for all x ∈ M as in this case one has x = ∞
P
n=1 hen |xien
2
by Theorem 2.1.9. Furthermore, PM = PM holds and PM is orthogonal. This
can be shown as in the first part of the example. Hence, PM is an orthogonal
projection onto the closed subspace M.

Definition 2.1.15 (Dual space). Let H be a Hilbert space. Then its (topologi-
cal) dual space is defined as

H0 B {ϕ : H → K : ϕ linear and continuous}.

Example 2.1.16. Let H be a Hilbert space. For y ∈ H one defines the func-
tional ϕy (x) = hy|xi on H. It follows from the Cauchy–Schwarz inequality
(Proposition 2.1.3) that for x ∈ H one has |ϕy (x)| ≤ kyk kxk. On the other hand
one has ϕy (y/kyk) = kyk provided y , 0. This shows that ϕy ∈ H0 with norm
kyk (the case y = 0 is obvious).

Remark 2.1.17. The space H0 is called the topological dual space of H because
one requires its elements to be continuous. Sometimes one also considers the
so-called algebraic dual space which consists of all linear functionals H → K.
We will exclusively work with the topological dual space. Hence, no confusion
can arise and we will often drop the term topological.

22
2.1. Basic Hilbert Space Theory

The Riesz representation theorem for Hilbert spaces says that indeed all
elements of H0 are of the form considered in Example 2.1.16.

Theorem 2.1.18 (Riesz representation theorem). Let H be a Hilbert space and


ϕ ∈ H0 . Then there exists a unique y ∈ H such that ϕ = ϕy . Moreover, the map

H → H0 y 7→ ϕy

is an anti-linear isometric isomorphism between H and H0 .

As an application of the Riesz representation theorem we now discuss the


bra–ket formalism (often also called the Dirac notation) frequently used by
physicists in quantum mechanics. In the following we will also work with the
adjoints of operators although we have not yet introduced this concept. You
can either ignore this part for the moment and return later or just work with
the adjoints as known from linear algebra (the standpoint of most physicists).
This makes no problems as long as all operators involved are bounded. We
will later in the lecture introduce and discuss adjoints in a mathematical
rigorous way.

Remark 2.1.19 (Bra–ket notation of quantum mechanics). Recall that by


the postulates of quantum mechanics a physical (pure) state of a quantum
mechanical system is described by an element in some Hilbert space H. In
physics notation such a state is often written as |ψi ∈ H, a so-called ket. Further,
elements in the dual space H0 are called bras, and one writes hϕ| ∈ H0 . One can
now apply |ψi to the functional hϕ|. If the functional hϕ| is identified with a
vector |ϕi ∈ H (which is possible in a unique way by the Riesz representation
theorem) one has
hϕ|(|ψi) = ϕ|ϕi (|ψi) = h(|ϕi)|(|ψi)i
To simplify notation, physicists usually use hϕ|ψi for the above expression.
Note that by the last equality this evaluation indeed agrees with the scalar
product of |ψi with hϕ| after hϕ| is identified with the state |ϕi via the isomor-
phism given by the Riesz representation theorem. Please note however that
this argument only works if both sides can indeed by identified with elements
of the Hilbert space H. Note that nevertheless physicists often use the above
notation when this condition is violated. For example, physicists usually call
the system (eix· )x∈R a generalized orthogonal basis and write

heix1 · |eix2 · i = δ(x1 − x2 ).

We will later give more sense to expressions as above by introducing Gelfand


triples and distributions.
The bra–ket notation can also be very convenient when working with
linear operators. Given a linear operator A (bounded or unbounded) one

23
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

denotes by A|ψi the value of ψi under A in agreement with the notation used
in mathematics. One extends the action of A to bras by defining

(hϕ|A)(|ψi) = A(hϕ|)(|ψi) B (hϕ|)(A|ψi) = h(|ϕi)|(A|ψi)i C hϕ|A|ψi.

In fact, if A ∈ B(H) one has |A(hϕ|)(|ψi)| ≤ khϕ|k kAk k|ψik and therefore A(hϕ|) ∈
H∗ for all hϕ| ∈ H∗ .
Now suppose one has given A ∈ B(H) and |ψi ∈ H. Let us determine the
bra hϕ| which corresponds to the ket |ϕi = A|ψi. One has

hϕ|ηi = h(A|ψi)|(|ηi)i = h(|ψi)|(A∗ |ηi)i = (hψ|A∗ )(|ηi).

One obtains hϕ| = hψ|A∗ . Hence, the adjoint formally acts on kets. In particular,
if A is self-adjoint, then A acts in the same way both on kets and bras.
Note that the definition of the action of A on bras is made in a way such
that the value of the scalar product at the right hand side of the above equation
agrees no matter whether A is applied to a ket or to a bra. This justifies the use
of the notation hϕ|A|ψi. Note that if H is finite dimensional and A is identified
with a matrix everything can be computed using matrix multiplications.
One also often uses the so-called outer product of kets and bras. For a ket
|ϕi ∈ H and a bra hψ| ∈ H∗ we define the bounded linear operator

(|ϕihψ|)(|χi) B hψ|χi · |ϕi for |χi ∈ H.

In the finite dimensional setting the operator |ϕihψ| corresponds to the matrix
obtained by multiplying the column vector |ϕi with the row vector hψ|. In
particular, if (en )n∈N is an orthonormal basis of a Hilbert space H, then the
finite rank operator
N
X
|en ihen |
n=1

for N ∈ N is an orthogonal projection onto span{e1 , . . . , eN } as considered in


Example 2.1.14.

2.1.3 Sobolev Spaces


In this section we introduce an important class of Hilbert spaces, the so-called
Sobolev spaces. These play a fundamental role in the mathematical treatment
of differential operators and partial differential equations. We have seen that
many important quantum mechanical observables can be realized by self-
adjoint differential operators. The correct domain of these operators usually
is some sort of Sobolev space as we will soon see. Hence, for the study of
concrete quantum mechanical systems, we need some basic knowledge on
those spaces.

24
2.1. Basic Hilbert Space Theory

Sobolev spaces generalize the notion of classical derivatives in a way that


is very well adapted to the use of functional analytic methods such as Hilbert
space theory. Indeed, the theory of Sobolev spaces is the correct framework
for the study of partial differential equations.
Before introducing Sobolev spaces, we need some basic notation. For an
open subset Ω ⊂ Rn we denote by Cc∞ (Ω) the set of all infinitely differentiable
functions ϕ : Ω → K with compact support in Ω, i.e. the closure of {x ∈
Ω : ϕ(x) , 0} (in the metric space Ω) is a compact subset of Ω. Note that
Cc∞ (Ω) contains plenty of functions. For example, in the one-dimensional
case consider the function

−1/x2 if x > 0
e


ψ(x) = 
0
 if x ≤ 0.

Then one can verify that ψ ∈ C ∞ (R). Now, the function ϕ(x) = ψ(1 + x)ψ(1 − x)
lies in C ∞ (R) and vanishes outside (−1, 1). Hence, ϕ is a non-trivial element
of Cc∞ (R). By taking suitable products of translations and dilations of the
function ϕ just constructed one now easily obtains non-trivial elements of
Cc∞ (Ω).
In the following we use a short-hand notation for higher derivatives. Let
Ω ⊂ Rn be open and α = (α1 , . . . , αn ) ∈ Nn0 be a multi-index. For a suffi-
ciently differentiable function f : Ω → R we write D α f = D α1 D α2 · · · D αn f =
∂|α| f
where |α| = α1 + · · · + αn . Sometimes we will also write Dxα f to
∂α1 x1 ···∂αn xn
,
make clear that we derivative with respect to the x-variables. Moreover, we
use for p ∈ [1, ∞]
p n o
Lloc (Ω) B f : Ω → K measurable : kf 1K kp < ∞ for all compact K ⊂ Ω .

Note that by the Lp -inclusions for finite measure spaces (Proposition 1.3.8)
p q
we have Lloc (Ω) ⊂ Lloc (Ω) for all p ≥ q. In particular, we have the inclusion
p
Lloc (Ω) ⊂ L1loc (Ω) for all p ≥ 1.

Definition 2.1.20. Let Ω ⊂ Rn be open, α ∈ Nn0 and f ∈ L1loc (Ω). A function


g ∈ L1loc (Ω) is called a weak α-th derivative of f if for all (real) ϕ ∈ Cc∞ (Ω)
Z Z
f D α ϕ = (−1)|α| gϕ.
Ω Ω

The space Cc∞ (Ω) is a so-called space of test functions. Such spaces will
later play an important role in the development of the mathematical theory
of distributions.
In the following example we show that the terminology weak derivative
makes sense.

25
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

Example 2.1.21. Let f ∈ C k (Ω) be a k times continuously differentiable func-


tion on some open subset Ω ⊂ Rn and α ∈ Nn0 with |α| ≤ k. Successively using
integrations by parts, we see that
Z Z
α
f D ϕ = (−1) |α|
Dαf ϕ for all ϕ ∈ Cc∞ (Ω).
Ω Ω

This shows that Dαf


is a weak α-th derivative of f . Notice that the continuity
of all factors in the integrands and the compact support of ϕ guarantee the
existence of both integrals.

Hence, we have generalized the concept of a classical derivative by making


the validity of the integration by parts formula to our definition of a weak
derivative. Moreover, there exist functions which have a weak derivative but
are not classically differentiable.

Example 2.1.22. Consider the function f ∈ L1loc ((−1, 1)) given by f (x) = |x|. It
is well-known that f is not differentiable in the origin. However, g(x) = sign x
is a weak derivative of f . Indeed, for ϕ ∈ Cc∞ ((−1, 1)) one has
Z1 Z1 Z0
0 0
|x| ϕ (x) dx = xϕ (x) dx − xϕ 0 (x) dx
−1 0 −1
Z 1 Z 0
= [xϕ(x)]10 − ϕ(x) dx − [xϕ(x)]0−1 + ϕ(x) dx
0 −1
Z 1
=− sign xϕ(x) dx.
−1

We now substantially generalize the above example.

Example 2.1.23. Let I = (a, b) ⊂ R be an open (not necessarily bounded)


interval
Rx and g ∈ L1 (I). Now for some x0 ∈ I consider the function f (x) =
x0
g(y) dy. Then f is weakly differentiable because for ϕ ∈ Cc∞ ((a, b)) we have
by Fubini’s theorem
Z b Z bZ x Z b Z b
0 0
f (x)ϕ (x) dx = g(y) dy ϕ (x) dx = g(y) ϕ 0 (x) dx dy
a a x0 a y
Z b
=− g(y)ϕ 0 (y) dy.
a

This shows that the weak derivative of f is given by g, i.e. f 0 = g in the weak
sense. Observe that the same argument works if f ∈ L1loc (I) provided one
makes the restriction x0 ∈ I.

Before going any further, we need some approximation results for contin-
uous and Lp functions.

26
2.1. Basic Hilbert Space Theory

Proposition 2.1.24. Let Ω ⊂ Rn be open, p ∈ [1, ∞) and f ∈ Lp (Ω) such that


f = 0 almost everywhere outside a subset A ⊂ Ω that has positive distance to ∂Ω.
Then there exists a sequence (ϕk )k∈N ⊂ Cc∞ (Ω) with ϕk → f in Lp (Ω). Moreover,
the sequence can be chosen to have the following properties.

(a) if f ≥ 0 almost everywhere, then ϕn (x) ≥ 0 for all n ∈ N;

(b) if f ∈ L∞ (Ω), then ϕn ≤ kf k∞ 1 for all n ∈ N.

Proof. Let ψ ∈ Cc∞ (Rn ) be a non-negative function with kψk1 = 1 and support
inside the unit ball. Now define ψk (x) = k n ψ(kx). Then ψk ≥ 0 and kψk k1 = 1
for all k ∈ N and ψk vanishes outside the ball B(0, 1/k). Now consider the
convolution Z
ϕk (x) = (f ∗ ψk )(x) = f (y)ψk (x − y) dy,
Rn
where we extend f by zero outside Ω. Notice that the convolution exists,
for example as a consequence of Proposition 1.3.10 or by observing that for
fixed x ∈ Rn both sides are integrable after passing to the compact support
of the integrand. It follows from the dominated convergence theorem that
ϕk ∈ C ∞ (Rn ). More precisely, one can verify that for a multi-index α ∈ Nn one
has Z
(D α ϕk )(x) = f (y)(D α ψk )(x − y) dy.
Rn
Moreover, it follows from the formula for the convolution that ϕk (x) vanishes
if |x − y| ≥ 1/k for all y ∈ A. Hence, ϕk vanishes outside A + B(0, 1/k). Since A
has positive distance to ∂Ω, we have ϕk ∈ Cc∞ (Ω) for sufficiently large k.
We now show that (ϕk )k∈N converges to f in Lp . We start with the case
when f additionally is a compactly supported continuous function. Let x ∈ Ω
and ε > 0. Since f is continuous in x, there exists δ > 0 such that |f (x)−f (y)| ≤ ε
for all |x − y| ≤ δ. Now if k > 1/δ
Z Z
|f (x) − ϕk (x)| = f (x)ψk (x − y) dy − f (y)ψk (x − y) dy

Rn R n
Z Z
≤ |f (x) − f (y)|ψk (x − y) dy = |f (x) − f (x − y)|ψk (y) dy
Rn Rn
Z Z
= |f (x) − f (x − y)|ψk (y) dy ≤ ε ψk (y) dy = ε.
B(0,1/k) B(0,1/k)

Hence, ϕk (x) → f (x) for all x ∈ Ω. As an immediate step we now show


assertions (a) and (b). Note that (a) is obvious from our construction and that
part (b) follows from the estimate
Z
|ϕk (x)| ≤ kf k∞ ψk (x − y) dy = kf k∞ .
Rn

27
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

Since the functions ϕk are supported in A + B(0, 1) and are uniformly bounded
by kf k∞ (which is finite because f is continuous and has compact support), it
follows from the dominated convergence theorem that ϕk → f in Lp .
We now consider the case of arbitrary f ∈ Lp (Ω). Consider for k ∈ N the
linear bounded operator

Tk : Lp (Ω) → Lp (Rn )
f 7→ f ∗ ψk

It follows from Minkowski’s inequality for convolutions (Proposition 1.3.10)


that kTk k ≤ kψk k1 = 1. Since Cc (Ω) is dense in Lp (Ω) by Proposition 1.3.6,
Lemma 2.1.10 shows that Tk f → f 1Ω in Lp (Rn ) for all f ∈ Lp (Ω). This finishes
the proof.

Notice that a priori a function f could have several different weak deriva-
tives. We now show that this is not the case. The following lemma is often
called the du Bois-Reymond lemma. In calculus of variations you have probably
encountered variants of this lemma which are usually called the fundamental
lemma of calculus of variations. There it is used to deduce the Euler–Lagrange
equations from the variational principle. Although intuitively clear, a rigorous
proof needs some effort because of measure theoretic difficulties.

Lemma 2.1.25 (du Bois-Reymond). Let Ω ⊂ Rn be open and f ∈ L1loc (Ω) with
Z
fϕ=0 for all ϕ ∈ Cc∞ (Ω).

Then f = 0 almost everywhere.

Proof. First observe that it is sufficient to consider the case when f is a real
function. Indeed, the assumption implies
Z Z Z
fϕ= Re f ϕ + i Im f ϕ = 0
Ω Ω Ω

for all real ϕ ∈ Cc (Ω). Since a complex number vanishes if and only if both
the real and imaginary part vanish, both summands in the above formula
must vanish. Hence, the complex case follows from the real case applied to
both Re f and Im f .
For n ∈ N let Ωn = {x ∈ Ω∩B(0, n) : dist(x, ∂(Ω∩B(0, n)) > 1/n}. Then Ωn is
open and bounded. Suppose we can show that f = 0 almost everywhere on Ωn .
Then it follows that f = 0 almost everywhere on Ω because of ∪n∈N Ωn = Ω
and the fact that the countable union of null sets is a null set.
Now assume that f = 0 does not hold almost everywhere on Ωn . This
means that |{x ∈ Ωn : |f (x)| > 0}| > 0. We may assume without loss of generality

28
2.1. Basic Hilbert Space Theory

that |{x ∈ Ωn : f (x) > 0}| > 0 (replace f by −f if necessary). It now follows that
there exists an ε > 0 and a measurable subset A ⊂ Ωn of positive measure with
f (x) ≥ ε for all x ∈ A.
Let B be an arbitary measurable subset of Ωn . Since Ωn has positive dis-
tance to the boundary of Ω, by Proposition 2.1.24 there exists a sequence
(ϕk )k∈N ⊂ Cc∞ (Ωn+1 ) with 0 ≤ ϕk ≤ 1 and ϕk → 1B in L1 (Ω). By Proposi-
tion 1.3.5 we can additionally assume after passing to a subsequence that
ϕk (x) → 1B (x) almost everywhere on Ω. Since Ωn has finite measure, it fol-
lows from the dominated convergence theorem because of |ϕk f | ≤ |f | 1Ωn+1
that Z Z
f 1B = lim f ϕk = 0.
Ω k→∞ Ω
Taking B = A, we however have by the considerations in the previous para-
graph Z Z
f 1A ≥ ε 1A = ε |A| > 0,
Ω Ω
which is a contradiction. Hence, we must have f = 0 almost everywhere on
Ωn .

The uniqueness of weak derivatives is now an immediate consequence of


the Du Bois–Reymond lemma.

Corollary 2.1.26. Let Ω ⊂ Rn be open, α ∈ Nn0 and f : Ω → K. Suppose that f


has two weak α-th derivatives g1 and g2 in L1loc (Ω). Then g1 = g2 as elements in
L1loc (Ω), i.e. g1 = g2 almost everywhere.

Proof. By definition, one has for all ϕ ∈ Cc∞ (Ω)


Z Z Z
|α| α |α|
(−1) g1 ϕ = f D ϕ = (−1) g2 ϕ.
Ω Ω Ω

Hence, for all ϕ ∈ Cc∞ (Ω) we obtain the identity


Z
(g1 − g2 )ϕ = 0.

It now follows from the du Bois-Reymond Lemma 2.1.25 that g1 = g2 almost


everywhere.

Now suppose that f ∈ L1loc (Ω) has a weak α-th derivative. By the above
corollary a weak derivative is uniquely determined in L1loc (Ω). It makes
therefore sense to speak of the weak α-th derivative which we will denote
by D α f . Recall that we have seen in Example 2.1.21 that if f is classically
continuously differentiable, the weak derivative coincides with the classical
derivative. Thus there is no conflict in notation.
We now can finally define Sobolev spaces.

29
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

Definition 2.1.27 (Sobelev spaces). Let Ω ⊂ Rn be open, k ∈ N and p ∈ [1, ∞].


We define the Sobolev spaces

W k,p (Ω) B {f ∈ Lp (Ω) : f is weakly α-diff. and D α f ∈ Lp (Ω) for all |α| ≤ n}.

endowed with the norms


 1/p
 X p

kf kW k,p (Ω) B  kD α f kL (Ω)  .
 
 p 
|α|≤k

In the particular interesting case of W k,2 -spaces, one naturally obtains a


Hilbert space structure.

Proposition 2.1.28. Let Ω ⊂ Rn be open and k ∈ N. Then W k,p (Ω) is a Banach


space. In the case p = 2 the norm of W k,2 (Ω) is induced by the inner product
XZ
hf |giW k,2 (Ω) B hD α f |D α giL2 (Ω) ,
|α|≤k Ω

which endows W k,2 (Ω) with the structure of a Hilbert space. We will also use the
abbreviation H k (Ω) = W k,2 (Ω).

Proof. All assertions except for the completeness are obvious. For the com-
pleteness suppose that (fn )n∈N is a Cauchy sequence in W k,p (Ω). In particular,
(D α fn )n∈N is a Cauchy sequence in Lp (Ω) for all |α| ≤ k. Now, it follows from
the completeness of Lp (Ω) that for all |α| ≤ k there exists fα ∈ Lp (Ω) with
D α fn → fα .
We now show that f = f(0,...,0) is weakly α-differentiable for all |α| ≤ k. For
this observe that for all ϕ ∈ Cc∞ (Ω) one has
Z Z Z Z
α α |α| α |α|
f D ϕ = lim fn D ϕ = lim (−1) D fn ϕ = (−1) fα ϕ.
Ω n→∞ Ω n→∞ Ω Ω
R
Here we have used twice the fact that for ψ ∈ Cc∞ (Ω) the functional f 7→ Ω f ψ
is continuous by Hölder’s inequality. Observe that the above calculation
shows that D α f = fα . From this it is now clear that f ∈ W k,p (Ω) and fn → f in
W k,p (Ω). Hence, W k,p (Ω) is complete.

Note that we have seen in Example 2.1.22 that there exist Sobolev functions
which are not classically differentiable. However, one has the following
denseness result.

Theorem 2.1.29. Let n ∈ N, p ∈ [1, ∞) and k ∈ N. Then the space Cc∞ (Rn ) is
dense in W k,p (Rn ).

30
2.1. Basic Hilbert Space Theory

Proof. Let f ∈ W k,p (Rn ). By Proposition 2.1.24 the sequence in Cc∞ (Rn ) given
by ϕk = f ∗ ψk , where ψk (x) = k n ψ(kx) for some Cc∞ (Rn ) with kψk1 = 1, con-
verges to f in Lp . Moreover, we have seen that for α ∈ Nn
Z
(D α ϕk )(x) = f (y)(Dxα ψk )(x − y) dy.
Rn

Observe that (Dxα ψk )(x − y) = (−1)|α| (Dyα ψk )(x − y) lies in Cc∞ (Rn ). Hence, for
|α| ≤ k we have by the definition of the weak derivative
Z Z
(D α ϕk )(x) = (−1)−α f (y)(Dyα ψk )(x − y) dy = (D α f )(y)ψk (x − y) dy
Rn Rn
α
= D f ∗ ψk

Hence, it follows again form Proposition 2.1.24 that D α ϕk → D α f in Lp .


Altogether this shows that ϕk → f in W k,p (Rn ). Hence, Cc (Rn ) is dense in
W k,p (Rn ).

Sobolev spaces are a fundamental tool in the modern treatment of partial


differential equations. We only give one example to show the power of the
functional analytic apparatus for the treatment of such equations.

Example 2.1.30 (Weak solutions for −∆u + u = f ). On Rn consider the fol-


lowing elliptic problem for real functions closely related to Poisson’s equation.
We search a solution u to the inhomogeneous problem −∆u + u = f on Rn . For
the moment assume that u ∈ C 2 (Rn ) is a classical solution of the problem. In
particular, u and therefore f are continuous functions. Integrating both sides
against a test function w ∈ Cc∞ (Rn ) we obtain by integration by parts
Z Z Z Z
(−∆u + u)w dx = ∇u∇w dx + uw dx = f w dx.
Rn Rn Rn Rn

Conversely, suppose that u ∈ C 2 (Rn ) satisfies for some f ∈ L1loc (Rn )


Z Z Z
∇u∇w dx + uw dx = f w dx (WS)
Rn Rn Rn

for all w ∈ Cc (Rn ). Then we again see by integration by parts that


Z Z
(−∆u + u)w dx = f w dx
Rn Rn

for all w ∈ Cc (Rn ). Hence, it follows from the Du Bois-Reymond lemma


(Lemma 2.1.25) that −∆u + u = f , i.e. u is a classical solution of the equation.
For f ∈ L2 (Rn ) we say that u ∈ H 1 (Rn ) (we work over real Hilbert spaces
in this example) is a weak solution of the equation if u satisfies (WS) for all

31
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

w ∈ Cc∞ (Rn ). Note that if u is a classical solution such that f as well as u and
all of its partial derivatives of first order lie in L2 (Rn ), then (WS) holds for all
w ∈ H 1 (Rn ) by the density of Cc (Ω) in H 1 (Rn ). More generally, the validity of
(WS) extends from all w ∈ Cc∞ (Rn ) to all w ∈ H 1 (Rn ) provided u ∈ H 1 (Rn ) and
f ∈ L2 (Rn ).
The advantage of the concept of weak solutions lies in the fact that one
often can establish the existence and uniqueness of weak solutions with
functional analytic methods. In our case consider the functional

ϕ : H 1 (Rn ) → R
Z
w 7→ f w dx.
Rn

Notice that by the Cauchy–Schwarz inequality (CS) one has

|ϕ(w)| ≤ kf k2 kwk2 ≤ kf k2 kwkH 1 (Rn )

for all w ∈ H 1 (Rn ). This shows that ϕ ∈ H 1 (Rn )∗ and it now follows from
the Riesz representation theorem (Theorem 2.1.18) that there exists a unique
u ∈ H 1 (Rn ) such that
Z Z Z
hu, wiH 1 (Rn ) = ∇u∇w dx + uw dx = f w dx = ϕ(w)
Rn Rn Rn

for all w ∈ H 1 (Rn ). Hence, we have shown that there exists a unique weak
solution u of the equation for all f ∈ L2 (Rn ). We will later see that this
solution already satisfies u ∈ C ∞ (Rn ) if the inhomogeneity additionally has
the regularity f ∈ C ∞ (Rn ) and therefore is a classical solution of −∆u + u = f .

2.1.4 The Fourier Transform on L2 (Rn )


The Fourier transform is an extremely important and powerful tool both in
mathematics and physics. In physics the Fourier transform is often used
to switch from the position space to the momentum space description of a
quantum mechanical system and vice versa. In mathematics and concrete
calculations the Fourier transform is extremely useful because it diagonalizes
differential operators (with constant coefficients). We start with with the
definition of the Fourier transform on L1 .

Definition 2.1.31 (The Fourier Transform on L1 (Rn )). For f ∈ L1 (Rn ) we


define its Fourier transform as
Z
1
(F f )(x) = f (y)e−ix·y dy,
(2π)n/2 Rn
where x · y = nk=1 xk yk stands for the Euclidean scalar product in Rn .
P

32
2.1. Basic Hilbert Space Theory

For an open set Ω ⊂ Rn we introduce the space of all continuous functions


vanishing in infinity

C0 (Ω) = {u : Ω → K continuous : ∀ε > 0 ∃K ⊂ Ω compact: |f (x)| ≤ ε ∀x < K}.

Observe that C0 (Ω) becomes a Banach space when endowed with the norm
kf k∞ B supx∈Ω |f (x)|. This follows from the fact that the uniform limit of
continuous functions is continuous and respects the vanishing condition. The
Fourier transform has the following elementary but useful mapping property.

Lemma 2.1.32 (Riemann–Lebesgue). The Fourier transform F maps L1 (Rn )


into C0 (Rn ).

The above lemma can be explicitly verified for indicators of finite intervals
and hence extends to simple functions by linearity. The general case then
follows from a density argument and the fact that C0 (Rn ) is a closed subspace
of L∞ (Rn ). Note that for a function f ∈ L2 (Rn ) the above Fourier integral may
not converge. Nevertheless it is possible to extend the Fourier transform to
L2 (Rn ).

Definition 2.1.33. Let H be a Hilbert space. A bounded linear operator


U ∈ B(H) is called a unitary operator if

(i) U is surjective and

(ii) hU x|U yi = hx|yi for all x, y ∈ H.

It follows from the polarization identity

4
1X k
hx|yi = i kx + i k yk2
4
k=1

that a bounded linear surjective operator U is unitary if and only if kU xk = kxk


for all x ∈ H, i.e. U is a surjective isometry. In fact, the next theorem shows
that the Fourier transform can even be extended to a unitary operator on
L2 (Rn ).

Theorem 2.1.34. The Fourier transform F has the following properties.

(a) Let f ∈ L1 (Rn ) such that F f ∈ L1 (Rn ) as well. Then the Fourier inversion
formula Z
1
f (x) = (F f )(y)eix·y dy
(2π)n/2 Rn
holds.

33
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

(b) Let f ∈ L1 (Rn ) ∩ L2 (Rn ). Then F f ∈ L2 (Rn ) and


Z Z
2
2
kF f k2 = |(F f )(x)| dx = |f (x)|2 dx = kf k22 .
Rn Rn

Therefore the Fourier transform can be uniquely extended to a unitary operator


F2 : L2 (Rn ) → L2 (Rn ) which we will also denote by F (the density of F L1 (Rn ) in
L2 (Rn ) will be a consequence of Proposition 3.2.7).

Note that by the inversion formula we have F 2 = − Id and therefore F −1 =


−F . Hence, results such as the Riemann-Lebesgue lemma are also valid for
the inverse Fourier transform. The importance of the Fourier transform lies
in the fact that it diagonalizes differentiation operators. This is essentially the
content of the next result.

Proposition 2.1.35. Let n ∈ N. The following criterion for a function f ∈ L2 (Rn )


to be in H 1 (Rn ) holds:

f ∈ H 1 (Rn ) ⇔ xj F f ∈ L2 (Rn ) for all j = 1, . . . , n.

More generally, one has f ∈ H k (Rn ) if and only if xα F f lies in L2 (Rn ) for all
|α| ≤ k. More precisely, for α ∈ Nn the weak α-th partial derivative exists and lies
in L2 (Rn ) if and only if xα F f ∈ L2 (Rn ). In this case

kD α f kL2 (Rn ) = kxα F f kL2 (Rn ) .

Proof. First let f ∈ Cc∞ (Rn ). Using integration by parts we obtain for j = 1, . . . , n
and x ∈ Rn the identity
!! Z Z
∂f 1 ∂f −ixy 1
F (x) = (y)e dy = ixj f (y)e−ixy dy
∂yj (2π)n/2 Rn ∂yj (2π)n/2 Rn
= ixj (F f )(x).

∂f
Hence, F ( ∂y ) = ixj F f . Since the space of test functions Cc∞ (Rn ) is dense
j

in H 1 (Rn ) by Proposition 2.1.29, it follows from a limiting argument that


the above identity extends to all f ∈ H 1 (Rn ). Indeed, for f ∈ H 1 (Rn ) let
(fn )n∈N ⊂ Cc∞ (Rn ) be a sequence with fn → f in H 1 (Rn ). Then it follows from
the continuity of the Fourier transform on L2 (Rn ) that
   
F Dj f = lim F Dj fn = lim ixj F fn
n→∞ n→∞

exists in L2 (Rn ). It follows from Proposition 1.3.5 that limn→∞ ixj F fn agrees
with ixj F f almost everywhere. Hence, ixj F f ∈ L2 (Rn ) with F (Dj f ) = ixj F f .

34
2.1. Basic Hilbert Space Theory

Conversely for f ∈ L2 (Rn ), assume that ixj F f is in L2 (Rn ) for all j = 1, . . . , n.


We show that F −1 (ixj F f ) is the weak j-th partial derivative for f ∈ L2 (Rn ).
Observe that for all ϕ ∈ Cc∞ (Rn ) we have
Z Z Z
−1 −1
F (ixj F f )ϕ = F (ixj F f )ϕ = ixj F f F ϕ
Rn Z Rn
Z R
Z
n

=− F f F (Dj ϕ) = − f Dj ϕ = − f Dj ϕ.
Rn Rn Rn

Hence, by definition f ∈ H 1 (Rn ). The general case and the norm identity
follow on exactly the same lines of proof.

A a direct consequence we obtain an embedding theorem for Sobolev


spaces into classical function spaces.

Theorem 2.1.36 (Sobolev embedding). Let k, n ∈ N with k > n2 . Then one has
the inclusion H k (Rn ) ⊂ C0 (Rn ). Hence, H k (Rn ) ⊂ C m (Rn ) ∩ C0 (Rn ) if k − m > n2 .
In particular ∩k∈N H k (Rn ) ⊂ C ∞ (Rn ).

Proof. Let k, n ∈ N be such that 2k > n. It follows from Proposition 2.1.35 that
|x|α F f ∈ L2 (Rn ) for all |α| ≤ k. It then follows that for some constants cα
Z X Z
2
2 k
(1 + |x| ) |(F f )(x)| dx = cα (xα )2 |(F f )(x)|2 dx < ∞.
Rn |α|≤k Rn

This shows (1 + |x|2 )k/2 F f ∈ L2 (Rn ). Now we obtain with the Cauchy–
Schwarz inequality that

kF f k1 ≤ k(1 + |x|2 )k/2 F f k2 k(1 + |x|2 )−k/2 k2 .

By the above calculation the first factor factor is finite, whereas the second
factor is finite because of the assumption k > n2 (note that (1 + |x|2 )−α is inte-
grable if and only if 2α > n). Hence, F f ∈ L1 (Rn ). It now follows from the
Riemann–Lebesuge lemma (Lemma 2.1.32) that f = F −1 F f ∈ C0 (Rn ).
For the proof of the higher inclusions observe that if k − 1 > n2 we have
Dj f ∈ H k−1 (Rn ) for all j = 1, . . . , n. Applying the just shown result to the
partial derivatives, we obtain Dj f ∈ C0 (Rn ). This shows f ∈ C 1 (Rn ). The
general case now follows inductively.

Remark 2.1.37 (More general Sobolev embeddings). More generally one


can show that if Ω ⊂ Rn is open and bounded and has C 1 -boundary, then for
k, m ∈ N and p ∈ [1, ∞) with k − m > np one has the inclusion W k,p (Ω) ⊂ C m (Ω).
In this case there exists a universal constant C ≥ 0 such that

kf kC m (Ω ≤ kf kW k,p (Ω) .

35
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

In fact, the existence of such a constant follows abstractly from the em-
bedding result by the closed graph theorem (Theorem 2.2.9) that will be
introduced later in the lecture. The Fourier transform has a lot of important
applications as it gives a very easy description of very important operators in
analysis. We illustrate this fact with an easy example.

Example 2.1.38 (Elliptic regularity of the Laplacian on L2 ). Recall the in-


homogeneous problem −∆u + u = f on Rn considered in Example 2.1.30. We
have seen that for each f ∈ L2 (Rn ) this elliptic problem has a unique weak
solution u ∈ H 1 (Rn ), i.e. u satisfies
Z Z Z
∇u∇ϕ + uϕ = fϕ
Rn Rn Rn

for all test functions ϕ ∈ Cc∞ (Rn ). In analogy to the notion of weak derivatives,
we say a function w ∈ L1loc (Rn ) is the (unique) weak Laplacian of some function
u ∈ L1loc (Rn ) provided the integration by parts formula
Z Z
u∆ϕ = wϕ
Rn Rn

holds for all test functions ϕ ∈ Cc∞ (Rn ). Using this terminology for the weak
solutions of −∆u + u = f , we see that because of
Z Z Z
− u∆ϕ = ∇u∇ϕ = (f − u)ϕ
Rn Rn Rn

for all ϕ ∈ Cc∞ (Rn ) that the weak Laplacian of u lies in L2 (Rn ) and satisfies
−∆u = f − u. Since ∆u ∈ L2 (Rn ), it follows from Proposition 2.1.35 that |x|2 F f
and therefore also (1 + |x|2 )F f is square integrable. We now want to show
that all mixed second partial derivatives exist and are square integrable. By
Proposition 2.1.35 this is equivalent to xi xj F f ∈ L2 (Rn ) for all i, j = 1, . . . , n.
Fix such i, j. Then
2
xi xj
Z 2 Z 2
xi xj (F u)(x) dx = 2
2 (1 + |x| )(F u)(x) dx
Rn Rn 1 + |x|
2 Z
xi xj
≤ sup 2
(1 + |x|2 )2 |(F u)(x)|2 dx ≤ k∆uk2L2 (Rn ) .
x∈Rn 1 + |x| Rn

A similar estimates holds for the first derivatives. Hence, u ∈ H 2 (Rn ) whenever
∆u ∈ L2 (Rn ). In particular, the weak solution u ∈ H 1 (Rn ) of −∆u + u = f
automatically has the higher regularity u ∈ H 2 (Rn ). Now suppose that the
inhomogeneity even satisfies f ∈ H 1 (Rn ). Then ∆u = u − f ∈ H 1 (Rn ). Using
a variant of the above arguments we see that u ∈ H 3 (Rn ). Iterating this

36
2.2. Symmetric and Self-Adjoint Operators

argument we see that f ∈ H k (Rn ) implies the higher regularity u ∈ H k+2 (Rn ).
This is the so-called elliptic regularity of the Laplace operator. Note that this in
particular implies by the Sobolev embedding Theorem 2.1.36 that u ∈ C ∞ (Rn )
whenever f ∈ C ∞ (Rn ) and additionally has suitable decay of all derivatives
such that f lies in Sobolev spaces of arbitrary high order.

2.2 Symmetric and Self-Adjoint Operators


2.2.1 Unbounded Operators on Hilbert Spaces
Let us now consider for a moment the most prototypical operators in quantum
mechanics, namely the position operator x̂ and the momentum operator p̂
for a single one-dimensional particle. Such a particle can be modeled in the
Hilbert space H = L2 (R). Explicitly one then has (ignoring physical constants)

d
x̂ : f 7→ [x 7→ x · f (x)] p̂ : f 7→ [x 7→ i f (x)].
dx
Clearly, both x̂ and p̂ are linear. However, they do not define bounded operators
on L2 (R). There are two (closely related) obstructions:

1. x̂(f ) does not lie in L2 (R) for all f ∈ L2 (R) (for example one can take
f (x) = 1x 1[1,∞) ). In the same spirit not every function in L2 (R) has a
(weak) derivative in L2 (Rn ).

2. Both operators are not bounded when restricted to their maximal do-
mains of definition. For example, one has

sup kx̂(f )k = ∞.
kf k2 ≤1:xf ∈L2 (R)

For applications in quantum mechanics one therefore has to study un-


bounded linear operators which are not defined on the whole Hilbert space.

Definition 2.2.1. An unbounded (linear) operator (A, D(A)) on a Hilbert space


H is the datum of a linear subspace D(A) ⊂ H and a linear operator A : D(A) →
H.

In particular, we say that two unbounded operators (A, D(A)) and (B, D(B))
agree and one writes A = B if and only if D(A) = D(B) and Ax = Bx for
all x ∈ D(A) = D(B). If one has D(A) ⊂ D(B) and Bx = Ax for all x ∈ D(A),
we say that B is an extension of A. Moreover, we define the sum of two
unbounded operators (A, D(A)) and (B, D(B)) on the same Hilbert space H
in the natural way: (D(A + B), A + B) is given by D(A + B) = D(A) ∩ D(B) and
(A + B)x B Ax + Bx. In particular, the sum of an unbounded operator (A, D(A))

37
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

and a bounded operator B ∈ B(H) is defined on D(A). In a similar fashion


one defines the composition of A and B in the natural way with the domain
D(AB) = {x ∈ D(B) : Bx ∈ D(A)}.

Remark 2.2.2. Be careful: by our definition an unbounded operator (A, D(A))


can be bounded in the sense of Definition 2.1.12. Our definition should be
understood in the sense of a not necessarily bounded operator.

Recall that by the postulates of quantum mechanics the possible outcomes


of a physical measurement of a quantum mechanical system are determined
by the spectrum of the self-adjoint operator associated to this measurement.
Hence, the physical treatment crucially relies on the spectral properties of
those operators. It is now time to define in a precise way the spectrum of
unbounded operator. We first define the inverse of an unbounded operator.

Definition 2.2.3. Let (A, D(A)) be an unbounded operator on some Hilbert


space H. If A is injective, we can define the operator (A−1 , D(A−1 )) with
D(A−1 ) = Rg A and A−1 y = x if and only if Ax = y. The operator (A, D(A)) is
called invertible if for each y ∈ H there exists a unique x ∈ D(A) with Ax = y,
i.e. A : D(A) → H is bijective.

This now allows us to define the spectrum of an unbounded operator,


generalizing the concept of an eigenvalue in the finite dimensional case.

Definition 2.2.4 (Spectrum and resolvent set of an operator). Let (A, D(A))
be an (unbounded) operator on some Hilbert space H. We call the set

ρ(A) B {λ ∈ K : λ Id −A is invertible}

the resolvent set of A. Its complement σ (A) B K \ ρ(A) is called the (mathemat-
ical) spectrum of A.

Clearly, every eigenvalue λ ∈ C of A lies in the spectrum σ (A) because


λ Id −A is not injective in this case. If H is finite dimensional and A : H → H is
linear, then λ Id −A is injective if and only if λ Id −A is bijective. Hence, in the
finite dimensional case, the spectrum of A consists exactly of the eigenvalues
of A.
One can show that a bounded operator on a complex Hilbert space always
has non-empty spectrum. Moreover, one can show that the spectrum of an
operator is always a closed subset of the complex plane. If A is moreover
bounded, then σ (A) is bounded as well. This shows that a bounded operator
can never model a physical system in which arbitrary high measure outcomes,
e.g. arbitrary high energies, are possible. Further, it can happen that even a
bounded operator on a complex Hilbert space has no eigenvalues in contrast
to the finite dimensional case.

38
2.2. Symmetric and Self-Adjoint Operators

Example 2.2.5. On H = L2 ([0, 1]) consider the spatially bounded analogue of


the position operator given by

(T f )(x) = xf (x).

Then T ∈ B(H) with kT k = 1. However, T has no complex eigenvalue. For if


λ ∈ C is an eigenvalue of T , then T f = λf for some f , 0. Hence, xf (x) = λf (x)
almost everywhere on [0, 1]. Equivalently, (x − λ)f (x) = 0 almost everywhere.
But this implies f = 0 as element in L2 ([0, 1]), in contradiction to our as-
sumption. Nevertheless 0 ∈ σ (A) since T is not surjective. For example, take
g(x) = x1/2 ∈ L2 ([0, 1]). If f ∈ L2 ([0, 1]) such that T f = g, i.e. xf (x) = x1/2 , then
f (x) = x−1/2 . But we clearly have
Z 1 Z 1
1
|f (x)|2 dx = dx = ∞,
0 0 x

so f < L2 ([0, 1]). With similiar arguments one can further show that [0, 1] ⊂
σ (A). Note that on a (until now) formal level the Dirac measure δ0 at zero is
an eigenvalue of T because of

T δ0 = xδ0 = 0.

HereRxδ0 is the Dirac measure with the density x, i.e. the Borel measure
A 7→ A x dδ0 . But please be aware that δ0 is not an element of L2 ([0, 1]) as it is
not even a function. At the end of the lecture we will give mathematical sense
to such expressions by introducing the theory of distributions. Moreover, note
that the same discussion applies to the unbounded position operator on L2 (R),
where one even obtains R ⊂ σ (A).

We will almost exclusively work with the class of closed operators on some
Hilbert space.

Definition 2.2.6. An unbounded operator (A, D(A)) on a Hilbert space H is


called closed if the graph

G(A) = {(x, y) ∈ D(A) × H : Ax = y}

of A is a closed subspace of H × H. An unbounded operator (A, D(A)) is called


closable if G(A) is the graph of a (necessarily closed) unbounded operator
(A, D(A)) on H. In this case (A, D(A)) is called the closure of A.

Observe that if (A, D(A)) is an injective closed operator, then the graph of
A−1 is given by G(A−1 ) = {(y, x) : (x, y) ∈ G(A)}. This shows that an injective
operator A is closed if and only if A−1 is closed. In particular, if ρ(A) is non-
empty, then λ − A is invertible for some λ ∈ C. Hence, (λ − A)−1 is bounded

39
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

and a fortiori closed. Now, it follows that λ − A and therefore also A are closed.
This shows that ρ(A) , ∅ implies that A is closed. Hence, a reasonable spectral
theory is only possible for closed operators.
As a consequence we only encounter closed operators in our study of
the mathematics behind quantum mechanics. Indeed, by the postulates of
quantum mechanics a physical quantum mechanical system is modeled by
a self-adjoint operator on some complex Hilbert space H. The mathematical
spectrum σ (H) of H corresponds by the postulates of quantum mechanics to
the possible outcomes of a physical measurement of the system. Naturally,
one requires σ (H) ⊂ R (reality condition). A fortiori one has ρ(A) , ∅. In other
words, there exists a λ ∈ C such that (λ − A)−1 : H → H is a bounded operator.
By the above reasoning this implies that A must be closed.

Remark 2.2.7. One can easily verify that the definition of a closed operator
is equivalent to the following condition: for (xn )n∈N ⊂ D(A) with xn → x and
Axn → y in H one has x ∈ D(A) and Ax = y.

The following example shows that the domain of an operator plays a


crucial role for its structural properties.
d
Example 2.2.8. Let H = L2 ([−1, 1]) and A = dx be the first derivative with
domain
D(A) = {f ∈ L2 ([−1, 1]) ∩ C 1 ((−1, 1)) : f 0 ∈ L2 ([−1, 1])}.
Then (A, D(A)) is an unbounded operator
q which is not closed. For this consider
the smooth functions given by fn (x) = x2 + n1 and for n ∈ N and the function
f (x) = |x|. Then one has by the dominated convergence theorem or a direct
calculation that
Z1
lim |fn (x) − f (x)|2 dx = 0
n→∞ −1

and by the same reasoning


Z 1 Z 1 2
x
lim |fn0 (x) − f 0 (x)| dx = lim
q − sign x dx = 0,
n→∞ n→∞ −1
−1 x2 + n1

where the derivative of f is understood in the weak sense. From this one sees
that (fn , Afn ) ∈ G(A) with (fn , Afn ) → (f , f 0 ) in H×H. Hence, G(A) is not closed
in H × H and (A, D(A)) does not define a closed operator.
d
However, if we choose B = dx with D(B) = H 1 ((0, 1)), one can immediately
verify that B is a closed operator on H. Moreover, it follows from the above
calculations that G(A) ⊂ G(B). From this one sees that A is closable and that B
is an extension of A.

40
2.2. Symmetric and Self-Adjoint Operators

In particular notice that in the above example one only obtains a closed
operator if the domain consists of Sobolev and not classically differentiable
functions. This is the prototypical behaviour for all differential operators.
One has the following fundamental theorem on closed operators.

Theorem 2.2.9 (Closed Graph Theorem). Let (A, D(A)) be a closed operator on
some Hilbert space H with D(A) = H. Then A is bounded, i.e. A ∈ B(H).

Let us briefly discuss the consequences of the closed graph theorem for the
mathematical description of quantum mechanics. A typical quantum mechan-
ical system is described by a closed unbounded operator H (the Hamiltonian)
on some Hilbert space H. The closed graph theorem implies that one then
automatically has D(H) ( H, i.e. one has to restrict the domain of the operator.

2.2.2 The Difference Between Symmetry and Self-Adjointness


Recall that by the postulates of quantum mechanics a physical observable
is described by a self-adjoint operator some Hilbert space. More precisely,
one can only obtain elements of the spectrum as possible outcomes of the
measurement of a pure quantum mechanical state. This fact is guaranteed by
the postulated self-adjointness of the operator associated to the observable.
In analogy to what we have learned in course in linear algebra, this is often
defined as follows in the physics literature.

Definition 2.2.10. An operator (A, D(A)) on a Hilbert space H is called sym-


metric if
hx|Ayi = hAx|yi for all x, y ∈ D(A).

Now let 0 , x ∈ D(A) be an eigenvalue of A, i.e. Ax = λx for some λ ∈ C.


Then
λhx|xi = hλx|xi = hAx|xi = hx|Axi = hx|λxi = λhx|xi.
Hence, (λ − λ)hx|xi = 0. Since x , 0 we must have λ = λ which is equivalent to
λ ∈ R. This shows that the symmetry of (A, D(A)) implies that all eigenvalues
of A are real. However, as we have already seen the spectrum σ (A) may not
contain one single eigenvalue. Hence, it is not clear that one has σ (A) ⊂ R for
a symmetric operator A. In fact, this is not even true.
The next example shows that the symmetry of an operator A is not suffi-
cient to guarantee that the spectrum of A is real.
d
Example 2.2.11. Let H B L2 ([0, ∞)) and consider A = i dx with domain D(A) =
∞ ∞
Cc ((0, ∞)). For f , g ∈ Cc ((0, ∞)) we have by integration by parts
Z∞ Z∞
hf |Agi = f (x)ig 0 (x) dx = i[f (x)g(x)]∞
0 − i f 0 (x)g(x) dx
0 0

41
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

Z ∞
= if 0 (x)g(x) dx = hAf |gi.
0

This shows that A is symmetric. However, it follows along the same line of
arguments as in Example 2.2.8 that A is not closed. Hence, by the comments
after Definition 2.2.6 one has σ (A) = C.

Before going further, we prove a very useful lemma for symmetric opera-
tors.

Lemma 2.2.12. Let (A, D(A)) be a symmetric operator on some Hilbert space H.
Then for all λ ∈ C we have

k(λ − A)xk ≥ |Im λ| kxk for all x ∈ D(A).

Proof. Let x ∈ D(A). By symmetry, we have hAx|xi = hx|Axi. Further, by the


properties of a scalar product one also has hAx|xi = hx|Axi. This shows that
hx|Axi must be a real number. Hence,

Im(hx|(λ − A)xi) = Im λhx|xi.

Taking absolutes values and using the Cauchy–Schwarz inequality we obtain

|Im λ| kxk2 ≤ |hx|(λ − A)xi| ≤ k(λ − A)xk kxk .

Now, if x = 0 the assertion trivially holds. In the case x , 0 the inequality


follows by dividing both sides with kxk.

The above lemma has the following very useful consequences which sim-
plify the study of symmetric operators substantially.

Corollary 2.2.13. Let (A, (D(A)) be a symmetric operator on some Hilbert space
H. Then the following are equivalent for λ ∈ C \ R.

(i) λ ∈ ρ(A);

(ii) λ − A is surjective.

Proof. The first condition implies the first by definition. For the converse
observe that λ − A is injective by Lemma 2.2.12 or the fact that a symmetric
operator does not have any eigenvalues. Hence, λ − A is bijective. Now, it
follows from the estimate in Lemma 2.2.12 that for y ∈ H with (λ − A)x = y we
have

k(λ − A)−1 yk = kxk ≤ |Im λ|−1 k(λ − A)xk = |Im λ|−1 kyk.

This shows that k(λ − A)−1 k ≤ |Im λ|−1 . Hence, λ ∈ ρ(A).

42
2.2. Symmetric and Self-Adjoint Operators

Corollary 2.2.14. Let (A, D(A)) be a symmetric operator on some Hilbert space
H. If λ − A is surjective for some λ ∈ C \ R, then D(A) is dense in H.

Proof. Assume that this is not the case. Then the closure D(A) of H is a proper
subspace of H. Hence, there exists y ∈ H \ D(A). Let P be the orthogonal
projection onto D(A). Replacing y by y − P y if necessary, we may assume
that y is orthogonal to all elements of D(A). By assumption, there exists
0 , x ∈ D(A) with (λ − A)x = y. Now, on the one hand

0 = hx|yi = hx|(λ − A)xi

On the other hand we have shown in Lemma 2.2.12 that

Imhx|(λ − A)xi = Im λ kxk2 > 0.

This is a contradiction. Hence, D(A) must be dense in H.

Note that by Remark 2.1.37 one has the Sobolev embedding H 1 ((0, 1)) ⊂
C([0, 1]). More precisely, this means that every function f ∈ H 1 ((0, 1)) has an
(automatically unique) representative in C([0, 1]). Note that in particular it
then makes sense to talk about point evaluations of f , e.g. f (0) or f (1). Please
remember that point evaluations are not well-defined for general functions
in L2 ([0, 1]). The above inclusion into the continuous functions allows us to
define the following subspace of H 1 ((0, 1)).

Definition 2.2.15. Let a < b ∈ R. Then we define the Sobolev space with
vanishing boundary values as

H01 ([a, b]) = {f ∈ H 1 ((0, 1)) : f (a) = f (b) = 0}.

In the case of an one-sided unbounded interval we set in a similar manner

H01 ([a, ∞)) = {f ∈ H 1 ((0, 1)) : f (1) = 0}.

In an analogous way one also defines the spaces H01 ((−∞, b]).

Remark 2.2.16. Using the methods of the proof of Theorem 2.1.36 and a trun-
cation argument one can show that Cc ((0, ∞)) is a dense subset of H01 ([0, ∞)).
Analogously, Cc ((a, b)) is a dense subset of H01 ((a, b)).

In Example 2.2.11, one may argue that the essential obstruction is the fact
that A is not closed. Indeed, this fact was used to show σ (A) = C. However,
there are even more obstructions.

43
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

Example 2.2.17 (Example 2.2.11 continued). Now consider the same map-
d
ping A = i dx , but with the different domain D(A) = H01 ([0, ∞)). Now the
derivative has to be understood in the weak sense. Then A is a closed operator.
This can be checked as in the previous examples: for the additional prob-
lem of the boundary values please observe that fn → f in H 1 ((0, ∞)) implies
fn (x) → f (x) for all x ∈ [0, ∞) by the estimate for the Sobolev embedding
theorem (see end of Remark 2.1.37). It is not clear that A is still symmetric
on the bigger domain which we use now. Until now, we only know from
Example 2.2.11 that
hf |Agi = hAf |gi
for all f , g ∈ Cc∞ ((0, ∞)). Now let f , g ∈ H01 ([0, ∞)). Since this space is dense
by the remarks made before this example, there exists sequences (fn )n∈N and
(gn )n∈N in Cc∞ ([0, ∞)) such that fn → f and gn → g in H01 ([0, ∞)). A fortiori,
we have fn → f and gn → g in L2 ([0, ∞)). Moreover, for h ∈ H01 ([0, ∞)) we have

kAhk2 = kh0 k2 ≤ khk2 + kh0 k2 = khkH01 .

This shows that A is continuous as a mapping from D(A) = H01 ([0, ∞)) to
L2 ([0, ∞)). In particular fn → f in H01 ([0, ∞)) implies Afn → Af in L2 ([0, ∞)).
Using the continuity of the scalar product we obtain

hf |Agi = lim hfn |Agn i = lim hAfn |gn i = hAf |gi.


n→∞ n→∞

Hence, (A, D(A)) is still symmetric. Note that loosely spoken the symmetry
gets increasingly difficult to achieve for bigger domains. For example, if we
would use D(A) = H 1 ([0, ∞)) instead, in the integration by parts argument the
term f (0)g(0) would not vanish in general. Hence, A is not symmetric with
this even bigger domain.
We now deal with the spectrum of A. Let us first consider λ ∈ C with
Im λ < 0. Note that by Corollary 2.2.13 the essential question is whether λ − A
is surjective. For this let g ∈ L2 ([0, ∞)). We have to find a (necessarily unique)
f ∈ H01 ([0, ∞)) with
λf − Af = λf − if 0 = g.
Using the well-known variation of parameters method we can write down the
very reasonable candidate (ignoring problems such as weak differentiation
for a moment) Z t
f (t) = i g(s)e−iλ(t−s) ds.
0
We now check that f is square integrable. For this observe that by the Cauchy–
Schwarz inequality
Zt !2 Z t Zt
2 Im λ(t−s) 2 Im λ(t−s)
|f (t)| ≤ |g(s)| e ds ≤ |g(s)| e ds eIm λ(t−s) ds
0 0 0

44
2.2. Symmetric and Self-Adjoint Operators

Z t
1
≤ |g(s)|2 eIm λ(t−s) ds.
|Im λ| 0

Integration of this inequality yields with Fubini’s theorem


Z∞ Z ∞Z t
2 1
|f (t)| dt ≤ |g(s)|2 eIm λ(t−s) ds dt
0 |Im λ| 0
Z∞ Z ∞0 Z∞ Z∞
1 2 Im λ(t−s) 1 2
= |g(s)| e dt ds = |g(s)| eIm λt dt ds
|Im λ| 0 s |Im λ| 0 0
Z∞
1
= 2
|g(s)|2 ds < ∞.
|Im λ| 0

Hence, f ∈ L2 ([0, ∞)) with kf k2 ≤ |Im λ|−1 kgk2 . Now a calculation similar to
that of Example 2.1.23 shows that f is weakly differentiable with if 0 = λf − g.
It follows that f 0 ∈ L2 ([0, ∞)) and therefore f ∈ H 1 ([0, ∞)). In particular, f is
continuous (this can also be easily verified directly by using the dominated
convergence theorem) with f (0) = 0. Altogether this shows that f ∈ H01 ([0, ∞))
with (λ − A)f = g. Corollary 2.2.13 shows that λ ∈ ρ(A) whenever Im λ < 0.
Hence, the upper half-plane is contained in ρ(A).
However, the situation is different for the upper half-plane. For example,
take λ = i and g(t) = e−t ∈ L2 ([0, ∞)). In this case the resolvent equation is the
ordinary differential equation −if (t) − if 0 (t) = e−t with the initial condition
f (0) = 0 whose unique solution (note that the right hand side lies in Sobolev
spaces of arbitrary order, by bootstrapping this shows that a solution f ∈
H 1 ([0, ∞)) automatically satisfies f ∈ C ∞ ((0, ∞)) and therefore is a classical
solution of the ODE) is given by
Zt Zt
i
f (t) = i e−s et−s ds = iet e−2s ds = et (1 − e−2t )
0 0 2

which satisfies |f (t)| → ∞ as t → ∞ and therefore cannot be square integrable.


As a consequence we obtain i ∈ σ (A). A similar argument applies for all λ ∈ C
with positive imaginary part. Since the spectrum is always closed, this shows
that σ (A) = {λ ∈ C : Im λ ≥ 0}.
d
Note that if we would have worked with B = −i dx and the same domain
instead, then σ (B) = {λ ∈ C : Im λ ≤ 0}. Concluding our discussion, we see
that although A and B are closed symmetric operators, the spectrum may
consist of non-real numbers. By using the operator A ⊕ B and the direct sum
L2 ([0, ∞)) ⊕ L2 ([0, ∞)) one moreover obtains an example of a closed symmetric
operator for which the spectrum consists of the entire complex plane.

Remark 2.2.18 (Physical interpretation). Let us try to give a physical expla-


nation of the above result from the naive standpoint of a mathematician. In
fact, the above result may be surprising from a physical point of view. The

45
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

Hilbert space L2 ([0, ∞)) should be associated to a quantum system which


describes an one-sided well given by the potential

0 if x > 0


V (x) = 
∞ if x ≤ 0.

d
Indeed, we have shown that the momentum operator p̂ = −i dx on the half-line
[0, ∞) with zero boundary condition is not self-adjoint and therefore is not a
physical reasonable observable. Even worse, we will later show that the oper-
d
ator −i dx defined on Cc∞ ((0, ∞)) has no self-adjoint extensions. Clearly, if ψ is
a quantum mechanical wave function which is localized in a compact subset
of (0, ∞), the time evolution of the system should be given for sufficiently
small times by the time evolution of a free particle, i.e. by translation as we
have seen in the physics part of the lecture. Hence, on Cc∞ ((0, ∞)) the single
d
choice we have is to set p̂ = −i dx . But since this operator does not have any
self-adjoint extensions, by the postulates of quantum mechanics, there does
not exist a well-defined momentum observable. Clearly, this odd behaviour
should arise from the infinite high well at zero.
Since the well is infinitely high, we would expect that a wave function
ψ is totally reflected at zero (or at least with some phase shift), i.e. the time
evolution gets an immediate phase shift of π. You probably all know this kind
of reasoning from classical mechanics where this argument works perfectly
fine. But now for simplicity think of a wave function of the form ψ(x) = 1[0,1] .
Then one has kψk = 1. Let us for simplicity assume that the time evolution
operators U (t) of the momentum operator are given by translation of one per
time unit. Assuming total reflection, the state has evolved after a time span of
1/2 to the state U (1/2)ψ which satisfies

kU (1/2)ψk22 = k21[0,1/2] k22 = 2 , 1.

Hence, U (t) does not preserve the norm of the state and therefore violates a
fundamental principle of quantum mechanics. Again physical and mathemat-
ical reasoning fits perfectly together!
The occurrence of half-planes as the spectrum of symmetric operators is
no coincidence as the next result shows. Before we introduce the resolvent of
an operator and need some technical tools.
Definition 2.2.19 (Resolvent). Let (A, D(A)) be an unbounded operator on
some Hilbert space H. Then the mapping

R(·, A) : ρ(A) → B(H)


λ 7→ R(λ, A) B (λ − A)−1

is called the resolvent of A.

46
2.2. Symmetric and Self-Adjoint Operators

The next elementary result plays a fundamental role in the study of opera-
tors on Banach or Hilbert spaces.

Lemma 2.2.20 (Neumann series). Let X be a Banach space and T ∈ B(X) a


bounded operator. If kT k < 1, then Id −T is invertible. Moreover, one has

X
(Id −T )−1
= T n.
n=0

In particular, if T ∈ B(H) is invertible and kT − Sk < kT −1 k−1 for some S ∈ B(H),


then S is invertible as well.

Proof. For the first assertion, simply calculate


N
X N
X
(Id −T ) T n = (Id −T ) T n = Id −T N +1
n=0 n=0

which exists because of kT k < 1. The first assertion now follows by taking
limits. For the second assertion write

S = T + S − T = T (Id +T −1 (S − T )).

Note that T is invertible because of the assumptions and that Id +T −1 (S − T )


is invertible because of the first part of this lemma and the estimate

kT −1 (S − T )k ≤ kT −1 k kT − Sk < 1.

Hence, S is invertible as the composition of two invertible operators.

Note that it follows from the above lemma that the set of invertible opera-
tors is open in B(X) for every Banach space X. Moreover, one obtains directly
the previously stated fact that the resolvent set ρ(A) of an unbounded operator
(A, D(A)) is open (and as a consequence that the spectrum σ (A) is closed). We
are now ready to describe the spectrum of a symmetric operator.

Proposition 2.2.21. Let (A, D(A)) be a symmetric operator on some Hilbert space
H. If λ Id −A is surjective for some λ ∈ C with Im λ > 0 (resp. Im λ < 0), then the
whole upper (resp. lower) half-plane is contained in the resolvent set ρ(A) of A.

Proof. We assume without loss of generality that λ Id −A is surjective for some


λ ∈ C with Im λ > 0. It follows from Corollary 2.2.13 that λ ∈ ρ(A). Moreover,
we have seen in Lemma 2.2.12 that k(λ − A)xk ≥ |Im λ| kxk for all x ∈ D(A).
Taking inverses, we see that kR(λ, A)k = k(λ − A)−1 k ≤ 1/ |Im λ|. Further note
that for µ in the upper half plane

(µ − A)R(λ, A) = (λ − A + µ − λ)R(λ, A) = Id +(µ − λ)R(λ, A).

47
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

Hence, it follows from the Neumann series (Lemma 2.2.20) that the right
hand side and therefore µ − A is surjective and consequently µ ∈ ρ(A) by
Corollary 2.2.13 provided

1
|µ − λ| kR(λ, A)k < 1 ⇔ kR(λ, A)k < .
|µ − λ|

This is satisfied if
1 1
< ⇔ |µ − λ| < |Im λ| .
|Im λ| |µ − λ|

Hence, for every λ ∈ ρ(A) the ball with center λ and radius |Im λ| is contained
in ρ(A) as well. By taking bigger and bigger disks we see that the whole upper
half plane is contained in the resolvent set ρ(A).

As a direct corollary we obtain the following spectral properties of sym-


metric operators.

Corollary 2.2.22. Let (A, D(A)) be a symmetric operator on some Hilbert space
H. Then the spectrum σ (A) of A is given by one of the following four possibilities:
a closed subset of the real line, the upper or lower closed half-plane or the entire
complex plane.

Proof. By Lemma 2.2.21 the resolvent set ρ(A) contains a whole open half-
plane as soon as a single point of this half-plane is contained in ρ(A). Passing
to complements this means that either {λ ∈ C : Re λ > 0} or no point of this
half-plane are contained in the spectrum σ (A). The same of course applies
the lower half-plane. Since the spectrum σ (A) is closed, only the above listed
cases remain possible.

In fact, one can construct for each of the above listed closed sets a closed (!)
densely defined symmetric operator on some Hilbert space whose spectrum is
exactly that set. Hence, the above statement is the best possible in the general
case of a symmetric operator. In order to guarantee that the spectrum of an
unbounded operator is real we therefore need a stronger condition than mere
symmetry. It is now finally time to introduce the adjoint of a densely defined
operator on some Hilbert space.

Definition 2.2.23. Let (A, D(A)) be a densely defined operator on some Hilbert
space H. The adjoint A∗ of A is the operator defined by

D(A∗ ) = {x ∈ H : ∃y ∈ H with hx|Azi = hy|zi for all z ∈ D(A)}


A∗ x = y.

48
2.2. Symmetric and Self-Adjoint Operators

We now give some comments on the definition of the adjoint operator.


Let f ∈ D(A∗ ). Then it follows from the definition of D(A∗ ) that the linear
functional

ϕ : D(A) → H
z 7→ hx|Ayi

is continuous, i.e. there exists a constant C ≥ 0 such that |ϕ(z)| ≤ C kzkH for all
z ∈ D(A). Since D(A) is dense in H, there exists a unique extension ϕ̃ ∈ H∗ of
ϕ to all of H. In fact by the definition, one has ϕ̃(z) = hg|zi for all z ∈ H. Note
that by the Riesz representation theorem (Theorem 2.1.18) each continuous
functional on H can be uniquely determined by such an element in H, in our
case by the element y. Hence, an equivalent definition of the domain is

D(A∗ ) = {x ∈ H : hx|A·i is continuous with respect to H}.

Note that by the above definition the element y is uniquely determined. This
shows that A∗ is well-defined. Moreover, it follows from a direct computation
that A∗ is indeed linear. Further, the adjoint is always closed.

Lemma 2.2.24. Let (A, D(A)) be a densely defined operator on some Hilbert space
H. Then the adjoint operator (A∗ , D(A∗ )) is closed.

Proof. Let (xn )n∈N ⊂ D(A∗ ) with xn → x and A∗ xn → y ∈ H. Then one has for
all z ∈ D(A)
hx|Azi = lim hxn |Azi = lim hA∗ xn |zi = hy|zi.
n→∞ n→∞
Hence, x ∈ D(A∗ ) and A∗ x = y. This shows that A∗ is a closed operator.

Let us as a first example calculate the adjoint operator of the variant of


the momentum operator considered in Example 2.2.11.

Example 2.2.25. Recall that the operator from Example 2.2.11 was given by
d
A = i dx with domain D(A) = H01 ([0, ∞)). Assume that f ∈ D(A∗ ). Then we
have for all g ∈ D(A)
Z∞ Z∞


A f g = hA f |gi = hf |Agi = f (x)ig 0 (x) dx
0 0

In particular, if we only consider real ϕ = g ∈ Cc∞ ([0, ∞)) and use the anti-
symmetry of the scalar product, we obtain
Z∞ Z∞

− iA f ϕ = − f (x)ϕ 0 (x) dx.
0 0

Hence, f is weakly differentiable with f 0 = −iA∗ f . This shows that f ∈


H 1 ([0, ∞)) and A∗ f = if 0 . Altogether these arguments shows that D(A∗ ) ⊂

49
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

d
H 1 ([0, ∞)) and that A∗ is a restriction of the differential operator i dx with
1 ∗ 1
domain H ([0, ∞)). We now show that D(A ) = H ([0, ∞)). For this let
f ∈ H 1 ([0, ∞)). Then one has for all g ∈ D(A) = H01 ([0, ∞)) using integration by
parts that
Z ∞ h i∞ Z ∞
0
hf |Agi = f (x)ig (x) dx = if (x)g(x) − f 0 (x)ig(x) dx
0
Z0∞ 0

= if 0 (x)g(x) dx = hif 0 |gi.


0

Note that equality relying upon integration by parts holds for all test functions
g ∈ Cc∞ ((0, ∞)) and extends to all g ∈ H01 [0, ∞) by continuity because the
test functions Cc∞ ([0, ∞)) are dense in H01 ([0, ∞)) as stated in Remark 2.2.16.
The above identity implies that H 1 ([0, ∞)) ⊂ D(A∗ ) and A∗ f = if 0 for all
d
f ∈ H 1 ([0, ∞)). Altogether, we have shown that A∗ = i dx with domain D(A∗ ) =
H 1 ((0, ∞)). Hence, we do not have A = A∗ . However, A∗ is an extension of A.
Note that (A∗ , D(A∗ )) is not symmetric although (A, D(A)) is symmetric.
Indeed, let f (x) = g(x) = e−x/2 ∈ H 1 ([0, ∞)). Then by integration by parts we
see that the symmetry is violated as
Z ∞ Z ∞
∗ 0 −x
hf |A gi = hf |ig i = −i e dx = i + i e−x dx = i + hif 0 |gi = i + hA∗ f |gi.
0 0

It is no coincidence that the adjoint operator A∗ in the last example is an


extension of A as the next lemma shows.

Lemma 2.2.26. Let (A, D(A)) be a densely defined symmetric operator on some
Hilbert space. Then (A∗ , D(A∗ )) is a closed extension of A. In particular, (A, D(A))
is closable.

Proof. We have seen in Lemma 2.2.24 that the adjoint is a closed operator. It
remains to show that A∗ extends A. Let x ∈ D(A). Then one has for all y ∈ D(A)
by the symmetry of A
hx|Ayi = hAx|yi.

Hence, x ∈ D(A∗ ) with A∗ x = Ax as desired.

It is now finally time to introduce self-adjoint operators. We will soon see


that for this definition the behavior is exactly as desired.

Definition 2.2.27. A densely defined operator (A, D(A)) on a Hilbert space


H is self-adjoint if A = A∗ . Moreover, a densely defined symmetric operator
(A, D(A)) is called essentially self-adjoint if its closure is self-adjoint.

50
2.2. Symmetric and Self-Adjoint Operators

Using this definition, we have shown in Example 2.2.25 that the operator
d
A = i dx with domain D(A) = H 1 ([0, ∞)) is not self-adjoint as the adjoint is a
true extension of A.
We now give some basic positive examples of self-adjoint operators.
Example 2.2.28 (Multiplication Operators). Let (Ω, Σ, µ) be a measure space
and m : Ω → R a measurable function. One defines the multiplication opera-
tor Mm on L2 (Ω) = L2 (Ω, Σ, µ) as
D(Mm ) = {f ∈ L2 (Ω) : m · f ∈ L2 (Ω)}
Mm f = m · f .
First observe that Mm is densely defined. For this consider the set An = {x ∈ Ω :
|m(x)| ≤ n} for n ∈ N. Since m is bounded on An , one has f 1An ∈ D(Mm ) for all
f ∈ L2 (Ω) and all n ∈ N. Moreover, it follows from the dominated convergence
theorem that f 1An → f in L2 (Ω) for all f ∈ L2 (Ω). Hence, D(Mm ) = L2 (Ω).
Note that since D(Mm ) is dense, we can now speak of the adjoint.
Now let us compute the adjoint of Mm . First assume that f , g ∈ D(Mm ).
Then one has
Z Z
hf |Mm gi = f mg dµ = mf g dµ = hMm f |gi.
Ω Ω
This shows that Mm is symmetric and (therefore automatically) Mm is a restric-
tion of (Mm )∗ . It remains to show that the converse inclusion Mm ∗ = (M )∗ ⊆
m
Mm holds as well. Suppose that f ∈ D(Mm ∗ ). Then one has for all g ∈ D(M )
m
by definition of the adjoint
Z Z Z
∗ ∗ ∗
Mm f g dµ = hMm f |gi = hf |Mm gi = f mg dµ ⇔ Mm f − mf g dµ = 0.
Ω Ω Ω

Since D(Mm ) is dense in L2 (Ω),


the above identity extends to all g ∈ L2 (Ω) by
continuity of the scalar product. In particular, we may choose g = Mm ∗ f − mf

and obtain Z
|Mm ∗
f − mf |2 dµ = 0.

This shows that Mm= mf . In particular, mf ∈ L2 (Ω) and therefore f ∈
∗f

D(Mm ) with Mm f = mf = Mm ∗ f . This shows M ∗ ⊂ M . Altogether, we have


m m
Mm = Mm ∗ and M is self-adjoint.
m

Note that if we choose (R, B(R), λ) as measure space and m(x) = x as multi-
plier, we obtain the (one-dimensional) position operator from quantum me-
chanics. Similarly, for (R3 , B(R3 ), λ) and mj (x) = xj for j = 1, 2, 3 we obtain the
self-adjointness of the momentum operators x̂, ŷ and ẑ in three-dimensional
space. It is now time to treat some more basic operators / observables from
quantum mechanics which are given by differential operators. We start with
momentum operators.

51
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

Example 2.2.29 (Momentum operators). On Rn we consider the momentum



operator A = −i ∂x with domain
j

n o
D(A) = f ∈ L2 (Rn ) : Dj f exists in the weak sense and is in L2 (Rn ) .

It follows from the mapping properties of the Fourier transform on L2 (Rn )


stated in Proposition 2.1.35 that for f ∈ D(A) one has

∂f
−i = iF −1 Mixj F f = F −1 Mxj F .
∂xj


Moreover, it follows that the unitary operator F maps the domain of −i ∂x
j
bijectively onto the domain of the multiplication operator Mxj . This shows
that A = F −1 Mxj F with the appropriate domain mapping condition D(Mxj ) =
F (D(A)). Since the Fourier transform F on L2 (Rn ) is a unitary operator

by Theorem 2.1.34, we have shown that −i ∂x is unitarily equivalent to the
j
multiplication operator Mxj . Hence, the self-adjointness of the momentum
operators directly follows from the self-adjointness of the multiplication
operators shown in Example 2.2.28.

Notice that in the above example we have in particular considered the op-
d
erator −i dx on L2 (R) with domain H 1 (R) = H01 (R) which is a self-adjoint oper-
ator, whereas for the analogous situation on L2 ([0, ∞)) with domain H01 ([0, ∞))
one does not obtain a self-adjoint operator a shown in Example 2.2.25.
Using the same methods one can treat the Hamiltonian of a free particle
in Rn .

Example 2.2.30 (Hamiltonian of a free particle). On Rn consider the Hamil-


tonian of a free particle. In classical mechanics the Hamiltonian (the energy)
1 Pn 2
is given by H = 2m k=1 pk , where m denotes the mass of the particle and
the p̂k are the momenta in the respective directions. Hence, using the formal
quantization rules we obtain that the Hamiltonian is given by
n
1 X 2
Ĥ = p̂k .
2m
k=1

As usual in mathematics ignoring physical constants or setting them equal to


one has
n !2 n
X ∂ X ∂2
Ĥ = i =− = −∆,
∂xk ∂2 xk
k=1 k=1

where ∆ denotes the Laplace operator. Of course, we are not done with this
formal calculation because we must also deal with the domain of Ĥ. Let us

52
2.2. Symmetric and Self-Adjoint Operators

∂2
start with the domain of ∂2 xk
. For a function f ∈ L2 (Rn ) to be in its domain
we require that the k-th partial derivative of f exists and again lies in L2 (Rn ).
∂f
Moreover, also the weak k-th partial derivative of ∂x must exist and lie in
k
L2 (Rn ). Taking the Fourier description obtained in Proposition 2.1.35 we see
that xk2 F f ∈ L2 (Rn ) for all k = 1, . . . , n. Equivalently, (1 + |x|2 )F f ∈ L2 (Rn ). As
already used in the proof of the Sobolev embedding theorem (Theorem 2.1.36)
this is equivalent to f ∈ H 2 (Rn ). Hence, Ĥ = −∆ with domain H 2 (Rn ). Arguing
as in the previous example, we see that −∆ is unitarily equivalent to the
multiplication operator M|x|2 via the Fourier transform. Hence, −∆ with
domain H 2 (Rn ) is a self-adjoint operator.

We now shortly comment on the easier case of adjoints of bounded opera-


tors.

Remark 2.2.31 (Adjoints of bounded operators). Let A ∈ B(H) be a bounded


operator on some Hilbert space H. In particular, we have D(A) = H. Then
it follows from the Cauchy–Schwarz inequality and the boundedness of the
operator that
|hf |Agi| ≤ kf k kAk kgk .
This shows that D(A∗ ) = H. Moreover, one has

kA∗ k = sup kA∗ f k = sup sup hg|A∗ f i = sup sup hAg|f i = sup sup hAg|f i
kf k≤1 kf k≤1 kgk≤1 kf k≤1 kgk≤1 kgk≤1 kf k≤1

= sup kAgk = kAk .


kgk≤1

In particular, A∗ is bounded if A is bounded. In particular, this shows that ev-


ery bounded symmetric operator is already self-adjoint. Hence, the difficulties
vanish in the bounded case.

Let us also determine the adjoint of a bounded operator.

Example 2.2.32 (Adjoint of orthogonal projections). Let H be a Hilbert


space and M ⊂ H a closed subspace. We denote by P the orthogonal projection
onto M as introduced in Example 2.1.14. Then for every x ∈ H we have the
decomposition x = P x + (x − P x) into the orthogornal subspaces Ker P and Im P .
Now, for x, y ∈ H we have

hy|P xi = hP y + (y − P y)|P xi = hP y|P xi = hP y|P x + (x − P x)i = hP y|xi.

Hence, we have P = P ∗ , i.e. P is self-adjoint. In fact, the self-adjointness


characterizes orthogonal projections among all bounded projections. This is
left as an exercise to the reader.

53
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

2.2.3 Basic Criteria for Self-Adjointness


Until now we have checked the self-adjointness of concrete symmetric opera-
tors by direct calculations. As this can be a quite difficult endeavour, we now
give some characterizations and criteria for self-adjoint operators which will
simplify our life.
We will use the following lemma which is proved by an argument analo-
gous to the one used in the proof of Corollary 2.2.14.

Lemma 2.2.33. Let (A, D(A)) be a densely defined operator on some Hilbert space
H. Then the following are equivalent.

(i) A∗ is injective;

(ii) A has dense range.

Proof. (ii) ⇒ (i): Let x ∈ D(A∗ ) with A∗ x = 0. Then one has for all y ∈ D(A)

0 = hA∗ x|yi = hx|Ayi.

Since A has dense range, we conclude that 0 = hx|zi for all z ∈ H. Choosing
z = x, this shows x = 0. Hence, A∗ is injective.
(i) ⇒ (ii): Assume that A has not dense range. Then the closure Rg A
is a proper closed subspace of H. Then there exists some x ∈ H \ Rg A. By
replacing x by x − P x where P is the orthogonal projection onto RgA, we may
assume that x , 0 and x is orthogonal to RgA. Then we have for all y ∈ D(A)
the identity 0 = hx|Ayi. This shows that x ∈ D(A∗ ) and A∗ x = 0. Since x , 0 by
construction, A∗ cannot be injective which contradicts the assumption.

The following theorem gives a very convenient characterization of self-


adjoint operators.

Theorem 2.2.34. Let (A, D(A)) be a symmetric operator on some Hilbert space H.
Then the following are equivalent.

(i) A is densely defined and self-adjoint;

(ii) A is closed, densely defined and Ker(A∗ ± i) = 0;

(iii) Rg(A ± i) = H.

Proof. (i) ⇒ (ii): If A is self-adjoint, one has A = A∗ . Since the latter is closed
by Proposition 2.2.24, A is a closed operator. Moreover, one has Ker(A∗ ± i) =
Ker(A ± i) = 0, since the latter is injective by Lemma 2.2.12.
(ii) ⇒ (iii): By Lemma 2.2.33 the operators A ± i have dense range. We now
show that this already implies that A ± i are surjective. We only treat the case
of A + i as the other case works completely analogously. Recall that one has

54
2.2. Symmetric and Self-Adjoint Operators

the estimate k(A + i)xk ≥ kxk for all x ∈ D(A) by Lemma 2.2.12. Now let y ∈ H.
Since Rg(A + i) is dense, there exist yn ∈ H and xn ∈ D(A) with (A + i)xn = yn
and yn → y in H. One has

kxn − xm k ≤ k(A + i)(xn − xm )k = kyn − ym k

Since the right hand side converges it is Cauchy sequence. By the above
estimate (xn )n∈N is a Cauchy sequence as well. Hence, there exists x ∈ H with
xn → x. To summarize we have xn → x and (A + i)xn → y. Since A is closed,
this implies x ∈ D(A) and (A + i)x = y. This shows that A + i is surjective.
(iii) ⇒ (i): Note first that by Lemma 2.2.14 the domain D(A) is dense. It
therefore makes sense to consider the adjoint. By Lemma 2.2.26, the adjoint
A∗ is an extension of A. It therefore remains to show that A extends A∗ . For
this let y ∈ D(A∗ ). By assumption, there exists x ∈ D(A) with (A + i)x = (A∗ + i)y.
Since A∗ extends A, we have (A∗ + i)x = (A∗ + i)y. It follows from Lemma 2.2.33
that (A − i)∗ = A∗ + i is injective. Hence, y = x ∈ D(A) and A extends A∗ .

We now show using the example of real multiplication operators that this
result simplifies verifying self-adjointness.

Example 2.2.35. Let (Ω, Σ, µ) be a measure space and Mm the multiplication


operator on L2 (Ω, Σ, µ) for some measurable m : Ω → R as in Example 2.2.28.
The symmetry (the easy part) can be checked by a direct calculation. We now
check the range condition of Theorem 2.2.34. For this we have to find for a
given g ∈ L2 a function f ∈ L2 with mf ∈ L2 such that
1
(Mm ± i)f = (m ± i)f = g ⇔ f = g.
m±i
Note that |m ± i| ≥ 1 because m is a real function and therefore

kf k2 ≤ k1/(m ± i)k∞ kgk2 ≤ kgk2 .

This shows f ∈ L2 . Moreover, one has km/(m ± i)k∞ ≤ 1 which implies f ∈


D(Mm ). Hence, Mm is a symmetric operator with Rg(Mm ± i) = L2 (Ω). Theo-
rem 2.2.34 now implies that Mm is self-adjoint.

Note the above argument works without checking the denseness of the
domain or determining the adjoint. With a similar condition one can also
check whether a symmetric operator is essentially self-adjoint or not.

Theorem 2.2.36. Let (A, D(A)) by a densely defined symmetric operator on some
Hilbert space H. Then the following are equivalent.

(i) A is essentially self-adjoint;

(ii) Ker(A∗ ± i) = 0;

55
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

(iii) Rg(A ± i) is dense in H.

We will only give a sketch of the proof of the theorem here and leave
the details as an exercise to the reader. One can show that for a symmetric
operator A∗∗ is again symmetric and the closure of A. Moreover, one always
has A∗ = A∗∗∗ . Hence, the above result follows from Theorem 2.2.34 applied to
A∗∗ and Lemma 2.2.33.

Remark 2.2.37. Recall that in Example 2.2.25 we have seen a closed densely
defined symmetric operator A whose adjoint A∗ is not symmetric. In fact,
this is no coincidence. Suppose that A is a closed densely defined symmetric
operator whose adjoint A∗ is also symmetric. If follows from the symmetry of
A that A ⊂ A∗ . Analogously, it follows from the symmetry of A∗ that A∗ ⊂ A∗∗ .
Since, A is closed, it follows from the comments before this remark that
A = A∗∗ . Hence, A∗ ⊂ A and therefore A is self-adjoint.

2.2.4 Self-Adjoint Extensions


It may be difficult to explicitly write down the domain of a self-adjoint op-
erator. For example, this situation arises in the study of Hamiltonians if the
potential is non-trivial. Therefore one often as a first step defines a symmet-
ric operator with some dense domain, e.g. some class of smooth functions.
After that one asks whether this symmetric operator can be extended to a
self-adjoint operator. In this direction we have given a characterization of
essentially self-adjoint operators in the last section. However, it may happen
that a symmetric operator has several different self-adjoint extensions. In this
section we give a characterization of all self-adjoint extensions of a symmetric
operator and give some physical examples.
We extend the notion of an isometry in a natural way to unbounded
operators.

Definition 2.2.38 (Isometry). An unbounded operator (U , D(U )) on some


Hilbert space H is called an isometry if

hU x|U yi = hx|yi for all x, y ∈ D(U ).

Note that as after Definition 2.1.33 it follows from the polarization identity
that (U , D(U )) is an isometry if and only if kU xk = kxk for all x ∈ D(U ). As a
motivation for the following argument we take a quick look at the Möbius
transform
z−i
z 7→ .
z+i
It is a well-known fact from complex analysis that this defines a bijection
between the real line and the unit circle in the complex plane without the

56
2.2. Symmetric and Self-Adjoint Operators

point 1 (considered in the extended complex plan the point ∞ is mapped to


1). Indeed, for z ∈ R the absolute value of the nominator and the denominator
agree and the inverse can be easily calculated as

1+w
w 7→ i .
1−w
We now apply the above transformation to a symmetric operator. This trans-
formation is called the Cayley transformation. In the next proposition we study
its basic properties.

Proposition 2.2.39 (Properties of the Cayley transform). Let (A, D(A)) be a


symmetric operator on some Hilbert space H. Then the Cayley transform of A

V B (A − i)(A + i)−1

is a well-defined isometry from Rg(A + i) onto Rg(A − i) ⊂ H.

Proof. Note that for x ∈ D(A) one has by the symmetry of A

k(A ± i)xk2 = h(A ± i)x|(A ± i)xi = hAx|Axi + hx|xi ± ihAx|xi ∓ ihx|Axi


= kAxk2 + kxk2 ± hAx|xi ∓ ihAx|xi = kAxk2 + kxk2 .

This again shows that A ± i is injective and therefore that the inverse (A + i)−1
is well-defined as an unbounded operator with domain Rg(A + i). Moreover,
for x ∈ D(A) we have
k(A + i)xk2 = k(A − i)xk2 .
Now, it follows for x ∈ Rg(A + i) and y = (A + i)−1 x that

kV xk2 = k(A − i)yk2 = k(A + i)yk2 = kxk2 .

This shows that V is an isometry. Moreover, if y ∈ Rg(A − i), then x = (A +


i)(A − i)−1 y satisfies V x = y. This finishes the proof.

Lemma 2.2.40. Let (A, D(A)) be a symmetric operator on some Hilbert space H.
Then the Cayley transform of A is unitary on H if and only if A is densely defined
and self-adjoint.

Proof. First assume that A is self-adjoint. Then it follows from Theorem 2.2.34
that Rg(A ± i) = H. By Proposition 2.2.39 the Cayley transform V is an
isometric operator from H = Rg(A + i) onto H = Rg(A − i). This yields that V is
a surjective isometric operator H → H. It now follows from the polarization
identity that V is unitary on H.
Conversely, if V : H → H is unitary, then Rg(A ± i) = H by Proposi-
tion 2.2.39. The self-adjointness now follows from Theorem 2.2.34.

57
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

Suppose that V is the Cayley transform of some symmetric operator A.


Then one would expect that one can recover A from V by the natural formula
A = i(Id +V )(Id −V )−1 . Indeed, one can show by direct calculations that if A
is an unbounded operator such that A + i is injective, then Id −V is injective,
Rg(Id −V ) = D(A) and A = i(Id +V )(Id −V )−1 as unbounded operators. More-
over, if some given V : H → H is unitary such that Id −V is injective, one can
always write V as the Cayley transform of some self-adjoint operator.

Lemma 2.2.41. Let H be a Hilbert space and V ∈ B(H) a unitary operator such
that Id −V is injective. Then there exists a self-adjoint operator (A, D(A)) on H
such that V is the Cayley transform of A.

Proof. It is sufficient to find a symmetric operator on H whose Cayley trans-


form is V : the self-adjointness of A then follows directly from Lemma 2.2.40.
Since Id −V is injective, the operator

A = i(Id +V )(Id −V )−1

is well-defined with D(A) = Rg(Id −V ). We show that A is symmetric. For this


let x, y ∈ Rg(Id −V ). There exist w, z ∈ H with x = (Id −V )w and y = (Id −V )z.
We have

hAx|yi = hi(Id +V )(Id −V )−1 x|yi = hi(Id +V )w|(Id −V )zi


= −ihw|zi + ihV w|V zi − ihV w|zi + ihw|V zi = ihw|V zi − ihV w|zi
= ihw|zi − ihV w|V zi − ihV w|zi + ihw|V zi = h(Id −V )w|i(Id +V )zi
= hx|Ayi.

Now, one can verify with a direct computation that the Cayley transform of A
is indeed V , a task which is left to the reader.

Note that the last step of the above proof only used the fact that V is
isometric. Hence, the proof shows that for every isometry U there exists a
symmetric operator A whose Cayley transform is U .
By the above arguments the problem of finding self-adjoint extensions of a
symmetric operator A is reduced to the problem of finding unitary extensions
of the Cayley transform V : Rg(A + i) → Rg(A − i). Now working with orthog-
onal complements becomes quite handy, a notion which we already have used
implicitly several times.

Definition 2.2.42 (Orthogonal complement). Let M ⊂ H be a subset of some


Hilbert space. Then its orthogonal complement is defined as the closed subspace

M ⊥ B {x ∈ H : hx|yi = 0 for all y ∈ M}.

58
2.2. Symmetric and Self-Adjoint Operators

The orthogonal complement M ⊥ has a straightforward geometrical inter-


pretation. It consists of all vectors which are perpendicular to all elements
of M. Note that for a closed subspace M ⊂ H one always has the orthogonal
direct decomposition H = M ⊕ M ⊥ . For x ∈ H this decomposition is given
by x = P x + (Id −P )x, where P is the orthogonal projection onto M defined in
Example 2.1.14. We will use the following relation for unbounded operators.

Lemma 2.2.43. Let (A, D(A)) be a densely defined unbounded operator on some
Hilbert space H. Then
Ker A∗ = Rg(A)⊥ .

Proof. Suppose x ∈ Ker A∗ . Then we have for all Az = y ∈ Rg(A)

hx|yi = hx|Azi = hA∗ x|zi = 0.

Hence, x ∈ Rg(A)⊥ . Conversely, let x ∈ Rg(A)⊥ . Then we have for all y ∈ D(A)
the identity hx|Ayi = 0. This implies that x ∈ D(A∗ ) and A∗ x = 0. Therefore
x ∈ Ker A∗ .

We now return to the above extension problem. Note that if such a unitary
extension U : H → H exists, then U splits as a componentwise isomorphism

Rg(A + i) ⊕ Rg(A + i)⊥ → Rg(A − i) ⊕ Rg(A − i)⊥ .

Hence, Rg(A + i)⊥ and Rg(A + i)⊥ are isomorphic as Hilbert spaces. Conversely,
one sees directly that an isometry Rg(A + i) → Rg(A − i) extends uniquely to an
isometry between the closures. Hence, a unitary isomorphism between Rg(A +
i)⊥ and Rg(A − i)⊥ yields a unitary operator U : H → H extending the Cayley
transform of A. One can check that for such an extension U the operator Id −U
is injective and therefore is the Cayley transform of some self-adjoint operator
B by Lemma 2.2.41. In fact, this gives an one-to-correspondence between
unitary extensions of the Cayley transform and self-adjoint extensions of A. A
crucial role is played by the dimensions of the orthogonal complements just
used.

Definition 2.2.44 (Deficiency indices). The deficiency indices d+ (A) and d− (A)
of a densely defined symmetric operator (A, D(A)) are defined as

d+ (A) = dim Ker(A∗ + i) and d− (A) = dim Ker(A∗ − i).

Note that by Lemma 2.2.43 one has d+ (A) = dim Rg(A − i)⊥ and d− (A) =
dim Rg(A + i)⊥ . We have just sketched a proof of the following theorem (we
omit a detailed proof).

Theorem 2.2.45. Let (A, D(A)) be a densely defined symmetric operator on some
Hilbert space H. Then

59
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

(a) A has self-adjoint extensions if and only if d+ (A) = d− (A).

(b) There is a one-to-one correspondence between self-adjoint extensions of A


and unitary extensions H → H of its Cayley transform, i.e. unitary operators
between Ker(A∗ − i) and Ker(A∗ + i).

In particular, A admits a unique self-adjoint extension if and only if d+ (A) =


d− (A) = 0, i.e. if and only if A is essentially self-adjoint by Theorem 2.2.36.
Moreover, if U : Ker(A∗ −i) → Ker(A∗ +i) is unitary, the corresponding self-adjoint
extension AU is given by

D(AU ) = {f + g − U g : f ∈ D(A), g ∈ Ker(A∗ − i)}


AU (f + g − U g) = Af + i(g + U g).

In fact, more generally one can show that there is an one-to-one correspon-
dence between isometries (U , D(U )) into Ker(A∗ + i) with D(U ) ⊂ Ker(A∗ − i)
and closed symmetric extensions of A. Let us illustrate the theorem with some
examples.

Example 2.2.46 (Self-adjoint extensions of the momentum operator on the


d
half-line). Let us again consider the momentum operator A = −i dx with
∞ 2
domain D(A) = Cc ((0, ∞)) on L ([0, ∞)). The calculations in Example 2.2.25
d
show that its adjoint is A∗ = −i dx with D(A∗ ) = H 1 ((0, ∞)). Let us determine
the deficiency indices of A. First we consider the equation

(A∗ + i)f = −if 0 + if = 0 ⇔ f 0 − f = 0.

Every solution of this equation is a solution in the classical sense: we have


f ∈ H 1 ((0, ∞)) and the above equation implies that f 0 ∈ H 1 ((0, 1)) as well.
Therefore f ∈ H 2 ((0, ∞)). Iterating this argument shows that f ∈ H k ((0, ∞))
for all k ∈ N. The Sobolev embedding theorems (Remark 2.1.37) show that
f ∈ C ∞ ((0, ∞)) and therefore a classical solution. This shows that all solutions
of the above ordinary differential equation are scalar multiples of f (t) = et .
However, these solutions are not-square integrable. This shows d+ (A) = 0. For
d− (A) we have to solve

(A∗ − i)f = −if 0 − if = 0 ⇔ f 0 + f = 0.

As above all solutions of the above equation are scalar multiples of f (t) = e−t
which are square-integrable. Hence, d− (A) = 1. Altogether we have d+ (A) ,
d− (A) and we see from Theorem 2.2.45 that A has no self-adjoint extensions.

Please recall that we have already discussed the non-existence of self-


adjoint extensions of the momentum operator on the half-line from a physical
perspective in Remark 2.2.18.

60
2.2. Symmetric and Self-Adjoint Operators

Example 2.2.47 (Self-adjoint extensions of the momentum operator on a


bounded interval). We now study a slight variation of the previous example.
d
In fact, we consider the momentum operator A = −i dx on L2 ([0, 1]) with

domain D(A) = Cc ((0, 1)). Similar as in the previous example one calculates
d
its adjoint to be A∗ = −i dx with D(A∗ ) = H 1 ((0, 1)). For d+ (A) we solve

(A∗ + i)f = 0 ⇔ f 0 − f = 0.

Every solution of the above equation is a scalar multiple of f (t) = et . Hence,


d+ (A) = 1. An analogous calculation shows that d− (A) = 1. By Theorem 2.2.45,
A has self-adjoint extensions. Moreover, the self-adjoint extensions are in one-
to-one correspondence to unitaries between C ' Ker(A∗ −i) and C ' Ker(A∗ +i).
Every unitary map between in C is uniquely determined by a real number
α ∈ [0, 2π), i.e. z 7→ eiα z. Now let us determine the self-adjoint extensions of
A concretely. For this we first need to determine the closure of A. Notice
that the norm k·k + kA·k agrees with the norm of H 1 ((0, 1)). It follows from
d
the denseness of Cc ((0, 1)) in H01 ((0, 1)) (see Remark 2.2.16) that A = −i dx with
1
D(A) = H0 ((0, 1)). To write down the unitary mappings between Ker(A∗ − i)
and Ker(A∗ + i) we need normalized elements in the kernels. For this we
calculate
Z1 Z1
1 1 −2 2 1
−2t −2
e dt = (1 − e ) = e (e − 1) and e2t dt = (e2 − 1).
0 2 2 0 2
The normalized elements are therefore given by
√ √
2e −t 2 t
f− (t) = √ e and f+ (t) = √ e.
e2 − 1 e2 − 1
Hence, the unitary Uα : Ker(A∗ − i) → Ker(A∗ + i) is determined by f− 7→ eiα f+ .
Every element of D(AUα ) can be written as the sum g + λ(f− − Uα f− ) for some
g ∈ H01 ((0, 1)). Evaluating at the boundaries gives
√ √
iα 2e iα 2
f− (0) − e f+ (0) = √ −e √
e2 − 1 e2 − 1
√ √
2 2e
f− (1) − eiα f+ (1) = √ − eiα √ .
e2 − 1 e2 − 1
The quotient of both is

e − eiα e − eiα
= = 1.
1 − eiα e e−iα − e

Moreover, one can directly verify or use the theory of Möbius transformations
that z 7→ (e − z)/(1 − ez) restricts to a bijection on the unit circle. Hence, there

61
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

exists a β ∈ [0, 2π) (uniquely determined by α) such that for every f ∈ D(Uα )
we have f (0) = eiβ f (1). Conversely, one sees that every f ∈ H 1 ((0, 1)) with
f (0) = eiβ f (1) can be written as g + λ(f− Uα f− ) for some g ∈ H01 ((0, 1)) and some
λ ∈ C. Hence, the self-adjoint operators (Aβ , D(Aβ )) defined by

D(Aβ ) = {f ∈ H 1 ((0, 1)) : f (0) = eiβ f (1)}


Aβ f = −if 0 .

for some β ∈ [0, 2π) are exactly the self-adjoint extensions of A.

It is again time to discuss the physical relevance of the above calculations.


This time we describe a particle which is located in a finite spatial region due
to the potential 
0 if 0 ≤ x ≤ 1

V (x) =  .

∞ else

Given a compactly supported wave function ψ ∈ Cc∞ ((0, 1)) the time evolution
described by the momentum operator should be given by translation of ψ. But
this of course does not completely describe the time evolution for all times
because after some time the wave function will meet the boundary. Since the
time evolution must be described by some unitary operator, we cannot loose
probability mass of the function. Hence, what goes out at one side of the well
should come in at the other side. However, we have the freedom to choose a
phase shift for the outcoming wave. By the superposition principle, this phase
shift is independent of the wave function. In this way one exactly obtains the
just calculated self-adjoint extensions, where eiβ is the global phase shift of
the outcoming waves.
What can we learn from this example? Different self-adjoint extensions
really correspond to different physics! Hence, the concrete choice of a self-
adjoint extension of a given symmetric operator is not just some mathematical
freedom which is irrelevant for the description of the physical world, but has
real physical consequences.

Example 2.2.48 (The Hamiltonian in the half-line). We now consider the


free Hamiltonian associated to the momentum operator of the last example,
2
i.e. A = − dd2 x with D(A) = Cc∞ ((0, ∞)). Let us calculate its adjoint. For this let
f ∈ D(A∗ ). Then one has for all g ∈ Cc∞ ((0, ∞)) that
Z∞ Z∞


(A f )(x)g(x) dx = hA f |gi = hf |Agi = − f (x)g 00 (x) dx.
0 0

Taking complex conjugates, we obtain for real g ∈ Cc∞ ((0, ∞))


Z∞ Z∞
(A∗ f )(x)g(x) dx = − f (x)g 00 (x) dx.
0 0

62
2.2. Symmetric and Self-Adjoint Operators

This shows that the second derivative of f exists in the weak sense and is
given by −A∗ f . Conversely, if the second derivative of f exists in the weak
sense and lies in L2 ((0, ∞)), the same calculations read backwards show that
f ∈ D(A∗ ) and A∗ f = −f 00 . Now let us determine the deficiency indices of A.
For d+ (A) we solve

(A∗ + i)f = −f 00 + if = 0.

It follows from the elliptic regularity theory presented in Example 2.1.38 (the
half-line case follows from the real line case via a reduction argument, simply
extend all solutions to the real line by reflecting the solution along at the
origin; in fact this argument also shows D(A∗ ) = H 2 ((0, ∞)) that a solution of
the equation in the weak sense already satisfies f ∈ C ∞ ((0, ∞)) and therefore is
a classical solution. Every classical solution of this second ordinary differential
equation is of the form
√ √
λ1 exp((1 − i)x/ 2) + λ2 exp(−(1 + i)x/ 2)

for some λ1 , λ2 ∈ C. This solution is in L2 ((0, ∞)) if and only if λ1 = 0. Hence,


d+ (A) = 1. An analogous calculation shows d− (A) = 1. Hence, it follows from
Theorem 2.2.45 that A has self-adjoint extensions. In fact, along the reasoning
of Example 2.2.47 the self-adjoint extensions of A are precisely the operators
2
− dd2 x with domains

D(Aa ) = {f ∈ H 2 ((0, ∞)) : f 0 (0) + af (0) = 0}

for a ∈ R and in the formal limit case a = ∞

D(A∞ ) = {f ∈ H 2 ((0, ∞)) : f (0) = 0}.

This result is very interesting from a physical perspective. Although


one cannot define a reasonable momentum observable in the half-line case
as shown in Example 2.2.46, the situation changes when one considers the
Hamiltonian as its formal square. In fact, the previous example shows that
there are infinitely many self-adjoint realizations of the Hamiltonian with
different boundary conditions. Let us show that these realizations again
correspond to different physics. Let us for the following moment ignore the
behaviour of the wave function at infinity, i.e. all integrability issues. Then
the plane waves e±ikx for k ∈ R do not lie in D(Aa ) since they do not satisfy the
boundary conditions. However, if we consider a supposition of both incoming
and outcoming waves ψ(x) = e−ikx + λeikx we have

ik − a
ψ 0 (0) + aψ(0) = −ik + ikλ + a + λa = 0 ⇔ λ= .
ik + a

63
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

Note that |λ| = 1. Hence, Aa generates a dynamic in which a plane wave is


reflected at the origin with a phase change depending on the value of a and the
momentum k. The limit case a = ∞ corresponds to the case of total reflection
on a hard wall where the phase change is −1 independent of the momentum.
We now come back to an old example from the other part of the lecture.
Sadly, this example is more difficult than the previous ones and is best treated
within the theory of distributions which we will introduce in the next chapter.
We will state all necessary facts explicitly. The reader may skip the example
and return later or simply believe the used results.
d d 3
Example 2.2.49. On L2 (R) consider the operator A = −ix3 dx − i dx x . We have
already studied this operator in the physics part of the lecture. Using naive
reasoning, it seems that
1 2
f (x) = 3/2
e−1/4x ∈ L2 (R)
|x|
satisfies Af = −if , i.e. f is an imaginary eigenvalue of A. The problem is that,
once precisely written down, f does not lie in the domain D(A). We now
give a mathematical precise study of the operator with the aim to get another
perspective on the operator with the help of the theory just developed. For
this we use the domain D(A) = Cc∞ (R). Let us first show that the operator
(A, D(A)) is symmetric. Using integration by parts, for f , g ∈ D(A) we obtain
the desired identity
Z∞ Z∞
hf |Agi = f (x)(Ag)(x) dx = −i f (x)(x3 g 0 (x) + (x3 g(x))0 ) dx
−∞ −∞
Z∞
=i (f (x)x3 )0 g(x) + f 0 (x)x3 g(x) dx
−∞
Z∞
= −i(x3 f 0 (x) + (f (x)x3 )0 )g(x) dx = hAf |gi.
−∞

It is somewhat difficult to describe the domain of the adjoint more explicitly


without using the theory of distributions that we will cover in the next chapter.
For the moment we assume that we can give sense to the derivative of an
arbitrary L1loc -function and that these derivatives can be multiplied with
smooth functions of polynomial growth. The domain D(A∗ ) is then given
by those functions f ∈ L2 (R) for which the distribution x3 f 0 (x) + (x3 f (x))0 =
2x3 f 0 (x) + 3x2 f (x) lies in L2 (R) (assuming the usual rules of calculus). Let us
now calculate the deficiency indices. For d+ (A) we obtain the equation

(A∗ + i)f = 0 ⇔ f (x) − 2x3 f 0 (x) − 3x2 f (x) = (1 − 3x2 )f (x) − 2x3 f 0 (x) = 0.

We can restrict the above equation to the open right and left half-plane (this
is also valid for distributions). For x > 0 respectively x < 0 we obtain the

64
2.2. Symmetric and Self-Adjoint Operators

equation
1 − 3x2
f 0 (x) = f (x).
2x3
We now treat the case x > 0. If f ∈ L2 (R) solves the equation on [ε, ∞) for
some ε > 0, then it follows from the fact the factor on the right hand side is
bounded that f 0 ∈ L2 ([ε, ∞)). Hence, f ∈ H 1 ([ε, ∞)) because we will see from
the definition of distributions that if the distributional derivative exists as a
locally integrable function, then the weak derivative exists and agrees with the
function representing the distributional derivative. Iterating this argument
and using the Sobolev embeddings we see that f ∈ C ∞ ((ε, ∞)). Hence, f is
smooth away from zero. Hence, away from zero it is sufficient to work with
classical solution of the differential equation. Using separation of variables,
we see that all solutions are scalar multiples of
1 − 3y 2 exp(−1/4y 2 )
Z !
1 −2 3
 
f+ (x) = exp dy = exp − y − log y = .
2y 3 4 2 y 3/2
Analogously, one obtains for the case x < 0 that the solution is a multiple of
1 − 3y 2 exp(−1/4y 2 )
Z !
1 −2 3
 
f− (x) = exp dy = exp − y − log|y| = .
2y 3 4 2 |y|3/2
One now sees directly that both are square-integrable in the respective half-
planes. Hence, λ1 f+ 1[0,∞) + λ2 f− 1(−∞,0] ∈ Ker(A∗ + i) for λ1 , λ2 ∈ C. This shows
d+ (A) = 2.
Let us continue with d− (A). For this we have to find all solutions of

(A∗ − i)f = 0 ⇔ f (x) + 2x3 f 0 (x) + 3x2 f (x) = (1 + 3x2 )f (x) + 2x3 f (x) = 0.

For x , 0 we therefore have by the same reasoning


1 + 3x2
f 0 (x) = − f (x).
2x3
Integrating this equation we obtain that on both open half-lines one must
have scalar multiples of the functions
1 + 3y 2 exp(1/4y 2 )
Z !
1 −2 3
 
f (x) = exp − dy = exp y − log|y| = .
2y 3 4 2 |y|3/2
However, these functions have non-square integrable singularities at zero and
therefore cannot be combined to a non-zero L2 (R)-function. This shows that
d− (A) = 0.
Altogether we see from Theorem 2.2.45 that A does not have any self-
adjoint extensions. Moreover, observe that the potential eigenvector for i does
not lie in the domain of A. However, it does lie in the domain of A∗ as we just
shown. However, the domain D(A∗ ) is too big in order for A∗ to be self-adjoint.

65
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

2.2.5 The Spectrum of Self-Adjoint Operators


Recall that our actual interest for self-adjoint operators stems from the fact
that self-adjoint operators describe quantum mechanical observables by the
postulates of quantum mechanics. In particular, a key issue here is the fact
that the spectrum of a self-adjoint operator should be real because the spec-
trum of an observable corresponds to its possible measurement outcomes.
Recall that for symmetric operators the spectrum can contain complex num-
bers. We now show that the reality condition on the spectrum characterizes
self-adjoint operators among the symmetric operators. Hence, self-adjointness
as introduced before is exactly the right concept for the mathematical treat-
ment of quantum mechanics.
Theorem 2.2.50. Let (A, D(A)) be a densely defined symmetric operator on some
Hilbert space. Then
σ (A) ⊂ R ⇔ A is self-adjoint.
Proof. Suppose first that A is self-adjoint. Then it follows from the characteri-
zation of self-adjoint operators given in Theorem 2.2.34 (iii) and Lemma 2.2.13
that ±i ∈ ρ(A). By the structural properties of the spectrum of a symmetric
operator proved in Corollary 2.2.22, the spectrum of A must be contained in
the real axis.
Conversely, if σ (A) ⊂ R, then we have ±iρ(A). The self-adjointness of A
then follows directly from Proposition 2.2.21.
Nevertheless it can take some work to determine the spectrum of self-
adjoint operators explicitly. We now do this for the important case of real
multiplication operators.
Example 2.2.51 (Spectrum of multiplication operators). Consider again the
multiplication operator Mm f = mf on the Hilbert space L2 (Ω, Σ, µ) for some
σ -finite measure space (Ω, Σ, µ) and a measurable function m : Ω → R with
domain D(Mm ) = {f ∈ L2 : mf ∈ L2 }. We now want to determine the spectrum
σ (Mm ) of Mm . For this we introduce the essential range of m as
essim(m) B {z ∈ Ω : for all ε > 0 one has µ({ω : |m(ω) − z|) < ε} > 0}.
We claim that σ (Mm ) = essim(m). For this let us first assume that z < essim(m).
Then there exists ε > 0 such that µ({ω : |m(ω) − z| < ε}) = 0. Hence, |m(ω) − z| ≥
ε almost everywhere. We show that z ∈ ρ(A). For this we have to solve
(z − m)f = g for g ∈ L2 (Ω). In fact, the unique solution of this equation is given
by f (ω) = g(ω)/(z − m(ω)) almost everywhere. This solution is indeed square
integrable because of
|g(ω)|2
Z Z Z
2
|f (ω)| dω = 2
dω ≤ ε −2
|g(ω)|2 dω.
Ω Ω |z − m(ω)| Ω

66
2.2. Symmetric and Self-Adjoint Operators

This shows σ (A) ⊆ essim(A). It remains to show the converse inclusion


essim(A) ⊂ σ (A). For this let z ∈ essim(A). Then for all n ∈ N there exist
cn > 0 with µ(An ) = cn , where An = {ω : |m(ω) − z| < 4−n }. Now consider the
−1/2 −n
function g = ∞
P
n=1 cn 2 1An . Then


X ∞
X
kgk2 ≤ cn−1/2 2−n k1An k2 = cn−1/2 2−n cn1/2 = 1.
n=1 n=1

Let f (ω) = g(ω)/(z − m(ω)) be the unique solution of the resolvent equation.
Then for all n ∈ N we have

|g(ω)|2 cn−1 4−n


Z Z Z
2 n
kf k2 ≥ 2
dω ≥ 2
dω ≥ 4 cn−1 dω = 4n .
An |z − m(ω)| An |z − m(ω)| An

This shows that f is not square integrable and that z − Mm is not surjective. A
fortiori, z ∈ σ (A) and the converse inclusion essim(A) ⊂ σ (A) is shown.

As particular examples we can determine the spectrum of the position


and momentum operators and the Laplace operator on Rn .

Example 2.2.52. For j = 1, . . . , n let xˆj and pˆj be the position and momentum
operators on Rn as introduced before and in Example 2.2.29. These operators
are self-adjoint and satisfy

σ (p̂j ) = σ (x̂j ) = R for all j = 1, . . . , n.

Proof. The case of the momentum operators is a particular instance of Ex-


ample 2.2.51. For the momentum operators recall that these are unitarily
equivalent via the Fourier transform to the multiplication operators with
m(x) = xj . Since the spectrum is invariant under similarity transforms, the
result again follows from Example 2.2.51.

Note that the full spectrum of the position and momentum operators
is exactly the result we would expect from physics: if one measures the
momentum (or the position) of a particle, one can obtain arbitrary real values
as measurement outcomes.
An analogous approach yields the spectrum of the negative Laplacian −∆
with domain H 2 (Rn ).

Example 2.2.53 (Spectrum of −∆). For n ∈ N consider the negative Laplacian


−∆ with domain H 2 (Rn ). We have seen that this operator is unitarily equiva-
lent via the Fourier transform to the multiplication operator with m(x) = |x|2 .
Hence, σ (−∆) = [0, ∞) by Example 2.2.51.

67
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

Note that the result σ (−∆) = [0, ∞) perfectly corresponds to the physical
intuition. In fact, recall that −∆ = nj=1 p̂j2 is the Hamiltonian for the free
P

particle on Rn if we ignore all physical constants. As the spectrum of Ĥ


corresponds to the possible measurable energy values of the particle, we
expect that the spectrum is non-negative because there should not be a particle
with negative energy (using the natural gauge). Moreover, since the particle
is free and therefore no constraints whatsoever apply, in principle all energy
values should be possible. Hence, by physical reasoning one has σ (−∆) = [0, ∞)
which is exactly our mathematical result.
Note that for more difficult Hamiltonians it can be extremely difficult to
exactly determine their spectrum. Therefore it is desirable to at least have
some information on the structure of the spectrum, for example whether there
exists an orthonormal basis of eigenvectors. We will deal to some extend with
such problems in the next section. However, we will not have the time to give
a systematic study of the spectral properties of typical Hamiltonians of the
form −∆ + V .

2.3 The Spectral Theorem for Self-Adjoint Operators


Let us shortly recall the spectral theorem in the finite dimensional setting.
Suppose that A is a symmetric operator on some finite dimensional complex
inner product space V . Then the spectral theorem guarantees the existence of
an orthonormal basis of V consisting of eigenvectors of A. With respect to this
orthonormal basis, A is represented as a diagonal matrix. This is in general not
possible for self-adjoint operators in the infinite dimensional setting. In fact,
we have seen that there exist bounded operators on some infinite dimensional
Hilbert space which not have a single eigenvalue. However, in some important
physical examples, for example the Hamiltonians for an infinite high well or
the harmonic oscillator, there exists an orthonormal basis of eigenvectors.

2.3.1 The Spectral Theorem for Compact Self-Adjoint Operators


In fact, with an additional assumption one obtains such an orthonormal
basis. We deal with this case first before we proceed with general self-adjoint
operators.

Definition 2.3.1 (Compact subsets). A subset K ⊂ H of some Hilbert space


(or more generally of some metric space) is called compact if one of the follow-
ing equivalent conditions holds.

(i) For every sequence (xn )n∈N ⊂ K there exists a subsequence (xnk )k∈N and
x ∈ K with xnk → x for k → ∞.

68
2.3. The Spectral Theorem for Self-Adjoint Operators

(ii) Let K ⊂ ∪i∈I Ui be a covering of K with open subsets Ui ⊂ H. Then


there exist finitely many i1 , . . . , in ∈ I with K ⊂ ∪nk=1 Uik , i.e. every open
covering of K has a finite sub-covering.

The second point is the definition of compact subsets for general topo-
logical spaces, whereas the (in the case of metric spaces equivalent) second
point is called sequentially compactness. You have learned in calculus that a
subset K of some finite dimensional normed vector space V is compact if and
only if K is closed and bounded. This changes fundamentally in the infinite
dimensional setting.

Example 2.3.2 (Non-compact unit ball). Let B be the closed unit ball of some
infinite dimensional separable Hilbert space H. Then B clearly is bounded
and closed by definition. However, we now show that B is not compact.
Choose an orthonormal basis (en )n∈N of H which exists by Theorem 2.1.7. By
orthogonality we have for n , m

ken − em k2 = ken k2 + kem k2 = 2.



Since the mutual distance of two different arbitrary elements is equal to 2,
the sequence (en )n∈N cannot have a convergent subsequence. Hence, B is not
compact.

It is an instructive exercise for the reader to explicitly construct an infinite


disjoint open covering of the closed unit ball of some infinite-dimensional
Hilbert space. We now introduce the important class of bounded operators.

Definition 2.3.3 (Compact operator). A linear operator T : H1 → H2 be-


tween two Hilbert spaces (or more generally between two Banach spaces) H1
and H2 is compact if T (B) is a compact subset of H2 , where B denotes the
closed unit ball in H1 . We denote the space of all compact operators between
H1 and H2 by K(H1 , H2 ).

The reader should verify the following elementary facts: a compact linear
operator T : H1 → H2 is automatically bounded and T (B) is compact for
arbitrary bounded subsets B of H1 . Moreover, the compact operators form
a subspace of B(H1 , H2 ) and the composition of a bounded and a compact
operator is again a compact operator (hence, in algebraic terms, K(H) is an
ideal in B(H)).
We start our journey through the spectral theorems with the easiest (infi-
nite dimensional) case, namely the spectral theorem for compact self-adjoint
or more generally normal operators.

Definition 2.3.4 (Normal operator). An operator N ∈ B(H) is called normal


if N N ∗ = N ∗ N .

69
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

Theorem 2.3.5 (The spectral theorem for compact normal operators). Let
T : H → H be a compact normal operator on some Hilbert space H. Then there
exists a countable orthonormal system (en )n∈N and (λn )n∈N ⊂ C \ {0} such that

H = Ker T ⊕ span{en : n ∈ N}

and X
Tx = λn hen |xien for all x ∈ H.
n∈N
Moreover, the only possible accumulation point of (λn )n∈N is 0 and every non-zero
eigenvalue of T has finite multiplicity, i.e. dim Ker(T − λ) < ∞ for all λ , 0 and
all eigenvalues λn are real if T is self-adjoint.

In particular, after choosing an orthonormal basis of Ker T , we have an


orthonormal basis of H consisting of eigenvectors of T . The above formulation
has the advantage that it is valid arbitrary Hilbert spaces. Ultimately, we are
mostly interested in the spectral theory of unbounded self-adjoint operators
which will not be bounded. Nevertheless the spectral theorem for compact
self-adjoint operators has direct consequences for the study of unbounded
operators, such as some Hamiltonians, as the next corollary shows.

Corollary 2.3.6. Let (A, D(A)) be a self-adjoint operator on some separable infinite-
dimensional Hilbert space H with compact resolvent, i.e. R(λ, A) is a compact
operator for some λ in the resolvent set ρ(A). Then there exists an orthonormal
basis (en )n∈N of H and a sequence (λn )n∈N ⊂ R with limn→∞ |λn | = ∞ such that
 
 X∞ 
2 2
 
D(A) =  x ∈ H : |λ | |he |xi| < ∞
 
 n n 

 
n=1

X
Ax = λn hen |xien .
n=1

Proof. Consider the bounded operator R(λ, A) = (λ − A)−1 . One can verify
that its adjoint is given by R(λ, A∗ ) = R(λ, A). It is easy to see that the re-
solvents at different values commute, in particular we have R(λ, A)R(λ, A) =
R(λ, A)R(λ, A). Hence, R(λ, A) is a compact normal operator. By the spec-
tral theorem for such operators (Theorem 2.3.5) there exists an orthonormal
system (en )n∈N of H and sequence (µn )n∈N with limn→∞ µn = 0 such that

X
R(λ, A) = µn hen |xien .
n=1

Since R(λ, A) is invertible, its kernel is trivial and therefore (en )n∈N must be
an orthonormal basis of H. For λn = λ − µ−1 n define the operator B as in the

70
2.3. The Spectral Theorem for Self-Adjoint Operators

assertion of the theorem, i.e.


 
 ∞
X 
2 2
 
D(B) =  x ∈ H : |λ | |he |xi| < ∞
 
 n n 

 
n=1

X
Bx = λn hen |xien .
n=1

Note that by definition of B we have λ ∈ ρ(B) and R(λ, A) = R(λ, B). In other
words, (λ − A)−1 = (λ − B)−1 . Hence, λ − A = λ − B as unbounded operators and
therefore A = B. Further observe that since A = B is self-adjoint, all eigenvalues
λn of A must be real. Further limn→∞ |µn | = 0 implies limn→∞ |λn | = ∞.

We now give a physical example that illustrate the above methods. How-
ever, before we present a powerful characterization of compact subsets of
Lp -spaces. Its proof uses a smooth approximation of the members of F for
which the Arzelà–Ascoli theorem is applied.

Theorem 2.3.7 (Fréchet–Kolmogorov). Let n ∈ N, p ∈ [1, ∞) and F ⊂ Lp (Rn )


be a bounded subset. Suppose that F satisfies
Z
lim |f (x + h) − f (x)|p dx = 0 uniformly on F .
h→0 Rn

Then the closure of F|Ω is compact in Lp (Ω) for any measurable subset Ω ⊂ Rn of
finite measure. Moreover F is compact in Lp (Rn ) if one additionally has
Z
lim |f (x)|p dx = 0 uniformly on F .
r→∞ |x|≥r

Note that F|Ω = {g : Ω → K : g = f|Ω for some f ∈ F }. Further the first


assumption can be rephrased as: for all ε > 0 there exists δ > 0 such that
|f (x + h) − f (x)|p dx ≤ ε for all |h| ≤ δ and all f ∈ F . An analogous version
R
Rn
with quantifiers can be of course formulated for the second condition.
We now apply the Fréchet–Kolmogorov theorem to some basic examples
and show its consequences for concrete quantum mechanical systems. We
start with compact embeddings of Sobolev spaces.

Example 2.3.8 (Compact embedding of H 1 ((a, b))). Consider F = BR , the


closed ball of radius R > 0 in H 1 ((a, b)) for some −∞ < a < b < ∞. For f ∈ F
define
(1 − (a − x))2 f (a) x ∈ [a − 1, a]





f (x) x ∈ (a, b)


f˜(x) = 

2
.



(1 − (x − b)) f (b) x ∈ [b, b + 1]


0 else

71
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

We have f˜ ∈ H 1 (R) for all f ∈ F . Moreover, verify with the help of Re-
mark 2.1.37 that there exists a universal constant C > 0 such that kf˜kH 1 (R) ≤
C kf kH 1 ((a,b)) . We set F˜ = {f˜ : f ∈ F }. Note that F˜|(a,b) = F . We now check that
F˜ satisfies the condition of the Fréchet–Kolmogorov theorem (Theorem 2.3.7).
First assume that f ∈ C 1 (R)∩ F˜ . Then the Cauchy–Schwarz inequality implies
Z x+h Z x+h !1/2
1/2
|f (x + h) − f (x)| ≤ 0
|f (y)| dy ≤ |h| 0 2
|f (y)| dy ≤ |h|1/2 kf 0 k22
x x
1/2
≤ |h| kf kH 1 (R) .

Hence, for |h| < 1 we have


Z Z b+2
2
|f (x + h) − f (x)| dx ≤ kf k2H 1 (R) |h| dx ≤ |h| kf k2H 1 (R) (b − a + 4).
R a−2

Now let f˜ ∈ F˜ be arbitrary. Then there exists a sequence (fn )n∈N ⊂ C 1 (R) ∩ F˜
with fn → f˜ ∈ H 1 (R). In particular we have

kfn (· + h) − fn − (f˜(· + h) − f˜)k2 ≤ k(fn − f˜)(· + h)k2 + kfn − f˜k2


= 2kfn − f˜k2 −−−−−→ 0.
n→∞

It then follows that


Z Z
˜ ˜ 2
|f (x + h) − f (x)| dx = lim |fn (x + h) − fn (x)|2 dx
R R n→∞
2
≤ |h| (b − a + 4) lim kfn kH 1 (R) = |h| (b − a + 4) kf k2H 1 (R)
n→∞
≤ C |h| (b − a + 4)kf˜kH 1 (a,b) .

From this inequality we immediately get that the left hand side converges
uniformly to zero as h → 0. This shows that the closure of bounded sets in
H 1 ((a, b)) are compact subsets of L2 ((a, b)). This result can also be rephrased
in the following way: the natural inclusion

ι : H 1 ((a, b)) ,→ L2 ((a, b))

is a compact operator. Further note that similar arguments to Sobolev spaces


of recantangles or higher dimensional bounded subsets of Rn . The only
difficulty that arises for more complicated sets is the fact that it is difficult to
extend Sobolev functions to all of Rn . In fact, this may not be possible if not
the boundary of the set has some bad regularity.

Moreover, one can alternatively directly apply the Arzelà–Ascoli theorem


to show the compactness if the embedding ι : H 1 ((a, b)) ,→ L2 ((a, b)). In fact,

72
2.3. The Spectral Theorem for Self-Adjoint Operators

the Arzelà–Ascoli theorem gives the stronger fact that the Sobolev embedding
H 1 ((a, b)) ,→ C([a, b]) is compact.
We now apply the above embedding to a particle in an infinitely high
square well.

Example 2.3.9 (Hamiltonian for an infinitely high square well). Consider


an one-dimensional particle in an infinitely high square well, i.e. a particle
subject to the potential

0, x ∈ [a, b]


V (x) = 
∞, x < [a, b]

for some −∞ < a < b < ∞. Ignoring physical constants, the Hamiltonian of this
2
system is a self-adjoint extension of A = − dd2 x with D(A) = Cc∞ ((a, b)) and the
underlying Hilbert space H = L2 ((a, b)). The most common choice here is

{f ∈ H 2 ((a, b)) : f (a) = f (b) = 0},

for which we have already solved the eigenvalue problem in the physics parts
of the lecture. We leave it to the reader the verify the self-adjointness of the
operator with this domain. However, there are also other possbile self-adjoint
extensions corresponding to different physics. Let (B, D(B)) be an arbitrary
self-adjoint extension of A. Then we have D(B) ⊂ H 2 ((a, b)). In particular, we
have
R(i, B)L2 ((a, b)) ⊂ D(B) ⊂ H 2 ((a, b)) ⊂ H 1 ((a, b)).
and therefore R(i, B) is compact as the composition of the bounded opera-
tor R(i, B) : L2 ((a, b)) → H 1 ((a, b)) and the compact operator ι : H 1 ((a, b)) →
L2 ((a, b)). Hence, it follows from Corollary 2.3.6 that L2 ((a, b)) has an orthonor-
mal basis consisting of eigenvectors of B.

It is interesting to compare this with the situation of a particle in a finitely


high square well.

Remark 2.3.10 (Hamiltonian for a finitely high square well). In physics


one usually considers the easiest case, i.e. the potential is constant outside the
box and one therefore has

0, x ∈ [a, b]


V (x) = 
V0 , x < [a, b]

for some V0 > 0. The Hamiltonian is then given as Ĥ = −∆ + V with domain


H 2 (R). Note that H 2 (R) is the natural domain of −∆, where V (x) is a multipli-
cation operator with a bounded function and therefore a bounded operator on
L2 (R). We will later see that the sum of a self-adjoint operator with a bounded

73
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

symmetric (and therefore self-adjoint) operator is always a self-adjoint opera-


tor with the domain of the unbounded self-adjoint operator. In contrast to the
case of a subset of finite measure, H 2 (R) is not compactly embedded in L2 (R).
We therefore cannot deduce that Ĥ has an orthonormal basis of eigenvalues.
In fact, it is well-known from physics that although there may exist some
eigenvectors of Ĥ representing bounded states of the system whose energy is
smaller than V0 , the other part of the spectrum corresponding to states whose
energy is bigger than V0 is continuous. Hence, as long as the energy is smaller
than the potential barrier, the part of the spectrum looks like the spectrum for
a constrained particle, whereas when the energy is bigger than the barrier, the
part of the spectrum looks like the spectrum of a free particle. This situation
is prototypical for systems with a potential.
Moreover, notice the following fun fact: when one ususally solves the
above eigenvalue problem explicitly, one obtains three differential equations in
the spatial regions (−∞, a), [a, b] and (b, ∞) with solutions ψ1 , ψ2 and ψ3 . One
then usually argues with some (maybe mysterious) hand-waving that ψ1 (a) =
ψ2 (a) and ψ2 (b) = ψ3 (b) and the same holds for the first derivatives. From a
mathematical perspective this is very clear: The solution ψ must lie in D(Ĥ) =
H 2 (Rn ). By Sobolev embeddings a solution ψ of the eigenvalue problem must
lie in H 2 (R) ⊂ C 1 (R). But ψ and its first derivative are continuous at a and b if
and only if the above conditions are satisfied! However, the second derivative
of an H 2 (R)-function must not be continuous and therefore we do not require
combability conditions such as ψ100 (a) = ψ200 (a).

2.3.2 Trace Class Operators


The spectral theorem for compact self-adjoint operators also allows us to
give sense to the definition of trace-class operators. Recall that there are two
different types of states of a quantum mechanical systems: pure states which
are often modeled as elements in the underlying Hilbert space and the more
general mixed states which are thought of as (infinite) convex combinations of
pure states. Mixed states are modeled as trace class operators with trace equal
to one. With this motivation in our minds we now define trace class operators.
We only deal with the infinite dimensional case. The obvious modifications
for the finite dimensional case are left to the reader. Note that if A is a
compact linear operator, the compact operator A∗ A has an orthonormal basis
of eigenvectors (en )n∈N by the spectral theorem (Theorem 2.3.5). Moreover,
each eigenvalue λn is non-negative. This allows us to define the bounded
operator
∞ p
X
(A∗ A)1/2 x = λn hen |xien .
n=1

74
2.3. The Spectral Theorem for Self-Adjoint Operators

This is a particular easy instance of the so-called functional calculus for


(unbounded) self-adjoint operators studied in the next section.

Definition 2.3.11 (Trace class operator). Let H be a separable Hilbert space


of infinite dimension. An operator A ∈ K(H) is called trace class if for one
(equiv. all) orthonormal bases (en )n∈N of H
∞ D
X E
kAk1 B (A∗ A)1/2 en en < ∞.
n=1

In this case the sum



X
Tr A B hAen |en i
n=1
is independent of the orthonormal basis and called the trace of A.

One can show that the space of trace class operators endowed with the
norm k·k1 is a Banach space in which the finite rank operators are dense.
Note that every linear operator on a finite dimensional Hilbert space is
trace class and its trace coincides with the usual trace for matrices. Using
the general functional calculus for self-adjoint operators, we will soon see
that we can give sense to the square of A∗ A for arbitrary bounded operators.
One can then define trace class operators without assuming the compactness
of A. Nevertheless one obtains the same class of operators as one can show
that trace class implies compactness. Moreover, observe that if a trace-class
operator A has a basis of eigenvectors (en )n∈N with corresponding eigenvalues
(λn )n∈N , we obtain

X ∞
X
Tr A = hAen |en i = λn .
n=1 n=1
Note that it makes no sense to consider unbounded operators in connection
with the trace. Even if an unbounded operator A has an orthonormal basis
of eigenvectors, the associated eigenvalues are not summable unless they
are bounded (they must even form a zero sequence) and therefore define a
bounded (even compact) operator.
One now can precisely define a mixed state as a positive trace class oper-
ator ρ ∈ K(H) with Tr ρ = 1. Here positive means that ρ is self-adjoint and
all eigenvalues of ρ are non-negative. It follows from the spectral theorem
for compact self-adjoint operators (Theorem 2.3.5) that there exists an or-
thonormal basis (ψn )n∈N of eigenvectors with eigenvalues pn ≥ 0 such that
ρ(ϕ) = ∞
P
n=1 pn hψn |ϕiψn for all ϕ ∈ H. Hence,

X
Tr ρ = pn = 1.
n=1

75
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

The state ρ can be interpreted as a discrete probability distribution: the sys-


tem is in the state ψn with probability pn . Such a situation arises naturally
when one considers an entangled state ψ on a composite system described by
a tensor product H1 ⊗ H2 for which only the information on H1 is accessbile
to the observer due to some limitations. Alternatively, one works with mixed
states if the system under consideration has an uncertain probabilistic prepa-
ration history, for example if you measure the unpolarized light emitted by a
light bulb. In physics ρ is also often called a density matrix or density operator.
Recall that a pure state |ψi ∈ H can be represented as the density matrix

ρ = |ψihψ|,

i.e. as the one-dimensional projection onto the span of |ψi (provided |ψi is
normalized).

Remark 2.3.12. Please be careful: there are two different notions in quantum
mechanics related with probabilistic measurements. Let us illustrate the two
concepts in a two-dimensional Hilbert space C2 with pure states ψ1 = (1, 0)T
and ψ2 = (0, 1)T which are both eigenvalues for the observable Ĥ = diag(1, 2).
Both states are eigenstates for H and therefore have a certain measurement
outcome, namely 1 for ψ1 and 2 for ψ2 . Now consider the superposition
1
ψ = √ (ψ1 + ψ2 ).
2
This is again a pure state. Measuring the observable Ĥ, we will obtain both
outcomes 1 and 2 with probability equal to 1. Nevertheless we will know for
sure that the system is in the state ψ. Now consider the density matrix
!
1 T T 1/2 0
ρ = (ψ1 ψ1 + ψ2 ψ2 ) = .
2 0 1/2

Now measuring Ĥ gives


1 3
Tr(ρĤ) = (1 + 2) = .
2 2
This time ρ models a mixed state which is in state ψ1 and in state ψ2 with
chances equal to 1/2 each, i.e. a statistical mixture of both states. In contrast,
the state ψ zero probability to be in one of the states ψ1 and ψ2 , it is definitely
in the state ψ. The measurement of ρĤ gives the expectation value of the
measurement outcome of Ĥ of such an ensemble.
The difference of |ψihψ| and ρ may be seen experimentally if one takes
the measurement of some non-commuting observable. In fact, consider the
observable !
ˆI = 0 1 .
1 0

76
2.3. The Spectral Theorem for Self-Adjoint Operators

The operator Iˆ has the eigenvalues 1 and −1 with normalized eigenvectors


! !
1 1 1 1 1 1
√ = ψ = √ (ψ1 + ψ2 ) and √ = √ (ψ1 − ψ2 ).
2 1 2 2 −1 2
If one considers the pure state |ψihψ| the measurement outcome for Iˆ is equal
to 1 with probability 1. If one however considers the mixed state ρ, one has a
50% chance to be in the states ψ1 or ψ2 . For both states one has a 50% chance
to obtain the measurement outcome 1 for I, ˆ e.g. for ψ1 one calculates the
square of the norm of
! ! !
1 1 1 1 1 1
(|ψihψ|)(ψ1 ) = = .
2 1 1 0 2 1
as being equal to 1/2. Altogether for the mixed state ρ we have a 50% chance
to measure 1 for Iˆ in contrast to the pure state |ψihψ|.

Remark 2.3.13. If you still feel unsure about pure and mixed states, the fol-
lowing analogy to classical mechanics may help: in classical mechanics the
state of a system (say particle) is determined by its coordinate in the phase
space, e.g. by its position and momentum, whereas in analogy a quantum
mechanical systems is determined by its state ψ ∈ H. The time evolution of
these states is then governed by the Hamiltonian respectively Schrödinger
equations. In practice, one however often has to work with huge ensembles of
individual particles. In classical mechanics this is usually seen as a collection
of individual systems in different states and modelled via a probability dis-
tribution over the phase space. The quantum mechanical analogue for such
ensembles are mixed states which are represented by densities operators.

Remark 2.3.14. It may seem unsatisfactory to only allow discrete probability


distributions for density matrices. This is what is usually done in most
introductory texts for quantum mechanics and allows the description of
most physical phenomena. Ultimately one can generalize the algebras of
observables from Hilbert space operators to abstract operator algebras such
as C ∗ - and von Neumann algebras. In this context one also obtains a general
concept of pure and mixed states. This is the so called algebraic formulation
of quantum mechanics. The introduction of abstract algebras is motivated by
the by now mostly accepted fact that not all self-adjoint operators can indeed
be realized as physical observables. Moreover, the algebraic approach has
become a very useful tool in the study of quantum systems with infinite degree
of freedoms, for example in quantum statistical mechanics and quantum field
theory.

We now state some basic properties of trace class operators which are
commonly used in practice.

77
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

Proposition 2.3.15 (Properties of trace class operators). Let H be a (separa-


ble) Hilbert space. The trace class operators have the following properties.

(a) They form a subspace of B(H) and the composition of a bounded and a trace
class operator is again trace class.

(b) The trace mapping from the trace class operators is linear and satisfies

Tr(ST ) = Tr(T S)

for all S ∈ B(H) and all trace class operators T .

2.3.3 The Spectral Theorem for Unbounded Self-Adjoint


Operators
We now come to one of the central results of this lecture, the spectral theorem
for unbounded self-adjoint operators. We will motivate and give two different
equivalent formulations of the spectral theorem. For the first version observe
that the spectral theorem for a compact self-adjoint operator T on an infinite-
dimensional separable Hilbert space (Theorem 2.3.5) can be reformulated
as follows. Let (λn )n∈N be the sequence of eigenvalues with respect to the
orthonormal basis of eigenvectors (en )n∈N given by the spectral theorem.
Define the multiplication operator

M : ` 2 (N) → ` 2 (N)
(xn )n∈N 7→ (λn xn )n∈N

Then the spectral theorem says that T is unitarily equivalent to M, i.e. T =


U −1 MU for U : x 7→ (hen |xi)n∈N . For general self-adjoint operators this cannot
remain true as stated above, however it holds true if one allows different L2 -
spaces than ` 2 (N). This is the following multiplicative version of the spectral
theorem.

Theorem 2.3.16 (Spectral theorem for self-adjoint operators – multiplica-


tive version). Let (A, D(A)) be a self-adjoint operator on some Hilbert space H.
Then there exists a measure space (Ω, Σ, µ) (σ -finite if H is separable), a measurable
function m : Ω → C and a unitary operator U : H → L2 (Ω, Σ, µ) such that

(a) x ∈ D(T ) if and only if m · U x ∈ L2 (Ω, Σ, µ)

(b) U T U −1 f = m · f for all f ∈ D(Mm ) = {f ∈ L2 (Ω, Σ, µ) : mf ∈ L2 (Ω, Σ, µ)}.

Recall that we have seen in Example 2.2.28 that Mm with the above domain
is a self-adjoint operator. The spectral theorem shows that up to unitary
equivalence every self-adjoint operator is of this form. The spectral theorem is

78
2.3. The Spectral Theorem for Self-Adjoint Operators

usually first proved for bounded normal operators. The case of an unbounded
self-adjoint operator A is then reduced to the bounded normal case via the
Cayley transform of A. However, we do not want to delve into the details and
refer to the literature instead. Instead, we present an example.

Example 2.3.17 (The spectral theorem for −∆). We have seen that −∆ with
domain H 2 (Rn ) is a self-adjoint operator in Example 2.2.30. As exploited in
the argument for its self-adjointness, −∆ is equivalent to the multiplication
operator Mm on L2 (Rn ) with m(x) = |x|2 via the unitary operator given by the
Fourier transform F : L2 (Rn ) → L2 (Rn ).

Note that the spectral theorem allows one to define a functional calculus
for self-adjoint operators. In fact, if A is a self-adjoint operator and f : σ (A) →
C is a measurable function, we may define the closed operator

f (A)x = U −1 (f (m) · U x) with D(A) = {x ∈ H : f (m) · U x ∈ L2 (Ω, Σ, µ)}.

We will discuss this functional calculus in more detail soon when considering
the other variant of the spectral theorem via spectral measures.
Before given the exact definition of such measures, let us again motivative
this version of the spectral theorem. Let A be a self-adjoint operator on Cn .
Let P1 , . . . , Pk denote the orthogonal projections onto the pairwise orthogonal
eigenspaces of A for the eigenvalues λ1 , . . . , λk . Since by the spectral theorem
these eigenspaces span the complete space Cn , we obtain
 k  k
X  X
A = A · I = A 
 λl Pl  =

 λ l Pl .
l=1 l=1

Hence, A can be decomposed as the sum of the orthogonal projections onto its
eigenspaces. The natural generalization of the family (P1 , . . . , Pk ) is the concept
of a projection-valued measure.

Definition 2.3.18 (Projection-valued measure). Let H be a Hilbert space. A


projection-valued measure on R is a mapping P : B(R) → B(H) satisfying the
following properties.

(i) For every Ω ∈ B(R) the operator P (Ω) is an orthogonal projection;

(ii) P (∅) = 0 and P (R) = Id;

(iii) For (Ωn )n∈N ⊂ B(R) pairwise disjoint one has

[ ! N
X
P Ωn x = lim P (Ωn )x for all x ∈ H.
N →∞
n∈N n=1

79
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

The last property in the definition is called the strong continuity of P . Of


course, analogously one can define a projection-valued measure on Rn . One
can deduce from the above properties the non-trivial fact that

P (Ω1 )P (Ω2 ) = P (Ω1 ∩ Ω2 ) for all Ω1 , Ω2 ∈ B(R).

First observe that for Ω1 , Ω2 ∈ B(R) we have

P (Ω1 ∩ Ω2 ) + P (Ω1 ∪ Ω2 ) = P (Ω1 ∩ Ω2 ) + P (Ω1 ∪ (Ω2 \ (Ω1 ∩ Ω2 ))


= P (Ω1 ∩ Ω2 ) + P (Ω1 ) + P (Ω2 \ (Ω1 ∩ Ω2 ))
= P (Ω1 ∩ Ω2 ) + P (Ω1 ) + P (Ω2 ) − P (Ω1 ∩ Ω2 )
= P (Ω1 ) + P (Ω2 ).

Now, if Ω1 and Ω2 are disjoint, we obtain by taking squares in the last


equation

P (Ω1 ∪ Ω2 ) = P 2 (Ω1 ∪ Ω2 ) = (P (Ω1 ) + P (Ω2 ))2


= P (Ω1 ) + P (Ω2 ) + P (Ω1 )P (Ω2 ) + P (Ω2 )P (Ω1 )
= P (Ω1 ∩ Ω2 ) + P (Ω1 )P (Ω2 ) + P (Ω2 )P (Ω1 ).

This shows that


P (Ω1 )P (Ω2 ) + P (Ω2 )P (Ω1 ) = 0.
Multiplying both sides of the equation with P (Ω2 ) from the right we see
that P (Ω1 )P (Ω2 ) = −P (Ω2 )P (Ω1 )P (Ω2 ), which is self-adjoint (an orthogonal
projection satisfies P = P ∗ ). This shows that

P (Ω1 )P (Ω2 ) = (P (Ω1 )P (Ω2 ))∗ = P (Ω2 )P (Ω1 ) = 0.

Now, for the general case we obtain as desired

P (Ω1 )P (Ω2 ) = (P (Ω1 \ Ω2 ) + P (Ω1 ∩ Ω2 ))(P (Ω2 \ Ω1 ) + P (Ω1 ∩ Ω2 ))


= P 2 (Ω1 ∩ Ω2 ) = P (Ω1 ∩ Ω2 ).

Further we leave it as an exercise to the reader that a projection-valued


measure is monotone in the sense that Ω1 ⊂ Ω2 for two Borel sets implies
hP (Ω1 )x|xi ≤ hP (Ω2 )x|xi for all x ∈ H. In particular P (Ω2 ) = 0 implies P (Ω1 ) =
0 for Ω1 ⊂ Ω2 .
To a projection-valued measure P on R there corresponds a projection-
valued function λ 7→ Pλ = P ((−∞, λ)), a so called projection-valued resolution of
the identity. This function is characterized by the following properties.

(a) Pλ Pµ = Pmin{λ,µ} for all λ, µ ∈ R;

(b) limλ→−∞ Pλ x = 0 and limλ→∞ Pλ x = x for all x ∈ H;

80
2.3. The Spectral Theorem for Self-Adjoint Operators

(c) limµ↑λ Pµ x = Pλ x for all x ∈ H.

Note that the above properties imply that for all x ∈ H the resolution of
the function λ 7→ hx|Pλ xi is a distribution function of the bounded measure
Ω 7→ hx|P (Ω)xi. If kxk = 1, this is a distribution function of a probability
measure. Conversely, a standard result from measure and probability theory
says that for every bounded distribution function there exist a unique bounded
measure with the given distribution function. This allows one to recover the
spectral measure P from its set of distribution functions λ 7→ hx|Pλ xi for x ∈ H.
For general x, y ∈ H we moreover obtain (complex-valued!) measures related
to the just considered distribution functions via the polarization identity by
4
1X k
hy|P (Ω)xi = i hx + i k y|P (Ω)(x + i k y)i
4
k=1

for Ω ∈ B(R).
Note that for x ∈ H the map Ω 7→ hx|P (Ω)xi defines a Borel measure in the
usual sense of measure theory because of

hx|P (Ω)xi = hx|P 2 (Ω)xi = hP (Ω)x|P (Ω)xi ≥ 0 for all x ∈ H.

Hence, we can apply the usual theory of Lebesuge integration to these mea-
sures. In particular, a measurable function f : R → K is said to be finite
almost everywhere with respect to P if it is finite almost everywhere with
respect to all measures hx|P xi for x ∈ H.
We now can state the spectral theorem in its version for projection-valued
measures and its various consequences. To motivate the domains involved
below, consider a step function f = nk=1 ak 1Ωk for some pairwise disjoint
P

Ωk ∈ B(R) and ak ∈ C. Then as for the usual Lebesgue integral we can directly
define the bounded operator
Z n
X
f (λ) dP (λ) B ak P (Ωk ).
R k=1

Note that for the value of the norm of this operator evaluated at some x ∈ H
we obtain
Z n
2 *X n
X +
f (λ) dP (λ)x = ak P (Ωk )x al P (Ωl )x

R k=1 l=1
n X
X n Xn X
n
= ak al hP (Ωk )x|P (Ωl )xi = ak ak hx|P (Ωk )P (Ωl )xi
k=1 l=1 k=1 l=1
Xn X n Xn
= ak ak hx|P (Ωk ∩ Ωl )xi = |ak |2 hx|P (Ωk )xi
k=1 l=1 k=1

81
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

Z
= |f (λ)|2 dhx|P (λ)xi.
R

The above isometry allows one to extend the integral on the left hand side
pointwise for a given x ∈ H for all measurable functions f : R → C which are
square-integrable with respect to the measure Ω 7→ hx|P (Ω)xi = kP (Ω)xk2 via
a limiting argument. Moreover, this calculation together with the Cauchy
Schwarz inequality also shows the less exact inequality
Z
f (λ) dhy|P (λ)xi ≤
kf k∞ kxk kyk
R

for all x, y ∈ H which is fundamental for the bounded functional calculus of


A. However, in order to avoid dealing with such operator-valued integrals we
will work with the measures Ω 7→ hy|P (Ω)xi instead.

Theorem 2.3.19 (Spectral theorem for self-adjoint operators – version with


projection-valued measures). For every self-adjoint operator (A, D(A)) on some
Hilbert space H there exists a unique projection-valued measure P with the follow-
ing properties.

(a) We have ( Z )
2
D(A) = x ∈ H : λ dhx|P (λ)xi < ∞
R
and for every x ∈ D(A) and y ∈ H we have
Z
hy|Axi = λ dhy|P (λ)xi.
R

Further, for the spectrum we have

λ ∈ σ (A) ⇔ P ((λ − ε, λ + ε)) , 0 for all ε > 0.

(b) Let f : R → C be measurable and finite almost everywhere with repsect to P .


Then one obtains a densely defined linear operator f (A) with
( Z )
2
D(f (A)) = x ∈ H : |f (λ)| dhx|P (λ)xi < ∞
R

which is defined for x ∈ D(f (A)) and y ∈ H by


Z
hy|f (A)xi = f (λ) dhy|P (λ)xi.
R

Moreover, the correspondence f 7→ f (A) has the following properties: one


has f (A)∗ = f (A) and f (A) is bounded if and only if f is bounded on σ (A).
Further, f 7→ f (A) is an algebra homomorphism from the space of bounded
measurable functions on R to B(H).

82
2.3. The Spectral Theorem for Self-Adjoint Operators

Let us now give some examples of projection-valued measures and the


associated functional calculus.
Example 2.3.20 (The PVM for the position operator). Let H = L2 (R) and
for Ω ∈ B(R) let P (Ω)u = 1Ω u for all u ∈ L2 (R). We leave it to the reader to
verify that P is a projection-valued measure (for the third property use the
dominated convergence theorem). We have for all u, w ∈ L2 (R) and Ω ∈ B(R)
Z Z
hu|P (Ω)wi = u(x)1Ω (x)w(x) dx = u(x)w(x) dx.
R Ω
Hence, the measure is given by the density uw. For the integral over the
projection-valued measure we obtain for a bounded measurable function
f : R → C that (if you do not know this identity check its validity first for
simple functions and then write a general function as the monotone limit of
step functions and use the monotone convergence theorem; you can take the
more general arguments in the next example as a guidline)
Z Z
f (λ) dhu|P (λ)wi = f (x)u(x)w(x) dx.
R R

This is exactly hu|Tf wi, where Tf w = f · w, i.e. Tf acts as multiplication with


f . It now follows, for example by comparing resolvents or carefully repeating
the above argument for unbounded functions, that P is the resolution of the
identity for the position operator on R.
Example 2.3.21 (PVMs for multipliers). Let (X, Σ, µ) be a measure space
and m : X → R a measurable function. We have already seen that we have
a functional calculus for the self-adjoint multiplication operator Mm : to a
bounded measurable function f : R → R it associates the bounded operator
f (Mm )u = f ◦ m · u. In particular, if Ω ∈ B(R), we obtain the orthogornal
projection 1Ω u = 1Ω ◦ m · u = 1m−1 (Ω) u. We therefore expect that the family of
these projections is the projection-valued measure for Mm . Let us verify this
explicitly.
Hence, generalizing the previous example, for Ω ∈ B(R) we let P (Ω)u =
1m−1 (Ω) u for u ∈ L2 (X, Σ, µ). With the same arguments as in the previous
example one can directly verify that P is a projection-valued measure on
L2 (X, Σ, µ). Now, let u = 1A and w = 1B for some A, B ∈ Σ and Ω ∈ B(R). Then
Z
hu|P (Ω)wi = 1A (x)1B (x)1m−1 (Ω) (x) dµ(x) = µ(A ∩ B ∩ m−1 (Ω)).
X
Pn
Now, if f = k=1 ak 1Ωk for some ak ≥ and disjoint Ωk ∈ B(R) is a step function,
we obtain
Z n
X n
X
f (λ) dhu|P (λ)wi = ak hu|P (Ωk )ui = ak µ(A ∩ B ∩ m−1 (Ωk ))
R k=1 k=1

83
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

Z n
X Z n
X Z
= ak 1m−1 (Ωk ) dµ = ak 1Ωk ◦ m dµ = f ◦ m dµ
A∩B k=1 A∩B k=1 A∩B
Z Z
= 1A∩B f ◦ m dµ = 1A 1B f ◦ m dµ.
X X

Using the monotone convergence theorem we see that the above identity
extends to all positive functions f : R → R. By linearity one then can pass to
all functions which are integrable with respect to the measure Ω 7→ hu|P (Ω)wi.
In particular, the identity holds for all bounded functions if A and B have
finite measure. Since both sides of the equality are sesquilinear, the identity
therefore holds for all integrable step functions u and w and all bounded
f . Since the step functions are dense in L2 (X, Σ, µ), the identity extends to
all square integrable functions u and w. Written this out, we have for all
u, w ∈ L2 (X, Σ, µ) and all measurable bounded f : R → R that
Z Z
f (λ) dhu|P (λ)wi = uwf ◦ m dµ = hu|Mf ◦m wiL2 (X,Σ,µ) .
R X

Again, by comparing resolvents, we see that P is the projection valued measure


associated to the multiplication operator Mm on L2 (X, Σ, µ).

In fact, notice that this example actually gives a proof of the spectral
theorem in the version with projection-valued measures based on the mul-
tiplicative version of the spectral theorem. In fact, one can verify that one
version of the spectral theorem implies the other. The arguments for the con-
verse implication can be found in [Tes12, Lemma 3.3 and following results].

Example 2.3.22 (PVM for orthonormal bases of eigenvectors). Let (A, D(A))
be a self-adjoint operator on a separable infinite-dimensional Hilbert space
and (en )n∈N an orthonormal basis of eigenvectors for A for the real eigenvalues
(λn )n∈N . For Ω ∈ B(R) we define
X
P (Ω) = |en ihen |.
n:λn ∈Ω

Then P (Ω) is the orthogonal projection onto the span of the eigenspaces for
the eigenvalues which lie in Ω. Moreover, P is a projection-valued measure.
In fact, after an unitary transformation this example is again a special case of
the previous one. Nevertheless let us verify directly the required properties.
In fact, everything is clear except for the σ -additivity. For this note that one a
direct decomposition
M
H= Hk ,
k

84
2.3. The Spectral Theorem for Self-Adjoint Operators

where Hk are the pairwise orthogonal eigenspaces for the different eigen-
values λk of A. Let Pk be the orthogonal projection onto Hk . Then Pk =
P
n:λn =λk |en ihen |. It follows from orthogonality that
X
kxk2 = kPk xk2 for all x ∈ H. (2.2)
k

Now, let (Ωn )n∈N ⊂ B(R) be pairwise disjoint. Then we have for x ∈ H

X N 2 X X 2

P (Ωn )x − P (∪n∈N Ωn )x = Pk x − Pk x
n=1 k:λk ∈∪N
n=1 Ωn
k:λk ∈∪∞
n=1 Ωn
X 2 X
= Pk x = kPk xk2 −−−−−→ 0.
N →∞
k:λk ∈∪∞
n=N +1 k:λk ∈∪∞
n=N +1

because of the absolute square-summability of (2.2). We leave it to the reader


to work out the concrete representation of the functional calculus for P . For
example by comparing resolvents, one then can see that P is the projection-
valued measure for A.

2.3.4 Measurement in Quantum Mechanics


The spectral theorem allows us to model quantum mechanical measurements
in a mathematically exact way. From an abstract point of view a physical
theory (for example classical mechanics or quantum mechanics) is described
by a set of observables A and a set of states S in which the physical system
can be. A process of measurement is then the assignment A × S 3 (A, ρ) 7→ µA ,
where µA is a probability measure on (R, B(R)). For every Borel set Ω ⊂ R the
value µA (Ω) is then the probability that for a system in the state ρ the result
of a measurement of the observable A belongs to Ω.

The Born–von Neumann formula We have seen that for quantum mechanics
S is the set of all trace class operators on some fixed (separable) Hilbert
space H and A is the set of all self-adjoint operators on H. Let PA be the
projection-valued measure of A. Then µA is given by the Born–von Neumann
formula
µA (Ω) = Tr PA (Ω)ρ for Ω ∈ B(R).

Note that the trace is well-defined because PA (E)ρ is trace class as the compo-
sition of a bounded and a trace class operator. However, one still has to check
that µA is a probability measure. The only non-trivial fact is the σ -additivity
of µA . For this let (Ωk )k∈N be pairwise disjoint. Then for an orthonormal basis

85
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

(en )n∈N
N
X ∞
X
µA (Ωk ) = Tr(PA (∪N
k=1 Ωk )ρ) = hen |PA (∪nk=1 Ωk )ρen i.
k=1 n=1

Note that for all n ∈ N the sequence PA (∪Nk=1 Ωk )en converges to PA (∪k∈N Ωk )en
by the properties of a projection-valued measure. Since moreover the n-th
summand is dominated by hen |ρen i which is summable because ρ is trace
class, the series converges to Tr(PA (∪k∈N Ωk )ρ) by the dominated convergence
theorem.
Further, the expectation value of the observable A in the state ρ is (pro-
vided it exists) Z
hAiρ = λ dµA (λ).
R

Some useful results We now state some useful related results without
proofs which are often used in physics. The first result can be verified by
simply checking the definitions and shows that under mild assumptions the
expectation value of a state can be calculated as shown in the physics part of
the lecture.

Proposition 2.3.23. Let A be an observable and ρ a state such that hAiρ exists (as
finite value) and Im ρ ⊂ D(A). Then Aρ is trace class and

hAiρ = Tr(Aρ).

In particular, if ρ = |ψihψ| is a pure state and ψ ∈ D(A), then

hAiρ = hAψ|ψi and hA2 iρ = kAψk2 .

We now come to the simultaneous measurement of severable observables.


We already know from physics that this is only possible if the observables
commute. Note that it is not clear what we mean with commuting observables
in the unbounded case. The spectral theorem helps us here again.

Definition 2.3.24. We say that two self-adjoint operators A and B commute if


the corresponding projection-valued measures PA and PB commute, i.e.

PA (Ω1 )PB (Ω2 ) = PB (Ω2 )PA (Ω1 ) for all Ω1 , Ω2 ∈ B(R).

One can show that the above definition is equivalent to the fact that the re-
solvents of A and B commute, i.e. one has R(λ, A)R(µ, B) = R(µ, B)R(λ, A) for all
λ, µ ∈ C \ R. For commuting observables one has the following generalization
of the spectral theorem.

86
2.3. The Spectral Theorem for Self-Adjoint Operators

Proposition 2.3.25. Let A1 , . . . , An be finitely many pairwise commuting self-


adjoint operators on some Hilbert space H. Then there exists a unique projection-
valued measure P on B(Rn ) with the following properties.

(a) For every Ω = Ω1 × · · · × Ωn ∈ B(Rn ) we have

P (Ω) = PA1 (Ω1 ) · · · PAn (Ωn ).

(b) If λk is the k-th coordinate functional in Rn , i.e. λk (x1 , . . . , xn ) = xk , we have


for all k = 1, . . . , n
( Z )
2
D(Ak ) = x ∈ H : |λk | dhx|P xi < ∞
R

and for every x ∈ D(A) and y ∈ H we have


Z
hy|Axi = λk hx|P xi.
Rn

Moreover, as in the case of a single operator one obtains a joint functional calculus
by integrating measurable functions on Rn against the spectral measure.

Note that as in the case of one observable the simultaneous measurement


of observables A = {A1 , . . . , An } of a quantum mechanical system in a state
ρ should be described by the probability measure µA on Rn given by the
following natural generalization of the Born-von Neumann formula:

µA (Ω) = Tr(PA1 (Ω1 ) · · · PAn (Ωn )ρ) for all Ω = Ω1 × · · · × Ωn ∈ B(Rn ).

However, for the above formula to define a probability measure for all states
ρ and all Ω as above we need that Ω1 × · · · × Ωn 7→ PA1 (Ω1 ) · · · PAn (Ωn ) extends
to a projection-valued measure on Rn . Since the product of two orthogonal
projections is an orthogonal projection if and only if the projections commute,
we see that the observables must commute pairwise. From a physical perspec-
tive this agrees with the requirement that the simultaneous measurement of
several observables should be independent of the order of the measurements
of the individual observables.

2.3.5 The Schrödinger Equation and Stone’s Theorem on


One-Parameter Unitary Groups
We now come to the time evolution of quantum mechanical systems. Recall
from the postulates of quantum mechanics that although measurements of
states are of a probabilistic nature, the time evolution of a state is deterministic.
In fact, given a system described by the Hilbert space H, the time evolution of

87
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

its states is completely determined by a special observable, i.e. a self-adjoint


operator, H on H. This operator is called the Hamiltonian of the system. This
time evolution is governed for a pure state ψ ∈ H by the Schrödinger equation


i~ ψ(t) = Hψ(t).
∂t
From a mathematical perspective there are several problems with the above
equation. First of all the equation does not make sense in the usual meaning
if the initial value ψ0 or the solution ψ(t) does not lie in the domain of H. So
how one has to interpret the above equation? And even if ψ0 ∈ D(H), why
does the above equation have a unique solution? In fact, if H is unbounded,
the usual theorems for existence and uniqueness of solutions involving a
Lipschitz condition do not apply.

The case of bounded Hamiltonians Nevertheless let us first take a look at


the case of a bounded Hamiltonian, i.e. H ∈ B(H), to get a feeling for the
problem. From now on we again ignore all physical constants. As for linear
systems of ordinary differential equations, we can explicitly write down the
solution of the Schrödinger equation. For arbitrary ψ0 ∈ H a solution is given
by

−iHt −iHt
X (−iHt)k
ψ(t) = e ψ0 , where e = .
k!
k=0

Note that the series is absolute convergent in B(H). In fact, we have for t ∈ R
because of the boundedness of H
∞ ∞
X k(iHt)k k X kHkk |t|k
≤ = e|t|kHk .
k! k!
k=0 k=0

Moreover, the solution is unique by the Picard–Lindelöf theorem which also


holds on Banach spaces with the same proof or by a direct argument as in the
case systems of linear ordinary differential equations. Note that the above
argument does not work in the case of unbounded operators.

The case of unbounded Hamiltonians Note that in the case of a bounded


Hamiltonian the family U (t) = e−Hit of solution operators are unitary and
satisfy the exponential law U (t + s) = U (t)U (s) for all t, s ∈ R. Moreover, one
has U (0) = Id. This leads to the following abstract definition.

Definition 2.3.26 (Unitary Group). A family of unitary operators (U (t))t∈R


on a Hilbert space H is called a unitary group if

(a) U (0) = Id;

88
2.3. The Spectral Theorem for Self-Adjoint Operators

(b) U (t + s) = U (t)U (s) for all t, s ∈ R.


Further, (U (t))t∈R is called strongly continuous if t 7→ U (t)x is continuous for
all x ∈ H.
From a physical perspective it is very reasonable to describe the time
evolution of a quantum mechanical system by a unitary group. In fact, the
existence of family of mappings statisfying the exponential law follows from
the fact the the time evolution of a system is uniquely determined. Morever,
by the superposition principle these mappings should be linear. Further, by
the probabilistic interpretation of states each member of this family should
preserve the norm of pure states, i.e. should be given by isometries. The
surjectivity of these isometries comes from the requirement that the history
of each state can be traced back and the assumption that each pure state (or
at least a dense set) is physically realizable.
Moreover, it is natural to require some regularity on the map t 7→ U (t).
A result of J. von Neumann says that a unitary group (U (t))t∈R is already
strongly continuous if the orbits are weakly measurable, i.e. t 7→ hy|U (t)xi is
measurable for all x, y ∈ H. Hence, we would expect that for each Hamiltonian
there is an associated unitary group. This is indeed true.
Proposition 2.3.27. Let (A, D(A)) be a self-adjoint operator on a Hilbert space.
Then (U (t))t∈R defined by
U (t) = eitA for t ∈ R
by the functional calculus of A is a strongly continuous unitary group.
Proof. For t ∈ R define ft (λ) = eitλ . Then f is a bounded function on R. By the
functional calculus for self-adjoint operators U (t) = ft (A) defines a bounded
operator on H. Since the bounded functional calculus is compatible with
multiplication of functions, we have U (t)U (s) = ft (A)fs (A) = ft+s (A) = U (t + s)
and we have verified the exponential law. Further, we have clearly U (0) = Id.
Moreover, we have for all x ∈ H and t ∈ R
Z
hy|U ∗ (t)U (t)xi = hy|ft (A)ft (A)xi = ft (λ)ft (λ)hy|P (λ)xi
Z R

= hy|P (λ)xi = hx|yi.


R
Hence, U (t) is unitary. Finally, for the strong continuity observe that for x ∈ H
we have
Z Z
2 iλt iλs
2 eiλ(t−s) − 1 2 dhx|P (λ)xi.
kU (t)x − U (s)xk = e − e dhx|P (λ)xi =

R R
The right hand side goes to zero as t → s because of the dominated convergence
theorem. Altogether (U (t))t∈R is a unitary group.

89
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

Conversely, we want to recover the self-adjoint operator out of the group


U (t) = eitA . In the bounded case this is again easy: the group mapping
d
R → B(H) is differentiable and one has A = −i dt U (t)|t=0 . This cannot work in
the unbounded case because A is simply not bounded. However, the approach
works pointwise.

Definition 2.3.28 (Infinitesimal generator). Let H be a Hilbert space. The


infinitesimal generator of the strongly continuous unitary group (U (t))t∈R on
H is the unbounded operator (A, D(A)) defined as
( )
U (h)x − x
D(A) = x ∈ H : lim exists
t→0 h
U (h)x − x
Ax = −i lim .
t→0 h

Moreover, one would expect a relation between solutions of the differential


equation ẋ(t) = −iAx(t), i.e. the Schrödinger equation, and the unitary group.
This is studied in the next result.

Proposition 2.3.29. Let (U (t))t∈R be a strongly continuous unitary group a


Hilbert space H with infinitesimal generator A. Then for every x0 ∈ D(A) the
function x(t) = U (t)x0 satisfies x(t) ∈ D(A) for all t > 0. Moreover, x(t) is the
unique (classical) solution of the problem

 ẋ(t) = −iA(x(t)) (t ≥ 0)



 x(0) = x0 .

Proof. Let x0 ∈ D(A). Then we have for all t > 0


!
U (t + h)x0 − U (t)x0 U (h)U (t)x0 − U (t)x0 U (h)x0 − x0
lim = lim = U (t) lim .
h→0 h h→0 h h→0 h

Hence, U (t)x0 ∈ D(A) and x(t) is differentiable with ẋ(t) = −iAU (t)x0 =
−iU (t)Ax0 . Now, let y be a second solution of the problem. For t > 0 consider
the function z(s) = U (t − s)y(s). Then it follows from a variant of the product
rule that

ż(s) = iAU (t − s)y(s) + U (t − s)ẏ(s) = iU (t − s)Ay(s) − iU (t − s)Ay(s) = 0.

This shows that ż(s) = 0 for all s ∈ R. By reducing to the case of the functions
s 7→ hw|z(s)i for w ∈ H, we see from the scalar case that the function z is
constant. In particular, we have U (t)x0 = z(0) = z(t) = y(t). This establishes
the uniqueness of the solutions.

90
2.3. The Spectral Theorem for Self-Adjoint Operators

Hence, for elements in the domain of the generator we obtain classical solu-
tions of the Schrödinger equation. In the general case x0 ∈ H we may therefore
interpret U (t)x0 as a generalized solution of the Schrödinger equation.
As one would hope the infinitesimal generator of every strongly continu-
ous unitary group is self-adjoint. As a preliminary results in this direction we
show the following lemma.

Lemma 2.3.30. Let (U (t))t∈R be a strongly continuous unitary group on some


Hilbert space H. Then its infinitesimal generator A is essentially self-adjoint.

Proof. For x, y ∈ D(A) we have because of U ∗ (t) = U (−t) for all t ∈ R


* + * +
U (h)x − x U (−h)y − y
hy|Axi = y −i lim = lim i x = hAy|xi.
h→0 h h→0 h

This shows that A is symmetric. Hence, A is closable by Lemma 2.2.26. Now


let x ∈ H. Note that as in the scalar case we can define the Riemann integral of
continuous functions as the limit of approximating sums. We have for R > 0
RR
and xR = 0 e−t U (t)x dt

U (h) − xR e−h xR − xR e−h U (h)xR − xR


e−h + =
h h h
ZR
1
= e−(t+h) U (t + h)x − e−t U (t)x dt
h 0
1 R+h −t 1 h −t
Z Z
= e U (t)x dt − e U (t)x dt −−−−→ e−R U (R)x − x.
h R h 0 h→0

Hence, xR ∈ D(A) and (iA − 1)xR = e−R U (R)x − x. Note that both xR and
(iA − 1)xR convergeR as R → ∞. Since A is closed this implies that the improper

Riemann integral 0 e−t U (t)x dt lies in the domain of A and
Z∞
(iA − 1) e−t U (t)x dt = lim e−R U (R)x − x = −x.
0 R→∞

Hence, iA − 1 and therefore A + i are surjective. For the self-adjointness of A it


remains to show that A − i is surjective as well.
Now consider the adjoint group (U ∗ (t))t∈R . Clearly, one has U ∗ (0) = Id∗ =
Id and for all x, y ∈ H and t, s ∈ R

hy|U ∗ (t + s)xi = hU (t + s)y|xi = hU (s)U (t)y|xi = hy|U ∗ (t)U ∗ (s)xi.

This shows that (U ∗ (t))t∈R is a unitary group. We now show that the group is
strongly continuous. For this observe that for all x, y ∈ H

hy|U ∗ (s)xi = hU (s)y|xi −−−→ hU (t)y|xi = hy|U ∗ (t)xi


s→t

91
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

by the strong continuity of (U (t))t∈R . We obtain that for all x ∈ H

kU ∗ (t)x − U ∗ (s)xk2 = kU ∗ (t)xk2 + kU ∗ (s)xk2 − 2 RehU ∗ (t)x|U ∗ (s)xi


= 2 kxk2 − 2 Rehx|U ∗ (s − t)xi −−−→ 0.
s→t

Since (U ∗ (t))
t∈R is strongly continuous, we may calculate its infinitesimal
generator B. Note that because of
U ∗ (h)x − x U (−h)x − x
−i = −i
h h
one has B = −A. By the first part of the proof we know that −A + i and
therefore also A − i are surjective. The self-adjointness of A now follows from
Theorem 2.2.34. But this is by definition equivalent to A being essentially
self-adjoint.

We now determine the infinitesimal generator of the unitary groups de-


fined by the functional calculus for self-adjoint operators.

Lemma 2.3.31. Let (A, D(A)) be a self-adjoint operator on a Hilbert space H.


Then A is the infinitesimal generator of the strongly continuous unitary group
U (t) = eitA .

Proof. Let B denote the infinitesimal generator of (U (t))t∈R . By Lemma 2.3.30


the operator B is essentially self-adjoint. For x ∈ D(A) we have by the spectral
theorem for self-adjoint operators (Theorem 2.3.19) that
2 Z 2
eihλ − 1

−i U (h)x − x − Ax =
− λ dhx|P (λ)xi
−i
h
R h
Z ihλ 2
e − 1
= − iλ dhx|P (λ)xi −−−−−→ 0
h

R h→∞

by the dominated convergence theorem. Hence, x ∈ D(B) and A ⊂ B. This


shows that B is a symmetric extension of the self-adjoint operator A. But if A
satisfies the range condition for self-adjoint operators (Theorem 2.2.34 (iii)),
then so does B. This shows (i + A)−1 = (i + B)−1 , which implies A = B.

We now come to the fundamental theorem of this section which shows that
the description of the evolution via the Schrödinger equation is equivalent to
the description via unitary groups.

Theorem 2.3.32 (Stone’s theorem). Let H be a Hilbert space. There is a one-to-


one correspondence between self-adjoint operators on H and strongly continuous
unitary groups on H given by

A 7→ (eitA )t∈R .

92
2.3. The Spectral Theorem for Self-Adjoint Operators

Proof. Let (U (t))t∈R be a strongly continuous unitary group. Then its in-
finitesimal generator A is essentially self-adjoint by Lemma 2.3.30. Therefore
the self-adjoint operator A generates the strongly continuous unitary group
(eitA )t∈R . Since A is self-adjoint, we see that D(A) and therefore also D(A)
are dense in H. Note that for x ∈ D(A) both unitary groups yield classical
solutions for the problem ẋ(t) = Ax(t) with initial value x0 . By the unique-
ness of the solutions shown in Proposition 2.3.29, the operators eitA and
U (t) therefore agree on the dense subset D(A) for all t ∈ R. Since these op-
erators are bounded, we indeed have eitA = U (t) for all t ∈ R. This shows
(eitA )t∈R = (U (t))t∈R . In particular, as the groups agree, so do the generators
and we have A = A. Altogether we have shown that the map A 7→ (eitA )t∈R is
onto. For the injectivity simply note again that if (eitA )t∈R and (eitB )t∈R define
the same unitary groups, then one clearly has A = B.

We now study the unitary groups generated by the position and momen-
tum operators. Since both are particular instances of multiplication operators,
we study this class first.

Example 2.3.33 (Multiplication semigroup). Let (Ω, Σ, µ) be an arbitary


measure space and m : Ω → R a measurable function. We have seen in Ex-
ample 2.2.28 that the multiplication operator Mm is self-adjoint. By Stone’s
theorem (Theorem 2.3.32) U (t) = eitMm for t ∈ R is the associated unitary
group. We have seen in Example 2.3.21 that

(eitMm f )(x) = eitm(x) f (x) for all f ∈ L2 (Ω, Σ, µ).

In particular, we obtain for the position operators the following.

Corollary 2.3.34. Let n ∈ N and for j = 1, . . . , n let x̂j denote the position operator
on L2 (Rn ) in direction j. Then x̂j generates the unitary group

(U (t)f )(x) = eitxj f (x) for all f ∈ L2 (Rn ).

Hence, these groups simply act on pure states by a uniform phase rotation.
After a Fourier transform an analogous argument applies to the momentum
operators.

Corollary 2.3.35. Let n ∈ N and for j = 1, . . . , n let p̂j = −i ∂x be the momentum
j

operator on L2 (Rn ) in direction j. Then p̂j generates the unitary group

(U (t)f )(x) = f (x + tej ) for all f ∈ L2 (Rn ),

where ej denotes the j-th unit vector.

93
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

Proof. Recall that we have seen in Example 2.2.29 that under the unitary
Fourier transform p̂j becomes the multiplication operator with xj . Hence, it
follows that for all f ∈ L2 (Rn ) one has almost everywhere

(U (t)f )(x) = F −1 (eitxj F f )(x) = f (x + tej ).

We now come back to the abstract description of the time evolution of a


quantum mechanical system. Let A be a self-adjoint operator and (U (t))t∈R
the unitary group generated by A. Then a quantum mechanical state initially
given by ψ evolves into the state U (t)ψ after time t. Written as a density
matrix, we obtain

|U (t)ψihU (t)ψ| = U (t)|ψihψ|U ∗ (t).

Requiring linearity of the time evolution for mixed states, the same formula
holds for (non-normalized) mixed states made of finitely many pure states.
Assuming again continuity of the time evolution, it follows from the density of
the finite rank operators in the space of trace class operators that the identity
indeed holds for all trace class operators. In particular, it holds for all density
operators. Hence, we obtain the following mathematical description of the
time evolution in quantum mechanics.

Time evolution in Schrödinger’s picture. The dynamics of a quantum sys-


tem under a self-adjoint operator H is described by the strongly continuous
unitary group (U (t))t∈R associated to H via Stone’s theorem. Quantum observ-
ables do not depend on time and the evolution of states is given by

ρ 7→ U (t)ρU (t)−1 .

2.4 Further Criteria for Self-Adjointness


Recall that until now we introduced self-adjoint operators and checked self-
adjointness for several elementary examples, such as the position and momen-
tum operators or the Hamiltonian of a free particle on Rn provided one works
with the right domains. However, most Hamiltonians such as for the harmonic
oscillator or the hydrogen atom are more complicated as they additionally
involve potentials, i.e. are of the form

−∆ + V .

In this general case we cannot rely on simple Fourier transform techniques


and more advanced criteria for self-adjointness are needed. In particular, it
can be very difficult to determine the domains explicitly. We will now present
several criteria for self-adjointness – partially with proofs – and apply them
to concrete quantum mechanical operators of physical importance.

94
2.4. Further Criteria for Self-Adjointness

2.4.1 von Neumann’s criterion


A sometimes very handy criterion is von Neumann’s criterion. In concrete
situations it is sometimes trivial to verify this criterion. However, it has the
disadvantage that it only shows the existence of self-adjoint extensions and
not the essentially self-adjointness of symmetric operators. Von Neumann’s
criterion is formulated in terms of conjugations.

Definition 2.4.1 (Conjugation). A surjective map V : H → H on a Hilbert


space H is called anti-unitary if

(i) V is anti-linear, i.e. V (λx+µy) = λV x+µV (y) for all λ, µ ∈ C and x, y ∈ H.

(ii) V is anti-isometric, i.e. hV x|V yi = hx|yi for all x, y ∈ H.

Moreover, an anti-unitary operator C : H → H is called a conjugation if C 2 = C.

Now, von Neumann’s criterion reads as follows.

Theorem 2.4.2 (von Neumann’s criterion). Let (A, D(A)) be a densely defined
symmetric operator on a Hilbert space H. If there is a conjugation C : H → H such
that C(D(A)) ⊂ D(A) and

AC = CA on D(A),

then A has self-adjoint extensions.

Proof. We first show that C(D(A∗ )) ⊂ D(A∗ ) and that A∗ C = CA∗ . For this let
x ∈ D(A) and y ∈ D(A∗ ). Then by assumption

hCy|Axi = hy|CAxi = hy|ACxi = hA∗ y|Cxi = hCA∗ y|xi.

This shows thats Cy ∈ D(A∗ ) and A∗ Cy = CA∗ y. Hence, A∗ C = CA∗ on D(A∗ ).


Now let x ∈ Ker(A∗ + i). Hence,

(A∗ + i)x = 0 ⇒ C(A∗ + i)x = A∗ Cx − iCx = (A∗ − i)Cx = 0.

Since C is an isometry, C induces an injective map Ker(A∗ + i) → Ker(A∗ − i).


We now show that this map is onto. In fact, as above for y ∈ Ker(A∗ − i) we
have for x = Cy that x ∈ D(A∗ ), Cx = C 2 y = y and

(A∗ + i)Cy = C(A∗ y − iy) = C((A∗ − i)y) = 0.

Since C preserves angles, it maps orthonormal bases to orthonormal bases.


Hence, Ker(A∗ + i) and Ker(A∗ − i) are isomorphic as Hilbert spaces. This
shows that the deficiency indices d+ (A) and d− (A) agree. It follows from
Theorem 2.2.45 that A has self-adjoint extensions.

95
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

Von Neumann’s criterion immediately gives the following results on Hamil-


tonians with potential terms.

Example 2.4.3 (Hamiltonians with potential terms). Let us consider the


Hamiltonian −∆ + V for a real-valued function V ∈ L1loc (Rn ) on the Hilbert
space L2 (Rn ). Using the test functions Cc∞ (Rn ) as domain, we that for f , g ∈
Cc∞ (Rn )
Z Z
hg|(−∆ + V )f i = − g(x)∆f (x) dx + g(x)V (x)f (x) dx
Z Rn Z Rn

=− ∆g(x)f (x) dx + V (x)g(x)f (x) dx = h(−∆ + V )g|f i.


Rn Rn

Here we have used integration by parts twice for the first summand. Hence,
−∆ + V is a densely defined symmetric operator on L2 (Rn ). Now observe that
the complex conjugation C : f 7→ f clearly is a conjugation that leaves Cc∞ (Rn )
invariant and statisfies AC = CA on Cc∞ (Rn ). Hence, by von Neumann’s
criterion (Theorem 2.4.2) the operator A has self-adjoint extensions.

Note that although a conjugation as required by von Neumann’s criterion


for a given symmetric operator may exist, it can be somewhat difficult to find
it. For example, you might want to try to find conjugations for the momentum
operators both on a bounded interval and on R with domains equal to the
space of test functions.

2.4.2 Kato–Rellich Theory


Note that it is however more difficult to establish essentially self-adjointness
for Hamiltonians with potential terms. This will be our goal now and is the
content of the so-called Kato–Rellich theory. The central idea is to see the
potential term as a perturbation of the self-adjoint operator −∆. In fact, as one
would expect, perturbations preserve self-adjointness if they are small in a
certain sense. This is shown in the next abstract result.

Theorem 2.4.4 (Kato–Rellich). Let (A, D(A)) and (B, D(B)) be two self-adjoint
operators on a Hilbert space H. Suppose that D(A) ⊂ D(B) and that there exist
constants 0 < a < 1 and 0 < b such that

kBxk ≤ a kAxk + b kxk for all x ∈ D(A).

Then (A + B, D(A)) is a self-adjoint operator on H.

Proof. Use a positive sufficiently large number µ ∈ R such that a + b/µ < 1.
This is possible because of a < 1. Now for all x ∈ D(A) we have

(A + B + iµ)x = (B(A + iµ)−1 + Id)(A + iµ)x.

96
2.4. Further Criteria for Self-Adjointness

The second factor on the right hand side is invertible because A is self-adjoint.
Now let us deal with the first factor. Observe that because of D(A) ⊂ D(B) this
factor is a closed operator which is defined on the whole Hilbert space. For all
x ∈ H we have by the assumption and the estimate of Lemma 2.2.12 that

kB(A + iµ)−1 xk ≤ akA(A + iµ)−1 xk + bk(A + iµ)−1 xk


b
≤ akA(A + iµ)−1 xk + kxk .
µ
Further note that because of the self-adjointness of A one has hAz|zi ∈ R for
all z ∈ D(A) and therefore

kxk2 = k(A(A + iµ)−1 x + iµ(A + iµ)−1 xk2 = k(A(A + iµ)−1 xk2 + kµ(A + iµ)−1 xk2
+ 2 RehA(A + iµ)−1 x|iµ(A + iµ)−1 xi = k(A(A + iµ)−1 xk2 + kµ(A + iµ)−1 xk2 .

Forgetting the second summand at the right hand side, we therefore obtain

kB(A + iµ)−1 xk ≤ (a + b/y) kxk .

Since the factor on the right is smaller than 1 by the choice of µ, using the
Neumann series (Lemma 2.2.20) we see that B(A + iµ)−1 + Id and therefore
A + B + iµ = µ(µ−1 A + µ−1 B + i) are invertible. Of course, the same argument
applies to (A + B − iµ). It now follows from Theorem 2.2.34 that A + B is
self-adjoint with domain D(A).

Furthermore, one can show with similar arguments that (A + B, D) is


essentially self-adjoint whenever (A, D) is essentially self-adjoint. We now
apply the Kato–Rellich theorem to Hamiltonians with potentials.

Theorem 2.4.5. Let n ≤ 3 and V : Rn → R be measurable such that V = V1 + V2


with V1 ∈ L2 (Rn ) and V2 ∈ L∞ (Rn ). Then −∆ + V is self-adjoint on H 2 (Rn ).

Proof. We want to apply the Kato–Rellich theorem (Theorem 2.4.4) with


A = −∆ and domain D(A) = H 2 (Rn ) and B the multiplication operator with
the potential V . Note that in particular we have to check that D(B) ⊂ D(A).
Let f ∈ H 2 (Rn ) and ε > 0. One now argues almost analogously to the proof of
the Sobolev embedding theorem (Theorem 2.1.34): we have by the Cauchy–
Schwarz inequality

kF f k1 ≤ k(1 + ε |x|2 )F f k2 k(1 + ε |x|2 )−1 k2 = Cε kf + ε∆f k2


≤ Cε (ε k∆f k2 + kf k2 ).

1/(1 + ε |x|2 )2 dx is finite for


R
Here we have used the fact that the integral Rn
n ≤ 3. Since F f ∈ L1 (Rn ), it follows that

kf k∞ ≤ kF f k1 ≤ Cε (ε k∆f k2 + kf k2 ).

97
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

Hence, for all ε > 0 there exists a constant cε > 0 such that

kf k∞ ≤ ε k∆f k2 + cε kf k2 .

Using this estimate, we obtain for all f ∈ H 2 (Rn ) that

kV f k2 ≤ kV1 f k2 + kV2 f k2 ≤ kV1 k2 kf k∞ + kV2 k∞ kf k2


≤ ε kV1 k2 k∆f k2 + (cε kV1 k2 + kV2 k∞ ) kf k2
= ε kV1 k2 kAf k2 + (cε kV1 k2 + kV2 k∞ ) kf k2 < ∞.

This shows that f ∈ D(B). Moreover, if we choose ε < kV1 k−1


2 , then the assump-
tions of the Kato–Rellich theorem are fulfilled and therefore −∆ + V with
domain H 2 (Rn ) is self-adjoint.

One can show that the theorem still holds for n ≥ 4 if one replaces the
assumption V1 ∈ L2 (Rn ) by V1 ∈ Lp (Rn ) for some p > n/2. For a proof see
[RS75, Theorem X.20]. As a particular instance of the above theorem we
obtain the self-adjointness of the hydrogen atom.

Example 2.4.6 (Hydrogen atom). Let n = 3 and consider the Hamiltonian of


the hydrogen atom given by −∆ + 1/ |x|, once again ignoring physical constants.
This means that the potential is given by 1/ |x|. We now use the decomposition

1 1
V (x) = V1 (x) + V2 (x) = 1|x|≤1 + 1|x|>1 .
|x| |x|

Then clearly V2 ∈ L∞ (R3 ). Moreover, for V1 we have using polar coordinates


Z Z Z 1Z
2 1 1 2
|V2 (x)| dx = 2
dx = dθ r dr = 4π < ∞.
R2 |x|≤1 |x| 0 S2 r2

1
Hence, it follows from Theorem 2.4.5 that −∆ + |x| is self-adjoint with domain
2 3
H (R ).

With more effort one can prove a variant of Theorem 2.4.5 due to T. Kato
which allows a wider range of potentials. However, in this case one only
obtains essentially self-adjointness on the space of test functions and one has
no exact information on the domain of the closure. We omit the proof because
of its complexity and refer to [RS75, Theorem X.29].

Theorem 2.4.7 (Kato). Let n ≤ 3 and V : Rn → R be measurable such that


V = V1 + V2 + V3 with V1 ∈ L2 (Rn ), V2 ∈ L∞ (Rn ) and 0 ≤ V3 ∈ L2loc (Rn ). Then
−∆ + V is essentially self-adjoint on Cc∞ (Rn ).

98
2.4. Further Criteria for Self-Adjointness

The theorem again holds for n ≥ 4 if one replaces the condition V1 ∈ L2 (Rn )
with V1 ∈ Lp (Rn ) for some p > n/2. Note that the above theorem in particular
applies for non-negative continuous potentials. For example, −∆ + x4 is
essentially self-adjoint on Cc∞ (Rn ). The sign of the potential here plays a
crucial role as the next example makes clear. We do not discuss it here and
refer the reader to [Hal13, Section 9.10] where a detailed exposition is given.

Example 2.4.8. The operator −∆ − x4 with domain Cc∞ (Rn ) is not essentially
self-adjoint.

In contrast to this negative result recall that we have seen in Example 2.4.3
that −∆ − x4 has self-adjoint extensions. In fact, since −∆ − x4 on Cc∞ (Rn ) is not
essentially self-adjoint, the above example even yields that there must exist
several self-adjoint extensions.

2.4.3 Nelson’s criterion


We now come to our last criterion, namely Nelson’s criterion. This result is
particularly useful in connection with the algebraic method of raising and
lowering operators. In fact, it sometimes allows to verify self-adjointness with
the usual arguments used by physicists. For the criterion we need the notion
of analytic vectors.

Definition 2.4.9. Let (A, D(A)) be an unbounded operator on a Hilbert space


H.

(i) An element x ∈ D(A) is called a C ∞ -vector for A if An x ∈ D(A) for all


n ∈ N.

(ii) A C ∞ -vector x ∈ D(A) is called an analytic vector for A if



X kAn xk
tn < ∞ for some t > 0.
n!
n=0

Observe that the defining condition for an analytic vector implies that the
kAn xk n
map z 7→ ∞
P
n=0 n! z is well-defined and analytic. In particular, it follows
form the fact that derivatives are analytic and the triangle inequality that
span{An x : n ∈ N} entirely consists of analytic vectors if x is analytic.
Let us consider a self-adjoint operator (A, D(A)) on H with the projection-
valued measure P . Then for M > 0 the operator A leaves P ([−M, M])H invari-
ant and therefore restricts to a bounded operator on P ([−M, M])H. Therefore
An n
the exponential series ∞
P
n=0 n! t converges in absolutely operator norm for

99
2. The Theory of Self-Adjoint Operators in Hilbert Spaces

all t ∈ R. This implies that for x in the dense set ∪M∈N P ([−M, M])H we have
a C ∞ -vector

X kAn xk n
t <∞ for all t > 0.
n!
n=0

Hence, for a self-adjoint operators its set of analytic vector is dense. In fact,
this property characterizes self-adjoint operators. This is Nelson’s criterion
for self-adjointness which we will not prove here due to time constraints. A
proof can be found in [Mor13, Theorem 5.47] and [RS75, Theorem X.39].

Theorem 2.4.10 (Nelson’s criterion). Let (A, D(A)) be a symmetric operator on


a Hilbert space H. Suppose that D(A) contains a set of analytic vectors for A whose
span is dense in H. Then A is essentially self-adjoint.

Let us give a fundamental example of Nelson’s criterion, the quantum


mechanical one-dimensional harmonic oscillator.

Example 2.4.11 (Harmonic oscillator). Recall from the physics part that
d2 2
the quantum mechanical harmonic oscillator is given by − dx 2 + x ignoring

physical constants. This time the space of test functions Cc∞ (R) is not the
optimal choice as we have already seen in the physics part that the operator
has Hermite functions as eigenvectors. Therefore we work with the space of
Schwartz function S(R) instead. This is is the space of all C ∞ -functions with
rapid decay at infinity. This space plays a fundamental role in the theory of
distributions where you can also find an exact definition (Definition 3.2.1).
d2 2 with domain S(R) is a
A direct calculation now shows that − dx 2 + x

densely defined symmetric operator on L2 (R). Using the ladder operators

d d
a=x− and a† = x +
dx dx

with domains S(R), we obtain the factorization H = a† a + 1 on S(Rn ). Now,


repeating the calculations done in the physics part we obtain for all n ∈
2
N the vectors ψn (x) = pn (x)e−x /2 for some polynomial of degree n which
appropriately normalized form an orthonormal system of eigenvectors for
d2 2
− dx 2 + x . In fact, the polynomials pn are – ignoring possbile scaling issues –

the Hermite polynomials. Note that we have used the notation a† instead of
a∗ because a† is not the adjoint of a. Moreover, there are no domain problems
because for arbitrary polynomial expressions in a and ↠their domain is again
S(R) if a and a† are chosen with domain S(R) because S(R) is left invariant
by those operators.
We now sketch how one can show that (ψn )n∈N in fact forms an orthonor-
mal basis of L2 (R). Therefore we must show that V = span{ψn : n ∈ N} is
dense in L2 (R). Note that it follows from the fact that the polynomials pn

100
2.4. Further Criteria for Self-Adjointness

2
have degree n that V = {p(x)e−x /2 : p polynomial}. One now shows that for all
α∈C
N
X α n xn −x2 /2 2
e −−−−−→ eαx e−x /2 in L2 (R).
n! n→∞
n=0
Hence, if ψ is orthogornal to every element in the closure of V , we have
Z
2
e−ikx e−x /2 ψ(x) dx = 0 for all k ∈ R.
R
2
This means that the Fourier transform of e−x /2 ψ(x) is identically zero. By
2
Plancherel’s theorem (Theorem 2.1.34) this implies that e−x /2 ψ(x) and there-

fore also ψ(x) are the zero function. Hence, V = V ⊥ = 0 which is equivalent
to the denseness of V in L2 (Rn ).
Now observe that eigenvalues of an operator A are clearly analytic vectors
d2 2
for A. Hence, (ψn )n∈N is a set of analytic vectors for − dx 2 + x whose span is
2
d
dense in L2 (Rn ). Nelson’s criterion (Theorem 2.4.10) now shows that − dx 2 +x
2

with domain S(R) is essentially self-adjoint.


2
d 2 with domain S(R) is essentially self-
Of course, the fact that − dx 2 + x

adjoint also follows directly from Theorem 2.4.7.

101
Distributions 3
We have already seen that the domains of self-adjoint realizations of differen-
tial operators contain functions which are not classically differentiable, i.e.
non-differentiable Sobolev functions. However, there are still many locally in-
tegrable functions which do not have a derivative even in the weak sense. This
disadvantage has already made some arguments very difficult or impossible,
for example in Example 2.1.38 or Example 2.2.49. In this section we give an
introduction to the theory of distributions which also can be differentiated and
have the huge advantage that they are closed with respect to differentiation.
Moreover, we have formally seen (see Example 2.2.5) that such generalized
functions can naturally arise as some kind of generalized eigenfunctions for
self-adjoint operators which may not have eigenvalues in the usual sense. In
fact, the spectral theorem can be generalized to this situation and we will see
that typical quantum mechanical operators have an orthonormal basis if one
allows for eigenvectors in this generalized sense. This is the statement of the
main results of this chapter, the so-called nuclear spectral theorem.

3.1 The Space of Distributions


We now introduce distributions. Later on we will mainly work with a special
class of distributions, the so-called tempered distributions for which some
technical issues are more easily to handle. Nevertheless, for the sake of
completeness we first shortly introduce general distributions. The general
idea of the theory of distributions is to replace functions by functionals which
act in some continuous way on a given class of particularly nice functions
which are often called test functions. The exact mathematical objects one
obtains depend on the concrete choice of the space of test functions. A first
natural choice is here the class of all smooth functions with compact support.

Definition 3.1.1 (Space of Test Functions). Let Ω ⊂ Rn be open. We say that


a sequence (fn )n∈N ⊂ Cc∞ (Ω) converges to f ∈ Cc∞ (Ω) if

(i) there exists a compact subset K ⊂ Ω such that supp fn ⊂ K for all n ∈ N
and

(ii) for all α ∈ Nn0 the α-th derivative ∂α fn converges uniformly in K to ∂α f ,


i.e. supx∈K |∂α (fn − f )| → 0.

We call Cc (Ω) with this notation of convergence the space of all test functions
on Ω und write D(Ω).

103
3. Distributions

More precisely, there exists a locally convex topology on the vector space
Cc (Ω) for which the convergence of a sequence (fn )n∈N is equivalent to (i) and
(ii). However, we will ignore such topological concepts to a great extent and
instead work directly with the notion of convergence induced by the topology.
Distributions are then defined as continuous linear functionals on D(Ω).

Definition 3.1.2 (Distribution). A distribution is a continuous linear func-


tional on D(Ω), i.e. a linear function u : D(Ω) → C for which

fn → f in D(Ω) implies hu, fn i → hu, f i.

Let us start with the most prominent example of all, the infamous Dirac
distribution.

Example 3.1.3 (Dirac distribution). Let Ω ⊂ Rn open and a ∈ Ω. We define

δa : D(Ω) → C
ϕ 7→ ϕ(a).

Then δa is a distribution, the Dirac distribution in a. Indeed, let fn → f in


D(Ω). A fortiori, one has fn (a) → f (a) which shows hδa , fn i → hδa , f i.

One has the following useful criterion for a linear functional u : D(Ω) → C
to be a distribution.

Proposition 3.1.4. Let u : D(Ω) → C be linear. Then u is a distribution if and


only if for every compact subset K ⊂ Ω there exists a constant CK ≥ 0 and n ∈ N
such that
X
|hu, ϕi| ≤ CK k∂α ϕk∞ for all ϕ ∈ D(Ω) with supp ϕ ⊂ K.
|α|≤n

Let us apply the criterion to some concrete and important examples of


distributions.

Example 3.1.5 (Principal value). Let Ω = R. We define


1
 
PV : D(R) → C
x Z
ϕ(x)
ϕ 7→ lim dx.
ε↓0 |x|≥ε x

Note that for ϕ ∈ D(Ω) one has


Z Z∞
ϕ(x) ϕ(x) − ϕ(−x)
lim dx = lim dx.
ε↓0 |x|≥ε x ε→0 ε x

104
3.1. The Space of Distributions

Note that the integrand can be extended to a continuous function on R by


setting its value at zero to 2ϕ 0 (0). Together with the compactness of the
support of ϕ it follows that the improper integral exists. Let K ⊂ R be
compact. Applying the mean value theorem, we obtain for ϕ ∈ D(Ω) with
supp ϕ ⊂ K
Z ∞ Z∞
ϕ(x) − ϕ(−x)
0

0

dx ≤ 2 sup ϕ (x) dx ≤ 2λ(K) sup ϕ (x) .

0 x
0 x∈R x∈R

Hence, Proposition 3.1.4 shows that PV(1/x) is a distribution.

The space of locally integrable functions can be naturally identified with


a subspace of the space of distributions.

Example 3.1.6 (Locally integrable functions). Let Ω ⊂ Rn be open. For


f ∈ L1loc (Ω) consider the linear functional uf : D(Ω) → C defined via
Z
huf , ϕi B f (x)ϕ(x) dx.

This is well-defined because f is integrable over compact subsets of Ω. More-


over, one has for ϕ ∈ D(Ω) with supp ϕ ⊂ K for some K ⊂ Ω compact
Z
|hu, ϕi| ≤ sup |ϕ(x)| |f (x)| dx.
x∈Ω K

This shows that uf is a distribution. Moreover, it follows from the Du Bois-


Reymond lemma (Lemma 2.1.25) that the mapping f 7→ uf from L1loc (Ω) into
the space of distributions is injective.

Along the same line one can identify finite Borel measures with distribu-
tions.

Example 3.1.7 (Borel measures). Let Ω ⊂ Rn be open and µ : B(Ω) → C be


a locally finite Borel measure, i.e. µ(K) < ∞ for all compact subsets K ⊂ Ω.
Consider the functional uµ : D(Ω) → C defined via
Z
huµ , ϕi B ϕ dµ.

Then uµ is a distribution because for ϕ ∈ D(Ω) with supp ϕ ⊂ K one has


Z
|huµ , ϕi| ≤ sup |ϕ(x)| dµ = sup |ϕ(x)| µ(K).
x∈K K x∈Ω

One can show that the embedding µ 7→ uµ from the locally finite Borel mea-
sures into the space of distributions is injective. This can be seen as a stronger

105
3. Distributions

variant of the Du Bois-Reymond lemma (Lemma 2.1.25) as every locally


integrable function defines a locally finite Borel measure. The injectivity es-
sentially is a generalized version of the Riesz–Markov representation theorem:
there is a one-to one correspondence between positive linear functionals on
D(Ω) and locally finite Borel measures on Ω.

Now, that we have got a basic feeling for distributions, we introduce a


notion of convergence on the space of all distributions.

Definition 3.1.8. Let Ω ⊂ Rn be open. We denote by D0 (Ω) the space of all


distributions. We say that a sequence of distributions (un )n∈N converges to
u ∈ D0 (Ω) in D0 (Ω) if

lim hun , ϕi = hu, ϕi for all ϕ ∈ D(Ω).


n→∞

Again, we do not define a topology on the space of distributions. However,


if you are more interested in these issues, read the next remarks. Otherwise
simply ignore it.

Remark 3.1.9 (For experts). The notation D0 (Ω) for the space of distributions
actually has a deeper mathematical meaning. It stands for the topological dual
of the space D(Ω), i.e. the space of all continuous linear functionals on D(Ω).
The above definition says that a sequence of distributions converges in D0 (Ω) if
and only it converges with respect to the weak∗ -topology. However, for deeper
results on distributions it may be more suitable to not endow D0 (Ω) with
the weak∗ -topology. Indeed, often D0 (Ω) is defined with a different topology
(the so-called strong topology) which however agrees on bounded sets (note
that we have and will not define the notion of boundedness on topological
vector spaces) with the weak∗ -topology. Hence, the notion of convergence of
sequences is independent of the choice between these topologies.

In physics (and also in mathematics) one often works with approximating


δ-functions. In our newly introduced language this simply means that the
δ-distribution can be written as the limit of ordinary functions in the space of
distributions.

Example 3.1.10 (Approximations of the δ-distribution). Let f be a positive


and normalized function in L1 (R). We define fn (x) = nf (nx). Then one has
Z∞ Z∞
fn (x) dx = f (x) dx = 1
−∞ −∞

for all n ∈ N. Suppose further that for all ε > 0 one has
Z
lim fn (x) dx = 0.
n→∞ |x|≥ε

106
3.1. The Space of Distributions

This means that the masses of the functions fn concentrate on arbitrary small
intervals as n goes to infinity. Under these assumptions one has fn → δ0 in
D0 (Ω). Indeed, for ϕ ∈ D(Ω) one has
Z∞ Z ∞
|hδ0 , ϕi − hfn , ϕi| = ϕ(0) − fn (x)ϕ(x) dx = fn (x)(ϕ(0) − ϕ(x)) dx

−∞ ∞
Z∞
≤ fn (x) |ϕ(0) − ϕ(x)| dx
−∞

and therefore one obtains for all ε > 0


Z
lim sup |hδ0 , ϕi − hfn , ϕi| ≤ lim sup sup |ϕ(0) − ϕ(x)| fn (x) dx
n→∞ n→∞ x∈[−ε,ε] |x|≤ε

≤ sup |ϕ(0) − ϕ(x)| .


x∈[−ε,ε]

Since ϕ is continuous, the right hand side tends to zero as ε → 0. Hence,


we have shown that fn converges to the delta distribution in the sense of
distributions. As concrete examples of approximations one for example has
2
for f (x) = (π)−1/2 e−x and g(x) = π−1 (1 + x2 )−1

n 1 n
fn (x) = √ exp(−(nx)2 ) and gn (x) =
π π 1 + (nx)2

or using the substitution ε(n) = n−1/2 one obtains the approximations for
ε = ε(n) → 0

x2
!
1 1 ε
fε (x) = √ exp − and gε (x) = .
πε ε π ε2 + x2

The main advantage of distributions is that classical analytical operations


like taking derivatives have very well-behaved extensions to distributions.
Indeed, these operations are bounded with respect to the topology of distribu-
tions and therefore by density can be extended from D(Ω) to D0 (Ω). However,
it is more practical to define the extensions directly by using duality.

Definition 3.1.11. Let Ω ⊂ Rn be open and u ∈ D0 (Ω). For α ∈ Nn0 we define


the α-th derivative of u as

hD α u, ϕi = (−1)|α| hu, ∂α ϕi for ϕ ∈ D(Ω).

Clearly, this is well-defined, i.e. D α u is a distribution. Note that the above


definition extends the concept of weak derivatives in the context of Sobolev
spaces given in Definition 2.1.20. We now give some elementary examples of
derivatives of distributions which are often used in physics.

107
3. Distributions

Example 3.1.12 (Derivative of the delta function). Consider the Dirac dis-
tribution δ0 ∈ D0 (R). For ϕ ∈ D(Ω) and n ∈ N one has
∂n ϕ
hD n δ0 , ϕi = (−1)n hδ0 , D n ϕi = (−1)n (0).
∂n x
Hence, the distributional derivative of the Dirac distribution is given by (−1)n
times the evaluation of the n-th derivative at zero.

Example 3.1.13 (Derivative of the signum function). We come back to the


signum function considered in Example 2.1.22. As we have seen this function
does not have a derivative in the sense of Sobolev spaces. However, considered
as a distribution we have for ϕ ∈ D(R)
Z∞ Z0 Z∞
0 0 0
hD sign, ϕi = −hsign, ϕ i = − sign(x)ϕ (x) dx = ϕ (x) dx − ϕ 0 (x) dx
−∞ −∞ 0
= ϕ(0) − (−ϕ(0)) = 2hδ0 , ϕi

Hence, the distributional derivative of sign is given by two times the Dirac
distribution.

Remark 3.1.14 (Non-commuting derivatives). It is a well-known fact from


real analysis that there exist continuous functions f : R2 → R which are twice
differentiable and satisfy Dx Dy f , Dy Dx f , i.e. the mixed partial derivatives
do not commute. For example, take f : R2 → R defined as
 xy(x2 −y 2 )

 x2 +y 2
 if (x, y) , (0, 0)
f (x, y) = 
0

if (x, y) = (0, 0).

However, the situation changes when one considers f as a distribution.


Now it follows from the mere definition of the derivative and the fact that
the order of differentiation can be changed for smooth functions that for all
ϕ ∈ D(R)

hDx Dy f , ϕi = hf , Dx Dy ϕi = hf , Dy Dx ϕi = hDy Dx f , ϕi.

Hence, Dx Dy f = Dy Dx f as elements in D0 (R). In particular this means that if


Dx Dy f and Dy Dx f exist in the classical sense and define L1loc (Rn ) functions,
then one has a consequence of the Du Bois-Reymond lemma (Lemma 2.1.25)
that Dx Dy f = Dy Dx f almost everywhere.

We now want to define the multiplication of distributions with smooth


functions. For this note that for m ∈ C ∞ (Ω), a function f ∈ L1loc (Ω) and a test
function ϕ ∈ D(Ω) we have
Z Z
humf , ϕi = m(x)f (x)ϕ(x) dx = f (x)m(x)ϕ(x) = huf , m · ϕi.
Ω Ω

108
3.2. Tempered Distributions

Note that if ϕ ∈ D(Ω) and m is smooth, then the product m · ϕ is again a test
function. This allows us to extend the above identity as a definition to general
distributions.

Definition 3.1.15 (Multiplication of distributions with smooth functions).


Let u ∈ D0 (Ω) be a distribution and m ∈ C ∞ (Ω). Then their product is the
distribution defined via

hm · u, ϕi = hu, m · ϕi for ϕ ∈ D(Ω).

Note that one sees directly that m·u is linear and continuous and therefore
a well-defined distribution. Let us give a very elementary example to see how
one works with such multiplications.

Example 3.1.16. Let δ0 ∈ D0 (R) be the Dirac distribution and let m : R → R


be an arbitrary smooth function. Then we have for ϕ ∈ D(R)

hmδ0 , ϕi = hδ0 , mϕi = m(0)ϕ(0) = hm(0)δ0 , ϕi.

Hence, we have mδ0 = m(0)δ0 as one would intuitively expect. However,


things can also be slightly more complicated. For example, one has

hmδ00 , ϕi = hδ00 , mϕi = −hδ0 , (mϕ)0 i = −m0 (0)ϕ(0) − m(0)ϕ 0 (0)


= h−m0 (0)δ0 , ϕi + hm(0)δ00 , i.

Hence, m · δ00 = −m0 (0)δ0 + m(0)δ00 .

3.2 Tempered Distributions


We now come to a special class of distributions, the so-called tempered dis-
tributions. These are the right space for extending the theory of Fourier
transform to distributions and are moreover even easier to handle from a
technical perspective. For the tempered distributions we work with a different
class of test functions than D(Ω), namely the Schwartz functions.

Definition 3.2.1 (Schwartz functions). Let n ∈ N. For α, β ∈ Nn0 and a


smooth function f ∈ C ∞ (Rn ) we define

kf kα,β B sup |xα D β f (x)|.


x∈Rn

We then define the Schwartz space or the space of all rapidly decreasing functions
as
S(Rn ) B {f ∈ C ∞ (Rn ) : kf kα,β < ∞ for all α, β ∈ Nn0 }.

109
3. Distributions

More intuitively, the above definition means that f and all of its derivatives
exist and decay faster than any inverse power of x. As for the test functions
D(Ω) we need a notion of convergence for Schwartz functions.
Definition 3.2.2 (Convergence of Schwartz functions). We say that a se-
quence of Schwartz functions (fm )m∈N ⊂ S(Rn ) converges to f ∈ S(Rn ) if
kfm − f kα,β → 0 for all α, β ∈ Nn0 .
It is a good exercise to check that (fm )m∈N ⊂ S(Rn ) tends to f ∈ S(Rn ) as
m → ∞ if and only if (fm )m∈N converges to f with respect to the metric
X kf − gkα,β
d(f , g) B 2−(|α|+|β |) (f , g ∈ S(Rn )).
1 + kf − gkα,β
α,β∈Nn0

Moreover, one can show that (S(Rn ), d) is a complete metric space. These
properties show in mathematical terms that S(Rn ) is a so-called Fréchet space.
Let us give the exact definition for later use. First we need the concept of a
semi-norm which we have already met in our study of Lp -spaces.
Definition 3.2.3 (Semi-norm). Let V be a K-vector space. A map p : V →
[0, ∞) is called a semi-norm on V if
(i) p is homogeneous, i.e. p(λx) = λp(x) for all x ∈ V and λ ∈ K;

(ii) p satisfies the triangle inequality, i.e. p(x + y) ≤ p(x) + p(y) for all x, y ∈ V .
Hence, the only difference between a norm and a semi-norm is the fact
that a semi-norm p may not satisfy the definiteness condition p(x) = 0 ⇔ x = 0.
Fréchet spaces are defined in terms of a countable family of semi-norms.
Definition 3.2.4 (Fréchet space). Let G be a vector space and (pk )k∈N a count-
able family of semi-norms on G with the following properties:
(i) If x ∈ G satisfies pk (x) = 0 for all k ∈ N, then x = 0.

(ii) If (xn )n∈N ⊂ G is Cauchy with respect to each semi-norm pk , i.e. for all
k ∈ N and for all ε > 0 there exists N ∈ N such that
pk (xn − xm ) < ε for all n, m ≥ N ,
then there exists x ∈ G such that (xn )n∈N converges to x with respect to
each semi-norm, i.e. pk (xn − x) → 0 as n → ∞ for all k ∈ N.
Then G together with the datum of these semi-norms is called a Fréchet space.
In this case (G, d) is a complete metric space for the metric

X kf − gkk
d(f , g) B 2−k (f , g ∈ G).
1 + kf − gkk
k=1

110
3.2. Tempered Distributions

If G is a Fréchet space, we agree that all topological notions such a conver-


gence, closedness or compactness or understood with respect to the metric
induced by the family of semi-norms.

Remark 3.2.5. One can always replace the family (pk )k∈N by a family of
increasing semi-norms (qk )k∈N by setting qk = supn≤k qn . In fact, the family
(qk )k∈N induces an equivalent metric on G and we are only interested in the
topological properties and not on the concrete values of the metric itself.
Hence, we may assume that the semi-norms satisfy

q1 (x) ≤ q2 ≤ q3 (x) ≤ · · · for all x ∈ G.

Note that in contrast to this situation for S(Rn ) one can show that the
convergence in D(Ω) for ∅ , Ω ⊂ Rn is not induced by a translation-invariant
metric.
As for distributions we define the tempered distributions as the space of
all continuous functionals on S(Rn ).

Definition 3.2.6 (Tempered distribution). A tempered distribution is a contin-


uous linear functional u : S(Rn ) → C. The space of all tempered distributions
is denoted by S 0 (Rn ). We say that a sequence of tempered distributions
(um )m∈N ⊂ S 0 (Rn ) converges to u ∈ S 0 (Rn ) if

hum , ϕi → hu, ϕi for all ϕ ∈ S(Rn ).

All comments made for distributions also apply to tempered distributions:


the definition of convergence of tempered distributions means in mathe-
matical terms that we endow the space of tempered distributions with the
weak∗ -topology. Notice further that with almost the identical proofs one can
see that locally integrable functions define tempered distributions as long as
they have polynomial growth. A similar statement also holds for measures.
Moreover, distributional derivatives of tempered distributions are tempered
distributions and the product of a function with polynomial growth and a
tempered distribution is a tempered distribution as well.
For an example of a distribution in D0 (R) which is not tempered one can
take uf for f (x) = e|x| .

3.2.1 The Fourier Transform of Tempered Distributions


Now let us come to the Fourier transform of tempered distributions which
will ultimately allow us to take for example the Fourier transform of Delta-
functions. For this we first need to understand the Fourier transform on
Schwartz functions.

111
3. Distributions

Proposition 3.2.7 (Fourier transform on Schwartz functions). For all n ∈ N


the Fourier transform restricts to a bijection of S(Rn ). Moreover, one has F fm →
F f in S(Rn ) if fm → f in S(Rn ).

This follows from the fact that under the Fourier transform differentiation
becomes multiplication. The details should be verified by the reader. Note
that moreover for Schwartz functions f , ϕ ∈ S(Rn ) we have by Plancherel’s
formula (Theorem 2.1.34)
Z Z Z
huF f , ϕi = (F f )(x)ϕ(x) dx = −1
f (x)(F ϕ)(x) dx = f (x)(F ϕ)(x) dx
Rn Rn Rn
= huf , F ϕi.

This equality can now be used to extend the Fourier transform to tempered
distributions.

Definition 3.2.8 (The Fourier transform of tempered distributions). Let


u ∈ S 0 (Rn ) be a tempered distribution. Its Fourier transform is defined as

hF u, ϕi = hu, F ϕi for ϕ ∈ S(Rn ).

Note that since the Fourier transform is continuous on S(Rn ) by Proposi-


tion 3.2.7, F u is a well-defined tempered distribution. Further, the Fourier
transform on the tempered distribution is a bijection.

Proposition 3.2.9. The Fourier transform F : S 0 (Rn ) → S 0 (Rn ) is one-to-one. Its


inverse is given by the extension of the inverse Fourier transform via

hF −1 u, ϕi B hu, F −1 ϕi for ϕ ∈ S(Rn ).

Proof. As in the case of the Fourier transform, one has a well-defined mapping
F −1 : S 0 (Rn ) → S 0 (Rn ). We now show that as the notation already indicates
that F −1 is the inverse of F . In fact, we have for u ∈ S 0 (Rn ) and ϕ ∈ S(Rn )
using the mere definitions

hF −1 F u, ϕi = hF u, F −1 ϕi = hu, F F −1 ϕi = hu, ϕi.

This shows that F −1 F = IdS 0 (Rn ) . An analogous calculation also shows that
F F −1 = IdS 0 (Rn ) .

Let us now give some examples which are particularly relevant for physics.

Example 3.2.10 (Fourier transform of Delta function). Let us calculate the


Fourier transform of the Delta distribution δ0 . For this observe that for all
ϕ ∈ S(Rn ) we have
Z
1
hF δ0 , ϕi = hδ0 , F ϕi = (F ϕ)(0) = ϕ(x) dx = (2π)−n/2 h1, ϕi.
(2π)n/2 Rn

112
3.2. Tempered Distributions

Hence, we have shown that F δ0 = (2π)−n/2 1. Physically this means that one
obtains the delta function if one takes all frequencies with equal (normalized)
strength.

Essentially in the converse direction, let us take a look at the Fourier


transform of planes waves.

Example 3.2.11 (Fourier transform of plane wave). For k ∈ R consider the


plain wave f (x) = eikx ∈ L1loc (Rn ). By the Fourier inversion formula we obtain
for all ϕ ∈ S(Rn )
Z Z
hF f , ϕi = hf , F ϕi = f (x)(F ϕ)(x) dx = eikx (F ϕ)(x) dx
Rn Rn
n/2 n/2
= (2π) (F −1
F ϕ)(k) = (2π) ϕ(k) = h(2π)n/2 δk , ϕi.

Hence, we have shown that F (eik· ) = (2π)n/2 δk . Physically this gives the
obvious fact that one obtains a plain wave if one picks a single frequency (via
a delta-distribution).

We now shortly sketch an application of tempered distributions to differ-


ential operators, a topic which is often used in physics courses. As a starting
point we define linear differential operators with constant coefficients, the
class of differential operators to be considered from now on.

Definition 3.2.12 (Linear differential operator with constant coefficients).


Let n ∈ N. A linear operator L : C ∞ (Rn ) → C ∞ (Rn ) is called a linear differential
operator with constant coefficients if there exists a polynomial P in n variables
such that !
∂ ∂
Lf = P ,..., f for all f ∈ C ∞ (Rn ).
∂x1 ∂xn
Of course, the Laplace operator is an example of a differential operator
with constant coefficients. Such differential operators can be treated very
well with the help of distributions. An important concept here is that of a
fundamental solution.

Definition 3.2.13 (Fundamental solution). Let L be a linear differential op-


erator with constant coefficients. A distribution u ∈ D0 (Rn ) is called a funda-
mental solution for L if u satisfies

Lu = δ0 .

One may wonder if fundamental solutions do exist at all. This is indeed


not the case for general differential operators. However, the Malgrange–
Ehrenpreis theorem states that every non-zero linear differential operator
with constant coefficients has a fundamental solution.

113
3. Distributions

In order to apply fundamental solutions to the study of differential equa-


tions, we need to define convolutions of tempered distributions with Schwartz
functions. For this consider first three Schwartz functions f , g, ϕ ∈ S(Rn ). Let
us set g̃(x) = g(−x). Then we have
Z Z Z
huf ∗g , ϕi = (f ∗ g)(x)ϕ(x) dx = f (y)g(x − y) dyϕ(x) dx
Rn R n Rn
Z Z Z Z
= g(x − y)ϕ(x) dxf (y) dy = g̃(y − x)ϕ(x) dxf (y) dy
Rn Rn Rn Rn
= huf , g̃ ∗ ϕi.

Again, the right hand side makes sense if uf is replaced by an arbitrary


tempered distribution u ∈ S 0 (Rn ) because one can verify that the convolution
of two Schwartz functions is again a Schwartz function. Therefore it can be
used to extend the definition of convolutions of tempered distributions with
Schwartz functions.

Definition 3.2.14 (Convolution of tempered distributions with Schwartz


functions). Let u ∈ S 0 (Rn ) and g ∈ S(Rn ). Then u ∗ g is the distribution which
for ϕ ∈ S(Rn ) is defined as

hu ∗ g, ϕi = hu, g̃ ∗ ϕi.

Note that for u ∗ g to be a well-defined distribution, we need that u ∗ g is


a continuous functional, a fact which should be checked by the reader. The
usefulness of fundamental solutions lies in the following observation.

Proposition 3.2.15. Let n ∈ N be fixed and L : C ∞ (Rn ) → C ∞ (Rn ) be linear


differential operator with constant coefficients. Further, let u ∈ S 0 (Rn ) be a fun-
damental solution for L. Then for f ∈ S(Rn ) a distributional solution of the
inhomogeneous problem
Lw = f
is given by the convolution w = u ∗ f .

Proof. The statement follows by a direct calculation. Let L = P (∂1 , . . . , ∂n ) and


L∗ the adjoint obtained by replacing ∂i by −∂i for all i = 1, . . . , n. Then for
ϕ ∈ S(Rn ) we have

hLw, ϕi = hL(u ∗ f ), ϕi = hu ∗ f , L∗ ϕi = hu, f˜ ∗ L∗ ϕi = hu, L∗ (f˜ ∗ ϕ)i = hLu, f˜ ∗ ϕi


Z
˜ ˜
= hδ0 , f ∗ ϕi = (f ∗ ϕ)(0) = f (y)ϕ(y) dy = hf , ϕi.
Rn

This shows that Lw = f and finishes the proof.

114
3.2. Tempered Distributions

We now show for the example of the three dimensional Laplace operator
how a fundamental solution for a linear differential operator with constant
coefficients can in principle be obtained.

Example 3.2.16 (Laplace operator in three dimensions). Let L = ∆ be the


Laplace operator in R3 . Suppose that u ∈ S 0 (R3 ) is a fundamental solution
for ∆, i.e. ∆u = δ0 . Taking Fourier transforms on both sides we obtain for
ϕ ∈ S(R3 )

h(2π)−3/2 1, ϕi = hF (δ0 ), ϕi = hF (∆u), ϕi = h∆u, F ϕi = hu, ∆F ϕi


= hu, F (− |·|2 ϕ(·))i = hF u, − |·|2 ϕ(·)i = h− |x|2 F u, ϕi.

Hence, (2π)−3/2 1 = − |x|2 F u almost everywhere. Therefore we can try the


choice (F u)(x) = −(2π)−3/2 |x|−2 . Since the Fourier transform is a bijection
on the space of tempered distributions, this equation is equivalent to u =
−(2π)−3/2 F −1 (|x|−2 ). Note that
Z Z RZ
1
2
dx = r −2 r 2 dθ dr = 4πR < ∞.
|x|≤R |x| 0 S2

This shows that x 7→ |x|−2 ∈ L1loc (R3 ). Therefore its Fourier transform exists in
the sense of tempered distributions. Moreover, one has |x|−2 1B(0,R) → |x|−2 in
S 0 (R3 ) as R → ∞. This is a direct consequence of the dominated convergence
theorem. By the continuity of the Fourier transform, we therefore obtain
F −1 |x|−2 1B(0,R) → F −1 |x|−2 as R → ∞. Hence for ϕ ∈ S(Rn ), we can determine
the action of u on ϕ via the calculation

hu, ϕi = h(2π)−3/2 F −1 (− |·|−2 ), ϕi = − lim h(2π)−3/2 F −1 (|·|−2 1B(0,R) ), ϕi


R→∞
−3/2 −2 −1
= − lim (2π) h|·| 1B(0,R) , F ϕi
R→∞
Z
= − lim (2π)−3/2 |x|−2 1B(0,R) (x)(F −1 ϕ)(x) dx
R→∞ R3
Z Z
1 −2
=− lim |x| ϕ(y)eixy dy dx
(2π)3 R→∞ |x|≤R R3
Z Z
1
=− lim |x|−2 eixy dxϕ(y) dy.
(2π)3 R→∞ R3 |x|≤R
We now deal with the inner integral. For R > 0 we have by change of coordi-
nates
Z Z Z RZ Z RZ
−2 ixy −2 irθy 2
|x| e dx = r e dθr dr = cos(rθy) dθ dr.
|x|≤R S2 0 S2 0 S2

Let us again first take a look at the inner integral. The following argument
holds for all r > 0 and |y| , 0: Choose an orthogonal matrix O such that Oez =

115
3. Distributions

y/|y|. Then using the substitution s = θz = cos ϕ (in spherical coordinates with


ϕ ∈ [0, π)) and consequently ds = (arccos s)0 = −1/ 1 − s2 , we obtain
Z Z Z
cos(rθy) dθ = cos(r|y|θ · Oez ) dθ = cos(r|y|O−1 θ · ez ) dθ
S2 Z S 2
Z S2

= cos(r|y|θ · ez ) dθ = cos(r|y|θz ) dθ
S2 S2
Z 1 Z Z 1
ds
=− √ dα cos(r|y|s) √ = −2π cos(r|y|s) ds
−1 1−s2 S1 1 − s2 −1
" #s=1
sin(r|y|s) sin(r|y|)
= −2π = −4π
r|y| s=−1 r|y|
R∞
Recall the identity 0 sinx x dx = π/2 for the sinc-function. By the substitution
formula we therefore obtain for |y| , 0
Z∞
sin(r|y|) π
dr = .
0 r|y| 2|y|

Putting all calculations together, we obtain


Z ZR
1 sin(r|y|)
hu, ϕi = lim 4π drϕ(y) dy
(2π)3 R→∞ R3 0 r|y|
Z Z
1 2 1 1 1
= 3
2π ϕ(y) dy = ϕ(y) dy
(2π) R3 |y| 4π R3 |y|

Here we can exchange the limit and the integral with the help of the dominated
convergence theorem, where we use the fact that there exists a universal
constant M > 0 such that
Z Z
R sin x R sin ax M
dx ≤ M for all R > 0 ⇒ dx ≤ for all a, R > 0.
x 0 ax a

0

Hence, altogether we have shown that u(y) = (1/4π)|y|−1 is a fundamental


solution for the Laplace operator on R3 .

By a similar calculation one can see that an analogue formula holds for
the case n ≥ 3, whereas for n = 2 one obtains a logarithmic term.

3.3 The Nuclear Spectral Theorem


We now come back to the considerations related to the spectral theorem. For
d
this let us again consider the momentum operator A = −i dx with D(A) =
1
H (R). We have seen that the spectrum is given by σ (A) = R and that A has

116
3.3. The Nuclear Spectral Theorem

no eigenvectors in L2 (R). Nevertheless one can find non-square integrable


eigenfunctions. In fact, for each ξ ∈ R the function fξ (x) = eixξ < L2 (R) satisfies

(Afξ )(x) = −i · iξeixξ = ξeixξ = ξfξ (x).

Hence, in a generalized sense A has a complete system of generalized eigen-


functions.
A similar argument can also be applied for the position operator Mx . In
fact, for x0 ∈ R the delta-distribution δx0 ∈ S 0 (R) satisfies in the sense of
distributions
Mx (δx0 ) = x · δx0 = x0 δx0 .
Intuitively, this again means that there exist a complete system of generalized
eigenfunctions in a certain sense. One sees however that in order to obtain
these eigenfunctions one needs to extend the operator to a larger class of
functions or distributions.

3.3.1 Gelfand Triples


This can be formalized with the help of Gelfand triples. For this we must again
consider Fréchet spaces (actually, it is more natural to formulate the most
concepts in the context of general locally convex topological vector spaces,
however we do not want to delve too deeply into this theory). Recall that a
Fréchet space can be described by a complete metric which is invariant with
respect to translations, i.e. one has d(x + z, y + z) = d(x, y) for all x, y, z ∈ G. All
topological notions such as convergence or closedness are understood with
respect to this metric. As for Hilbert spaces we now define dual spaces of
Fréchet spaces.

Definition 3.3.1 (Dual of Fréchet spaces). Let G be a Fréchet space. Its dual
space G 0 is defined as

G 0 B {ϕ : G → C linear and continuous }.

We endow G 0 with the weak∗ -topology, i.e. a sequence (ϕn )n∈N ⊂ G 0 satisfies
ϕn → ϕ ∈ G 0 if and only if ϕn (g) → ϕ(g) for all g ∈ G.

Further, consider additionally a Hilbert space H together with a continu-


ous linear embedding i : G → H with dense range. Here continuous as usual
means that xn → x in G implies i(xn ) → x in H. Observe that i induces a
continuous anti-linear embedding i † : H → G 0 given by

i † : x 7→ [g 7→ hx|i(g)i].

In fact, gn → g in G implies i(gn ) → i(g) in H and therefore i † (x)(gn ) → i † (x)(g)


by the continuity of the scalar product. With the same argument one sees

117
3. Distributions

that xn → x in H implies i † (xn )(g) → i † (x)(g) for all g ∈ G. This shows the
continuity of the map i † : H → G 0 . Furthermore, the injectivity of i † follows
from the density of G in H because an element in the dual space H∗ is uniquely
by its values on a dense subset.

Definition 3.3.2 (Gelfand triple). Let G be a Fréchet space and H a Hilbert


space together with a continuous linear embedding i : G → H with dense
range. The induced structure

i i†
G ,− → G0
→ H ,−

is called the Gelfand triple associated to (G, H, ι).

In the following we will often ignore the embedding i and directly identify
elements of G with elements in H. However, one has to be careful when one
identifies elements of H with elements of G 0 because the embedding i † is
anti-linear. The next example is probably the most important Gelfand triple.
i i†
Example 3.3.3. Consider S(Rn ) ,− → L2 (Rn ) ,−
→ S 0 (Rn ), where the embedding
n 2
i : S(R ) → L (R ) is the natural inclusion and i † : H → S 0 (Rn ) is given by
n

" Z #
† n
i : f 7→ S(R ) 3 ϕ 7→ f (x)ϕ(x) dx .
Rn

Hence, except for a change with the complex conjugate i † is the restriction of
the inclusion of L1loc -functions into the space of distributions.

From now on let G ,→ H ,→ G 0 be a fixed Gelfand triple. Before we can


talk in a mathematical rigorous way about generalized eigenvectors, we must
extend the action of a self-adjoint operator (A, D(A)) on H to the Gelfand triple.
For this we make the additional assumption that G ⊆ D(A) and AG ⊆ G, i.e. A
leaves G invariant. This means that A restricts to a linear bounded operator
on G (which is automatically continuous by the variant of the closed graph
theorem for Fréchet spaces). One then obtains the dual map

A0 : G 0 → G 0
ϕ 7→ [g 7→ ϕ(Ag) C hϕ, AgiG0 ,G ].

We leave it to the reader to verify that this map is well-defined.

Definition 3.3.4 (Generalized eigenvector). Let (A, D(A)) be a self-adjoint


operator on some Hilbert space H and G ,→ H ,→ G 0 a Gelfand tripel with
G ⊆ D(A) and AG ⊆ G. Then the eigenvectors of A0 : G 0 → G 0 are called the
generalized eigenvectors of A.

118
3.3. The Nuclear Spectral Theorem

Suppose that y ∈ D(A) is a eigenvalue of A, i.e. Ay = λy for some λ ∈ R.


Then we have for all x ∈ G

hA0 i † y, xiG0 ,G = hi † y, AxiG0 ,G = hy|Axi = λhy|xi = hλi † y, xiG0 ,G .

Hence, A0 i † y = λi † y. This shows that every eigenvalue of A in the classical


sense is also a generalized eigenvalue of A. More structurally spoken, by
adapting the above reasoning we see that for y ∈ D(A) and x ∈ G

hA0 i † y, xiG0 ,G = hi † y, AxiG0 ,G = hy|Axi = hAy|xi = hi † Ay, xiG0 ,G ,

which shows that (in a certain sense) A0 is an extension of A to G 0 .


Observe that until now we have only defined generalized eigenvalues but
we do not know yet when a system of generalized eigenvalues is complete.
For λ ∈ R let E(λ) B Ker(λ Id −A0 ) ⊆ G 0 be the eigenspace of A0 . We now define
a generalized variant of the Fourier transform for elements in G.

Definition 3.3.5 (Generalized Fourier transform). Let G ,→ H ,→ G 0 be a


Gelfand triple and (A, D(A)) a self-adjoint operator on H with D(A) ⊆ H and
AG ⊆ G. Then one defines the generalized Fourier transform
Y
ˆ: G → E(λ)0
λ∈R
g 7→ (ϕ 3 E(λ) 7→ hϕ, giG0 ,G )λ∈R

We say that A has a complete system of generalized eigenvectors (with respect


to A) in G 0 if the generalized Fourier transform is injective, i.e. ĝ = 0 implies
g = 0.

Of course, E(λ)0 may be trivial for certain values of λ ∈ R. One can show
that A has a complete system of generalized eigenvectors if and only if the
span of all E(λ) is dense in G 0 for the weak∗ -topology. As before we will ignore
this topological issues and will instead consider some examples. First we
consider the momentum operator in one dimension.

Example 3.3.6 (Generalized eigenvectors for the momentum operator). We


d
consider the momentum operator A = −i dx with D(A) = H 1 (R). We choose
G = S(R). Then G 0 = S 0 (R) is the space of all tempered distributions. We
choose i : S(R) ,→ L2 (R) as the canonical inclusion. Then we obtain the
Gelfand triple S(R) ,→ L2 (R) ,→ S 0 (R) considered in Example 3.3.3. Observe
that S(R) ⊂ H 1 (R) and AS(R) ⊂ S(R). Hence, there exists the extension
A0 : S 0 (R) → S 0 (R) of A. Explicitly, one has for u ∈ S 0 (R) and ϕ ∈ S(R)

hA0 u, ϕiS 0 (R),S(R) = hu, AϕiS 0 (R),S(R) = hu, −iϕ 0 iS 0 (R),S(R) = hu 0 , iϕiS 0 (R),S(R)
= hiu 0 , ϕiS 0 (R),S(R) .

119
3. Distributions

This shows that A0 u = iu 0 in the sense of distributions. Note that for ϕ ∈ S(R)
one has A0 i † ϕ = iϕ 0 = i † (−iϕ 0 ) = i † Aϕ as it should hold in general. Now let us
determine the generalized real eigenvalues of A, i.e. the eigenvalues of A0 . For
this we must find for given λ ∈ R all distributional solutions u ∈ S 0 (R) of

A0 u = iu 0 = λu.

One can show that all distributional solutions of the above equation already
are smooth functions (we have already used and proved an easier variant of
this result where we additionally assumed that u 0 ∈ L1loc (R)). Hence, every
solution of the above equation is of the form u(x) = ce−iλx for some c ∈ C.
In other words, this shows that E(λ) = {ce−iλx : c ∈ C} for all λ ∈ R. We now
determine the generalized Fourier transform with respect to A. For ϕ ∈ S(R)
we have Z
ce−iλx 7→ hce−iλx , ϕiS 0 (R),S(R) = c ϕ(x)e−iλx dx.
R

Hence, if we identify linear mappings C → C with a complex number, we


obtain λ∈R E(λ)0 ' λ∈R C, which is the space of all functions f : R → C.
Q Q

Using this identification, for ϕ ∈R S(R) the generalized Fourier transform


ϕ̂ agrees with the function λ 7→ R ϕ(x)e−iλx dx, i.e. up to a normalization
constant with the usual Fourier transform. Since a Schwartz function is
uniquely determined by its Fourier transform, we see that A has a complete
system of generalized eigenvectors.

As a second example we consider the position operator in one dimension.

Example 3.3.7 (Generalized eigenvectors for the position operator). We


consider the position operator Mx on L2 (R). We again work with the Gelfand
triple S(R) ,→ L2 (R) ,→ S 0 (R) as in the last example. Note that one has S(R) ⊂
D(Mx ) and Mx S(R) ⊂ S(R). The extension A0 : S 0 (R) → S 0 (R) is explicitly
given for u ∈ S 0 (R) and ϕ ∈ S(R) as

hMx0 u, ϕiS 0 (R),S(R) = hu, Mx ϕiS 0 (R),S(R) = hu, x · ϕiS 0 (R),S(R) = hx · u, ϕiS 0 (R),S(R)

Hence, Mx0 : u 7→ x·u is the multiplication of the function x 7→ x with the given
distribution. Now, Mx0 δx0 = xδx0 = x0 δx0 holds in a mathematical rigorous way.
Moreover, one can show that all solutions of the eigenvalue equation

Mx0 u = x · u = x0 u ⇔ (x − x0 ) · u = 0

are constant multiples of δx0 . In fact, passing to the Fourier transform (for
tempered distributions) the problem reduces to the eigenvalue problem of the
previous example. Put differently, we have E(λ) = {cδλ : c ∈ C} for all λ ∈ R.

120
3.3. The Nuclear Spectral Theorem

Let us now calculate the generalized Fourier transform with respect to A. For
ϕ = cδλ ∈ E(λ) and ϕ ∈ S(R) we have

cδλ 7→ hcδλ , ϕiS 0 (R),S(R) = cϕ(λ).

Hence, if we again identify linear mappings C → C with a complex number,


we obtain λ∈R E(λ)0 ' λ∈R C, the space of all functions f : R → C. Using
Q Q

this identification, for ϕ ∈ S(R) the Fourier transform ϕ̂ agrees with the
function λ 7→ ϕ(λ), that is ϕ̂ = ϕ and the Fourier transform is the identity
mapping. Of course, from this it follows immediately that A has a complete
system of generalized eigenvectors.

The nuclear spectral theorem says that one can always find a complete
system of generalized eigenvalues for a self-adjoint operator A which is com-
patible with a Gelfand triple (G, H, ι) provided the Fréchet space G is nice
enough. This niceness condition is in mathematical terms that G is a so-called
nuclear space. This is a rather abstract mathematical concept from the the-
ory of locally convex topological vector spaces and needs some work to be
properly presented. Physics students may ignore the following definitions
and simply work with the fact that the space of Schwartz functions S(Rn )
is nuclear if they feel overwhelmed by the amount of definitions and may
therefore directly jump to the statement of the nuclear theorem.
From now on let G be a Fréchet space given by a countable family of
increasing semi-norms (pn )n∈N . For such a semi-norm we can define its local
Banach space.

Definition 3.3.8 (Local Banach space). Let G be a Fréchet space and p one
its defining semi-norms. Then for the null space Np = {x ∈ G : p(x) = 0} define
the quotient vector space

Gp B G/Np = {x + Np : x ∈ G}.

Then (Gp , p) is a normed vector space and its completion Ĝp is called the local
Banach space for p.

Note that for p ≤ q one has p(x) = 0 whenever q(x) = 0. Hence, Nq ⊆ Np


and one obtains a natural contractive linear map Gq → Gp sending x + Nq to
x + Np . Passing to the completions, we obtain a contractive operator Ĝq → Ĝp
between the two local Banach spaces. In particular in our situation, using the
shorthand notation Gk = Gpk and so on, we obtain natural maps Ĝl → Ĝk for
l ≥ k.
We now generalize the concept of trace class operator to mappings between
general Banach spaces. This concept goes back to the work of A. Grothendieck.

121
3. Distributions

Definition 3.3.9. A linear map T : X → Y between two Banach spaces is


called nuclear if there exists an absolutely summable sequence (λn )n∈N and
(xn0 )n∈N ⊂ X 0 and (yn )n∈N ⊂ Y with kxn k , kyn k ≤ 1 for all n ∈ N such that

X
Tx = λn hxn0 , xiX 0 ,X yn for all x ∈ X.
n=1

We now can define nuclear spaces.

Definition 3.3.10. A Fréchet space G given by an increasing family (pk )k∈N


of semi-norms is called nuclear if for all k ∈ N there is some l > k such that
the natural map Ĝl → Ĝk is nuclear.

Intuitively, a nuclear map has a very small image. Hence, in a nuclear


space the unit balls with respect to the semi-norms decrease rapidly. As a rule
of thumb one can say that all natural important function spaces which are
Fréchet spaces are either Banach spaces or nuclear spaces. Nuclear Fréchet
spaces – in contrast to infinite dimensional Banach spaces – behave in many
regards very similar to finite dimensional spaces. For example, the Heine–
Borel characterization of compact subsets holds for nuclear Fréchet spaces.
Let us illustrate all this with a concrete example.

Example 3.3.11 (The space of smooth functions). Let us consider the space
G = C ∞ ([0, 1]). We naturally want that a sequence in (fn )n∈N ∈ C ∞ ([0, 1]) con-
verges to some f ∈ C ∞ ([0, 1]) if and only if D k fn → D k f uniformly on [0, 1] for
all k ∈ N. This is achieved if we require that (fn )n∈N converges with respect
to all semi-norms pk (f ) B supx∈[0,1] |f (x)| for k ∈ N. One sees that the family
(pk )k∈N satisfies all requirements in the definition of a Fréchet spaces. Hence,
(pk )k∈N gives C ∞ ([0, 1]) the structure of a Fréchet space. Alternatively, one can
also work with the norms qk (f ) = kf kH k (0,1) for k ∈ N. Note that pk (fn − f ) → 0
for all n ∈ N if and only if qk (fn − f ) → 0 for all n ∈ N. This is a consequence
of the fact that by the Sobolev embedding theorems (Theorem 2.1.36) conver-
gence in H m ((0, 1)) implies convergence in C k (Rn ) provided m is large enough.
Observe that the family (qk )k∈N is monotone, i.e qk ≤ qk+1 for all k ∈ N.
We now show that C ∞ ([0, 1]) is a nuclear space. Since qk is already a norm,
the local Banach spaces are given by Gk = H k ((0, 1)). For l > k the natural
map Ĝl → Ĝk is the natural inclusion H l ((0, 1)) ,→ H k ((0, 1)) of Sobolev spaces.
We must show that for l large enough these inclusions are nuclear mappings.
For this let us consider the discrete Fourier transform F : L2 ([0, 1]) → ` 2 (Z)
which is an isomorphism because the trigonometric system (e2πim· )m∈Z is an
orthonormal basis of L2 ([0, 1]). Analogous to the continuous case, the image
of H k ((0, 1)) under F is the space of all sequences (xn )n∈N for which (nk xn )n∈N
lies in ` 2 (Z) or equivalently (xn )n∈N lies in ` 2 (Z, (n2k )). Hence, under the

122
3.3. The Nuclear Spectral Theorem

Fourier transform the inclusion H l ((0, 1)) ,→ H k ((0, 1)) becomes the identity
mapping
` 2 (Z, (n2l )) ,→ ` 2 (Z, (n2k )).
Choose yn = n−2k en , where en is the n-th unit vector, and xn = n−2l en . Then
kxn k`2 (Z,(n2l )) = kyn k`2 (Z,(n2k )) = 1 and (xn ) and (yn ) form orthonormal bases in
the respective spaces. Hence, for all z ∈ ` 2 (Z, (n2l )) we have

X ∞
X
z= hxn |zi`2 (Z,(n2l )) xn = hxn |zi`2 (Z,(n2l )) n2(k−l) yn .
n=1 n=1

This shows that for l > k the identity mapping between the weighted spaces is
nuclear because in this case the sequence (n2(k−l) )n∈N is absolutely summable.
Hence, C ∞ ((0, 1)) is a nuclear Fréchet space.

With the above result one can further easily deduce that the Fréchet space
C ∞ (R)is nuclear. The ambitious reader may try to prove that the space of
Schwartz functions S(Rn ) is a nuclear Fréchet space as well, a fact which
we will use soon. The less ambitious readers can find a proof of this fact in
[Kab14, p. 279].
We now come to the final result of our lecture, the nuclear spectral theo-
rem.

Theorem 3.3.12 (Nuclear spectral theorem). Let G be a nuclear Fréchet space,


G ,→ H ,→ H0 a Gelfand triple and (A, D(A)) a self-adjoint operator on H with
G ⊆ D(A) and AG ⊆ G. Then for some set K the operator A has a complete system
{ηλ,k : λ ∈ R, k ∈ K} of generalized eigenvectors in G 0 . Further, there exist finite
Borel measures (µk )k∈K such that for x ∈ G one has the expansions
XZ
x= hηλ,k , xiηλ,k dµk (λ)
k∈K R

and XZ
Ax = λhηλ,k , xiηλ,k dµk (λ)
k∈K R

Further for x ∈ G one has Plancherel’s formula


XZ
2
kxkH = |hηλ,k , xi|2 dµk (λ).
k∈K R

We now sketch some main steps of the proof. For a complete proof see
[Kab14, Satz 16.41]. One first establishes the following refined variant of
the spectral theorem for unbounded self-adjoint operators: for a self-adjoint
operator (A, D(A)) on H there exists a family of finite Borel measures (µk )k∈K

123
3. Distributions

L2 (σ (A), B(σ (A)), µk ) such that U AU −1


L
and a unitary operator U : H → k∈K
is the multiplication operator
M M
L2 (σ (A), B(σ (A)), µk ) → L2 (σ (A), B(σ (A)), µk )
k∈K k∈K
(fk )k∈K 7→ (λ 7→ λfk (λ))k∈K .

This version of the spectral theorem can first be proven for bounded normal
operators and thereafter the self-adjoint case can be deduced with the help
of the Cayley transform. An elegant way to prove the spectral theorem for
a normal operator T ∈ B(H) is to establish a continuous functional calculus
C(σ (T )) → B(H) for T . One then decomposes the above representation into
the direct sum of cyclic representations which can be shown to be unitary
equivalent to multiplication operators. For proofs of these facts we refer to
the literature on functional analysis given in the bibliography.
With the above unitary transform one can essentially reduce the prob-
lem to the case of multiplication operators. One now tries to define delta-
distributions δk,λ for k ∈ K and λ ∈ R. These distributions would be general-
ized eigenfunctions for A provided they are well-defined elements in G 0 . In
order to prove that these delta-distributions are indeed well-defined one uses
the fact that G is nuclear. In fact, one can show that if K is a nuclear Fréchet
space and one has a Gelfand triple of the form K ,→ L2 (Ω, Σ, µ) ,→ K0 for some
measure space (Ω, Σ, µ), then for almost all ω ∈ Ω the delta-distributions δω
are well-defined elements of K0 .
As a particular instance of the nuclear spectral theorem we obtain the
following corollary for differential operators which includes some quantum
mechanical operators.

Corollary 3.3.13 (Generalized eigenfunctions for differential operators).


Let m, n ∈ N and aα ∈ C ∞ (Rn ) smooth real functions of polynomial growth for
|α| ≤ m. Then the symmetric differential operator D = |α|≤m aα D α on L2 (Rn )
P

with D(A) = S(Rn ) can be extended to a self-adjoint operator A on L2 (Rn ). Further,


A has a complete system of generalized eigenfunctions in S 0 (Rn ).

Recall that the existence of self-adjoint extensions in the above situation is


guaranteed by von Neumann’s criterion (Theorem 2.4.2). Hence, the corollary
is a direct consequence of the nuclear spectral theorem. The above Corollary
applies to some quantum mechanical operators, however it does not apply
to Hamiltonians of the form −∆ + V for singular potentials V . A prominent
example for this situation is the Hamiltonian of the hydrogen atom.
However, the definition of nuclear spaces extends to general locally convex
topological vector spaces (you know how to do this if you know the theory
of locally convex spaces). Further, the nuclear spectral theorem in fact holds

124
3.3. The Nuclear Spectral Theorem

for such general nuclear locally convex spaces which may not be Fréchet
spaces. We have avoided general locally convex spaces in this lecture to
reduce topological difficulties. In fact, the topology of general locally convex
space is not induced by a (translation-invariant) metric in contrast to the
case of Fréchet spaces and concepts such as continuity are more difficult to
formulate correctly. A prominent and important example of a nuclear space
which is not a Fréchet space is the space of test functions D(Ω) for some open
Ω , ∅. Taking the nuclear spectral theorem for granted in the case of D(Ω),
we obtain the following strengthening of the previous corollary.

Corollary 3.3.14. Let m, n ∈ N, Ω ⊂ Rn open and aα ∈ C ∞ (Ω) smooth real


functions for |α| ≤ m. Then the symmetric differential operator D = |α|≤m aα D α
P

on L2 (Ω) with D(A) = D(Ω) can be extended to a self-adjoint operator A on L2 (Ω).


Further, A has a complete system of generalized eigenfunctions in D0 (Ω).

Note that the above corollary can in fact be applied to the Hamiltonian
Ĥ = −∆ + |x|−1 of the hydrogen atom. In fact, we can choose Ω = R3 \ {0} as the
potential is smooth outside the singularity at zero. The above corollary then
shows that there exists a complete system of generalized eigenfunctions in
D0 (R3 \ {0}).

125
The postulates of quantum A
mechanics
In this short appendix we present all postulates of quantum mechanics used
in our lectures in a condensed and mathematical manner. Our formulation of
the postulates closely follows the presentation in [Tak08].

Postulate 1. A quantum system is described by a complex Hilbert space H. The


Hilbert space of a composite system is the tensor product of Hilbert spaces of
the single component systems.

Postulate 2. The set of observables A of a quantum mechanical system with


Hilbert space H consists of all self-adjoint operators on H.

Postulate 3. The set of states S of a quantum system with a Hilbert space H


consists of all positive trace class operators ρ with Tr ρ = 1. Pure states are
projection operators onto one-dimensional subspaces of H. All other states
are called mixed states.

Postulate 4. The process of measurement in quantum mechanics is described


by the assignment
A × S 3 (A, M) 7→ µA ,
where µA is the probability measure on (R, B(R)) given by the Born–von Neu-
mann formula
µA (Ω) = Tr PA (Ω)ρ for Ω ∈ B(R),
where PA is the projection-valued measure associated to the self-adjoint opera-
tor A. For every Borel set Ω ⊆ R, the quantity 0 ≤ µA (Ω) ≤ 1 is the probability
that for a quantum system in the state ρ the result of a measurement of the
observable A belongs to Ω.

Postulate 5. A finite set of observables A = {A1 , . . . , An } can be measure si-


multaneously (simultaneously measured observables) if and only if they form
a commutative family. The simultaneous measurement of the commutative
family A in the state ρ ∈ S is described by the probability measure µA on Rn
given by
µA (Ω) = Tr PA (Ω)ρ for Ω ∈ B(Rn ),
where PA is the projection-valued measure associated to the family A. For
every Borel set Ω ⊆ Rn the quantity 0 ≤ µA (Ω) ≤ 1 is the probability that for a
quantum system in the state ρ the result of the simultaneous measurement of
the observables A1 , . . . , An belongs to Ω.

127
A. The postulates of quantum mechanics

Postulate 6 (Schrödinger’s picture of time evolution). The dynamics of a


quantum system is described by a strongly continuous unitary group (U (t))t∈R
on the Hilbert space H of the system. Quantum observables do not depend
on time and the evolution of states is given by

S 3 ρ 7→ ρ(t) = U (t)ρU (t)−1 .

Recall that by Stone’s theorem (Theorem 2.3.32) there is a one-to-one


correspondence between self-adjoint operators and strongly continuous uni-
tary groups on H. This gives the connection to the usually used well-known
Schrödinger equation.
One may add additional postulates if one considers particles with spin or
systems of identical particles with spins (bosons and fermions), physical phe-
nomena which we have not discussed thoroughly in our lectures. Further, for
concrete physics one would like to have a mapping which assigns a quantum
observable to its classical counterpart. Such mappings are called quantization
rules and were discussed in Kedar’s part of the lecture.

128
Bibliography
[Bar95] Robert G. Bartle. The elements of integration and Lebesgue measure.
Wiley Classics Library, vol. Containing a corrected reprint of the
1966 original [ıt The elements of integration, Wiley, New York;
MR0200398 (34 #293)], A Wiley-Interscience Publication. John
Wiley & Sons, Inc., New York, 1995, pp. xii+179.
[Bog07] V. I. Bogachev. Measure theory. Vol. I, II. Springer-Verlag, Berlin,
2007, Vol. I: xviii+500 pp., Vol. II: xiv+575.
[Hal13] Brian C. Hall. Quantum theory for mathematicians. Graduate Texts
in Mathematics, vol. 267. Springer, New York, 2013, pp. xvi+554.
[Kab14] Winfried Kaballo. Aufbaukurs Funktionalanalysis und Operatortheo-
rie. Springer-Verlag, 2014.
[Mor13] Valter Moretti. Spectral theory and quantum mechanics. Unitext,
vol. 64. With an introduction to the algebraic formulation, La
Matematica per il 3+2. Springer, Milan, 2013, pp. xvi+728.
[RS75] Michael Reed and Barry Simon. Methods of modern mathemati-
cal physics. II. Fourier analysis, self-adjointness. Academic Press
[Harcourt Brace Jovanovich, Publishers], New York-London, 1975,
pp. xv+361.
[Rud87] Walter Rudin. Real and complex analysis. Third. McGraw-Hill Book
Co., New York, 1987, pp. xiv+416.
[Tak08] Leon A. Takhtajan. Quantum mechanics for mathematicians. Gradu-
ate Studies in Mathematics, vol. 95. American Mathematical Society,
Providence, RI, 2008, pp. xvi+387.
[Tes12] Gerald Teschl. Ordinary differential equations and dynamical systems.
Graduate Studies in Mathematics, vol. 140. American Mathematical
Society, Providence, RI, 2012, pp. xii+356.

129

You might also like