100% found this document useful (1 vote)
110 views

Legendre Transformation Intro

This document provides an overview of Legendre-Fenchel transforms, which generalize Legendre transforms commonly used in physics. It defines Legendre-Fenchel transforms using a variational formula and explains how they relate to convex analysis through the use of supporting lines. Specifically, it shows that the shape of a function and its Legendre-Fenchel transform are determined by the supporting lines of each, and that a Legendre-Fenchel transform is its own inverse if and only if the original function is convex through the use of supporting lines.

Uploaded by

utbeast
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
110 views

Legendre Transformation Intro

This document provides an overview of Legendre-Fenchel transforms, which generalize Legendre transforms commonly used in physics. It defines Legendre-Fenchel transforms using a variational formula and explains how they relate to convex analysis through the use of supporting lines. Specifically, it shows that the shape of a function and its Legendre-Fenchel transform are determined by the supporting lines of each, and that a Legendre-Fenchel transform is its own inverse if and only if the original function is convex through the use of supporting lines.

Uploaded by

utbeast
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

1

Legendre-Fenchel transforms in a nutshell


Hugo Touchette
School of Mathematical Sciences, Queen Mary, University of London,
London E1 4NS, UK
Started: July 11, 2005; last compiled: August 14, 2007

12

The aim of this report is to list and explain the basic properties of the Legendre-Fenchel
transform, which is a generalization of the Legendre transform commonly encountered
in physics. The precise way in which the Legendre-Fenchel transform generalizes the
Legendre transform is carefully explained and illustrated with many examples and pictures. The understanding of the difference between the two transforms is important because the general transform which arises in statistical mechanics is the Legendre-Fenchel
transform, not the Legendre transform.
All the results contained in this report can be found with much more mathematical
details and rigor in [1]. The proofs of these results can also be found in that reference.

13

1. Definitions

4
5
6
7
8
9
10
11

14
15

Consider a function f (x) : R R. We define the Legendre-Fenchel (LF) transform


of f (x) by the variational formula
f (k) = sup{kx f (x)}.

(1)

xR
16
17
18

We also express this transform by f (k) = ( f (x)) or, more compactly, by f = ( f ) ,


where the star stands for the LF transform.
The LF transform of f (k) is
f (x) = sup{kx f (k)}.

(2)

kR
19
20

This corresponds also to the double LF transform of f (x). The double-star notation
comes obviously from our compact notation for the LF transform:
f = ( f ) = (( f ) ) .
1

(3)

21
22

Remark 1. LF transforms can also be defined using an infimum (min) rather than a
supremum (max):
g (k) = inf {kx g(x)}.
(4)
xR

23
24

Transforming one version of the LF transform to the other is just a matter of introducing
minus signs at the right place. Indeed,
f (k) = sup{kx f (x)} = inf{kx + f (x)},

(5)

g (q) = inf{q x g(x)},

(6)

25

so that
x

26

27
28
29
30

making the transformations g(x) = f (x) and g (q) = f (k = q).


Remark 2. The Legendre-Fenchel transform is often referred to in physics as the Legendre transform. This does not do justice to Fenchel who explicitly studied the variational formula (1), and applied it to nondifferentiable as well as nonconvex functions.
What Legendre actually considered is the transform defined by
f (k) = kxk f (xk )

31

where xk is determined by solving


f 0 (x) = k.

32
33
34
35

36
37
38
39

40
41
42
43

(7)

(8)

This form is more limited in scope than the LF transform, as it applies only to differentiable functions, and, we shall see later, convex functions. In this sense, the LF transform
is a generalization of the Legendre transform, which reduces to the Legendre transform
when applied to convex, differentiable functions. We shall comment more on this later.
Remark 3. The LF transform is not necessarily self-inverse (we also say involutive);
that is to say, f need not necessarily be equal to f . The equality f = f is taken for
granted too often in physics; well see later in which cases it actually holds and which
other cases it does not.
Remark 4. The definition of the LF transform can trivially be generalized to functions
defined on higher-dimensional spaces (i.e., functions f (x) : Rn R, with n a positive
integer) by replacing the normal real-number product kx by the scalar product k x,
where k is a vector having the same dimension as x.

44

Remark 5. (Steepest-descent or Laplace approximation). Consider the definite integral


Z
F(k, n) =
(9)
en[kx f (x)] d x.
R

45
46
47
48

In the limit n , it is possible to approximate this integral using Laplace Method


(or steepest-descent method if x C) by locating the maximum value of the integrand
corresponding to the maximum value of the exponent kx f (x) (assume theres only
one such value). This yields,


F(k, n) exp n sup{kx f (x)} .
(10)
xR

49

It can be proved that the corrections to this approximation are subexponential in n, i.e.,
ln F(k, n) = n sup{kx f (x)} + o(n).

(11)

xR
50
51
52

Remark 6. (The LF transform in statistical mechanics). Let U be the energy function


of an n-body system. In general, the density n (u) of microscopic states of the system
having a mean energy u = U/n scales exponentially with n, which is to say that
ln n = ns(u) + o(n),

(12)

55

where s(u) is the microcanonical entropy function of the system. (This can be taken as
a definition of the microcanonical entropy.) Defining the canonical partition function of
the system in the usual way, i.e.,
Z
Z n () = n (u)enu du,
(13)

56

we can use Laplace Method to write

53
54

1
() = lim ln Z n () = inf{u s(u)}.
n
u
n
57
58
59
60

(14)

Physically, () represents the free energy of the system in the canonical ensemble. So,
what the above result shows is that the canonical free energy is the LF transform of the
microcanonical entropy ( = s ). The inverse result, namely s = , is not always true,
as will become clear later.

61

2. Theory of LF transforms

62

The theory of LF transforms deals mainly with two questions:

63

Q1: How is the shape of f (k) determined by the shape of f (x), and vice versa?

64

Q2: When is the LF transform involutive? That is, when does f = (( f ) ) = f ?

66

Well see next that these two questions are answered by using a fundamental concept of
convex analysis known as a supporting line.

67

2.1. Supporting lines

65

68
69

70
71

We say that the function f : R R has or admits a supporting line at x R if there


exists R such that
f (y) f (x) + (y x),
(15)
for all y R. The parameter is the slope of the supporting line. We further say that a
supporting line is strictly supporting at x if
f (y) > f (x) + (y x)

72
73

74
75
76
77

78
79

80
81
82

83
84

85
86
87

(16)

holds for all y 6= x. For these definitions to make sense, we need obviously to have
f < .
Remark 7. For convenience, it is useful to replace the expression f admits a supporting line at x by f is convex at x. So, from now on, the two expressions mean the
same (this is a definition). If f does not admit a supporting line at x, then we shall say
that f is nonconvex at x.
The geometrical interpretation of supporting lines is shown in Figure 1. In this figure,
we see that
The point a admits a supporting line ( f is convex at a). The supporting line has
the property that it touches f at the point (a, f (a)) and lies beneath the graph of
f (x) for all x; hence the term supporting.
The supporting line at a is strictly supporting because it touches the graph of f (x)
only at a. In this case, we say that f is strictly convex at a.
The point b does not admit any supporting lines; any lines passing through (b, f (b))
must cross the graph of f (x) at some point. In this case, we also say that f is nonconvex at b.

Figure 1: Geometric interpretation of supporting lines.

90

The point c admits a supporting line which is non-strictly supporting, as it touches


another point (d) of the graph of f (x). (The points c and d share the same supporting line.)

91

From this picture, we easily deduce the following result:

88
89

94

Proposition 1. If f admits a supporting line at x and f 0 (x) exists, then the slope of
the supporting line must be equal to f 0 (x). In other words, for differentiable functions,
a supporting line is also a tangent line.

95

2.2. Convexity properties

92
93

97

Before answering Q1 and Q2, let us pause briefly for two important results, which we
state without proofs.

98

Theorem 2. f (k) is an always convex function of k (independently of the shape of f ).

96

99
100

101
102
103
104
105

Corollary 3. f (x) is an always convex function of x (again, independently of the


shape of f ).
The precise meaning of convex here is that f (or f ) admits a supporting line at
all k (all x, respectively). More simply, it means that f and f are -shaped.1
Note that these results tell us already that the LF transform cannot always be involutive. Indeed, f (x) is convex even if f (x) is not, so that f 6= f if f is not everywhere
convex. Well see later that this is the only problematic case.
1 There seems to be some confusion in the literature about the definitions of concave and convex.

The Webster (7th Edition), for one, defines a -shaped function to be concave rather than convex. However,
most mathematical textbooks will agree in defining the same function to be convex.

Figure 2: Illustration of the duality property for supporting lines: points of f are transformed into slopes of f , and slopes of f are transformed into points of f .
106

107
108

109
110

111
112

2.3. Supporting line duality


We now answer our first question (Q1): How is the shape of f (k) determined by the
shape of f (x), and vice versa? A partial answer is provided by the following result:
Theorem 4. If f admits a supporting line at x with slope k, then f at k admits a
supporting line with slope x.
This theorem is illustrated in Figure 2. The next theorem covers the special case of
strict convexity.

115

Theorem 5. If f admits a strict supporting line at x with slope k, then f admits a


tangent supporting line at k with slope f 0 (k) = x. (Hence f is differentiable in this
case in addition to admit a supporting line.)

116

2.4. Inversion of LF transforms

117

The answer to Q2 ( f = f ) is provided by the following result:

118

Theorem 6. f (x) = f (x) if and only if f admits a supporting line at x.

113
114

121

Thus, from the point of view of f (x), we have that the LF transform is involutive at
x if and only if f is convex at x (in the sense of supporting lines). Changing our point of
view to f (k), we have the following:

122

Theorem 7. If f is differentiable at k, then f = f at x = f 0 (k).

119
120

123
124
125

Well see later with a specific example that the differentiability property of f is
sufficient (as stated) but non-necessary for f = f . For now, we note the following
obvious corollary:

126

127
128
129
130

Corollary 8. If f (k) is everywhere differentiable, then f (x) = f (x) for all x.


This says in words that the LF transform is completely involutive if f (k) is everywhere differentiable.
We end this section with another corollary and a result which helps us visualize the
meaning of f (x).

132

Corollary 9. A convex function can always be written as the LF transform of another


function. (This is not true for nonconvex functions.)

133

Theorem 10. f (x) is the largest convex function satisfying f (x) f (x).

131

135

Because of this result, we call f (x) the convex envelope or convex hull of f (x).
Well precise the meaning of these expressions in the next section.

136

3. Some particular cases

134

139

We consider in this section a number of examples which help visualize the meaning
and application of all the results presented in the previous section. All of the examples
considered arise in statistical mechanics.

140

3.1. Differentiable, convex functions

141

The LF transform

137
138

f (k) = sup{kx f (x)}

(17)

142
143

is in general evaluated by finding the critical points xk (there could be more than one)
which maximize the function
F(x, k) = kx f (x).

144

In mathematical notation, we express xk in the following manner:


xk = arg sup F(x, k) = arg sup{kx f (x)},
x

145
146
147
148

(18)

(19)

where arg sup reads arguments of the supremum, and mean in words points at which
the maximum occurs.
Now, assume that f (x) is everywhere differentiable. Then, we can find the maximum
of F(x, k) using the common rules of calculus by solving

F(x, k) = 0,
x

(20)

149

for a fixed value of k. Given the form of F(x, k), this is equivalent to solving
k = f 0 (x)

150
151
152
153

for x given k. As noted before, there could be more than one critical points of F(x, k)
that would solve here the above differential equation. To make sure that there is actually
only one solution for every k R, we need to impose the following two conditions on
f:

154

1. f 0 (x) is continuous and monotonically increasing for increasing x;

155

2. f 0 (x) for x and f 0 (x) for x .

156
157

158

159
160
161
162
163
164
165

Given these, we are assured that there exists a unique value xk for each k R satisfying
k = f 0 (xk ) and which maximizes F(x, k). As a result, we can write
f (k) = kxk f (xk ),

(22)

f 0 (xk ) = k.

(23)

where
These two equations define precisely what the Legendre transform of f (x) is (as
opposed to the LF transform, which is defined with the sup formula). Accordingly, we
have proved that the LF transform reduces to the Legendre transform for differentiable
and strictly convex functions. (The strictly convex property results from the monotonicity of f 0 (x).) Since f (x) at this point is convex by assumption, we must have f = f
for all x. Therefore, the Legendre transform must be involutive (always), and the inverse
Legendre transform is the Legendre transform itself; in symbol,
f (x) = k x x f (k x ),

166

168
169
170
171
172

(24)

where k x is the unique solution of


f 0 (k) = x.

167

(21)

(25)

3.2. Function having a nondifferentiable point


What happens if f (x) has one or more nondifferentiable points? Figure 3 shows a particular example of a function f (x) which is nondifferentiable at xc . What does its LF
transform f (k) look like?
The answer is provided by what we have learned about supporting lines. Lets consider the differentiable and nondifferentiable parts of f (x) separately:

Figure 3: Function having a nondifferentiable point; its LF transform is affine.


173
174
175
176
177
178
179
180
181
182
183

184
185
186
187
188
189
190

Differentiable points of f : Each point (x, f (x)) on the differentiable branches of


f (x) admits a strict supporting line with slope f 0 (x) = k. From the results of
the previous section, we then know that these points are transformed at the level
of f (k) into points (k, f (k)) admitting supporting line of slopes f 0 (k) = x.
For example, the differentiable branch of f (x) on the left (branch a in Figure
3) is transformed into a differentiable branch of f (k) (branch a 0 ) which extends
over all k (, kl ]. This range of k-values arises because the slopes of the leftbranch of f (x) ranges from to kl . Similarly, the differentiable branch of f (x)
on the right (branch b) is transformed into the right differentiable branch of f (k)
(branch b0 ), which extends from kh to +. (Note that, for the two differentiable
branches, the LF transform reduces to the Legendre transform.)
Nondifferentiable point of f : The nondifferentiable point xc admits not one but
infinitely many supporting lines with slopes in the range [kl , kh ]. As a result, each
point of f (k) with k [kl , kh ] must admit a supporting line with constant slope
xc (branch c0 ). That is, f (k) must have a constant slope f 0 (k) = xc in the
interval [kl , kh ]. We say in this case that f (k) is affine or linear over (kl , kh ).
(The affinity interval is always the open version of the interval over which f has
constant slope.)

192

The case of functions having more than one nondifferentiable point is treated similarly by considering each nondifferentiable point separately.

193

3.3. Affine function

191

194
195

Since f (x) in the previous example is convex, f (x) = f (x) for all x, and so the roles
of f and f can be inverted to obtain the following: a convex function f (x) having an

10

Figure 4: Nondifferentiable points are transformed into affine parts under the action of
the LF transform and vice versa.

199

affine part has a LF transform f (k) having one nondifferentiable point; see Figure 4.
More precisely, if f (x) is affine over (xl , x h ) with slope kc in that interval, then f (k)
will have a nondifferentiable point at kc with left- and right-derivatives at kc given by xl
and x h , respectively.

200

3.4. Bounded-domain function with infinite slopes at boundaries

196
197
198

215

Consider the function f (x) shown in Figure 5. This function has the particularity to be
defined only on a bounded domain of x-values, which we denote by [xl , x h ]. Furthermore, f 0 (x) as x xl + 0 and x x h 0 (the derivative of f blows up near at
the boundaries). Outside the interval of definition of f (x), we formally set f (x) = .
To determine the shape of f (k), we use again what we know about supporting lines
of f and f . All points (x, f (x)) with x (xl , x h ) admit a strict supporting line with
slope k(x). These points are represented at the level of f by points (k(x), f (k(x)))
having a supporting line of slope x. As x approaches xl from the right, the slope of f (x)
diverges to . At the level of f , this implies that the slope of the supporting line of
f reaches xl as k . Similarly, since the slope of f (x) goes to + as x x h ,
the slope of the supporting line of f reaches the value x h as k +; see Figure 5.
Note, finally, that f = f since f is convex. This means that we can invert the roles
of f and f in this example just like in the previous one to obtain the following: the LF
transform of a convex function which is asymptotically linear is a convex function which
is finite on a bounded domain with diverging slopes at the boundaries.

216

3.5. Bounded-domain function with finite slopes at boundaries

201
202
203
204
205
206
207
208
209
210
211
212
213
214

217
218
219
220
221

Consider now a variation of the previous example. Rather than having diverging slopes
at the boundaries xl and x h , we assume that f (x) has finite slopes at these points. We
denote the right-derivative of f at xl by kl and its left-derivative at x h by kh .
For this example, everything works as in the previous example except that we have
to be careful about the boundary points. As in the case of nondifferentiable points, f at

11

Figure 5: Function defined on a bounded domain with diverging slopes at boundaries; its
LF transform is asymptotically linear as |k| .

Figure 6: Function defined on a bounded domain with finite slopes at boundaries; its LF
transform has affine parts outside some interior domain.

227

x h admits not one but infinitely many supporting lines with slopes taking values in the
range [kh , ). At the level of f , this means that all points (k, f (k)) with k [kh , )
have supporting lines with constant slope x h ; that is, f (k) is affine past kh with slope
x h . Likewise, f at xl admits an infinite number of supporting lines with slopes now
ranging from to kl . As a consequence, f must be affine over the range (, kl )
with constant slope xl ; see Figure 6.

228

3.6. Nonconvex function

222
223
224
225
226

229
230
231
232

233
234

Our last example is quite interesting, as it illustrates the precise case for which the LF
transform is not involutive, namely nonconvex functions.
The function that we consider is shown in Figure 7; it has three branches having the
following properties:
Branch a: The points on this branch, which extends from x = to xl , admit
strict supporting lines. This branch is thus transformed into a differentiable branch

12

Figure 7: Nonconvex function; its LF transform has a nondifferentiable point.


235

236

at the level of f (branch a 0 ).


Branch b: Similarly as for branch a.

242

Branch c: None of the points on this branch, which extends from (xl , x h ), admit
supporting lines. This means that these points are not represented at the level of
f . In other words, there is not one point of f which admits a supporting line
with slope in the range (xl , x h ). (That would contradict the fact that f has a
supporting line at k with slope x if and only if f admits a supporting line at x with
slope k.)

243

These three observations have two important consequences (see Figure 7):

237
238
239
240
241

244
245
246
247

248
249
250
251
252
253
254
255

1. f (k) must have a nondifferentiable point at kc , with kc equal to the slope of the
supporting line connecting the two points (xl , f (xl )) and (x h , f (x h )). This follows
since xl and x h share the same supporting line of slope kc . Thus, in a way, f must
have two slopes at kc .
2. Define the convex extrapolation of f (x) to be the function obtained by replacing
the nonconvex branch of f (x) (branch c) by the supporting line connecting the
two convex branches of f (a and b). Then, both the LF transforms of f and
its convex extrapolation yield f . This is evident from our previous working of
nondifferentiable and affine functions. It should also be evident from the example
of nondifferentiable functions that the convex extrapolation of f is nothing but
f , the double LF transform of f . This explains why we call f the convex
envelope of f .

13

Figure 8: Structure of the LF transform for nonconvex functions.


256

To summarize, note that, as a result of Point 2 above, we have


( f ) = ( f ) = f .

257

Also, for the example considered, we have


( f ) = f 6= f.

258

260
261

262

That is, in this case, the LF transform is involutive (see Theorem 2.4).

263

4. Important results to remember

265

266
267

268
269
270
271

(28)

where the arrows stand for the LF transform; see Figure 8. This diagram clearly shows
that the LF transform is non-involutive in general. For convex functions, i.e., functions
admitting supporting lines everywhere, the diagram reduces to
f
f .

264

(27)

Overall, this means that the LF transform has the following structure:
f f
f ,

259

(26)

(29)

The LF transform yields only convex functions: f = ( f ) is convex and so is


f = ( f ) .
The shape of f is determined from the shape of f by using the duality relationship which exists between the supporting lines of f and those of f .
Points of f are transformed into slopes of f , and slopes of f are transformed into points of f .
Nondifferentiable points of f are transformed, through the action of the LF
transform, into affine branches of f .

14

272
273
274

275
276

Affine or nonconvex branches of f are transformed into nondifferentiable


points of f . These are the only two cases producing nondifferentiable
points.
The involution (self-inverse) property of the LF transform is determined from the
supporting line properties of f or from the differentiability properties of f .

277

f = f at x if and only if f admits a supporting line at x.

278

If f is differentiable at k, then f = f at x = f 0 (k).

279

The double LF transform f of f corresponds to the convex envelope of f .

280

The complete structure of the LF transform for general functions goes as follows:
f f
f ,

(30)

282

where the arrows denote the LF transform. For convex functions ( f = f ), this
reduces to
f
f ;
(31)

283

i.e., in this case, the LF transform is involutive.

281

284
285

286
287

288

The LF transform is more general that the Legendre transform because it applies
to nonconvex functions as well as nondifferentiable functions.
The LF transform reduces to the Legendre transform in the case of convex, differentiable functions.

Acknowledgments

293

This report was written while visiting the Rockefeller University, New York, in July
2005. I wish to thank E.G.D. Cohen for his invitation to visit Rockefeller, as well as
for his many comments on the content and presentation of this report, and for his contagious desire to make things clear. My work is supported by the Natural Sciences and
Engineering Research Council of Canada (Post-Doctoral Fellowship).

294

References

295

[1] R. T. Rockafellar. Convex Analysis. Princeton University Press, Princeton, 1970.

289
290
291
292

You might also like