Constrained Optimization & Lagrange Multipliers
Constrained Optimization & Lagrange Multipliers
1
The plan for this lesson is to now go deeper into more advanced topics in calculus of
variations. In particular, the topic of this lesson is constrained optimization.
Simply put, constrained optimization is the area of math concerned with finding minima and
maxima of functions (or functionals in our context) that are subject to some additional
constraint. A constraint here means some sort of restriction on what values the function can
take on.
As for this lesson, we will first begin by looking at how constrained optimization works in
ordinary multivariable calculus. This will introduce us a new tool called Lagrange
multipliers. Then, we'll move on to looking at constrained optimization in variational
calculus, the main topic of today's lesson.
Lesson Contents:
2
1. Constraints In Multivariable Calculus
2 2
For example, we could have a function f(x, y) = y x that we want to find the maximum or
minimum points of. In addition, we only want values of the function that lie on the unit circle
x 2 + y 2 = 1.
At surface level, it seems like we have two separate problems here - one is, we want to find
the stationary points of the function and the other is, we want to find the values of the
function that satisfy the equation of the unit circle. How can we combine these problems and
get a solution that is consistent with both?
Well, that is what constrained optimization answers - it give us exactly the tools to solve
such a problem. In this example, it would be for solving the problem of optimizing the
function f(x, y) = ye -x under the constraint x 2 + y 2 = 1.
We'll first develop the techniques to solve such a problem and look at an example. After that,
we'll look at the intuition behind why the method works and what it's actually based on at a
geometric level. In particular, the method we're going to use to solve a constrained
optimization problem is known as the Lagrange multiplier method.
3
To understand the Lagrange multiplier method, we first have to understand how to find
the minima or maxima of a multivariable function without any constraints. As an
example, let's take the function f(x, y) = y 2 x 2 , which we want to find the extremal
points of.
To do that, all we do is calculate the gradient of f, ∇f, and set it equal to zero. The
intuition behind this is exactly the same as for finding extremal points of a single-variable
function, where we would calculate the derivative of the function and set it equal to zero.
Now, setting the gradient equal to zero, in this case, gives us two equations since the
gradient ∇f is a vector with components ∂f / ∂x and ∂f / ∂y:
∂f
=0
∂f ∂f ∂x
∇f = x̂ + ŷ = 0 ⇒
∂x ∂y ∂f
=0
∂y
2
To find the extremal points of f = xy + x , both of these equations have to be
satisfied simultaneously. If we calculate these partial derivatives, we get the following:
∂f
=0
∂x y + 2x = 0
⇒
∂f x=0
=0
∂y
This set of equations has a single solution, the point (x, y) = (0, 0). So, the extremal
2
point of the function f = xy + x is (0, 0).
The key point in the above example is that to find the extremal points of a multivariable
function without constraints, we set its gradient equal to zero and solve the equation
∇f = 0.
Now, what happens when we do have a constraint? Well, the key behind this lies in how we
would represent a constraint in the first place - a constraint is always represented by some
relation or equation between the variables we are interested in. Such an equation is usually
called, not surprisingly, a constraint equation.
4
For example, if we want to optimize some function f(x, y) under the constraint that its
extremal values must lie on the unit circle, then our constraint equation would simply be the
2 2
equation defining a unit circle, x + y = 1. This means that the possible solutions to our
constrained optimization problem would have to satisfy this equation, as well as also being
extremal points of the function f(x, y).
Now, let's try to develop our theory in a bit more generality. The general form of a
constraint equation - for a problem involving two variables x and y - is usually taken to be
as follows:
All this equation says is that we have some function of or relevant variables set equal to a
constant - in other words, an equation restricting the valus of the variables in some form. For
2
example, for the constraint equation x + y 2 = 1, we have g(x, y) = x 2 + y 2 and C = 1.
Okay, so we want to optimize a function f(x, y), such that it obeys the constraint
g(x, y) = C. How could this be done? Well, we can understand the problem geometrically
by noting that both of the functions, f(x, y) and g(x, y), describe surfaces -like multivariable
functions generally - if we place them in a Cartesian coordinate system.
Now, let’s say the value of the function f(x, y) is k - this is the value we want to optimize.
In other words, f(x, y) = k is some curve on the graph of f(x, y). This is generally called a
contour line.
But, our constraint g(x, y) = C is also a contour line, namely the curve where the value of
g(x, y) is equal to C - this is the curve along which the constraint is satisfied.
5
Now, in order for the value k to be a stationary value of f(x, y) such that the constraint
g(x, y) = C is also satisfied, we must have two things - first, the contour lines f(x, y) = k
and g(x, y) = C must meet at some point in order to both have a simultaneous solution.
Second, for k to specifically be a stationary value of f(x, y), the contour lines f(x, y) =k
and g(x, y) = C must be tangent to one another - they must be going in the same
direction at that point.
The nice thing about these contour lines is that the gradient of a multivariable function
always points perpendicular to its contour lines. Now, at this maximum point, the contour
lines are tangential to one another, therefore also the gradient vectors at this must point in
the same direction. In other words, the gradients at this point are proportional to each other:
6
An equivalent way of stating this would be by expressing these gradient vectors as multiples
of one another by adding in a proportionality factor, which we’ll call λ. So, the fundamental
condition for a multivariable function f to be stationary under a constraint g = C is:
∇f = λ∇g
The maximum points (x, y) of f(x, y) under the constraint g(x, y) = C are then the
solutions of this equation - or actually, equations since there is one equation for each
component of the gradients.
The proportionality factor λ here is called a Lagrange multiplier. We'll talk about it more
soon, but for now, it's simply another variable in our equations to solve for. So, in addition to
solving for the values of the variables x and y at the extremal points, we also have to solve
for λ to find a unique solution.
You might notice that when solving for the variables x, y and λ, we only have two equations
coming from the gradient condition, ∇f = λ∇g - one equation for each of the components
of the gradient:
∂f ∂g ∂f ∂g
=λ and =λ
∂x ∂x ∂y ∂y
7
However, we have three variables to solve for, so we need one more equation. This additional
equation is simply the constraint equation, g = C. In other words, we have three variables
to solve for from three equations. The solutions (x, y and λ) we get form these are the
extremal points of the function we want to optimize, as well as an expression for the
Lagrange multipler that tells us what λ has to be in order to obtain a unique solution.
Just as a short summary before we get to looking at an example, the steps for solving a
constrained optimization problem in multivariable calculus are as follows:
3. Set the gradients proportional to one another, ∇f = λ∇g. You'll then have a set of
three equations:
∂f ∂g
=λ
∂x ∂x
∂f ∂g
=λ
∂y ∂y
g=C
4. Use these three equations to solve for x, y and λ. The solution (x, y) you get is the
stationary point of the original function f(x, y).
It's worth noting that these steps work exactly the same for a multivariable function of any
number of variables, such as f(x, y, z). In that case, you would just get an extra equation
from the gradient, along with the additional variable z to solve for.
2
Let's consider again the function f(x, y) = xy + x . Earlier, we found that the extremal
point of this function is at (0, 0). However, what if we now want to find the extremal
2
point of this function that lies on the unit circle x + y 2 = 1?
8
2
Here, we have a constraint described by the constraint equation x + y 2 = 1. This has
exactly the form of g(x, y) = C with g(x, y) = x 2 + y 2 and C = 1. We can then find
the gradients of these functions:
∂f ∂f
∇f = x̂ + ŷ = (y + 2x)x̂ + xŷ
∂x ∂y
∂g ∂g
∇g = x̂ + ŷ = 2xx̂ + 2yŷ
∂x ∂y
The set of equations we then want to solve are then obtained from the components of
these gradients as:
∂f ∂g
=λ
∂x ∂x y + 2x = 2λ · x
∇f = λ∇g ⇒ ⇒
∂f ∂g x = λ · 2y
=λ
∂y ∂y
So, we then have the following set of three equations (remember the constraint
equation!).
9
We need to now solve for x, y and λ from:
y + 2x = 2λx
x = 2λy
x2 + y2 = 1
If you want, you can go through the algebra of solving these or simply just plug the set
of equations into a calculator. In any case, the solutions you'll get are:
- 2+1 2- 2 - 2- 2 1+ 2
x= , y= , λ=
2 2 2
or
- 2-1 2+ 2 2+ 2 1- 2
x= , y= , λ=
2 2 2
or
2-1 2+ 2 - 2+ 2 1- 2
x= , y= , λ=
2 2 2
or
2+1 2- 2 2- 2 1+ 2
x= , y= , λ=
2 2 2
These describe all the extremal points of f(x, y) = xy + x 2 that lie on the unit circle
x 2 + y 2 = 1, which was our constraint. Moreover, these solutions also tell us what the
Lagrange multipliers (λ) have to be in order for the constraint to be satisfied.
When using the Lagrange multiplier method, these λ-variables called Lagrange multipliers
show up. In a practical sense, they are just that - additional variables to solve for in order to
solve the given problem.
However, there is also an intuitive geometric way to view these Lagrange multipliers.
Essentially, the Lagrange multipliers tell us how much the extremal values of a given function
change as the constraint is changed by a small amount.
10
The way to see this is to consider again the constraint condition, ∇f = λ∇g. We can see
from this that the Lagrange multiplier λ tells us "how much" the gradients are pointing in
the same direction - it's the proportionality factor between the gradient of the function to
be optimized and the gradient of the constraint function.
• If the Lagrange multiplier is large, changing the constraint by just a small amount will
cause a big change in the maximum or minimum of the function that is to be
optimized.
• If the Lagrange multiplier is small, changing the constraint by just a small amount will
instead cause only a small change in the maximum or minimum of the function that is
to be optimized.
• If the Lagrange multiplier is zero, we can interpret this as saying that a change in the
constraint does not affect the minima or maxima of the function to be optimized. That
is, the extrema of the function are independent of the constraint.
The easiest way to understand why the above is true is to consider an optimization problem
in only one dimension, say for functions f(x) and g(x). In this case, the condition ∇f = λ∇g
becomes just:
df dg
=λ
dx dx
Now, we'll do something that seems a bit out of place but will give us something interesting.
We're going to think of f as a function f(g(x)) - in other words, it is a function of the
constraint, which itself depends on the variable x. This makes sense, because the value of f
does indeed depend on the constraint if we make the requirement that the constraint also
must be satisfied.
We can then consider the chain rule for taking the derivative d / dx of this function f, which
would be of the form:
df df dg
=
dx dg dx
With the notation used previously, we had the extremal value of a function f denoted as k
and the value of the constraint function g denoted as C.
11
Using this notation, we would write the equation above as:
df dk dg
=
dx dC dx
Now, compare this to the constraint condition, df / dx = λdg / dx from above. Clearly, if
these two are to describe the same solutions - which they should - we must have:
df dg df dk dg
=λ ⇔ =
dx dx dx dC dx
dk
λ=
dC
What does this mean? Well, this directly describes "how much the maximum value of a
function, k, changes as we change the value of the constraint, C". That's the way to interpret
the Lagrange multipliers.
In practice, however, when doing any type of constrained optimization problem with
Lagrange multipliers, it's enough to treat the Lagrange multipliers λ as just some constants
that need to be solved for.
Lagrange multipliers are also quite important in Lagrangian mechanics, where we use
variational calculus to find equations of motion for various physical systems. There,
Lagrange multipliers can be used to constrain the dynamics of a system and make the
system behave in a particular way.
The physical interpretation of the Lagrange multipliers would then be, essentially, "how
much the solutions to the equations of motion change when changing the constraints of
the system by a small amount". This corresponds to what so-called constraint forces do
- so, the Lagrange multipliers represent forces of constraint in Lagrangian mechanics.
12
2. Constraints In Calculus of Variations
We're now ready to get to the main topic of this section - constraints in variational
calculus.
This section will be kept a bit on the lighter side in terms of derivations and rigorour math.
The main thing we'll focus on is the intuition behind constraints and how they can be
applied in calculus of variations problems.
Now, let's begin by drawing some analogies between optimization in multivariable calculus
and in calculus of variations, based on what we discussed previously.
In multivariable calculus, when we simply want to find the extremal points of a function
f(x, y) without any constraints, we set the gradient of f equal to zero and solve the
following equation for the values of x and y:
∇f = 0
In calculus of variations, on the other hand, when we want to find the extremal "points" of a
x2
functional F = ∫x
1
f(x, y, y')dx , we set its functional derivative equal to zero:
𝛿F
=0
𝛿y
This then leads us to the the Euler-Lagrange equation, which we solve to get the curve y(x)
that extremizes the functional F:
𝛿F d ∂f ∂f
=0 ⇒ - =0
𝛿y dx ∂y' ∂y
13
Following the analogy, is there are something similar for constraints in calculus of variations?
Well, the first thing to note is that in calculus of variations, constraints are also expressed as
constraint equations but in terms of functionals. Above, we have a general constraint
function g(x, y) and similarly, in calculus variations, we have a constraint functional of the
general form:
b
G = ∫ g(x, y, y')dx
a
This is simply a functional that describes a particular constraint imposed on the problem
when we set it equal to a constant, just like we did earlier. Let's call this constant C. A
constraint equation in calculus of variations would then have the form G = C or:
b
∫a g(x, y, y')dx = C
Now, when we introduced constraints in the multivariable case, the right-hand side of our
"optimization equation", ∇f = 0, became non-zero - we set the gradients of the original
function and the constraint function proportional to one another using a Lagrange multiplier
λ and got the equation ∇f = λ∇g.
𝛿F 𝛿G
=λ
𝛿y 𝛿y
This is the condition for the functional F to be stationary under the constraint described by
G = C. From this condition, we can solve for the desired function y(x).
14
2.1. The Euler-Lagrange Equation With Constraints
Following up with the above condition, we'll now use the formula for the functional
derivative we derived in the lesson Functional Derivatives & Variations. Since both of our
functionals, F and G, are of the correct form to apply this formula to, we have the following:
𝛿F ∂f d ∂f
= -
𝛿y ∂y dx ∂y'
𝛿G ∂g d ∂g
= -
𝛿y ∂y dx ∂y'
𝛿F 𝛿G ∂f d ∂f ∂g d ∂g
=λ ⇒ - =λ -
𝛿y 𝛿y ∂y dx ∂y' ∂y dx ∂y'
By convention, we multiply both sides by -1 and "absorb" the minus sign into the λ on the
right-hand side - λ is just an arbitrary constant at this point, anyway. We then get the
following:
d ∂f ∂f ∂g d ∂g
- =λ -
dx ∂y' ∂y ∂y dx ∂y'
We now have an equation we can solve for the function y(x) that makes stationary the
original functional F. Now, being able to do this relies on the fact that we need to write both
functionals in the form of definite integrals and identify the integrands f and g. If we are able
to do this, we can then apply the constrained Euler-Lagrange equation.
Using this equation works pretty much the same as previously - we calculate the relevant
partial derivatives and then solve the resulting differential equation for y(x). The only
difference here is that we also need to solve for the Lagrange multiplier λ.
15
2.2. The Beltrami Identity With Constraints
Before we get to practical examples, it's also worth mentioning that there exists a version of
the Beltrami identity with constraints as well. You'll find the full derivation below, but this is
how it looks like:
∂f ∂g
y' - f = C - λ y' - g
∂y' ∂y'
The Beltrami identity with constraints applies only when both the conditions ∂f / ∂x = 0 and
∂g / ∂x = 0 are true.
The derivation of the Beltrami identity with constraints is quite similar to the case with no
constraints - the only difference is that we start with the Euler-Lagrange equation with
constraints, as might be expected:
d ∂f ∂f ∂g d ∂g
- =λ -
dx ∂y' ∂y ∂y dx ∂y'
What we do next is pretty much just following the same steps we did when deriving the
Beltrami identity in the lesson The Euler-Lagrange Equation & Beltrami Identity - first,
we'll multiply both sides by y':
d ∂f ∂f ∂g d ∂g
y' - y' = λ y' - y'
dx ∂y' ∂y ∂y dx ∂y'
Now, consider the following expressions obtained from calculating the total derivatives
df / dx and dg / dx using the chain rule (these are exactly the same expressions we had
in the derivation of the Beltrami identity with no constraints):
16
∂f df ∂f ∂f
y' = - - y''
∂y dx ∂x ∂y'
∂g dg ∂g ∂g
y' = - - y''
∂y dx ∂x ∂y'
d ∂f df ∂f ∂f dg ∂g ∂g d ∂g
y' - + + y'' = λ - λ - λ y'' - λ y'
dx ∂y' dx ∂x ∂y' dx ∂x ∂y' dx ∂y'
d ∂f ∂f df ∂f dg ∂g d ∂g ∂g
⇒ y' + y'' - + = λ -λ -λ y' + y''
dx ∂y' ∂y' dx ∂x dx ∂x dx ∂y' ∂y'
d ∂f d ∂f ∂f
y' = y' + y''
dx ∂y' dx ∂y' ∂y'
d ∂g d ∂g ∂g
y' = y' + y''
dx ∂y' dx ∂y' ∂y'
d ∂f df ∂f dg ∂g d ∂g
y' - + = λ -λ -λ y'
dx ∂y' dx ∂x dx ∂x dx ∂y'
Now, we haven't yet used the condition for the Beltrami identity - namely that
∂f / ∂x = ∂g / ∂x = 0. Applying these now, we get:
d ∂f df dg d ∂g
y' - = λ -λ y'
dx ∂y' dx dx dx ∂y'
We now have a total derivative d / dx on each of these terms, so we can integrate both
sides to get the final result (remember - λ here is just some constant, so it can be pulled
outside of the integral):
17
∂f d ∂g
∫d df
y' dx - ∫ dx = λ∫ dx - λ∫
dg
y' dx
dx ∂y' dx dx dx ∂y'
∂f ∂g
⇒ y' - f = λg - λ y' + C
∂y' ∂y'
∂f ∂g
⇒ y' - f = C - λ y' - g
∂y' ∂y'
The C here is again an integration constant that will be specific to any given problem.
x2
1. Define the functional F = ∫ f(x, y, y')dx that is to be optimized as well as the
x1
x2
constraint equation ∫x 1
g(x, y, y')dx = C describing the problem.
2. Identify the integrand function f(x, y, y') and the constraint function g(x, y, y').
4. Solve the resulting differential equation for the function y(x) and for the Lagrange
multiplier λ. Note that you'll need (at least) two equations to do this, which you get
either from the boundary conditions or from the constraint equation
x2
∫x 1
g(x, y, y')dx = C.
With all this being said, we can now begin looking at some examples!
18
3. Example: The Catenary Problem
Next, we'll look at a pretty famous constrained optimization problem in calculus of variations
that uses exactly the stuff discussed throughout this lesson. This is called the catenary
problem.
Essentially, the catenary problem consists of a uniform chain or rope that is suspended to
hang between two points. We'll take these two points to be at the same height h here and
the horizontal separation between the points as d. The goal is to then find the exact shape
of this hanging chain (and function describing it) when gravity is pulling down on it with
gravitational acceleration g.
We can place the hanging chain in an x, y -coordinate system with the ends of the chain at
the points (-d / 2, h) and (d / 2, h). The shape of the chain itself then makes some curve in
the x, y -plane that we can treat as a function y(x):
Now, what determines the shape of this chain when it's hanging? The answer is gravity.
Naturally, when an object is pulled down by gravity, the object will minimize its gravitational
potential energy. This is, for example, why objects naturally fall down to the ground due to
gravity.
19
Therefore, the chain will take on a shape that specifically minimizes its gravitational potential
energy. To find this shape, we have to first find the potential energy of the chain.
The total gravitational potential energy of the chain is given by V = mgh, where m is the
total mass of the chain and h is its height. But what height? It doesn't really make sense to
talk about a specific height for the entire chain, since each part of it is hanging at a different
height. Therefore, each part (or point) of the chain also has a different potential energy.
The way to go about this is by dividing the chain into infinitesimal parts, each with mass dm
and potential energy dV. Now, any given "small part" (point) of the chain is at a height given
exactly by the value of the curve y. So, each small part of the chain has potential energy
dV = gydm.
Since we have a uniform chain, its mass density is the same at all points. The mass density 𝜌
of any given "small piece" of the chain is defined as the mass of that small piece, dm, divided
by the length of the piece, ds. We can solve for the mass from this:
dm
𝜌= ⇒ dm = 𝜌ds
ds
Now, this "piece of length", ds, is just a small distance along the curve y(x), which is given by
the Pythagorean theorem. We've seen this quantity a bunch of times before in the previous
lessons:
2
dy
ds = dx 2 + dy 2 = 1+ dx = 1 + y' 2 dx
dx
So, we have that dm = 𝜌 1 + y' 2 dx. We can insert this to then get the potential energy
dV as:
The total potential energy of the whole chain is then obtained by integrating all these dV's
between the end points of the chain:
d/2 d/2
V = ∫ dV = ∫ 𝜌gy 1 + y' 2 dx
-d/2 -d/2
20
This potential energy should be minimized. Now, this clearly has the form of a functional -
therefore, we need to use the tools of variational calculus to minimize V.
However, we also have a constraint in this problem - the total length of the chain should
remain constant no matter the shape of the chain, since we're assuming the chain to not
stretch here. We'll call the total length of the chain L (it is a constant) and it is given by
integrating all the "little distances" ds along the chain:
f = 𝜌gy 1 + y' 2
g= 1 + y' 2
Let's now get to deriving the diferential equations for the chain and solving them. Since
the above functions are both independent of x, we can apply the Beltrami identity with
constraints here:
∂f ∂g
y' - f = C - λ y' - g
∂y' ∂y'
y' 2 2 y' 2
⇒ 𝜌gy - 𝜌gy 1 + y' = C - λ - 1 + y' 2
1 + y' 2 1 + y' 2
21
If we square both sides and solve for y' = dy / dx, we get:
(-𝜌gy - λ) 2 = C 2 1 + y' 2
1
⇒ 2
(𝜌gy + λ) 2 = 1 + y' 2
C
1 2
⇒ y' = ( 𝜌gy + λ) -1
C2
2
dy 𝜌gy + λ
⇒ = -1
dx C
1
∫ dy = ∫dx
2
𝜌gy + λ
-1
C
We can do a substitution here of the form cosh u = 𝜌gy + λ, from which we could find
dy as:
𝜌gy + λ
cosh u =
C
C
⇒ y= cosh u - λ
𝜌g
dy d C C
⇒ dy = du = cosh u - λ du = sinh u
du du 𝜌g 𝜌g
Here, I've used the fact that the derivative of cosh u is sinh u. With these, the integral
now turns into:
1 1 C
∫ dy ⇒ ∫ sinh udu
2 2
cosh u - 1 𝜌g
𝜌gy + λ
-1
C
22
2
We can use the identity cosh u - 1 = sinh 2 u here to get:
1 C
∫ sinh udu
2 𝜌g
sinh u
C
=∫ du
𝜌g
C
= u
𝜌g
𝜌gy + λ 𝜌gy + λ
Substituting back in cosh u = ⇒ u = arccosh , we then get
C C
the solution to our original integral as:
1 C 𝜌gy + λ
∫ dy = arccosh
2 𝜌g C
𝜌gy + λ
-1
C
1
∫ dy = ∫dx
2
𝜌gy + λ
-1
C
C 𝜌gy + λ
⇒ arccosh = x+𝜑
𝜌g C
Here, 𝜑 is some integration constant that we'll determine soon. Now, solving this for
y(x), we finally get:
C 𝜌g λ
y(x) = cosh (x + 𝜑) -
𝜌g C 𝜌g
The above is nearly our final solution - however, we still need to solve for λ.
23
We can do this by first setting the constant 𝜑 = 0. Why? Well, 𝜑 just corresponds to a shift
of the entire curve along the x-axis. Setting 𝜑 = 0 then corresponds to the curve being
centered at the origin, which is what we have in this problem. Thus, we have:
C 𝜌g λ
y(x) = cosh x -
𝜌g C 𝜌g
We can now find λ from the boundary conditions for our problem. Namely, requiring that
y(d / 2) = h gives us:
d
y =h
2
𝜌g d
⇒ cosh =h
C2
λ 𝜌gd
⇒ = cosh -h
𝜌g 2C
This expression determines what the Lagrange multiplier λ has to be in order for the
constraint - the length of the chain being constant - to be satisfied. So, our solution is then:
C 𝜌g λ
y(x) = cosh (x + 𝜑) -
𝜌g C 𝜌g
C 𝜌g C 𝜌gd
⇒ y(x) = cosh x - cosh -h
𝜌g C 𝜌g 2C
C 𝜌g 𝜌gd
⇒ y(x) = cosh x - cosh +h
𝜌g C 2C
To simplify this a bit, we could define a new constant 𝛽 = C / 𝜌g, so that this becomes:
x d
y(x) = 𝛽 cosh - cosh +h
𝛽 2𝛽
This is our final solution - the curve y(x), which minimizes the potential energy of the
hanging chain and thus, describes the natural shape of the chain uner gravity.
24
Now, this still has the unknown constant 𝛽 (or C), which in principle, could be solved for
from the constraint equation:
d/2 d/2
x
∫-d/2 2
1 + y' dx = L ⇒ ∫-d/2 1 + sinh 2 dx = L
𝛽
2
We can use here the fact that 1 + sinh = cosh 2 to calculate this integral:
d/2 d/2
x x d
∫-d/2 cosh 2
dx = ∫ cosh dx = L ⇒ 2𝛽 sinh =L
𝛽 -d/2 𝛽 2𝛽
The problem here is that this cannot be solved analytically for 𝛽 - numerically, sure. However,
we're left with the solution in the form:
x d d
y(x) = 𝛽 cosh - cosh + h, where 𝛽 is determined by 2𝛽 sinh = L.
𝛽 2𝛽 2𝛽
The more important question here is how does the shape of the hanging chain actually
look like? Well, it is describes by the curve y(x), which is a hyperbolic cosine function
connecting the points (-d / 2, h) and (d / 2, h), so something like this:
In general, this kind of hyperbolic cosine curve is called a catenary. Catenaries are an
incredibly important shape in, for example, engineering - this is because pretty much all non-
rigid structures that are connected at two ends, will naturally take on the shape of a catenary
like we've just seen here (if there are no other forces than gravity).
25