0% found this document useful (0 votes)
1 views

Main A

Uploaded by

Pablo Vivero
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

Main A

Uploaded by

Pablo Vivero
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Notes for Advanced Topics in Mathematical Engineering

Nicolás A Barnafi
October 22, 2024

Context
These notes exist as backup material for a course on some deeper topics in Math Eng at Pontificia Universidad Católica de
Chile, the 2nd semester of 2024. The idea is to provide mathematical tools for students that give them the ability to assess
the difficulty of mathematical problems, mainly within the world of Partial Differential Equations (PDEs). The target is
ultimately to implement these models, so that all tools are oriented towards having solid foundations that allows one to
trust a computational model. Informally speaking, the main mathematical concepts to haunt us during all these notes are:

• Existence and uniqueness: It is a natural baseline in the mathematician’s world to try to solve only problems that have
a solution. Otherwise, things might be as pointless as developing an iterative method for finding real numbers such
that x2 = −1. Uniqueness is a further luxury, but sometimes two different methods give two different solutions, and
having only those things at hand can make it difficult to distinguish whether that is a bug or a feature of the model.
There exist some root-isolation methods that allow to find solutions of a problem that are different from a given one.
This is out of the scope of this course.
• Stability: The intuitive idea behind this is that small perturbations in the data give rise to small changes in the solution.
This typically looks like
∥u∥ X ≤ ∥ f ∥ X ′ ,
where u is the solution of a problem that depends on f , and X is some functional (hopefully Hilbert) space. More
rigorously, this means that the solution map f 7→ u( f ) is bounded, or continuous in the linear case. Stability also
sometimes refers to time dynamics and the fact that a discrete solution stays within a certain distance of the real solution
throughout a simulation. In the continuous setting, it might also mean that there are no finite-time singularities. In
general, stability is not a well defined term, but still a widely understood one to anyone who has struggled to get a
code to run correctly, and a highly desired property.
All other properties (or at least most of them anyway) are ways to guarantee that a problem enjoys one of these nice
properties. There are ways to handle problems that do not have those properties, but they are almost always extremely
problem dependent, and the person studying such problems should dive deep into the sectorial knowledge to see how
certain communities deal with such issues. This is an aspect that mathematically oriented people almost always disregard,
which has some severe mathematical (and social) consequences. In fact, some extremely classical models in engineering are
still far from understood mathematically, such as the Navier-Stokes equations. This has not prevented the CFD community
from solving these models with extreme efficiency, and from further leveraging them for industrial applications which,
unsurprisingly, work fantastically. Discovering the amazing ways in which mathematically obvlivious communities solve
mathematically hard problems is, and will probably be for very long, a beautiful opportunity for collaboration.

1 Analysis preliminaries
In this section we will review some important properties of functional spaces and operators. These things should be deemed
as ’review’ material. Intrinsically new things will start appearing in Section 2. Most, if not all, results will be coming from
the amazing book Linear and nonlinear functional analysis by PG Ciarlet.

1
1.1 Functional spaces
Banach and Hilbert spaces Throughout the entire manuscript, we will rely on Banach spaces, Hilbert spaces, and their
duals. Despite the existence of a flexible theory of Banach space formulations, we will mostly rely on Hilbert spaces because
of their many nice properties. For now, let’s simply review some relevant properties:
• Banach spaces are complete metric spaces. For a given Banach space X, its (topological) dual is the space X ′ of
functions X 7→ R. The action of an element in the dual space is sometimes denoted as ⟨ T, x ⟩ X ′ × X , so as to resemble
the notation of an inner product. In general, one can identify a part of the bidual space X ′′ through the evaluation
operator T f : X ′ 7→ R in X ′′ defined as T f ( L) = L( f ). This immersion is not surjective.
• Continuous linear operators acting on Banach spaces have an induced norm: If T : X 7→ Y, then
| Tx |Y
∥ T ∥ = sup .
x∈X |x|X
Some people write this space as L( X, Y ).
• Hilbert spaces are Banach spaces with respect to the distance induced by a dot (inner) product, i.e. a bilinear form
⟨·, ·⟩ : X × X 7→ R such that:
– It is symmetric: ⟨ x, y⟩ = ⟨y, x ⟩
– It is linear in its first argument: ⟨αx1 + βx2 , y⟩ = α⟨ x1 , y⟩ + β⟨ x2 , y⟩
– It is positive definite: ⟨ x, x ⟩ ≥ 0, where it is 0 only if x = 0.
• The inner product yields the fantastic Riesz map, which is actually an isometry. This is given as follows: Consider a
Hilbert space H with inner product ⟨·, ·⟩ H , then a Riesz map is an operator R H : H 7→ H ′ such that for any x, y in H it
holds that ⟨ R H ( x ), y⟩ H ′ × H = ⟨ x, y⟩ H . Notably, ∥ R H ( x )∥ H ′ = ∥ x ∥ H .
• Inner products are mostly used as projections. This means that, in the same way that we can orthogonalize a vector x
with respect to y, we can also do this in the Hilbert space setting analogously as
x⊥ := x − ⟨ x, y⟩ H y.
It can be quickly verified that the function x⊥ is indeed perpendicular to y in the sense that ⟨ x⊥ , y⟩ H = 0.
One fundamental aspect of Hilbert spaces is that they provide some intuitive properties related to projections, which we
recall through the following results:
Theorem 1. Best approximation Set U ⊂ H a closed subspace of a Hilbert space H and set f in H. Then there exists a unique g in U
such that
∥ f − g∥ H = inf ∥ f − u∥ H .
u ∈U
Using this, we can uniquely define the orthogonal complement of a set U:
U ⊥ := {v ∈ H : (v, u) H = 0 ∀ u ∈ U }.
Theorem 2. Set U a closed subspace of a Hilbert space H. Then, if f is in H, there exists a unique pair (u, v) in U × U ⊥ such that
f = u + v.
Some common and/or simple examples: Helmholtz decomposition, zero average functions, zero trace tensors, symmet-
ric tensors. Note that the orthogonal complement is defined with respect to a given inner product.
Example: Assume we want to orthogonalize with respect to U = R. Then, we have that there is a constant c such
that for f in a Hilbert space H we can write
f = h + c,
with h ⊥ U. Noting that a function x satisfies x ⊥ U iff ( x, 1) H = 0, then we can use the previous expression to
obtain
( f , 1) H = (c, 1) = c(1, 1) H ,
which gives
( f , 1) H
c= .
(1, 1) H

2
The most important spaces for us will be the Lebesgue spaces L p (Ω; Rd ) given by measurable functions f : Ω 7→ Rd
such that Z
p
| f |Rd dx < ∞.

It will be important to know that if |Ω| < ∞, then these spaces form an ordered inclusion:

L∞ (Ω) ⊂ L p (Ω) ⊂ ... ⊂ L1 (Ω).

A simple way to remember this is to split a function as f = I| f |≤1 f + I| f |≥1 f and note that | x | p < | x | p+ϵ for ϵ > 0.

Distributions and derivatives To formulate differential equations in Banach/Hilbert spaces, it will be important to be
able to define derivatives in such spaces. This is done through the language of distributions, invented (discovered) by L
Schwartz. For this, we require the notion of ’test functions’, i.e. functions on which we can discharge derivatives of abstract
objects through integration by parts. Consider then a function f in C0∞ (Rd ), the space of infinitely differentiable scalar
functions with compact support in Rd , then a distribution is simply an element T in the dual space (C0∞ (Rd ))′ , whose action
can be written as ⟨ T, f ⟩(C∞ )′ ×C∞ , or sometimes simply as ⟨ T, f ⟩, if it is clear by context. The idea is to generalize the notion
0 0
of action through integration, so that for sufficiently smooth functions f , their induced distribution is T f given by
Z
⟨ T f , g⟩ = f g dx.
Rd

This integral approach, when thinking about integration by parts formulas, allows us to define distribution derivatives as

⟨∂i T, f ⟩ := −⟨ T, ∂i f ⟩,

as given by integration by parts. This is known as a weak derivative. Arbitrary order differential operators can be defined
analogously, most importantly ∇, div, curl, given by

⟨div T, f ⟩ := −⟨ T, ∇ f ⟩
⟨curl T, f ⟩ := ⟨ T, curl f ⟩.

Example: All classical (or strong) derivatives coincide with the weak derivatives, as seen from the integration by
parts formulas. Also, consider the Dirac delta distribution given by

⟨δx , f ⟩ = f ( x ),
R
sometimes written as δx ( f ), or also simply as Ω δx f dx (with a not too mild abuse of notation). Then, its derivative is
given by
⟨δ′ , f ⟩ = −⟨δx , f ′ ⟩ = − f ′ ( x ).

Naturally, weak derivatives and the common ones coincide under differentiability assumptions. The notion of weak
derivatives allows us to define differentiable Hilbert spaces, given by

W 1,p (Ω) := { f ∈ L p (Ω) : ∇ f ∈ L p (Ω)}.

These are Banach spaces with norm


∥ x ∥W 1,p (Ω) := ∥ x ∥ L p (Ω) + ∥ ∇ x ∥ L p (Ω) .

This is the graph norm of the ∇ operator, and it can be further seen as the ℓ1 norm of the two-dimensional vector (∥ x ∥ L p (Ω) , ∥ ∇ x ∥ L p (Ω
and thus all vector norms for such a vector induce equivalent norms for Sobolev spaces. It is very common to use the fol-
lowing notations:
• H 1 (Ω) = W 1,2 (Ω).

• ∥ x ∥ L2 (Ω) = ∥ x ∥0,Ω , or even simply ∥ x ∥0 , depending on the laziness of the person writing.

• ∥ x ∥ H 1 (Ω) = ∥ x ∥1,Ω .

3
The space H 1 (Ω) is very important, as it is a Hilbert space with inner product

⟨ x, y⟩ H1 (Ω) := ⟨ x, y⟩ L2 (Ω) + ⟨∇ x, ∇ y⟩ L2 (Ω) .

Analogously, we can define the spaces


n o
H (div; Ω) = f ∈ L2 (Ω) : div f ∈ L2 (Ω)

and n o
H (curl; Ω) = f ∈ L2 (Ω) : curl f ∈ L2 (Ω)

Their application depends on the context, so we only keep here their definition. They are also Hilbert spaces, with the
inner product defined as the one in H 1 but with the corresponding differential operators. Note that H 1 functions belong to
both H (div) and H (curl), but inclusions among them are not clear. We conclude this section with the celebrated Sobolev
embedding results:
Theorem 3 (Sobolev embeddings (continuous)). Consider Ω a bounded Lipschitz domain in Rd , set m, j two non-negative integers,
and p in [1, ∞]. Then the following embeddings hold:
dp
1. If mp < d then for p ≤ q ≤ d−mp :
W j+m,p (Ω) ,→ W j,q (Ω).

2. If mp = d, then for p ≤ q < ∞:


W j+m,p (Ω) ,→ W j,q (Ω).

3. If mp > d ≥ (m − 1) p, then
W j+m,p (Ω) ,→ C j (Ω̄).

Theorem 4 (Sobolev embeddings (compact)). Consider Ω a bounded Lipschitz domain in Rd , m ≥ 1 integer, j ≥ 0 integer and let
p in [1, ∞). Then the following embeddings are compact:
dp
1. If mp ≤ d then for q in [1, d−mp ):
W j+m,p (Ω) ,→ W j,q (Ω).

2. If mp > d, then
W j+m,p (Ω ,→ C j (Ω̄).

These inclusions hold also if the arrival domain is an arbitrary subdomain of Ω.


These theorems hold in greater generality, and are typically used together with the weak-compactness of the unit ball to
show the existence of certain strongly convergent subsequence. We will see examples of this further ahead. One fundamen-
tal consequence is that H 1 is compactly embedded in L2 .

1.2 Traces
Traces or trace operators are the ones that restrict a function in Rd to some set in Rd−1 , most commonly the boundary of a
domain. They are fundamental to adequately define boundary conditions. For the presentation of this section, we follow
[Gat14] and [Mon03]. Some details about Sobolev spaces are drawn from [AF03]. The fundamental difficulty of defining
trace operators is that the domain where the boundary condition is defined has measure 0 in the measure of the starting
domain, so some regularity of the function is required to guarantee that this operation makes sense. We will not enter the
details of how a Lipschitz boundary is defined, see [Mon03] for further details.
There are several definitions and constructions here that are needed for everything to make sense. We will follow them in
a reasonable order, but this might be a very personal vision, so please read other formulations to have a more well-rounded
vision. We will denote with C0∞ ( X ) the space of functions with compact support in X, and also with D(Ω̄) the functions
in C0∞ (Rd ) with support in an open set U such that Ω̄ ⊂ U. This belongs to a wider set known as the Schwartz class of
functions. If the set X is open, we may denote D( X ) as C0∞ ( X ) with a bit of an abuse of notation.

• Classic densities: D(Ω̄) is dense in L p (Ω) if Ω is bounded and Lipschitz.

4
• C ∞ (Ω̄) is dense in W s,p (Ω) for s a positive integer and p ∈ [1, ∞).
• For s a positive integer and p ∈ (1, ∞), we have that there is a continuous linear extension Π : W s,p (Ω) → W s,p (Rd )
such that Πu|Ω = u for all u in W s,p (Ω).

Some technicalities arise when Ω is unbounded. For the sake of this course, all domains are bounded and Lipschitz
unless otherwise stated. The following theorem allows us to extend the trace operator, which we initially define as γ0 :
D(Ω̄) → C ∞ (∂Ω), given by
γ0 u = u|∂Ω .
An important property that we will use many times is the trace inequality, which states that there exists C positive such that

∥γ0 f ∥0,∂Ω ≤ C ∥ f ∥1,Ω ∀ f ∈ D(Ω̄).

Theorem 5 (Trace theorem). Set Ω a bounded and Lipscthiz domain. Then, considering 1/p < s ≤ 1, there exists a continuous
s−1 ,p
extension of γ0 given by γ0 : W s,p (Ω) → W p (∂Ω)
Proof. We refer to [AF03] for a complete proof. We will simply see how to extend γ0 from smooth functions to a linear
bounded operator from H 1 (Ω) to L2 (∂Ω). This result requires the density of D in H 1 and the trace inequality. Consider
thus a Cauchy sequence { φi }i in D(Ω̄) converging to some v in H 1 (Ω). Using linearity and continuity, we get that for some
pair of indexes i, j:
∥γ0 φi − γ0 φ j ∥0,∂Ω = ∥γ0 ( φi − φ j )∥0,∂Ω ≤ C ∥ φi − φ j ∥1,Ω .
This states that {γ0 φi }i is a Cauchy sequence in L2 , and thus has a limit ξ in L2 (∂Ω). Before setting ξ as our extension, we
need to check it is independent of the chosen sequence. This can be simply done by choosing another sequence such that
{ φ̃i }i converges also to v. Then, it holds that

∥γ0 φ̃i − ξ ∥ ≤ ∥γ0 ( φ̃i − φi ) + γ0 φi − ξ ∥ ≤ ∥γ0 ( φ̃i − φi )∥ + ∥γφi − ξ ∥.

The second term goes to zero as we showed previously. For the first one, we use the trace inequality again to obtain

∥γ0 ( φi − φ̃i )∥ ≤ C ∥ φi − φ̃i ∥,

which concludes the proof.


This trace is sometimes referred to as the Dirichlet trace, as it is used to define Dirichlet boundary conditions. We can
now define the Sobolev spaces ”with boundary conditions”, i.e.
1,p
W0 = {u ∈ L p (Ω) : ∇ u ∈ [ L p (Ω)]d and γ0 u = 0}.

Here, we used the standard notation [ X (Ω)]d := X (Ω) × . . . × X (Ω). We thus get the extensively used spaces:

H01 (Ω) = W01,2 (Ω)


H −1 (Ω) = [ H01 (Ω)]′
H 1/2 (∂Ω) = W 1/2,2 (∂Ω)
H −1/2 (∂Ω) = [ H 1/2 (∂Ω)]′ ,

where the space H 1/2 (∂Ω) is the trace space associated to H 1 (Ω), and the kernel of γ0 is given by H01 (Ω).

The trace spaces This space has some nice properties, which we detail now. Its norm1 is given by

∥u∥1/2,∂Ω := inf{∥U ∥1,Ω : U ∈ H 1 (Ω) and u = γ0 U },


which naturally yields the following continuity estimate for the trace operator:

∥γ0 U ∥1/2,∂Ω ≤ ∥U ∥1,Ω .


1 Sobolev spaces of fractional order are a best on their own. The rigorous definition can be given using either Fourier transforms or by using the

Slobodeckij seminorm. Both are very cumbersome and seldom used.

5
The trace space H 1/2 can be seen as a quotient space derived from H 1 , so it is also a Hilbert space. A natural question is
what the inner product looks like. To do that, we consider for a given u in H 1/2 (∂Ω), an element that yields the norm, i.e.
U in H 1 (Ω) such that γ0 U = u and ∥u∥1/2,∂Ω = ∥U ∥1,Ω . In such a case, we can consider the following inner product:

(v1 , v2 )1/2,∂Ω := (V1 , V2 )1,Ω ,


s,p
where Vi are the extension functions. Finally, it will be useful to know that H01 (Ω) (and indeed also W0 ) can be defined as
a closure in terms of the H 1 norm:
∥·∥1,Ω
H01 (Ω) := C0∞ (Ω) .

Note on integration by parts formulas These formulas will be important to define the normal and tangential traces. All
formulas stem from the divergence theorem:
Theorem 6 (Divergence Theorem). Consider a bounded Lipschitz domain Ω in Rd=2,3 and consider a vector field F : Rd → Rd in
[C1 (Ω̄)]d . Then it holds that Z Z
div F dx = F · n ds,
Ω ∂Ω
where n is the outwards normal vector, dx is the volume measure and ds is the surface measure.
The relevant formulas are the following:

• If ξ in C1 (Ω̄) and u in [C1 (Ω̄)]d :


Z Z Z
(div u)ξ dx = − u · ∇ ξ dx + u · nξ ds.
Ω Ω ∂Ω

• (Green’s [first] identity)2 If ξ in C1 (Ω̄) and p in C2 (Ω̄):


Z Z Z
− (∆p)ξ dx = ∇ p · ∇ ξ dx − (∇ p · n) ξ ds.
Ω Ω ∂Ω

• Consider u, ϕ in [C1 (Ω̄)]d :


Z Z Z
(curl u) · ϕ dx = u · (curl ϕ) dx + (n × u) · ϕ ds.
Ω Ω ∂Ω

H (div) and the normal trace In this space, we have a first simple density result:
Theorem 7. Consider a bounded and Lipschitz domain Ω in Rd , then H (div; Ω) is the closure of [C (Ω̄)]d in the H (div) norm.
Proof. Sketch: The main idea is to show that the orthogonal complement of [C (Ω̄]d in H (div; Ω) is the trivial one. Then, one
uses the orthogonality condition
(u, ϕ) + (div u, div ϕ) = 0
for all ϕ in [C (Ω)]d implies that ∇ div u = u is in L2 , and thus div u is in H 1 . An adequate extension to all Rd and a density
argument concludes the proof. For details, see [Mon03, Thm 3.22].

The normal trace is simply given for a smooth function as

γ N v = v|∂Ω · n,

where n is the outwards normal vector. This can be extended up to functions in H (div) as stated in the following theorem.
Theorem 8 (Normal trace). Consider a bounded Lipschitz domain Ω in Rd with outwards unit normal n. Then, the mapping γ N
can be extended to a continuous linear map γ N : H (div; Ω) → H −1/2 (∂Ω), and the following integration by parts formula holds:

⟨γ N v, ϕ⟩−1/2,1/2 = (v, ∇ ϕ) + (div v, ϕ) ∀v ∈ H (div; Ω), ϕ ∈ H 1 (Ω).


2 See [Mon03] for the second one. It is useful to derive Boundary Element (BEM) methods.

6
Proof. First, we note that the integration by parts formula, which holds for C ∞ functions initially, can be extended by density
to functions ϕ in H 1 (Ω):

⟨v · n, ϕ⟩ = (v, ∇ ϕ) + (div v, ϕ) ∀v ∈ H (div; Ω), ϕ ∈ H 1 (Ω), ∀v ∈ H (div; Ω), ϕ ∈ H 1 (Ω).

Cauchy-Schwartz further yields


|⟨v · n, ϕ⟩| ≤ ∥v∥div ∥ϕ∥1 .
In particular, using the surjectivity of the Dirichlet trace, we get that for all µ in H 1/2 (∂Ω) it also holds that

|⟨v · n, µ⟩| ≤ ∥v∥div ∥µ∥1/2 ,

which naturally yields


∥v · n∥−1/2 ≤ ∥v∥div .
This all implies that γ N is a bounded linear map from [C (Ω̄)]d to H −1/2 (∂Ω), meaning that it can be extended by density
to be defined in H (div; Ω) as well. We are only missing the surjectivity, for which we consider a generic function η in
H −1/2 (∂Ω), and define the following problem: Find ϕ in H 1 (Ω) such that

(∇ ϕ, ∇ ψ) + (ϕ, ψ) = ⟨η, γ0 ψ⟩−1/2,1/2 ∀ ψ ∈ H 1 ( Ω ).

This in particular implies that


(∇ ϕ, ∇ ψ0 ) + (ϕ, ψ0 ) = 0 ∀ψ0 ∈ H01 (Ω),
so that, as in the previous proof, − div ∇ ϕ = ϕ distributionally and thus v := ∇ ϕ is the H (div; Ω) function we were
looking for.
Finally, we will use the space of functions with null normal trace, so that we initially define
∥·∥div
H0 (div, Ω) := [C0∞ (Ω)]d ,

i.e. the closure in the div norm. The following result makes all the trouble worth it.
Theorem 9. Consider a bounded Lipschitz domain Ω in Rd . Then:

H0 (div; Ω) = {v ∈ H (div; Ω) : γ N v = 0}.

Proof. The proof is done through the orthogonal complement. All techniques have been already shown here, so we simply
refer the interested reader to [Mon03, Thm 3.25].

H (curl) and the tangential trace In the proofs involving H (div), we heavily used the fact that the ∇ and div are the trans-
pose of one another. This is not the case for the curl operator, so the proofs for this case are notoriously more complicated.
Instead, we will be happy with simply stating the related results:

∥·∥curl
H0 (curl, Ω) := [C0∞ (Ω)]d .

Theorem 10. For a bounded Lipschitz domain Ω, it holds that [C (Ω̄)]3 is dense in H (curl; Ω).

• For a bounded Lipschitz domain Ω it holds that, if u in H (curl; Ω) is such that

(curl u, ϕ) − (u, curl ϕ) = 0

for all ϕ in [C ∞ (Ω̄)]3 , then u is actually in H0 (curl; Ω).


• The trace operators here are two:

γt v = n × v,
γT v = (n × v) × n.

7
Theorem 11. For a bounded Lipschitz domain Ω it holds that the extension γt : H (curl; Ω) → H −1/2 (∂Ω) is bounded and
linear, with the following integration by parts formula:

(curl v, ϕ) − (v, curl ϕ) = ⟨γt v, ϕ⟩ ∀v ∈ H (curl; Ω), ϕ ∈ H 1 (Ω).

**Note the bold space, which refers to a vector space. We will use this often.
The operator γt is not surjective simply because it is the limit of tangential vectors which will never have a normal
component (at least intuitively).
• Characterizing γT is out of scope in this course.


Theorem 12. Consider a bounded Lipschitz domain Ω. Then,

H0 (curl; Ω) = {v ∈ H (curl; Ω) : γt v = 0}.

Example: One interesting context in which these trace operators show up is when considering normal or tangential
boundary conditions. As we will see, one typically requires boundary information on all components of the solution
on the boundary, but how this is done is highly context dependent. One nice example are slip boundary conditions in
fluid dynamics, where only the normal component of the fluid velocity is required to be 0:

u · n = 0.

This has to be complemented with conditions in the tangential direction. One possibility would be to prescribe some
tangential velocity:
(I − n ⊗ n)u = (n × u) × n = vτ ,
but one can also look at the tangential components of the stress tensor, thus generating a ”tangential” Neumann
boundary condition:
( I − n ⊗ n)[σ(u)n] = gτ .

1.3 Weak formulations


A weak formulation refers to an integral form of a PDE, understood distributionally. This is typically a systematic procedure
that should not be too difficult, and it helps in revealing what are the adequate boundary conditions for a given problem.
The main tool for this will be the integration by parts formulas. Our test problem will be the Laplace problem, given by the
−∆ operator. The minus sign will be better justified in the following section. Consider then the problem of finding u such
that
−∆u = f in Ω.
Define an arbitrary smooth function v, then integration by parts yields
Z Z Z
− ∆uv dx = − γD vγ N ∇ u ds + ∇ u · ∇ v dx
Ω ∂Ω Ω

for all v. This function is typically called a test function. The surface form suggest the boundary conditions:
Z
γD v γN ∇ u ds,
∂Ω |{z} | {z }
Dirichlet BC Neumann BC

so that we can have boundary conditions on the function itself

u = g,

or on its normal derivative


∇ u · n = h.

8
This can be combined, so that for a given partition of the boundary into two sets ΓD and ΓN such that ∂Ω = ΓD ∪ ΓN , one can
have a Dirichlet boundary condition on ΓD and a Neumann boundary condition on ΓN . For this type of boundary condition,
one must define a solution space given by

Vg = {v ∈ H 1 (Ω) : γD v = g on ΓD },

but let us focus first on spaces with null Dirichlet boundary condition (V0 ). In this case, the boundary conditions will give
Z Z
γD vγ N ∇ u ds = hγD v ds,
∂Ω γN

and thus the integral form of the equation will be given by


Z Z Z Z
− ∆uv dx = − vh ds + ∇ u · ∇ v dx = f v dx,
Ω ΓN Ω Ω

for all smooth v. Now, we note that (i) this formulation is well defined for u, v in H 1 (and thus can be extended to hold
for all v in H 1 by density as long as v satisfies the Dirichlet boundary conditions), (ii) the γD operator has been omitted
from the surface integral for convenience, and (iii) that the Dirichlet boundary condition does not appear anywhere in the
formulation. This justifies naming Dirichlet boundary conditions essential, and Neumann boundary conditions natural. The
weak formulation of the problem thus refers to the following statement: Find u in V0 such that

(∇ u, ∇ v)0,Ω = ⟨ f , v⟩ + ⟨h, v⟩ ∀v ∈ V0 ,

for given functions f in V0′ and g in (γD V0 )′ . Note the following:


• The space of the solution and the test functions is the same. This is not mandatory, but it is common and can be better
motivated by interpreting the Laplace problem as the first order equations related to the following minimization
problem: Z
min | ∇ v|2 dx.
v∈V0 Ω

Then, one simply infers the spaces of each function from the definition of the Gateaux derivative.
• The solution u was formulated in a space without boundary condition. This is important because the regularity
theory will depend on the solution space being a Hilbert space, and the space Vg is not even a vector space as it is not
closed under addition. This can be solved by defining adequate lifting operators, i.e. a function G in H 1 (Ω) such that
γD G = g that allows us to write u in Vg as
u = u0 + G,
where u0 belongs to V0 . We can then rewrite the problem in Vg as a problem in V0 (and I encourage the reader to do
this procedure at least once in their life). The existence of a lifting function in this case is given by the surjectivity of
the Dirichlet trace, but it can be tricky in other contexts. This is also tricky in nonlinear problems, which justifies that
nonlinear problems are typically studied with homogeneous boundary conditions.
• We note that the Laplacian is now being interpreted as a distribution, and thus the strong problem (including the
boundary conditions) yields the definition of the action of the distribution. In particular, this means that the action
of the distribution naturally changes with the boundary conditions. This observation is fundamental to understand
Discontinuous Galerkin methods, or other formulations defined on broken spaces (i.e. spaces that allow for disconti-
nuities).

Example: We encourage the reader to try to compute the weak formulation of the Poisson problem in mixed form.
To do this, one must define the auxiliary variable σ := u, so that the strong form of the problem now becomes

− div σ = f in Ω,
σ −∇u = 0 in Ω,
γD u = g on ΓD ,
γN σ = h on ΓN .

This problem will be studied in detail further ahead.

9
1.4 Poincaré inequalities and Lax-Milgram
Using all of the previous definitions, we can finally look at actual problems and some first well-posedness results. For all of
them, the Poincaré inequality will be fundamental.
Lemma 1 (Generalized Poincaré inequality). There exists a positive constant C such that for a non-empty portion of the subdomain
Γ ⊆ ∂Ω it holds that  Z 
∥u∥0,Ω ≤ C |u|1,Ω + u ds ∀ u ∈ H 1 ( Ω ).
Γ
1,p
The result also holds in L p and W0 .
We provide an incomplete proof to show some of the related techniques used to prove this result. Most developments
come from [BS08].
Proof. The main result to be used is the Bramble-Hilbert Lemma, which establishes (among other things) R that if B is a
sufficiently big ball in Ω such that Ω is starred with respect to it, then the average over B given by ū = 1/| B| B u dx satisfies

∥u − ū∥0,Ω ≤ C |u|1,Ω ,

where C is a positive constant and | · |1,Ω is the H 1 semi-norm. The case B = Ω is known as the Friedrich’s inequality. To
recover Poincaré’s inequality, one notes the two following properties:

∥v∥0,Ω ≤ ∥v − v̄∥0 + ∥v̄∥0 ≤ C |v|1,Ω + ∥v̄∥0 ,

where the first term was controlled using the Friedrich inequality. The second term is controlled by forcing another triangle
inequality:
|Ω| |Ω|
Z Z Z 
2
∥v̄∥0 = v̄ ds ≤ (v̄ − v) ds + v ds
| Γ| Γ | Γ| Γ Γ

and finally by using the trace inequality plus another application of the Friedrich inequality one gets the desired result.
The case in which u is restricted to be in H01 (Ω) is the well-known Poincaré inequality. Note that in that case the boundary
integral disappears. For the case of pure Neumann boundary conditions, we will require the following inequality.
Lemma 2 (Poincaré-Wirtinger
R inequality). Consider an open connected bounded domain Ω as previously. Then, if we define the
volumetric average as ū = Ω u dx, then it holds that

∥u − ū∥ L p (Ω) ≤ C ∥ ∇ u∥ L p (Ω) .

Proof. The proof has been taken from [Eva22]. By contradiction, we assume that there exists a sequence of functions (uk )k
in W 1,p (Ω) such that
∥uk − ūk ∥ p > k∥ ∇ uk ∥ p .
We can renormalize the sequence by setting
uk − ūk
vk = ,
∥uk − ūk ∥ p
so that v̄k = 0 and ∥vk ∥ = 1. Our hypothesis yields

1
∥ ∇ vk ∥ p = ∥∥uk − ūk ∥− 1
p ∇( uk − ūk )∥ p < ,
k
which in particular means that (vk )k is a bounded sequence in W 1,p (Ω). This means that (vk )k has a weakly convergent
sequence in W 1,p (Ω), and as this space is compactly embedded in L p (Ω) (Rellich-Kondarchov), then there exists an element
v in L p (Ω) such that, possibly up to a subsequence, vk j → v. This implies that v̄ = 0 and ∥v∥ p = 1. Using the bound on
∇ vk , one can also show that for smooth functions ϕ,
Z Z Z
v∂i ϕ dx = lim vk ∂i ϕ dx = − lim ∂i vk ϕ = 0.
Ω k k Ω

This establishes that ∇ v = 0, which implies that v is constant (it is not trivial to check that null weak derivatives implies
being a constant), and the null average condition yields that v = 0. This contradicts the initial hypothesis.

10
Probably the most important consequence of this is that the semi-norm given by | x | := ∥ ∇ x ∥0,Ω , sometimes referred to
as the H01 semi-norm, is equivalent to the H 1 norm in the following cases: (i) homogeneous Dirichlet boundary conditions,
(ii) homogeneous Neumann boundary conditions, and (iii) mixed homogeneous Dirichlet and Neumann boundary condi-
tions. Verifying this is a simple but fundamental exercise. A fundamental property to be verified is the following: a bilinear
form a(·, ·) defined on a Hilbert space X is said to be elliptic if there exists a constant α such that

a( x, x ) ≥ α∥ x ∥2X ∀ x ∈ X.

Lemma 3 (Lax-Milgram). Consider a bounded bilinear form a : H × H → R defined on a Hilbert space H that is elliptic with
constants C and α respectively, and a linear functional f in H ′ . Then, there exists a unique u in H such that

a(u, v) = f (v) ∀v ∈ H.

This solution is continuous with respect to the data, in the sense that there exists a positive constant C such that

C
∥u∥ H ≤ ∥ f ∥ H′ .
α
This is typically referred to as the a priori estimate.

The Poisson problem Consider f in H −1 (Ω) and g in H 1/2 ( Γ ) with Γ := ∂Ω. The Poisson problem in strong form is given
as the following PDE:

−∆u = f in Ω
γ0 u = g on Γ.

Note that the strong form must be understood in the distributional sense, i.e. as an equation in H −1 (Ω). To derive the weak
formulation, consider a function v in H01 (Ω), then using the boundary conditions we obtain that

−⟨∆u, v⟩ = (∇ u, ∇ v),

where (·, ·) is the L2 (Ω) product. Thus the weak formulation reads: Find u in H01 (Ω) such that
Z
∇ u · ∇ v dx = ⟨ f , v⟩ ∀v ∈ H01 (Ω).

This problem can be shown to be well-posed using Lax-Milgram’s lemma and the Poincarè inequality. Small exercise:
Extend the proof to the case of non-homogeneous Dirichlet boundary conditions.
In the case of having a boundary condition defined only on a portion ΓD of the boundary, the formulation changes,
because (i) we need further information regarding the Neumann trace on the complement of the boundary, (ii) the test
space looks different. In particular, we define the solution space given by

V0 = {v ∈ H 1 (Ω) : v=0 on ΓD },

which using the generalized Poincaré can be shown to still satisfy an ellipticity estimate.

The pure Neumann problem In general, having Neumann boundary conditions is problematic for two reasons: It results
in a data compatibility condition and (ii) it results in having a non-trivial kernel in the problem. The problem in general reads:
Find u in H 1 (Ω) such that
−∆u = f
∇ u · n = h.
The weak formulation is
(∇ u, ∇ v) = ⟨ f , v⟩ ∀ v ∈ H 1 ( Ω ),
where it is easy to see that if u is a solution, then u + c is also a solution for all c ∈ R. This means that the problem has a
kernel, which is given by the space of constant functions, i.e. span({1}). The other problem is that, when one considers a
test function in the kernel of the problem, this yields the following:

(∇ u, ∇ 1) = 0 = ⟨ f , 1⟩.

11
This is a compatibility condition on the data, and it shows that having compatible data is necessary for having a well-posed
formulation. Because of these reasons, one considers a solution (and test) space that is orthogonal to the kernel:
Z
V = {u ∈ H 1 (Ω) : u dx = 0},

where the null average condition can be seen as


Z
u dx = (u, 1)0 = (u, 1)0 + (∇ u, ∇ 1)0 = (u, 1)1 ,

and thus the orthogonality is being considered with respect to the natural space H 1 (Ω). With it, the weak formulation is
given as: Consider f a compatible function in H −1 (Ω), then find u in V such that

(∇ u, ∇ v) = ⟨ f , v⟩ ∀v ∈ V.

1.5 Galerkin schemes for elliptic problems


The idea here is that, instead of discretizing the differential operator as one would do in the case of finite differences to obtain
discrete derivatives, one considers discrete functional spaces, with the idea that the discrete space somehow converges to
the continuous space. This is known as a Galerkin scheme. The schemes here will be conforming in the sense that the
discrete space Vh is a subset of V, but this is not necessary, and indeed all Discontinuous Galerkin (DG) formulations are
basically non-conforming schemes.
Consider thus an abstract differential problem given by finding u in V such that

a(u, v) = L(v) ∀v ∈ V,

such that the hypotheses of Lax-Milgram hold. In this case, we can consider a discrete space Vh in V, and thus define the
following discrete problem:
a(uh , vh ) = L(vh ) ∀vh ∈ Vh .
The most notable aspect of Lax-Milgram is that all of its hypotheses hold also in Vh , which implies that the discrete problem
is also invertible, and the a priori estimate holds as well. The natural question is whether the discrete solution uh converges
to the continuous solution u, which is studied through the error equation. This is computed by setting the continuous test
function V as Vh and then subtracting both problems:

a(eh , vh ) = 0 ∀vh ∈ Vh ,

where eh = u − hh . This property is known as the Galerkin orthogonality, and it can be used to compute the error estimate by
considering an arbitrary function zh in Vh :

α∥eh ∥2V ≤ a(eh , eh )


= a(eh , u − zh ) (Galerkin orth.)
≤ C ∥ e h ∥V ∥ u − z h ∥V (a continuous),

where one obtains that for all zh it holds that


C
∥ u − z h ∥V .
∥ e h ∥V ≤
α
Taking the infimum over zh one obtains the celebrated Ceà estimate:

C
∥ u − u h ∥V ≤ inf ∥u − vh ∥Vh .
α vh ∈Vh

This inequality can reveal many things. For example, if the number C/α is very big, it can hint on a very wide gap between
the optimal solution (i.e. the projection) and the discrete one computed from the space Vh . A more precise characterization
of the approximation properties of a space can be given by the Kolmogorov width, which has been studied in [EBBH09].

12
1.6 Finite elements
The idea, for this course, of studying specifically finite elements will be that of having more significative ways of expressing
the projection error present in Ceà’s estimate:
inf ∥u − vh ∥V .
vh ∈Vh

A typical use of this inequality will be that of bounding the projection error through a Finite Element (FEM) interpolant,
such that one gets inequalities such as

inf ∥u − vh ∥V ≤ ∥u − Ih u∥V ≤ hs ∥u∥W ,


vh ∈Vh

where W is some space of higher regularity, s is some (hopefully positive) exponent and Ih : V → Vh is an interpolation
operator. Our idea is that of producing FEM spaces for all of our relevant Hilbert spaces: L2 , H 1 , H (div), and H (curl).
A finite element is defined as a triple (K, PK , ΣK ), where K is a geometric domain (interval, triangle, etc), PK is a space
of functions, and ΣK is a set of linear functionals defined on PK known as degrees of freedom. We will typically assume
that the finite element is unisolvent in the sense that the degrees of freedom uniquely characterize a function in PK . One
simple example in the interval (0, 1) would be considering as PK the space of linear functions and as degrees of freedom
the functions l0 ( p) = p(0) and l1 ( p) = p(1). Associated with a finite element is an interpolant, which is the operator
πK : C (K ) → PK such that
l (πK (u) − u) = 0 ∀l ∈ ΣK ,
for all sufficiently smooth functions u. In words, the polynomial in PK that matches u in the degrees of freedom. This is
known as an interpolation operator.
A FEM space is a global space in the space that it is defined in all of the domain Ω, or some approximation of it Ωh , which
does not depend on the degrees of freedom. We will mostly use the following definitions: PKk is the space of polynomials of
degree from 1 to k defined on an element K, the geometry is discretized through a tessellation of elements such that
[
Ω= Kj ,
j

and thus a scalar FEM space is given by


n o
j
Xhk = vh ∈ C (Ω) : vh |K ∈ PK , j ∈ {1, . . . , k} ∀K .

This space is said to be conforming in V if Xhk belongs to V. Naturally, a global FEM space will be conforming to either
H 1 , H (div), or H (curl) depending on the continuity imposed on the degrees of freedom. We summarize the (somewhat
expected) relevant continuity requirements in the following lemma.

Lemma 4 (Conforming spaces). Consider two non-overlapping Lipschitz domains K1 and K2 such that they meet at a common
surface Σ.
• Consider two scalar functions p1 in H 1 (K1 ) and p2 in H 1 (K2 ), and glue them as p = p1 IK1 + p2 IK2 . If p1 |Σ = p2 |Σ , then p
belongs to H 1 (K1 ∪ K2 ∪ Σ).
• Consider two vector functions u1 in H (div; K1 ) and u2 in H (div; K2 ), and glue them as u = u1 IK1 + u2 IK2 . Then, if u1 · n =
u2 · n it holds that u belongs to H (div; K1 ∪ K2 ∪ Σ).

• Consider two vector functions u1 in H (curl; K1 ) and u2 in H (curl; K2 ), and glue them as u = u1 IK1 + u2 IK2 . Then, if u1 × n =
u2 × n it holds that u belongs to H (curl; K1 ∪ K2 ∪ Σ).
Proof. Point (1) is proved in [Gat14], (2) is in [Mon03], and (3) is homework :) .
We now simply show the fundamental FEM bases and the related interpolation operators that we will use to recover
most convergence estimates. We follow the presentation given in [Mon03], but similar results can be found in [EG04].

13
H (div): Here we consider the local polynomial in Rd given by

Dk = [Pk−1 ]d ⊕ P
] k −1 x,

where x is the identity map, and P fk is the set of homogeneous3 polynomials of degree exactly k. If k = 1, this space
will have d + 1 degrees of freedom, characterized in a triangle/tetrahedron by degrees of freedom defined through the
normal component of the polynomial on its facets. These elements are known as Raviart-Thomas elements. Thus, given a
triangulation of the geometry Th , this allows us to define the space

Wh = {uh ∈ H (div; Ω) : uh |K ∈ Dk ∀K ∈ Th } ,

where Wh is conforming in H (div). In addition, we get the following result regarding an interpolation operator:
Theorem 13 (H (div) interpolation). Consider 0 < δ < 1/2 and s ∈ [1/2 + δ, k ]. Then, if u belongs to H s (Ω), there exists an
interpolation operator wh and a positive constant C such that

∥u − wh u∥0 ≤ Chs ∥u∥ H s (Ω)

and
∥ div (u − wh u) ∥0 ≤ Chs ∥ div u∥ H s (Ω) .

Naturally, this results implies the estimate in the H (div) norm.

H (curl): The procedure for this space is roughly similar to the previous one. For it, we define the space

fk ]3 | p · x = 0},
Sk = { p ∈ [ P

which is somehow the orthogonal complement to the space P


] k−1 x. With it, we define the local space as

Rk = [Pk−1 ]3 ⊕ Sk ,

and given a triangulation Th of the geometry, we consider the space

Vh = {uh ∈ H (curl; Ω) : uh |K ∈ Rk ∀K ∈ Th } .

This space is conforming in H (curl). In addition, we also get some nice interpolation properties.
Theorem 14 (H (curl) interpolation). The theorem states:

• Consider δ > 0 and s in [1/2 + δ, k]. Then, if u is in H s (Ω), there exists a positive constant C and an interpolation operator rh
such that
∥u − rh u∥0 + ∥ curl(u − rh u)∥0 ≤ Chs (∥u∥ H s + ∥ curl u∥ H s ) .

• Consider 0 < δ ≤ 1/2. If u belongs to H 1/2+δ (Ω) and [curl u]|K belongs to Dk in all elements K, then we further have that
 

∥u − rh u∥0 ≤ C h1/2K ∥ u ∥ H 1/2 + δ + h K ∥ curl u ∥ 0 .

H1: From the Lemma regarding conforming spaces, we can immediately see that the following is a conforming space in
H1:
Uh = { v h ∈ H 1 ( Ω ) : v h | K ∈ P k ∀ K ∈ T h },
with a nice interpolation operator:
Theorem 15. There exists a positive constant C and an interpolation operator πh such that, for a positive δ:

∥ p − πh p∥1 ≤ Chs−1 ∥ p∥ H s 3/2 + δ ≤ s ≤ k + 1.


3A homogeneous polynomial is one whose non-zero terms all have the same degree, such as x2 + xy.

14
L2 : We can finally write the space
Zh = { ph ∈ L2 (Ω) : ph |K ∈ Pk−1 ∀ K ∈ T h },
and there exists an interpolation operator P0 such that for v in and 0 ≤ l ≤ k, p ∈ [1, ∞], there is some positive C
W l,p (Ω)
such that
∥v − P0 v∥ L p ≤ Chl |v|W l,p .
For sufficiently smooth functions (δ > 0 in the above), these spaces induce a de Rham complex that can be extremely useful
in FEM analysis. For this, we can consider the following spaces for δ > 0:
U = H 3/2+δ (Ω)
V = {v ∈ H 1/2+δ (Ω) : curl v ∈ H 1/2+δ (Ω)},
W = {w ∈ H 1/2+δ (Ω) : div w ∈ L2 (Ω)}
which yield the following:
∇ curl div
H 1 (Ω) −−−−→ H (curl; Ω) −−−−−→ H (div; Ω) −−−−−→ L2 ( Ω )
U⊂  V ⊂ W⊂  
   
rh  wh  P0 
   
πh 
y y y y
Uh −−−−−−→ Vh −−−−−→ Wh −−−−−→ Zh
There is an intrinsic relationship between this structure and inf-sup conditions. It additionally commutes, and thus one can
obtain identities such as
rh ∇ = ∇ πh ,
which further relates the interpolation results.

1.7 Inverse inequalities


All of the discrete spaces considered are finite dimensional, which means that all their norms are equivalent. Of course,
these relationships don’t hold in the continuous setting, so there is bound to be some dependence of these constants on h.
The great thing about inverse inequalities is that in the discrete settings we can sometimes get away with operations that
would be otherwise not feasible, but using adequate bounds can yield convergence anyway. We provide a general result
first and then show some important consequences. For details, see [EG04]. We will require the following definition: we say
that a family of affine meshes in Rd is shape regular if there exists σ0 such that
hK
∀h : σK := ≤ σ0 ∀K ∈ Th ,
ρK
where ρK is the diameter of the largest ball that can be inscribed in K, and hK is the diameter of K. It is quasi-uniform if it is
shape-regular and there is some C such that
∀h : hK ≥ Ch ∀K ∈ Th .
Theorem 16 (Global inverse inequality). Consider a finite element (K̂, P̂, Σ̂), l ≥ 0 such that P̂ ⊂ W l,∞ (K̂ ), a family of shape-
regular and quasi-uniform meshes {Th }h>0 with h < 1, and set
Wh = {vh : vh ◦ TK ∈ P̂ ∀ K ∈ T h },
where TK is the affine mapping from an element K to the reference element K̂; i.e. the family of discrete functions that locally belong to
the finite element considered. Then, there is some positive C such that for all vh in Wh and m ∈ [0, l ] it holds that
!1/p !1/q
∑ ∥vh ∥W l,p (K) ∑ ∥vh ∥W m,q (K)
p q
≤ Chm−l +min(0,d/p−d/q) .
K K

In particular this shows that


∥vh ∥W 1,p ≤ Ch−1 ∥vh ∥ L p .
One can also obtain the following local estimate in 2D for simplices [WH03]:
∥vh ∥0,F ≤ Ch−
K
1/2
∥vh ∥0,K ,
where F is a facet (line or triangle) and K is the element (triangle or tetrahedron).

15
2 Beyond ellipticity
In the previous section we have thorougly studied elliptic problems and many approximation propoerties. Still, one might
rightfully notice that ellipticity can be quickly broken. For example, the operator −∆ : H01 (Ω) → H −1 (Ω) is elliptic, but if
we remove the minus sign, it loses that property. As this case is linear, it is possible to remap the unknown with u 7→ −u,
but it still feels unsatisfactory that the property is so fragile. Because of this, in this section we review two important theories
that go beyond ellipticity: the theory of Fredholm operators and the inf-sup theory.

2.1 Fredholm operators


Most of the forms in which we present our problems comes from the fantastic notes from Andrea Moiola on time-harmonic
acoustic waves [Moi21]. Our reference problem will be the Helmholtz equation, given by the following strong form:

−∆u − k2 u = f , Ω,


u = 0, ∂Ω,
for some f ∈ H −1 (Ω) and some boundary condition. This equation can be recoverred in the following way: consider
the wave equation
ü − ∆u = 0,
and consider a variable separation procedure with u(t, x ) = V (t)U ( x ). This is a common technique, and it results in

∆U
= k2 .
U
multiplying by U gives the desired result. Regarding well-posedness, let’s first explore what we can conclude with Lax-
Milgram. We will assume u in H01 (Ω) for simplicity, which yields the following weak form: Find u in H01 (Ω) such that

a(u, v) = (∇u, ∇v) − k2 (u, v) = ⟨ f , v⟩, ∀v ∈ H01 .


Using H01 (Ω) with the norm induced by the seminorm |v|1 = ∥∇v∥0 , Poincaré inequality gives us ∥u∥0 ≤ CΩ ∥∇u∥0 ,
and so the Lax-Milgram hypotheses look as follows:

• Boundedness:

a(u, v) ≤ ∥∇u∥0 ∥∇v∥0 + k2 ∥u∥0 ∥v∥0 ,


≤ (1 + k2 CΩ
2
)|u|1 |v|1 .

• Coercivity:
a(v, v) = ∥∇v∥20,Ω − k2 ∥v∥20,Ω .

We note that, by Poincaré inequality,

k2 ∥v∥20 ≤ k2 CΩ
2
∥∇v∥20 ,

and thus,

a(v, v) ≥ ∥∇v∥20 − k2 CΩ
2
∥∇v∥20
= (1 − k2 CΩ
2
)∥∇v∥20,Ω .

In other words, the problem is well-possed if

1
1 − k2 CΩ
2
> 0 ⇐⇒ k2 < 2
CΩ

This is a very limited answer, so we will now answer what happens for arbitrary k ∈ R.
Definition 1. A linear operator K : H1 → H1 is compact if the image of a bounded sequence admits a converging subsequence.

16
Definition 2. A bounded linear operator is a Fredholm operator if it is the sum of an invertible and a compact operator.
Theorem 17 (Fredholm alternative). A Fredholm operator is injective if and only if it is surjective. In such case, it has a bounded
inverse.
A simpler way of showing that an operator is Fredholm, is through a Gårding inequality.
Definition 3. Consider H ⊂ V Hilbert spaces with a continuous embedding H ,→ V. A bilinear form a : H × H → R satisfies a
Gårding inequality if there exist two positive constants α, CV subject to,

a(v, v) ≥ α∥v∥2H − CV ∥v∥2V , ∀v ∈ H.


Proposition: If the inclusion H ,→ V is compact, then the operator A : H → H ∗ associated to a : H × H → R, i.e,

⟨ Ax, y⟩ = a( x, y),
is such that if a satisfies Gårding then A is Fredholm.
Proof. We note that, by definition of Gårding, a(u, v) + CV (u, v)V is invertible because of Lax-Milgram. We now try to write
the equation in operator form. The form (u, v)V is handled as follows: set T : V → H ∗ as

⟨ Tv, w⟩ H ∗ × H = ”(v, w)V ” ≡ (v, iw)V ,


where i : H → V is the compact embedding. T is clearly bounded:

(v, iw)V
∥ Tv∥ H ∗ = sup ,
w∈ H ∥w∥ H
∥v∥V ∥iw∥V
≤ sup ,
w∈ H ∥w∥ H
∥v∥V ∥i ∥∥w∥ H
≤ sup ,
w∈ H ∥w∥ H
≤ ∥i ∥∥v∥.
Then, the operator associated to the problem is B := A + CV T ◦ i, where:
• B is invertible,
• T is continuous,
• i is compact.
The last two impliy that T ◦ i is compact, as the composition of continuous and compact operators is compact. It follows
that A = B − CV T ◦ i is Fredholm.

2.2 Galerkin Stability


We have seen that elliptic problems have the following a-priori stability estimate

α∥u∥ H ≤ ∥ f ∥ H ∗ for a(u, v) = ⟨ f , v⟩, ∀v ∈ H.


which carries naturally to the discrete problem:

α∥uh ∥ H ≤ ∥ f ∥ H ∗ for a ( u h , v h ) = ⟨ f , v ⟩, ∀ v h ∈ Hh .
This can be rewritten as follows: denote the orthogonal projection Πh : H → H ∗ , and R H : H → H ∗ the Riesz map.
Then:

a(u, v) = ⟨ f , v⟩ H ∗ × H ∀v ∈ H,
⇐⇒ ⟨ Au, v⟩ H ′ × H = ⟨ f , v⟩ H ′ × H ∀v ∈ H,
⇐⇒ Au = f in H ′ and ( R−1 ◦ A)u = R−1 f in H.

17
Also:

a(uh , vh ) = ⟨ f , vh ⟩ ∀ v h ∈ Hh ,
⇐⇒ ( R ◦ Auh , vh ) H = ( R−1 f , vh ) H
−1
∀ v h ∈ Hh ,
⇐⇒ Πh Auh = Πh F,

where A = R− 1 ◦ A and F = R−1 f . The stability estimate gives,


1 1
∥uh ∥ ≤ ∥Πh F∥ = ∥Πh Auh ∥.
α α
Finally, by duality, we get

1 1 (Πh Auh , vh ) 1 ⟨ Auh , vh ⟩ H ′ × H 1 a(uh , vh )


∥uh ∥ ≤ ∥Πh Auh ∥ = sup = sup = sup .
α α vh ∥vh ∥ α vh ∥vh ∥ α vh ∥vh ∥

We call this result,

a(uh , vh )
∥uh ∥ ≤ C sup , ∀ u h ∈ Hh ,
vh ∈Vh ∥vh ∥
an inf-sup condition, which we will later use to prove surjectivity and injectivity separately. One may readily see that
this condition implies (discrete) injectivity as Auh implies uh = 0.
We will require the following Lemma regarding discrete stability of Fredholm operators. Details are provided in [SBH19].

Lemma 5. Consider a bilinear form a : H × H → R associated to an injective Fredholm operator A + K. Then, there exists C, h0 > 0
such that

a(uh , vh )
∥uh ∥ H ≤ C sup , ∀ u h ∈ Hh , h ≤ h0 .
vh ∥vh ∥
In other words, discrete stability holds only for sufficiently fine meshes.
Consider the discrete problem
a(uh , vh ) + b(uh , vh ) = ⟨ f , vh ⟩
where a and b are the bilinear forms associated to an elliptic and a compact operator respectively, and consider the
Galerkin projection Gh : H → Hh

a( Gh u, vh ) + b( Gh u, vh ) = a(u, vh ) + b(u, vh ).
Under the previous hypothesis, for h ≤ h0 we have,

a( Gh u, vh ) + b( Gh u, vh )
∥ Gh u∥ ≤ C sup ≤ C ∥ A + K ∥∥u∥.
vh ∥vh ∥
Then Gh is bounded. We observe that, as Gh is a porjection, it holds that Gh Πh = Πh , and thus:

∥ u − Gh u ∥ ≤ ∥ u − Πh u ∥ + ∥ Πh u − Gh u ∥ ,
= ∥u − Πh u∥ + ∥ Gh (Πh u − u)∥,
≤ (1 + C ∥ A + K ∥)∥u − Πh u∥,

which implies

∥u − uh ∥ ≤ (1 + C ∥ A + K ∥) inf ∥u − vh ∥,
v h ∈ Hh

which is a Céa estimate for sufficiently small h.

18
2.3 Inf-sup conditions
2.4 Saddle point problems
In this section we will study the well-posedness theory of saddle point problems. The presentation has been taken, mostly
verbatim, from [Gat14]. A saddle point problem has the form:

A BT u
    
f
= .
B 0 p g

To grasp the relevance of this formulations, let’s first see some examples:
• Stokes: The variational formulation of Stokes reads:
(
(∇u, ∇v) − ( p, div v) = ⟨ f , v⟩ ∀v,
(div u, q) =0 ∀q.

By inspection we see that A = −∆, B = div.


• Darcy (or mixed Poisson): We are dealing with the problem

−∆u = f in Ω

Introducing the variable σ = −∇u, the problem now reads:


(
div σ = f in Ω
σ + ∇u = 0 on ∂Ω

Testing the equation on τ and : (


(σ, τ ) − (u, div τ ) = 0 ∀τ,
(div σ, v) = ⟨ f , v⟩ ∀v,
where A is the identity operatir and B = div as before.
• Primal-Mixed Poisson (Dirichlet with multipliers) Consider the problem
(
−∆u = f in Ω
u=g on ∂Ω

Integration by parts yields:


(∇u, ∇v) − ⟨γ N u, γD v⟩ = ⟨ f , v⟩.
Define ξ = −γ N u and impose Dirichlet boundary conditions weakly. That is,

∀λ ∈ H −1/2 (∂Ω) : ⟨λ, γD u⟩ = ⟨λ, g⟩

Writing everything together shows a saddle point problem:


(
(∇u, ∇v) + ⟨ξ, γD v⟩ = ⟨ f , v⟩ ∀v,
⟨λ, γD u⟩ = ⟨λ, g⟩ ∀λ.

Now that we know some examples of saddle points problems, a natural question is to ask for conditions for the existence
and uniqueness of solutions. Fortunately, this has already been done and it’s known as the Ladyzhenskaya-Babušhka-Brezzi
Theory, typically denoted as LBB theory.
Theorem 18. Consider the problem
BT
    
A u f
=
B 0 p g
Let V = ker B and Π : H → V the orthogonal projector. Suppose that

19
• ΠA : V → V is a bijection;
• the bilinear form b (associated to B) satisfies the inf-sup condition with constant β.
Then, for each ( f , g) in H ′ × Q′ there exists a unique pair (u, p) ∈ H × Q such that (18) holds. Moreover, there is a positive constant
C = C (∥ A∥, ∥(ΠA)−1 ∥, β) such that
∥(u, p)∥ ≤ C (∥ f ∥ + ∥ g∥)
Proof. Exercise :)

Common example: Darcy We will present a the worked example of Darcy’s problem, to be analyzed using the LBB theory.
Its weak form is given as follows: Consider H = H (div; Ω) ∩ {u · n = 0} and Q = L2 (Ω), then find (u, p) in H × Q such
that
(K −1 u, v) − (div v, p) = ⟨ f , v⟩ ∀v ∈ H
(div u, q) = ⟨ g, q⟩ ∀q ∈ Q,
where K is symmetric and positive definite (k1 | x|2 ≤ x · K −1 x ≤ k2 | x|2 ), f is in H ′ and g is in Q′ . We omit details regarding
continuity as it is a simply application of the Cauchy-Schwartz inequality. The bilinear forms to be studied here are a(u, v) =
(K −1 u, v) and b(v, q) = (div v, q).
• First, we need to show that ΠA|V is invertible. To better understand this operator, we recall that Π is the orthogonal
projector of H into V = ker B, so let’s look into all the pieces to make sense out of it. First, B : H → Q is the operator
given by
( Bv, q) = (div v, q),
and thus, as L2 can be identified with its dual, we have that simply B = div and thus V = {v ∈ H : div v = 0}, i.e.
the space of solenoidal functions in H (div; Ω) with null normal component on the boundary. Second, the projection
Π is surjective, and thus
⟨ΠA|V u, v⟩
which is defined for all u, v in H, can be written analogously as

⟨ΠAu, v⟩

for u in V and v in H, where Au belongs to H. Finally, Π : H → V is an orthogonal projector, meaning that if we


denote H = V ⊕ V ⊥ and v = v0 + v⊥ , we obtain
⟨ Au, v0 ⟩
with u in V and v0 in V . In other words, the operator ΠA|V is simply the restricted bilinear form a : V × V → R,
given by
a(u, v) = (K −1 u, v) ∀u, v ∈ V .
In such a space, the form a is elliptic:

a(u, u) = (K −1 u, u) ≥ k1 ∥u∥20 = k1 ∥u∥2div ∀u ∈ V ,

where the term ∥ div u∥ can be trivially added as div u = 0 for u in V . This shows that ΠA|V is invertible using the
Lax-Milgram lemma.
• We now show the inf-sup property. For this, we will use the technique of the auxiliary problem. Consider the following
problem
−∆z = q Ω
∇ z · n = 0 ∂Ω,
which we have already shown to be invertible in the subspace of H 1 that is orthogonal to the constants. Thus, it holds
that the function ũ = − ∇ z is such that it belongs to H and satisfies that ∥ũ∥0 ≤ C ∥q∥0 , which comes from the a-priori
(or stability) estimate. Also, as div ũ = q, we have ∥ũ∥div ≤ (C + 1)∥q∥. Going back to the original problem, we want
to show that there exists β > 0 such that

b(v, q)
sup ≥ β ∥ q ∥0 ∀q ∈ Q.
v∈ H ∥v∥

20
Using our previously constructed solution, we obtain

b(v, q) (div ũ, q) ∥q∥20


sup ≥ ≥ = β̃∥q∥,
v∈ H ∥v∥ ∥ũ∥div (C + 1)∥q∥

where β̃ = 1/(C + 1). This concludes the proof. We note that it was actually sufficient to show that for each q there
existed an element ũ in the desired space, as it proved the survectivity of the operator B, but showing the complete
inf-sup estimate was more instructive.
Given that we have shown the required properties for a and b, then there exists a unique and stable solution (u, p) of Darcy’s
problem.

2.5 Discretization of saddle point problems


We consider finite dimensional and conforming spaces { Hh }h ⊂ H and { Qh }h ⊂ Q. Then, given F in H ′ and G in Q′ , the
discrete problem reads: Find (uh , ph ) in Hh × Qh such that

a(uh , vh ) + b(vh , ph ) = ⟨ F, vh ⟩ ∀ v h ∈ Hh
b(uh , qh ) = ⟨ G, qh ⟩ ∀qh ∈ Qh .

The previous theory can be used in this context unchanged, with only some mild changes in the definition of the operators,
which we do in the following before announcing the operators. Consider the induced operators Ah : Hh → Hh and
Bh : Hh → Qh , defined using convenient Riesz operators as done previously, and define the kernel space Vh = ker Bh =
{ v h ∈ Hh : b ( v h , q h ) = 0 ∀ q h ∈ Q h } .
Theorem 19. Consider the orthogonal projection Πh : Hh → Qh . Then, if

• The operator Πh Ah |Vh : Vh → Vh is injective (or surjective), and


• the bilinear form b : Hh × Qh → R satisfies and inf-sup condition, then
for each pair of functions ( F, G ) there is a unique solution (uh , ph ) in Hh × Qh such that
 
∥(uh , ph )∥ H ×Q ≤ Ch ∥ F | Hh ∥ H ′ + ∥ G |Qh ∥Q′ ,
h h

where Ch = Ch (∥ Ah ∥, ∥(Πh A)−1 ∥, β h ).


From this result, it is easy to see that the Galerkin projection Gh : H × Q → Hh × Qh is well-posed:

a(ΠH ◦ Gh (u, p), vh ) + b(vh , ΠQ ◦ Gh (u, p)) = a(u, vh ) + b( p, vh ) ∀ v h ∈ Hh


b(ΠH ◦ Gh (u, p), qh ) = b(u, qh ) ∀qh ∈ Qh ,

where ΠH and ΠQ are simply component projections, i.e. ΠH (u, p) = u and ΠQ (u, p) = p. One can further prove a Céa
estimate:
∥u − uh ∥ ≤ C1 ∥ inf ∥u − ζ h ∥ H + C2 inf ∥ p − wh ∥
ζ h ∈ Hh wh ∈ Qh

∥ p − ph ∥ ≤ C3 ∥ inf ∥u − ζ h ∥ H + C4 inf ∥ p − wh ∥.
ζ h ∈ Hh wh ∈ Qh

Note that the discrete inf-sup condition does not follow from the continuous one, so it is typically an additional difficulty
during the analysis. Still, there is a classical lemma that allows to infer the discrete inf-sup in some conditions.

Lemma 6 (Fortin’s Lemma). Consider b : H × Q → R that satisfies an inf-sup condition with constant β > 0. If there exists a
famliy of discrete projectors Πh : H → Hh such that

∥Πh ∥ ≤ C̃ ∀h and b(Πh u, qh ) = b(u, qh ) ∀u ∈ H, qh ∈ Qh ,

then the discrete inf-sup of b : Hh → Qh holds with β̃ = β/C̃.

21
3 Beyond linearity
We will now cover the notion of differentiability in Banach spaces. The presentation closely follows that of [AP95], with
substantially less detail.
Definition 4. Set u ∈ U ⊂ X with U an open set, and F : U → Y. We say F is Fréchet differentiable at u if there exists a linear
operator A ∈ L( X, Y ) such that
F (u + h) = F (u) + A(h) + o (∥h∥),
where we denote L( X, Y ) as the set of linear operators from X to Y, and the residual term o (h) corresponds to a function f (h) such that
∥ f (h)∥/∥h∥ → 0 when h → 0. We denote the Fréchet derivative at u in the direction h as dF (u)[h] := A(h).
This construction yields two properties about Fréchet differentiability:
1. For a given F : U → Y, dF (u) is unique. To prove this, assume A ̸= B are two Fréchet derivatives of F, that is,
F ( x + h) = F ( x ) + Ah + o (∥h∥)
F ( x + h) = F ( x ) + Bh + o (∥h∥).
Subtracting these equations, we have Ah − Bh = o (∥h∥). Since A ̸= B, there exists a direction h∗ such that Ah∗ ̸= Bh∗ .
Setting h = th∗ , with t ∈ R, we note that
∥ Ah − Bh∥ t∥ Ah∗ − Bh∗ ∥
= = const. ↛ 0,
∥h∥ t∥ h∗ ∥
since it does not depend on ∥h∥, and thus Ah − Bh ̸= o (∥h∥), which is a contradiction.
2. If F : U → Y is Fréchet differentiable, then F is continuous.
Furthermore, we recover some classical properties of differentiation:
1. (Linear combination) If F and G are Fréchet differentiable, then for any a, b ∈ R, aF + bG is Fréchet differentiable.
2. (Chain rule) Let F : U → Y and G : V → Z, with U ⊂ X an open set and F (U ) ⊂ V ⊂ Y. Then, if F is Fréchet
differentiable at u ∈ U and G is Fréchet differentiable at F (u) ∈ V, then the composition G ◦ F : U → Z is Fréchet
differentiable at u, and we have
dG ◦ F (u)[h] = dG ( F (u))[dF (u)[h]].

We now define the Fréchet derivative map.


Definition 5. Let F : U → Y be a Fréchet differentiable function in U (i.e. Fréchet differentiable at every point u ∈ U). The map
F ′ : U → L( X, Y )
u 7→ dF (u)

is called the Fréchet derivative of F. If F ′ is continuous, we say that F ∈ C1 and write F ∈ C1 (U, Y ).
We note that the definition of the Fréchet derivative gives no hints to actually compute it from a given F. We begin by
defining a different notion of differentiability.
Definition 6. We define the Gâteaux derivative of F : U → Y at a fixed u ∈ U as
F (u + εh) − F (u) d
dG F (u)[h] := lim = ( F (u + εh))|ε=0 .
ε →0 ε dε
This corresponds to the directional derivative of F in the direction h.
In order to show the equivalence between Fréchet and Gâteaux derivatives, we require the mean value theorem.
Theorem 20 (Mean value). Let F : U → Y be a function that is Gâteaux differentiable in U (i.e. at every point u ∈ U). Define the
function interval (convex combination)
[u, v] := {tu + (1 − t)v, t ∈ [0, 1]}.
Then, we have !
∥ F (u) − F (v)∥ ≤ sup ∥dG F (w)∥ ∥ u − v ∥.
w∈[u,v]

22
Proof. Assume that F (u) − F (v) ̸= 0. We build the norm using the Hahn-Banach theorem: there exists ψ ∈ Y ∗ with ∥ψ∥ = 1
such that
⟨ψ, F (u) − F (v)⟩ = ∥ F (u) − F (v)∥.
Define γ(t) = tu + (1 − t)v, with t ∈ [0, 1], and h(t) = ⟨ψ, F (γ(t))⟩. Then, by linearity,

h(t + τ ) − h(t) F (γ(t) + τ (u − v)) − F (γ(t))


= ⟨ψ, ⟩.
τ τ
By definition of the Gâteaux derivative, we have

h(t + τ ) − h(t)
lim = h′ (t) = ⟨ψ, dG F (γ(t))[u − v]⟩,
τ →0 τ
and the scalar mean-value yields h(1) − h(0) = h′ (θ ) for some θ ∈ (0, 1). Computing all the terms, we get

h(1) − h(0) = ⟨ψ, F (u) − F (v)⟩ = ∥ F (u) − F (v)∥,

and
h′ (θ ) = ⟨ψ, dG F (γ(θ ))[u, v]⟩ ≤ ∥ψ∥ ∥dG F (γ(θ ))∥∥u − v∥,
|{z}
=1
where we used the Cauchy-Schwarz inequality followed by the continuity of dG F (γ(θ )). Taking sup over θ completes the
proof.
Theorem 21 (Equivalence of Gâteaux and Fréchet derivatives). If the Gâteaux derivative dG F of a function F : X → Y is
continuous at u∗ ∈ X, then F is Fréchet differentiable at u∗ , and they coincide, i.e. dF (u∗ ) = dG F (u∗ ).
Proof. Because of uniqueness, we simply need to verify that the Gâteaux derivative is indeed the Fréchet one. Fix u ∈ U and
define R(h) = F (u + h) − F (u) − dG F (u)[h]. We now need to prove R(h) = o (∥ h∥), so that Ah = dG F (u)[h] in the definition
of the Fréchet derivative. The Gâteaux derivative of R at h in the direction k is
d
dG R(h)[k] = R(h + εk )
dε ε =0
d
= F (u + h + εk ) − F (u) − dG F (u)[h + εk]
dε ε =0
= dG F (u + h)[k] − dG F (u)[k].
Thus, by the mean value theorem, we get

∥ R(h)∥ ≤ sup ∥dG ( R(w))∥∥h∥


w∈[0,h]

= sup ∥dG R(th)∥∥h∥


t∈[0,1]

= sup ∥dG F (u + h)[k] − dG F (u)[k]∥∥h∥,


w∈[0,h]

and thus dividing by ∥ h∥ and taking the limit ∥h∥ → 0 yields

∥ R(h)∥
lim ≤ lim ∥dG F (u + th) − dG F (u)∥ = 0,
∥h∥→0 ∥ h ∥ ∥h∥→0

because of the continuity of dG F.


Typically, one computes the Gâteaux derivative and hope it is continuous.
For higher order derivatives, we start from the Fréchet derivative F ′ (u) := dF (u) ∈ L( X, Y ), where F : U → Y.
Repeating the calculation, we have
d2 F (u) = dF ′ (u),
where F ′ : U → L( X, Y ), and so dF ′ (u) ∈ L( X, L( X, Y )). Notably, the space L( X, L( X, Y )) is isometric to L( X × X, Y )
through the isometry
ΨA (u1 , u2 ) = [ A(u1 )](u2 ),

23
for A ∈ L( X, L( X, Y )). For this reason, most people write the second derivative as d2 F (u)[h1 , h2 ], instead of (d2 F (u)[h1 ])[h2 ].
In calculus of variations, it is common to study the second varition of a functional. This is simply d2 Π (u)[h, h], as seen in
the second order term of the Taylor series

1
f ( x + h) ≈ f ( x ) + ∇ f ( x ) · h + h⊤ ( H f )( x )h,
2
where we wrote d2 f = H f as the Hessian of f at x.

3.1 Local analysis


3.2 Fixed point theorems
3.3 Monotone operators

4 Time dependent problems


4.1 Faedo-Galerkin and the method of lines
4.2 Space and time discretization

5 Poroelasticity
5.1 Equilibrium equations
5.2 Constitutive modeling
5.3 Darcy and Biot equations

References
[AF03] RA Adams and JF Fournier. Sobolev spaces. Elsevier, 2003.
[AP95] A Ambrosetti and G Prodi. A primer of nonlinear analysis. Number 34. Cambridge University Press, 1995.
[BS08] SC Brenner and R Scott. The mathematical theory of finite element methods. Springer, 2008.

[EBBH09] JA Evans, Y Bazilevs, I Babuška, and TJR Hughes. n-widths, sup–infs, and optimality ratios for the k-version of
the isogeometric finite element method. Computer Methods in Applied Mechanics and Engineering, 198(21-26):1726–
1741, 2009.
[EG04] A Ern and J-L Guermond. Theory and practice of finite elements, volume 159. Springer, 2004.

[Eva22] LC Evans. Partial differential equations, volume 19. American Mathematical Society, 2022.
[Gat14] GN Gatica. A simple introduction to the mixed finite element method. Theory and Applications. Springer Briefs in
Mathematics. Springer, London, 2014.
[Moi21] A Moiola. Scattering of time-harmonic acoustic waves: Helmholtz equation, boundary integral equations and
bem. Lecture notes for the “Advanced numerical methods for PDEs” class, University of Pavia, Department of Mathemat-
ics, 2021.
[Mon03] P Monk. Finite element methods for Maxwell’s equations. Oxford university press, 2003.
[SBH19] FJ Sayas, TS Brown, and ME Hassell. Variational techniques for elliptic partial differential equations: Theoretical tools
and advanced applications. CRC Press, 2019.

[WH03] T Warburton and JS Hesthaven. On the constants in hp-finite element trace inverse inequalities. Computer methods
in applied mechanics and engineering, 192(25):2765–2773, 2003.

24

You might also like