Applications of the Discrete Adjoint Method in Computational Fluid Dynamics
Applications of the Discrete Adjoint Method in Computational Fluid Dynamics
in
Computational Fluid Dynamics
by
René Schneider
April 2006
The candidate confirms that the work submitted is his own and that the appropriate
credit has been given where reference has been made to the work of others.
This copy has been supplied on the understanding that it is copyright material and
that no quotation from the thesis may be published without proper
acknowledgement.
Abstract
The discrete adjoint method allows efficient evaluation of the derivative of a function
I(s) with respect to parameters s in situations where I depends on s indirectly, via an in-
termediate variable ω(s), which is computationally expensive to evaluate. In this thesis
two applications of this method in the context of computational fluid dynamics are con-
sidered. The first is shape optimisation, where the discrete adjoint approach is employed
to compute the derivatives with respect to shape parameters for a performance functional
depending on the solution of a mathematical flow model which has the form of a discre-
tised system of partial differential equations. In this context particular emphasis is given
to efficient solution strategies for the linear systems arising in the discretisation of the
flow models. Numerical results for two example problems are presented, demonstrating
the utility of the approach.
The second application, in adaptive mesh design, allows efficient evaluation of the
derivatives of an a posteriori error estimate with respect to the positions of the nodes in a
finite element mesh. This novel approach makes additional information available which
may be utilised to guide the automatic design of adaptive meshes. Special emphasis is
given to problems with anisotropic solution features, for which adaptive anisotropic mesh
refinement can deliver significant performance improvements over existing adaptive h-
refinement approaches. Two adaptive solution algorithms are presented and compared to
existing approaches by applying them to a reaction-diffusion model problem.
i
Acknowledgements
Firstly, I would like to thank the sponsors which made this work possible by providing
the required financial support: the University of Leeds, Advantage CFD (CASE award)
and the Engineering and Physical Sciences Research Council (fees only award). I am also
grateful to Chemnitz University of Technology for allowing me to finish this thesis under
their employment.
I’d like to thank my supervisor, Professor Peter Jimack, for enabling me to undertake
these studies, for providing me with an interesting and challenging topic, for countless
pointers to relevant literature and interesting problems, and for his enormous patience
with regard to my English language skills.
Special thanks go to Advantage CFD for their support in the development of the finite
volume discretisation used in this work, and the inspiration they gave to me by allowing
me to see some real, challenging engineering applications of PDE-based simulation.
I would also like to thank all my colleagues from the School of Computing and all my
friends in Leeds who made these three years of living in a foreign country a happy and
unforgettable experience.
Further, many thanks to my parents and to my future parents-in-law, especially for
their support during January–March 2006, when I was struggling to care for my ill girl-
friend and our eight month old baby, whilst teaching at Chemnitz University of Technol-
ogy and trying to finish this thesis.
Finally, and most of all, I would like to thank my beloved Verena. Without her this
time in Leeds would never have been such an enjoyable experience. I thank her for bearing
with me when I was too deep in thought regarding this work or one of those little other
projects, and especially I thank her for giving us the most wonderful souvenir, our lovely
son Finley who was born in Leeds in June 2005.
ii
Declarations
Some parts of the work presented in this thesis have been published in the following
articles:
[75] R. Schneider and P.K. Jimack, Efficient preconditioning of the discrete adjoint equa-
tions for the incompressible Navier-Stokes equations, International Journal for Numeri-
cal Methods in Fluids, 47:1277–1283, 2005.
[76] R. Schneider and P.K. Jimack, Toward anisotropic mesh adaption based upon sensitivity
of a posteriori estimates, School of Computing Research Report Series 2005.03, Univer-
sity of Leeds, 2005. Available at http://www.comp.leeds.ac.uk/research/
pubs/reports/2005/2005_03.pdf.
iii
Contents
1 Introduction 2
1.1 The Discrete Adjoint Method . . . . . . . . . . . . . . . . . . . . . . . . 3
iv
2.5.2.3 Treatment of boundary cond. and ZMPC by projections 66
2.5.2.4 Geometric multigrid using the box-smoother . . . . . . 79
2.5.3 Finite volume method . . . . . . . . . . . . . . . . . . . . . . . 91
2.5.3.1 The Fp preconditioner and the FV scheme . . . . . . . 91
2.5.3.2 Geometric multigrid using the box-smoother . . . . . . 95
2.6 Application of the discrete adjoint method . . . . . . . . . . . . . . . . . 99
2.6.1 The disc. adj. method and regularisation of incomp. Navier-Stokes 99
2.6.2 The discrete adjoint method applied in the finite element context . 101
2.6.2.1 Derivatives with respect to node positions . . . . . . . 102
2.6.2.2 Influence of the mesh deformation mapping . . . . . . 106
2.6.2.3 Efficient solution of the discrete adjoint equations . . . 110
2.6.3 The discrete adjoint method applied in the finite volume context . 112
2.7 Numerical examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
2.7.1 Optimisation examples . . . . . . . . . . . . . . . . . . . . . . . 114
2.7.1.1 Example 2.1: Multiply connected cavity . . . . . . . . 114
2.7.1.2 Example 2.2: Obstacle in a channel . . . . . . . . . . . 119
2.7.2 Validation of the software . . . . . . . . . . . . . . . . . . . . . 124
2.7.2.1 Validation of the discretisation techniques . . . . . . . 124
2.7.2.2 Validation of the derivatives . . . . . . . . . . . . . . . 129
2.8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
v
3.4.2.1 Hanging nodes . . . . . . . . . . . . . . . . . . . . . . 151
3.4.2.2 Adjoint equations for derivatives of the error estimate . 152
3.4.2.3 Optimisation . . . . . . . . . . . . . . . . . . . . . . . 153
3.4.3 Assessment of the quality of the error estimate . . . . . . . . . . 154
3.4.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
4 Conclusions 166
4.1 Conclusions regarding efficient solvers for the linear systems . . . . . . . 167
4.2 Conclusions regarding adaptive mesh design . . . . . . . . . . . . . . . . 168
4.3 Opportunities for future research . . . . . . . . . . . . . . . . . . . . . . 168
Bibliography 178
Index 188
vi
List of Figures
vii
2.22 Galerkin FEM velocity profiles (denoted by feins) for the lid driven cavity
problem along the lines x = 0.5 and y = 0.5 for Re = 10 and Re = 100
(and close-up view). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
2.23 SDFEM velocity profiles (denoted by feins) for the lid driven cavity prob-
lem along the lines x = 0.5 and y = 0.5 for Re = 100 and Re = 1000 (and
close-up view). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
2.24 FV velocity profiles for the lid driven cavity problem along the lines x =
0.5 and y = 0.5 for Re = 100 and Re = 1000 (and close-up view). . . . . 128
viii
List of Tables
ix
2.12 Iteration counts for box-smoother multigrid preconditioned GMRES, driven
cavity at Re = 100, FEM discretisation, Picard linearisation, W-cycle,
smoother variant sorted p-dir noVweights, various damping parameters
γ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
2.13 Iteration counts for box-smoother multigrid preconditioned GMRES, driven
cavity at Re = 1000, FEM discretisation, Picard linearisation, W-cycle,
smoother variant sorted p-dir noVweights, various damping parameters γ 90
2.14 Iteration counts for the FV solver . . . . . . . . . . . . . . . . . . . . . . 98
2.15 Comparison forward and adjoint solves with Fp preconditioner, cavity
with obstacle (Example 2.1) at Re = 100 . . . . . . . . . . . . . . . . . . 112
2.16 Example 2.1: FE-solver optimisation results for varying Reynolds numbers 117
2.17 Example 2.1: FV-solver optimisation results for varying Reynolds numbers 117
2.18 Example 2.1: FE-solver optimisation results varying mesh refinement . . 118
2.19 Example 2.1: FV-solver optimisation results for varying mesh refinement 118
2.20 Example 2.2: Optimisation results . . . . . . . . . . . . . . . . . . . . . 124
2.21 Verification of adjoint derivative evaluation in the FE discretisation by
comparison to finite difference values . . . . . . . . . . . . . . . . . . . 130
2.22 Verification of adjoint derivative evaluation in the FV discretisation by
comparison to finite difference values . . . . . . . . . . . . . . . . . . . 130
x
Chapter 0 1
Notation
Introduction
The discrete adjoint method is a technique allowing efficient evaluation of the sensitivities
(derivatives) of a functional depending on the solution of a discretised partial differential
equation (PDE). In this thesis we consider two applications of this technique in the context
of computational fluid dynamics, a field which is concerned with computer methods for
approximating the solutions to mathematical flow models. Typically these mathematical
flow models are PDEs, whose solutions characterise the behaviour of the fluid in the flow
domain.
The two applications considered here are shape optimisation and adaptive mesh de-
sign. Both are fundamental in computational fluid dynamics, although on different levels.
The desire to control fluid flow, i.e. to influence how a fluid (e.g. water or air) behaves in
a flow domain, is the driving force behind all attempts to analyse and describe fluid flow.
Traditionally computational methods have been viewed as approaches to analyse fluid be-
haviour (e.g. [40]), although recently work concentrates more and more on computational
treatment of the control problems (e.g. [41]). The shape of objects in the flow domain,
or surrounding the flow domain, is one means of controlling the flow. This type of con-
trol is applied in all sorts of situations, ranging from the building of irrigation systems in
ancient times (shape=form of the duct), through water taps found in every modern house
(shape=opening/closing the valve), to sophisticated aircraft wings designed specifically
to provide the best possible performance in certain flight conditions. If one is interested
in finding the shape for such an object that delivers the best possible performance of the
fluid flow system in a quantitatively measurable sense, this leads to a shape optimisation
2
Chapter 1 3 Introduction
problem.
At the other end of the scale most computational methods for analysing flow rely
on the discretisation of PDEs. Such discretisations are defined based on a mesh, which
denotes a partitioning of the geometrical flow domain into small elements of simple ge-
ometry (e.g. triangles, quadrangles, tetrahedra, hexahedra). The size of the elements
influences the quality of the approximation to the solution of the PDE. In essence, the
smaller the elements, the more accurately the solution can be approximated. However,
making the elements smaller generally means that their number grows, implying that the
computer resources required to define and solve the resulting discrete problems grow as
well. Naturally such resources available to solve a specific problem are limited, and thus
the number of elements which may be used is limited. Adaptive mesh design essentially
seeks to find meshes which allow for the best possible approximation of the PDE solution
with a given maximum number of mesh elements, or to achieve a desired accuracy with
the smallest possible number of elements. Thus this is one way to improve efficiency of
computer methods for fluid flow analysis1 .
The outline of this thesis is now described, see Figure 1.1. In the following section the
discrete adjoint method is introduced in a general form, allowing application in many pos-
sible scenarios. As the applications considered in this thesis are fundamentally different
in character, they are discussed (almost) independently, each forming the topic for one of
the two major chapters of this thesis, Chapter 2 for shape optimisation and Chapter 3 for
adaptive mesh design. Both of these chapters contain an introduction to their respective
area and define the discretisations on which the considerations are based. This is followed
by a discussion of the application of the discrete adjoint method to the respective problem.
The chapters are closed with numerical examples illustrating the utility of the approaches
under consideration. Finally, in Chapter 4, conclusions regarding both applications are
drawn, also highlighting how the techniques developed in this thesis may fit together in
order to construct efficient algorithms suited to deal with real engineering problems.
Introduction
Discrete Adjoint Method (DAM)
Examples Examples
Conclusions
method. A more detailed discussion with references to origins and related approaches
can, for example, be found in [36].
Consider a scalar-valued function, I, of an independent vector variable s, such that
I(s) := I(ω(s),
e s), (1.1.1)
where the vector ω(s) is defined implicitly by the (possibly nonlinear) system
Consider the effect of small perturbations δ s of s in (1.1.1) and (1.1.2). Discarding higher
order derivative terms, such a perturbation results in a perturbation δ I in I,
∂ Ie ∂ Ie
δI = δ ω + δ s, (1.1.3)
∂ω ∂s
∂R ∂R
0 = δR = δω + δ s. (1.1.4)
∂ω ∂s
T " #T
∂R ∂ Ie
Ψ= , (1.1.6)
∂ω ∂ω
and if ω(s) is well-defined by (1.1.2) then Equation (1.1.6) uniquely defines Ψ which is
of the same dimension as ω. Equation (1.1.6) is known as the adjoint equation and Ψ as
the adjoint solution. With this choice of Ψ the perturbation δ I is
!
∂ Ie ∂R
δI = − ΨT δ s,
∂s ∂s
Chapter 1 6 Introduction
DI ∂ Ie ∂R
= − ΨT . (1.1.7)
Ds ∂ s ∂s
The importance of this representation is that, once the original equation (1.1.2) is
solved and I(s) evaluated from (1.1.1), DI/Ds may be evaluated for little more than the
cost of a single solve of the linear system (1.1.6) and a single matrix-vector product in
(1.1.7), regardless of the dimension of s. This is compared to other methods of evalu-
ating DI/Ds which typically require the solution of (1.1.2) (or a linearised version) per
component of s.
Evaluating the derivative by means of (1.1.6) and (1.1.7) is called the discrete adjoint
method if R(., .) is the discretisation of a PDE and ω the discrete representation of the
solution of the PDE. The designation as discrete adjoint is used in order to distinguish this
approach from a related one, which applies the same ideas on the PDE level, resulting in
an adjoint PDE as an analogue to (1.1.6), defining an adjoint solution which is a function
defined on the PDE domain. Since the adjoint technique is applied on the continuous
level, this approach is denoted the continuous adjoint method, see [36] and references
therein for an introduction.
This relatively simple idea of the discrete adjoint method gives rise to powerful appli-
cations, two of which will be discussed in the following two chapters.
Chapter 2
This chapter is concerned with an application for which the discrete adjoint method is
particularly well suited: shape optimisation. After a brief introduction, defining the gen-
eral form of a shape optimisation problem in the context of this thesis and introducing
the mathematical models for fluid flow, different numerical approaches to shape optimi-
sation are discussed in Section 2.2. As we will see, the discrete adjoint method forms one
possible component for a class of such approaches, gradient based optimisation.
Two discretisation methods for the mathematical flow models, finite elements and
finite volumes, are introduced in sections 2.3 and 2.4. In the context of shape optimisation
the resulting equation systems have to be solved repeatedly for variations of the domain
geometry. Thus, efficient solution strategies for these systems are of special importance.
Section 2.5 discusses two such solution strategies and the length of the section reflects its
importance, as well as the degree of difficulty this issue poses.
Application of the discrete adjoint method for the two discretisations is discussed in
Section 2.6 and the chapter is closed by numerical examples in Section 2.7 illustrating the
utility of the approaches presented in this chapter.
7
Chapter 2 8 Shape Optimisation and CFD
2.1 Introduction
In Subsection 2.1.1 a general shape optimisation problem is defined, which forms the
object of consideration of this whole chapter. This definition contains a general PDE sub-
problem, examples of which are given in Subsection 2.1.2 in the context of fluid dynamics.
Amongst these is the primary PDE model for this chapter, the stationary incompressible
Navier-Stokes equations.
where F denotes the shape, D the set of admissible shapes (which may contain some
geometrical and other simple constraints), L the differential operator of the PDE, Ω(F )
the PDE domain as function of the shape, B an operator defining the boundary conditions
and f and g are given functions.
According to [66, Section 2.6] consideration of such problems goes as far back as
the year 1910 when Hadamard, in the context of the calculus of variations, published a
formula for a shape derivative of the Green’s function of the Laplace operator. Pironneau’s
Chapter 2 9 Shape Optimisation and CFD
work [66] itself represents a significant contribution to this area and contains some further
historical background information, see [66, Section 2.6].
An abstract theory for problems of this kind can be found in, for example, [44], an-
swering questions regarding the existence of solutions for a broad class of problems as
well as discussing the possibilities for numerical analysis. The sufficient conditions for
the existence of solutions, as given in [44], contain two key ingredients:
The proofs for the continuity of the mapping u(F ) are usually rather technical and depend
on the properties of the PDE, of course. The importance of the compactness of the set
of admissible shapes D is demonstrated by a counter example at the end of Chapter 1 in
[44]1 .
In the area of computational fluid dynamics (CFD) the PDE part of (2.1.1) is a set
of equations describing the motion of the fluid in the domain Ω(F ). Before we con-
tinue consideration of problem (2.1.1) we discuss a few selected models of fluid flow and
specify which one will be used for the remainder of this work.
velocity u = [u1 , . . . , ud ]T ,
temperature T,
pressure p,
density ρ and
energy E.
1 This counter example problem can be summarised as a heat transfer problem, where in a domain of
constant volume a positive constant heat production density is assumed. If one seeks to minimise the
integral of the square of the temperature in the domain by optimising the shape of a part of the boundary
where homogeneous Dirichlet boundary conditions are applied, one finds that oscillating boundaries result
in low values of the performance function. It turns out that the shorter the wavelength of the oscillations
the lower the value of the performance function. Yet, no boundary shape with oscillations of infinitely short
wavelength exists.
Chapter 2 10 Shape Optimisation and CFD
Navier-Stokes Equations. The Navier-Stokes Equations are considered the most gen-
eral model for fluid flow. They comprise of a momentum conservation equation (2.1.2a),
a mass conservation equation (2.1.2b) and a constitutive law for the fluid behaviour: for
example (2.1.2c) for perfect gases under normal conditions.
!
d
∂u ∂u
ρ + ∑ ui − µ∆u − (3λ + µ)∇∇.u + ∇p = f, (2.1.2a)
∂t i=1 ∂ xi
∂ρ
+ ∇.(ρu) = 0, (2.1.2b)
∂t
kρ γ = p. (2.1.2c)
The form given above can be found in [93, Section 1] for example. Note that (2.1.2c)
imposes strong assumptions on the particular fluid being a gas in a narrow temperature
range. It can be replaced by other thermodynamic models for more general situations
(which may also be PDEs). Also note that (2.1.2a) assumes the fluid to be Newtonian, so
stress is assumed to be proportional to the velocity gradient. The constants λ , µ, k and γ
are model parameters.
For many flow situations specialised flow models can be derived from the general
Navier-Stokes equations. The following three models all fall into this category.
Compressible Euler Equations. Here the main assumption is that viscous effects are
negligible, for example because gases generally have a very low viscosity. In this situation
one can set µ = λ = 0 in (2.1.2), which gives
!
3
∂u ∂u
ρ + ∑ ui + ∇p = f, (2.1.3a)
∂t i=1 ∂ xi
∂ρ
+ ∇.(ρu) = 0 (2.1.3b)
∂t
kρ γ = p. (2.1.3c)
Chapter 2 11 Shape Optimisation and CFD
The Euler equations (2.1.3) are considered to provide a good model for fluids with low
viscosity in high speed flows (e.g. air around a wing at high speeds, if viscous effects in
the boundary layer are not of interest). Again we emphasise that (2.1.3c) imposes strong
assumptions on the particular fluid being a gas in a narrow temperature range and can be
replaced by other thermodynamic models if necessary.
Incompressible Navier-Stokes Equations. For liquids in general and for gases in slow
flows (typically Mach <∼ 0.3) incompressibility of the fluid may be assumed, leading
to ρ = const as replacement for the constitutive law (2.1.2c). This simplifies the Navier-
Stokes equations (2.1.2) to
∂u
− ν ∆u + u.∇u + ∇p = f, (2.1.4a)
∂t
∇.u = 0. (2.1.4b)
Again we see a momentum equation (2.1.4a) and an equation of mass conservation (2.1.4b).
The constant ν is the kinematic viscosity, defined as ν = µ/ρ, where µ is the viscosity
parameter which appeared in (2.1.2). The constant ν is tightly related to the Reynolds
number Re = V `ρ/µ, a dimensionless constant characterising the flow. If the character-
istic length ` and characteristic velocity V of the flow are constant, ν is proportional to
1/Re.
Very often a slightly different form of the incompressible Navier-Stokes equations can
be found in the literature, which uses a nonlinear convection term ∇.(uuT ) that is derived
in a different way,
∂u
+ ∇p + ∇.(uuT ) = ν ∆u + f, (2.1.5a)
∂t
∇.u = 0. (2.1.5b)
ary state for t → ∞ and one is only interested in the long term behaviour of the system
rather than possible initial temporal effects, then it makes sense to drop the time deriva-
tive term from (2.1.5) (or (2.1.4) respectively) and only analyse the resulting stationary
incompressible Navier-Stokes equations.
Stokes Equations. The Stokes equations deal with the opposite extreme to the Euler
equations. Here one assumes that viscous effects dominate to such an extend that inertia
effects are negligible. In such regimes compressible effects can usually be neglected as
well, so it is appropriate to further simplify the incompressible models by neglecting the
inertia terms u.∇u in (2.1.4) (or ∇.(uuT ) in (2.1.5)), leading to
∂u
ρ − µ∆u + ∇p = f, (2.1.6a)
∂t
∇.u = 0. (2.1.6b)
Note that, of the fluid flow models presented in this section, the Stokes equations (2.1.6)
are the only one featuring a fully linear differential operator; all other models are nonlinear
PDEs.
sidered, which attempt to model the effects of sub-grid time and length scale effects by
analytical as well as empirical models. This is the area of turbulence modelling. Various
approaches exist, some of which claim to be successful for a wide variety of engineering
applications, see [105, Section 1.13] for a list of references. However, for the purpose
of this work we will not utilise such turbulence models but restrict ourselves to low or
moderate Reynolds number regimes, where these additional models are not necessary,
and where stable steady state solutions exist.
Chapter 2 14 Shape Optimisation and CFD
2.2.1 Preliminaries
Discretisation of problem (2.1.1) generally requires two distinct sub-problems to be con-
sidered:
For each part of the discretisation various possibilities arise, resulting in different ap-
proaches with varying degrees of interaction between the discretisations. In this section
we consider a selection of possible approaches and highlight some of their advantages
and disadvantages.
The need for discretisation of the shape may appear surprising at first, so we briefly
illustrate this point. The shape of a domain is generally defined by its boundary. For
a two dimensional domain this boundary is a curve and for three dimensional domains
it is a surface (not necessarily simply connected). Sufficient smoothness assumed, parts
of the boundary can be considered functions of the form Rd−1 7→ Rd fulfilling certain
smoothness criteria. As spaces of such functions are generally infinite dimensional, it
becomes clear that some kind of discretisation is required in order to allow numerical
calculations in finite dimensional spaces.
If the discretisation of the shape is treated separately from the discretisation of the
PDE, parametric domain or parametric shape definitions result, leading to parametric PDE
problems. In many engineering situations parametric shape definitions may be the most
natural choice anyway, as manufacturing capabilites restrict the variety of possible shapes
from the outset, i.e. the parameterisation may be given by the manufacturing process. In
the general case one can define a family of shape parameterisations Fκ , with κ > 0 de-
noting a discretisation parameter, such that approximation properties improve as κ → +0.
This approach is for example considered in [44], from which a key result regarding the
convergence of the discrete solutions to solutions of the continuous problem is reproduced
Chapter 2 15 Shape Optimisation and CFD
in Appendix A. In essence, if the continuous and discrete PDE problems are well-posed
for all feasible domains, the PDE discretisation is compatible with the shape discretisa-
tion, the discrete shapes are asymptotically dense in the space of feasible shapes, the set of
feasible shapes is compact and the performance criterion is continuous, then any sequence
of optimal discrete shapes {Fκ } , κ → +0, contains a subsequence which converges to
a locally optimal solution of (2.1.1) and any accumulation point of such a sequence is a
locally optimal solution of (2.1.1).
From here on the shape will be understood as a parametric shape, unless stated oth-
erwise. The terms shape and shape parameters will be used as synonyms and the symbol
F will denote a shape parameter vector.
A systematic overview of some of the most important choices in designing a shape
optimisation algorithm is given in Figure 2.1 in the form of a decision tree. The following
subsections discuss each of the possible choices in Figure 2.1 in some more detail and
define the approach taken in this work. In part we follow a similar discussion to that of
[41, Chapter 2], while we aim to give a short presentation highlighting properties of the
approaches from the viewpoint required for the work described in this thesis.
Note that most of the choices are based on assumptions concerning the typical proper-
ties of problem (2.1.1). Thus, for a given specific problem a different choice may be more
favourable if the properties of this specific problem do not coincide with assumptions of
a typical problem made in this work. Such assumptions will be stated and justified where
they are used.
simplex method [59], genetic algorithms (GA) or evolutionary algorithms (EA) which
hold a population of competing solution candidates at each step (e.g. [25] for an introduc-
tion), or methods based on continuously updated low order polynomial approximations
of the cost function (e.g. [24, 67]). All of these methods have in common that they use
stored data of objective function evaluations for different parameter values in order to
determine new candidate solutions by some mechanism. In a sense the local behaviour
of the function is deduced from evaluated function values at different points in parameter
space. Thus, in high dimensional parameter spaces these methods require a rather large
number of function evaluations to even identify a downhill direction, let alone get close
to an extremum.
An advantage generally attributed to these methods (e.g. [24, 25]) is that they tend to
be harder to distract by local extrema and more likely to find solutions that are good in a
global sense, while gradient based methods tend to get attracted to a local extremum close
to the initial guess.
rants further comment here, is the method of sequential quadratic programming (SQP).
This method forms the basis for the most successful general optimisation codes which can
handle general nonlinear objective functions and nonlinear equality as well as nonlinear
inequality constraints [61, Chapter 18]. In SQP methods an auxiliary optimisation prob-
lem, with a quadratic objective function and linear constraints, is solved at each step of
the iterative process, defining a search direction for a line search, which in turn defines the
new iterate. These auxiliary quadratic optimisation problems are defined by the Hessian
of the objective function (or the BFGS approximation of it) and the first order derivatives
of the objective function and constraints. Thus, the problem of solving a general nonlinear
optimisation problem is reduced to solving a sequence of quadratic optimisation problems
which are well understood and for which various solution methods exist [61].
• as an optimisation problem where the solution operator for the PDE (2.1.1b) and
(2.1.1c) is part of the objective function.
Further constraints resulting from the set of admissible shapes D are set aside for now, to
concentrate on the way in which the PDE constraints are incorporated into the problem.
This classification applies to both, the continuous problem (2.1.1) with continuous shape
F and PDE constraints, as well as the analogous discrete problem with discrete shape
Fκ and discretised PDEs as constraints. Depending on which understanding of problem
(2.1.1) is adopted different approaches to solve the problem emerge.
The second category in the classification above can be understood as eliminating the
equality constraint from the optimisation problem by redefining the objective function as
combination of the solution operator and the original objective function. This approach
has long been known in the optimisation community [61], with both advantages and dis-
advantages associated with it. The main advantage is that it reduces the dimension of the
optimisation problem, since the dependent variables u are eliminated from it. However,
Chapter 2 19 Shape Optimisation and CFD
the original structure of the problem is lost and ill-conditioning can be introduced into the
problem [61, Section 15.2]. In our context we call this the uncoupled approach, whereas
we call the treatment as an equality constrained optimisation problem the fully coupled
approach.
In the context of shape optimisation problems the advantages and disadvantages of
both approaches are more pronounced than for moderately dimensional general problems.
A major argument in favour of the fully coupled approach is that it may potentially require
fewer solves of linearised equations than the uncoupled approach. This is because each
iteration of the optimisation solver requires essentially only one such solve, while in the
uncoupled approach the potentially nonlinear PDE (or its discretisation) has to be solved
to a relatively high accuracy, usually requiring multiple solves of linear systems per step
of the optimisation solver. However, the nesting of nonlinear solves in the uncoupled
case does have significant advantages as well. First of all this approach is more modular,
which simplifies code development. Further, the linear systems that have to be solved are
usually much smaller due to the un-coupling.
The modularity can be a huge advantage. For example it may be possible to re-use
existing simulation codes inside “off the shelf” optimisation software, e.g. DONLP2 [84].
Thus one can rely on robust existing codes. In contrast, the potential advantage of the
fully coupled approach, of requiring fewer solves, may fade away quickly if the coupled
solve converges slowly or does not even converge at all. The convergence properties of
this coupled system may be fundamentally different to those of the individual systems.
Overall we conclude that the modularity and simplicity of the uncoupled approach
outweighs the drawbacks of this approach, although this is clearly a subjective judgement.
Thus, the fully coupled approach is not considered further in this work. However, we
remark that the interested reader may find material on the fully coupled approach in [41]
and the individual articles in [16], for example.
respect to the shape parameters F in some way. The gradient of the performance crite-
rion I is then evaluated by applying the chain rule of differentiation, using the sensitivity
∂ u/∂ F , i.e.
DI(u(F ), F ) ∂ I ∂ u ∂I
= + .
DF ∂u ∂F ∂F
A detailed discussion of this approach can be found for example in [86], which also
answers theoretical questions concerning the differentiability of u with respect to the pa-
rameters.
In contrast to the sensitivity equation approach, adjoint equation methods omit the
sensitivities of u and instead compute an adjoint solution Ψ as explained in Section 1.1.
The relationship (1.1.7)
DI(u(F ), F ) ∂I ∂R
= − ΨT
DF ∂F ∂F
suggests that the adjoint solution can be thought of as a sensitivity measure of the per-
formance criterion I with respect to changes in the residuals of the (possibly discretised)
PDE. Computation of the gradient is then completed by computing the derivatives of the
(discretised) PDE residual and weighting them with Ψ.
Let us assume for one moment that the gradients of more than one, say m, criteria Ii
be required in the optimisation algorithm. This may for example be the case if there are
multiple objectives for the optimisation (see [58] for an introduction), or if the derivatives
of constraints which depend on the PDE solution u have to be evaluated as well as those
of the performance criterion.
The important difference between the two approaches in this subsection is that the
adjoint solution Ψ is specific to the performance criterion, i.e. one Ψ is computed per
Ii and used in the computation of all components of the gradient DIi /DF , while the
sensitivities of u are specific to the parameter components, i.e. one sensitivity ∂ u/∂ F j is
computed for each component of the parameter vector F and can be used to calculate the
derivatives of any performance criterion DIi /DF j . The costs for solving the adjoint or
sensitivity equations are comparable, and are the most significant costs for either method.
Thus, it is advantageous to use the sensitivity equation approach if m is greater than the
number of parameters F j , while it is more efficient to use the adjoint approach if the
number of parameters F j is greater than m.
The case of multi objective optimisation can be reverted to a series of single objective
optimisation problems by means of the weighting method [58], so this will not be consid-
ered further in this work. Moreover, the general form of the shape optimisation problem
considered in this thesis, (2.1.1), contains only one performance criterion I and no con-
Chapter 2 21 Shape Optimisation and CFD
straints depending on the PDE solution u. Thus, m = 1 for the problems of interest. At
the same time there can be a large number of shape parameters resulting from the shape
discretisation. Therefore, for this type of problem the adjoint approach is usually the more
efficient choice.
However, it has to be mentioned that for time dependent problems the adjoint solu-
tion has to be calculated backward in time, starting from the final time of the simulation.
For nonlinear problems the primal solution u will be required to formulate the linearised
problems at each time-step. This means that either the solution has to be stored during the
forward solve, resulting in enormous storage requirements, or has to be recomputed by
additional forward solves during the adjoint solve stage, resulting in prohibitive computa-
tional costs. This can be a major drawback, rendering the adjoint approach less attractive
for time dependent problems. A good approach to limit the resulting problems is to store
the solution at appropriately selected points in time and recompute it over small time
intervals. This way the storage, as well as the computational costs, can be bounded to
moderate values [46].
Finally we comment on how finite difference and automatic differentiation fit into this
classification. If the derivatives DI/DF are to be approximated by finite differences, then
the PDE has to be solved for perturbations of each parameter, F fj = F j ± h, resulting in
perturbed values of the solution u and performance criterion I. Even though approxi-
mations to ∂ u/∂ F j may not be computed explicitly, this can be classed as a sensitivity
equation approach, because an equation is solved for each parameter F j and the perturbed
solutions are an analogue to the sensitivity ∂ u/∂ F j .
Automatic differentiation, i.e. the differentiation of computer programs by software
techniques, can be used to implement either of these classes depending on the method that
is used. In automatic differentiation the so-called forward and reverse modes exist (e.g.
[39]), where the forward mode corresponds to the sensitivity equation approach and the
reverse mode to the adjoint equation approach.
the continuous level, using either of the approaches in Subsection 2.2.4. This results in a
set of PDEs whose solution is used to determine the derivatives DI/DF . These PDEs are
then discretised and solved to obtain approximations to DI/DF .
In contrast, in the discretise then differentiate approach the PDE system (2.1.1b) and
(2.1.1c) is discretised first, resulting in a set of algebraic equations and algebraic expres-
sions for Ih , which serves as an approximation for I. These algebraic equations, or even
a computer program that solves them, are then differentiated using one of the approaches
from Subsection 2.2.4. This way a set of equations is obtained which determines an ap-
proximation of DI/DF .
One apparent drawback of the disc-diff variant is that not only the performance func-
tional I but also the discretisation error (Ih − I) is differentiated. Thus if the sole interest
is to approximate DI/DF this approach is not necessarily the best one. However, as the
resulting approximation is the derivative of Ih , this approach results in a derivative which
is consistent with Ih , allowing good performance for optimisation algorithms for Ih . In
contrast, if the diff-disc variant is employed, then the approximation (DI/DF )h is not
necessarily the derivative of Ih . Thus, if this approximation of the gradient is used, the
optimisation algorithm is likely to run into difficulties because the descent direction indi-
cated by (DI/DF )h is not really a descent direction for Ih . Typically this leads to a slow
down in progress and ultimately to a breakdown of the optimisation algorithm.
If an adjoint approach is employed, the diff-disc approach may present an advantage,
because in this approach the mesh for the adjoint problem may be different to the mesh
used for the forward simulation. This allows one to use different adaptive meshes to
obtain good approximations for I as well as DI/DF .
From a software development point of view the disc-diff approach possesses several
major advantages over the diff-disc variant. A logical first step in attempting shape op-
timisation for a problem is to develop software that allows one to approximate I for a
given geometry, which is usually done by discretising the PDE problem. Once this step is
mastered one is presented with the choice of diff-disc or disc-diff. The diff-disc approach
requires one to differentiate the PDE, which results in a new set of PDEs. The structure
of these PDEs may be different to that of the original PDE, thus requiring one to develop
a new discretisation scheme that is suitable for them. As a result one may end up writing
an essentially new code for the differentiated PDEs.
In the case of the disc-diff variant it is easier to build upon the existing code. The solu-
tion strategy that is employed for the primal simulation can be adapted in a straightforward
manner to solve the differentiated problem. Also, the definition of the differentiated prob-
lem builds upon the existing software, as all that is required of the additional parts is to
Chapter 2 23 Shape Optimisation and CFD
deliver derivatives of the functionalities found in the existing code. This situation is ideal
for application of automatic differentiation software. Even if automatic differentiation
can not be applied, the correspondence to the original code is an enormous advantage, as
it allows verification of the developed new routines by comparing the computed derivative
values to those obtained by applying finite differences to the original routines. Thus every
step in the development can be verified independently.
This, along with the above observations concerning the consistency of the gradients,
are the reasons that we prefer the disc-diff approach and therefore choose this approach
for the remainder of this work.
Finally we remark that both the automatic differentiation and finite differences ap-
proaches from Subsection 2.2.4 are inherently of the disc-diff type.
as discussed in [87] for example. Unfortunately, such deformations may decrease the
mesh quality, e.g. interior angles of triangles may increase with the deformation and get
close to π. For strong deformations cells may collapse or even change orientation. Even
though the application of the equations of linear elasticity does not prevent such degra-
dation of mesh quality, it guarantees that for sufficiently small changes of the boundary
geometry, the degradation in the interior is small as well. Thus a neighbourhood of the
base geometry may be implicitly defined in which the quality changes of the mesh are
tolerable. Furthermore, to compute such a deformation may be significantly faster than
generating a whole new mesh for each geometry.
A practical shape optimisation algorithm may use a hybrid approach, using mesh de-
formation so long as it produces meshes of tolerable quality, switching to a new base mesh
as and when the deformations become too strong. This way it is guaranteed that for any
given discrete shape Fκ there exists a neighbourhood in which the discrete performance
function Ih (Fκ ) is smooth. Discontinuities only occur when the base geometry changes.
Whether these discontinuities present a problem for the optimisation algorithm or not de-
pends on the local behaviour of the underlying smooth function I(Fκ ) and the size of the
jump that is caused by the discontinuity.
To illustrate this, the top part of Figure 2.2 contains a hypothetical example of a
smooth relationship between a parameter and the performance criterion I(Fκ ) (smooth
function), a discontinuous discrete approximation Ih (Fκ ) which may result from discon-
tinuous dependency of the mesh on the parameter (non-smooth-approx), and a piecewise
smoothed approximation Ih (Fκ ) which may result from the hybrid approach as described
above (deformation-smoothed). An attempted step of an optimisation algorithm using
the hybrid approach may then successfully pass one of the remaining discontinuities if
it yields a reduction of the piecewise smoothed performance criterion Ih (Fκ ) as in the
top part of Figure 2.2, or it may fail to pass the discontinuity because the approximated
performance at the new candidate solution is increased, even though for the underlying
smooth performance criterion the parameter change would result in a reduction of the per-
formance criterion as in the bottom part of Figure 2.2. A situation similar to that depicted
in the bottom part of Figure 2.2 may even cause a break-down of the optimisation algo-
rithm, if even arbitrarily small steps in the downhill direction indicated by the gradient of
the smoothed function pass a jump discontinuity which increases the approximated per-
formance function. Thus, the hybrid approach does not resolve the problem completely,
but represents an improvement compared to using automatic mesh generation alone. Note
that the accuracy of the results of optimising the discrete performance function is always
limited by the accuracy of this discrete approximation to the continuous performance
Chapter 2 25 Shape Optimisation and CFD
function. The better the accuracy of the latter approximation, the less difficulties of this
kind can arise.
Chapter 2 26 Shape Optimisation and CFD
−0.94
smooth function
non−smooth−approx
−0.95 deformation−smoothed
−0.96
−0.97
X: 1.362
Y: −0.9781
−0.98
successful step
−0.99
X: 1.508
Y: −0.997
−1
−1.01
1.3 1.4 1.5 1.6 1.7 1.8 1.9
−0.94
smooth function
non−smooth−approx
−0.95 deformation−smoothed
−0.96
−0.97
−0.98
non−successful step
−0.99 X: 1.483
Y: −0.9946
X: 1.459
−1 Y: −0.9955
−1.01
1.3 1.4 1.5 1.6 1.7 1.8 1.9
For the d dimensional domain Ω the bilinear forms a(., .) and b(., .) are defined as
Z
a(u, v) := ν grad u : grad v dΩ (2.3.2)
Ω
d
∂ ui ∂ vi
Z
:= ν ∑ dΩ, (2.3.3)
i, j=1 ∂ x j ∂ x j
Ω
Z
b(u, q) := q div u dΩ, (2.3.4)
Ω
A discretisation of (2.3.1) is defined in the usual way, i.e. by replacing the infinite dimen-
sional function spaces H1g (Ω) and L02 (Ω) by finite dimensional subspaces, Vhg ⊂ H1g (Ω)
and S0h ⊂ L02 (Ω) respectively. Thus, the discrete problem is: find (uh , ph ) ∈ Vhg × S0h such
Chapter 2 28 Shape Optimisation and CFD
that
Vh = span(Φh ),
h iT
h 1 1 2 2 d
Φ := ϕ1 , . . . ϕN , ϕ1 , . . . ϕN , . . . ϕN , (2.3.8a)
Sh = span(Θh ),
Θh := [θ1 , . . . θM ]T , (2.3.8b)
b(vh , qh )
∃γ > 0, independent of h : inf sup ≥ γ,
qh ∈S0h vh ∈Vh |vh |1 kqh k0
0
qh 6=0 vh =6 0
to allow a stable discretisation. A number of pairs of spaces Vhg and S0h are known to
produce unstable discretisations, like for example the combination of piecewise linear
velocity and piecewise linear pressure approximations on the same mesh. A further dis-
cussion of this matter, including techniques for proving div-stability, error estimates, an
overview of known stable and unstable element pairs, stabilisation methods for unstable
discretisations and references to the original research papers can for example be found in
[40].
For the purpose of this work the div-stable Taylor-Hood element pair is employed,
i.e. continuous piecewise quadratic velocity and continuous piecewise linear pressure
Chapter 2 29 Shape Optimisation and CFD
h||u||∞
Pe := ,
2ν
i.e. if Pe > 1 the Galerkin FE discretisation may be unstable and oscillations may occur.
For problems where viscosity is very small it may be impractical or even impossible to
refine the mesh far enough to guarantee Pe ≤ 1. To overcome this problem stabilised
discretisations have been introduced, e.g. Streamline-Upwind Petrov Galerkin (SUPG)
and Sub-Grid Stabilisation (SGS) techniques have been proposed as a remedy, see e.g.
[47] for an overview of different approaches. These approaches use modified weak forms
with improved stability properties to allow somewhat more meaningful approximations
even if the mesh can not be sufficiently refined to capture all the involved effects.
The necessity for this kind of stabilisation has been questioned as it is often feasible
to get good approximations without stabilisation if the mesh is well chosen, see [90] for
example. Critics again (e.g. [71]) argue that this approach is not sufficiently robust as the
stability may be very sensitive to the choice of the mesh, and the information necessary
for the construction of such meshes (e.g. position of interior layers) may not always be
available a priori. Therefore a combined approach is probably still a favourable choice.
However, if the viscosity constant ν is very small, or equivalently the Reynolds num-
ber is very large, other problems occur as well, i.e. stationary solutions may not be unique
(bifurcation phenomena), they may not describe the behaviour well as the physical system
does not approach a steady (∂ /∂t = 0) state for t → ∞, but oscillations remain, or indeed,
steady solutions may not even exist.
For the purpose of this work we restrict ourselves to problems where sufficient vis-
cosity is present such that the criterion Pe ≤ 1 can be fulfilled. Thus application of stan-
dard Galerkin discretisation is possible and the stabilised discretisations are only used for
Chapter 2 30 Shape Optimisation and CFD
coarse grid discretisations in the context of multigrid techniques, see Subsection 2.5.2.2.
Finally, we may observe that the left-hand side of the weak form (2.3.1) is nonlinear
in the velocity u, as the term u appears twice as argument for the tri-linear term c(., ., .).
Thus the discretisation will result in a set of nonlinear algebraic equations. We will discuss
techniques to deal with this nonlinearity in the following subsection.
2.3.2 Linearisation
The canonical approach to solving nonlinear systems is to use Newton’s method, which
applied to (2.3.7) yields
Using an appropriate starting guess uh(0) this iterative method may be used to approximate
the solution of the nonlinear system of algebraic equations. The well known properties
of the Newton method also apply in this case [40, Section 6.1]: close to a solution it is
quadratically convergent, and it may fail to converge if the initial guess is too far from a
solution.
An obvious but interesting observation is that for the Galerkin FE discretisation of
(2.3.1) the approaches “linearise then discretise” and “discretise then linearise” coincide,
so long as Newton’s method and the same FE function spaces are used. This allows
some insight into the properties of the Jacobian as its correspondence to linear differential
operators can be used.
Unfortunately, the requirement of a “good” initial guess can be quite problematic and
so gives rise to the need for other approaches to solve the nonlinear system. Probably the
most widespread such approach is to use Picard iteration,
Like the Newton iteration this iterative solution process requires an initial velocity guess
uh(0) which may well be chosen to be zero everywhere apart from the Dirichlet bound-
ary, where it should satisfy the boundary conditions for consistency. In [40, Section 6.3],
where this method is referred to as the simple iteration method, it is stated that this method
Chapter 2 31 Shape Optimisation and CFD
is linearly convergent for arbitrary initial guess uh(0) . This method may be regarded as an
in-exact Newton method, since it discards some of the terms in (2.3.9), i.e. the reaction
terms (zeroth order terms). Another approach to motivate this linearisation can be de-
rived from the time-dependent problem, where it may be argued that convection of the
velocities, rather than a reaction, is the natural process underlying the nonlinear convec-
tive term. Using a backward Euler time discretisation and taking the time-step size to the
infinite limit yields the Picard iteration.
Hybrid solution strategies, combining the global convergence of the Picard iteration to
find a “good” initial guess for the faster convergent Newton method, are of course an even
better approach to solving the nonlinear system. However, for the Reynolds number range
considered here, the approach of first computing as high a Re solution as possible on a
coarse mesh using Newton linearisation, and then interpolating this solution and using it
as an initial guess for a Newton iteration on the finer mesh for higher Re, was found to
be sufficient to obtain initial guesses for the solution on each respective level. Once the
desired Reynolds number can be used on the fine meshes, the interpolated solution on a
finer level may even be a good enough approximation such that a single Newton iteration
is enough to solve the nonlinear system to the required accuracy, as demonstrated in [22]
for example.
Often (e.g. [96, Section 3.3]) Picard iteration is favoured over Newton linearisation
because the resulting linear problems are better conditioned, especially for large Reynolds
number (i.e. small viscosity constant ν). However, for the range of relatively small
Reynolds numbers considered here, the fast convergence of the Newton method and that
it is required to formulate the discrete adjoint problem make it the preferable choice, even
though the involved linear systems are more demanding, as the results in Section 2.5.2
demonstrate.
Ultimately the application of the Discrete Adjoint Method requires the solution of a
system of equations with the transpose of the Jacobian of the discrete system as coeffi-
cient matrix. Thus, every effort that is put into solving systems with this Jacobian matrix
efficiently pays double, as the techniques can be used for the adjoint equation as well.
Chapter 2 32 Shape Optimisation and CFD
∂q
+ div( f (q)) = 0 in [0, T ] × Ω (2.4.1)
∂t
∂q
Z Z
0= dV + div( f (q)) dV
∂t
V V
Z Z
∂
= q dV + f (q).n dA. (2.4.2)
∂t
V ∂V
Here ∂V denotes the surface of the volume V and n the outward normal of this surface. If
a sufficiently smooth solution exists, (2.4.1) is equivalent to (2.4.2) holding for all V ⊂ Ω
and t ∈ [0, T ]. The smoothness requirements of the latter formulation are not as strong
as those of (2.4.1) and therefore it is called a weak formulation. In most cases from
theoretical mechanics (e.g. for the Navier-Stokes equations) the conservation equation
(2.4.1) is derived from the formulation (2.4.2) in the first place, so the latter is even a
more natural formulation, as it does not increase smoothness requirements.
In order to discretise the equations, the domain Ω is split into a set of finite, non-
S
overlapping, non-empty control volumes Vi , Ω = i Vi , and (2.4.2) is applied to each of
these control volumes only, rather than all possible volumes V ⊂ Ω. A discrete representa-
tion of the dependent functions q and an associated way of evaluating the terms in (2.4.2)
are chosen, which complete the definition of a set of algebraic equations forming the
discrete system. There are many ways these final steps can be done and different discreti-
sation schemes emerge from the precise definition of the Vi , the data representation and
the way the terms in (2.4.2) are evaluated. Naturally the properties of the variants differ,
and an appropriate scheme has to be chosen considering the properties and requirements
Chapter 2 33 Shape Optimisation and CFD
of the application, otherwise one may easily end up with an unstable discretisation.
It shall not be the subject of this work to give an overview of the different approaches
that have been developed, the interested reader may be referred to [31, 105] for example.
The main subject of this work is the discrete adjoint method (DAM). Therefore only one
particular FV scheme is selected here, and the application of the DAM is investigated for
this scheme, in the hope to gain more insight into the DAM in general.
∂u
+ ∇p + div u uT − ν div grad u = 0
(2.4.3a)
∂t
∂p
+ β div(u) = 0, (2.4.3b)
∂t
with the constant parameter β > 0 chosen suitably [6]. Steady state solutions of these
equations satisfy the incompressibility condition div(u) = 0. However, transient solutions
do not necessarily possess this property. This presents no obstacle if, as in this work, this
scheme is only used to compute steady state solutions.
From this the vector of dependent variables q and the flux f (q) can be identified as
" #
u
q := , (2.4.4)
p
δi,1 p + u1 ui − ν ∂∂ux1i
δi,2 p + u2 ui − ν ∂ u2
∂ xi
..
f(:,i) := , i = 1, . . . , d, (2.4.5)
.
δ p + u u − ν d ∂ u
i,d d i ∂ xi
β ui
Chapter 2 34 Shape Optimisation and CFD
where δi, j is the Kronecker δ . Hence, the normal flux F(q, n) := f (q).n from (2.4.2) is
pn1 + u1 (u.n) − ν ∂∂un1
pn2 + u2 (u.n) − ν ∂ u2
∂n
..
F(q, n) := f (q).n = . (2.4.6)
.
pn + u (u.n) − ν ∂ ud
d d ∂n
β u.n
The FV scheme that has been chosen for the purpose of this work is a first order cell
centred finite volume scheme, utilising Roe’s flux difference splitting approach to sta-
bilise the discrete formulation, as in [6] and [95, pp. 333–338] for example. For further
simplicity considerations are restricted to block Cartesian meshes.
The scheme is called cell centred because the discrete function values of q are con-
sidered to be localised at the centres of mesh cells. The integral over the boundary ∂Vi of
mesh cell Vi is approximated as
Z
f (q).n dA ≈ ∑ Fj |A j |, (2.4.7)
j:A j ⊂∂Vi
∂Vi
where the discrete flux Fj is an approximation to the normal flux Fj ≈ F(q, n) on the
cell interface A j (edge in two dimensions, face in three). The evaluation of (2.4.2) is
performed in a cell interface oriented way. The discrete flux Fj is evaluated for each cell
interface, and the contribution Fj |A j | is added to the residual vector for both adjacent cells
with opposite sign, of course, as the outward normal has opposing directions for the two
neighbouring cells.
For the purpose of defining the discrete flux Fj , both it and F(q, n) are split into
Chapter 2 35 Shape Optimisation and CFD
Aj
q n q
L,j R,j
Figure 2.3: Notation for the definition of the discrete flux Fj at interface A j
which are treated separately due to their fundamentally different character. An important
observation is that the normal flux F(q, n) has to be approximated on the cell interface
while the state variables q are considered located at the cell centres. Let qL, j and qR, j
denote the states in the cells left and right of interface A j respectively, see Figure 2.3.
The normal derivatives of the viscous flux (2.4.8) can be approximated on the interface
by using the central difference
∂ u uR, j − uL, j
≈ , (2.4.10)
∂n hj
where h j denotes the distance of the cell centres of the left and right cells. If the interface
is exactly half-way between the centres of the adjacent cells, this results in a second order
consistent approximation, otherwise only first order is achieved.
Definition of the discrete inviscid flux is more demanding. A straightforward ap-
Chapter 2 36 Shape Optimisation and CFD
proach to approximate F(q, n) at the interface would be to use the arithmetic mean of left
and right cell evaluations, 1/2(F(qL, j , n) + F(qL, j , n)). Unfortunately this would result
in an unstable discretisation (in a similar way as central difference schemes yield an un-
stable discretisation of convection equations, see [31, Section 4.4.2]). Stabilisation may
be achieved using Roe’s approach, which modifies the arithmetic mean by an upwinding
term
1 1
Fj,i := F(qL, j , n) + F(qR, j , n) − |J|(qR, j − qL, j ),
2 2
with
|J| := R|Λ|L.
Here |Λ| is the diagonal matrix with the absolute values of the eigenvalues of the Jacobian
∂ F(q, n)/∂ q, and R and L are defined by diagonalising this Jacobian,
∂ F(q, n)
RΛL= .
∂q
The authors in [95] provide an interpretation of this spatial discretisation scheme in terms
of equivalent differential operators on a given two dimensional grid, i.e. the stabilised
(nonlinear) operator becomes
−ν∆ + 2u∂x + v∂y u∂y ∂x
√
h 2u2 +β h uv
p 1 v2 +β −|v|
− h2 u( √
− (|v|∂yy + √ ∂xx ) − 2 v2 +β (2 v2 + β − |v|)∂yy ∂xx + ∂yy )
2 u2 +β u2 +β v2 +β
v∂x −ν∆ + u∂x + 2v∂y ∂y
√ .
(2.4.11)
h uv p 2
2v +β 1 u2 +β −|u|
− 2 (2 u2 + β − |u|)∂xx
2 u +β − h2 (|u|∂xx + √ 2
∂yy ) h
− 2 v( √ 2 ∂yy + u2 +β ∂xx )
v +β v +β
β ∂x β ∂y
− h2 √ β2u ∂xx − h2 √ β2v ∂yy h 1 1
− 2 β ( √ 2 ∂xx + √ 2 ∂yy )
u +β v +β u +β v +β
The operator is written like a matrix that would be multiplied from the right by the vector
of flow variables q := [u, v, p]T and the entries of the matrix are each written across two
lines, the first line forming the original operators from the Navier-Stokes equations and
the second denoting the stabilisation terms. As all the stabilisation terms are of order
O(h), the overall spatial discretisation is only first order accurate.
The boundary conditions are also implemented in different ways for the inviscid and
viscous fluxes. The inviscid fluxes require the definition of the velocities and pressure on
Chapter 2 37 Shape Optimisation and CFD
boundary boundary
use u,v use u,v
the boundary. The velocities are prescribed by the boundary conditions while the pressure
is extrapolated linearly from the two adjacent inner cells, see the left-hand side of Figure
2.4 for an illustration. This procedure guarantees that the total mass flux through the
boundary of the domain is exactly that described by the boundary conditions, i.e. if the
boundary conditions dictate balancing in and out flux then the net mass flux through the
boundary will be zero. The viscous flux requires the normal derivatives of the velocities
on the boundary. This has been implemented using “ghost” cells, as illustrated in the
right-hand side of Figure 2.4. Utilising the velocity values at the two adjacent inner cells
and the values on the boundary, quadratic extrapolation is used to define values at the
“ghost” cell. A central difference for the internal and extrapolated “ghost” cell values is
then used as the normal derivative, reusing the code for the internal cells. This approach
reproduces the solution for channel flow exactly.
This completes the definition of the spatial discretisation. Defining the vector of dis-
crete states ω and the vector of discrete residuals R(ω), this yields
∂ω
M + R(ω) = 0,
∂t
where M is a diagonal matrix with the cell volumes as entries. Temporal discretisation
Chapter 2 38 Shape Optimisation and CFD
ω (k+1) − ω (k)
M + R(ω (k) ) = 0,
t (k+1) − t (k)
which implies CFL-type restrictions on the time-step size τ := t (k+1) − t (k) (e.g. [31,
Section 6.3.1]). As the focus here is on steady solutions, the implicit Euler scheme
ω (k+1) − ω (k)
M + R(ω (k+1) ) = 0, (2.4.12)
t (k+1) − t (k)
is advantageous, since it allows larger time-steps and thus allows one to arrive at a steady
state in fewer (but more expensive) time-steps.
R(ω) = R(ω + αe p )
for any vector ω of discrete flow variables and any constant α, while e p denotes a vector
containing zeros for the velocity variables and ones for the pressure variables. Thus the
solution of R(ω) = 0 can only be well defined up to an additive constant pressure. If no
Chapter 2 39 Shape Optimisation and CFD
attention is paid to this, inaccuracies can build up in the iterative solution procedure which
can potentially result in very large constant pressure components with the adverse effects
these imply in finite precision arithmetic. To avoid this, the pressure is normalised after
each time-step by subtracting the mean pressure, p := p − e(wT p), where e is a vector
of all ones, and w is a weight vector consisting of the cell volumes, scaled such that the
sum of the weights is one, i.e. wi := dVi /V , which implies wT e = 1. This is equivalent
to projecting the pressure such that the integral of the pressure over the whole domain
is zero. Applying the projection step at each solution update can be seen as to seek a
solution ω to " #
R(ω)
0= ,
wTp ω
where w p denotes a vector containing zeros for the velocity variables and the weight from
w for the pressure variables. As this is a system of N + 1 equations in N unknowns, it
is not yet quite complete. However, by defining an extended state vector ω e := (ω, λ ), a
regularised discrete system may be defined as
" #
R(ω) + λ w p
0 = R( e :=
e ω)
T
. (2.4.13)
wp ω
The additional term λ w p ensures that the system R( e = f is solvable for arbitrary right-
e ω)
hand side, removing the irregularity resulting from the “constant mass production” issues
discussed at the beginning of this section. Using Matlab, it has been confirmed that the
Jacobian Jeof R,
e " # " #
∂ R
e J w p ∂R
J :=
e = T , J := ,
∂ωe wp 0 ∂ω
is non-singular at the evaluated solution, thus this solution is (at least) locally unique.
ω = [u1 , u2 , . . . , uN , v1 , . . . , vN , p1 , . . . , pN ] ,
and the equations are ordered in an analogous way, the Jacobian of the scheme can be
written in the block structure
" #
∂R F B T
J := = . (2.4.14)
∂ω B C
Chapter 2 40 Shape Optimisation and CFD
This singularity is the discrete analogue to the non-uniqueness of the static pressure p
for the steady state of the continuous system. The stabilisation terms in (2.4.11) imply
that BT 6= BT for the Jacobian of this scheme, which allows for the corresponding left
eigenvector to be different. However, this left eigenvector has been evaluated for different
discrete problems using Matlab, and it was found to be identical to the right eigenvector.
This is somewhat surprising as the constant mass production issues discussed at the be-
ginning of Section 2.4.3 would point toward an eigenvector of the structure w p rather than
e p , which is observed.
To explain why one would expect right eigenvectors of the form w p we observe that
the left eigenvector to eigenvalue λ = 0 has to be orthogonal to the image of J. Thus
it has to be a vector that is not obtainable as product Jy for any y ∈ RN . Looking at
the differential operator defining the B block, we know that div(u) = c = const is only
possible for c = 0 due to definition of the boundary conditions. Therefore the integral
over a constant should not be in the image of the pressure space residual, i.e. w p should
be left eigenvector to eigenvalue λ = 0.
It is common practice in FV solvers to avoid the problems resulting from this zero
eigenvalue simply by using finite time-steps of length τk = t (k+1) − t (k) , even when a
steady-state solution is sought. For backward Euler time discretisation and uniform spatial
meshes this is equivalent to adding (1/τk )|Vh | times the identity matrix to the Jacobian,
which shifts the spectrum and thereby makes the resulting system non-singular (for an
appropriate time-step τk ). This in itself presents a dilemma. If the τk is chosen to be very
small, many time-steps will be needed before the solution approaches a steady state. On
the other hand, if the τk is chosen very large then 1/τk will be very small. Therefore,
for τk → ∞ the properties of the system deteriorate towards the singular behaviour that is
seen when no time terms are added. Note that the absence of the time terms can also be
interpreted as τk = ∞.
Chapter 2 41 Shape Optimisation and CFD
2.5.1 Introduction
As already stated above the size of the systems of equations resulting from PDE dis-
cretisation can be arbitrarily large. For meshes defined by uniform refinement of a given
coarse mesh, the number of degrees of freedom (DOFs) N grows like N ∼ h−d , where d
is again the spatial dimension and h the discretisation length scale. The more accurate a
solution to the PDE is required, the smaller h has to be chosen, the larger the size of the
linear systems N. Fortunately, discretisation by finite elements or finite volumes results in
sparse linear systems. That means that most of the entries of the system matrix are zero,
apart from a few entries in each row or column. As the number of non-zero entries per
row is bounded, with a bound depending on the discretisation method and the base mesh,
the memory requirements for storing the system matrices are O(N), rather than N 2 in the
Chapter 2 43 Shape Optimisation and CFD
where r0 := Ax0 − b is the residual for the initial guess x0 of the solution. The individual
methods arise by different ways in which the projection on the subspace is defined. CG
uses a projection orthogonal to Kk (A, r0 ) while GMRES uses projection orthogonal to
AKk (A, r0 ). For reference we provide basic versions of the preconditioned variants of
these algorithms as Algorithm 2 (CG in the notation of [57]) and Algorithm 3 (GMRES4 ).
Both algorithms require one matrix-vector product Ax per iteration, which, due to the
sparsity of A, can be implemented in O(N) operations. To define the projection, CG
requires a constant number of vector-vector operations as well, whose overall computa-
tional cost is O(N). The symmetry of the matrix A is used in CG to define a sequence
of orthogonal search directions without storing more than one of these directions at any
time. A similar recurrence definition of the search directions is not known in the general
case, so GMRES builds and stores a basis of orthogonal search directions explicitly. Each
new search direction has to be orthogonalised to all previous search directions. Thus, its
memory requirements as well as the required operations per iteration grow with each it-
4 Note that for simplicity of presentation this version does not check the norm of the residual at each step.
A real implementation specifies a maximal number of steps m, computes the Hessenberg matrix H and the
orthogonal search directions vi according to Algorithm 3 and, at each step, applies rotations to transform
the Hessenberg matrix H into an upper triangular matrix R and β e1 into g j . The resulting triangular least
squares problem min(kg j − Ry j k) provides the residual after the j-th step directly and, when a stopping
criterion is fulfilled and the solution approximation x(m) has to be computed, allows easy computation of
ym . See [73, Subsection 6.5.3] for a detailed discussion of these implementation issues.
Chapter 2 44 Shape Optimisation and CFD
eration. The overall cost for m iterations of GMRES is O(m2 N) in contrast to O(mN) for
m iterations of CG. So, if m, the number of iterations to achieve the required accuracy5 ,
can be kept small in relation to N, then these solvers are far more efficient than standard
factorisation approaches.
At this point the condition number of the matrices becomes important. The well
known result for CG (e.g. [73, Section 6.11.3])
√ m
κ −1
kx∗ − xm kA ≤ 2 √ kx∗ − x0 kA (2.5.1)
κ +1
relates the A-norm of the error of the m-th iterate x∗ − xm to that of the initial error x∗ − x0 .
The error bound (2.5.1) shows exponential convergence with respect to m, with a constant
that depends on the condition number κ, which is defined as the ratio of the largest to the
smallest eigenvalue of A. Thus, the number of iterations required to reduce the error to a
given fraction of the initial error is dependent on κ. Unfortunately, for the PDE discreti-
sations which are of interest here, the condition number κ grows as the discretisation is
refined, e.g. for the Poisson equation it is well known (e.g. [88, Section 5.2] or [19, Kapi-
tel IV, §2]) that κ ∼ O(h−2 ). Thus, as h is reduced, not only is the size N of the equation
5 As the equation systems arise in the discretisation of PDEs,which implies a certain discretisation error,
solution of the linear equation systems to the same order of accuracy as the discretisation error is sufficient
to maintain the convergence order of the PDE discretisation.
Chapter 2 45 Shape Optimisation and CFD
system increased, but the number of iterations required to achieve a given reduction of the
error increases too.
This effect of ill-conditioning can be averted by appropriate preconditioning. In the
case of the FEM discretised Poisson equation, for example, very successful solution
strategies have been developed. One such approach is to combine CG with the BPX
preconditioner [20], which relies on hierarchical meshes. The condition numbers of the
preconditioned system C−1 Ax = C−1 b are of order O(1), i.e. independent of h. Such
an approach is of course only useful if the preconditioner C−1 itself can be implemented
in an efficient way. For the BPX preconditioner this is the case and the overall cost for
solving the systems is O(N), which is optimal.
Geometric multigrid (GMG) or algebraic multigrid (AMG) techniques (e.g. [95] for
an introduction) have been proven to provide optimal (O(N)) solution strategies for the
Poisson equation as well. These techniques do not only use the equation system as it is
given on the current mesh, but a sequence of equation systems, corresponding to different
refinement levels. In the case of GMG this sequence of equation systems is generated
by building the system matrices for a hierarchical sequence of increasingly fine meshes,
while in the case of AMG the coarse versions of the equation system are constructed from
the system matrix A itself. In both cases simple iterative methods, like Gauß-Seidel for
example, are used to reduce the highest frequency components of the solution error on
the finest mesh (smoothing). The remaining residual is then restricted to a coarse-grid
representation, which is used to obtain an improved approximation of the low frequency
components of the solution. This coarse grid solution is then interpolated onto the fine
mesh and used to update the fine solution approximation (coarse grid correction). A
further smoothing step is then usually used to reduce high frequency error components
introduced by the coarse grid correction. This basic process can either be repeated until
the residual of the equation system becomes sufficiently small, i.e. using multigrid as a
solver, or it can be used as preconditioner inside a Krylov subspace solver, like CG for
example. The latter variant, using multigrid as a preconditioner in a Krylov subspace
solver, is found to be more robust [95, Section 7.8].
The improved approximation of the low frequency components of the solution on the
coarse mesh can be obtained the same way, i.e. pre-smoothing, coarse grid correction (if
there is a coarser grid), post-smoothing, because the highest frequency components on the
coarse grid have lower frequency than those on the fine mesh. Often the equation systems
on the coarsest grid are solved by direct solvers. Different variants of the basic multigrid
idea emerge as the operators for the smoothing, interpolation, and restriction are defined,
and by choosing the sequence and the number of times the pre-smoothing, coarse grid
Chapter 2 47 Shape Optimisation and CFD
If the discrete variables from the linearised FE discretisation are ordered such that all
the velocity variables come before the pressure variables, and the equations are ordered
in an analogous way, then the system matrices K for (2.3.9) and (2.3.10) have the block
structure " #
F BT
K= , (2.5.2)
B 0
where the F block results from the a(., .) and c(., ., .) terms while the B and BT blocks
arise from the b(., .) terms of the linearisations of weak form (2.3.7). For discretisations
which are not inherently div-stable but stabilised in some way, the pressure-pressure block
of K might be non-zero as well. But as the div-stable Taylor-Hood element pair is used in
this work, such stabilisation is not necessary here.
In [29, 52, 101] a right preconditioner is presented based upon approximating the
inverse of the block triangular matrix
" #
F B T
CeR := ,
0 −S
S := BF −1 BT . (2.5.3)
In order to derive an analogous left preconditioner, the same approach is applied to the
matrix
" #
F 0
CeL := . (2.5.4)
B −S
where for computations the latter factorisation is advantageous as it demonstrates the three
Chapter 2 49 Shape Optimisation and CFD
major steps necessary to apply this inverse to a vector: solve with F, multiply by B and
finally solve with −S.
The most important ingredient of the Fp preconditioning technique is to use an ap-
proximation X −1 ≈ −S−1 instead of the inverse of S. Solving equation systems with S
would be very expensive, as the definition of S already contains the inverse of F. Thus,
if iterative solvers like GMRES where used to solve with S, for each multiplication by
S a system with F would have to be solved. The particular approximation X −1 used in
the Fp preconditioner is motivated by the following argument: S is a discretisation of the
differential operators
where the constant σ is used to distinguish between the F operator arising from Newton
linearisation σ = 1 and that from Picard iteration σ = 0. Now, if for the purpose of
motivation it is assumed that the operators are commutative, and if the dimensions of the
various arguments are ignored, the expression
can be derived. This would suggest to use the inverse of a Laplacian A p and multiplication
by an F-like discrete operator Fp as an approximation to −S−1 ,
X −1 := M p−1 Fp A−1
p . (2.5.7)
Here M p is a mass matrix (discretisation of the identity operator) which is added to im-
prove scaling. The subscript ∗ p indicates that these operators are to be applied to pressure
space vectors, respectively functions from S0h . A rigorous analysis of the resulting pre-
conditioner can be found in [29, 52, 101]. Here we focus on practical issues related to the
application of this approach.
Let us draw attention to the fact that the preconditioner, as it is written in (2.5.5) and
(2.5.7), contains the inverse matrices of F, A p and M p . Of course computing these inverse
matrices would be a computationally very expensive task, and would render the approach
similarly expensive as direct solution methods for the whole matrix K. However, as this
preconditioner is to be used within a Krylov subspace solver, it is sufficient to supply
routines which apply the preconditioner to a given vector. Therefore it suffices to be
able to solve equation systems with each of the matrices F, A p and M p , representing the
action of the inverse matrices. Such solves do not necessarily have to be very accurate,
Chapter 2 50 Shape Optimisation and CFD
and a preconditioned Krylov subspace solver may be used to approximate the solutions
to a specified tolerance. Of course the efficiency of the outer solution technique, for
the linearised Navier-Stokes problems, strongly depends on the efficiency of the methods
used to solve the subproblems which arise in the preconditioner. That is, an optimal solver
can only be achieved if the subproblem solvers are optimal.
Very efficient methods for solving equation systems arising from discretisations of
the Laplacian operator are available, as already discussed in Subsection 2.5.1. The imple-
mentation in this work uses CG with the BPX preconditioner to solve the systems with
A p . This results in computational costs linear in the number of unknowns, and thus is
considered an optimal solver for this subproblem. The subproblems with the mass matrix
require even less effort, as it is known [104] that the condition number of diag(M p )−1 M p
is mesh independent. Therefore applying CG preconditioned by the inverse of the diag-
onal of M p to solve these problems results in an optimal solver. However, the remaining
subproblem, application of F −1 turns out to be more challenging. Detailed discussion of
this part is left to Subsection 2.5.2.2. For now let us assume that a good preconditioner
for these systems is available such that GMRES with this preconditioner allows robust,
efficient solution of equation systems with F.
When iterative solution techniques are used for the inner solves, to approximate the
application of the inverse operators, F −1 , A−1 −1
p , M p , care has to be taken that the result-
ing outer preconditioner (2.5.5) remains a linear operator. If, for example, a fixed relative
reduction of the initial residual of the inner equation systems is used as a stopping crite-
rion in the inner GMRES or CG solvers, this will generally result in a different number
of iterations required to fulfil this criterion, depending on the right-hand side to which
the algorithms are applied. These different iteration numbers imply that slightly different
approximations of the inverse matrices are used in each call of the outer preconditioner,
i.e. 20 steps of GMRES will result in a better approximation of F −1 than 10 steps, pro-
vided the same preconditioner is used. Thus, one would apply the outer GMRES solver
−1 −1
to a preconditioned matrix CL,(k) K which is not constant, as CL,(k) is slightly different in
each call k. This is a situation for which the GMRES algorithm has not been designed
and consequently it will fail, or at least return less accurate results than predicted. On the
other hand, if a fixed number of iterations and initial guess x0 = 0 is used for each of the
inner solves, these become constant linear operators, since the solution approximations
resulting from m steps of GMRES [73, Lemma 6.6] as well those from m steps of CG [73,
Lemma 6.5] can be written as
xm = x0 + qm (A)(b − Ax0 ),
Chapter 2 51 Shape Optimisation and CFD
This preconditioning technique has first been developed for the systems arising in the
Picard iteration [52], where it performs best, and has been applied to the systems from
Newton linearisation later [29]. In the case of the Newton linearisation the performance
is worse in two ways. Firstly, the number of iterations for the outer solver deteriorates
more strongly as the Reynolds number Re increases, as analysed in [29]. Secondly, and
equally importantly, the performance of the multigrid solvers for the subsystems with the
F-block deteriorates badly with increasing Re, as is explained in more detail in Subsection
2.5.2.2. The combined increase makes this solver approach less attractive for higher Re,
but it is still competitive due to the fast convergence of the Newton linearisation in terms
of the nonlinear system. These issues are illustrated in Table 2.1, where for comparison a
driven cavity problem has been solved with both linearisation techniques. In both cases,
Picard linearisation and Newton linearisation, the Fp preconditioner has been utilised,
with the same accuracy settings for the inner and outer solves. The desired Reynolds
number Re = 100 could not be used on the coarse meshes due to stability issues explained
at the end of Section 2.3.1. The table lists the mesh refinement level `, the Reynolds
number Re used on that level, the number of degrees of freedom (#dof), the number of
linear steps (#steps) to approximate the nonlinear solution to the specified accuracy, the
maximal number of Fp preconditioned GMRES iterations required to solve the linear
systems (max(#it), the maximum is over the linearisation steps), the maximal number
of inner GMRES iterations for the F block per outer GMRES iteration (max( #inner F
#outer )),
the maximal time it took to solve one of the (outer) linear equation systems on the level
(max(tsolve )) and the total time for the computations from the coarsest to the finest level.
In both cases the most efficient multigrid preconditioning variant for the F-block solves
has been used, i.e. stab0 for the Picard linearisation, and stab3 for Newton linearisation,
see Section 2.5.2.2 for a detailed discussion.
Chapter 2 52 Shape Optimisation and CFD
Picard linearisation
` Re #dof #steps max(#it) max( #inner F
#outer ) max(tsolve )
1 5.9 59 2 11 2 5.0e − 04
2 17.9 187 2 17 9 1.2e − 02
3 20.8 659 2 18 9 8.7e − 02
4 32.3 2467 2 20 11 6.6e − 01
5 40.0 9539 2 21 13 3.4e + 00
6 76.9 37507 4 24 18 2.5e + 01
7 100.0 148739 4 25 22 1.4e + 02
8 100.0 592387 4 25 22 6.1e + 02
total time 2.6e + 03
Newton linearisation
` Re #dof #steps max(#it) max( #inner F
#outer ) max(tsolve )
1 5.9 59 2 11 2 4.7e − 04
2 17.9 187 2 17 10 1.8e − 02
3 20.8 659 2 20 10 1.4e − 01
4 32.3 2467 2 23 14 1.3e + 00
5 40.0 9539 2 25 16 6.9e + 00
6 76.9 37507 3 30 25 6.0e + 01
7 100.0 148739 3 34 32 4.1e + 02
8 100.0 592387 3 34 32 1.8e + 03
total time 4.7e + 03
Table 2.1: Comparison Picard and Newton linearisations with Fp preconditioner, driven
cavity at Re = 100
Chapter 2 53 Shape Optimisation and CFD
0.02 0.01
0.01 0.005
Imag
Imag
0 0
−0.01 −0.005
−0.02 −0.01
−0.03 −0.015
0 0.05 0.1 0.15 0.2 0.25 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08
Real Real
x 10
−3 FEM level 4, 1089 pressure nodes
4
1
Imag
−1
−2
−3
−4
0 0.002 0.004 0.006 0.008 0.01 0.012 0.014 0.016 0.018 0.02
Real
(Results for driven cavity, Re = 20)
Note that the increase in the iteration numbers as the mesh is refined is mainly related
to the increasing Re. Once the desired Re can be used on the mesh (stability constraints
are fulfilled), further refinement does not result in increased iteration numbers. The effects
explained above are clearly visible in this example, i.e. the Newton linearisation requires
fewer steps to converge to the solution of the nonlinear system, but the solution of the
linear systems arising in the Picard linearisation is computationally more efficient. Note
that for the same problem at Re = 10 there would be no noticeable difference between the
two linearisation methods, due to the dominance of the linear terms in this case.
Yet, besides the fast convergence of the nonlinear systems, there is another important
reason that makes the Newton linearisation a more favourable choice: it is required for
the discrete adjoint method.
We conclude this subsection by illustrating the effects of the Schur complement pre-
conditioner in figures 2.5, 2.6 and 2.7 for a driven cavity model problem at Re = 20
(ν = 0.05). Figure 2.5 shows the spectra of the Schur complement S := BF −1 BT for three
levels of mesh refinement in the case of Newton linearisation. The eigenvalues are dis-
Chapter 2 54 Shape Optimisation and CFD
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
Imag
Imag
0 0
−0.1 −0.1
−0.2 −0.2
−0.3 −0.3
−0.4 −0.4
−0.5 −0.5
0 0.2 0.4 0.6 0.8 1 1.2 1.4 0 0.2 0.4 0.6 0.8 1 1.2 1.4
Real Real
FEM level 4, 1089 pressure nodes
0.5
0.4
0.3
0.2
0.1
Imag
−0.1
−0.2
−0.3
−0.4
−0.5
0 0.2 0.4 0.6 0.8 1 1.2 1.4
Real
(Results for driven cavity, Re = 20)
0.6 0.6
0.4 0.4
0.2 0.2
Imag
Imag
0 0
−0.2 −0.2
−0.4 −0.4
−0.6 −0.6
−0.8 −0.8
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Real Real
FEM level 4, 1089 pressure nodes
0.8
0.6
0.4
0.2
Imag
−0.2
−0.4
−0.6
−0.8
−0.2 0 0.2 0.4 0.6 0.8 1
Real
(Results for driven cavity, Re = 20)
tributed over a region which depends on the refinement level. This is contrasted by the
spectra of the preconditioned Schur complement XF−1 p
S in figures 2.6 and 2.7 in the case
of Picard linearisation and Newton linearisation respectively. In both cases clustering of
the eigenvalues in two small regions may be observed. Note that in all three of these
figures a zero eigenvalue is present. This stems from the rank deficiency of the B-block,
which is not treated explicitly at this point, but will be discussed in detail in Subsection
2.5.2.3. Note also that in contrast to the case of CG, the spectrum of the matrix alone is
not sufficient to describe the convergence behaviour of GMRES [82], even though it is of
importance.
where the reaction part with reaction coefficient (∇u) is only present in the case of Newton
linearisation (σ = 1 in this case, σ = 0 otherwise). There are two terms in this operator
which make efficient solution difficult, the reaction term and the convection term.
While reaction-diffusion equations with non-negative reaction coefficients can, for
example, be efficiently solved by CG with BPX preconditioning [20], the non-negativity
condition can clearly not be guaranteed for ∇u. With increasing Reynolds number the
velocity components u of the solutions of the Navier-Stokes equations usually develop
boundary layers, where u changes rapidly from its free-stream behaviour to conform
with the boundary conditions. This change occurs in a layer of thickness O(ν) [65]
around Dirichlet boundaries. Thus the reaction coefficients can be expected to be of order
O(ν −1 ) in these areas, while more moderate values can be expected in the rest of the
domain. Negative reaction coefficients can produce oscillatory behaviour of the continu-
ous solution [45]. Such oscillations may be impossible to approximate on coarse meshes,
which may form an obstacle for successful multigrid solution of these equations. Further,
standard theory regarding existence and uniqueness of solutions of elliptic problems uses
the positivity of the bilinear part of the weak form of the PDE. This positivity can easily
be violated if negative reaction coefficients are permitted. Thus, most publications restrict
themselves to non-negative reaction coefficients, for example [33].
If the reaction terms in (2.5.8) are ignored, another model operator, convection-diffusion,
− ν∆ + w.∇, (2.5.9)
Chapter 2 57 Shape Optimisation and CFD
hkwk
Pe := , (2.5.10)
2ν
i.e. if Pe > 1 the Galerkin FE discretisation may be unstable and oscillations may occur
which are not typical of the continuous solutions. In the context of multigrid, good ap-
proximations on coarse meshes are required, such that obtaining Pe ≤ 1 is not generally
feasible on all meshes. To avoid the spurious oscillations stabilised discretisations have
been introduced, e.g. Streamline-Upwind Petrov Galerkin (SUPG) and Sub-Grid Stabil-
isation (SGS) techniques have been proposed as a remedy, see e.g. [47] for an overview
of different approaches. These schemes modify the discrete weak form by adding terms
which improve stability for Pe > 1 and vanish as h → +0. For example in the SUPG
method, the Galerkin discrete form
is equivalent to diffusion along direction w, i.e. the streamline. This together with the
classification as a Petrov-Galerkin discretisation, due to the use of discrete test functions
vh + w · ∇vh which are from a different discrete function space than the ansatz functions
uh , gives this method its name, streamline-upwind7 Petrov Galerkin method. Note that
testing with (vh + δ w · ∇vh ) implies terms of the form (−ν∆uh , w · ∇vh ) which vanish
for linear elements, but not for higher order elements. The choice of the stabilisation
7 The term up-winding is generally used for stabilisation approaches for convection discretisations. It
stems from finite difference approaches where forward differences in direction of the wind are unstable
while backward differences according to the wind direction are stable.
Chapter 2 58 Shape Optimisation and CFD
parameter δ ≥ 0 is thereby crucial for the success of the approach. Large values of δ
smooth out important solution features like boundary layers, while too small values result
in unstable discretisations. A typical choice in the case of SUPG is, for example [68],
h̄ 1 ν
δ := max 0, − , (2.5.14)
kw̄k 2 h̄kw̄k
where h̄ is a measure of the element size along the wind direction w, and kw̄k is the Eu-
clidean norm of the wind vector at the centre of the element. The stabilisation allows
meaningful (i.e. non-oscillatory) approximations on relatively coarse meshes and, be-
cause the stabilisation terms vanish as h → +0, asymptotic convergence is maintained.
Thereby the h order of the stabilisation terms influences the overall convergence order
of the discretisation and has to be chosen to allow fast convergence while maintaining
stability.
In [68] the SUPG approach has been used to develop a multigrid preconditioner for
GMRES, to solve convection-diffusion problems in h independent number of GMRES
steps. Three different prolongation operators have been considered, standard FE interpo-
lation, and two types of operator dependent prolongation. Restriction was chosen to be
the transpose of the prolongation operator in each of the cases. For geometric multigrid,
i.e. the coarse grid matrices are assembled on coarse meshes rather than constructed from
the fine grid matrices directly, the tests in [68] indicate that the standard interpolation
performed at least as well as the operator dependent prolongations. Further, the results in
[68] demonstrate that in order to achieve mesh independent iteration counts for GMRES,
appropriate choice of the stabilisation parameter is crucial.
The choice of the smoother for convection-diffusion problems deserves more atten-
tion than in the Poisson case. In the case of Gauß-Seidel smoothing the node numbering,
or alternatively the order in which the rows of the equation system are traversed by the
Gauß-Seidel sweep, is far more important than in the Poisson case. The convection wind
w dictates a preferred direction for the transport of information. For pure convection, a
forcing term located at a specific node has strong influence on the solution at all nodes in
downstream direction of that location, while it has no influence on the solution upstream
of that node. For convection-diffusion with dominating convection this effect is not as
strong, but still important. Thus, it is advantageous in a Gauß-Seidel sweep if the solution
approximation at the upstream nodes is improved first, because any changes to it affect
all downstream nodes, whereas changes at the downstream nodes have little influence on
the upstream solution. For some problems, e.g. flow in a pipe, it is very easy to number
the nodes of the mesh accordingly, by simply sorting them with increasing distance from
Chapter 2 59 Shape Optimisation and CFD
the inflow boundary. However, in most interesting flows this is not as easy, because recir-
culations occur. Due to the recirculation some nodes of the mesh are upstream of them-
selves. Simple geometrical ordering according to the distance from the inflow boundary
is not optimal in these cases. Also for some problems, like the driven cavity problems
for example, there is no inflow boundary, as the boundary conditions specify tangential
velocities at all parts of the boundary. As a remedy, two Gauß-Seidel sweeps are used
as smoother in [68], the first ordering the nodes left-to-right, the second top-to-bottom.
Typically the Gauß-Seidel sweeps for the pre- and post-smooths are done in opposing
directions, thus the pre-smoothing might be done left-to-right then top-to-bottom, and the
post-smoothing bottom-to-top then right-to-left. This way the information can travel in
any direction within one V-cycle.
Inspired by the success of the multigrid preconditioned GMRES solver in [68] this
strategy has also been adopted in this present work. However, in contrast to the bilinear
quadrangular elements used in [68], quadratic triangular elements are used here in the
definition of the discrete systems corresponding to the F-block. Even though this type
of element is frequently used in practice, e.g. as part of the Taylor-Hood element pair
in this present work, the author has been unable to find any reference in the literature
that investigates stabilisation of convection-diffusion or multigrid solution of convection-
diffusion for this type of element. Yet for linear and bilinear elements there are numerous
publications on this type of stabilisation, e.g. [17, 23, 33, 45, 47, 62, 63, 68] (and many
more).
Often the analysis of these stabilisation approaches contain terms which vanish for the
low order elements for which they are used, e.g. [47, Equation (77)] proposes a scheme
with
aδ (uh , vh ) − a(uh , vh ) = (L uh − f , δ (−L ∗ ∇vh )) , (2.5.15)
where L denotes the differential operator of the PDE, L ∗ its adjoint and the inner product
is evaluated element wise (ignoring jump terms at element interfaces). In the case of
convection-diffusion L = −ν∆ + w · ∇, the resulting integral contains terms of the form
∆uh and ∆vh which are zero for linear and bilinear elements. This approach has been tested
by the author during the preparation of this thesis, together with the proposed stabilisation
parameter
h 1
δHughes := coth(Pe) − . (2.5.16)
2kwk Pe
If the coarse grid discretisations were defined that way for the convection-diffusion sub-
problems that arise when the Fp preconditioner is applied to the Picard linearisation dis-
Chapter 2 60 Shape Optimisation and CFD
without stabilisation, is included in this list under the name Stab0. The simplest approach
to stabilisation is to modify the differential operator by increasing the viscosity constant
ν such that the stable regime is maintained. This approach is generally referred to as
artificial diffusion (e.g. [51]) and is known to give poor approximations because it changes
the properties of the differential operator. However, as the stabilised approaches are only
used in the coarse grid discretisations which are used in a multigrid preconditioner here,
it is not necessary that these discretisations give the most accurate discrete solution, but
only something that approximates the low frequencies of the solution well. The diffusion
parameter ν is set to a mesh wide constant value, which is increased by a factor of two
with each mesh coarsening. This stabilisation variant is denoted by Stab1.
A refined approach is to add diffusion only in the streamline direction, which is
achieved by using formula (2.5.12). In the case of linear elements this can be seen as
a Petrov-Galerkin discretisation, but for quadratic elements this does not hold. Instead,
(2.5.12) can be seen as a perturbation of the Galerkin formulation. This approach is de-
noted by Stab2. A simplification is possible by using linear elements for the coarse grid
discretisations. This can be implemented using linear element functions on the finest
mesh as the second finest discretisation in the hierarchy, and applying coarsening to de-
fine the remaining levels of the hierarchy. An important advantage of this approach is that
for linear elements the stiffness matrices have fewer non-zero entries per row and thus
storage requirements are lower and fewer floating point operations are required for the
multigrid ingredients of smoothing, and evaluation of the residuals. Further, the stabilised
methods are readily available for linear elements. The resulting approach is denoted by
Stab3. If the reaction terms are present, i.e. for Newton linearisation, a further simplifica-
tion can be made by dropping the reaction terms on the coarse grids, resulting in a further
reduction of the memory requirements as the coupling between the velocity components
is removed. This approach is referred to as Stab4.
Finally, application of the Sub-Grid Stabilisation (SGS) approach [47] is attempted,
denoted by Stab5 in the overview Table 2.2. However, as already described earlier in this
subsection, this approach failed outright even in the tests for Picard linearisation at low
Re. Thus, no results for Stab5 are included in the test result tables below.
Now that all these aspects of the multigrid preconditioning of the F-block subprob-
lems have been discussed we present some exemplary results for the driven cavity prob-
lem. Both Picard and Newton linearisation are considered, for two Reynolds number
regimes, Re = 10 and Re = 100. The nonlinear solver steps are performed until both the
Euclidean norm of the residual and the update are less than 10−1 for the coarse levels and
less than 10− 6 on the finest mesh. The stopping criterion for the outer linear solves is set
Chapter 2 62 Shape Optimisation and CFD
Name Description
Stab0 No stabilisation. (Standard Galerkin discretisation.)
Stab1 Artificial diffusion. The viscosity constant ν is doubled with
every level of mesh coarsening.
Stab2 Streamline diffusion. Only the streamline diffusion term
(2.5.12) with δ according to (2.5.14) is added to the standard
Galerkin discretisation. (Not a Petrov-Galerkin discretisation.)
Stab3 Linear elements and SUPG. Only linear elements are used
in the coarse meshes and SUPG according to (2.5.12) and
(2.5.14). (True Petrov-Galerkin discretisation.)
Stab4 Same as Stab3, but if reaction terms are present (Newton lin-
earisation) these are switched off on coarse meshes.
Stab5 Sub-grid Stabilisation. The term stabilisation term (2.5.15) is
used in conjunction with (2.5.14).
Table 2.2: Stabilisation approaches for the coarse grid representations of the F-block
stab1
stab2
stab3
9.5e+02
9.3e+02
9.3e+02
total time
Table 2.3: Inner iteration counts F-block for driven cavity at Re = 10, Newton linearisa-
tion, V-cycle, various stabilisations
Chapter 2 63 Shape Optimisation and CFD
stab0
stab2
stab3
stab4
ν #dof itF /itouter
* 5.6e-02 162 9 10 8 8
* 5.6e-02 162 8 10 8 8
* 4.8e-02 578 9 11 9 9
* 4.8e-02 578 9 10 9 9
* 3.1e-02 2178 11 12 11 11
* 3.1e-02 2178 11 13 11 11
* 2.5e-02 8450 12 14 13 13
* 2.5e-02 8450 13 15 13 13
* 1.3e-02 33282 18 21 19 19
* 1.3e-02 33282 18 21 19 19
* 1.3e-02 33282 18 21 19 19
* 1.3e-02 33282 18 21 18 18
1.0e-02 132098 22 26 22 22
1.0e-02 132098 22 26 22 22
1.0e-02 132098 22 25 22 22
1.0e-02 132098 22 26 22 22
1.0e-02 526338 22 26 22 22
1.0e-02 526338 22 26 23 23
1.0e-02 526338 22 26 23 23
1.0e-02 526338 22 26 23 23
2.6e+03
3.4e+03
2.8e+03
2.8e+03
total time
Table 2.4: Inner iteration counts F-block for driven cavity at Re = 100, Picard linearisa-
tion, V-cycle, various stabilisations
Chapter 2 64 Shape Optimisation and CFD
stab0
stab2
stab3
stab4
ν #dof itF /itouter
* 5.6e-02 162 10 11 10 10
* 5.6e-02 162 10 11 10 10
* 4.8e-02 578 10 12 10 11
* 4.8e-02 578 11 12 10 12
* 3.1e-02 2178 13 15 14 15
* 3.1e-02 2178 14 15 14 15
* 2.5e-02 8450 16 17 16 18
* 2.5e-02 8450 16 17 16 18
* 1.3e-02 33282 26 27 25 27
* 1.3e-02 33282 26 27 24 28
* 1.3e-02 33282 26 27 25 27
1.0e-02 132098 63 37 32 37
1.0e-02 132098 54 37 31 35
1.0e-02 132098 54 35 32 39
1.0e-02 526338 39 29 27 29
1.0e-02 526338 54 37 32 38
1.0e-02 526338 52 38 31 34
8.3e+03
5.5e+03
4.7e+03
5.0e+03
total time
Table 2.5: Inner iteration counts F-block for driven cavity at Re = 100, Newton lineari-
sation, V-cycle, various stabilisations
Chapter 2 65 Shape Optimisation and CFD
to 10−5 improvement in the norm of the preconditioned residual. For the inner solves a
relative accuracy of an improvement by 10−6 in the preconditioned residual is required
for the first call of the inner solver within the solve of each outer system, and the resulting
number of iterations is fixed for all consecutive calls. Iteration counts for the inner sub-
problems with the F-block are presented in tables 2.3, 2.4 and 2.5. In each case the table
lists the actual ν used on each refinement level, the number of degrees of freedom (#dof)
in the outer nonlinear system, and for each multigrid preconditioner variant the number
of inner iterations for the F-block per outer iteration (itF /itouter ). The refinement levels
are separated by lines while the individual rows within each block constitute the steps of
the nonlinear solver.
In Table 2.3 the results are presented for Newton linearisation in the case of Re = 10.
Apart from the stabilisation Stab1 (artificial viscosity) all the columns show textbook
multigrid performance. The degradation in the case of Stab1 shows that this approach is
unsuitable for constructing a competitive multigrid preconditioner. A column for Stab0,
i.e. no stabilisation, as well as a similar table for Picard iteration have been omitted be-
cause the results are virtually identical. However, the situation changes if Re is increased
to Re = 100, see tables 2.4 and 2.5. Here it is actually the case that without increasing ν
the stability condition Pe < 1 is only fulfilled on the two finest meshes used in the tests.
The levels where ν was increased to guarantee this stability condition are marked with a
* in the first column. For Picard linearisation, Table 2.4, a rather surprising result is ob-
tained. The variant without stabilisation of the coarse grid discretisations, Stab0, actually
performs better than any of the stabilised variants. However, the variants Stab3 and Stab4
show almost the same performance as Stab0, while Stab2 results in slightly higher itera-
tion numbers. Mesh independent iteration numbers are obtained in all four stabilisation
variants as there is only an increase between the blocks where the diffusion parameter ν is
decreased. Thus optimal performance is achieved for all four variants, but a deterioration
is observed as the Reynolds number is increased.
For the Newton linearisation the situation is similar in the sense that the strongest
differences in iteration numbers occur where the diffusion parameter ν is decreased, see
Table 2.5. But here the differences between the stabilisation approaches are more pro-
nounced, and the deterioration as Re is increased is also more significant. The non-
stabilised variant Stab0 actually performs worst, while among the stabilised variants
Stab3, streamline diffusion on linear elements, performs best in multiple ways: it re-
sults in the lowest overall time to solution, the lowest iteration numbers and the smallest
variations between consecutive Newton steps. Thus we conclude that Stab3 is the most
preferable of the listed stabilisation approaches.
Chapter 2 66 Shape Optimisation and CFD
Note that the variant Stab1 has been omitted for the Re = 100 tests due to the poor
results in the Re = 10 test case. Note also that this is only one example problem, but it
shows the general trends observed in more extensive tests carried out by the author. As
one such test ordering of the nodes as described in [68] has been tested, i.e. performing
two ordered Gauß-Seidel sweeps, one sorted by the x component of the node positions and
the second sweep sorted by the y component. This resulted in slightly decreased iteration
numbers, but this did not compensate for the higher cost due to the multiple smoother
sweeps. The deterioration of the multigrid performance was unaffected by these ordered
smoother sweeps.
Considering the partial character of the success of the geometric multigrid (GMG)
approach for the problems considered in this subsection, the question remains if other
approaches might provide a better base for a preconditioner. In [106] both geometric and
algebraic multigrid (AMG) approaches have been considered for convection-diffusion
problems, with the result that both approaches show deficiencies as the diffusion param-
eter is decreased. However, there it was found that for convection dominated problems
GMRES with an AMG preconditioner was the best of the solvers considered. Another
recent approach is to utilise hierarchical matrix (H -matrix) approximative LU factorisa-
tions [15] as preconditioner for GMRES. For the streamline-diffusion FE discretisation
of the convection-diffusion equation the approximate LU decompositions can be com-
puted with a given fixed accuracy in logarithmic-linear complexity in both memory and
computational cost [15], rendering this an attractive approach. However, both of these ap-
proaches have so far only been applied to stabilised discretisations of convection-diffusion
problems. Application to the F-block subproblems considered in this work is a possible
future research direction. Especially the H -matrix approach may be promising for the
subproblems arising in the Newton-linearisation, as it allows to specify an accuracy of the
approximate LU factorisation even in the general case, i.e. for arbitrary system matrices.
The special properties of the system matrices, which arise due to the properties of the dis-
cretised PDEs, are required to guarantee the boundedness of memory and computational
cost, but the approach should work reliably (albeit more expensively) for more general
systems as well.
An important aspect that has been largely ignored in the previous subsections is the effi-
cient implementation of two side-constraints to the linearised Navier-Stokes systems: the
Chapter 2 67 Shape Optimisation and CFD
Ku = b,
with
n−k
K = a(ϕ j , ϕi ) i, j=1 ,
b = [b(ϕi ) − a(g0 , ϕi )]n−k
i=1 ,
{ϕ1 , . . . , ϕn−k } denoting a basis of the subspace Vh0 , g0 denoting a function fulfilling the
Dirichlet boundary conditions which is extended identical to zero for all nodes in the
interior of the domain in the case of Dirichlet boundary conditions, and g0 = 0 in the case
of the ZMPC. In most cases it is disadvantageous to assemble the stiffness matrix K and
right-hand side b directly. For example in the case of Dirichlet boundary conditions this
would involve consideration of special cases for each element, i.e. checking if one or more
nodes of the element are boundary nodes. The inclusion of such conditional statements
in what is a nested loop of computationally intensive operations has a significant negative
impact on the speed of execution! Instead often the matrix K e is assembled ignoring the
constraints, i.e.
e = a(ϕ j , ϕi ) n
K i, j=1
K = PT KP
e (2.5.17a)
b − PT Kg
b = PT e e ,
0
(2.5.17b)
where P denotes an appropriate projector onto the discrete subspace Vh0 ⊂ Vh and g0 the
coefficient vector for function g0 . In the case of Dirichlet boundary conditions for example
this is simply to delete rows and columns from the stiffness matrix, to delete rows from
the right-hand side vector and to adjust the right-hand side to incorporate a(., g0 ). Thus,
" #
In−k
P := ∈ Rn,n−k .
0
Chapter 2 68 Shape Optimisation and CFD
However, this modification of the matrix requires the size of the matrix to change and a
reordering of entries if the boundary nodes are not as conveniently numbered as in the
notation here. Further this approach requires the handling of coefficient vectors of size
n − k as well as of size n, including the boundary nodes.
An alternative approach, circumventing these difficulties, is taken in [57] in the con-
text of hanging nodes in adaptive mesh refinement, remarking on its application for the
handling of Dirichlet boundary conditions as well. The first modification this approach
makes is to define the projector P as P ∈ Rn,n , i.e. it projects into a lower dimensional sub-
space of Rn rather than the lower dimensional space Rn−k . It may then be observed that if
appropriate projection steps are introduced in an iterative solver (only CG is considered in
[57]) it can be achieved that only residuals of the projected type r = PT (Kx e −eb) are ever
used in the solver and that if the initial guess for the solution is within the subspace Vh0 and
all search directions are projected, w = Pŵ, then all the iterates for the solution remain
within the subspace Vh0 . Since all the arithmetic of the solver is performed in Vh0 this is
actually equivalent to applying the iterative solver to a lower dimensional problem. This
approach implies a small computational overhead because all operations are performed in
Rn rather than Rn−k , but it saves on the cost for setting up the problem.
In [57] it is found that this approach can be implemented using a standard CG imple-
mentation, simply by using a projected version of the preconditioner for the larger space,
(This includes the possibility Ce−1 = I.) Analysing this idea it was found that the optimal-
ity of the BPX preconditioner for the Poisson problem is preserved by this approach (for
the hanging nodes treatment). Unfortunately the technique of proof, the fictitious space
lemma, is not directly applicable to the systems under consideration in this present work
here, due to their non-symmetric character.
The question whether implementation in GMRES can be done in the same simple
way, by using the projected preconditioner (2.5.18) can be answered in the affirmative.
Substituting the preconditioner in Algorithm 3 by (2.5.18), one can immediately see that
only projected residuals are ever considered and that all search directions conform with
the subspace Vh0 .
In the remainder of this subsection, first the application of this approach to the Dirich-
let boundary conditions and later its use to implement the ZMPC will be considered in
more detail.
Chapter 2 69 Shape Optimisation and CFD
P = diag([α1 , . . . , αn ]),
with (
0 ∀i : index i is associated with a Dirichlet node,
αi :=
1 otherwise.
A special feature of this case is that if the initial guess of the Krylov subspace solver is
chosen to fulfil non-homogeneous boundary conditions, i.e. x(0) = g0 with g0 as defined
above, then all iterates are in the affine subspace (Vh0 + g0 ) ⊂ Vh . However, as all updates
to this x(0) are in Vh0 , this is equivalent to solving in the subspace V0h with a right-hand
side equivalent to (2.5.17b).
Numerical experiments indicate that preservation of optimality of multigrid as a pre-
conditioner, as proven for BPX and the hanging node treatment in [57], is not trivially
achieved. If the boundary conditions are ignored for the coarse grid representations the
iteration numbers required to achieve a prescribed relative residual reduction in GMRES
increase. This is improved if due consideration is given to the boundary conditions on the
coarse meshes. In a manner similar to the basic projection idea this can easily be achieved
by a slight modification of the Gauß-Seidel smoother. If the vector d = [d0 , . . . , dn ] of re-
ciprocals of the diagonal entries of the system matrix A is stored,
1
di := , (2.5.19)
Ai,i
then the update step of the i-th degree of freedom in the Gauß-Seidel sweep can be imple-
mented as
n
xi := xi + di (bi − ∑ Ai,k xk ).
k=1
Besides the advantage that at each step this saves a division, which on current hardware
is computationally more expensive than a multiplication, this approach also allows to set
the values of di to zero for all boundary degrees of freedom, i.e. d := Pd. As this can
easily be done on all levels of the multigrid hierarchy, the V-cycle with these projected
smoothers becomes equivalent to standard Gauß-Seidel smoothing for the systems where
the boundary degrees of freedom have been removed. This minor modification improves
the performance of multigrid as projected preconditioner (2.5.18). This is demonstrated
in Table 2.6, where iteration counts are given for the inner iterations of the F blocks of
Chapter 2 70 Shape Optimisation and CFD
the Fp preconditioner. The table is organised in the same manner as the result tables at the
end of Subsection 2.5.2.2, see page 61 for a detailed description. The variant normal diag
denotes the implementation of the projected preconditioner without projection of d on all
levels, while projected diag denotes the results for the same problem with the projected
smoother on all levels d = Pd. Note that the difference lies only in different variants of
Ce−1 .
If the coarsest mesh of the hierarchy is finer than in the example given in Table 2.6,
it becomes important to apply a direct solver on the coarsest grid, because very low fre-
quency components of the error are not sufficiently removed by smoothing alone if the
coarsest grid is too fine. To apply a direct solver it is necessary to implement the Dirich-
let boundary conditions directly on the coarsest mesh. This variant is denoted projected
diag exactCGS. As the values in Table 2.6 demonstrate, this gives only a very slight im-
provement in this case where the coarsest grid consists of only 5 × 5 nodes. However, for
examples with finer and less regular coarse grids noticeable improvements are obtained
by the direct coarse grid solves.
Even though this slight modification is necessary in order to achieve the best multigrid
performance, the projection approach is beneficial for the implementation of the multigrid
components. It simplifies the implementation of restriction and prolongation operators,
since these operators can be exactly the same for interior and boundary elements, due to
the projected smoother.
The zero mean pressure condition. While the zero-mean-pressure condition (ZMPC)
p ∈ L02 (Ω), i.e. Ω p dΩ = 0, is frequently used in the theory on the existence and unique-
R
ness of the solutions to Dirichlet problems of the Stokes and Navier-Stokes equations,
often very little is said on how this or similar conditions are enforced for the discrete
problems. The purpose of this condition is to enforce uniqueness of the pressure part of
the solution to these problems, as without any further constraints p is only defined up to
an additive constant. One common approach is to fix the pressure to a specified value,
usually zero, at one node of the mesh, e.g. [38, Vol. 2, Section 3.8.2, Remark 1]. While
this approach is reasonable if this node is chosen well, it may be difficult in general to
make such a choice. If, for example, the pressure is fixed at a location where a pressure
spike occurs, this may result in relatively large pressure values in parts of the domain
where p shows very little variation. This combined with the limitations of finite precision
arithmetic may result in unnecessary cancellation effects.
Formulation of the ZMPC for the discrete systems directly leads to an equation of the
Chapter 2 71 Shape Optimisation and CFD
9.5e+02
total time
Table 2.6: Inner iteration counts F-block for driven cavity at Re = 10, Picard linearisation,
V-cycle, Stab2, implementations of boundary conditions on coarse grids
Chapter 2 72 Shape Optimisation and CFD
form
wT p = 0, (2.5.20)
where w is a weight vector, containing the integrals over the pressure basis functions,
and p is the coefficient vector of the FE pressure solution. One possible approach to
incorporate this equation into the discrete system is to use a Lagrange multiplier approach,
leading to systems of the form
F BT 0 u fu
B 0 w p = f p , (2.5.21)
0 wT 0 λ 0
where the scalar λ is an auxiliary variable, the Lagrange multiplier. While this approach
is feasible, its effect on the conditioning of the system may be disadvantageous. Further,
it has the disadvantage of changing the system size, which results in significant overhead
in the setup of the problem and during the solution process. Incorporating the weight
vector wT into the sparse system matrix may be disadvantageous due to the relatively
high number of non-zeros of the resulting row9 . An equivalent system is obtained by
replacing one row, e.g. the last, of the last block row of (2.5.2) with (2.5.20), as it has
for example been studied in our previous work [75] for the formulation of the discrete
adjoint system. Proof of the equivalence of these two formulations is given in Appendix
B. While this approach avoids the increased size of the system, the issue of its effects on
the conditioning of the resulting problem remains.
In this subsection two more approaches are studied, which turn out to be related. The
first is a “do nothing” approach which under certain conditions can be interpreted as a
projection approach, but with respect to a different subspace than that induced by the
ZMPC. The second approach is based on a projection onto the ZMPC subspace.
In order to analyse these approaches a property of the B-block is required. For this we
recall the definition of B as derived from (2.3.7b) and (2.3.8),
B = bi, j ∈ RM,d N ,
bi, j = b(ϕ( j) , θi )
Z
= θi div(ϕ( j) ) dΩ, (2.5.22)
Ω
9 Sparse matrix data structures with constant maximum number of non-zeros per row or column can not
be used in this case or have to be used with disadvantageous parameters. Banded matrix storage is also not
applicable.
Chapter 2 73 Shape Optimisation and CFD
where the θi are pressure space basis functions, M the dimension of the discrete pressure
space Sh , the ϕ( j) are velocity space basis functions and d N the dimension of the velocity
space Vh0 . Note that the notation ϕ( j) is used here to symbolise the j-th basis function of
Vh0 , avoiding the additional index ϕk` which was used in (2.3.8) to distinguish the compo-
nents of the velocity vector. Let us assume Dirichlet boundary conditions for the velocity
on the whole boundary ∂ Ω and that these boundary conditions have been treated accord-
ingly, i.e. ϕ( j) are a basis of Vh0 rather than Vh , or equivalently ϕ( j) (x) = 0 ∀x ∈ ∂ Ω. Let
us also assume that the ZMPC is ignored, that is the θi form a basis of Sh rather than S0h .
For the usual nodal basis functions and e = [1, 1, . . . , 1]T , i.e. a vector of ones, this implies
d N
eT B = b(ϕ( j) , 1) j=1 .
Thus,
eT B = 0. (2.5.23)
Let e p be defined as the vector containing zeros for all velocity degrees of freedom (DOFs)
and ones for all pressure DOFs. Then (2.5.23) implies for the matrix K as defined in
(2.5.2)
Ke p = 0 and eTp K = 0. (2.5.24)
and let w ∈ Rn be such that vT0 w = 1. Then P := I − v0 wT defines a projector for which it
holds
PT AP = A (2.5.26)
Chapter 2 74 Shape Optimisation and CFD
PT AP = I − wvT0 AP
= A − w(vT0 A) P
= A I − v0 wT
= A − (Av0 )wT
= A.
and
For the second part, we observe that the set of vectors {v0 } can be extended to an or-
thonormal basis of Rn by orthogonalising the canonical basis. The resulting orthonormal
basis {v0 , v1 , . . . , vn−1 } defines an orthogonal matrix Q as
h i
QT := v0 v1 . . . vn−1 . (2.5.28)
Equation (2.5.27) is then easily verified by applying (2.5.25). The rank of a matrix is
invariant under multiplication with non-singular matrices, thus
Remark 2. Note that the condition kv0 k = 1 was only used to simplify the definition of
Q. For the definition of P it is sufficient to have v0 6= 0 and vT0 w = 1.
wT (C−1 v) = 0 ∀v ∈ Rn : vT v0 = 0, (2.5.29)
is used where wT v0 = 1, then all iterates will be in the subspace orthogonal to w, because
m
the search space is spanned by (C−1 A)k r0 k=0 . The corresponding projector is then
Remark 3. Note that C−1 is not necessarily invertible and thus not necessarily the inverse
of any C ∈ Rn,n . The notation C−1 is used to highlight its use as preconditioner.
y = PbT y = Py
b ∀y ∈ Rn : vT0 y = 0
Chapter 2 76 Shape Optimisation and CFD
and observing
" #
0 0
vT0 Ay = 0, and vT0 Q y=0 ∀y ∈ Rn ,
0 Ce−1
Thus, if Ce−1 is non-singular, then the product C−1 A is non-singular on the subspace or-
thogonal to v0 .
These ideas from the general case can now be applied to analyse the performance of
the projected Fp -preconditioner. For A = K the singular vector is v0 = e p /ke p k due to
(2.5.24). Under the assumption that the velocity boundary conditions are implemented
by reducing the dimension of the FE ansatz space and due to the LBB-condition, the
assumption rank(K) = n − 1 holds. Thus, Lemma 1 is applicable to this case. In order to
apply (2.5.31), it has to be verified that the Fp -preconditioner (2.5.5) with (2.5.7) applied
with the modified pressure space has the structure
"#
0 0
CF−1 =Q QT ,
p
0 Ce−1
or equivalently
eTp CF−1
p
=0 and 0 = CF−1 e .
p p
(2.5.32)
Chapter 2 77 Shape Optimisation and CFD
Recalling,
" #
F −1 0
CF−1 = (2.5.33)
p
−X −1 B F −1 −X −1
" #" #" #
I 0 I 0 F −1 0
= , (2.5.34)
0 −X −1 −B I 0 I
T
and that e p = 0T , eT in this block notation, (2.5.32) is equivalent to
eT X −1 = 0 and 0 = X −1 e. (2.5.35)
Lemma 2. Let K ∈ Rn,n be the discrete linearised Navier Stokes system (2.5.2) with
Dirichlet boundary conditions implemented directly and standard FE pressure space,
e
i.e. ZMPC or equivalent conditions not incorporated into this space, and let v0 = ke p k ,
p
T T
P = I − v0 v0 . Let Q := [v0 , v1 , . . . , vn−1 ] be defined as in the proof of Lemma 1 and let
b
Ce−1 ∈ Rn−1,n−1 be a preconditioner with
kCe−1 Axk
e
γ≤ ≤γ ∀x ∈ Rn−1 : x 6= 0. (2.5.36)
kxk
10 Note that eT X −1 = 0 does not hold otherwise.
Chapter 2 78 Shape Optimisation and CFD
Then
kC−1 Axk
γ≤ ≤γ ∀x ∈ Rn : x 6= 0, vT0 x = 0, (2.5.37)
kxk
Proof. The assertion is implied by (2.5.31) and the construction of Q as orthogonal ma-
trix.
w
e
w= , (2.5.39)
eT e
w p
where
w ei ]ni=1 ,
e := [w
0 for velocity DOFs
ei :=
w R
θi dΓ for pressure DOFs.
Ω
does preserve optimality of the Fp preconditioner and guarantees that GMRES constructs
Chapter 2 79 Shape Optimisation and CFD
wT x=0 x
Px
epT x=0
ep
Px
Implementing a geometric multigrid (GMG) solution procedure for the linearised Navier-
Stokes equations is a relatively difficult task compared to simpler operators such as the
Laplacian [95]. In particular selecting a good smoother is not trivial. For example Gauß-
Seidel smoothing, a very common choice for elliptic self-adjoint PDEs, is not possible
because of the zero diagonal entries in the pressure-pressure block of the Jacobian K 11 .
Even if this difficulty is avoided, for example by the use of artificial compressibility or
stabilised methods, the velocity-pressure coupling is not properly accounted for.
The basic idea of the smoother. In [95, pp. 320–322] the so called box-smoother
for the Navier-Stokes equations is discussed. In contrast to the Gauß-Seidel smoothing
procedure, where only one component of the solution vector is updated at a time, the
box smoother updates all unknowns related to a small part of the mesh simultaneously
by solving a local (small) system of equations. The basic procedure is summarised in
11 For inherently stable discretisations the entire pressure-pressure block is zero.
Chapter 2 80 Shape Optimisation and CFD
Algorithm 4. Note that this algorithm is based on a more general Jacobian matrix K than
that arising from a stable FE discretisation, to allow its use in the FV setting later on
as well. In a sense this procedure is a generalisation of the Gauß-Seidel smoother, but
unfortunately it is computationally more expensive. Note that in other publications this
smoother idea is often referred to as Vanka smoother, honouring its first appearance in
[97].
Definition of the local systems. The definition of the local degrees of freedom (LDOFs)
is an important issue. Since general dense matrix solvers are used, the cost for setting up
and solving the local systems grows like n3 , where n is the number of local degrees of
freedom. Thus n is desired to be kept very small. On the other hand, it is necessary to
define these systems such that sufficient smoothing takes place. In [95] this smoother is
applied to a staggered finite difference discretisation. In this setting use of the degrees of
freedom associated with the individual mesh cells (see Figure 2.9) yields very small sys-
tems (9 by 9) and the number of systems (i.e. |I |) is equal to the number of cells, but the
resulting procedure still possesses very good smoothing properties. For the FE discretisa-
tion employed in this thesis, i.e. using Taylor-Hood elements, the definition of the local
degrees of freedom is not as straightforward. References applying this type of smoother
to triangular Taylor-Hood elements are rare. The method is much more frequently applied
to lowest order stabilisations, e.g. the Qrot
1 − Q0 (Rannacher-Turek) elements, see for ex-
Chapter 2 81 Shape Optimisation and CFD
u,v
u,v u,v
p
u,v
ample [96]. In [50] this type of smoother has been applied in multigrid schemes to various
types of finite elements for incompressible flow. The main topic of [50] were higher order
discretisation schemes, of which the triangular Taylor-Hood elements are one example.
The local DOFs in [50] have been defined as
b. a pressure DOF and all velocity degrees of freedom related to it in the case of
continuous pressure FE spaces.
Both these approaches are applied to the (continuous pressure) elements considered in
this thesis, as summarised in Figure 2.10. In order to refine the definition of “all velocity
degrees of freedom related” to a pressure node, two distinct possible cases are considered.
If actually all velocity DOFs of the elements containing a pressure node are included in
the definition of a local problem, see the p-support Neumann part of Figure 2.10, no
actually physical meaning can be associated with the local problem. It is neither a local
Dirichlet nor a local Neumann problem12 . Nevertheless the notation p-support Neumann
is used here because the problems are more similar to the Neumann case. A physically
meaningful local problem, i.e. a local Dirichlet problem, is obtained if the velocity DOFs
corresponding to the boundary of the local element patch are dropped, see the p-support
Dirichlet part of Figure 2.10. This definition also has the advantage of a reduced size for
the local problems, resulting in lower costs for solving them.
p−support Dirichlet
index set: pressure nodes
p−support Neumann
index set: pressure nodes
i local DOFs: (p,u,v) at the node, (u,v) of adjacent element
Element based
index set: elements
local DOFs: all DOFs of the element
i
Stabilisation. The issue of stabilisation arises again in this context, because even if the
finest mesh of the hierarchy is chosen such that the Galerkin FE discretisation of the
convection operator is stable, the discretisations on the coarser meshes may be unstable.
In this case not only the convection-reaction-diffusion operator has to be stabilised, but the
whole Navier-Stokes system. For this purpose the Streamline Diffusion Finite Element
Method (SDFEM) as described and analysed in [94] together with the parameter choice
of [34] is used here. In this work only the steady state equations are considered and
the pressure FE space consists of continuous functions. Thus [94, eq.(4.6)], modified
according to [94, Remark 3.2], simplifies to: find (uh , ph ) ∈ Vhg × S0h such that
for all (vh , qh ) ∈ Vh0 × S0h . Equation (2.5.42) replaces the Galerkin FE discretisation
(2.3.7). Similar to (2.3.7), (2.5.42) can be expressed in the notation of [40], i.e. using
the bilinear forms a(., .), b(., .) for the diffusive and divergence terms respectively and
the trilinear form c(., ., .) for the convective terms. The stabilisation terms are denoted
by d1 (., ., ., ., ., .) and d2 (., .), which are also linear in each of their arguments respectively
and are only present in the SDFEM formulation. So (2.5.42) becomes
In order to simplify the notation the superscript h will be omitted for the remainder of this
subsection, as only discrete functions are considered anyway.
Application of Newton’s method to solve the nonlinear system (2.5.43) requires one
Chapter 2 84 Shape Optimisation and CFD
to solve
However, in the computational experiments performed in this work the Newton iteration
(2.5.44) was found to be divergent even for moderate Re. Thus for the results presented
later in this subsection the more robust Picard iteration has been applied:
The parameter choice of [34, equation (17)], with constant according to [34, Experi-
ment 5.5], gives in the notation of (2.5.42)
1
α= , δ = ν. (2.5.46)
ν2
In order to verify the correct implementation of this discretisation scheme it has been
applied to the standard test problem of driven cavity flow, see Section 2.7.2.1.
Variants arising by the choice of smoother parameters. In our tests all three smoother
variants, p-support Dirichlet, p-support Neumann and elem based, required significant
damping in order to get robust and fast convergence for the multigrid preconditioned
GMRES solver. As the most successful schemes are obtained for damping factors of 0.6
up to 0.8 (see result tables later in this subsection), the question arises if this might be
due to a systematic requirement of the approach. Indeed, such a systematic scaling of the
updates can be found in [79], where the smoothing properties of this type of smoother
applied in a Jacobi fashion (rather than in a Gauß-Seidel fashion) have been analysed for
mixed FEM discretisations of the Stokes equations. There the velocity updates are scaled
according to the number of appearances of each velocity DOF in a smoother sweep. To
achieve this in a symmetry-preserving manner (2.5.41) is replaced by
√
where D is a diagonal matrix with (number of appearances) as entries for the velocity
DOFs and ones for the pressure DOFs. As this introduces weighting of the velocity DOFs
into the smoother, this variant is denoted by Vweights in the tables at the end of this
subsection.
Further, due to the potentially dominating convection, the influence of the order in
which the local problems appear in the Gauß-Seidel sweeps may again be of importance,
as in Subsection 2.5.2.2. In order to analyse this issue, tests have been performed with
the cell patches in their natural order created by the refinement algorithm13 (label nosort)
and for comparison sorted by increasing x coordinate first and for equal x by increasing
y coordinate (no label). Motivated by the approach in [68] a third sorting approach is
also considered in the tests, for which after a sweep sorted by increasing x and increasing
y, a second smoother sweep is performed sorted by decreasing y and for equal y sorted
by decreasing x. This third sorting mode is labelled 2sort in the result tables. Since the
post-smooth sweeps the mesh in the opposite order than the pre-smooth, information can
travel in all four basic directions (right, down, up, left) on each level, within one multigrid
cycle.
The final parameter in this multigrid preconditioner is the damping parameter γ ∈
(0, 1] of Algorithm 4. As the undamped variant (γ = 1) was unreliable in all tests per-
formed in the preparation of this thesis, only the values γi = i/10, i = 1, . . . 9 have been
considered in the production of the result tables at the end of this subsection.
Table 2.7 summarises all the parameters considered in the result tables.
Test results. The smoother described in this subsection has been tested in a multigrid
preconditioner for a GMRES solver for the SDFEM discretised, Picard linearised incom-
13 New nodes and elements are always inserted at the end of the current lists, while one of the four child
elements replaces the parent element.
Chapter 2 86 Shape Optimisation and CFD
L
L−1
L−2
L−3
L0
Table 2.8: Iteration counts for box-smoother multigrid preconditioned GMRES, driven cavity at Re = 100, FEM discretisation, Picard
2sort p-neum noVweights dam 0.7 2.6e+02
13
15
16
17
2sort p-dir noVweights dam 0.6 2.4e+02
13
16
17
18
2sort elem noVweights dam 0.8 4.5e+02
13
16
17
18
p-neum noVweights dam 0.6 1.6e+02
16
19
20
21
p-neum Vweights dam 0.7 1.8e+02
20
23
24
25
p-dir noVweights dam 0.8 1.4e+02
15
18
19
20
p-dir Vweights dam 0.6 21 16 21 1.7e+02
25 20 25
25 20 26
26 20 26
max(it)
total time
Chapter 2 88 Shape Optimisation and CFD
Table 2.9: Iteration counts for box-smoother multigrid preconditioned GMRES, driven cavity at Re = 1000, FEM discretisation, Picard
2sort p-neum noVweights dam 0.7 2.7e+03
44
75
102
117
2sort p-dir noVweights dam 0.7 2.4e+03
46
76
0 103
0 119
2sort elem noVweights dam 0.* 0.0e+00
0
0
p-neum noVweights dam 0.7 1.6e+03
52
87
122
141
p-neum Vweights dam 0.7 2.0e+03
65
108
147
167
p-dir noVweights dam 0.7 1.5e+03
55
95
126
143
p-dir Vweights dam 0.6 1.9e+03
71
91 119
153 120 162
174 136 171
max(it)
total time
Chapter 2 89 Shape Optimisation and CFD
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
γ
#dof max(it)
659 31 24 21 19 17 16 16 0 15
2467 36 28 25 22 21 20 19 0 18
9539 39 30 25 23 21 20 20 0 19
37507 39 30 26 24 22 21 20 0 19
2.5e+02
1.9e+02
1.6e+02
1.5e+02
1.5e+02
1.4e+02
1.4e+02
0.0e+00
1.4e+02
total time
Table 2.10: Iteration counts for box-smoother multigrid preconditioned GMRES, driven
cavity at Re = 100, FEM discretisation, Picard linearisation, W-cycle, smoother variant
nosort p-dir noVweights, various damping parameters γ
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
γ
#dof max(it)
659 98 82 72 66 62 59 57 56 54
2467 174 138 121 110 103 100 97 93 90
9539 262 182 163 150 141 135 129 126 123
37507 273 187 171 193 166 149 146 144 142
2.8e+03
2.2e+03
1.9e+03
2.1e+03
1.7e+03
1.6e+03
1.5e+03
1.5e+03
1.5e+03
total time
Table 2.11: Iteration counts for box-smoother multigrid preconditioned GMRES, driven
cavity at Re = 1000, FEM discretisation, Picard linearisation, W-cycle, smoother variant
nosort p-dir noVweights, various damping parameters γ
reduce the number of iterations required, and thus the additional cost for this approach
can not be justified on the basis of these tests. Note that the 2sort elem noVweights col-
umn in Table 2.9 shows zero iteration counts, because for all tested damping parameters
the solver failed or took excessively long.
The only major difference between the results of the two best smoothing variants
nosort p-dir noVweights and p-dir noVweights lies in the most successful damping pa-
rameter. To allow assessment of the robustness of these methods tables 2.10, 2.11, 2.12
and 2.13 list the results for all tested damping parameters. Both variants fail outright for
some damping parameters in the Re = 100 case. The fact that these failing damping pa-
rameters are in a similar range as the most successful parameters indicates that even these
best parameter choices may not be particularly reliable.
Chapter 2 90 Shape Optimisation and CFD
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
γ
#dof max(it)
659 31 0 21 0 17 16 0 15 15
2467 36 0 24 0 21 20 0 18 18
9539 39 0 25 0 21 20 0 19 19
37507 39 0 26 0 22 21 0 20 19
2.6e+02
0.0e+00
1.7e+02
0.0e+00
1.5e+02
1.5e+02
0.0e+00
1.5e+02
1.4e+02
total time
Table 2.12: Iteration counts for box-smoother multigrid preconditioned GMRES, driven
cavity at Re = 100, FEM discretisation, Picard linearisation, W-cycle, smoother variant
sorted p-dir noVweights, various damping parameters γ
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
γ
#dof max(it)
659 98 82 72 65 60 57 55 54 52
2467 174 138 120 109 101 98 95 90 87
9539 261 180 160 147 138 131 126 122 119
37507 273 186 170 159 152 147 143 141 139
2.8e+03
2.2e+03
1.9e+03
1.8e+03
1.6e+03
1.6e+03
1.5e+03
1.5e+03
1.4e+03
total time
Table 2.13: Iteration counts for box-smoother multigrid preconditioned GMRES, driven
cavity at Re = 1000, FEM discretisation, Picard linearisation, W-cycle, smoother variant
sorted p-dir noVweights, various damping parameters γ
Chapter 2 91 Shape Optimisation and CFD
It may be observed that even the best of these tested smoother variants does not pro-
duce uniform behaviour with respect to the Reynolds number Re. While for this test
problem the solver performs reasonably well for Re = 100, iteration numbers increase
drastically for Re = 1000. In addition more Picard iterations are required as well for the
higher Re due to the stronger dominance of the nonlinear convection term in this case.
We emphasise that all the results in this subsection are for the Picard iteration case
only, but for the discrete adjoint method the true Jacobian of the nonlinear discretisation
scheme has to be used in order to obtain sensitivity values which are consistent with the
discrete functional evaluations themselves. Thus it has to be concluded that in conjunction
with the finite element discretisation at hand this approach does not appear to be suitable
to construct an efficient solver base for an adjoint code. However, in Section 2.5.3.2 it will
be demonstrated that this solver approach does indeed have applications where it produces
a reliable and efficient base for an adjoint code, i.e. for the finite volume discretisation
described in Section 2.4.
Unfortunately the Fp preconditioning technique which has proved quite successful for the
FE discretisation turns out not to deliver efficient results for the FV scheme presented in
Section 2.4.2. To investigate the reason for this, consider the (3, 3) block of the stabilised
operator (2.4.11), which is essentially a (directionally-scaled) diffusion operator scaled by
hβ /2. If the directional scaling is ignored (or assumed u = v), then this becomes a Lapla-
cian operator, scaled proportional to h, acting on the pressure. This observation forms
the basis for our analysis of why the Fp preconditioner does not produce h-independent
behaviour for the problems arising from the Newton linearisation of the FV scheme under
consideration.
Let us recall a few features of this preconditioning technique. As usual, let the or-
dering of the discrete variables be such that the linearised discrete system has the block
form " #" # " #
F BT u ru
= . (2.5.48)
B C p rp
In [52] the Fp preconditioner is compared to optimal Schur complement preconditioning
techniques for the Stokes systems. The Fp preconditioner’s most significant ingredient is
to use
XF−1
p
= M p−1 Fp A−1
p (2.5.49)
Chapter 2 92 Shape Optimisation and CFD
S = BF −1 BT −C. (2.5.50)
This can be seen as a natural extension from the Stokes case, where
−1
XM := νM p−1 (2.5.51)
pT (BA−1 BT −C)p
γ≤ ≤γ (2.5.52)
pT M p p
unstable finite element discretisations the non-stabilised version (C=0) of (2.5.52) holds
only for γ = 0. In [102] a way to overcome this is suggested, based upon splitting the
pressure space Ph into a stable part Qh and mode part QTh such that Ph is the orthogonal
sum of the two spaces and (2.5.52) holds on Qh even with C = 0. Once this is achieved
the following restrictions
2 pT Cp
φ ≤ T ≤ Φ2 (2.5.53)
p Mp p
pT (BA−1 BT )p
≤ α2 (2.5.54)
pT M p p
for all p ∈ QTh , with h-independent constants φ > 0, Φ, α ensure that (2.5.52) holds on
Ph \ {0}.
It is well established that one of the unstable pressure modes of standard methods on
Chapter 2 93 Shape Optimisation and CFD
i.e. neighbouring nodes have pressure of opposite sign but the same absolute value (= 1).
Let the corresponding pressure vector be denoted as pm and consider this mode for a FEM
discretisation of the operator
−∆ ∂x
−∆ ∂y , (2.5.55)
∂x ∂y −c1 h∆
pTmCpm c1 h Ω k∇phm k2 dΩ
R
=
pTm M p pm h 2
R
Ω |pm | dΩ
c1 h h12 h 2
R
K k∇pm k dΩ
≈ 1 R
h2 K m
|ph |2 dΩ
c1 h h12 4
≈ 1 1 2
h2 6
h
−1
= O(h ), (2.5.56)
which implies that the upper bound in (2.5.53) can not hold uniformly in h. This indicates
that Fp -type preconditioning will not achieve h-uniform performance if the stabilisation
term C is a multiple of a Laplacian scaled by h.
Even though we have highly simplified the problem, the resulting O(h−1 ) upper bound
appears to be representative and numerical experiments confirm a spreading of the spec-
trum of that order. This is demonstrated in Figure 2.12 where the eigenvalues of the
preconditioned Schur complement XF−1 p
S for different levels of refinement of the mesh are
plotted in the complex plane. It may be observed that the real part of the eigenvalue of
largest absolute value (the left-most in each plot) roughly doubles with each refinement:
Chapter 2 94 Shape Optimisation and CFD
0.4 1.5
1
0.2
0.5
Imag 0
Imag
0
−0.2
−0.5
−0.4
−1
−0.6 −1.5
−0.8 −2
−14 −12 −10 −8 −6 −4 −2 0 −25 −20 −15 −10 −5 0
Real Real
level 5, 1024 cells
2.5
1.5
0.5
Imag
−0.5
−1
−1.5
−2
−2.5
−45 −40 −35 −30 −25 −20 −15 −10 −5 0
Real
(Results for driven cavity, Re = 10)
ca. −12, −22, −43. In particular, the plot for the finest mesh, level 5, demonstrates
that there is no significant clustering of the eigenvalues as has been observed in the finite
element case (figures 2.6 and 2.7).
We remark that in [102, Section 4.2] a stabilisation for FE methods is discussed which
is basically C = c2 h2 A p and that for this second order stabilisation the above arguments
involving pm revert to O(1) and good performance of this type of preconditioning has
been reported, e.g. in [103]. This leaves the possibility that Fp preconditioning might still
perform well for a second order FV scheme. However, as the multigrid method which is
considered in Section 2.5.3.2 performs optimally, this hypothesis is not followed up here.
A further problem with the Fp preconditioning technique arises from the pressure time
terms in the FV scheme. In [52] it is recommended to use only homogeneous Neumann
boundary conditions in the definition of the operators Fp and A p , in order to reflect the
property of the incompressible Navier-Stokes equations that the pressure is only defined
up to an additive constant. For infinite time-steps this makes both of these operators
singular, but the vectors that A−1p is applied to are in the range of A p , so the solution is
well defined up to a constant and this additive constant does not influence the result of the
subsequent multiplication by Fp .
The FV approach under consideration, with artificial compressibility and finite time-
Chapter 2 95 Shape Optimisation and CFD
steps in the pressure terms, results in a somewhat different situation. The time terms
uniquely define the pressure at the next time-step, even though additive constants have no
influence on the residual with respect to steady state. Time terms enter Fp quite naturally
but not A p . Applying A−1p with Neumann boundary conditions requires the right-hand
side to be in the range of A p , thus care needs to be taken of this issue.
The motivation arguments (2.5.6) (as taken from [52]), used in the derivation of the
Fp preconditioner, would suggest something like
−1 −1
−1
Xwith p time term = Fp A p + αM p , (2.5.57)
because the time term adds a zero order operator S = BF −1 BT + αM p . The expression
(2.5.57) can be approximated for very large or very small time-steps by neglecting the
appropriate term. Yet, for time-steps of intermediate size, no satisfactory approximation
is known to the author.
9 3 1 3
q(x) = q(a) + q(b) + q(c) + q(d),
16 16 16 16
see Figure 2.14. The interpolation for the remaining child cells is defined by rotating the
setup from Figure 2.14 accordingly. The restriction operator is defined as the transpose of
Chapter 2 96 Shape Optimisation and CFD
cell i
a b
current mesh is reduced to 10−8 of its initial norm. It is evident that for higher Re more
time-steps are necessary. This reflects the increasingly nonlinear nature of the problem
and longer time-scales for the diffusion of initial perturbations. Further, it becomes more
pronounced on the finer meshes where artificial diffusion becomes small.
All tests listed in the Table 2.14 have been performed with damping parameter γ = 1.0,
that is without damping. The first three columns of the table show results for the use of
a simple V-cycle as preconditioner. While this performs well at low Re, its performance
deteriorates significantly for higher Re. This deterioration may be attributed to the bad
approximation of even the low frequency solution components on the coarse meshes due
to the low order stabilisation terms introduced by the Roe scheme (2.4.11). Examples
of the convergence of the solution of a lid driven cavity problem are given in Section
2.7.2.1 where the Re = 1000 results demonstrate this effect. To verify that this is actually
the source of the deterioration the results of the simple V-cycle (a) (with one pre- and
one post-smoothing step) are compared to those of a V-cycle with more smoothing (b)
(one pre- and two post-smoothing steps) and a W-cycle (c) (pre-smoothing, coarse grid
correction (CGC), post- and pre-smoothing, one more CGC, post-smoothing), see the last
three columns in Table 2.14.
The improved smoothing in (b) produces lower iteration counts for GMRES but is
not sufficient to avoid deterioration as the mesh is refined. On the other hand the more
expensive W-cycle (c) achieves this and results in what appear to be mesh independent
iteration counts, compensating for the higher cost per cycle. This is illustrated by the total
CPU time in each of the cases, which is the sum of the times taken on each level listed.
Note that the performance of the smoother for convection-dominated problems (high
Re) may be improved by ordering the cells stream-wise, from inflow to outflow. However
for the driven cavity problem it is difficult to define such an ordering due to the recircu-
lation and the absence of in- and outflow boundaries. Therefore the results presented in
Table 2.14 were produced with a simple coordinate based ordering which conforms with
the flow direction at the boundary that drives the flow (the lid).
Overall we conclude that the box smoother used in a W-cycle as preconditioner for a
GMRES solver provides a satisfying basis for a solver in the sense that the costs for the
linear solves are linear in the number of unknowns and, in the Reynolds number range
considered here, almost independent of Re.
Chapter 2 98 Shape Optimisation and CFD
Re
level cells 10 100 1000 1000 1000
2 64 5×3 6×3 6×3 6×3 6×2.8
3 256 4×3 6×4 7×5 7×3.7 7×4.1
4 1024 4×3 6×4 8×6 8×4.8 8×4.8
5 4096 4×3 6×5.5 9×7.3 9×9 9×5.2
6 16384 4×4 6×6 11×10.5 11×8.5 11×6.1
7 65536 4×4.5 6×7 12×15.7 12×12.1 12×6.2
8 262144 6×5.5 6×7 21×21.1 14×15.1 14×5.1
(time-steps) × (average GMRES iterations)
total CPU time 32m52s 42m13s 197m38s 228m03s 118m41s
final residual 6.1e-09 1.4e-10 9.6e-11 9.5e-11 9.5e-11
cycle a a a b c
cycle types:
a V-cycle down, CGC, up
b V-cycle down, CGC, up, down
c W-cycle 2×( down, CGC, up)
smoother:
up box smoother, cell ordering forward
down box smoother, cell ordering backward
Table 2.14: Iteration counts for the FV solver
Chapter 2 99 Shape Optimisation and CFD
The regularising weight vector w p must be non-orthogonal to both the singular left and
right eigenvectors of the system. Then Re = 0 defines a unique solution ω e because R = 0
defines a solution ω unique up to an additive constant pressure, which is fixed by the
condition wTp ω = 0 and the resulting unique solution is extended to ω e by λ = 0. Thus the
case λ 6= 0 can only arise if no solution exists to the non-regularised system R = 0. Since
Chapter 2 100 Shape Optimisation and CFD
Re defines a unique solution, its Jacobian Jeis non-singular. It has the block structure
" #
∂ R
e J w p
Je:= = T , (2.6.2)
∂ωe wp 0
where J := ∂∂ ωR .
Now let us apply the discrete adjoint technique to the augmented system Re = 0. As
described in Section 1.1, the overall derivatives DI/DF of the performance criterion
I := I( e ), F ) can be evaluated by solving the discrete adjoint equation
e ω(F
" #T
∂ Re e = ∂I
e
Ψ (2.6.3a)
∂ω e ∂ω
e
and computing
DI ∂ IeT e T ∂ Re
= −Ψ . (2.6.3b)
DF ∂F ∂F
∂ Ie
eTp = 0.
∂ω
∂ Ie
J T Ψω = (2.6.5)
∂ω
is orthogonal to the singular vector vR = e p and hence the system has a solution
Ψω = Ψ∗ω + s vL , (2.6.6)
Chapter 2 101 Shape Optimisation and CFD
Now that this common issue has been discussed the remaining two subsections will
detail some of the aspects which are specific to the application of the discrete adjoint
method for the individual discretisations considered in this work.
2.6.2 The discrete adjoint method applied in the finite element con-
text
Since the finite element (FE) discretisation in this work uses unstructured triangular meshes
combined with an automatic mesh generation and the mesh deformation approach, as
briefly outlined in Section 2.2.6, it can be applied to a wide range of geometries. This
approach implies that the dependency of the discrete approximation of the performance
functional I is actually realised by a chain of dependencies. The shape parameter vector
F defines a mesh with node positions s, which defines the discrete Navier-Stokes equa-
tions R(ω, s) = 0 with discrete solution ω, which in turn defines the discrete performance
functional I(s) := I(ω(s),
e s). Just like this chain of definitions in the forward problem the
derivatives DI/DF are defined as a chain but in the opposite direction. The discrete ad-
joint method is actually applied to the dependency of the discrete functional on the node
positions of the mesh. In Subsection 2.6.2.1 an important ingredient of this approach is
discussed, the derivatives of the FE discretisation with respect to the node positions of
the mesh. Subsection 2.6.2.2 is devoted to the next step in the adjoint chain, the deriva-
tives of the node positions with respect to the shape parameters, using this opportunity
Chapter 2 102 Shape Optimisation and CFD
to describe the mesh deformation approach in more detail. The efficient solution of the
discrete adjoint equation is finally discussed in Subsection 2.6.2.3.
Evaluation of the total derivative DI/Ds by way of the discrete adjoint method (1.1.7)
requires evaluation of terms of the form
DI ∂ Ie ∂R
= − ΨT , (2.6.7)
Ds ∂ s ∂s
i.e. partial derivatives of the discrete residual vector R with respect to the node positions
in the mesh as well as partial derivatives of the performance criterion Ie with respect to the
node positions. In essence this requires differentiation of the algorithms which perform
the computation of R and I. e There are three common approaches to this task: apply-
ing finite differences, using automatic differentiation software [39] and hand calculations.
While the first two of these approaches offer the possibility to avoid the work-intensive
hand calculations, they may result in significant performance loss. Generally, all these
approaches should be applied on the element level only, i.e. for the local contributions to
R rather than the whole vector R, in order to avoid a possible increase in computational
complexity. Automatic differentiation software (see e.g. [39]) is designed to differentiate
computer programs by generating a modified source code. Unfortunately mature tools
of this kind are not available for all programming languages. Thus, “hand coding” the
derivatives remains an important alternative. Further, if approached in a systematic man-
ner, exploiting the knowledge about the formulas that are implemented by the software,
then it is a simpler task than one might think. In this subsection such a systematic ap-
proach is demonstrated for a simplified model problem. The model problem is chosen in
order to keep notation simple and the focus on the general methodology. Extension to the
incompressible Navier-Stokes equations is straightforward.
As model problem the Poisson equation with constant forcing term f ≡ 1 is chosen,
Chapter 2 103 Shape Optimisation and CFD
The integrals contained in the definition of a(., .) and b(.) are usually computed using
numerical integration on each element and then summing over the elements. Hence the
i-th row of the discrete residual vector R may be expressed as
n m !
[R]i = ∑ ∑ ∑ w` ∇T ϕ (gT ( j)) · ∇T ϕ (i) − 1 ϕb(i) (x̂` ) |det(JT )| ugT ( j) , (2.6.9)
T ∈Th j=1 `=1
si ∈T
where the outermost sum is over those elements T ∈ Th which have node i as a vertex, n
is the number of Lagrange basis functions per element, x̂` (` = 1, . . . , m) are the quadra-
ture points on the master element and w` are the quadrature weights. Further, gT ( j) is the
global node number of vertex j of element T , ∇T denotes the gradient restricted to element
T evaluated at the point x` (which corresponds to x̂` via the isoparametric element map-
ping), ϕb( j) (x̂) is the basis function corresponding to node j evaluated at point x̂ within the
master element and JT is the Jacobian of the element mapping. In this expression the parts
that are non-constant with respect to the node positions are
in each element T . Applying the product rule of differentiation gives for the derivative of
Chapter 2 104 Shape Optimisation and CFD
∂ ∇T ϕ (i)
(gT ( j))
+ ∇T ϕ · |det(JT )| (2.6.10)
∂ [sk ]t
m
+ ∑ w` ∇T ϕ (gT ( j)) · ∇T ϕ (i)
`=1
#
∂ |det(JT )|
+ 1 ϕb(i) (x̂` )
ugT ( j) .
∂ [sk ]t
Assuming that each of the individual terms in (2.6.10) may be evaluated it is now a
straightforward task to compute ΨT ∂ R/∂ s.
Remark 4. Note that it is more efficient to compute and sum the element contributions to
ΨT ∂ R/∂ s because this avoids the cost, and the higher memory requirements, of assem-
bling the sparse Jacobian matrix ∂ R/∂ s. This matrix would be used for only one matrix
vector product anyway.
It remains to derive expressions for the derivatives of the gradient ∇T ϕ (i) and the deter-
minant |det(JT )|. For the impatient we present the result of the rather technical derivation
in advance.
Proposition 1.
∂ [∇T ϕ (i) ]u h i h i
= − ∇T ϕ (k) ∇T ϕ (i) (2.6.11)
∂ [sk ]t u t
∂ |det(JT )|
= |det(JT )|JT−T ∇
ˆ ϕb(k) (x̂` ). (2.6.12)
∂ sgT (k)
Note that similar representations of these derivatives have for example been presented
in [43], but only for linear triangular elements, while the derivation that follows is valid
for isoparametric elements of arbitrary degree and shape. Note also that the conditions
for these expressions are different to similar formulas in [49], because we consider the
evaluation point as fixed in the reference element, whereas [49] considered it fixed in
the world element and restricted themselves to elements of simplex type, i.e. excluding
curved element boundaries.
The remainder of this subsection is devoted to the proof of this proposition.
Chapter 2 105 Shape Optimisation and CFD
n
x = MT (x̂) := ∑ sk ϕb(k)(x̂), (2.6.13)
k=1
n iT
∂x h
ˆ (k)
JT = = ∑ sk ∇ϕ (x̂) ,
b (2.6.14)
∂ x̂ k=1
ϕ (i) (x) := ϕb(i) (MT−1 (x)), (2.6.15)
∇ ϕ (i) (x) = J −T ∇
T T
ˆ ϕb(i) (M −1 (x)).
T (2.6.16)
Here sk denotes the coordinate vector of the k-th node of the element T and ∇ ˆ denotes the
gradient on the reference element, i.e. the partial derivatives with respect to the reference
element coordinates x̂. Two immediate consequences are
∂ JTT (v, j)
n h i
= ∑ δk,r δt, j ˆ (r)
∇ϕ (x̂)
b (by (2.6.14))
∂ [sk ]t r=1 v
h i
= δt, j ∇ˆ ϕb(k) (x̂) , (2.6.17)
v
Since ∇ˆ ϕb(i) (x̂` ) is independent of the node positions, ∇ ϕ (i) depends on the node positions
T
T
only via the transpose of the element Jacobian JT . The derivatives with respect to the
node positions can be calculated using the implicit function theorem on the reformulated
(2.6.18),
ˆ ϕb(i) (x̂` ).
0 = JTT ([sk ]t )∇T ϕ (i) − ∇ (2.6.19)
This gives
∂ ∇T ϕ (i) T
−T ∂ JT
= −JT ∇T ϕ (i) , (2.6.20)
∂ [sk ]t ∂ [sk ]t
where the derivative of the transposed Jacobian is in the component wise sense. Taking a
Chapter 2 106 Shape Optimisation and CFD
d ∂ JT
d
∂ [∇T ϕ (i) ]u −T T (v, j)
h
(i)
i
= − ∑ JT (u,v) ∑ ∇T ϕ
∂ [sk ]t v=1 j=1 ∂ [sk ]t j
T
d ∂ JT (v, j) h i
= − ∑ JT−T (u,v) ∇T ϕ (i)
v, j=1 ∂ [sk ]t j
d −T h i h i
=− ∑ JT (u,v) δt, j ∇ˆ ϕb(k) (x̂` ) ∇ ϕ (i) (by (2.6.17))
T
v, j=1 v j
d
−T h i h i
ˆ (k)
= − ∑ JT (u,v) ∇ϕb (x̂` ) ∇T ϕ (i)
v t
hv=1 i h i
(k) (i)
= − ∇T ϕ ∇T ϕ (by (2.6.18)),
u t
which follows from expansion to sub-determinants and the adjoint representation of the
inverse of the matrix A. Applying this formula and an analogue of (2.6.17) we get
d ∂ [JT ]u,v
∂
| det(JT )| = | det(JT )| ∑ JT−T u,v
∂ [sk ]t u,v=1 ∂ [sk ]t
d −T h i
= | det(JT )| ∑ JT u,v δt,u ∇ˆ ϕb(k) (x̂` )
u,v=1 v
d h (k) i
= | det(JT )| ∑ JT−T t,v ∇
ˆ ϕb (x̂` ) .
v=1 v
As already discussed in Section 2.2.6, automatic mesh generators, such as [77, 80] for
example, may result in meshes which depend in a discontinuous manner on the shape
parameters F , even the number of elements may not be constant. As a compromise
between the smoothness properties of parametric meshes and the geometric flexibility of
automatically generated meshes, we proposed to use a hybrid approach of defining a base
Chapter 2 107 Shape Optimisation and CFD
mesh by means of an automatic mesh generator and then use deformed versions of it as
a parametric mesh in a neighbourhood of the base shape parameters. The definition of
the deformations as well as a discussion of how these have to be taken into account in the
evaluation of the derivatives with respect to the shape parameters are the subject of this
subsection.
The stationary Lamé equations (linear elasticity) are a mathematical model for elastic
deformations of solid bodies under internal or external forces, see e.g. [19]. The pure
Dirichlet problem, find u ∈ H1g (Ω)2 such that
can be used to propagate a deformation g of the boundary of a domain into the interior,
resulting in a deformation vector field u of minimal potential energy. This model contains
two material constants λ > 0 and µ > 0 (Lamé constants) which form a continuous ana-
logue to spring constants in a discrete network of springs. In this present work they are
chosen to take the values for steel,
Finite element discretisation of (2.6.21) on the reference mesh allows a node-wise specifi-
cation of a deformation gh of the boundary, i.e. the displacement vectors for each bound-
ary node. In order to define the displacement of the interior nodes an equation system of
the form " #" # " #
Ki,i Ki,b ui 0
= (2.6.24)
0 I ub g
has to be solved, where the subscript ∗i denotes a block corresponding to interior nodes
while ∗b denotes a block corresponding to boundary nodes. The vector s of node positions
in the deformed mesh is defined as the positions in the base mesh x0 plus the displace-
ments due to the deformation,
s = x0 + u. (2.6.25)
This basic linear elasticity approach has one major disadvantage: regions of the mesh
with relatively small elements (e.g. to resolve boundary layers) are treated the same way
Chapter 2 108 Shape Optimisation and CFD
as those with large elements, even though deformations in these regions of small ele-
ments have a much stronger influence on the mesh quality than in the coarser regions of
the mesh. In [87] a modification to the discretised Lamé equations was proposed in order
to address this issue. From a modelling point of view this modification adapts the param-
eters λ and µ such that the material is more rigid in regions of small elements, thus the
deformations in those areas will tend to be smaller than in regions of large elements. This
can easily be implemented by choosing λ and µ to be element-wise constant and mul-
tiplying the base values (2.6.23) by |Ak |α , where α ∈ R is a parameter and |Ak | denotes
the surface area of element k. The tests in [87] suggest that α = −1 forms a reasonable
compromise, balancing the deformations between areas of small elements and areas of
larger elements. Conveniently this parameter choice is equivalent to dropping the term
|Ak | completely from the assembly routines14 , thus it even simplifies the computations.
For these reasons the choice α = −1 is used in this present work as well. Note that even
with these modifications a multigrid preconditioned CG solver performs optimally for
the symmetric reformulation of (2.6.24). Thus, the costs for solving these problems are
comperatively small in the context of the Navier-Stokes solver.
To illustrate the resulting mesh deformations, Figure 2.15 shows a close-up of two
superimposed versions of the mesh around an interior obstacle in a channel (Example
2.2 from Section 2.7.1.2): one before and one after the mesh deformation. For moderate
deformations of the boundary the mesh quality in terms of the size of the interior angles
of the elments is usually maintained. However, strong deformations may result in degen-
erating meshes and thus re-definition of the base mesh may be required as discussed in
Section 2.2.6.
Remark 5. Note that simply deforming the boundary of the mesh only is not appropriate,
because this would drastically restrict the amount by which the shape parameters can
change before remeshing becomes necessary. For example a mesh cell can degenerate if
the boundary is moved inward. Further, the quality of the boundary cells would strongly
depend on the shape parameters, and the changes in this quality might even affect the
quality of the discrete approximation of the performance functional I more strongly than
the actual change of the boundary geometry. On the other hand, if the interior mesh nodes
are moved in an appropriate manner as well, then these effects are far less pronounced.
To illustrate these issues we ask the reader to look again at Figure 2.15. Note that some of
the smaller elements at the top of the obstacle would have almost collapsed if the interior
nodes of the mesh where not moved as well. Yet, the fully deformed mesh is of similar
quality as the initial mesh.
14 The surface area |Ak | is equivalent to the term |det(JT )| in (2.6.9), for example.
Chapter 2 109 Shape Optimisation and CFD
initial
deformed
0.4
0.2
-0.2
-0.4
An implication of equations (2.6.25) and (2.6.24) is that for the deformed mesh the
node positions of the interior nodes are linearly dependent on the boundary displacements
g,
−1
si = x0,i − Ki,i Ki,b g and sb = x0,b + g.
This dependency has to be taken into account in the computation of the derivatives DI/DF .
This is easily done applying the adjoint equation approach, yielding
DI DI T −T DI
= − Ki,b Ki,i , (2.6.26)
Dg Dsb Dsi
where in analogy to (2.6.24) the vector of node positions s is split into interior and bound-
ary parts. The derivatives with respect to the node positions in the mesh DI/Ds are
computed using the discrete adjoint method with the approaches discussed in Subsec-
tion 2.6.2.1. The vector of boundary displacements g itself is a function of the shape
parameters F . Thus the computation of DI/DF is completed by
DI ∂ g
DI
= .
DF Dg ∂ F
Chapter 2 110 Shape Optimisation and CFD
Remark 6. Note that this deformation approach has advantages in the context of multi-
grid solution techniques as well. Successful application of multigrid requires a hierarchy
of meshes, with the coarsest meshes in the hierarchy consisting of a relatively small num-
ber of elements or mesh cells only. Such hierarchical meshes are most easily created
by refining an initial coarse mesh. However, for domains with curved boundaries some
means of adjusting the refined mesh to fit the domain is required, for which the mesh
deformation approach can be easily applied.
One advantage of the discrete adjoint method compared to the continuous adjoint is that
preconditioners for the original problem can be utilised for the adjoint problem as well,
because the adjoint equations are defined by the transpose of the stiffness matrix of the for-
ward problem. In the context of non-symmetric problems like the incompressible Navier-
Stokes equations, it has to be observed that the transpose of a right preconditioner CR−1
for the original problem becomes a left preconditioner for the adjoint equations and vice
versa, simply because
T
KCR−1 = CR−T K T .
Thus in order to define a left Fp preconditioner for the adjoint problem a right precondi-
tioner for the forward problem should be transposed.
The zero mean pressure condition (ZMPC), which for the forward problem has been
dealt with by means of the projection approach, see Subsection 2.5.2.3, has to be studied
closely again in the formulation and solution of the adjoint problem. Application of the
projection approach implies that the ZMPC becomes a “hidden” constraint in the sense
that it does not appear in the matrix defining the linear equation system. In our previous
publication [75] this issue has been dealt with explicitly by including this constraint into
the system matrix before it is transposed. This was done in replacing the last row of
the equation system by the ZMPC which takes the form wT p = 0, where w is a weight
vector and p the coefficient vector for the discrete pressure approximation, see Subsection
2.5.2.3. In the context of the adjoint equation even more disadvantages of this approach
arise in addition to those discussed in Subsection 2.5.2.3. Since the degrees of freedom of
the adjoint equations (adjoint variables Ψ) correspond to the individual equations of the
forward problem, the variable corresponding to the ZMPC has a special meaning which
the Fp preconditioner does not account for in its standard form. To overcome this, in
[75] we modified the preconditioner by adding an additional projection step in analogy
to the projection approach from Subsection 2.5.2.3. With this modification the adjoint
Chapter 2 111 Shape Optimisation and CFD
2.6.3 The discrete adjoint method applied in the finite volume con-
text
Due to the Cartesian block structured meshes used in the finite volume (FV) discreti-
sation studied in this work, the mesh deformation approach can not be applied to this
discretisation. Instead the geometries are restricted to what can be handled by parametric
meshes of the block Cartesian type. Note that this limitation does not apply to FV dis-
cretisations in general, as this discretisation technique has been developed specifically to
allow discretisations on unstructured meshes. For those interested in the issues related to
more flexible meshes and geometries we remark that the methods presented for the finite
element discretisation are applicable for more flexible FV discretisations as well.
It remains to present a solution procedure for the adjoint equations and to discuss the
derivatives of the FV discretisation with respect to the mesh parameters.
For solving the adjoint equation (2.6.3a) an analogous solution procedure to that for
the forward problem can be used, i.e. time-stepping in the case of the FV discretisation of
Section 2.4. The solution procedure is summarised in Algorithm 5, with the same notation
as used in Section 2.4.5. Compared to Algorithm 1, the nonlinear residual R(ω, F ) of the
forward problem is replaced by the residual of the adjoint equation r(Ψ) := J T Ψ−∂ I/∂
e ω
and the Jacobian J is transposed. Note that this time-stepping procedure performed more
efficiently in our tests than solving the (linear) adjoint equation directly. It appears to
be likely that quadratic growth of the cost for the GMRES algorithm with the number of
iterations is the source to this phenomenon. The relatively inaccurate solves of the time-
step systems (J T + D)y = r require few GMRES steps. Performing these cheap solves
multiple times in order to converge to the solution of the adjoint equation is still more
efficient than performing the higher number of GMRES steps to directly solve the adjoint
equation accurately.
Like in the finite element case, the author recommends computing the term ΨTω [∂ R/∂ F ]
Chapter 2 113 Shape Optimisation and CFD
end while
∂ Rloc
ΨTloc ,
∂F
rather than assembling the sparse matrix ∂ R/∂ F first, because ∂ R/∂ F would be used
for this one matrix vector product only. This allows considerable savings of time and
memory. In the FV case the local contributions are naturally cell interface based, like the
computation of the residual R(ω) itself. For the block Cartesian meshes considered in
this work the parameters F are the coordinates of the corners of each of the mesh blocks.
These enter the computation in two ways only: via the measure of the cell interface |A j | in
formula (2.4.7), and via the distance of neighbouring cell centres h j in formula (2.4.10).
Thus evaluation of the local derivatives ∂ Rloc /∂ F is straightforward.
Chapter 2 114 Shape Optimisation and CFD
Problem formulation: A modified version of the lid driven cavity problem serves as
a first example. A square obstacle of fixed size is placed in the cavity, see Figure 2.16.
The average velocity component in the negative x-direction V in the area ΩB , below the
obstacle, is used as performance criterion. The aim is to maximise this average velocity by
choosing the position for the obstacle appropriately. This may for example be of interest
if such a cavity is used in a coating process where the coating material contains very small
particles which fall out as sediment if the flow velocity becomes too small.
1 3 1 3
2
Ω := (0, 1) \ + F1 , + F1 × + F2 , + F2 ,
4 4 4 4
1 3 1
ΩB := + F1 , + F1 × 0, + F2 ,
4 4 4
1
Z
I := V := −u1 dΩ. (2.7.1)
|ΩB |
ΩB
Chapter 2 115 Shape Optimisation and CFD
u= ( 10 )
11111111111111111111111
00000000000000000000000
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111 Ω
00000000000000000000000
11111111111111111111111
0000000000
1111111111
00000000000000000000000
11111111111111111111111
0000000000
1111111111
00000000000000000000000
11111111111111111111111
0000000000
1111111111
00000000000000000000000
11111111111111111111111
0000000000
1111111111
00000000000000000000000
11111111111111111111111
0000000000
1111111111
00000000000000000000000
11111111111111111111111
0000000000
1111111111
00000000000000000000000
11111111111111111111111
0000000000
1111111111
00000000000000000000000
11111111111111111111111
0000000000
1111111111
00000000000000000000000
11111111111111111111111
0000000000
1111111111
00000000000000000000000
11111111111111111111111
0000000000
1111111111
00000000000000000000000
11111111111111111111111
0000000000
1111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
V
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
ΩB
Figure 2.16: Example 2.1: Multiply connected cavity
The parameters F = (F1 , F2 )T are the deviation of the centre of the obstacle from the
centre of the cavity. The position of the obstacle is constrained such that a gap remains
between the obstacle and the boundary of the cavity,
Due to the simple rectangular geometry this problem is well suited to be treated with
parametric meshes and both the finite element as well as the finite volume discretisation
of this work can be applied.
Results and discussion: The optimisation problem has been solved for Re = 10, (ν =
10−1 ), Re = 20, (ν = 5 × 10−2 ), Re = 100, (ν = 10−2 ) and Re = 200, (ν = 5 × 10−3 ).
Chapter 2 116 Shape Optimisation and CFD
In all four cases the optimisation solver was started with F = [0, 0]T and converged to a
solution in the interior of the parameter space, i.e. the constraints (2.7.6) were not active
at the optimal solution. Figure 2.17 shows the velocity profiles along the lines x = 0.5
and y = 0.5 for both the initial as well as the optimised geometry. An increase in the
velocity at the bottom part of the domain can be observed in all four cases and the optimal
geometries show little difference between the different Reynolds number regimes.
Tables 2.16 and 2.17 list the corresponding values of the viscosity parameter ν (ν =
1/Re), the initial performance Iini , the optimal performance Iopt , the optimal solution Fopt ,
the number of iterations of the DONLP2 [84] optimisation solver (#it), the total number of
evaluations of the performance function by the optimisation solver (#eval) and the num-
ber of degrees of freedom in the discretised Navier-Stokes equations (#DOFs). It can be
observed in these tables that the first component of Fopt shows a clear dependency on ν
and that the FE and FV solvers compute optimal parameters that are very close to each
other. However, the initial as well as the optimal values of I show significantly different
behaviour in both solvers. For this optimisation problem in two variables DONLP2 con-
verges in relatively few iterations, only 6–9 iterations. However, the number of function
evaluations is slightly larger because the line search steps are not counted as iterations.
These numbers demonstrate the need for efficient solutions strategies if optimisation is to
be considered.
In order to investigate the reason for the differing values of the initial and optimised
performance values I between the FE and FV discretisation, tables 2.18 and 2.19 list these
quantities again for different levels of mesh refinement. Note that, due to the higher con-
vergence order, the FE discretisation produces very accurate approximations of the func-
tional even on the coarsest mesh, while the first order convergent FV scheme shows sig-
nificant differences even between the computed performance values on the finer meshes.
In order to show the difference between the finite element results for differing refinement
levels the results have to be listed to more digits than in the finite volume case. Due to
stability limitations (Pe ≤ 1) the FE discretisation can only be applied to the ν = 5 × 10−3
case on the finest mesh with 446208 degrees of freedom. Thus no comparison between
refinement levels is possible in this case.
Chapter 2 117 Shape Optimisation and CFD
velocity component orthogonal to the lines velocity component orthogonal to the lines
x=0.5 and y=0.5 for Re=10 x=0.5 and y=0.5 for Re=20
1 1
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
y
y
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
ini ini
0 opt 0 opt
0 0.5 1 1.5 0 0.5 1 1.5
x x
velocity component orthogonal to the lines velocity component orthogonal to the lines
x=0.5 and y=0.5 for Re=100 x=0.5 and y=0.5 for Re=200
1 1
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
y
0.5 y 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
ini ini
0 opt 0 opt
0 0.5 1 1.5 0 0.5 1 1.5
x x
Figure 2.17: Initial (F = (0, 0)T ) and optimised velocity profiles for Example 2.1
d1 a d2
11111111111111111111
00000000000000000000
00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111
b 00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111
I := F1 , (2.7.8)
where n is the outward normal of the fluid domain Ω, which can also be seen as an inward
normal of the surface of Q. To keep the y-component of the acting force equal to zero, the
shape of the object Q is restricted to be symmetrical about the centre line of the domain.
The shape of Q is parameterised by cubic Bezier-splines (B-splines, [44, Appendix B]
or [30]) as demonstrated in Figure 2.19. In each spline segment these curves are cubic
Chapter 2 120 Shape Optimisation and CFD
3
x(t) = ∑ xi B3i (t),
i=0
where xi ∈ R2 are so called control points and Bni denote the Bernstein polynomials
n i
Bni (t) := t (1 − t)n−i .
i
The curve segments start at their respective control point x0 and end at x3 . The tangents
at the beginning and end points of each segment are defined by the lines x0 x1 and x2 x3
respectively. By defining the control points at a segment interface such that x3 − x2 |left =
x1 −x0 |right it can be achieved that twice continuously differentiable curves (B-splines) are
constructed. The red diamonds in Figure 2.19 mark the interface control points between
spline segments (x0 and x3 ) and the endpoints of the tangential lines mark the positions
of the interior control points (x1 and x2 ) of each segment. Due to the construction as
continuously differentiable curves only one of these interior control points can be chosen
arbitrarily per segment, apart from the last segment. The free control points are marked
by red circles in Figure 2.19.
The shape parameters for the optimisation problem are chosen to be the vertical po-
sitions of the control points (red circles and diamonds). The horizontal position of the
control points, as well as the vertical position of the two points on the symmetry axis, are
fixed. Thus, the parameterisation as illustrated in Figure 2.19 results in ten parameters.
The use of B-splines has several advantages. They allow great control of the resulting
curve in terms of geometrical constraints due to the convex hull property [30]. The Bezier
curve segments of the spline are guaranteed to be contained in the convex hull of the
control points of this specific segment. This allows simple geometric constraints on the
resulting curve to be expressed as constraints on the position of the control points of the
spline, which in turn form the parameters of the shape. The possibility of self overlapping
of the domain, and consequently of the mesh, is avoided by constraining the shape such
that it contains the green polygon in Figure 2.19, which is obtained as the convex hull of
the front and rear tip of the object, and two points at 0.1 a and 0.9 a horizontally, deviated
vertically from the centreline of Q by 0.02 a.
The application of parametric meshes for this problem is challenging and, while it
may be possible, it would not necessarily provide more insight than the previous example.
Instead, the more general approach of using the general mesh-generator Triangle [80]
in combination with mesh deformation is taken, see sections 2.2.6 and 2.6.2.2. Due to the
Chapter 2 121 Shape Optimisation and CFD
non-rectangular geometry of the obstacle the finite volume discretisation of this present
work is not applicable, so only results for the finite element discretisation can be provided.
So the overall problem is:
Illustration of the adjoint solution and sensitivity: In order to illustrate the character
of the solution and the discrete adjoint solution, Figure 2.20 shows both of them in the
vicinity of the obstacle for Re = 10. It is interesting to observe that the discrete adjoint
solution Ψ behaves like a PDE solution, but only in the interior of the domain. Close to
the obstacle boundary effects from the definition of the adjoint problem in a discrete sense
are visible. The forcing terms ∂ I/∂ ω of the adjoint equation act in a very narrow region
around the boundary while effectively homogeneous Dirichlet boundary conditions are in
place. Corresponding to the singularity-like behaviour of the adjoint velocity variables
Ψu near the obstacle boundary, the adjoint pressure Ψ p does also show extreme peaks in
the vicinity of the obstacle. The sensitivity DI/DF is visualised in the bottom part of
Figure 2.20 by adding the scaled gradient DI/DF to the current shape parameters. Thus
the illustrated shape modification shows a modification increment in the direction of the
strongest increase in the performance functional I.
Results: Optimisation has been performed for two Reynolds number regimes, Re = 10
and Re = 20. For the computations the dimensions of the domain have been set to b = 4,
d1 = 2, d2 = 6 and a = 1. The initial geometry of the obstacle was set to a B-spline
interpolation of the NACA0040 profile. Thus the width of the initial geometry is 0.4.
The volume of the obstacle is restricted to be greater or equal to the volume of the initial
geometry, that is |Q| ≥ 4.2633. The results of the optimisation are summarised in Table
2.20, where the viscosity parameter ν, the corresponding Reynolds number Re, the initial
performance Iini , the optimised performance Iopt , the number of iterations of the DONLP2
optimisation solver (#it), the total number of performance function evaluations (#evals)
and the number of degrees of freedom in the discrete Navier-Stokes system (#DOFs) are
listed. Note that due to the automatic mesh generation the number of DOFs is not fixed but
depends on the current base mesh. The values given specify the range that was observed.
The optimised obstacle geometries are shown in Figure 2.19.
Discussion: The optimised shapes in Figure 2.19 may appear surprising, as one may
expect a more tear-drop-like shape to result in a lower drag, or maybe with the volume
Chapter 2 122 Shape Optimisation and CFD
Initial (NACA0040)
Optimised for Re = 10
Optimised for Re = 20
shape
shape + a*grad
0.4
0.2
-0.2
-0.4
Figure 2.20: Example 2.2: Solution, adjoint solution, and (scaled) sensitivity for the
obstacle at Re = 10
Chapter 2 124 Shape Optimisation and CFD
and length constraints one may expect a cigar-like shape such as it is used for air ships.
However, these (expected) geometries are known for their good low drag performance in
high Reynolds number regimes (Re ≈ 104 to 106 ), in which different physical phenomena
are of importance than in the regimes for which these geometries have been optimised.
To put the results into perspective let us consider two scenarios where water flow past
such an obstacle is characterised by Reynolds number Re = 10. The kinematic viscosity ν
for water is νwater = 1.1 × 10−6 m2 s−1 [65]. The Reynolds number Re can be expressed in
terms of ν, characteristic length scale ` (length of the obstacle) and characteristic velocity
V (inflow velocity),
V`
Re = .
ν
Assuming a length of the obstacle of 1m, the inflow velocity would have to be V = 1.1 ×
10−5 m s−1 in order to have Re = 10. On the other hand, if the inflow velocity is set to
V = 10km/h = 36m s−1 , the length of the obstacle would have to be ` ≈ 3.06 × 10−7 m.
Thus it is probably more realistic to think of the fluid to be a very viscous one, like honey
for example.
The reduction in the drag I achieved by the optimisation is relatively mild. This may
be attributed to the relatively good initial shape, which makes large improvements diffi-
cult.
In order to verify that the implementation of the discretisations is correct all three discreti-
sation schemes, Galerkin FEM, SDFEM and FVM, are applied to a standard test problem:
lid driven cavity flow, see Figure 2.21. The results are compared with those from Ghia et
al [35], which are widely acknowledged as a good reference solution. Figures 2.22, 2.23
and 2.24 show velocity profiles along the lines x = 0.5 and y = 0.5, for Re = 100 and
Re = 1000. The results are presented for a sequence of meshes and compared to those
from Ghia et al15 . Note that for Galerkin FEM Re = 1000 would require an extremely
15 The results in [35] where computed on a mesh with 129 × 129 nodes, using a second order finite
difference scheme with first order upwinding for the convective terms.
Chapter 2 125 Shape Optimisation and CFD
u= ( 10 )
11111111111111111111111
00000000000000000000000
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
Ω
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
Figure 2.21: Test setup: Lid driven cavity
fine mesh due to the stability requirements, and solving would be rather inefficient due
to the degrading of the efficiency of the Fp preconditioner and the F-block multigrid as
discussed in Section 2.5.2. Thus results for Re = 1000 are not presented for this case but
for Re = 10 instead, although for Re = 10 no values are presented in [35].
In all cases the solutions appear to converge to values close to those from Ghia et al,
although with differing convergence rates. The slow convergence of the first order FV
scheme is evident in Figure 2.24. Even at Re = 100 the first order scheme requires a very
fine mesh to get accurate results. At Re = 1000 even the finest mesh with more than 16
million cells produces a solution that is far from converged. Nevertheless, this comparison
shows the behaviour expected from a first order scheme.
Judging by the distance to the reference solution and the distance between sequentially
refined meshes, the SDFEM stabilised finite element discretisation achieves similar accu-
racy as the FV discretisation with far fewer degrees of freedom, compare, for example,
the lines bcfvs 2562 cells in Figure 2.24 with feins 2572 nodes16 in Figure 2.23. Note that
oscillations remain for the coarse mesh SDFEM solutions, in the Re = 100 regime and
even more pronounced in the Re = 1000 regime. In both regimes the two finest meshes
shown in Figure 2.23 show no visible difference, thus the solution can be regarded as well
converged, even though converged to slightly different values than the reference solution.
16 FEINS is the name of the FE solver of this work. The name is an abbreviation for Finite Elements for
Incompressible Navier Stokes. The FV has been given the name BCFVS, Block Cartesian Finite Volume
Solver.
Chapter 2 126 Shape Optimisation and CFD
u−velocity component along the line x=0.5 for Re=10 u−velocity component along the line x=0.5 for Re=100
1 1
0.9 0.9
0.8 0.8
Ghia et. al.
0.7 2 0.7
feins 17 nodes feins 172 nodes
y
0.5 0.5
0 0
−0.4 −0.2 0 0.2 0.4 0.6 0.8 1 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1
u u
v−velocity component along the line y=0.5 for Re=10 v−velocity component along the line y=0.5 for Re=100
0.2 0.2
0.15 0.15
0.1
0.1
0.05
0.05
0
v
0
Ghia et. al.
2 −0.05
feins 17 nodes feins 172 nodes
−0.05 feins 332 nodes feins 332 nodes
−0.1
feins 652 nodes feins 652 nodes
−0.1 feins 1292 nodes feins 1292 nodes
feins 2572 nodes −0.15 feins 2572 nodes
2
feins 513 nodes feins 5132 nodes
−0.15
feins 10252 nodes −0.2 feins 10252 nodes
2
feins 2049 nodes feins 20492 nodes
−0.2 −0.25
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
x x
v−velocity component along the line y=0.5 for Re=10 v−velocity component along the line y=0.5 for Re=100
0.19
0.185 0.18
0.18 0.175
0.175
0.17
0.17
0.165
0.165
v
0.16
0.16
0.155 0.155
0.15 0.15
0.145
0.145
0.14
0.18 0.19 0.2 0.21 0.22 0.23 0.24 0.25 0.18 0.2 0.22 0.24 0.26 0.28 0.3
x x
Figure 2.22: Galerkin FEM velocity profiles (denoted by feins) for the lid driven cavity
problem along the lines x = 0.5 and y = 0.5 for Re = 10 and Re = 100 (and close-up
view).
Chapter 2 127 Shape Optimisation and CFD
y
0.5
y
0.5
0.4 Ghia et. al.
0.4 Ghia et. al. feins 92 nodes
feins 92 nodes 0.3 feins 172 nodes
0.3 2
feins 17 nodes feins 332 nodes
feins 332 nodes 0.2 feins 652 nodes
0.2
2
feins 65 nodes feins 1292 nodes
0.1 feins 1292 nodes 0.1 feins 2572 nodes
feins 257 nodes2 feins 5132 nodes
0 0
−0.4 −0.2 0 0.2 0.4 0.6 0.8 1 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1
u u
0.1
0.1
0.05
0
0
v
−0.1
v
u−velocity component along the line x=0.5 for Re=1000 v−velocity component along the line y=0.5 for Re=1000
1 0.38
Ghia et. al.
feins 92 nodes
0.37
feins 172 nodes
0.95
feins 332 nodes
0.36 feins 652 nodes
feins 1292 nodes
0.9 feins 2572 nodes
0.35
feins 5132 nodes
y
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.1 0.12 0.14 0.16 0.18 0.2
u x
Figure 2.23: SDFEM velocity profiles (denoted by feins) for the lid driven cavity problem
along the lines x = 0.5 and y = 0.5 for Re = 100 and Re = 1000 (and close-up view).
Chapter 2 128 Shape Optimisation and CFD
u−velocity component along the line x=0.5 for Re=100 u−velocity component along the line x=0.5 for Re=1000
1 1
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
y
bcfvs 322 cells bcfvs 322 cells
0.4 bcfvs 642 cells 0.4 bcfvs 642 cells
bcfvs 1282 cells bcfvs 1282 cells
0.3 2 0.3
bcfvs 256 cells bcfvs 2562 cells
v−velocity component along the line y=0.5 for Re=100 v−velocity component along the line y=0.5 for Re=1000
0.2 0.4
0.15 0.3
0.1 0.2
0.05 0.1
0 0
0.38
0.18 0.36
0.34
0.17 0.32
0.3
v
0.16
0.28
0.26
0.15
0.24
0.22
0.14
0.2
0.18 0.2 0.22 0.24 0.26 0.28 0.3 0.1 0.15 0.2 0.25
x x
Figure 2.24: FV velocity profiles for the lid driven cavity problem along the lines x = 0.5
and y = 0.5 for Re = 100 and Re = 1000 (and close-up view).
Chapter 2 129 Shape Optimisation and CFD
In the case of the Galerkin finite element discretisation fast convergence is observed
as well, see Figure 2.22. Note that in both Reynolds number regimes there is no visible
difference between the four finest meshes, not even in the closeup views at the bottom of
the figure.
For both shape optimisation examples the derivative values DI/DF computed by the
adjoint approach have been verified by comparing them to those computed by the finite
difference approximation
DI I(Fi + h) − I(Fi − h)
≈
DFi 2h
(central difference) with h = 10−5 . Note that smaller h led to unreliable results for the
finite difference approximation due to cancellation effects and the inexact solves of the
discretised Navier-Stokes systems. Tables 2.21 and 2.22 lists the derivative values for the
finite element discretisation and finite volume discretisation as computed by the adjoint
method, finite differences (FD) and the relative difference between them. The tests have
been performed at Re = 20 for the initial geometries used in the optimisation runs for
Example 2.1 and Example 2.2, on meshes producing the number of degrees of freedom as
listed in the tables (#DOFs). The derivative values compare well and the most significant
relative errors occur only for those components of the gradient with small absolute value.
In those cases the relative error of the finite difference approximation is largest due to
cancellation effects. Table 2.21 does also show the total time taken by either method. In
Example 2.2 where the parameter vector is of dimension ten, the advantage of the adjoint
method is clearly visible.
Chapter 2 130 Shape Optimisation and CFD
Example 2.1
adjoint FD (adj-FD)/|FD|
-1.6259e-02 -1.6261e-02 1.5167e-04
-5.1754e-01 -5.1754e-01 9.1737e-06
Re = 20 (ν = 0.05), #DOFs = 112512
tadj = 5.2e + 02 tFD = 1.7e + 03
Example 2.2
adjoint FD (adj-FD)/|FD|
7.6111e+00 7.6111e+00 9.3887e-07
2.8666e+00 2.8666e+00 -8.6047e-08
1.1410e+01 1.1410e+01 4.8489e-07
9.5121e+00 9.5121e+00 6.3536e-07
-1.6896e+00 -1.6896e+00 -3.2079e-07
-1.7203e-02 -1.7195e-02 -4.1383e-04
-1.7018e+00 -1.7018e+00 4.2783e-06
1.3484e+00 1.3484e+00 1.0735e-06
1.7395e+00 1.7395e+00 -2.2238e-07
2.0839e-01 2.0840e-01 -3.0649e-05
Re = 20 (ν = 0.05), #DOFs = 588736
tadj = 2.9e + 01 tFD = 3.5e + 02
Table 2.21: Verification of adjoint derivative evaluation in the FE discretisation by com-
parison to finite difference values
Example 2.1
adjoint FD (adj-FD)/|FD|
-1.5851e-02 -1.5851e-02 -4.9369e-07
-5.0701e-01 -5.0701e-01 7.4693e-09
Re = 20 (ν = 0.05), #DOFs = 147456
Table 2.22: Verification of adjoint derivative evaluation in the FV discretisation by com-
parison to finite difference values
Chapter 2 131 Shape Optimisation and CFD
2.8 Conclusions
In this chapter the application of the discrete adjoint method to shape optimisation for
fluid dynamics problems has been discussed. The method has been successfully applied
for two discretisation techniques, namely finite elements and finite volumes. Since shape
optimisation invariably requires repetitive evaluation of the performance criterion and its
derivatives, particular emphasis has been given to the need for efficient solution tech-
niques for the linear systems arising in both the forward and the adjoint problems. It
has been demonstrated that efficient solution strategies for the forward problem lead to
efficient solution methods for the adjoint problem in a natural way, requiring only a few
modifications. Various aspects arising in the application of the discrete adjoint method
have been discussed in Section 2.6, including the computation of partial derivatives of the
discretisation with respect to the mesh. The utility of the presented approaches has been
demonstrated by successfully applying them to two shape optimisation example prob-
lems.
A more detailed discussion of conclusions regarding this chapter, including a discus-
sion of opportunities for future research, will be given in Chapter 4 together with the
conclusions for the second major part of this thesis. This second part deals with an en-
tirely different application of the discrete adjoint method: adaptive mesh design.
Chapter 3
In this chapter we consider adaptive mesh design as a topic of its own, not limited to
problems from fluid dynamics. Thus, the focus is on PDEs in general, using a model
problem and the finite element discretisation to develop a (hopefully) general approach.
The discussion is meant to be independent of the previous chapter, apart from Section
3.4.2.2, where some results from Chapter 2 regarding the application of the discrete ad-
joint method in the finite element discretisation are used.
3.1 Introduction
The use of a posteriori error estimation in order to guide local mesh refinement is now
common in the finite element (FE) solution of PDEs [3, 13, 74, 107]. Such techniques
not only provide a reliable indication as to the overall accuracy of a computed solution,
but also provide a reliable indication as to which regions of the computational domain
contribute most (and least) greatly to the overall error in a given solution. Recently, this
approach has been augmented by the development of a posteriori error estimation tech-
niques for quantities of interest which are derived from the solution of the PDE. Typical
examples are described in [11, 37]. This development is significant since it is frequently
the case that quantities such as drag, lift, local fluxes, etc., are of more interest to the
user than the overall solution. Hence the numerical solution procedure should seek to
approximate these chosen quantities as efficiently as possible.
132
Chapter 3 133 Adaptive mesh design
For the purposes of developing efficient adaptive finite element software reliable error
estimation, whilst necessary, is not sufficient. Some mechanism is also required for using
this information in order to construct an improved trial space from which to seek a new
solution. One possibility is to locally enrich the polynomial order of the FE trial space (p-
refinement) in regions of the domain where the error is largest and the solution is judged
to be sufficiently smooth [2, 9]. Alternatively, isotropic local refinement (h-refinement)
of the FE mesh may be used [70, 83] and recently significant research has focused on the
efficient combination of these two [4, 78]. In this work our focus is only on techniques
for improving the trial space based upon adapting the FE mesh, rather than enriching the
polynomial degree, which we take to be piecewise linear throughout this chapter.
For many problems isotropic local refinement of the current mesh provides a per-
fectly satisfactory mechanism for the adaptive procedure. However, this is not always the
case. A significant number of practical problems have solutions which possess features
that are highly anisotropic (shocks or boundary layers for example). In such situations,
unless one starts from a mesh designed using prior knowledge concerning the location
and orientation of such features, regular mesh refinement is generally far from optimal.
Numerous authors have considered mechanisms for addressing this problem through the
development of techniques for anisotropic refinement. Examples include the approach of
[32], which makes use of a solution-dependent metric based upon an approximation to the
Hessian matrix, or the technique described in [81], which is suitable for tensor product
meshes. See also the work of Kunert (e.g. [53, 54]) on a posteriori error estimation for
anisotropic meshes or the experimental results of [7].
This chapter investigates an alternative approach to the problem of automatically
adapting a mesh, or meshes, based upon a posteriori error estimates for problems whose
solutions exhibit anisotropic behaviour. This is based upon not only computing an error
estimate, but also calculating the sensitivity of this estimate to the positions of the nodes
in the current mesh, which can be done efficiently with the discrete adjoint method.
This sensitivity information may then be used to improve the quality (in the sense of
reducing the estimated error) of the existing mesh without increasing the dimension of
the FE trial space. The ultimate goal would be to adjust the mesh in order to position the
nodes in locally optimal locations, by employing techniques from mathematical optimi-
sation for example. However, this is still an extremely demanding and computationally
expensive task, even when the sensitivity information can be computed inexpensively.
Furthermore, it is essential to ensure that the error estimate itself remains reliable on the
meshes produced and so a number of constraints are proposed to reflect this requirement.
An approach has been developed therefore that seeks to balance the goal of reducing the
Chapter 3 134 Adaptive mesh design
estimated error for a given mesh connectivity with the need to maintain control over both
the computational cost and the quality of the error estimate.
The developed approach falls into the class of r-refinement methods (e.g. [10, 12, 48,
56, 99]), which generally covers adaptivity by node redistribution (hence r-refinement).
However, the use of the sensitivity of a posteriori error estimates distinguishes it from
previous approaches, which either construct the mesh movement as the solution of a non-
linear PDE (e.g. [10, 48, 56]) or by solving sequences of local optimisation problems
only (e.g. [12, 99] solve small optimisation problems for each node in a Gauß-Seidel like
fashion). We refer to [10] for a review of r-refinement approaches.
The new approach has been implemented and tested for the solution of a class of
singularly-perturbed reaction-diffusion equations. These problems were selected because
a priori analysis is available to guide the design of anisotropic meshes, for example using
so-called Shishkin meshes (see e.g. [8] and the survey article [71]), thus allowing com-
parisons to be made against our new, more general a posteriori approach. The extension
to wider classes of problems is also discussed in the concluding remarks in Section 4.2.
Note that the material in this chapter is in large parts a reproduction of our previous
work [76].
e = u − uh . (3.2.1)
Error estimation generally refers to attempts to estimate this quantity, usually by seeking
bounds on localised (e.g. to an element) or global norms. There are two main motivations
to do this. The first is to provide the user with some indication of the accuracy of the com-
puted approximation. The second is to provide information as to how the approximation
may be improved.
Generally one distinguishes two approaches to error estimation: a priori and a pos-
teriori. The first, a priori error estimation, utilises properties of the differential operator,
the domain and the approximation method, but not the approximated solution, in order
to derive estimates (e.g. [21]). These estimates are of an asymptotic nature, describing
the behaviour of the error in terms of discretisation parameters such as the mesh size pa-
rameter h. The constants appearing in these estimates are generally unknown, thus the
Chapter 3 135 Adaptive mesh design
very small minimal angles in the triangular elements. If, for example, an error estimator
grossly underestimates the error for certain deformations of the mesh, the optimisation is
likely to steer towards meshes which show these deformations. Thus, the results would
be useless in such a case.
Robust energy norm error estimation techniques for anisotropic meshes have, for ex-
ample, been introduced in [54]. Unfortunately this estimate fails the criterion of differen-
tiability. It describes element dimensions in terms of the longest edge and the dimension
perpendicular to it. These terms are obviously not continuously differentiable with respect
to the node positions. The estimator is a generalisation of the basic residual estimator [3,
Section 2.2] to anisotropic meshes. The estimation of the local error contributions in
terms of characteristic element dimensions hT in general is unlikely to be a good choice
for optimisation by means of node movement, because this choice discards directional
information on the element and on the error behaviour. However, this directional infor-
mation is crucial for anisotropic problems.
In this sense an error estimate which is computed by using the element geometry di-
rectly, without using artificial quantities such as hT , is more desirable1 . One class of
such energy norm error estimates is defined by the local problem error estimators, e.g.
[98, Section 1.3] and [3, Section 3]. For the construction of these estimators the mesh
is subdivided into small patches (which may even be the individual elements). The es-
timator requires the solution of auxiliary PDE problems on these patches with enriched
finite element spaces in order to get an approximation of the local error. Since these lo-
cal problems have a very similar structure to the original finite element approximation
the differentiability is preserved, and even more, the same techniques for the computa-
tion of the derivatives can be used, e.g. the discrete adjoint method. Unfortunately the
analysis of this estimator in [98], as well as in [3], is based on assumptions incompati-
ble with anisotropic meshes. Thus, its robustness with respect to mesh deformations is
not guaranteed. However, this does not imply that the estimation approach is unsuitable
for anisotropic meshes. It is quite possible that a different analysis could establish its
robustness under assumptions compatible with anisotropic meshes.
We will not take consideration of this error estimation technique further in this thesis,
but let it remain as an example of an energy norm error estimator which is sufficiently
smooth to be used as an optimisation criterion for the optimisation techniques envisaged
in this work.
1 This only concerns the formulation of the estimate, i.e. the formula defining it. Analysis of the estimate
in terms of characteristic length-scales is of course still possible.
Chapter 3 137 Adaptive mesh design
where g is a kernel function and u ∈ H1gD ,ΓD is the unknown solution to a PDE whose weak
form is
a(u, v) = b(v) ∀v ∈ H10,ΓD . (3.2.3)
let part of the boundary of the domain Ω. Let uh be defined as the solution of the FE
discretisation of the weak form (3.2.3), hence uh ∈ VgD ,h such that
where V f ,h denotes the FE function space V f ,h ⊂ H1f ,ΓD , and let the discretisation error e
be defined as in (3.2.1). The dual in the Dual Weighted Residual method comes from its
use of the solution z of the dual problem, find z ∈ H10,ΓD for which
which is well defined since J is a bounded linear functional and we assume the forward
problem (3.2.3) to be well defined. Furthermore the linearity of J implies that the error in
the quantity of interest, J(u) − J(uh ) = J(e), and from (3.2.5)
J(e) = a(e, z)
= a(u, z) − a(uh , z)
= b(z) − a(uh , z). (3.2.6)
Chapter 3 138 Adaptive mesh design
Note that the use of any approximation zh ∈ V0,h for z in (3.2.6) does not provide any
information on J(e) since b(zh ) − a(uh , zh ) = 0 ∀zh ∈ V0,h by Galerkin orthogonality.
This implies that the FE approximation to the dual solution is not sufficient to provide a
successful error estimator, at least so long as it is computed from the same space as uh
is. Of course computing an exact solution to the dual problem (3.2.5) is just as difficult
as computing an exact solution to the original problem (3.2.3) and therefore it is neces-
sary to consider an approximate solution for z, which implies that instead of the true error
J(e) it is only possible to obtain an estimate for it. To emphasise that this approxima-
tion is different from zh it will, from here on, be denoted by zapp , and the corresponding
approximation shall be denoted by
Several ways of defining zapp in order to obtain such an estimate Jest have been proposed,
e.g.
1. solve the dual problem with a higher order method on the same mesh,
3. use the same trial space to solve the dual problem, but use a higher order interpolant
of zh as zapp .
Solving the dual problem with a higher order method or on a uniformly refined mesh
means that the error estimation will be more expensive to obtain than solving the original
discretised equations. In the former case the overhead results from more degrees of free-
dom and larger matrix stencils, as well as the need for more accurate integration formulae.
In the latter case the number of unknowns in the discrete problem will be increased by a
factor of four or eight (in 2d respectively 3d). However, it does have the advantage of us-
ing the same order method, hence significant parts of the original code should be directly
reusable. Furthermore, the estimated error Jest can be used to get an improved approxi-
mation to J, thus effectively delivering the accuracy that one would get if one solved the
problem on the finer mesh.
The third alternative of using a higher order interpolant of zh as zapp is described as
a strong competitor in [11]. However, the quality of this approximation Jest relies on a
super-convergence property which in turn relies on uniform meshes and smoothness of
the dual solution z. As this work concentrates on anisotropic meshes this approach is less
suitable, as demonstrated by the results in Section 3.4.3 of this thesis, where the second
and third methods are compared for an anisotropic problem for which the meshes are
Chapter 3 139 Adaptive mesh design
stretched accordingly. It is found that the robustness of the second method justifies its
additional expense.
Error estimates such as the DWR estimate are commonly used to guide local mesh
refinement in regions which contribute most to the estimated error. This may be achieved,
for example, by evaluating the right-hand side of (3.2.7) separately on each element and
then refining those elements with the largest contributions. This refinement is normally
done by subdividing elements into smaller ones of the same or similar shape. In two di-
mensions this typically involves division in two [14] or four [70] sub-triangles and for
tetrahedra in three dimensions either two or eight sub-elements (as in [74] or [83] respec-
tively).
For the remainder of this chapter attention is restricted to problems in two dimensions
and h-refinement based upon sub-division of triangles into four child elements. This
method of local refinement is referred to as isotropic local mesh refinement in order to
distinguish it from other approaches where an element may be refined differently in dif-
ferent spatial directions. Treatment of hanging nodes, which come about when an element
is refined but a neighbouring element is not, is briefly discussed in Subsection 3.4.2.1.
3.3.1 Overview
The approach discussed in this section may be summarised as seeking to utilise the deriva-
tives, with respect to the positions of the nodes of the finite element mesh, of an a poste-
riori error estimate in order to guide anisotropic local mesh refinement. These derivatives
may be computed at relatively low cost via the discrete adjoint technique described in
Section 1.1. Numerous possibilities exist for using this derivative information, but this
work concentrates on moving nodes of an existing mesh in order to reduce the estimated
error (and therefore, one hopes, the actual error) in the quantity of interest. Naturally one
must impose certain constraints on this node movement in order to ensure that the mesh
remains suitable for finite element calculations (e.g. tangling of the elements should be
avoided). However, restrictions are also necessary to ensure that the error estimate itself
remains a reliable approximation of the error.
Of course the use of node movement alone can only redistribute a constant number of
degrees of freedom and so for most practical problems it will be necessary to combine this
with isotropic local mesh refinement in some way. Such a hybrid approach will not gen-
erally improve the asymptotic convergence properties of the underlying FE discretisation,
Chapter 3 140 Adaptive mesh design
however realistic goals are to yield a better constant for this asymptotic behaviour and/or
to reach the asymptotic regime more quickly (i.e. with fewer degrees of freedom). The
remainder of this section considers a number of important issues that must be addressed in
the design of this type of hybrid algorithm. These include consideration of: how the node
movement may be defined, what constitutes a good quality measure in order to drive the
refinement, what restrictions on the mesh are required, the implementation of appropriate
restrictions on the node movement, and techniques for combining node relocation with
isotropic local refinement.
Minimise Je,est (s), with respect to the node positions s, subject to:
Constraints 1, 2 and 3 are standard geometric restrictions. Note that bounding the an-
gles from below is not appropriate since this prevents the possibility of strongly anisotropic
meshes. Condition 4 is introduced as a safeguard to ensure the reliability of the DWR er-
ror estimate, as will be explained in Section 3.3.5. The realisation of these constraints and
a suitable choice of Je,est are discussed in detail in the following subsections.
Chapter 3 141 Adaptive mesh design
It should be noted from the outset that, regardless of the precise quantitative realisation
of the four constraints or the precise definition of Je,est , Problem 1 is highly nonlinear and
its solution is extremely challenging for all but the most trivial finite element meshes. For
a practical adaptive algorithm it is therefore essential to limit the computational effort that
goes into attempting to find a solution, if one even exists.
Here Je,T represents the right-hand side of (3.2.7) evaluated on element T , of triangulation
T , only. This choice of Je,est
Equidistribution of the error would imply that the resulting mesh would form a good basis
for further uniform mesh refinement. However, since true equidistribution is unlikely to
be achieved, further adaptive isotropic local refinement, as discussed in Subsection 3.3.6,
will be a better choice.
3.3.5 Constraints
The approach of moving nodes in order to reduce a particular error estimate requires a
high level of reliability from that error estimate. Specifically, it must work for any mesh
that arises during the adaptive process. If this were not the case then it would be possible
to reach situations where the optimisation process was driving down the estimate of the
error but the quality of this estimate was deteriorating in such a way that the true error
would be allowed to go up rather than down. There has been a significant amount of
research on the topic of reliable a posteriori error estimation for highly anisotropic prob-
lems in recent years (e.g. [54, 55]) and typical results demonstrate the quality of such
estimates provided the mesh is well aligned with the solution and rapid changes in the
discretisation step sizes do not occur inside the layers. Similarly, in this work we have
found that provided the aspect ratio of neighbouring elements is not permitted to change
too quickly, the estimate (3.2.7) is also very reliable (as we will illustrate in Section 3.4.3).
Consequently, this proviso is imposed as Constraint 4 in Problem 1. This constraint may
be realised by bounding the ratio of the areas of elements sharing a common edge. Specif-
ically, for each edge in the interior of the FE mesh, let A1 and A2 be the areas of the two
elements that share that edge. Then it is necessary that
c4 A1 − A2 ≥ 0 and c4 A2 − A1 ≥ 0
for some predetermined constant c4 > 0. For the examples described in this thesis the
constant is typically chosen to be c4 := 5.
Chapter 3 143 Adaptive mesh design
cos(αi ) − cosmax ≥ 0,
which is easily evaluated from the vertex locations of the triangular element.
Since SQP is used as an iterative solver for the optimisation problem, it is possible to
aid the treatment of Constraint 2, the non-self-overlapping (NSO) condition, by applying
a trust region approach. Such approaches define at each iteration a region of trust for
the local search and it is only this region of the search space that is explored. This trust
region may be defined for each node individually: allowing the local search to move each
node only within a small region as illustrated in Figure 3.1. For this purpose an additional
parameter ctrust is defined and the trust region for each node is taken to be the polygon that
is defined by a set of linear inequality constraints, where for each edge opposing the node
in an element an inequality is defined such that the node may travel only up to ctrust of its
distance orthogonal to the straight line defined by the edge. The resulting linear inequality
constraints are enforced for the QP. This trust region approach may not be sufficient to
guarantee NSO on its own, but allows the exclusion of a large proportion of NSO violating
node movements from the outset. Additionally the NSO condition has to be verified for all
iterations, and in case of violating movements these have to be scaled down such that they
satisfy NSO. If each of the iterations is non-self-overlapping Constraint 2 is guaranteed
for the computed solution.
This approach allows for generalisation to more general domains with curved boundaries,
although this would be computationally more expensive and has not been undertaken here.
adaptive solver
solve adjoint
solve adjoint of dual
evaluate derivatives
move nodes
loop
refine mesh
adaptive solver
refine mesh
solve adjoint
solve adjoint of dual
evaluate derivatives
move nodes
certain cases whilst a priori information regarding the solution behaviour is available in
other cases. This allows the quality of the opt-adapt and adapt-opt algorithms to
be contrasted against well established a priori results. It is envisaged that the a posteri-
ori approach used here will be much more generally applicable than the a priori theory
however.
where ε > 0, ΓD ∪ ΓN is the boundary ∂ Ω of the domain Ω, |ΓD ∩ ΓN | = 0 and |ΓD | > 0.
As quantity of interest J(u) consider the integral of the normal derivative of u over the
Dirichlet boundary,
∂u
Z
J(u) := dΓ. (3.4.2)
∂n
ΓD
Numerous alternative choices could have been made but this is selected partly for its
simplicity and partly for its similarity to the viscous drag terms that are frequently of
interest in fluid dynamics problems, e.g. Example 2.2 from the Chapter 2. The weak form
of problem (3.4.1) is to find u ∈ H10,ΓD such that
where
Z
1
a(u, v) := ∇u.∇v + 2 uv dΩ,
ε
Ω
1
Z Z
b(v) := v f dΩ + vgN dΓ.
ε2
Ω ΓN
Chapter 3 148 Adaptive mesh design
holds for the solution u of (3.4.1) for all v ∈ H1 . Re-arranging the above equation yields
Z
∂u 1 1 ∂u
Z Z Z
J(u) = v dΓ = ∇u.∇v + 2 uv dΩ − v f dΩ − v dΓ,
∂n ε ε2 ∂n
ΓD Ω Ω ΓN
and selecting v ≡ 1 gives two different expressions for J(u). Whilst these are equivalent
for the exact solution, the second expression provides a better order of convergence for
the FE solution and is therefore used from here on. This can conveniently be formulated
as
J(u) = a(u, 1) − b(1). (3.4.3)
At this point it should be noted that (3.4.3) is not a linear functional in u, but an
affine one. This does not present any significant difficulties however since, following [11,
Chapter 6] for example, the dual problem that must be solved for the error estimation
simply becomes
a(ϕ, z) = a(ϕ, 1) ∀ϕ ∈ H10,ΓD . (3.4.4)
That is, it utilises the linear part of the functional only. Otherwise the application of the
DWR method is as described in Section 3.2.2.
For the purposes of this numerical assessment three example cases are selected with
different choices of Ω, f , ΓD , ΓN and gN .
Chapter 3 149 Adaptive mesh design
Example 3.1.
Domain: Unit square, Ω = (0, 1)2 .
Boundary conditions:
u= 0 ∀(x, y) : x = 1
∂u
= 0 ∀(x, y) : y = 1 ∨ y = 0
∂n
∂u 1 −1/ε
= εe ∀(x, y) : x = 0
∂n
RHS function:
f =1
u(x, y) = 1 − e(x−1)/ε
which, when ε is small, involves a steep boundary layer next to the x = 1 boundary.
Furthermore, using the representation (3.4.2), it is easy to show that J(u) = −1/ε for this
example. The solution for this example problem is illustrated in Figure 3.4 (left) for the
case ε = 1/10.
Example 3.2.
Domain: Unit square, Ω = (0, 1)2 .
Boundary conditions:
u= 0 ∀(x, y) : x = 1 ∨ y = 1
∂u √ −1 (y−1)
√
= √1 e 2ε e 2ε ∀(x, y) : x = 0
∂n 2ε
∂u (x−1)
√ −1
√
= √1 e 2ε e 2ε ∀(x, y) : y = 0
∂n 2ε
RHS function:
1 (x−1)
√ (y−1)
√
f (x, y) = 1 − e 2ε + 1 − e 2ε
2
This example is more challenging than the first, with a truly two dimensional solution.
Chapter 3 150 Adaptive mesh design
u u
1 1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
1 1
1 1
0.8 0.8
0.5 0.6 0.5 0.6
0.4 0.4
0.2 0.2
0 0 0 0
y x y x
Figure 3.4: Solutions for Example 3.1 (left) and Example 3.2 (right) both for ε = 1/10
The solutions for Example 3.2 is illustrated in Figure 3.4 (right) for the case ε = 1/10.
Example 3.3.
Domain: Square with a square hole,
1 1
Ω = (−1, 1)2 \ (− , )2 .
5 5
Boundary conditions:
1 1 2
u = 0 ∀(x, y) ∈ − ,
5 5
∂u
= 0 ∀(x, y) : (x = ±1) ∨ (y = ±1)
∂n
RHS function:
f =1
This is a problem for which an exact solution is not known, however to illustrate the
domain and the character of the solution Figure 3.5 shows a mesh and a computed solution
for ε = 1/10.
Chapter 3 151 Adaptive mesh design
u
1
0.8
0.6 1
0.4 0.8
0.2 0.6
0
0.4
−0.2
0.2
−0.4
0
1
−0.6
0.5 1
−0.8 0 0.5
0
−0.5
−1 −0.5
−1 −0.5 0 0.5 1 −1 −1
y x
Figure 3.5: Example mesh and corresponding solution for Example 3.3 (ε = 1/10)
A difficulty related to local mesh refinement is the introduction of hanging nodes which
come about when an element is refined but a neighbouring element is not. In order to en-
sure the resulting function space is conforming three different approaches are commonly
used. The first uses temporary closure elements, as in [83, 91] for example, the second
constructs special basis functions for elements with hanging nodes, e.g. [92, 100], whilst
the third simply restricts the values at the hanging nodes so as to ensure a conforming so-
lution, e.g. [69, 72]. In this case care needs to be taken to ensure that no element consists
of hanging nodes only, since such a refinement would not enhance the solution quality.
In any case the number of hanging nodes per element needs to be controlled in order to
ensure non-degrading of the mesh or function space. For the results described in this
chapter the common rule is adopted that the refinement level of neighbouring elements
must not differ by more than one, otherwise the coarser element is refined. A variant of
the third approach similar to that in [69] is employed, whereby the solution value at each
hanging node is constrained to be the linear interpolant of the values at the nodes that lie
at the end of the edge from which it hangs. This is achieved here by explicitly modifying
the finite element system to restrict it to the conforming subspace of the non-conforming
FE function space. However an alternative would be to follow [57], where the local el-
ement matrices are assembled without regard for the hanging nodes and then the linear
interpolation constraints are imposed as part of the iterative solution procedure (through
Chapter 3 152 Adaptive mesh design
This section provides an outline of how the discrete adjoint technique may be applied
to evaluate the derivative of the mesh-quality measure Je,est (s) with respect to the node
positions. Here Je,est takes the role of Ie in (1.1.1) and it is considered to be a function of
u, the coefficient vector of the primal finite element solution, and z, the coefficient vector
of the finite element solution of the dual problem. These quantities u and z are themselves
dependent upon the vector s of node coordinates for the underlying FE mesh. To complete
the notation of Section 1.1 therefore, let
" #
u
ω := (3.4.5)
z
" # " #
K(s) b(s)
and R(ω, s) := ω− , (3.4.6)
Kdual (s) bdual (s)
where K(s) and b(s) are the stiffness matrix and right-hand side from the FE discretisation
of the primal problem and Kdual (s) and bdual (s) are those of the dual problem utilised in
the error estimate. The adjoint equation (1.1.6) therefore becomes
" #
KT ∂ Je,est
T
Ψ= (3.4.7)
Kdual ∂ω
and the total derivatives DJe,est /Ds can be evaluated according to (1.1.7) as
" #
∂ (Ku−b)
DJe,est ∂ Je,est
= − ΨT ∂s
∂ (Kdual z−bdual ) . (3.4.8)
Ds ∂s
∂s
The partial derivatives with respect to the node positions s, ∂ Je,est /∂ s ,∂ (Ku − b)/∂ s
and ∂ (Kdual z − bdual )/∂ s, can be evaluated using the approach from Section 2.6.2.1, i.e.
by formulating the terms as sums according to quadrature formulas, applying the chain
rule of differentiation to reduce the problem to the evaluation of simpler terms and using
explicit formulae for these simpler terms as given in Proposition 1. As the integrals consist
exclusively of terms which have been present in the model problem in Section 2.6.2.1, a
more detailed derivation is omitted at this point. Again, the Jacobians ∂ (Ku − b)/∂ s
and ∂ (Kdual z − bdual )/∂ s form sparse matrices which are only used for one matrix-vector
product in (3.4.8). Thus, time and memory may be saved by computing this post-product
Chapter 3 153 Adaptive mesh design
with ΨT directly. For this purpose the adjoint solution vector Ψ can be treated the same
way as ω, as a concatenation of u and z components:
"#
Ψu
Ψ= .
Ψz
As a final remark on the adjoint equations themselves note that the location of hanging
nodes is determined by the position of their parents and so should not be included in
the vector s. Furthermore, any implementation of the expression (3.4.8) may have its
correctness verified by comparing the values of DJe,est /Ds to those computed, at much
greater expense, through the use of finite differences. For the implementation in this
thesis this has been done, confirming the correct implementation of the discrete adjoint
method.
3.4.2.3 Optimisation
1 1
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
3
DWR method 1 3
DWR method 1
10 10
2 2
10 10
1 1
10 10
0 0
10 10
J(e) J(e)
Jest Jest
Je,est Je,est
ref a ref a
−1 −1
10 10
0.8 0.82 0.84 0.86 0.88 0.9 0.92 0.94 0.96 0.98 1 0.8 0.82 0.84 0.86 0.88 0.9 0.92 0.94 0.96 0.98 1
a a
3
DWR method 2 3
DWR method 2
10 10
2 2
10 10
1 1
10 10
0 0
10 10
J(e) J(e)
Jest Jest
Je,est Je,est
ref a ref a
−1 −1
10 10
0.8 0.82 0.84 0.86 0.88 0.9 0.92 0.94 0.96 0.98 1 0.8 0.82 0.84 0.86 0.88 0.9 0.92 0.94 0.96 0.98 1
a a
Figure 3.6: Error estimates on parametric meshes for Example 3.1, ε = 10−2
Chapter 3 156 Adaptive mesh design
0.8
0.6
0.4
0.2
−0.2
−0.4
−0.6
−0.8
−1 −0.5 0 0.5 1
3.4.4 Results
Figures 3.8, 3.9, 3.10, 3.11, 3.12 and 3.13 illustrate the performance of the opt-adapt
and adapt-opt algorithms for the three example problems considered. These are con-
trasted with the performance of global and local isotropic mesh refinement and with
Shishkin meshes that are based upon a priori analysis (e.g. [8, 71] and references therein).
For each example four cases are considered, with ε ranging from 10−1 to 10−4 , and in
each case the relative error |Jest |/|J(u)| is plotted against the number of nodes in the pri-
mal mesh. In the first two examples, for which the exact solution is known, |J(e)|/|J(u)|
is also plotted.
One of the most significant observations concerning these results is that the estimated
error and the actual error follow each other closely for all of the meshes used in the
production of figures 3.8, 3.9, 3.10 and 3.11. It should also be noted that all of the methods
perform in a similar, optimal, manner in the mildly anisotropic regime for ε = 10−1 .
Furthermore, for smaller values of ε, denoting increasingly anisotropic behaviour, both
opt-adapt and adapt-opt provide significant advantage over the adaptive isotropic
refinement approaches and this advantage becomes more pronounced as ε gets smaller.
Indeed, both approaches achieve the desired goal of reaching the asymptotic regime more
quickly (with fewer degrees of freedom) and with a better constant. Of the two new
approaches opt-adapt is superior for these examples. This would become even more
clear if the comparison was made with respect to CPU time, since adapt-opt requires
a relatively large total number of SQP steps for increasingly large problems. The cost
for this approach therefore significantly exceeds that of opt-adapt. It should be noted
however that the implementations used to produce these results shown here are based
Chapter 3 157 Adaptive mesh design
1
history for eps=1.0e−01
10
J(e) − adapt−opt
Jest − adapt−opt
0
10 J(e) − opt−adapt
Jest − opt−adapt
J(e) − iso adapt
relative error |J(e)/J|
−1
10 Jest − iso adapt
J(e) − Shishkin
−2
Jest − Shishkin
10
J(e) − uniform
Jest − uniform
−3
10 ref o(h)=o(1/sqrt(n))
ref o(h2)=o(1/n)
−4
10
−5
10
0 1 2 3 4 5 6 7
10 10 10 10 10 10 10 10
#nodes
2
history for eps=1.0e−02
10
J(e) − adapt−opt
Jest − adapt−opt
1
10
J(e) − opt−adapt
Jest − opt−adapt
0
10 J(e) − iso adapt
relative error |J(e)/J|
−2
J(e) − uniform
10
Jest − uniform
ref o(h)=o(1/sqrt(n))
ref o(h2)=o(1/n)
−3
10
−4
10
−5
10
0 1 2 3 4 5 6 7
10 10 10 10 10 10 10 10
#nodes
Figure 3.8: Convergence histories for Example 3.1, ε = 10−1 and ε = 10−2
Chapter 3 159 Adaptive mesh design
3
history for eps=1.0e−03
10
J(e) − adapt−opt
Jest − adapt−opt
2
10
J(e) − opt−adapt
Jest − opt−adapt
1
10 J(e) − iso adapt
relative error |J(e)/J|
−1
J(e) − uniform
10
Jest − uniform
ref o(h)=o(1/sqrt(n))
ref o(h2)=o(1/n)
−2
10
−3
10
−4
10
0 1 2 3 4 5 6 7
10 10 10 10 10 10 10 10
#nodes
4
history for eps=1.0e−04
10
J(e) − adapt−opt
Jest − adapt−opt
3
10
J(e) − opt−adapt
Jest − opt−adapt
2
10 J(e) − iso adapt
relative error |J(e)/J|
0
J(e) − uniform
10
Jest − uniform
ref o(h)=o(1/sqrt(n))
ref o(h2)=o(1/n)
−1
10
−2
10
−3
10
0 1 2 3 4 5 6 7
10 10 10 10 10 10 10 10
#nodes
Figure 3.9: Convergence histories for Example 3.1, ε = 10−3 and ε = 10−4
Chapter 3 160 Adaptive mesh design
1
history for eps=1.0e−01
10
J(e) − adapt−opt
Jest − adapt−opt
0
10 J(e) − opt−adapt
Jest − opt−adapt
J(e) − iso adapt
relative error |J(e)/J|
−1
10 Jest − iso adapt
J(e) − Shishkin
−2
Jest − Shishkin
10
J(e) − uniform
Jest − uniform
−3
10 ref o(h)=o(1/sqrt(n))
ref o(h2)=o(1/n)
−4
10
−5
10
0 1 2 3 4 5 6 7
10 10 10 10 10 10 10 10
#nodes
2
history for eps=1.0e−02
10
J(e) − adapt−opt
Jest − adapt−opt
1
10 J(e) − opt−adapt
Jest − opt−adapt
J(e) − iso adapt
relative error |J(e)/J|
0
10 Jest − iso adapt
J(e) − Shishkin
−1
Jest − Shishkin
10
J(e) − uniform
Jest − uniform
−2
10 ref o(h)=o(1/sqrt(n))
ref o(h2)=o(1/n)
−3
10
−4
10
0 1 2 3 4 5 6 7
10 10 10 10 10 10 10 10
#nodes
Figure 3.10: Convergence histories for Example 3.2, ε = 10−1 and ε = 10−2
Chapter 3 161 Adaptive mesh design
3
history for eps=1.0e−03
10
J(e) − adapt−opt
Jest − adapt−opt
2
10
J(e) − opt−adapt
Jest − opt−adapt
1
10 J(e) − iso adapt
relative error |J(e)/J|
−1
J(e) − uniform
10
Jest − uniform
ref o(h)=o(1/sqrt(n))
ref o(h2)=o(1/n)
−2
10
−3
10
−4
10
0 1 2 3 4 5 6 7
10 10 10 10 10 10 10 10
#nodes
4
history for eps=1.0e−04
10
J(e) − adapt−opt
Jest − adapt−opt
3
10
J(e) − opt−adapt
Jest − opt−adapt
2
10 J(e) − iso adapt
relative error |J(e)/J|
0
J(e) − uniform
10
Jest − uniform
ref o(h)=o(1/sqrt(n))
ref o(h2)=o(1/n)
−1
10
−2
10
−3
10
0 1 2 3 4 5 6 7
10 10 10 10 10 10 10 10
#nodes
Figure 3.11: Convergence histories for Example 3.2, ε = 10−3 and ε = 10−4
Chapter 3 162 Adaptive mesh design
1
history for eps=1.0e−01
10
J − adapt−opt
est
Jest − opt−adapt
0
10 Jest − iso adapt
Jest − Shishkin
Jest − uniform
relative error |J(e)/J|
−1
10 ref o(h)=o(1/sqrt(n))
ref o(h2)=o(1/n)
−2
10
−3
10
−4
10
1 2 3 4 5 6 7
10 10 10 10 10 10 10
#nodes
2
history for eps=1.0e−02
10
J − adapt−opt
est
Jest − opt−adapt
1
10
Jest − iso adapt
Jest − Shishkin
Jest − uniform
relative error |J(e)/J|
0
10
ref o(h)=o(1/sqrt(n))
−1
ref o(h2)=o(1/n)
10
−2
10
−3
10
−4
10
1 2 3 4 5 6 7
10 10 10 10 10 10 10
#nodes
Figure 3.12: Convergence histories for Example 3.3, ε = 10−1 and ε = 10−2
Chapter 3 163 Adaptive mesh design
3
history for eps=1.0e−03
10
J − adapt−opt
est
Jest − opt−adapt
2
10
Jest − iso adapt
Jest − Shishkin
Jest − uniform
relative error |J(e)/J|
1
10
ref o(h)=o(1/sqrt(n))
0
ref o(h2)=o(1/n)
10
−1
10
−2
10
−3
10
1 2 3 4 5 6 7
10 10 10 10 10 10 10
#nodes
4
history for eps=1.0e−04
10
J − adapt−opt
est
Jest − opt−adapt
3
10
Jest − iso adapt
Jest − Shishkin
Jest − uniform
relative error |J(e)/J|
2
10
ref o(h)=o(1/sqrt(n))
1
ref o(h2)=o(1/n)
10
0
10
−1
10
−2
10
1 2 3 4 5 6 7
10 10 10 10 10 10 10
#nodes
Figure 3.13: Convergence histories for Example 3.3, ε = 10−3 and ε = 10−4
Chapter 3 164 Adaptive mesh design
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
y
−0.2 −0.2
−0.4 −0.4
−0.6 −0.6
−0.8 −0.8
−1 −1
−1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1
x x
Figure 3.14: Initial and optimised coarse mesh for Example 3.3
Chapter 3 165 Adaptive mesh design
opt−adapt final mesh for eps=1.0e−03 opt−adapt final mesh for eps=1.0e−03 (close−up)
1 0.21
0.8 0.208
0.6 0.206
0.4 0.204
0.2 0.202
0 0.2
y
y
−0.2 0.198
−0.4 0.196
−0.6 0.194
−0.8 0.192
−1 0.19
−1 −0.5 0 0.5 1 −0.21 −0.205 −0.2 −0.195 −0.19
x x
adapt−opt final mesh for eps=1.0e−03 adapt−opt final mesh for eps=1.0e−03 (close−up)
1 0.21
0.8 0.208
0.6 0.206
0.4 0.204
0.2 0.202
0 0.2
y
y
−0.2 0.198
−0.4 0.196
−0.6 0.194
−0.8 0.192
−1 0.19
−1 −0.5 0 0.5 1 −0.21 −0.205 −0.2 −0.195 −0.19
x x
iso adapt final mesh for eps=1.0e−03 iso adapt final mesh for eps=1.0e−03 (close−up)
1 0.21
0.8 0.208
0.6 0.206
0.4 0.204
0.2 0.202
0 0.2
y
−0.2 0.198
−0.4 0.196
−0.6 0.194
−0.8 0.192
−1 0.19
−1 −0.5 0 0.5 1 −0.21 −0.205 −0.2 −0.195 −0.19
x x
0.8 0.208
0.6 0.206
0.4 0.204
0.2 0.202
0 0.2
y
−0.2 0.198
−0.4 0.196
−0.6 0.194
−0.8 0.192
−1 0.19
−1 −0.5 0 0.5 1 −0.21 −0.205 −0.2 −0.195 −0.19
x x
Conclusions
The application of the discrete adjoint method has been the main topic of this thesis. This
technique allows efficient evaluation of the derivative of a function I(s) with respect to
parameters s in situations where I depends on s indirectly, via an intermediate variable
ω(s), which is computationally expensive to evaluate. The method has been applied in
the context of shape optimisation for fluid dynamics systems and in adaptive mesh design.
In both cases the utility of the approach has been demonstrated by numerical experiments.
Both presented applications of this technique require the evaluation of partial deriva-
tives of the residuals of the discretised PDEs with respect to parameters of the discretisa-
tion, i.e. the underlying mesh. In the case of the finite volume discretisation considered
in this work (Section 2.4), this turned out to be trivial due to the simple structure of the
meshes considered, see Section 2.6.3. For the case of finite element discretisations using
isoparametric elements of arbitrary degree on unstructured meshes a general approach to
evaluating these derivatives has been introduced in Section 2.6.2.1, improving in general-
ity on previously published evaluation approaches (e.g. [43, 49]).
In Section 2.5 the need for efficient solution techniques for the linear systems arising
from the PDE discretisation has been emphasised and for the problems considered such
techniques have been discussed. In Section 2.6 it was then demonstrated that efficient
solution methods for the forward problem can be translated into efficient solution methods
for the discrete adjoint equations.
The pure Dirichlet problem of the stationary incompressible Navier-Stokes equations
allows for non-uniqueness of the pressure solution. The influence of the regularisation of
166
Chapter 4 167 Conclusions
the resulting singularity of the Navier-Stokes system in the context of the discrete adjoint
method has been analysed in Section 2.6.1. A significant contribution of this work is
that it has been demonstrated that the adjoint systems possess the same properties as the
original systems, leading to the conclusion that the same techniques used in the treatment
of the singularities in the original problem can be used for the adjoint equations.
Two particularly important aspects which have been given consideration in this thesis
are the efficient solution of the discrete linearised Navier-Stokes systems and the adaptive
mesh design. Conclusions regarding these topics are given in the following two short
sections. Finally opportunities for future research, resulting from the exposition in this
thesis are summarised in Section 4.3.
and decreased multigrid efficiency for the F-block inner solves, represents one open sub-
problem. As a possible approach to improve on this, it would be interesting to see if
algebraic multigrid approaches [106] or H -matrix approximate LU decompositions [15]
constitute more efficient preconditioners for the F-block systems in the Fp preconditioner,
as remarked in Section 2.5.2.2. More efficient geometric multigrid techniques may also
be possible. Further, as the failure of the Fp -preconditioner in the finite volume discreti-
sation appears to result from the first order stabilisation terms, it would be interesting to
apply this preconditioner to a second order accurate finite volume discretisation.
For the second candidate approach, using geometric multigrid directly for the whole
linearised Navier-Stokes system, the construction of robust, efficient, discretisation in-
dependent smoothers remains an open problem. In particular, a good smoother for the
Taylor-Hood finite element discretisation is still unknown to the author. The discussion in
[50] indicates that the box-smoother performs best for lowest order discretisations. Thus,
one possible approach might be to construct such a low order discretisation for the sole
purpose of the smoother, i.e. using a lowest order smoother for the higher order discreti-
sations on all levels of the mesh hierarchy.
In the context of the automatic anisotropic mesh adaption approach introduced in this
work, which has been demonstrated to be feasible for a relatively small class of test prob-
lems, it is now necessary to extend the work to larger classes of PDE and a wider range of
a posteriori error estimates. It has already been demonstrated here that a key factor is to
ensure the reliability of the error estimate so as to ensure that, as the mesh becomes more
anisotropic, this does not deteriorate and lead to optimisation of a meaningless quantity.
Other problem classes that may be considered include convection-diffusion problems and
the Navier-Stokes equations, whose solutions may involve very steep layers. In terms
of error estimation techniques, the local problem estimator (e.g. [98, Section 1.3], [3,
Section 3]) appears to be a good alternative. The use of more complex geometries or
richer finite element spaces may be considered as well. In principle however all of the
techniques developed here can be extended to these situations.
Application of the most efficient adaptive mesh discretisations is of course also of
interest in the context of shape optimisation in order to keep the costs for individual per-
formance functional evaluations as low as possible. Thus, the refinement approach from
Chapter 3 could in theory also be applied for the problems in Chapter 2. Error estima-
tion for the incompressible Navier-Stokes equations using the DWR-approach has already
been developed, see for example [11, Chapter 11], so one step for the application of the
presented approach to this situation is already done. However, if adaptive meshes are
used, the consistency of the shape gradient (evaluated by means of the discrete adjoint)
Chapter 4 170 Conclusions
and the discrete performance functional may be lost, see the discussion in Section 2.2.5.
Thus, a further topic for future research may be to see if the approaches can be combined
in such a way that this consistency is retained.
Redistribution of the mesh alone as a means of adaptive refinement has its limitations,
as discussed in Section 3.3. In addition to the combination with adaptive locally uniform
refinement (Section 3.3.6), changing the mesh connectivity (e.g. edge swapping) may
also provide a means to improve the quality of the solution approximation, see [56, 99]
for example. Further, if connectivity changes are considered, the quality of the deformed
meshes may also be improved by the removal of nodes (coarsening) in regions of small
errors. However, both these extensions have not been considered in this thesis and remain
as topics for future research.
Appendix A
This appendix is concerned with the following question: under which conditions can
it be guaranteed that the solutions of the discrete optimal shape problems converge to
solutions of the continuous optimal shape problem? In the following a key result from [44,
Section 2.4] is reproduced. The exposition is meant to provide an approach to obtaining
an answer to this question rather than giving a final answer itself. Thus not all details are
present, but an overview of the approach is given. The interested reader is referred to [44]
for a detailed discussion of the topic.
From the outset it is clear that solutions to the discrete optimisation problems may ap-
proximate different continuous solutions, if the continuous problem possesses more than
one locally optimal solution. Indeed, an optimisation algorithm for the discrete problems
may easily converge to a different local optimum after refinement of the discretisation.
This complicates the definition of meaningful convergence rates, and instead the theory
reproduced here only states that for a sequence of increasingly fine discrete solutions there
exists a subsequence converging to a solution of the continuous problem.
Let us introduce some notation required for this section and let us give a clear state-
ment of the discrete optimisation problem. Let a family of shape parameterisations with
parameter κ > 0 be defined, such that the number of shape parameters dim(Fκ ) is
uniquely defined by κ. Let Dκ denote the set of admissible discrete shapes, where
Dκ ⊂ D e for some superset D e ⊃ D, for all κ > 0, but it is not necessary that Dκ ⊂ D.
Let every Fκ ∈ Dκ uniquely define a computational domain Ωκh , where the domain
discretisation parameter h > 0 is a monotone function of κ such that
h → +0 ⇐⇒ κ → +0. (A.0.1)
171
Appendix A 172 Convergence of the disc. optimal shape
A distinction between discrete shape and computational domain is made to allow for,
for example, B-spline definitions of the discrete shape while the computational domain
may be polygonal with a triangulation of discretisation length-scale h. Let V(Ω) denote
the function space associated with the weak formulation of the PDE (2.1.1b), (2.1.1c)
on domain Ω and let Vh (Ωκh ) denote a finite dimensional function space (finite element
space) such that Vh (Ωκh ) ⊂ V(Ωκh ). Then let uh := uh (Fκ ) := uh (Ωκh ) ∈ Vh (Ωκh )
denote the uniquely defined discrete approximation to the unique solution of the PDE
(2.1.1b), (2.1.1c) on the computational domain Ωκh corresponding to discrete shape Fκ .
Thus, for a given discrete shape Fκ ∈ Dκ the following chain of mappings defines the
approximated performance,
which in turn defines (locally) optimal discrete solutions Fκ∗ , as well as the corresponding
Ω∗κh and u∗h := uh (Fκ∗ ).
In order to analyse convergence properties of (A.0.3), the following assumptions are
introduced.
Ωκh −→ Ω as κ, h → +0.
Assumption 2 (Compactness).
For any sequence of discrete shapes {Fκ }, Fκ ∈ Dκ , with corresponding computational
domains {Ωκh }, there exist subsequences Fκ j with corresponding Ωκ j h j and an
Ωκ j h j −→ Ω
and uh j (Ωκ j h j ) −→ u(Ω) as j → ∞.
Ω(F ) ⊂ Ω
e ∀F ∈ D.
e
Theorem 1. Let assumptions 1, 2 and 3 be satisfied. Then for any sequence {(Fκ∗ , uh (Fκ∗ ))}
of optimal pairs of (A.0.3), κ → +0, there exists its subsequence {(Fκ∗ j , uh (Fκ∗ j ))} such
that (
Ωκ j h j (Fκ∗ j ) −→ Ω(F ∗ ),
(A.0.4)
uh j (Ωκ j h j (Fκ∗ j )) −→ u(Ω(F ∗ ))
for j → ∞. In addition, (F ∗ , u(F ∗ )) is an optimal pair of (2.1.1). Any accumulation
point of {(Fκ∗ , uh (Fκ∗ ))} in the sense of (A.0.4) possesses this property.
Discussion. In order to apply this theory to the shape optimisation problem for the sta-
tionary incompressible Navier-Stokes (INS) equations it has to be verified that the as-
sumptions made are indeed fulfilled for this case. Before we comment on assumptions
1, 2 and 3, the question of existence and uniqueness of the solution to the INS equa-
tions should be considered. Sufficient criteria for existence and uniqueness of solutions
are presented, for example, in [93, Section 10] for the continuous problem and [40] may
be consulted for the discretised (FEM) problems. Even though these conditions are not
explicitly verified here, they give a good indication that, for the relatively low Reynolds
number regimes considered in this work, unique solutions should exist.
Assumption 1, density of the discrete shapes, is clearly fulfilled if the shape parame-
terisations are chosen appropriately. For sufficiently smooth variable boundary sections,
Appendix A 174 Convergence of the disc. optimal shape
spline approximations allow arbitrarily close approximation, as does the subsequent in-
terpolation by the piecewise linears which results from triangulation of the domain. The
continuity of I, Assumption 3, is a natural requirement on the performance functional I.
If it was not fulfilled, optimal solutions would be practically useless, as even the slight-
est perturbation of the parameters, e.g. due to inaccuracies of a manufacturing process,
could result in dramatically different performance. All the performance criteria used in
this thesis fulfil this property.
Verifying Assumption 2, compactness, is rather technical. In [44, Section 2.5] this is
demonstrated for the Poisson equation with various boundary conditions and also for a
stream-function formulation of the Stokes equations.
It is worth noting that the above theory requires that the PDE discretisation parameter
h goes to zero as the shape discretisation parameter κ goes to zero. An obvious require-
ment is that the triangulations of the domains Ω(Fκ ) are sufficiently fine that changes in
the shape parameters Fκ do actually affect the computational domains Ωκh . Even more,
the meshes should be sufficiently fine such that the effected changes in the continuous
solution u(Ω(Fκ )) are reasonably well resolved by the discrete solutions uh (Ωκh ).
Of course taking the limit κ → +0 is mainly a theoretical approach and in practical
situations bounds on computational resources mean that only a certain finest κ0 > 0 can
be achieved at reasonable cost. However, the result indicates that good approximations to
solutions may be obtained if a sufficiently small discretisation parameter κ can be used.
Appendix B
In this appendix we demonstrate that the system that arises when the last row of (2.5.2)
is replaced by the zero mean pressure condition (ZMPC) is equivalent to the augmented
system that is obtained when the ZMPC is applied using a Lagrange multiplier approach
(2.5.21). This is expressed as the following proposition where, for clarity, we assume
∑mi=1 wi = 1 (which is always achievable with appropriate scaling) and that Dirichlet
boundary conditions are incorporated explicitly (see below).
Proposition 2. Provided the Dirichlet boundary conditions are consistent with mass con-
servation, i.e. nT u dΓ = 0, the linear system
R
∂Ω
F1,1int F1,1bc F1,2int F1,2bc BT1int f1int
u1int
0 I1bc 0 0 0 f1bc
F
u
T 1bc
F F F B
2,1int 2,1bc 2,2int 2,2bc 2int f
2int
=
u2int (B.0.1)
0 0 0 I2bc 0 f2bc
u2bc 0
Be
1int Be1bc Be2int Be2bc 0
p
0 0 0 0 w T 0
175
Appendix B 176 Equiv. of lin. Navier-Stokes systems
where the subscript ()∗int denotes a block corresponding to interior nodes, ()∗bc a block
corresponding to Dirichlet nodes, u1∗ the expansion coefficient of the first velocity com-
ponent with respect to the FE ansatz functions and u2∗ the second component. The equa-
tions are ordered analogously to the degrees of freedom. The vectors f1bc and f2bc store
the Dirichlet values for the first and second components of u on the boundary.
Then the first four and the last block rows of (B.0.2) are clearly satisfied. Furthermore,
the first N p − 1 equations of the fifth block row of (B.0.2) are also immediately satisfied,
where N p denotes the number of (linear) pressure ansatz functions. Hence we need only
consider the N p -th equation in this block. The left-hand side of this is
Z
qNp (∇ · uh ) dΩ.
Ω
Hence
!
R Np R R Np
qNp (∇ · uh ) dΩ = ∑ qi (∇ · uh ) dΩ = ∑ qi (∇ · uh ) dΩ
Ω i=1 Ω Ω i=1
R R
= 1(∇ · uh ) dΩ = ∑ ∇ · uh dΩ
Ω T ∈T T
nT · uh dΩ nT · uh dΩ
R R
= ∑ =
T ∈T ∂ T ∂Ω
= 0
(due to the Dirichlet boundary conditions). All the equations in (B.0.2) are therefore
satisfied by (B.0.3).
Next suppose that (B.0.2) holds. Then the i-th equation in the fifth block row of (B.0.2)
is Z
qi (∇ · uh ) dΩ + λ wi = 0 ∀i = 1, . . . , N p .
Ω
but, using the same argument as above, the Dirichlet boundary conditions imply that
λ = 0. Given that λ = 0 it follows trivially that
[2] M. Ainsworth and D. Kay. The approximation theory for the p-version finite ele-
ment method and application to nonlinear elliptic PDEs. Numerische Mathematik,
82(3):351–388, 1999.
[3] M. Ainsworth and J.T. Oden. A Posteriori Error Estimation in Finite Element
Analysis. Wiley, 2000.
[4] M. Ainsworth and W. Senior. An adaptive refinement strategy for h-p finite element
computations. Applied Numerical Mathematics, 26:165–178, 1997.
[6] W.K. Anderson, R.D. Rausch, and D.L. Bonhaus. Implicit/multigrid algorithms for
incompressible turbulent flows on unstructured grids. Journal of Computational
Physics, 128(2):391–408, 1996.
[7] T. Apel, S. Grosman, P.K. Jimack, and A. Meyer. A new methodology for
anisotropic mesh refinement based upon error gradients. Applied Numerical Math-
ematics, 50:329–341, 2004.
[8] T. Apel and G. Lube. Anisotropic mesh refinement for a singularly perturbed re-
action diffusion model problem. Applied Numerical Mathematics, 26:415–433,
1998.
[9] I. Babuska, B.A. Szabo, and I.N. Katz. The p-version of the finite element method.
SIAM Journal on Numerical Analysis, 18:515–545, 1981.
[10] M.J. Baines. Grid adaptation via node movement. Applied Numerical Mathematics,
26:77–96, 1998.
178
179 BIBLIOGRAPHY
[11] W. Bangerth and R. Rannacher. Adaptive Finite Element Methods for Differential
Equations. Birkhäuser Verlag, 2003.
[12] R.E. Bank and R.K. Smith. Mesh smoothing using a posteriori error estimates.
SIAM Journal on Numerical Analysis, 34(3):979–997, 1997.
[13] R.E. Bank and A. Weiser. Some a posteriori error estimates for elliptic partial
differential equations. Mathematics of Computation, 44:283–301, 1985.
[16] L.T. Biegler, O. Ghattas, M. Heinkenschloss, and B. van Bloemen Waanders, edi-
tors. Large-Scale PDE-Constrained Optimization. Springer, 2003.
[17] H. Blank, M. Rudgyard, and A. Wathen. Stabilised finite element methods for
steady incompressible flow. Computer Methods in Applied Mechanics and Engi-
neering, 174:91–105, 1999.
[18] D. Braess. Finite Elemente – Theorie, schnelle Löser und Anwendungen in der
Elastizitätstheorie. Springer Lehrbuch, Berlin, 1997.
[19] D. Braess. Finite Elemente – Theorie, schnelle Löser und Anwendungen in der
Elastizitätstheorie. Springer Lehrbuch, Berlin, 3rd edition, 2003.
[20] J.H. Bramble, J.E. Pasciak, and J. Xu. Parallel multilevel preconditioners. Mathe-
matics of Computation, 55(191):1–22, July 1990.
[21] P.G. Ciarlet. The Finite Element Method for Elliptic Problems. Classics in Applied
Mathematics. SIAM, Philadelphia, 2002.
[22] A.L. Codd, T.A. Manteuffel, and S.F. McCormick. Multilevel first-order system
least squares for nonlinear elliptic partial differential equations. SIAM Journal on
Numerical Analysis, 46(6):2197–2209, 2003.
180 BIBLIOGRAPHY
[23] R. Codina. On stabilized finite element methods for linear systems of convection-
diffusion-reaction equations. Computer Methods in Applied Mechanics and Engi-
neering, 188:61–82, 2000.
[24] A.R. Conn, K. Scheinberg, and P.L. Toint. Recent progress in unconstrained non-
linear optimization without derivatives. Mathematical Programming, 79:397–414,
1997.
[26] J.W. Demmel, S.C. Eisenstat, J.R. Gilbert, X.S. Li, and J.W.H. Liu. A supernodal
approach to sparse partial pivoting. SIAM Journal on Matrix Analysis and Appli-
cations, 20(3):720–755, 1999. Software available at http://crd.lbl.gov/
~xiaoye/SuperLU.
[27] H.C. Elman. Preconditioning for the steady-state Navier-Stokes equations with low
viscosity. SIAM Journal on Scientific Computing, 20:1299–1316, 1999.
[28] H.C. Elman. Preconditioning strategies for models of incompressible flow. Tech-
nical Report UMCP-CSD:CS-TR-4543, UMIACS, University of Maryland, 2003.
Available at http://www.cs.umd.edu/Library/TRs/CS-TR-4543/
CS-TR-4543.ps.
[29] H.C. Elman, D. Loghin, and A.J. Wathen. Preconditioning techniques for New-
ton’s method for the incompressible Navier-Stokes equations. BIT, 43(5):961–974,
2003.
[30] G. Farin. Curves and Surfaces for Computer Aided Geometric Design: A Practical
Guide. Academic Press, 1988.
[31] J.H. Ferziger and M. Peric. Computational Methods for Fluid Dynamics. Springer,
1996.
[33] L.P. Franka and F. Valentin. On an improved unusual stabilized finite element
method for the advective-reactive-diffusive equation. Computer Methods in Ap-
plied Mechanics and Engineering, 190:1785–1800, 2000.
181 BIBLIOGRAPHY
[34] T. Gelhard, G. Lube, M.A. Olshanskii, and J.-H. Starcke. Stabilized finite element
schemes with LBB-stable elements for incompressible flows. Journal of Compu-
tational and Applied Mathematics, 177:243–267, 2005.
[35] U. Ghia, K. N. Ghia, and C. T. Shin. High-Re solutions for incompressible flow us-
ing the Navier-Stokes equations and a multigrid method. Journal of Computational
Physics, 48:387–411, 1982.
[36] M.B. Giles and N.A. Pierce. An introduction to the adjoint approach to design.
Flow Turbulence and Combustion, 65(3–4):393–415, 2000.
[37] M.B. Giles, N.A. Pierce, and E. Süli. Progress in adjoint error correction for inte-
gral functionals. Computing and Visualisation in Science, 6:113–121, 2004.
[38] P.M. Gresho, R.L. Sani, and M.S. Engelman. Incompressible Flow and the Finite
Element Method: Advection-Diffusion and Isothermal Laminar Flow. Wiley, 1998.
[40] M.D. Gunzburger. Finite Element Methods for Viscous Incompressible Flows. Aca-
demic Press, 1989.
[41] M.D. Gunzburger. Perspectives in Flow Control and Optimization. SIAM, 2003.
[43] J. Hämäläinen, R.A.E. Mäkinen, and P. Tarvainen. Optimal design of paper ma-
chine headboxes. International Journal for Numerical Methods in Fluids, 34:685–
700, 2000.
[45] G. Hauke. A simple subgrid scale stabilized method for the advection-diffusion-
reaction equation. Computer Methods in Applied Mechanics and Engineering,
191:2925–2947, 2002.
182 BIBLIOGRAPHY
[48] P.K. Jimack. A best approximation property of the moving finite element method.
SIAM Journal on Numerical Analysis, 33(6):2286–2302, December 1996.
[49] P.K. Jimack and A.J. Wathen. Temporal derivatives in the finite-element method on
continuously deforming grids. SIAM Journal on Numerical Analysis, 28(4):990–
1003, August 1991.
[52] D. Kay, D. Loghin, and A. Wathen. A preconditioner for the steady-state Navier-
Stokes equations. SIAM Journal Scientific Computing, 24(1):237–256, 2002.
[53] G. Kunert. Toward anisotropic mesh construction and error estimation in the
finite element method. Numerical Methods for Partial Differential Equations,
18(5):625–648, 2002.
[55] G. Kunert and R. Verfürth. Edge residuals dominate a posteriori error estimates
for linear finite element methods on anisotropic triangular and tetrahedral meshes.
Numerische Mathematik, 86:283–303, 2000.
[56] J. Lang, W. Cao, W. Huang, and R.D. Russel. A two-dimensional moving finite
element method with local refinement based on a posteriori error estimates. Applied
Numerical Mathematics, 46:75–94, 2003.
183 BIBLIOGRAPHY
[57] A. Meyer. Projection techniques embedded in the PCGM for handling hanging
nodes and boundary restrictions. In B.H.V. Topping and Z. Bittnar, editors, Engi-
neering Computational Technology, pages 147–165. Saxe-Cobourg Publications,
2002.
[59] J.A. Nelder and R. Mead. A simplex method for function minimization. The
Computer Journal, 7:308–313, 1965.
[61] J. Nocedal and S.J. Wright. Numerical Optimization. Springer Series in Operations
Research. Springer, 1999.
[64] S.V. Patankar and D.B. Spalding. Calculation procedure for heat, mass and
momentum-transfer in 3-dimensional parabolic flows. International Journal of
Heat and Mass Transfer, 15(10):1787–1806, 1972.
[65] A.R. Paterson. A First Course in Fluid Dynamics. Cambridge University Press,
1983.
[66] O. Pironneau. Optimal Shape Design for Elliptic Systems. Springer-Verlag, 1984.
[69] W.C. Rheinboldt and C.K. Mesztenyi. On a data structure for adaptive finite ele-
ment mesh refinements. ACM Transactions on Mathematical Software, 6(2):166–
187, 1980.
[71] H.G. Roos. Layer-adapted grids for singular perturbation problems. Zeitschrift für
Angewandte Mathematik und Mechanik (ZAMM), 78(5):291–309, 1998.
[73] Y. Saad. Iterative Methods for Sparse Linear Systems, Second Edition. SIAM
(Society for Industrial and Applied Mathematics), 2003.
[74] A. Schmidt and K.G. Siebert. Albert - software for scientific computations and
applications. Acta Mathematica Universitatis Comenianae, 70:105–122, 2001.
[75] R. Schneider and P.K. Jimack. Efficient preconditioning of the discrete adjoint
equations for the incompressible Navier-Stokes equations. International Journal
for Numerical Methods in Fluids, 47:1277–1283, 2005.
[76] R. Schneider and P.K. Jimack. Toward anisotropic mesh adaption based upon sen-
sitivity of a posteriori estimates. School of Computing Research Report Series
2005.03, University of Leeds, 2005. Available at http://www.comp.leeds.
ac.uk/research/pubs/reports/2005/2005_03.pdf.
[78] C. Schwab and M. Suri. The p and hp versions of the finite element method
for problems with boundary layers. Mathematics of Computation, 65:1402–1429,
1996.
[79] J. Schöberl and W. Zulehner. On Schwarz-type smoothers for saddle point prob-
lems. Numerische Mathematik, 95:377–399, 2003.
185 BIBLIOGRAPHY
[80] J.R. Shewchuk. Triangle: Engineering a 2D Quality Mesh Generator and Delaunay
Triangulator. In M.C. Lin and D. Manocha, editors, Applied Computational Geom-
etry: Towards Geometric Engineering, volume 1148 of Lecture Notes in Computer
Science, pages 203–222. Springer-Verlag, 1996. Article and software available at
http://www.cs.cmu.edu/~quake/triangle.html.
[83] W. Speares and M. Berzins. A 3-d unstructured mesh adaptation algorithm for
time-dependent shock dominated problems. International Journal for Numerical
Methods in Fluids, 25:81–104, 1997.
[86] L.G. Stanley and D.L. Stewart. Design Sensitivity Analysis. SIAM, 2002.
[87] K. Stein, T. Tezduyar, and R. Benney. Mesh moving techniques for fluid-structure
interactions with large displacements. Journal of Applied Mechanics, 70:58–63,
2003.
[88] G. Strang and G.J. Fix. An Analysis of the Finite Element Method. Prentice-Hall,
1973.
[89] J.C. Strikwerda. Finite Difference Schemes and Partial Differential Equations.
Wadsworth & Brooks/Cole, 1989.
[96] S. Turek. Efficient Solvers for Incompressible Flow Problems, An Algorithmic and
Computational Approach. Lecture Notes in Computational Science and Engineer-
ing. Springer, 1999.
[99] M. Walkley, P.K. Jimack, and M. Berzins. Anisotropic adaptivity for finite ele-
ment solutions of 3-d convection-dominated problems. International Journal for
Numerical Methods in Fluids, 40:551–559, 2002.
[100] W. Wang. Special bilinear quadrilateral elements for locally refined finite element
grids. SIAM Journal on Scientific Computing, 22(6):2029–2050, 2001.
[101] A. Wathen, D. Loghin, D. Kay, H.C. Elman, and D. Silvester. A preconditioner for
the 3D Oseen equations. Technical Report NA-02/04, Oxford University Comput-
ing Laboratory, February 2002. Available at http://web.comlab.ox.ac.
uk/oucl/publications/natr/na-02-04.html.
[102] A. Wathen and D. Sylvester. Fast iterative solution of stabilised Stokes systems,
part I: Using simple diagonal preconditioners. SIAM Journal on Numerical Analy-
sis, 30(3):630 – 649, June 1993.
187 BIBLIOGRAPHY
[103] A. Wathen and D. Sylvester. Fast iterative solution of stabilised Stokes systems,
part II: Using general block preconditioners. SIAM Journal on Numerical Analysis,
31(5):1352 – 1367, October 1994.
[104] A.J. Wathen. Realistic eigenvalue bounds for the Galerkin mass matrix. IMA
Journal of Numerical Analysis, 7:449–457, 1987.
[106] C.T. Wu and H.C. Elman. Analysis and comparison of geometric and algebraic
multigrid for convection-diffusion equations. Technical report, University of Mary-
land, 2004. Available at http://www.cs.umd.edu/~elman/papers/
mg-amg-paper.pdf.
[107] O.C. Zienkiewicz and J.Z. Zhu. A simple error estimator and adaptive procedure
for practical engineering analysis. International Journal for Numerical Methods in
Engineering, 24:337–357, 1987.
Index
188
189 INDEX
parametric meshes, 23
partial differential equation (PDE), 8
PDE, 2, 8
Picard iteration, 29
preconditioning, 45
projected preconditioner, 68
projected smoothers, 69
restriction, 45
SDFEM, 81
sequential quadratic programming (SQP), 18
shape discretisation, 14
shape optimisation, 8
smoother, 45
smoothing, 45
SQP, 18
stabilisation, 56
stabilisation parameter δ , 57
stabilised discretisations, 56
stationary, 11
Stokes equations, 12
Streamline Diffusion Finite Element Method
(SDFEM), 81
streamline-upwind Petrov Galerkin (SUPG),
56
SUPG, 56